Last blog post we did the setup of CloudFront, deal right? Wrong!

Just because we setup a CDN, do not mean we did it right, if you look carefully to the output our our curl to the sample image, it's missing an important http response header, the cache-control, also neither an age header.

curl -I https://cdn.awsary.com/blog/Arch_Amazon-CloudFront_64%405x.png

HTTP/2 200
content-type: image/png
content-length: 159035
date: Sun, 26 Feb 2023 22:23:01 GMT
last-modified: Sun, 26 Feb 2023 22:19:18 GMT
etag: "9f169178643a2f885f8697610dc677de"
x-amz-server-side-encryption: AES256
accept-ranges: bytes
server: AmazonS3
x-cache: RefreshHit from cloudfront
via: 1.1 e4fc537726e6de98f17edd9f0158561a.cloudfront.net (CloudFront)
x-amz-cf-pop: LIS50-C1
x-amz-cf-id: 4gkSGILVYL3LXKyDUZ2TIVCRCv0rQPnKjutZsmzZTLoLDpeTGY6guA==

Looking at blogposts and documentations, most people suggest to set a lambda@edge that sets this values, or even go to S3 and configure there, but then it needs to be done object by object. I though about bout configuring an S3 event to trigger a Lambda when a S3 object is saved to use this second aproach, but there needs to be a better way, right ?!

Reading the terraform module documentation I did not found and hit, but since it's open source, why not go one step deeper and take a look at the code?

Line 140, 145, 146 and 147 where intriguing me, it looks like something that will suits me.

I eventually found this CloudFront managed cache policies and CachingOptimized looks like exactly what i'm looking for, it supports Gzip and Brotli compression, so let's give it a try.

Add a line of code:

cache_policy_id = "658327ea-f89d-4fab-a63d-7e88639e58f6"

And do a terraform apply:

Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
  ~ update in-place

Terraform will perform the following actions:

  # module.cloudfront.aws_cloudfront_distribution.this[0] will be updated in-place
  ~ resource "aws_cloudfront_distribution" "this" {
        id                             = "E6DJC1VIGZDGG"
        tags                           = {}
        # (19 unchanged attributes hidden)

      ~ default_cache_behavior {
          + cache_policy_id            = "658327ea-f89d-4fab-a63d-7e88639e58f6"
            # (12 unchanged attributes hidden)

            # (1 unchanged block hidden)
        }

        # (4 unchanged blocks hidden)
    }

Plan: 0 to add, 1 to change, 0 to destroy.

I run into an error Error: updating CloudFront Distribution (E6DJC1VIGZDGG): InvalidArgument: The parameter ForwardedValues cannot be used when a cache policy is associated to the cache behavior.

It looks like that to enable this caching policy, we need to disable headers and query_string forwardings, and once again the code on line 150 looks like a good hint, let's try it.

use_forwarded_values = false

Terraform apply and this time, no more errors.

Remember that CloudFront is a constelation of servers all over the world and it takes some time for new settings to apply. Check your console to see if it's still Deploying before run into tests.

Let's get the files a gain and see if we hit or miss the cache:

curl -I https://cdn.awsary.com/blog/Arch_Amazon-CloudFront_64%405x.png

HTTP/2 200
content-type: image/png
content-length: 159035
last-modified: Sun, 26 Feb 2023 22:19:18 GMT
x-amz-server-side-encryption: AES256
accept-ranges: bytes
server: AmazonS3
date: Mon, 27 Feb 2023 22:14:04 GMT
etag: "9f169178643a2f885f8697610dc677de"
vary: Accept-Encoding
x-cache: Hit from cloudfront
via: 1.1 9286764bc0c8327719870fa33a225c9a.cloudfront.net (CloudFront)
x-amz-cf-pop: LIS50-C1
x-amz-cf-id: 83ThgLNzjzgmkb_VmaqZ1UyPQCb6qHDJDyfc8cHqH90RpWRaPUZleA==
age: 214

Now we are already seeing  x-cache: Hit from cloudfront all the time, before was very inconsistent, and we can se an age header tag as well. 🥳

While content is beeing cached on CDN and requests to S3 have been reducend significantly, without setting cache-control header, we depend on the browser implementation to cache locally or not at the client side.

Run some simple requests on Chrome and Safari, and while it looks like Chrome is caching content locally, Safari is beeing more strict with the headers and not caching without cache-control set. 😅

I'm not satisfied, specially because we are going to fetch mainly from the iOS app and probably the API's that SwiftUI is calling for networking will be similar to the Safari ones.

Let's try manually setting this values and run another test.

min_ttl = 1
default_ttl = 31536000
max_ttl = 86400

No luck, looks like for now we will just have cache at CDN avoiding requests to S3, but to actually implement cache-control header we need to either set this on S3 or with a Lambda@Edge function. If you know any other suggestion, fell free to open an issue or do a pull request.

I will consider this an half-win for today. 😜