Some websites have a high Time to First Byte due to server side issues. To check if your website has such an issue, run it through WebPageTest from a location geographically closest to your servers with the fastest possible connection setting. A First Byte time in excess of 1.5 seconds for such a run would point to a possible server side issue.
The ideal way to address this is to root cause the issue within infrastructure, code, config or database on the server. But, often due to legacy, resource, expertise or time constraints, caching the entire HTML becomes a lucerative option to evaluate.
HTML Caching Considerations
However, a lot of HTML pages need not change from visitor to visitor and may be updated after days, weeks or months. Some examples - lead generation landing pages, blog post pages (like this one), product documentation, etc. Often, the most popular pages of a website fall in this category and can be cached. In such cases, caching the HTML offers an easier way to resolve server side slowness issues.
If your website’s most popular pages need not differ from visitor to visitor and are not updated every minute, caching them may offer a notable speed boost.
Caching HTML at CDN Edge
It is a common practice to cache HTML at origin with Redis or Varnish. Both enable powerful caching and cache invalidation capabilities. For example, Varnish Edge Side Includes enables one to cache parts of a page while fetching rest from the origin. But, not every HTML requires such powerful configurability. And, such HTML pages can be cached at the CDN edge. Caching at the CDN edge offers following benefits over caching at origin:
Implementing HTML Caching with AWS Cloudfront CDN
I set-up HTML caching for certain sections of this website a month ago. Infact, this post’s HTML is being served by Cloudfront CDN. Below are the high-level details of the configuration:
a. Create a Cloudfront Distribution to cache HTML
It is ideal to have a different Cloudfront distribution to handle HTML from the one used to handle the static files (CSS, JS, images). This is because we would invalidate CDN cache for HTML differently than we do for static files. More on this later.
b. Create Behaviors for Cacheable Path Patterns
By default, Cloudfront CDN determines if it should cache an object by looking at HTTP cache header from the origin’s response. To cache HTML for certain path patterns, we should create separate behavior for those path patterns (from the default behavior):
For each of the non-default behaviors, we should specify the TTL values. These values determine how long HTML matching these URL patterns would be cached (irrespective of their HTTP cache header value). In the below example, we ensure that the CDN caches URLs matching a certain pattern for 7 days:
c. Invalidating CDN Cache
We cannot use file-name based cache busting for HTML files like we use for static files. This means we will have to explicitly invalidate HTML CDN cache during website updates, changes to pages, etc. This can be done via AWS CLI and can be integrated within build & deploy scripts.
aws cloudfront create-invalidation –distribution-id <DISTRIBUTION_ID> –paths /post/*
If some of the most popular pages on your website are anonymous static pages that do not change frequently, caching them at the CDN edge can boost their performance better than caching them at the origin. However, it is important to work out their cache invalidation in sync with website’s build & update process.
Well, can’t we just go to PageSpeed Insights or web.dev measure, slap our dear URL and know the scores? Unfortunately, the answer is ‘No’. Read on to know...continue reading