Over the last decade, web performance optimization has been controlled by one indisputable guideline: the best request is no request. A very humble rule, easy to interpret. Every network call for a resource eliminated improves performance. Every
src attribute spared, every
link element dropped. But everything has changed now that HTTP/2 is available, hasn’t it? Designed for the modern web, HTTP/2 is more efficient in responding to a larger number of requests than its predecessor. So the question is: does the old rule of reducing requests still hold up?
What has changed with HTTP/2?
Updates in HTTP/1.1 try to overcome this limitation. Clients are able to use one TCP connection for multiple resources, but still have to download them in sequence. This so-called “head of line blocking” makes waterfall charts actually look like waterfalls:
Also, most browsers started to open multiple TCP connections in parallel, limited to a rather low number per domain. Even with such optimizations, HTTP/1.1 is not well-suited to the considerable number of resources of today’s websites. Hence the saying “The best request is no request.” TCP connections are costly and take time. This is why we use things like concatenation, image sprites, and inlining of resources: avoid new connections, and reuse existing ones.
HTTP/2 is fundamentally different than HTTP/1.1. HTTP/2 uses a single TCP connection and allows more resources to be downloaded in parallel than its predecessor. Think of this single TCP connection as one broad tunnel where data is sent through in frames. On the client, all packages get reassembled into their original source. Using a couple of
link elements to transfer style sheets is now as practically efficient as bundling all of your style sheets into one file.
All connections use the same stream, so they also share bandwidth. Depending on the number of resources, this might mean that individual resources could take longer to be transmitted to the client side on low-bandwidth connections.
This also means that resource prioritization is not done as easily as it was with HTTP/1.1: the order of resources in the document had an impact on when they begin to download. With HTTP/2, everything happens at the same time! The HTTP/2 spec contains information on stream prioritization, but at the time of this writing, placing control over prioritization in developers’ hands is still in the distant future.
The best request is no request: cherry-picking
So what can we do to overcome the lack of waterfall resource prioritization? What about not wasting bandwidth? Think back to the first rule of performance optimization: the best request is no request. Let’s reinterpret the rule.
For example, consider a typical webpage (in this case, from Dynatrace). The screenshot below shows a piece of online documentation consisting of different components: main navigation, a footer, breadcrumbs, a sidebar, and the main article.
On other pages of the same site, we have things like a masthead, social media outlets, galleries, or other components. Each component is defined by its own markup and style sheet.
In HTTP/1.1 environments, we would typically combine all component style sheets into one CSS file. The best request is no request: one TCP connection to transfer all the CSS necessary, even for pages the user hasn’t seen yet. This can result in a huge CSS file.
The problem is compounded when a site uses a library like Bootstrap, which reached the 300 kB mark, adding site-specific CSS on top of it. The actual amount of CSS required by any given page, in some cases, was even less than 10% of the amount loaded:
The Dynatrace documentation example shown in figure 3 is built with the company’s own style library, which is tailored to the site’s specific needs as opposed to Bootstrap, which is offered as a general purpose solution. All components in the company style library combined add up to 80 kB of CSS. The CSS actually used on the page is divided among eight of those components, totaling 8.1 kB. So even though the library is tailored to the specific needs of the website, the page still uses only around 10% of the CSS it downloads.
HTTP/2 allows us to be much more picky when it comes to the files we want to transmit. The request itself is not as costly as it is in HTTP/1.1, so we can safely use more
link elements, pointing directly to the elements used on that particular page:
<link rel="stylesheet" href="/css/base.css"> <link rel="stylesheet" href="/css/typography.css"> <link rel="stylesheet" href="/css/layout.css"> <link rel="stylesheet" href="/css/navbar.css"> <link rel="stylesheet" href="/css/article.css"> <link rel="stylesheet" href="/css/footer.css"> <link rel="stylesheet" href="/css/sidebar.css"> <link rel="stylesheet" href="/css/breadcrumbs.css">
The first image shows that including the time required for the browser to establish the initial connection, the bundle needs about 700 ms to download on regular 3G connections. The second image shows timing values for one CSS file out of the eight that make up the page. The beginning of the response (TTFB) takes as long, but since the file is a lot smaller (less than 1 kB), the content is downloaded almost immediately.
This might not seem impressive when looking at only one resource. But as shown below, since all eight style sheets are downloaded in parallel, we still can save a great deal of transfer time when compared to the bundle approach.
When running the same page through webpagetest.org on regular 3G, we can see a similar pattern. The full bundle (
main.css) starts to download just after 1.5 s (yellow line) and takes 1.3 s to download; the time to first meaningful paint is around 3.5 seconds (green line):
When we split up the CSS bundle, each style sheet starts to download at 1.5 s (yellow line) and takes 315–375 ms to finish. As a result, we can reduce the time to first meaningful paint by more than one second (green line):
Per our measurements, the difference between bundled and split files has more impact on slow 3G than on regular 3G. On the latter, the bundle needs a total of 4.5 s to be downloaded, resulting in a time to first meaningful paint at around 7 s:
The same page with split files on slow 3G connections via webpagetest.org results in meaningful paint (green line) occurring 4 s earlier:
The interesting thing is that what was considered a performance anti-pattern in HTTP/1.1—using lots of references to resources—becomes a best practice in the HTTP/2 era. Plus, the rule stays the same! The meaning changes slightly.
The best request is no request: drop files and code your users don’t need!
Gzip (and Brotli) yields higher compression ratios when there is repetition in the data it is compressing. This means that a Gzipped bundle typically has a much smaller footprint than Gzipped single files. So if you are going to download a whole set of files anyway, the compression ratio of bundled assets might outperform that of single files downloaded in parallel. Test accordingly.
Also, be aware of your user base. While HTTP/2 has been widely adopted, some of your users might be limited to HTTP/1.1 connections. They will suffer from split resources.
The best request is no request: caching and versioning
To this point with our example, we’ve seen how to optimize the first visit to a page. The bundle is split up into separate files and the client receives only what it needs to display on a page. This gives us the chance to look into something people tend to neglect when optimizing for performance: subsequent visits.
On subsequent visits we want to avoid re-transferring assets unnecessarily. HTTP headers like Cache-Control (and their implementation in servers like Apache and NGINX) allow us to store files on the user’s disk for a specified amount of time. Some CDN servers default that to a few minutes. Some others to a few hours or days even. The idea is that during a session, users shouldn’t have to download what they already have in the past (unless they’ve cleared their cache in the interim). For example, the following Cache-Control header directive makes sure the file is stored in any cache available, for 600 seconds.
Cache-Control: public, max-age=600
We can leverage Cache-Control to be much more strict. In our first optimization we decided to cherry-pick resources and be choosy about what we transfer to the client, so let’s store these resources on the machine for a long period of time:
Cache-Control: public, max-age=31536000
The number above is one year in seconds. The usefulness in setting a high Cache-Control
max-age value is that the asset will be stored by the client for a long period of time. The screenshot below shows a waterfall chart of the first visit. Every asset of the HTML file is requested:
With properly set Cache-Control headers, a subsequent visit will result in less requests. The screenshot below shows that all assets requested on our test domain don’t trigger a request. Assets from another domain with improperly set Cache-Control headers still trigger a request, as do resources which haven’t been found:
When it comes to invalidating the cached asset (which, consequently, is one of the two hardest things in computer science), we simply use a new asset instead. Let’s see how that would work with our example. Caching works based on file names. A new file name triggers a new download. Previously, we split up our code base into reasonable chunks. A version indicator makes sure that each file name stays unique:
<link rel="stylesheet" href="/css/header.v1.css"> <link rel="stylesheet" href="/css/article.v1.css">
After a change to our article styles, we would modify the version number:
<link rel="stylesheet" href="/css/header.v1.css"> <link rel="stylesheet" href="/css/article.v2.css">
An alternative to keeping track of the file’s version is to set a revision hash based on the file’s content with automation tools.
It’s OK to store your assets on the client for a long period of time. However, your HTML should be more transient in most cases. Typically, the HTML file contains the information about which resources to download. Should you want your resources to change (such as loading article.v2.css instead of article.v1.css, as we just saw), you’ll need to update references to them in your HTML. Popular CDN servers cache HTML for no longer than six minutes, but you can decide what’s better suited for your application.
And again, the best request is no request: store files on the client as long as possible, and don’t request them over the wire ever again. Recent Firefox and Edge editions even sport an immutable directive for Cache-Control, targeting this pattern specifically.
HTTP/2 has been designed from the ground up to address the inefficiencies of HTTP/1. Triggering a large number of requests in an HTTP/2 environment is no longer inherently bad for performance; transferring unnecessary data is.
To reach the full potential of HTTP/2, we have to look at each case individually. An optimization that might be good for one website can have a negative effect on another. With all the benefits that come with HTTP/2 , the golden rule of performance optimization still applies: the best request is no request. Only this time we take a look at the actual amount of data transferred.
Only transfer what your users actually need. Nothing more, nothing less.