- Embrace performance budgets and learn to live within them. For mobile, aim for a JS budget of < 170KB minified/compressed. Uncompressed this is still ~0.7MB of code. Budgets are critical to success, however, they can’t magically fix perf in isolation. Team culture, structure and enforcement matter. Building without a budget invites performance regressions and failure.
The web is bloated by user “experience”
When users access your site you’re probably sending down a lot of files, many of which are scripts. From a web browsers’ perspective this looks a little bit like this:
A large factor of this is how long it takes to download code on a mobile network and then process it on a mobile CPU.
Let’s look at mobile networks.
This chart from OpenSignal shows how consistently 4G networks are globally available and the average connection speed users in each country experience. As we can see, many countries still experience lower connection speeds than we may think.
Not only can that 350 KB of script for a median website from earlier take a while to download, the reality is if we look at popular sites, they actually ship down a lot more script than this:
Sites today will often send the following in their JS bundles:
- A client-side framework or UI library
- A state management solution (e.g. Redux)
- Polyfills (often for modern browsers that don’t need them)
- Full libraries vs. only what they use (e.g. all of lodash, Moment + locales)
- A suite of UI components (buttons, headers, sidebars etc.)
This code adds up. The more there is, the longer it will take for a page to load.
Loading a web page is like a film strip that has three key moments.
There’s: Is it happening? Is it useful? And, is it usable?
Is it happening is the moment you’re able to deliver some content to the screen. (has the navigation started? has the server started responding?)
Is it useful is the moment when you’ve painted text or content that allows the user to derive value from the experience and engage with it.
And then is it usable is the moment when a user can start meaningfully interacting with the experience and have something happen.
I mentioned this term “‘interactive” earlier, but what does that mean?
Whether a user clicks on a link, or scrolls through a page, they need to see that something is actually happening in response to their actions. An experience that can’t deliver on this will frustrate your users.
When a browser runs many of the events you’re probably going to need, it’s likely going to do it on the same thread that handles user input. This thread is called the main thread.
Avoid blocking the main thread as much as possible. For more, see “Why web developers need to care about interactivity”
Measuring the Time-to-Interactive of Google News on mobile, we observe large variance between a high-end getting interactive in ~7s vs. a low-end device getting interactive in 55s. So, what is a good target for interactivity?
When it comes to Time to Interactive, we feel your baseline should be getting interactive in under five seconds on a slow 3G connection on a median mobile device. “But, my users are all on fast networks and high-end phones!” …are they? You may be on “fast” coffee-shop WiFi but effectively only getting 2G or 3G speeds. Variability matters.
Interactivity impacts a lot of things. It can be impacted by a person loading your website on a mobile data plan, or coffee shop WiFi, or just them being on the go with intermittent connectivity.
The above scenario is an accurate depiction of what happens in Chrome when it processes everything that you send down (yes, it’s a giant emoji factory).
That means we have to be fast at the network transmission and the processing side of things for our scripts.
As of Chrome 66, V8 compiles code on a background thread, reducing compile time by up to 20%. But parse and compile are still very costly and it’s rare to see a large script actually execute in under 50ms, even if compiled off-thread.
They might take the same amount of time to download, but when it comes to processing we’re dealing with very different costs.
One of the reasons why they start to add up and matter is mobile.
Mobile is a spectrum.
If we’re fortunate, we may have a high-end or median end phone. The reality is that not all users will have those devices.
At the top we have high-end devices, like the iPhone 8, which process script relatively quickly. Lower down we have more average phones like the Moto G4 and the <$100 Alcatel 1X. Notice the difference in processing times?
Android phones are getting cheaper, not faster, over time. These devices are frequently CPU poor with tiny L2/L3 cache sizes. You are failing your average users if you’re expecting them to all have high-end hardware.
Let’s look at a more practical version of this problem using script from a real-world site. Here’s the JS processing time for CNN.com:
Now that WebPageTest supports the Alcatel 1X (~$100 phone sold in the U.S, sold out at launch) we can also look at filmstrips for CNN loading on mid-lower end hardware. Slow doesn’t even begin to describe this:
This hints that maybe we need to stop taking fast networks and fast devices for granted.
Some users are not going to be on a fast network or have the latest and greatest phone, so it’s vital that we start testing on real phones and real networks . Variability is a real issue.
“Variability is what kills the User Experience” — Ilya Grigorik. Fast devices can actually sometimes be slow. Fast networks can be slow, variability can end up making absolutely everything slow.
When variance can kill the user experience, developing with a slow baseline ensures everyone (both on fast and slow setups) benefits. If your team can take a look at their analytics and understand exactly what devices your users are actually accessing your site with, that’ll give you a hint at what devices you should probably have in the office to test your site with.
webpagetest.org/easy has a number of Moto G4 preconfigured under the “Mobile” profiles. This is valuable in case you’re unable to purchase your own set of median-class hardware for testing.
There’s a number of profiles set up there that you can use today that already have popular devices pre-configured. For example, we’ve got some median mobile devices like the Moto G4 ready for testing.
It’s also important to test on representative networks. Although I’ve talked about how important low end and median phones are, Brian Holt made this great point: it’s really important to know your audience.
Not every site needs to perform well on 2G on a low-end phone. That said, aiming for a high level of performance across the entire spectrum is not a bad thing to do.
You may have a wide range of users on the higher end of the spectrum, or in a different part of the spectrum. Just be aware of the data behind your sites, so that you can make a reasonable call on how much all of this matters.
Take advantage of caching for repeat visits. Parse time is critical for phones with slow CPUs.
If you’re a back-end or full stack developer, you know that you kind of get what you pay for with respect to CPU, disk, and network.
The shape of success is whatever let’s us send the least amount of script to users while still giving them a useful experience. Code-splitting is one option that makes this possible.
Code-splitting can be done at the page level, route level, or component level. It’s something that’s well supported by many modern libraries and frameworks through bundlers like webpack and Parcel. Guides to accomplish this are available for React, Vue.js and Angular.
Many large teams have seen big wins off the back of investing in code-splitting recently.
In an effort to rewrite their mobile web experiences to try to make sure users were able to interact with their sites as soon as possible, both Twitter and Tinder saw anywhere up to a 50% improvement in their Time to Interactive when they adopted aggressive code splitting.
Another thing many of these sites have done is adopted auditing as part of their workflow.
Bundle auditing often highlights opportunities to swap heavy dependencies (like Moment.js and its locales) for lighter alternatives (such as date-fns).
If you use webpack, you may find our repository of common library issues in bundles useful.
Measure, Optimize, Monitor, and Repeat.
Lighthouse is a tool baked into the Chrome Developer Tools. It’s also available as a Chrome extension. It gives you an in-depth performance analysis that highlights opportunities to improve performance.
Another thing you can do is make sure you’re not shipping unused code down to your users:
Tip: With coverage recording, you can interact with your app and DevTools will update how much of your bundles were used.
This can be valuable for identifying opportunities to split up scripts and defer the loading of non-critical ones until they’re needed.
What this means is that when a user navigates to other views in the experience, there’s a good chance it’s already in the browser cache, and so they experience much more reduced costs in terms of booting scripts up and getting interactive.
If you care about performance or you’ve ever worked on a performance patch for your site, you know that sometimes you could end up working on a fix, only to come back a few weeks later and find someone on your team was working on a feature and unintentionally broke the experience. It goes a little like this:
Thankfully, there are ways we can we can try to work around this, and one way is having a performance budget in place.
Performance budgets are critical because they keep everybody on the same page. They create a culture of shared enthusiasm for constantly improving the user experience and team accountability.
Budgets define measurable constraints to allow a team to meet their performance goals. As you have to live within the constraints of budgets, performance is a consideration at each step, as opposed to an after-thought.
Based on the work by Tim Kadlec, metrics for perf budgets can include:
- Milestone timings — timings based on the user-experience loading a page (e.g Time-to-Interactive). You’ll often want to pair several milestone timings to accurately represent the complete story during page load.
- Rule-based metrics — scores generated by tools such as Lighthouse or WebPageTest. Often, a single number or series to grade your site.
Alex Russell had a tweet-storm about performance budgets with a few salient points worth noting:
- “Leadership buy-in is important. The willingness to put feature work on hold to keep the overall user experience good defines thoughtful management of technology products.”
- “Performance is about culture supported by tools. Browsers optimize HTML+CSS as much as possible. Moving more of your work into JS puts the burden on your team and their tools”
- “Budgets aren’t there to make you sad. They’re there to make the organization self-correct. Teams need budgets to constrain decision space and help hitting them”
Everyone who impacts the user-experience of a site has a relationship to how it performs.
Performance is more often a cultural challenge than a technical one.
Discuss performance during planning sessions and other get-togethers. Ask business stakeholders what their performance expectations are. Do they understand how perf can impact the business metrics they care about? Ask eng. teams how they plan to address perf bottlenecks. While the answers here can be unsatisfactory, they get the conversation started.
Here’s an action plan for performance:
– Create your performance vision. This is a one-page agreement on what business stakeholders and developers consider “good performance”
– Set your performance budgets. Extract key performance indicators (KPIs) from the vision and set realistic, measurable targets from them. e.g. “Load and get interactive in 5s”. Size budgets can fall out of this. e.g “Keep JS < 170KB minified/compressed”
– Create regular reports on KPIs. This can be a regular report sent out to the business highlighting progress and success.
What about tooling for perf budgets? You can setup Lighthouse scoring budgets in continuous integration with the Lighthouse CI project:
Embracing performance budgets encourages teams to think seriously about the consequences of any decisions they make from early on in the design phases right through to the end of a milestone.
Looking for further reference? The U.S Digital Service document their approach to tracking performance with Lighthouse by setting goals and budgets for metrics like Time-to-Interactive.
Each site should have access to both lab and field performance data.
The first is Long Tasks — an API enabling you to gather real-world telemetry on tasks (and their attributed scripts) that last longer than 50 milliseconds that could block the main thread. You can record these tasks and log them back to your analytics.
First Input Delay (FID) is a metric that measures the time from when a user first interacts with your site (i.e. when they tap a button) to the time when the browser is actually able to respond to that interaction. FID is still an early metric, but we have a polyfill available for it today that you can check out.
Get fast, stay fast.
Performance is a journey. Many small changes can lead to big gains.
In the end, your users will thank you.