Making Facebook 2x Faster

By Jason Sobel on Thursday, February 18, 2010 at 3:25pm

Everyone knows the internet is better when it's fast. At Facebook, we strive to make our site as responsive as possible; we've run experiments that prove users view more pages and get more value out of the site when it runs faster. Google and Microsoft presented similar conclusions for their properties at the 2009 O'Reilly Velocity Conference.

So how do we go about making Facebook faster? The first thing we have to get right is a way to measure our progress. We want to optimize for users seeing pages as fast as possible so we look at the three main components that contribute to the performance of a page load: network time, generation time, and render time.

Components Explained


Network time represents how long a user is waiting while data is transmitted between their computer and Facebook. We can't completely control network time since some users are on slower connections than others, but we can reduce the number of bytes required to load a page; fewer bytes means less network time. The 5 main contributors to network time are bytes of cookies, HTML, CSS, JavaScript, and images.

Generation time captures how long it takes from when our webserver receives a request from the user to the time it sends back a response. This metric measures the efficiency of our code itself and also our webserver, caching, database, and network hardware. Reducing generation time is totally under our control and is accomplished through cleaner, faster code and constantly improving our backend architectures.

Render time measures how much time the user's web browser needs to process a response from Facebook and display the resultant web page. Like network time, we are somewhat constrained here by the performance and behavior of the various browsers but much is still under our control. The less we send back to the user, the faster the browser can display results, so minimizing bytes of HTML, CSS, JavaScript, and images also helps with render time. Another simple way to reduce render time is to execute as little JavaScript as possible before showing the page to the user.

The three metrics I describe are effective at capturing individual components of user perceived performance, but we wanted to roll them up into one number that would give us a high level sense of how fast the site is. We call this metric Time-to-Interact (TTI for short), and it is our best sense of how long the user has to wait for the important contents of a page to become visible and usable. On our homepage, for example, TTI measures the time it takes for the newsfeed to become visible.

First Steps


From early 2008 to mid 2009, we spent a lot of time following the best practices laid out by pioneers in the web performance field to try and improve TTI. For anyone serious about making a web site faster, Steve Souders's compilations are must-reads: High Performance Web Sites and Even Faster Web Sites. We also developed some impressive technologies of our own to measure and improve the performance of Facebook as described at the 2009 O’Reilly Velocity Conference by two Facebook engineers, David Wei and Changhao Jiang.

By June of 2009 we had made significant improvements, cutting median render time in half for users in the United States. This was great progress, but in the meantime, Facebook had exploded in popularity all across the globe and we needed to start thinking about a worldwide audience. We decided to measure TTI at the 75th percentile for all users as a better way to represent how fast the site felt. After looking at the data, we set an ambitious goal to cut this measurement in half by 2010; we had about six months to make Facebook twice as fast.

Six Months and Counting...


On closer inspection, our measurements told us that pages were primarily slow because of network and render time. Our generation time definitely had (and still has) significant room to improve but it wouldn't provide the same bang for the buck. So we devoted most of our engineering effort towards two goals: drastically cutting down the bytes of cookies, HTML, CSS, and JavaScript required by a Facebook page while also developing new frameworks and methodologies that would allow the browser to show content to the user as quickly as possible.

Cutting back on cookies required a few engineering tricks but was pretty straightforward; over six months we reduced the average cookie bytes per request by 42% (before gzip). To reduce HTML and CSS, our engineers developed a new library of reusable components (built on top of XHP) that would form the building blocks of all our pages. Before the development of this component library, each page would rely on a lot of custom HTML and CSS even though many pages shared similar features and functionality. With the component library, it’s easy to optimize our HTML in one place and see it pay off all across the site. Another benefit is that, since the components share CSS rules, once a user has downloaded some CSS it’s very likely those rules will be reused on the next page instead of needing to download an entirely new set. Due to these efforts, we cut our average CSS bytes per page by 19% (after gzip) and HTML bytes per page by 44% (before gzip). These dramatic reductions mean we get our content to users faster and browsers can process it more quickly.

Cutting back on JavaScript was another challenging problem. Facebook feels like a dynamic and engaging site in large part due to the JavaScript functionality we've created, but as we added more and more features, we wrote more and more JavaScript which users have to download to use the site. Remember that downloading and executing JavaScript are two of the main issues we need to combat to improve network and render time. To address this problem our engineers took a step back and looked at what we were using JavaScript to accomplish. We noticed that a relatively small set of functionality could be used to build a large portion of our features yet we were implementing them in similar-but-different ways. This common functionality could be provided in a very small, efficient library that is also cacheable on the user's computer. We set out to rewrite our core interactions on top of this new library, called Primer, and saw a massive 40% decrease (after gzip) in average JavaScript bytes per page. Since Primer is downloaded quickly and then cached for use on future page views, it also means that features built exclusively on Primer are immediately usable when they appear on the screen; there's no need to wait for further JavaScript to download. An example of such a feature is our feedback interface which allows users to comment on, like, and share content and appears all across Facebook.

Another project I'd like to highlight requires a little more setup. As described earlier, the traditional model for loading a web page involves a user sending a request to a server, the server generating a response which is sent back to the browser, and the browser converting the response in to something the user can see and interact with. If you think about this model there is a glaring problem. Let's say it takes a few hundred milliseconds for the server to completely prepare and send a response back to the user. While the server is chugging through its work the browser is just sitting there uselessly, waiting for something to do and generally being lazy. What if we could pipeline this whole procedure? Wouldn't it be great if the server could do a little bit of work, say in ten or fifty milliseconds, and then send a partial response back to the browser which can then start downloading JavaScript and CSS or even start displaying some content? Once the server has done some more processing and has produced another bit of output it can send that back to the browser as well. Then we just repeat the process until the server has nothing left to do. We've overlapped a significant portion of the generation time with the render time which will reduce the overall TTI experienced by the user.

Over the last few months we've implemented exactly this ability for Facebook pages. We call the whole system BigPipe and it allows us to break our web pages up in to logical blocks of content, called Pagelets, and pipeline the generation and render of these Pagelets. Looking at the home page, for example, think of the newsfeed as one Pagelet, the Suggestions box another, and the advertisement yet another. BigPipe not only reduces the TTI of our pages but also makes them seem even faster to users since seeing partial content earlier feels faster than seeing complete content a little bit later.

Success!


I'm pleased to say that on December 22nd, as a result of these and other efforts, we declared victory on our goal to make the site twice as fast. We even had 9 whole days to spare!

After hitting the 2x site speed goal the team celebrated with t-shirts. And dinner (not pictured).After hitting the 2x site speed goal the team celebrated with t-shirts. And dinner (not pictured).
I hope that you've personally experienced and appreciated the improvements we've made in site speed and that this post has given you some insight in to how we think about and approach performance projects at Facebook. Stay tuned for more details on many of the projects I mention here in future blog posts and industry conferences. In 2010 look for the site to get even faster as we tackle new challenges!

Jason, an engineer at Facebook, wants to remind you that perf graphs go the wrong way.