Keeping Up

By Jason Sobel on Friday, December 21, 2007 at 1:01pm

Almost two million new users from around the world sign up for Facebook each week—and we couldn't be happier. It's tremendously rewarding to see so many people find what we work on useful and fun. As we continue to add new users and features, however, the load on our thousands of servers continues to increase at a pretty astounding rate. A few weeks ago we reached full capacity in our California datacenters. In the past we handled this problem by purchasing a few dozen servers, hooking them up, and getting on with our lives, but this time we didn't have it so easy. We'd actually run out of space in our datacenters for new machines.

Fortunately we saw this problem coming a long time ago and started work on a new datacenter in Virginia. Now, we identify whether a user would be better off talking to the east coast datacenter or a west coast data center. For people in Europe and the eastern half of the US, it's noticeably faster to talk to a server in Virginia than in California. For these users we direct them to Virginia whenever they're browsing the site and not making any changes.

Whenever that person goes to change some data—uploading a photo album, or changing profile info for example—we send them off to California so that all our modifying operations happen in the same location. This decision was made to prevent two or more modifications from conflicting with each other and messing up our data. It might sound like we're forcing our users to go to California a lot but only about 10% of our traffic causes a modifying operation. MySQL has a great replication feature that allows us to, in real time, stream all the modifications happening on a California MySQL server to another one in Virginia. Replication happens so fast, even across the country, that the Virginia servers are almost never more than one or two seconds behind the California servers.

Even though all of the modification happens in California and streams instantly to Virginia, we were faced with another problem. Although Facebook's data is stored in MySQL database servers, we use a large number of memcached servers to store copies of the data. Memcached is much faster and able to keep up with requests quicker than the databases themselves can keep up. We had to figure out a way for memcached servers to replicate data concurrently with the MySQL databases. Because of various technical limitations of our architecture there was no easy way to do so.

Fortunately MySQL is open source software, meaning we can actually change the way it works by modifying the code. We did just that—embedding extra information in to the MySQL replication stream that allows us to properly update memcached in Virginia. This ensures that the cache and the database are always in sync. Over the last seven months a great team of Facebook employees has been building new software and setting up new servers like I described above. Over Thanksgiving we finally flipped the switch and since then almost 30% of our traffic has been served from Virginia.

The east coast datacenter is a great first step towards keeping Facebook fast and reliable as the site grows. Going forward we have lots of exciting plans to expand our infrastructure and improve performance so no user ever has to sit around waiting for a page to load.

Jason, a Facebook engineer, is all about the east coast.