Research Publications at Facebook

Giving people the power to share and connect requires constant innovation. At Facebook, we solve technical problems no one else has seen because no one else has built a social network of this size.
Working at the intersection of research and engineering to make the world more open and connected is one of the best things about being at Facebook right now.

2014top

Question Answering with Subgraph Embeddings

Empirical Methods in Natural Language Processing (EMNLP)

#TagSpace: Semantic Embeddings from Hashtags

Jason Weston, Sumit Chopra, Keith Adams
Empirical Methods in Natural Language Processing (EMNLP) conference

Practical Lessons from Predicting Clicks on Ads at Facebook

International Workshop on Data Mining for Online Advertising (ADKDD)

Streamed Approximate Counting of Distinct Elements

Daniel Ting
ACM Conference on Knowledge Discovery and Data Mining (KDD)

A Hitchhiker’s Guide to Fast and Efficient Data Reconstruction in Erasure-coded Data Centers

K V Rashmi, Nihar B Shah, Dikang Gu, Hairong Kuang, Dhruba Borthakur, Kannan Ramachandran
ACM Conference on Applications, Technologies, Architectures, and Protocols for Computer Communications (SIGCOMM)

Fastpass: A Centralized “Zero-Queue” Datacenter Network

Jonathan Perry, Amy Ousterhout, Hari Balakrishnan, Devavrat Shah, Hans Fugal
ACM Conference on Applications, Technologies, Architectures, and Protocols for Computer Communications (SIGCOMM)

Perceiving, learning, and exploiting object affordances for autonomous pile manipulation

Dubi Katz, Arun Venkatraman, Moslem Kazemi, J. Andrew Bagnell, Anthony Stentz
Autonomous Robots

PANDA: Pose Aligned Networks for Deep Attribute Modeling

Conference on Computer Vision and Pattern Recognition (CVPR)

Fast Database Restarts at Facebook

ACM Special Interest Group on Management of Data (SIGMOD)

Collaborative Hashing

Xianglong Liu, Junfeng He, Bo Lang
Conference on Computer Vision and Pattern Recognition (CVPR)

Adaptive HTER Estimation for Document-Specific MT Post-Editing

Fei Huang, Jianming Xu, Abraham Ittycheriah, Salim Roukos
Annual Conference of Association of Computational Linguistics (ACL)

vCacheShare: Automated Server Flash Cache Space Management in a Virtualization Environment

Fei Meng, Li Zhou, Xiaosong Ma, Sandeep Uttamchandani, Deng Liu
USENIX Annual Technical Conference (ATC)

There is no Fork: an Abstraction for Efficient, Concurrent, and Concise Data Access

Simon Marlow, Louis Brandy, Jon Coens, Jon Purdy
ACM SIGPLAN International Conference on Functional Programming (ICFP)

Head Tracking for the Oculus Rift

IEEE International Conference on Robotics and Automation (ICRA)

Topic-based Clusters in Egocentric Networks on Facebook

Lilian Weng, Thomas Lento
AAAI Conference on Weblogs and Social Media (ICWSM)

Rumor Cascades

AAAI Conference on Weblogs and Social Media (ICWSM)

Joint Inference of Multiple Label Types in Large Networks

International Conference on Machine Learning (ICML)

Growing Closer on Facebook: Changes in Tie Strength Through Site Use

Moira Burke, Robert Kraut
ACM Conference on Human Factors in Computing (CHI)

Incentives to Participate in Online Research: An Experimental Examination of "Surprise" Incentives

Andrew Tresolini Fiore, Coye Cheshire, Lindsay Shaw Taylor, G.A. Mendelsohn
ACM Conference on Human Factors in Computing Systems (CHI)

Visually Impaired Users on an Online Social Network

ACM Conference on Human Factors in Computing Systems (CHI)

Room for Interpretation: The Role of Self-Esteem and CMC in Romantic Couple Conflict

Lauren Scissors, Michael E. Roloff, Darren Gergle
ACM Conference on Human Factors in Computing (CHI)

Designing and Deploying Online Field Experiments

International World Wide Web Conference (WWW)

Can cascades be predicted?

International World Wide Web Conference (WWW)

Personalized Collaborative Clustering

Yisong Yue, Chong Wang, Khalid El-Arini, Carlos Guestrin
International World Wide Web Conference (WWW)

Deduplicating a Places Database

International World Wide Web Conference (WWW)

Libra: Divide and Conquer to Verify Forwarding Tables in Huge Networks

James Hongyi Zeng, Shidong Zhang, Fei Ye, Vimal Kumar, Mickey Ju, Junda Liu, Nick McKeown, Amin Vahdat
USENIX Symposium on Networked Systems Design and Implementation (NSDI)

Help is on the Way: Patterns of Responses to Resource Requests on Facebook

Cliff Lampe, Rebecca Gray, Andrew Tresolini Fiore, Nicole Ellison
ACM Conference on Computer Supported Cooperative Work and Social Computing (CSCW)

The Role of Founders in Building Online Groups

Robert Kraut, Andrew Tresolini Fiore
ACM Conference on Computer Supported Cooperative Work and Social Computing (CSCW)

Romantic Partnerships and the Dispersion of Social Ties: A Network Analysis of Relationship Status on Facebook

Lars Backstrom, Jon Kleinberg
ACM Conference on Computer Supported Cooperative Work and Social Computing (CSCW)

Analysis of HDFS Under HBase: A Facebook Messages Case Study

Tyler Harter, Dhruba Borthakur, Siying Dong, Amitanand Aiyer, Liyin Tang, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau
USENIX Conference on File Storage Technologies (FAST)

The Essence of Reynolds

ACM Symposium on Principles of Programming Languages (POPL)

2013top

Counterfactual Reasoning and Learning Systems: The Example of Computational Advertising

Leon Bottou, Jonas Peters, Joaquin Quiñonero Candela, Denis Charles, Max Chickering, Elon Portugaly, Dipankar Ray, Patrice Simard, Ed SnelsonShow all (9)
Journal of Machine Learning Research (JMLR)

Using Web Text to Improve Keyword Spotting in Speech

Ankur Gandhe, Long Qin, Florian Metze, Alexander Rudnicky, Ian Lane, Matthias Eck
Automatic Speech Recognition and Understanding Workshop (ASRU)

An Analysis of Facebook Photo Caching

ACM Symposium on Operating Systems Principles (SOSP)

Virtual Network Diagnosis as a Service

Wenfei Wu, Guohui Wang, Aditya Akella, Anees Shaikh
ACM Symposium on Cloud Computing (SoCC)

Scuba: Diving into Data at Facebook

Oleksandr Barykin, Bhuwan Chopra, Ciprian Gerea, Josh Metzler, Subbu Subramanian, Janet Wiener, David Reiss, Daniel Merl
International Conference on Very Large Data Bases (VLDB)

XORing Elephants: Novel Erasure Codes for Big Data

Maheshwaran Sathiamoorthy, Megasthenis Asteris, Dimitris Papailiopoulos, Alexandros G. Dimakis, Ramkumar Vadali, Scott Chen, Dhruba Borthakur
International Conference on Very Large Data Bases (VLDB)

Unicorn: A System for Searching the Social Graph

International Conference on Very Large Data Bases (VLDB)

Weighted Hashing for Fast Large Scale Similarity Search

Qifan Wang, Dan Zhang, Luo Si
ACM International Conference on Information and Knowledge Management (CIKM)

Reciprocal Hash Tables for Nearest Neighbor Search

Xianglong Liu, Junfeng He, Bo Lang
AAAI Conference on Artificial Intelligence (AI)

Graph Cluster Randomization: Network Exposure to Multiple Universes

Johan Ugander, Brian Karrer, Lars Backstrom, Jon Kleinberg
ACM Conference on Knowledge Discovery and Data Mining (KDD)

MI2LS: Multi-Instance Learning from Multiple Information Sources

Dan Zhang, Jingrui He, Richard Lawrence
ACM Conference on Knowledge Discovery and Data Mining (KDD)

Uncertainty in Online Experiments with Dependent Data: An Evaluation of Bootstrap Methods

Eytan Bakshy, Dean Eckles
ACM Conference on Knowledge Discovery and Data Mining (KDD)

Semantic Hashing using Tags and Topic Modeling

Qifan Wang, Dan Zhang, Luo Si
ACM Special Interest Group on Information Retrieval Conference (SIGIR)

Selection Effects in Online Sharing: Consequences for Peer Adoption

Sean Taylor, Eytan Bakshy, Sinan Aral
ACM Conference on Electronic Commerce (EC)

Calling All Facebook Friends: Exploring requests for help on Facebook

Nicole Ellison, Rebecca Gray, Jessica Vitak, Cliff Lampe, Andrew Tresolini Fiore
AAAI Conference on Weblogs and Social Media (ICWSM)

Families on Facebook

Moira Burke, Lada Adamic, Karyn Marciniak
AAAI Conference on Weblogs and Social Media (ICWSM)

The Anatomy of Large Facebook Cascades

AAAI Conference on Weblogs and Social Media (ICWSM)

Self-censorship on Facebook

Sauvik Das, Adam D. I. Kramer
AAAI Conference on Weblogs and Social Media (ICWSM)

Development and Deployment at Facebook

Dror Feitelson, Eitan Frachtenberg, Kent Beck
IEEE Internet Computing 17(4)

TAO: Facebook's Distributed Data Store for the Social Graph

Nathan Bronson, Zachary Amsden, George Cabrera III, Prasad Chakka, Peter Dimov, Hui Ding, Jack Ferris, Anthony Giardullo, Sachin Kulkarni, Harry Li, Mark Marchukov, Dmitri Petrov, Lovro Puzar, Yee Jiun Song, Venkat VenkataramaniShow all (15)
USENIX Annual Technical Conference (ATC)

A Solution to the Network Challenges of Data Recovery in Erasure-coded Distributed Storage Systems: A Study on the Facebook Warehouse Cluster

K V Rashmi, Nihar B Shah, Dikang Gu, Hairong Kuang, Dhruba Borthakur, Kannan Ramachandran
USENIX Workshop on Hot Topics in Storage and File Systems (HotStorage)

LinkBench: a Database Benchmark based on the Facebook Social Graph

Tim Armstrong, Nagavamsi Ponnekanti, Dhruba Borthakur, Mark Callaghan
ACM Special Interest Group on Management of Data (SIGMOD/PODS)

MILEAGE: Multiple Instance LEArning with Global Embedding

Dan Zhang, Jingrui He, Luo Si, Richard Lawrence
International Conference on Machine Learning (ICML)

Representing Documents Through Their Readers

Khalid El-Arini, Min Xu, Emily Fox, Carlos Guestrin
ACM Conference on Knowledge Discovery and Data Mining (KDD)

Speeding up Large-Scale Learning with a Social Prior

Deepayan Chakrabarti, Ralf Herbrich
ACM Conference on Knowledge Discovery and Data Mining (KDD)

Machine Learning Paradigms for Speech Recognition: An Overview

IEEE/ACM Transactions on Audio, Speech, and Language Processing

Latent Credibility Analysis

Jeff Pasternack, Dan Roth
International World Wide Web Conference (WWW)

Subgraph Frequencies: Mapping the Empirical and Extremal Geography of Large Graph Collections

Johan Ugander, Lars Backstrom, Jon Kleinberg
International World Wide Web Conference (WWW)

CopyCatch: Stopping Group Attacks by Spotting Lockstep Behavior in Social Networks

Alex Beutel, Tom Wanhong Xu, Venkatesan Guruswami, Christopher Palow, Christos Faloutsos
International World Wide Web Conference (WWW)

Facebook's Data Center Network Architecture

Nathan Farrington, Alexey Andreyev
IEEE Optical Interconnects Conference (OI)

Hash Bit Selection: a Unified Solution for Selection Problems in Hashing

Xianglong Liu, Junfeng He, Bo Lang, Shih-Fu Chang
Conference on Computer Vision and Pattern Recognition (CVPR)

Quantifying the Invisible Audience in Social Networks

Michael Bernstein, Eytan Bakshy, Moira Burke, Brian Karrer
ACM Conference on Human Factors in Computing Systems (CHI)

Gender, Topic, and Audience Response: An Analysis of User-Generated Content on Facebook

Yi-Chia Wang, Moira Burke, Robert Kraut
ACM Conference on Human Factors in Computing Systems (CHI)

Storage and Performance Optimization of Long Tail Key Access in a Social Network

John Liang, Yang 'James' Luo, Mark Drayton, Rajesh Nishtala, Richard Liu, Nick Hammer, Jason Taylor, Bill Jia
International Workshop on Cloud Data and Platforms (Cloud DP)

Scaling Memcache at Facebook

Rajesh Nishtala, Hans Fugal, Steven Grimm, Marc Kwiatkowski, Herman Lee, Harry Li, Ryan McElroy, Michael Paleczny, Daniel Peek, Paul Saab, David Stafford, Tony Tung, Venkat VenkataramaniShow all (13)
USENIX Symposium on Networked Systems Design and Implementation (NSDI)

Using Facebook after Losing a Job: Differential Benefits of Strong and Weak Ties

Moira Burke, Robert Kraut
ACM Conference on Computer Supported Cooperative Work and Social Computing (CSCW)

Arrival and Departure Dynamics in Social Networks

Shaomei Wu, Atish Das Sarma, Alex Fabrikant, Andrew Tomkins, Silvio Lattanzi
ACM International Conference on Web Search and Data Mining (WSDM)

Balanced Label Propagation for Partitioning Massive Graphs

Johan Ugander, Lars Backstrom
ACM International Conference on Web Search and Data Mining (WSDM)

Characterizing and Curating Conversation Threads: Expansion, Focus, Volume, Re-entry

Lars Backstrom, Jon Kleinberg, Lillian Lee, Cristian Danescu-Niculescu-Mizil
ACM International Conference on Web Search and Data Mining (WSDM)

2012top

The HipHop Compiler for PHP

ACM International Conference on Object Oriented Programming Systems, Languages, and Applications (OOPSLA)

Workload Analysis of a Large-Scale Key-Value Store

Berk Atikoglu, Yuehai Xu, Eitan Frachtenberg, Song Jiang, Michael Paleczny
ACM International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS)

DeTail: Reducing the Flow Completion Time Tail in Datacenter Networks

David Zats, Tathagata Das, Prashanth Mohan, Dhruba Borthakur, Randy Katz
ACM Conference on Applications, Technologies, Architectures, and Protocols for Computer Communications (SIGCOMM)

Active Sampling for Entity Matching

Kedar Bellare, Suresh Iyengar Parthasarathy, Aditya Parameswaran, Vibhor Rastogi
ACM Conference on Knowledge Discovery and Data Mining (KDD)

Four Degrees of Separation

Lars Backstrom, Paolo Boldi, Marco Rosa, Johan Ugander, Sebastiano Vigna
ACM Web Science Conference (WebSci)

Storage Infrastructure Behind Facebook Messages: Using HBase at Scale

Amitanand Aiyer, Mikhail Bautin, Guoqiang Jerry Chen, Pritam Damania, Prakash Khemani, Kannan Muthukkaruppan, Karthik Ranganathan, Nicolas Spiegelberg, Liyin Tang, Madhuwanti VaidyaShow all (10)
IEEE International Conference on Data Engineering (ICDE)

Power and Performance Evaluation of Memcached on the TILEPro64 Architecture

Mateusz Berezecki, Eitan Frachtenberg, Michael Paleczny, Ken Steele
Sustainable Computing: Informatics and Systems

The spread of emotion via Facebook

ACM Conference on Human Factors in Computing Systems (CHI)

Thermal Design in the Open Compute Datacenter

Eitan Frachtenberg, Dan Lee, Marco Magarelli, Veerendra Mulay, Jay Park
IEEE Intersociety Conference on Thermal and Thermomechanical Phenomena in Electronic Systems (ITHERM)

PACMan: Coordinated Memory Caching for Parallel Jobs

Ganesh Ananthanarayanan, Ali Ghodsi, Andrew Wang, Dhruba Borthakur, Srikanth Kandula, Scott Shenker, Ion Stoica
USENIX Symposium on Networked Systems Design and Implementation (NSDI)

Structural Diversity in Social Contagion

Johan Ugander, Lars Backstrom, Cameron Marlow, Jon Kleinberg
Proceedings of the National Academy of Sciences (PNAS)

The Role of Social Networks in Information Diffusion

International World Wide Web Conference (WWW)

Bootstrapping Data Arrays of Arbitrary Order

Art B. Owen, Dean Eckles
The Annals of Applied Statistics (AOAS)

Predicting Memcache Throughput using Simulation and Modeling

Steven Hart, Eitan Frachtenberg, Mateusz Berezecki
IEEE Symposium on Theory of Modeling and Simulation (TMS)

2011top

High-efficiency server design

Eitan Frachtenberg, Ali Heydari, Hu Li, Amir Michael, Jacob Na, Avery Nisbet, Pierluigi Sarti
ACM Conference on Supercomputing (ICS)

Performance of an online translation tool when applied to patient educational material

Raman R. Khanna, Leah S. Karliner, Matthias Eck, Eric Vittinghoff, Christopher J. Koenig, Margaret C. Fang
Journal of Hospital Medicine November/December 2011

Phonetic Classification Using Controlled Random Walks

Katrin Kirchhoff, Andrei Alexandrescu
Conference of the International Speech Communication Association (Interspeech)

Dimensions of Self-Expression in Facebook Status Updates

Adam D. I. Kramer, Cindy K. Chung
AAAI International Conference on Weblogs and Social Media (ICWSM)

Center of Attention: How Facebook Users Allocate Attention across Friends

Lars Backstrom, Eytan Bakshy, Jon Kleinberg, Itamar Rosenn, Thomas Lento
AAAI International Conference on Weblogs and Social Media (ICWSM)

Location3: How Users Share and Respond to Location-Based Data on Social Networking Sites

Jonathan Chang, Eric Sun
AAAI International Conference on Weblogs and Social Media (ICWSM)

Many-core key-value store

Mateusz Berezecki, Eitan Frachtenberg, Michael Paleczny, Ken Steele
International Green Computing Conference (IGCC)

YSmart: Yet Another SQL-to-MapReduce Translator

Rubao Lee, Yin Huai, Tian Luo, Fusheng Wang, Yongqiang He, Xiaodong Zhang
International Conference on Distributed Computing Systems (ICDCS)

Apache Hadoop Goes Realtime at Facebook

Dhruba Borthakur, Nicolas Spiegelberg, Hairong Kuang, Aravind Menon, Sam Rash, Rodrigo Schmidt, Amitanand Aiyer, Jonathan Gray, Kannan Muthukkaruppan, Karthik Ranganathan, Dmytro Molkov, Joydeep Sen SarmaShow all (12)
ACM Special Interest Group on Management of Data (SIGMOD)

Facebook Immune System

Karan Mangla, Roger Chen, Tao Stein
Workshop on Social Network Systems (SNS)

FATE and DESTINI: A Framework for Cloud Recovery Testing

Haryadi S. Gunawi, Thanh Do, Pallavi Joshi, Peter Alvaro, Joseph M. Hellerstein, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau, Koushik Sen, Dhruba BorthakurShow all (9)
USENIX Symposium on Networked Systems Design and Implementation (NSDI)

Social Capital on Facebook: Differentiating Uses and Users

ACM Conference on Human Factors in Computing Systems (CHI)

RCFile: A Fast and Space-efficient Data Placement Structure in MapReduce-based Warehouse Systems

Yongqiang He, Namit Jain, Zheng Shao, Rubao Lee, Yin Huai, Xiaodong Zhang, Zhiwei Xu
IEEE International Conference on Data Engineering (ICDE)

Network Bucket Testing

Jon Kleinberg, Lars Backstrom
International World Wide Web Conference (WWW)

2010top

Finding a needle in Haystack: Facebook's photo storage

Peter Vajgel, Sanjeev Kumar, Harry Li, Doug Beaver, Jason Sobel
USENIX Symposium on Operating Systems Design and Implementation (OSDI)

Data warehousing and analytics infrastructure at Facebook.

Namit Jain, Dhruba Borthakur, Raghotham Murthy, Zheng Shao, Suresh Antony, Ashish Thusoo, Joydeep Sen Sarma, Hao Liu
Special Interest Group on Management of Data (SIGMOD)

Tools for Collecting Speech Corpora via Mechanical-Turk

Ian Lane, Alex Waibel, Matthias Eck, Kay Rottmann
CSLDAMT '10 Proceedings of the NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon's Mechanical Turk

Not-so-latent dirichlet allocation: collapsed Gibbs sampling using human judgments

Conference of the North American Chapter of the Association for Computational Linguistics (NAACL)

Job Scheduling for Multi-User MapReduce Clusters

Matei Zaharia, Dhruba Borthakur, Joydeep Sen Sarma, Khaled Elmeleegy, Scott Shenker, Ion Stoica
ACM European Conference on Computer Systems (EUROSYS)

Social Network Activity and Social Well-Being

ACM Conference on Human Factors in Computing Systems (CHI)

An Unobtrusive Behavioral Model of “Gross National Happiness”

ACM Conference on Human Factors in Computing Systems (CHI)

2009top

Hive - A Warehousing Solution Over a Map-Reduce Framework

Prasad Chakka, Namit Jain, Zheng Shao, Raghotham Murthy
International Conference on Very Large Data Bases (VLDB)

Gesundheit! Modeling Contagion through Facebook News Feed

AAAI Conference on Weblogs and Social Media (ICWSM)