facebook technology stack

25
FACEBOOK

Upload: husain-kapadia

Post on 16-May-2015

20.688 views

Category:

Technology


1 download

TRANSCRIPT

Page 1: Facebook Technology Stack

FACEBOOK

Page 2: Facebook Technology Stack

Introduction

Facebook is the “social networking”. People have been “facebooking” each other for about 7 years now,

making Facebook the most used social network with over 500 million users worldwide.

50% of our active users log on to Facebook in any given day Average user has 130 friends People spend over 700 billion minutes per month on Facebook There are over 900 million objects that people interact with (pages,

groups, events and community pages) Average user is connected to 80 community pages, groups and

events Average user creates 90 pieces of content each month More than 30 billion pieces of content (web links, news stories, blog

posts, notes, photo albums, etc.) shared each month.

http://www.facebook.com/press/info.php?statistics

Page 3: Facebook Technology Stack

Facebook’s scaling challenge Here are a few factoids to give you an idea of the

scaling challenge that Facebook has to deal with: Facebook serves 570 billion page views per

month (according to Google Ad Planner). There are more photos on Facebook than all other

photo sites combined (including sites like Flickr). More than 3 billion photos are uploaded every

month. Facebook’s systems serve 1.2 million photos per

second. This doesn’t include the images served by Facebook’s CDN.

More than 25 billion pieces of content (status updates, comments, etc) are shared every month.

Facebook has more than 30,000 servers (and this number is from last year!)

All Data 2009

Page 4: Facebook Technology Stack

How Does Facebook Work? The Nuts and Bolts

Linux & Apache PHP Memcache Haystack BigPipe

The Front End

Page 5: Facebook Technology Stack

LAMP stack

Facebook uses Linux, but has optimized it for its own purposes (especially in terms of network throughput).

Facebook still uses PHP, but it has built a compiler(HIP HOP) for it so it can be turned into native code on its web servers, thus boosting performance.

Page 6: Facebook Technology Stack

Haystack

There are more than 20 billion uploaded photos on Facebook, and each one is saved in four different resolutions, resulting in more than 80 billion photos.

And it’s not just about being able to handle billions of photos, performance is critical. Facebook serves around 1.2 million photos per second.

Haystack is Facebook’s high-performance photo storage/retrieval system, a highly scalable object store used to serve Facebook’s immense amount of photos.

Strictly speaking, Haystack is an object store, so it doesn’t necessarily have to store photos.

http://www.niallkennedy.com/blog/2009/04/facebook-haystack.html

Page 7: Facebook Technology Stack

Haystack

Haystack stores photo data inside 10 GB bucket with 1 MB of metadata for every GB stored.

The Haystack index stores metadata about the one needle it needs to find within the Haystack.

Incoming requests for a given photo asset are interpreted as before, but now contain a direct reference to the storage offset containing the appropriate data.

Page 8: Facebook Technology Stack

BigPipe

Pipelining web pages for high performance BigPipe -dynamic web page serving system,

Facebook has developed. Facebook uses it to serve each web page in

sections (called “pagelets”) for optimal performance.

BigPipe is a fundamental redesign of the dynamic web page serving system. The general idea is to pipeline pagelets through several execution stages inside web servers and browsers.

Page 9: Facebook Technology Stack

BigPipe

BigPipe breaks the page generation process into several stages

The first three stages are executed by the web server, and the last four stages are executed by the browser.

Each pagelet must go through all these stages sequentially, but BigPipe enables several pagelets to be executed simultaneously in different stages.

Page 10: Facebook Technology Stack

Pagelets in Facebook home page. Each rectangle corresponds to one pagelet.

Page 11: Facebook Technology Stack

Memcache

Free & open source, high-performance, distributed memory object caching system

Memcached is an in-memory key-value store for small chunks of arbitrary data (strings, objects) from results of database calls, API calls, or page rendering.

The system uses a client–server architecture. The servers maintain a key–value associative array; the clients populate this array and query it.

The servers keep the values in RAM; if a server runs out of RAM, it discards the oldest values.

Clients can read each other's cached data.

Page 12: Facebook Technology Stack

How Does Facebook Work ?

Thrift (protocol) Cassandra (database) Scribe (log server) HipHop for PHP

The Back End

Page 13: Facebook Technology Stack

Thrift

Facebook’s backend services are written in a variety of different programming languages including C++, Java, Python, and Erlang.

Thrift allows for easy exchange of data (variables, objects) between applications written in different languages.

Thrift protocol offers cross language serialization. It combines a software stack with a code generation engine to build services that work efficiently and seamlessly between C++, C#, Java, Perl, Python, PHP, Erlang, Haskel, Cocoa and Ruby.

http://www.thrift.pl/Thrift-tutorial-how-it-works.html

Page 14: Facebook Technology Stack

The Back End

Scribe (log server) is a server for aggregating log data streamed in real-time from many other servers. Useful for logging a wide array of data. It is built on top of Thrift.

Cassandra is a database management system designed to handle large amounts of data spread out across many servers. It powers Facebook’s Inbox Search feature and provides a structured key-value store with eventual consistency.

HipHop for PHP is a source code transformer for PHP script code and was created to save server resources. HipHop transforms PHP source code into optimized C++. After doing this, it uses g++ to compile it to machine code.

Page 15: Facebook Technology Stack

Reliability

Unlike other social networks like Friendster, MySpace, and Twitter – all of whom have run into serious scalability issues at different points during their growth. Facebook has been mostly reliable throughout its rise. 

In actuality, Facebook uses JavaScript heavily, relies on their own in-house PHP wrapper called XHP, HipHop (which optimizes PHP), and many more technologies.

A lot of technologies have been developed by Facebook in-house to serve their own needs, for example Cassandra.

Page 16: Facebook Technology Stack

Facebook has also widgetized large portions of their application, meaning that widgets can be written in an appropriate language instead of simply using PHP. These widgets interface with the other parts of the application through the use of internal APIs.

Like many other big sites, Facebook uses a Content delivery network (CDN) to help serve static content.

And then of course there is the huge data center Facebook is building in Oregon to help it scale out with even more servers.

Page 17: Facebook Technology Stack

Facebook’s love affair with open source Not only is Facebook using (and contributing to)

open source software such as Linux, Memcached, MySQL, Hadoop, Hive, and many others, it has also made much of its internally developed software available as open source.

Examples of open source projects that originated from inside Facebook include HipHop, Cassandra, Thrift, Cfengine, Varnish and Scribe.

Facebook has also open-sourced Tornado, a high-performance web server framework developed by the team behind FriendFeed (which Facebook bought in August 2009).

Check out - http://developers.facebook.com/opensource/

Page 18: Facebook Technology Stack

Gradual releases and dark launches Facebook has a system, Gatekeeper that

lets run different code for different sets of users.

This lets Facebook do gradual releases of new features, activate certain features only for Facebook employees, etc.

Gatekeeper also lets Facebook do something called “dark launches”, which is to activate elements of a certain feature behind the scenes before it goes live.

Page 19: Facebook Technology Stack

Facebook Platform

The Facebook Platform provides a set of APIs and tools which enable 3rd party developers to integrate with the "open graph“.

Graph API is the core of Facebook Platform, enabling developers to read and write data to Facebook.

Page 20: Facebook Technology Stack

The Graph API

The Graph API presents a simple, consistent view of the Facebook social graph, uniformly representing objects in the graph (e.g.,people, photos, events, and pages) and the connections between them (e.g., friend relationships, shared content, and photo tags).

Restful API for accessing data on the Facebook graph. O auth 2.0 based authentication. JSON Modeling of objects and connections. Every object in the social graph has a unique ID. You

can access the properties of an object by requesting - https://graph.facebook.com/ID

Alternatively, people and pages with usernames can be accessed using their username as an ID. All responses are JSON objects.

Specifications - http://developers.facebook.com/docs/api

Page 21: Facebook Technology Stack

Facebook Markup Language FBML is a variant-evolved subset of HTML with some

elements removed. It allows Facebook Application developers to customize the

"look and feel" of their applications, to a limited extent. It is the specification of how to encode content so that

Facebook's  servers can read and publish it. FBML plays an important role in building applications.

FBML is used to tap in to various Facebook elements when building applications.

It operates a lot like HTML and it gives the ability to do various tasks with ease such as: sending a user e-mail creating a two column form embedding flash video creating a dashboard posting on a wall displaying a header…etc

Page 22: Facebook Technology Stack

FBML

Facebook also allows the use of regular HTML tags, such as <a href=”#”></a>, which is used to generate a hyperlink. Facebook also allows the use of many more HTML tags for building applications

http://developers.facebook.com/docs/reference/fbml/

Page 23: Facebook Technology Stack

Facebook features

Facebook features

Chat Status Updates

Credits URL shortener

Facebook Live Usernames

Networks, Groups, and Like Pages

Easter eggs

News Feed Wall

Notification Gifts

Poke Lite

Page 24: Facebook Technology Stack

Facebook’s New Messages The new Messages interweaves your chats,

texts and emails. It’s a central place to control all of your private communication, both on and off Facebook.

Simply put, it can be a single inbox for all of your messages, no matter how you choose to send them.

A facebook.com Email Address SMS From Facebook Chat History

Page 25: Facebook Technology Stack

Facebook Connect

Facebook Connect is a set of APIs from Facebook that enable Facebook members to log onto third-party websites, applications, mobile devices and gaming systems with their Facebook identity.