the 5 principles of google's cloud
Post on 15-Jan-2015
1.756 Views
Preview:
DESCRIPTION
TRANSCRIPT
ENTERPRISEA R C H I TE C T U R E
THE 5 PRINCIPLES OF OF GOOGLE’S ”CLOUD”Patrik Svensson, 2011, ptrksvnssn@gmail.com
torsdag den 12 maj 2011
ENTERPRISEA R C H I TE C T U R E THE VISION OF GOOGLE
torsdag den 12 maj 2011
ENTERPRISEA R C H I TE C T U R E THE 5 PRINCIPLES
• Everything is a service (or an application in
Android)
• Relentless technical focus (thinking at nanoscale)
• Data centers are the foundation
• Code is king, Data is king kong
• Identify and keep track on your users
torsdag den 12 maj 2011
ENTERPRISEA R C H I TE C T U R E
torsdag den 12 maj 2011
ENTERPRISEA R C H I TE C T U R E
#1 EVERYTHING IS A SERVICE (OR AN APPLICATION)
torsdag den 12 maj 2011
ENTERPRISEA R C H I TE C T U R E
#2 RELENTLESS TECHNICAL FOCUS
• Jedis build their own
lightsabres
• Parallelize, Distribute, Cache,
Compress, Redundantize
everything
• Latency is VERY evil Source: http://www.flickr.com/photos/60994749@N07/5557591956/
torsdag den 12 maj 2011
ENTERPRISEA R C H I TE C T U R E
EXAMPLE: ”NUMBERS EVERYONE SHOULD KNOW”
1,000,000 ns = 1 ms
1,000,000,000 ns = 1 s
Source: Jeff Dean, ”Designs, Lessons and Advise from Building Large Distributed Systems”
torsdag den 12 maj 2011
ENTERPRISEA R C H I TE C T U R E
#3 DATA CENTERS ARE THE FOUNDATION
torsdag den 12 maj 2011
ENTERPRISEA R C H I TE C T U R E ECONOMIES OF SCALE
• ~40 data centers in 2009, 1000,000 machines
Source: http://techcrunch.com/2008/04/11/where-are-all-the-google-data-centers/
torsdag den 12 maj 2011
ENTERPRISEA R C H I TE C T U R E
torsdag den 12 maj 2011
ENTERPRISEA R C H I TE C T U R E
#4 CODE IS KING, DATA IS KING KONG
Linux
Protocol Buffers, JsonPython, Java, C++
Sawzall, Dremel, Percolator
App Engine, Gmail, Search, Index OpenID, OAuth, Google
Accounts available for most services
GFS masterGoogle Work Queue,
Chubby,Netscalar, Google HTTP Server, (Spanner)
MapReduce
Enterprise Architecture
Technical Architecture i.e. which technologies do we use
"We need:Cooling, Power,
Perimeter Networks, Containers, Racks,
Switches & Hardware at low cost that scale"
"We need:One Distributed File Systems, Distributed One Shared memory,
& common data formats to get scale
and low cost"
Google Container-based Data Centers
"We need to build applications and services, application-,
integration- & data platforms, parallell computing platforms & use an open source OS, upon our data center/data platform"
GFS,BigTable,
Protocol Buffers
"We need scheduling synchronization, lock services, i.e. various
forms of control mechanisms for data and
code"
"We need to identify our users to be able to
interact, differentiate and customize the user
experience"
Android, Chrome
Implementation Architecture i.e. how do we implement the technologies
DATA CENTERS DATA CODE CONTROL USERS
torsdag den 12 maj 2011
ENTERPRISEA R C H I TE C T U R E ABOUT DATA
0
50
100
150
200
Structured, Numerical Unstructured, Textual Communication, Traffic
+20 Petabyte/day
~10 Terabyte/day
~2,5 Terabyte
"Google's mission is to organize the world's
information and make it available to all"
torsdag den 12 maj 2011
ENTERPRISEA R C H I TE C T U R E DATA CENTER ”ENTRY”
• The same entry to each Data Center
• ~50 caching (using Squid)
• Built their own HTTP servers/farms
Source: Ed Austin, ”The Anatomy of the Google Architecture”
torsdag den 12 maj 2011
ENTERPRISEA R C H I TE C T U R E INSIDE THE CONTAINERS
• Customized commodity servers, is customized racks in
containers (+1000 servers), organized into clusters
• All containers ”cloned” and look the same
Source: Jeff Dean, ”Designs, Lessons and Advise from Building Large Distributed Systems”
torsdag den 12 maj 2011
ENTERPRISEA R C H I TE C T U R E
THE SAME HW, OS AND FILESYSTEM EVERYWHERE
Source: Jeff Dean, ”Designs, Lessons and Advise from Building Large Distributed Systems”
torsdag den 12 maj 2011
ENTERPRISEA R C H I TE C T U R E BIGDATA AS DATABASE
Source: Jeff Dean, ”Designs, Lessons and Advise from Building Large Distributed Systems”
torsdag den 12 maj 2011
ENTERPRISEA R C H I TE C T U R E
BIGDATA IS COLUMN-BASED
Source: Jeff Dean, ”Designs, Lessons and Advise from Building Large Distributed Systems”
torsdag den 12 maj 2011
ENTERPRISEA R C H I TE C T U R E BIGDATA NEEDS GFS
• Use GFS to store data and logs
Source: Jeff Dean, ”Designs, Lessons and Advise from Building Large Distributed Systems”
torsdag den 12 maj 2011
ENTERPRISEA R C H I TE C T U R E
MAPREDUCE - A PARALLELL COMPUTING PLATFORM
Source: Jeff Dean, ”Designs, Lessons and Advise from Building Large Distributed Systems”
torsdag den 12 maj 2011
ENTERPRISEA R C H I TE C T U R E
ABOUT CODING AT GOOGLE
• Linux as operating system everywhere - is open source, highly customized for this (Android is also
a higly customized version of Linux)
• Serialization/Integration - Protocol buffers (RPC) runs at nano speed, internally used for
”everything”, Json and RESTful used for external API’s
• Application-oriented Programming languages - mainly Python, Java and C++
• Data-oriented programming languages - Percolator, Sawzall, Dremel for various data
processing task (so specialised tools for data!)
• The Business Applications - Gmail, Search, App Engine etc - built upon data center
infrasctructure, data platform and above
torsdag den 12 maj 2011
ENTERPRISEA R C H I TE C T U R E
#5 IDENTIFY AND KEEP TRACK OF YOUR USERS
• You need a google account to start
Android properly
• OpenSocial is a collaborate effort to
compete against Facebook
• OpenID is an identity standard and OAuth
is a standard for authorizing services
• Google is identifying and tracking every
step you take within their domains
torsdag den 12 maj 2011
top related