commodity data center design james hamilton 2007-04-17 [email protected] jamesrh
TRANSCRIPT
Commodity Data Center Design
James Hamilton2007-04-17
[email protected]://research.microsoft.com/~JamesRH
• 15+ years in database engine development teams– Lead architect on IBM DB2– Architect on SQL Server
• Led core engine teams over the years including SQL clients, optimizer, SQL compiler, XML, full text search, execution engine, protocols, etc.
• Led the Exchange Hosted Services Team– Email anti-spam, anti-virus, and archiving for 2m+ seats– ~700 servers in 10 data centers world-wide
• Currently architect on the Windows Live Core team• Automation & redundancy is only way to:
– Reduce costs– Improve rate of innovation– Reduce operational failures and downtime
Background and Biases
1/21/2007 2
Commodity Data Center Growth• Software as a Service
– Services w/o unique value-add going off premise• Payroll, security, etc. all went years ago
– Substantial economies of scale• Services at 10^5+ systems under mgmt rather than ~10^2
– IT outsourcing also centralizing compute centers• Commercial High Performance Computing
– Leverage falling costs of H/W in deep data analysis– Better understand customers, optimize supply chain, …
• Consumer Services– Google estimated at over 450 thousand systems in more than 25 data centers (NY Times)
• Basic observation:– No single system can reliably reach 5 9’s (need redundant H/W with resultant S/W complexity)– With S/W redundancy, most economic H/W solution is large numbers of commodity systems
1/21/2007 3
An Idea Whose Time Has Come
1/21/2007
Nortel Steel EnclosureContainerized telecom equipment
Sun Project Black Box242 systems in 20’
Rackable Systems Concentro1,152 Systems in 40’ (9,600 cores/3.5 PB)
Rackable Systems Container Cooling Model
CaterpillarPortable Power
4
DatatainerZoneBox
Google WillPower ProjectWill Whitted Petabox
Brewster Kahle Internet Archive
Cooling, Feedback, & Air Handling Gains
1/21/2007
• Tighter control of air-flow increased delta-T and overall system efficiency
• Expect increased use of special enclosures, variable speed fans, and warm machine rooms
• CRACs closer to servers for tighter temp control feedback loop
• Container takes one step further with very little air in motion, variable speed fans, & tight feedback between CRAC and rack
5
Intel
Intel
Verari
Shipping Container as Data Center Module• Data Center Module
– Contains network gear, compute, storage, & cooling– Just plug in power, network, & chilled water
• Increased cooling efficiency– Variable water & air flow– Better air flow management (higher delta-T)– 80% air handling power reductions (Rackable Systems)
• Bring your own data center shell– Just central networking, power, cooling, security & admin center– Grow beyond existing facilities– Can be stacked 3 to 5 high– Less regulatory issues (e.g. no building permit)– Avoids (for now) building floor space taxes
• Meet seasonal load requirements• Single customs clearance on import• Single FCC compliance certification
1/21/2007 6
Unit of Data Center Growth• One at a time:
– 1 system– Racking & networking: 14 hrs ($1,330)
• Rack at a time:– ~40 systems– Install & networking: ¾ hour ($60)– Considerably more efficient & now the unit of growth
in efficient centers• Container at a time:
– ~1,000 systems– No packaging to remove– No floor space required– Require power, network, & cooling only
• Containers are weatherproof & transportable• Data center construction takes 24+ months
– New build & DC expansion require regulatory approval
1/21/2007 7
Manufacturing & H/W Admin. Savings• Factory racking, stacking & packing much more efficient
– Robotics and/or inexpensive labor• Avoid layers of packaging
– Systems->packing box->pallet->container– Materials cost and wastage and labor at customer site
• Data Center power & cooling expensive consulting contracts– Data centers are still custom crafted rather than prefab units– Move skill set to module manufacturer who designs power & cooling once– Installation design to meet module power, network, & cooling specs
• More space efficient– Power densities in excess of 1250 W/sq ft– Rooftop or parking lot installation acceptable (with security)– Stack 3 to 5 high
• Service-Free– H/W admin contracts can exceed 25% of systems cost– Sufficient redundancy that it just degrades over time
• At end of service, return for remanufacture & recycling– 20% to 50% of systems outages caused by Admin error (A. Brown & D. Patterson)
1/21/2007 8
DC Location Flexibility & Portability• Dynamic data center– Inexpensive intermodal transit anywhere in world– Move data center to cheap power & networking– Install capacity where needed– Conventional Data centers cost upwards of $150M
& take 24+ months to design & build– Political/Social issues• USA PATRIOT act concerns and other national interests
can require local data centers• Build out a massively distributed data center fabric– Install satellite data centers near consumers
1/21/2007 9
Systems & Power Density• Estimating datacenter power density difficult (15+ year horizon)
– Power is 40% of DC costs• Power + Mechanical: 55% of cost
– Shell is roughly 15% of DC cost– Cheaper to waste floor than power
• Typically 100 to 200 W/sq ft• Rarely as high as 350 to 600 W/sq ft
– Modular DC eliminates the shell/power trade-off• Add modules until power is absorbed
• 480VAC to container– High efficiency DC distribution within– High voltage to rack can save >5% over 208VAC approach
• Over 20% of entire DC costs is in power redundancy– Batteries able to supply up to 12 min at some facilities– N+2 generation
• Instead, more smaller, cheaper data centers• Eliminate redundant power & bulk of shell costs
1/21/2007 10
Where do you Want to Compute Today?
1/21/2007Slides posted to: http://research.microsoft.com/~JamesRH
11
Active Server Availability Feedback James Hamilton [email protected] Microsoft SQL Server 2002.06.12
Commercial Database Security Issues James Hamilton [email protected] Microsoft SQL Server 2002.10.16