enabling datacenter servers to scale out economically and ... · talk overview 1. background and...
TRANSCRIPT
Enabling Datacenter Servers to Scale Out Economically and Sustainably
IDEAL (Intelligent Design of Efficient Architectures Laboratory)Department of Electrical and Computer Engineering
University of Florida
3UHVHQWHG�E\�Chao Li
MICRO-46'HF�����������'DYLV� &$
Chao Li, Yang Hu, Ruijin Zhou, Ming Liu, Longjun Liu, Jingling Yuan, Tao Li
� ����
Talk Overview
1. Background and Motivation 2. Oasis: Design and Prototype
3. Optimized Oasis Operation 4. Evaluation and Discussion
Inverter
PLCHMI
MPPT
SensorSensor
Switch Panel
Charger
/ 1 *
Pow
er C
ontr
ol H
ub
Utility Power
4%7%
9%
13%
16%22%
29%
Others Mechanical HMI BatteryInverter PLC Solar Panel
5%
14%
76%
Server Racks
P Load > P Renewable
1
5'%�!��
XSV&�!���� 6ZLWFK� �µ1¶��FSX)UHT� �µ+LJKHVW¶�
2 XSV&������ 6ZLWFK� �µ<¶��FSX)UHT� �µ7%'¶
3'%� ��
XSV&�!���� 6ZLWFK� �µ1¶��FSX)UHT� �µ/RZ¶�
4 XSV&������ 6ZLWFK� �µ<¶��FSX)UHT� �µ7%'¶
5
8'%�!��
XSV&�!���� 6ZLWFK� �µ<¶��FSX)UHT� �µ+LJKHVW¶
6 XSV&������ 6ZLWFK� �µ1¶��FSX)UHT� �µ7%'¶
7'%� ��
XSV&�!���� 6ZLWFK� �µ<¶��FSX)UHT� �µ/RZHVW¶
8 XSV&������ 6ZLWFK� �µ1¶��FSX)UHT� �µ7%'¶
� ����
Datacenter Footprint Continues to Expand
• Horizontal scaling (scale out) has gained increasing attention[1] DCD Industry Census 2012: Energy, http://www.dcd-intelligence.com/
020406080
100120140160
% Increase in cloud infrastructure capacity in 2013
Everything is in the Cloud
Ever-increasing user data
Endless data processing
More servers are needed!
� ����
The Power Provisioning Capacity Problem
• Datacenters are power-constrained:– Limited power capacity headroomRun out of power capacity in 2012 ?
30
70
00 Capacity expanded in the last 5 years?
80
20
3RZHU�&DSDFLW\�&RQVWUDLQWV
Automatic Transfer Switch (ATS)Power Panel / Switch Gear
Uninterruptable Power SupplyPower Distribution Units
Server Clusters
� ����
Existing Solutions
66%
10%
42%29% 24% 30%
ConsolidateServers
DeployContainers
UpgradeEquipment
Build NewDatacenters
LeaseColocation
Move to theCloud
Improve Efficiency Facility Construction Third-Party Solutions
[1] the Uptime Institute 2012 Data Center Industry Survey, 2012
66%
10%
42%29% 24% 30%
ConsolidateServers
DeployContainers
UpgradeEquipment
Build NewDatacenters
LeaseColocation
Move to theCloud
Preference to different solutions [1]
� ����
Existing Solutions
ConsolidateServers
DeployContainers
UpgradeEquipment
Build NewDatacenters
LeaseColocation
Move to theCloud
Schemes ProblemsImprove Efficiency Power under-provisioning issue and low performanceFacility Construction High capital investment and long construction lead timeThird-Party Solutions Not suitable for large-scale enterprise datacenters
Improve Efficiency Facility Construction Third-Party Solutions
� ����
Energy and Environmental Problems
[1] C. Belady, Projecting Annual New Datacenter Construction Market Size, Global Foundation Services, 2011[2] DCD Industry Census 2012: Energy, http://www.dcd-intelligence.com/
USAChinaU.K.JapanBrazilFranceBeneluxCanada
GermanyRussia
8 TWh3 TWh2 TWh2 TWh2 TWh1 TWh1 TWh1 TWh1 TWh1 TWh
AustraliaIndia
1 TWh1 TWh
The increase in server energy demand (2012 - 2013) [2]
• Server energy consumption:– 1.8% of global electricity usage– Might triple within 8 years [1]
300 ~ 400 TWh in 2012
1000 ~ 1400 TWh in 2020
� ����
Energy and Environmental Problems
0%
20%
40%Ru
ssia
Fran
ceIta
lyBr
azil
Spai
nCh
ina
Mex
ico
Nor
dics
Cana
daTu
rkey
Bene
lux
USA
Germ
any
Indi
aU
KJa
pan
% Performing Carbon Monitoring
Hurricane Sandy, 2012(Northeastern US)
Typhoon Haiyan, 2013(Southeast Asia)
• The greenhouse effect and climate change
• 1MW data center → 10~15 Kt CO2 yearly
• Datacenters are carbon-constrained: – Must cap carbon emissions
� ����
Renewable Energy Powered Systems
Many IT Companies start tointegrate non-conventional
clean energy solutions
�� ����
Green Computing - Related Work
• Mainly focus on managing solar/wind– Supply/Load co-scheduling
[ASPLOS’13, HPCA’13]– Supply-aware job scheduling
[Eurosys’12]– Supply-driven load migration
[ISCA’12]– Avoid shedding critical load
[ASPLOS’11]– Optimal power allocation
[HPCA’11]
• We explore carbon-conscious capacity expansion schemes – Scalable, sustainable, and economical power provisioning
�� ����
Talk Overview
1. Background and Motivation 2. Oasis: Design and Prototype
3. Optimized Oasis Operation 4. Evaluation and Discussion
Inverter
PLCHMI
MPPT
SensorSensor
Switch Panel
Charger
/ 1 *
Pow
er C
ontr
ol H
ub
Utility Power
4%7%
9%
13%
16%22%
29%
Others Mechanical HMI BatteryInverter PLC Solar Panel
5%
14%
76%
Server Racks
P Load > P Renewable
1
5'%�!��
XSV&�!���� 6ZLWFK� �µ1¶��FSX)UHT� �µ+LJKHVW¶�
2 XSV&������ 6ZLWFK� �µ<¶��FSX)UHT� �µ7%'¶
3'%� ��
XSV&�!���� 6ZLWFK� �µ1¶��FSX)UHT� �µ/RZ¶�
4 XSV&������ 6ZLWFK� �µ<¶��FSX)UHT� �µ7%'¶
5
8'%�!��
XSV&�!���� 6ZLWFK� �µ<¶��FSX)UHT� �µ+LJKHVW¶
6 XSV&������ 6ZLWFK� �µ1¶��FSX)UHT� �µ7%'¶
7'%� ��
XSV&�!���� 6ZLWFK� �µ<¶��FSX)UHT� �µ/RZHVW¶
8 XSV&������ 6ZLWFK� �µ1¶��FSX)UHT� �µ7%'¶
�� ����
Server Racks
Power Distribution Units (PDUs)
A/C SystemsEnergy Storage Cabinets/UPS
Generators
Switch Gears
Utility Power Over-Provisioning (Conventional)
�� ����
Centralized Power Capacity Expansion (Conventional)
Solar Array
Server Racks
Power Distribution Units (PDUs)
A/C SystemsEnergy Storage Cabinets/UPS
Generators
Switch Gears
Inverter
�� ����
Scale-Out Models
Carbon Emission
Capacity Scalability
Cost of Utility Power
Cost of Green Power
Utility Over-Provisioning Poor Poor High 1�$
Centralized Expansion Good Poor Reduced High
Ideal Power Provisioning Good Good Reduced Reduced
ModelsMetrics
• Oasis: green energy solutions + pay-as-you-grow model– Adds green power budget directly to server racks– Gradually increases green power capacity
�� ����
We Leverage Modular Power Sources
• Distributed Battery System
• Solar Module with Micro-inverters
AC
DC
AC①�&RDUVH�JUDLQHG
Battery CabinetRack Triplets
②�)LQH�JUDLQHG
Battery Cabinet
�� ����
Distributed Incremental Integration (Architecture of Oasis)
Micro-inverters
Solar Array
Distributed Battery Cabinet
Oasis Power Control Hub
�� ����
Oasis Implementation: An Overview
Rack Power Strip
Cluster-Level PowerManagement Agent
Inverter
PLCHMI
MPPT
SensorSensor
Switch Panel
Network SwitchEthernet
ModBus
Charger
/ 1 *Se
rver
s
Pow
er C
ontr
ol H
ub
Utility Power • Power Ctrl. Hub– PLC
� Manages sensorsand switchgears
– HMI� Communication
gateway of Oasis
• Power Mgmt. Agent– Send/Receive power
management signals – Coordinates power
supply and server load
�� ����
Oasis Implementation: An Overview
Oasis Node
� ,QWHO�L���EDVHG�PLFUR�VHUYHU
Server Nodes
� $0'�ORZ�SRZHU�QRGH����:�
Power Mgmt. Agent
� 3/&���+0,���,QYHUWHU���«
Power Ctrl. Hub
� 1LQH���$K�OHDG�DFLG�EDWWHULHV
Battery Chassis
�� ����
Hybrid Power Supply Scheme
0 500 1000 1500 2000 2500 3000 350011.5
12
12.5
13
Time (Seconds)
Bat
tery
Vol
tage
(V
)
Swtich to Utility
Switch to Solar
Voltage Drop
• Stored solar energy– Release solar energy when batteries are fully charged– Charge batteries with solar power when the SOC is low
• Utility power supply– The primary energy source in cloudy days or at night
Roof-mounted solar panels in our lab Battery charging and discharging scenarios
�� ����
Power Control Hub - I
• Monitors power supply status– Emergency alert– Battery capacity check– Health status assessment
Inside the Pwr. Ctrl. Hub
Two Monitoring Approaches
to the power mgmt. agent
to HMI touch screen display
�� ����
Power Control Hub - II
• Bridges power supply and load– Send/Receive control signals– Send/Store monitored data
Inside the Pwr. Ctrl. Hub
HMI PLC
(ModBus TCP/Server)(ModBus TCP/Client)
RS-232/485
Server Clusters
Power Control Hub
Battery Solar Utility
Ethe
rnet
Actuator
Server Clusters
Battery Solar Utility
ActuatorHMIPLC
Comm. Gateway!
�� ����
Power Control Hub - III
• Performs Power Supply Switch – Switch between solar power
and utility power– Leverage high-voltage relay
array controlled by a PLCInside the Pwr. Ctrl. Hub
Two Switching Modes!
�� ����
Power Management Agent (PMA)
• Adaptive power source switching– Manages utility power usage (affect carbon footprint)– Manages solar energy and battery usage
• Supply-aware server load tuning– Dynamic voltage and frequency scaling (DVFS)– Trigger VM migration/checkpointing if necessary
&DQ�EH�LPSOHPHQWHG�DV�HLWKHU�PLGGOHZDUH�RU�H[WHUQDO�FRQWURO�QRGH
PCH
(ModBus TCP/Server)(ModBus TCP/Client)
Server OS PMA (as server node)
Workload Workload
PMA (as middleware)
Workload
6HUYHU�6\VWHP��7KH�/RDG�
�� ����
Talk Overview
1. Background and Motivation 2. Oasis: Design and Prototype
3. Optimized Oasis Operation 4. Evaluation and Discussion
Inverter
PLCHMI
MPPT
SensorSensor
Switch Panel
Charger
/ 1 *
Pow
er C
ontr
ol H
ub
Utility Power
4%7%
9%
13%
16%22%
29%
Others Mechanical HMI BatteryInverter PLC Solar Panel
5%
14%
76%
Server Racks
P Load > P Renewable
1
5'%�!��
XSV&�!���� 6ZLWFK� �µ1¶��FSX)UHT� �µ+LJKHVW¶�
2 XSV&������ 6ZLWFK� �µ<¶��FSX)UHT� �µ7%'¶
3'%� ��
XSV&�!���� 6ZLWFK� �µ1¶��FSX)UHT� �µ/RZ¶�
4 XSV&������ 6ZLWFK� �µ<¶��FSX)UHT� �µ7%'¶
5
8'%�!��
XSV&�!���� 6ZLWFK� �µ<¶��FSX)UHT� �µ+LJKHVW¶
6 XSV&������ 6ZLWFK� �µ1¶��FSX)UHT� �µ7%'¶
7'%� ��
XSV&�!���� 6ZLWFK� �µ<¶��FSX)UHT� �µ/RZHVW¶
8 XSV&������ 6ZLWFK� �µ1¶��FSX)UHT� �µ7%'¶
�� ����
Ozone: Optimized Oasis Operation (O3)
� $GMXVW�GHPDQG�OHYHO
6HUYHU�ORDG
� 6ZLWFK�SRZHU�VRXUFH
3RZHU�VXSSO\
� 6WRUHG�JUHHQ�HQHUJ\�XVDJH
%DWWHU\�V\VWHPV
� 3HUIRUPDQFH�DFFHSWDEOH"� &DUERQ�IRRWSULQW�VDWLVILHG"� %DWWHU\�FRVW�UHGXFHG"� $YDLODELOLW\�JXDUDQWHHG"
'HVLJQ�WUDGH�RIIV
�� ����
Backup Capacity
• Capping green energy usage for each discharge cycle– The stored green energy level affects backup time– Should avoid low state of charge (SOC)
Flexible Capacity
Reserved Capacity
SOC
0%10
0%
Limited emergency handling capability Relatively longer recharge time
Limited green energy delivery
• Use different power management schemes at different SOC– Abundant stored energy? (60% ~ 100% SOC)– Not enough stored energy? (20% ~ 60% SOC)– Should avoid low SOC (i.e., SOC < 20%)
�� ����
Discharge Budget
• Discharge throughput model– The total energy that can be cycled through a battery is fixed
�������������������������
��������������������������������
���
���
���
���
���
���
���
���
���
����
Thro
ughp
ut (k
Wh)
# of
Cyc
les
&\FOHV�WR�)DLOXUH 7KURXJKSXW
• Capping the aggregated discharge throughput– Predicting lifetime based on the remaining throughput– Capping battery discharge to avoid over-use
Manage solar energy usage based on
�t
aggregated AhD D
�budget ratedD T Lifetime D
budget aggregatedD D
�� ����
Supply/Load Control of Ozone
• Coordinating server load and power supply switch – Based on the capacity level of stored green energy– Based on the aggregated stored green energy usage
Discharge Budget > 0 Discharge Budget = 0
Flexible Capacity > 0
Give Priority to ReleasingStored Solar Energy
(Use DVFS if necessary)Switch to Utility
Flexible Capacity = 0
Give Priority to Server Power Capping
(Use battery if necessary)Switch to Utility
�� ����
Talk Overview
1. Background and Motivation 2. Oasis: Design and Prototype
3. Optimized Oasis Operation 4. Impact of Oasis Design
Inverter
PLCHMI
MPPT
SensorSensor
Switch Panel
Charger
/ 1 *
Pow
er C
ontr
ol H
ub
Utility Power
4%7%
9%
13%
16%22%
29%
Others Mechanical HMI BatteryInverter PLC Solar Panel
5%
14%
76%
Server Racks
P Load > P Renewable
1
5'%�!��
XSV&�!���� 6ZLWFK� �µ1¶��FSX)UHT� �µ+LJKHVW¶�
2 XSV&������ 6ZLWFK� �µ<¶��FSX)UHT� �µ7%'¶
3'%� ��
XSV&�!���� 6ZLWFK� �µ1¶��FSX)UHT� �µ/RZ¶�
4 XSV&������ 6ZLWFK� �µ<¶��FSX)UHT� �µ7%'¶
5
8'%�!��
XSV&�!���� 6ZLWFK� �µ<¶��FSX)UHT� �µ+LJKHVW¶
6 XSV&������ 6ZLWFK� �µ1¶��FSX)UHT� �µ7%'¶
7'%� ��
XSV&�!���� 6ZLWFK� �µ<¶��FSX)UHT� �µ/RZHVW¶
8 XSV&������ 6ZLWFK� �µ1¶��FSX)UHT� �µ7%'¶
�� ����
Job Latency vs. Battery Life
• Ozone seeks a balance between supply tuning and load tuning– Battery-based design (Oasis-B) emphasis performance– Load scaling based design (Oasis-L) emphasis battery lifetime
0%
1%
2%
3%
4%
5%
6%
7%
8%
Oasis-B Oasis-L Ozone
Job
Dela
y
Sort
WCount
PRank
Nutch
Bayes
Kmeans
Web
Media
YCSB
SWtest
Avg. 0
1
2
3
4
5
6
7
Life
time
(Yea
rs)
Oasis-B Oasis-L Ozone
�� ����
Battery Backup Time
• Ozone also maintains the best battery backup capacity – Under various renewable power variability
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
Back
up C
apac
ity
Oasis-B Oasis-L Ozone
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
Back
up C
apac
ity
Oasis-B Oasis-L Ozone
High solar power variability Low solar power variability
�� ����
Cost Projection
• Solar systems and batteries are major cost components– PCH: < 4% total cost
• Oasis could result in 25% less total CapEx– Depending on the
hardware cost trend
6RODU%DWWHU\3&+2WKHUV
Scaled-down Prototype
Large-scale Deployment 0 2nd 4th 6th 8th 10th0
0.2
0.4
0.6
0.8
1
Year
Nor
mal
ized
Cos
t
Oaiss with 6%/year Solar Cost Decline Oasis with 12%/year Solar Cost DeclineConventional Centralized Integration
0 2nd 4th 6th 8th 10th0
0.2
0.4
0.6
0.8
1
Year
Nor
mal
ized
Cos
t
Oaiss with 6%/year Solar Cost Decline Oasis with 12%/year Solar Cost DeclineConventional Centralized Integration
0 2nd 4th 6th 8th 10th0
0.2
0.4
0.6
0.8
1
Year
Nor
mal
ized
Cos
t
Oaiss with 6%/year Solar Cost Decline Oasis with 12%/year Solar Cost DeclineConventional Centralized Integration
�� ����
Conclusions
• A distributed, incremental green energy integration method can reduce 25% capital expenditure
• Balancing power supply control and server load control can further improve the design trade-offs
• IT can be the enabler of sustainability: Expanding datacenters using green energy in the big data era!
• Integrating modular green energy sources allows data centers to scale out sustainably
�� ����
February 15-19, 2014http://hpca20.ece.ufl.edu/
Welcome!
�� ����
Green Computing
�35