making sense of thermoelectrics for processor thermal …scale.engin.brown.edu/pubs/islped15.pdf ·...

Making Sense of Thermoelectrics for ProcessorThermal Management and Energy Harvesting

Sriram JayakumarSchool of Engineering

Brown UniversityProvidence, RI 02912

Email: sriram [email protected]

Sherief RedaSchool of Engineering

Brown UniversityProvidence, RI 02912

Email: sherief [email protected]

Abstract—A thermoelectric (TE) device can be used as a heatpump that consumes electric power to cool a processor chip, orit can be used as a heat engine that generates electricity from theheat dissipated during processor operation. To better understandthe use of TE devices, we develop a fully instrumented processor-based system with controllable TE devices. We first examine theuse of TE devices for energy harvesting. We identify a pitfall inprevious works that can lead to wrong conclusions for TEG useby demonstrating that TEGs increase the processor’s leakagepower which offsets their harvested power. For thermoelectriccooling (TEC), we elucidate the intricate relationships betweenthe processor power, thermoelectric power, and fan power. Wepropose a dynamic thermal management scheme (DTM) thatmaximizes performance under thermal constraints and given totalpower budgets by controlling the processor’s dynamic frequencyand voltage scaling (DVFS), TEC current, and fan speed. Forthe evaluated thermal constraints, our results demonstrate goodimprovements to performance at the cost of additional coolingpower compared to standard DVFS+fan DTM techniques.

I. INTRODUCTION

The performance of most modern processors is thermallyconstrained, where the frequency and voltage of operation arecontrolled to prevent high temperatures that can compromisereliability. Modern processors make use of turbo modes, wherea dynamic thermal management (DTM) system boosts thefrequency of operation based on the available thermal slack.In addition to thermal constraints, it is also important for com-puting systems to operate within a power budget to minimizeenergy consumption and to prolong operational time especiallyfor battery-operated systems.

Thermoelectric (TE) devices are made of an array of p-typeand n-type semiconductor, e.g., bismuth-tellurides, thermocou-ples that are connected electrically in series but thermally inparallel. In processor-based systems, the TE device is usuallyinserted between the processor and the heat sink. TE devicescan operate as thermoelectric coolers (TECs) by pumping heatfrom the chip side to the heat sink side, where the amount ofheat pumping is linearly proportional to the TEC’s electriccurrent. The TEC electrical current leads to Joule heatingwhich must be dissipated by the heat sink as well. A TEdevice can be also used as a thermoelectric generator (TEG),where it generates electricity from the heat dissipated duringchip operation. In this case the TEG uses the natural thermalgradient across it from the hot chip to the colder sink togenerate voltage that is proportional to the gradient.

TE devices provide an exciting opportunity for computingsystems as they can enable higher performance levels throughimproved cooling and/or they can be used as electrical gener-ators by harvesting the heat waste of computer chips. In thispaper we create a real system that uses a TE device with a 22nm quad-core processor to elucidate the intricacies of usingTE devices with processors. The contributions of this paperare as follows.

• We examine the use of TEGs in a processor context. Weanalyze the thermal impact of using TEGs to extract energyand its ramifications on the leakage power of the processor.Compared to a standard setup which does not use TEGs,we demonstrate an increase in leakage power when TEGsare used, and as a result we conclude that previous studiesoverestimated the benefits of TEGs because the harvestedpower is offset by an increase in leakage power.• To demonstrate TE potential as a TEC, we first characterize

our system under a wide range of settings for dynamicvoltage and frequency scaling (DVFS), fan speed and TECcurrent, and elucidate the intricate relationships betweenthe processor power, thermoelectric power, and fan power.Based on these relationships, we devise a DTM algorithmthat maximizes performance subject to thermal constraintsfor a given total (computing and cooling) power budget.Our algorithm controls DVFS, fan speed and TEC currentto achieve its aims.• We implement our methods on a real system based on a 22

nm Ivy Bridge Intel Core i5 processor (CPU) equipped witha TE device and a traditional fan. The system is fully instru-mented to monitor the CPU power consumption and internalthermal sensors, and to control the TEC’s supply currentand the fan speed. All measurements are collected on thesystem itself providing an ideal testbed for DTM algorithms.Given a thermal constraint, our experiments demonstrate anaverage 19.6% performance increase with an additional 9.15W of TEC cooling compared to standard DVFS+fan DTMtechniques across a number of benchmarks.

The organization of this paper is as follows. In section IIwe provide an overview of the relevant work in the literature.Section III elaborates our setup that is used for our analysesand experiments. In section IV we analyze the use of TEdevices for energy harvesting, and in Section V we analyzetheir use for cooling and we propose a new DTM method. InSection VI, we summarize the main conclusions of this work.

II. BACKGROUND

Cooling Processors. To analyze the impact of TEC coolingon processors, Koester et al. examined the use of TECs tocool hot spots of processors as a function of TEC current [11].Chowdhury et al. demonstrated the first use of a high figure-of-merit superlattice TEC on cooling a processor [5]. Comparedto bulk TE devices, superlattice or thin-film TECs achieve ahigher figure of merit by increasing the electrical conductivityand reducing the thermal conductivity through nano structuringfabrication methods. Chaparro et al. evaluated the prospectof thin-film TECs for dynamic thermal management in asimulation environment, where the DVFS of a core is increasedto leverage the thermal slack that is created when a TEC isenabled [3]. Alexandrov et al. provided compact models forthin-film TECs and use the models to simulate the transienttime required to establish steady-state temperatures [2], whileSullivan et al. analyzed the use of multiple TECs to coolmultiple hot spots [12]. Paterna and Reda investigated the useof per-core thin-film TECs to mitigate dark silicon problemsin multi-core processors [8]. Dousti and Pedram formulatedthe problem of identifying the TEC current and fan speed tominimize the total power subject to a thermal constraint [6],[7]. The model formulation of the problem is based on designinformation, and then the formulation is solved using active-set sequential quadratic programming and results are verifiedin simulation.

Energy Harvesting. A number of works examine the useof TE devices as TEGs in processor-based systems [1], [13],[4]. Using simulations and models, Choday et al. evaluate theenergy harvested from a TEG and the cooling achieved by aTEC as a function of the workload running on a processor.Alexandrov el. demonstrates a system in which the TE deviceis used for energy harvesting when the chip temperature is low,but it is used for cooling when the chip’s temperature increases[1]. Wu analyzed the harvested power and the impact ofincrease in temperature from using the TEG on the processor’sreliability [13].

Shortcomings. We observe that previous works have anumber of shortcomings. First, many ignore or miscalculate theimpact of the additional thermal resistance of the TEG devices,which leads to higher processor temperature than predicted bysimulations. None of the previous works considered the impactof this temperature increase on leakage power of the processor.This higher leakage can offset the benefits of TEGs as we willdemonstrate in this paper.

Second, all thin-film TEC papers use or assume substan-dard thermal grease for the thermal interface material (TIM)with a thermal conductivity of 1.75 W/mK in the models, andas a result replacing the standard thermal grease by the thin-film TE device (with effective 17 W/mK thermal conductivity)leads to a smaller thermal resistance (even in passive mode)in comparison to a baseline case without TEC. As a result,the passive cooling effect of thin film TECs may not occurwhen good TIMs are used. In fact, most modern processorsuse higher quality TIMs, where silver-based compounds reacha thermal conductivity 8.5 W/mK and Indium-based soldersused in high-power processors each 30−50 W/mK [10]. TheseTIMs are also cheaper than TE devices.

III. SYSTEM SETUP AND DESIGN

To assess the use of TE devices in computing systems, weput a fully instrumented setup (shown in Figure 1) that consistsof following main components.• A motherboard which hosts an 22 nm FinFET-based Intel

i5-3450S Ivy Bridge processor. The processor comes with astandard heat sink and a fan that has a rotational speed thatcan be controlled from 500 rpm to 2600 rpm. The systemruns Ubuntu 12.04.• To measure the processor’s power consumption, we intercept

the 12 V DC supply lines to the processor with 1 mV/Ashunt resistor, and use Agilent A34410A digital multime-ter to measure the electric current to the processor. Themeasurements are relayed over USB to our system. Anadditional Agilent A34410A digital multimeter is used tomeasure the output voltage when the TE device is used inTEG mode.• RMT 1MC06-142-03AN25 TE devices. Each device is 26.5

mm ×12 mm, with 0.4 mm thermocouple thickness. Wespecial ordered thin AlN ceramic pads instead of standardAl2O3 ceramic pads to reduce the thermal resistance of thepads. The entire device has a thickness of 0.9 mm.• A programmable Tektronix 4205 power supply is used to

control the current to the TE device in cooling mode. Thepower supply is connected over USB to the motherboard toenable real-time programming from our DTM system.Throughout the paper we consider two setups for compar-

ison. We consider the baseline system without the TE devicesas shown in Figure 2.a, and in the setup of Figure 2.b, weconsider the TE devices placed between the heat spreader andthe heat sink & fan. We use two TE devices that are thermallyin parallel but electrically in series, which reduces the passivethermal resistance and the processor’s temperature. The twoTE devices occupy an area of 26.5 mm × 24 mm.

IV. MAKING SENSE OF TEGS FOR ENERGY HARVESTING

Comparing to the baseline system without TE devices, thehigher thermal resistance arising from the insertion of the TEdevice in the heat removal path creates a challenging problembecause it increases the temperature and leakage power of theprocessor. Previous works analyzed the increase in temperatureand its impact on reliability [1], [13], [4], but did not analyzethe impact on leakage. Analyzing leakage power is the most

TEC programmable current supply

CPU power monitor CPU, TE device and fan assembly

Fig. 1. Setup for studying use of TE devices.

Heat sink & Fan

processor die heat spreader

(a)

Heat Sink & Fan TECs

processor die

heat spreader

(b)

Fig. 2. Processor and heat removal assembly with and without TECs. Silver-based TIM (Arctic Silver 5) is used at the interfaces of all objects.

important issue because it can offset the benefits of usingTEGs for energy harvesting. To assess this possibility, wecreate a 4-threaded microkernel application with an extremelystable nature. We run this workload on the baseline systemat various DVFS settings and record the temperatures fromthe sensors and power of the processor after reaching steadystate. We fixed the fan speed at 2600 rpm. We re-do thesame runs after inserting the TEG into the system, and againrecord the processor power, the temperature sensors, and TEGoutput voltage. The TEG open-circuit output voltage is equal tonS∆T , where n is the number of thermocouples electricallyin series, S is Seebeck coefficient, and ∆T is the thermalgradient across the TEG. The maximum power generationoccurs when the the load resistance matches TEG’s internalresistance. However, the internal resistance is a function of theTEG’s temperature. Thus for every experiment, the TEG outputvoltage is recorded for the open-circuit case (denoted V1) andunder a load RL (denoted V2), and the maximum TEG poweris then given by V 2

1 /(4RL(V1/V2 − 1)) [9]. This maximumpower will be delivered to an external load only when theload’s impedance matches the TEC’s internal impedance. Ouropen-circuit voltage ranged from 1.70−2.57 V, and the voltageranged from 0.72− 1.05 V with a 10 Ω load resistor.

We plot in Figure 3.a the temperatures of the baselinesetup, and the setup where the TEG is connected to a load

DVFS (GHz)1.6 1.8 2 2.2 2.4 2.6 2.8

Tem

p (C

)

010203040506070

(a) Temperatures in standard setup vs TEG setupTEG CPU Temp (C)baseline CPU Temp(C)

DVFS(GHz)1.6 1.8 2 2.2 2.4 2.6 2.8

Pow

er (m

W)

0

500

1000

1500

(b) TEG vs leakage power increaseTEG harvested (mW)additional leakage (mW)

Fig. 3. Impact of TEGs on temperature and leakage power.

Temperature (C)35 40 45 50 55 60 65 70 75

CPU

pow

er (W

)

30

31

32

33

34

35

36

y = 6e-05*x3 - 0.0079*x2 + 0.45*x + 21

Fig. 4. Quantifying leakage for evaluated Core i5 processor.

as a function of DVFS. The open-circuit TEG exhibits worseresults. The plot shows that the inserting the TE device leadsto about 10.15 − 14.85 °C increase in temperature. We thenplot in Figure 3.b the difference in the processor’s powerbetween the TE setup and the baseline setup, and we plotthe maximum power harvested from the TEG. The resultsshow that the harvested power by the TEG is offset by amuch larger increase in the processor’s leakage power. In theplot, the TEG has a peak power generation of in the range54−113 mW depending on DVFS, but leakage increase rangesby 945 − 1866 mW. Thus, there is no net power generationcompared to the baseline setup without TEG. TEG powergeneration is ultimately in conflict with leakage, because thegenerated voltage is linearly proportional to ∆T , and higher∆T implies higher temperature for the processor; however,leakage is exponentially dependent on temperature. That iswhy the gap between power generation and leakage widenswith increasing DVFS. Our results demonstrate that previouswork ignored the crucial impact of the TEG on the leakagepower of the processor. Despite this negative result, futurethinner TE devices with higher figure of merit may mitigatethis problem by reducing the thermal impact on the processorand increasing the TEG’s energy generation.

Our results show that it is important to model or measurethe processor’s leakage power of the processor. We observe thatusing the TE device in TEC mode and the fan can quantify theentire leakage power profile of the processor as a function ofits temperature. We execute our stable microkernel applicationand fix the DVFS setting at 2.8 GHz. We then sweep the fanspeed (500−2600 rpm) and the current of the TEC (0−1.5 A)and record the average temperature from the thermal sensorsand the processor’s total power at the steady state for everycombination. Figure 4 gives the power of the processor as afunction of its average temperature. Note that the changes inthe processor’s power are attributed entirely to leakage powersince the dynamic power is fixed by the stable nature of ourapplication and the fixed voltage setting of DVFS.

V. MAKING SENSE OF TECS FOR DTM

DVFS and fan speeds have been the main knobs for DTM,where DVFS is typically increased until a thermal constraintis reached, and the fan speed is adjusted depending on thesensed temperatures. In TEC-based systems, a heat sink anda fan are still required to remove the pumped heat from theprocessor and the TEC’s own heat. Our goal is to deviseDTM systems that leverage three knobs: DVFS settings, TECcurrents and fan speeds. We observe that the presence of TECscan lead to large power consumption for cooling. Given thatmost computing systems have a power budget, we consider

a DTM system that seeks to maximize performance whilesubject to both a total power budget and a thermal constraint,where the total power budget is the sum of both computing bythe processor and cooling by the TEC and fan. We first analyzethe effectiveness of the TEC and fan in Subsection V-A, andthen analyze the relationship between DVFS and TEC-basedcooling in Subsection V-B, which leads to our devised DTMmethod in Subsection V-C.

A. Cooling: TEC and Fan

In this subsection our objective is to study the impact ofsettings for the TEC and the fan on the temperature of theprocessor and total power. The amount of heat qc that a TECcan pump is controlled by its input current, I , where at steadystate

qc = STcI −K∆T − 12I2R, (1)

where Tc is the temperature of the cold side, K is the TECthermal conductance, ∆T is the thermal gradient across theTEC, and R is the internal electrical resistance of the TEC.The power consumption of the TEC is equal to

Ptec = S∆TI + I2R. (2)

For the fan, its power consumption, Pfan, is equal toPfan ∝ ω3, where ω is its rotational speed. In computerchips, the fan is connected to a 12 V supply line from themotherboard, and the speed is adjusted using pulse widthmodulation, which lowers the effective voltage of the fan.

To study the impact of I and ω, we use our stable 4-threaded microkernel application to activate all cores at thehighest DVFS setting. We then apply various combinations ofω and I to the fan and TEC respectively. For each combination,we wait until the steady state and record the average processortemperature from its four sensors, the power consumptionof the TEC and the power consumption of the fan. Thesemeasurements are averaged over 30 seconds at steady state.From the results, we make the following four key observations.

First, increasing the fan speed will always reduce the tem-perature and leakage of the processor though with diminishingreturns trends as shown in Figure 5. However, increasingthe TEC current will first lead to a reduction in temperaturebecause of the higher heat pump as given by Equation 1, butlater large TEC currents will lead to larger Joule heating for theTEC as given by Equation 2, which increases the temperatureof the heat sink and the processor [5]. This latter increase isnot shown in the figure as it requires large amounts of currentin excess of 2 A in our setup.

Second, we plot in Figure 6 the total system power (i.e.,processor, fan and TEC) as a function of TEC current and fanspeed. The figure shows that activating the TEC with I = 0.3A and the fan with ω = 1750 rpm will minimize the totalpower by 7.5% compared to the setting that minimizes justcooling power (i.e., I = 0 A and ω = 500 rpm). This result isattained from savings in leakage power from the processordespite the TEC and fan joint power consumption. Thus,operating at these settings should be default for DTM, becauseoperating with less TEC current or fan speed will simultane-ously increase total power consumption and the processor’stemperature. Note that this point will be slightly impactedby the nature of the workload and DVFS. It is also worth

500

Fan speed (rpm)100015002000250021.5

TEC current (A)10.50

30

40

90

80

70

60

50

Tem

pera

ture

(C)

Fig. 5. Die average temperature as a function of fan speed.

2500

Fan speed (rpm)

20001500

10005000

0.5TEC current (A)

11.5

70

60

50

40

302

Tota

l sys

tem

pow

er (W

)

Fig. 6. Total system power as a function of fan speed and TEC current.

noting that TECs have much larger operational and powerconsumption range than fans. Thus, fans are likely to reachtheir maximum limits much earlier than TECs. For instance,while the fan in our system consumes in the range of 0−1.75W depending on its speed, the TEC consumes between 0−29W depending on its current and thermal gradient.

Third, there is a subtle interaction between the fan speedand the TEC power consumption. For a fixed TEC current,increasing the fan speed reduces the TEC power consumption,because increasing the fan speed reduces the temperaturegradient across the TEC, which reduces the TEC’s voltageand power consumption as given by Equation 2. In our results,increasing the fan from 500 rpm to 2600 rpm reduced the TECpower consumption by 8−11% depending on the TEC currentand CPU power.

Fourth, TECs offer much more rapid control of the pro-cessor’s temperature compared to regular fans because theTEC cold side is in touch directly with the processor and theTEC is a solid-state device with faster reaction time than thefan’s mechanical motor. We illustrate the dynamic behavior inSubsection V-C.

B. DVFS and Cooling Power Tradeoff

Using the same setup as in the previous subsection, we plotthe thermal contour plots as a function of the processor DVFSand TEC current (for fixed fan 1750 rpm) in Figure 7. The plotgives the trade-off between processor performance as given byDVFS and TEC current to achieve a target temperature. Fromthe figure, we observe that for a fixed frequency, the plotsget increasingly spaced as current increases. For instance, at2.8 GHz, reducing the temperature by 4 °C from 62 °C to58 °C requires an additional 0.175 A, but reducing from 42°C to 38 °C requires an additional 0.3 A. This observationmatches up the observation in the previous section which statesthat increasing TEC current has diminishing returns. Anotherinsight from the contour plot is that for a fixed temperature, as

26

3034

34

38

3842

4246

4650

5054

54 58 62

DVFS (GHz)1.6 1.8 2 2.2 2.4 2.6 2.8

TEC

Cur

rent

(A)

0

0.5

1

1.5

2

30

35

40

45

50

55

60

Fig. 7. Temperature contours as a function of DVFS setting and TEC current.

CPU DVFS increases, more and more TEC current is required.The effect is more pronounced at lower temperatures comparedto higher temperatures. For instance, increasing the frequencyfrom 2.00 GHz to 2.80 GHz requires an additional of 0.4A at the 50 °C contour, but requires an additional 0.5 Aat the 38 °C contour. The reason for this behavior is thatlower temperature thresholds require higher heat pumping bythe TEC which necessitates increasing its current. However,the TEC’s Joule heating increases as a byproduct as givenin Equation 2, which requires additional amount of currentto maintain the same temperature. Thus, maintaining a fixedtemperature requires disproportionately more power from theTEC to sustain performance increases. DVFS though deliversfaster response than the TEC because it leads to fast (10-20µs) reduction in the processor’s power.

We plot the maximum attained DVFS as a function ofthe power budget in Figure 8 for two thermal constraints 65°C and 45 °C. We also plot in Figure 8.b the breakdownof the power budget among the CPU, TEC and fan as afunction of the power budget for the 45 °C case. The plotsshow that lower temperature constraints require higher powerbudgets to deliver more TEC current for cooling. This iswhy each of the plots starts at a different point. Consider afixed temperature constraint: at lower frequencies, increasingthe power budget slightly gives larger gains in performance,compared to increasing the power budget by the same amountat a higher frequency. Additionally, for lower temperatureconstraints, the growth rate of frequency as a function ofpower budget is lower. Compare the Tmax = 45 °C and Tmax

= 65 °C plots: at the lowest power budget available, the 45°C plot starts out growing slower compared to the 60 °C plot.The power breakdown also shows that the TEC consumingan expanding portion of the total power that is changingat a higher rate than the CPU portion as the power budget

Total power budget (W)20 25 30 35 40 45

max

DVF

S (G

Hz)

1.6

1.8

2

2.2

2.4

2.6

2.8

(a) Impact of power budget on DVFS

T=65 CT=45 C

Total power budget (W)25 30 35 40 45

Pow

er b

reak

dow

n (W

)

0

5

10

15

20

25

30

35

40(b) Breakdown of power budget

cpu (W)tec (W)fan (W)

Fig. 8. Impact of power budget on performance and its breakdown.

and performance increase. For instance, at 30 W, the TECconsumes 13% of the total budget, but at 40 W, the TECconsumes 18% of the total power budget.

C. Dynamic Thermal Management

The goal of DTM is to maximize performance based on agiven thermal constraint Tmax and power budget Pmax. Basedon our analysis in Subsection V-A, our DTM always uses aminimum bound of Imin = 0.3A for the TEC and ωmin = 1750rpm for the fan. We limit the maximum value for I is Imax =1.5 A and the maximum value for ω is ωmax = 2600 rpm. OurDTM is invoked periodically every 1 second, and at everyinvocation k, it measures the maximum sensor temperatureacross all cores, T (k), and the total (TEC, CPU & fan) powerconsumption P (k). Utilizing the insights from our analysisin Subsection V-A and Subsection V-B, the DTM then setsDVFS, the TEC current and fan speed based on the followingcases.

• case #1: if T (k) < Tmax and P (k) < Pmax then increaseDVFS if possible else decrease cooling if I > Imin or ω >ωmin.• case #2: if T (k) ≤ Tmax and P (k) > Pmax then decrease

cooling if I > Imin or ω > ωmin else decrease DVFS.• case #3: if T (k) > Tmax and P (k) < Pmax then increase

cooling if I < Imax or ω < ωmax else decrease DVFS.• case #4: if T (k) > Tmax and P (k) ≥ Pmax then decrease

DVFS.

Case 1 seeks to boost performance but if the maximumDVFS is reached then it will attempt to decrease cooling toreduce power consumption. In case 2, the DTM reduces thecooling to meet the power budget, which is reasonable giventhe availability of thermal slack, but if the the TEC and fanare at their minima, then DTM is forced to reduce DVFS. Incase 3, the slack in power budget is used to increase cooling toreduce the thermal violation, but if cooling is at its maximum,then DVFS has to be decreased. The last case is when thethermal and power budget constraints are both violated whichtriggers a reduction in DVFS. When we change the coolinglevel, we always prioritize TEC over the fan. We use a P-controller for the adjustment of DVFS, I and ω settings, thougha PI controller is also possible.

In this first experiment we illustrate our DTM approachin our real system. We launch four instances of the povraybenchmark from SPEC CPU06 suite to keep all cores utilized.We show four metrics over time in Figure 9: the maximumsensor temperature, the total power consumption; the TEC Isetting, and the DVFS setting. For space considerations, weskip the fan which operates between 1750− 2600 rpm. At thebeginning we set a thermal threshold of 50 °C and a 40 Wpower budget, thus, DVFS increases while the TEC and fan areengaged to bring the temperature below the threshold. After1 minute, we impose a power budget of 30 W. As a resultthe TEC current has to be scaled back which forces DVFSto decrease to avoid violating the thermal constraint. After2 minutes, we relax the thermal constraint to 65 °C, whichenables TEC current to decrease creating room for DVFS toincrease within the same budget, and finally after 3 minutes,we increase the power budget to 40 W, which enables furtherimprovements to DVFS with no need to engage the TEC. The

time (s)0 30 60 90 120 150 180 210 240 270

tem

p (C

)

0

20

40

60

time (s)0 30 60 90 120 150 180 210 240 270to

tal p

ower

(W)

0

20

40

60

time (s)0 30 60 90 120 150 180 210 240 270

TEC

I (A

)

0

0.5

1

1.5

time (s)0 30 60 90 120 150 180 210 240 270

DVF

S (G

Hz)

1.5

2

2.5

3

Tmax=55

Pmax=40 Pmax=30

Tmax=55

Pmax=30 Pmax=40

Tmax=65 Tmax=65

Fig. 9. DTM under various maximum temperature and power budgets forpovray from SPEC CPU2006.

traces show that our DTM controller is effectively able tomaximize DVFS subject to time-varying thermal constraintsand power budgets.

In the second experiment we apply our method on fiveSPEC06 CPU benchmarks: astar, bzip2, calculix, gccand tonto. We select these benchmarks because they displaythe most interesting variations in power and temperature duringexecution. We consider a DTM scenario with Tmax = 45°C and Pmax = 45 W. We compare the setup where the TECis used (e.g., Figure 2.b) and the case where the TEC is notinserted (e.g., Figure 2.a). For the latter setup, we considera similar DTM method, where we only use DVFS and thefan. We report in Table I the average DVFS setting throughoutexecution, the percentage of runtime where the benchmark hadthermal violations, and the average cooling power as measuredby sum of the TEC power and fan power for the TEC setupand just fan power for the no-TEC setup. Our results show thatusing the TEC-based DTM boosts DVFS on average by 19.6%with an additional 9.15 W on average for cooling. Thus, TEC-based cooling alleviates the thermal constraints on computing,which enable DVFS to increase; however, this performancebenefit comes at increased cooling power consumption.

VI. CONCLUSIONS

In this paper we studied the effectiveness of using ther-moelectric devices for both energy harvesting and dynamicthermal management in processors. We first analyzed theiruse as TEGs and concluded that TEGs increase the proces-sor’s temperature and leakage power, and that the additionalleakage power reduces their benefit as energy harvesters asdemonstrated on our 22 nm multi-core processor. For the roleas TECs, we argued that TECs have to be examined along-side DTM methods such as fans and DVFS. Consequentlywe elaborated the relationship between these three controlmethods, and as a result proposed a DTM method that is able tosimultaneously determine the values for DVFS, TEC currentand fan speed to maximize performance subject to thermalconstraints and power budgets. We implemented our method

Tmax = 45 °C and Pmax = 45 WDTM with TEC DTM (no TEC inserted)

benchmark mean thermal mean cool mean thermal mean coolDVFS viol. (%) pwr (W) DVFS viol. (%) pwr (W)

astar 2.73 1.10 8.61 2.47 0.00 1.41bzip2 2.64 0.00 10.70 2.22 0.00 1.58gcc 2.71 1.10 10.25 2.40 0.00 1.52

tonto 2.58 0.00 11.69 1.94 0.00 1.65calculix 2.45 0.00 12.30 1.92 0.00 1.64Average 2.62 0.44 10.71 2.19 0.00 1.56

TABLE I. COMPARISON DTM WITH TEC AND WITHOUT TEC FORTmax = 45 °C AND TOTAL POWER BUDGET 45W . mean DVFS IS

REPORTED IN GHZ. mean cooling pwr GIVES THE SUM OF TEC AND FANPOWER. thermal viol. IS THE PERCENTAGE OF TIME THE BENCHMARK

SPENT ABOVE THE MAXIMUM TEMPERATURE DURING EXECUTION.

using state-of-the-art infrastructure, and we concluded that us-ing thermoelectrics as TECs can provide boosts to performancewith additional power consumption. In our experiments witha real 22 nm quad-core processor, we demonstrated about19.6% to performance at the cost of additional cooling powercompared to standard DVFS+fan DTM techniques.

Acknowledgments: This research is partially supported byNSF grants 0952866 and 1305148.

REFERENCES

[1] B. Alexandrov, K. Z. Ahmed, and S. Mukhopadhyay, “An on-chipautonomous thermoelectric energy management system for energy-efficient active cooling,” in ISLPED, 2014, pp. 51–56.

[2] B. Alexandrov, O. Sullivan, S. Kumar, and S. Mukhopadhyay,“Prospects of active cooling with integrated super-lattice based thin-film thermoelectric devices for mitigating hotspot challenges in micro-processors,” in ASP-DAC, 2012, pp. 633 –638.

[3] P. Chaparro, J. Gonzalez, Q. Cai, and G. Chrysler, “Dynamic ThermalManagement Using Thin-Film Thermoelectric Cooling,” in ISLPED,2009, pp. 111–116.

[4] S. H. Choday, K.-W. Kwon, and K. Roy, “Workload dependent evalua-tion of thin-film thermoelectric devices for on-chip cooling and energyharvesting,” in ICCAD, 2014, pp. 535–541.

[5] I. Chowdhury et al., “On-Chip Cooling by Superlattice-based Thin-FilmThermoelectrics,” Nature Nanotechnology, vol. 4, no. 4, pp. 235–238,2009.

[6] M. Dousti and M. Pedram, “Platform-Dependent, Leakage-Aware Con-trol of the Driving Current of Embedded Thermoelectric Coolers,” inISLPED, 2013, pp. 311–316.

[7] M. Dousti and M. Pedram, “Power-Aware Deployment and Control ofForced-Convection and Thermoelectric Coolers,” in DAC, 2014, pp. 1–6.

[8] F. Paterna and S. Reda, “Mitigating Dark Silicon Problems UsingSuperlattice-based Thermoelectric Coolers,” in DATE, 2013, pp. 1–4.

[9] D. M. Rowe and G. Min, “Evaluation of Thermoelectric Modules forPower Generation ,” Journal of Power Sources, vol.73, no. 2, pp. 193–198, 1998.

[10] E. Samson et al., “Interface Material Selection and a Thermal Manage-ment Technique in Second-Generation Platforms Built on Intel CentrinoMobile Technology,” IEEE Technology Journal, vol. 09, no. 1, pp. 75–86, 2005.

[11] G.J. Snyder et al. “Hot spot cooling using embedded thermoelectriccoolers,” in Semiconductor Thermal Measurement and ManagementSymposium., 2006, pp. 135 – 143.

[12] O. Sullivan, M. Gupta, S. Mukhopadhyay, and K. S., “Array ofthermoelectric coolers for on-chip thermal management,” Journal ofElectronic Packaging, ASME, vol. 134, no. 021005, pp. 1–8, 2012.

[13] C.-J. Wu, “Architectural Thermal Energy Harvesting Opportunities forSustainable Computing,” IEEE Computer Architecture Letters, vol. 13,no. 2, pp. 65–68, 2014.

making sense of thermoelectrics for processor thermal …scale.engin.brown.edu/pubs/islped15.pdf ·...

Documents