closed-loop modeling of power and temperature profiles of fpgas

12
1 Closed-Loop Modeling of Power and Temperature Profiles of FPGAs Kanupriya Gulati Sunil P. Khatri Peng Li Department of ECE, Texas A&M University, College Station

Upload: odysseus-soto

Post on 31-Dec-2015

19 views

Category:

Documents


0 download

DESCRIPTION

Closed-Loop Modeling of Power and Temperature Profiles of FPGAs. Kanupriya Gulati Sunil P. Khatri Peng Li Department of ECE, Texas A&M University, College Station. Introduction. Due to increasing density of FPGAs Power is now a zeroth order design constraint - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Closed-Loop Modeling of Power and Temperature Profiles of FPGAs

1

Closed-Loop Modeling of Power and Temperature Profiles of FPGAs

Kanupriya GulatiSunil P. Khatri

Peng LiDepartment of ECE,

Texas A&M University, College Station

Page 2: Closed-Loop Modeling of Power and Temperature Profiles of FPGAs

2

Introduction

• Due to increasing density of FPGAs– Power is now a zeroth order design constraint

• During operation, two components of power consumption are– Dynamic Power

• Temperature independent

– Static Power• Gate leakage

– Largely temperature independent• Sub-threshold leakage

– Exponential dependence on junction temperature

• This positive feedback loop could cause – Non-convergence (thermal runaway)– Convergence above a safe junction temperature (thermal breakdown)

Increase in dynamic power

Increase in temperature

Increase in leakage power

Page 3: Closed-Loop Modeling of Power and Temperature Profiles of FPGAs

3

Our Approach

• Our approach is design and FPGA device specific• Partition placed and routed FPGA design into n2 grid regions• For each grid region, at the given temperature

– Compute total power (dynamic and leakage power)• Dynamic power computed based on logic in the region

• Leakage power computed using fast and accurate macromodels

• From the power of the n2 grid regions, compute new thermal profile – Compute increase in temperature for each grid region– If change in temperature in all grid regions is less than ε, stop and

declare convergence– If no convergence and new temperature in any grid region more than a

threshold value, declare thermal breakdown– Else recompute leakage power of each grid region using new

temperature value and iterate

Page 4: Closed-Loop Modeling of Power and Temperature Profiles of FPGAs

4

Our Approach – Flowchart

Page 5: Closed-Loop Modeling of Power and Temperature Profiles of FPGAs

5

Our Approach – Dynamic Power

• Compute using the XPower tool from Xilinx– XPower reads the design data file and computes activity estimate ‘α’

– After synthesis, place and route of the design, we compute the maximum operating frequency ‘fckt’

– XPower has the node and wire capacitance values. So, Pdyn = C * Vdd2 * fckt * α

– Find the contribution of grid region (i, j) to Pdyn

• For each LUT in grid region (i, j), we compute

– Probability of output being logic ‘1’, P1 = (ΣVk)/16

» Where Vk is the logic value stored in the kth SRAM of the LUT

– Probability of output switching, Psw = 2 * P1 * (1-P1)

• Average probability of switching in the grid region P(i, j) = (ΣPsw)/q

– Where q is the number of LUTs per grid region

• Pdyn(i, j) = Pdyn * P(i, j) * 1/(ΣP(i, j))

Page 6: Closed-Loop Modeling of Power and Temperature Profiles of FPGAs

6

Our Approach – Static Power

NMOS Passgate Gate Leakage States

L2’ Leakage

NMOS Passgate Sub-threshold Leakage States

LUT Implementation using a 16:1 MUX

Page 7: Closed-Loop Modeling of Power and Temperature Profiles of FPGAs

7

Our Approach – Static Power

• Pre-compute leakage using SPICE for – LUT

• SRAM configuration data is known

• Each of the 31 pass gates in LUT are in one of

– 4 states ( L1, L2, L3 or L2’ ) contributing to subthreshold leakage

– 4 states ( K1, K2, K3 or K4 ) contributing to gate leakage or

– Remaining states have negligible leakage contribution

• But we do not know the f1, f2, f3 and f4 inputs to the LUT

– Take average over 16 possible input combinations

• SRAM cell in LUT (stored 1 and 0)

– D-flipflop (output 1 and 0) – MUX

Logic block in the FPGA

Page 8: Closed-Loop Modeling of Power and Temperature Profiles of FPGAs

8

Our Approach – Total Power

• Generate temperature dependent leakage macromodel for– LUT (L states), D-flipflop, SRAM and MUX

• Pre-compute the leakage values at 3 different temperatures and fit exponential curve

• Gate leakage (for K states) is largely temperature independent

– Leakage is quickly and accurately estimated for the logic block at any temperature

• Maximum 3% error when compared to explicit SPICE runs • 4 orders of magnitude faster

• Compute leakage for grid region (i, j) at any temperature, Plkg(i, j, T)– Taking the sum of the leakages of all LUTs, D-flipflops, SRAMs and

MUXes in region (i, j) at any temperature T = temp(i, j)

• Total power Ptot(i, j, T) = Pdyn(i, j) + Plkg(i, j, T)

Page 9: Closed-Loop Modeling of Power and Temperature Profiles of FPGAs

9

Our Approach – Temperature Computation

• We use the following approach– “Critical path analysis considering temperature, power supply variations

and temperature induced leakage”, P. Li, ISQED 2006– Assume a 1W power consumption in grid region (i, j)

• Table Zij(k, l) indicates resulting temperature at grid region (k, l)

– We precompute n2 such Zij tables, each with n2 entries

– We know the total power consumption of each grid region• Thus, we find the new temperature, temp_new(i,j), at the (i, j)th grid

region, by superposition

• Details of the thermal model– Circuit discretized into n2 grid regions– 15 layers of metal/dielectric are modeled

• Assuming a metallization percentage for each layer, the thermal conductivity of each layer is computed

– Model includes heat dissipation due to heat sinks

Page 10: Closed-Loop Modeling of Power and Temperature Profiles of FPGAs

10

Endgame and Experimental Setup• Endgame

– Find the absolute difference between • temp(i,j) and temp_new(i,j)

– Declare convergence when the maximum difference for all grid points is < 0.001°c

– If temp_new(i,j) > 110°c, and no convergence, we declare thermal breakdown

• Setup– Applied our methodology to 10 designs, implemented on a Virtex-4XCVLX200

Xilinx FPGA device– Synthesized, placed and routed using Xilinx ISE 8.1i– Initial temperature set at 27°c– n = 16– To the best of our knowledge, no other existing work reports final converged

temperature and power numbers for FPGA designs, after closing the dependence loop between leakage and temperature

– We therefore compared our final temperatures against a full-chip 3D thermal modeling and simulation tool

• Maximum (average) error in temperature was 2.52%(1.05%) for the DMA benchmark• Our approach is faster by ~40X per iteration

Page 11: Closed-Loop Modeling of Power and Temperature Profiles of FPGAs

11

Results

Temperature Profile for Circuit DMA

Circuits operating at 450 MHz

Page 12: Closed-Loop Modeling of Power and Temperature Profiles of FPGAs

12

Conclusions• Developed a technique to simultaneously model (in an FPGA)

– Power consumption– Temperature

• Used fast and accurate macromodels, for leakage estimation– Over all circuit components of a logic block, at all temperatures

• Less than 3% error compared to SPICE and • Up to 4 orders of magnitude speedup

• Approach– Partition FPGA design (placed and routed) into 16x16 grid regions– Compute total power consumption (dynamic and leakage) for each region– Find thermal profile of IC under this power consumption

• Using pre-computed power-to-temperature tables

– New thermal information is used to update the leakage power consumption– Steps iterated until the temperature converges (for all grid regions), or exceeds

a safe value (for any grid region)

• Final temperature obtained from our method– Compared to full-chip 3D temperature estimation tool– Shows max.(avg.) error of 2.52%(1.05%) for the DMA benchmark