a 180mv fft processor using subthreshold circuit techniques

23
A 180mV FFT Processor Using Subthreshold Circuit Techniques Alice Wang and Anantha Chandrakasan Massachusetts Institute of Technology

Upload: others

Post on 25-Oct-2021

26 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: A 180mV FFT Processor Using Subthreshold Circuit Techniques

A 180mV FFT Processor Using Subthreshold Circuit Techniques

Alice Wang and Anantha ChandrakasanMassachusetts Institute of Technology

Page 2: A 180mV FFT Processor Using Subthreshold Circuit Techniques

Extreme Sensor NetworkingEmerging Sensor Applications

Enabler: Self-Powered Sensor System

Sensor&

A/D

SensorSpecific

Cores(e.g., FFT)

SensorDSP

ProcessorRF

Energy Scavenger

Operating Room of the Future(courtesy John Guttag)

Machine Monitoring(courtesy ABB)

Target Tracking & Detection(Courtesy of ARL)

System Power < 10µµµµW for Energy Scavenging

Page 3: A 180mV FFT Processor Using Subthreshold Circuit Techniques

Design Considerations! For emerging low-performance microsensor applications,

computing speed is not critical. Energy dissipation per function must be minimized.

! Traditional low-power design is optimized for the worst-case operating scenario.

! Significant diversity in operating scenarios:! Operating modes: threshold detection (low-activity), source

detection (medium-activity), localization and classification (high-activity)

! Event statistics! User-specified latency and quality

! The node must be energy aware and able to adapt energy consumption over a variety of operating scenarios.

Page 4: A 180mV FFT Processor Using Subthreshold Circuit Techniques

Energy Aware FFT Architecture

! Energy aware FFT architecture scales gracefully from 128 to 1024 point lengths and supports 8b and 16b precision.

W=e-j2ππππkn/N

Twiddle ROM’s

Butterfly Datapath

A

BW

X=A+B*W

Y=A -B*W

clk

W

Y

X

AB

Waddress

Aaddress, Baddress

dataready

clk

enable

FFT length

dataout datain clk bit precision

Data MemoryBank #1: Parity Odd

Bank #2: Parity Even

Bank #3: Parity Odd

Bank #4: Parity Even

MSB

=1M

SB=0

Con

trol

Log

ic

Page 5: A 180mV FFT Processor Using Subthreshold Circuit Techniques

Bit-scalable Baugh-Wooley MultiplierX{15:0}

Adder used only in16-bit mode

Adder used in 8-bitand 16-bit mode

Y{7:0}

X{7:0}

0

00000000

00

00

00

0

Y{15:0}1

Z{31:0}! Fine-grained gating reduces activity factor and achieves

energy savings with minimal area overhead.! Bit-precision scaling architectures are used in the

butterfly datapath, data memory and Twiddle ROMs.

Page 6: A 180mV FFT Processor Using Subthreshold Circuit Techniques

Variable FFT Length

! Dedicated memory structure contains an MSB and parity-bit crossbar to avoid read/write hazards.

! The energy aware control logic scales the number of butterflies with FFT length.

Memory Read

MSB=0Aaddress

Baddress

MSB=1

Parity Even128x32b

Parity Even128x32b

A B

ParityBit

Parity Odd128x32b

Parity Odd128x32b

Page 7: A 180mV FFT Processor Using Subthreshold Circuit Techniques

FFT Energy/Performance Contours

2DDLSwitching VCaE ⋅⋅=

TVIE SV

DDSLeakage

th

⋅⋅⋅=−

10

! The optimal VDD for the 1024-point, 16b FFT is estimated from switching and leakage models for a 0.18µm process.

Optim

al (Vdd , V

th )

Threshold Voltage (Vth)

Supp

ly V

olta

ge (V

DD)

Exploit Subthreshold Operation for Sensor Circuits

Page 8: A 180mV FFT Processor Using Subthreshold Circuit Techniques

0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

200

400

600

800

1000

1200

1400

Optimum Power Supply

Estimated minimum energy point

@ VDD=400mV

Vth = 450mV

VDD

Ene

rgy(

nJ)

! There is a trade-off between leakage and switching energy as frequency, VDD and activity factor is varied.

! The FFT design focuses on achieving supply voltages well below 400mV to investigate the minimum energy point.

Page 9: A 180mV FFT Processor Using Subthreshold Circuit Techniques

Min-Max Sizing Curve

50 100 150 2000

10

20

30

40

50

60

VDD

(mV)

Wp (

µm)

Wp(max)

Wp(min)

100 150 200 250 300 350 4000

10

20

30

40

50

60

VDD

(mV)

Wp (

µm)

Wp(max)SF corner

Wp(min)FS corner

Wp (max)

Inverter with a minimum sized Wn

Typical transistor

Process Corners

0 1Wp (min)

! The minimum supply voltage is limited by the effect of process variations.

! Inverter sizing analysis and minimum supply voltage analysis are performed at the corners.

drive currentleakage current

Page 10: A 180mV FFT Processor Using Subthreshold Circuit Techniques

Tiny XOR at 100mV

idle currentdrive current

A=1, B=0, Z=1

B

Z

B

AA

Z

100

50

01m 2m 3m 4m0

A=1 B=0

A=0 B=1

A=0 B=0

A=1 B=1

! Leakage through the parallel devices causes the tiny XOR to fail at 100mV.

Voltage level at Z (mV)

Page 11: A 180mV FFT Processor Using Subthreshold Circuit Techniques

Transmission Gate XOR

Z

B

B

A

B

A

idle currentdrive current

A=1, B=0, Z=1

weak drive current

50

0

Voltage level at Z (mV)

100

1m 2m 3m 4m0

Z

A=1 B=0

A=0 B=1

A=0 B=0

A=1 B=1

! Balanced number of devices reduces the effects of leakage and process variations.

Page 12: A 180mV FFT Processor Using Subthreshold Circuit Techniques

Sneak Leakage and Stacked Devices

B

A

B

A

Sum

P

PB

Cin

Cin

Cin

A

A

! Traditional circuits suffer from effects such as parallel leakage, stacked devices, and sneak leakage paths.

parallel leakage

sneak leakagepath

idle currentdrive current

A=0, B=0, Cin = 0

stacked devices

Page 13: A 180mV FFT Processor Using Subthreshold Circuit Techniques

Subthreshold Library Methodology

drive device gates

Cin

PA

A

B

B

B

SumP

Cin

A

A

B

B

B

! Buffering, reducing parallel devices, and driving device gates are methods used in subthreshold standard cell logic design.

add buffers

BA

B

A

Sum

P

P

B

B

Cin

Cin

Cin

A

AReduce parallel devices

Page 14: A 180mV FFT Processor Using Subthreshold Circuit Techniques

Sizing Tradeoffs - SRAM cell

100 200 300 400 5000

10

20

30

40

50

VDD(mV)W

N1/W

P1

max N1FS corner

min N1SF corner

BL BL

HI LON3

P1

N1

P2

N2

N4

WL

Write condition trade-offWN3/WP1 large: write ‘0’ into HI at the SF cornerWN2/WP2 small: write ‘1’ into LO at the FS corner

BL BL

HI LON3

P1

N1

P2

N2

N4

WL

Increasing WN2 prevents the memory cell from being rewritten during a read access.

WL=1BL=1BL=1HI=1LO=0 ∆VLO

WL=1BL=0BL=1HI =1 ∆VHILO=0 ∆VLO

Page 15: A 180mV FFT Processor Using Subthreshold Circuit Techniques

Tristate Write Access

WBLM

WWL

WWL

100 200 300 400 5000

10

20

30

40

50

60

VDD

Wp (

µm)

! Tristate latch-based write access achieves low voltage operation at process corners.

Wp(max)SF corner

Wp(min)FS corner

Page 16: A 180mV FFT Processor Using Subthreshold Circuit Techniques

Read Bitline at 100mV

1m 1.5m0

50

100

RBL-(output-low)

RBL-(output-high)

Data dependent leakageWorst case output-high:

M0=0, M1-M127=1Worst case output-low:

M0=1, M1-M127=0

Precharge Read

ϕpre

RWL0

RWL1

RBL

Wpre

M0

M1

RWL0

RWL1

RWL2

M0

M1

M2

RBL

Tristate Read

2m0 4m 6m0

100

50 RWL0

RBL

RBL-(output-low)RWL0

RBL

Page 17: A 180mV FFT Processor Using Subthreshold Circuit Techniques

Hierarchical-Read Bitline

A0

RBL

M0M1

M2M3

M126M127

A1 A2 A6

2m 4m 6m0

10080604020

0

M0=0, M1-M127=1, A1-A6=0

A0 RBL

ZM0

M1

A0

A0

A0

Mux

! The hierarchical-read bitline eliminates parallel leakage and stacked devices.

Page 18: A 180mV FFT Processor Using Subthreshold Circuit Techniques

Latch-Write and Hierarchical-Read Memory

A0

RBL

A1

A2 A6

latch0

latch1

latch2

latch3

latch4

latch5

A0-

A6

WWL0

WWL0

! Muxes are daisy-chained for compact layout area.

WWL127

WWL127

WWL1

WWL1

Page 19: A 180mV FFT Processor Using Subthreshold Circuit Techniques

Custom Subthreshold FFTProcess Details

! 0.18µm CMOS process! 6 layer metal! 628k transistors

Design Flow! Custom subthreshold

logic cells! Custom Skill-based

memory generators and multipliers

! Skill code place-and-route

Data Memory

TwiddleROMs

ButterflyDatapath

Control logic

2.1

mm

2.6 mm

Page 20: A 180mV FFT Processor Using Subthreshold Circuit Techniques

! The FFT processor achieves 180mV operation for 16-bit, 1024-point operation. The clock frequency is 164 Hz.

180 mV Supply Demonstration

DataReady

DataOutput[1-0]

output clock

Page 21: A 180mV FFT Processor Using Subthreshold Circuit Techniques

200 300 400 500 600 700 800 900

101

102

103

200 300 400 500 600 700 800 900

101

102

103

Energy-Scalability MeasurementsE

nerg

y (n

J)

Ene

rgy

(nJ)

1024 point

512 point

256 point

128 point

1024 point

512 point

256 point

128 point

8-bit processing 16-bit processing

! The FFT is able to operate at 128, 256, 512 and 1024-point FFT lengths and 8 and 16b precisions.

! 8b processing leads operation at a larger minimum VDDdue to reduced activity factor.

VDD(mV) VDD(mV)

Page 22: A 180mV FFT Processor Using Subthreshold Circuit Techniques

200 300 400 500 600 700 800 9000

100

200

300

400

500

600

700

800

900

1000

Energy Estimation

200 300 400 500 600 700 800 900100Hz

1kHz

10kHz

100kHz

1MHz

10MHz

VDD(mV)

Clo

ck fr

eque

ncy

VDD(mV)

! The FFT operates between VDD=180mV-900mV and clock frequency of 164Hz-6MHz.

! The minimum energy dissipated is 155nJ/FFT at 350 mV for a 1024-point 16b FFT. The clock frequency is 10kHz and the FFT processor dissipates 0.6µW.

1024-point, 16 bit

measured

estimatedEne

rgy

(nJ)

Page 23: A 180mV FFT Processor Using Subthreshold Circuit Techniques

Conclusions

! Subthreshold operation at the optimal supply voltage and clock frequency is necessary to minimize energy dissipation of digital circuits in wireless sensor applications.

! Process variations limit the minimum supply voltage operation of CMOS circuits.

! Subthreshold logic and memory design methodology minimizes parallel leakage, stacked devices and sneak leakage effects.

! Demonstrated a 180mV FFT Processor using subthreshold circuit techniques.