nsf erc for wireless integrated microsystems (wims) eric d. marsman 1, robert m. senger 1, michael...

19
NSF ERC for Wireless Integrated MicroSystems (WIMS) NSF ERC for Wireless Integrated MicroSystems (WIMS) Eric D. Marsman 1 , Robert M. Senger 1 , Michael S. McCorquodale 2 , Matthew R. Guthaus 1 , Rajiv A. Ravindran 1 , Ganesh S. Dasika 1 , Scott A. Mahlke 1 , Richard B. Brown 3 1 University of Michigan, 2 Mobius Microsystems, 3 University of Utah IEEE International Symposium on Circuits and Systems May 23 rd – May 26 th , 2005, Kobe, Japan A 16-Bit Low-Power Microcontroller with Monolithic MEMS-LC Clocking

Upload: buddy-norris

Post on 20-Jan-2016

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: NSF ERC for Wireless Integrated MicroSystems (WIMS) Eric D. Marsman 1, Robert M. Senger 1, Michael S. McCorquodale 2, Matthew R. Guthaus 1, Rajiv A. Ravindran

NSF ERC for Wireless Integrated MicroSystems (WIMS)NSF ERC for Wireless Integrated MicroSystems (WIMS)

Eric D. Marsman1, Robert M. Senger1, Michael S. McCorquodale2, Matthew R. Guthaus1, Rajiv A. Ravindran1,

Ganesh S. Dasika1, Scott A. Mahlke1, Richard B. Brown3

1University of Michigan, 2Mobius Microsystems, 3University of Utah

IEEE International Symposium on Circuits and SystemsMay 23rd – May 26th, 2005, Kobe, Japan

A 16-Bit Low-Power Microcontroller with Monolithic MEMS-LC Clocking

Page 2: NSF ERC for Wireless Integrated MicroSystems (WIMS) Eric D. Marsman 1, Robert M. Senger 1, Michael S. McCorquodale 2, Matthew R. Guthaus 1, Rajiv A. Ravindran

NSF ERC for Wireless Integrated MicroSystems (WIMS)NSF ERC for Wireless Integrated MicroSystems (WIMS) 2

Overview

• Motivation• Microsystem Architecture

– Microcontroller– Clock Generation– Dynamic Frequency Scaling (DFS)

• Microsystem Measured Results– Microcontroller– Compiler Utilization– Instruction Level Power Modeling– Clock Generation– DFS

• Future Directions• Conclusion

Page 3: NSF ERC for Wireless Integrated MicroSystems (WIMS) Eric D. Marsman 1, Robert M. Senger 1, Michael S. McCorquodale 2, Matthew R. Guthaus 1, Rajiv A. Ravindran

NSF ERC for Wireless Integrated MicroSystems (WIMS)NSF ERC for Wireless Integrated MicroSystems (WIMS) 3

Motivation

Wireless Integrated Microsystems (WIMS)Environmental Sensors Biomedical Implants

Cochlear Implant

Deep Brain

Implants

Gas Chromatograph

HeavyMetals

Page 4: NSF ERC for Wireless Integrated MicroSystems (WIMS) Eric D. Marsman 1, Robert M. Senger 1, Michael S. McCorquodale 2, Matthew R. Guthaus 1, Rajiv A. Ravindran

NSF ERC for Wireless Integrated MicroSystems (WIMS)NSF ERC for Wireless Integrated MicroSystems (WIMS) 4

Motivation (cont)

• Power minimization– Frequency scaling

– Voltage scaling

– Memory architecture

– Process technology

– Leakage current mitigation

Core Process Frequency No. Bits Core Power

ARM7TDMI 0.18um 88MHz 32 22mW

Tensilica Xtensa

0.18um 200MHz 32 80mW

MIPS32M4K 0.13um 300MHz 32 84mW

Infineon C166S

0.18um 80MHz 16 160mW

Commercially available cores

Page 5: NSF ERC for Wireless Integrated MicroSystems (WIMS) Eric D. Marsman 1, Robert M. Senger 1, Michael S. McCorquodale 2, Matthew R. Guthaus 1, Rajiv A. Ravindran

NSF ERC for Wireless Integrated MicroSystems (WIMS)NSF ERC for Wireless Integrated MicroSystems (WIMS) 5

Microsystem Architecture

• 16-bit, 3-stage pipeline• Software controlled register interface to clock generator• Peripheral communication interfaces for flexibility

Register Files

Fetch Decode Execute

Memory Management Unit

BootROM

64KBSRAM

USART SPI

CMOS-MEMS Clock Generator

64 K

B E

xter

nal

Mem

ory

LoopCache X3

TimerTestInt. X2

Page 6: NSF ERC for Wireless Integrated MicroSystems (WIMS) Eric D. Marsman 1, Robert M. Senger 1, Michael S. McCorquodale 2, Matthew R. Guthaus 1, Rajiv A. Ravindran

NSF ERC for Wireless Integrated MicroSystems (WIMS)NSF ERC for Wireless Integrated MicroSystems (WIMS) 6

Microcontroller Architecture

• Primarily a Load-Store architecture• 77 instructions, 8 addressing modes• Data and address registers split into two windows• Hardware support for one level of interrupts and subroutines• Banked memory architecture with additional external memory

interface– Energy/area tradeoffs

compared to single 64kB

bank

• Low-power loop cache

for commonly executed

instructions

0

0.2

0.4

0.6

0.8

1

1x64 2x32 4x16 8x8 16x4 32x2Ram Structure ('banks' x 'size in kB')

No

rmal

ized

Are

a an

d P

ow

er

Power Area

15.9% more area69.2% less power

Page 7: NSF ERC for Wireless Integrated MicroSystems (WIMS) Eric D. Marsman 1, Robert M. Senger 1, Michael S. McCorquodale 2, Matthew R. Guthaus 1, Rajiv A. Ravindran

NSF ERC for Wireless Integrated MicroSystems (WIMS)NSF ERC for Wireless Integrated MicroSystems (WIMS) 7

Monolithic Clock Generation

• Complementary, cross coupled, negative-transconductance tank

• Frequency trimming via modulation of tail current with vtrim

• CMOS compatible• 1.056GHz oscillation frequency• Buffer amplifier removes amplitude variation

L

C C

DFF

QD+

_

16fo

vtrim

R

QDFF

QD

Q2fo

DFF

QD

Q

Page 8: NSF ERC for Wireless Integrated MicroSystems (WIMS) Eric D. Marsman 1, Robert M. Senger 1, Michael S. McCorquodale 2, Matthew R. Guthaus 1, Rajiv A. Ravindran

NSF ERC for Wireless Integrated MicroSystems (WIMS)NSF ERC for Wireless Integrated MicroSystems (WIMS) 8

Dynamic Frequency Scaling

• Fully synthesized logic, no custom design• Synchronization chain ensures glitch free output• Optional external clock input

D

C Q

QD

C Q

QQ

2 f 0

D

C Q

QD

C Q

QQ

D

C Q

QD

C Q

QQ

f 0

f 1

f 1 5

C l o c k D i v i d e r

C l o c k S e l

f c l kD

C

Q

Q

D

C

Q

QQ

D

C

Q

Q

D

C

Q

QQ

D

C

Q

Q

D

C

Q

QQ

F F 0 F F 1 F F 4

C l o c k S y n c h r o n i z e r

S y s t e m C l o c k

T o C l o c k T r e e

E x t e r n a l C l o c k

E x t e r n a l C l o c k S e l

,15 ... 2, , 1 ;2

0 nf

fnn

Page 9: NSF ERC for Wireless Integrated MicroSystems (WIMS) Eric D. Marsman 1, Robert M. Senger 1, Michael S. McCorquodale 2, Matthew R. Guthaus 1, Rajiv A. Ravindran

NSF ERC for Wireless Integrated MicroSystems (WIMS)NSF ERC for Wireless Integrated MicroSystems (WIMS) 9

Dynamic Frequency Scaling (cont)

f0

Clock Sel

fclk

2f0

FF0.Q

FF4.Q

f2

f0 f2 f1

f1

glitch

• Glitch suppression example

Page 10: NSF ERC for Wireless Integrated MicroSystems (WIMS) Eric D. Marsman 1, Robert M. Senger 1, Michael S. McCorquodale 2, Matthew R. Guthaus 1, Rajiv A. Ravindran

NSF ERC for Wireless Integrated MicroSystems (WIMS)NSF ERC for Wireless Integrated MicroSystems (WIMS) 10

Microsystem Measured Results

• TSMC 0.18m MM/RF bulk CMOS

• 3.5 million transistors• Operates up to 92MHz• 33.9mW core power

consumption @ 92MHz & 1.8V

• 1.4mW core power consumption @ 10MHz & 1.1V

• 17.28mW MEMS clock source power consumption @ 1.8V

• 740W sleep power consumption @ 1.1V

16KB SRAM

16KB SRAM

16KB SRAM

16KB SRAM

PIPELINECLK

PE

RIP

HE

RA

LS

CA

CH

E ANALOG

TEST

3.54mm

Page 11: NSF ERC for Wireless Integrated MicroSystems (WIMS) Eric D. Marsman 1, Robert M. Senger 1, Michael S. McCorquodale 2, Matthew R. Guthaus 1, Rajiv A. Ravindran

NSF ERC for Wireless Integrated MicroSystems (WIMS)NSF ERC for Wireless Integrated MicroSystems (WIMS) 11

Microcontroller Measured Results

• Static loop cache utilization provides 4 to 20% energy savings

• Vdd scaling across different frequencies allows for adjustment to program workload requirements

0.00

10.00

20.00

30.00

40.00

50.00

60.00

1.10 1.20 1.30 1.40 1.50 1.60 1.70 1.80 1.90 2.00 2.10 2.20

Core Vdd (V)

Co

re P

ow

er

(mW

)

Chip #1 Chip #2 Chip #3 Chip #4

90MHz

50MHz

10MHz

Power vs. Vdd across frequency rangesLoop cache energy savings

0

5

10

15

20

25

30

35

40

45

Data1 Data2 Data3 Data4 Fetch1 Fetch2

Mea

sure

d P

ow

er (

mW

)

0.0%

5.0%

10.0%

15.0%

20.0%

25.0%

30.0%

35.0%

40.0%

45.0%

Po

wer

Sav

ing

s u

sin

g L

oo

p C

ach

e

SRAMOnly

SRAM andLoop Cache

PercentageSavings

56%

LC

acc

esse

s

93%

LC

acc

esse

s

29%

LC

acc

esse

s

23%

LC

acc

esse

s

28%

LC

acc

esse

s

23%

LC

acc

esse

s

Page 12: NSF ERC for Wireless Integrated MicroSystems (WIMS) Eric D. Marsman 1, Robert M. Senger 1, Michael S. McCorquodale 2, Matthew R. Guthaus 1, Rajiv A. Ravindran

NSF ERC for Wireless Integrated MicroSystems (WIMS)NSF ERC for Wireless Integrated MicroSystems (WIMS) 12

WIMS C Compiler

• Windowed versus non-windowed machine– 19% reduction in power consumption

– 30% performance improvement

• Dynamic instruction placement in 512B loop cache achieves 43% energy savings over static placement

0

10

20

30

40

50

60

sha

epic

gsm

dec

rast

a

raw

c

raw

d

cjp

eg

djp

eg

blo

wfi

sh

un

epic

gsm

enc

rijn

dae

l

aver

age

% E

ner

gy

Sa

vin

gs

DynamicStatic

peg

wit

dec

Energy savings in 64B loop cache

Page 13: NSF ERC for Wireless Integrated MicroSystems (WIMS) Eric D. Marsman 1, Robert M. Senger 1, Michael S. McCorquodale 2, Matthew R. Guthaus 1, Rajiv A. Ravindran

NSF ERC for Wireless Integrated MicroSystems (WIMS)NSF ERC for Wireless Integrated MicroSystems (WIMS) 13

Instruction Level Power Modeling

• Divide ISA into groups of similar instructions

• noops model inter-instruction pipeline switching

• Account for memory access energy separately

Instruction

Group

Energy

(nJ)

Instruction

Group

Energy

(nJ)

add-sub 0.2403 win swap 0.1832

shift 0.1950 load imm 0.1961

boolean 0.2127 branch-nt 0.1720

compare 0.2082 branch-t 0.5741

multiply 2.7702 jmp abs 0.5372

divide 2.7160 jmp rel 0.4020

copy 0.2127 jmp abs sub 0.5658

bit 0.6137 jmp rel sub 0.3527

load abs 0.5249 return 0.3700

load rel 0.3661 swi 0.5585

store abs 0.4427

store rel 0.3070 noop 0.1931

Energy per instruction group

1Excludes memory access energy as this is memory dependent

Ext Mem (nJ)1 Loop (nJ) MMR (nJ)

Boot Rom (nJ)

inst fetch -0.0554 -0.0507 - -0.0420

bit2 -0.1643 -0.1615 -0.1909 -

load abs2 -0.0976 -0.1016 -0.0877 -

load rel2 -0.1039 -0.1039 -0.1091 -

store abs2 -0.0411 -0.0461 -0.0427 -

store rel2 -0.0525 -0.0633 -0.0575 -

2Fetch energy counted separately

Memory access energy

Page 14: NSF ERC for Wireless Integrated MicroSystems (WIMS) Eric D. Marsman 1, Robert M. Senger 1, Michael S. McCorquodale 2, Matthew R. Guthaus 1, Rajiv A. Ravindran

NSF ERC for Wireless Integrated MicroSystems (WIMS)NSF ERC for Wireless Integrated MicroSystems (WIMS) 14

Clock Generation Results

• No external reference• No PLL/DLL• High frequency accuracy• Low start-up latency• Low temperature

coefficient• Broad operating

temperature range• Low jitter• Minimal area overhead

(3% of die)• Low Power• All Si technology

Metric/Parameter LC Clock

Reference frequency 1056MHz

Output frequencies 0.002 – 66MHz

Frequency accuracy across lot ±0.75%

Frequency precision (no trim) ±2%

Trimmed frequency accuracy 100ppm

Worst case duty cycle 48/52

Worst case RMS period jitter <300ppm

Temperature stability ±0.9% (-40 to 100C)

Max. operation temperature 150C

Power supply 1.8V

Bias current 9.6mA

Power dissipation 17.28mW

Min. operating power 7.2mW

Start-up latency (25C/125C) 18ns/28ns

Si footprint 0.3mm2

Page 15: NSF ERC for Wireless Integrated MicroSystems (WIMS) Eric D. Marsman 1, Robert M. Senger 1, Michael S. McCorquodale 2, Matthew R. Guthaus 1, Rajiv A. Ravindran

NSF ERC for Wireless Integrated MicroSystems (WIMS)NSF ERC for Wireless Integrated MicroSystems (WIMS) 15

MEMS Fabrication

• Post processing etch using PAD cut

• Suspended inductor• Varactor etch

unsuccessful– No etch chemistry for

MiM oxy-nitride

dielectric

– Use transconductance

modulation instead

Page 16: NSF ERC for Wireless Integrated MicroSystems (WIMS) Eric D. Marsman 1, Robert M. Senger 1, Michael S. McCorquodale 2, Matthew R. Guthaus 1, Rajiv A. Ravindran

NSF ERC for Wireless Integrated MicroSystems (WIMS)NSF ERC for Wireless Integrated MicroSystems (WIMS) 16

33MHz

1MHz

glitch-free frequency switching

DFS Results

• Glitch free switching

• Switching latency is 5/2f0, or 37.45ns for this implementation

33MHz 16MHz8MHz 4MHz

Page 17: NSF ERC for Wireless Integrated MicroSystems (WIMS) Eric D. Marsman 1, Robert M. Senger 1, Michael S. McCorquodale 2, Matthew R. Guthaus 1, Rajiv A. Ravindran

NSF ERC for Wireless Integrated MicroSystems (WIMS)NSF ERC for Wireless Integrated MicroSystems (WIMS) 17

Future Directions

• Add DSP for Cochlear Implants and other bio-medical devices• Include ring oscillator for a lower power alternative• ISA improvements to reduce

compiler bottlenecks– Address register support

– Separate data and address

register windows

– DMA instructions

• Decrease sleep mode power• Explore Microsystem design in

advanced technologies

8KBSRAM

8KBSRAM

8KBSRAM

8KBSRAM

CACHE

PIP

EL

INE

I/ODSP CLK

3.0mm

Preliminary next generation system

Page 18: NSF ERC for Wireless Integrated MicroSystems (WIMS) Eric D. Marsman 1, Robert M. Senger 1, Michael S. McCorquodale 2, Matthew R. Guthaus 1, Rajiv A. Ravindran

NSF ERC for Wireless Integrated MicroSystems (WIMS)NSF ERC for Wireless Integrated MicroSystems (WIMS) 18

Conclusion

• Described a highly-functional, low-power Microsystem ideally suited for remote and bio-medical applications

• DFS allows on-the-fly, low-latency adaptation to workload requirements from 33.9mW @ 90MHz to 1.4mW @ 10MHz or sleep mode at 740W

• Monolithic clock reference decreases system size, cost, and power consumption compared to other techniques

• Power-aware compiler takes advantage of low-power architectural features to achieve maximum power reduction

Page 19: NSF ERC for Wireless Integrated MicroSystems (WIMS) Eric D. Marsman 1, Robert M. Senger 1, Michael S. McCorquodale 2, Matthew R. Guthaus 1, Rajiv A. Ravindran

NSF ERC for Wireless Integrated MicroSystems (WIMS)NSF ERC for Wireless Integrated MicroSystems (WIMS) 19

Acknowledgements

• NSF ERC for WIMS• MOSIS Educational Program• Artisan Components• TSMC• Cadence• Synopsys• Mentor Graphics• Coventor