1
Ultralow-Voltage Design and Technology of Silicon-on-Thin-Buried-Oxide (SOTB) CMOS for Highly Energy Efficient Electronics in IoT Era
Nobuyuki Sugii Low-power Electronics Association & Project (LEAP)
27 August, 2014
http://www.leap.or.jp/public_html/eindex.html http://tia-nano.jp/en/index.html International Symposium on Leadging-edge SOI Technologies at KIT
2
This work is supported by New Energy and Industrial Technology Development Organization and Ministry of Economy, Trade, and Industry of Japan. Universities and national institute in collaboration with LEAP: The University of Tokyo The University of Electro-Communications Shibaura Institute of Technology Kyoto University Kyoto Institute of Technology Keio University Osaka University Tokyo University of Science The National Institute of AIST Staffs of Renesas Electronics for chip fabrication LEAP SOTB group members
27 August, 2014 International Symposium on Leadging-edge SOI Technologies at KIT
Acknowledgments
Key Messages and Questions • IoT: Great number (exceed trillions in '20) of tiny
electronics and huge network traffic.
• Tiny electronics should be self-powered (zero-sum
energy). MEP operation, yet slow, recommended.
• SOTB offers highly energy efficient operation of CMOS
with considerably high (for IoT nodes) performance.
• How about chips in large systems?
• Scaling not significantly decrease energy due to
leakage and prefers higher fCLK.
• Further low Vdd at MEP possible?
• Retro scaling fCLK possible? Solutions?
3 27 August, 2014 International Symposium on Leadging-edge SOI Technologies at KIT
MEP: Minimum Energy Point
What is MEP operation?
4 27 August, 2014 International Symposium on Leadging-edge SOI Technologies at KIT
Any transistor should work under the condition; “lowest energy per operation”
P = CVdd2f + IleakVdd E = CVdd
2 + IleakVdd/af AC power leakage
Power: Energy per operation:
a: activity
B. Zhai et al. VLSI Symp. (2006)
MEP
“MEP operation”, at near Vdd=0.4V, maximizes efficiency, but very slow
MEP operation is very slow
5 27 August, 2014 International Symposium on Leadging-edge SOI Technologies at KIT
A. Wang et al. IEEE JSSC 40, p. 310 (2005)
MEP here
Important parameters to optimize energy
Miniaturization increases Vdd@MEP
6 27 August, 2014 International Symposium on Leadging-edge SOI Technologies at KIT
D. Bol, Thesis, Université catholique de Louvain (2008)
A. Chandrakasan et al. Proc. IEEE 98, 191 (2010)
LEAKAGE should DECREASE with SCALING for MEP op.
High-performance flavor also prefers higher Vdd (E determined by Eactive can disregard leakage power!).
Higher fCLK and variability increase E
7 27 August, 2014 International Symposium on Leadging-edge SOI Technologies at KIT
Higher fCLK degrades efficiency
Kao et al. JSSC 37, p. 1545 (2002)
This trend still unchanged
D. Blaauw et al. ISCAS 2006, p. 32 (2006)
Variability tolerant design increases power (area) and decreases speed (logic depth)
Adaptive control is efficient
Single Knob
Miyazaki et al., ISSCC, p.40 (2002)
No Knob
P. Flatresse et al., ISSCC, 24.3 (2013)
Bulk FBB
FDSOI ZBB
FDSOI FBB Dual Knobs
EDP, not MEP!
“Leakage (Vth) control=Adaptive body biasing” (ABB), depending on a (activity), is a key technology in terms of power-speed tradeoff. ABB minimizes Energy!
27 August, 2014 International Symposium on Leadging-edge SOI Technologies at KIT 8
fCLK increase already stopped
fCLK
Power
Tr. # increases
"The Free Lunch Is Over," http://www.gotw.ca/publications/concurrency-ddj.htm
Perf./CLK
Why not decreases?
Why not increases?
27 August, 2014 International Symposium on Leadging-edge SOI Technologies at KIT 9
Interest as a device engineer
10 27 August, 2014 International Symposium on Leadging-edge SOI Technologies at KIT
• Operation should be done at MEP. • More performance (not fCLK) at MEP. • Smaller variability & leakage can
decrease Vdd at MEP. • Adaptive control capability significant. • High reliability at low Vdd (soft error).
Already proven for thin-BOX FDSOI.
• New steep transistors in a long term. TFETs are improving their performance year by year.
11
Ultrathin SOI / BOX: Low-impurity channel: Ultrathin BOX: Vth adjustment Imp.: Same layout as bulk:
- Excellent SCE immunity - Suppression of Vth variation - Less Vth sensitivity to tSOI - Back-gate bias control - Vth control (Multiple Vth) - Easy design porting
R. Tsuchiya et al., IEDM2004. / N. Sugii et al. T-ED 57 p. 835. / Y. Yamamoto et al. VLSI2012-3.
SOTB (Silicon on Thin BOX)
27 August, 2014 International Symposium on Leadging-edge SOI Technologies at KIT
p substrate deep n well
n GP p GP
ultrathin BOX (10 nm)
Vbp Vbn Vdd (core) GND
ultrathin SOI (12 nm)
p substrate
n well p well
Vcc (I/O) GND
Ultralow-voltage core circuit Hybrid-bulk I/O circuit NMOS PMOS NMOS PMOS
Device/process technology
12 27 August, 2014 International Symposium on Leadging-edge SOI Technologies at KIT
H. Makiyama et al., IEEE IMFEDK, p. 42 (2011). Y. Yamamoto et al., VLSI 2012 & 2013
• Stay on 65 nm (low leakage, we can go smaller if leakage can reduce).
• Gate stack: stay on poly-gate, and use high-k to tune EWF (LP~LSTP option)
• Impurity-profile tuning below BOX (multiple Vth control)
• High-quality epitaxial growth (small R & small on-current variation )
• Hybrid bulk for I/O and ESD
"65 nm is the best node for IoT," R. Aitken (ARM), VLSI 2014
SOTB integration (SRAM & logic)
2Mbit 6T-SRAM bit cell area :0.54 μm2 same layout as bulk
SiON
Poly-Si
Hf or Al incorporationfor “Quarter-gap” work function
SOI~12nmBOX=10nmGP~1018/cm3
EpiSiON
Poly-Si
Hf or Al incorporationfor “Quarter-gap” work function
SOI~12nmBOX=10nmGP~1018/cm3
Epi
Cross sectional TEM of SRAM region
Si-sub
The smallest transistors in ULSI (of each technology) are in SRAM.
PD
PU PG
27 August, 2014 International Symposium on Leadging-edge SOI Technologies at KIT 13
Benchmark of Vth variation FDSOI (SOTB) has better AVT values among various transistor structures
AVT is proportional to gate-oxide thickness Tinv.
27 August, 2014 International Symposium on Leadging-edge SOI Technologies at KIT 14
Data from conference papers (2010-2012)
1M-Variability of Vth and Ion
Ion(uA) @ VDD=1.2V Vth(V) 0 0.2 0.4 0.6 0.8 0 40 80 120
1M Trs L=0.06um W=0.14um
Same worst leakage
6
4
2
0
-2
-4
-6 Cum
ulat
ive
prob
abili
ty
×2 worst performance
SOTB Bulk
SOTB
Bulk
·The distribution has no tail (normal distr.). ·SOTB have ×2 worst on current than bulk of same worst leakage transistor
Y. Yamamoto et al. VLSI Tech. 2013. 27 August, 2014 International Symposium on Leadging-edge SOI Technologies at KIT 15
0.37-V SRAM (2 Mbit) operation
Significant Vmin improvement thanks to small variability of SOTB
-5
-4
-3
-2
-1
0
0 0.2 0.4 0.6 0.8 1
Fail
Bit_σ
VDD(V)
Vmin =0.37V
Measured data
-0.4V
Y. Yamamoto et al. VLSI Tech. 2013.
27 August, 2014 International Symposium on Leadging-edge SOI Technologies at KIT 16
Speed and leakage control by back bias
1
10
100
1000
Active Standby Ce
ll le
akag
e cu
rren
t (p
A)
0.1
1
10
0.2 0.4 0.6 0.8 1 1.2 VDD(V)
Acc
ess
tim
e (n
sec)
Active
Operates >>>10 MHz at VDD=0.4 V
1.2-pA cell leakage current for standby mode
5.5nsec 1/200 by RBB of 1.3V
2-Mbit SRAM
Y. Yamamoto et al. VLSI Tech. 2013. 27 August, 2014 International Symposium on Leadging-edge SOI Technologies at KIT 17
Robustness against temperature
Read limit
Write limit
0.2
0.3
0.4
0.5
0 0.2 0.4
Vmin
(V)
|Vb| (V)
RT 80°C
■Vth↑ by RBB →Vmin recover <0.4V
■High temperature →Vth↓ →Read Margin degrade
SRAM stability simulation
Y. Yamamoto et al. VLSI Tech. 2013. 27 August, 2014 International Symposium on Leadging-edge SOI Technologies at KIT 18
Measured data
SOTB is fast and efficient at low Vdd
19 27 August, 2014 International Symposium on Leadging-edge SOI Technologies at KIT
Makiyama et al., Jpn. J. Appl. Phys. 53, 04EC07 (2014)
By 40% smaller delay at 0.4 V (same VT)
Same Cload, improved Ieff/Ioff characteristics (S and DIBL)
VT=(Vthn-Vth
p)/2
Small local delay variability
20 27 August, 2014 International Symposium on Leadging-edge SOI Technologies at KIT
Small N dependence: mainly global variability (N: number of ring osc. stages)
Makiyama et al., Jpn. J. Appl. Phys. 53, 04EC07 (2014)
Soft error immunity
21 27 August, 2014 International Symposium on Leadging-edge SOI Technologies at KIT
6T-SRAM FF
Courtesy: Prof. Hashimoto (Osaka Univ.) NSREC 2014 (to be published in TNS)
Courtesy: Prof. Kobayashi (Kyorto Inst. Tech.) RADECS 2013 (to be published in TNS)
Single event upset (neutron): > one-order improvement Multiple-cell upset (neutron): > two-order improvement
Alpha: > two-order improvement Neutron: > one-order improvement
Design environment
22 27 August, 2014 International Symposium on Leadging-edge SOI Technologies at KIT
SOTB circuits can be designed through the conventional flow. Only slight modifications are necessary.
Logic synthesis
Logic verification
Place and route
Timing analysis
Physical verification
Parasitic RC Libraries
P&R tech. lib.
Verification rules
SPICE
RTL
timing constraints
DRC, LVS, ANT, dummy Mask layout
std. cell layout
Silicon verification OK!
Design Compiler
STAR RC
Calibre
Virtuoso/Composer
IC Compiler (SoC Encounter)
PrimeTime
BSIM4 HiSIM-SOTB
Formality
Very low energy in 32-bit MCU
6T-SRAM144 kByte
32-bit RISC CPU
SPI interface
GPinterface
UART interface
ROMinterface
SOTB ULV circuit Bulk I/O circuit
Ishibashi et al., COOL Chips XVII (2014)
50k gate logic
SRAM SRAM
SRAM SRAM
1.43 mm
1.47
mm
27 August, 2014 International Symposium on Leadging-edge SOI Technologies at KIT 23
CPU: In-order five-stage pipeline Harvard architecture (instruction and data on separate buses) Instruction and data cache (not implemented in this chip)
Power line structure
24 27 August, 2014 International Symposium on Leadging-edge SOI Technologies at KIT
Ishibashi et al., COOL Chips XVII (2014)
Very low energy in 32-bit MCU
-0.3V
27 August, 2014 International Symposium on Leadging-edge SOI Technologies at KIT 25
Sleep current control
26 27 August, 2014 International Symposium on Leadging-edge SOI Technologies at KIT
VBB generator design with very small current (<< uA) has been done thanks to negligibly small substrate leakage.
Perpetuum Mobile Computing
27 27 August, 2014 International Symposium on Leadging-edge SOI Technologies at KIT
Key Messages and Questions • IoT: Great number (exceed trillions in '20) of tiny
electronics and huge network traffic.
• Tiny electronics should be self-powered (zero-sum
energy). MEP operation, yet slow, recommended.
• SOTB offers highly energy efficient operation of CMOS
with considerably high (for IoT nodes) performance.
• How about chips in large systems?
• Scaling not significantly decrease energy due to
leakage and prefers higher fCLK.
• Further low Vdd at MEP possible?
• Retro scaling fCLK possible? Solutions?
28 27 August, 2014 International Symposium on Leadging-edge SOI Technologies at KIT
MEP: Minimum Energy Point
Energy benchmarking
*highly parallel processing (154 GOPs/W)
Vdd@MEP around 0.3-0.5 V regardless of tech. Frequency not high @MEP, though, FDSOI (SOTB/UTBB) has significant speed gain
Reference ISSCC 2010
ISSCC 2012
ISSCC 2012
ISSCC 2014
COOL Chips'14
To be (2014)
Technology 180-nm Bulk
32-nm Bulk
22-nm Tri Gate
28-nm UTBB
65-nm SOTB
65-nm SOTB
IP 32-bit M3
IA-32 32-way SIMD*
DSP 32-bit RISC
DSP
Vdd range 0.35- 0.75 V
0.28- 1.2 V
0.28- 1.1 V
0.39- 1.3 V
0.22- 1.2 V
0.2- 1.2 V
Vdd @ MEP 0.4 V 0.45 V 0.28 V 0.53 V 0.35 V 0.55 V f @ MEP 73 kHz 60 MHz 17 MHz 460 MHz 14 MHz Emin 28.9 pJ 170 pJ 6.5 pJ* 62 pJ 13.4 pJ
27 August, 2014 International Symposium on Leadging-edge SOI Technologies at KIT 29
fCLK retro scaling possible?
How can we increase the portion of such highly-efficient "dedicated" engines in a chip?
Data cited from B. Brodersen (S3S 2013) & ISSCC 2013 papers
"Dedicated" means ultra-parallelism dedicated design.
Type Micro proc. Mobile proc. (DSP)
"Dedicated" (BB, MPEG)
# op. per cycle 27 500 5,000 fCLK 3 GHz 80 MHz 25 MHz Perf. (GOPS) 81 40 125 P (W) 95 0.2 2.3m Efficiency (OP/nJ) 0.85 200 54,348 Energy (pJ) 1173 5.0 0.0184 Vdd (V) ~1.0? ~0.9? 0.55
27 August, 2014 International Symposium on Leadging-edge SOI Technologies at KIT 30
Efficiency of Top 500 Supercomputers
31 27 August, 2014 International Symposium on Leadging-edge SOI Technologies at KIT
Decreasing E as low as possible down to MEP is mandatory even in HP applications.
0.01
0.1
1
10
100
1000
0.1 1 10 100 1000
Pow
er (M
W)
Performance (Pflops)
400 MW for Exa flops!
Mega solar power plant
Can improve E by two orders?
Already many GPUs are used. What comes next?
http://www.top500.org/
as of Nov. 2013
Higher-efficiency approach
32 27 August, 2014 International Symposium on Leadging-edge SOI Technologies at KIT
Tiny electronics Large systems
• Higher performance with slower fCLK (MEP)
• Energy efficient and wide-band memory?
• Wide-band yet slower fCLK communication (I/F?)
• Reduce sequential fraction (Amdahl's law)
• Reconfigurable interconnect
• Dedicated multiple components?
• .... • .... • .... still many issues
• Higher performance with slower fCLK (MEP)
• Zero-sum power with harvesting
• Non-volatile logic eliminating leakage
• Intermittent operation • Reduce communication
data rate (compression) • Low-voltage I/O • Dedicated multiple
components? • .... • .... • .... still many issues
• Higher performance with slower fCLK (MEP)
• Energy efficient and wide-band memory?
• Wide-band yet slower fCLK communication (I/F?)
• Reduce sequential fraction (Amdahl's law)
• Reconfigurable interconnect
• Dedicated multiple components?
• .... • .... • .... still many issues
• Higher performance with slower fCLK (MEP)
• Zero-sum power with harvesting
• Non-volatile logic eliminating leakage
• Intermittent operation • Reduce communication
data rate (compression) • Low-voltage I/O • Dedicated multiple
components? • .... • .... • .... still many issues
Examples: efficient components Flex Power FPGA
Critical Path
LVT
HVT
HVT & LVT by VBB ~1/100 Leakage
Processing Elements
Micro controller
Cool Mega Array
Reconfigurable logic minimizing E by VBB
0
20
40
60
80
100
0 50 100Frequency (MHz)
Effi
cien
cy
(MO
PS
/m
W) Alpha Blender
Bulk
Vdd=0.3V
0.4V
Vdd=0.8V
VBB= -0.2V 0V
VBB= 0V
Vdd=0.4V SOTB
VBB= -0.4V 0V 0.2V 0.4V
Nat. Inst. AIST Keio University
ON OFF
Ru
Cu Solid Electrolyte
Reconfigurable wiring with Atom SW (LEAP)
Reconfigurable offloader reduces CLK cycle and E
Atom SW
Fine-grain VBB control of each LUT and MUX
LEAP
Data compress.
CLK cycle
E (nJ)
CPU: MSP430
177 2.89
Offloader: Atom SW
8.3 0.135
ratio 0.047 0.044
27 August, 2014 International Symposium on Leadging-edge SOI Technologies at KIT 33
Summary
34 27 August, 2014 International Symposium on Leadging-edge SOI Technologies at KIT
• Serious issue in IoT Era: Energy saving. • CMOS circuits should operate at MEP at
any time. • Hetero integration of "Dedicated"
(parallel) engines is expected. • Super steep transistor is promising,
in the long term. • In the short term, thin-BOX FDSOI
(SOTB) is our recommendation. • Roughly one order improvement in
energy. Why not use FDSOI?
35 27 August, 2014 International Symposium on Leadging-edge SOI Technologies at KIT
Thank you for your attention!
Appendix
36 27 August, 2014 International Symposium on Leadging-edge SOI Technologies at KIT
Publications relating SOTB (from Hitachi/Renesas, Tokyo University, and LEAP) R. Tsuchiya et al., IEDM 2004, p. 631. SOTB T. Ohtou et al., Electron Device Letters, 28, p. 740 (2007). Variability simulation T. Ishigaki et al., SSDM 2007, p. 886. SOTB/bulk hybrid T. Ishigaki et al., Jpn. J. Appl. Phys., 47 (4), p. 2585 (2008). SOTB/bulk hybrid R. Tsuchiya et al., IEDM 2007, p. 475. SRAM, back-bias effect Y. Morita et al., VLSI Technology 2008, p. 166. Variability, multi-Vth T. Ishigaki et al., ESSDERC 2008, p. 198. RO, back-bias effect T. Ishigaki et al., Solid-State Electronics, 53, p. 717 (2009). RO, back-bias effect N. Sugii et al., SSDM 2008., p. 880. Variability N. Sugii et al., Jpn. J. Appl. Phys., 48 (4), 04C043 (2009). Variability N. Sugii et al., IEDM 2008, p. 249. Variability R. Tsuchiya et al., VLSI 2009, p. 150. SRAM, analog, reliability N. Sugii et al., Trans. Electron Devices, 54, p. 835 (2010). Variability T. Ishigaki et al., IRPS 2010, p. 1049. HC & NBTI reliability T. Hiramoto et al., SOI Conference 2010, p. 170 (2010). Variability T. Ishigaki et al., Trans. Electron Devices, 58, p. 1197 (2011). HC & NBTI reliability J. Nishimura et al., IEEE ULIS 2011, (2011). RTN A. Shima et al., Jpn. J. Appl. Phys., 50 (4), 04DC06 (2011). Metal S/D process H. Makiyama et al., IEEE IMFEDK, p. 42 (2011). Vth design H. Makiyama et al., EUROSOI 2012 Vth design Y. Yamamoto et al., VLSI 2012 Low-Vdd operation T. Mizutani et al., SNW 2012 Variability (on-current) T. Mizutani et al., Jpn. J. Appl. Phys., 52, 04CC02 (2013). Variability (S factor) Y. Yamamoto et al., VLSI 2013, T212 Low-Vdd SRAM H. Makiyama et al., IEDM 2013, 33.2 RO delay VBB control H. Makiyama et al., Jpn. J. Appl. Phys., 53, 04EC07 (2014) open access RO delay variability N. Sugii et al., JLPEA. 2014, 4, 65-76; doi:10.3390/jlpea4020065 (open) Review K. Ishibashi et al., COOL Chips XVII, Yokohama 2014. CPU results