a survey of ddr4 sdram design improvement methods
DESCRIPTION
A Survey of DDR4 SDRAM Design Improvement Methods. 16 January 2014 Edmund Leong 梁文禎 0260814. Overview. Introduction - DDR4 Specifications A Far End Cross Talk Cancellation Method Driver Design A Low Jitter DLL Design Fast Parallel CRC and DBI Calculation Method Conclusion. - PowerPoint PPT PresentationTRANSCRIPT
NCTU Memory Systems IEE5011 FALL 2013 1
A Survey of DDR4 SDRAM Design
Improvement Methods16 January 2014
Edmund Leong 梁文禎0260814
NCTU Memory Systems IEE5011 FALL 2013 2
Overview• Introduction - DDR4 Specifications• A Far End Cross Talk Cancellation Method• Driver Design• A Low Jitter DLL Design• Fast Parallel CRC and DBI Calculation Method• Conclusion
NCTU Memory Systems IEE5011 FALL 2013 3
DDR4 Specifications (1/3)• P = α CL VDD
2 f• DDR → DDR4• f↑ 8x• VDD↓2.75x• Based on simplified
equation, power consumption is still increasing.• Other methods are
introduced to reduce power consumption
NCTU Memory Systems IEE5011 FALL 2013 4
DDR4 Specifications (2/3)• Change from center tapped termination (CTT)/SSTL
to pseudo open drain (POD)• Reduction of VDD to GND path when DQ is logic
high.
NCTU Memory Systems IEE5011 FALL 2013 5
DDR4 Specifications (3/3)• Data Bus Inversion (DBI)• 2.5V Vpp for word lines• CRC protection• CA parity error
detection• Point to point topology
NCTU Memory Systems IEE5011 FALL 2013 6
Overview• Introduction - DDR4 Specifications• A Far End Cross Talk Cancellation Method• Driver Design• A Low Jitter DLL Design• Fast Parallel CRC and DBI Calculation Method• Conclusion
NCTU Memory Systems IEE5011 FALL 2013 7
Far End Crosstalk Cancellation Method (1/4)• crosstalk cancellation methods:• Circuit implementation• Wider spacing between signal traces• Use Via stub capacitance
NCTU Memory Systems IEE5011 FALL 2013 8
Far End Crosstalk Cancellation Method (2/4)• Far end crosstalk can be reduced by using via stubs• Inter Symbol Interference (ISI) is not affected• Resonant frequency is over 10GHz
𝑉 𝐹𝐸𝑋𝑇 (𝑡 )=𝑡 h𝑓𝑙𝑖𝑔 𝑡
2(𝐶𝑚
𝐶 −𝐿𝑚
𝐿 )𝑑𝑉 𝑎𝑔𝑔¿¿
NCTU Memory Systems IEE5011 FALL 2013 9
• 8 port s-parameter is measured up to 20GHz with a vector analyzer
• Resonance by the stub starts around 15GHz
NCTU Memory Systems IEE5011 FALL 2013 10
Far End Crosstalk Cancellation Method (4/4)
NCTU Memory Systems IEE5011 FALL 2013 11
Overview• Introduction - DDR4 Specifications• A Far End Cross Talk Cancellation Method• Driver Design• A Low Jitter DLL Design• Fast Parallel CRC and DBI Calculation Method• Conclusion
NCTU Memory Systems IEE5011 FALL 2013 12
Driver Design (1/5)• Type 0 – standard termination• Type I – switched termination preemphasis• Type II – constant termination de-emphasis
NCTU Memory Systems IEE5011 FALL 2013 13
Driver Design (2/5)• Type 0 – standard termination• R2 is always open• Always driving with RS termination• No boost to high frequency content
NCTU Memory Systems IEE5011 FALL 2013 14
Driver Design (3/5)• Type I – switched termination preemphasis• R2 termination is active only during transition bit• Termination during transition is Rs||R2.• Termination during non transition is Rs only.• Level of pre-emphasis is controlled by Rs and R2
NCTU Memory Systems IEE5011 FALL 2013 15
Driver Design (4/5)• Type II – constant termination de-emphasis • R1 and R2 in series which is the Thevenin
equivalent to Rs.• Transition bit driven by Rs.• Non Transition bit driven by R1-R2 network
NCTU Memory Systems IEE5011 FALL 2013 16
Driver Design (5/5)
Driver and Termination Best termination valueRs - Rt
Best eye width (ps)
Type 0, VDDQT 40 – 60 187
Type I, VDDQT 40 – 60 187
Type II, VDDQT 40 - 120 232
Type II, CTT 40 – 120 228
Simulation of DDR4 2400MT/s, 1 DIMM per channel
• With optimized resistor values, difference of VDDQT or CTT termination has minimal effect on the performance of the net
NCTU Memory Systems IEE5011 FALL 2013 17
Overview• Introduction - DDR4 Specifications• A Far End Cross Talk Cancellation Method• Driver Design• A Low Jitter DLL Design• Fast Parallel CRC and DBI Calculation Method• Conclusion
NCTU Memory Systems IEE5011 FALL 2013 18
Low Jitter DLL Design (1/4)
NCTU Memory Systems IEE5011 FALL 2013 19
Low Jitter DLL Design (2/4)• Conventional Charge Pump 𝐼𝐷=1
2𝜇𝑛 ,𝑝𝐶𝑜𝑥[𝑊𝐿 ] (𝑉 𝐺𝑆−𝑉𝑇𝐻 )2 (1+𝜆𝑉 𝐷𝑆)
NCTU Memory Systems IEE5011 FALL 2013 20
Low Jitter DLL Design (3/4)• New Charge Pump design
NCTU Memory Systems IEE5011 FALL 2013 21
Low Jitter DLL Design (4/4)TVLSI’10 JSSC’11 Proposed’13
Process (nm) 54 130 90
DRAM Interface GDDR3 DDR DDR4
Supply (V) 1.8 1.2 1.2
DLL Type ADDLL ADDLL ADDLL
Frequency (GHz) 1.4 0.11-1.4 1.6
Peak-to-peak jitter (ps)
29 @ 1.4 GHz 15.11 @ 1.4 GHz 12.33 @ 1.6 GHz
Power (mW) 29.5 @ 1GHz 74.4 @ 1.4 GHz 33.6 @ 1.6 GHz
Area (mm2) 0.11 0.387 0.047
Proposed – Design and Diagnostics of Electronic Circuits & Systems (DDECS), IEEE International Symposium 2013
NCTU Memory Systems IEE5011 FALL 2013 22
Overview• Introduction - DDR4 Specifications• A Far End Cross Talk Cancellation Method• Driver Design• A Low Jitter DLL Design• Fast Parallel CRC and DBI Calculation Method• Conclusion
NCTU Memory Systems IEE5011 FALL 2013 23
Fast Parallel CRC and DBI Calculation Method (1/7)• DDR4 introduces CRC ATM-8 HEC• CRC calculation is based on DBI inverted data• DDR4 adds CRC value at the end of data burst
NCTU Memory Systems IEE5011 FALL 2013 24
Fast Parallel CRC and DBI Calculation Method (2/7)• CLmin = tCore + Max(0, tCRC – tPrep) + tAlign• tCalc + Flight time 1 + Flight time 2 < 4nCK• In 3.2Gbps DDR4, calculation time constrain is
about 1.2ns
NCTU Memory Systems IEE5011 FALL 2013 25
Fast Parallel CRC and DBI Calculation Method (3/7)• (a) has internal nodes that
do not swing full rail. • Vdd-Vth swing
• (c) has internal nodes with full rail swing• Inverter to prevent long chain
of transmission gate
NCTU Memory Systems IEE5011 FALL 2013 26
Fast Parallel CRC and DBI Calculation Method (4/7)• DBI is activated when more
then half of DQ bits are 0.• Each CRC calculation inputs
are determined by bit mapping (eg. Gray boxes).• Serial DBI CRC calculations
are too inefficient• A parallel method is needed
NCTU Memory Systems IEE5011 FALL 2013 27
Fast Parallel CRC and DBI Calculation Method (5/7)• CRC starts with all DBI bits = 0• For each CRC[i], information needed for post
processing CRC+DBI correction:• Inclusion of DBI#[k] in CRC[i]• Oddness of DQ bits associated with burst k and CRC[i]• Actual DBI#[i]
• D[k]= self’ * Odd * DBI#[0]’+self * Even * DBI#[k] + self * Odd
CRC_new[i] = CRC[i] xor D[0] xor … … xor D[7]
NCTU Memory Systems IEE5011 FALL 2013 28
Fast Parallel CRC and DBI Calculation Method (6/7)
Stage Input Slots
CRC[i] input
Empty Slots
1 64 37 27
2 32 19 13
3 16 10 6
4 8 5 3
5 4 3 1
6 2 2 0
32
6
CRC_new[i]
• DBI#[k] inputs into third stage of XOR tree
• Critical path is one tXOR more than XOR tree
NCTU Memory Systems IEE5011 FALL 2013 29
Fast Parallel CRC and DBI Calculation Method (7/7)
NCTU Memory Systems IEE5011 FALL 2013 30
Conclusion• Specifications of DDR4 require very high speeds which
places importance on signal integrity• Transmission line theory is important for impedance
matching in termination to reduce reflections and cross talks.• Crosstalk can be minimize with closely placed via stubs• Driver design with constant termination de-emphasis can
widen eye diagram• A good DLL design is needed to reduce jitter• Parallel CRC + DBI calculations can relax speed constrains
NCTU Memory Systems IEE5011 FALL 2013 31
References• E. Desjardins (2012, Sept. 12). JEDEC Announces Publication of DDR4 Standard [Online]. Available:
http://www.jedec.org/news/pressreleases/jedec-announces-publication-ddr4-standard• DDR4 SDRAM, JEDEC standard JESD79-4. Sept 2012.• D. Wang (2013, Dec. 3). Why migrate to DDR4? [Online]. Available: http://www.eetimes.com/document.asp?doc_id=1280577• H. Goto (2010, Aug. 16). Towards Next-Generation 4Gbps DDR4 Memory [Online]. Available:
http://pc.watch.impress.co.jp/docs/column/kaigai/20100816_387444.html• C-M Nieh, J. Park, “Far-end Crosstalk Cancellation using Via Stub for DDR4 Memory Channel,” in IEEE 63rd Electronic
Components and Technology Conference (ECTC), pp. 2035-2040, 2013. • N. Pham, D. Dreps, R. Mandrekar, N. Na, “Driver Design for DDR4 Memory Subsystems,” in IEEE 19th Electrical Performance of
Electronic Packaging and Systems (EPEPS), pp.297-300, 2010.• Y-H. Tu, K-H. Cheng, H-Y. Wei, H-Y. Huang, “A Low Jitter Delay-Locked-Loop Applied for DDR4,” in IEEE 16th Design and
Diagnostics of Electronic Circuits and Systems (DEECS), pp. 98-101, 2013.• Hsiang-Hui Chang, Jung-Yu Chang, Chun-Yi Kuo, Shen-Iuan Liu, “A 0.7-2GHz Self-Calibrated Multiphase Delay-Locked Loop” IEEE
Journal of Solid-State Circuits, Vol. 41, No. 5, May 2006• W-J. Yun, H-W. Lee, D. Shin, and S. Kim, “A 3.57 Gbps Low Jitter All Digital DLL with Dual DCC Circuit for GDDR3 DRAM in 54nm
CMOS Technology,” in IEEE Trans. On VLSI, 2010.• Y-S. Kim, S-K. Lee, H-J. Park, and J-Y. Sim, “A 110MHz to 1.4GHz locking 40-phase all-digital DLL,” in IEEE Journal of Solid-State
Circuits, vol. 46, no. 2, pp. 435-444, Feb 2011.• J. Moon, J. S. Kih, “Fast Parallel CRC & DBI Calculation for High-speed Memories: GDDR5 and DDR4”, in IEEE International
Symposium on Circuits and Systems (ISCAS), pp 317-320. 2011. • K. Lin, C. Wu, “A Low-cost Realization of Multiple-input Exclusive-OR gates,” ASIC Conference and Exhibit, Proceedings of the 8th
Annual IEEE Ineternational, pp.307-310. Sept 1995.