a 10gb/s 10mm on-chip serial link in 65nm cmos featuring...
TRANSCRIPT
Symposia on VLSI Technology and Circuits
A 10Gb/s 10mm On-Chip Serial Link in 65nm CMOS Featuring a Half-Rate
Time-Based Decision Feedback Equalizer
Po-Wei Chiu, Somnath Kundu, Qianying Tang, and Chris H. Kim
University of Minnesota, Minneapolis, MN
Outline
• Background
• Proposed Time-Based Decision
Feedback Equalizer (DFE)
• 65nm Test Chip Details
• BER Measurements
• Conclusion
Slide 1
On-Chip Interconnect Scaling
Slide 2
• Interconnect length not scaling down at same rate as
transistor scaling
• Interconnect power larger fraction of total chip power
Interconnect Design Challenges
Slide 3
• A 2cm x 2cm processor requires interconnect lengths as long as 15 mm
• Signal loss due to RC parasitics limits performance
Standard Solution: Repeater
Slide 4
• Pros: Improved RC delay, digital implementation, good CAD tool support
• Cons: Floorplan disruption, increased power consumption
[2] L. Zhang, et al., TVLSI, 2009.
La
ten
cy (
ns
)
Length (mm)5 10 15
0
1
1.5
0.5
20
n=1n=2
n=3n=5
Equalizer
Repeater (no equalization)
21 nn-1
0
TX/RX Equalization Techniques
Slide 5
TX RX
Channel
Freq.
Gain
CTLE + DFEFFE
Time
Vo
lt.
Time
Vo
lt.
• Feed Forward Equalizer (FFE)
• Continuous Time Linear Equalizer (CTLE)
• Decision Feedback Equalizer (DFE)
Equalization Techniques for On-Chip Interconnects
Slide 6
[3] D. Walter, et al., ISSCC, 2012. [4] S. Lee, et al., ISSCC, 2013.
VDD
VSS
INP
IPEIDRV
D+ D- Dd- Dd+
VOUT
VDD
VSS
INN
Active Inductor
OUTN
OUTP
• Left: Utilizes capacitors to pre-distort signal
• Right: Utilizes current mode logic to pre-distort
signal, inductor provides frequency peaking
Conventional DFE Implementation
Slide 7
Current mode logic
OutputVRX(t) ΣΣΣΣ
W1W2WN
Z-1 ...
...
Z-1 Z-1
VDFEx[n]
IBIAS W1 WN...
Slicer
W2
Proposed Time-Based DFE
Slide 8
Delay line
OutputVRX(t) ΣΣΣΣ
W1W2WN
Z-1 ...
...
Z-1 Z-1
x[n]
W1W2WN
∆tPD=Phase DetectorPD
CLK ∆t
DFE Comparison
Slide 9
Time-based DFE
Pros
Cons
Inverter based, scalable to large number of taps, low
power consumption
Moderate speed
CLK PD
Conventional DFE
Ultra high speed (>20Gb/s)
Analog intensive, headroom issues, limited number of
taps, large power consumption
Time-Based DFE Operation
Slide 10
VRX(t)
w/ DFE
w/o DFE
1 0 0
W1 W2
NREF
NDFE
W1 W2 W1W1 W2
0
Input
Decision Results
Time-Based DFE Operation
Slide 11
VRX(t)
w/ DFE
w/o DFE
Input
D=1 D=0
Optimizing Time-Based DFE
Slide 12
x[n]
OutputPD
TREF
T1 T2 T3
TREF TREF
Optimizing Time-Based DFE
Slide 13
x[n]
OutputPD
TREF
T1 T2
2TREF-T3
• Performs same time-based comparison with fewer
delay stages � low power consumption
• All delay stages are identical
Delay Stage Implementation
Slide 14
Output
Z-1 Z-1
x[n]
PD
TREF
TRX TW1X1
Analog Control
Digital Control
T-W2X2
W2 W1
∆tPDΣΣΣΣVRX(t)
Delay Stage w/ Analog Control
Slide 15
2X
1X
4X
VRX
VRX
VRX
VRX
VRX VRX
4X 4X
∆ D
ela
y (
ps
)VRX (V)
65 nm GP CMOS, 25°C
0 0.2 0.4 0.6 0.8 1.0 1.2
0
5
10
15
1X 2X 4X
• Delay controlled by analog signal
Delay Stage w/ Digital Control
Slide 16
∆ D
ela
y (
ps
)
w<5:0>0 20 40 60
0
2
4
6
8 65nm GP CMOS, 25°C
1X 2X 4X
2X
1X
4X
D·w<5>
D·w<4>
D·w<2>
D·w<3>
D·w<1> D·w<0>4X 4X
D=0
D=1
• Inverters for coarse control, capacitors for fine control
• Capacitor connected to VDD for wider delay range
65nm Test Chip Diagram
• TX: Half rate (i.e. Fclk=5GHz for 10 Gbps) FFE
• RX: Transimpedance amp.(TIA), time-based DFE
• Testing features: 15 bit PRBS, in-situ eye-diagram
monitor
Slide 17
Time-Based DFE Implementation
• Transmission gate resistor for impedance matching
• Half rate operation
• Zero-offset aperture phase detector (PD)
Slide 18
-W2
CLK(5GHz)
TRXPD
D[n]
TW1X1FF FF
PD
D[n+1]
FF FF
TREF
TIA2X4X 8X1X
To+∆T
To
VRX(t) W1
W1
-W2
T-W2X2
TRX TW1X1
TREF T-W2X2
CLK(5GHz)
To+∆T
To
D Q
RST
Zero-offset aperture PD
Eye-Diagram Measurement
Slide 19
• Typical BER eye diagram
– Y-axis: Voltage offset
– X-axis: Unit Interval
[4] S. Lee, et al., ISSCC, 2013.
Eye-Diagram Measurement
Slide 20
• Typical BER eye diagram
– Y-axis: Voltage offset
– X-axis: Unit Interval
• TB-DFE BER eye diagram
– Y-axis: Time offset
– X-axis: Unit Interval
[4] S. Lee, et al., ISSCC, 2013.
In-situ BER Eye-Diagram Monitor
• 6-bit programmable delay to sweep the X,Y-axes
• BER monitor compares TX data with RX data
Slide 2111
b
Co
un
ter
Para
lle
l to
seri
al
CLK PDPhase Delay
∆T
BER Monitor
D
D' Error count
TREF DFE
TRX DFE
Err
215-1 PRBS
UI
T
Measured BER Eye Diagram
Slide 22
• Y-axis is time offset, not voltage offset
Measured BER Bathtub
• Eye width=0.43 UI @ BER=10-12 with DFE
Slide 23
-0.6 0
Phase (UI)
w/o DFE
w/ DFE
-0.9 -0.3 0.6 0.90.310
-12
10-9
10-6
10-3
Bit
Err
or
Ra
te
65nm GP, 1.2V, 25ºC
1.2-1.2
0.43UI
Die Photo and Feature Summary
Slide 24
Performance Comparison
Slide 25
ISSCC'09 [5] ISSCC'12 [6] ISSCC'13 [7] VLSI'15 [8]
Technology 90nm 65nm 65nm 65nm
TX and RXCurrent mode
driver+TIAVoltage mode
driver+sense amp.Current mode
driver+sense amp.CTLE-based
repeater
Data Rate 4Gb/s 10Gb/s 3Gb/s 4Gb/s
Throughput
(Gb/s/µm)2 2.56 0.75 4
Link Length 10mm 6mm 10mm 2.5mm+2.5mm
BER Bathtub < 10E-6 < 10E-12
TIA 14.4
DFE 30.9
FFE 31.9
174 9.5 48.4
Voltage modedriver+TIA
2
Energy
Efficiency
(fJ/b/mm)
35.6
This work
65nm
10Gb/s
10mm
Features 2-tap TB-DFENo DFE No DFE No DFENo DFE
< 10E-12 < 10E-12< 10E-12BER Eye Yes (< 10E-6) No Yes (< 10E-12) Yes (< 10E-11)No
[5] B. Kim, et al., ISSCC, 2009. [6] D. Walter, et al., ISSCC, 2012. [7] S. Lee, et al., ISSCC, 2013.
[8] M. Chen et al., VLSI, 2015.
Conclusion
• A 10mm, 10 Gbps on-chip serial link implemented in 65nm GP
• An all-digital inverter based 2-tap half-rate time-based DFE
• In-situ BER eye diagram monitor
Slide 26