status of gtk asic - tdcpix
DESCRIPTION
Status of GTK ASIC - TDCpix. 22 Nov 2011 G. Aglieri, M. Fiorini, P. Jarron, J. Kaplon, A. Kluge, E. Martin, M. Noy, L. Perktold, K. Poltorak. TDCpix ASIC block diagram (60 bit serial/4 LVDS pairs parallel). 45. April 2, 2012. 4x45. 45. 2.7 /4 Mhits/s. Config pixel. 5 bit trimDAC. - PowerPoint PPT PresentationTRANSCRIPT
Status of GTK ASIC - TDCpix
22 Nov 2011G. Aglieri, M. Fiorini, P. Jarron, J. Kaplon, A. Kluge,
E. Martin, M. Noy, L. Perktold, K. Poltorak
DLL
Config pixel 5 bittrimDAC
pixel
driver&line&receiverpixel cell x 45
fineHitRegister 0
syncRegister
fineTimeStampEncoder
pixelGroupFifo (depth= 3)
5 address + 5 pileup32 fineRise32 fineTrail2x12+1 coarseRise2x4+1 coarseTrail
5 add+5 pil
5 fineRise5 fineTrail12+1 coarseRise6+1 coarseTrail2 group collision
5 rise+5 trail
grou
p E
OC
8
grou
p E
OC
0
CP&PDDLL 0
colu
mn
0
columnFifoController
colu
mn
1
quarterchipFifo&frameInserter
Controller
doub
le c
olum
n 1
serializer
48
2
clkdll=320MHz
= SEU protected
TDCpix ASIC block diagram (60 bit serial/4 LVDS pairs parallel)
9+1x temp
April 2, 2012
pixel column
end of column
23 cell units * (0.40 µm x 4.8 µm)* (648+152+373/10) FF=37000 µm2=124µm*300µm
8 bit thresholdDACcolumn &
3 bit bias DACConfigDoubleCol
bandgap
2.4/3.2 Gbits/sCML driver
parallelOut
4 x SLVS480/640 Mbit/s
band
gap
over
ride
test pulse
clkDll
config/statuschip
state machine
doub
le c
olum
n 0
quar
ter
chip
RO
1
quar
ter
chip
RO
2
quar
ter
chip
RO
3
global DACs
45
5
hitArbiter 0 & edge detector
hA 1
hA 2
hA 8
2, parallel_load&daq_rdy
1,hit
coarseHitRegister 0
2 x 32 2 x (13 + 5)
5 add+5 pil2 x 32 2 x (13 + 5)
coarseTimeStampEncoder
32
13 rise+5 trail
>
> grou
p E
OC
2
grou
p E
OC
1
coarseTimeStampServer0
coarseTimeStamp
12
5 rise+5trail+12+1 rise+6+1 trail+5add+5pil+2col=42 42 4242
columnMux 9 to 1
columnFifo (depth= 6)
42+4 add=46
46 467x12+1x6
46+4 add=50
data formatter & comma & frame inserter
8b10b encoder
60
>
>
>
clkserial/2
clkserial/2
> clkserial/2
2 (1
tem
p)
analogMonitor
Mux
clksync & enableclk
clksync
> clkmultiserial
mul
tiSer
ialP
ower
5859
2
31
0
serialTime1
9
serialTimeMux 90 to 48
clksync or clkserialTime
clksync
quar
ter
chip
RO
0
is located in synchronous logic; clk divider needs synchronous reset with respect to receiving clock domain (clkmultiserial)
>clksync
SLVS320 MHz
SLVS≥320 Mbit/s
analog DC
SLVS
wor
ld
doub
le c
olum
n 2
doub
le c
olum
n 3
doub
le c
olum
n 4
doub
le c
olum
n 5
doub
le c
olum
n 6
doub
le c
olum
n 19
rese
t_gl
obal
SLV
Sre
set_
cors
ecnt
SLV
S648 FF @ 2 depth
2.7 /4 Mhits/s
27/40 Mhit/s
2.7/4 Mhit/s
0.3/0.44 Mhit/s
avg. nominal rate (750 MHz beam (104 Mhit/s per chip)/ rate with 2.4 Gbiit/s serializer [Mhit/s])
152FF @ 4 depth
4x4545
clkdll>
serializer controller
min. 40 FIFOs 1 FIFO overflow bit,optional overflow count
FIFO overflow status
clksync>
clkmultiserial or clktest
48
sync register> clksync & enableclk
sync register> clksync & enableclk
quarterChipMux 10 to 1
90
qchip clock divider & clk distribution
/2
clkmultiserial
clksync
clkFIFOread
c10
1
0
d
/6/5f
e
clk D
igita
l=32
0/48
0 M
Hz
PLL
PLL override
clkSerial=2.4/3.2 GHz
clkserial/2
ext
3
CMOS DCSLVS
320/480MHz
/2
/5
/8/2
a b c
01
path d is doubled as to have one direct link from clkserial/2 to clkfiforead
d
muxmode
PLL
0
1 Modes:serialPLL2.4/serialPLL3.2/ext320/ext480/PLLoverrideabc:
/001010/110*010/111*011/101*010 8 modes = 3 bitsclkInDigital=20/26.66/320/480/320MHzclkPLL=2.4/3.2/-/-/0.32GHzclksync=240(10)/ 320(10)/ 320*(0)/240*(2)/32(1) MHzclkFIFOread=40(60)/53(60)/27(12)/40(12)/5.3 MHz(60)clkmultiserial=480/640/320/480/64 MHzclkconfig=320/320/320/240/320 MHz ? if PLL runs on 480?? () =division factor, * can also be 0 or 1 to change clksync in TDC
1
0
b
SLVS
/6
> clkconfig
clkconfig
10
row 0
Col
umn
0
Pixel = column * 45 + row
Pixel group = column * 9+ groupgroup 0 contains pixel 0
Pixel matrix: 13500 µm
12000 µm
Corners: 125 µm
Ban
d G
ap 1
100x
400
Test
pad
s 21
5x70
0
qchip 0 3000x500 qchip 1 3000x500 qchip 2 3000x500 qchip 3 3000x500
IO row 12000 x 400 (158 pads)
Bgana 1400x300
PLL & 4 x Serializer & clk divider 8000 x 500
config 12000 x 300Bgdig&temp 1500x300
clk buffer & reset buffer & 20 clk dll buffers 12000 x 40
Quarter chip read-out 500 µmClk_dll & reset_cc 72 µm
row 0
Col
umn
0
EoColumn bias 1800 µm
TL rx: 70 µm
hitArbiter & DLL, SM, fine registers& Coarse units, pixel group FIFOs, column FIFO 2477 µm
Serializer & PLL 500 µm
IO row ~400 µm
IO row 12000 x 400 (158 pads)
Total: > 19683 µm
12000 µm
Test
pad
s 21
5x50
0=9
pads
BGana 1400x300
PLL & 4 x Serializer & clk divider 8000 x 500
config 11900 x 292
qchip 0 3000x500 qchip 1 3000x500 qchip 2 3000x500 qchip & qconfig 3 3000x500
Pixel = column * 45 + row
Pixel group = column * 9+ groupgroup 0 contains pixel 0
Pixel matrix: 13500 µm
Corners: 125 µm
Bgdig&temp 1500x300
qchip 3 3000x500
doub
le c
olum
n an
alog
0do
uble
col
. di
g.0
doub
le c
olum
n an
alog
19
doub
le c
ol.
dig.
19
clk buffer & reset buffer & 20 clk dll buffers 119000 x 72
Configuration 292 µmtest pulse distributor 11900 x 72 Test pulse distributor 72 µm
Top level schematic• is what you call left 0,2,4,6,8• and right 1,3,5,7?
TOP_KP_Serializer_4Cells• Only one pair of GND/VDD in symbol
– Expecting: 4 pairs for each serializer + 1 pair for PLL
– Where is sub connected to?– Question: which power supply is used for CML
driver.– What are the drivers for the test pads?– Where is txResetA/B/C & testEnable connected
to?– pll_dac_code = pll_select_i<3:0>?– pll_bandwidth_sel = pll_select_r<1:0>?– pll_test_enable = pll_pfd_test_en?
Config• sdout = serial_conf_out?• missing:
– enable_term– enable_term_serial_conf_in -> fixed or
defaulted by register– slvs_out_cset_multi_serial<3:0>– slvs_out_cset_clkout<3:0>– slvs_out_cset_serial_conf_out<3:0> -> fixed
or defaulted by register– pix_strobe stobe/b <19:0> is missing
• is clk_mux_sel<0> = mux_sel_a 1 = b …
config• Where is pll_test_a/b going?• where is reset_cc_cmd going?• is sdin_m_a/b/c serial_conf_in• where pll_instant_lock_m_a/b/c to?• where pll_test_mux_a/b<3:0> to?• e_co = err_fc?
dll clock fanout• dll_clk_to_logic tdc_clk_dll_tologic?• dll_clk tdc_clk_dll
Colpair1• no power connections• err_tc = ei?
BandgapRHfinal• Analog band gap reference?• frcref connect to bandgapoverride?
– to what kind of pad?• vhigh/vlow/lsub to where?
tempinter• is this reference digital and temp out?• vhigh/vlow/lsub to where?• RTR = temp_adj_int• no override? for reference• to which kind of pad is temp going?
• Where to connect NX and SX in schematic of encounter blocks
TDC• gnd power domains for tdc and dll are not
separated, pins missing• why is not NX SX connection?
Modes:serialPLL2.4/serialPLL3.2/ext320/ext480/PLLoverrideabc:
/001010/110*010/111*011/101*010 8 modes = 3 bitsclkInDigital=20/26.66/320/480/320MHzclkPLL=2.4/3.2/-/-/0.32GHzclksync=240(10)/ 320(10)/ 320*(0)/240*(2)/32(1) MHzclkFIFOread=40(60)/53(60)/27(12)/40(12)/5.3 MHz(60)clkmultiserial=480/640/320/480/64 MHzclkconfig=320/320/320/240/320 MHz ? if PLL runs on 480?? () =division factor, * can also be 0 or 1 to change clksync in TDC
qchip clock divider & clk distribution
/2
clkmultiserial
clksync
clkFIFOread
c10
1
0
d
/6/5f
e
clk D
igita
l=32
0/48
0 M
Hz
PLL
PLL override
clkSerial=2.4/3.2 GHz
clkserial/2
ext
3
CMOS DCLVDS
320/480MHz
/2
/5
/8/2
a b c
01
path d is doubled as to have one direct link from clkserial/2 to clkfiforead
d
muxmode
PLL
0
1
1
0
b
LVDS+2CMOS
/6
clkconfig
10
qchip clock divider & clk distribution
/2
clkmultiserial
clksync
clkFIFOread
c10
1
0
d
/6/5f
e
clk D
igita
l=32
0/48
0 M
Hz
PLL
PLL override
clkSerial=2.4/3.2 GHz
clkserial/2
ext
3
CMOS DCLVDS
320/480MHz
/2
/5
/8/2
a b c
01
path d is doubled as to have one direct link from clkserial/2 to clkfiforead
d
muxmode
PLL
0
1
1
0
b
LVDS+2CMOS
/6
clkconfig
10
qchip clock divider & clk distribution
/2
clkmultiserial
clksync
clkFIFOread
c10
1
0
d
/6/5f
e
clk D
igita
l=32
0/48
0 M
Hz
PLL
PLL override
clkSerial=2.4/3.2 GHz
clkserial/2
ext
3
CMOS DCLVDS
320/480MHz
/2
/5
/8/2
a b c
01
path d is doubled as to have one direct link from clkserial/2 to clkfiforead
d
muxmode
PLL
0
1
1
0
b
LVDS+2CMOS
/6
clkconfig
10
qchip clock divider & clk distribution
/2
clkmultiserial
clksync
clkFIFOread
c10
1
0
d
/6/5f
e
clk D
igita
l=32
0/48
0 M
Hz
PLL
PLL override
clkSerial=2.4/3.2 GHz
clkserial/2
ext
3
CMOS DCLVDS
320/480MHz
/2
/5
/8/2
a b c
01
path d is doubled as to have one direct link from clkserial/2 to clkfiforead
d
muxmode
PLL
0
1
1
0
b
LVDS+2CMOS
/6
clkconfig
10
qchip clock divider & clk distribution
/2
clkmultiserial
clksync
clkFIFOread
c10
1
0
d
/6/5f
e
clk D
igita
l=32
0/48
0 M
Hz
PLL
PLL override
clkSerial=2.4/3.2 GHz
clkserial/2
ext
3
CMOS DCLVDS
320/480MHz
/2
/5
/8/2
a b c
01
path d is doubled as to have one direct link from clkserial/2 to clkfiforead
d
muxmode
PLL
0
1
1
0
b
LVDS+2CMOS
/6
clkconfig
10
qchip clock divider & clk distribution
/2
clkmultiserial
clksync
clkFIFOread
c10
1
0
d
/6/5f
e
clk D
igita
l=32
0/48
0 M
Hz
PLL
PLL override
clkSerial=2.4/3.2 GHz
clkserial/2
ext
3
CMOS DCLVDS
320/480MHz
/2
/5
/8/2
a b c
01
path d is doubled as to have one direct link from clkserial/2 to clkfiforead
d
muxmode
PLL
0
1
1
0
b
LVDS+2CMOS
/6
clkconfig
10
qchip clock divider & clk distribution
/2
clkmultiserial
clksync
clkFIFOread
c10
1
0
d
/6/5f
e
clk D
igita
l=32
0/48
0 M
Hz
PLL
PLL override
clkSerial=2.4/3.2 GHz
clkserial/2
ext
3
CMOS DCLVDS
320/480MHz
/2
/5
/8/2
a b c
01
path d is doubled as to have one direct link from clkserial/2 to clkfiforead
d
muxmode
PLL
0
1
1
0
b
LVDS+2CMOS
/6
clkconfig
10
qchip clock divider & clk distribution
/2
clkmultiserial
clksync
clkFIFOread
c10
1
0
d
/6/5f
e
clk D
igita
l=32
0/48
0 M
Hz
PLL
PLL override
clkSerial=2.4/3.2 GHz
clkserial/2
ext
3
CMOS DCLVDS
320/480MHz
/2
/5
/8/2
a b c
01
path d is doubled as to have one direct link from clkserial/2 to clkfiforead
d
muxmode
PLL
0
1
1
0
b
LVDS+2CMOS
/6
clkconfig
10
serialPLL2_4lowClkSync = “001010
serialPLL2_4HighClkSync = “000010”
serialPLL3_2= “001000”
ext320lowClkSync=“H110H0” H/L don’t care
ext320highClkSync=“H100H0” H/L don’t care
ext480lowClkSync=“H110H1” H/L don’t care
ext480highClkSync=“H100H1” H/L don’t care
PLLoverride320=“101010” H/L don’t care
qchip_controller_x3
48
2.4/3.2 Gbits/sCML driver
multi_serial_x3
4 x LVDS480/640 Mbit/s
enable
10 *46 10 x 9 serial reg
46+4 add=50
data_formatter_komma_frame_inserter_x3
8b10b encoder enc8b10bx6_x3
clksync
clksync& enableclk
clkdll
> clkmultiserial
serial_time_x3 mux 90 to 48> clksync
60
>
>
clkserial/2
clkserial/2
clkserial/25859
2
31
027/40 Mhit/s
clkmultiserial or clktest
48
sync register> clksync & enableclk
sync register
> clksync & enableclk
reset_synchronizer_x3_reset
clksync
reset_synchronizer_x3_reset_coarse_counterclkdll
>
>
qchip_mux_x3
>
10 *columnFifo
sync registerclksync & enableclk> register_qchip_word_x3
clock_enable_generator_x3clksync> enableclkclkfifo
90
pattern_control_x3clksync
>
48config_register_rwclksync>
>
10 *columnFifo
serializer
10 x columnFifoclksync> 10 x serialRegclksync
>
config_globalclkconfig
>
Qchip block diagram
sync10b8b
clkword(recovered) 10
dec10b8b_x6
48
clkword>
serial_time_decoder
48
clkword>
decoder_multi_serial
48
clkmulitserial
>
clkword>
multi_serial_compare
2011.10.24
serialTimeMux 12 to 1>clksync
12
9x10
7x12+1x6
8x6=48
serialTime multiplexing
(116/6)*25ns=500 ns6400 ns / 500 ns = 12.8
SerialMux block diagram
IO row
PLL & 4 x Serializer
config
qchip 0 qchip 3
reset_sync
rese
t_sy
nc_
s1_
_qc
hip
0 t
o 3
dou
ble
colu
mn
0h
it ar
bite
r 0
& 1
reset_synchronizerclk_sync
>
reset_flip_flop
clk_sync>
clk_sync>
dll statemachine 0
reset_flip_flop
clk_sync>
reset_cc
reset_cc_s1_dc 0 to 19 & reset_cc_long_s1_qc 0 to 3
reset_synchronizer & edge detectorclk_dll
>
qchip 2qchip 1
clk_dll>
clk_dll
>
clk_dll
clk_dll>
rese
t_sy
nc_
s2
reset_sync_s3
reset_flip_flop
clk_dll>
rese
t_cc
_s
2
clk_
dll_
dc_
0 to
19
clk_dll_sm
clk_
dll_
dc_
0 to
19
&
clk_
dll_
con
ig &
clk_
dll_
qc
0 to
3
clk_
dll_
dc_
0 to
19
clk_
syn
c_q
c_0
to 3
clk_
syn
c 0
,1,2
,3
clk_sync>
25
420 4
clk_
dll_
con
fig
clk_
dll_
qc
0
1 1 1 1 20 4
1 11
1 1 1 1
clk_
syn
c_q
c_0
clk_
syn
c_q
c_1
clk_
syn
c_q
c_2
clk_
syn
c_q
c_3
clk_
dll_
qc
1
clk_
dll_
qc
2
clk_
dll_
qc
3
> > >
Power domains
clk_sync
Q
Q_
DQ
Q_
D Q
Q_
D
clk_sync
reset_synchronizer_sync
cmd_reset_sync
min: clk_prop + hold; max: clk_prop+clk_cycle-setup
*) pin reset_all_n reset_sync, reset_dll, reset_config, reset_bandgap_n*) cmd_reset_all reset_sync, reset_dll, reset_config, reset_bandgap_n*) cmd_reset_sync reset_sync*) cmd_reset_dll reset_dll (to dll_state_machine)*) cmd_reset_config reset_config*) cmd_reset_bandgap reset_bandgap_n
clk_config
cmd_config
reset_bandgap_ncmd_reset_bandgap
Reset scheme
Q
Q_
D
clk_sync
Q
Q_
DQ
Q_
D
digital logic high active resetfrom outside and analog blocks low active reset
voter
IOs• south end of chip:
– 12 mm-2 corners*0.215 mm / 0.073 mm pitch = 158– if possible only one row
optional, two rows with power pins in the 2nd row (longer bond wires)– bond pads 200 µm long x ~ 70 µm wide
• east and west end:– area accessible when sensor bonded: x mm pads– area not accessible when sensor bonded: x mm padsavailable for test pads in the EOC area
Operation Test Power
clk_dig lvds_in 2 Test_out <37 downto 0>
Cmos or analog or lvds
38 VDDanalog1.2 power 13
clk_dll lvds_in 2 Test_in <39 downto 0>
Cmos or analog or lvds
40 VDDtdc1.2 power 6
serial_conf_in lvds_in 2 VDDdigital1.2 power 7
reset_coarse_frame_count
lvds_in 2 Optional VDDserializer(min.3 pairs/serializer)
power 12
reset_global lvds_in 2 address <3 downto 0>
cmos (4)
(reset_dll) cmos_in# (1)
Jtag_trst cmos (1) VDDlvds2.5 power 1
serial_conf_out lvds_out 2 Jtag_tck cmos (1) VDDlvdsMultiSerial2.5
power 4
reset_bandgap cmos_in 1 Jtag_tms cmos (1) GNDanalog1.2 power 13
serial_out<3 downto 0>
CML_out 8 Jtag_tdi cmos (1) GNDtdc1.2 power 6
temp<1 downto 0>
analog_out 2 C_chan <7 downto 0> cmos ? GNDdigital1.2 power 7
test_pulse_in diff analog_in
2 Jtag_tdo cmos (1) GNDserializer(min.3 pairs/serializer)
power 12
multiSerial_out<15 downto 0>
lvds_out 32 (seu) lvds_out (2)
clk_multiserial or clk_test
lvds_out 2 mux_analog_out analog 1 GNDlvds2.5 power 1
Mode GNDlvdsMultiSerial2.5
power 2
bandgap_override analogInOut 1
(mode_parallel_out)
cmos_in (1)
12-2*0.215 mm / 0.073 mm = 158
152wo()
clockMuxMode cmos_in 3
# possibly LVDS
I/O
power densities
examples:M1: Idc = 3.12*(W-0.06)=3.12*(1.4+0.7*(wd-1.4)-0.06)for 50 um -> 99 mA 1.98 mA/umM1: Irs = 7.52*(W-0.06)*sqrt(1.19+3.53/(W-0.06))= for 50 um7.52*((1.4+0.7*(wd-1.4)-0.06)*sqrt(1.19+3.53/((1.4+0.7*(wd-1.4)-0.06))=300 mA 6 mA/umM2,3: for 50 um -> Idc = 111 mA 2.22 mA/umMG,MQ,LM: for 50 um -> Idc = 191 mA 3.82 mA/um
SIOVDD SIOVSSSIOVDD: M2: 23 um + MA: 17 umM2: Idc = 3.12*(W-0.06)=3.12*(1.4+0.7*(wd-1.4)-0.06)MA: Idc = 5.63*(W-0.27)=5.63*(1.4+0.7*(wd-1.4)-0.27)for 50 um -> 99 mA 1.98 mA/umIdc M2 = 51 uA + 68 uA = 119 uA
SIOVDD,SIOVSS pads thin oxide:two big traces of M2 and MA, vertical & bars horizontalthese bars even help for power density but abut to the next cell We cannot have that as we have many different power domains.There are separation cells but they are 71 um wide and would waste space Modification of the SIOVDD and SIOVSS is needed to avoid short circuits
Data format• Nominal transmission: 2.4 Gbits/s,• High speed: 3.2 Gbits/s• All words: 48 bits (6 bytes) long• 8b10 encoded bit stream 60 bits
– data word– frame word– idle (komma) word: no hits available to transmit– sync word: after reset and after each force_sync command (can be sent repetitive)
• Header contains frame counter every 2048 clock_dell cycles (at 320 MHz ~ every 6.4 µs)• Data contains dynamic range up to 6.4 µs + 1 overroll counter bit
Data format-hit word normal mode (48 bit)• ------------------------------------------------------------------• --qchip_word -> data_out• ------------------------------------------------------------------
• --(47) Status/data selector 1 bit• --(46..40) Address 7 bit (90 pixel groups)• --(39..35) Address-hit arbiter 5 bit• --(34..30) Address pileup 5 bit• --(29) Leading coarse time selector 1 bit• --(28..17) Leading coarse time 12 bit 1bit rollover
indicator+2048(11bit)*3.125 ns=6.4 µs• --(16..12) Leading fine time 5 bit 98 ps -> 3.125 ns• --(11) Trailing coarse time selector 1 bit• --(10..5) Trailing coarse time 6 bit 64*3.125 ns = 200 ns• --(4..0) Trailing fine time 5 bit 98 ps -> 3.125 ns• ___________________________________________________________• --Total 48 bit
(45..39) Address 7 bit (90 pixel groups)
• 10 column each 9 pixels groups to be addressed:
• Column 0: pixel group 0,1,2,3,…,7,8• Column 1: pixel group 9,10,11,12,13..17• Column 2: pixel group 18,19,20,21,..26• ….
• pixels in pixel group are one hot encoded– example pixel 2: “00010”
Frame word (ex frame 0)• word_frame0(27 downto 0) <= frame_counter;• word_frame0(36 downto 28) <= hit_counter;• word_frame0(42 downto 37) <= qchip_collision_count;• word_frame0(46 downto 43) <= “0000” not used;• word_frame0(47) <= '1'; --format bit• ------------------------------------------------------------------• --(47) status bit 1 bit• --(46..43) not used = ‘0’ 4 bit• --(42..37) # of collisions in previous frame 6 bits • -- 2**6=64, 3.3 MHz*10*6.4us=211hits --> count to 64 allows 1/3 of hits to collide
• --(36..28) # of hits in previous frame 9 bits• -- hits per qchip and frame= 130 Mhits/s/4*6.4us=208->• -- max hits in frame: worst case: clk_dll = 240 MHz clk_sync = 320MHz clk serial 3200 MHz-> number
of -> 2048 / 240 MHz = 8.53 us frame length -> transmission cycles in one frame : 8.53 us *(3200MHz / 60)= 455 -> 9 bit
• --(27..0) framecounter 28 bit
• -- 2**28*6.4us=1718s
• ______________________________________word_frame1 suppressed
sync link word (48 bit) sent after reset for 1024 clk cyclesSUPPRESSED
• 6 * Komma K28.5___________________________________________________________________________________
• Total 6 * 48 bit
sync slot word (48 bit) sent after reset or reset_cmd for 2^16 = 65536 times -> @ 320 MHz 3.125 ns * 6 * 65536 = 1.23
milliseconds
• 5 * Komma K28.5+ 1 K27.7 + K27.7 is sent after 5 Kommas___________________________________________________________________________________
• Total 6 * 48 bit
idle word (48 bit)
• 5 * Komma K28.5+ 1 K27.7 + K27.7 is sent after 5 Kommas___________________________________________________________________________________
• Total 6 * 48 bit
Do we need these values in frame• Seu_counter• FIFO_overflow_counter• Error_info• Status_info• Checksum
Configuration: qChip• -----------------------------------------------------------------------• --configuration register• -----------------------------------------------------------------------• send_k_sync_requ <= configuration_data_int_in(0); • send_k_word_requ <= configuration_data_int_in(1);• k_word_type <= configuration_data_int_in(5 downto 2);• enable_serial_time_int <= configuration_data_int_in(6);• enable_test_pattern_int <= configuration_data_int_in(7);• new_data_testpattern <= configuration_data_int_in(8); --01 transisition counts• -- this bit acts as write command for the data_testpattern fifo, each 01 transition is used as write cmd• data_testpattern <= configuration_data_int_in(9+data_test_pattern_width-1 downto 9);• -- 54 bit: data_testpattern -> write data into cell 54 of data_testpattern -- 48 data + 6 k indicator
• --> subsequent writing moves write pointer of FIFO so that all 8 FIFO cells can be • --> written• --> when test pattern FIFO is used, all 8 FIFO cells are read and pushed into• --> the data stream, thus the data stream consists of a multiple of 8 data words.
Configuration: TDC
Configuration: DLL
Configuration: EOC bias
Configuration: pixel
Configuration: config
G. Aglieri
G. Aglieri
G. Aglieri
Status• schematic or hdl• simulation pre-layout / pre-synthesis• layout & extraction• simulation post-layout / parasitics back
annotated• DRC & LVS• schematic integrated in top• layout integrated in top• simulation integrated in top• SEU simulation
Clock tree÷r 60bit 5pads
Implementation data transmission 60b• Using GBT running at 20 MHz, but modifying data shift length to 60• Problem: GBT has 3 parallel multiplexed shift registers, 60/3=20
GBT can to be modified to 2 SR each 30 bits, first clock divider from 3 to 2additional high speed dividers
• 20 MHz in 2.4 Gbit/s 40 Mwords/s (+21% (132 Mhits/s); + 54% (104 Mhits/s)• 2400 / 320 = 7.5 ! 2400/8 = 300 MHz• Programmable divider: 10 (240) / 5! (480) / 60 (40) for synchronous read logic• Programmable divider: 8 (300), 6(400) for FIFO write and state machines
2.4 GHz20 MHzPLL
Clock divider2.4 GHz
1.2 GHz serial mux & shift
40 MHz parallel_load (/60)
40 MHz (60) / 240 MHz (10) / 480 MHz (5!)
• Synchronous parallel read-FIFO frequency:• serialFrequ * n / 50 [MHz] = 48
(1)/96(2)/144(3)/192/240(10)/288/336/384/432/480 (5!)
240 MHz (10) / 300 MHz (8) / 400 (6)
Fifo read
Fifo write
• Fast counter:• /2 = 1.2 GHz serial mux & shift• /5 /2 = 240 MHz fifo read• /5/2 = 240; /2 /4 = 300 MHz; /3 /2 = 400 MHz
statemachines, all FIFOs&chipFIFOwrite
Implementation data transmission; 60bit/5IO• Multi Serial60bit:
– 60 bits (8b10); 5 I/O pairs– FIFO read-frequency for 50% contingency on 132 Mhits/s 50 MHz / quarter chip
* 60 bit /5 pairs (10 bits serializer) 3000 /5 = 600 MHz per LVDS pair – Input frequency comes from PLL or from outside, either 2.4 Gbit/s on pad or 480 MHz
for all pads & synchronous logic– if synchronous logic works with 480 MHz only 480 MHz * 5 = 2400 Mbit/s / 60
40 Mhits/s (21 % (132 Mhits/s) +54 % (104 Mhit/s))– Worst case
• synchronous logic works with 320 MHz only 320 MHz * 5 = 1600 Mbit/s / 60 26.7 Mhits/s (-19 % (132 Mhits/s) +3 % (104 Mhit/s))
• synchronous logic works with 240 MHz only 240MHz * 5 = 1200 Mbit/s / 60 20 Mhits/s (-39 % (132 Mhits/s) -23 % (104 Mhit/s))
Implementation data transmission 60b• Using GBT running at 26.66 MHz• 26.66 MHz in 3.2 Gbit/s 53 Mwords/s
(+61 % (132 Mhits/s); + 105 % (104 Mhits/s)• 3200 / 320 = 10 • Programmable divider: 10 (320)
3.2 GHz26.66 MHzPLL
Clock divider3.2 GHz
3.2 GHz
53MHz parallel_load (/60)
53 MHz (60) / 320 MHz (10) / 640MHz (5!)
320MHz (10) / 400 MHz (8) / 533.33 (6)
Fifo read
Fifo write
• Which test pads for building blocks?– TDC inputs.
• Can they be put in 2nd row? or on the side?• How much space for EOC? 4.5mm+padrow=5
mm• How much space of ASIC not under sensor
minus corner / 73µm *2 is # test pads
Test pads• divided PLL output on test pad
Chip assembly• Global floor planning• Placement of pixel matrix, TDC, EOC, pad ring,
configuration, auxiliary blocks• Power routing• Global functionality simulation• DRC, LVS• Top level schematic• Chips size compatibility with sensor, dicing,
bump bonding
Block assembly• Pixel matrix (Virtuoso)
– Pixel cell, inPixel confinguration, inpixel DACs• EOC blocks (Encounter)
– TDC, hitArbiter, FIFOreadout, quadConfiguration, chipConfiguration
• Global blocks (Virtusoso/Encounter, depending on competency)– Serializer, IO ring, band gap, temperature
Verification sequence• Test patterns
– From hit generator or– From configuration pattern
• Individual blocks– Behavioral/functional– Layout DRC/LVS– Timing back annotated, worst/best case (libraries)
• Local top level (ie. TDC, FIFOread-out, full configuration– Full functional back annotated with test patterns
• Global top level (pixel matrix&digital&serializer)– Full functional back annotated (digital) with test patterns &
simulated configuration & HDL modeled analog front-end & HDL modeled DLL• Functional simulation• SEU simulation
– Mixed mode simulation on interface: transmission line & receiver & hitArbiter– DRC/LVS, (if possible full chip)
• Global system test bench (pattern generator, verification of data output, assertions)
Pixel cell & matrix• Pixel cell
– Pre-amplifier, discriminator, transmission line driver– In pixel DAC– In pixel configuration– Qualification
• analog: extraction, connectivity, crosstalk sensitivity• config: functionality, connectivity
• Pixel matrix– Top level schematic– column layout– transmission lines– Transmission line receiver
• placement• Translation to 1.7 OA• Qualification
– extraction, simulation– power routing– test pulse routing– biasing DACs– bias routing– configuration routing– Bias monitoring & mux– Qualification
• analog: extraction, connectivity, crosstalk sensitivity, power drop• config: functionality, connectivity
Pixel cell & matrix• Analog End-of-column
– Column DAC– Column DAC control
• Temperature/radiation diodes– ADC– direct output
TDC• Delay line
– Delay line, charge pump, loop filter– State machine– Qualification
• DLL, operation margins, startup, extraction• Top level, including state machine
TDC• TDC
– Floorplanning– Delay line – 32-5 encoder
• synthesis, layout, simulation– fine hit registers
• Layout, simulation, qualification with routing effects– course counter
• concept• synthesis• qualification
– hit arbiters & edge detector• schematic, simulation, layout• Qualification
– State machine– placement, routing, Interconnection bus– Verification of power consumption– power routing TDC & compatibility with pixel matrix/global power routing– Qualification
• extraction, functionality, crosstalk, power routing, top level, mixed mode– Top level schematics– Functional simulation (startup & time tag)– Timing simulation with hitArbiterController & FIFO controller & serial read-out controller
HitArbiter• Test bench• Remove demonstrator problems
– Double hits, varying delays, pileUp address• Move to OA , 1.7• Simulate backannotation with test bench,
define efficiency• Place/Route compatible with space and
power routing
Configuration• Global configuration master• QuadConfiguration• PixelConfiguration
– SEU simulation– DLL & pixel cell functional verification with real
configuration data– Place&route (Encounter)
FIFO read-out• read-out
– VHDL system level simulation, occupancy, definition of FIFO dephts– FIFO controller (SEU hard)– FIFO
Task• PLL & Serializer & driver• Band Gap• LVDS 500 Mbit/s driver / receiver, rad tolerant• 200 µm pad opening on all pads
Pad library• Pad modification for all pads required to have
large bond pads.• Special 70µm LVDS pads?
LVDS pads• Have never been tested or simulated in detail
to higher than 200 MHz;• Pads in demonstrator have a known radiation
issue; for us with 100 krad should not be a problem
• New pads are going to be tested but are not faster have been optimized for below 200 MHz !
PLL & Serializer• Use GBT as template
– 4 * serializer + 1 PLL @ 4.8 Gbit/s = 750 mW• Use GBT only with 2.4 Gbit/s nominal• Redesign clock divider• Move from LM to DM
– Only power and capacitors on top 5 layers• Change aspect ratio from 1 mm x 1 mm to 0.5 mm x 2 mm• Separate PLL from serializer• Implement 4 clock dividers (10/8/6/2(Mux))• Change SR length to 2*25• Use only 2 Mux inputs• Outputs are CML, are optical components compatible with CML, if not find converters.
Pad ring• Definition of power domains• Break padring• Connect to power stripes• Implement elongated pads
Power domains• VDDanalog1.2
– pixel matrix only– consumption 50%: 1.6W 1.3A ≥ 13 pins
• VDDtdc1.2– DLL, fine time registers
• VDDdigital1.2– synthesized logic– VDDtdc & VDDdigital consumption 50%: 1.6 W 1.3 A ≥ 13 pins
• VDDserializer1.2?4*150mA min 6. pads, Paulo min. 3 pairs per serializer min. 12 pairs
• VDDlvds2.5– clkdll, serialConfigIn/Out, resetCoarseCnt– 1 pin
• VDDlvdsmultiserial2.5– 4 groups of 5 pads (should be physically grouped together)– min. 2 pins.
TODO
hit Arbiter• ✔convert to 1.8
– all libraries moved to 1.8 directory structure– ✔comparison simulation in new environment with non-modified and non-ported logic gives same
output statistics• seems to have more setup and hold violations in asynchronous part ?
– ✔resynthesized and again placed and routed• don’t touch only for hitArbiter but not for hitArbiterController -> additional buffers added -> verify• routing blockage needed to be modified as min distance to placed cells seem to have changed
in 1.8• simulation of 1s gives similar results compared to 1.7 (see files
/projects/IBM_CMOS8/gtk2010/V4.0/workAreas/akluge/log_hitArbiterAndContoller2010a/ncsim.log.V3.1000 and ncsim.log.1000.onlyreport
– added *lib and time_model output reports– DRC and LVS OK
• ✔output driver stronger• buffer 1 add testoutput from rx• ✔try implementing test inputs, 6th channel for TDC testing• ✔reset synchronizer flipflop• ✔put daq rdy through FF and possibly adapt to GL readout
– see next slide
• ✔add daq_ready input to top, presently is connected to ‘1’ internally• still 2 DRC errors -> will be investigated once fully embedded in GL TDC• verify constraints.sdc (hold, GL comment)• signoff verify full flow output and rerun verilog• 20120515 pins moved to outside of block
hitArbiterController2010a 20120215
• The parallel_load signal is activated after the first rising edge of the clock after the hit for one clock cycle, as it assumes that the hitRegisters (fine and coarse) have been loaded and are ready to be transferred after 1 clk cycles to the synchronisation registers.• The lead/trail_edge_trigger signals are sending the hit trail/fall edges with constant latency and are reset upon parallel_load & rising edge clock, thus have at least 1 clock cycle length, maximum 2 clk cycles, if daq_ready is active.• Lead/trail_edge_trigger is at least one clock cycle and up to two clk cycles long, but depending on clk phase relation can be active during one or two clock rising edges.
>
D
Q
R
Shit_in=set_lead_edge_int
lead_edge_int=set_block
>
D
Q
R
Sclk_int
block_int=lead_edge_trigger=block_pileup
para
llel_
load
lead_edge_trigger
block_pileupreset
>
D
Q
R
Sset_
trai
l_ed
ge_i
nt
reset
parallel_load_requ
rese
t_le
ad_t
rail_
edge
_int
trail_edge_int=set_trail_edge_trigger_present
>
D
Q
R
Sclk_int
trail_edge_trigger_present=block_hit=trail_edge_trigger
parallel_load
trail_edge_trigger
block_hit
reset>
D
Q
R
Sclk_intreset
parallel_load_requ
>
D
Q
R
Sclk_intreset
daq_ready
parallel_load
parallel_load_i_out
reset_pileup_int
reset
reset_pileup_int_intreset_pileup_i
clk_ro_int=clk_int
reset
hit_in
daq_ready
latency = ~0
pulsewidth= 1-2 clk cycles
latency = 0-1 clk cycles
pulsewidth= 1-2 clk cycles
latency = 1-2 clk cycles
pulsewidth= 1 clk cycles para
llel_
load
para
llel_
load
_req
u
trai
l_ed
ge_i
nt
hit_
int >
D
Q
R
S
single_hit_i_flush event
daq_rdyclk_int
single_hit_i_flush event ‘1’ in normal encoded operation mode. ‘0’ in serial read-out mode and ‘1’ for one single clock cycle when hithas been read-out and new hit can be accepted
new: not yet implemented, not yet simulated
hitArbiterController2010a 20120319
• The parallel_load signal is activated after the first rising edge of the clock after the hit for one clock cycle, as it assumes that the hitRegisters (fine and coarse) have been loaded and are ready to be transferred after 1 clk cycles to the synchronisation registers.• The lead/trail_edge_trigger signals are sending the hit trail/fall edges with constant latency and are reset upon parallel_load & rising edge clock, thus have at least 1 clock cycle length, maximum 2 clk cycles, if daq_ready is active.• Lead/trail_edge_trigger is at least one clock cycle and up to two clk cycles long, but depending on clk phase relation can be active during one or two clock rising edges.
>
D
Q
R
Shit_in=set_lead_edge_int
lead_edge_int=set_block
>
D
Q
R
Sclk_int
block_int=lead_edge_trigger=block_pileup
para
llel_
load
lead_edge_trigger
block_pileupreset_int
>
D
Q
R
Sset_
trai
l_ed
ge_i
nt
reset_int
parallel_load_requ
rese
t_le
ad_t
rail_
edge
_int
trail_edge_int=set_trail_edge_trigger_present
>
D
Q
R
Sclk_int
trail_edge_trigger_present=block_hit=trail_edge_trigger
parallel_load
trail_edge_trigger
block_hit
reset_int>
D
Q
R
Sclk_intreset_int
parallel_load_requ
>
D
Q
R
Sclk_intreset_int
daq_ready
parallel_load
parallel_load_i_out
reset_pileup_int
reset_int
reset_pileup_int_intreset_pileup_i
clk_ro_int=clk_int
reset
hit_in
daq_ready
latency = ~0
pulsewidth= 1-2 clk cycles
latency = 0-1 clk cycles
pulsewidth= 1-2 clk cycles
latency = 1-2 clk cycles
pulsewidth= 1 clk cycles para
llel_
load
para
llel_
load
_req
u
trai
l_ed
ge_i
nt
hit_
int >
D
Q
R
S
daq_ready
daq_ready_intclk_int
single_hit_i_flush event ‘1’ in normal encoded operation mode. ‘0’ in serial read-out mode and ‘1’ for one single clock cycle when hithas been read-out and new hit can be accepted
>
D
Q
R
S
reset
reset_intclk_int
hitArbiter2010a: test inputred line -> can act as test input to TDC -> if only red line implemented resulting address code would be that of last registered address
how could the test pulse be routed there?How could it be enabled?
enable
test_pulse
DLL state machine and lock detector• ✔implement cmd reset of DLL
– reset without register reset• bit in config register is ored with reset in the dll_state_machine2010_block. Signal is not connected to reset of configuration.
– verify that each DLL can be reset individually and possibly stage automatic power on reset
• ✔correct error in serial reg, reset should appear only upon read– use GL register
• ✔config synchronizer in DLL statemachine, discuss what the latest decision• ✔reset synchronizer flipflop• modify size of block• modify position of I/O• verify constraints.sdc, Lukas timing simulation & (hold, GL comment)• signoff redo simulation and check log files
qchip• implement edge detector on falling edge of reset_coarse_count ->
synchronisation to clk_sync• route clk_sync, reset_coarse_count and reset_sync to double columns• place Gianluca FIFO in functional model• put PLL and serializer as functional/schematic model in simulation• write automatic clk selector test bench• link directly Karolina files in simulation and delete copy• go through code and provide automatic test bench and documentation• add mode bit to multiserial to switch off output drivers• add clk_test_multiserial selector• start synthesis and constraints (multi cycle clock)• place & route• adapt charge pump functional model parameters to Karolina design• clk_enable_generator
– is dependent on clk_fifo, clk_enable can jitter forward and backward if clk_fifo_read’event is at same time as clk_sync’event add state machine and counter on clk_sync after scanning phase of clk_fifo_read
• connect reset on write pulse to reset? or individual data bit in config register for local reset
• verify constraints.sdc (hold, GL comment)• gate clk_multi_serial and/or switch of multi_serial output drivers
qchip• during synthesis some triplication has been
merged look for merged in log file
qchip
qchipPlease also notice the first item in the list, that assumption is made for the correct de-assertion of the holdingData flag.
qchip• delay of sel_mux = 3,• no acknoledge needed• send enable_serial_time to each column -> 10
outputs• verify that arbitrary phase in clock_divider
creates no problem
top• top level schematic• reset generator• clock distributor• SLVS wire bond pads• increasing wire bond pad size• adapting IO for vdd gnd breaking• temperature sensor• band gap
Notes• from here on notes and old block diagrams
Implementation data transmission 50b• Using GBT running at 20 MHz, but modifying data shift length to 50• Problem: GBT has 3 parallel multiplexed shift registers, 50/3=16.7
GBT need to be modified to 2 SR each 25 bits, first clock divider from 3 to 2additional high speed dividers
• 20 MHz in 2.4 Gbit/s 48 Mwords/s (+45% (132 Mhits/s); + 84 % (104 Mhits/s)• 2400 / 320 = 7.5 ! 2400/8 = 300 MHz• Programmable divider: 10 (240) / 5! (480) / 50 (48) for synchronous read logic• Programmable divider: 8 (300), 6(400) for FIFO write and state machines
2.4 GHz20 MHzPLL
Clock divider2.4 GHz
1.2 GHz serial mux & shift
48 MHz parallel_load (/50)
48 MHz (50) / 240 MHz (10) / 480 MHz (5!)
• Synchronous parallel read-FIFO frequency:• serialFrequ * n / 50 [MHz] = 48
(1)/96(2)/144(3)/192/240(10)/288/336/384/432/480 (5!)
240 MHz (10) / 300 MHz (8) / 400 (6)
Fifo read
Fifo write
• Fast counter:• /2 = 1.2 GHz serial mux & shift• /5 /2 = 240 MHz fifo read• /5/2 = 240; /2 /4 = 300 MHz; /3 /2 = 400 MHz
statemachines, all FIFOs&chipFIFOwrite
Implementation data transmission 50b; 5pairs
• Multi Serial50bit:– 50 bits (8b10); 5 I/O pairs– FIFO read-frequency for 50% contingency on 132 Mhits/s 50 MHz / quarter chip
* 50 bit /5 pairs (10 bits serializer) 500 MHz per LVDS pair 2400 /5 = 480 MHz
– Input frequency comes from PLL or from outside, either 2.4 Gbit/s on pad or 480 MHz for all pads & synchronous logic
– Worst case• synchronous logic works with 320 MHz only 320 MHz * 5 = 1600 Mbit/s / 50
32 Mhits/s (-4 % (132 Mhits/s) +23 % (104 Mhit/s))• synchronous logic works with 240 MHz only 240MHz * 5 = 1200 Mbit/s / 50
24 Mhits/s (-27 % (132 Mhits/s) -8% (104 Mhit/s))
Implementation data transmission; 60bit/4IO• Multi Serial60bit:
– 60 bits (8b10); 4 I/O pairs– FIFO read-frequency for 50% contingency on 132 Mhits/s 50 MHz / quarter chip *
60 bit /4 pairs (10 bits serializer) 750 MHz per LVDS pair 2400 /4 = 600 MHz
– Input frequency comes from PLL or from outside, either 2.4 Gbit/s on pad or 480 MHz for all pads & synchronous logic
– synchronous logic works with 600 MHz 600 MHz * 4 = 2400 Mbit/s / 60 40 Mhits/s (21 % (132 Mhits/s) +54 % (104 Mhit/s))
– Worst case• synchronous logic works with 320 MHz only 320 MHz * 4 = 1280 Mbit/s / 60
21.3 Mhits/s (-35 % (132 Mhits/s) -18 % (104 Mhit/s))• synchronous logic works with 240 MHz only 240MHz * 4 = 960 Mbit/s / 60
16 Mhits/s (-52 % (132 Mhits/s) -38% (104 Mhit/s))
Implementation data transmission50b• Using GBT running at 26.66 MHz• 26.66 MHz in 3.2 Gbit/s 64 Mwords/s
(+93% (132 Mhits/s); + 145 % (104 Mhits/s)• 3200 / 320 = 10 • Programmable divider: 10 (320)
3.2 GHz26.66 MHzPLL
Clock divider3.2 GHz
3.2 GHz
64 MHz parallel_load (/50)
320 MHz (10) / 640MHz (5!)
320MHz (10) / 400 MHz (8) / 533.33 (6)
Fifo read
Fifo write
Implementation data transmission50b
4.8 GHz40 MHzPLL
• If only 2 SR in serializer it will not run at 40 MHz• Using GBT running at 40 MHz• 40 MHz in 4.8 Gbit/s 96 Mwords/s (+190% (132 Mhits/s); + 270 % (104 Mhits/s)• 4800 / 320 = 15 2400/8 = 300 MHz• Programmable divider: 10 (480) /8 (600) /
6 (800) / 16 (300)/ 12 (400) /[15 (320)]
Clock divider4.8 GHz
4.8 GHz
96 MHz parallel_load (/50)
480 MHz (10) / 640MHz (5!)
480 MHz (10) / 600 MHz (8) / 800 (6)
400 (12) / 300 (16)
Fifo read
Fifo write
Clock tree÷r
Clock tree÷r
Clock tree÷r
Clock tree÷r
Clock tree÷r
Clock tree÷r
Notes on data transmission• 1 GHz beam: 132 Mhits/s per chip• 750 MHz: 105 Mhits/s per chip• 132 Mhits/s * 40 bits = 5.28 Gbit/s• 4 serializers 5.28/4 = 1.32 Gbit/s 132/4=33 Mwords/s• 8b10b 1.32 *10/8 = 1.65 Gbit/s 132/4=33 Mwords/s• +20% contingency 1.65 * 1.2 = 1.98 Gbit/s 132*1.2/4= 39.6 Mwords/s• = 51% contingency for 750 MHz & 105 Mhits/s
• Approach with two clock domains for last FIFO stage• 320 MHz * 8 = 2.56 Gbit/s• FIFO read frequency: 2560/50=51.2MHz• 320/51.2= 6.25 (no integer) FIFO read cannot run on 320 MHz clock• 2nd clock needed to read last FIFO, if so then serial frequency = read_frequency * 50• 2.56 Gbit/s is arbitrary chosen • Clock_dll = 320 MHz, clock_digital = 320 MHz, clock_serial = 2.56 GHz with division by 50.
Notes on data transmission• Last FIFO read & write clock different
2.56 GHz (1.28 GHz)320 MHz or anyPLL
2.56 GHz (1.28 GHz)
51.2 MHz parallel_load
If possible 320 MHz but not required
Clock divider
2.56 GHz (1.28 GHz)
Notes on data transmission• All blocks on 320 MHz• 3.2 Gbit/s 64 Mwords/s (+93% (132 Mhits/s); + 150 % (104 Mhits/s)
3.2 GHz320 MHzPLL
Clock divider
3.2 GHz
3.2 GHz
64 MHz parallel_load
If possible 320 MHz but not required
Notes on data transmission• Parallel out: max 4 x 2 pins per quarter chip (40/4=10)
• Data without 8b10 decoding• 320M/40*4=32 Mwords/s (+23 %;104 Mhit/s)
450M/40*4=45 Mwords/s (+73%;104 Mhit/s) 480M/40*4=48 Mwords/s (+84%;104 Mhit/s;
+45%;132 Mhits/s)• 320 MHz clock domain compatible with 320M/355M/400/457/533 otherwise
readfrequency of last FIFO is different from 320 MHz two clock domains.• With 8b10 decoding: 4 IO is inconvenient, either 5, same data rate as above or• Unbalanced transmission (50/4= 12.5)• @ 320 MHz 1.28 Mbits/s 320M/50*4=25.6 Mword/s (-2%;104Mhits/s)• @ 450 MHz 1.8 Mbit/s 450/50*4=36 Mwords/s (+38%;104Mhits/)• 2 clock domains at last FIFO required
Notes on data transmission• GBT:• 40 MHz in; 4.8 Gbit/s out, stream 120 bits.• Block is 1 mm x 1 mm aspect ratio not good for us. • 4 serializers + 1 PLL = 750 mW @ 40 MHz• If used like it is:• Running at 20 MHz gives; 2.4 Gbit/s; • Our 50 bits data stream needs to be reformatted to 120 bits.• Top level metals contain power and capcacitors move to LM seems possible.
Data transmission• Using GBT running at 20 MHz with 120 bit serializer word length• Needs a demultiplexer5*40bits to get from 40 bits words to 100
before or after FIFO and then 8b10 encoding to 120 bit, additional control needed
• 20 MHz in 2.4 Gbit/s 48 Mwords/s (+45% (132 Mhits/s); + 84 % (104 Mhits/s)
• 2400 / 320 = 7.5 ! 2400/8 = 300 MHz• Programmable divider: 10 (240) / 5! (480) for synchronous logic
2.4 GHz20 MHzPLL
Clock divider
2.4 GHz
2.4 GHz
20 MHz parallel_load (/120)
240 MHz / 480 MHz
• Synchronous parallel read-FIFO frequency:• serialFrequ * n / 120 [MHz] = 20
(1)/40(2)/60(3)/80/100/120/140/160/180/200/220/240/…300/400/480 (5!)/..
I have another question about the clocks sent to the GTK: in an earlier talk about this subject we had agreed on sending by means of optical links the "high quality" clock for the GTK TDCs and the "digital clock" for the serializers.I found PLLs from IDT which have ps jitters and would do very well the job of redriving the 40MHz clock, multiplied when needed, to the GTK ASIC.But the peak jitter figures of the optical transceivers, for instance of the Finisar 4.2Gbps which I was thinking of using, are in the range of tens of ps; even 120 ps for the Zarlink 2.5GBps. It is not clear how many sigmas do they use to define the maximum.Should we worry?