iee5043 spring 2016 數位積體電路 · 1 part 7 special-purpose subsystems lecture 25 : clock...
TRANSCRIPT
1
Part 7 Special-Purpose Subsystems Lecture 25 : Clock Subsystems
IEE5043 Spring 2016 數位積體電路 Digital Integrated Circuits
Professor Wei Hwang ( 黃 威 教授 ) Department of Electronics Engineering National Chiao Tung University [email protected]
Topics covered in a typical DIC Course
3
Outline Introduction Synchronous Systems
Timing Issues Clock generation Phase-Locked Loops (PLL) Delay-Locked Loops (DLL)
Clock distribution Asynchronous Systems Summary
4
Issues in Timing
Source from : McLaughlin
5
Issues in Timing
6
Some Definitions Signals that can only transition at predetermined
times with respect to a signal clock are called “{syn,meso,plesio} chronous” An asynchronous signal can transition at any
arbitrary time.
7
Some Definitions (contd) Synchronous Signal: exactly the same frequency as local
clock, and fixed phase offset to that clock. Mesochronous Signal: exactly the same frequency as local
clock, but unknown phase offset. Plesiochronous Signal: frequency nominally the same as local
clock, but slightly different
Mesochronous and plesiochronous concepts are very useful for the design of systems with long interconnections, and/or multiple clock domains
8
Mesochronous Interconnect
9
Mesochronous Communication
10
Plesiochronous Communication
Does only marginally deal with fast variations in data delay
11
Synchronous Pipelined Datapath
12
Anisochronous Interconnect
13
The Timing Uncertainty
Positive and Negative Clock Skew
14
Data Structure with feedback
15
Clock Subsystem
Skew and Jitter in Synchronous Clock distribution
17
18
Clocked Elements 1 – Master-Slave latch
19
Clocked Elements 2 – Pulsed-Clock latch
20
Clocked Elements 3 – Sense-Amp Flip-Flop
22
Clock Generation
Low frequency: Buffer input clock and drive to all registers
High frequency Buffer delay introduces large skew relative to
input clocks Makes it difficult to sample input data
Distributing a very fast clock on a PCB is hard
23
Zero-Delay Buffer
If the periodic clock is delayed by Tc, it is indistinguishable from the original clock
Build feedback system to guarantee this delay
Phase-Locked Loop (PLL)
Delay-Locked Loop (DLL)
24
Phase-Locked Loop (PLL)
System
Linear Model
Why PLL ?
The PLL has been widely used in consumer, computer and communication aspects.
Phase Locked-Loop
Frequency synthesis
Clock/Data Recovery
Clock De-skewing Demodulator ………
….
26
PLL-Based Synchronization
Charge Pump PLL (analog)
27
DPLL Architecture (mixed mode)
28
PFD Circuit
29
DCO Circuit
30
31
All Digital –PLL (ADPLL)
32
Detailed Block Diagram of ADPLL
33
PLL Versions The advantage and disadvantage of different types of PLL
LPLL
DPLL
ADPLL
Design methodology
Design cycle
Noise rejection
Output frequency
Oscillator resolution
power
Lock cycle
Area
Analog Mixed mode Digital
Slow Slow Fast
Poor Poor Good
High High Low
High High Low
Slow Slow Quick
Large Large Small
Large Large Design dependent
35
Delay Locked Loop
Delays input clock rather than creating a new clock with an oscillator
Cannot perform frequency multiplication More stable and easier to design
1st order rather than 2nd State variable is now time (T)
Locks when loop delay is exactly Tc
Deviations of ∆T from locked value
36
Delay-Locked Loop (DLL)
System
Linear Model
37
Delayed Locked Loop (DLL)
DLL
( )( )
cp
err 2pd
pd
I s IK
s π= =
Φ
38
39
40
DLL Loop Dynamics
Closed loop transfer function of DLL
This is a first order system
τ indicates time constant (inverse of bandwidth) Choose at least 10Tc for continuous time approx.
( ) ( )( )
out
in
11
T sH s
T s sτ∆
= =∆ +
1 c
pd I vcdl cp vcdl
CTK K K I K
τ = =
All Digital Multiphase Delay-Locked Loop (AD-MDLL)
41
ADMDLL Architecture
42
43
Phase Alignment
If the DDR clock is aligned to the transmitted clock, it must be shifted by 90º before sampling
Use PLL
44
Mesochronous Clocking
As speeds increase, it is difficult to keep clock and data aligned Mismatches in trace lengths Mismatches in propagation speeds Different in clock vs. data drivers
Mesochronous: clock and data have same frequency but unknown phase Use PLL/DLL to realign clock to each data
channel
45
Phase Calibration Loop
Special phase detector compares clock & data phase
Clock Distribution
Clock Tree Distribution
Central Clock Spine Distribution
49
Clocking Architectures
DLL
( )( )
cp
err 2pd
pd
I s IK
s π= =
Φ
50
Clock Distribution
On a small chip, the clock distribution network is just a wire And possibly an inverter for clkb
On practical chips, the RC delay of the wire resistance and gate load is very long Variations in this delay cause clock to get to
different elements at different times This is called clock skew
Most chips use repeaters to buffer the clock and equalize the delay Reduces but doesn’t eliminate skew
51
Example
Skew comes from differences in gate and wire delay With right buffer sizing, clk1 and clk2 could ideally
arrive at the same time. But power supply noise changes buffer delays clk2 and clk3 will always see RC skew
3 mm
1.3 pF
3.1 mmgclk
clk1
0.5 mm
clk2clk3
0.4 pF 0.4 pF
52
Review: Skew Impact
F1 F2
clk
clk clk
Combinational Logic
Tc
Q1 D2
Q1
D2
tskew
CL
Q1
D2
F1
clk
Q1
F2
clk
D2
clk
tskew
tsetup
tpcq
tpdq
tcd
thold
tccq
( )setup skew
sequencing overhead
hold skew
pd c pcq
cd ccq
t T t t t
t t t t
≤ − + +
≥ − +
Ideally full cycle is available for work Skew adds sequencing overhead Increases hold time too
53
Solutions
Reduce clock skew Careful clock distribution network design Plenty of metal wiring resources
Analyze clock skew Only budget actual, not worst case skews Local vs. global skew budgets
Tolerate clock skew Choose circuit structures insensitive to skew
54
Clock Dist. Networks
Ad hoc Grids H-tree Hybrid
55
Clock Grids
Use grid on two or more levels to carry clock Make wires wide to reduce RC delay Ensures low skew between nearby points But possibly large skew across die
56
Alpha Clock Grids
57
H-Trees
Fractal structure Gets clock arbitrarily close to any point Matched delay along all paths
Delay variations cause skew A and B might see big skew A B
58
Itanium 2 H-Tree
Four levels of buffering: Primary driver Repeater Second-level clock buffer Gater
Route around obstructions
59
Hybrid Networks
Use H-tree to distribute clock to many points Tie these points together with a grid
Ex: IBM Power4, PowerPC
H-tree drives 16-64 sector buffers Buffers drive total of 1024 points All points shorted together with grid
61
Self-timed and Asynchronous Design Functions of clock in synchronous design
Acts as completion signal Ensures the correct ordering of events
Truly asynchronous design Ordering of events is implicit in logic Completion is ensured by careful timing analysis
Self-timed design Completion ensured by completion signal Ordering imposed by handshaking protocol
62
Synchronous Datapath
Source from : McLaughlin
63
Self-Timed Pipelined Datapath
Source from : McLaughlin
64
Hand-Shaking Protocol
65
Event Logic – The Muller-C Element
66
2-Phase Handshake Protocol
Advantage : FAST - minimal # of signaling events (important for global interconnect)
Disadvantage : edge - sensitive, has state
67
Example: Self-timed FIFO
All 1s or 0s -> pipeline empty Alternating 1s and 0s -> pipeline full
68
2-Phase Protocol
69
Example
70
Example
71
Example
72
Example
73
4-Phase Handshake Protocol
Source from : McLaughlin
74
4-Phase Handshake Protocol
75
Self-Resetting Logic
76
Asynchronous-Synchronous Interface
77
Synchronizers and Arbiters Arbiter: Circuit to decide which of 2 events occurred first Synchronizer: Arbiter with clock φ as one of the inputs Problem: Circuit HAS to make a decision in limited time
which decision is not important Caveat: It is impossible to ensure correct operation But, we can decrease the error probability at the expense of
delay