clock buffers

Post on 29-Nov-2015

81 Views

Category:

Documents

4 Downloads

Preview:

Click to see full reader

DESCRIPTION

best

TRANSCRIPT

Introduction to Digital VLSI DesignספרתיVLSIמבוא לתכנון

Clock

Lecturer: Gil Rahav

Semester B’ , EE Dept. BGU.

Freescale Semiconductors Israel

Reference Signal in Sequential Elements.

o Most critical signal in synchronous designs

o Reference for all the timing measurements

o Single clock signal serves multiple flops/latches

Clock

Clock tree concepts

For a successful signal launch and capture :-

i) Tck-qff1 + Tdly3 <= T + Skew – Tsetupff2

ii) Tck-qff1 + Tdly3 >= Skew + Tholdff2

Clock Insertion Delay = Tdly1, Tdly2

Clock Skew = Tdly2 – Tdly1

Clock Tree Concepts

Rise Skew :- Skew calculated based on Rise edge at Clock Root

Fall Skew :- Skew calculated based on Fall edge at Clock Root

Triggering edge Skew :- Skew calculated based on arrival times of active signal on clock pins

Transition time :- The time taken by signal to make a transition from 20%-80% of the full value.

Clock Tree Architecture

H-Tree Mesh

H-Tree Clock Network

Clock

Clock

Idlecondition

Gatedclock

Can insert clock gating atmultiple levels in clock tree

Can shut off entire subtreeif all gating conditions aresatisfied

� If the paths are perfectly balanced, clock skew is zero

Clock Tree Synthesis :-i) Achieve Insertion Delay number

ii) Achieve Skew Targets

iii) Maintain Transition limits

iv) Limit Power Numbers

Compare the three Structures …….?

SoC Clock Distribution Network

External clock

IP Coreor Module

Global Clock Net

Core InternalClock Net

Core InternalClock Driver/PLL:• Buffer• Freq. Multiply• Align

SoC

PLL

Example: H-Tree

Restle et al., The clock distribution of the Power4 microprocessor, ISSCC2002

Ref clk in

Ref clk out

PLL

ClockDistribution

PLL out

Feedback

3 2

1Bypass

PLL Block Diagram

Phasedetector

Chargepump

Divide byN

Loopfilter VCO

Referenceclock

Localclock

SystemClock

Up

Down

vcont

Power reduction in Clock trees – Clock Gating

What’s the Problem ?

� Clock variation

� Difference in arrival times to flops

� Static reasons = skew:

- Unequal wire length

- Unequal buffer delay

- Unequal load

- IR drop

- In-die process variation

� Dynamic reasons = drift and jitter:

- Switching load � supply voltage variation, IR drop, coupling

- Temperature

Two Methods of Clock Distribution Networks

clock

clock delay0

(a) IP 1

IP 2

IP 3

IP 4

IP 5

IP 6

IP 7

IP 8

IP 9

clockdistribution

network

Zero skewat clock inputs to IP cores

Zero skewat the flip-flops

clockdistribution

network

0

clock

(b) IP 1

IP 2

IP 3

IP 4

IP 5

IP 6

IP 7

IP 8

IP 9

clock delay

Assume perfectly balanced clock tree !

Clock Tree Synthesis

� when?

Clock Tree synthesis is needed for nets which have high fanout:

➢ Clocks

➢ Asynchronous resets

➢ Scan signals which feed all the Flip-flops in the design

Clock Tree Synthesis

�Why?

➢ Minimal skew .

➢ Minimal insertion delay .

➢ DRC (Design Rule Constraints) – max transition, max capacitance

Clock Tree Synthesis

�Why?

➢ Minimum Skew – Hold violation

A C

B

Hold time violation when A + B < C

Clock Tree Synthesis

�Why?

➢ Minimum Skew – Setup violation

CA

B

Setup time violation when A + B > C + T

Clock Tree Synthesis

�Why?

➢ Minimum Insertion Delay / Buffers Stages

Large insertion delay increase power but also results with increased skew cause of On-Chip-Variation (OCV) .

Total skewTotal skew

Timing Violation

Add BuffersOn LogicPath

Total Power& Area

Clock Tree Synthesis

�Symmetric Clock tree

➢ All flops of a symmetric clock tree , traced back from the clocktree root are passing the same number of levels and the same cell references at each level.

➢ The clock tree is balanced at a specific corner which should fitall corners .

➢ Asymmetric tree results with increased skew variations at different corners .

Clock Tree Synthesis

�Asymmetric Clock Tree

Asymmetric Clock Tree is used for non clock signals such as asynchronous resets & DFT signals.

Asymmetric clock tree features:

➢ Requires max delay & max transition.

➢ Relaxed constraints for skew.

Review: Synchronous Timing Basics

� Under ideal conditions (i.e., when tclk1 = tclk2)

T ≥ tc-q + tplogic + tsu

thold ≤ tcdlogic + tcdreg

� Under real conditions, the clock signal can have both spatial (clock skew) and temporal (clock jitter) variations

� skew is constant from cycle to cycle (by definition); skew can be positive (clock and data flowing in the same direction) or negative (clock and data flowing in opposite directions)

� jitter causes T to change on a cycle-by-cycle basis

D Q

R1Combinational

logicD Q

R2

clk

In

tclk1 tclk2

tc-q, tsu,thold, tcdreg

tplogic, tcdlogic

Sources of Clock Skew and Jitter in Clock Network

PLL

1

2

4

3

5

6

7

clock generation

clock drivers

power supply

interconnectcapacitive load

capacitive coupling

temperature

� Skew� manufacturing device

variations in clock drivers

� interconnect variations

� environmental variations (power supply and temperature)

� Jitter� clock generation

� capacitive loading and coupling

� environmental variations (power supply and temperature)

Positive Clock Skew

D Q

R1Combinational

logicD Q

R2

clk

In

tclk1 tclk2

delay

� δ > 0: Improves performance, but makes thold harder to meet. If thold is not met (race conditions), the circuit malfunctions independent of the clock period!

T

T + δδ > 0

δ + thold

T + δ ≥ tc-q + tplogic + tsu so T ≥ tc-q + tplogic + tsu - δ

thold + δ ≤ tcdlogic + tcdreg so thold ≤ tcdlogic + tcdreg - δ

1

2

3

4

� Clock and data flow in the same direction

T :

thold :

Negative Clock Skew

D Q

R1Combinational

logicD Q

R2

clk

In

tclk1 tclk2

delay

� Clock and data flow in opposite directions

T

T + δ

δ < 0

T + δ ≥ tc-q + tplogic + tsu so T ≥ tc-q + tplogic + tsu - δ

thold + δ ≤ tcdlogic + tcdreg so thold ≤ tcdlogic + tcdreg - δ

1

2

3

4

� δ < 0: Degrades performance, but thold is easier to meet (eliminating race conditions)

T :

thold :

Clock Jitter� Jitter causes T to

vary on a cycle-by-cycle basis

R1Combinational

logic

clk

In

tclk

T

-tjitter +tjitter

T - 2tjitter ≥ tc-q + tplogic + tsu so T ≥ tc-q + tplogic + tsu + 2tjitter

� Jitter directly reduces the performance of a sequential circuit

T :

Combined Impact of Skew and Jitter

D Q

R1Combinational

logicD Q

R2

In

tclk1 tclk2

� Constraints on the minimum clock period (δ > 0)

� δ > 0 with jitter: Degrades performance, and makes tholdeven harder to meet. (The acceptable skew is reduced by jitter.)

T

T + δδ > 0

1

6 12-tjitter

T ≥ tc-q + tplogic + tsu - δ + 2tjitter thold ≤ tcdlogic + tcdreg – δ – 2tjitter

Clock Distribution Networks

� Clock skew and jitter can ultimately limit the performance of a digital system, so designing a clock network that minimizes both is important

� In many high-speed processors, a majority of the dynamic power is dissipated in the clock network.

� To reduce dynamic power, the clock network must support clock gating (shutting down (disabling the clock) units)

� Clock distribution techniques� Balanced paths (H-tree network, matched RC trees)

- In the ideal case, can eliminate skew

- Could take multiple cycles for the clock signal to propagate to the leaves of the tree

� Clock grids- Typically used in the final stage of the clock distribution network

- Minimizes absolute delay (not relative delay)

Dealing with Clock Skew and Jitter� To minimize skew, balance clock paths using H-tree or other

clock distribution structures.

� If possible, route data and clock in opposite directions; eliminates races at the cost of performance.

� The use of gated clocks to help with dynamic power consumption make jitter worse.

� Shield clock wires (route power lines – VDD or GND – next to clock lines) to minimize/eliminate coupling with neighboring signal nets.

� Use dummy fills to reduce skew by reducing variations in interconnect capacitances due to interlayer dielectric thickness variations.

� Beware of temperature and supply rail variations and their effects on skew and jitter. Power supply noise fundamentally limits the performance of clock networks.

top related