usefulskewappnote_v10_1491

68
Predictable Success Useful Skew Application Note Presentation IC Compiler Version Z-2007.03-SP3 07/31/2007

Upload: gpraveenroy

Post on 08-Nov-2014

48 views

Category:

Documents


1 download

DESCRIPTION

usefull skew

TRANSCRIPT

Page 1: usefulskewappnote_v10_1491

Predictable Success

Useful Skew Application Note PresentationIC Compiler Version Z-2007.03-SP3

07/31/2007

Page 2: usefulskewappnote_v10_1491

© 2007 Synopsys, Inc. (2)

Predictable Success

Contents

•Overview of Useful Skew•Useful Skew in IC Compiler•Running Useful Skew•Analyzing and Debugging Useful Skew Results•Case Study

Page 3: usefulskewappnote_v10_1491

© 2007 Synopsys, Inc. (3)

Predictable Success

Contents

•Overview of Useful Skew•Useful Skew in IC Compiler•Running Useful Skew•Analyzing and Debugging Useful Skew Results•Case Study

Page 4: usefulskewappnote_v10_1491

© 2007 Synopsys, Inc. (4)

Predictable Success

What is Useful Skew?

IN OUT

CLK

(-1)

(+2)

(-1)

Paths with negativeslack

Path with positiveslack

(0)

• Most timing violations are fixed by data path optimization• With useful skew, you fix timing violations by adjusting

clock arrival times at the registers or latches

Page 5: usefulskewappnote_v10_1491

© 2007 Synopsys, Inc. (5)

Predictable Success

IN OUT

CLK

(-1)

(+2)

(-1)

Paths with negativeslack

Path with positiveslack

(0)

Fixing Timing Violations By Using Clock Skew

CLK

IN OUT

(0)

(0)

(0)

Decrease clock arrival time at this pin

Increase clock arrival time at this pin

(0)

•Fixed timing violations

•Increased clock skew No change at this pin

Page 6: usefulskewappnote_v10_1491

© 2007 Synopsys, Inc. (6)

Predictable Success

Two Approaches to Implementing Useful Skew

• Apply useful skew on your design before clock is synthesized

+ Clock tree synthesis can achieve larger latency adjustment targets; design can have more useful skew

– At pre clock tree synthesis stage, parasitics are estimated based on virtual routing with more scope for miscorrelation

– Factors such as timing derate on clock path are not considered since the clock is ideal

• Apply useful skew incrementally to fix timing violations in the post clock tree synthesis or post route stage

+ After detail routing, timing should be most accurate; therefore applying useful skew should be effective

– Clock tree optimization can only make small latency adjustments– Pre route clock tree optimization allows sizing, relocation and delay

insertion. However, the ability to use these techniques to meet the latency adjustment is limited.

– Post route clock tree optimization only allows sizing

Page 7: usefulskewappnote_v10_1491

© 2007 Synopsys, Inc. (7)

Predictable Success

Known Limitations of the Useful Skew Approach

• Useful skew approach cannot improve timing forLooping paths from a register to itself Feedthrough paths from input to output port

IN OUT

CLK

(-1)

(+2)

(-1)

Feedthrough path

(0)

IN2 OUT2

Looping path from register to itself

Page 8: usefulskewappnote_v10_1491

© 2007 Synopsys, Inc. (8)

Predictable Success

Contents

•Overview of Useful Skew•Useful Skew in IC Compiler

Overview of Useful Skew in IC CompilerAnalyzing the Timing of the DesignDetermining the Pins to Be OptimizedUnderstanding the Solution File Generated by skew_optSourcing the Solution FilePrerequisites for Running Useful SkewHow skew_opt Works With Clock Gating, I/O Paths, Hold Fixing, and Scan ChainsKnown Issues With the Current Useful Skew Implementation

•Running Useful Skew•Analyzing and Debugging Useful Skew Results•Case Study

Page 9: usefulskewappnote_v10_1491

© 2007 Synopsys, Inc. (9)

Predictable Success

Overview of Useful Skew in IC Compiler

Analyzes the timing of the design

Determines the pins to be optimized

Determines the optimal solution

Writes the solution to a file

Sources the solution file onto the design

A simple look at what skew_opt does “under the hood”

Page 10: usefulskewappnote_v10_1491

© 2007 Synopsys, Inc. (10)

Predictable Success

Analyzing the Timing of the Design

• Multiple paths to each end point• Multiple paths from each start point• Determine interclock relationships

CK1 CK2

CK2

CK2

CK1

CK1

Page 11: usefulskewappnote_v10_1491

© 2007 Synopsys, Inc. (11)

Predictable Success

Determining the Pins to Be Optimized

• Based on the paths to be optimized, skew_opt determines the pins whose latency should be adjusted

• The following pins are not optimized:“Fixed”: nonoptimized pins that are not written into the solution file

• I/O ports• Nonstop pins

Clock pins of clock-gating cellsClock pins of registers with generated clock definitionsExplicit nonstop pins

“Fragile”: nonoptimized pins that are written into the solution file• Unconstrained pins• Clock pins inside interface logic models (ILMs)• Level-sensitive latches

Clock pins of level-sensitive latchesClock pins of registers on paths to or from level-sensitive latches

• When skew_opt_optimize_to_clock_gates is false (default is true), registers generating enable signals for clock gates are not optimized. See slide 49 for more details

Page 12: usefulskewappnote_v10_1491

© 2007 Synopsys, Inc. (12)

Predictable Success

Nonstop Pins Are Not Optimized by skew_opt

• skew_opt sets float pin exceptions on clock pins whose latency needs to be adjusted (set_clock_tree_exception –float)

• Clock tree synthesis stops traversal when it sees this exceptionon clock pins

The portion of the clock structure beyond these pins is not optimized for skew, causing incorrect results

CLK

IGC ECLK

If a float pin exception is set on this pin, the registers U1 and U2 are not considered part of the clock tree

U1

U2

Page 13: usefulskewappnote_v10_1491

© 2007 Synopsys, Inc. (13)

Predictable Success

Pins Inside ILMs Are Not Optimized by skew_opt

P1 O1

CLK

IN

OUT

ILM

Top level

Clock tree synthesis can only adjust latency to ILM clock pins

Clock tree synthesis at top level cannot adjust latencies to pins inside ILMs; skew_opt therefore considers them as “fragile” pins and sets the same float pin exception on these pins

Page 14: usefulskewappnote_v10_1491

© 2007 Synopsys, Inc. (14)

Predictable Success

Contents

•Overview of Useful Skew•Useful Skew in IC Compiler

Overview of Useful Skew in IC CompilerAnalyzing the Timing of the DesignDetermining the Pins to Be OptimizedUnderstanding the Solution File Generated by skew_optSourcing the Solution FilePrerequisites for Running Useful SkewHow skew_opt Works With Clock Gating, I/O Paths, Hold Fixing, and Scan ChainsKnown Issues With the Current Useful Skew Implementation

•Running Useful Skew•Analyzing and Debugging Useful Skew Results•Case Study

Page 15: usefulskewappnote_v10_1491

© 2007 Synopsys, Inc. (15)

Predictable Success

Understanding the Useful Skew Solution Filecheck_error –reset

__scl 0.129037 {STACK_BLK/MEM_reg_5__9_/CK}

__scte -float_pin_capacitance 0 -float_pin_max_delay_rise -0.09 -float_pin_min_delay_rise -0.09 -float_pins {STACK_BLK/MEM_reg_5__9_/CK}

__sicdo -balance_group { CLOCK }

Clock latency set by skew_opt: If clock tree synthesis is able to implement the clock exceptions defined by skew_opt, you should expect to see the propagated clock latency on this pin very close to this value (__scl is aliased to set_clock_latency in the Tcl file)

Clock exception equivalent to the clock latency set by skew_opt. See next slide on how skew_opt determines the clock exception value from the clock latency values (__scte is aliased to set_clock_tree_exceptions in the Tclfile)

Interclock delay balancing options set by skew_opt based on the timing relationship between the clock domains (__sicdo is aliased to set_inter_clock_delay_options in the Tcl file)

Page 16: usefulskewappnote_v10_1491

© 2007 Synopsys, Inc. (16)

Predictable Success

Why Three Sets of Tcl Commands?

• set_clock_latency

Understood by the timer; can be used to measure skew_opt QoRClock tree synthesis does not honor clock latencies set at clock pins

• set_clock_tree_exceptions

Not understood by timerClock tree synthesis honors these constraints

• set_inter_clock_delay_options

Interclock delay constraints based on skew_opt analysis

Page 17: usefulskewappnote_v10_1491

© 2007 Synopsys, Inc. (17)

Predictable Success

Sourcing the Solution File

• By default, all three sets of Tcl commands are sourced:set_clock_latency

set_clock_tree_exceptions

set_inter_clock_delay_options

• Use the following variable settings to control which Tclcommands are sourced from the solution file:

skew_opt_skip_ideal_clocks

skew_opt_skip_propagated_clocks

skew_opt_skip_clock_balancing

• For example, If skew_opt_skip_ideal_clocks is set to true• set_clock_latency commands are not sourced

If skew_opt_skip_propagated_clocks is set to true• set_clock_tree_exceptions commands are not sourced

Page 18: usefulskewappnote_v10_1491

© 2007 Synopsys, Inc. (18)

Predictable Success

Determining the Clock Exception Values

IN OUT

CLK

(-1)

(+2)

(-1)

Paths with negativeslack

Path with positiveslack

(0)

SDC:set_clock_latency 4.0 CLK

skew_opt clock latencies:

set_clock_latency 5.0 U1/CK

set_clock_latency 3.0 U2/CK

set_clock_latency 4.0 U3/CK

U1

U2

U3

Calculating the clock exception value:

1. Find min (all clock latency values)

2. Float pin value for pin = (Min latency – latency specified for pin)

skew_opt clock exceptions:

set_clock_tree_exceptions -2.0 -float_pin U1/CK

set_clock_tree_exceptions 0.0 –float_pin U2/CK

set_clock_tree_exceptions -1.0 –float_pin U3/CK

Decrease clock arrival time at this pin

Increase clock arrival time at this pin

Page 19: usefulskewappnote_v10_1491

© 2007 Synopsys, Inc. (19)

Predictable Success

Contents

•Overview of Useful Skew•Useful Skew in IC Compiler

Overview of Useful Skew in IC CompilerAnalyzing the Timing of the DesignDetermining the Pins to Be OptimizedUnderstanding the Solution File Generated by skew_optSourcing the Solution FilePrerequisites for Running Useful SkewHow skew_opt Works With Clock Gating, I/O Paths, Hold Fixing, and Scan ChainsKnown Issues With the Current Useful Skew Implementation

•Running Useful Skew•Analyzing and Debugging Useful Skew Results•Case Study

Page 20: usefulskewappnote_v10_1491

© 2007 Synopsys, Inc. (20)

Predictable Success

Prerequisites for Running Useful Skew (1/4)

1. Check your clock tree for missing or incorrect constraints or definitions using check_clock_tree

2. Check for preexisting exceptions such as ignore, stop, or float pins set on your clock tree by using report_clock_tree-exceptions

• skew_opt does not consider clock exceptions during analysis and optimization. They are honored only during clock tree synthesis and clock tree optimization

• If the pin with a preexisting exception is an optimizableendpoint, skew_opt overrides the ignore, float, or stop pin exception with the new float pin exception

• Preexisting nonstop exceptions are not overridden

Page 21: usefulskewappnote_v10_1491

© 2007 Synopsys, Inc. (21)

Predictable Success

Prerequisites for Running Useful Skew (2/4)

3. Ensure that the constraints are correcta. Use a quick run of clock tree synthesis to determine and apply

clock latencies before running skew_opt• By default, update_clock_latency does not create a

set_clock_latency command for generated clocks. Because clock tree synthesis balances the registers on the master clock with those on the generated clock, it is essential that the clock latency of the generated clock be specified before running skew_opt

• You can do one of the following:Manually apply the clock latency of the master clock to the generated clockUse set_latency_adjustment_options to set the latency of the generated clock with respect to its master before running update_clock_latency

b. Running update_clock_latency after a skew_opt flow is incorrect

• Median latency calculated will be incorrect (all clock pins have clock exceptions set by skew_opt solution)

• Changing the constraints after skew_opt will lead to convergence issues

Page 22: usefulskewappnote_v10_1491

© 2007 Synopsys, Inc. (22)

Predictable Success

Prerequisites for Running Useful Skew (3/4)

3. Ensure that the constraints are correct (continued)c. Ensure that the clock latency specifications on clock pins in the

clock structure are correct• For example, the clock pins of clock gating cells

d. Remove any ideal latencies set on the clock network (remove_ideal_latency)

e. Ensure that the clocks are ideal before running pre clock tree synthesis skew_opt flow (remove_propagated_clock)

f. Ensure that high-fanout nets are marked as ideal or run high-fanoutnet synthesis on these nets

g. Remove any pin_load constraints set on clock ports. For example,set_load -min -pin_load 0.0 Clk

Page 23: usefulskewappnote_v10_1491

© 2007 Synopsys, Inc. (23)

Predictable Success

Prerequisites for Running Useful Skew (4/4)

4. Optimize the design to minimize timing violationsUseful skew can impact global skew and insertion delayThe smaller the useful skew introduced, the lesser the impact onclock tree synthesis metrics

Page 24: usefulskewappnote_v10_1491

© 2007 Synopsys, Inc. (24)

Predictable Success

Sample Script: Preparing the Design for skew_opt

#All clocks are ideal before CTSopen_mw_cel placed_celremove_propagated_clock [all_fanout -clock]remove_propagated_clock {*}remove_ideal_latency -allremove_ideal_network –all

#Run clock_opt to get updated latenciesset_inter_clock_delay_balance –balance_groups {clk1 clk2}set_latency_adjustment_options -from_clock clk1 -to_clock vclkclock_opt -inter_clock_balance -update_clock_latencywrite_sdc updated.sdcsh grep set_clock_latency updated.sdc > updated.sdc.1sh grep get_clock updated.sdc.1 > updated.tclclose_mw_cel

#Load updated constraints into placed CEL and optimize the design #before running skew_optopen_mw_cel placed_celsource updated.tclextract_rc -estimateremove_propagated_clock [all_fanout -clock]remove_propagated_clock {*}remove_ideal_latency -allremove_ideal_network -allplace_opt

Generated clock latencies are not updated by update_clock_latency

Page 25: usefulskewappnote_v10_1491

© 2007 Synopsys, Inc. (25)

Predictable Success

Contents

•Overview of Useful Skew•Useful Skew in IC Compiler

Overview of Useful Skew in IC CompilerAnalyzing the Timing of the DesignDetermining the Pins to Be OptimizedUnderstanding the Solution File Generated by skew_optSourcing the Solution FilePrerequisites for Running Useful SkewHow skew_opt Works With Clock Gating, I/O Paths, Hold Fixing, and Scan ChainsKnown Issues With the Current Useful Skew Implementation

•Running Useful Skew•Analyzing and Debugging Useful Skew Results•Case Study

Page 26: usefulskewappnote_v10_1491

© 2007 Synopsys, Inc. (26)

Predictable Success

Using skew_opt on Designs That Have Clock Gates: Scenario 1

CLK

U1

ECLK

U2

ICG

• The ideal latency specified for CLK is considered as the clock arrival time for all the pins on that clock domain, such as the clock pins of U1, U2 and integrated clock gating

The enable timing seen by skew_opt is therefore optimistic• After clock tree synthesis, the clock arrival time at the integrated clock gating (ICG) clock

pin will be less than that at the clock pin of U2 (The clock pin of the integrated clock gating is a non stop for clock tree synthesis)

To avoid this, you can explicitly set the clock latency at the clock pins of the clock gates taking into consideration the delay from the clock gate to the endpoints

• Use a quick run of clock tree synthesis to determine these latencies By default, skew_opt adjusts the clock arrival time at the clock pins of U1 and U2, and leaves the clock arrival time at the integrated clock gating unchanged

Page 27: usefulskewappnote_v10_1491

© 2007 Synopsys, Inc. (27)

Predictable Success

Using skew_opt on Designs That Have Clock Gates: Scenario 2

CLK

U1

ECLK

U2

ICG

• In the above scenario, the register generating the enable signal for the clock gate has a data path to registers that are gated by the same clock gate

• By default, skew_opt adjusts the latency to the clock pins of both U1 and U2

During clock tree synthesis, the float pin exception on these pins causes the clock arrival time at the clock gate to change (as compared to the initial quick run of clock tree synthesis to estimate the clock arrival time at the clock gate), thus invalidating the skew_opt solution

• When skew_opt_optimize_to_clock_gates is set to false, skew_opt does not optimize the latency on the clock pin of the register generating the enable signal

Page 28: usefulskewappnote_v10_1491

© 2007 Synopsys, Inc. (28)

Predictable Success

skew_opt and I/O timing (-fix_boundary_pins)

• By default, skew_opt optimizes boundary paths

Registers on boundary paths therefore might have adjusted clock latencies

• With the –fix_boundary_pins option, skew_opt keeps the clock arrival times for registers on boundary paths unchanged

Page 29: usefulskewappnote_v10_1491

© 2007 Synopsys, Inc. (29)

Predictable Success

skew_opt and Hold Timing

• By default, skew_opt optimizes for setup

Optimizing for setup can degrade holdskew_opt minimizes the latency adjustment to minimize impact on hold

• When both the –setup and –hold options are specified,skew_opt tracks WNS for both setup and hold for each startpoint and endpoint; the worst WNS governs the solution

• Specify minimum libraries for more realistic hold timing analysis

Page 30: usefulskewappnote_v10_1491

© 2007 Synopsys, Inc. (30)

Predictable Success

skew_opt and Scan Chains

• In a skew_opt flow, there will be larger “real” skew between clock pins after clock tree synthesis

optimize_dft currently assumes zero skew between clock pins and can lead to larger hold violations

Page 31: usefulskewappnote_v10_1491

© 2007 Synopsys, Inc. (31)

Predictable Success

Known Issues With the Current Useful Skew Implementation

• Current solution excludes paths that end on nonstop pins such as

Clock gating cellsRegisters with a generated clock at the outputPins with explicit nonstop exception• This is because setting a float pin exception on these clock

pins causes clock tree synthesis not to traverse beyond these pins

• Pins inside interface logic models are not optimizedClock tree synthesis cannot adjust latencies to pins inside ILMs

• Level-sensitive latch support is limitedWill not optimize paths to or from level-sensitive latches

Page 32: usefulskewappnote_v10_1491

© 2007 Synopsys, Inc. (32)

Predictable Success

Contents

•Overview of Useful Skew•Useful Skew in IC Compiler•Running Useful Skew

Known Issues With the Current Useful Skew ImplementationUseful Skew User Interface

•Analyzing and Debugging Useful Skew Results•Case Study

Page 33: usefulskewappnote_v10_1491

© 2007 Synopsys, Inc. (33)

Predictable Success

Useful Skew Flows: Pre Clock Tree Synthesis and Post Clock Tree Synthesis

IC CompilerPlaced CEL View

(Prepared for Useful Skew)

skew_opt

clock_opt–inter_clock_balance

route_opt

Pre Clock Tree Synthesis Flow

clock_opt –inter_clock_balance–no_clock_route

set skew_opt_skip_ideal_clocks trueskew_opt

optimize_clock_tree

route_opt

Post Clock Tree Synthesis Flow

This setting is required to avoid losing the propagated attribute that is annotated on the clocks by compile_clock_tree

Postroute flow has QoR limitations

Page 34: usefulskewappnote_v10_1491

© 2007 Synopsys, Inc. (34)

Predictable Success

Sample Script: Pre Clock Tree Synthesis skew_opt Flow

#All clocks are ideal before clock tree synthesisremove_propagated_clock [all_fanout -clock]remove_propagated_clock {*}remove_ideal_latency -allremove_ideal_network –all

#Run skew_optskew_opt

#Run clock_optset_inter_clock_delay_balance –balance_group {clk1 clk2}set_clock_tree_options -gate_sizing true -gate_relocation true

-buffer_sizing true -delay_insertion false -buffer_relocation true

clock_opt -inter_clock_balance

#Run route_optset_fix_hold [all_clocks]route_opt

Page 35: usefulskewappnote_v10_1491

© 2007 Synopsys, Inc. (35)

Predictable Success

Sample Script: Post Clock Tree Synthesis skew_opt Flow

#All clocks are ideal before clock tree synthesisremove_propagated_clock [all_fanout -clock]remove_propagated_clock {*}remove_ideal_latency -allremove_ideal_network –all

#Run clock_optset_inter_clock_delay_balance –balance_group {clk1 clk2}set_clock_tree_options -gate_sizing true -gate_relocation true

-buffer_sizing true -delay_insertion false -buffer_relocation true

clock_opt -inter_clock_balance –no_clock_route

#Run skew_opt followed by clock tree optimizationset skew_opt_skip_ideal_clocks trueskew_optoptimize_clock_treeroute_group –all_clock_nets

#Run route_optset_fix_hold [all_clocks] route_opt

Page 36: usefulskewappnote_v10_1491

© 2007 Synopsys, Inc. (36)

Predictable Success

Contents

•Overview of Useful Skew•Useful Skew in IC Compiler•Running Useful Skew

Known Issues With the Current Useful Skew ImplementationUseful Skew User Interface

•Analyzing and Debugging Useful Skew Results•Case Study

Page 37: usefulskewappnote_v10_1491

© 2007 Synopsys, Inc. (37)

Predictable Success

Variables for skew_opt

• Variables that control which Tcl commands are sourced from the solution file

skew_opt_skip_ideal_clocks

skew_opt_skip_propagated_clocks

skew_opt_skip_clock_balancing

These variables do not affect the solution generated by skew_opt. They control only which Tcl commands are sourced in from the solution file.

Page 38: usefulskewappnote_v10_1491

© 2007 Synopsys, Inc. (38)

Predictable Success

Description of Variables

• skew_opt_skip_ideal_clocks

Default: falseWhen set to true, skew_opt does not set ideal clock latencies on the clock pins

• skew_opt_skip_propagated_clocks

Default: falseWhen set to true, skew_opt does not set clock exceptions on the clock pins

• skew_opt_skip_clock_balancing

Default: falseWhen set to true, skew_opt does not set interclock balancing options

Page 39: usefulskewappnote_v10_1491

© 2007 Synopsys, Inc. (39)

Predictable Success

skew_opt Command Optionsskew_opt

–setup

-hold

-pins pin_list

-fix_boundary_pins

-ignore_boundary_paths

-path_groups path_group_list

-output file_name

-no_auto_source

-no_optimization

-setup_margin setup_margin_value

-hold_margin hold_margin_value

-adjustment_limit adjustment_limit_value

-decrease_factor decrease_factor_value

-improvement_threshold improvement_threshold_value

-resolution resolution_value

Page 40: usefulskewappnote_v10_1491

© 2007 Synopsys, Inc. (40)

Predictable Success

Description of Options (1/4)

-setupOptimize WNS for setup constraints, on by default. When only the –setupoption is specified, skew_opt optimizes for setup but minimizes impact on hold

-holdOptimize WNS for hold constraints; off by default. When both –setup and –hold are specified, the setup solution is constrained by the hold slack. It is possible that setup improvement achieved is not as much as when only the –setup option is used.

-pins pin_list

Specifies a list of pins to optimize; by default, all adjustable clock pins are considered for optimization

-fix_boundary_pins

Do not optimize clock arrival time for registers on boundary paths

Page 41: usefulskewappnote_v10_1491

© 2007 Synopsys, Inc. (41)

Predictable Success

Description of Options (2/4)

-ignore_boundary_pathsDo not consider I/O paths during optimization. However, the tool can degrade I/O paths while optimizing other register-register paths. By default, I/O paths are included

-path_groups path_groups

Specifies the path groups considered for optimization; by default, all path groups are considered

-output file_name

Specifies a file name for the solution file; by default, the solution file name is skew_opt.tcl

-no_auto_sourceDo not source the solution file at the end of skew_opt; by default, the solution file is sourced at the end of skew_opt

Page 42: usefulskewappnote_v10_1491

© 2007 Synopsys, Inc. (42)

Predictable Success

Description of Options (3/4)

-setup_margin setup_margin_value

The margin is subtracted from the setup slack to allow you to influence skew_opt to improve paths with positive slack. Default is 0 ns. Unit is ns.

-hold_margin hold_margin_value

The margin is subtracted from the hold slack to allow you to influence skew_opt to improve paths with positive slack. Default is 0 ns. Unit is ns.

-adjustment_limit adjustment_limit_value

Sets a limit on the latency adjustment that can be set on any pin. Default is no limit. Unit is ns.

-decrease_factor decrease_factor_valueSets a fractional limit on latency decreases by using a value between zero and one. Default is 0.5. For designs with many clock tree levels, a larger decrease factor (e.g. 0.75) might yield more slack improvement.

Page 43: usefulskewappnote_v10_1491

© 2007 Synopsys, Inc. (43)

Predictable Success

Description of Options (4/4)

-improvement_threshold improvement_threshold_value

Do not generate a solution if the solution cannot improve timing QoR (WNS) by at least this value. Default is 0.01 ns. Unit is ns.

-resolution resolution_value

Snaps the clock tree exception value to a multiple of this value. Default is 0.001 ns. Unit is ns. The minimum allowed value is 0.0001 ns.

-no_optimization

Use the clock latencies set at the clock pins. For example, the tool takes the set_clock_latency commands you specified and converts them into clock exceptions; by default, this is disabled

Page 44: usefulskewappnote_v10_1491

© 2007 Synopsys, Inc. (44)

Predictable Success

Contents

•Overview of Useful Skew•Useful Skew in IC Compiler•Running Useful Skew•Analyzing and Debugging Useful Skew Results

Measuring Useful Skew QoR Without Running Clock Tree SynthesisUnderstanding the Log FileDebugging QoR Degradation in a Useful Skew Flow

•Case Study

Page 45: usefulskewappnote_v10_1491

© 2007 Synopsys, Inc. (45)

Predictable Success

Measuring Useful Skew QoR Without Running Clock Tree Synthesis (Pre Clock Tree Synthesis Only)

IC Compiler Placed CEL View (Prepared for Useful Skew)

skew_opt –no_auto_source

Useful skew flow

Pre Clock Tree Synthesis Flow

set skew_opt_skip_propagated_clocks truesource skew_opt.tcl

report_timing

Timing acceptable?

Y

Run skew_opt with a different set of options or go with the default flow

N

Run skew_optwithout sourcing the solution file Set the variable to disable

loading of clock exceptions in the solution file, then source the solution file

Analyze timing with ideal clock latencies defined by skew_opt

If timing improvement is acceptable, continue with the skew_opt flow

Page 46: usefulskewappnote_v10_1491

© 2007 Synopsys, Inc. (46)

Predictable Success

Contents

•Overview of Useful Skew•Useful Skew in IC Compiler•Running Useful Skew•Analyzing and Debugging Useful Skew Results

Measuring Useful Skew QoR Without Running Clock Tree SynthesisUnderstanding the Log FileDebugging QoR Degradation in a Useful Skew Flow

•Case Study

Page 47: usefulskewappnote_v10_1491

© 2007 Synopsys, Inc. (47)

Predictable Success

Understanding the Log File: Initial AnalysisUsing boundary paths.

Adjusting boundary pins.

Using all clock pins.

Using all path-groups.

2744398 initial constraints

==================================================

30850 loop constraints

32 feedthrough constraints

1367137 non-worst setup constraints

0 non-worst hold constraints

--------------------------------------------------

346379 remaining setup constraints

0 remaining hold constraints

31176 initial pins

==================================================

557 latencies at I/O ports

0 latencies at clock-gating cells

9 latencies at level-sensitive latches

0 latencies inside interface logic models

--------------------------------------------------

30610 latencies will be optimized

566 latencies will be kept fixed

Indicates the path and pins that skew_opt will work on

skew_opt processes the constraints to determine the ones it will work on

Based on the constraints, skew_optdetermines the pins to work with

Pins on clock-gating cells, I/O ports, ILMs, and level-sensitive latches are excluded

Page 48: usefulskewappnote_v10_1491

© 2007 Synopsys, Inc. (48)

Predictable Success

Understanding the Log File: Settings

Settings

--------

setup_margin = 0 (ns)

hold_margin = 0 (ns)

adjustment_limit = 1e+30 (ns)

decrease_factor = 0.5

improvement_threshold = 0.01 (ns)

resolution = 0.001 (ns)

Log file indicates all the variable settings

Page 49: usefulskewappnote_v10_1491

© 2007 Synopsys, Inc. (49)

Predictable Success

Understanding the Log File: Optimization and Results

Optimizing latencies:

Setup WNS Setup CNS Hold WNS Hold CNS

--------- --------- -------- --------

-9.167e-01 -1.982e+02 +0.000e+00 +0.000e+00

-8.718e-01 -1.693e+02 +0.000e+00 +0.000e+00

-8.290e-01 -1.577e+02 +0.000e+00 +0.000e+00

.

.

-2.316e-01 -1.791e+01 +0.000e+00 +0.000e+00

Minimizing latency adjustments:

Setup WNS Setup CNS Hold WNS Hold CNS

--------- --------- -------- --------

-2.310e-01 -1.791e+01 +0.000e+00 +0.000e+00

-2.350e-01 -1.870e+01 +0.000e+00 +0.000e+00

Maximum latency increase: +1.075 --> +0.061

Maximum latency decrease: -0.500 --> -0.089

Indicates the starting QoR and the improvement after each iteration

Cumulative negative slack (CNS): the sum of negative slacks for all the constraints skew_opt is considering

Final QoR achieved by skew_opt

Latency adjustments are minimized; this might have a small impact on the setup WNS

Indicates that latency increases have been reduced from 1.075 ns to 0.061 ns; latency decreases reduced from 0.5 ns to 0.089 ns. The smaller the latency adjustment, the easier it is for clock tree synthesis to meet this target

Page 50: usefulskewappnote_v10_1491

© 2007 Synopsys, Inc. (50)

Predictable Success

Understanding the Log File: Optimization and Results

Writing set_clock_latency commands ... done.

There are clocks shared by data ports and sink pins in this design

Using a phase resolution of 1 ps will achieve 89% dominant phase.

The number of unique phases will be 207.

Writing set_clock_tree_exception commands ... done.

Writing set_inter_clock_delay_options commands ... done.

Sourcing optimizations from "skew_opt.tcl".

--> sourcing set_clock_latency

--> sourcing set_clock_tree_exceptions

--> sourcing set_inter_clock_delay_options

skew_opt completed successfully.

Log file indicates which settings are sourced

I/O constraints have been specified with respect to a real clock (instead of a virtual clock)

Page 51: usefulskewappnote_v10_1491

© 2007 Synopsys, Inc. (51)

Predictable Success

Understanding the Log File: skew_opt Unable to Optimize the Design

Optimizing latencies:

Setup WNS Setup CNS Hold WNS Hold CNS

--------- --------- -------- --------

+0.000e+00 +0.000e+00 +0.000e+00 +0.000e+00

+0.000e+00 +0.000e+00 +0.000e+00 +0.000e+00

Resources used for optimization:

1.22e-04 cpu hours

0.00e+00 gigabytes

This design could not be further optimized.

When QoR improvement is less than the threshold, skew_opt does not generate a solution file

Page 52: usefulskewappnote_v10_1491

© 2007 Synopsys, Inc. (52)

Predictable Success

Contents

•Overview of Useful Skew•Useful Skew in IC Compiler•Running Useful Skew•Analyzing and Debugging Useful Skew Results

Measuring Useful Skew QoR Without Running Clock Tree SynthesisUnderstanding the Log FileDebugging QoR Degradation in a Useful Skew Flow

•Case Study

Page 53: usefulskewappnote_v10_1491

© 2007 Synopsys, Inc. (53)

Predictable Success

Tips for Debugging QoR Degradation in a Useful Skew Flow (1/2)

1. Review the skew_opt log fileCheck if the starting QoR reported by skew_opt is what you expectCheck if skew_opt is able to improve the QoR

2. Run timing analysis before and after skew_opt and after running clock tree synthesis

Use report_qor and report_timing commandsWill help identify where the degradation occurs

3. Create path groups to isolate paths that skew_opt will not optimize

Feedthrough pathsNonstop clock pins such as clock pins of integrated clock gatings

Page 54: usefulskewappnote_v10_1491

© 2007 Synopsys, Inc. (54)

Predictable Success

Tips for Debugging QoR Degradation in a Useful Skew Flow (2/2)

4. Write out clock latencies before and after clock tree synthesisComparing the two will help identify if miscorrelation between skew_opt and clock tree synthesis is the cause of the degradation

Page 55: usefulskewappnote_v10_1491

© 2007 Synopsys, Inc. (55)

Predictable Success

Debugging QoR Degradation in a Useful Skew (Pre Clock Tree Synthesis) Flow

Postroute CEL(baseline flow)

Postroute CEL(skew_opt flow)

skew_opt flow timing worse than baseline?

Y

N

skew_opt flow timing after clock tree synthesis worse than baseline?

Y

N

Y

N

Indicates correlation issue between post clock tree synthesis and postroute timing

Indicates clock tree synthesis is not able to implement the skew_opt solution

skew_opt should not degrade QoR; File STAR

Did you follow the prerequisitesfor the skew_opt flow?

Rerun flow after following recommended methodology

Y

*See next slide for details on checking if the clock tree synthesis implementation correlates with the skew_opt solution

Check if interclock balancing tool issued any messages about clocks it could not balance

Does post skew_opttiming correlate with post clock tree synthesis?*

Page 56: usefulskewappnote_v10_1491

© 2007 Synopsys, Inc. (56)

Predictable Success

Comparing the Clock Tree Synthesis Implementation to the skew_opt Solution in a Pre Clock Tree Synthesis Useful Skew Flow

Post-clock tree synthesis clock timing report (report_clock_timing-nosplit -type latency -nworst 1000000)

Compare clock latency in theskew_opt solution file to the clock timing after clock tree synthesis*

skew_opt solution file (skew_opt.tcl)

Possible convergence issues

Page 57: usefulskewappnote_v10_1491

© 2007 Synopsys, Inc. (57)

Predictable Success

Contents

•Overview of Useful Skew•Useful Skew in IC Compiler•Running Useful Skew•Analyzing and Debugging Useful Skew Results•Case Study

Page 58: usefulskewappnote_v10_1491

© 2007 Synopsys, Inc. (58)

Predictable Success

Case Study: Design Information, Initial Timing, Timing with Baseline Flow

• Design information65 nm, 325K instances

• Initial timing (after place_opt)

• Final timing with baseline flow (flow without skew_opt)

WNS TNS # Vlns # Hold Vlns

Clk1 -0.35 ns -661.73 ns 6000 43067Clk2 0.87 ns 0.00 ns 0 261

WNS TNS # Vlns # Hold Vlns

Clk1 -0.76 ns -720.39 ns 5114 41849Clk2 0.86 ns 0.00 ns 0 282

Page 59: usefulskewappnote_v10_1491

© 2007 Synopsys, Inc. (59)

Predictable Success

Case Study: Preparing the Design for skew_opt (1/4)

• Run check_clock_tree on the design

icc_shell> check_clock_tree

1

• Check for clock exceptions on the clock structure

report_clock_tree –exceptions

Clock Tree Exceptions Summary

=============================

1. Clock: Clk1

.

Implicit ignore pins: 34

Default sink pins: 36948

.

.

2. Clock: Clk2

.

Default sink pins: 350

.

.

Implicit ignore pins connecting to clock pins of gates with unconnected output; should not impact skew_opt flow

Page 60: usefulskewappnote_v10_1491

© 2007 Synopsys, Inc. (60)

Predictable Success

Case Study: Preparing the Design for skew_opt (2/4)

• Check for interclock relationships

icc_shell> report_timing –from Clk1 –to Clk2

****************************************

Report : timing

-path full

-delay max

-max_paths 1

Design : test_design

Version: Z-2007.03-ICC-SP3-CS1

Date : Tue Jul 17 20:21:21 2007

****************************************

* Some/all delay information is back-annotated.

# A fanout number of 1000 was used for high fanout net computations.

Operating Conditions: WCIND Library: tsmc65lp_108125

No paths.

1

icc_shell> report_timing –from Clk2 –to Clk1.

.

Page 61: usefulskewappnote_v10_1491

© 2007 Synopsys, Inc. (61)

Predictable Success

Case Study: Preparing the Design for skew_opt (3/4)

• Run a quick run of clock tree synthesis to estimate clock latenciesicc_shell> clock_opt -inter_clock_balance -update_clock_latency

.

.

============= Clock Tree Summary ==============

Clock Sinks CTBuffers ClkCells Skew LongestPath TotalDRC BufferArea

-----------------------------------------------------------------------------------

Clk1 36957 1059 1786 0.306 2.139 0 7213.620

Clk2 350 42 71 0.034 1.820 0 108.800

Updating the latencies on clock objeclock tree synthesis.(*psynopt*)

Information: Latency computed from clock Clk1 will be applied on clock Clk1. (clock tree synthesis-530)

Information: Updating the latency of clock Clk1 to 1.804981 (max) 0.735533 (min). (clock tree synthesis-531)

Information: Latency computed from clock Clk2 will be applied on clock Clk2. (clock tree synthesis-530)

Information: Updating the latency of Clk2 to 2.000936 (max) 0.848920 (min). (clock tree synthesis-531)

• Check for high-fanout netsicc_shell> all_high_fanout –nets

1

Page 62: usefulskewappnote_v10_1491

© 2007 Synopsys, Inc. (62)

Predictable Success

Case Study: Preparing the Design for skew_opt (3/3)

• Create path groups (Not done for this case study)group_path -name INPUTS -from [all_inputs] -to [all_registers]

group_path -name OUTPUTS -from [all_registers] -to [all_outputs]

group_path -name REG2REG -from [all_registers] -to [all_registers]

group_path -name FEEDTHROUGH -from [all_inputs] -to [all_outputs]

group_path –name ENABLE –to $enable_pins

Page 63: usefulskewappnote_v10_1491

© 2007 Synopsys, Inc. (63)

Predictable Success

Case Study: Running skew_opt

• Script to run useful skew flow

set_clock_latency -max 1.80 [get_clocks Clk1]

set_clock_latency -min 0.74 [get_clocks Clk1]

set_clock_latency -max 2.00 [get_clocks Clk2]

set_clock_latency -min 0.85 [get_clocks Clk2]

remove_propagated_clock [all_fanout -clock]

remove_propagated_clock {*}

remove_ideal_latency -all

remove_ideal_network -all

extract_rc –estimate

report_qor

skew_opt

report_qor

Page 64: usefulskewappnote_v10_1491

© 2007 Synopsys, Inc. (64)

Predictable Success

Case Study: Check skew_opt LogTiming Path Group ‘Clk1'

-----------------------------------

Critical Path Slack: -0.35

Critical Path Clk Period: 4.00

Total Negative Slack: -662.24

No. of Violating Paths: 6001.00

No. of Hold Violations: 43068.00

Timing Path Group ‘Clk2'

-----------------------------------

Critical Path Slack: 0.87

Critical Path Clk Period: 8.00

Total Negative Slack: 0.00

No. of Violating Paths: 0.00

No. of Hold Violations: 261.00

.

.

Setup WNS Setup CNS Hold WNS Hold CNS

--------- --------- -------- --------

-3.525e-01 -7.615e+03 +0.000e+00 +0.000e+00

-2.090e-01 -3.749e+01 +0.000e+00 +0.000e+00

Minimizing latency adjustments:

Setup WNS Setup CNS Hold WNS Hold CNS

--------- --------- -------- --------

-2.090e-01 -3.749e+01 +0.000e+00 +0.000e+00

-2.090e-01 -3.749e+01 +0.000e+00 +0.000e+00

Timing Path Group ‘Clk1'

-----------------------------------

Levels of Logic: 32.00

Critical Path Length: 3.87

Critical Path Slack: -0.21

Critical Path Clk Period: 4.00

Total Negative Slack: -7.01

No. of Violating Paths: 1016.00

No. of Hold Violations: 44286.00

-----------------------------------

Timing Path Group ‘Clk2'

-----------------------------------

Levels of Logic: 9.00

Critical Path Length: 2.80

Critical Path Slack: 0.87

Critical Path Clk Period: 8.00

Total Negative Slack: 0.00

No. of Violating Paths: 0.00

No. of Hold Violations: 269.00

-----------------------------------

Should correlate

Should correlate

Page 65: usefulskewappnote_v10_1491

© 2007 Synopsys, Inc. (65)

Predictable Success

Case Study: Running clock_opt and route_opt

• Script to run clock_opt and route_opt

clock_opt -inter_clock_balance

report_qor

set_fix_hold [all_clocks]

route_opt

report_qor

Page 66: usefulskewappnote_v10_1491

© 2007 Synopsys, Inc. (66)

Predictable Success

Case Study: QoR After clock_opt

Timing Path Group ‘Clk1'

-----------------------------------

Levels of Logic: 18.00

Critical Path Length: 3.80

Critical Path Slack: -0.48

Critical Path Clk Period: 4.00

Total Negative Slack: -885.28

No. of Violating Paths: 6494.00

No. of Hold Violations: 38738.00

-----------------------------------

Timing Path Group ‘Clk2'

-----------------------------------

Levels of Logic: 9.00

Critical Path Length: 2.95

Critical Path Slack: 0.81

Critical Path Clk Period: 8.00

Total Negative Slack: 0.00

No. of Violating Paths: 0.00

No. of Hold Violations: 274.00

-----------------------------------

Page 67: usefulskewappnote_v10_1491

© 2007 Synopsys, Inc. (67)

Predictable Success

Case Study: Check Final QoR

Timing Path Group ‘Clk1'

-----------------------------------

Levels of Logic: 11.00

Critical Path Length: 0.75

Critical Path Slack: -0.58

Critical Path Clk Period: 4.00

Total Negative Slack: -359.50

No. of Violating Paths: 3483.00

No. of Hold Violations: 40152.00

-----------------------------------

Timing Path Group ‘Clk2'

-----------------------------------

Levels of Logic: 9.00

Critical Path Length: 2.92

Critical Path Slack: 1.00

Critical Path Clk Period: 8.00

Total Negative Slack: 0.00

No. of Violating Paths: 0.00

No. of Hold Violations: 281.00

-----------------------------------

Timing Path Group ‘Clk1'

-----------------------------------

Levels of Logic: 11.00

Critical Path Length: 0.78

Critical Path Slack: -0.76

Critical Path Clk Period: 4.00

Total Negative Slack: -720.42

No. of Violating Paths: 5113.00

No. of Hold Violations: 41851.00

-----------------------------------

Timing Path Group ‘Clk2'

-----------------------------------

Levels of Logic: 9.00

Critical Path Length: 2.93

Critical Path Slack: 0.86

Critical Path Clk Period: 8.00

Total Negative Slack: 0.00

No. of Violating Paths: 0.00

No. of Hold Violations: 282.00

-----------------------------------

Useful skew flow Baseline flow

Page 68: usefulskewappnote_v10_1491

© 2007 Synopsys, Inc. (68)

Predictable Success

Predictable Success