5.ClockTreeSynthesis JD
5.ClockTreeSynthesis JD
(CTS)
Jaidev Kaushik
Azenda
1. Need of CTS
2.Clock Terminology
6.CTS flow
2
ASIC PD Flow
3
What is CTS?
Deliver Clocks to all sequential elements almost in same time, with
proper buffering, meeting given constraints(skew, insertion delay)
without DRCs (max. Tran, max. Cap)
In VLSI flow, CTS is performed after the placement and before the
routing of signal nets. D Q
D Q D Q D CK Q
CK CK CK
D Q D Q
CK CK
CLK
4
Why CTS ?
Within most VLSI circuits, data
transfer between sequential elements is
synchronized by the processing clock.
5
CTS Terminology
Clock Skew (Global Skew, Local Skew, Positive/Negative Skew, Useful Skew)
• No. of Levels
•
Clock Skew
Clock skew is the maximum difference in the arrival time of a clock signal at pins of two
different sequential elements.
Source Latency: It is the time taken by the clock signal to propagate from its ideal waveform
origin point to the clock definition point in the design.
Network Latency: It is the time taken by the clock signal to propagate from the clock
definition point in the design to the clock pin of the sequential device.
It is the delay that is assumed to exist between the clock source and the flip-flop clock
pin during pre CTS stage.
It is not the actual delay, but the delay specified by the user, to account for the clock
delay which will be implemented after routing of clock tree.
The timing analyzer uses this information to determine clock arrival times in the
absence of propagated clocking i.e. during pre CTS.
Clock Latency in SDC
Here are some example commands that specify source and network laten-
cies.
# Specify a network latency (no -source option) of 0.8ns for
# rise, fall, max and min:
set_clock_latency 0.8 [get_clocks CLK_CONFIG]
# Specify a source latency:
set_clock_latency 1.9 -source [get_clocks SYS_CLK]
Actual : Clock Insertion Delay
Once CTS is completed, i.e. post CTS design, the actual delay from the clock source point to
the clock sink points can be calculated. These are typically called (actual) clock insertion
delays at that point.
Until this point, we normally use set_clock_latency in the SDC (assumed value) to account the
clock insertion delay.
Technically, clock_latency in SDC and Actual Clock insertion Delay, both are same just that
the estimated latency is hardcoded for all sequential elements whereas the actual insertion
delays for all sequential elements are likely to be slightly different(from one another)
Timing : Ideal Clock to Propagated Clock
Common Path
Read CTS SDC: Clock Tree begins at SDC defined clock pin and ends at stop pin
of the flop
Route Clock Tree (Optional and can be done during Signal net routing also)
Define Clock
set_ccopt_property buffer_cells {CLKBUFX4 CLKBUFX8 CLKBUFX16 PBUFX2} Buffers and
set_ccopt_property inverter_cells {CLKINVX4 CLKINVX8 CLKINVX16 PINVX1}
Inverters
Define preferred Metal Layer
setNanoRouteMode -quiet -routeTopRoutingLayer 9
ccopt_check_prerequisites
set_ccopt_property max_fanout 50
create_ccopt_clock_tree_spec -immediate
#setDelayCalMode -engine aae Create Clock tree
ccopt_design
############################################
# postCTS optimization (hold)
############################################
setDontUse DLY* false
Why Clock Buffers(Inverter Pairs) ?
Non clock cells may be moved to non-ideal location (to be optimized later)
Skew Reports
H-tree, Because of the balanced construction, it is easy to reduce clock skew in the H-
tree clock structure.
A disadvantage to this approach is that the fixed clock plan makes it
difficult to fix register placement. It is rigid in fine-tuning the clock tree.
Conventional CTS Distribution
It is the most used approach for dealing with design complexity
There is very huge depth for both buffer and clock-gating levels. Most of the sinks in the design
share very less paths back to the clock root.
Clo
ck
FF FF FF FF FF FF FF FF FF FF
Clock Mesh Distribution
>The spine tree (Fish bone) arrangement makes it easy to reduce the skew. But it is
heavily influenced by process parameters, and may have problems with phase delay
Clock Gating
Common technique for reducing clock power by shutting off the clock
to modules by a clock enable signal.
Consider you were using an AND gate with clock. The high EN edge
may come anytime and may not coincide with a clock edge. In that
case the output of the AND gate will be a 1 for less time than the
clock’s duty cycle. You in turn end up with a glitch in your clock
signal.
To avoid this, a special kind of clock gating cells are used, that
synchronizes the EN with a clock edge. These are call integrated
clock gating cells or ICG.
Clock Gating
So we turn off the clock, when it is not needed by using clock-gating cells
There are two types of clock gating styles available. They are:
The output gated clock, can turn terminate prematurely or can generate multiple
clocks pulses.
This restriction makes it inappropriate for single clock based flip-flop designs.
Integrated Clock Gating Cell (ICG)
This style adds a level-sensitive latch to Using AND Gate with High EN
the design to hold the enable signal from
the active edge of the clock until the
inactive edge of the clock.
Using OR Gate
with High EN
Latch Based Clock Gating
This style adds a level-sensitive latch to the design to hold the enable signal from the
active edge of the clock until the inactive edge of the clock.
Since the latch captures the state of the enable signal and holds it until the complete
clock pulse has been generated, the enable signal need only be stable around the
rising edge of the clock.
Project Challanges
During creating clock spec, & nets had more then 500 fanout, means
afterplacement, still Ideal networks were there.
●We updated the SDC file and made those net as nonideal
●Initial Netlist, Clock input tansition was too slow 0.800 and target was
0.889. So we made it 0.890 in all sdc files.
●In updated Netlist, it was fixed automatically.
●Dont_touch cells and dnt touch network were present in design. We found
one tcl script and did run.
Remove all dnt touch cell and dont Touch network
●check_dont_touch_clock_tree_nets
●remove_dont_touch_clock_tree_nets
●check_cts_clock_cells
●remove_dont_touch_clock_cells
●To check all clock spec requirement, we used command >
ccopt_check_prerequisites
Clock Exceptions
1 Nonstop pins:Nonstop pins are pins that would normally be considered endpoints of the
clock tree, but instead ICC traces through them to find the clock tree endpoints, the clock pins of
sequential cells driving generated clocks are implicit nonstop pins. In addtion, ICC supports user-
defined ( or explicit ) nonstop pins.
2.Exclude pins:Exclude pins are clock tree endpoints that are excluded from clock tree timing
calculations and optimizations. ICC uses exclude pins only in calculations and optimizations for
design rule constraints.
3. Float pins:Float Pins are clock pins that have special insertion delay requirements. ICC adds
the float pin delay ( positive or negative ) to the calculated insertion delay up to this pin.
4. Stop pins: A stop pin is an explicitly specified end pin of a clock tree. Unlike default clock
sinks, a stop pin can be the input pin of a non-sequential cell. Clock tree synthesis treats a stop pin
as a clock sink.
Few more Terms
On Chip Variations (OCV)
Timing Derates
Cross Talk
Thank You
42