100% found this document useful (2 votes)
2K views

CLock Tree Synthesis PDF

The document discusses clock tree synthesis (CTS). It provides an introduction to CTS, including its objectives and basic terminology. It discusses clock routing algorithms like H-tree and X-tree algorithms. It also covers clock distribution techniques, inputs required for CTS, the general CTS process, and effects of performing CTS. The presentation aims to provide an overview of CTS.

Uploaded by

Mudit Agarwal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (2 votes)
2K views

CLock Tree Synthesis PDF

The document discusses clock tree synthesis (CTS). It provides an introduction to CTS, including its objectives and basic terminology. It discusses clock routing algorithms like H-tree and X-tree algorithms. It also covers clock distribution techniques, inputs required for CTS, the general CTS process, and effects of performing CTS. The presentation aims to provide an overview of CTS.

Uploaded by

Mudit Agarwal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 50

Clock Tree Synthesis

October 06, 2012


SmartPlay Overview

“To be a leading service provider of End to End Solutions


enabled by Innovative Business Models that provide Value,
Quality and Execution excellence to our Customers”

Semiconductor

Digital Analog Wireless Software System Design

World-wide Sales

Common Support Functions (HR/Staffing/Ops/Finance)

Common Infrastructure
2
Confidential
Agenda
 Introduction To CTS
 Objective
 Basic Terminologies
 Clock Routing Algorithms
 Clock distribution Techniques
 Checklist before doing CTS
 Inputs Required for CTS
 General Steps for CTS
 ICC commands for performing CTS
 Effect of CTS
 Checklist after CTS
 Hands Off
SmartPlay Proprietary & Confidential 3
Agenda
 Introduction To CTS
 Objective
 Basic Terminologies
 Clock Routing Algorithms
 Clock distribution Techniques
 Checklist before doing CTS
 Inputs Required for CTS
 General Steps for CTS
 ICC commands for performing CTS
 Effect of CTS
 Checklist after CTS
 Hands Off
SmartPlay Proprietary & Confidential 4
Introduction to CTS
 In VLSI flow, CTS is performed after the placement and
before the routing of signal nets.

SmartPlay Proprietary & Confidential 5


Cont..
 Clock is propagated after placement because the exact physical
location of cells and modules are needed for the clocks
propagation which in turn impacts in dealing with accurate delay
and operating frequency

 Clock is propagated before routing so that clock router can have


optimum utilization of all routing resources which leads to
minimum skew as well as low dynamic power dissipation.

SmartPlay Proprietary & Confidential 6


Introduction to CTS
Within most VLSI circuits, data transfer between sequential
elements is synchronized by the processing clock.

Before CTS, All clock pins are driven by a single clock source having
high fan-out and high load.

SmartPlay Proprietary & Confidential 7


Cont..
CTS is the process of inserting buffers/inverters along the clock
path of the ASIC design to balance the clock delay to all clock inputs.

In order to balance the skew and minimize insertion delay, CTS is
performed.

SmartPlay Proprietary & Confidential 8


Outline
 Introduction To CTS
 Objective
 Basic Terminologies
 Clock Routing Algorithms
 Clock distribution Techniques
 Checklist before doing CTS
 Inputs Required for CTS
 General Steps for CTS
 ICC commands for performing CTS
 Effect of CTS
 Checklist after CTS
 Hands Off
CTS Goals
 Given a clock source and n sinks.

 Connect all sinks to the clock source by an interconnect network


(tree or non-tree) so as to minimize:
• Clock Skew = maxi,j |ti - tj|
• Delay = maxi ti
• Minimizing Power dissipation
• Total wirelength
• Noise and coupling effect

SmartPlay Proprietary & Confidential 10


Outline
 Introduction To CTS
 Objective
 Basic Terminologies
 Clock Routing Algorithms
 Clock distribution Techniques
 Checklist before doing CTS
 Inputs Required for CTS
 General Steps for CTS
 ICC commands for performing CTS
 Effect of CTS
 Checklist after CTS
 Hands Off
Clock Skew
Clock skew is the maximum difference in the arrival time of a clock
signal at pins of two different sequential elements.

Figure showing both Local Skew and Global


skew

SmartPlay Proprietary & Confidential 12


Cont..
There are two types of clock skew:
 Local skew: Local skew is the difference in the arrival of clock
signal at the clock pin of related flops of same clock domain.

 Global skew: Global skew is the difference in the arrival of


clock signal at the clock pin of non-related flops of same
clock domain.
This is also defined as the difference between shortest clock
path delay and longest clock path delay of same clock
domain in overall design reaching two sequential elements.

SmartPlay Proprietary & Confidential 13


Cont..
Clock skew are also classified as +ve and –ve skew:
 Positive skew : Capture clock comes late than launch clock .
Data and clock are routed in same direction. Also, both travels
in same direction
+ve skew improves setup time but can lead to hold violation

 Negative skew: Capture clock comes early than launch clock.


Data and clock are routed in opposite direction. Also, both
travels in opposite direction
-ve skew improves hold time but can lead to setup violation.

• Beneficial Skew : If clock is skewed intentionally to resolve


violations
SmartPlay Proprietary & Confidential 14
Cont..

Figure showing both +ve skew and –ve skew

SmartPlay Proprietary & Confidential 15


Clock Latency
 It is the delay that is assumed to exist between the clock source
and the flip-flop clock pin during pre CTS stage.

 This is used before clock routing, when clock is ideal.

 It is not the actual delay, but the delay specified by the user, to
account for the clock delay which will be implemented after
routing of clock tree.

 The timing analyzer uses this information to determine clock


arrival times in the absence of propagated clocking i.e. during pre
CTS.

SmartPlay Proprietary & Confidential 16


Cont..
There are two terms associated with latency:
Source Latency: It is the time taken by the clock signal to
propagate from its ideal waveform origin point to the clock
definition point in the design.
Network Latency: It is the time taken by the clock signal to
propagate from the clock definition point in the design to the
clock pin of the sequential device.

Figure showing source latency and


network latency
SmartPlay Proprietary & Confidential 17
Insertion Delay
Once CTS is complete i.e. post CTS, the actual delay from the clock
source point to the clock sink points can be calculated. These are
typically called insertion delays at that point.

SmartPlay Proprietary & Confidential 18


Uncertainty
To be written

SmartPlay Proprietary & Confidential 19


Jitter
To be written

SmartPlay Proprietary & Confidential 20


Clock-gating
 Clock tree consume more than 50 % of dynamic power.
 So we turn off the clock, when it is not needed by using clock-gating
cells

 There are two types of clock gating styles available. They are:
1) Latch-based clock gating
2) Latch-free clock gating.

SmartPlay Proprietary & Confidential 21


Latch free Clock-gating
 It uses a simple AND or OR gate.
 The output gated clk, can turn terminate prematurely or can
generate multiple clocks pulses.
 This restriction makes it inappropriate for single clock based flip-flop
designs.

Latch free clock gating

SmartPlay Proprietary & Confidential 22


Latch Based Clock-gating
 This style adds a level-sensitive latch to the design to hold the
enable signal from the active edge of the clock until the inactive
edge of the clock.
 Since the latch captures the state of the enable signal and holds it
until the complete clock pulse has been generated, the enable signal
need only be stable around the rising edge of the clock

SmartPlay Proprietary & Confidential 23


Outline
 Introduction To CTS
 Objective
 Basic Terminologies
 Clock Routing Algorithms
 Clock distribution Techniques
 Checklist before doing CTS
 Inputs Required for CTS
 General Steps for CTS
 ICC commands for performing CTS
 Effect of CTS
 Checklist after CTS
 Hands Off
Clock Routing Algorithms
 How to minimize Skew
Distribute the clock signal in such a way that the
interconnections carrying the clock signal to functional sub-
block are equal in length.

 Several clock routing algorithm exit which try to achieve this


goal:-
• H-Tree based algorithm
• X-Tree based algorithm
• MMM algorithm
• Bone Fish Algorithm

SmartPlay Proprietary & Confidential 25


H-Tree Clock Routing

SmartPlay Proprietary & Confidential 26


H-tree Algorithm
Minimize skew by making interconnections to sequential elements
equal in length
•Symmetric Pattern
•The skew is 0 assuming delay is directly proportional to wire
length

Can be used when terminals are evenly distributed


•However, this is never the case in practice (due to blockage,
and so on)
•So strict (pure) H-trees are rarely used
• However, still popular for top-level clock network design
•It utilizes a lot of routing resources.
•Power dissipation is also high.

SmartPlay Proprietary & Confidential 27


X-tree Algorithm
An alternate tree structure with a smaller delay

 Assuming non-rectilinear routing is possible

Can Although apparently better than H-Tree but this may cause
crosstalk due to close proximity of wires.

Like H-Trees, this is also applicable for very special structures

Not applicable in general

SmartPlay Proprietary & Confidential 28


X-tree Algorithm

SmartPlay Proprietary & Confidential 29


Method of Means and Medians (MMM)
Follows a strategy very similar to H-Tree.

Recursively partition the terminals into two sets of equal size


(median). Then, connect the center of mass of the whole circuit to
the centers of mass of the two sub-circuits (mean).

Clock skew is only minimized heuristically. The resulting tree may


not have zero-skew.

The basic algorithm ignores the blockages and produces a non-


rectilinear tree . Some wires may also intersect.
• In the second phase, each wire can be converted so that it
consist only of rectilinear segment and avoids blockage.

SmartPlay Proprietary & Confidential 30


Method of Means and Medians (MMM)

SmartPlay Proprietary & Confidential 31


Fish-Bone Algorithm
 The clock driver drives all the clock pins directly.

 Skew is caused by differing interconnect lengths and loads

 If the clock driver delay is much


larger than the interconnect delays,
then the skew will be minimum but
insertion delay will large.

Implementation of fish bone


Algo in a design

SmartPlay Proprietary & Confidential 32


Outline
 Introduction To CTS
 Objective
 Basic Terminologies
 Clock Routing Algorithms
 Clock distribution Techniques
 Checklist before doing CTS
 Inputs Required for CTS
 General Steps for CTS
 ICC commands for performing CTS
 Effect of CTS
 Checklist after CTS
 Hands Off
Conventional CTS Distribution
It is the most used approach for dealing with design complexity
There is very huge depth for both
buffer and clock-gating levels.
Most of the sinks in the design
share very less paths back to the
clock root.

Impact of on-chip-variation effect


is very high.

SmartPlay Proprietary & Confidential 34


Clock-Mesh Distribution
It has extremely shallow logic depth below the mesh, usually just a
single buffer or clock gate directly driving the sinks.

It has large shared path from


clock root to the mesh.

Impact of on-chip-variation effect


is minimal

It uses a very dense mesh fabric.


Ultra low skew values can be
achieved

SmartPlay Proprietary & Confidential 35


Clock-Mesh Distribution
It exhibits high power dissipation.

The design logic attached to the mesh fabric is relatively small bins
that contains cluster or sub-cluster amt. of logic. Further, the clock
to logic could be connected by fish-bone or comb logic

It is not good for the design


having RAMs, ROMs and other
hard blockage.

Clock routing in sub-cluster by


fish-bone. The dark black net is
clock mesh

SmartPlay Proprietary & Confidential 36


Multi-Source CTS Distribution
It has a moderate depth for both buffer and clock-gating levels.
The multi-clock source are
located at the bottom of the mesh
grid and all the structure above
the mesh form a shared path back
to the root clock buffer.

Impact of on-chip-variation effect


is greater than clock mesh but less
than conventional CTS.

SmartPlay Proprietary & Confidential 37


Multi-Source CTS Distribution
Mesh fabric is one or two orders of magnitude less dense as of
Clock-Mesh distribution.

It exhibits power dissipation as same as conventional CTS. It


allows greater clock gating depth, thus saving more power.

It offers much larger logic


groupings that are themselves small
clock trees. So each logic grouping
can have their own clock tree
structure

SmartPlay Proprietary & Confidential 38


Checklist before doing CTS
Placement – Completed

Power ground nets – Pre-routed

Estimated congestion – Acceptable

Estimated Timing – Acceptable (setup should be ~0 ns )

Estimated Max Tran/Cap – No Violations

SmartPlay Proprietary & Confidential 39


Inputs Required for CTS
Detailed placement Database

Target for Latency and skew if specified

Buffers/inverters for building the clock tree

Clock tree DRC (Max Tran, Max Cap, Max Fanout, No. of Buffer
levels)

SmartPlay Proprietary & Confidential 40


Outline
 Introduction To CTS
 Objective
 Basic Terminologies
 Clock Routing Algorithms
 Clock distribution Techniques
 Checklist before doing CTS
 Inputs Required for CTS
 General Steps for CTS
 ICC commands for performing CTS
 Effect of CTS
 Checklist after CTS
 Hands Off
Steps used by CTS Algo’s
Create the virtual clusters by identifying the location of the leaf
cells which are in the close proximity of each other.

If there are leaf cells that are far from any cluster, they will be
moved to nearest cluster.

The no. of leaf cells per cluster is user defined.

Once the clusters and their locations are determined, buffer


insertions begin such that the clock propagation delay is equal to
each cluster, and clock skew within each cluster is minimized.

The smaller the cluster, the less the skew, but more clock buffering
levels will be required.
SmartPlay Proprietary & Confidential 42
Outline
 Introduction To CTS
 Objective
 Basic Terminologies
 Clock Routing Algorithms
 Clock distribution Techniques
 Checklist before doing CTS
 Inputs Required for CTS
 General Steps for CTS
 ICC commands for performing CTS
 Effect of CTS
 Checklist after CTS
 Hands Off
ICC commands for performing CTS
As explained in text file

SmartPlay Proprietary & Confidential 44


Effect of CTS
 Clock Buffers are added

 Congestion may increase

 Non clock cell may be added to non-ideal location

 Can introduce timing and max cap/tran violation

SmartPlay Proprietary & Confidential 45


Checklist After CTS
 Skew Report

 Clock tree Report

 Timing report for Setup and Hold

 Power and Area Report

SmartPlay Proprietary & Confidential 46


Output of CTS
 Database with properly build clock tree in design

SmartPlay Proprietary & Confidential 47


Reference
1) Synopsys Solvnet
2) “Physical Design Essentials” Authored by Khosrow Golshan,
Publication “spring Publication”
3) http://www.vlsi-basics.com/2013/10/clock-tree-synthesis-
cts.html
4) http://vlsi.pro/physical-design-flow-iiiclock-tree-
synthesis/#prettyPhoto

SmartPlay Proprietary & Confidential 48


Any Questions

SmartPlay Proprietary & Confidential 49


Thank You

Confidential 50

You might also like