0% found this document useful (0 votes)

978 views8 pages

ClockGating Cts

The document discusses clock gating techniques to reduce power consumption. Clock trees can consume over 50% of dynamic power. There are two main types of clock gating styles - latch-based and latch-free. Latch-based clock gating uses a latch to ensure a clean clock signal is delivered to flip-flops, making it better suited for single-clock flip-flop designs. Latch-free clock gating directly gates the clock with combinational logic and may prematurely truncate the clock or generate extra pulses, so it is less suitable. RTL clock gating identifies groups of flip-flops that can be gated together using a common enable signal to turn off their clock power when inactive.

Uploaded by

Srikanth Reddy Sarabudla

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as RTF, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

978 views8 pages

ClockGating Cts

Uploaded by

Srikanth Reddy Sarabudla

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as RTF, PDF, TXT or read online on Scribd

You are on page 1/ 8

Clock Gating

Clock tree consume more than 50 % of dynamic power. The components of this power are:

1) Power consumed by combinatorial logic whose values are changing on each clock edge

2) Power consumed by flip-flops and

3) The power consumed by the clock buffer tree in the design.

It is good design idea to turn off the clock when it is not needed. Automatic clock gating is supported by
modern EDA tools. They identify the circuits where clock gating can be inserted.

RTL clock gating works by identifying groups of flip-flops which share a common enable control signal.
Traditional methodologies use this enable term to control the select on a multiplexer connected to the D
port of the flip-flop or to control the clock enable pin on a flip-flop with clock enable capabilities. RTL
clock gating uses this enable signal to control a clock gating circuit which is connected to the clock ports
of all of the flip-flops with the common enable term. Therefore, if a bank of flip-flops which share a
common enable term have RTL clock gating implemented, the flip-flops will consume zero dynamic
power as long as this enable signal is false.

There are two types of clock gating styles available. They are:

1) Latch-based clock gating

2) Latch-free clock gating.

Target Skew
Target Skew, the skew value on which the cts engine will try to build a balanced clock tree.
In this post we will discuss about on which factors we will choose the target skew of our
design & how’s that factors affect our design QOR.
As a designer, it is general tendency to have a zero skew & have a perfect balanced clock
tree, but Zero skew is not overall good for design, why? Think about in terms of latency,
buffer count, Dynamic power & congestion.
For Zero skew, overall latency of a design is going to increase, as it will take more clock
Buff/Inv to balance the flops (for zero skew), which may results in increase of uncommon
clock path (more prone to OCV variation) & high dynamic power dissipation as all the flops
& buffer will going to toggle at same time. As each clock net takes double routing resource
because of NDR settings apply on clock net, so the congestion also increases as the lesser
skew are targeted.
As the technology is shrinking, so it is becoming more critical to close timing across corners.
Skew has direct impact on setup/hold. The main motive to attain Zero skew is the hold
timing across all the corners. So, by optimal selection of skew number (target skew), we will
decrease clock power consumption, clock buffer/inv count & significant congestion
reduction.

How we will analyze the Target skew value?

For Target skew we have to do multiple experiments, creating clock tree with target skew
defined by keeping the constraints constant (SDC) & then different Skew numbers are
analyzed based on latency, power & congestion.
We know that hold timing equation of a flop i.e,

Figure 1: Timing Path

Tck->q + Tcomb > TSkew + THold

We can re-write this as,

TSkew < Tck->q + Tcomb - THold
For worst case scenario in hold ,lets suppose T comb = 0 (flops sitting very near, no logic path),
above equation can be rewritten as
TSkew < Tck->q - THold
Lets assume flop delay in worst case is 100 ps & hold time is 30 ps
TSkew < 70 ps
Which mean there is a scope of ~70 ps skew without degrading hold timing in worst case.
So we do multiple iterations by setting the target skew in range of +- 30 ps & analyse for
above factors as well as for checked for timing(NFE) in all the corners.

NFE -> No. of failing endpoints

Temperature Inversion
Temperature inversion is a phenomenon which occurs in lower nodes,which makes the delay
of a cell decreases when there is a rise in temperature contradictory the delay in higher
nodes.
Lets unfold this,
If you look at the MOSFET drive current equation,

So, ID varies linearly as u (mobility) and (VGS-Vt)^2 or the overdrive voltage. We can
conclude that Delay of a cell depend upon two factors mobility & threshold voltage( Vt)of a
transistor.

How mobility & Vt depend upon temperature

Due to rise in temperature, metal ions going to vibrate more, so mobility of charge carriers
will decreases such that delay of a cell going to increase.

Threshold voltage is also going to decrease,with rise in temperature as number of minority

carriers in the substrate going to increase, which makes the less Vt than usual required to
form a channel.

To Summarise, increase in temperature, makes the delay of a cell

 Decreases due to Decrease in threshold voltage,

 Increases due to Decrease in the mobility.
So delay of a cell may increase or decrease depend upon which factor going to dominate
either mobility or threshold voltage on final current.

When the VGS- Vt or overdrive voltage is large(in higher node-> high VGS), then decrease
in threshold voltage due to variation in temperature is negligible because overall overdrive
voltage has very less impact, so mobility factor is going to dominate here, results delay of a
cell going to increase with rise in temperature .
When the overdrive voltage(VGS-Vt) is less (smaller node -> less VGS), then decrease in
threshold voltage due to rise in temp going to dominate the overall overdrive voltage,
results delay of a cell going to decrease with rise in temperature.

One thing should be noted that temperature inversion is come into picture at lower nodes
(lower voltage) with more prominent effects on HVT cells.

ICG Optimization
In the previous post we have read about ICG Enable timing problem, to overcome the problem we use
ICG optimization technique in pre-cts stage.
ICG optimization is executed during place stage & performs
 Dummy -CTS
 ICG Splitting
 Clock aware placement
Dummy CTS
In Dummy CTS ,it will build a Dummy clock tree to identify the critical ICG, calling it as dummy clock
because in cts stage tool will build the actual clock tree by discarding the dummy one
Benefits of dummy cts are:
 Accurately determine the ICG enable critical timing paths with the help of Dummy cts
 Accurate data path optimization of timing critical ICG enable paths in place stage.
 Effective ICG splitting & clock aware placement (discuss below).
One thing should be taken care of that we have to apply all cts related settings like clock tree exceptions,
NDR rules, layers, etc before running place stage in order to correlate Dummy cts & actual cts clock tree
as much as possible for optimum ICG optimziation .

ICG Splitting
After Dummy CTS, ICG optimization perform ICG splitting, we know that if ICG cell is driving multiple reg ,
it will increase the ICG downstream latency lead to more enable timing critical paths.
Tool identify the Critical ICG's in Dummy cts & do ICG splitting i.e instantiate one ICG into many ICG's &
place them near to the reg they drive to reduce ICG downstream latency for better ICG enable timing.
ICG splitting is timing driven means only ICG's with enable timing violations will split.
Figure 1: ICG Splitting

Clock aware Placement

In last stage of ICG optimization tool will do clock aware placement i.e after timing driven splitting of ICG,
tool will place the ICG's with critical enable timing near to register clusters(as shown in figure1).

One thing to note that ICG optimization may increase dynamic power dissipation in our design.

ICG Enable Timing Problem

As we know Integrated Clock Gating cells are used to reduce dynamic power dissipation in
the design, which is being Enable by CTRL logic. To get the glitch free output from ICG cell ,
it should meet the timing requirement (setup/hold) at enable pin of ICG cell.

Figure1: ICG cell

In the above figure as we seen ICG cell is driving multiple flops which is being enabled by
control logic flop R1. L2 & L3 is the latency from clock port to ICG & flops.so our ICG cell
latency(latency from ICG output clock to flops) will be

ICG latency = L3-L2

Ideally one ICG cell can drive infinite flops, as no. of flops going to increase driven by ICG
cell,tool is going to add more buffer in the clock path to balance clock tree, which will
increase the ICG latency.

As L3 latency going to increase, results in increase in ICG latency, as clock period is fixed so
now we are having lesser clock period than before to meet setup timing at EN pin.

So we can conclude that Larger the ICG latency , more critical the ICG enable timing.

It is always advisable to address ICG timing in place/pre-cts stage, as after CTS it can be
too late for the design to address ICG timing violation.

we know that Pre-cts timing analysis used ideal clock latency for all clock pins,that means
L2=L3, & ICG latency will be 0.

As ICG latency is 0 ,which will make ICG Enable timing analysis too optimistic, because now
ICG cell will get full clock to meet setup at Enable pin(before it get only L3 - ICG Latency).
So, In Pre-cts actual ICG violations are not seen, therefore not fixed in the design.

To overcome this design problem ICG optimization is a technique recommended for

designs having critical ICG enable timing.In the next post we will discuss about ICG
optimization technique, how it is executed.

------------------------------------
Latch free clock gating
The latch-free clock gating style uses a simple AND or OR gate (depending on the edge on which flip-
flops are triggered). Here if enable signal goes inactive in between the clock pulse or if it multiple times
then gated clock output either can terminate prematurely or generate multiple clock pulses. This
restriction makes the latch-free clock gating style inappropriate for our single-clock flip-flop based
design.

Latch based clock gating

The latch-based clock gating style adds a level-sensitive latch to the design to hold the enable signal from
the active edge of the clock until the inactive edge of the clock. Since the latch captures the state of the
enable signal and holds it until the complete clock pulse has been generated, the enable signal need only
be stable around the rising edge of the clock, just as in the traditional ungated design style.

Specific clock gating cells are required in library to be utilized by the synthesis tools. Availability of clock
gating cells and automatic insertion by the EDA tools makes it simpler method of low power technique.
Advantage of this method is that clock gating does not require modifications to RTL description.

Eng PCB800860 Edp 30 40 Lvds 40 - 240826 - 211500
0% (1)
Eng PCB800860 Edp 30 40 Lvds 40 - 240826 - 211500
14 pages
Timing Issues in Digital Circuits
No ratings yet
Timing Issues in Digital Circuits
23 pages
Epson L1210 L3210 L3250 L3251 L3260 L5290 6-IN-1 FREE RESETTER NO LICENSE NEEDED
No ratings yet
Epson L1210 L3210 L3250 L3251 L3260 L5290 6-IN-1 FREE RESETTER NO LICENSE NEEDED
9 pages
PrimeTime Workshop
No ratings yet
PrimeTime Workshop
213 pages
STA Interview Questions 1745401721
100% (1)
STA Interview Questions 1745401721
31 pages
Synthesis Training
No ratings yet
Synthesis Training
45 pages
Unit-5 CC
No ratings yet
Unit-5 CC
84 pages
Monster Javascript: 400+ Code Snippets 500+ Lessons 50+ Projects & Challenges
No ratings yet
Monster Javascript: 400+ Code Snippets 500+ Lessons 50+ Projects & Challenges
126 pages
2022 TL1-Northbound - Interface - TL1 - User - Manual - 04 PDF
No ratings yet
2022 TL1-Northbound - Interface - TL1 - User - Manual - 04 PDF
344 pages
Css With Python
No ratings yet
Css With Python
11 pages
Clock Tree Synthesis
No ratings yet
Clock Tree Synthesis
68 pages
Cts Logs
No ratings yet
Cts Logs
63 pages
Comprehensive Optimization Stage of DC
No ratings yet
Comprehensive Optimization Stage of DC
383 pages
Riedel Connect IP Manual v2 - 0 - EN
No ratings yet
Riedel Connect IP Manual v2 - 0 - EN
61 pages
STA - Part 1
No ratings yet
STA - Part 1
24 pages
Synthesis
100% (1)
Synthesis
10 pages
Python Notes Class VI Rev
No ratings yet
Python Notes Class VI Rev
4 pages
PD Interview Questions PDF
100% (1)
PD Interview Questions PDF
37 pages
CCD CTS
No ratings yet
CCD CTS
55 pages
Constraints Sta PDF
No ratings yet
Constraints Sta PDF
92 pages
Clock Concurrent Optimization: Paul Cunningham, Marc Swinnen, Steev Wilcox Electronic Design Processes April 10, 2009
100% (1)
Clock Concurrent Optimization: Paul Cunningham, Marc Swinnen, Steev Wilcox Electronic Design Processes April 10, 2009
25 pages
Read These
No ratings yet
Read These
23 pages
Static Timing Analysis 3 Clocked Design
No ratings yet
Static Timing Analysis 3 Clocked Design
26 pages
How To Land On Azure Data Engineer Job
No ratings yet
How To Land On Azure Data Engineer Job
5 pages
Sta Interview
No ratings yet
Sta Interview
28 pages
Cts
No ratings yet
Cts
79 pages
STA Basic Commands and Timing Report Analysis
100% (1)
STA Basic Commands and Timing Report Analysis
21 pages
Clock Tree Synthesis: Presented By: Apoorva Jinal Yesha Susmita
No ratings yet
Clock Tree Synthesis: Presented By: Apoorva Jinal Yesha Susmita
30 pages
OpenLNS Server License Guide
No ratings yet
OpenLNS Server License Guide
24 pages
SDLC - Pre Quiz
100% (1)
SDLC - Pre Quiz
3 pages
Clock Issues in Deep Submircron Design
No ratings yet
Clock Issues in Deep Submircron Design
50 pages
Synthesis
No ratings yet
Synthesis
31 pages
Multicycle Path Between Two Clock Domains
No ratings yet
Multicycle Path Between Two Clock Domains
17 pages
Shivajees MCQ On Microprocessor (Instruction Set) - GATE
No ratings yet
Shivajees MCQ On Microprocessor (Instruction Set) - GATE
13 pages
How Does The Clock Skew Violate Setup and Hold Time Contraints? What Are The Common Errors in Constraints?
No ratings yet
How Does The Clock Skew Violate Setup and Hold Time Contraints? What Are The Common Errors in Constraints?
11 pages
Clock Tree Synthesis (CTS)
No ratings yet
Clock Tree Synthesis (CTS)
43 pages
Clock Gating
No ratings yet
Clock Gating
7 pages
Clock Issues in Deep Submircron Design
100% (1)
Clock Issues in Deep Submircron Design
50 pages
Chapter 1
No ratings yet
Chapter 1
12 pages
Gated Clock Cloning For Timing Fixing 12 16
No ratings yet
Gated Clock Cloning For Timing Fixing 12 16
19 pages
Analysis in ASIC Physical Design
100% (1)
Analysis in ASIC Physical Design
67 pages
Complex Clocking Situations 010904
No ratings yet
Complex Clocking Situations 010904
54 pages
SRS DRMS Project Summary
No ratings yet
SRS DRMS Project Summary
14 pages
Lab 5
No ratings yet
Lab 5
17 pages
FM Rail Book Lecture Notes Version
No ratings yet
FM Rail Book Lecture Notes Version
17 pages
Design Planning For Large SoC Implementation at 40nm - Part 3
No ratings yet
Design Planning For Large SoC Implementation at 40nm - Part 3
17 pages
AMD Interview Questions
100% (3)
AMD Interview Questions
4 pages
ECT306 - Ktu Qbank
No ratings yet
ECT306 - Ktu Qbank
10 pages
CSS NC II Institutional Assessment
100% (1)
CSS NC II Institutional Assessment
3 pages
PrimeTime 2011 Webinar-Advanced OCV
No ratings yet
PrimeTime 2011 Webinar-Advanced OCV
32 pages
Questions Bank
No ratings yet
Questions Bank
47 pages
What Is Timing Analysis PDF
No ratings yet
What Is Timing Analysis PDF
62 pages
Amdals Law Notes
No ratings yet
Amdals Law Notes
8 pages
Check - Timing Warnings PDF
No ratings yet
Check - Timing Warnings PDF
10 pages
Case Study of Complex Full Chip Low Power Implementation in 16nm Node
No ratings yet
Case Study of Complex Full Chip Low Power Implementation in 16nm Node
32 pages
Floor Plan 0
No ratings yet
Floor Plan 0
13 pages
2023-02-14 - Adopting Position Independent Shellcodes From Object Files in Memory For Threadless Injection
No ratings yet
2023-02-14 - Adopting Position Independent Shellcodes From Object Files in Memory For Threadless Injection
8 pages
ICT 7 First Quarter Compendium of Notes - Week 2
No ratings yet
ICT 7 First Quarter Compendium of Notes - Week 2
10 pages
DbCommands
No ratings yet
DbCommands
99 pages
Tcode and TB
No ratings yet
Tcode and TB
16 pages
Module 3 Extraction Timing Analysis Optmization and CTS
No ratings yet
Module 3 Extraction Timing Analysis Optmization and CTS
7 pages
PostgreSQL Architecture
No ratings yet
PostgreSQL Architecture
7 pages
Uncertainty
No ratings yet
Uncertainty
24 pages
4.fixing Hold Time Violations by Inserting Delay at The Data Path Endpoint
No ratings yet
4.fixing Hold Time Violations by Inserting Delay at The Data Path Endpoint
3 pages
Placement Aware Clock Gate Cloning and Redistribution Methodology PDF
100% (1)
Placement Aware Clock Gate Cloning and Redistribution Methodology PDF
4 pages
Timing Exceptions - Success Is All Yours... - ) - )
No ratings yet
Timing Exceptions - Success Is All Yours... - ) - )
3 pages
Cts Cmnds
No ratings yet
Cts Cmnds
6 pages
Robust Chip Level CTS
100% (1)
Robust Chip Level CTS
14 pages
Reporte de Threat Modeling Proyecto
No ratings yet
Reporte de Threat Modeling Proyecto
19 pages
Primetime Flow
100% (1)
Primetime Flow
1 page
Negative Skew For Setup: Propagation Delays Load Delays Interconnect Delays
100% (1)
Negative Skew For Setup: Propagation Delays Load Delays Interconnect Delays
25 pages
Tool Practice Documentry
No ratings yet
Tool Practice Documentry
14 pages
Complete Synthesis PNR
No ratings yet
Complete Synthesis PNR
4 pages
Basic Sign Off
No ratings yet
Basic Sign Off
8 pages
High Fanout Nets
100% (3)
High Fanout Nets
26 pages
Exercise 5 - Bubble Sorting An Array
No ratings yet
Exercise 5 - Bubble Sorting An Array
5 pages
Synth Constraints
No ratings yet
Synth Constraints
28 pages
ICC 201012 LG 05 Route
No ratings yet
ICC 201012 LG 05 Route
12 pages
Nisarga Resume
No ratings yet
Nisarga Resume
2 pages
What's The Difference Between CTS, Multisource CTS, and Clock Mesh
No ratings yet
What's The Difference Between CTS, Multisource CTS, and Clock Mesh
7 pages
Ccopt Innovus
No ratings yet
Ccopt Innovus
2 pages
CMU - CampusMap
No ratings yet
CMU - CampusMap
1 page
Troubleshooting Networks
No ratings yet
Troubleshooting Networks
2 pages
Clock Tree Synthesis
100% (1)
Clock Tree Synthesis
2 pages
Implementation of Clock Network Based On Clock Mesh, Huang Xu
No ratings yet
Implementation of Clock Network Based On Clock Mesh, Huang Xu
6 pages
Setup With OCVxtalk
No ratings yet
Setup With OCVxtalk
10 pages
Best Ethical Hacking Course
No ratings yet
Best Ethical Hacking Course
2 pages
Smart Cities: Iot-Enabled Solid Waste Management in Smart Cities
No ratings yet
Smart Cities: Iot-Enabled Solid Waste Management in Smart Cities
14 pages
Cts
No ratings yet
Cts
6 pages
PD Flow Files
No ratings yet
PD Flow Files
2 pages

ClockGating Cts

Uploaded by

ClockGating Cts

Uploaded by

Clock Gating

2) Power consumed by flip-flops and

3) The power consumed by the clock buffer tree in the design.

1) Latch-based clock gating

2) Latch-free clock gating.

How we will analyze the Target skew value?

Figure 1: Timing Path

Tck->q + Tcomb > TSkew + THold

We can re-write this as,

NFE -> No. of failing endpoints

How mobility & Vt depend upon temperature

Threshold voltage is also going to decrease,with rise in temperature as number of minority

To Summarise, increase in temperature, makes the delay of a cell

 Decreases due to Decrease in threshold voltage,

Clock aware Placement

ICG Enable Timing Problem

Figure1: ICG cell

ICG latency = L3-L2

To overcome this design problem ICG optimization is a technique recommended for

Latch based clock gating

You might also like