0% found this document useful (0 votes)

38 views12 pages

Power Management Techniques For Soft IP PDF

Uploaded by

GoobeD'Great

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

38 views12 pages

Power Management Techniques For Soft IP PDF

Uploaded by

GoobeD'Great

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 12

Power Management Techniques for Soft IP

Peter Greenhalgh
CPU Group
ARM
[email protected]

ABSTRACT
Both dynamic and static leakage power are becoming a significant design issue in
130nm and 90nm technology processes. High power consumption not only reduces
battery life in mobile devices, but also requires more costly IC packaging to deal with
heat dissipation. Designers are required to consider power management techniques
throughout the design flow. However, power management in soft IP presents a
difficult challenge as any power savings must be compatible across a range of target
process technologies and design tool flows.

This paper covers design techniques that can be used in soft IP (using the
ARM1176JZF-S™ and ARM926EJ-S™ processors as examples) to manage dynamic
and static power that are compatible with current tool flows and across multiple
technologies. For dynamic power reduction, RTL clock gating, architectural clock
gating and Dynamic Voltage and Frequency scaling are examined, while static power
consumption is addressed with a Dormant Power Mode.

SNUG Europe 2004 -1 - greenhalgh_final.doc

Table of Contents

1 – Introduction 3

2 – RTL Clock Gating 3

3 – Architectural Clock Gating 4

4 – Comparing the Benefits of RTL Clock Gating and Architectural Clock Gating. 6

5 – Dormant Power Mode 7

6 – Dynamic Voltage Scaling 9

7 – Summary 12

8 – Acknowledgements 12

Table of Figures
Figure 2.1 – Circuit with no clock gating (re-circulating mux)
Figure 2.2 – Circuit with clock gating using an integrated clock gating cell
Figure 3.1 – Architectural clock gating
Figure 3.2 – Architectural clock gate in the ARM1176JZF-S processor
Figure 4.1 – Power consumption of different clock gating approaches on the
ARM926EJ-S processor using Dhrystone loop 4
Figure 5.1 – Dormant Mode voltage domains and clamping logic
Figure 6.1 – ARM1176JZF-S RTL Hierarchy for Dynamic Voltage and Frequency
Scaling
Figure 6.2 – Power and energy benefit from Dynamic Voltage and Frequency Scaling

SNUG Europe 2004 -2 - greenhalgh_final.doc

1.0 Introduction
Chip power consumption is increasing at technology process geometries of 130nm
and below due to greater transistor density. High power consumption reduces battery
life in mobile devices, requires more costly IC packaging to deal with heat dissipation
and can affect long term reliability of the part.

To reduce the upward trend in power consumption, custom circuit techniques that are
library and technology specific have been developed in combination with library and
technology independent approaches.

Soft-IP presents a further challenge because it is not known what technology the IP
will be implemented in. Therefore the IP must be designed such that it supports the
introduction of custom logic as well as incorporating technology independent
techniques.

Furthermore, at process geometries of 90nm and below static leakage power is

becoming increasingly significant. Therefore the IP should provide techniques to
mitigate both static and dynamic power consumption.

The ARM1176JZF-S and ARM926EJ-S microprocessors are designs delivered to

licensees in a synthesisable soft IP form. The ARM926EJ-S processor supports both
RTL and architectural clock gating. At the time of writing the ARM1176JZF-S is the
latest ARM microprocessor and in addition to RTL and architectural clock gating also
has support for a Dormant Power mode and dynamic voltage and frequency scaling.
These power saving techniques have been chosen to complement any custom circuit
or implementation approaches that a licensee may have developed to save power.

2 RTL Clock Gating

RTL clock gating has been supported by tools such as Synopsys Power Compiler for
several years to reduce dynamic power. This design approach uses an enable signal
coded into the RTL to control whether or not a group of registers is updated.

In most cases a clock gate is inferred by Power Compiler to replace the re-circulating
mux that is coded into the RTL. The following diagrams demonstrate a circuit with
no clock gating (figure 2.1) and one using an integrated clock gating cell (figure 2.2):

0
D Q
Data 1

D Q
Clock Enable
Generation Logic

Figure 2.1 – Circuit with no clock gating (re-circulating mux).

SNUG Europe 2004 -3 - greenhalgh_final.doc

Scan over-ride

Latch
D Q Data
D Q
Clock
Enable
D Q
Generation Logic

Integrated Clock
Gating Cell

Figure 2.2 – Circuit with clock gating using an integrated clock gating cell.

RTL clock gating saves power both in the switching of the registers being gated and
also in the clock network between the clock gate and the registers. Depending on the
cell library up to 50% of the capacitance of the clock tree occurs in the clock network
between the clock gate and the registers and in the clock input pins of the registers.
Given that it is typical for the clock tree to consume around 33% of the power of the
standard cell logic, significant power savings can be made by preventing this network
from switching.

However RTL clock gating requires discipline from the designer in identifying
suitable control signals for the clock gate. Where there is naturally a re-circulating
mux this is not an issue. In other cases, a reasonable number of registers (usually four
or more) needs to be identified for the gain in dynamic power from clock gating to
overcome the static leakage penalty of the clock gate and the logic that generates the
enable signal. Nevertheless, RTL clock gating is an effective way of reducing
dynamic power and the ARM1176JZF-S and ARM926EJ-S processors use this
approach on the majority of their registers.

A further benefit of RTL clock gating is that the removal of the re-circulating mux
from the datapath will result in a potential performance improvement on the register-
to-register path. However, there is a penalty in that the enable signal for the clock
gate must be set up to the clock gate some time period before the end of the cycle to
allow the clock to propagate from the clock gate to the register. This can make the
enable generation logic complicated to design from a timing perspective, especially
since the criticality of the enable signal is not accurately known until after Clock Tree
Synthesis has been performed on a placed design.

3 Architectural clock gating

Architectural clock gating is a more coarse grained method of gating the clock from
registers and can be used either at a block level (e.g. a DMA block) or at the top level
of the design.

Architectural clock gating provides an incremental benefit over register clock gating.
Indeed if the target block already uses a large amount of RTL clock gating, the main
benefit will be in gating out a larger section of the clock tree.

For example, in the following diagram, Block 2 is architecturally clock gated and
Block 1 is not. When the architectural clock gate is in operation the RTL clock gating

SNUG Europe 2004 -4 - greenhalgh_final.doc

inside of block 2 should already have disabled the clock. Therefore the incremental
power saving of architectural clock gating over RTL clock gating is on the free
running registers (where either a suitable enable signal could not be identified or there
are too few registers that can be grouped together to make clock gating worthwhile)
and on the clock network leading to the RTL clock gating cells.

IP Block
Block 1 Block 2
D Q
D Q

D Q
RTL Clock
D Q
Gating

D Q
Architectural RTL Clock
Clock Gate Gate
CLK

Figure 3.1 – Architectural clock gating.

Because an architectural clock gate exists higher up the clock tree than a conventional
RTL clock gate the enable signal has a smaller portion of the cycle to setup to the
clock gate. Therefore the enable must come directly out of a register or traverse a
very small number of gates. This can limit the usefulness of architectural clock gates
for circuits that require complex enables.

On the ARM1176JZF-S microprocessor, architectural clock gating is used to remove

the main clock (CLKIN) from the entire microprocessor. The enable signal that
controls this is based on the following conditions:

• There are no instructions being executed in the integer core or the Vector
Floating Point unit.
• There are no operations in progress on the bus interface unit.
• There are no DMA operations in progress into the internal Tightly Coupled
Memories.
• The ARM1176JZF-S processor is not in debug state.

The problem with soft IP is that the IP creator does not know what the insertion delay
of the clock will be as this is a function of the target library and the quality of the
clock tree insertion tool. The risk is that if the clock tree insertion delay approaches
one clock cycle the enable could arrive at the architectural clock gate at the same time
as the next clock edge thereby causing a glitch on the clock. Therefore the prudent
approach is for the IP creator to design clock circuitry that allows any clock tree
insertion delay.

On the ARM1176JZF-S processor a glitch is prevented by using an unbalanced, free-

running version of the main clock called FREECLKIN to synchronise the enable
signal coming into the architectural clock gate.

SNUG Europe 2004 -5 - greenhalgh_final.doc

Synchronised Pipeline Empty Signal
Enable Signals

Synchronisation
Circuitry

FREECLKIN
D Q D Q D Q

Integer Pipeline
Empty Logic
D Q

Unbalanced Clock Tree

Enable
SoC CLK In CLK Out
Clock
CLKIN

Architectural Clock Gate Balanced Clock Tree

(Integrated Clock Gating Cell)
SoC ARM1176JZ(F)-S

Figure 3.2 – Architectural clock gate in the ARM1176JZF-S microprocessor.

Providing the FREECLKIN signal is not balanced with respect to the CLKIN signal,
the clock tree can have any insertion delay and there is no risk of a glitch.

4 Comparing the Benefits of RTL Clock Gating and Architectural Clock

Gating

A normalized comparison of the power benefits of mixing the different clock gating
approaches is shown using the ARM926EJ-S microprocessor running loop 4 of the
industry standard Dhrystone benchmark. The ARM926EJ-S microprocessor was used
to produce these figures rather than the ARM1176JZF-S microprocessor as the use of
architectural clock gating is an implementation choice with the ARM926EJ-S
processor whereas it is a fundamental part of the clocking strategy of the
ARM1176JZF-S processor, and cannot be removed.

The architectural clock gating approach used in the ARM926EJ-S processor differs
from the ARM1176JZF-S processor in that as well as being able to gate off the entire
clock from the microprocessor there are clock gates on eleven other major blocks in
the design. These include the coprocessor interface, the instruction caches, the data
caches, the instruction tightly coupled memories, the data tightly coupled memories,
and the main integer core. Therefore the ARM926EJ-S processor is ideal for
comparing the benefit of block level architectural clock gating with RTL clock gating.

Note that each clock gating experiment required a complete Physical Compiler
synthesis and Astro CTS/route and consequently there will be minor variations
between each run:

SNUG Europe 2004 -6 - greenhalgh_final.doc

No Clock Gating 100%

Architectural Clock Gating Only 94%

RTL Clock Gating Only 74%

RTL and Architectural Clock Gating 69%

0% 20% 40% 60% 80% 100%

Figure 4.1 –Power consumption of different clock gating approaches on the

ARM926EJ-S microprocessor using Dhrystone loop 4

Dhrystone is a reasonably strenuous benchmark for a microprocessor because it

typically fits completely inside of the caches and therefore requires no processor bus
transactions. Indeed, the diagram shows that only a small benefit is gained from the
coarse grained architectural clock gating approach. This is because during the fourth
loop of Dhrystone the main integer core and caches are running continuously and only
a few blocks (such as the coprocessor interface) are idle and can be clock gated out.
However the combination of RTL and architectural clock gating still provides a useful
31% reduction in power compared to using no clock gating at all.

Note that the power savings discussed in this section are only applicable to dynamic
power. Yet static power consumption is becoming increasingly significant at
technology process geometries of 90nm and below. There are many circuit
techniques to deal with static power consumption however there are also technology
independent approaches that an IP creator can use. The use of a Dormant Power
Mode is one such technique.

5 Dormant Power Mode

In order to combat static leakage power it may be advantageous to remove power
from large blocks of logic that are not in use and can return to full functionality a
relatively small time period after power is returned. For example the ARM1176JZF-S
microprocessor implements a power down mode (Dormant Mode) which allows the
power to be removed from the standard cells while the main cache RAMs remain
powered up. This allows the power for the standard cell logic in the processor to be
removed when no useful work is occurring but state can be returned relatively quickly
when required without the latency and power penalty of refilling the processor caches
from external memory.

Depending on the leakage characteristics of the cache RAMs, Dormant Mode can be
highly advantageous for leakage reduction. Indeed, if leakage management is of
critical importance, nominal or even high threshold transistors can be used in the
caches while the logic is implemented using mixed threshold standard cells for
optimum performance.

SNUG Europe 2004 -7 - greenhalgh_final.doc

However, in the case of the ARM1176JZF-S microprocessor, Dormant Mode is not
trivial to implement. The complexities of Dormant Mode can be split into a hardware
component and a software component.

5.1 Hardware Considerations

To implement Dormant Mode successfully, the following must be considered:

• All inputs (including the clock) into the RAMs must be clamped when the
standard cell power is removed to prevent erroneous data being written to the
RAMs. A signal called RAMCLAMP is used for this purpose and the design
is in normal operation when RAMCLAMP is set to logic zero.
• All clamping logic must be placed only in the RAM power domain.
• Clocks must be held at a known state during Dormant Mode and a rising edge
avoided when coming out of Dormant Mode to prevent erroneous data being
clocked into the cache RAMs.

The following diagrams illustrate the Dormant Mode voltage domains and clamping
logic.

SoC ARM1176JZF-S

Standard Cell Logic Clock

VDD Cell Signal
Domain

VDD RAM Data Signal

Clamping Logic Domain
RAMCLAMP

RAM Blocks

Clamped Clamped
Data Signal Clock Signal

Figure 5.1 – Dormant Mode voltage domains and clamping logic

5.2 Software Considerations

Because in Dormant Mode the power is removed from the standard cells all
microprocessor configuration will be lost. So in order to successfully enter and return
from Dormant Mode the following must be copied to and from external memory:

• ARM general purpose and status registers.

• CP15 (system control coprocessor) and CP14 (debug) registers.
• DMA state.
• VFP (floating point coprocessor) state.
• Cache state bits for the cache RAMs.

Depending on the cache write policy (write through/write back) and the AMBA™
AXI processor bus frequency, entry and exit from Dormant Mode takes between 5000
and 10,000 clock cycles. This cycle overhead means that accurate modelling of the
software environment is required to get the best performance and static leakage power

SNUG Europe 2004 -8 - greenhalgh_final.doc

saving from Dormant Mode. However, the voltage domain divide between the RAMs
and the standard cell logic was chosen purely to simplify the implementation process.
It would be possible to place the registers used for storing the cache state bits for the
RAMs in the RAM voltage domain thereby improving the latency to return from
Dormant Mode. Naturally, this would complicate implementation and possibly
reduce performance, but the benefit in reduced latency may make it worthwhile.

6.0 Dynamic Voltage and Frequency Scaling

To achieve further improvements in power reduction without resorting to custom
circuit techniques, Dynamic Voltage and Frequency Scaling can be used.

Dynamic Voltage and Frequency Scaling is effective because of the following two
facts:

• The amount of energy required to complete a task is proportional to the

square of the supply voltage.
• The maximum frequency of any CMOS circuit is proportional to the supply
voltage.

So if the supply voltage is decreased there is a square-law reduction in energy to

complete a given task. However the task takes longer to complete because of the
linear reduction in frequency. Therefore, the principle gain with Dynamic Voltage
and Frequency Scaling is with respect to dynamic power consumption. However any
reduction in supply voltage also results in a proportional reduction in static power.

The ARM1176JZF-S microprocessor supports Dynamic Voltage and Frequency

Scaling as a part of the design. Although splitting the design up into multiple voltage
domains is mostly an implementation problem the structure of the RTL can crucially
affect compatibility with EDA tools.

To ease implementation the RTL was designed using the following rules:

1 – The top-level logical hierarchy should correlate with the voltage domains in the
design.
2 – Clocks should not cross voltage domains inside of the IP.
3 – All data signals that cross between voltage domains inside the processor or on the
processor interface should be synchronized and level shifted.
4 – All outputs from the core voltage domain should be clamped to allow the power to
be removed (Dormant Mode).

SNUG Europe 2004 -9 - greenhalgh_final.doc

By following these rules the ARM1176JZF-S microprocessor looks as follows in a
SoC:

RAMs
Vram ARM1176JZ(F)-S
Level Shift and Clamp Logic

Vcore BIST

Level Shift & Clamps

AMBA AMBA
AXI AXI
Coprocessor Core logic Register Register
Slices Slices
(Vcore) (Vsoc)
Embedded
Trace
Macrocell

Level Shift and Clamp Logic

Energy
Management
Controller
AMBA AXI Buses
Clock & (including clocks
Interrupt Debug
Reset
Vsoc Controller Logic
Signals
and resets)

Figure 6.1 – ARM1176JZF-S RTL Hierarchy for Dynamic Voltage and Frequency
Scaling

The AMBA AXI Register Slices are a method of pipelining the main processor buses.
AXI allows register slices to be placed in the interconnect with no impact at all on the
available bandwidth. By splitting these register slices and placing one half in the core
voltage domain and one half in the SoC voltage domain the synchronous timing
interface to the ARM1176JZF-S microprocessor from the SoC looks the same no
matter what the core voltage is. This approach ensures that the first rule is met and
that the top-level logical hierarchy of the ARM1176JZF-S microprocessor correlates
with the voltage domains being implemented. Furthermore, the ARM1176JZF-S
microprocessor conforms to the second rule since the AMBA AXI clocks required for
the bus interface unit are only used inside of Vsoc module of the AMBA AXI register
slice.

Other interfaces in the design from the core voltage domain to the SoC voltage
domain (such as the debug interface) are asynchronous at the boundary of the
microprocessor and are already handled by synchronization circuitry internally. This
means that the asynchronous interfaces only need to be level shifted and not
resynchronized.

Where possible, it is advantageous for implementation complexity and maximum

power saving to keep peripherals such as coprocessors in the core voltage domain.
However if the peripherals also communicate with blocks of IP in the SoC voltage
domain then some effort may be required from the IP integrator to ensure that timing
and functionality are maintained across the voltage boundary.

SNUG Europe 2004 - 10 - greenhalgh_final.doc

In order for Dynamic Voltage and Frequency Scaling to work in a SoC environment,
there must be some form of energy management controller on the SoC that controls
the voltage domains in the SoC. Also, the operating system must understand when
the voltage and frequency can scale depending on the current and future workload of
the microprocessor.

The benefits of Dynamic Voltage and Frequency Scaling are substantial. For
example, on an ARM926EJ-S test chip fabricated on the TSMC CL013G process, the
following measurements were obtained from the test silicon when running the
Dhrystone benchmark:

300MHz, 1.21V 100% 100%

225MHz, 1.03V 56% 75%

150MHz, 0.81V 23% 46%

75MHz, 0.69V 9% 36%

0% 20% 40% 60% 80% 100% 0% 20% 40% 60% 80% 100%

Power Energy
Figure 6.2 – Power and energy benefit from Dynamic Voltage and Frequency Scaling

The voltage numbers in the diagrams above represent the minimum voltage that
would still allow the benchmark to successfully complete at the target frequency on
the test silicon.

The diagrams clearly show a large benefit from reducing the supply voltage.
However, the energy diagram is possibly more useful as it factors out the increased
time to execute a task due to the reduction in frequency.

Even though the benchmark took twice as long to run at 150MHz than 300MHz, this
may not be an issue in some applications and the 54% reduction in energy required to
complete the task highly desirable.

Dynamic voltage and frequency scaling is a complex technique which requires

consideration from the RTL design stage all the way through implementation,
integration and in the operating system. However the energy savings of Dynamic
Voltage and Frequency Scaling easily offset these complications.

SNUG Europe 2004 - 11 - greenhalgh_final.doc

7.0 Summary

The techniques that have been described in this paper to save power (static and
dynamic) are complementary and applicable across all types of soft IP. They range
from the highly automated (RTL clock gating) to the more complex (dynamic voltage
and frequency scaling), but each can provide incremental savings in power across all
technology processes therefore extending battery life in mobile devices and also
potentially reducing IC package costs. It is up to the IP creator to decide which
techniques are required depending on the target application and to then put in place
the infrastructure to allow them to be successfully implemented and integrated into a
system chip.

8.0 Acknowledgements
Thanks to everyone in ARM who reviewed this paper prior to release.

SNUG Europe 2004 - 12 - greenhalgh_final.doc

Problem Solving Cos 102 Class-1
No ratings yet
Problem Solving Cos 102 Class-1
48 pages
Visualizing Criminal Networks White Papers
No ratings yet
Visualizing Criminal Networks White Papers
7 pages
Chapter 7. Managing Object-Oriented Software Engineering
No ratings yet
Chapter 7. Managing Object-Oriented Software Engineering
19 pages
Vu Cms Led TV: Model
No ratings yet
Vu Cms Led TV: Model
32 pages
PDF The Handbook of Formal Methods in Human Computer Interaction Benjamin Weyers Download
100% (8)
PDF The Handbook of Formal Methods in Human Computer Interaction Benjamin Weyers Download
65 pages
Comparison of Power Estimation in Different Stages of An ASIC Design Stages
No ratings yet
Comparison of Power Estimation in Different Stages of An ASIC Design Stages
19 pages
D Practical 2
No ratings yet
D Practical 2
9 pages
CC MCQ Unit-3
No ratings yet
CC MCQ Unit-3
3 pages
Towards Seamless Connectivity Implementing 6G Communication Technologies in Next-Generation Networks
No ratings yet
Towards Seamless Connectivity Implementing 6G Communication Technologies in Next-Generation Networks
6 pages
Power Delivery - Topological Design Strategy PDF
No ratings yet
Power Delivery - Topological Design Strategy PDF
58 pages
Power Delivery - Topological Design Strategy
No ratings yet
Power Delivery - Topological Design Strategy
36 pages
Structural Modelling
No ratings yet
Structural Modelling
27 pages
Estimedia
No ratings yet
Estimedia
23 pages
Overcoming Power Compiler Limitations To Optimize Clock Gating PDF
No ratings yet
Overcoming Power Compiler Limitations To Optimize Clock Gating PDF
19 pages
AstroRail Tips, Tricks and Gotchas
No ratings yet
AstroRail Tips, Tricks and Gotchas
19 pages
Power Optimization Through Dual Supply Voltage Scaling Using Power Compiler
No ratings yet
Power Optimization Through Dual Supply Voltage Scaling Using Power Compiler
18 pages
Early Estimation of Leakage Power
No ratings yet
Early Estimation of Leakage Power
18 pages
Accurate Timing - and Power Characterization With Nanosim PDF
No ratings yet
Accurate Timing - and Power Characterization With Nanosim PDF
17 pages
Pre Project Presentation - Khubasad
No ratings yet
Pre Project Presentation - Khubasad
11 pages
Power Optimization Through Dual Supply Voltage Scaling Using Power Compiler PDF
No ratings yet
Power Optimization Through Dual Supply Voltage Scaling Using Power Compiler PDF
11 pages
Fight The Power PDF
No ratings yet
Fight The Power PDF
24 pages
Using Synthesis Techniques For Power Reduction
No ratings yet
Using Synthesis Techniques For Power Reduction
9 pages
A Practical Approach To The Full-Chip Dynamic IR Drop
No ratings yet
A Practical Approach To The Full-Chip Dynamic IR Drop
22 pages
Early Estimation of Power
No ratings yet
Early Estimation of Power
8 pages
Power Optimisation For A 32-Bit RISC Processor
No ratings yet
Power Optimisation For A 32-Bit RISC Processor
7 pages
FSMs
No ratings yet
FSMs
64 pages
Power Network Synthesis and Analysis With JupiterXT and Primepower
No ratings yet
Power Network Synthesis and Analysis With JupiterXT and Primepower
14 pages
Gate Level Power Estimation by RTL Activity File PDF
No ratings yet
Gate Level Power Estimation by RTL Activity File PDF
13 pages
Sequential Modelling
No ratings yet
Sequential Modelling
30 pages
Power Analysis Methodology and Objectives For TI Wireless Platform PDF
No ratings yet
Power Analysis Methodology and Objectives For TI Wireless Platform PDF
19 pages
Power Isolation and Challenges
No ratings yet
Power Isolation and Challenges
9 pages
Power Analysis Using Astro Rail PDF
No ratings yet
Power Analysis Using Astro Rail PDF
12 pages
Behavioural Modelling
No ratings yet
Behavioural Modelling
40 pages
DFT 3
No ratings yet
DFT 3
52 pages
5 Computer Hardware
No ratings yet
5 Computer Hardware
127 pages
I&cs MCQ Set-3
No ratings yet
I&cs MCQ Set-3
22 pages
DataFlow Modelling
No ratings yet
DataFlow Modelling
60 pages
Image Processing
No ratings yet
Image Processing
62 pages
MSI MS-1795 User Manual
No ratings yet
MSI MS-1795 User Manual
58 pages
Function Practice Questions
No ratings yet
Function Practice Questions
27 pages
Paper 2
No ratings yet
Paper 2
2 pages
Xapp790 7 Series Clock Gating
No ratings yet
Xapp790 7 Series Clock Gating
19 pages
Machine Learning Steps - A Complete Guide - Simplilearn
No ratings yet
Machine Learning Steps - A Complete Guide - Simplilearn
11 pages
Pharmaceutical Manufacturing Broucher
No ratings yet
Pharmaceutical Manufacturing Broucher
11 pages
X-It - Board Exam Paer With Key 2022-23
No ratings yet
X-It - Board Exam Paer With Key 2022-23
9 pages
Yashika Vohra CV.
No ratings yet
Yashika Vohra CV.
4 pages
PVL 207 Lec 12 (Minimizing Switched Capacitances)
No ratings yet
PVL 207 Lec 12 (Minimizing Switched Capacitances)
19 pages
2020 Internet Retailer Leading Vendors Top 1000 Eretailers
100% (1)
2020 Internet Retailer Leading Vendors Top 1000 Eretailers
106 pages
B308 Octo-Output Module
No ratings yet
B308 Octo-Output Module
3 pages
Clock-Gating of Streaming Applications For Energy Efficient Implementations On Fpgas
No ratings yet
Clock-Gating of Streaming Applications For Energy Efficient Implementations On Fpgas
5 pages
Master - Isqed
No ratings yet
Master - Isqed
5 pages
WinCNC Optiscout Operations-Manual
No ratings yet
WinCNC Optiscout Operations-Manual
11 pages
Power Reduction in Datapath Designs
No ratings yet
Power Reduction in Datapath Designs
10 pages
Clockgating Fpga 2
No ratings yet
Clockgating Fpga 2
5 pages
The Security Network Coding System With Physical Layer Key Generation in Two-Way Relay Networks
No ratings yet
The Security Network Coding System With Physical Layer Key Generation in Two-Way Relay Networks
9 pages
Fight The Power
No ratings yet
Fight The Power
29 pages
512 Kbit (64K x8) UV EPROM and OTP EPROM: Features
No ratings yet
512 Kbit (64K x8) UV EPROM and OTP EPROM: Features
19 pages
Tours Csharp Project Proposal
No ratings yet
Tours Csharp Project Proposal
2 pages
(2014 Transanction) Design - Flow - For - Flip-Flop - Grouping - in - Data-Driven - Clock - Gating
No ratings yet
(2014 Transanction) Design - Flow - For - Flip-Flop - Grouping - in - Data-Driven - Clock - Gating
8 pages
Mainline Blueprint Cas Pa
No ratings yet
Mainline Blueprint Cas Pa
7 pages
Advanced Clock Gating With Power Compiler
No ratings yet
Advanced Clock Gating With Power Compiler
18 pages
My Paper
No ratings yet
My Paper
11 pages
Power Analysis and Implementation of The 8 - Bit T
No ratings yet
Power Analysis and Implementation of The 8 - Bit T
6 pages
76 - A Low
No ratings yet
76 - A Low
9 pages
Hadoop BIG DATA Interview Questions You'll Most Likely Be Asked
No ratings yet
Hadoop BIG DATA Interview Questions You'll Most Likely Be Asked
20 pages
Design of Medium Grain Integrated Clock Gater For Low Power Clock Network
No ratings yet
Design of Medium Grain Integrated Clock Gater For Low Power Clock Network
9 pages
(2023 Conference) Novel - Clock - Gating - Broadcasting - Applications - For - Low-Power - FPGA - Architectures
No ratings yet
(2023 Conference) Novel - Clock - Gating - Broadcasting - Applications - For - Low-Power - FPGA - Architectures
5 pages
Java Primer
No ratings yet
Java Primer
187 pages
Ultimate Guide - Clock Tree Synthesis - AnySilicon
No ratings yet
Ultimate Guide - Clock Tree Synthesis - AnySilicon
12 pages
Dynamic Power Reduction in Sequential Circuits Using Look Ahead Clock Gating Technique
No ratings yet
Dynamic Power Reduction in Sequential Circuits Using Look Ahead Clock Gating Technique
7 pages
MPC-HC Digital TV
No ratings yet
MPC-HC Digital TV
5 pages
Speed Controller
No ratings yet
Speed Controller
4 pages
Power Reduction by Clock Gating Technique
No ratings yet
Power Reduction by Clock Gating Technique
5 pages
Comparative Analysis of Different Clock Gating Techniques
No ratings yet
Comparative Analysis of Different Clock Gating Techniques
55 pages
CSC159 Ch2 Numbering System
No ratings yet
CSC159 Ch2 Numbering System
23 pages
The Agile Change Management Process
No ratings yet
The Agile Change Management Process
6 pages
Chapter 2
No ratings yet
Chapter 2
27 pages
Low Power Implementation of RISC V Proce
No ratings yet
Low Power Implementation of RISC V Proce
6 pages
Internal Order in Multi Org Setup
100% (1)
Internal Order in Multi Org Setup
10 pages
Analog Dialogue, Volume 47, Number 1: Analog Dialogue, #9
From Everand
Analog Dialogue, Volume 47, Number 1: Analog Dialogue, #9
Analog Dialogue
No ratings yet
PLC: Programmable Logic Controller – Arktika.: EXPERIMENTAL PRODUCT BASED ON CPLD.
From Everand
PLC: Programmable Logic Controller – Arktika.: EXPERIMENTAL PRODUCT BASED ON CPLD.
MARIO FRANCO
No ratings yet
Clock Gate
No ratings yet
Clock Gate
19 pages
Clock Gating
No ratings yet
Clock Gating
4 pages
Analog Dialogue, Volume 46, Number 3: Analog Dialogue, #7
From Everand
Analog Dialogue, Volume 46, Number 3: Analog Dialogue, #7
Analog Dialogue
No ratings yet
Technology in Telecommunications Networks
From Everand
Technology in Telecommunications Networks
Tanushri Kaniyar
No ratings yet
Lin2018 PDF
No ratings yet
Lin2018 PDF
2 pages
Power Optimization (Part 2) : Xuan Silvia' Zhang
No ratings yet
Power Optimization (Part 2) : Xuan Silvia' Zhang
26 pages
Comparative Analysis of Efficient Designs of D Latch Using 32nm CMOS Technology
No ratings yet
Comparative Analysis of Efficient Designs of D Latch Using 32nm CMOS Technology
4 pages
Improve Clock Tree Efficiency For Low Power Clock Tree Design
No ratings yet
Improve Clock Tree Efficiency For Low Power Clock Tree Design
3 pages
Clock Issues in Deep Submircron Design
No ratings yet
Clock Issues in Deep Submircron Design
50 pages
Efficient Design of 1
No ratings yet
Efficient Design of 1
7 pages
7.3. Clock Gating: Excerpt Reprinted by Permission From "FPGA-Based Prototyping Methodology Manual."
No ratings yet
7.3. Clock Gating: Excerpt Reprinted by Permission From "FPGA-Based Prototyping Methodology Manual."
5 pages
Placement Aware Clock Gate Cloning and Redistribution Methodology PDF
100% (1)
Placement Aware Clock Gate Cloning and Redistribution Methodology PDF
4 pages
Data Driven Clock Gating: Bar Ilan University School of Engineering Vlsi Lab
No ratings yet
Data Driven Clock Gating: Bar Ilan University School of Engineering Vlsi Lab
34 pages
Mitch Dale, Calypto Design Systems: Share On Facebook Share On Twitter
No ratings yet
Mitch Dale, Calypto Design Systems: Share On Facebook Share On Twitter
5 pages
IJETR032485
No ratings yet
IJETR032485
3 pages
Low Power Clock Tree Synthesis Using Clustering Algorithm
No ratings yet
Low Power Clock Tree Synthesis Using Clustering Algorithm
5 pages
Flip-Flop Grouping in Data-Driven Clock Gating: Varghese James A, Divya S, Seena George
No ratings yet
Flip-Flop Grouping in Data-Driven Clock Gating: Varghese James A, Divya S, Seena George
9 pages
At Speed Test
No ratings yet
At Speed Test
20 pages
Data of ISCAS Benchmark Circuits (RTL CAD Tool Design)
No ratings yet
Data of ISCAS Benchmark Circuits (RTL CAD Tool Design)
4 pages
Clock Gating For Power Optimization in ASIC Design Cycle: Theory & Practice
No ratings yet
Clock Gating For Power Optimization in ASIC Design Cycle: Theory & Practice
50 pages
Clock Gating Circuits PDF
100% (1)
Clock Gating Circuits PDF
4 pages
Synopsys Eclypse
No ratings yet
Synopsys Eclypse
5 pages
Automatic Clock Gating For Power Reductionl
No ratings yet
Automatic Clock Gating For Power Reductionl
11 pages
Dees ST-Microelectronics Stradale Primosole, Viale Andrea Dona Universita' Di Catania 1-95 121 CATANIA Italy 1-95 125 CATANIA Italy
No ratings yet
Dees ST-Microelectronics Stradale Primosole, Viale Andrea Dona Universita' Di Catania 1-95 121 CATANIA Italy 1-95 125 CATANIA Italy
4 pages
Clock Gating
No ratings yet
Clock Gating
2 pages
Clock Gating
No ratings yet
Clock Gating
2 pages
Low Power Register Design With Integration Clock Gating and Power Gating
No ratings yet
Low Power Register Design With Integration Clock Gating and Power Gating
6 pages
Clock Gating
No ratings yet
Clock Gating
11 pages
Different Low Power Techniques: Trade-Offs Associated With The Various Power Management Techniques
No ratings yet
Different Low Power Techniques: Trade-Offs Associated With The Various Power Management Techniques
2 pages
Power Optimized Programmable Embedded Controller
No ratings yet
Power Optimized Programmable Embedded Controller
11 pages
Clock Gating
100% (1)
Clock Gating
4 pages
Clock Distribution
100% (2)
Clock Distribution
52 pages
A Low-Power FPGA Based On Autonomous Fine-Grain Power Gating
No ratings yet
A Low-Power FPGA Based On Autonomous Fine-Grain Power Gating
13 pages
Clock Issues in Deep Submircron Design
100% (1)
Clock Issues in Deep Submircron Design
50 pages
ASIC-System On Chip-VLSI Design - Clock Gating
No ratings yet
ASIC-System On Chip-VLSI Design - Clock Gating
4 pages
Clock Gating: Smart Use Ensures Smart Returns
No ratings yet
Clock Gating: Smart Use Ensures Smart Returns
4 pages

Power Management Techniques For Soft IP PDF

Uploaded by

Power Management Techniques For Soft IP PDF

Uploaded by

Power Management Techniques for Soft IP

SNUG Europe 2004 -1 - greenhalgh_final.doc

2 – RTL Clock Gating 3

3 – Architectural Clock Gating 4

5 – Dormant Power Mode 7

6 – Dynamic Voltage Scaling 9

SNUG Europe 2004 -2 - greenhalgh_final.doc

Furthermore, at process geometries of 90nm and below static leakage power is

The ARM1176JZF-S and ARM926EJ-S microprocessors are designs delivered to

2 RTL Clock Gating

Figure 2.1 – Circuit with no clock gating (re-circulating mux).

SNUG Europe 2004 -3 - greenhalgh_final.doc

3 Architectural clock gating

SNUG Europe 2004 -4 - greenhalgh_final.doc

Figure 3.1 – Architectural clock gating.

On the ARM1176JZF-S microprocessor, architectural clock gating is used to remove

On the ARM1176JZF-S processor a glitch is prevented by using an unbalanced, free-

SNUG Europe 2004 -5 - greenhalgh_final.doc

Unbalanced Clock Tree

Architectural Clock Gate Balanced Clock Tree

Figure 3.2 – Architectural clock gate in the ARM1176JZF-S microprocessor.

4 Comparing the Benefits of RTL Clock Gating and Architectural Clock

SNUG Europe 2004 -6 - greenhalgh_final.doc

Architectural Clock Gating Only 94%

RTL Clock Gating Only 74%

RTL and Architectural Clock Gating 69%

0% 20% 40% 60% 80% 100%

Figure 4.1 –Power consumption of different clock gating approaches on the

Dhrystone is a reasonably strenuous benchmark for a microprocessor because it

5 Dormant Power Mode

SNUG Europe 2004 -7 - greenhalgh_final.doc

5.1 Hardware Considerations

To implement Dormant Mode successfully, the following must be considered:

Standard Cell Logic Clock

VDD RAM Data Signal

Figure 5.1 – Dormant Mode voltage domains and clamping logic

5.2 Software Considerations

• ARM general purpose and status registers.

SNUG Europe 2004 -8 - greenhalgh_final.doc

6.0 Dynamic Voltage and Frequency Scaling

• The amount of energy required to complete a task is proportional to the

So if the supply voltage is decreased there is a square-law reduction in energy to

The ARM1176JZF-S microprocessor supports Dynamic Voltage and Frequency

SNUG Europe 2004 -9 - greenhalgh_final.doc

Level Shift & Clamps

Level Shift and Clamp Logic

Where possible, it is advantageous for implementation complexity and maximum

SNUG Europe 2004 - 10 - greenhalgh_final.doc

300MHz, 1.21V 100% 100%

225MHz, 1.03V 56% 75%

150MHz, 0.81V 23% 46%

75MHz, 0.69V 9% 36%

Dynamic voltage and frequency scaling is a complex technique which requires

SNUG Europe 2004 - 11 - greenhalgh_final.doc

SNUG Europe 2004 - 12 - greenhalgh_final.doc

You might also like