Low-Power VLSI Circuits and Systems
Low-Power VLSI Circuits and Systems
Ajit Pal
1 3
Ajit Pal
Computer Science and Engineering
Indian Institute of Technology Kharagpur
Kharagpur
West Bengal
India
Several years ago, I introduced a graduate course entitled “Low Power VLSI Cir-
cuits and Systems” (CS60054) to our students at IIT Kharagpur. Although the
course became very popular among students, the lack of a suitable textbook was
sorely felt. To overcome this problem, I began to hand out lecture notes, which
was highly appreciated by the students. Over the years, those lecture notes have
gradually evolved into this book. The book is intended as a first-level course on
VLSI circuits for graduate and senior undergraduate students. While a basic course
on digital circuits is a prerequisite, no background in the area of VLSI circuits is
necessary to use this book. Each chapter is provided with an abstract and keywords
in the beginning and a chapter summary, review questions and references at the
end to meet pedagogical requirements of a textbook. This will help the students in
understanding the topics covered and also help the instructors while teaching the
subject. The book comprises the following 12 chapters covering different aspects
of the digital VLSI circuit design with particular emphasis on low-power aspects. A
chapter-wise summary of coverage is given below.
Chapter 1: Introduction
This chapter begins with the historical background that led to the development of
present-day VLSI circuits. In the next section, Sect. 1.2, the importance of low-
power in high-performance and battery-operated embedded systems is highlighted.
Various sources of power dissipation are identified in Sect. 1.3. Low-power design
methodologies are introduced in Sect. 1.4.
v
vi Preface
The structure of various types of MOS transistors that can be obtained after fabri-
cation is presented in Sect. 3.1. In Sect. 3.2, characteristics of MOS transistors are
explained with the help of fluid model, which helps to understand the operation of
a MOS transistor without going into the details of device physics. Three different
modes of operation such as accumulation, depletion, and inversion are discussed
in Sect. 3.3. Electrical characteristics of MOS transistors are explained in detail in
Sect. 3.4. Use of MOS transistors as a switch is explored in Sect. 3.5.
Various sources of power dissipation in MOS circuits are presented in this chapter. It
begins with the explanation of the difference between power and energy. How short
circuit power dissipation takes place in CMOS circuits is explained and the expres-
sion for short circuit power dissipation is derived in Sect. 6.1. Switching power
dissipation in CMOS circuits has been considered in Sect. 6.2 and an expression
for switching power dissipation is derived. Switching activity for different types of
gates is calculated and that for dynamic CMOS circuits is highlighted. Expression
for power dissipation due to charge sharing is derived. Section 6.3 presents glitch-
ing power dissipation along with techniques to reduce it. Sources of leakage power
dissipation such as subthreshold leakage and gate leakage have been introduced and
techniques to reduce them are presented in Sect. 6.4. Various mechanisms which
affect the subthreshold leakage current are also highlighted.
In this chapter various voltage scaling techniques starting with static voltage scal-
ing are discussed. The challenges involved in supply voltage scaling for low power
are highlighted. The distinction between constant field and constant voltage scal-
ing are explained in detail. First, the physical level-based approach, device feature
size scaling, to overcome the loss in performance is discussed in Sect. 7.1. The
short-channel effect arising out of feature size scaling is introduced. In Sect. 7.2
architecture level approaches such as parallelism and pipelining for static voltage
scaling are discussed. The relevance of multi-core for low power is explained. Static
viii Preface
As multiple threshold voltages are used to minimize leakage power, various ap-
proaches for the fabrication of multiple threshold voltage transistors are first pre-
sented in Sect. 9.1. Variable threshold voltage CMOS (VTCMOS) approach for
leakage power minimization is discussed in Sect. 9.2. Transistor stacking approach
based on the stack effect to minimize standby leakage power is highlighted in
Sect. 9.3. How run-time leakage power can be minimized by using multiple-thresh-
old voltage (MTCMOS) approach is discussed in Sect. 9.4. Section 9.5 addresses
the power-gating technique to minimize leakage power and various issues related
to power-gating approaches are highlighted. How power management approach can
Preface
ix
be used to reduce leakage power dissipation and how it can be combined with dy-
namic voltage scaling approach are explained. Isolation strategy is highlighted in
Sect. 9.6. State retention strategy is introduced in Sect. 9.7. Power gating control-
lers are discussed in Sect. 9.8. Power management techniques are considered in
Sect. 9.9. Dual-Vt assignment technique is introduced in detail in Sect. 9.10. De-
lay-constrained dual-Vt technique is presented in Sect. 9.11 and energy constrained
dual-Vt technique is considered in Sect. 9.12. Dynamic Vt scaling technique is in-
troduced in Sect. 9.13.
Section 10.1 introduces adiabatic charging which forms the basis of adiabatic cir-
cuits. The difference between adiabatic charging and conventional charging of a
capacitor is explained. As amplification is a fundamental operation performed by
electronic circuits to increase the current or voltage drive, adiabatic amplifica-
tion is presented in Sect. 10.2. The steps of realization of adiabatic logic gates are
explained and illustrated with the help of an example. Adiabatic logic gates are
introduced in Sect. 10.3. Realization of pulsed power supply, which is the most
fundamental building block of an adiabatic logic circuit is introduced in Sect. 10.4.
The realizations of both synchronous and asynchronous pulsed power supplies are
explained. How stepwise charging and discharging can be used to minimize power
dissipation is explainedin Sect. 10.5. Various partially adiabatic circuits such as
efficient charge recovery logic (ECRL), positive feedback adiabatic logic (PFAL),
and 2N-2N2Pare introduced and compared in Sect. 10.6.
This chapter discusses few design techniques and proposes an architectural power
management method to optimize the battery lifetime and to obtain maximum num-
ber of cycles per recharge. Section 11.1 introduces the so called battery gap, which
depicts that ever-increasing power requirement versus the actual rate of growth of
energy density of the battery technology. An overview of different battery technolo-
gies is provided in Sect. 11.2. Section 11.3 introduces different characteristics of
a rechargeable battery. The underlying process of battery discharge is explained
in Sect. 11.4. Different approaches of battery modeling are briefly introduced in
Sect. 11.5. Realizations of battery-driven systems are presented in Sect. 11.6. As an
example of a battery-aware system, Sect. 11.7 presents battery-aware sensor net-
works.
x Preface
This chapter introduces different software optimization techniques for low pow-
er. Power aware software does not require any additional hardware, but performs
suitable optimization of software to minimize energy consumption for their execu-
tion. The optimization techniques can be broadly classified into two categories:
machine independent and machine dependent. Machine-independent optimization
techniques are independent of the processor architecture and can be used for any
processor. Various software optimization techniques to reduce power consumption
without any change in the underlying hardware are considered in this chapter. Both
types of software are discussed here. Various sources of power dissipation in the
computer hardware are highlighted in Sect. 12.1. Machine-independent software
optimizations approaches are discussed in Sect. 12.2. Various loop optimization
techniques have been combined with DVFS to achieve larger reduction in energy
dissipation; this has been discussed in detail in Sect. 12.3. Power aware software
prefetching approach exploit the architectural features of the target processor and
the hardware platform, which has been discussed in detail in Sect. 12.4.
Acknowledgements
I am indebted to the editorial team at Springer, especially Kamiya Khatter for help-
ing shape the raw manuscript of the book to the present form. I am also grateful to
Ms. Zaenab Khan, Crest Premedia Solutions Private Limited, Pune, for her patience
during the production work-flow of the manuscript and resolving all my queries. I
am thankful to my wife Alpana, my younger daughter Amrita, her husband Shilad-
itya, my elder daughter Aditi and her husband Arjun for their help and encourage-
ment in going through this daunting task of writing a book.
xi
Contents
1 Introduction������������������������������������������������������������������������������������������������ 1
1.1 Introduction����������������������������������������������������������������������������������������� 1
1.2 Historical Background [1]������������������������������������������������������������������� 2
1.3 Why Low Power? [2]�������������������������������������������������������������������������� 7
1.4 Sources of Power Dissipations [3]������������������������������������������������������ 9
1.4.1 Dynamic Power����������������������������������������������������������������������� 10
1.4.2 Static Power���������������������������������������������������������������������������� 13
1.5 Low-Power Design Methodologies����������������������������������������������������� 14
1.6 Chapter Summary�������������������������������������������������������������������������������� 16
1.7 Review Questions�������������������������������������������������������������������������������� 16
References���������������������������������������������������������������������������������������������������� 17
xiii
xiv Contents
Index���������������������������������������������������������������������������������������������������������������� 387
About the Author
xxi
List of Figures
xxiii
xxiv List of Figures
Fig. 8.4 The code morphing software mediates between x86 soft-
ware and the Crusoe processor. BIOS basic input/output
system, VLIW very long instruction word������������������������������������ 218
Fig. 8.5 Flowchart of a program with a branch������������������������������������������ 219
Fig. 8.6 Encoder and decoder blocks to reduce switching activity������������ 221
Fig. 8.7 Encoder and decoder for Gray code��������������������������������������������� 222
Fig. 8.8 One-hot encoding�������������������������������������������������������������������������� 223
Fig. 8.9 Bus-inversion encoding���������������������������������������������������������������� 224
Fig. 8.10 Encoder and decoder of bus-inversion encoding. CLK
clock signal, INV invalid��������������������������������������������������������������� 225
Fig. 8.11 T0 encoding���������������������������������������������������������������������������������� 225
Fig. 8.12 T0 encoder and decoder. CLK clock signal, MUX multi-
plexer, INC increment������������������������������������������������������������������� 226
Fig. 8.13 Power reduction using clock gating���������������������������������������������� 227
Fig. 8.14 Clock-gating mechanism. EN enable, CLK global clock,
CLKG gated clock������������������������������������������������������������������������� 227
Fig. 8.15 a Clock gating using AND gate, b clock gating using OR
gate, c glitch propagation through the AND gate, and
d glitch propagation through the OR gate. EN enable,
CLK global clock, CLKG gated clock������������������������������������������ 228
Fig. 8.16 a Clock gating using a level-sensitive, low-active latch
along with an AND gate and b clock gating using a level-
sensitive, low-active latch along with an OR gate. EN
enable, CLK global clock, CLKG gated clock������������������������������ 228
Fig. 8.17 Clock gating the register file of a processor.
EN enable, CLK global clock, CLKG gated clock,
ALU arithmetic logic unit������������������������������������������������������������� 229
Fig. 8.18 a Synchronous load-enabled register bank and b clock-
gated version of the register bank. EN enable, CLK global
clock, CLKG gated clock, MUX multiplexer�������������������������������� 230
Fig. 8.19 Basic structure of a finite-state machine. PI primary input,
PO primary output, PS previous state, NS next state�������������������� 231
Fig. 8.20 Gated-clock version of the finite-state machine. PI primary
input, PO primary output, PS previous state, NS next state,
EN enable, CLK clock, CLKG gated clock����������������������������������� 231
Fig. 8.21 State-transition diagram of a finite-state machine ( FSM)������������ 232
Fig. 8.22 Gated-clock implementation of the finite-state
machine ( FSM) of Fig. 8.20. CLK clock,
CLKG gated clock, EN enable������������������������������������������������������ 232
Fig. 8.23 State-transition diagram of a modulo-6 counter��������������������������� 233
Fig. 8.24 State-transition diagram of the “11111” sequence detector���������� 234
Fig. 8.25 a An example finite-state machine FSM and
b decomposed FSM into two FSMs��������������������������������������������� 235
Fig. 8.26 a An example circuit and b operand isolation.
CLK clock signal, AS activation signal����������������������������������������� 235
List of Figures xxxi
Fig. 11.10 Three approaches to task scheduling with voltage scaling�������� 341
Fig. 11.11 Schematic diagram of a clustered sensor network��������������������� 347
Fig. 11.12 Schematic diagram of a clustered sensor network
with sensor nodes����������������������������������������������������������������������� 349
Fig. 11.13 Protocol operation of assisted-LEACH�������������������������������������� 351
Fig. 12.1 Simplified schematic diagram of a computer system����������������� 356
Fig. 12.2 Codes after “before inlining” and “after inlining”��������������������� 360
Fig. 12.3 Codes after “before code hoisting” and
“after code hoisting”������������������������������������������������������������������� 361
Fig. 12.4 Dead-store elimination��������������������������������������������������������������� 362
Fig. 12.5 Dead-code elimination��������������������������������������������������������������� 362
Fig. 12.6 Loop-invariant computation������������������������������������������������������� 363
Fig. 12.7 Loop unrolling���������������������������������������������������������������������������� 364
Fig. 12.8 Loop unrolling, where n = 10,000 and uf = 8.
a Original code. b Transformed code����������������������������������������� 366
Fig. 12.9 Loop tiling, where n = 10,000 and block = 32.
a Original code. b Transformed code����������������������������������������� 367
Fig. 12.10 Loop permutation, where n = 256.
a Original code. b Transformed code����������������������������������������� 368
Fig. 12.11 Strength reduction, where n = 10,000.
a Original code. b Transformed code����������������������������������������� 368
Fig. 12.12 Loop fusion, where n = 10,000. a Original code.
b Transformed code�������������������������������������������������������������������� 369
Fig. 12.13 Loop peeling, where n = 10,000. a Original code.
b Transformed code�������������������������������������������������������������������� 370
Fig. 12.14 Loop unswitching, where n = 10,000. a Original code.
b Transformed code�������������������������������������������������������������������� 371
Fig. 12.15 3D Jacobi’s kernel���������������������������������������������������������������������� 372
Fig. 12.16 3D Jacobi’s kernel with software prefetching���������������������������� 373
Fig. 12.17 General structure of a program with software prefetching�������� 374
Fig. 12.18 General structure of power-aware software prefetching
program (PASPP)����������������������������������������������������������������������� 374
Fig. 12.19 3D Jacobi’s Kernel with power-aware software prefetching����� 379
Fig. 12.20 Detailed power dissipation at different units for three
versions of 3D Jacobi’s Kernel��������������������������������������������������� 382
List of Tables
xxxv
xxxvi List of Tables