0% found this document useful (0 votes)

56 views10 pages

Signal Processing (E.g. For Multimedia and Wireless Communications)

This document discusses two types of computation and their optimization for power consumption. Stream-based signal processing benefits from real-time throughput, while general purpose processing benefits from higher overall speeds. Architecture and circuit-level optimizations can significantly improve energy efficiency by tailoring designs for specific computation types. These include lowering supply voltages, which reduces energy quadratically but increases delays.

Uploaded by

Harinath Reddy

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

56 views10 pages

Signal Processing (E.g. For Multimedia and Wireless Communications)

Uploaded by

Harinath Reddy

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 10

Two Kinds of Computation

Lecture 2 - 225 C
• Signal Processing (e.g. for multimedia and
wireless communications)
Architecture and System Level Optimization of • Stream based computation
Power Consumption • No advantage in obtaining throughput in excess
of the realtime constraint
• General purpose processing (for downloaded
code)
• Bursty - mostly idle with bursts of computation
• Faster is better

Architecture and System Level Optimization of Power Consumption 2

Potential of computation specific energy

Switching Energy
optimization
• Conventional general purpose processors
Vdd
• Clock rate is everything ... somehow we’ll get the
power in and out
• 10-100 watts, 100-1000 Mops = .01Mops/mW Vin Vout

• Energy optimized but general purpose CL

• Keep the generality, but reduce the energy as much as
possible - e.g. StrongArm
• .5 Watts, 130 Mops = .3 Mops/mW Energy/transition = CL * V dd2

• Energy optimized and dedicated Power = Energy/transition * f = CL * Vdd 2 * f

• 100 Mops/mW

Architecture and System Level Optimization of Power Consumption 3 Architecture and System Level Optimization of Power Consumption 4

Power-Delay Product Normalized Delay vs. Supply Voltage

7.5 multiplier
NORMALIZED POWER-DELAY PRODUCT

1.5
2.0 µm technology
N OR MALIZED D ELAY

7.0
1.00 P x t d = E t = C L * V d d2 6.5 clock generator
0.70 6.0
0.50 5.5
5.0
0.30 4.5 C L • Vdd
4.0
Td =
0.20 I
E (Vdd=2) (C L) * (2)2 3.5
0.15 quadratic dependence = ring oscillator
E (Vdd=5) (CL ) * (5) 2 3.0
0.1 2.5 microcoded DSP chip
51 stage ring oscillator
0.07 2.0
E (Vdd=2) ≈ 0.16 E (Vdd =5) 1.5
0.05 adder
8-bit adder 1.0 adder (SPICE)
0.03
2.0 4.0 6.0
1 2 5
Vdd (volts) Vdd (volts)
Strong function of voltage (V2 dependence).
Lowering V dd reduces energy but increases delays
Relatively independent of logic function and style.

Architecture and System Level Optimization of Power Consumption 5 Architecture and System Level Optimization of Power Consumption 6
Architecture Trade-offs - Reference Datapath Parallel Datapath
A
A

C OM PAR AT O R
1

COM P ARA TO R
2T

C OM P A R A TOR

LA TCH A
CO M PA RATO R

L AT CH B

L ATC H C
A>B

AD DE R
1

L ATC H A

LAT CH B

LAT CH C
ADDE R
T
A>B 1 C
2T
B 1
2T

MU X
1
T
C
Area = 636 x 833 µ2

C OM PA RAT O R
1

COM P ARA TO R
1
2T
B

L ATCH A

L AT CH B
1 T

L AT CH C
A>B

A DDE R
T

Critical path delay ⇒ Tadder + T comparator (= 25ns) 1

2T
C
1

⇒ fref = 40Mhz
2T

Area = 1476 x 1219 µ2

Total capacitance being switched = C ref
The clock rate can be reduced by half with the same
V dd = Vref = 5V throughput ⇒ f par = f ref / 2
Power for reference datapath = Pref = C ref V ref2 f ref V par = V ref / 1.7, C par = 2.15Cref
from [Chandrakasan92] (IEEE JSSC)
P par = (2.15C ref) (Vref /1.7)2 (fref/2) ≈ 0.36 P ref

Architecture and System Level Optimization of Power Consumption 7 Architecture and System Level Optimization of Power Consumption 8

The More Parallel the Better?? Pipelined Datapath

1.00
Fixed Throughput
NORMALIZED POWER

0.90 A
Minimal Area

C OM PA R A T OR
0.80 1

CO M P ARAT OR
T

L ATC H C2

L ATC H C1
LAT CH A

LAT CH B

LA TCH P
ADDE R
0.70 1
A>B
B
T
0.60
1
0.50 T C
0.40 1
Area = 640 x 1081 µ2
1
T T
0.30
0.20
Critical path delay is less ⇒ max [T adder , T comparator]
0.10 Minimal Power
0.00 Keeping clock rate constant: fpipe = fref
1.00 2.00 3.00 4.00 5.00
Voltage can be dropped ⇒ V pipe = Vref / 1.7
Vdd (volts)
Capacitance slightly higher: C pipe = 1.15C ref
Capacitance overhead starts to dominate at “high” levels
of parallelism and results in an optimum voltage P pipe = (1.15C ref) (V ref/1.7)2 fref ≈ 0.39 P ref

Architecture and System Level Optimization of Power Consumption 9 Architecture and System Level Optimization of Power Consumption 10

Architecture Summary for a Simple Datapath Algorithmic Transformations

XN + YN

XN + YN
Architecture type Voltage Area Power 2D

Simple datapath Loop Unrolling *

* D
(no pipelining or 5V 1 1 A *
A
parallelism) A
XN-1 + Y N-1

Pipelined datapath 2.9V 1.3 0.39 Ceff = Effective normalized

mult-add capacitance = 1 Ceff = 2
Voltage = 5 Voltage = 5
Parallel datapath 2.9V 3.4 0.36 Throughput = 1 Throughput = 2
Power = 25 Power = 25
Pipeline-Parallel 2.0V 3.7 0.2
Loop-unrolling does not reduce power consumption
from [Chandrakasan95] (IEEE TCAD )

Architecture and System Level Optimization of Power Consumption 11 Architecture and System Level Optimization of Power Consumption 12
Loop Unrolling Enables Other Transformations Speed vs. Power Optimization
25

XN + + YN XN + D + YN
21

* 2D * 2D POWER (Fixed Throughput)

A * A * 17

A2 A2
A * Pipelining A * 13

X N-1 + YN-1 XN-1 D + YN-1

9
After SPEEDUP
C eff = 3 VOLTAGE
Algebraic Transformations, CAPACITANCE
& 5
Voltage = 2.9
Constant Propagation
Throughput = 2
C eff = 3 1
1 2 3 4 5 6 7
Power = 12.5 (x2 reduction)
Voltage = 3.7 Unrolling Factor
Throughput = 2 Area can be traded for higher throughput or lower power
Power = 20 (20% reduction)
ARBITRARY SPEEDUP vs. FINITE POWER REDUCTION

Architecture and System Level Optimization of Power Consumption 13 Architecture and System Level Optimization of Power Consumption 14

Multiple Supply Voltage Systems: Filter Example Time-multiplexed Architectures

Parallel busses for I,Q
1 Time-shared bus for I,Q
3V
* * * * I0 I1 I2
2 * * * * + + Q0 Q1 Q2
I0 Q0 I1 Q1 I1
3 T/2
2.4V T
4 + + 30 30
5 + + 20 20
* * * *

Signal Value
Sig nal Va lue

6 10 10
+ I
7 +
0 0
Q
8 * * * * 5V
-10 -10
9 + +
-20 -20
10 + + 0 10 20 30 40 50 0 20 40 60 80 100
Time, Sample Number Time, Sample Number
Power (5V) / Power (5V,3V, 2.4V)= 1.5
from [Raje95]
Can destroy signal correlations and increase
Similar approach to logic design proposed in [Usami95] the switching activity
Architecture and System Level Optimization of Power Consumption 15 Architecture and System Level Optimization of Power Consumption 16

Optimizing Multiplications Number Representation

A = IN * 0 0 1 1
B = IN * 0 1 1 1 Two’s Complement Sign Magnitude
1.0
Tr ansition Probability

A = (IN >>4 + IN >>3) A = (IN >>4 + IN >>3) 1.0

Tra nsition Probability

Rapidly Varying
B = (IN >>4 + IN >>3 + IN >>2) B = (A + IN >>2) 0.8 0.8 Rapidly Varying

16 0.6
0.6
# of shift-ad d operations

0.4 0.4
14

0.2 Slowly Varying 0.2

12 Only Scaling
0.0 0.0
0 2 4 6 8 10 12 14 0 2 4 6 8 10 12 14
Bit Number Bit Number
10

8 Sign-extension activity significantly reduced using

Scaling &
6 Common sign-magnitude representation
1.10 1.15 1.20 1.25
αq Sub-expression

Architecture and System Level Optimization of Power Consumption 17 Architecture and System Level Optimization of Power Consumption 18
Two’s Complement vs. Sign-Magnitude Reducing Activity by Reordering Inputs
SUM1 SUM2 SUM1 SUM2
IN IN >> 8
SUM
(Two’s Complement)
1.0
Transition Activity

>> 7 >> 8 >> 7

IN
Associativity & Commutativity
IN IN IN

SUMB SUMA + SUMB

0.5

T ransition Pr obability

T ransition Pr obability
(Sign-Magnitude) 0.5 0.5

0.4 0.4
SUM1

SUMA 0.3 0.3

SUM2
0.2 0.2 SUM2
0.0 SUM1
0 2 4 6 8 10 12 0.1 0.1
Bit Number
0.0 0 0.0
2 4 6 8 10 12 14 0 2 4 6 8 10 12 14
Two’s complement datapath has a significantly Bit Number Bit Number

higher glitching activity 30% reduction in switching energy

Architecture and System Level Optimization of Power Consumption 19 Architecture and System Level Optimization of Power Consumption 20

Resource Sharing Can Increase Activity Memory Architecture

Counter 1
Co unter 1

BUS1 Serial Access Parallel Access

OR SHARED BUS Row Decoding

Row De coding
MEMORY MEMORY
Co unter 2
C ounter 2

BUS2 Addr Addr

CELL CELL
ARRAY ARRAY
10.0
# o f Bu s T ran sition s Per Cycle

4 4 4 4 4 4 4 4
8.0

Number of Bus Transitions Per Cycle f Mux f/8 Latch

= 2 (1 + 1/2 + 1/4 + ...+1/128) ≈ 4
6.0 4 4 4 4 4
4.0
f Latch f Mux 8-nibbles

No Bus-sharing
2.0
4 bit display interface
0.0
0 50 100 150 200 250 Voltage = 3V Voltage = 1.1V
Skew Between Counter Outputs

Architecture and System Level Optimization of Power Consumption 21 Architecture and System Level Optimization of Power Consumption 22

General Purpose computing - Do we just

optimize power? The complete subsystem should be optimized

Energy Generic System Topology:

Power = x Operations
Operation Second
NO! 0.2-2 W CPU Support 0.1-2 W
ICs
What is important? Crystal
Bus PLD
PROM
Operations per Battery Life:
AA

0.05-0.3 W Glue Logic

•
Minimize Energy Consumed per Operation •
•
and Main I/O
0.2-2 W Memory Interface 0.05-1 W
Operations per Second:
Maximize Throughput ≡ Operations/second
Power dissipation is distributed
Architecture and System Level Optimization of Power Consumption 23 Architecture and System Level Optimization of Power Consumption 24
Proposed Design Methodology - (Tom Burd,
Anthony Stratakos and Trevor Pering) Demonstration Vehicle

Instruction Set Architecture

Redesign the
InfoPad processor subsystem
45 mW 120 mW 400 mW
Energy efficient Clock
system organization ARM60 PLD
Oscillator
Dynamically
adjust throughput Processor Bus
to user’s needs 45 mW
Apply energy efficient
circuit and architecture
design techniques I/O SRAM
Interface 128k x 8
40 mW 600 mW
Energy Efficient Processor System
Current System: 10 MIPS @ 1.2W
Architecture and System Level Optimization of Power Consumption 25 Architecture and System Level Optimization of Power Consumption 26

Processor Usage Model Simplest Approach: Compute ASAP

Compute-intensive and Delivered

Desired low-latency computation Throughput Excess throughput
Throughput
80
MIPs
Ceiling: Set by top speed
of the processor

time
time Wake up → Compute ASAP → Go to idle/sleep mode
Not always computing
Background and Always high throughput
high-latency computation Always high energy

Architecture and System Level Optimization of Power Consumption 27 Architecture and System Level Optimization of Power Consumption 28

Clock rate reduction doesn’t help energy

Another Approach: Reduce Clock Frequency consumption
Delivered PowerBook
Frequency Control Panel
Throughput set by user Slow Fast
80
fCLK
• Energy is independent of clock rate
MIPs
Reduced • Number of operations = Nops
• Energy/operation = CV2
time • Total energy = CV2 * Nops

• Energy remains unchanged... • Reducing the clock rate only degrades

while throughput & power scale down with fCLK throughput, but no savings in battery life -
• Reducing power dissipation not always equivalent unless the voltage is changed
to reducing energy consumption

Architecture and System Level Optimization of Power Consumption 29 Architecture and System Level Optimization of Power Consumption 30
Dynamic Voltage Scaling Scale Energy with Throughput, fCLK

Delivered
Throughput Constant supply voltage.
1.0 3.3V
Reduce throughput & fCLK,

Energy (Watts/MIP)
Peak Reduce energy/operation

~10x Energy
time
0.5 Reduction
Dynamically scale energy with clock rate Reduced supply voltage,
circuit speed tracks f CLK .
Extend battery life by up to 10x 0 1.2V
with the same hardware 0 0.5 1.0
Throughput (∝ f CLK )
Key: Process scheduler determines operating point.
Normalized data (simulated, 0.6um process)
Architecture and System Level Optimization of Power Consumption 31 Architecture and System Level Optimization of Power Consumption 32

Minimal Hardware Implementation DVS in Practice

Modify existing DC-DC converter [Stratakos]

feedback loop Fixed Throughput, Energy/operation
• 10 msec per • Clock tracks over Throughput = 10 MIPS Throughput = 80 MIPS
frequency transition process and temp. Energy/op. = 1 nJ/inst. Energy/op. = 9 nJ/inst.
(10 mW) (720 mW)

Ring Osc. f CLK

VD D
Compare ∆f DC-DC Occasionally Demand Peak Throughput
Frequencies Converter Peak Throughput = 80 MIPS
Set by a Frequency Average Energy/op. ≈ 1 nJ/inst
Load Special
Register Inst. Register
(Peak throughput 11% of the time... average energy/op = 2 nJ/inst)
Add Register to ISA
Architecture and System Level Optimization of Power Consumption 33 Architecture and System Level Optimization of Power Consumption 34

Main Memory: IC Design Main Memory: Architecture

Use existing low-power memory block [Burstein] Standard memory architecture design
3.2 mm 2, 0.6 um 4 kByte Block 32
Access time = 22 ns 8 8 8 8
Energy/access = 120 pJ

Proposed memory architecture design

Design 64 kByte IC:
32
Access time ~ 40 ns 32 32 32 32
Energy/access ~ 300 pJ
5-10x better than commercial
Only activate one SRAM → power reduced by 4x
Key: SRAM must be DVS Compatible. Micro-power bus driver makes extra load negligible power
Architecture and System Level Optimization of Power Consumption 35 Architecture and System Level Optimization of Power Consumption 36
Self-timed Approach for Eliminating Glitching Glitch Free, Low Swing RAM Bitslice
Vdd
INACTIVE Data Out
Output remains tri-stated
until senseamp/latch has
Cells Cells
Row Decode
resolved data
OEN

PRE PRE
Vdd Vdd
Sense Sense Column select/ PRE PRE
Block Selected cascode amp SEL0
and Sense-amp
Output Valid SEL1
32

PRE
Enable tri-state drivers after sense-amp outputs are valid Bitlines precharged to
Vdd Vdd
to eliminate glitching on the data-bus. Vdd - Vtn Vdd
B0 B0 B1 B1

Architecture and System Level Optimization of Power Consumption 37 Architecture and System Level Optimization of Power Consumption 38

Critical circuit - High efficiency DC-DC

Achievable energy levels conversion using a switching regulator
10 MIPS, 1 nJ/inst. ⇔ 80 MIPS, 9 nJ/inst. PASS V g1
DEVICE Vx
ILf
(10 mW) (720 mW)
M1 Lf
+ +
V in
DC-DC LP-ARM -
Cin
Vg2 M2 Cx Cf RL V dd
-
Converter CPU
100 pJ 500 pJ SYNCHRONOUS RECTIFIER
Processor Bus
Arbitrary Vdd (<Vin) generated using the Buck converter
<< 100 pJ
Vdd = V in ² Duty Cycle at Node X

0.5 MB Chief sources of inefficiencies:

I/O ⇒ Conduction loss (I2 R)
Interface SRAM
(8 ICs) ⇒ Switching loss ( Cx Vin2 fs and Ls I2 fs)
100 pJ 300 pJ ⇒ Gate-drive loss (Cg V in 2 fs)
from [Stratakos94]
Improves energy efficiency by an order of magnitude (IEEE PESC)

Architecture and System Level Optimization of Power Consumption 39 Architecture and System Level Optimization of Power Consumption 40

Soft-Switching Eliminates C xV2f Loss What happens when I out changes?

PASS V in
Vx Iout ↓ ⇒ Cx discharges slowly Iout ↑ ⇒ Cx discharges quickly
DEVICE
V gp M1 ILf
Iout
Vx
Lf t Vg n Vg n
Vgn M2 Cx Cf RL
ILf
Vx Vx
Iout
RECTIFIER

t
Dead-time when neither PASS DEVICE ON
Rectifier Discharges Cx Body Diode Conduction
FET conducts
| Vgsp |
Current reverses Inverter node transition times depends on Iout
Vgsn
Typical schemes use fixed dead time set by gate delays
Lf charges and discharges C x
t
FETS ARE SWITCHED WITH VDS = 0
Adaptive Dead-time Control Needed for varying Iout
RECTIFIER ON

Architecture and System Level Optimization of Power Consumption 41 Architecture and System Level Optimization of Power Consumption 42
Switcher Design: Power Transistor Sizing Low Voltage Support Circuitry: Level Converter
Normalized FET Losses VddH

4/3 4/3
4 OEN M4 M3
P total = Pgd + Pcl VddL
VIN 24/2
O VOUT
8/2, M1
24/2 4/2
2 M2 VddH

Pgd = af sW P cl = b/W 0 ↔ VddL

VddL
0
W opt O
Gate-Width Tri-stateable output driver
Minimize Ptotal = P gate-drive + Pconduction loss

b Compatibility with 3.3V/5V standard components

W = ------------
opt a ⋅f (VOH)IN = 1.1V to 5V and (V OH )OUT = 1.1 to 5V
s
Architecture and System Level Optimization of Power Consumption 43 Architecture and System Level Optimization of Power Consumption 44

Other uses of adaptive DC-DC converters Adaptive Power Supply Voltages

Power
• Adaptive supplies Control Supply
• Self-timed circuits V DD(t)

FIFO
REG

FIFO

REG
Self-timed
• Adaption to varying algorithmic workloads
Processor

Exploit Data Dependent Computation Times To Vary the Supply

from [Nielsen94]
(IEEE Transactions on VLSI Systems)

Architecture and System Level Optimization of Power Consumption 45 Architecture and System Level Optimization of Power Consumption 46

But Self-timed Circuits are Expensive... Critical path based voltage optimization
V dd V dd VDD_Ref
VDD_Ref
OUTB Equivalent
I OUT
Critical
Path Signal
+
IN
-
INB
Comparator
I
Equivalent
Critical
Path
Guaranteed transition for every operation
Regulated Voltage to DSP
α 0->1 = 1 from [Macken90]
Feedback adjusts the regulated voltage to the point
Use Synchronous DSP instead where the equivalent critical path is about to fail

Architecture and System Level Optimization of Power Consumption 47 Architecture and System Level Optimization of Power Consumption 48
Case Study: A Portable Multimedia I/O Terminal Chipset Summary (1.2-µm, Vt = 0.7-0.9V)

Antenna Radio Modem Minimum

Area Power
Chip Description Supply
(mmxmm) Voltage
at 1.5V
Video
Protocol Module Protocol 9.4 x 9.1 1.1V 1.9mW
Decompression
(2mW)
Module Frame-buffer SRAM 7.8 x 6.5 1.1V 1mW
(2mW) (for 640x480 display)

Pen Speech Video Controller 6.7 x 6.4 1.1V 150 µW

Text/Graphics
Digitizer Codec Luminance 8.5 x 6.7 1.1V 115 µW
Frame-Buffer Decompression
Module
(1mW) Chrominance 8.5 x 9.0 1.1V 100 µW
Decompression
Protocol, ECC, Buffering, Video Decompression, and I/O Color Space Conversion 4.1 x 4.7 1.3V 1.1mW
(InfoPad Terminal Developed at U.C. Berkeley) and Triple DAC
from [Chandrakasan94]

Architecture and System Level Optimization of Power Consumption 49 Architecture and System Level Optimization of Power Consumption 50

Video Decompression Module Digital YIQ -> Digital RGB

Luminance Color
Decompression R D 11 D 12 D 13 Y
Video Space D 21 D 22 D 23
Y Translator G = I
Controller - Ping-pong
frame-buffer B D 31 D 32 D 33 Q
- Demultiplex Digital
- Lookup Table
- NTSC Timing
- Frame-buffer YIQ
control Optimized matrix multiplication (6mults -> 8 adds)
Chrominance I to
- LUT control Decompression Analog ? Hardwired shift-add operations
- Variable sized
packets - Ping-pong RGB ? Coefficient scaling to minimize shift-add operations
- Synchronization frame-buffer ? Exploit multiple coefficients multiplied with the
Q
- Lookup Table same input
100 µWatts compared to commercial 1 Watt - Why??

Architecture and System Level Optimization of Power Consumption 51 Architecture and System Level Optimization of Power Consumption 52

Power reduction approaches which make up the

Color Space Translator and Triple DAC factor of 10,000 improvement

Key Features:
Digital YIQ -> Analog RGB Design Power
Approach
Consideration Reduction
Optimized Multiplications
Frequency 14MHz->2.5MHz 5.6
Number Representation
Supply Voltage 5V->1.5V 11
Optimized Time-sharing IN MATRIX
DACR
Library Optimization Minimum Sized Devices 2-3
Integrated low-voltage DAC’s COMPUTATION Single Phase Clocking
DACG
Matrix Multiplication Hardwired Shift-add 7
ADD TREE
Power @ 1.3V: 0.93mW Coefficient Optimization
DACB
SATURATION Resource Allocation Fully Parallel Implementation 1.5-2
Clock Rate: 2.5MHz Number Representation Sign-Magnitude 1.2
Off Chip Drivers Integrate Processing and DAC 1.4
Size: 4.1mm x 4.7mm Bitwidth 8bits->6bits 1.3
1.2µm technology

Architecture and System Level Optimization of Power Consumption 53 Architecture and System Level Optimization of Power Consumption 54
Summary

Signal statistics can be exploited to minimize

the number of transitions required to perform
a given function
Architectural voltage scaling is a key technique for
low-voltage operation
Variable power supply reduces power and buffering
trades latency for power

Orders of magnitude of power reduction

are possible

Architecture and System Level Optimization of Power Consumption 55

Power and Speed Trade-Offs in Data Path Structures Array Subsystems
100% (1)
Power and Speed Trade-Offs in Data Path Structures Array Subsystems
54 pages
Research Ethics in The Digital Age - Ethics For The Social Sciences and Humanities in Times of Mediatization and Digitization (High)
No ratings yet
Research Ethics in The Digital Age - Ethics For The Social Sciences and Humanities in Times of Mediatization and Digitization (High)
159 pages
Why Low Power Design?
No ratings yet
Why Low Power Design?
29 pages
LPV 06
No ratings yet
LPV 06
52 pages
30VLSI System Level
No ratings yet
30VLSI System Level
49 pages
Low Power Design of Digital Systems
No ratings yet
Low Power Design of Digital Systems
28 pages
Low Power Vlsi Design: Assignment-1 G Abhishek Kumar Reddy, M Manoj Varma
No ratings yet
Low Power Vlsi Design: Assignment-1 G Abhishek Kumar Reddy, M Manoj Varma
17 pages
Zhou 2008
No ratings yet
Zhou 2008
7 pages
Chapter 4
No ratings yet
Chapter 4
35 pages
Chapter 17: Low-Power Design: Keshab K. Parhi and Viktor Owall
No ratings yet
Chapter 17: Low-Power Design: Keshab K. Parhi and Viktor Owall
34 pages
Power Aware WP
No ratings yet
Power Aware WP
29 pages
Cmos Low Power
No ratings yet
Cmos Low Power
5 pages
Power Aware Design Methodologies
No ratings yet
Power Aware Design Methodologies
542 pages
Chapter 1 Part 2: Computer Abstractions and Technology
No ratings yet
Chapter 1 Part 2: Computer Abstractions and Technology
27 pages
Eytu Lecture2-3
No ratings yet
Eytu Lecture2-3
114 pages
Power Aware Architecture
No ratings yet
Power Aware Architecture
46 pages
Low-Power VLSI Design TOC
No ratings yet
Low-Power VLSI Design TOC
3 pages
Lecture Slides-Week2
No ratings yet
Lecture Slides-Week2
58 pages
Lecture13 03 PDF
No ratings yet
Lecture13 03 PDF
35 pages
LP Main
No ratings yet
LP Main
10 pages
System-Level Power Optimization: Techniques and Tools
No ratings yet
System-Level Power Optimization: Techniques and Tools
78 pages
System On Chip and Embedded Systems
No ratings yet
System On Chip and Embedded Systems
53 pages
Lecture 02 - Computer Abstractions and Technology
No ratings yet
Lecture 02 - Computer Abstractions and Technology
23 pages
Low Power Computing
No ratings yet
Low Power Computing
24 pages
Low Power Solutions
No ratings yet
Low Power Solutions
56 pages
AT - Better C Code For ARM Devices
No ratings yet
AT - Better C Code For ARM Devices
30 pages
Chapter Five
No ratings yet
Chapter Five
13 pages
Designing For Low Power in Soc Projects
No ratings yet
Designing For Low Power in Soc Projects
14 pages
Unit 5
No ratings yet
Unit 5
11 pages
Advanced Computer Architecture: Azvjvhd
No ratings yet
Advanced Computer Architecture: Azvjvhd
61 pages
Lasc As 2010
No ratings yet
Lasc As 2010
4 pages
Coa Mod1
No ratings yet
Coa Mod1
9 pages
Survey On Power Optimization Techniques For Low Power Vlsi Circuit in Deep Submicron Technology
No ratings yet
Survey On Power Optimization Techniques For Low Power Vlsi Circuit in Deep Submicron Technology
15 pages
Cmos Power Consumption AND Approaches Towards Low Power Design
No ratings yet
Cmos Power Consumption AND Approaches Towards Low Power Design
24 pages
Kaxiras - Computer Architecture Techniques For Power Efficiency - 2008
No ratings yet
Kaxiras - Computer Architecture Techniques For Power Efficiency - 2008
219 pages
Low Power Design: Dr. Paul D. Franzon
No ratings yet
Low Power Design: Dr. Paul D. Franzon
16 pages
Ediol 2009mar25 MCP Ta 01
No ratings yet
Ediol 2009mar25 MCP Ta 01
2 pages
Low Power VLSI Design
No ratings yet
Low Power VLSI Design
6 pages
M. Dasigenis, N. Kroupis, A. Argyriou, K. Tatas, D. Soudris N. Zervas
No ratings yet
M. Dasigenis, N. Kroupis, A. Argyriou, K. Tatas, D. Soudris N. Zervas
4 pages
Construction of A Low-Voltage Standard Cell Library For Ultra-Low Power Application
No ratings yet
Construction of A Low-Voltage Standard Cell Library For Ultra-Low Power Application
64 pages
Embedded System: Low Power Computing
No ratings yet
Embedded System: Low Power Computing
30 pages
ISSCC 2014 Horowitz
No ratings yet
ISSCC 2014 Horowitz
5 pages
Low-Power Design Guide: Authors: Brant Ivey Microchip Technology Inc
No ratings yet
Low-Power Design Guide: Authors: Brant Ivey Microchip Technology Inc
22 pages
IJCRT1872033
No ratings yet
IJCRT1872033
10 pages
Ico22 - 1 - Computer Abstraction and Technology
No ratings yet
Ico22 - 1 - Computer Abstraction and Technology
42 pages
Lecture02 - High-Level Digital Design Automation
No ratings yet
Lecture02 - High-Level Digital Design Automation
34 pages
ARM Cortex M Book
No ratings yet
ARM Cortex M Book
290 pages
Design and Technology Trends: R. Saleh Dept. of ECE University of British Columbia Res@ece - Ubc.ca
No ratings yet
Design and Technology Trends: R. Saleh Dept. of ECE University of British Columbia Res@ece - Ubc.ca
32 pages
Architectural-Level Low-Power Design: Naehyuck Chang Dept. of EECS/CSE Seoul National University Naehyuck@snu - Ac.kr
No ratings yet
Architectural-Level Low-Power Design: Naehyuck Chang Dept. of EECS/CSE Seoul National University Naehyuck@snu - Ac.kr
53 pages
Design Techniques For Low Power Systems: Paul J.M. Havinga, Gerard J.M. Smit
No ratings yet
Design Techniques For Low Power Systems: Paul J.M. Havinga, Gerard J.M. Smit
20 pages
High-Level Power Analysis and Optimization
No ratings yet
High-Level Power Analysis and Optimization
185 pages
Chapter-4 Low Power Computing: Sources of Energy Consumptions
No ratings yet
Chapter-4 Low Power Computing: Sources of Energy Consumptions
3 pages
5 - Embedded Systems
No ratings yet
5 - Embedded Systems
53 pages
By Christian Plante, Director of Marketing For Low-Power and Mixed-Signal Fpgas, Actel Corp
No ratings yet
By Christian Plante, Director of Marketing For Low-Power and Mixed-Signal Fpgas, Actel Corp
2 pages
A Study of Low Power Design Techniques For Application Specific Processors
No ratings yet
A Study of Low Power Design Techniques For Application Specific Processors
2 pages
Factors Affecting Power Consumption in VLSI
No ratings yet
Factors Affecting Power Consumption in VLSI
44 pages
Analog Dialogue, Volume 45, Number 2: Analog Dialogue, #2
From Everand
Analog Dialogue, Volume 45, Number 2: Analog Dialogue, #2
Analog Dialogue
No ratings yet
Analog Dialogue, Volume 46, Number 3: Analog Dialogue, #7
From Everand
Analog Dialogue, Volume 46, Number 3: Analog Dialogue, #7
Analog Dialogue
No ratings yet
An Introduction To Data Acquisition
From Everand
An Introduction To Data Acquisition
Jason King
No ratings yet
Investigation of the Usefulness of the PowerWorld Simulator Program: Developed by "Glover, Overbye & Sarma" in the Solution of Power System Problems
From Everand
Investigation of the Usefulness of the PowerWorld Simulator Program: Developed by "Glover, Overbye & Sarma" in the Solution of Power System Problems
Dr. Hidaia Mahmood Alassouli
No ratings yet
Distributed Facts Device for Flow Controls
From Everand
Distributed Facts Device for Flow Controls
Dr.V.V.L.N. Sastry
No ratings yet
2025-03-17
No ratings yet
2025-03-17
3 pages
Red Hat Enterprise Linux-9-Upgrading From RHEL 8 To RHEL 9-En-US
No ratings yet
Red Hat Enterprise Linux-9-Upgrading From RHEL 8 To RHEL 9-En-US
61 pages
Learning Episode 11 Updated
No ratings yet
Learning Episode 11 Updated
7 pages
Bauer New Filling Valves
No ratings yet
Bauer New Filling Valves
4 pages
Typical Slab and Beams and Columns Bbs 1st 9th Floor
No ratings yet
Typical Slab and Beams and Columns Bbs 1st 9th Floor
19 pages
Development of Hydroponic IoT-based Monitoring System and Automatic Nutrition Control Using KNN
No ratings yet
Development of Hydroponic IoT-based Monitoring System and Automatic Nutrition Control Using KNN
6 pages
Chapter 2
100% (1)
Chapter 2
40 pages
Soal
No ratings yet
Soal
14 pages
Canon I350 Waste Tank Full - Fixyourownprinter
No ratings yet
Canon I350 Waste Tank Full - Fixyourownprinter
22 pages
1 s2.0 S0306261924004148 Main
No ratings yet
1 s2.0 S0306261924004148 Main
20 pages
A) Collection of Values
No ratings yet
A) Collection of Values
9 pages
Lab 12
No ratings yet
Lab 12
8 pages
Force Analysis of Spur Gears PDF
No ratings yet
Force Analysis of Spur Gears PDF
5 pages
(NOV) F2升F3 BI (tech savvy)
No ratings yet
(NOV) F2升F3 BI (tech savvy)
33 pages
Voisey Bay C&F
No ratings yet
Voisey Bay C&F
16 pages
Application Guide For Master's Degree Courses Asia Bridge Program (ABP)
No ratings yet
Application Guide For Master's Degree Courses Asia Bridge Program (ABP)
34 pages
Kashi Vishwanath Entry Ticket (5 Persons)
No ratings yet
Kashi Vishwanath Entry Ticket (5 Persons)
1 page
Section 5 Parts List: Safety Precaution
No ratings yet
Section 5 Parts List: Safety Precaution
19 pages
Cs403 Assignment Solution 1 Fall 2023
No ratings yet
Cs403 Assignment Solution 1 Fall 2023
7 pages
Embedded Systems Input and Output Optional
No ratings yet
Embedded Systems Input and Output Optional
4 pages
Computation With The Fractional Fourier Transform
No ratings yet
Computation With The Fractional Fourier Transform
2 pages
Powerex Air Compressor Service and Maintenance Manual
No ratings yet
Powerex Air Compressor Service and Maintenance Manual
12 pages
ADA Flanger Manual
No ratings yet
ADA Flanger Manual
11 pages
SDN Notes
No ratings yet
SDN Notes
117 pages
7-Forex Trading Is A Business Learn To Trade The Market PDF
No ratings yet
7-Forex Trading Is A Business Learn To Trade The Market PDF
8 pages
GC 2024 04 19
No ratings yet
GC 2024 04 19
24 pages
Setting Up OpenVPN Server On Ubuntu
No ratings yet
Setting Up OpenVPN Server On Ubuntu
35 pages
The Ontologrcal Expressiveness Information Systems Analysis Design Grammars
No ratings yet
The Ontologrcal Expressiveness Information Systems Analysis Design Grammars
21 pages
Information Retrieval 8 Term Weighting A
No ratings yet
Information Retrieval 8 Term Weighting A
11 pages

Signal Processing (E.g. For Multimedia and Wireless Communications)

Uploaded by

Signal Processing (E.g. For Multimedia and Wireless Communications)

Uploaded by

Two Kinds of Computation

Architecture and System Level Optimization of Power Consumption 2

Potential of computation specific energy

• Energy optimized but general purpose CL

• Energy optimized and dedicated Power = Energy/transition * f = CL * Vdd 2 * f

Power-Delay Product Normalized Delay vs. Supply Voltage

Critical path delay ⇒ Tadder + T comparator (= 25ns) 1

Area = 1476 x 1219 µ2

The More Parallel the Better?? Pipelined Datapath

Architecture Summary for a Simple Datapath Algorithmic Transformations

Simple datapath Loop Unrolling *

Pipelined datapath 2.9V 1.3 0.39 Ceff = Effective normalized

* 2D * 2D POWER (Fixed Throughput)

X N-1 + YN-1 XN-1 D + YN-1

Multiple Supply Voltage Systems: Filter Example Time-multiplexed Architectures

Optimizing Multiplications Number Representation

A = (IN >>4 + IN >>3) A = (IN >>4 + IN >>3) 1.0

0.2 Slowly Varying 0.2

8 Sign-extension activity significantly reduced using

>> 7 >> 8 >> 7

SUMB SUMA + SUMB

SUMA 0.3 0.3

higher glitching activity 30% reduction in switching energy

Resource Sharing Can Increase Activity Memory Architecture

BUS1 Serial Access Parallel Access

BUS2 Addr Addr

Number of Bus Transitions Per Cycle f Mux f/8 Latch

General Purpose computing - Do we just

Energy Generic System Topology:

0.05-0.3 W Glue Logic

Instruction Set Architecture

Processor Usage Model Simplest Approach: Compute ASAP

Compute-intensive and Delivered

Clock rate reduction doesn’t help energy

• Energy remains unchanged... • Reducing the clock rate only degrades

Minimal Hardware Implementation DVS in Practice

Modify existing DC-DC converter [Stratakos]

Ring Osc. f CLK

Main Memory: IC Design Main Memory: Architecture

Proposed memory architecture design

Critical circuit - High efficiency DC-DC

0.5 MB Chief sources of inefficiencies:

Soft-Switching Eliminates C xV2f Loss What happens when I out changes?

Pgd = af sW P cl = b/W 0 ↔ VddL

b Compatibility with 3.3V/5V standard components

Other uses of adaptive DC-DC converters Adaptive Power Supply Voltages

Exploit Data Dependent Computation Times To Vary the Supply

Antenna Radio Modem Minimum

Pen Speech Video Controller 6.7 x 6.4 1.1V 150 µW

Video Decompression Module Digital YIQ -> Digital RGB

Power reduction approaches which make up the

Signal statistics can be exploited to minimize

Orders of magnitude of power reduction

Architecture and System Level Optimization of Power Consumption 55

You might also like