0% found this document useful (0 votes)
248 views33 pages

Intel Power Management 2nd Gen

Uploaded by

sachin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
248 views33 pages

Intel Power Management 2nd Gen

Uploaded by

sachin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 33

Power management architecture of the

2nd generation Intel® Core™


microarchitecture, formerly
codenamed Sandy Bridge

Efi Rotem - Sandy Bridge power architect


Alon Naveh, Doron Rajwan,
Avinash Ananthakrishnan, Eli Weissmann

Hot Chips Aug-2011


Agenda
 Power management overview
 Intel® Turbo Boost Technology 2.0
 Thermal management
 Energy efficiency
 Average power management
 Platform view
 Summary

High CPU and PG performance


Power and energy efficiency

Sandy Bridge - Hot Chips 2011 2


Power management overview

Sandy Bridge - Hot Chips 2011 3


Sandy Bridge power mgmt ID card
VR  Sandy Bridge is:
DMI PCI Express*
 1-4 CPU cores + PG
System Agent  Integrated System Agent (SA)
SVID
 Sliced LLC shared by all cores/PG
IMC
PECI PCU  Ring interconnect + power management link
EC
Display  Package Control Unit (PCU) :
Core LLC
 On chip logic and embedded controller running power
management firmware
Core LLC  Communicates internally with cores, ring and SA
 Monitors physical conditions
Core LLC
− Voltage, temperature, power consumption
Core LLC  Controls power states
− CPU and PG voltage and frequency
Graphics − Controls voltage regulators DDR and system
 External power management interface
 Accepts external inputs
− System power management requests and limits
− Power and temperature reading
 MSR, MMIO and PECI system bus

Sandy Bridge - Hot Chips 2011 4


Voltage and frequency domains
VCC Periphery
 Two Independent Variable Power Planes:
VCC SA  CPU cores, ring and LLC
 Embedded power gates - Each core can be turned
off individually
VCC Core  Cache power gating - Turn off portions or all
(Gated) cache at deeper sleep states
Embedded power gates

 Graphics processor
VCC Core  Can be varied or turned off when not active
VCC Core (ungated)  Shared frequency for all IA32 cores and ring
(Gated)  Independent frequency for PG
 Fixed Programmable power plane for System Agent
 Optimize SA power consumption
VCC Core
 System On Chip functionality and PCU logic
(Gated)
 Periphery: DDR, PCIe, Display

VCC Core
(Gated)

VCC Graphics

VCC Periphery

Sandy Bridge - Hot Chips 2011 5


Power performance fundamentals
 Maximize user experience under multiple constraints
 User Experience (May have different preferences):
− Throughput performance
− Responsiveness - burst performance
− CPU / PG performance
− Battery life / Energy bills
− Ergonomics (acoustic noise, heat)
 Optimizing around Constraints to meet user preferences
− Silicon capabilities
− System Thermo-Mechanical capabilities
− Power delivery capabilities
− S/W and Operating system explicit control
− Workload and usage

Rich set of control knobs for the user and system designer
to tailor power - performance preferences

Sandy Bridge - Hot Chips 2011 6


Platform Power management features topology
S/W

Operating system, PG driver, BIOS, Embedded Controller and user preferences

Control
Perf Opt.
Power

Power/performance optimization algorithms


Milliseconds to seconds control algorithms

Control
Real Time
events

PCU “kernel” – mission critical power management events


C-state control, P-states transitions and latency sensitive actions

Control
Physical

Thermal sensing, Maximum current control, physical layer communication


Layer

Platform control: DDR thermal, Voltage Regulator optimization, hot sensors etc.

Sandy Bridge - Hot Chips 2011 7


I n t e l ® Tu r bo Boost
Te ch n ology 2 .0

Sandy Bridge - Hot Chips 2011 8


Power metering
 Power management is based on power metering
 Sandy Bridge implements a digital power meter
 3rd generation of power metering in Intel® products
 Active power - Event counters track main building blocks activities
− 100 Micro arch. event counters - apply active energy cost to each event
− CPU, PG, Ring, Cache, and I/O
 Static power – Leakage and idle as a function of voltage and temperature
 Used for power management algorithms
 Architecturally exposed to software and system
 For the use of S/W or system embedded controller
45

40

35

30

25

20 CPU - predicted
PG - predicted
15 Package - predicted
Average accuracy – 0.9%
CPU - actual
10 STDEV 0.6%
PG - actual
5 Package - actual

0
0 50 100 150 200 250

Sandy Bridge - Hot Chips 2011 9


What is CPU Turbo
P0 1C
 P-state: a voltage/frequency pair (ACPI terminology)
 P1 is guaranteed frequency
“Turbo”  CPU and PG simultaneous heavy load at worst case conditions
H/W
Control  Actual power has high dynamic range
 P0 is max possible frequency
P1
 Pn is the energy efficient state
frequency

OS  OS control Pn-P1 range


Visible  P1-P0 has significant frequency range (GHz)
States
 P1 to P0 range is fully H/W controlled
Pn  User preferences and policies
 Single thread or lightly loaded applications
Thermal
Control  GFX <>CPU balancing

LFM

Sandy Bridge - Hot Chips 2011 10


What is Turbo
 Turbo enabled product specifications
CPU PG TDP total package
P1 P0 P1 P0 sustained power

Source: http://www.intel.com/Assets/PDF/datasheet/324692.pdf

Sandy Bridge - Hot Chips 2011 11


New concept: thermal capacitance
Classic Model New Model
Steady-State Thermal Resistance Steady-State Thermal Resistance
Design guide for steady state PG and CPU sharing
AND
Dynamic Thermal Capacitance

CPU PG

Example:
Cp_Al ~ 0.9 J/(gr*’K)
100gr heat sink heated by
35W CPU  100Sec
Temperature

Temperature
Classic model More realistic
respond response to
power changes
Time Time

Sandy Bridge - Hot Chips 2011 12


New concept: thermal capacitance
Classic Model New Model
Steady-State Thermal Resistance Steady-State Thermal Resistance
Design guide for steady state PG and CPU sharing
AND
Dynamic Thermal Capacitance
• Managing of energy budget rolling average
• Heat sink capacity time constant – few sec. CPU PG
• Short time constants for power delivery

En +1 = αEn + (1 − α ) * (TDPn − Pn )∆t n


• Package energy sharing between CPU and PG
• Multiple sources of controls
• Software or external embedded controller
Temperature

Temperature
Classic model More realistic
respond response to
power changes
Time Time

PCU manages energy budgets over multiple time constants


Accumulated energy during idle period used when needed

Sandy Bridge - Hot Chips 2011 13


Intel® Turbo Boost Technology 2.0 - Dynamic
Power After idle periods, the system
accumulates “energy budget” and
can accommodate high
C0/P0 (Turbo) power/performance for up to a
minute
Max power
1.2-1.3X TDP
In Steady State conditions the
power stabilizes on TDP, possibly at
higher then nominal frequency
30-60 Sec

“TDP” Use a ccu m u la t e d


e n e r gy bu dge t t o
e n h a n ce u se r
5 Sec / 30-60 Sec
Sleep or exponential Average e x pe r ie n ce
Low power
Time
Buildup thermal budget
during idle periods

Sandy Bridge - Hot Chips 2011 14


Usage Scenario: Responsive Behavior
• Interactive work benefits from Intel® Turbo Boost 2.0
• Idle periods intermixed with user actions
Photo editing
• Open image
• Process
• View
• Balance colors
• Red eye removal
• Contrast
• Filters
• Etc.

Use r in t e r a ct ive a ct ion s


I m a ge ope n a n d pr oce ss

IntelRestricted
Intel Top Secret Secret – RSNDA
35 Sandy Bridge - Hot Chips 2011 15
Turbo controls in action
Voltage Regulator reported capability
Actual instantaneous
CURRENT_CONFIG_CONTROL MSR
power Allow programmability
P-state
Power
TURBO_POWER_LIMIT Control MSR

Hard Limit Enables and locks


Max Icc
Package Power limit 2 – Instantaneous

Power limit 2 Package Power limit 1 Time interval


Power delivery
Package Power limit 1 clamp bit

Package Power limit 1 - power


C0 P0
•Also:
• Individual power controls available
Power limit 1 • Explicit frequency control
Config TDP
User / OEM / OS preference

PL1 time exp.


average

Time

Sandy Bridge - Hot Chips 2011 16


Intel® Turbo Boost Technology 2.0 - Package
 Power specification is defined for the entire package
 Monolithic die – power budget shared by CPU and PG
 Sum of component power at or below specifications

Cor e H e a vy CPU
Pow e r [ W ] w or k loa d
Tot a l pa ck a ge pow e r
CPU+ PG= con st

H e a vy Gr a phics
w or k loa d
Applica t ions

PG Pow e r [ W ]
Sandy Bridge - Hot Chips 2011 17
Intel® Turbo Boost Technology 2.0 - Package
 Power specification is defined for the entire package
 Monolithic die – power budget shared by CPU and PG
 Sum of component power at or below specifications
 Energy budget spit dynamically according to user preference
 Control algorithm translates energy headroom to turbo bins

CPU pr e fe r e nce
Cor e Pow e r bu dge t is
Pow e r [ W ] a ssign e d t o t h e CPU
Ba la nce d
Pow e r bu dge t split
be t w e e n CPU a n d PG
IA preference MSR

GT preference MSR

PG pr e fe r e nce
Pow e r bu dge t is
a ssign e d t o t h e CPU

Applica t ion

PG Pow e r [ W ]
Sandy Bridge - Hot Chips 2011 18
Turbo in action – measurements
• Four core 45W 2.2 up to 3.5 GHz Sandy Bridge example
• Running CPU and PG simultaneous workloads
• Control power management knobs on the fly using a control utility

After idle period


turbo to 56W for
~20Sec - stabilize at
TDP = 45W
Frequency varies

Sandy Bridge - Hot Chips 2011 19


Energy Efficient P-State - optimizing MIPS / Watt
 Frequency voltage scaling up is not energy efficient
 Cubic increase in power for linear increase in frequency and performance
 Used to get raw performance at the cost of increased energy consumption
 Not all workloads gain performance from frequency
 For example – many memory accesses  poor performance scalability
 “Wait slowly”  lower frequency at memory bound intervals
− Save energy to be used for core bounded phases
− Or just save energy with minimal performance impact
 Continuously generate “scalability” metric
 Drop frequency if scalability is low Workload Scalability over time
1

User preference control 0.98

Performance Scaling
 Max performance – ignore energy cost
0.96

0.94

 Balanced – lower frequency at 0.92

memory-bound intervals 0.9

0.88

 Max energy savings – limited turbo 0.86


Time

Impacts active energy - Small impact on battery life

Sandy Bridge - Hot Chips 2011 20


Average Power Management

Sandy Bridge - Hot Chips 2011 21


Sandy Bridge average power control
Core Level

HW coordinated per- Thread level Thread level


Active states
thread interface coordination coordination
P0 - P1 - Pn
Core C0 Core C0
Only snoops
supported Core C1 Core C1
Core caches flushed
Vcc-gated Core C3 Package C3 Core C3
System-Agent Ring + LLC
Pop-up: DDR-Self HW coordinated Core C6 Package C6 Core C6
refresh Clock off + low-VCC

Retention voltage Package C7


C3/6/7:
DDR clock off, IO clock off
Display-Engine in energy efficient LLC Flushed
screen refresh mode Usage based close/open
algorithms

Sandy Bridge - Hot Chips 2011 22


Improved C-state Latency and energy efficiency
High transition rate : Low transition Rate:
Demote Stop-Demotion

Bapco* Mobile Mark 07 Pro Core/package


8000
C- state
C-state Breaks Silence range
7000
storm ranges C0
Core level C6 Entres

6000
Core 0
C1/3
5000
4000
Core 1 C6
Core 2
3000 Core 3 Time
2000
1000 Auto-demotion: Auto-un-demotion:
0
1 622 1243 1864 2485 3106 3727 4348 4969 5590 6211 6832
8-15% perf (MM07, Aggressive Demotion-
Sysmark) enable!
“Interrupt storms” seen on real systems 45-200mW power savings measured on Sysmark
 Performance Impact and media applications
 Entry and exit latency
 Energy Impact
 transition power and energy
overhead

Sandy Bridge - Hot Chips 2011 23


Thermal management

Sandy Bridge - Hot Chips 2011 24


Package thermal management
 On die thermal sensors

12 sensors on each CPU core + PG, ring and
SA
 Operating range 50-100’C
 Temperature reporting
 Maximum reading of each functional block and
maximum reading of the total chip
 Used for:
 Critical thermal protection
 Notification, throttle and shutdown
 Programmable throttle temperature
 Leakage calculation of power meter
 PCU optimization algorithms
 External system controls (e.g. Fan control)

Sandy Bridge - Hot Chips 2011 25


System thermal management
 Digital DDR power meter for thermal prediction
 Count DDR read and write and calculate power
 Maximum bandwidth control to prevent critical heating
 Initiates double DDR refresh rate at high temperature
 Supports DDR thermal sensor
 For a more accurate DDR temperature reading
 Voltage Regulator thermal sensing
 Hot and critical conditions using in and out of band communication
 Digital package temperature reporting
 Used by external agent for system fan control

Sandy Bridge - Hot Chips 2011 26


Power efficient memory controller
 DDR power management
 Aggressive DDR power savings policies, configurable by PCU
− Normal power down
− Pre-charge Power down
− PLL off
 Self Refresh
 Configurable policies for entering Self Refresh, based on package
power states, controlled by PCU
 IO clock controls – power down

Sandy Bridge - Hot Chips 2011 27


Platform power management

Sandy Bridge - Hot Chips 2011 28


Platform power management - SVID
 SVID – Serial Voltage ID
 New serial bus to control external Voltage Regulators
 Three wires serial bus – control multiple VRs
 Control VR voltage – continues fine grain optimization
 Optimize voltage for changing conditions
 Optimize VR power savings mode - minimize power losses
 Power States to optimize VR efficiency
 A function of current consumption and sleep states
 Read VR parameters for PCU algorithms use
 Load line resistance, max Icc and temperature

Sandy Bridge - Hot Chips 2011 29


Platform power management – PECI
 PECI – A new platform control interface
 Connects the PCU to external embedded controller
 Report - PCU communicates out to the embedded controller:
 Individual component and max package temperature
 Individual and total package energy consumption
 Other power management status information
 Used for fan control and plat
 Control:
 Package power – instantaneous and sustain (PL1-PL2)
 Other power management settings and preferences
 Used by Embedded Controller to manage total system power
management

Sandy Bridge - Hot Chips 2011 30


Summary and conclusions
Sandy Bridge is built to maximize user experience under constraints
 Throughput performance – Turbo over long time window
 Responsiveness – Turbo dynamically for short duration
 User guided CPU / PG performance balancing
 Battery life / Energy bills – Tight control of active and idle power states
 Rich set of control available for S/W, operating system and system
embedded controller allow:
 User preferences where tradeoff exists
 Enables small form factor platforms
 Improved ergonomics - lower acoustic noise
and heat

Sandy Bridge - Hot Chips 2011 31


Sandy Bridge - Hot Chips 2011 32
Turbo roadmap evolution
N e ha le m / W e st m e r e
M obile Merom/Penryn
Clarksfield Arrandale Sandy Bridge
D e sk t op (Mobile only)
Lynnfield/Clarkdale
• CPU Core C- st at es
• CPU Core • CPU Core C- st at es
• CPU Core C- st at es • CPU/ PG/ Package power
C- st at e • CPU Power- Plat form iMon
Con t r ol • CPU Power - Plat form iMon • Built - in power m onit oring
•Digit al power • PG Power- Plat form iMon
m et er •Power Budget Managem ent
• Package Power
•Plat form Cont rol ( EC / VR)

• HW cont rolled power


• 1- 2 t urbo bin • Turbo cont rolled • PG dynam ic frequency sharing bet ween CPU - PG
Ke y N e w when ot her core wit hin power lim it • Driver cont rolled power • Brief t urbo above TDP 
Ca pa bilit ie s is asleep • Mult i- core t urbo sharing bet ween CPU and dynam ic Turbo
• More t urbo if cores are asleep PG ( Mobile) •More plat form cont rol via PECI
3.0 and SVI D

Quad Core Die Dual Core Die Dual Quad


Single Dual Quad Single Dual Core Die Core Die
Core Core Core Graphics
Core Core
Turbo Turbo Turbo Turbo
Turbo Turbo
Tur bo
Be ha vior

I llust r a t ive
only. D oe s not
r e pr e se nt
a ct ua l num be r
of t ur bo bins.

0 1 0 1 2 3 0 1 2 3 0 1 2 3 0 1 GT 0 1 GT 0 1 GT 0 1 GT 0 1 2 3 GT

Sandy Bridge - Hot Chips 2011 33

You might also like