Power and performance tuning
Power and performance tuning
Energy efficiency is increasingly important in enterprise and data center environments, and it
adds another set of tradeoffs to the mix of configuration options.
Windows Server 2016 is optimized for excellent energy efficiency with minimum performance
impact across a wide range of customer workloads. Processor Power Management (PPM) Tuning
for the Windows Server Balanced Power Plan describes the workloads used for tuning the
default parameters in Windows Server 2016, and provides suggestions for customized tunings.
This section expands on energy-efficiency tradeoffs to help you make informed decisions if you
need to adjust the default power settings on your server. However, the majority of server
hardware and workloads should not require administrator power tuning when running Windows
Server 2016.
You can calculate your server's energy efficiency ratio for a useful metric that incorporates
power and performance information. Energy efficiency is the ratio of work that is done to the
average power that is required during a specified amount of time.
You can use this metric to set practical goals that respect the tradeoff between power and
performance. In contrast, a goal of 10 percent energy savings across the data center fails to
capture the corresponding effects on performance and vice versa.
Similarly, if you tune your server to increase performance by 5 percent, and that results in 10
percent higher energy consumption, the total result might or might not be acceptable for your
business goals. The energy efficiency metric allows for more informed decision making than
power or performance metrics alone.
If your server has the necessary support, you can use the power metering and budgeting features
in Windows Server 2016 to view system-level energy consumption by using Performance
Monitor.
One way to determine whether your server has support for metering and budgeting is to review
the Windows Server Catalog. If your server model qualifies for the new Enhanced Power
Management qualification in the Windows Hardware Certification Program, it is guaranteed to
support the metering and budgeting functionality.
Another way to check for metering support is to manually look for the counters in Performance
Monitor. Open Performance Monitor, select Add Counters, and then locate the Power Meter
counter group.
If named instances of power meters appear in the box labeled Instances of Selected Object,
your platform supports metering. The Power counter that shows power in watts appears in the
selected counter group. The exact derivation of the power data value is not specified. For
example, it could be an instantaneous power draw or an average power draw over some time
interval.
If your server platform does not support metering, you can use a physical metering device
connected to the power supply input to measure system power draw or energy consumption.
To establish a baseline, you should measure the average power required at various system load
points, from idle to 100 percent (maximum throughput) to generate a load line. The following
figure shows load lines for three sample configurations:
You can use load lines to evaluate and compare the performance and energy consumption of
configurations at all load points. In this particular example, it is easy to see what the best
configuration is. However, there can easily be scenarios where one configuration works best for
heavy workloads and one works best for light workloads.
Important
To ensure an accurate analysis, make sure that all local apps are closed before you run
PowerCfg.exe.
Shortened timer tick rates, drivers that lack power management support, and excessive CPU
utilization are a few of the behavioral issues that are detected by the powercfg /energy
command. This tool provides a simple way to identify and fix power management issues,
potentially resulting in significant cost savings in a large datacenter.
For more info about PowerCfg.exe, see Using PowerCfg to Evaluate System Energy Efficiency.
Common
Plan Description applicable Implementation highlights
scenarios
Default setting. Targets Matches capacity to demand.
Balanced good energy efficiency Energy-saving features
General computing
(recommended) with minimal performance balance power and
impact. performance.
High Increases performance at Low latency apps Processors are always locked
Performance the cost of high energy and app code that is at the highest performance
consumption. Power and sensitive to state (including “turbo”
thermal limitations, processor frequencies). All cores are
Common
Plan Description applicable Implementation highlights
scenarios
operating expenses, and
performance unparked. Thermal output
reliability considerations
changes may be significant.
apply.
Limits performance to save
energy and reduce
Deployments with Caps processor frequency at a
operating cost. Not
limited power percentage of maximum (if
Power Saver recommended without
budgets and thermal supported), and enables other
thorough testing to make
constraints energy-saving features.
sure performance is
adequate.
These power plans exist in Windows for alternating current (AC) and direct current (DC)
powered systems, but we will assume that servers are always using an AC power source.
For more info on power plans and power policy configurations, see Power Policy Configuration
and Deployment in Windows.
Note
Some server manufactures have their own power management options available through the
BIOS settings. If the operating system does not have control over the power management,
changing the power plans in Windows will not affect system power and performance.
The following sections describe ways to tune some specific processor power management
parameters to meet goals not addressed by the three built-in plans. If you need to understand a
wider array of power parameters, see Power Policy Configuration and Deployment in Windows.
Note
The EPB register is only supported in Intel Westmere and later processors.
For Intel Nehalem and AMD processors, Turbo is disabled by default on P-state-based platforms.
However, if a system supports Collaborative Processor Performance Control (CPPC), which is a
new alternative mode of performance communication between the operating system and the
hardware (defined in ACPI 5.0), Turbo may be engaged if the Windows operating system
dynamically requests the hardware to deliver the highest possible performance levels.
To enable or disable the Turbo Boost feature, the Processor Performance Boost Mode parameter
must be configured by the administrator or by the default parameter settings for the chosen
power plan. Processor Performance Boost Mode has five allowable values, as shown in Table 5.
For P-state-based control, the choices are Disabled, Enabled (Turbo is available to the hardware
whenever nominal performance is requested), and Efficient (Turbo is available only if the EPB
register is implemented).
For CPPC-based control, the choices are Disabled, Efficient Enabled (Windows specifies the
exact amount of Turbo to provide), and Aggressive (Windows asks for “maximum performance”
to enable Turbo).
The following commands enable Processor Performance Boost Mode on the current power plan
(specify the policy by using a GUID alias):
syntax
Powercfg -setacvalueindex scheme_current sub_processor PERFBOOSTMODE 1
Powercfg -setactive scheme_current
Important
You must run the powercfg -setactive command to enable the new settings. You do not need to
reboot the server.
To set this value for power plans other than the currently selected plan, you can use aliases such
as SCHEME_MAX (Power Saver), SCHEME_MIN (High Performance), and
SCHEME_BALANCED (Balanced) in place of SCHEME_CURRENT. Replace “scheme
current” in the powercfg -setactive commands previously shown with the desired alias to enable
that power plan.
For example, to adjust the Boost Mode in the Power Saver plan and make that Power Saver is the
current plan, run the following commands:
syntax
Powercfg -setacvalueindex scheme_max sub_processor PERFBOOSTMODE 1
Powercfg -setactive scheme_max
The values for the Minimum Processor Performance State and Maximum Processor
Performance State parameters are expressed as a percentage of maximum processor frequency,
with a value in the range 0 – 100.
If your server requires ultra-low latency, invariant CPU frequency (e.g., for repeatable testing),
or the highest performance levels, you might not want the processors switching to lower-
performance states. For such a server, you can cap the minimum processor performance state at
100 percent by using the following commands:
syntax
Powercfg -setacvalueindex scheme_current sub_processor PROCTHROTTLEMIN 100
Powercfg -setactive scheme_current
If your server requires lower energy consumption, you might want to cap the processor
performance state at a percentage of maximum. For example, you can restrict the processor to 75
percent of its maximum frequency by using the following commands:
syntax
Powercfg -setacvalueindex scheme_current sub_processor PROCTHROTTLEMAX 75
Powercfg -setactive scheme_current
Note
Capping processor performance at a percentage of maximum requires processor support. Check
the processor documentation to determine whether such support exists, or view the Performance
Monitor counter % of maximum frequency in the Processor group to see if any frequency caps
were applied.
Processor Performance Increase Threshold defines the utilization value above which a
processor’s performance state will increase. Larger values slow the rate of increase for
the performance state in response to increased activities.
Processor Performance Decrease Threshold defines the utilization value below which
a processor’s performance state will decrease. Larger values increase the rate of decrease
for the performance state during idle periods.
Processor Performance Increase Policy and Processor Performance Decrease Policy
determine which performance state should be set when a change happens. “Single” policy
means it chooses the next state. “Rocket” means the maximum or minimal power
performance state. “Ideal” tries to find a balance between power and performance.
For example, if your server requires ultra-low latency while still wanting to benefit from low
power during idle periods, you could quicken the performance state increase for any increase in
load and slow the decrease when load goes down. The following commands set the increase
policy to “Rocket” for a faster state increase, and set the decrease policy to “Single”. The
increase and decrease thresholds are set to 10 and 8 respectively.
syntax
Powercfg.exe -setacvalueindex scheme_current sub_processor PERFINCPOL 2
Powercfg.exe -setacvalueindex scheme_current sub_processor PERFDECPOL 1
Powercfg.exe -setacvalueindex scheme_current sub_processor PERFINCTHRESHOLD 10
Powercfg.exe -setacvalueindex scheme_current sub_processor PERFDECTHRESHOLD 8
Powercfg.exe /setactive scheme_current
Cores that are parked generally do not have any threads scheduled, and they will drop into very
low power states when they are not processing interrupts, DPCs, or other strictly affinitized
work. The remaining cores are responsible for the remainder of the workload. Core parking can
potentially increase energy efficiency during lower usage.
For most servers, the default core-parking behavior provides a reasonable balance of throughput
and energy efficiency. On processors where core parking may not show as much benefit on
generic workloads, it can be disabled by default.
If your server has specific core parking requirements, you can control the number of cores that
are available to park by using the Processor Performance Core Parking Maximum Cores
parameter or the Processor Performance Core Parking Minimum Cores parameter in
Windows Server 2016.
One scenario that core parking isn’t always optimal for is when there are one or more active
threads affinitized to a non-trivial subset of CPUs in a NUMA node (that is, more than 1 CPU,
but less than the entire set of CPUs on the node). When the core parking algorithm is picking
cores to unpark (assuming an increase in workload intensity occurs), it may not always pick the
cores within the active affinitized subset (or subsets) to unpark, and thus may end up unparking
cores that won’t actually be utilized.
The values for these parameters are percentages in the range 0 – 100. The Processor
Performance Core Parking Maximum Cores parameter controls the maximum percentage of
cores that can be unparked (available to run threads) at any time, while the Processor
Performance Core Parking Minimum Cores parameter controls the minimum percentage of
cores that can be unparked. To turn off core parking, set the Processor Performance Core
Parking Minimum Cores parameter to 100 percent by using the following commands:
syntax
Powercfg -setacvalueindex scheme_current sub_processor CPMINCORES 100
Powercfg -setactive scheme_current
To reduce the number of schedulable cores to 50 percent of the maximum count, set the
Processor Performance Core Parking Maximum Cores parameter to 50 as follows:
syntax
Powercfg -setacvalueindex scheme_current sub_processor CPMAXCORES 50
Powercfg -setactive scheme_current
Utility Distribution is enabled by default for the Balanced power plan for some processors. It can
reduce processor power consumption by lowering the requested CPU frequencies of workloads
that are in a reasonably steady state. However, Utility Distribution is not necessarily a good
algorithmic choice for workloads that are subject to high activity bursts or for programs where
the workload quickly and randomly shifts across processors.
For such workloads, we recommend disabling Utility Distribution by using the following
commands:
syntax
Powercfg -setacvalueindex scheme_current sub_processor DISTRIBUTEUTIL 0
Powercfg -setactive scheme_current