0% found this document useful (0 votes)
136 views

Understanding Processor Utilization With Ibm Powervm

The document discusses understanding processor utilization in IBM PowerVM virtualized environments. It provides reference documentation on the topic, describes known AIX and firmware issues that can impact utilization, and explains what hardware, configuration, and performance information should be collected from customers to properly assess utilization. This includes outputs from the prtconf and lparstat commands. The paper aims to help customers correctly evaluate utilization on their virtualized systems.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
136 views

Understanding Processor Utilization With Ibm Powervm

The document discusses understanding processor utilization in IBM PowerVM virtualized environments. It provides reference documentation on the topic, describes known AIX and firmware issues that can impact utilization, and explains what hardware, configuration, and performance information should be collected from customers to properly assess utilization. This includes outputs from the prtconf and lparstat commands. The paper aims to help customers correctly evaluate utilization on their virtualized systems.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

Understanding Processor Utilization with IBM PowerVM

Understanding Processor Utilization with IBM PowerVM

Dennis Massanari / IBM


Balaji Pagadala / Oracle

June 12, 2012

Dennis Massanari June 12, 2012


Understanding Processor Utilization with IBM PowerVM

Index
1. Introduction ................................................................................................................................. 3
2. Reference Documentation ........................................................................................................... 3
3. Know AIX and Firmware Issues................................................................................................. 4
4. What Information to Collect ....................................................................................................... 4
4.1 Collecting Hardware and LPAR Configuration Information................................................ 4
4.1.1 prtconf command: .......................................................................................................... 4
4.1.2 lparstat commands: ........................................................................................................ 5
4.2 Collecting Performance Information .................................................................................... 7
4.2.1 Sample Processor Utilization Data ................................................................................ 8
4.2.3 Differences in CPU Consumption by Hypervisor at Low and High Utilization .. 11
5. Collecting Detailed Performance Data ..................................................................................... 13
5.1 curt Performance Report ..................................................................................................... 13
5.2 tprof Performance Report ................................................................................................... 14

Dennis Massanari June 12, 2012


Understanding Processor Utilization with IBM PowerVM

1. Introduction
A number of customers on Oracle RAC and IBM AIX have opened SRs after seeing a higher
than expected total system utilization running the Oracle 11.2 Clusterware and database stack on
AIX with an otherwise idle system. For example database instance(s) are ONLINE but not being
used (this will be referred to as “idle RAC cluster”). Because of the virtualized environments
frequently being used on modern processors, it is important to understand how to properly assess
the utilization in these environments.

There are probably several factors leading to the number of related SRs being opened:

1. There appears to be higher utilization when running Oracle Clusterware on AIX than
other platforms. Oracle and IBM are working together to better understand and if
possible correct this problem. This work is still in progress.
2. There is a known bug in Oracle 11.2 Clusterware release where a higher CPU
utilization is seen even when the database instances are idle. An Oracle patch
13498267 needs to be applied for this problem.
3. There is a known firmware and AIX problem resulting in addition CPU utilization in
some cases when using shared LPARs compared to dedicated LPARs. When proper
code levels are installed there is only a very slight performance difference between
dedicated and shared when looking at the overhead of an idle RAC cluster.
4. In some case customers may not be correctly assessing the system utilization on their
virtualized systems.

The purpose of this paper is to address the third and fourth items in this list, by documenting the
recommended code levels and explaining how to properly evaluate the utilization on a
virtualized AIX system.

2. Reference Documentation
The following paper provides very detailed information useful in understanding processor
utilization in a virtualized environment.

Understanding Processor Utilization on POWER Systems – AIX


http://www.ibm.com/developerworks/wikis/display/WikiPtype/Understanding+Processor+Utilization+on+POWER
+Systems+-+AIX

The following paper provides detailed information related to processor performance monitoring
IBM's Energy Saving features which modify the CPU frequency. It is not clear the extent to
which this feature is being used by customers, but it is important to keep in mind when
evaluating the customers processor utilization, because if it is being used and the correct
performance tools are not used it can skew the processor utilization.

CPU frequency monitoring using lparstat


http://www.ibm.com/developerworks/wikis/display/WikiPtype/CPU+frequency+monitoring+using+lparstat

Dennis Massanari June 12, 2012


Understanding Processor Utilization with IBM PowerVM

3. Known AIX and Firmware Issues


A problem was identified and fixed which was causing excessive CPU usage in a LPAR in
shared processor mode. The following list indicates the APAR required to fix this problem on the
specified Technology Levels (TL) and Service Packs (SP).

APARs for WAITPROC IDLE LOOPING CONSUMES CPU:


IV01111 AIX 6.1 TL05 if before SP08 (fixed in SP08)
IV06197 AIX 6.1 TL06 if before SP07 (fixed in SP07)
IV10172 AIX 6.1 TL07 if before SP02 (fixed in SP02)
IV09133 AIX 7.1 TL00 if before SP05 (fixed in SP05)
IV10484 AIX 7.1 TL01 if before SP02 (fixed in SP02)

A hypervisor problem was identified and fixed that caused the hypervisor to delay dispatching a
partition even though it was ready to run. This added latency that adversely affected
performance. This problem can effect POWER7 systems running any level of Ax720 firmware
prior to Ax720_101. But it is recommended to update to the latest available firmware.

If required, AIX and Firmware fixes can be obtained from IBM Support Fix Central:
http://www-933.ibm.com/support/fixcentral/main/System+p/AIX

4. What Information to Collect


When a customer opens a SR related to higher then expected system utilization on AIX the
following information should be included in what is requested from the cluster.

4.1 Collecting Hardware and LPAR Configuration Information


Output from the following commands should be collected to understand the system hardware and
firmware levels and the LPAR configuration. Sample output, and descriptions of what to look for
are listed below.
prtconf command - Displays system configuration information.
lparstat command - Reports logical partition (LPAR) related information and statistics.

4.1.1 prtconf command:


The prtconf command displays system configuration information. The information of interest
here is in the top of the prtconf report so only the first 30 lines are displayed below.

# prtconf|head -30
System Model: IBM,9117-MMB

Dennis Massanari June 12, 2012


Understanding Processor Utilization with IBM PowerVM

Machine Serial Number: 1017D8P


Processor Type: PowerPC_POWER7
Processor Implementation Mode: POWER 7
Processor Version: PV_7_Compat
Number Of Processors: 2
Processor Clock Speed: 3108 MHz
CPU Type: 64-bit
Kernel Type: 64-bit
LPAR Info: 5 prd1117_vclient2_el9-89-163
Memory Size: 32768 MB
Good Memory Size: 32768 MB
Platform Firmware level: AM730_066
Firmware Version: IBM,AM730_066
Console Login: enable
Auto Restart: true
Full Core: false

Network Information
Host Name: rac82
IP Address: 9.47.89.163
Sub Netmask: 255.255.255.0
Gateway: 9.47.89.1
Name Server:
Domain Name:

Paging Space Information


Total Paging Space: 16384MB
Percent Used: 2%

prtconf command sample output

Of primary interest are the System Model, Processor Type, Processor Clock Speed, and Platform
Firmware level; highlighted in yellow in the sample output.

4.1.2 lparstat commands:


The lparstat command reports logical partition (LPAR) related information and statistics. Here
the command is being used with the “-i” option to collect configuration information. Later it will
be used with other options to collect performance information.

# lparstat -i
Node Name : rac82
Partition Name : prd1117_vclient2_el9-89-163
Partition Number : 5
Type : Shared-SMT-4
Mode : Capped
Entitled Capacity : 2.00
Partition Group-ID : 32773
Shared Pool ID : 0
Online Virtual CPUs : 2
Maximum Virtual CPUs : 4
Minimum Virtual CPUs : 2
Online Memory : 32768 MB
Maximum Memory : 32768 MB

Dennis Massanari June 12, 2012


Understanding Processor Utilization with IBM PowerVM

Minimum Memory : 32768 MB


Variable Capacity Weight : 0
Minimum Capacity : 2.00
Maximum Capacity : 4.00
Capacity Increment : 0.01
Maximum Physical CPUs in system : 64
Active Physical CPUs in system : 32
Active CPUs in Pool : 32
Shared Physical CPUs in system : 32
Maximum Capacity of Pool : 3200
Entitled Capacity of Pool : 1700
Unallocated Capacity : 0.00
Physical CPU Percentage : 100.00%
Unallocated Weight : 0
Memory Mode : Dedicated
Total I/O Memory Entitlement : -
Variable Memory Capacity Weight : -
Memory Pool ID : -
Physical Memory in the Pool : -
Hypervisor Page Size : -
Unallocated Variable Memory Capacity Weight: -
Unallocated I/O Memory entitlement : -
Memory Group ID of LPAR : -
Desired Virtual CPUs : 2
Desired Memory : 32768 MB
Desired Variable Capacity Weight : 0
Desired Capacity : 2.00
Target Memory Expansion Factor : -
Target Memory Expansion Size : -
Power Saving Mode : Disabled
lparstat –I sample output

The key values from this output related to processor utilization are described below. Some
settings of these values can impact the processor utilization. The following section describing
what performance commands to run shows how to access the impact of some of these settings.

Type:
Indicates whether the LPAR is using dedicated or shared CPU resource and if SMT is
turned ON. The Type is displayed in the format [Shared | Dedicated] [ -SMT ] [ -# ]

The following list explains the different Type formats:

 Shared - Indicates that the LPAR is running in the Shared processor mode. In
shared mode virtual processors from other LPARs may be time-sliced on the same
physical processor.
 Dedicated - Indicates that the LPAR is running in the dedicated processor mode.
In dedicated mode the LPAR is not time-sliced with virtual processors from other
LPARs on the physical processor. There is a special optional mode of dedicated,
called donating, in which if the LPAR is idle other virtual processors can
“borrow” the physical processor, but once the LPAR is no longer idle it resumes
on the physical processor without time-slicing.

Dennis Massanari June 12, 2012


Understanding Processor Utilization with IBM PowerVM

 SMT[-#] - Indicates that the LPAR has SMT mode turned ON and the number of
SMT threads. If only “SMT” is shown the number of threads per processor is 2. If
the number of threads is greater than 2, then the number of threads is also
displayed. Example: SMT or SMT-2 = SMT with 2 threads; SMT-4 = SMT with
4 threads.

Mode:
Indicates whether the LPAR processor capacity is capped or uncapped allowing it to
consume idle cycles from the shared pool. Dedicated LPAR is capped or donating.

Entitled Capacity
The number of processing units this LPAR is entitled (guaranteed) to receive. If the
LPAR mode is capped, then the LPAR can not consume more then the entitled capacity.
If the LPAR is uncapped then if needed it can consume additional resources (up to the
number of virtual processors, ie each virtual processor can consume up to 1 full physical
processor) as long as there are available resources on the system not required to meet the
Entitled Capacity of other LPARs.

Online Virtual CPUs


Number of CPUs (virtual engines) currently online.

Target Memory Expansion Factor


The target memory expansion factor configured for the LPAR. If the target memory
expansion factor is not displayed, or if “-“ is displayed then Active Memory Expansion is
not enabled.

Memory Expansion does memory compression/expansion using CPU cycles, therefore


when enabled additional CPU will be consumed. If the system is memory bound, but has
spare CPU cycles, then using Memory Expansion can increase the system throughput.

Power Save Mode


The power saving mode for the LPAR. When the power saver mode is enabled the
system may operate at a lower frequency during periods when the processor utilization is
low. When operating at a lower frequency the processor will consume less power.

4.2 Collecting Performance Information


In order to determine the processor utilization collecting output from the following commands is
recommended. Depending on the LPAR configuration some can later be ignored if not applicable
to the customer configuration.
1. vmstat –t –w <interval> <#samples>
2. lparstat –h <interval> <# samples>
3. lparstat –E <interval> <# samples>
4. sar –P ALL <interval> <# samples>
5. mpstat –s <interval> <# samples>
6. If using Active Memory Expansion (AME): lparstat –c <interval> <# samples>

Dennis Massanari June 12, 2012


Understanding Processor Utilization with IBM PowerVM

Sample output from these recommended commands is listed below, along with explanations of
the output. In some cases multiple commands for displaying similar information are included.

Analysis of output from these commands will allow you to:


1. determine the portion of a physical processes being consumed
2. determine if additional overhead is being reported due to low utilization effects
3. determine if machine is running in power saving mode and if so what the normalized
CPU utilization is
4. determine is additional CPU is being consumed for AME

Note:
On AIX, whenever per CPU utilization is shown it is calculated as a percentage of the
physical processor consumption of each individual physical processor. Because of this,
when looking at the per CPU utilization the sum of the utilization percentages (USR,
SYS, WAIT and IDLE) will always be 100%, even if the physical processor utilization
on that processor is much lower than 100%. If this isn’t understood it can be misleading,
and lead to a conclusion that the system utilization is higher then it actually is.

In contrast to the per CPU utilization, the System Level (ALL) CPU utilization
percentages are relative to Entitled Capacity and in Uncapped mode relative to the
Physical Capacity being consumed (Physc) once it is greater then the Entitled Capacity.
This is more intuitive then the way the per CPU utilization is displayed.

Examples of this are included below where applicable.

4.2.1 Sample Processor Utilization Data


The recommended commands were run concurrently to collect the following set of sample data,
and the results are explained. One exception is in the case of `lparstat –c` output from another
system with AME enabled was used.

For the examples only one sample was collected to keep the output short, normally it would be
desirable to collect multiple samples over a longer time.

# vmstat 10 1
System configuration: lcpu=8 mem=32768MB ent=2.00

kthr memory page faults cpu


----- --------------- ----------------- ------------- ---------------------
r b avm fre re pi po fr sr cy in sy cs us sy id wa pc ec
5 0 1978175 6035436 0 0 0 0 0 0 91 23654 6163 7 7 86 0 0.44 21.8
Relevant columns:
pc
Number of physical processors consumed. Displayed only if the partition is running with
shared processor.
ec
The percentage of entitled capacity consumed. Displayed only if the partition is running
with shared processor. Because the time base over which this data is computed can vary,

Dennis Massanari June 12, 2012


Understanding Processor Utilization with IBM PowerVM

the entitled capacity percentage can sometimes exceed 100%. This excess is noticeable
only with small sampling intervals.

vmstat command sample output

# sar -P ALL 10 1

AIX rac82 1 7 00F617D84C00 04/11/12

System configuration: lcpu=8 ent=2.00 mode=Capped

18:45:43 cpu %usr %sys %wio %idle physc %entc


18:45:53 0 50 47 0 4 0.25 12.4
1 27 22 0 50 0.09 4.3
2 0 1 0 99 0.05 2.4
3 0 0 0 100 0.05 2.4
4 0 11 0 89 0.00 0.0
5 0 10 0 90 0.00 0.0
6 0 10 0 90 0.00 0.0
7 0 59 0 41 0.00 0.0
U - - 0 78 1.57 78.3
- 7 7 0 86 0.43 21.7
Relevant columns:
physc
Reports the number of physical processors consumed. This data will be reported if the
partition is dedicated and enabled for donation, or is running with shared processors or
simultaneous multithreading enabled.
%entc
Reports the percentage of entitled capacity consumed. This will be reported only if the
partition is running with shared processors. Because the time base over which this data is
computed can vary, the entitled capacity percentage can sometimes exceed 100%. This
excess is noticeable only with small sampling intervals.
Note:
The last line indicates system-wide statistics for all processors, and the line with cpuid U
indicates the system-wide Unused capacity
sar command sample output

# lparstat -h 10 1

System configuration: type=Shared mode=Capped smt=4 lcpu=8 mem=32768MB


psize=32 ent=2.00

%user %sys %wait %idle physc %entc lbusy vcsw phint %hypv hcalls
----- ----- ------ ------ ----- ----- ------ ----- ----- ------ ------
7.4 6.8 0.0 85.8 0.43 21.7 8.9 2144 2 88.0 14188
Relevant columns:
lbusy
Indicates the percentage of logical processor(s) utilization that occurred while executing
at the user and system level.
vcsw
Indicates the number of virtual context switches that are virtual-processor hardware

Dennis Massanari June 12, 2012


Understanding Processor Utilization with IBM PowerVM

preemptions.
phint
Indicates the number of phantom (targeted to another shared partition in this pool)
interruptions received.
%hypv
Indicates the percentage of physical processor consumption spent making hypervisor
calls.
hcalls
Indicates the average number of hypervisor calls that were started.
lparstat –h command sample output from not busy system

# lparstat –E 10 1

System configuration: type=Shared mode=Capped smt=4 lcpu=8 mem=32768MB


ent=2.00 Power=Disabled

Physical Processor Utilisation:

--------Actual-------- ------Normalised------
user sys wait idle freq user sys wait idle
---- ---- ---- ---- --------- ---- ---- ---- ----
0.147 0.137 0.000 1.716 3.1GHz[100%] 0.147 0.137 0.000 1.716
With option –E lparstat reports Scaled Processor Utilization Resource Register (SPURR) based
utilization metrics if run on a SPURR-capable processor. This output is useful when the Power
Saving Mode is Enabled.
The Normalized columns indicate what the utilization would be if the machine was running at
the full processor speed, and therefore are a better indicator of the available capacity (since as the
processor becomes busier the frequency will be restored to the full speed).
See the reference CPU frequency monitoring using lparstat for more details.
Lparstat –E command sample output

# lparstat –c 10 1
System is not running in AME memory mode, -c flag is not valid
If AME is not enabled (target memory expansion factor is not displayed, or if “-“ is displayed)
then no stats are returned.
Below is sample output from another system, under load, with AME enabled.
System configuration: type=Shared mode=Capped mmode=Ded-E smt=4 lcpu=80
mem=40960MB tmem=32768MB psize=32 ent=20.00

%user %sys %wait %idle physc %entc lbusy vcsw phint %xcpu xphysc dxm
----- ----- ------ ------ ----- ----- ------ ----- ----- ------ ------ ------
18.3 4.1 4.0 73.6 7.06 35.3 10.3 2654 87 0.8 0.0576 0
21.3 4.0 3.1 71.6 7.79 39.0 11.5 3799 121 0.3 0.0240 0
21.3 4.1 3.2 71.4 8.02 40.1 11.8 8983 112 0.6 0.0502 0
24.8 3.4 2.6 69.2 8.89 44.5 13.4 4433 120 1.4 0.1286 0
17.5 3.4 2.4 76.7 6.57 32.9 11.2 6329 119 0.4 0.0295 0
60.1 5.6 3.4 30.8 16.25 81.2 41.9 14123 614 1.3 0.2085 0
25.7 20.7 8.3 45.3 11.99 60.0 31.4 12320 336 37.0 4.4403 0
13.3 10.6 4.5 71.6 7.44 37.2 15.9 8107 171 29.3 2.1798 0
17.1 3.5 5.7 73.7 6.53 32.7 11.4 4205 118 6.3 0.4125 0
23.9 5.2 6.7 64.2 8.89 44.4 15.3 9940 149 5.4 0.4772 0

Dennis Massanari June 12, 2012


Understanding Processor Utilization with IBM PowerVM

25.5 3.1 4.8 66.5 8.91 44.6 13.9 4390 142 1.3 0.1120 0
16.7 3.8 4.1 75.4 6.81 34.1 11.6 2980 87 3.8 0.2603 0
27.9 8.6 3.7 59.8 10.99 54.9 17.1 9501 158 10.1 1.1112 0
18.6 7.4 3.6 70.4 8.27 41.3 13.2 4801 121 12.4 1.0275 0
21.0 4.5 3.5 71.0 8.12 40.6 12.6 3548 111 5.7 0.4610 0
20.2 4.9 2.7 72.3 7.93 39.6 13.5 4683 115 2.3 0.1842 0
26.7 4.2 2.4 66.7 9.37 46.9 15.6 3412 136 0.6 0.0537 0
43.0 16.7 2.7 37.6 16.00 80.0 29.0 9829 293 17.3 2.7726 0
15.2 8.3 5.0 71.5 7.19 36.0 15.7 17264 212 21.5 1.5424 0
23.9 6.2 7.4 62.5 9.29 46.4 16.0 11682 203 12.9 1.1936 0
Relevant columns:
exit%xcpu
Indicates the percentage of utilization (relative to the overall CPU consumption by the
logical partition, in other words relative to the value in physc column ) for the Active
Memory Expansion (AME) activity.
xphysc
Indicates the number of physical processors used for the Active Memory Expansion
activity.
dxm
Indicates the size of the expanded memory deficit for the LPAR in MB.
lparstat –c command sample output

# mpstat –s 10 1

System configuration: lcpu=8 ent=2.0 mode=Capped

Proc0 Proc4
43.13% 0.27%
cpu0 cpu1 cpu2 cpu3 cpu4 cpu5 cpu6 cpu7
24.79% 8.65% 4.86% 4.83% 0.06% 0.06% 0.06% 0.10%
Displays simultaneous multithreading threads utilization, this –s flag is available only when
mpstat runs in a simultaneous multithreading enabled partition.

Proc# - shows virtual processors


cpu# - shows logical processors
In this case, Proc0 has logical processors 0,1,2,3 and Proc4 has logical processors 4,5,6,7.

The percentages show the percentage of physical processor consumed. This is comparable to the
sar command physc output only broken out by virtual and logical processor.
mpstat –s command sample output

4.2.3 Differences in CPU Consumption by Hypervisor at Low and High


Utilization

# lparstat -h 10 1

System configuration: type=Shared mode=Capped smt=4 lcpu=8 mem=32768MB


psize=32 ent=2.00

%user %sys %wait %idle physc %entc lbusy vcsw phint %hypv hcalls

Dennis Massanari June 12, 2012


Understanding Processor Utilization with IBM PowerVM

----- ----- ------ ------ ----- ----- ------ ----- ----- ------ ------
7.4 6.8 0.0 85.8 0.43 21.7 8.9 2144 2 88.0 14188
lparstat –h command sample output from not busy system

# lparstat -h 10 1

System configuration: type=Shared mode=Capped smt=4 lcpu=8 mem=32768MB


psize=32 ent=2.00

%user %sys %wait %idle physc %entc lbusy vcsw phint %hypv hcalls
----- ----- ------ ------ ----- ----- ------ ----- ----- ------ ------
96.1 3.9 0.0 0.0 2.00 100.0 100.0 4 62 0.4 24855
lparstat –h command sample output from busy system

The two preceding tables show the `lparstat –h` command on a not busy and busy system. Note
that on the not busy system the percentage of physical processor consumption spent making
hypervisor calls (%hypv) is very high. In the busy case, the percentage of physical processor
consumption spent making hypervisor calls (%hypv) is very small. The important thing to note is
that when the system is not busy (when the wait process is running) it gives control back to the
hypervisor, this results in additional overhead, however as the system becomes busier this
overhead diminishes.

Note: at this time there is a know problem impacting the accuracy of the %hypv being reported.
In some cases the percentage reported is too high. At this time there is not an APAR to resolve
this.

The lparstat command with the –H option show a detailed break down of the %hypv. Note in the
following example the majority of the %hypv time is in the cede hypervisor call. This is the wait
process giving control back to the hypervisor.

--------------------------------------------------------------------------------
Hypervisor Number of %Total Time %Hypervisor Avg Call Max Call
Call Calls Spent Time Spent Time(ns) Time(ns)

remove 92300 0.0 0.0 746 10812


read 14680 0.0 0.0 192 4937
nclear_mod 0 0.0 0.0 0 0
page_init 71804 0.0 0.0 564 9625
clear_ref 0 0.0 0.0 0 0
protect 1424 0.0 0.0 890 9031
put_tce 2175 0.0 0.0 962 8468
xirr 4013 0.0 0.0 988 8968
eoi 3985 0.0 0.0 382 4468
ipi 30 0.0 0.0 662 5687
cppr 3970 0.0 0.0 339 6093
asr 0 0.0 0.0 0 0
others 3 0.0 0.0 1041 1812
enter 135742 0.0 0.0 391 11750
cede 266805 98.2 99.9 882874 44025468
migrate_dma 0 0.0 0.0 0 0
put_rtce 0 0.0 0.0 0 0
confer 16 0.0 0.0 95 500
prod 21771 0.0 0.0 631 9781
get_ppp 70 0.0 0.0 1261 7312

Dennis Massanari June 12, 2012


Understanding Processor Utilization with IBM PowerVM

set_ppp 0 0.0 0.0 0 0


purr 0 0.0 0.0 0 0
pic 70 0.0 0.0 185 875
bulk_remove 13090 0.0 0.0 2137 11531
send_crq 0 0.0 0.0 0 0
copy_rdma 0 0.0 0.0 0 0
get_tce 0 0.0 0.0 0 0
send_logical_lan 0 0.0 0.0 0 0
add_logicl_lan_buf 0 0.0 0.0 0 0
--------------------------------------------------------------------------------
lparstat –H command sample output

5. Collecting Detailed Performance Data


This section describes the AIX curt and tprof performance tool for getting a more detailed view
of the processor utilization. Curt provides a summary of the overhead by processing category and
process or thread level. Tprof provides a summary by process and thread level, and subroutine
level. Normally a customer would only use these tools if additional data was requested by IBM
or Oracle support who would be involved in interpreting the detailed results.

5.1 curt Performance Report


The curt tool produces a number of statistics related to processor (CPU) utilization and
process/thread/pthread activity. An AIX trace is collected and then the curt tool process the trace
data to show utilization.

Please refer to the AIX documentation for details on the curt command and its options:
http://pic.dhe.ibm.com/infocenter/aix/v7r1/index.jsp?topic=%2Fcom.ibm.aix.cmds%2Fdoc%2Fa
ixcmds1%2Fcurt.htm

Following is an example of creating a curt report.

# trace -a -L 4294967296 -r PURR -T 268435184 -J curt -o /bkup/trace_curt.raw;sleep $TIME;trcstop


# curt -r PURR -i /bkup/trace_curt.raw -o curt_report.out
trace commands options:
-a - Runs the trace daemon asynchronously (i.e. as a background task).
-T – in memory trace buffer size
-J curt – event groups to include. “curt” includes special set of trace hooks needed to generate curt report
-r PURR – Includes PURR register which curt can use to calculate CPU times more accurately
curt command options:
-r PURR - Uses the PURR register to calculate CPU times

Sample section from curt report:

:
System Summary
--------------
processing percent percent
total time total time busy time

Dennis Massanari June 12, 2012


Understanding Processor Utilization with IBM PowerVM

(msec) (incl. idle) (excl. idle) processing category


=========== =========== =========== ===================
54973.37 11.18 64.12 APPLICATION
15478.07 3.15 18.05 SYSCALL
1004.39 0.20 1.17 HCALL
2618.87 0.53 3.05 KPROC (excluding IDLE and NFS)
0.01 0.00 0.00 NFS
10176.75 2.07 11.87 FLIH
134.84 0.03 0.16 SLIH
1349.85 0.27 1.57 DISPATCH (all procs. incl. IDLE)
418.87 0.09 0.49 IDLE DISPATCH (only IDLE proc.)
----------- ---------- -------
85736.15 17.44 100.00 CPU(s) busy time
405795.13 82.56 IDLE
----------- ----------
491531.28 TOTAL
:
Sample section from curt report

The curt report includes the First Level Interrupt Handler (FLIH) time. This is noted because it is
not accounted for in the tprof report mentioned below.

5.2 tprof Performance Report


The tprof command is a versatile profiler that provides a detailed profile of CPU usage by every
process ID and name. It further profiles at the application level, routine level, and even to the
source statement level and provides both a global view and a detailed view. In addition, the tprof
command can profile kernel extensions, stripped executable programs, and stripped libraries. It
does subroutine-level profiling for most executable programs on which the stripnm command
produces a symbols table.

Please refer to the tprof documentation for details on this command:


http://pic.dhe.ibm.com/infocenter/aix/v7r1/index.jsp?topic=%2Fcom.ibm.aix.cmds%2Fdoc%2Fa
ixcmds5%2Ftprof.htm

The tprof command can run in different modes. In the following example it is run in offline
mode which allows specifying a larger trace buffer and logfile.

#--- create links between ps command names and Oracle binaries


for F in `ps -elf|egrep “${CRS_HOME}|${ORACLE_HOME}”|grep -v grep|sed -e "s|.*
/|/|"|sed -e "s| .*||g"|sort|uniq`
do
N=`basename $F`
ln -s $F $N
done
#--- first run tprof in online mode to generate the symbol file
tprof -c -tskeujl -R -T 268435184 -p ocssd.bin,crsd.bin,ohasd.bin -x sleep 1
rm sleep.prof
rm sleep.ctrc
mv sleep.csyms sleep.syms #--- save the symbols file for next step
#--- generate large trace for tprof to process

Dennis Massanari June 12, 2012


Understanding Processor Utilization with IBM PowerVM

trace -a -r PURR -L 2147483648 -T 268435184 -J tprof -o /bkup/trace_tprof.raw;sleep


$TIME;trcstop
tprof -R -tskeujl -p ocssd.bin,crsd.bin,ohasd.bin -F -r sleep
mv sleep.prof tprof_report.out
tprof command option:
-R – use samples weighted by PURR register
-p - Enables process level profiling of the process names specified in the processlist.
-F – use offline mode
-r sleep – root string, identifies root name of input and output files (for example sleep.syms)

Following is a sample section from a tprof report.

:
Process Freq Total Kernel User Shared Other Java
======= ==== ===== ====== ==== ====== ===== ====
wait 4 88.92 88.91 0.01 0.00 0.00 0.00
./ocssd.bin 18 1.76 1.17 0.00 0.58 0.000 0.00
./oraagent.bin 67 1.15 0.56 0.31 0.28 0.00 0.00
./orarootagent.bin 385 1.06 0.66 0.17 0.23 0.00 0.00
./ohasd.bin 20 0.91 0.62 0.01 0.29 0.000 0.00
./gipcd.bin 6 0.88 0.61 0.00 0.27 0.00 0.00
./crsd.bin 23 0.76 0.52 0.00 0.23 0.00 0.00
./evmd.bin 8 0.67 0.47 0.00 0.20 0.00 0.00
./octssd.bin 8 0.65 0.46 0.00 0.19 0.00 0.00
//bin/sh 2253 0.46 0.41 0.02 0.01 0.01 0.00
/ora112/grid/perl/bin/perl 223 0.14 0.07 0.05 0.02 0.01 0.00
ora_lms1_orcl_4 1 0.14 0.11 0.02 0.00 0.00 0.00
ora_lms0_gpdb_4 1 0.14 0.11 0.02 0.00 0.00 0.00
ora_lms1_gpdb_4 1 0.13 0.11 0.02 0.00 0.00 0.00
ora_lms0_orcl_4 1 0.13 0.11 0.02 0.00 0.00 0.00
asm_dia0_+ASM4 1 0.12 0.07 0.04 0.01 0.00 0.00
gil 4 0.12 0.11 0.00 0.00 0.00 0.00
ora_dia0_gpdb_4 1 0.11 0.06 0.04 0.01 0.00 0.00
ora_dia0_orcl_4 1 0.11 0.06 0.04 0.01 0.00 0.00
asm_lms0_+ASM4 1 0.09 0.07 0.02 0.00 0.00 0.00
/usr/sbin/lsattr 502 0.09 0.07 0.01 0.00 0.00 0.00
/usr/sbin/sshd 275 0.08 0.05 0.02 0.01 0.00 0.00
/sbin/acfsutil.bin 115 0.07 0.03 0.00 0.00 0.04 0.00
ora_lmon_gpdb_4 1 0.07 0.02 0.05 0.00 0.00 0.00
ora_lmon_orcl_4 1 0.06 0.02 0.04 0.00 0.00 0.00
/usr/bin/awk 386 0.06 0.05 0.00 0.00 0.00 0.00
/usr/bin/pwd 272 0.05 0.04 0.00 0.00 0.00 0.00
asm_lmon_+ASM4 1 0.05 0.02 0.02 0.00 0.00 0.00
/usr/bin/ps 170 0.03 0.03 0.00 0.00 0.00 0.00
/bin/sh 161 0.03 0.03 0.00 0.00 0.00 0.00
./cssdagent 11 0.03 0.02 0.00 0.01 0.00 0.00
ora_j000_orcl_4 27 0.03 0.01 0.01 0.00 0.01 0.00
./cssdmonitor 9 0.03 0.02 0.00 0.01 0.00 0.00
asm_lmd0_+ASM4 1 0.03 0.02 0.01 0.00 0.00 0.00
/usr/sbin/instfix 5 0.02 0.01 0.01 0.00 0.00 0.00
:
Sample section from tprof report

Dennis Massanari June 12, 2012

You might also like