Lab3 Performance Management Best Practices Cu Sublinieri
Lab3 Performance Management Best Practices Cu Sublinieri
1 –
Performance Management Best Practices
Sergio Bonilla
IBM Tivoli Storage Productivity Center development,
San Jose, CA
Second Edition (November 2009)
This edition applies to Version 4, Release 1, of IBM Tivoli Storage Productivity Center
2
Table of Contents
1 Notices........................................................................................................................................ 4
1.1 Legal Notice........................................................................................................................ 4
1.2 Trademarks......................................................................................................................... 5
1.3 Acknowledgement .............................................................................................................. 6
1.4 Other IBM Tivoli Storage Productivity Center Publications ................................................ 6
2 IBM Tivoli Storage Productivity Center Performance Management........................................... 7
2.1 Overview ................................................................................................................................ 7
2.2 Disk Performance and Fabric Performance.......................................................................... 7
2.3 Performance Metrics............................................................................................................ 7
3 Setup and Configuration........................................................................................................... 10
3.1 Performance Data collection................................................................................................ 10
3.1.1 Add CIMOM................................................................................................................. 10
3.1.2 Discover Subsystems or Switches .............................................................................. 11
3.1.3 Probe the Device......................................................................................................... 11
3.1.4 Create Threshold Alerts .............................................................................................. 11
3.1.5 Create Performance Monitor ................................................................................... 14
3.1.6 Check Performance Monitor Status......................................................................... 15
3.2 Retention for performance data........................................................................................ 15
3.3 Common Issues................................................................................................................ 16
3.3.1 General issues ......................................................................................................... 16
3.3.2 ESS and DS Related Issues.................................................................................... 16
3.3.3. DS4000/DS5000 Related Issues ............................................................................. 16
3.3.4 HDS Related Issues................................................................................................. 17
4. Top Reports and Graphs a Storage Administrator May Want to Run .............................. 17
4.1 Tabular Reports................................................................................................................ 18
4.2 Drill up and Drill down....................................................................................................... 19
4.3 Historic Charts .................................................................................................................. 19
4.4 Batch Reports ................................................................................................................... 20
4.5 Constraint Violation Reports............................................................................................. 22
4.6 Top Hit Reports ................................................................................................................ 23
5. SAN Planner and Storage Optimizer................................................................................ 24
6. Summary .......................................................................................................................... 25
7. Reference ......................................................................................................................... 25
Appendix A Available Metrics ...................................................................................................... 27
Appendix B Available Thresholds................................................................................................. 52
Appendix C DS3000, DS4000 and DS5000 Metrics ................................................................... 59
3
1 Notices
IBM may have patents or pending patent applications covering subject matter described in this
document. The furnishing of this document does not give you any license to these patents. You
can send license inquiries, in writing, to:
IBM Director of Licensing, IBM Corporation, North Castle Drive Armonk, NY 10504-1785 U.S.A.
The following paragraph does not apply to the United Kingdom or any other country where such
provisions are inconsistent with local law: INTERNATIONAL BUSINESS MACHINES
CORPORATION PROVIDES THIS PUBLICATION "AS IS" WITHOUT WARRANTY OF ANY
KIND, EITHER EXPRESS OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
WARRANTIES OF NONINFRINGEMENT, MERCHANTABILITY OR FITNESS FOR A
PARTICULAR PURPOSE. Some states do not allow disclaimer of express or implied warranties
in certain transactions, therefore, this statement may not apply to you.
This information could include technical inaccuracies or typographical errors. Changes are
periodically made to the information herein; these changes will be incorporated in new editions of
the publication. IBM may make improvements and/or changes in the product(s) and/or the
program(s) described in this publication at any time without notice.
Any references in this information to non-IBM Web sites are provided for convenience only and
do not in any manner serve as an endorsement of those Web sites. The materials at those Web
sites are not part of the materials for this IBM product and use of those Web sites is at your own
risk.
IBM may use or distribute any of the information you supply in any way it believes appropriate
without incurring any obligation to you.
Information concerning non-IBM products was obtained from the suppliers of those products, their
published announcements or other publicly available sources. IBM has not tested those products
and cannot confirm the accuracy of performance, compatibility or any other claims related to non-
IBM products. Questions on the capabilities of non-IBM products should be addressed to the
suppliers of those products.
This information contains examples of data and reports used in daily business operations. To
illustrate them as completely as possible, the examples include the names of individuals,
4
companies, brands, and products. All of these names are fictitious and any similarity to the
names and addresses used by an actual business enterprise is entirely coincidental.
COPYRIGHT LICENSE:
This information contains sample application programs in source language, which illustrates
programming techniques on various operating platforms. You may copy, modify, and distribute
these sample programs in any form without payment to IBM, for the purposes of developing,
using, marketing or distributing application programs conforming to the application programming
interface for the operating platform for which the sample programs are written. These examples
have not been thoroughly tested under all conditions. IBM, therefore, cannot guarantee or imply
reliability, serviceability, or function of these programs. You may copy, modify, and distribute
these sample programs in any form without payment to IBM for the purposes of developing, using,
marketing, or distributing application programs conforming to IBM's application programming
interfaces.
1.2 Trademarks
The following terms are trademarks or registered trademarks of the International Business
Machines Corporation in the United States or other countries or both:
AIX® iSeries Tivoli®
DB2® Passport Advantage® Tivoli Storage®
DS4000, DS6000, DS8000 pSeries® WebSphere®
UNIX is a registered trademark of the Open Group in the United States and other countries.
Java, Solaris, and all Java-based trademarks and logos are trademarks of Sun Microsystems, Inc.
in the United States, other countries, or both.
Intel is a registered trademark of the Intel Corporation or its subsidiaries in the United States and
other countries.
Linux is a trademark of Linus Torvalds in the United States, other countries, or both.
5
HiCommand is a registered trademark of Hitachi Data Systems Corporation.
Brocade and the Brocade logo are trademarks or registered trademarks of Brocade
Communications Systems, Inc., in the United States and/or in other countries.
Cisco is a registered trademark of Cisco Systems, Inc. and/or its affiliates in the U.S. and certain
other countries.
Engenio and the Engenio logo are trademarks or registered trademarks of LSI Logic Corporation.
Other company, product, or service names may be trademarks or service marks of others.
1.3 Acknowledgement
The materials in this document have been collected from explicit work in the IBM Tivoli Storage
Productivity Center development lab, other labs within IBM, from experiences in the field at
customer locations, and contributions offered by people that have discovered valuable tips and
have documented the solution.
Many people have helped with the materials that are included in this document, too many to
properly acknowledge here, but special thanks goes to Xin Wang who compiled the original
version of this document. It is a source of information for advanced configuration help and basic
best practices for users wanting to get started quickly with Tivoli Storage Productivity Center.
6
2 IBM Tivoli Storage Productivity Center Performance
Management
2.1 Overview
There are three main functions for IBM Tivoli Storage Productivity Center performance
management: performance data collection, performance thresholds/alerts, and performance
reports. The product can collect performance data for devices - storage subsystems and fibre
channel switches - with CIM agents that are at least SMI-S 1.1 compliant, and store the data in
the database up to a user-defined period; the product can set thresholds for important
performance metrics, and when any boundary condition is crossed, can notify the user via email,
SNMP, or other alerting mechanisms; and lastly, the product can generate reports, historic trend
charts, and help analyze the bottleneck of a performance congestion by drilling down to threshold
violated components and the affected hosts. The combination of those functions can be used to
monitor a complicated storage network environment, to predict warning signs of system fallout,
and to do capacity planning as overall workload grows.
The IBM Tivoli Storage Productivity Center Standard Edition (5608-WC0) includes
performance management for both subsystems and switches, while IBM Tivoli Storage
Productivity Center for Disk (5608-WC4) is only for storage subsystems. IBM Tivoli Storage
Productivity Center Basic Edition (5608-WB1) and IBM Tivoli Storage Productivity Center for Data
(5608-WC3) do not include performance management function.
7
measure of the traffic between the servers and the storage subsystem, and are characterized by
relatively fast hits in the cache, as well as occasional cache misses that go all the way to the
RAID arrays on the back end. Back-end I/O metrics are a measure of all traffic between the
subsystem cache and the disks in the RAID arrays in the backend of the subsystem. Most
storage subsystems give metrics for both kinds of I/O operations, front- and back-end. We need
to always be clear whether we are looking at throughput and response time at the front-end (very
close to system level response time as measured from a server), or the throughput and response
time at the back-end (just between cache and disk).
The main front-end throughput metrics are:
• Total I/O Rate (overall)
• Read I/O Rate (overall)
• Write I/O Rate (overall)
The corresponding front-end response time metrics are:
• Overall Response Time
• Read Response Time
• Write Response Time
The main back-end throughput metrics are:
• Total Backend I/O Rate (overall)
• Backend Read I/O Rate (overall)
• Backend Write I/O Rate (overall)
The corresponding back-end response time metrics are:
• Overall Backend Response Time
• Backend Read Response Time
• Backend Write Response Time
It is important to remember that the response times taken in isolation of throughput rates are
not terribly useful, because it is common for components which have negligible throughput rates
to exhibit large (bad) response times. But in essence those bad response times are not significant
to the overall operation of the storage environment if they occurred for only a handful of I/O
operations. It is therefore necessary to have an understanding of which throughput and response
time combinations are significant and which can be ignored. To help in this determination, IBM
Tivoli Storage Productivity Center V4.1.1 has introduced a metric called Volume Utilization
Percentage. This metric is based on both I/O Rate and Response Time of a storage volume and
is an approximate measure of the amount of time the volume was busy reading and writing data.
It is therefore safe to ignore bad average response time values for volumes with very low
utilization percentages, and conversely, those volumes with the highest utilization percentages
are the most important for the smooth operation of the storage environment and are most
important to exhibit good response times. When implementing storage tiering using 10K, 15K, or
even SSD drives, the most highly utilized volumes should be considered for being placed on the
best performing underlying media.
Furthermore, it will be advantageous to track any growth or change in the throughput rates and
response times. It frequently happens that I/O rates grow over time, and that response times
increase as the I/O rates increase. This relationship is what “capacity planning” is all about. As
I/O rates and response times increase, you can use these trends to project when additional
storage performance (as well as capacity) will be required.
Depending on the particular storage environment, it may be that throughput or response times
change drastically from hour to hour or day to day. There may be periods when the values fall
8
outside the expected range of values. In that case, other performance metrics can be used to
understand what is happening. Here are some additional metrics that can be used to make sense
of throughput and response times.
• Total Cache Hit percentage
• Read Cache Hit Percentage
• NVS Full Percentage
• Read Transfer Size (KB/Op)
• Write Transfer Size (KB/Op)
Low cache hit percentages can drive up response times, since a cache miss requires access to
the backend storage. Low hit percentages will also tend to increase the utilization percentage of
the backend storage, which may adversely affect the back-end throughput and response times.
High NVS Full Percentage (also known as Write-cache Delay Percentage) can drive up the write
response times. High transfer sizes usually indicate more of a batch workload, in which case the
overall data rates are more important than the I/O rates and the response times.
In addition to the front-end and back-end metrics, many storage subsystems provide additional
metrics to measure the traffic between the subsystem and host computers, and to measure the
traffic between the subsystem and other subsystems when linked in remote-copy relationships.
Such fibre channel port-based metrics, primarily I/O rates, data rates, and response times are
available for ESS, DS6000, DS8000, and SVC subsystems. ESS, DS6000, and DS8000
subsystems provide additional break-down between FCP, FICON, and PPRC operations at each
port. SVC subsystems provide additional breakdown between communications with host
computers, backend managed disks, and other nodes within the local SVC cluster, as well as
remote SVC clusters at each SVC port.
Similar to the Volume Utilization Percentage mentioned earlier, IBM Tivoli Storage Productivity
Center V4.1.1 has also introduced the Port Utilization Percentage metric (available for ESS,
DS6000, and DS8000 storage subsystems). The Port Utilization Percentage is an approximate
measure of the amount of time a port was busy, and can be used to identify over-utilized and
under-utilized ports on the subsystem for potential port balancing. For subsystems where port
utilizations are not available, the simpler Port Bandwidth Percentage metrics provide a measure
of the approximate bandwidth utilization of a port, based on the port’s negotiated speed, and can
be used in a similar fashion. However, beware that the Port Bandwidth Percentages can
potentially provide misleading indicators of port under-utilization when ports are not under-utilized
if there is a performance bottleneck elsewhere in the fabric or at the port’s communication partner.
For fibre-channel switches, the important metrics are Total Port Packet Rate and Total Port
Data Rate, which provide the traffic pattern over a particular switch port, as well as the Port
Bandwidth Percentage metrics providing indicators of bandwidth usage based on port speeds.
When there are lost frames from the host to the switch port, or from the switch port to a storage
device, the dumped frame rate on the port can be monitored.
All these metrics can be monitored via reports or graphs in IBM Tivoli Storage Productivity
Center. Also there are several metrics for which you can define thresholds and receive alerts
when measured values do not fall within acceptable boundaries. Some examples of supported
thresholds are:
• Total I/O Rate and Total Data Rate Thresholds
• Total Backend I/O Rate and Total Backend Data Rate Thresholds
• Read Backend Response Time and Write Backend Response Time Thresholds
• Total Port I/O Rate (Packet Rate) and Data Rate Thresholds
• Overall Port Response Time Threshold
• Port Send Utilization Percentage and Port Receive Utilization Percentage Thresholds
9
• Port Send Bandwidth Percentage and Port Receive Bandwidth Percentage Thresholds
Please see Appendix A for a complete list of performance metrics that IBM Tivoli Storage
Productivity Center supports and Appendix B for a complete list of thresholds supported.
The important thing is to monitor the throughput and response time patterns over time for your
environment, to develop an understanding of normal and expected behaviors. Then you can set
threshold boundaries to alert you when anomalies to the expected behavior are detected. And
you can use the performance reports and graphs to investigate any deviations from normal
patterns or to generate the trends of workload changes.
10
• Check Test CIMOM connectivity before adding to make sure CIMOM is up and
can be connected
When a CIMOM is added successfully, it will appear in the list of CIMOMs on the Administrative
Services –> Data Sources –> CIMOM Agents panel, and the status should be green.
The alternative to adding the CIMOM manually is to have the CIMOM automatically discovered
via SLP. However, configuring and performing a CIMOM discovery via SLP falls outside the
scope of this paper.
11
The Tivoli Storage Productivity Center ships with several default thresholds enabled (see
Appendix B for a full list of thresholds supported) that do not change much with the environment,
but metrics such as throughput and response time can vary a lot depending on the type of
workload, model of hardware, amount of cache memory etc. so that there are no recommended
values to set. Boundary values for these thresholds have to be determined in each particular
environment by establishing a base-line of the normal and expected performance behavior for the
devices in the environment. After the base-line is determined, thresholds can then be defined to
trigger if the measured performance behavior falls outside the normally expected range.
Thresholds are device type and component type specific, meaning that each threshold may
apply to only a subset of all supported device types and to only a subset of supported component
types for each device type. Every threshold is associated with a particular metric; checking that
metric’s value at each collection interval determines whether the threshold is violated or not.
To create an alert for subsystem thresholds, go to Disk Manager -> Alerting -> Storage
Subsystem Alerts, right-click to select create storage subsystem alert (see Figure 1):
• Alert tab – In the triggering condition area, select from the drop down list a triggering
condition (threshold alerts have names ending with “Threshold”), ensure that the
threshold is enabled via the checkbox in the upper right corner, and then enter the
threshold boundary values for the desired boundary conditions. Tivoli Storage
Productivity Center V4.1.1 allows decimal values in the threshold boundary values, prior
versions only allow integer values.
• Alert tab – Some thresholds are associated with an optional filter condition, which is
displayed in the triggering condition area. If displayed, you can enable or disable the filter,
and if enabled, can set the filter boundary condition. If the filter condition is triggered, any
violation of this threshold will be ignored when the filter is enabled.
• Alert tab – In the alert suppression area, select whether to trigger alerts for both critical
and warning conditions or only critical conditions or not to trigger any alerts. The
suppressed alerts will not alert log entries or cause any action to be taken as defined in
the triggered action area, but they will still be visible in the constraint violation reports.
• Alert tab – In the alert suppression area, select whether to suppress repeating alerts.
You may either suppress alerts until the triggering condition has been violated
continuously for a specified length of time or to suppress subsequent violations for a
length of time after the initial violation. Alerts suppressed will still be visible in the
constraint violation reports.
• Alert tab – In the triggered action area, select one of the following actions: SNMP trap,
TEC/OMNIbus event, login notification, Window’s event log, run script, or email.
• Storage subsystem tab – move the subsystem(s) you want to monitor into the right-
hand panel (Selected subsystems). Make sure these are the subsystems for which you
will define performance monitors
• Save the alert with a name
12
Figure 1. Threshold alert creation panel for storage subsystems.
To create an alert for switch thresholds, go to Fabric Manager -> Alerting -> Switch Alerts,
right-click to select create switch alert, and follow the same steps as for subsystems described
above.
There are a few points that need to be addressed in order to understand threshold settings:
1. There are two types of boundaries for each threshold, the upper boundary (stress)
and lower boundary (idle). When a metric’s value exceeds the upper boundary or is
below the lower boundary, it will trigger an alert.
2. There are two levels of alerts, warning and critical. The combination of boundary type
and level type generates four different threshold settings: critical stress, warning
stress, warning idle and critical idle. Most threshold values are in descending order
(critical stress has the highest value that indicates high stress on the device, and
critical idle has the lowest value) while Cache Holding Time is the only threshold in
ascending order.
3. If the user is only interested to receive alerts for certain boundaries, the other
boundaries should be left blank. The performance manager will only check boundary
conditions with input values, therefore no alerts will be sent for the condition that is
left blank.
4. The storage subsystem alerts will be displayed under IBM Tivoli Storage
Productivity Center -> Alerting -> Alert logs -> All, as well as under Storage
Subsystem. Another important way to look at the exception data is to look at
constraint violation reports. This is described in section 4.4.
13
3.1.5 Create Performance Monitor
A performance data collection is defined via a mechanism called a monitor job, and that can be
run manually (immediately), can be scheduled for one-time execution, or can be scheduled for
repeated execution, as desired.
Only after the device has been probed successfully, can a monitor job be run successfully. To
create a performance monitor on a subsystem, go to Disk Manager -> Monitoring ->
Subsystem Performance Monitors, right-click to select create performance monitor.
• Storage Subsystem tab - move the subsystem(s) you want to monitor into the right-hand
panel (Selected subsystems)
• Sampling and scheduling tab – enter how frequently the data should be collected and
saved (the smaller the interval length, the more granular the performance data), when the
monitor will be run and the duration for the collection
• Save the monitor with a name
To create a performance monitor for a switch, go to Fabric Manager -> Monitoring -> Switch
Performance Monitors, follow the same steps as above, substituting storage subsystem with
switch.
The monitor will start at the scheduled time, but the performance sample data will be collected
a few minutes later. For example, if your monitor is scheduled to start at 9 am to collect with an
interval length of 5 minutes, the first performance data might be inserted into the database 10-15
minutes later, and the second performance data will be inserted after 5 more minutes. Only after
the first sample data is inserted into the database, in this case, around 9:10 or 9:15 am, you will
be able to view the performance reports.
Because of this, there are some best practice information related how to set up the schedule
and duration for a performance monitor:
1. Monitor duration – if a monitor is intended to run for a long time, choose to run it
indefinitely. The performance monitor is optimized such that running indefinitely will
be more efficient than running, stopping, then starting again. One thing to note for the
indefinite monitor: there will be no message on “# of records inserted into database”
in the job log while it exists for monitors that run definitely.
2. You should only have one performance monitor defined per storage device. If you
want to run the same monitor at different workload periods, set the duration to be 1
hour less than the difference between the two starting points. This gives the
collection engine one hour to finish up the first collection and shutdown properly. For
example, if you want to start a monitor at 12 am and 12 pm on the same day, the
duration for the 12 am collection has to be 11 hours or less, so the monitor can start
again at 12 pm successfully.
3. The same is true for a repeated run. If you want to run the same monitor daily, be
sure the duration of the monitor will be 23 hours or less. If you want to run the same
monitor weekly, the duration of the monitor will need to be 7x24 -1 = 167 hours or
less.
During a performance sample collection, the hourly and daily summary for each performance
metric are computed based on the sample data. The summary data reflects the performance
characteristics of the component over certain time periods while the sample data shows the
performance right at that moment.
One more thing to notice for a performance monitor and the sample data: the clock on the
server might be different from that on the device. The performance monitor always uses device
time on the sample it collects, then converts it into the time zone (if it’s different) of the IBM Tivoli
Storage Productivity Center server.
14
3.1.6 Check Performance Monitor Status
When the performance monitor job starts to run, you begin to collect performance data for the
device. You should check the status of the monitor job, make sure it runs and continues running.
Expand on Subsystem Performance Monitors, click on the monitor, and check the status of the
job you want to view. If the status is blue, the monitor is still running without issues. If the status is
yellow, you can check out the warning messages. The monitor will continue to run with warning
messages. For example, if there are “missing a sample data” warning messages, the monitor will
continue to run, and only if the monitor misses all the data it should collect, the status will turn red,
and the monitor failed. In this case, you can click on the job, and on the right-hand panel, click on
the storage subsystem to view the job log for more information. Normally the job log will have
error messages logged for a failed collection. If the status is green, the monitor completed
successfully.
There are a few common issues that may lead to failed data collection. See section 3.3 for
details.
15
Storage subsystem performance aggregated data size = NumSS * NumV * (24 * Rh + Rd) * 200
byte
Switch performance sample size= NumSw * NumPt * CR * 24 * Rs * 150
Switch performance aggregated data size = NumSw * NumPt * (24 * Rh + Rd) * 150
16
3.3.4 HDS Related Issues
Tivoli Storage Productivity Center 4.1.1 is capable of collecting some performance statistics from
HDS subsystems with HDvM 6.2, but there are currently known limitations to the performance
metrics being returned. As such, Tivoli Storage Productivity Center 4.1.1 does not claim support
for monitoring HDS subsystems with HDvM 6.2.
For more information regarding this limitation, please see:
http://www-01.ibm.com/support/docview.wss?uid=swg21406692
17
4.1 Tabular Reports
The most straight forward way to view performance data is to go through the corresponding
component manager’s reporting function to view the most recent data. For subsystem
performance reports, go to Disk Manager -> Reporting -> Storage Subsystem Performance,
then select one of the options to view the data.
The options on report types as shown in Figure 2:
• By Subsystem – for box level aggregate/averages for ESS, DS, SVC, and SMI-S BSP
• By Controller – for ESS clusters, and DS and select SMI-S BSP controllers
• By Array – for ESS and DS arrays
• By Volume – for ESS, DS, and SMI-S BSP volumes/LUNs, and SVC vdisks
• By Port – ESS, DS, SVC, and SMI-S BSP FC ports onto storage box
• By I/O Group – for SVC I/O groups
• By Node – for SVC nodes
• By Managed Disk Group – for SVC managed disk groups
• By Managed Disk – for SVC managed disks
Figure 2. IBM Tivoli Storage Productivity Center 4.1.1 Performance Reports Options for Disk
Manager.
18
On the right-hand panel, the available metrics for the particular type of device are in the
included columns, and the user can pick which metric not to include in the performance report.
The user also could use the selection button to pick components specific to the device type, and
use the filter button to define criteria of their choosing.
Select the “display latest performance data” option to generate a report on the most recent
data. Historic reports can be created by choosing either the date/time range or by defining how
many days in the past to include in the report. You may display either the latest sample, hourly, or
daily data for either the latest or historic reports.
If the selection is saved, this customized report will show up under IBM Tivoli Storage
Productivity Center -> Reporting -> My Reports –> [Admin]’s Reports. [Admin] here is the
login name used to define the report. See more information regarding this topic in the IBM Tivoli
Storage Productivity Center 4.1.1 User’s Guide.
For switch performance reports, go to Fabric Manager -> Reporting -> Switch Performance.
A report may be created in a similar fashion as a subsystem report. The supported report types
are:
• By Switch
• By Port
19
periodic snapshots of performance. The key is to monitor normal operations with key metrics,
develop an understanding of expected behaviors, and then track the behavior for either
performance anomalies or simple growth in the workload.
See Figure 3 for an example of the throughput chart (Total I/O Rate) for a DS8000. This data is
an hourly summary of I/O rate for this DS8000 for past day. This data can also easily be exported
into other format for analysis.
20
columns for that report type. The panel is similar to the tabular report panel in section 4.1 (see
Figure 2) and features the same options.
On the Options tab, select which agent to run the report on (this will determine the location of
the output file), and choose the type of output file to generate (see Figure 5), such as a CSV file
that may be imported into a spreadsheet or an HTML file. Then choose when and how often you
want this job to run on the When to Run tab. Then save the batch report. When the batch report
is run, the file location is described in the batch job’s log.
For additional information regarding batch reports, see the IBM Tivoli Storage Productivity
Center V4.1.1 info center:
http://publib.boulder.ibm.com/infocenter/tivihelp/v4r1/index.jsp?topic=/com.ibm.tpc_V411.doc/fqz
0_c_batch_reports.html
21
Figure 5. Choose the agent to run the report on and the type of output to generate.
22
Figure 4. Constraint Violation Reports Options
One more way to view performance reports for subsystem devices is to look at top 25 volumes
with highest performance metrics (for cache hit, this will be the lowest). Here are the reports
available for subsystems under IBM Tivoli Storage Productivity Center -> Reporting ->
System Reports –> Disk:
• Top Active Volumes Cache Hit Performance
• Top Volumes Data Rate Performance
• Top Volumes Disk Performance
• Top Volumes I/O Rate Performance
• Top Volumes Response Performance
For example, the Top Volumes I/O Rate Performance report will show the 25 busiest volumes
by I/O rate. The main metrics shown in this report are:
• Overall read/write/total I/O rate
• Overall read/write/total data rate
• Read/write/overall response time
Similar top hit reports are available for switches under IBM Tivoli Storage Productivity Center -
> Reporting –> System Reports –> Fabric:
• Top Switch Ports Data Rate Performance
• Top Switch Ports Packet Rate Performance
23
Figure 5. Top hits reporting choices.
These reports will help the user to look up quickly the top hit volumes/ports for bottleneck
analysis. One caveat is that these top reports are based on the latest sample data, and in some
cases, may not reflect the problem on a component over a certain period. For example, if the
daily average I/O rate is high for a volume but the last sample data is normal, this volume may not
show up on the top 25 reports. Another complication in storage performance data is data wrap,
that is, from one sample interval to the next, the metric value may appear extremely large. This
will also skew these top reports. It is also possible to see some volumes in these reports with low
or no I/O (0 or N/A values for their metrics) if fewer than 25 volumes have high I/O.
There are also other predefined reports. Under the same node “System Reports -> Disk”, it
has reports such as “Array Performance” and “Subsystem Performance”. Those predefined
reports are provided with the product which shows also the latest sample data.
24
• Total amount of space required
• Minimum number, maximum number, and sizes of volumes
• Workload requirements
• Contention from other workloads
Only subsystems that have been discovered and probed will show up in the SAN Planner. To use
the SAN Planner, the user needs to define the capacity and the workload profile of the new
volumes to be allocated. A few standard workload profiles are provided. Once performance data
has been collected, you can use historic performance data to define a profile to be used for new
volumes whose workloads will be similar to some existing volumes. See the following link for
more information: IBM Tivoli Storage Productivity Center User’s Guide, Chapter 4. Managing
storage resources, under the section titled “Planning and modifying storage configurations”
http://publib.boulder.ibm.com/infocenter/tivihelp/v4r1/topic/com.ibm.tpc_V411.doc/fqz0_usersguid
e_v411.pdf
While the SAN Planner tries to identify the RAID arrays or pools with the least workload in
order to recommend where to create new volumes, the Storage Optimizer examines existing
volumes to determine if there are any performance bottlenecks. The Storage Optimizer then goes
through several scenarios to determine if the performance bottlenecks may be eliminated by
moving the problem volumes to other pools. The Storage Optimizer supports ESS, DS8000,
DS6000, DS4000, and SVC in IBM Tivoli Storage Productivity Center 4.1.1.
Additional information regarding the Storage Optimizer may be found in the IBM Tivoli Storage
Productivity Center User’s Guide, Chapter 4. Managing storage resources, under the section
titled “Optimizing storage configurations”.
6. Summary
This paper attempts to give an overview of performance monitoring and management function
that can be achieved using IBM Tivoli Storage Productivity Center 4.1.1. It lays out all the
configuration steps necessary to start a performance monitor, to set a threshold, and to generate
some useful reports and charts for problem diagnostics. It also attempts to interpret a small
number of performance metrics. The reporting of those metrics can form the foundation for
capacity planning and performance tuning.
7. Reference
IBM Tivoli Storage Productivity Center Support Site
http://www-
01.ibm.com/software/sysmgmt/products/support/IBMTotalStorageProductivityCenterStandardEditi
on.html
IBM Tivoli Storage Productivity Center Information Center
http://publib.boulder.ibm.com/infocenter/tivihelp/v4r1/index.jsp
IBM Tivoli Storage Productivity Center Installation and Configuration Guide
http://publib.boulder.ibm.com/infocenter/tivihelp/v4r1/topic/com.ibm.tpc_V411.doc/fqz0_t_installin
g_main.html
IBM Tivoli Storage Productivity Center V4.1.1 User’s Guide
http://publib.boulder.ibm.com/infocenter/tivihelp/v4r1/topic/com.ibm.tpc_V411.doc/fqz0_usersguid
e_v411.pdf
25
IBM Tivoli Storage Productivity Center V4.1.1 Messages
http://publib.boulder.ibm.com/infocenter/tivihelp/v4r1/index.jsp?topic=/com.ibm.tpc_V411.doc/tpc
msg41122.html
IBM TotalStorage Productivity Center V3.1 Problem Determination Guide
http://publib.boulder.ibm.com/infocenter/tivihelp/v4r1/topic/com.ibm.itpc.doc/tpcpdg31.htm
IBM TotalStorage Productivity Center V3.3.2/4.1 Hints and Tips Guide
http://www-
01.ibm.com/support/docview.wss?rs=40&context=SSBSEX&context=SSMN28&context=SSMMU
P&context=SS8JB5&context=SS8JFM&dc=DB700&dc=DA4A10&uid=swg27008254&loc=en_US
&cs=utf-8&lang=en
IBM TotalStorage Productivity Center V3.3 SAN Storage Provisioning Planner White Paper
ftp://ftp.software.ibm.com/common/ssi/sa/wh/n/tsw03026usen/TSW03026USEN.PDF
IBM Tivoli Storage Productivity Center V4.1 Storage Optimizer White Paper
http://www-01.ibm.com/support/docview.wss?uid=swg21389271
26
Appendix A Available Metrics
This table lists the metric name, the types of components for which each metric is available, and
a description. The SMI-S BSP device type mentioned in the table below refers to any storage
subsystem that is managed via a CIMOM which supports SMI-S 1.1 with Block Server
Performance (BSP) subprofile.
Metrics that require specific versions of IBM Tivoli Storage Productivity Center are noted in
parenthesis.
27
Metric Device/Component Description
Metric
Type Type
ESS/DS6K/DS8K write operations, for a particular
Subsystem component over a time interval.
SVC VDisk
SVC Node Note: SVC Node and SVC
SVC I/O Group Subsystem support requires
SVC MDisk Group v3.1.3 or above.
SVC Subsystem
SMI-S BSP Volume
SMI-S BSP Controller
SMI-S BSP Subsystem
ESS/DS6K/DS8K Volume Average number of I/O
ESS/DS6K/DS8K Array operations per second for non-
Total I/O Rate
807 ESS/DS6K/DS8K Controller sequential read and write
(normal)
ESS/DS6K/DS8K operations, for a particular
Subsystem component over a time interval.
ESS/DS6K/DS8K Volume Average number of I/O
ESS/DS6K/DS8K Array operations per second for
Total I/O Rate
808 ESS/DS6K/DS8K Controller sequential read and write
(sequential)
ESS/DS6K/DS8K operations, for a particular
Subsystem component over a time interval.
ESS/DS6K/DS8K Volume
ESS/DS6K/DS8K Array
Average number of I/O
ESS/DS6K/DS8K Controller
operations per second for both
ESS/DS6K/DS8K
sequential and non-sequential
Subsystem
read and write operations, for a
SVC VDisk
Total I/O Rate particular component over a
809 SVC Node
(overall) time interval.
SVC I/O Group
SVC MDisk Group
Note: SVC Node and SVC
SVC Subsystem
Subsystem support requires
SMI-S BSP Volume
v3.1.3 or above.
SMI-S BSP Controller
SMI-S BSP Subsystem
Average number of write
SVC VDisk
Global Mirror Write operations per second issued to
SVC Node
I/O Rate 937 the Global Mirror secondary
SVC I/O Group
(3.1.3) site, for a particular component
SVC Subsystem
over a time interval.
Average percentage of write
operations issued by the Global
Mirror primary site which were
serialized overlapping writes, for
Global Mirror SVC VDisk a particular component over a
Overlapping Write SVC Node time interval. For SVC 4.3.1
938
Percentage SVC I/O Group and later, some overlapping
(3.1.3) SVC Subsystem writes are processed in parallel
(are not serialized), so are
excluded. For earlier SVC
versions, all overlapping writes
were serialized.
Global Mirror SVC VDisk Average number of serialized
939
Overlapping Write SVC Node overlapping write operations per
28
Metric Device/Component Description
Metric
Type Type
I/O Rate SVC I/O Group second encountered by the
(3.1.3) SVC Subsystem Global Mirror primary site, for a
particular component over a
time interval. For SVC 4.3.1
and later, some overlapping
writes are processed in parallel
(are not serialized), so are
excluded. For earlier SVC
versions, all overlapping writes
were serialized.
Average number of read
operations per second that were
ESS/DS6K/DS8K Volume
issued via the High
ESS/DS6K/DS8K Array
HPF Read I/O Rate Performance FICON (HPF)
943 ESS/DS6K/DS8K Controller
(4.1.1) feature of the storage
ESS/DS6K/DS8K
subsystem, for a particular
Subsystem
component over a particular
time interval.
Average number of write
operations per second that were
ESS/DS6K/DS8K Volume
issued via the High
ESS/DS6K/DS8K Array
HPF Write I/O Rate Performance FICON (HPF)
944 ESS/DS6K/DS8K Controller
(4.1.1) feature of the storage
ESS/DS6K/DS8K
subsystem, for a particular
Subsystem
component over a particular
time interval.
Average number of read and
write operations per second that
ESS/DS6K/DS8K Volume
were issued via the High
ESS/DS6K/DS8K Array
Total HPF I/O Rate Performance FICON (HPF)
945 ESS/DS6K/DS8K Controller
(4.1.1) feature of the storage
ESS/DS6K/DS8K
subsystem, for a particular
Subsystem
component over a particular
time interval.
The percentage of all I/O
ESS/DS6K/DS8K Volume operations that were issued via
HPF I/O ESS/DS6K/DS8K Array the High Performance FICON
Percentage 946 ESS/DS6K/DS8K Controller (HPF) feature of the storage
(4.1.1) ESS/DS6K/DS8K subsystem for a particular
Subsystem component over a particular
time interval.
Average number of track
ESS/DS6K/DS8K Volume
transfer operations per second
PPRC Transfer ESS/DS6K/DS8K Array
for Peer-to-Peer Remote Copy
Rate 947 ESS/DS6K/DS8K Controller
usage, for a particular
(4.1.1) ESS/DS6K/DS8K
component over a particular
Subsystem
time interval.
Cache Hit Percentages
ESS/DS6K/DS8K Volume Percentage of cache hits for
Read Cache Hits ESS/DS6K/DS8K Array non-sequential read operations,
810
(normal) ESS/DS6K/DS8K Controller for a particular component over
ESS/DS6K/DS8K a time interval.
29
Metric Device/Component Description
Metric
Type Type
Subsystem
ESS/DS6K/DS8K Volume
Percentage of cache hits for
ESS/DS6K/DS8K Array
Read Cache Hits sequential read operations, for a
811 ESS/DS6K/DS8K Controller
(sequential) particular component over a
ESS/DS6K/DS8K
time interval.
Subsystem
ESS/DS6K/DS8K Volume
ESS/DS6K/DS8K Array
Percentage of cache hits for
ESS/DS6K/DS8K Controller
both sequential and non-
ESS/DS6K/DS8K
sequential read operations, for a
Subsystem
particular component over a
Read Cache Hits SVC VDisk
812 time interval.
(overall) SVC Node
SVC I/O Group
Note: SVC Node and SVC
SVC Subsystem
Subsystem support requires
SMI-S BSP Volume
v3.1.3 or above.
SMI-S BSP Controller
SMI-S BSP Subsystem
ESS/DS6K/DS8K Volume
Percentage of cache hits for
ESS/DS6K/DS8K Array
Write Cache Hits non-sequential write operations,
813 ESS/DS6K/DS8K Controller
(normal) for a particular component over
ESS/DS6K/DS8K
a time interval.
Subsystem
ESS/DS6K/DS8K Volume
Percentage of cache hits for
ESS/DS6K/DS8K Array
Write Cache Hits sequential write operations, for
814 ESS/DS6K/DS8K Controller
(sequential) a particular component over a
ESS/DS6K/DS8K
time interval.
Subsystem
ESS/DS6K/DS8K Volume
ESS/DS6K/DS8K Array
Percentage of cache hits for
ESS/DS6K/DS8K Controller
both sequential and non-
ESS/DS6K/DS8K
sequential write operations, for
Subsystem
a particular component over a
Write Cache Hits SVC VDisk
815 time interval.
(overall) SVC Node
SVC I/O Group
Note: SVC Node and SVC
SVC Subsystem
Subsystem support requires
SMI-S BSP Volume
v3.1.3 or above.
SMI-S BSP Controller
SMI-S BSP Subsystem
ESS/DS6K/DS8K Volume
Percentage of cache hits for
ESS/DS6K/DS8K Array
Total Cache Hits non-sequential read and write
816 ESS/DS6K/DS8K Controller
(normal) operations, for a particular
ESS/DS6K/DS8K
component over a time interval.
Subsystem
ESS/DS6K/DS8K Volume
Percentage of cache hits for
ESS/DS6K/DS8K Array
Total Cache Hits sequential read and write
817 ESS/DS6K/DS8K Controller
(sequential) operations, for a particular
ESS/DS6K/DS8K
component over a time interval.
Subsystem
Total Cache Hits ESS/DS6K/DS8K Volume Percentage of cache hits for
818
(overall) ESS/DS6K/DS8K Array both sequential and non-
30
Metric Device/Component Description
Metric
Type Type
ESS/DS6K/DS8K Controller sequential read and write
ESS/DS6K/DS8K operations, for a particular
Subsystem component over a time interval.
SVC VDisk
SVC Node Note: SVC Node and SVC
SVC I/O Group Subsystem support requires
SVC Subsystem v3.1.3 or above.
SMI-S BSP Volume
SMI-S BSP Controller
SMI-S BSP Subsystem
Readahead SVC VDisk Percentage of all read cache
Percentage of SVC Node hits which occurred on
890
Cache Hits SVC I/O Group prestaged data, for a particular
(3.1.3) SVC Subsystem component over a time interval.
Percentage of all write cache
Dirty-Write SVC VDisk
hits which occurred on already
Percentage of SVC Node
891 dirty data in the cache, for a
Cache Hits SVC I/O Group
particular component over a
(3.1.3) SVC Subsystem
time interval.
Data Rates
ESS/DS6K/DS8K Volume
ESS/DS6K/DS8K Array
ESS/DS6K/DS8K Controller Average number of megabytes
ESS/DS6K/DS8K (2^20 bytes) per second that
Subsystem were transferred for read
SVC VDisk operations, for a particular
Read Data Rate 819 SVC Node component over a time interval.
SVC I/O Group
SVC MDisk Group Note: SVC Node and SVC
SVC Subsystem Subsystem support requires
SMI-S BSP Volume v3.1.3 or above.
SMI-S BSP Controller
SMI-S BSP Subsystem
ESS/DS6K/DS8K Volume
ESS/DS6K/DS8K Array
ESS/DS6K/DS8K Controller Average number of megabytes
ESS/DS6K/DS8K (2^20 bytes) per second that
Subsystem were transferred for write
SVC VDisk operations, for a particular
Write Data Rate 820 SVC Node component over a time interval.
SVC I/O Group
SVC MDisk Group Note: SVC Node and SVC
SVD Subsystem Subsystem support requires
SMI-S BSP Volume v3.1.3 or above.
SMI-S BSP Controller
SMI-S BSP Subsystem
ESS/DS6K/DS8K Volume Average number of megabytes
ESS/DS6K/DS8K Array (2^20 bytes) per second that
ESS/DS6K/DS8K Controller were transferred for read and
Total Data Rate 821
ESS/DS6K/DS8K write operations, for a particular
Subsystem component over a time interval.
SVC VDisk
31
Metric Device/Component Description
Metric
Type Type
SVC Node Note: SVC Node and SVC
SVC I/O Group Subsystem support requires
SVC MDisk Group v3.1.3 or above.
SVC Subsystem
SMI-S BSP Volume
SMI-S BSP Controller
SMI-S BSP Subsystem
Response Times
ESS/DS6K/DS8K Volume
ESS/DS6K/DS8K Array
ESS/DS6K/DS8K Controller Average number of milliseconds
ESS/DS6K/DS8K that it took to service each read
Subsystem operation, for a particular
SVC VDisk component over a time interval.
Read Response
822 SVC Node
Time
SVC I/O Group Note: SVC VDisk, Node, I/O
SVC MDisk Group Group, MDisk Group, and
SVC Subsystem Subsystem support requires
SMI-S BSP Volume v3.1.3 or above.
SMI-S BSP Controller
SMI-S BSP Subsystem
ESS/DS6K/DS8K Volume
ESS/DS6K/DS8K Array
ESS/DS6K/DS8K Controller Average number of milliseconds
ESS/DS6K/DS8K that it took to service each write
Subsystem operation, for a particular
SVC VDisk component over a time interval.
Write Response
823 SVC Node
Time
SVC I/O Group Note: SVC VDisk, Node, I/O
SVC MDisk Group Group, MDisk Group, and
SVC Subsystem Subsystem support requires
SMI-S BSP Volume v3.1.3 or above.
SMI-S BSP Controller
SMI-S BSP Subsystem
ESS/DS6K/DS8K Volume
ESS/DS6K/DS8K Array
Average number of milliseconds
ESS/DS6K/DS8K Controller
that it took to service each I/O
ESS/DS6K/DS8K
operation (read and write), for a
Subsystem
particular component over a
SVC VDisk
Overall Response time interval.
824 SVC Node
Time
SVC I/O Group
Note: SVC VDisk, Node, I/O
SVC MDisk Group
Group, MDisk Group, and
SVC Subsystem
Subsystem support requires
SMI-S BSP Volume
v3.1.3 or above.
SMI-S BSP Controller
SMI-S BSP Subsystem
SVC VDisk The peak (worst) response time
Peak Read
SVC Node among all read operations, for a
Response Time 940
SVC I/O Group particular component over a
(3.1.3)
SVC Subsystem time interval.
Peak Write 941 SVC VDisk The peak (worst) response time
32
Metric Device/Component Description
Metric
Type Type
Response Time SVC Node among all write operations, for a
(3.1.3) SVC I/O Group particular component over a
SVC Subsystem time interval.
The average number of
additional milliseconds it took to
Global Mirror SVC VDisk service each secondary write
Secondary Write SVC Node operation for Global Mirror,
942
Lag SVC I/O Group beyond the time needed to
(3.1.3) SVC Subsystem service the primary writes, for a
particular component over a
particular time interval.
This is the percentage of the
average response time
(read+write) which can be
attributed to delays from the
host systems. This is provided
as an aid to diagnose slow
Overall Host
SVC VDisk hosts and poorly performing
Attributed
SVC Node fabrics. This value is based on
Response Time 948
SVC I/O Group the time taken for hosts to
Percentage
SVC Subsystem respond to transfer-ready
(4.1.1)
notifications from the SVC
nodes (for read) and the time
taken for hosts to send the write
data after the node has
responded to a transfer-ready
notification (for write).
Transfer Sizes
ESS/DS6K/DS8K Volume
ESS/DS6K/DS8K Array
ESS/DS6K/DS8K Controller
Average number of KB per I/O
ESS/DS6K/DS8K
for read operations, for a
Subsystem
particular component over a
SVC VDisk
time interval.
Read Transfer Size 825 SVC Node
SVC I/O Group
Note: SVC Node and SVC
SVC MDisk Group
Subsystem support requires
SVC Subsystem
v3.1.3 or above.
SMI-S BSP Volume
SMI-S BSP Controller
SMI-S BSP Subsystem
ESS/DS6K/DS8K Volume
ESS/DS6K/DS8K Array
Average number of KB per I/O
ESS/DS6K/DS8K Controller
for write operations, for a
ESS/DS6K/DS8K
particular component over a
Subsystem
time interval.
Write Transfer Size 826 SVC VDisk
SVC Node
Note: SVC Node and SVC
SVC I/O Group
Subsystem support requires
SVC MDisk Group
v3.1.3 or above.
SVC Subsystem
SMI-S BSP Volume
33
Metric Device/Component Description
Metric
Type Type
SMI-S BSP Controller
SMI-S BSP Subsystem
ESS/DS6K/DS8K Volume
ESS/DS6K/DS8K Array
ESS/DS6K/DS8K Controller
Average number of KB per I/O
ESS/DS6K/DS8K
for read and write operations,
Subsystem
for a particular component over
SVC VDisk
Overall Transfer a time interval.
827 SVC Node
Size
SVC I/O Group
Note: SVC Node and SVC
SVC MDisk Group
Subsystem support requires
SVC Subsystem
v3.1.3 or above.
SMI-S BSP Volume
SMI-S BSP Controller
SMI-S BSP Subsystem
Record Mode Reads
Average number of I/O
ESS/DS6K/DS8K Volume operations per second for
Record Mode Read
828 ESS/DS6K/DS8K Array record mode read operations,
I/O Rate
ESS/DS6K/DS8K Controller for a particular component over
a time interval.
Percentage of cache hits for
ESS/DS6K/DS8K Volume
Record Mode Read record mode read operations,
829 ESS/DS6K/DS8K Array
Cache Hits for a particular component over
ESS/DS6K/DS8K Controller
a time interval.
Cache Transfers
Average number of I/O
ESS/DS6K/DS8K Volume operations (track transfers) per
ESS/DS6K/DS8K Array second for disk to cache
ESS/DS6K/DS8K Controller transfers, for a particular
Disk to Cache I/O
830 SVC VDisk component over a time interval.
Rate
SVC Node
SVC I/O Group Note: SVC VDisk, Node, I/O
SVC Subsystem Group, and Subsystem support
requires v3.1.3 or above.
Average number of I/O
ESS/DS6K/DS8K Volume operations (track transfers) per
ESS/DS6K/DS8K Array second for cache to disk
ESS/DS6K/DS8K Controller transfers, for a particular
Cache to Disk I/O
831 SVC VDisk component over a time interval.
Rate
SVC Node
SVC I/O Group Note: SVC VDisk, Node, I/O
SVC Subsystem Group, and Subsystem support
requires v3.1.3 or above.
Write-cache Constraints
ESS/DS6K/DS8K Volume Percentage of I/O operations
ESS/DS6K/DS8K Array that were delayed due to write-
ESS/DS6K/DS8K Controller cache space constraints or
Write-cache Delay
832 ESS/DS6K/DS8K other conditions, for a particular
Percentage
Subsystem component over a time interval.
SVC VDisk (The ratio of delayed operations
SVC Node to total I/Os.)
34
Metric Device/Component Description
Metric
Type Type
SVC I/O Group
SVC Subsystem Note: SVC VDisk, Node, I/O
Group, and Subsystem support
requires v3.1.3 or above.
Average number of I/O
ESS/DS6K/DS8K Volume
operations per second that were
ESS/DS6K/DS8K Array
delayed due to write-cache
ESS/DS6K/DS8K Controller
space constraints or other
ESS/DS6K/DS8K
Write-cache Delay conditions, for a particular
833 Subsystem
I/O Rate component over a time interval.
SVC VDisk
SVC Node
Note: SVC VDisk, Node, I/O
SVC I/O Group
Group, and Subsystem support
SVC Subsystem
requires v3.1.3 or above.
Percentage of write operations
Write-cache SVC VDisk
that were delayed due to lack of
Overflow SVC Node
894 write-cache space, for a
Percentage SVC I/O Group
particular component over a
(3.1.3) SVC Subsystem
time interval.
Average number of tracks per
SVC VDisk
Write-cache second that were delayed due
SVC Node
Overflow I/O Rate 895 to lack of write-cache space, for
SVC I/O Group
(3.1.3) a particular component over a
SVC Subsystem
time interval.
Percentage of write operations
Write-cache Flush- SVC VDisk
that were processed in Flush-
through SVC Node
896 through write mode, for a
Percentage SVC I/O Group
particular component over a
(3.1.3) SVC Subsystem
time interval.
Average number of tracks per
SVC VDisk
Write-cache Flush- second that were processed in
SVC Node
through I/O Rate 897 Flush-through write mode, for a
SVC I/O Group
(3.1.3) particular component over a
SVC Subsystem
time interval.
Percentage of write operations
Write-cache Write- SVC VDisk
that were processed in Write-
through SVC Node
898 through write mode, for a
Percentage SVC I/O Group
particular component over a
(3.1.3) SVC Subsystem
time interval.
Average number of tracks per
SVC VDisk
Write-cache Write- second that were processed in
SVC Node
through I/O Rate 899 Write-through write mode, for a
SVC I/O Group
(3.1.3) particular component over a
SVC Subsystem
time interval.
Miscellaneous Computed Values
Average cache holding time, in
ESS/DS6K/DS8K Controller seconds, for I/O data in this
Cache Holding
834 ESS/DS6K/DS8K subsystem controller (cluster).
Time
Subsystem Shorter time periods indicate
adverse performance.
CPU Utilization SVC Node Average utilization percentage
900
(3.1.3) SVC I/O Group of the CPU(s) for a particular
35
Metric Device/Component Description
Metric
Type Type
SVC Subsystem component over a time interval.
The overall percentage of I/O
Non-Preferred performed or data transferred
Node Usage SVC VDisk via the non-preferred nodes of
949
Percentage SVC I/O Group the VDisks, for a particular
(4.1.1) component over a particular
time interval.
The approximate utilization
percentage of a particular
Volume Utilization ESS/DS6K/DS8K Volume volume over a time interval, i.e.
978
(4.1.1) SVC VDisk the average amount of time that
the volume was busy reading or
writing data.
36
Metric Device/Component Description
Metric
Type Type
SVC MDisk
SVC MDisk Group Note: SVC Node, I/O Group,
SVC Node and Subsystem support requires
SVC I/O Group v3.1.3 or above.
SVC Subsystem
ESS/DS6K/DS8K Rank
Average number of megabytes
ESS/DS6K/DS8K Array
(2^20 bytes) per second that
ESS/DS6K/DS8K Controller
were transferred for write
ESS/DS6K/DS8K
operations, for a particular
Backend Write Subsystem
839 component over a time interval.
Data Rate SVC MDisk
SVC MDisk Group
Note: SVC Node, I/O Group,
SVC Node
and Subsystem support requires
SVC I/O Group
v3.1.3 or above.
SVC Subsystem
ESS/DS6K/DS8K Rank
Average number of megabytes
ESS/DS6K/DS8K Array
(2^20 bytes) per second that
ESS/DS6K/DS8K Controller
were transferred for read and
ESS/DS6K/DS8K
write operations, for a particular
Total Backend Subsystem
840 component over a time interval.
Data Rate SVC MDisk
SVC MDisk Group
Note: SVC Node, I/O Group,
SVC Node
and Subsystem support requires
SVC I/O Group
v3.1.3 or above.
SVC Subsystem
Response Times
Average number of milliseconds
ESS/DS6K/DS8K Rank
that it took to service each read
ESS/DS6K/DS8K Array
operation, for a particular
ESS/DS6K/DS8K Controller
component over a time interval.
ESS/DS6K/DS8K
For SVC, this is the external
Backend Read Subsystem
841 response time time of the
Response Time SVC MDisk
MDisks.
SVC MDisk Group
SVC Node
Note: SVC Node, I/O Group,
SVC I/O Group
and Subsystem support requires
SVC Subsystem
v3.1.3 or above.
Average number of milliseconds
ESS/DS6K/DS8K Rank
that it took to service each write
ESS/DS6K/DS8K Array
operation, for a particular
ESS/DS6K/DS8K Controller
component over a time interval.
ESS/DS6K/DS8K
For SVC, this is the external
Backend Write Subsystem
842 response time time of the
Response Time SVC MDisk
MDisks.
SVC MDisk Group
SVC Node
Note: SVC Node, I/O Group,
SVC I/O Group
and Subsystem support requires
SVC Subsystem
v3.1.3 or above.
ESS/DS6K/DS8K Rank Average number of milliseconds
Overall Backend ESS/DS6K/DS8K Array that it took to service each I/O
843
Response Time ESS/DS6K/DS8K Controller operation (read and write), for a
ESS/DS6K/DS8K particular component over a
37
Metric Device/Component Description
Metric
Type Type
Subsystem time interval.
SVC MDisk For SVC, this is the external
SVC MDisk Group response time time of the
SVC Node MDisks.
SVC I/O Group
SVC Subsystem Note: SVC Node, I/O Group,
and Subsystem support requires
v3.1.3 or above.
Average number of milliseconds
that each read operation spent
on the queue before being
SVC MDisk
issued to the backend device,
SVC MDisk Group
Backend Read for a particular MDisk or MDisk
844 SVC Node
Queue Time Group over a time interval.
SVC I/O Group
SVC Subsystem
Note: SVC Node, I/O Group,
and Subsystem support requires
v3.1.3 or above.
Average number of milliseconds
that each read operation spent
on the queue before being
SVC MDisk
issued to the backend device,
SVC MDisk Group
Backend Write for a particular MDisk or MDisk
845 SVC Node
Queue Time Group over a time interval.
SVC I/O Group
SVC Subsystem
Note: SVC Node, I/O Group,
and Subsystem support requires
v3.1.3 or above.
Average number of milliseconds
that each read operation spent
on the queue before being
SVC MDisk
issued to the backend device,
SVC MDisk Group
Overall Backend for a particular MDisk or MDisk
846 SVC Node
Queue Time Group over a time interval.
SVC I/O Group
SVC Subsystem
Note: SVC Node, I/O Group,
and Subsystem support requires
v3.1.3 or above.
The peak (worst) response time
SVC MDisk
Peak Backend among all read operations, for a
SVC Node
Read Response particular component over a
950 SVC I/O Group
Time time interval. For SVC, this is
SVC MDisk Group
(4.1.1) the external response time of
SVC Subsystem
the MDisks.
The peak (worst) response time
SVC MDisk
Peak Backend among all write operations, for a
SVC Node
Write Response particular component over a
951 SVC I/O Group
Time time interval. For SVC, this is
SVC MDisk Group
(4.1.1) the external response time of
SVC Subsystem
the MDisks.
Peak Backend SVC MDisk The lower bound on the peak
952
Read Queue Time SVC Node (worst) queue time for read
38
Metric Device/Component Description
Metric
Type Type
(4.1.1) SVC I/O Group operations, for a particular
SVC MDisk Group component over a time interval.
SVC Subsystem The queue time is the amount of
time that the read operation
spent on the queue before
being issued to the backend
device.
The lower bound on the peak
(worst) queue time for write
SVC MDisk operations, for a particular
Peak Backend SVC Node component over a time interval.
Write Queue Time 953 SVC I/O Group The queue time is the amount of
(4.1.1) SVC MDisk Group time that the write operation
SVC Subsystem spent on the queue before
being issued to the backend
device.
Transfer Sizes
ESS/DS6K/DS8K Rank
ESS/DS6K/DS8K Array Average number of KB per I/O
ESS/DS6K/DS8K Controller for read operations, for a
ESS/DS6K/DS8K particular component over a
Backend Read Subsystem time interval.
847
Transfer Size SVC MDisk
SVC MDisk Group Note: SVC Node, I/O Group,
SVC Node and Subsystem support requires
SVC I/O Group v3.1.3 or above.
SVC Subsystem
ESS/DS6K/DS8K Rank
ESS/DS6K/DS8K Array Average number of KB per I/O
ESS/DS6K/DS8K Controller for write operations, for a
ESS/DS6K/DS8K particular component over a
Backend Write Subsystem time interval.
848
Transfer Size SVC MDisk
SVC MDisk Group Note: SVC Node, I/O Group,
SVC Node and Subsystem support requires
SVC I/O Group v3.1.3 or above.
SVC Subsystem
ESS/DS6K/DS8K Rank
ESS/DS6K/DS8K Array Average number of KB per I/O
ESS/DS6K/DS8K Controller for read and write operations,
ESS/DS6K/DS8K for a particular component over
Overall
Subsystem a time interval.
BackendTransfer 849
SVC MDisk
Size
SVC MDisk Group Note: SVC Node, I/O Group,
SVC Node and Subsystem support requires
SVC I/O Group v3.1.3 or above.
SVC Subsystem
Disk Utilization
The approximate utilization
Disk Utilization percentage of a particular rank
850 ESS/DS6K/DS8K Array
Percentage over a time interval, i.e. the
average percent of time that the
39
Metric Device/Component Description
Metric
Type Type
disks associated with the array
were busy
Percent of all I/O operations
Sequential I/O performed for a particular array
851 ESS/DS6K/DS8K Array
Percentage over a time interval that were
sequential operations.
40
Metric Device/Component Description
Metric
Type Type
Average number of packets per
Total Port Packet Switch Port second for send and receive
857
Rate Switch operations, for a particular port
over a time interval.
SVC Port Average number of exchanges
Port to Host Send
SVC Node (I/Os) per second sent to host
I/O Rate 901
SVC I/O Group computers by a particular
(3.1.3)
SVC Subsystem component over a time interval.
SVC Port Average number of exchanges
Port to Host
SVC Node (I/Os) per second received from
Receive I/O Rate 902
SVC I/O Group host computers by a particular
(3.1.3)
SVC Subsystem component over a time interval.
Average number of exchanges
SVC Port
Total Port to Host (I/Os) per second transmitted
SVC Node
I/O Rate 903 between host computers and a
SVC I/O Group
(3.1.3) particular component over a
SVC Subsystem
time interval.
Average number of exchanges
SVC Port
Port to Disk Send (I/Os) per second sent to
SVC Node
I/O Rate 904 storage subsystems by a
SVC I/O Group
(3.1.3) particular component over a
SVC Subsystem
time interval.
Average number of exchanges
SVC Port
Port to Disk (I/Os) per second received from
SVC Node
Receive I/O Rate 905 storage subsystems by a
SVC I/O Group
(3.1.3) particular component over a
SVC Subsystem
time interval.
Average number of exchanges
SVC Port
Total Port to Disk (I/Os) per second transmitted
SVC Node
I/O Rate 906 between storage subsystems
SVC I/O Group
(3.1.3) and a particular component over
SVC Subsystem
a time interval.
Average number of exchanges
SVC Port
Port to Local Node (I/Os) per second sent to other
SVC Node
Send I/O Rate 907 nodes in the local SVC cluster
SVC I/O Group
(3.1.3) by a particular component over
SVC Subsystem
a time interval.
Average number of exchanges
SVC Port
Port to Local Node (I/Os) per second received from
SVC Node
Receive I/O Rate 908 other nodes in the local SVC
SVC I/O Group
(3.1.3) cluster by a particular
SVC Subsystem
component over a time interval.
Average number of exchanges
SVC Port
Total Port to Local (I/Os) per second transmitted
SVC Node
Node I/O Rate 909 between other nodes in the local
SVC I/O Group
(3.1.3) SVC cluster and a particular
SVC Subsystem
component over a time interval.
Average number of exchanges
Port to Remote SVC Port
(I/Os) per second sent to nodes
Node Send I/O SVC Node
910 in the remote SVC cluster by a
Rate SVC I/O Group
particular component over a
(3.1.3) SVC Subsystem
time interval.
41
Metric Device/Component Description
Metric
Type Type
Average number of exchanges
Port to Remote SVC Port
(I/Os) per second received from
Node Receive I/O SVC Node
911 nodes in the remote SVC cluster
Rate SVC I/O Group
by a particular component over
(3.1.3) SVC Subsystem
a time interval.
Average number of exchanges
Total Port to SVC Port
(I/Os) per second transmitted
Remote Node I/O SVC Node
912 between nodes in the remote
Rate SVC I/O Group
SVC cluster and a particular
(3.1.3) SVC Subsystem
component over a time interval.
Average number of send
Port FCP Send I/O
operations per second using the
Rate 979 ESS/DS6K/DS8K Port
FCP protocol, for a particular
(4.1.1)
port over a time interval.
Average number of receive
Port FCP Receive
operations per second using the
I/O Rate 980 ESS/DS6K/DS8K Port
FCP protocol, for a particular
(4.1.1)
port over a time interval.
Average number of send and
Total Port FCP I/O receive operations per second
Rate 981 ESS/DS6K/DS8K Port using the FCP protocol, for a
(4.1.1) particular port over a time
interval.
Average number of send
Port FICON Send
operations per second using the
I/O Rate 954 ESS/DS6K/DS8K Port
FICON protocol, for a particular
(4.1.1)
port over a time interval.
Average number of receive
Port FICON
operations per second using the
Receive I/O Rate 955 ESS/DS6K/DS8K Port
FICON protocol, for a particular
(4.1.1)
port over a time interval.
Average number of send and
Total Port FICON receive operations per second
I/O Rate 956 ESS/DS6K/DS8K Port using the FICON protocol, for a
(4.1.1) particular port over a time
interval.
Average number of send
Port PPRC Send ESS/DS6K/DS8K Port operations per second for Peer-
I/O Rate 957 ESS/DS6K/DS8K to-Peer Remote Copy usage, for
(4.1.1) Subsystem a particular port over a time
interval.
Average number of receive
Port PPRC ESS/DS6K/DS8K Port operations per second for Peer-
Receive I/O Rate 958 ESS/DS6K/DS8K to-Peer Remote Copy usage, for
(4.1.1) Subsystem a particular port over a time
interval.
Average number of send and
Total Port PPRC ESS/DS6K/DS8K Port receive operations per second
I/O Rate 959 ESS/DS6K/DS8K for Peer-to-Peer Remote Copy
(4.1.1) Subsystem usage for a particular port over
a time interval.
Data Rates
42
Metric Device/Component Description
Metric
Type Type
Average number of megabytes
ESS/DS6K/DS8K Port (2^20 bytes) per second that
ESS/DS6K/DS8K were transferred for send (read)
Subsystem operations, for a particular port
SVC Port over a time interval.
Port Send Data SVC Node
858
Rate SVC I/O Group Note: ESS/DS6K/DS8K
SVC Subsystem Subsystem and SVC Port,
SMI-S BSP Port Node, I/O Group, and
Switch Port Subsystem support requires
Switch v3.1.3 or above; SMI-S BSP
Port support requires v3.3.
Average number of megabytes
(2^20 bytes) per second that
ESS/DS6K/DS8K Port
were transferred for receive
ESS/DS6K/DS8K
(write) operations, for a
Subsystem
particular port over a time
SVC Port
interval.
Port Receive Data SVC Node
859
Rate SVC I/O Group
Note: ESS/DS6K/DS8K
SVC Subsystem
Subsystem and SVC Port,
SMI-S BSP Port
Node, I/O Group, and
Switch Port
Subsystem support requires
Switch
v3.1.3 or above; SMI-S BSP
Port support requires v3.3.
Average number of megabytes
(2^20 bytes) per second that
ESS/DS6K/DS8K Port
were transferred for send and
ESS/DS6K/DS8K
receive operations, for a
Subsystem
particular port over a time
SVC Port
interval.
Total Port Data SVC Node
860
Rate SVC I/O Group
Note: ESS/DS6K/DS8K
SVC Subsystem
Subsystem and SVC Port,
SMI-S BSP Port
Node, I/O Group, and
Switch Port
Subsystem support requires
Switch
v3.1.3 or above; SMI-S BSP
Port support requires v3.3.
Peak number of megabytes
Port Peak Send Switch Port (2^20 bytes) per second that
861
Data Rate Switch were sent by a particular port
over a time interval.
Peak number of megabytes
Port Peak Receive Switch Port (2^20 bytes) per second that
862
Data Rate Switch were received by a particular
port over a time interval.
SVC Port Average number of megabytes
Port to Host Send
SVC Node (2^20 bytes) per second sent to
Data Rate 913
SVC I/O Group host computers by a particular
(3.1.3)
SVC Subsystem component over a time interval.
Port to Host SVC Port Average number of megabytes
914
Receive Data Rate SVC Node (2^20 bytes) per second
43
Metric Device/Component Description
Metric
Type Type
(3.1.3) SVC I/O Group received from host computers
SVC Subsystem by a particular component over
a time interval.
Average number of megabytes
SVC Port
Total Port to Host (2^20 bytes) per second
SVC Node
Data Rate 915 transmitted between host
SVC I/O Group
(3.1.3) computers and a particular
SVC Subsystem
component over a time interval.
Average number of megabytes
SVC Port
Port to Disk Send (2^20 bytes) per second sent to
SVC Node
Data Rate 916 storage subsystems by a
SVC I/O Group
(3.1.3) particular component over a
SVC Subsystem
time interval.
Average number of megabytes
SVC Port
Port to Disk (2^20 bytes) per second
SVC Node
Receive Data Rate 917 received from storage
SVC I/O Group
(3.1.3) subsystems by a particular
SVC Subsystem
component over a time interval.
Average number of megabytes
SVC Port
Total Port to Disk (2^20 bytes) per second
SVC Node
Data Rate 918 transmitted between storage
SVC I/O Group
(3.1.3) subsystems and a particular
SVC Subsystem
component over a time interval.
Average number of megabytes
SVC Port
Port to Local Node (2^20 bytes) per second sent to
SVC Node
Send Data Rate 919 other nodes in the local SVC
SVC I/O Group
(3.1.3) cluster by a particular
SVC Subsystem
component over a time interval.
Average number of megabytes
SVC Port
Port to Local Node (2^20 bytes) per second
SVC Node
Receive Data Rate 920 received from other nodes in the
SVC I/O Group
(3.1.3) local SVC cluster by a particular
SVC Subsystem
component over a time interval.
Average number of megabytes
SVC Port (2^20 bytes) per second
Total Port to Local
SVC Node transmitted between other
Node Data Rate 921
SVC I/O Group nodes in the local SVC cluster
(3.1.3)
SVC Subsystem and a particular component over
a time interval.
Average number of megabytes
Port to Remote SVC Port
(2^20 bytes) per second sent to
Node Send Data SVC Node
922 nodes in the remote SVC cluster
Rate SVC I/O Group
by a particular component over
(3.1.3) SVC Subsystem
a time interval.
Average number of megabytes
Port to Remote SVC Port (2^20 bytes) per second
Node Receive Data SVC Node received from nodes in the
923
Rate SVC I/O Group remote SVC cluster by a
(3.1.3) SVC Subsystem particular component over a
time interval.
Total Port to SVC Port Average number of megabytes
924
Remote Node Data SVC Node (2^20 bytes) per second
44
Metric Device/Component Description
Metric
Type Type
Rate SVC I/O Group transmitted between nodes in
(3.1.3) SVC Subsystem the remote SVC cluster and a
particular component over a
time interval.
Average number of megabytes
Port FCP Send (2^20 bytes) per second sent
Data Rate 982 ESS/DS6K/DS8K Port using the FCP protocol, for a
(4.1.1) particular port over a time
interval.
Average number of megabytes
Port FCP Receive (2^20 bytes) per second
Data Rate 983 ESS/DS6K/DS8K Port received using the FCP
(4.1.1) protocol, for a particular port
over a time interval.
Average number of megabytes
Total Port FCP (2^20 bytes) per second sent or
Data Rate 984 ESS/DS6K/DS8K Port received using the FCP
(4.1.1) protocol, for a particular port
over a time interval.
Average number of megabytes
Port FICON Send (2^20 bytes) per second sent
Data Rate 960 ESS/DS6K/DS8K Port using the FICON protocol, for a
(4.1.1) particular port over a time
interval.
Average number of megabytes
Port FICON (2^20 bytes) per second
Receive Data Rate 961 ESS/DS6K/DS8K Port received using the FICON
(4.1.1) protocol, for a particular port
over a time interval.
Average number of megabytes
Total Port FICON (2^20 bytes) per second sent or
Data Rate 962 ESS/DS6K/DS8K Port received using the FICON
(4.1.1) protocol, for a particular port
over a time interval.
Average number of megabytes
Port PPRC Send ESS/DS6K/DS8K Port (2^20 bytes) per second sent for
Data Rate 963 ESS/DS6K/DS8K Peer-to-Peer Remote Copy
(4.1.1) Subsystem usage, for a particular port over
a time interval.
Average number of megabytes
(2^20 bytes) per second
Port PPRC ESS/DS6K/DS8K Port
received for Peer-to-Peer
Receive Data Rate 964 ESS/DS6K/DS8K
Remote Copy usage, for a
(4.1.1) Subsystem
particular port over a time
interval.
Average number of megabytes
(2^20 bytes) per second
Total Port PPRC ESS/DS6K/DS8K Port
transferred for Peer-to-Peer
Data Rate 965 ESS/DS6K/DS8K
Remote Copy usage, for a
(4.1.1) Subsystem
particular port over a time
interval.
Response Times
45
Metric Device/Component Description
Metric
Type Type
Average number of milliseconds
that it took to service each send
(read) operation, for a particular
ESS/DS6K/DS8K Port
Port Send port over a time interval.
863 ESS/DS6K/DS8K
Response Time
Subsystem
Note: ESS/DS6K/DS8K
Subsystem support requires
v3.1.3 or above.
Average number of milliseconds
that it took to service each
receive (write) operation, for a
ESS/DS6K/DS8K Port particular port over a time
Port Receive
864 ESS/DS6K/DS8K interval.
Response Time
Subsystem
Note: ESS/DS6K/DS8K
Subsystem support requires
v3.1.3 or above.
Average number of milliseconds
that it took to service each
operation (send and receive),
ESS/DS6K/DS8K Port for a particular port over a time
Overall Port
865 ESS/DS6K/DS8K interval.
Response Time
Subsystem
Note: ESS/DS6K/DS8K
Subsystem support requires
v3.1.3 or above.
Average number of milliseconds
it took to service each send
Port to Local Node operation to another node in the
SVC Node
Send Response local SVC cluster, for a
925 SVC I/O Group
Time particular component over a
SVC Subsystem
(3.1.3) time interval. For SVC, this is
the external response time of
the transfers.
Average number of milliseconds
it took to service each receive
Port to Local Node operation from another node in
SVC Node
Receive Response the local SVC cluster, for a
926 SVC I/O Group
Time particular component over a
SVC Subsystem
(3.1.3) time interval. For SVC, this is
the external response time of
the transfers.
Average number of milliseconds
it took to service each send or
Total Port to Local receive operation between
SVC Node
Node Response another node in the local SVC
927 SVC I/O Group
Time cluster and a particular
SVC Subsystem
(3.1.3) component over a time interval.
For SVC, this is the external
response time of the transfers.
Port to Local Node SVC Node Average number of milliseconds
928
Send Queued Time SVC I/O Group that each send operation issued
46
Metric Device/Component Description
Metric
Type Type
(3.1.3) SVC Subsystem to another node in the local SVC
cluster spent on the queue
before being issued, for a
particular component over a
time interval.
Average number of milliseconds
that each receive operation from
Port to Local Node
SVC Node another node in the local SVC
Receive Queued
929 SVC I/O Group cluster spent on the queue
Time
SVC Subsystem before being issued, for a
(3.1.3)
particular component over a
time interval.
Average number of milliseconds
that each operation issued to
Total Port to Local SVC Node another node in the local SVC
Node Queued Time 930 SVC I/O Group cluster spent on the queue
(3.1.3) SVC Subsystem before being issued, for a
particular component over a
time interval.
Average number of milliseconds
it took to service each send
Port to Remote operation to a node in the
SVC Node
Node Send remote SVC cluster, for a
931 SVC I/O Group
Response Time particular component over a
SVC Subsystem
(3.1.3) time interval. For SVC, this is
the external response time of
the transfers.
Average number of milliseconds
it took to service each receive
Port to Remote operation from a node in the
SVC Node
Node Receive remote SVC cluster, for a
932 SVC I/O Group
Response Time particular component over a
SVC Subsystem
(3.1.3) time interval. For SVC, this is
the external response time of
the transfers.
Average number of milliseconds
it took to service each send or
Total Port to receive operation between a
SVC Node
Remote Node node in the remote SVC cluster
933 SVC I/O Group
Response Time and a particular component over
SVC Subsystem
(3.1.3) a time interval. For SVC, this is
the external response time of
the transfers.
Average number of milliseconds
that each send operation issued
Port to Remote
SVC Node to a node in the remote SVC
Node Send
934 SVC I/O Group cluster spent on the queue
Queued Time
SVC Subsystem before being issued, for a
(3.1.3)
particular component over a
time interval.
Port to Remote SVC Node Average number of milliseconds
935
Node Receive SVC I/O Group that each receive operation from
47
Metric Device/Component Description
Metric
Type Type
Queued Time SVC Subsystem a node in the remote SVC
(3.1.3) cluster spent on the queue
before being issued, for a
particular component over a
time interval.
Average number of milliseconds
Total Port to that each operation issued to a
SVC Node
Remote Node node in the remote SVC cluster
936 SVC I/O Group
Queued Time spent on the queue before
SVC Subsystem
(3.1.3) being issued, for a particular
component over a time interval.
Average number of milliseconds
Port FCP Send it took to service all send
Response Time 985 ESS/DS6K/DS8K Port operations using the FCP
(4.1.1) protocol, for a particular port
over a time interval.
Average number of milliseconds
Port FCP Receive it took to service all receive
Response Time 986 ESS/DS6K/DS8K Port operations using the FCP
(4.1.1) protocol, for a particular port
over a time interval.
Average number of milliseconds
Overall Port FCP it took to service all I/O
Response Time 987 ESS/DS6K/DS8K Port operations using the FCP
(4.1.1) protocol, for a particular port
over a time interval.
Average number of milliseconds
Port FICON Send it took to service all send
Response Time 966 ESS/DS6K/DS8K Port operations using the FICON
(4.1.1) protocol, for a particular port
over a time interval.
Average number of milliseconds
Port FICON
it took to service all receive
Receive Response
967 ESS/DS6K/DS8K Port operations using the FICON
Time
protocol, for a particular port
(4.1.1)
over a time interval.
Average number of milliseconds
Overall Port FICON it took to service all I/O
Response Time 968 ESS/DS6K/DS8K Port operations using the FICON
(4.1.1) protocol, for a particular port
over a time interval.
Average number of milliseconds
it took to service all send
Port PPRC Send ESS/DS6K/DS8K Port
operations for Peer-to-Peer
Response Time 969 ESS/DS6K/DS8K
Remote Copy usage, for a
(4.1.1) Subsystem
particular port over a time
interval.
Average number of milliseconds
Port PPRC
ESS/DS6K/DS8K Port it took to service all receive
Receive Response
970 ESS/DS6K/DS8K operations for Peer-to-Peer
Time
Subsystem Remote Copy usage, for a
(4.1.1)
particular port over a time
48
Metric Device/Component Description
Metric
Type Type
interval.
Average number of milliseconds
it took to service all I/O
Overall Port PPRC ESS/DS6K/DS8K Port
operations for Peer-to-Peer
Response Time 971 ESS/DS6K/DS8K
Remote Copy usage, for a
(4.1.1) Subsystem
particular port over a time
interval.
Transfer Sizes
Average number of KB sent per
I/O by a particular port over a
ESS/DS6K/DS8K Port time interval.
Port Send Transfer ESS/DS6K/DS8K
866
Size Subsystem Note: ESS/DS6K/DS8K
SMI-S BSP Port Subsystem support requires
v3.1.3 or above; SMI-S BSP
Port requires v3.3.
Average number of KB received
per I/O by a particular port over
ESS/DS6K/DS8K Port a time interval.
Port Receive ESS/DS6K/DS8K
867
Transfer Size Subsystem Note: ESS/DS6K/DS8K
SMI-S BSP Port Subsystem support requires
v3.1.3 or above; SMI-S BSP
Port requires v3.3.
Average number of KB
transferred per I/O by a
ESS/DS6K/DS8K Port particular port over a time
Overall Port ESS/DS6K/DS8K interval.
868
Transfer Size Subsystem
SMI-S BSP Port Note: ESS/DS6K/DS8K
Subsystem support requires
v3.1.3 or above.
Average number of KB sent per
Port Send Packet Switch Port
869 packet by a particular port over
Size Switch
a time interval.
Average number of KB received
Port Receive Switch Port
870 per packet by a particular port
Packet Size Switch
over a time interval.
Average number of KB
Overall Port Packet Switch Port transferred per packet by a
871
Size Switch particular port over a time
interval.
Special Computed Values
Port Send
Average amount of time that the
Utilization
972 ESS/DS6K/DS8K Port port was busy sending data,
Percentage
over a particular time interval.
(4.1.1)
Port Receive
Average amount of time that the
Utilization
973 ESS/DS6K/DS8K Port port was busy receiving data,
Percentage
over a particular time interval.
(4.1.1)
Overall Port 974 ESS/DS6K/DS8K Port Average amount of time that the
49
Metric Device/Component Description
Metric
Type Type
Utilization port was busy sending or
Percentage receiving data, over a particular
(4.1.1) time interval.
The approximate bandwidth
Port Send
ESS/DS8K Port utilization percentage for send
Bandwidth
975 SVC Port operations by this port, over a
Percentage
Switch Port particular time interval, based
(4.1.1)
on its current negotiated speed.
The approximate bandwidth
Port Receive
ESS/DS8K Port utilization percentage for receive
Bandwidth
976 SVC Port operations by this port, over a
Percentage
Switch Port particular time interval, based
(4.1.1)
on its current negotiated speed.
The approximate bandwidth
Overall Port
ESS/DS8K Port utilization percentage for send
Bandwidth
977 SVC Port and receive operations by this
Percentage
Switch Port port, over a particular time
(4.1.1)
interval.
Error Rates
The number of frames per
Switch Port second that were received in
Error Frame Rate 872
Switch error by a particular port over a
time interval.
The number of frames per
second that were lost due to a
Dumped Frame Switch Port
873 lack of available host buffers, for
Rate Switch
a particular port over a time
interval.
The number of link errors per
Switch Port second that were experienced
Link Failure Rate 874
Switch by a particular port over a time
interval.
The average number of times
per second that synchronization
Switch Port
Loss of Sync Rate 875 was lost, for a particular
Switch
component over a particular
time interval.
The average number of times
Switch Port per second that the signal was
Loss of Signal Rate 876
Switch lost, for a particular component
over a particular time interval.
The average number of frames
received per second in which
the CRC in the frame did not
Switch Port
CRC Error Rate 877 match the CRC computed by
Switch
the receiver, for a particular
component over a particular
time interval.
The average number of frames
Switch Port received per second that were
Short Frame Rate 878
Switch shorter than 28 octets (24
header + 4 CRC) not including
50
Metric Device/Component Description
Metric
Type Type
any SOF/EOF bytes, for a
particular component over a
particular time interval.
The average number of frames
received per second that were
longer than 2140 octets (24
Switch Port header + 4 CRC + 2112 data)
Long Frame Rate 879
Switch not including any SOF/EOF
bytes, for a particular
component over a particular
time interval.
The average number of disparity
Encoding Disparity Switch Port errors received per second, for
880
Error Rate Switch a particular component over a
particular time interval.
The average number of class-3
frames per second that were
Discarded Class3 Switch Port
881 discarded by a particular
Frame Rate Switch
component over a particular
time interval.
The average number of F-BSY
frames per second that were
Switch Port
F-BSY Frame Rate 882 generated by a particular
Switch
component over a particular
time interval.
The average number of F-RJT
frames per second that were
Switch Port
F-RJT Frame Rate 883 generated by a particular
Switch
component over a particular
time interval.
51
Appendix B Available Thresholds
This table lists the threshold name, the types of components for which each threshold is available,
and a description. The SMI-S BSP device type mentioned in the table below refers to any storage
subsystem that is managed via a CIMOM which supports SMI-S 1.1 with Block Server
Performance (BSP) subprofile.
Thresholds that require specific versions of IBM Tivoli Storage Productivity Center are noted in
parenthesis.
52
Threshold Type Device/Component Description
(Metric) Type
each array or MDisk is checked against
the threshold boundaries for each
collection interval. Though this threshold
is disabled by default, suggested
boundary values of 35,25,-1,-1 are pre-
populated. In addition, a filter is available
for this threshold which will ignore any
boundary violations if the Backend Read
I/O Rate is less than a specified filter
value. The pre-populated filter value is 5.
Sets thresholds on the average number of
milliseconds that it took to service each
array and MDisk write operation. The
Backend Write Response Time metric for
each array or MDisk is checked against
Backend Write the threshold boundaries for each
Response ESS/DS6K/DS8K Array collection interval. Though this threshold
842
Time SVC MDisk is disabled by default, suggested
(4.1.1) boundary values of 120,80,-1,-1 are pre-
populated. In addition, a filter is available
for this threshold which will ignore any
boundary violations if the Backend Write
I/O Rate is less than a specified filter
value. The pre-populated filter value is 5.
Sets thresholds on the average number of
milliseconds that it took to service each
MDisk I/O operation, measured at the
MDisk level. The Total Response Time
(external) metric for each MDisk is
Overall
checked against the threshold boundaries
Backend
843 SVC MDisk for each collection interval. This threshold
Response
is disabled by default. In addition, a filter
Time
is available for this threshold which will
ignore any boundary violations if the Total
Backend I/O Rate is less than a specified
filter value. The pre-populated filter value
is 10.
Sets thresholds on the average number of
milliseconds that each read operation
spent on the queue before being issued to
the backend device. The Backend Read
Queue Time metric for each MDisk is
checked against the threshold boundaries
Backend Read for each collection interval. Though this
Queue Time 844 SVC Mdisk threshold is disabled by default,
(4.1.1) suggested boundary values of 5,3,-1,-1
are pre-populated. In addition, a filter is
available for this threshold which will
ignore any boundary violations if the
Backend Read I/O Rate is less than a
specified filter value. The pre-populated
filter value is 5. Violation of these
53
Threshold Type Device/Component Description
(Metric) Type
threshold boundaries means that the SVC
deems the MDisk to be overloaded. There
is a queue algorithm that determines the
number of concurrent I/O ops that the
SVC will send to a given MDisk. If there
is any queuing (other than during maybe
a backup process) then this suggests
performance can be improved by
resolving the queuing issue.
Sets thresholds on the average number of
milliseconds that each write operation
spent on the queue before being issued to
the backend device. The Backend Write
Queue Time metric for each MDisk is
checked against the threshold boundaries
for each collection interval. Though this
threshold is disabled by default,
suggested boundary values of 5,3,-1,-1
are pre-populated. In addition, a filter is
available for this threshold which will
Backend Write
ignore any boundary violations if the
Queue Time 845 SVC MDisk
Backend Read I/O Rate is less than a
(4.1.1)
specified filter value. The pre-populated
filter value is 5. Violation of these
threshold boundaries means that the SVC
deems the MDisk to be overloaded. There
is a queue algorithm that determines the
number of concurrent I/O ops that the
SVC will send to a given MDisk. If there
is any queuing (other than during maybe
a backup process) then this suggests
performance can be improved by
resolving the queuing issue.
Sets thresholds on the peak (worst)
response time among all MDisk write
operations by a node. The Backend Peak
Write Response Time metric for each
Node is checked against the threshold
boundaries for each collection interval.
This threshold is enabled by default, with
Peak Backend default boundary values of 30000,10000,-
Write 1,-1. Violation of these threshold
Response 951 SVC Node boundaries means that the SVC cache is
Time having to “partition limit” for a given MDisk
(4.1.1) group – that is, the destage data from the
SVC cache for this MDisk group is
causing the cache to fill up (writes are
being received faster than they can be
destaged to disk). If delays reach 30
seconds or more, then the SVC will switch
into “short term mode” where writes are
no longer cached for the MDisk Group.
54
Threshold Type Device/Component Description
(Metric) Type
Controller Thresholds
Sets threshold on the average number of
I/O operations per second for read and
write operations, for the subsystem
SMI-S BSP Subsystem
controllers (clusters) or I/O Groups. The
Total I/O Rate ESS/DS6K/DS8K
809 Total I/O Rate metric for each controller or
(overall) Controller
I/O Group is checked against the
SVC I/O Group
threshold boundaries for each collection
interval. These thresholds are disabled
by default.
Sets threshold on the average number of
MB per second for read and write
SMI-S BSP Subsystem
operations, for the subsystem controllers
ESS/DS6K/DS8K
Total Data (clusters) or I/O Groups. The Total Data
821 Controller
Rate Rate metric for each controller or I/O
SVC I/O Group
Group is checked against the threshold
boundaries for each collection interval.
These thresholds are disabled by default.
Sets thresholds on the percentage of I/O
operations that were delayed due to write-
cache space constraints. The Write-
cache Full Percentage metric for each
controller or node is checked against the
threshold boundaries for each collection
Write-cache ESS/DS6K/DS8K
interval. This threshold is enabled by
Delay 832 Controller
default, with default boundaries of 10, 3, -
Percentage SVC Node
1, -1. In addition, a filter is available for
this threshold which will ignore any
boundary violations if the Write-cache
Delay I/O Rate is less than a specified
filter value. The pre-populated filter value
is 10 I/Os per sec.
Sets thresholds on the average cache
holding time, in seconds, for I/O data in
the subsystem controllers (clusters).
Shorter time periods indicate adverse
Cache Holding ESS/DS6K/DS8K performance. The Cache Holding Time
834
Time Controller metric for each controller is checked
against the threshold boundaries for each
collection interval. This threshold is
enabled by default, with default
boundaries of 30, 60, -1, -1.
Sets thresholds on the average utilization
percentage of the CPU(s) in the SVC
nodes. The CPU Utilization metric for
CPU
each node is checked against the
Utilization 900 SVC Node
threshold boundaries for each collection
(3.1.3)
interval. This threshold is enabled by
default, with default boundaries of 90,75,-
1,-1.
Non-Preferred Sets thresholds on the Non-Preferred
949 SVC I/O Group
Node Usage Node Usage Percentage of an I/O Group.
55
Threshold Type Device/Component Description
(Metric) Type
Percentage This metric of each I/O Group is checked
(4.1.1) against the threshold boundaries at each
collection interval. This threshold is
disabled by default. In addition, a filter is
available for this threshold which will
ignore any boundary violations if the Total
I/O Rate of the I/O Group is less than a
specified filter value.
Port Thresholds
Sets thresholds on the average number of
I/O operations per second for send and
ESS/DS6K/DS8K Port
receive operations, for the ports. The
Total Port I/O SVC Port (3.1.3)
854 Total I/O Rate metric for each port is
Rate SMI-S BSP Port
checked against the threshold boundaries
for each collection interval. This threshold
is disabled by default.
Sets thresholds on the average number of
packets per second for send and receive
operations, for the ports. The Total I/O
Total Port
857 Switch Port Rate metric for each port is checked
Packet Rate
against the threshold boundaries for each
collection interval. This threshold is
disabled by default.
Sets thresholds on the average number of
MB per second for send and receive
ESS/DS6K/DS8K Port
operations, for the ports. The Total Data
Total Port SVC Port (3.1.3)
860 Rate metric for each port is checked
Data Rate SMI-S BSP Port
against the threshold boundaries for each
Switch Port
collection interval. This threshold is
disabled by default.
Sets thresholds on the average number of
milliseconds that it took to service each
Overall Port I/O operation (send and receive) for ports.
Response 865 ESS/DS6K/DS8K Port The Total Response Time metric for each
Time port is checked against the threshold
boundaries for each collection interval.
This threshold is disabled by default.
Sets thresholds on the average number of
milliseconds it took to service each send
operation to another node in the local
SVC cluster. The Port to Local Node
Send Response Time metric for each
Port to Local Node is checked against the threshold
Node Send boundaries for each collection interval.
Response 925 SVC Node This threshold is enabled by default, with
Time default boundary values of 3,1.5,-1,-1.
(4.1.1) Violation of these threshold boundaries
means that it is taking too long to send
data between nodes (on the fabric), and
suggests either congestion around these
FC ports, or an internal SVC microcode
problem.
56
Threshold Type Device/Component Description
(Metric) Type
Sets thresholds on the average number of
milliseconds it took to service each
receive operation from another node in
the local SVC cluster. The Port to Local
Node Receive Response Time metric for
Port to Local each Node is checked against the
Node Receive threshold boundaries for each collection
Response 926 SVC Node interval. This threshold is enabled by
Time default, with default boundary values of
(4.1.1) 1,0.5,-1,-1. Violation of these threshold
boundaries means that it is taking too
long to send data between nodes (on the
fabric), and suggests either congestion
around these FC ports, or an internal SVC
microcode problem.
Sets thresholds on the average number of
milliseconds that each send operation
issued to another node in the local SVC
cluster spent on the queue before being
issued. The Port to Local Node Send
Port to Local Queued Time metric for each node is
Node Send checked against the threshold boundaries
928 SVC Node
Queue Time for each collection interval. This threshold
(4.1.1) is enabled by default, with default
boundary values of 2,1,-1,-1. Violation of
these threshold boundaries means that
the node has to wait too long to send data
to other nodes (on the fabric), and
suggests congestion on the fabric.
Sets thresholds on the average number of
milliseconds that each receive operation
issued to another node in the local SVC
cluster spent on the queue before being
issued. The Port to Local Node Receive
Port to Local Queued Time metric for each node is
Node Receive checked against the threshold boundaries
929 SVC Node
Queue Time for each collection interval. This threshold
(4.1.1) is enabled by default, with default
boundary values of 1,0.5,-1,-1. Violation
of these threshold boundaries means that
the node has to wait too long to receive
data from other nodes (on the fabric), and
suggests congestion on the fabric.
Sets thresholds on the average amount of
time that ports are busy sending data.
Port Send
The Overall Port Busy Percentage metric
Utilization
972 ESS/DS6K/DS8K Port for each port is checked against the
Percentage
threshold boundaries for each collection
(4.1.1)
interval. This threshold is disabled by
default.
Port Receive Sets thresholds on the average amount of
973 ESS/DS6K/DS8K Port
Utilization time that ports are busy receiving data.
57
Threshold Type Device/Component Description
(Metric) Type
Percentage The Overall Port Busy Percentage metric
(4.1.1) for each port is checked against the
threshold boundaries for each collection
interval. This threshold is disabled by
default.
Sets thresholds on the average port
bandwidth utilization percentage for send
Port Send operations. The Port Send Utilization
ESS/DS8K Port
Bandwidth Percentage metric is checked against the
975 SVC Port
Percentage threshold boundaries for each collection
Switch Port
(4.1.1) interval. This threshold is enabled by
default, with default boundaries 85,75,-1,-
1.
Sets thresholds on the average port
bandwidth utilization percentage for
Port Receive receive operations. The Port Send
ESS/DS8K Port
Bandwidth Utilization Percentage metric is checked
976 SVC Port
Percentage against the threshold boundaries for each
Switch Port
(4.1.1) collection interval. This threshold is
enabled by default, with default
boundaries 85,75,-1,-1.
Sets thresholds on the average number of
frames per second received in error for
the switch ports. The Error Frame Rate
Error Frame
872 Switch Port metric for each port is checked against
Rate
the threshold boundary for each collection
interval. This threshold is disabled by
default.
Sets thresholds on the average number of
link errors per second experienced by the
switch ports. The Link Failure Rate metric
Link Failure
874 Switch Port for each port is checked against the
Rate
threshold boundary for each collection
interval. This threshold is disabled by
default.
58
Appendix C DS3000, DS4000 and DS5000 Metrics
This table lists the metrics supported by DS3000, DS4000 and DS5000 subsystems, a
description of the metric, and the reports that will include the metrics.
Older DS3000, DS4000, and DS5000 subsystems managed by Engenio providers, e.g.
10.50.G0.04, only support a subset of the following metrics in their reports. Later levels of
DS3000, DS4000, and DS5000 subsystems managed by LSI SMI-S Provider 1.3 and above, e.g.
10.06.GG.33, support more metrics – these are denoted by an asterisk. For more information
regarding supported DS3000, DS4000, and DS5000 subsystems and their related providers,
please see:
http://www-
01.ibm.com/support/docview.wss?rs=40&context=SSBSEX&q1=subsystem&uid=swg21384734&l
oc=en_US&cs=utf-8&lang=en
59
Metric Description Report Type
a particular component over a Controller Cache Performance*
time interval. Top Active Volumes Cache Hit Performance*
Percentage of cache hits for By Volume*
both sequential and non- By Controller*
* Total Cache
sequential read and write By Subsystem*
Hits (overall)
operations, for a particular Controller Cache Performance*
component over a time interval. Top Active Volumes Cache Hit Performance*
By Volume
By Controller*
Average number of megabytes
By Subsystem
(2^20 bytes) per second that
Controller Performance*
Read Data Rate were transferred for read
Subsystem Performance
operations, for a particular
Top Active Volumes Cache Hit Performance
component over a time interval.
Top Volumes Data Rate Performance
Top Volumes I/O Rate Performance
By Volume
By Controller*
Average number of megabytes
By Subsystem
(2^20 bytes) per second that
Controller Performance*
Write Data Rate were transferred for write
Subsystem Performance
operations, for a particular
Top Active Volumes Cache Hit Performance
component over a time interval.
Top Volumes Data Rate Performance
Top Volumes I/O Rate Performance
By Volume
By Controller*
Average number of megabytes
By Subsystem
(2^20 bytes) per second that
Controller Performance*
Total Data Rate were transferred for read and
Subsystem Performance
write operations, for a particular
Top Active Volumes Cache Hit Performance
component over a time interval.
Top Volumes Data Rate Performance
Top Volumes I/O Rate Performance
Average number of KB per I/O
By Volume
Read Transfer for read operations, for a
By Controller*
Size particular component over a
By Subsystem
time interval.
Average number of KB per I/O
By Volume
Write Transfer for write operations, for a
By Controller*
Size particular component over a
By Subsystem
time interval.
Average number of KB per I/O
By Volume
Overall Transfer for read and write operations,
By Controller*
Size for a particular component over
By Subsystem
a time interval.
Average number of I/O
* Port Send I/O operations per second for send By Port*
Rate operations, for a particular port Port Performance*
over a time interval.
Average number of I/O
operations per second for
* Port Receive By Port*
receive operations, for a
I/O Rate Port Performance*
particular port over a time
interval.
* Total Port I/O Average number of I/O By Port*
60
Metric Description Report Type
Rate operations per second for send Port Performance*
and receive operations, for a
particular port over a time
interval.
Average number of megabytes
(2^20 bytes) per second that
* Port Send Data By Port*
were transferred for send (read)
Rate Port Performance*
operations, for a particular port
over a time interval.
Average number of megabytes
(2^20 bytes) per second that
* Port Receive were transferred for receive By Port*
Data Rate (write) operations, for a Port Performance*
particular port over a time
interval.
Average number of megabytes
(2^20 bytes) per second that
* Total Port Data were transferred for send and By Port*
Rate receive operations, for a Port Performance*
particular port over a time
interval.
Average number of KB sent per
* Port Send By Port*
I/O by a particular port over a
Transfer Size Port Performance*
time interval.
Average number of KB received
* Port Receive By Port*
per I/O by a particular port over
Transfer Size Port Performance*
a time interval.
Average number of KB
* Overall Port transferred per I/O by a By Port*
Transfer Size particular port over a time Port Performance*
interval.
61