Isilon InsightIQ User Guide
Isilon InsightIQ User Guide
User Guide
4.1.4
September 2021
Rev. 04
Notes, cautions, and warnings
NOTE: A NOTE indicates important information that helps you make better use of your product.
CAUTION: A CAUTION indicates either potential damage to hardware or loss of data and tells you how to avoid
the problem.
WARNING: A WARNING indicates a potential for property damage, personal injury, or death.
© 2009 - 2021 Dell Inc. or its subsidiaries. All rights reserved. Dell, EMC, and other trademarks are trademarks of Dell Inc. or its subsidiaries.
Other trademarks may be trademarks of their respective owners.
Contents
Contents 3
Publish a report on demand...................................................................................................................................... 24
Scheduling Performance reports...................................................................................................................................24
Create a scheduled Performance report............................................................................................................... 24
Disable a scheduled Performance report...............................................................................................................26
Enable a scheduled Performance report .............................................................................................................. 26
Modify a scheduled Performance report............................................................................................................... 26
Delete a scheduled Performance report ............................................................................................................... 27
Troubleshooting scheduled Performance report issues..................................................................................... 27
Interpreting Performance reports................................................................................................................................. 28
Active Clients report...................................................................................................................................................28
Average Pending Disk Operations Count report..................................................................................................28
External Network Throughput Rate report........................................................................................................... 28
Overall Cache Hit Rate report..................................................................................................................................29
Overall Cache Throughput Rate report................................................................................................................. 29
Pending Disk Operations Latency report...............................................................................................................29
Protocol Operations Average Latency report...................................................................................................... 30
4 Contents
1
Introduction to this guide
This section contains the following topics:
Topics:
• About this guide
• Where to go for support
Introduction to InsightIQ
InsightIQ provides tools to monitor and analyze historical data from Isilon clusters.
With InsightIQ, in the InsightIQ web application, you can view standard or customized Performance reports and File System
reports, to monitor and analyze Isilon cluster activity. You can create customized reports to view information about storage
cluster hardware, software, and protocol operations. You can publish, schedule, and share reports, and you can export the data
to a third-party application.
You can create and view specific InsightIQ reports to identify or confirm the cause of a performance issue. For example, if users
report client connectivity issues during certain dates, you can create a report that indicates the cause of the issue. InsightIQ
enables you to correlate data across present and historical conditions.
If you modify the Isilon cluster environment, you can measure the effects of those changes by creating an InsightIQ report that
compares the past performance with the performance after the changes were made. For example, if you add 20 new clients,
you can create a report to compare the performance before the change and after the change. This comparison allows you to
determine the impact of the additional clients on system performance.
Create customized reports that help to identify bottlenecks or inefficiencies in Isilon cluster systems and workflows. For
example, if you want to verify that all clients can access the cluster quickly and efficiently, you can create a report that
measures which client connections are faster or slower. You can then modify the report to add breakouts and filters to identify
the cause of the slower connections.
Customized reports provide specific information about cluster operations. For example, if you recently deployed an Isilon cluster,
you might want to view a customized report that illustrates how the cluster and its individual components are performing.
A review of past performance trends can help you predict future trends and needs. For example, if you deployed an Isilon cluster
for a data-archival project 6 months ago, you might want to estimate when the cluster reaches its maximum capacity. You can
customize a report to illustrate capacity usage by day, week, or month to estimate when the cluster reaches capacity.
You can install InsightIQ on a Linux computer or on a virtual machine. Configure settings such as the IP address of InsightIQ and
the administrator password in the InsightIQ console interface.
6 InsightIQ overview
Performance reports and File System reports
Monitor EMC Isilon clusters with customizable reports.
Performance reports include information about the cluster activity and capacity. The information can help you confirm that
storage clusters perform as expected, and help you identify the specific cause of a performance-related issue to investigate.
File System reports include information about the cluster file system, including deduplication, quotas, and usable capacity. The
information can help you identify the types of data that is stored and where the data is located.
With InsightIQ, you can create live reports and view them through the InsightIQ web application. You can create live reports for
Performance reports and File System reports. You can modify the attributes of live reports as you view the reports, including
the period, breakouts, and filters.
You can view Performance reports as a PDF file that is generated based on a report schedule that the administrator sets up.
InsightIQ can be configured to send a PDF file as an email attachment. You can use a PDF file to verify cluster health or to
distribute InsightIQ information to individuals who do not have access to the InsightIQ web application.
Report configuration
InsightIQ reports are configured by using modules, breakouts, and filter rules.
A module is a section of a report that displays information about a cluster. A breakout can be applied to a module so that
users can view information about separate cluster components. A filter rule can be applied to a module so that users can view
information about a specific component across the entire report. Filter rules can be combined into collections that are called
filters. Filters can be saved and applied to multiple reports.
InsightIQ overview 7
3
Cluster monitoring
This section contains the following topics:
Topics:
• View cluster status
• View aggregate cluster status
• View the status of a cluster
• Monitoring errors
Monitoring errors
Identify and resolve problems that are encountered when monitoring a cluster.
To resolve the following errors, you must be logged in to InsightIQ with an administrator account.
Authorization InsightIQ is not authorized to communicate with the monitored cluster. To resolve this issue, ensure that
error InsightIQ is authenticating with the correct username and password.
8 Cluster monitoring
Connection error InsightIQ is unable to connect to the monitored cluster. If the Monitoring Status of the cluster is yellow,
InsightIQ is trying to resolve this issue. If the Monitoring Status of the cluster is red, InsightIQ is unable
to repair the connection to the cluster and resolve this issue manually. To resolve this issue, reestablish
network connectivity between InsightIQ and the monitored cluster. After the connection is reestablished,
InsightIQ can connect to the monitored cluster automatically.
Database busy InsightIQ is unable to add cluster data to the InsightIQ data store because the maximum number of
data store connections are already in use. Cluster monitoring is temporarily paused until the necessary
resources are freed. If this error persists for several minutes or more, open an SSH connection to
InsightIQ and restart InsightIQ by running the following command:
iiq_restart
Data store InsightIQ is unable to connect to the InsightIQ data store. Cluster monitoring is suspended until this issue
connection error is resolved. To resolve this issue, ensure network connectivity between InsightIQ and the data store.
InsightIQ tries to connect to the data store until the network connection is reestablished.
Data store full The InsightIQ data store does not have sufficient space to continue monitoring. To resolve this issue, free
space on the InsightIQ data store, and then restart InsightIQ.
Data retrieval The most recent data that InsightIQ has retrieved from the cluster is more than 15 minutes old. If
delayed InsightIQ recently began or resumed monitoring the cluster, this issue commonly resolves itself as
InsightIQ retrieves more recent data. If this issue also reports one or more InsightIQ errors, resolve the
errors. If the error persists after all other InsightIQ errors are resolved, contact Isilon Technical Support.
FSA connection InsightIQ is unable to retrieve File System Analytics reports from the cluster. To resolve this issue, ensure
error that NFSv3 is enabled on the monitored cluster and that InsightIQ can connect to the cluster. After this
issue is resolved, InsightIQ automatically resumes retrieving File System Analytics reports.
GUID error The globally unique identifier (GUID) associated with the monitored cluster in InsightIQ has changed.
The hostname or IP address that is associated with the cluster might be reassigned to another cluster.
To resolve this issue, ensure that the hostname or IP address of the monitored cluster matches the
hostname or IP address that InsightIQ accesses.
License error InsightIQ is not licensed on the monitored cluster. To resolve this issue, configure a valid InsightIQ license
on the cluster.
Privilege error The InsightIQ user does not have sufficient privileges to retrieve the specified data from the cluster. For
information about adding privileges to InsightIQ, see KB 194112.
Cluster monitoring 9
4
Performance reports
This section contains the following topics:
Topics:
• Performance reports overview
• Live Performance Reports overview
• Live Performance reports
• Publish reports
• Scheduling Performance reports
• Interpreting Performance reports
NOTE: If InsightIQ is monitoring a OneFS 8.2.0 cluster, filtering might not work properly on some reports.
InsightIQ downsampling
InsightIQ downsamples the data that it collects for Performance reports.
Downsampling means that InsightIQ converts higher-resolution data to lower-resolution data by adding or averaging several
higher-resolution samples. InsightIQ collects data samples at different intervals, depending on the type of data collected.
InsightIQ averages or adds samples together to create a single, low-resolution sample that represents a 10-minute period, and
preserves the minimum and maximum values for that period. InsightIQ deletes the averaged samples after 24 hours, to limit the
size of the data store.
10 Performance reports
Table 2. Allowable data values by module
Module Notes
Active Clients
Connected Clients You can view maximum and minimum values when you filter by
Protocol Operations Average Latency protocol.
Performance reports 11
Cluster Performance report overview
The Cluster Performance report displays the general health and status of the cluster.
You can focus the modules on individual nodes or clients to view more detailed information about the affects on cluster
performance.
Here is a list of modules that are included in this report:
● External Network Throughput Rate
● Protocol Operations Rate
● CPU % Use
● Active Clients
● Connected Clients
● Jobs
● Job Workers
6. In the External Network Throughput Rate module, click the Client breakout.
To learn more about breakouts, refer to the Breakouts topic.
To remove focus on a breakout, click the None breakout.
On the left side of the module, you can now see a list of all clients that were using the cluster's network throughput during
the date range you specified. The client at the top of the list was using the highest percentage of network throughput.
7. Click to select the client that was using the most throughput during the date range.
To remove focus on a client, click the client filter button at the top of the module.
All applicable modules in the report are now showing information that is focused on the client you selected, and data from all
other clients are removed from the module.
With the information you gathered, you can determine that a specific client might be causing performance issues on the cluster.
12 Performance reports
● Disk Throughput Rate
● Disk Operations Rate
● Active Clients
● Connected Clients
6. In the Node Summary module, click the node with the highest Total I/O.
To remove focus on a node, click the node filter button at the top of the module.
All applicable modules in the report are now focused on the node you filtered. Data from all other nodes is removed from the
modules.
7. In the External Network Throughput Rate module, click the Client breakout.
To learn more about breakouts, refer to the Breakouts topic.
To remove focus on a breakout, click the None breakout.
All applicable modules in the report are now showing information that is focused on the node you filtered and the client
breakout you selected.
In this example, you can determine which node had high I/O, and focus the modules to determine which client has the most
affect on the node performance.
Performance reports 13
4. In the Date Range fields, select a range to display, then click View Report.
The date range should include the entire period of time when the cluster was behaving unexpectedly. The default setting is
to view the last hour of data.
The modules in the report are now populated with data for the cluster and date range you selected.
5. In the External Network Throughput Rate module, check the throughput rate during the time in question.
NOTE: High throughput values indicate times when the cluster's network was under heavy load.
6. In the External Network Throughput Rate module, click the Client breakout.
To learn more about breakouts, refer to the Breakouts topic.
To remove focus on a breakout, click the None breakout.
On the left side of the module, you can now see a list of all clients that were using the cluster's network throughput during
the timeframe you specified. The client at the top of the list was using the highest percentage of network throughput.
7. Click to select the client that was using the most throughput during the date range.
To remove focus on a client, click the client filter button at the top of the module.
All applicable modules in the report are now showing information that is focused on the client you selected, and data from all
other clients are removed from the module.
8. In the Protocol Operations Average Latency module, click the Protocol breakout.
On the left side of the module, you can now see a list of all protocols in use during the date range you specified, with the
protocol at the top experiencing the most latency.
With the information you gathered, you can determine which client and protocol might be causing network latency issues on the
cluster.
14 Performance reports
On the left side of the module, you can now see a list of all protocols being used by the client you selected. The protocol at
the top of the list experienced the highest average latency during the date range you specified.
With the information you gathered, you can determine the client with the highest Total I/O, and the protocols the client used to
access the cluster.
On the left side of the module, you can now see a list of all disks that were not idle during the date range you specified. The
disk at the top of the list had the most activity during the date range you selected.
6. In the Disk Throughput Rate module, click the Disk breakout.
On the left side of the module, you can now see a list of all disks that were being written to or read from during the date
range. The disk at the top of the list had the highest percentage throughput.
With the information you gathered, you can determine which disk has the highest throughput and might be causing performance
issues on the cluster.
Performance reports 15
● Contended File System Events Rate
● Locked File System Events Rate
● Blocking File System Events Rate
16 Performance reports
The date range should include the entire period of time when the cluster was behaving unexpectedly. The default setting is
to view the last hour of data.
The modules in the report are now populated with data for the cluster and date range you selected.
5. In the Overall Cache Hit Rate module, check the rate of L1, L2, and L3 cache during the time in question.
NOTE: If L3 cache is enabled and has a higher hit rate than L2 or L1 cache, cluster performance might be affected.
6. In the Overall Cache Throughput Rate module, select L3 Starts in the breakout dropdown and click the Node breakout.
To learn more about breakouts, refer to the Breakouts topic.
To remove focus on a breakout, click the None breakout.
On the left side of the module, you can now see a list of all nodes that were requesting info from L3 during the date range
you specified. The node at the top of the list made the most requests to L3.
With the information you gathered, you can determine which node is making the most cache requests, what type of cache
requests are being made, and how the cache requests affect performance on the OneFS cluster.
With the information you gathered, you can determine how close you are to reaching the cluster's total capacity and estimate
when additional capacity might be needed.
Performance reports 17
Configure the Deduplication Performance report
You can view cluster capacity, deduplicated data, and saved storage space within the Deduplication Performance report.
For this example, you are researching into the amount of storage space deduplication has saved on the cluster.
1. Click the Performance Reporting tab.
2. On the Live Performance Reporting page, click the Cluster drop down and select the cluster that you want to monitor.
3. Click the Report Type drop down and select Deduplication.
4. In the Date Range fields, select a range to display, then click View Report.
The date range should include the entire period of time when the cluster was behaving unexpectedly. The default setting is
to view the last hour of data.
The modules in the report are now populated with data for the cluster and date range you selected.
5. In the Deduplication Summary (Physical) module, view the Space Saved.
With the information you gathered, you can determine how much storage space (physical) you have saved with deduplication.
18 Performance reports
With the information you gathered, you can determine which job type and job id might have caused the event.
6. Click to select the Service that was using the most CPU usage during the date range.
To remove focus, click the service filter button at the top of the module.
All applicable modules in the report are now showing information that is focused on the service you selected, and data from
all other services are removed from the module.
7. In the CPU Usage Rate module, click the Node breakout.
To learn more about breakouts, refer to the Breakouts topic.
To remove focus on a breakout, click the None breakout.
On the left side of the module, you can now see a list of all nodes that were using CPU time during the date specified. The
node at the top of the list was using the highest percentage of CPU.
With the information you gathered, you can determine which job and service had high CPU usage and what node originated the
request.
Performance reports 19
In a live Performance report, you can view the status of the monitored cluster. This information includes the clusters current
date and time, the number of nodes that are deployed, a client-activity overview, a network throughput overview, the capacity
usage, and CPU utilization.
Create a live Performance report from In the Select a Starting Point for Your Performance Report section, click
a template that is based on the default Create from Blank Report.
settings.
Create a live Performance report based In the Saved Performance Reports section, click a saved report.
on a Saved Performance report.
Create a live Performance report based In the Standard Reports section, click the name of a Standard report.
on one of the standard reports. ● Node Performance report template
● Cluster Performance report template
● Client Performance report template
● Network Performance report template
● File System Performance report template
● Disk Performance report template
● Cluster Capacity report template
● File System Cache Performance report template
● Cluster Events report template
● Deduplication report template
● Jobs and Services report template
6. In the Select the Data You Want to See section, specify the Performance modules that you want to include in the report.
Add a Performance a. In any unassigned Performance module area, from the Select a Module for this Position
module drop-down list, select a performance module.
b. If all existing Performance module areas are assigned, at the bottom of the section, click
Add another performance module.
Modify a Performance ● In an assigned Performance module area, from the Performance module drop-down list,
module select a different Performance module.
Delete a Performance ● In an assigned Performance module area, click Delete this performance module.
module
Repeat this step for each performance module that you want to include.
20 Performance reports
View a Live Performance report
View a Live Performance report.
1. In the InsightIQ web application, click the Dashboard tab.
2. In the InsightIQ Dashboard view, locate the monitored cluster that you want to view information about.
3. Click the Performance details link that is located beside the cluster name.
The Live Performance Reporting view for the cluster appears.
4. Optional: In the Date Range field, specify the start date and time and the end date and time for the period that you want to
view.
5. Select the Report Type that you want to view from the list.
You can select a Standard report or a Saved report.
6. Click View Report.
7. Scroll down to view the report.
Option Description
Change the name of the Type the new name in the Performance Report Name field.
performance report
Add a performance module a. In the Select the Data You Want to See section, click Add another performance
module.
b. From the Select a Module for this Position list, select a performance module.
Modify a performance module a. Locate the performance module that you want to modify.
b. From the performance module list, select a new performance module.
Remove a performance module a. Locate the performance module that you want to remove.
b. In the performance module area, click Delete this performance module.
6. Complete changes to the performance report.
● If you want to save the report but continue editing it, click Save.
● If you want to save the report and then immediately apply the report to a monitored cluster and view the results, click
Finish.
Performance reports 21
Delete one report Beside the name of the report that you want to delete, click Delete.
Delete multiple reports at Beside the names of the reports that you want to delete, select the Report checkbox, and
once then, in the Select an action list, select Delete selected reports.
Delete all live Performance Select the top-level checkbox that is next to the Report column title, and then, in the
reports Select an action list, select Delete selected reports.
A Confirm Delete dialog box appears and prompts you to confirm that you want to delete the performance reports.
5. Click Delete.
Publish reports
Publish a report as a PDF.
InsightIQ generates PDFs, sometimes referred to as a generated Performance report, at the times that are specified in a
scheduled report. After the published report is complete, you can configure InsightIQ to send the PDF as an email attachment to
up to 10 email recipients. You can download the PDFs or view the published reports in the InsightIQ web application.
22 Performance reports
3. In the Generated Reports area, in the row for the generated report that you want to view, click View.
The report PDF appears.
Send one published report In the row for the published report that you want to send, click Send.
Send multiple reports in a In the rows for the reports you that want to send, select the check boxes, and then, in the
single email Select an action list, select Send Selected reports.
Send all reports in a single Select the top-level check box that is next to the Report column title, and then, in the
email Select an action drop-down list, select Send Selected reports.
Delete one published Beside the name of the published report that you want to delete, click Delete.
report
Delete multiple published Beside the names of the reports you want to delete, select the check boxes, and then, in the
reports Select an action list, select Delete Selected reports.
Delete all published Select the top-level checkbox in the Generated Reports area, and then, in the Select an
reports. action list, click Delete selected reports.
A confirmation dialog box appears and prompts you to confirm that you want to delete the report.
4. Click OK.
Performance reports 23
Publish a report on demand
Manually publish a scheduled performance report.
1. In the InsightIQ web application, click the Performance Reporting tab and then click Manage Performance Reporting on
the Performance Reporting ribbon.
The Manage Performance Reporting view appears.
2. Click the Scheduled Performance Reports tab.
The Scheduled Performance Reporting view appears.
3. In the Report Schedules table, in the Actions column, click Generate for the report that you want to publish.
The Generate link is unavailable while InsightIQ publishes the report.
4. In the Manage Performance Reporting view, click the Generated Reports Archive tab.
The published report is listed in the Generated Reports table.
Performance reports are based on the time settings of the computer or virtual machine that is running InsightIQ, not the
time settings on the monitored cluster.
Create a report schedule from a template based on the default settings. Click Create from Blank Report.
Create a report schedule from a template based on a user-created In the Saved Reports area, click the name of a
Performance report. report.
Create a report schedule from a template based on one of the standard In the Standard Reports area, click the name
reports included with InsightIQ. of a report.
The Create a New Performance Report page reloads and displays report configuration options.
3. In the Build Your New Performance Report area, in the Performance Report Name field, type a name for the report
schedule.
4. Select the Scheduled Performance Report checkbox.
24 Performance reports
If you want to also create a live performance report that has the same properties as this static performance report schedule,
select both the Scheduled Performance Report and Live Performance Reporting checkboxes.
Generate one or more reports Select Hourly, specify the number of hours to elapse before the next report is
every day generated, and specify the time when each report is generated for the first time each
day. For example, if you configure InsightIQ to generate a report every 7 hours starting
at 1:30 PM, the report is generated daily at 1:30 PM and 8:30 PM.
Generate no more than one report Select Daily, specify the number of days that elapse before the next report is
per day. Optionally, suspend generated, and specify the time of day when the report is generated.
report generation for a number of
days.
Generate no more than one report Select Weekly, specify the number of weeks that elapse before the next report is
per day. Optionally, suspend generated, and specify one or more days of the week and the time of day when the
report generation for a number of report is generated.
weeks.
Generate no more than one report Select Monthly, specify the number of months that elapse before the next report is
per month generated, and specify the day of the month on which the report is generated. Reports
are always generated at 11:59 PM on the specified day.
8. If you want to send reports that are generated from this schedule to one or more email addresses, specify the addresses.
a. Select the Email this report as a PDF attachment each time it is generated checkbox.
b. Type one or more email addresses in the Report Recipients box.
Separate each address with a comma, a space, or a semi-colon. You can specify up to 10 email addresses.
9. In the Insert the Data You Want to See area, specify the modules that you want to view in the report.
Add module a. If all existing module areas are assigned, click Add another performance module.
NOTE: Unassigned modules contain a Select a Module for this Position list.
b. In an unassigned module area, from the Select a Module for this Position list, select a module.
Modify a ● In an assigned module area, from the module list, select a different module.
module
Repeat this step for each module that you want to include.
10. If you want to apply a breakout to a module, in a module area, in the Display Options area, click the name of the breakout
that you want to apply.
Repeat this step for each module that you want to apply a breakout to.
Performance reports 25
Disable a scheduled Performance report
Disable a scheduled Performance report to temporarily prevent InsightIQ from publishing Performance reports.
You can disable a scheduled Performance report if you want to temporarily prevent InsightIQ from publishing Performance
reports, but do not want to permanently delete the report. A disabled report can be enabled so that InsightIQ publishes
Performance reports again.
1. Click Performance Reporting > Manage Performance Reporting.
The Manage Performance Reporting page appears with the Live Performance Reporting tab displayed.
2. Click the Scheduled Performance Reports tab.
The Scheduled Performance Reports view appears. A list of all saved report schedules is displayed in the Report
Schedules table.
3. In the Report Schedules table, specify which report schedule you want to disable.
Disable a single report In the row of the report schedule you want to disable, click the black double-bar icon in the
schedule. Next Run (UTC) column.
Disable multiple report In the rows of the report schedules you want to disable, select the check boxes, and then, in the
schedules. Select an action list, select Pause selected report schedules.
Disable all report In the header row, select the check box next to Report, and then, in the Select an action list,
schedules. select Pause selected report schedules.
In the Next Run (UTC) column of the report schedules, the status changes to Paused. The icon changes to a right pointing
triangle and InsightIQ does not generate reports for the specified schedules.
Enable a single report In the row of the report schedule you want to enable, click the black right-pointing triangle icon
schedule. in the Next Run (UTC) column.
Enable multiple report In the rows of the report schedules you want to enable, select the checkboxes. In the Select
schedules. an action list, select Resume selected report schedules.
Enable all report In the header row, select the checkbox next to Report. In the Select an action list, select
schedules. Resume selected report schedules.
The next date that InsightIQ will publishes a report appears in the Next Run (UTC) column of the schedules.
26 Performance reports
The Manage Performance Reporting view appears with the Live Performance Reporting tab shown.
2. Click the Scheduled Performance Reports tab.
The Scheduled Performance Reports tab appears, and displays a list of all saved report schedules.
3. In the Scheduled Reports area, in the row for the scheduled report that you that want to modify, click Edit.
The Edit this Performance Report view appears.
4. Modify the schedule settings as necessary, and then click save.
Delete one report In the row for the report schedule that you want to delete, click Delete.
schedule
Delete multiple report In the rows for the report schedules that you want to delete, select the checkboxes, and then,
schedules in the Select an action list, select Delete selected report schedules.
Delete all report Select the highest checkbox in the Scheduled Reports area, and then, in the Select an
schedules action list, click Delete selected report schedules.
A dialog box appears and prompts you to confirm that you want to delete the report schedule or schedules.
4. Click Delete.
Performance reports 27
● Configure InsightIQ email settings. InsightIQ might not be authorized to send outbound email through the organization's
email server.
● Reduce the size of the published reports by removing breakouts from the report schedule. Many email servers reject email
messages that are larger than a certain limit. The email server might be rejecting the reports because they are too large.
● Configure the email program to allow InsightIQ to send email messages. The email program might be filtering email messages
that InsightIQ sends.
Depending on usage details, exceeding the recommendations in OneFS 7.1.0 and earlier can lead to dropped connections or
other performance problems. Generally speaking, if the number of active connections exceeds 90 percent of the limits, add
nodes to the cluster when possible.
In OneFS 7.1.1 and later, over allocation is handled differently. Connection requests are denied when the total number of clients
equals 90 percent of the total recommended limit for the cluster. If there many active clients, and if that quantity is near the
supported limit, check each node to ensure that the load is evenly distributed. Verify that any one node is not oversubscribed.
Keep these limits in mind when you perform maintenance on a node so that you do not exceed the pool or cluster limit when you
take nodes offline.
28 Performance reports
For example, if you have an application that uses a particular SmartConnect zone, you can view the throughput on those nodes
to observe that application's performance. This information can be useful, but it is not a good indicator of potential performance
problems. There are many factors that affect the actual throughput rates over the network, many of which are unrelated to the
cluster itself.
Break out the data for a selected cache by cache data type and node. If you want to determine if input-output is balanced
across nodes or node pools, examining node details can be helpful.
Also, InsightIQ provides specific charts for throughput rates of the individual L1, L2, and L3 caches.
By comparing this graph to the disk, protocol, and network throughput rates, you can determine the actual impact of cache
throughput on system performance.
InsightIQ also provides specific graphs for throughput rates of the individual L1, L2, and L3 caches.
Performance reports 29
Protocol Operations Average Latency report
The Protocol Operations Average Latency report displays the average amount of time that is required for protocols to process
incoming operations.
You can also break out this information by node, protocol, client, or operation class. The Protocol Operations Average Latency
report can be a helpful report, especially when you break out the details. For example, view the latency by protocol, then at the
operation class for each protocol, to check for specific operations that might be causing excessive latency.
For the SMB protocol, the Change Notify operation can be the source of high latency. You might also see that namespace
operations introduce significant latency, which can be alleviated with SSDs. HTTP transactions also exhibit high latency and
skew the results. Filter out HTTP traffic, unless you are specifically interested in that data.
30 Performance reports
5
File System reports
This section contains the following topics:
Topics:
• File System reports
• Capacity reports overview
• Deduplication reports
• File System reports overview
• Quota reports
Allocated The storage capacity of nodes that belong to node pools that include three or more nodes. In the OneFS
Capacity command-line interface, Allocated Capacity is referred to as size in the output of the isi_status
command. If all node pools include at least three nodes, the Allocated Capacity is the same as the Total
Capacity.
Estimated The estimated amount of data protection overhead that is required to protect data if the cluster reaches
Additional capacity.
Protection
Overhead
Capacity Forecast
Displays the amount data that can be added to the cluster before the cluster reaches capacity. This report module displays the
following values.
Allocated The storage capacity of nodes that belong to node pools that include three or more nodes. In the OneFS
Capacity command-line interface, Allocated Capacity is referred to as size in the output of the isi_status
command. If all node pools include at least three nodes, the Allocated Capacity is the same as the Total
Capacity.
Estimated The estimated amount of additional data protection overhead that is required to protect data if the
Additional cluster is filled to capacity.
Protection
Overhead
Estimated Usable The amount of data that can be added to the cluster before the cluster reaches capacity.
Capacity
Forecast data The breakout of information that is are shown in the Forecast chart. The Forecast data includes the
following information.
Calculation The highlighted range of data that is used to calculate the forecast.
Range
Forecast Usage The projected Total Usage over time.
Standard The standard deviation of the Forecast Usage calculation.
Deviation
Outliers Data points that are significantly outside of the range of the bulk of the Calculation
Range data. Depending on the frequency and amount of variation, these points can
have a major impact on the accuracy of the Forecast Usage data.
Overhead Specifies which FSA report to base the Estimated Additional Protection Overhead on. The estimated
estimated based overhead is calculated based on the ratio of data protection that is required to protect user data at the
on FSA report time that the FSA report was created.
Provisioned The storage capacity of nodes that belong to node pools.
Capacity
Remaining The total amount of storage capacity that is unused on the cluster.
Capacity
Directories
Displays disk-usage data and file counts per directory, recursively. You can sort the information by clicking a column heading in
the table view. Resorting causes the chart on the left to reload and display a visual representation of the specified information.
You can click pathnames to view more specific information about any subdirectories contained in the directory. You can also
create a filter for a specific directory that you can use to filter data in other InsightIQ views.
NOTE: The information that is displayed in this module always refers to the current state of the cluster and is not
influenced by the selected date range.
Quota Browser
Deduplicated The amount of data that has been deduplicated.
Data
Paths Displays whether a quota has been applied on or under a directory. The selected directory determines
which quotas are displayed in the Quotas Applied On and Under directory module. This module reflects
the current cluster configuration.
Quotas Applied Displays the quotas that have been applied on or under the selected directory. The Select a report list
On and Under determines the point in time that this module represents, regardless of the Date Range selected.
Directory
Accessed Time Breaks out data by when a file was last accessed.
Directory Breaks out data by the size of each directory.
File Extension Breaks out data by file name extension.
Logical Size Breaks out data by logical file size. Logical file size calculations include only data, and do not include
data-protection overhead.
Changed Time Breaks out data by when a file was last changed.
Node Pool Breaks out data by the size of each node pool.
Physical Size Breaks out data by physical size. Physical size calculations include data-protection overhead.
Tier Breaks out data by the size of each node tier.
User Attribute Breaks out data by a user-defined attribute, if an attribute is defined. You can define attributes through
the OneFS command-line interface.
Deduplication reports
Deduplication reports allow you to view historical and current information about data that has been deduplicated with the
SmartDedupe software module.
You can view information about specific deduplication jobs or the current state of deduplication on the cluster. Deduplication job
information is not cumulative. For example, if you run a deduplication job that deduplicates 10 GB of data, and then later run a
deduplication job that deduplicates another 10 GB of data, both data points on the graph will report that only 10 GB of data was
deduplicated.
To view deduplication reports, you must activate a SmartDedupe license on the monitored cluster. For more information about
SmartDedupe, see the OneFS Web Administration Guide and OneFS CLI Administration Guide.
NOTE: Deduplication information is also included in Performance reports. In Performance reports, deduplication information
is cumulative.
NOTE: If your FSA report includes a directory name that has spaces, you must enclose the directory path in quotes.
5. If you are viewing a data properties report, and want to compare the data for the selected reporting date with data for
another reporting date, from the Compare to list, select the other reporting date.
6. Click View Report.
The file system modules that are associated with the specified report appear.
7. If you want to apply a breakout, in a file system module area, in the Breakout by area, click the name of a breakout.
The breakouts appear below the chart in the file system module section.
8. If you want to apply a filter, in a file system module area, click the green button of the filter you want to apply.
For example, if you want to filter by write events, click write.
To remove a filter rule from a report, click the red button of the filter you want to remove in either the Data Filters section
or a file system module section where the filter is applied.
For example, if you want to stop filtering by write events, click Event:write.
Quota reports
Quota reports display information about quotas created through the SmartQuotas software module.
Quota reports can be useful if you want to compare the data usage of a directory to the quota limits for that directory over
time. This information can help you predict when a directory is likely to reach its quota limit.
Historical quota data is generated according to the quota reports that are generated by OneFS. The granularity of historical
quota data depends on how often quota reports are created. For information about configuring how frequent quota reports are
generated, see the OneFS Web Administration Guide and OneFS CLI Administration Guide.
To view quota reports, activate a SmartQuotas license on the monitored cluster.
Modules
Modules show collections of information about the cluster.
Modules are preconfigured collections of information about the monitored cluster. You can add and remove modules to all
Performance reports.
NOTE: Depending on which version of the OneFS operating system the monitored cluster is running, certain InsightIQ
features may not be available.
Contended File System Shows the number of file contention events, such as lock contention or read/write contention,
Events Rate occurring in the file system per second. You can optionally break out this data by path or node.
CPU % Use Shows the average CPU usage for all nodes in the monitored cluster. As some nodes may
consume significantly more or less CPU resources than others, the average reflects the sum
of the individual CPU-usage averages for each node.
NOTE: You can optionally break out this data by node. This breakout indicates the average
CPU usage of each node. For example, at 10:52:22 AM on January 10, 2017, the specified
node was using 14.35% of the total available node CPU capacity.
CPU Usage Rate Shows the rate of CPU time that is consumed on all cores per second. You can optionally
break out this data by job, node, or service.
Deadlocked File System Shows the number of file system deadlock events that the file system is processing per
Events Rate second. This information can be useful if you want to identify a specific file state that might be
contributing to performance issues. Deadlocked events occur regularly during normal cluster
operation, and the file system is designed to detect and break them. You can optionally break
out this data by path or node.
Deduplication Summary Shows the amount of space that deduplication has saved on the cluster and the amount
(Logical) of data that has been deduplicated. This module refers to the logical space and data. The
file metadata and protection overhead are not considered. This module is available only for
clusters running OneFS 7.1.1 or later.
Deduplication Summary Shows the amount of space that deduplication has saved on the cluster and the amount of
(Physical) data that has been deduplicated. This module refers to the estimated physical space and data.
The file metadata and protection overhead are considered This module is available only for
clusters running OneFS 7.1.1 or later.
External Network Packets Shows the total number of packets that passed through the external network interfaces in the
Rate monitored cluster. You can optionally break out this data by direction, interface, or node.
External Network Throughput Shows the total amount of data that passed through the external network interfaces in
Rate the monitored cluster. You can optionally break out this data by interface, direction, client,
operation class, protocol, or node.
File System Events Rate Shows the number of file system events, or operations, such as read, write, lookup, or rename,
that the file system is servicing per second. You can optionally break out this data by direction,
operation class, path, node, or event.
File System Throughput Rate Shows the rate at which data is being read from and written to the file system.
Job Workers Shows the number of active and assigned workers on the cluster. An active worker is a worker
that is performing a system job. An assigned worker is a worker that has been assigned to a
system job but is not currently performing the job. You can optionally break out this data by
job name or job ID.
Jobs Shows the number of active and inactive jobs on the cluster. An active job is a system job
that workers perform. An inactive job is a system job that has been assigned workers, but the
workers are not currently performing the job. You can optionally break out this data by job
name or job ID.
L1 and L2 Cache Prefetch Shows the amount of data that was prefetched for L1 and L2 and how much of the
Throughput Rate prefetched data was requested.
● Starts - Indicates the amount of data that was requested.
● Hits - Indicates the amount of requested data that was available.
L1 Cache Throughput Rate Shows the amount of data that was requested from the L1 cache and how much of the
requested data was available in the L1 cache.
● Starts - Indicates the amount of data that was requested.
● Hits - Indicates the amount of requested data that was available.
● Waits - Indicates the amount of requested data that existed in cache but was not available
because the data was in use.
● Misses - Indicates the amount of requested data that did not exist in cache.
● Prefetch Hits - Indicates the amount of data that was requested from prefetch.
L2 Cache Throughput Rate Shows the amount of data that was requested from the L2 cache and how much of the
requested data was available in the L2 cache.
● Starts - Indicates the amount of data that was requested.
● Hits - Indicates the amount of requested data that was available.
Breakouts
Refine information in reports with a breakout.
Modules in reports can provide a breakout of data by category to refine the scope of information.
You can apply breakouts to modules to view the individual contributions of various performance characteristics. You apply only
one breakout to a module at a time.
Breakouts provide heat maps that display variations of color to represent each component's contribution to overall performance.
The darker the color on a heat map, the greater the activity for that component. Heat maps help you to visualize performance
trends and to identify periods of constrained performance. If you hover the mouse pointer over any location on a heat map,
InsightIQ shows data for the specified component at that moment in time. Breakouts are sorted by component based on level of
activity, with the most active elements at the top of the list.
Breakouts can be useful when trying to determine the source of a performance issue. For example, if you break out the CPU
usage module by node, you can then view the individual CPU usage of each node.
Breakout types
Breakouts provide information on the individual components within modules.
Op Class (Operation Class) Reports data by the type of operation performed. The following operation types are
supported:
● Read - file and stream reading
● Write - file and stream writing
● Create - file, link, node, stream, and directory creation
● Delete - file, link, node, stream, and directory deletion
● Namespace Read - attribute, statistic, ACL read, lookup, and directory read
● Namespace Write - rename, set attribute, set permission, time, and write ACL
● File State - open, close, lock, acquire, release, break, check, and notify.
● Session State - negotiate, inquire, and change protocol connections and sessions
● Other - file system information and other operations that are not categorized.
Path Reports data by individual file name or directory name.
Protocol Reports data by protocol (for example, NFS).
Service Reports data by service.
Tier Reports data by tier.
Applying a breakout
Apply a breakout to a report.
1. In the InsightIQ web application, open any Perfomance report or a File System Analytics report.
Filters
Refine a large data set with a filter.
You can create and manage filters. Apply filters to any Performance report and File System Analytics reports to provide
information that is limited to specified parameters. You can create and apply filters, which contain one or more filter rules.
If you apply a filter rule to a report, all the Performance modules in the report display information that is constrained by the filter
rule.
For example, you can apply a filter rule to show information about a specific node in a cluster. All the Performance modules in
the report display data about that node.
While breakouts show all components in a category, filters can limit the number of contributors in a category. For example,
you can breakout a Performance module by protocol. Breakouts appear beneath Performance modules but do not modify the
Performance module. Filter rules limit what data is displayed in Performance modules. Filters are customized collections of filter
rules that you can create, save, and apply to various reports.
A filter can contain both a filter rule for a specific node in a cluster and a filter rule for a specific client accessing that cluster.
Applying this filter causes all Performance modules in a report to display information about the interactions for only that node
and client. You can save filter rules and then apply them to specific reports.
Filter rules
Rules for InsightIQ filters.
To define a filter rule, specify the match parameter. For some rules, InsightIQ automatically generates a list of valid parameters.
For other filters, type a valid parameter.
NOTE: If InsightIQ is monitoring a OneFS 8.2.0 cluster, the statistics, breakouts, filters, and report data might not display
properly.
Input values for the following Data Filter rules.
File Extension The file name extension. For example .xml To filter files
without file name extensions, specify the following:
(none)
User Attribute The name of a user attribute that is defined on the cluster. All
characters are valid. Only available in the File System reports.
You can select conditional values for the following Filter rules.
Create a filter
Create a filter that consists of one or more rules.
Specify the rules to include in a filter, then save the filter and apply it to reports.
1. On the Live Performance Reporting view, below the Date Range control, click Create/manage data filters.
The Data Filters control appears.
2. On the Data Filters control toolbar, click Add Rule.
3. From the Type drop-down list, select an attribute type that you want to filter on.
4. In the Match field, select or type a value for the attribute you selected.
Repeat steps 2 through 4 as needed until you have built and applied all the rules that you want to include in the filter.
5. Click Apply.
The filter is applied to all Performance modules in the report. Each rule is indicated
6. From the Manage menu, click Save.
The Save Filter As dialog box appears.
7. In the Please enter a filter name field, type a name for the filter, and then click OK.
The filter is saved, and the name of the filter appears in the Current filter field, indicating that the filter is applied globally
across all Performance modules in the report.
Modify a filter
Modify and remove the filter rules that are contained in a filter.
1. On the Live Performance Reporting view, click Create/manage data filters.
The Data Filters view appears.
2. On the Manage menu, point to Load filter, and then click the name of the filter that you want to modify.
The individual filter rules appear in the Filter section, and InsightIQ applies the filter globally across all Performance modules
in the report.
3. Modify filter settings as needed.
Modify a filter rule. a. In the Match field for the filter rule that you want to modify, select or type a filter value.
Remove a filter rule. a. Click either the Type list or Match field for the filter rule you want to remove.
b. Click Delete Rule.
Save a new instance of the filter a. On the Manage menu, click Save as.
with a different name. The Save Filter dialog box appears.
b. In the Please enter a filter name field, type a name for the filter, and then click
OK.
Delete a filter
Permanently delete a filter.
1. On the Live Performance Reporting view, click Create/manage data filters.
The Data Filters view appears.
2. On the Manage menu, point to Load Filter, and then click the name of the filter that you want to delete.
The individual filter rules appear in the Filter section, and InsightIQ applies the filter globally across all Performance modules
in the report.
3. On the Manage menu, click Delete.
NOTE: To remove an individual rule from a filter while keeping the rest of the filter intact, with the filter loaded, click
the rule that you want to delete. Click the Delete Rule button at the top of the rule list, and then save the filter.
The Delete Filter dialog box appears and prompts you to confirm that you want to delete the filter.
4. Click Yes.