NetApp ONTAP 9 - Cluster - Administration
NetApp ONTAP 9 - Cluster - Administration
ONTAP 9
NetApp
October 21, 2024
• System Manager is included with ONTAP software as a web service, enabled by default,
and accessible by using a browser.
• The name of System Manager has changed beginning with ONTAP 9.6. In ONTAP 9.5 and
earlier it was called OnCommand System Manager. Beginning with ONTAP 9.6 and later, it
is called System Manager.
• If you are using the classic System Manager (available only in ONTAP 9.7 and earlier), refer
to System Manager Classic (ONTAP 9.0 to 9.7)
Using the System Manager Dashboard, you can view at-a-glance information about important alerts and
notifications, the efficiency and capacity of storage tiers and volumes, the nodes that are available in a cluster,
the status of the nodes in an HA pair, the most active applications and objects, and the performance metrics of
a cluster or a node.
With System Manager you can perform many common tasks, such as the following:
• Create a cluster, configure a network, and set up support details for the cluster.
• Configure and manage storage objects, such as disks, local tiers, volumes, qtrees, and quotas.
• Configure protocols, such as SMB and NFS, and provision file sharing.
• Configure protocols such as FC, FCoE, NVMe, and iSCSI for block access.
• Create and configure network components, such as subnets, broadcast domains, data and management
interfaces, and interface groups.
• Set up and manage mirroring and vaulting relationships.
• Perform cluster management, storage node management, and storage virtual machine (storage VM)
management operations.
• Create and configure storage VMs, manage storage objects associated with storage VMs, and manage
storage VM services.
• Monitor and manage high-availability (HA) configurations in a cluster.
• Configure service processors to remotely log in, manage, monitor, and administer the node, regardless of
the state of the node.
1
System Manager terminology
System Manager uses different terminology than the CLI for some ONTAP key functionality.
• Local tier – a set of physical solid-state drives or hard-disk drives you store your data on. You might know
these as aggregates. In fact, if you use the ONTAP CLI, you will still see the term aggregate used to
represent a local tier.
• Cloud tier – storage in the cloud used by ONTAP when you want to have some of your data off premises
for one of several reasons. If you are thinking of the cloud part of a FabricPool, you’ve already figured it
out. And if you are using a StorageGRID system, your cloud might not be off premises at all. (A cloud-like
experience on premises is called a private cloud.)
• Storage VM – a virtual machine running within ONTAP that provides storage and data services to your
clients. You might know this as an SVM or a vserver.
• Network interface - an address and properties assigned to a physical network port. You might know this
as a logical interface (LIF).
• Pause - an action that halts operations. Before ONTAP 9.8, you might have referred to quiesce in other
versions of System Manager.
Beginning with ONTAP 9.12.1, System Manager is fully integrated with BlueXP.
With BlueXP, you can manage your hybrid multicloud infrastructure from a single control plane
while retaining the familiar System Manager dashboard.
Steps
1. Point the web browser to the IP address of the cluster management network interface:
2
If the cluster uses a self-signed digital certificate, the browser might display a warning indicating that the
certificate is not trusted. You can either acknowledge the risk to continue the access or install a Certificate
Authority (CA) signed digital certificate on the cluster for server authentication.
2. Optional: If you have configured an access banner by using the CLI, then read the message that is
displayed in the Warning dialog box, and choose the required option to proceed.
This option is not supported on systems on which Security Assertion Markup Language (SAML)
authentication is enabled.
◦ If you do not want to continue, click Cancel, and close the browser.
◦ If you want to continue, click OK to navigate to the System Manager login page.
3. Log in to System Manager by using your cluster administrator credentials.
Beginning with ONTAP 9.11.1, when you log in to System Manager, you can specify the
locale. The locale specifies certain localization settings, such as language, currency, time
and date format, and similar settings. For ONTAP 9.10.1 and earlier, the locale for System
Manager is detected from the browser. To change the locale for System Manager, you have
to change the locale of the browser.
4. Optional: Beginning with ONTAP 9.12.1, you can specify your preference for the appearance of System
Manager:
a. In the upper right corner of System Manager, click to manage user options.
b. Position the System Theme toggle switch to your preference:
Related information
Managing access to web services
Accessing a node’s log, core dump, and MIB files by using a web browser
3
have to add separate feature license keys. You download the NetApp License File from the NetApp Support
Site.
If you already have license keys for some features and you are upgrading to ONTAP 9.10.1, you can continue
to use those license keys.
Steps
1. Select Cluster > Settings.
2. Under Licenses, select .
3. Select Browse. Choose the NetApp License File you downloaded.
4. If you have license keys you want to add, select Use 28-character license keys and enter the keys.
Node configuration details include the node name, system serial number, system ID, system model, ONTAP
version, MetroCluster information, SP/BMC network information, and encryption configuration information.
Steps
1. Click Cluster > Overview.
2.
Click to display the drop-down menu.
3. Select Download configuration.
4. Select the HA pairs, then click Download.
Tags can be added when you create a cluster, or they can be added later.
You define a tag by specifying a key and associating a value to it using the format “key:value”. For example:
“dept:engineering” or “location:san-jose”.
4
The following should be considered when you create tags:
• Keys have a minimum length of one character and cannot be null. Values can be null.
• A key can be paired with multiple values by separating the values with a comma, for example,
“location:san-jose,toronto”
• Tags can be used for multiple resources.
• Keys must start with a lowercase letter.
Steps
To manage tags, performing the following steps:
Edit a tag a. Modify the content in the Key and Values (optional) fields.
b. Click Save.
To receive alerts about firmware updates, you must be registered with Active IQ Unified
Manager. Refer to Active IQ Unified Manager documentation resources.
Steps
1. In System Manager, select Support.
5
2. Click on the following links to perform procedures:
◦ Case Number: See details about the case.
◦ Go to NetApp Support Site: Navigate to the My AutoSupport page on the NetApp Support Site to
view knowledge base articles or submit a new support case.
◦ View My Cases: Navigate to the My Cases page on the NetApp Support Site.
◦ View Cluster Details: View and copy information you will need when you submit a new case.
Beginning with ONTAP 9.10.1, you can use System Manager to enable telemetry logging. When telemetry
logging is allowed, messages that are logged by System Manager are given a specific telemetry identifier that
indicates the exact process that triggered the message. All messages that are issued relating to that process
have the same identifier, which consists of the name of the operational workflow and a number (for example
"add-volume-1941290").
If you experience performance problems, you can enable telemetry logging, which allows support personnel to
more easily identify the specific process for which a message was issued. When telemetry identifiers are
added to the messages, the log file is only slightly enlarged.
Steps
1. In System Manager, select Cluster > Settings.
2. In UI Settings section, click the check box for Allow telemetry logging.
Beginning with ONTAP 9.13.1, you can specify the maximum capacity that can be allocated for all volumes in a
storage VM. You can enable the maximum capacity when you add a storage VM or when you edit an existing
storage VM.
Steps
1. Select Storage > Storage VMs.
2. Perform one of the following:
◦
To add a storage VM, click .
◦ To edit a storage VM, click next to the name of the storage VM, and then click Edit.
3. Enter or modify the settings for the storage VM, and select the check box labeled "Enable maximum
capacity limit".
4. Specify the maximum capacity size.
5. Specify the percentage of the maximum capacity you want to use as a threshold to trigger alerts.
6. Click Save.
6
Edit the maximum capacity limit of a storage VM
Beginning with ONTAP 9.13.1, you can edit the maximum capacity limit of an existing storage VM, if the
maximum capacity limit has been enabled already.
Steps
1. Select Storage > Storage VMs.
2. Click next to the name of the storage VM, and then click Edit.
The check box labeled "Enable maximum capacity limit" is already checked.
Action Steps
Disable the maximum capacity limit 1. Uncheck the check box.
2. Click Save.
Modify the maximum capacity limit 1. Specify the new maximum capacity size. (You cannot specify a
size that is less than the already allocated space in the storage
VM.)
2. Specify the new percentage of the maximum capacity you want to
use as a threshold to trigger alerts.
3. Click Save.
Related information
• View the maximum capacity limit of a storage VM
• Capacity measurements in System Manager
• Manage SVM capacity limits
• Beginning with ONTAP 9.10.1, System Manager lets you view historical data about the cluster’s capacity
and projections about how much capacity will be used or available in the future. You can also monitor the
capacity of local tiers and volumes.
• Beginning with ONTAP 9.12.1, System Manager displays the amount of committed capacity for a local tier.
• Beginning with ONTAP 9.13.1, you can enable a maximum capacity limit for a storage VM and set a
threshold to trigger alerts when the used storage reaches a certain percentage of the maximum capacity.
Measurements of used capacity are displayed differently depending on your ONTAP version.
Learn more in Capacity measurements in System Manager.
7
View the capacity of a cluster
You can view capacity measurements for a cluster on the Dashboard in System Manager.
Steps
1. In System Manager, click Dashboard.
2. In the Capacity section, you can view the following:
In System Manager, capacity representations do not account for root storage tier
(aggregate) capacities.
3. Click the chart to view more details about the capacity of the cluster.
◦ The top chart displays the physical capacity: the size of physical used, reserved, and available space.
◦ The bottom chart displays the logical capacity: the size of client data, Snapshot copies, and clones, and
the total logical used space.
◦ Data reduction ratio for only the client data (Snapshot copies and clones are not included).
◦ Overall data reduction ratio.
You can view details about the capacity of local tiers. Beginning with ONTAP 9.12.1, the Capacity view also
includes the amount of committed capacity for a local tier, enabling you to determine whether you need to add
capacity to the local tier to accommodate the committed capacity and avoid running out of free space.
Steps
1. Click Storage > Tiers.
2. Select the name of the local tier.
3. On the Overview page, in the Capacity section, the capacity is show in a bar chart with three
8
measurements:
◦ Used and reserved capacity
◦ Available capacity
◦ Committed capacity (beginning with ONTAP 9.12.1)
4. Click the chart to view details about the capacity of the local tier.
◦ The top bar chart displays physical capacity: the size of physical used, reserved, and available space.
◦ The bottom bar chart displays logical capacity: the size of client data, Snapshot copies, and clones,
and the total of logical used space.
Below the bar charts are measurements ratios for data reduction:
◦ Data reduction ratio for only the client data (Snapshot copies and clones are not included).
◦ Overall data reduction ratio.
Optional actions
• If the committed capacity is larger than the capacity of the local tier, you might consider adding capacity to
the local tier before it runs out of free space. See Add capacity to a local tier (add disks to an aggregate).
• You can also view the storage that specific volumes use in the local tier by selecting the Volumes tab.
You can view how much storage is used by the volumes in a storage VM and how much capacity is still
available. The total measurement of used and available storage is called "capacity across volumes".
Steps
1. Select Storage > Storage VMs.
2. Click on the name of the storage VM.
3. Scroll to the Capacity section, which shows a bar chart with the following measurements:
◦ Physical used: Sum of physical used storage across all volumes in this storage VM.
◦ Available: Sum of available capacity across all volumes in this storage VM.
◦ Logical used: Sum of logical used storage across all volumes in this storage VM.
For more details about the measurements, see Capacity measurements in System Manager.
Beginning with ONTAP 9.13.1, you can view the maximum capacity limit of a storage VM.
Steps
9
1. Select Storage > Storage VMs.
◦ In the row for the storage VM, view the Maximum Capacity column which contains a bar chart that
shows the used capacity, available capacity, and maximum capacity.
◦ Click the name of the storage VM. On the Overview tab, scroll to view the maximum capacity,
allocated capacity, and capacity alert threshold values in the left column.
Related information
• Edit the maximum capacity limit of a storage VM
• Capacity measurements in System Manager
10
Nodes
• You can view the front and rear views.
• For models with an internal disk shelf, you can also view the disk layout in the front view.
• You can view the following platforms:
11
AFF Yes Yes Yes Yes * Yes * Yes *
C250
12
* Install the latest patch releases to view these devices.
Ports
• You will see a port highlighted in red if it is down.
• When you hover over the port, you can view the status of a port and other details.
• You cannot view console ports.
Notes:
◦ For ONTAP 9.10.1 and earlier, you will see SAS ports highlighted in red when they are disabled.
◦ Beginning with ONTAP 9.11.1, you will see SAS ports highlighted in red only if they are in an error
state or if a cabled port that is being used goes offline. The ports appear in white if they are offline
and uncabled.
FRUs
Information about FRUs appears only when the state of a FRU is non-optimal.
Adapter cards
• Cards with defined part number fields display in the slots if external cards have been inserted.
• Ports display on the cards.
• For a supported card, you can view images of that card. If the card is not in the list of supported part
numbers, then a generic graphic appears.
13
Disk shelves
• You can display the front and rear views.
• You can view the following disk shelf models:
If your system is running… Then you can use System Manager to view…
ONTAP 9.9.1 and later All shelves that have not been designated as "end of service" or
"end of availability"
ONTAP 9.8 DS4243, DS4486, DS212C, DS2246, DS224C, and NS224
Shelf ports
• You can view port status.
• You can view remote port information if the port is connected.
Shelf FRUs
• PSU failure information displays.
14
Storage switches
• The display shows switches that act as storage switches used to connect shelves to nodes.
• Beginning with ONTAP 9.9.1, System Manager displays information about a switch that acts as both a
storage switch and a cluster, which can also be shared between nodes of an HA pair.
• The following information displays:
◦ Switch name
◦ IP address
◦ Serial number
◦ SNMP version
◦ System version
• You can view the following storage switch models:
If your system is running… Then you can use System Manager to view…
ONTAP 9.11.1 or later Cisco Nexus 3232C
Cisco Nexus 9336C-FX2
Mellanox SN2100
ONTAP 9.9.1 and 9.10.1 Cisco Nexus 3232C
Cisco Nexus 9336C-FX2
ONTAP 9.8 Cisco Nexus 3232C
Beginning with ONTAP 9.12.1, you can view the following cabling information:
• Cabling between controllers, switches, and shelves when no storage bridges are used
• Connectivity that shows the IDs and MAC addresses of the ports on either end of the cable
15
Add nodes to a cluster
You can increase the size and capabilities of your cluster by adding new nodes.
Steps
1. Select Cluster > Overview.
The new controllers are shown as nodes connected to the cluster network but are not in the cluster.
2. Select Add.
◦ The nodes are added into the cluster.
◦ Storage is allocated implicitly.
Steps
1. Select (Return to classic version).
2. Select Configurations > Cluster Expansion.
When you reboot or shutdown a node, its HA partner automatically executes a takeover.
This procedure applies to FAS, AFF, and current ASA systems. If you have an ASA r2 system
(ASA A1K, ASA A70, or ASA A90), follow these steps to shutdown and reboot a node. ASA r2
systems provide a simplified ONTAP experience specific to SAN-only customers.
Steps
1. Select Cluster > Overview.
2. Under Nodes, select .
3. Select the node and then select Shut down, Reboot, or Edit Service Processor.
If a node has been rebooted and is waiting for giveback, the Giveback option is also available.
16
If you select Edit Service Processor, you can choose Manual to input the IP address, subnet mask and
gateway, or you can choose DHCP for dynamic host configuration.
Rename nodes
Beginning with ONTAP 9.14.1, you can rename a node from the cluster overview page.
This procedure applies to FAS, AFF, and current ASA systems. If you have an ASA r2 system
(ASA A1K, ASA A70, or ASA A90), follow these steps to rename a node. ASA r2 systems
provide a simplified ONTAP experience specific to SAN-only customers.
Steps
1. Select Cluster. The cluster overview page displays.
2. Scroll down to the Nodes section.
3. Next to the node that you want to rename, select , and select Rename.
4. Modify the node name, and then select Rename.
License management
ONTAP licensing overview
A license is a record of one or more software entitlements. Beginning with ONTAP 9.10.1,
all licenses are delivered as a NetApp license file (NLF), which is a single file that enables
multiple features. Beginning in May 2023, all AFF systems (both A-series and C-series)
and FAS systems are sold with either the ONTAP One software suite or the ONTAP Base
software suite, and beginning in June 2023, all ASA systems are sold with ONTAP One
for SAN. Each software suite is delivered as a single NLF, replacing the separate NLF
bundles first introduced in ONTAP 9.10.1.
ONTAP One contains all available licensed functionality. It contains a combination of the contents of the former
Core bundle, Data Protection bundle, Security and Compliance bundle, Hybrid Cloud bundle, and Encryption
bundle, as shown in the table. Encryption is not available in restricted countries.
17
Data Protection bundle SnapMirror (asynchronous, synchronous, Business
Continuity)
SnapCenter
SnapMirror S3 for NetApp targets
Hybrid Cloud bundle SnapMirror cloud
SnapMirror S3 for non-NetApp targets
Encryption bundle NetApp Volume Encryption
Trusted Platform module
ONTAP One does not include any of NetApp’s cloud-delivered services, including the following:
• BlueXP tiering
• Cloud Insights
• BlueXP backup
• Data governance
If you have existing systems that are currently under NetApp support but have not been upgraded to ONTAP
One, the existing licenses on those systems are still valid and continue to work as expected. For example, if
the SnapMirror license is already installed on existing systems, it is not necessary to upgrade to ONTAP One
to get a new SnapMirror license. However, if you do not have a SnapMirror license installed on an existing
system, the only way to get that license is to upgrade to ONTAP One for an additional fee.
Beginning in June 2023, ONTAP systems using 28-character license keys can also upgrade to the ONTAP
One or ONTAP Base compatibility bundle.
ONTAP Base is an optional software suite that’s an alternative to ONTAP One for ONTAP systems. It is for
specific use cases where data protection technologies such as SnapMirror and SnapCenter, as well as security
features like Autonomous Ransomware, are not required, such as non-production systems for dedicated test or
development environments. Additional licenses cannot be added to ONTAP Base. If you want additional
licenses, such as SnapMirror, you must upgrade to ONTAP One.
18
Encryption bundle NetApp Volume Encryption
Trusted Platform module
ONTAP One for SAN is available for ASA A-series and C-series systems. This is the only software suite
available for SAN. ONTAP One for SAN contains the following licenses:
In ONTAP 8.2 through ONTAP 9.9.1, license keys are delivered as 28-character strings, and there is one key
per ONTAP feature. You use the ONTAP CLI to install license keys if you are using ONTAP 8.2 through ONTAP
9.9.1.
ONTAP 9.10.1 supports installing 28-character license keys using System Manager or the CLI.
However, if an NLF license is installed for a feature, you cannot install a 28-character license
key over the NetApp license file for the same feature. For information about installing NLFs or
license keys using System Manager, see Install ONTAP licenses.
Related information
How to get an ONTAP One license when the system has NLFs already
How to verify ONTAP Software Entitlements and related License Keys using the Support Site
19
The SnapMirror cloud and SnapMirror S3 licenses are not included with ONTAP One. They are
part of the ONTAP One Compatibility bundle, which you can get for free if you have ONTAP One
and request separately.
Steps
You can download ONTAP One license files for systems with existing NetApp license file bundles and for
systems with 28-character license keys that have been converted to NetApp license files on systems running
ONTAP 9.10.1 and later. For a fee, you can also upgrade systems from ONTAP Base to ONTAP One.
When your request is processed, you will receive an email from [email protected] with the
subject “NetApp Software Licensing Notification for SO# [SO Number]” and the email will include a
PDF attachment that includes your license serial number.
You will need to wait at least 2 hours for the licenses to generate.
20
and in releases earlier than ONTAP 9.10.1, ONTAP features are enabled with license
keys.
Steps
If you have already downloaded NetApp license files or license keys, you can use System Manager or the
ONTAP CLI to install NLFs and 28-character license keys.
CLI
1. Add one or more license key:
The following example installs licenses from the local node "/mroot/etc/lic_file" if the file exists at this
location:
The following example adds a list of licenses with the keys AAAAAAAAAAAAAAAAAAAAAAAAAAAA
and BBBBBBBBBBBBBBBBBBBBBBBBBBBB to the cluster:
Related information
• Man page for system license add command.
21
on your system, including viewing the license serial number, checking the status of a
license, and removing a license.
Steps
How you view details about a license depends on what version of ONTAP you are using and whether you use
System Manager or the ONTAP CLI.
CLI
1. Display details about an installed license:
Delete a license
22
System Manager - ONTAP 9.8 and later
1. To delete a license, select Cluster > Settings.
2. Under Licenses, select .
3. Select Features.
4. Select the licensed feature you want to delete and Delete legacy key.
Delete a specific license package across all of Click the Packages tab.
the nodes in the cluster
3. Select the software license package that you want to delete, and then click Delete.
CLI
1. Delete a license:
The following example deletes a license named CIFS and serial number 1-81-
0000000000000000000123456 from the cluster:
The following example deletes from the cluster all of the licenses under the installed-license Core
Bundle for serial number 123456789:
Related information
ONTAP CLI commands for managing licenses
23
ONTAP command reference
License types
A package can have one or more of the following license types installed in the cluster. The system license
show command displays the installed license type or types for a package.
A standard license is a node-locked license. It is issued for a node with a specific system serial number
(also known as a controller serial number). A standard license is valid only for the node that has the
matching serial number.
Installing a standard, node-locked license entitles a node to the licensed functionality. For the cluster to use
licensed functionality, at least one node must be licensed for the functionality. It might be out of compliance
to use licensed functionality on a node that does not have an entitlement for the functionality.
A site license is not tied to a specific system serial number. When you install a site license, all nodes in the
cluster are entitled to the licensed functionality. The system license show command displays site
licenses under the cluster serial number.
If your cluster has a site license and you remove a node from the cluster, the node does not carry the site
license with it, and it is no longer entitled to the licensed functionality. If you add a node to a cluster that has
a site license, the node is automatically entitled to the functionality granted by the site license.
An evaluation license is a temporary license that expires after a certain period of time (indicated by the
system license show command). It enables you to try certain software functionality without purchasing
an entitlement. It is a cluster-wide license, and it is not tied to a specific serial number of a node.
If your cluster has an evaluation license for a package and you remove a node from the cluster, the node
does not carry the evaluation license with it.
Licensed method
It is possible to install both a cluster-wide license (the site or demo type) and a node-locked license (the
license type) for a package. Therefore, an installed package can have multiple license types in the cluster.
However, to the cluster, there is only one licensed method for a package. The licensed method field of the
system license status show command displays the entitlement that is being used for a package. The
command determines the licensed method as follows:
• If a package has only one license type installed in the cluster, the installed license type is the licensed
method.
• If a package does not have any licenses installed in the cluster, the licensed method is none.
24
• If a package has multiple license types installed in the cluster, the licensed method is determined in the
following priority order of the license type--site, license, and demo.
For example:
◦ If you have a site license, a standard license, and an evaluation license for a package, the licensed
method for the package in the cluster is site.
◦ If you have a standard license and an evaluation license for a package, the licensed method for the
package in the cluster is license.
◦ If you have only an evaluation license for a package, the licensed method for the package in the cluster
is demo.
You can use the ONTAP CLI system license commands to manage feature licenses
for the cluster. You use the system feature-usage commands to monitor feature
usage.
The following table lists some of the common CLI commands for managing licenses and links to the command
man pages for additional information.
Related information
• ONTAP command reference
25
• Knowledge Base article: ONTAP 9.10.1 and later licensing overview
• Use System Manager to install a NetApp license file
Related information
For details about CLI syntax and usage, see the
ONTAP command reference documentation.
Cluster administrators administer the entire cluster and the storage virtual machines
(SVMs, formerly known as Vservers) it contains. SVM administrators administer only their
own data SVMs.
Cluster administrators can administer the entire cluster and its resources. They can also set up data SVMs and
delegate SVM administration to SVM administrators. The specific capabilities that cluster administrators have
depend on their access-control roles. By default, a cluster administrator with the “admin” account name or role
name has all capabilities for managing the cluster and SVMs.
SVM administrators can administer only their own SVM storage and network resources, such as volumes,
protocols, LIFs, and services. The specific capabilities that SVM administrators have depend on the access-
control roles that are assigned by cluster administrators.
The ONTAP command-line interface (CLI) continues to use the term Vserver in the output, and
vserver as a command or parameter name has not changed.
You can enable or disable a web browser’s access to System Manager. You can also
view the System Manager log.
You can control a web browser’s access to System Manager by using vserver services web modify
-name sysmgr -vserver cluster_name -enabled [true|false].
System Manager logging is recorded in the /mroot/etc/log/mlog/sysmgr.log files of the node that
hosts the cluster management LIF at the time System Manager is accessed. You can view the log files by
using a browser. The System Manager log is also included in AutoSupport messages.
26
What the cluster management server is
Upon failure of its home network port, the cluster management LIF automatically fails over to another node in
the cluster. Depending on the connectivity characteristics of the management protocol you are using, you might
or might not notice the failover. If you are using a connectionless protocol (for example, SNMP) or have a
limited connection (for example, HTTP), you are not likely to notice the failover. However, if you are using a
long-term connection (for example, SSH), then you will have to reconnect to the cluster management server
after the failover.
When you create a cluster, all of the characteristics of the cluster management LIF are configured, including its
IP address, netmask, gateway, and port.
Unlike a data SVM or node SVM, a cluster management server does not have a root volume or host user
volumes (though it can host system volumes). Furthermore, a cluster management server can only have LIFs
of the cluster management type.
If you run the vserver show command, the cluster management server appears in the output listing for that
command.
Types of SVMs
A cluster consists of four types of SVMs, which help in managing the cluster and its
resources and data access to the clients and applications.
A cluster contains the following types of SVMs:
• Admin SVM
The cluster setup process automatically creates the admin SVM for the cluster. The admin SVM represents
the cluster.
• Node SVM
A node SVM is created when the node joins the cluster, and the node SVM represents the individual nodes
of the cluster.
• Data SVM
A data SVM represents the data serving SVMs. After the cluster setup, a cluster administrator must create
data SVMs and add volumes to these SVMs to facilitate data access from the cluster.
A cluster must have at least one data SVM to serve data to its clients.
27
Unless otherwise specified, the term SVM refers to a data (data-serving) SVM.
You can access the cluster directly from a console that is attached to a node’s serial port.
Steps
1. At the console, press Enter.
3. Enter the password for the admin or administrative user account, and then press Enter.
You can issue SSH requests to an ONTAP cluster to perform administrative tasks. SSH is
enabled by default.
Before you begin
• You must have a user account that is configured to use ssh as an access method.
The -application parameter of the security login commands specifies the access method for a
user account. The security login man pages contain additional information.
• If you use an Active Directory (AD) domain user account to access the cluster, an authentication tunnel for
the cluster must have been set up through a CIFS-enabled storage VM, and your AD domain user account
must also have been added to the cluster with ssh as an access method and domain as the
authentication method.
If the cluster management LIF resides on the node, it shares this limit with the node management LIF.
28
If the rate of incoming connections is higher than 10 per second, the service is temporarily disabled for 60
seconds.
• ONTAP supports only the AES and 3DES encryption algorithms (also known as ciphers) for SSH.
AES is supported with 128, 192, and 256 bits in key length. 3DES is 56 bits in key length as in the original
DES, but it is repeated three times.
• When FIPS mode is on, SSH clients should negotiate with Elliptic Curve Digital Signature Algorithm
(ECDSA) public key algorithms for the connection to be successful.
• If you want to access the ONTAP CLI from a Windows host, you can use a third-party utility such as
PuTTY.
• If you use a Windows AD user name to log in to ONTAP, you should use the same uppercase or lowercase
letters that were used when the AD user name and domain name were created in ONTAP.
AD user names and domain names are not case-sensitive. However, ONTAP user names are case-
sensitive. Case mismatch between the user name created in ONTAP and the user name created in AD
results in a login failure.
When SSH multifactor authentication is enabled, users are authenticated by using a public key and a
password.
• Beginning with ONTAP 9.4, you can enable SSH multifactor authentication for LDAP and NIS remote
users.
• Beginning with ONTAP 9.13.1, you can optionally add certificate validation to the SSH authentication
process to enhance login security. To do this, associate an X.509 certificate with the public key that an
account uses. If you log in using SSH with both an SSH public key and an X.509 certificate, ONTAP checks
the validity of the X.509 certificate before authenticating with the SSH public key. SSH login is refused if
that certificate is expired or revoked, and the SSH public key is automatically disabled.
• Beginning with ONTAP 9.14.1, ONTAP administrators can add Cisco Duo two-factor authentication to the
SSH authentication process to enhance login security. Upon first login after you enable Cisco Duo
authentication, users will need to enroll a device to serve as an authenticator for SSH sessions.
• Beginning with ONTAP 9.15.1, administrators can Configure dynamic authorization to provide additional
adaptive authentication to SSH users based on the user’s trust score.
Steps
1. From a host with access to the ONTAP cluster’s network, enter the ssh command in one of the following
formats:
◦ ssh username@hostname_or_IP [command]
◦ ssh -l username hostname_or_IP [command]
If you are using an AD domain user account, you must specify username in the format of
domainname\\AD_accountname (with double backslashes after the domain name) or
"domainname\AD_accountname" (enclosed in double quotation marks and with a single backslash after the
domain name).
hostname_or_IP is the host name or the IP address of the cluster management LIF or a node management
29
LIF. Using the cluster management LIF is recommended. You can use an IPv4 or IPv6 address.
$ ssh [email protected]
Password:
cluster1::> cluster show
Node Health Eligibility
--------------------- ------- ------------
node1 true true
node2 true true
2 entries were displayed.
The following examples show how the user account named “john” from the domain named “DOMAIN1” can
issue an SSH request to access a cluster whose cluster management LIF is 10.72.137.28:
$ ssh DOMAIN1\\[email protected]
Password:
cluster1::> cluster show
Node Health Eligibility
--------------------- ------- ------------
node1 true true
node2 true true
2 entries were displayed.
30
$ ssh -l "DOMAIN1\john" 10.72.137.28 cluster show
Password:
Node Health Eligibility
--------------------- ------- ------------
node1 true true
node2 true true
2 entries were displayed.
The following example shows how the user account named “joe” can issue an SSH MFA request to access a
cluster whose cluster management LIF is 10.72.137.32:
$ ssh [email protected]
Authenticated with partial success.
Password:
cluster1::> cluster show
Node Health Eligibility
--------------------- ------- ------------
node1 true true
node2 true true
2 entries were displayed.
Related information
Administrator authentication and RBAC
Beginning with ONTAP 9.5, you can view information about previous logins, unsuccessful
attempts to log in, and changes to your privileges since your last successful login.
Security-related information is displayed when you successfully log in as an SSH admin user. You are alerted
about the following conditions:
If any of the information displayed is suspicious, you should immediately contact your security
department.
To obtain this information when you login, the following prerequisites must be met:
31
• Your login attempt must be successful.
The following restrictions and considerations apply to SSH login security information:
However, alerts about changes to the role of the user account cannot be displayed for these users. Also,
users belonging to an AD group that has been provisioned as an admin account in ONTAP cannot view the
count of unsuccessful login attempts that occurred since the last time they logged in.
• The information maintained for a user is deleted when the user account is deleted from ONTAP.
• The information is not displayed for connections to applications other than SSH.
The following examples demonstrate the type of information displayed after you login.
• These messages are displayed if there have been unsuccessful attempts to login since the last successful
login:
• These messages are displayed if there have been unsuccessful attempts to login and your privileges were
modified since the last successful login:
As a security best practice, Telnet and RSH are disabled by default. To enable the cluster
to accept Telnet or RSH requests, you must enable the service in the default
management service policy.
Telnet and RSH are not secure protocols; you should consider using SSH to access the cluster. SSH provides
a secure remote shell and interactive network session. For more information, refer to Access the cluster using
SSH.
32
About this task
• ONTAP supports a maximum of 50 concurrent Telnet or RSH sessions per node.
If the cluster management LIF resides on the node, it shares this limit with the node management LIF.
If the rate of incoming connections is higher than 10 per second, the service is temporarily disabled for 60
seconds.
33
ONTAP 9.6 or later
Steps
1. Confirm that the RSH or Telnet security protocol is enabled:
a. If the RSH or Telnet security protocol is enabled, continue to the next step.
b. If the RSH or Telnet security protocol is not enabled, use the following command to enable it:
or
Steps
1. Enter the advanced privilege mode:
set advanced
3. Create a new management firewall policy based on the mgmt management firewall policy:
34
4. Enable Telnet or RSH in the new management firewall policy:
You can issue Telnet requests to the cluster to perform administrative tasks. Telnet is
disabled by default.
Telnet and RSH are not secure protocols; you should consider using SSH to access the cluster. SSH provides
a secure remote shell and interactive network session. For more information, refer to Access the cluster using
SSH.
• You must have a cluster local user account that is configured to use Telnet as an access method.
The -application parameter of the security login commands specifies the access method for a
user account. For more information, see the security login man pages.
If the cluster management LIF resides on the node, it shares this limit with the node management LIF.
If the rate of in-coming connections is higher than 10 per second, the service is temporarily disabled for 60
seconds.
• If you want to access the ONTAP CLI from a Windows host, you can use a third-party utility such as
PuTTY.
• RSH commands require advanced privileges.
35
ONTAP 9.6 or later
Steps
1. Confirm that the Telnet security protocol is enabled:
• Telnet must already be enabled in the management firewall policy that is used by the cluster or node
management LIFs so that Telnet requests can go through the firewall.
By default, Telnet is disabled. The system services firewall policy show command with
the -service telnet parameter displays whether Telnet has been enabled in a firewall policy. For
more information, see the system services firewall policy man pages.
• If you use IPv6 connections, IPv6 must already be configured and enabled on the cluster, and firewall
policies must already be configured with IPv6 addresses.
The network options ipv6 show command displays whether IPv6 is enabled. The system
services firewall policy show command displays firewall policies.
Steps
1. From an administration host, enter the following command:
telnet hostname_or_IP
hostname_or_IP is the host name or the IP address of the cluster management LIF or a node
management LIF. Using the cluster management LIF is recommended. You can use an IPv4 or IPv6
address.
36
Example of a Telnet request
The following example shows how the user named “joe”, who has been set up with Telnet access, can issue a
Telnet request to access a cluster whose cluster management LIF is 10.72.137.28:
Data ONTAP
login: joe
Password:
cluster1::>
You can issue RSH requests to the cluster to perform administrative tasks. RSH is not a
secure protocol and is disabled by default.
Telnet and RSH are not secure protocols; you should consider using SSH to access the cluster. SSH provides
a secure remote shell and interactive network session. For more information, refer to Access the cluster using
SSH.
• You must have a cluster local user account that is configured to use RSH as an access method.
The -application parameter of the security login commands specifies the access method for a
user account. For more information, see the security login man pages.
If the cluster management LIF resides on the node, it shares this limit with the node management LIF.
If the rate of incoming connections is higher than 10 per second, the service is temporarily disabled for 60
seconds.
37
ONTAP 9.6 or later
Steps
1. Confirm that the RSH security protocol is enabled:
• RSH must already be enabled in the management firewall policy that is used by the cluster or node
management LIFs so that RSH requests can go through the firewall.
By default, RSH is disabled. The system services firewall policy show command with the -service
rsh parameter displays whether RSH has been enabled in a firewall policy. For more information, see
the system services firewall policy man pages.
• If you use IPv6 connections, IPv6 must already be configured and enabled on the cluster, and firewall
policies must already be configured with IPv6 addresses.
The network options ipv6 show command displays whether IPv6 is enabled. The system
services firewall policy show command displays firewall policies.
Steps
1. From an administration host, enter the following command:
hostname_or_IP is the host name or the IP address of the cluster management LIF or a node
management LIF. Using the cluster management LIF is recommended. You can use an IPv4 or IPv6
address.
38
Example of an RSH request
The following example shows how the user named “joe”, who has been set up with RSH access, can issue an
RSH request to run the cluster show command:
admin_host$
If you set the privilege level (that is, the -privilege parameter of the set command) to advanced, the
prompt includes an asterisk (*), for example:
cluster_name::*>
About the different shells for CLI commands overview (cluster administrators only)
The cluster has three different shells for CLI commands, the clustershell, the nodeshell,
and the systemshell. The shells are for different purposes, and they each have a different
command set.
• The clustershell is the native shell that is started automatically when you log in to the cluster.
It provides all the commands you need to configure and manage the cluster. The clustershell CLI help
(triggered by ? at the clustershell prompt) displays available clustershell commands. The man
command_name command in the clustershell displays the man page for the specified clustershell
command.
• The nodeshell is a special shell for commands that take effect only at the node level.
The nodeshell CLI help (triggered by ? or help at the nodeshell prompt) displays available nodeshell
commands. The man command_name command in the nodeshell displays the man page for the specified
nodeshell command.
39
Many commonly used nodeshell commands and options are tunneled or aliased into the clustershell and
can be executed also from the clustershell.
• The systemshell is a low-level shell that is used only for diagnostic and troubleshooting purposes.
The systemshell and the associated “diag” account are intended for low-level diagnostic purposes. Their
access requires the diagnostic privilege level and is reserved only for technical support to perform
troubleshooting tasks.
Many commonly used nodeshell commands and options are tunneled or aliased into the clustershell and can
be executed also from the clustershell.
Nodeshell options that are supported in the clustershell can be accessed by using the vserver options
clustershell command. To see these options, you can do one of the following:
If you enter a nodeshell or legacy command or option in the clustershell, and the command or option has an
equivalent clustershell command, ONTAP informs you of the clustershell command to use.
If you enter a nodeshell or legacy command or option that is not supported in the clustershell, ONTAP informs
you of the “not supported” status for the command or option.
You can obtain a list of available nodeshell commands by using the CLI help from the nodeshell.
Steps
1. To access the nodeshell, enter the following command at the clustershell’s system prompt:
2. Enter the following command in the nodeshell to see the list of available nodeshell commands:
[commandname] help
commandname is the name of the command whose availability you want to display. If you do not include
commandname, the CLI displays all available nodeshell commands.
40
Example of displaying available nodeshell commands
The following example accesses the nodeshell of a node named node2 and displays information for the
nodeshell command environment:
Commands in the CLI are organized into a hierarchy by command directories. You can
run commands in the hierarchy either by entering the full command path or by navigating
through the directory structure.
When using the CLI, you can access a command directory by typing the directory’s name at the prompt and
then pressing Enter. The directory name is then included in the prompt text to indicate that you are interacting
with the appropriate command directory. To move deeper into the command hierarchy, you type the name of a
command subdirectory followed by pressing Enter. The subdirectory name is then included in the prompt text
and the context shifts to that subdirectory.
You can navigate through several command directories by entering the entire command. For example, you can
display information about disk drives by entering the storage disk show command at the prompt. You can
also run the command by navigating through one command directory at a time, as shown in the following
example:
cluster1::> storage
cluster1::storage> disk
cluster1::storage disk> show
You can abbreviate commands by entering only the minimum number of letters in a command that makes the
command unique to the current directory. For example, to abbreviate the command in the previous example,
you can enter st d sh. You can also use the Tab key to expand abbreviated commands and to display a
command’s parameters, including default parameter values.
You can use the top command to go to the top level of the command hierarchy, and the up command or ..
command to go up one level in the command hierarchy.
Commands and command options preceded by an asterisk (*) in the CLI can be executed only
at the advanced privilege level or higher.
41
Rules for specifying values in the CLI
Most commands include one or more required or optional parameters. Many parameters
require you to specify a value for them. A few rules exist for specifying values in the CLI.
• A value can be a number, a Boolean specifier, a selection from an enumerated list of predefined values, or
a text string.
Some parameters can accept a comma-separated list of two or more values. Comma-separated lists of
values do not need to be in quotation marks (" "). Whenever you specify text, a space, or a query character
(when not meant as a query or text starting with a less-than or greater-than symbol), you must enclose the
entity in quotation marks.
• The CLI interprets a question mark (“?”) as the command to display help information for a particular
command.
• Some text that you enter in the CLI, such as command names, parameters, and certain values, is not case-
sensitive.
For example, when you enter parameter values for the vserver cifs commands, capitalization is
ignored. However, most parameter values, such as the names of nodes, storage virtual machines (SVMs),
aggregates, volumes, and logical interfaces, are case-sensitive.
• If you want to clear the value of a parameter that takes a string or a list, you specify an empty set of
quotation marks ("") or a dash ("-").
• The hash sign (“#”), also known as the pound sign, indicates a comment for a command-line input; if used,
it should appear after the last parameter in a command line.
The CLI ignores the text between “#” and the end of the line.
In the following example, an SVM is created with a text comment. The SVM is then modified to delete the
comment:
In the following example, a command-line comment that uses the “#” sign indicates what the command does.
Each CLI session keeps a history of all commands issued in it. You can view the
42
command history of the session that you are currently in. You can also reissue
commands.
To view the command history, you can use the history command.
To reissue a command, you can use the redo command with one of the following arguments:
For example, if the only volume command you have run is volume show, you can use the redo volume
command to reexecute the command.
For example, you can use the redo 4 command to reissue the fourth command in the history list.
For example, you can use the redo -2 command to reissue the command that you ran two commands
ago.
For example, to redo the command that is third from the end of the command history, you would enter the
following command:
cluster1::> redo -3
The command at the current command prompt is the active command. Using keyboard
shortcuts enables you to edit the active command quickly. These keyboard shortcuts are
similar to those of the UNIX tcsh shell and the Emacs editor.
The following table lists the keyboard shortcuts for editing CLI commands. “Ctrl-” indicates that you press and
hold the Ctrl key while typing the character specified after it. “Esc-” indicates that you press and release the
Esc key and then type the character specified after it.
Back arrow
Forward arrow
43
If you want to… Use the following keyboard shortcut…
Move the cursor forward by one word Esc-F
Remove the word before the cursor, and save it in the Ctrl-W
cut buffer
Yank the content of the cut buffer, and push it into the Ctrl-Y
command line at the cursor
Backspace
44
If you want to… Use the following keyboard shortcut…
Replace the current content of the command line with Ctrl-N
the next entry on the history list. With each repetition
of the keyboard shortcut, the history cursor moves to Esc-N
the next entry.
Down arrow
ONTAP commands and parameters are defined at three privilege levels: admin,
advanced, and diagnostic. The privilege levels reflect the skill levels required in
performing the tasks.
• admin
Most commands and parameters are available at this level. They are used for common or routine tasks.
• advanced
Commands and parameters at this level are used infrequently, require advanced knowledge, and can
cause problems if used inappropriately.
You use advanced commands or parameters only with the advice of support personnel.
• diagnostic
Diagnostic commands and parameters are potentially disruptive. They are used only by support personnel
to diagnose and fix problems.
You can set the privilege level in the CLI by using the set command. Changes to
privilege level settings apply only to the session you are in. They are not persistent
45
across sessions.
Steps
1. To set the privilege level in the CLI, use the set command with the -privilege parameter.
You can set display preferences for a CLI session by using the set command and rows
command. The preferences you set apply only to the session you are in. They are not
persistent across sessions.
About this task
You can set the following CLI display preferences:
If the preferred number of rows is not specified, it is automatically adjusted based on the actual height of
the terminal. If the actual height is undefined, the default number of rows is 24.
Steps
1. To set CLI display preferences, use the set command.
To set the number of rows the screen displays in the current CLI session, you can also use the rows
command.
For more information, see the man pages for the set command and rows command.
46
the number of rows to 50:
The management interface supports queries and UNIX-style patterns and wildcards to
enable you to match multiple values in command-parameter arguments.
The following table describes the supported query operators:
Operator Description
* Wildcard that matches all entries.
For example, the command volume show -volume *tmp* displays a list of all volumes whose
names include the string tmp.
! NOT operator.
Indicates a value that is not to be matched; for example, !vs0 indicates not to match the value
vs0.
| OR operator.
Separates two values that are to be compared; for example, vs0 | vs2 matches either vs0 or
vs2. You can specify multiple OR statements; for example, a | b* | *c* matches the entry a,
any entry that starts with b, and any entry that includes c.
.. Range operator.
For example, <20 matches any value that is less than 20.
47
Operator Description
>= Greater-than-or-equal-to operator.
For example, >=5 matches any value that is greater than or equal to 5.
An extended query must be specified as the first argument after the command name, before any
other parameters.
For example, the command volume modify {-volume *tmp*} -state offline sets
offline all volumes whose names include the string tmp.
If you want to parse query characters as literals, you must enclose the characters in double quotes (for
example, "<10", "0..100", "*abc*", or "a|b") for the correct results to be returned.
You must enclose raw file names in double quotes to prevent the interpretation of special characters. This also
applies to special characters used by the clustershell.
You can use multiple query operators in one command line. For example, the command volume show
-size >1GB -percent-used <50 -vserver !vs1 displays all volumes that are greater than 1 GB in
size, less than 50% utilized, and not in the storage virtual machine (SVM) named “vs1”.
Related information
Keyboard shortcuts for editing CLI commands
You can use extended queries to match and perform operations on objects that have
specified values.
You specify extended queries by enclosing them within curly brackets ({}). An extended query must be
specified as the first argument after the command name, before any other parameters. For example, to set
offline all volumes whose names include the string tmp, you run the command in the following example:
Extended queries are generally useful only with modify and delete commands. They have no meaning in
create or show commands.
The combination of queries and modify operations is a useful tool. However, it can potentially cause confusion
and errors if implemented incorrectly. For example, using the (advanced privilege) system node image
modify command to set a node’s default software image automatically sets the other software image not to be
the default. The command in the following example is effectively a null operation:
48
This command sets the current default image as the non-default image, then sets the new default image (the
previous non-default image) to the non-default image, resulting in the original default settings being retained.
To perform the operation correctly, you can use the command as given in the following example:
When you use the –instance parameter with a show command to display details, the
output can be lengthy and include more information than you need. The –fields
parameter of a show command enables you to display only the information you specify.
For example, running volume show -instance is likely to result in several screens of information. You can
use volume show –fields fieldname[,fieldname…] to customize the output so that it includes only
the specified field or fields (in addition to the default fields that are always displayed.) You can use –fields ?
to display valid fields for a show command.
The following example shows the output difference between the –instance parameter and the –fields
parameter:
49
cluster1::> volume show -instance
You can take advantage of the positional parameter functionality of the ONTAP CLI to
increase efficiency in command input. You can query a command to identify parameters
that are positional for the command.
• A positional parameter is a parameter that does not require you to specify the parameter name before
specifying the parameter value.
• A positional parameter can be interspersed with nonpositional parameters in the command input, as long
as it observes its relative sequence with other positional parameters in the same command, as indicated in
50
the command_name ? output.
• A positional parameter can be a required or optional parameter for a command.
• A parameter can be positional for one command but nonpositional for another.
Using the positional parameter functionality in scripts is not recommended, especially when the
positional parameters are optional for the command or have optional parameters listed before
them.
You can identify a positional parameter in the command_name ? command output. A positional parameter has
square brackets surrounding its parameter name, in one of the following formats:
For example, when displayed as the following in the command_name ? output, the parameter is positional for
the command it appears in:
• [-lif] <lif-name>
• [[-lif] <lif-name>]
However, when displayed as the following, the parameter is nonpositional for the command it appears in:
• -lif <lif-name>
• [-lif <lif-name>]
In the following example, the volume create ? output shows that three parameters are positional for the
command: -volume, -aggregate, and -size.
51
cluster1::> volume create ?
-vserver <vserver name> Vserver Name
[-volume] <volume name> Volume Name
[-aggregate] <aggregate name> Aggregate Name
[[-size] {<integer>[KB|MB|GB|TB|PB]}] Volume Size
[ -state {online|restricted|offline|force-online|force-offline|mixed} ]
Volume State (default: online)
[ -type {RW|DP|DC} ] Volume Type (default: RW)
[ -policy <text> ] Export Policy
[ -user <user name> ] User ID
...
[ -space-guarantee|-s {none|volume} ] Space Guarantee Style (default:
volume)
[ -percent-snapshot-space <percent> ] Space Reserved for Snapshot
Copies
...
In the following example, the volume create command is specified without taking advantage of the
positional parameter functionality:
cluster1::> volume create -vserver svm1 -volume vol1 -aggregate aggr1 -size 1g
-percent-snapshot-space 0
The following examples use the positional parameter functionality to increase the efficiency of the command
input. The positional parameters are interspersed with nonpositional parameters in the volume create
command, and the positional parameter values are specified without the parameter names. The positional
parameters are specified in the same sequence indicated by the volume create ? output. That is, the value
for -volume is specified before that of -aggregate, which is in turn specified before that of -size.
ONTAP manual (man) pages explain how to use ONTAP CLI commands. These pages
are available at the command line and are also published in release-specific command
references.
At the ONTAP command line, use the man command_name command to display the manual page of the
specified command. If you do not specify a command name, the manual page index is displayed. You can use
the man man command to view information about the man command itself. You can exit a man page by
entering q.
Refer to the command reference for your version of ONTAP 9 to learn about the admin-level and advanced-
level ONTAP commands available in your release.
52
Manage CLI sessions
You can record a CLI session into a file with a specified name and size limit, then upload
the file to an FTP or HTTP destination. You can also display or delete files in which you
previously recorded CLI sessions.
A record of a CLI session ends when you stop the recording or end the CLI session, or when the file reaches
the specified size limit. The default file size limit is 1 MB. The maximum file size limit is 2 GB.
Recording a CLI session is useful, for example, if you are troubleshooting an issue and want to save detailed
information or if you want to create a permanent record of space usage at a specific point in time.
Steps
1. Start recording the current CLI session into a file:
For more information about using the system script start command, see the man page.
ONTAP starts recording your CLI session into the specified file.
For more information about using the system script stop command, see the man page.
You use the system script commands to manage records of CLI sessions.
53
If you want to… Use this command…
Upload a record of a CLI session to an FTP or HTTP system script upload
destination
Related information
ONTAP command reference
The timeout value specifies how long a CLI session remains idle before being automatically terminated. The
CLI timeout value is cluster-wide. That is, every node in a cluster uses the same CLI timeout value.
You use the system timeout commands to manage the automatic timeout period of CLI sessions.
Modify the automatic timeout period for CLI sessions system timeout modify
Related information
ONTAP command reference
You can display node names, whether the nodes are healthy, and whether they are
eligible to participate in the cluster. At the advanced privilege level, you can also display
whether a node holds epsilon.
Steps
1. To display information about the nodes in a cluster, use the cluster show command.
If you want the output to show whether a node holds epsilon, run the command at the advanced privilege
level.
54
cluster1::> cluster show
Node Health Eligibility
--------------------- ------- ------------
node1 true true
node2 true true
node3 true true
node4 true true
The following example displays detailed information about the node named “node1” at the advanced privilege
level:
Node: node1
Node UUID: a67f9f34-9d8f-11da-b484-000423b6f094
Epsilon: false
Eligibility: true
Health: true
You can display a cluster’s unique identifier (UUID), name, serial number, location, and
contact information.
Steps
1. To display a cluster’s attributes, use the cluster identity show command.
55
Modify cluster attributes
You can modify a cluster’s attributes, such as the cluster name, location, and contact
information as needed.
About this task
You cannot change a cluster’s UUID, which is set when the cluster is created.
Steps
1. To modify cluster attributes, use the cluster identity modify command.
The -name parameter specifies the name of the cluster. The cluster identity modify man page
describes the rules for specifying the cluster’s name.
The -contact parameter specifies the contact information such as a name or e-mail address.
You can display the status of cluster replication rings to help you diagnose cluster-wide
problems. If your cluster is experiencing problems, support personnel might ask you to
perform this task to assist with troubleshooting efforts.
Steps
1. To display the status of cluster replication rings, use the cluster ring show command at the advanced
privilege level.
56
cluster1::> set -privilege advanced
Warning: These advanced commands are potentially dangerous; use them only
when directed to do so by support personnel.
Do you wish to continue? (y or n): y
Quorum and epsilon are important measures of cluster health and function that together
indicate how clusters address potential communications and connectivity challenges.
Quorum is a precondition for a fully functioning cluster. When a cluster is in quorum, a simple majority of nodes
are healthy and can communicate with each other. When quorum is lost, the cluster loses the ability to
accomplish normal cluster operations. Only one collection of nodes can have quorum at any one time because
all of the nodes collectively share a single view of the data. Therefore, if two non-communicating nodes are
permitted to modify the data in divergent ways, it is no longer possible to reconcile the data into a single data
view.
Each node in the cluster participates in a voting protocol that elects one node master; each remaining node is
a secondary. The master node is responsible for synchronizing information across the cluster. When quorum is
formed, it is maintained by continual voting. If the master node goes offline and the cluster is still in quorum, a
new master is elected by the nodes that remain online.
Because there is the possibility of a tie in a cluster that has an even number of nodes, one node has an extra
fractional voting weight called epsilon. If the connectivity between two equal portions of a large cluster fails, the
group of nodes containing epsilon maintains quorum, assuming that all of the nodes are healthy. For example,
the following illustration shows a four-node cluster in which two of the nodes have failed. However, because
one of the surviving nodes holds epsilon, the cluster remains in quorum even though there is not a simple
majority of healthy nodes.
Epsilon is automatically assigned to the first node when the cluster is created. If the node that holds epsilon
57
becomes unhealthy, takes over its high-availability partner, or is taken over by its high-availability partner, then
epsilon is automatically reassigned to a healthy node in a different HA pair.
Taking a node offline can affect the ability of the cluster to remain in quorum. Therefore, ONTAP issues a
warning message if you attempt an operation that will either take the cluster out of quorum or else put it one
outage away from a loss of quorum. You can disable the quorum warning messages by using the cluster
quorum-service options modify command at the advanced privilege level.
In general, assuming reliable connectivity among the nodes of the cluster, a larger cluster is more stable than a
smaller cluster. The quorum requirement of a simple majority of half the nodes plus epsilon is easier to
maintain in a cluster of 24 nodes than in a cluster of two nodes.
A two-node cluster presents some unique challenges for maintaining quorum. Two-node clusters use cluster
HA, in which neither node holds epsilon; instead, both nodes are continuously polled to ensure that if one node
fails, the other has full read-write access to data, as well as access to logical interfaces and management
functions.
System volumes are FlexVol volumes that contain special metadata, such as metadata
for file services audit logs. These volumes are visible in the cluster so that you can fully
account for storage use in your cluster.
System volumes are owned by the cluster management server (also called the admin SVM), and they are
created automatically when file services auditing is enabled.
You can view system volumes by using the volume show command, but most other volume operations are
not permitted. For example, you cannot modify a system volume by using the volume modify command.
This example shows four system volumes on the admin SVM, which were automatically created when file
services auditing was enabled for a data SVM in the cluster:
58
cluster1::> volume show -vserver cluster1
Vserver Volume Aggregate State Type Size Available
Used%
--------- ------------ ------------ ---------- ---- ---------- ----------
-----
cluster1 MDV_aud_1d0131843d4811e296fc123478563412
aggr0 online RW 2GB 1.90GB
5%
cluster1 MDV_aud_8be27f813d7311e296fc123478563412
root_vs0 online RW 2GB 1.90GB
5%
cluster1 MDV_aud_9dc4ad503d7311e296fc123478563412
aggr1 online RW 2GB 1.90GB
5%
cluster1 MDV_aud_a4b887ac3d7311e296fc123478563412
aggr2 online RW 2GB 1.90GB
5%
4 entries were displayed.
Manage nodes
After a cluster is created, you can expand it by adding nodes to it. You add only one node
at a time.
What you’ll need
• If you are adding nodes to a multiple-node cluster, all the existing nodes in the cluster must be healthy
(indicated by cluster show).
• If you are adding nodes to a two-node switchless cluster, you must convert your two-node switchless
cluster to a switch-attached cluster using a NetApp supported cluster switch.
• If you are adding a second node to a single-node cluster, the second node must have been installed, and
the cluster network must have been configured.
• If the cluster has SP automatic configuration enabled, the subnet specified for the SP must have available
resources to allow the joining node to use the specified subnet to automatically configure the SP.
• You must have gathered the following information for the new node’s node management LIF:
◦ Port
◦ IP address
◦ Netmask
◦ Default gateway
59
Nodes must be in even numbers so that they can form HA pairs. After you start to add a node to the cluster,
you must complete the process. The node must be part of the cluster before you can start to add another node.
Steps
1. Power on the node that you want to add to the cluster.
The node boots, and the Node Setup wizard starts on the console.
The Node Setup wizard exits, and a login prompt appears, warning that you have not completed the setup
tasks.
cluster setup
60
::> cluster setup
You can return to cluster setup at any time by typing "cluster setup".
To accept a default or omit a question, do not enter a value....
For more information on setting up a cluster using the setup GUI, see the System Manager
online help.
5. Press Enter to use the CLI to complete this task. When prompted to create a new cluster or join an existing
one, enter join.
If the ONTAP version running on the new node is different to the version running on the existing cluster, the
system reports a System checks Error: Cluster join operation cannot be performed at
this time error. This is the expected behavior. To continue, run the add-node -allow-mixed
-version-join new_node_name command at the advanced privilege level from an existing node in the
cluster.
6. Follow the prompts to set up the node and join it to the cluster:
◦ To accept the default value for a prompt, press Enter.
◦ To enter your own value for a prompt, enter the value, and then press Enter.
7. Repeat the preceding steps for each additional node that you want to add.
Related information
Mixed version ONTAP clusters
61
Remove nodes from the cluster
You can remove unwanted nodes from a cluster, one node at a time. After you remove a
node, you must also remove its failover partner. If you are removing a node, then its data
becomes inaccessible or erased.
Before you begin
The following conditions must be satisfied before removing nodes from the cluster:
If you do not remove the node and its HA partner from the SLM reporting-nodes list, access to the LUNs
previously on the node can be lost even though the volumes containing the LUNs were moved to another
node.
It is recommended that you issue an AutoSupport message to notify NetApp technical support that node
removal is underway.
You must not perform operations such as cluster remove-node, cluster unjoin, and
node rename when an automated ONTAP upgrade is in progress.
62
All system and user data, from all disks that are connected to the node, must be made
inaccessible to users before removing a node from the cluster. If a node was incorrectly unjoined
from a cluster, contact NetApp Support for assistance with options for recovery.
Steps
1. Change the privilege level to advanced:
3. If a node on the cluster holds epsilon and that node is going to be unjoined, move epsilon to a node that is
not going to be unjoined:
a. Move epsilon from the node that is going to be unjoined
The master node is the node that holds processes such as “mgmt”, “vldb”, “vifmgr”, “bcomd”, and “crs”.
5. If the node you want to remove is the current master node, then enable another node in the cluster to be
elected as the master node:
a. Make the current master node ineligibly to participate in the cluster:
When the master node become ineligible, one of the remaining nodes is elected by the cluster quorum
as the new master.
b. Make the previous master node eligible to participate in the cluster again:
63
cluster modify -node <node_name> -eligibility true
6. Log into the remote node management LIF or the cluster-management LIF on a node other than the one
that is being removed.
7. Remove the node from the cluster:
If you have a mixed version cluster and you are removing the last lower version node, use the -skip
-last-low-version-node-check parameter with these commands.
◦ You must also remove the node’s failover partner from the cluster.
◦ After the node is removed and before it can rejoin a cluster, you must use boot menu option (4) Clean
configuration and initialize all disks or option (9) Configure Advanced Drive Partitioning to erase the
node’s configuration and initialize all disks.
A failure message is generated if you have conditions that you must address before removing the
node. For example, the message might indicate that the node has shared resources that you must
remove or that the node is in a cluster HA configuration or storage failover configuration that you must
disable.
If the node is the quorum master, the cluster will briefly lose and then return to quorum. This quorum
loss is temporary and does not affect any data operations.
8. If a failure message indicates error conditions, address those conditions and rerun the cluster remove-
node or cluster unjoin command.
The node is automatically rebooted after it is successfully removed from the cluster.
9. If you are repurposing the node, erase the node configuration and initialize all disks:
a. During the boot process, press Ctrl-C to display the boot menu when prompted to do so.
b. Select the boot menu option (4) Clean configuration and initialize all disks.
10. Return to admin privilege level:
11. Repeat the preceding steps to remove the failover partner from the cluster.
64
Access a node’s log, core dump, and MIB files by using a web browser
The Service Processor Infrastructure (spi) web service is enabled by default to enable a
web browser to access the log, core dump, and MIB files of a node in the cluster. The
files remain accessible even when the node is down, provided that the node is taken over
by its partner.
What you’ll need
• The cluster management LIF must be up.
You can use the management LIF of the cluster or a node to access the spi web service. However, using
the cluster management LIF is recommended.
The network interface show command displays the status of all LIFs in the cluster.
• You must use a local user account to access the spi web service, domain user accounts are not
supported.
• If your user account does not have the “admin” role (which has access to the spi web service by default),
your access-control role must be granted access to the spi web service.
The vserver services web access show command shows what roles are granted access to which
web services.
• If you are not using the “admin” user account (which includes the http access method by default), your
user account must be set up with the http access method.
The security login show command shows user accounts' access and login methods and their
access-control roles.
• If you want to use HTTPS for secure web access, SSL must be enabled and a digital certificate must be
installed.
The system services web show command displays the configuration of the web protocol engine at
the cluster level.
The “admin” role is granted access to the spi web service by default, and the access can be disabled
manually (services web access delete -vserver cluster_name -name spi -role admin).
Steps
1. Point the web browser to the spi web service URL in one of the following formats:
◦ http://cluster-mgmt-LIF/spi/
◦ https://cluster-mgmt-LIF/spi/
2. When prompted by the browser, enter your user account and password.
65
After your account is authenticated, the browser displays links to the /mroot/etc/log/,
/mroot/etc/crash/, and /mroot/etc/mib/ directories of each node in the cluster.
If a node is hanging at the boot menu or the boot environment prompt, you can access it
only through the system console (also called the serial console). You can access the
system console of a node from an SSH connection to the node’s SP or to the cluster.
About this task
Both the SP and ONTAP offer commands that enable you to access the system console. However, from the
SP, you can access only the system console of its own node. From the cluster, you can access the system
console of any node in the cluster.
Steps
1. Access the system console of a node:
LOADER>
LOADER> boot_ontap
...
*******************************
* *
* Press Ctrl-C for Boot Menu. *
* *
*******************************
...
66
Connection to 123.12.123.12 closed.
SP node2>
The following example shows the result of entering the system node run-console command from ONTAP
to access the system console of node2, which is hanging at the boot environment prompt. The boot_ontap
command is entered at the console to boot node2 to ONTAP. Ctrl-D is then pressed to exit the console and
return to ONTAP.
LOADER>
LOADER> boot_ontap
...
*******************************
* *
* Press Ctrl-C for Boot Menu. *
* *
*******************************
...
A node’s root volume is a FlexVol volume that is installed at the factory or by setup
software. It is reserved for system files, log files, and core files. The directory name is
/mroot, which is accessible only through the systemshell by technical support. The
minimum size for a node’s root volume depends on the platform model.
A node’s root volume contains special directories and files for that node. The root aggregate contains the root
volume. A few rules govern a node’s root volume and root aggregate.
67
Storing user data in the root volume increases the storage giveback time between nodes in an HA pair.
◦ You can move the root volume to another aggregate. See Relocate root volumes to new aggregates.
• The root aggregate is dedicated to the node’s root volume only.
ONTAP prevents you from creating other volumes in the root aggregate.
A warning message appears when a node’s root volume has become full or almost full. The node cannot
operate properly when its root volume is full. You can free up space on a node’s root volume by deleting core
dump files, packet trace files, and root volume Snapshot copies.
Steps
1. Display the node’s core dump files and their names:
nodename is the name of the node whose root volume space you want to free up.
5. Display and delete the node’s packet trace files through the nodeshell:
a. Display all files in the node’s root volume:
ls /etc
b. If any packet trace files (*.trc) are in the node’s root volume, delete them individually:
rm /etc/log/packet_traces/file_name.trc
6. Identify and delete the node’s root volume Snapshot copies through the nodeshell:
a. Identify the root volume name:
vol status
The root volume is indicated by the word “root” in the “Options” column of the vol status command
output.
68
node1*> vol status
exit
The root replacement procedure migrates the current root aggregate to another set of disks without disruption.
You can change the location of the root volume to a new aggregate in the following scenarios:
• When the root aggregates are not on the disk you prefer
• When you want to rearrange the disks connected to the node
• When you are performing a shelf replacement of the EOS disk shelves
Steps
1. Set the privilege level to advanced:
◦ -node
Specifies the node that owns the root aggregate that you want to migrate.
◦ -disklist
Specifies the list of disks on which the new root aggregate will be created. All disks must be spares and
owned by the same node. The minimum number of disks required is dependent on the RAID type.
69
◦ -raid-type
Specifies the RAID type of the root aggregate. The default value is raid-dp.
Results
If all of the pre-checks are successful, the command starts a root volume replacement job and exits. Expect the
node to restart.
You might need to start or stop a node for maintenance or troubleshooting reasons. You
can do so from the ONTAP CLI, the boot environment prompt, or the SP CLI.
Using the SP CLI command system power off or system power cycle to turn off or power-cycle a node
might cause an improper shutdown of the node (also called a dirty shutdown) and is not a substitute for a
graceful shutdown using the ONTAP system node halt command.
You can reboot a node in normal mode from the system prompt. A node is configured to boot from the boot
device, such as a PC CompactFlash card.
Steps
1. If the cluster contains four or more nodes, verify that the node to be rebooted does not hold epsilon:
a. Set the privilege level to advanced:
cluster show
c. If the node to be rebooted holds epsilon, then remove epsilon from the node:
70
cluster modify -node node_name -epsilon false
If you do not specify the -skip-lif-migration parameter, the command attempts to migrate data and
cluster management LIFs synchronously to another node prior to the reboot. If the LIF migration fails or
times out, the rebooting process is aborted, and ONTAP displays an error to indicate the LIF migration
failure.
The node begins the reboot process. The ONTAP login prompt appears, indicating that the reboot process
is complete.
You can boot the current release or the backup release of ONTAP when you are at the boot environment
prompt of a node.
Steps
1. Access the boot environment prompt from the storage system prompt by using the system node halt
command.
To boot… Enter…
The current release of ONTAP boot_ontap
If you are unsure about which image to use, you should use boot_ontap in the first instance.
You can shut down a node if it becomes unresponsive or if support personnel direct you to do so as part of
troubleshooting efforts.
71
Steps
1. If the cluster contains four or more nodes, verify that the node to be shut down does not hold epsilon:
a. Set the privilege level to advanced:
cluster show
c. If the node to be shut down holds epsilon, then remove epsilon from the node:
2. Use the system node halt command to shut down the node.
If you do not specify the -skip-lif-migration parameter, the command attempts to migrate data and
cluster management LIFs synchronously to another node prior to the shutdown. If the LIF migration fails or
times out, the shutdown process is aborted, and ONTAP displays an error to indicate the LIF migration
failure.
You can manually trigger a core dump with the shutdown by using both the -dump parameter.
The following example shuts down the node named “node1” for hardware maintenance:
72
Manage a node by using the boot menu
You can use the boot menu to correct configuration problems on a node, reset the admin
password, initialize disks, reset the node configuration, and restore the node
configuration information back to the boot device.
If an HA pair is using encrypting SAS or NVMe drives (SED, NSE, FIPS), you must follow the
instructions in the topic Returning a FIPS drive or SED to unprotected mode for all drives within
the HA pair prior to initializing the system (boot options 4 or 9). Failure to do this may result in
future data loss if the drives are repurposed.
Steps
1. Reboot the node to access the boot menu by using the system node reboot command at the system
prompt.
2. During the reboot process, press Ctrl-C to display the boot menu when prompted to do so.
The node displays the following options for the boot menu:
Boot menu option (2) Boot without /etc/rc is obsolete and takes no effect on the system.
To… Select…
Continue to boot the node in normal 1) Normal Boot
mode
73
To… Select…
Initialize the node’s disks and 4) Clean configuration and initialize all disks
create a root volume for the node
This menu option erases all data on the disks of the
node and resets your node configuration to the
factory default settings.
Only select this menu item after the node has been removed from a
cluster (unjoined) and is not joined to another cluster.
For a node with internal or external disk shelves, the root volume on
the internal disks is initialized. If there are no internal disk shelves,
then the root volume on the external disks is initialized.
If the node you want to initialize has disks that are partitioned for
root-data partitioning, the disks must be unpartitioned before the
node can be initialized, see 9) Configure Advanced Drive
Partitioning and Disks and aggregates management.
If the ONTAP software on the boot device does not include support
for the storage array that you want to use for the root volume, you
can use this menu option to obtain a version of the software that
supports your storage array and install it on the node.
74
To… Select…
Reboot the node 8) Reboot node
You can display the attributes of one or more nodes in the cluster, for example, the name,
owner, location, model number, serial number, how long the node has been running,
health state, and eligibility to participate in a cluster.
Steps
1. To display the attributes of a specified node or about all nodes in a cluster, use the system node show
command.
75
cluster1::> system node show -node node1
Node: node1
Owner: Eng IT
Location: Lab 5
Model: model_number
Serial Number: 12345678
Asset Tag: -
Uptime: 23 days 04:42
NVRAM System ID: 118051205
System ID: 0118051205
Vendor: NetApp
Health: true
Eligibility: true
Differentiated Services: false
All-Flash Optimized: true
Capacity Optimized: false
QLC Optimized: false
All-Flash Select Optimized: false
SAS2/SAS3 Mixed Stack Support: none
You can modify the attributes of a node as required. The attributes that you can modify
include the node’s owner information, location information, asset tag, and eligibility to
participate in the cluster.
About this task
A node’s eligibility to participate in the cluster can be modified at the advanced privilege level by using the
–eligibility parameter of the system node modify or cluster modify command. If you set a node’s
eligibility to false, the node becomes inactive in the cluster.
You cannot modify node eligibility locally. It must be modified from a different node. Node
eligiblity also cannot be modified with a cluster HA configuration.
You should avoid setting a node’s eligibility to false, except for situations such as restoring the
node configuration or prolonged node maintenance. SAN and NAS data access to the node
might be impacted when the node is ineligible.
Steps
1. Use the system node modify command to modify a node’s attributes.
76
cluster1::> system node modify -node node1 -owner "Joe Smith" -assettag
js1234
Rename a node
The -newname parameter specifies the new name for the node. The system node rename man page
describes the rules for specifying the node name.
If you want to rename multiple nodes in the cluster, you must run the command for each node individually.
For fault tolerance and nondisruptive operations, it is highly recommended that you configure
your cluster with high-availability (HA pairs).
If you choose to configure or upgrade a single-node cluster, you should be aware of the following:
You can configure iSCSI SAN hosts to connect directly to a single node or to connect through one or more IP
switches. The node can have multiple iSCSI connections to the switch.
77
Direct-attached single-node configurations
In direct-attached single-node configurations, one or more hosts are directly connected to the node.
78
Ways to configure FC and FC-NVMe SAN hosts with single nodes
You can configure FC and FC-NVMe SAN hosts with single nodes through one or more fabrics. N-Port ID
Virtualization (NPIV) is required and must be enabled on all FC switches in the fabric. You cannot directly
attach FC or FC-NMVE SAN hosts to single nodes without using an FC switch.
In single-fabric single-node configurations, multipathing software is not required if you only have a single path
from the host to the node.
The FC target ports (0a, 0c, 0b, 0d) in the illustrations are examples. The actual port numbers vary depending
on the model of your storage node and whether you are using expansion adapters.
79
Related information
NetApp Technical Report 4684: Implementing and Configuring Modern SANs with NVMe-oF
Beginning with ONTAP 9.2, you can use the ONTAP CLI to perform an automated update of a single-node
cluster. Because single-node clusters lack redundancy, updates are always disruptive. Disruptive upgrades
cannot be performed using System Manager.
Steps
1. Delete the previous ONTAP software package:
3. Verify that the software package is available in the cluster package repository:
80
cluster image package show-repository
WARNING: There are additional manual upgrade validation checks that must
be performed after these automated validation checks have completed...
The software upgrade estimate displays details about each component to be updated, and the estimated
duration of the upgrade.
If an issue is encountered, the update pauses and prompts you to take corrective action.
You can use the cluster image show-update-progress command to view details about any
issues and the progress of the update. After correcting the issue, you can resume the
update by using the cluster image resume-update command.
81
cluster image show-update-progress
The node is rebooted as part of the update and cannot be accessed while rebooting.
If your cluster is not configured to send messages, a copy of the notification is saved locally.
It is a best practice to configure SP/BMC and the e0M management interface on a subnet
dedicated to management traffic. Running data traffic over the management network can
cause performance degradation and routing problems.
The management Ethernet port on most storage controllers (indicated by a wrench icon on the rear of the
chassis) is connected to an internal Ethernet switch. The internal switch provides connectivity to SP/BMC and
to the e0M management interface, which you can use to access the storage system via TCP/IP protocols like
Telnet, SSH, and SNMP.
If you plan to use both the remote management device and e0M, you must configure them on the same IP
subnet. Since these are low-bandwidth interfaces, the best practice is to configure SP/BMC and e0M on a
subnet dedicated to management traffic.
If you cannot isolate management traffic, or if your dedicated management network is unusually large, you
should try to keep the volume of network traffic as low as possible. Excessive ingress broadcast or multicast
traffic may degrade SP/BMC performance.
82
Some storage controllers, such as the AFF A800, have two external ports, one for BMC and the
other for e0M. For these controllers, there is no requirement to configure BMC and e0M on the
same IP subnet.
You can enable cluster-level, automatic network configuration for the SP (recommended).
You can also leave the SP automatic network configuration disabled (the default) and
manage the SP network configuration manually at the node level. A few considerations
exist for each case.
The SP automatic network configuration enables the SP to use address resources (including the IP address,
subnet mask, and gateway address) from the specified subnet to set up its network automatically. With the SP
automatic network configuration, you do not need to manually assign IP addresses for the SP of each node. By
default, the SP automatic network configuration is disabled; this is because enabling the configuration requires
that the subnet to be used for the configuration be defined in the cluster first.
If you enable the SP automatic network configuration, the following scenarios and considerations apply:
• If the SP has never been configured, the SP network is configured automatically based on the subnet
specified for the SP automatic network configuration.
• If the SP was previously configured manually, or if the existing SP network configuration is based on a
different subnet, the SP network of all nodes in the cluster are reconfigured based on the subnet that you
specify in the SP automatic network configuration.
The reconfiguration could result in the SP being assigned a different address, which might have an impact
on your DNS configuration and its ability to resolve SP host names. As a result, you might need to update
your DNS configuration.
• A node that joins the cluster uses the specified subnet to configure its SP network automatically.
• The system service-processor network modify command does not enable you to change the SP
IP address.
When the SP automatic network configuration is enabled, the command only allows you to enable or
disable the SP network interface.
◦ If the SP automatic network configuration was previously enabled, disabling the SP network interface
results in the assigned address resource being released and returned to the subnet.
◦ If you disable the SP network interface and then reenable it, the SP might be reconfigured with a
different address.
If the SP automatic network configuration is disabled (the default), the following scenarios and considerations
apply:
• If the SP has never been configured, SP IPv4 network configuration defaults to using IPv4 DHCP, and IPv6
is disabled.
A node that joins the cluster also uses IPv4 DHCP for its SP network configuration by default.
83
• The system service-processor network modify command enables you to configure a node’s SP
IP address.
A warning message appears when you attempt to manually configure the SP network with addresses that
are allocated to a subnet. Ignoring the warning and proceeding with the manual address assignment might
result in a scenario with duplicate addresses.
If the SP automatic network configuration is disabled after having been enabled previously, the following
scenarios and considerations apply:
• If the SP automatic network configuration has the IPv4 address family disabled, the SP IPv4 network
defaults to using DHCP, and the system service-processor network modify command enables
you to modify the SP IPv4 configuration for individual nodes.
• If the SP automatic network configuration has the IPv6 address family disabled, the SP IPv6 network is
also disabled, and the system service-processor network modify command enables you to
enable and modify the SP IPv6 configuration for individual nodes.
• The subnet you want to use for the SP automatic network configuration must already be defined in the
cluster and must have no resource conflicts with the SP network interface.
The network subnet show command displays subnet information for the cluster.
• If you want to use IPv6 connections for the SP, IPv6 must already be configured and enabled for ONTAP.
The network options ipv6 show command displays the current state of IPv6 settings for ONTAP.
Steps
1. Specify the IPv4 or IPv6 address family and name for the subnet that you want the SP to use by using the
system service-processor network auto-configuration enable command.
2. Display the SP automatic network configuration by using the system service-processor network
auto-configuration show command.
3. If you subsequently want to disable or reenable the SP IPv4 or IPv6 network interface for all nodes that are
in quorum, use the system service-processor network modify command with the -address
-family [IPv4|IPv6] and -enable [true|false] parameters.
When the SP automatic network configuration is enabled, you cannot modify the SP IP address for a node
that is in quorum. You can only enable or disable the SP IPv4 or IPv6 network interface.
If a node is out of quorum, you can modify the node’s SP network configuration, including the SP IP
84
address, by running system service-processor network modify from the node and confirming
that you want to override the SP automatic network configuration for the node. However, when the node
joins the quorum, the SP automatic reconfiguration takes place for the node based on the specified subnet.
If you do not have automatic network configuration set up for the SP, you must manually
configure a node’s SP network for the SP to be accessible by using an IP address.
What you’ll need
If you want to use IPv6 connections for the SP, IPv6 must already be configured and enabled for ONTAP. The
network options ipv6 commands manage IPv6 settings for ONTAP.
You can configure the SP to use IPv4, IPv6, or both. The SP IPv4 configuration supports static and DHCP
addressing, and the SP IPv6 configuration supports static addressing only.
If the SP automatic network configuration has been set up, you do not need to manually configure the SP
network for individual nodes, and the system service-processor network modify command allows
you to only enable or disable the SP network interface.
Steps
1. Configure the SP network for a node by using the system service-processor network modify
command.
◦ The -address-family parameter specifies whether the IPv4 or IPv6 configuration of the SP is to be
modified.
◦ The -enable parameter enables the network interface of the specified IP address family.
◦ The -dhcp parameter specifies whether to use the network configuration from the DHCP server or the
network address that you provide.
You can enable DHCP (by setting -dhcp to v4) only if you are using IPv4. You cannot enable DHCP
for IPv6 configurations.
◦ The -ip-address parameter specifies the public IP address for the SP.
A warning message appears when you attempt to manually configure the SP network with addresses
that are allocated to a subnet. Ignoring the warning and proceeding with the manual address
assignment might result in a duplicate address assignment.
◦ The -netmask parameter specifies the netmask for the SP (if using IPv4.)
◦ The -prefix-length parameter specifies the network prefix-length of the subnet mask for the SP (if
using IPv6.)
◦ The -gateway parameter specifies the gateway IP address for the SP.
2. Configure the SP network for the remaining nodes in the cluster by repeating the step 1.
3. Display the SP network configuration and verify the SP setup status by using the system service-
processor network show command with the –instance or –field setup-status parameters.
85
The SP setup status for a node can be one of the following:
Node: node1
Address Type: IPv4
Interface Enabled: true
Type of Device: SP
Status: online
Link Status: up
DHCP Status: none
IP Address: 192.168.123.98
MAC Address: ab:cd:ef:fe:ed:02
Netmask: 255.255.255.0
Prefix Length of Subnet Mask: -
Router Assigned IP Address: -
Link Local IP Address: -
Gateway IP Address: 192.168.123.1
Time Last Updated: Thu Apr 10 17:02:13 UTC 2014
Subnet Name: -
Enable IPv6 Router Assigned Address: -
SP Network Setup Status: succeeded
SP Network Setup Failure Reason: -
cluster1::>
The SP API is a secure network API that enables ONTAP to communicate with the SP
over the network. You can change the port used by the SP API service, renew the
86
certificates the service uses for internal communication, or disable the service entirely.
You need to modify the configuration only in rare situations.
About this task
• The SP API service uses port 50000 by default.
You can change the port value if, for example, you are in a network setting where port 50000 is used for
communication by another networking application, or you want to differentiate between traffic from other
applications and traffic generated by the SP API service.
• The SSL and SSH certificates used by the SP API service are internal to the cluster and not distributed
externally.
In the unlikely event that the certificates are compromised, you can renew them.
You only need to disable the SP API service in rare situations, such as in a private LAN where the SP is
not configured or used and you want to disable the service.
If the SP API service is disabled, the API does not accept any incoming connections. In addition,
functionality such as network-based SP firmware updates and network-based SP “down system” log
collection becomes unavailable. The system switches to using the serial interface.
Steps
1. Switch to the advanced privilege level by using the set -privilege advanced command.
2. Modify the SP API service configuration:
Renew the SSL and SSH certificates used by the • For ONTAP 9.5 or later use system service-
SP API service for internal communication processor api-service renew-
internal-certificate
• For ONTAP 9.4 and earlier use
• system service-processor api-
service renew-certificates
comm
87
If you want to… Use the following command…
Disable or reenable the SP API service system service-processor api-service
modify with the -is-enabled {true|false}
parameter
3. Display the SP API service configuration by using the system service-processor api-service
show command.
You can manage a node remotely using an onboard controller, called a Service Processor
(SP) or Baseboard Management Controller (BMC). This remote management controller is
included in all current platform models. The controller stays operational regardless of the
operating state of the node.
The following platforms support BMC instead of SP:
• FAS 8700
• FAS 8300
• FAS27x0
• AFF A800
• AFF A700s
• AFF A400
• AFF A320
• AFF A220
• AFF C190
About the SP
The Service Processor (SP) is a remote management device that enables you to access,
monitor, and troubleshoot a node remotely.
The key capabilities of the SP include the following:
• The SP enables you to access a node remotely to diagnose, shut down, power-cycle, or reboot the node,
regardless of the state of the node controller.
The SP is powered by a standby voltage, which is available as long as the node has input power from at
least one of its power supplies.
You can log in to the SP by using a Secure Shell client application from an administration host. You can
then use the SP CLI to monitor and troubleshoot the node remotely. In addition, you can use the SP to
access the serial console and run ONTAP commands remotely.
You can access the SP from the serial console or access the serial console from the SP. The SP enables
88
you to open both an SP CLI session and a separate console session simultaneously.
For instance, when a temperature sensor becomes critically high or low, ONTAP triggers the SP to shut
down the motherboard gracefully. The serial console becomes unresponsive, but you can still press Ctrl-G
on the console to access the SP CLI. You can then use the system power on or system power
cycle command from the SP to power on or power-cycle the node.
• The SP monitors environmental sensors and logs events to help you take timely and effective service
actions.
The SP monitors environmental sensors such as the node temperatures, voltages, currents, and fan
speeds. When an environmental sensor has reached an abnormal condition, the SP logs the abnormal
readings, notifies ONTAP of the issue, and sends alerts and “down system” notifications as necessary
through an AutoSupport message, regardless of whether the node can send AutoSupport messages.
The SP also logs events such as boot progress, Field Replaceable Unit (FRU) changes, events generated
by ONTAP, and SP command history. You can manually invoke an AutoSupport message to include the SP
log files that are collected from a specified node.
Other than generating these messages on behalf of a node that is down and attaching additional diagnostic
information to AutoSupport messages, the SP has no effect on the AutoSupport functionality. The
AutoSupport configuration settings and message content behavior are inherited from ONTAP.
The SP does not rely on the -transport parameter setting of the system node
autosupport modify command to send notifications. The SP only uses the Simple Mail
Transport Protocol (SMTP) and requires its host’s AutoSupport configuration to include mail
host information.
If SNMP is enabled, the SP generates SNMP traps to configured trap hosts for all “down system” events.
• The SP has a nonvolatile memory buffer that stores up to 4,000 events in a system event log (SEL) to help
you diagnose issues.
The SEL stores each audit log entry as an audit event. It is stored in onboard flash memory on the SP. The
event list from the SEL is automatically sent by the SP to specified recipients through an AutoSupport
message.
◦ Hardware events detected by the SP—for example, sensor status about power supplies, voltage, or
other components
◦ Errors detected by the SP—for example, a communication error, a fan failure, or a memory or CPU
error
◦ Critical software events sent to the SP by the node—for example, a panic, a communication failure, a
boot failure, or a user-triggered “down system” as a result of issuing the SP system reset or
system power cycle command
• The SP monitors the serial console regardless of whether administrators are logged in or connected to the
console.
When messages are sent to the console, the SP stores them in the console log. The console log persists
as long as the SP has power from either of the node power supplies. Because the SP operates with
standby power, it remains available even when the node is power-cycled or turned off.
89
• Hardware-assisted takeover is available if the SP is configured.
• The SP API service enables ONTAP to communicate with the SP over the network.
The service enhances ONTAP management of the SP by supporting network-based functionality such as
using the network interface for the SP firmware update, enabling a node to access another node’s SP
functionality or system console, and uploading the SP log from another node.
You can modify the configuration of the SP API service by changing the port the service uses, renewing the
SSL and SSH certificates that are used by the service for internal communication, or disabling the service
entirely.
The following diagram illustrates access to ONTAP and the SP of a node. The SP interface is accessed
through the Ethernet port (indicated by a wrench icon on the rear of the chassis):
• The BMC completely controls the environmental monitoring of power supply elements, cooling elements,
temperature sensors, voltage sensors, and current sensors. The BMC reports sensor information to
ONTAP through IPMI.
• Some of the high-availability (HA) and storage commands are different.
• The BMC does not send AutoSupport messages.
Automatic firmware updates are also available when running ONTAP 9.2 GA or later with the following
requirements:
90
• BMC firmware revision 1.15 or later must be installed.
A manual update is required to upgrade BMC firmware from 1.12 to 1.15 or later.
ONTAP includes an SP firmware image that is called the baseline image. If a new version
of the SP firmware becomes subsequently available, you have the option to download it
and update the SP firmware to the downloaded version without upgrading the ONTAP
version.
• The SP automatic update functionality is enabled by default, allowing the SP firmware to be automatically
updated in the following scenarios:
The ONTAP upgrade process automatically includes the SP firmware update, provided that the SP
firmware version bundled with ONTAP is newer than the SP version running on the node.
ONTAP detects a failed SP automatic update and triggers a corrective action to retry the
SP automatic update up to three times. If all three retries fail, see the Knowledge Base
article xref:./system-admin/ Health Monitor SPAutoUpgradeFailedMajorAlert SP upgrade
fails - AutoSupport Message.
◦ When you download a version of the SP firmware from the NetApp Support Site and the downloaded
version is newer than the one that the SP is currently running
◦ When you downgrade or revert to an earlier version of ONTAP
The SP firmware is automatically updated to the newest compatible version that is supported by the
ONTAP version you reverted or downgraded to. A manual SP firmware update is not required.
You have the option to disable the SP automatic update functionality by using the system service-
processor image modify command. However, it is recommended that you leave the functionality
enabled. Disabling the functionality can result in suboptimal or nonqualified combinations between the
ONTAP image and the SP firmware image.
• ONTAP enables you to trigger an SP update manually and specify how the update should take place by
using the system service-processor image update command.
91
You can update the SP firmware to a downloaded package by specifying the package file name. The
advance system image package show command displays all package files (including the files for
the SP firmware package) that are available on a node.
◦ Whether to use the baseline SP firmware package for the SP update (-baseline)
You can update the SP firmware to the baseline version that is bundled with the currently running
version of ONTAP.
If you use some of the more advanced update options or parameters, the BMC’s
configuration settings may be temporarily cleared. After reboot, it can take up to 10 minutes
for ONTAP to restore the BMC configuration.
• ONTAP enables you to display the status for the latest SP firmware update triggered from ONTAP by using
the system service-processor image update-progress show command.
Any existing connection to the SP is terminated when the SP firmware is being updated. This is the case
whether the SP firmware update is automatically or manually triggered.
Related information
NetApp Downloads: System Firmware and Diagnostics
When the SP/BMC uses the network interface for firmware updates
An SP firmware update that is triggered from ONTAP with the SP running version 1.5,
2.5, 3.1, or later supports using an IP-based file transfer mechanism over the SP network
interface.
An SP firmware update over the network interface is faster than an update over the serial interface. It reduces
the maintenance window during which the SP firmware is being updated, and it is also nondisruptive to ONTAP
operation. The SP versions that support this capability are included with ONTAP. They are also available on the
NetApp Support Site and can be installed on controllers that are running a compatible version of ONTAP.
When you are running SP version 1.5, 2.5, 3.1, or later, the following firmware upgrade behaviors apply:
• An SP firmware update that is automatically triggered by ONTAP defaults to using the network interface for
the update; however, the SP automatic update switches to using the serial interface for the firmware update
if one of the following conditions occurs:
◦ The SP network interface is not configured or not available.
◦ The IP-based file transfer fails.
◦ The SP API service is disabled.
Regardless of the SP version you are running, an SP firmware update triggered from the SP CLI always uses
the SP network interface for the update.
Related information
NetApp Downloads: System Firmware and Diagnostics
92
Accounts that can access the SP
When you try to access the SP, you are prompted for credential. Cluster user accounts
that are created with the service-processor application type have access to the SP
CLI on any node of the cluster. SP user accounts are managed from ONTAP and
authenticated by password. Beginning with ONTAP 9.9.1, SP user accounts must have
the admin role.
User accounts for accessing the SP are managed from ONTAP instead of the SP CLI. A cluster user account
can access the SP if it is created with the -application parameter of the security login create
command set to service-processor and the -authmethod parameter set to password. The SP supports
only password authentication.
You must specify the -role parameter when creating an SP user account.
• In ONTAP 9.9.1 and later releases, you must specify admin for the -role parameter, and any
modifications to an account require the admin role. Other roles are no longer permitted for security
reasons.
◦ If you are upgrading to ONTAP 9.9.1 or later releases, see Change in user accounts that can access
the Service Processor.
◦ If you are reverting to ONTAP 9.8 or earlier releases, see Verify user accounts that can access the
Service Processor.
• In ONTAP 9.8 and earlier releases, any role can access the SP, but admin is recommended.
By default, the cluster user account named “admin” includes the service-processor application type and
has access to the SP.
ONTAP prevents you from creating user accounts with names that are reserved for the system (such as “root”
and “naroot”). You cannot use a system-reserved name to access the cluster or the SP.
You can display current SP user accounts by using the -application service-processor parameter of
the security login show command.
You can log in to the SP of a node from an administration host to perform node
management tasks remotely.
What you’ll need
The following conditions must be met:
• The administration host you use to access the SP must support SSHv2.
• Your user account must already be set up for accessing the SP.
To access the SP, your user account must have been created with the -application parameter of the
security login create command set to service-processor and the -authmethod parameter
set to password.
93
If the SP is configured to use an IPv4 or IPv6 address, and if five SSH login attempts from a host fail
consecutively within 10 minutes, the SP rejects SSH login requests and suspends the communication with the
IP address of the host for 15 minutes. The communication resumes after 15 minutes, and you can try to log in
to the SP again.
ONTAP prevents you from creating or using system-reserved names (such as “root” and “naroot”) to access
the cluster or the SP.
Steps
1. From the administration host, log in to the SP:
ssh username@SP_IP_address
The SP prompt appears, indicating that you have access to the SP CLI.
The following examples show how to use the IPv6 global address or IPv6 router-advertised address to log in to
the SP on a node that has SSH set up for IPv6 and the SP configured for IPv6.
You can access the SP from the system console (also called serial console) to perform
monitoring or troubleshooting tasks.
About this task
This task applies to both the SP and the BMC.
Steps
1. Access the SP CLI from the system console by pressing Ctrl-G at the prompt.
94
2. Log in to the SP CLI when you are prompted.
The SP prompt appears, indicating that you have access to the SP CLI.
3. Exit the SP CLI and return to the system console by pressing Ctrl-D, and then press Enter.
cluster1::>
cluster1::>
You can open an SP CLI session to manage a node remotely and open a separate SP
console session to access the console of the node. The SP console session mirrors
output displayed in a concurrent system console session. The SP and the system
console have independent shell environments with independent login authentication.
Understanding how the SP CLI, SP console, and system console sessions are related helps you manage a
node remotely. The following describes the relationship among the sessions:
• Only one administrator can log in to the SP CLI session at a time; however, the SP enables you to open
both an SP CLI session and a separate SP console session simultaneously.
The SP CLI is indicated with the SP prompt (SP>). From an SP CLI session, you can use the SP system
console command to initiate an SP console session. At the same time, you can start a separate SP CLI
session through SSH. If you press Ctrl-D to exit from the SP console session, you automatically return to
95
the SP CLI session. If an SP CLI session already exists, a message asks you whether to terminate the
existing SP CLI session. If you enter “y”, the existing SP CLI session is terminated, enabling you to return
from the SP console to the SP CLI. This action is recorded in the SP event log.
In an ONTAP CLI session that is connected through SSH, you can switch to the system console of a node
by running the ONTAP system node run-console command from another node.
• For security reasons, the SP CLI session and the system console session have independent login
authentication.
When you initiate an SP console session from the SP CLI (by using the SP system console command),
you are prompted for the system console credential. When you access the SP CLI from a system console
session (by pressing Ctrl-G), you are prompted for the SP CLI credential.
• The SP console session and the system console session have independent shell environments.
The SP console session mirrors output that is displayed in a concurrent system console session. However,
the concurrent system console session does not mirror the SP console session.
The SP console session does not mirror output of concurrent SSH sessions.
By default, the SP accepts SSH connection requests from administration hosts of any IP
addresses. You can configure the SP to accept SSH connection requests from only the
administration hosts that have the IP addresses you specify. The changes you make
apply to SSH access to the SP of any nodes in the cluster.
Steps
1. Grant SP access to only the IP addresses you specify by using the system service-processor ssh
add-allowed-addresses command with the -allowed-addresses parameter.
◦ The value of the -allowed-addresses parameter must be specified in the format of address
/netmask, and multiple address/netmask pairs must be separated by commas, for example,
10.98.150.10/24, fd20:8b1e:b255:c09b::/64.
◦ When you change the default by limiting SP access to only the IP addresses you specify, ONTAP
prompts you to confirm that you want the specified IP addresses to replace the “allow all” default
setting (0.0.0.0/0, ::/0).
◦ The system service-processor ssh show command displays the IP addresses that can access
the SP.
2. If you want to block a specified IP address from accessing the SP, use the system service-processor
ssh remove-allowed-addresses command with the -allowed-addresses parameter.
If you block all IP addresses from accessing the SP, the SP becomes inaccessible from any administration
hosts.
96
The following examples show the default setting for SSH access to the SP, change the default by limiting SP
access to only the specified IP addresses, remove the specified IP addresses from the access list, and then
restore SP access for all IP addresses:
Warning: If all IP addresses are removed from the allowed address list,
all IP
addresses will be denied access. To restore the "allow all"
default,
use the "system service-processor ssh add-allowed-addresses
-allowed-addresses 0.0.0.0/0, ::/0" command. Do you want to
continue?
{y|n}: y
The online help displays the SP/BMC CLI commands and options.
About this task
This task applies to both the SP and the BMC.
Steps
97
1. To display help information for the SP/BMC commands, enter the following:
SP> help
date - print date and time
exit - exit from the SP command line interface
events - print system events and event information
help - print command help
priv - show and set user mode
sp - commands to control the SP
system - commands to control the system
version - print SP version
BMC> system
system acp - acp related commands
system battery - battery related commands
system console - connect to the system console
system core - dump the system core and reset
system cpld - cpld commands
system log - print system console logs
system power - commands controlling system power
system reset - reset the system using the selected firmware
system sensors - print environmental sensors status
system service-event - print service-event status
system fru - fru related commands
system watchdog - system watchdog commands
BMC>
2. To display help information for the option of an SP/BMC command, enter help before or after the SP/BMC
command.
The following example shows the SP CLI online help for the SP events command.
98
SP> help events
events all - print all system events
events info - print system event log information
events newest - print newest system events
events oldest - print oldest system events
events search - search for and print system events
The following example shows the BMC CLI online help for the BMC system power command.
BMC>
You can manage a node remotely by accessing its SP and running SP CLI commands to
perform node-management tasks. For several commonly performed remote node-
management tasks, you can also use ONTAP commands from another node in the
cluster. Some SP commands are platform-specific and might not be available on your
platform.
If you want to… Use this SP command… Use this BMC Or this ONTAP
command… command …
Display available SP help [command]
commands or
subcommands of a
specified SP command
99
If you want to… Use this SP command… Use this BMC Or this ONTAP
command… command …
Display events that are events {all | info |
logged by the SP newest number |
oldest number |
search keyword}
Display SP status and sp status [-v | -d] bmc status [-v | -d] system service-
network configuration processor show
information The -v option displays SP The -v option displays SP
statistics in verbose form. statistics in verbose form.
The -d option adds the The -d option adds the
SP debug log to the SP debug log to the
display. display.
Display the power status system power status system node power
for the controller of a node show
100
If you want to… Use this SP command… Use this BMC Or this ONTAP
command… command …
Display the FRU data system fru log show
history log (advanced privilege level)
101
If you want to… Use this SP command… Use this BMC Or this ONTAP
command… command …
Turn the node on or off, or system power on system node power
perform a power-cycle on (advanced privilege
(turning the power off and level)
then back on)
system power off
The standby power stays on to keep the SP running without interruption. During
the power-cycle, a brief pause occurs before power is turned back on.
These commands have the same effect as pressing the Non-maskable Interrupt
(NMI) button on a node, causing a dirty shutdown of the node and forcing a dump
of the core files when halting the node. These commands are helpful when
ONTAP on the node is hung or does not respond to commands such as system
node shutdown. The generated core dump files are displayed in the output of
the system node coredump show command. The SP stays operational as
long as the input power to the node is not interrupted.
If no BIOS firmware image is specified, the current image is used for the reboot.
The SP stays operational as long as the input power to the node is not
interrupted.
102
If you want to… Use this SP command… Use this BMC Or this ONTAP
command… command …
Display the status of system battery
battery firmware auto_update [status |
automatic update, or enable | disable]
enable or disable battery
firmware automatic (advanced privilege level)
update upon next SP boot
If image_URL is not
specified, the default
battery firmware image is
used for comparison.
Update the SP firmware sp update image_URL bmc update image_URL system service-
by using the image at the image_URL must not image_URL must not processor image
specified location exceed 200 characters. exceed 200 characters. update
103
About the threshold-based SP sensor readings and status values of the system sensors command
output
Examples of threshold-based sensors include sensors for the system temperatures, voltages, currents, and fan
speeds. The specific list of threshold-based sensors depends on the platform.
Threshold-based sensors have the following thresholds, displayed in the output of the SP system sensors
command:
A sensor reading between LNC and LCR or between UNC and UCR means that the component is showing
signs of a problem and a system failure might occur as a result. Therefore, you should plan for component
service soon.
A sensor reading below LCR or above UCR means that the component is malfunctioning and a system failure
is about to occur. Therefore, the component requires immediate attention.
The following diagram illustrates the severity ranges that are specified by the thresholds:
You can find the reading of a threshold-based sensor under the Current column in the system sensors
command output. The system sensors get sensor_name command displays additional details for the
specified sensor. As the reading of a threshold-based sensor crosses the noncritical and critical threshold
ranges, the sensor reports a problem of increasing severity. When the reading exceeds a threshold limit, the
sensor’s status in the system sensors command output changes from ok to nc (noncritical) or cr (critical)
depending on the exceeded threshold, and an event message is logged in the SEL event log.
Some threshold-based sensors do not have all four threshold levels. For those sensors, the missing thresholds
show na as their limits in the system sensors command output, indicating that the particular sensor has no
limit or severity concern for the given threshold and the SP does not monitor the sensor for that threshold.
104
SP node1> system sensors
Example of the system sensors sensor_name command output for a threshold-based sensor
The following example shows the result of entering system sensors get sensor_name in the SP CLI for
the threshold-based sensor 5V:
105
SP node1> system sensors get 5V
About the discrete SP sensor status values of the system sensors command output
Discrete sensors do not have thresholds. Their readings, displayed under the Current
column in the SP CLI system sensors command output, do not carry actual meanings
and thus are ignored by the SP. The Status column in the system sensors command
output displays the status values of discrete sensors in hexadecimal format.
Examples of discrete sensors include sensors for the fan, power supply unit (PSU) fault, and system fault. The
specific list of discrete sensors depends on the platform.
You can use the SP CLI system sensors get sensor_name command for help with interpreting the status
values for most discrete sensors. The following examples show the results of entering system sensors get
sensor_name for the discrete sensors CPU0_Error and IO_Slot1_Present:
106
SP node1> system sensors get IO_Slot1_Present
Locating sensor record...
Sensor ID : IO_Slot1_Present (0x74)
Entity ID : 11.97
Sensor Type (Discrete): Add-in Card
States Asserted : Availability State
[Device Present]
Although the system sensors get sensor_name command displays the status information for most
discrete sensors, it does not provide status information for the System_FW_Status, System_Watchdog,
PSU1_Input_Type, and PSU2_Input_Type discrete sensors. You can use the following information to interpret
these sensors' status values.
System_FW_Status
The System_FW_Status sensor’s condition appears in the form of 0xAABB. You can combine the information
of AA and BB to determine the condition of the sensor.
107
Values Condition of the sensor
1F BIOS is starting up
20 LOADER is running
2F ONTAP is running
For instance, the System_FW_Status sensor status 0x042F means "system firmware progress (04), ONTAP is
running (2F)."
System_Watchdog
• 0x0080
108
For instance, the System_Watchdog sensor status 0x0880 means a watchdog timeout occurs and causes a
system power cycle.
For direct current (DC) power supplies, the PSU1_Input_Type and PSU2_Input_Type sensors do not apply. For
alternating current (AC) power supplies, the sensors' status can have one of the following values:
For instance, the PSU1_Input_Type sensor status 0x0280 means that the sensor reports that the PSU type is
110V.
ONTAP provides commands for managing the SP, including the SP network
configuration, SP firmware image, SSH access to the SP, and general SP administration.
Disable the SP automatic network configuration for system service-processor network auto-
the IPv4 or IPv6 address family of the subnet configuration disable
specified for the SP
109
If you want to… Run this ONTAP command…
Manually configure the SP network for a node, system service-processor network modify
including the following:
Display the SP network configuration, including the system service-processor network show
following:
Displaying complete SP network details requires the
• The configured address family (IPv4 or IPv6) and -instance parameter.
whether it is enabled
• The remote management device type
• The current SP status and link status
• Network configuration, such as IP address, MAC
address, netmask, prefix-length of subnet mask,
router-assigned IP address, link local IP address,
and gateway IP address
• The time the SP was last updated
• The name of the subnet used for SP automatic
configuration
• Whether the IPv6 router-assigned IP address is
enabled
• SP network setup status
• Reason for the SP network setup failure
Modify the SP API service configuration, including the system service-processor api-service
following: modify
• Changing the port used by the SP API service (advanced privilege level)
• Enabling or disabling the SP API service
110
If you want to… Run this ONTAP command…
Display the SP API service configuration system service-processor api-service
show
Renew the SSL and SSH certificates used by the SP • For ONTAP 9.5 or later: system service-
API service for internal communication processor api-service renew-internal-
certificates
• For ONTAP 9.4 or earlier: system service-
processor api-service renew-
certificates
Enable or disable the SP automatic firmware update system service-processor image modify
111
If you want to… Run this ONTAP command…
Manually download an SP firmware image on a node system node image get
Display the status for the latest SP firmware update system service-processor image update-
triggered from ONTAP, including the following progress show
information:
Block the specified IP addresses from accessing the system service-processor ssh remove-
SP allowed-addresses
Display the IP addresses that can access the SP system service-processor ssh show
112
If you want to… Run this ONTAP command…
Display general SP information, including the system service-processor show Displaying
following: complete SP information requires the -instance
parameter.
• The remote management device type
• The current SP status
• Whether the SP network is configured
• Network information, such as the public IP
address and the MAC address
• The SP firmware version and Intelligent Platform
Management Interface (IPMI) version
• Whether the SP firmware automatic update is
enabled
Generate and send an AutoSupport message that system node autosupport invoke-splog
includes the SP log files collected from a specified
node
Display the allocation map of the collected SP log files system service-processor log show-
in the cluster, including the sequence numbers for the allocations
SP log files that reside in each collecting node
Related information
ONTAP command reference
113
If you want to… Use this command
Display/modify the details of the currently installed system service-processor image
BMC firmware image show/modify
Display the status for the latest BMC firmware update system service-processor image update-
progress show
Enable the automatic network configuration for the system service-processor network auto-
BMC to use an IPv4 or IPv6 address on the specified configuration enable
subnet
Disable the automatic network configuration for an system service-processor network auto-
IPv4 or IPv6 address on the subnet specified for the configuration disable
BMC
Display the BMC automatic network configuration system service-processor network auto-
configuration show
For commands that are not supported by the BMC firmware, the following error message is returned.
You can log into the BMC using SSH. The following commands are supported from the
BMC command line.
Command Function
system Display a list of all commands.
114
Command Function
system power status Print system power status.
system fru show [id] Dump all/selected field replaceable unit (FRU) info.
NTP is always enabled. However, configuration is still required for the cluster to synchronize with an external
time source. ONTAP enables you to manage the cluster’s NTP configuration in the following ways:
• You can associate a maximum of 10 external NTP servers with the cluster (cluster time-service
ntp server create).
◦ For redundancy and quality of time service, you should associate at least three external NTP servers
with the cluster.
◦ You can specify an NTP server by using its IPv4 or IPv6 address or fully qualified host name.
◦ You can manually specify the NTP version (v3 or v4) to use.
By default, ONTAP automatically selects the NTP version that is supported for a given external NTP
server.
If the NTP version you specify is not supported for the NTP server, time exchange cannot take place.
◦ At the advanced privilege level, you can specify an external NTP server that is associated with the
cluster to be the primary time source for correcting and adjusting the cluster time.
• You can display the NTP servers that are associated with the cluster (cluster time-service ntp
server show).
• You can modify the cluster’s NTP configuration (cluster time-service ntp server modify).
• You can disassociate the cluster from an external NTP server (cluster time-service ntp server
delete).
• At the advanced privilege level, you can reset the configuration by clearing all external NTP servers'
association with the cluster (cluster time-service ntp server reset).
A node that joins a cluster automatically adopts the NTP configuration of the cluster.
In addition to using NTP, ONTAP also enables you to manually manage the cluster time. This capability is
helpful when you need to correct erroneous time (for example, a node’s time has become significantly incorrect
after a reboot). In that case, you can specify an approximate time for the cluster until NTP can synchronize with
115
an external time server. The time you manually set takes effect across all nodes in the cluster.
You can manually manage the cluster time in the following ways:
• You can set or modify the time zone, date, and time on the cluster (cluster date modify).
• You can display the current time zone, date, and time settings of the cluster (cluster date show).
Job schedules do not adjust to manual cluster date and time changes. These jobs are
scheduled to run based on the current cluster time when the job was created or when the job
most recently ran. Therefore, if you manually change the cluster date or time, you must use the
job show and job history show commands to verify that all scheduled jobs are queued
and completed according to your requirements.
You use the cluster time-service ntp server commands to manage the NTP servers for the cluster.
You use the cluster date commands to manage the cluster time manually.
Beginning with ONTAP 9.5, you can configure your NTP server with symmetric authentication.
The following commands enable you to manage the NTP servers for the cluster:
Associate the cluster with an external NTP server with cluster time-service ntp server create
symmetric authenticationAvailable in ONTAP 9.5 or -server server_ip_address -key-id
later key_id
Enable symmetric authentication for an existing NTP cluster time-service ntp server modify
serverAn existing NTP server can be modified to -server server_name -key-id key_id
enable authentication by adding the required key-id.
116
If you want to… Use this command…
Configure a shared NTP key cluster time-service ntp key create -id
shared_key_id -type shared_key_type
-value shared_key_value
Display information about the NTP servers that are cluster time-service ntp server show
associated with the cluster
Modify the configuration of an external NTP server cluster time-service ntp server modify
that is associated with the cluster
Dissociate an NTP server from the cluster cluster time-service ntp server delete
Reset the configuration by clearing all external NTP cluster time-service ntp server reset
servers' association with the cluster
This command requires the advanced
privilege level.
The following commands enable you to manage the cluster time manually:
Display the time zone, date, and time settings for the cluster date show
cluster
Related information
ONTAP command reference
ONTAP enables you to configure a login banner or a message of the day (MOTD) to
communicate administrative information to CLI users of the cluster or storage virtual
machine (SVM).
A banner is displayed in a console session (for cluster access only) or an SSH session (for cluster or SVM
access) before a user is prompted for authentication such as a password. For example, you can use the
banner to display a warning message such as the following to someone who attempts to log in to the system:
117
$ ssh admin@cluster1-01
This system is for authorized users only. Your IP Address has been logged.
Password:
An MOTD is displayed in a console session (for cluster access only) or an SSH session (for cluster or SVM
access) after a user is authenticated but before the clustershell prompt appears. For example, you can use the
MOTD to display a welcome or informational message such as the following that only authenticated users will
see:
$ ssh admin@cluster1-01
Password:
You can create or modify the content of the banner or MOTD by using the security login banner
modify or security login motd modify command, respectively, in the following ways:
• You can use the CLI interactively or noninteractively to specify the text to use for the banner or MOTD.
The interactive mode, launched when the command is used without the -message or -uri parameter,
enables you to use newlines (also known as end of lines) in the message.
The noninteractive mode, which uses the -message parameter to specify the message string, does not
support newlines.
• You can upload content from an FTP or HTTP location to use for the banner or MOTD.
• You can configure the MOTD to display dynamic content.
Examples of what you can configure the MOTD to display dynamically include the following:
118
The banner does not support dynamic content.
You can manage the banner and MOTD at the cluster or SVM level:
If a cluster-level banner has been configured, it is overridden by the SVM-level banner for the given
SVM.
In this case, users logging in to the SVM will see two MOTDs, one defined at the cluster level and the
other at the SVM level.
◦ The cluster-level MOTD can be enabled or disabled on a per-SVM basis by the cluster administrator.
If the cluster administrator disables the cluster-level MOTD for an SVM, a user logging in to the SVM
does not see the cluster-level MOTD.
Create a banner
You can create a banner to display a message to someone who attempts to access the
cluster or SVM. The banner is displayed in a console session (for cluster access only) or
an SSH session (for cluster or SVM access) before a user is prompted for authentication.
Steps
1. Use the security login banner modify command to create a banner for the cluster or SVM:
Include newlines (also known as end of lines) in the Use the command without the -message or -uri
message parameter to launch the interactive mode for editing
the banner.
Upload content from a location to use for the banner Use the -uri parameter to specify the content’s
FTP or HTTP location.
A banner created by using the -uri parameter is static. It is not automatically refreshed to reflect
subsequent changes of the source content.
119
The banner created for the cluster is displayed also for all SVMs that do not have an existing banner. Any
subsequently created banner for an SVM overrides the cluster-level banner for that SVM. Specifying the
-message parameter with a hyphen within double quotes ("-") for the SVM resets the SVM to use the
cluster-level banner.
2. Verify that the banner has been created by displaying it with the security login banner show
command.
Specifying the -message parameter with an empty string ("") displays banners that have no content.
Specifying the -message parameter with "-" displays all (admin or data) SVMs that do not have a banner
configured.
cluster1::>
The following example uses the interactive mode to create a banner for the "`svm1`"SVM:
cluster1::>
The following example displays the banners that have been created:
120
cluster1::> security login banner show
Vserver: cluster1
Message
--------------------------------------------------------------------------
---
Authorized users only!
Vserver: svm1
Message
--------------------------------------------------------------------------
---
The svm1 SVM is reserved for authorized users only!
cluster1::>
Related information
Managing the banner
You can manage the banner at the cluster or SVM level. The banner configured for the
cluster is also used for all SVMs that do not have a banner message defined. A
subsequently created banner for an SVM overrides the cluster banner for that SVM.
Choices
• Manage the banner at the cluster level:
Remove the banner for all (cluster and SVM) logins Set the banner to an empty string (""):
121
If you want to… Then…
Override a banner created by an SVM administrator Modify the SVM banner message:
Suppress the banner supplied by the cluster Set the SVM banner to an empty string for the SVM:
administrator so that no banner is displayed for the
SVM security login banner modify -vserver
svm_name -message ""
Use the cluster-level banner when the SVM Set the SVM banner to "-":
currently uses an SVM-level banner
security login banner modify -vserver
svm_name -message "-"
Create an MOTD
Include newlines (also known as end of lines) Use the command without the -message or -uri
parameter to launch the interactive mode for editing
the MOTD.
122
If you want to… Then…
Upload content from a location to use for the MOTD Use the -uri parameter to specify the content’s
FTP or HTTP location.
The security login motd modify man page describes the escape sequences that you can use to
enable the MOTD to display dynamically generated content.
An MOTD created by using the -uri parameter is static. It is not automatically refreshed to reflect
subsequent changes of the source content.
An MOTD created for the cluster is displayed also for all SVM logins by default, along with an SVM-level
MOTD that you can create separately for a given SVM. Setting the -is-cluster-message-enabled
parameter to false for an SVM prevents the cluster-level MOTD from being displayed for that SVM.
2. Verify that the MOTD has been created by displaying it with the security login motd show
command.
Specifying the -message parameter with an empty string ("") displays MOTDs that are not configured or
have no content.
See the security login motd modify command man page for a list of parameters to use to enable the MOTD
to display dynamically generated content. Be sure to check the man page specific to your ONTAP version.
The following example uses the interactive mode to create an MOTD for the "`svm1`"SVM that uses escape
sequences to display dynamically generated content:
The following example displays the MOTDs that have been created:
123
cluster1::> security login motd show
Vserver: cluster1
Is the Cluster MOTD Displayed?: true
Message
--------------------------------------------------------------------------
---
Greetings!
Vserver: svm1
Is the Cluster MOTD Displayed?: true
Message
--------------------------------------------------------------------------
---
Welcome to the \n SVM. Your user ID is '\N'. Your last successful login
was \L.
You can manage the message of the day (MOTD) at the cluster or SVM level. By default,
the MOTD configured for the cluster is also enabled for all SVMs. Additionally, an SVM-
level MOTD can be configured for each SVM. The cluster-level MOTD can be enabled or
disabled for each SVM by the cluster administrator.
For a list of escape sequences that can be used to dynamically generate content for the MOTD, see the
command reference.
Choices
• Manage the MOTD at the cluster level:
Change the MOTD for all logins when no SVM-level Modify the cluster-level MOTD:
MOTDs are configured
security login motd modify -vserver
cluster_name { [-message "text"] } |
[-uri ftp_or_http_addr] }
124
If you want to… Then…
Remove the MOTD for all logins when no SVM-level Set the cluster-level MOTD to an empty string (""):
MOTDs are configured
security login motd modify -vserver
cluster_name -message ""
Have every SVM display the cluster-level MOTD Set a cluster-level MOTD, then set all SVM-level
instead of using the SVM-level MOTD MOTDs to an empty string with the cluster-level
MOTD enabled:
Have an MOTD displayed for only selected SVMs, Set the cluster-level MOTD to an empty string, then
and use no cluster-level MOTD set SVM-level MOTDs for selected SVMs:
Use the same SVM-level MOTD for all (data and Set the cluster and all SVMs to use the same
admin) SVMs MOTD:
125
If you want to… Then…
Have a cluster-level MOTD optionally available to all Set a cluster-level MOTD, but disable its display for
SVMs, but do not want the MOTD displayed for the cluster:
cluster logins
security login motd modify -vserver
cluster_name { [-message "text"] | [-
uri ftp_or_http_addr] } -is-cluster
-message-enabled false
Remove all MOTDs at the cluster and SVM levels Set the cluster and all SVMs to use an empty string
when only some SVMs have both cluster-level and for the MOTD:
SVM-level MOTDs
security login motd modify -vserver *
-message ""
Modify the MOTD only for the SVMs that have a Use extended queries to modify the MOTD
non-empty string, when other SVMs use an empty selectively:
string, and when a different MOTD is used at the
cluster level security login motd modify { -vserver
!"cluster_name" -message !"" } { [-
message "text"] | [-uri
ftp_or_http_addr] }
Display all MOTDs that contain specific text (for Use a query to display MOTDs:
example, “January” followed by “2015”) anywhere in
a single or multiline message, even if the text is split security login motd show -message
across different lines *"January"*"2015"*
Interactively create an MOTD that includes multiple In the interactive mode, press the space bar
and consecutive newlines (also known as end of followed by Enter to create a blank line without
lines, or EOLs) terminating the input for the MOTD.
126
If you want to… Then…
Use only the cluster-level MOTD for the SVM, when Set the SVM-level MOTD to an empty string, then
the SVM already has an SVM-level MOTD have the cluster administrator enable the cluster-
level MOTD for the SVM:
Not have the SVM display any MOTD, when both Set the SVM-level MOTD to an empty string, then
the cluster-level and SVM-level MOTDs are have the cluster administrator disable the cluster-
currently displayed for the SVM level MOTD for the SVM:
Job categories
There are three categories of jobs that you can manage: server-affiliated, cluster-affiliated, and private.
• Server-Affiliated jobs
These jobs are queued by the management framework to a specific node to be run.
• Cluster-Affiliated jobs
These jobs are queued by the management framework to any node in the cluster to be run.
• Private jobs
These jobs are specific to a node and do not use the replicated database (RDB) or any other cluster
mechanism. The commands that manage private jobs require the advanced privilege level or higher.
127
Commands for managing jobs
When you enter a command that invokes a job, typically, the command informs you that the job has been
queued and then returns to the CLI command prompt. However, some commands instead report job progress
and do not return to the CLI command prompt until the job has been completed. In these cases, you can press
Ctrl-C to move the job to the background.
Display the list of private jobs job private show (advanced privilege level)
Display information about completed private jobs job private show-completed (advanced
privilege level)
Display information about the initialization state for job job initstate show (advanced privilege level)
managers
Resume a paused private job job private resume (advanced privilege level)
128
If you want to… Use this command…
Stop a job job stop
You can use the event log show command to determine the outcome of a completed job.
Related information
ONTAP command reference
Job schedules do not adjust to manual changes to the cluster date and time. These jobs are scheduled to run
based on the current cluster time when the job was created or when the job most recently ran. Therefore, if you
manually change the cluster date or time, you should use the job show and job history show commands
to verify that all scheduled jobs are queued and completed according to your requirements.
If the cluster is part of a MetroCluster configuration, then the job schedules on both clusters must be identical.
Therefore, if you create, modify, or delete a job schedule, you must perform the same operation on the remote
cluster.
129
If you want to… Use this command…
Create a cron schedule job schedule cron create
Related information
ONTAP command reference
Configuration backup files are archive files (.7z) that contain information for all
configurable options that are necessary for the cluster, and the nodes within it, to operate
properly.
These files store the local configuration of each node, plus the cluster-wide replicated configuration. You use
configuration backup files to back up and restore the configuration of your cluster.
Each healthy node in the cluster includes a node configuration backup file, which contains all of the
configuration information and metadata necessary for the node to operate healthy in the cluster.
These files include an archive of all of the node configuration backup files in the cluster, plus the replicated
cluster configuration information (the replicated database, or RDB file). Cluster configuration backup files
enable you to restore the configuration of the entire cluster, or of any node in the cluster. The cluster
130
configuration backup schedules create these files automatically and store them on several nodes in the
cluster.
Configuration backup files contain configuration information only. They do not include any user
data. For information about restoring user data, see Data Protection.
Three separate schedules automatically create cluster and node configuration backup
files and replicate them among the nodes in the cluster.
The configuration backup files are automatically created according to the following schedules:
• Every 8 hours
• Daily
• Weekly
At each of these times, a node configuration backup file is created on each healthy node in the cluster. All of
these node configuration backup files are then collected in a single cluster configuration backup file along with
the replicated cluster configuration and saved on one or more nodes in the cluster.
You can use the system configuration backup settings commands to manage
configuration backup schedules.
These commands are available at the advanced privilege level.
• Specify a remote URL (HTTP, HTTPS, FTP, When you use HTTPS in the remote URL, use the
FTPS, or TFTP ) where the configuration backup -validate-certification option to enable or
files will be uploaded in addition to the default disable digital certificate validation. Certificate
locations in the cluster validation is disabled by default.
• Specify a user name to be used to log in to the
remote URL The web server to which you are
uploading the configuration backup file
• Set the number of backups to keep for each
must have PUT operations enabled for
configuration backup schedule
HTTP and POST operations enabled
for HTTPS. For more information, see
your web server’s documentation.
Set the password to be used to log in to the remote system configuration backup settings
URL set-password
131
If you want to… Use this command…
View the settings for the configuration backup system configuration backup settings
schedule show
You use the system configuration backup commands to manage cluster and node
configuration backup files.
These commands are available at the advanced privilege level.
Copy a configuration backup file from a node to system configuration backup copy
another node in the cluster
Upload a configuration backup file from a node in the system configuration backup upload
cluster to a remote URL (FTP, HTTP, HTTPS, TFTP,
or FTPS) When you use HTTPS in the remote URL, use the
-validate-certification option to enable or
disable digital certificate validation. Certificate
validation is disabled by default.
Download a configuration backup file from a remote system configuration backup download
URL to a node in the cluster, and, if specified, validate
the digital certificate When you use HTTPS in the remote URL, use the
-validate-certification option to enable or
disable digital certificate validation. Certificate
validation is disabled by default.
132
If you want to… Use this command…
Rename a configuration backup file on a node in the system configuration backup rename
cluster
View the node and cluster configuration backup files system configuration backup show
for one or more nodes in the cluster
You use a configuration backup file located at a remote URL or on a node in the cluster to
recover a node configuration.
About this task
You can use either a cluster or node configuration backup file to restore a node configuration.
Step
1. Make the configuration backup file available to the node for which you need to restore the configuration.
If you previously re-created the cluster, you should choose a configuration backup file that was created
after the cluster recreation. If you must use a configuration backup file that was created prior to the cluster
recreation, then after recovering the node, you must re-create the cluster again.
133
Restore the node configuration using a configuration backup file
You restore the node configuration using the configuration backup file that you identified
and made available to the recovering node.
About this task
You should only perform this task to recover from a disaster that resulted in the loss of the node’s local
configuration files.
Steps
1. Change to the advanced privilege level:
2. If the node is healthy, then at the advanced privilege level of a different node, use the cluster modify
command with the -node and -eligibility parameters to mark it ineligible and isolate it from the
cluster.
If the node is not healthy, then you should skip this step.
This example modifies node2 to be ineligible to participate in the cluster so that its configuration can be
restored:
3. Use the system configuration recovery node restore command at the advanced privilege level
to restore the node’s configuration from a configuration backup file.
If the node lost its identity, including its name, then you should use the -nodename-in-backup
parameter to specify the node name in the configuration backup file.
This example restores the node’s configuration using one of the configuration backup files stored on the
node:
4. If you marked the node ineligible, then use the system configuration recovery cluster sync
command to mark the node as eligible and synchronize it with the cluster.
134
5. If you are operating in a SAN environment, use the system node reboot command to reboot the node
and reestablish SAN quorum.
You use the configuration from either a node in the cluster or a cluster configuration
backup file to recover a cluster.
Steps
1. Choose a type of configuration to recover the cluster.
◦ A node in the cluster
If the cluster consists of more than one node, and one of the nodes has a cluster configuration from
when the cluster was in the desired configuration, then you can recover the cluster using the
configuration stored on that node.
In most cases, the node containing the replication ring with the most recent transaction ID is the best
node to use for restoring the cluster configuration. The cluster ring show command at the
advanced privilege level enables you to view a list of the replicated rings available on each node in the
cluster.
If you cannot identify a node with the correct cluster configuration, or if the cluster consists of a single
node, then you can use a cluster configuration backup file to recover the cluster.
If you are recovering the cluster from a configuration backup file, any configuration changes made
since the backup was taken will be lost. You must resolve any discrepancies between the configuration
backup file and the present configuration after recovery. See Knowledge Base article ONTAP
Configuration Backup Resolution Guide for troubleshooting guidance.
2. If you chose to use a cluster configuration backup file, then make the file available to the node you plan to
use to recover the cluster.
135
If the configuration backup file is located… Then…
On a node in the cluster a. Use the system configuration backup
show command at the advanced privilege level
to find a cluster configuration backup file that
was created when the cluster was in the desired
configuration.
b. If the cluster configuration backup file is not
located on the node you plan to use to recover
the cluster, then use the system
configuration backup copy command to
copy it to the recovering node.
To restore a cluster configuration from an existing configuration after a cluster failure, you
re-create the cluster using the cluster configuration that you chose and made available to
the recovering node, and then rejoin each additional node to the new cluster.
About this task
You should only perform this task to recover from a disaster that resulted in the loss of the cluster’s
configuration.
If you are re-creating the cluster from a configuration backup file, you must contact technical
support to resolve any discrepancies between the configuration backup file and the configuration
present in the cluster.
If you are recovering the cluster from a configuration backup file, any configuration changes
made since the backup was taken will be lost. You must resolve any discrepancies between the
configuration backup file and the present configuration after recovery. See the Knowledge Base
article ONTAP Configuration Backup Resolution Guide for troubleshooting guidance.
Steps
1. Disable storage failover for each HA pair:
You only need to disable storage failover once for each HA pair. When you disable storage failover for a
node, storage failover is also disabled on the node’s partner.
Warning: Are you sure you want to halt the node? {y|n}: y
136
3. Set the privilege level to advanced:
4. On the recovering node, use the system configuration recovery cluster recreate command
to re-create the cluster.
This example re-creates the cluster using the configuration information stored on the recovering node:
5. If you are re-creating the cluster from a configuration backup file, verify that the cluster recovery is still in
progress:
You do not need to verify the cluster recovery state if you are re-creating the cluster from a healthy node.
7. For each node that needs to be joined to the re-created cluster, do the following:
a. From a healthy node on the re-created cluster, rejoin the target node:
This example rejoins the “node2” target node to the re-created cluster:
137
cluster1::*> system configuration recovery cluster rejoin -node node2
Warning: This command will rejoin node "node2" into the local
cluster, potentially overwriting critical cluster
configuration files. This command should only be used
to recover from a disaster. Do not perform any other
recovery operations while this operation is in progress.
This command will cause node "node2" to reboot.
Do you want to continue? {y|n}: y
b. Verify that the target node is healthy and has formed quorum with the rest of the nodes in the cluster:
The target node must rejoin the re-created cluster before you can rejoin another node.
8. If you re-created the cluster from a configuration backup file, set the recovery status to be complete:
10. If the cluster consists of only two nodes, use the cluster ha modify command to reenable cluster HA.
11. Use the storage failover modify command to reenable storage failover for each HA pair.
If cluster-wide quorum exists, but one or more nodes are out of sync with the cluster, then
you must synchronize the node to restore the replicated database (RDB) on the node and
bring it into quorum.
Step
138
1. From a healthy node, use the system configuration recovery cluster sync command at the
advanced privilege level to synchronize the node that is out of sync with the cluster configuration.
This example synchronizes a node (node2) with the rest of the cluster:
Warning: This command will synchronize node "node2" with the cluster
configuration, potentially overwriting critical cluster
configuration files on the node. This feature should only be
used to recover from a disaster. Do not perform any other
recovery operations while this operation is in progress. This
command will cause all the cluster applications on node
"node2" to restart, interrupting administrative CLI and Web
interface on that node.
Do you want to continue? {y|n}: y
All cluster applications on node "node2" will be restarted. Verify that
the cluster applications go online.
Result
The RDB is replicated to the node, and the node becomes eligible to participate in the cluster.
Core dump files and reports are stored in the /mroot/etc/crash/ directory of a node. You can display
the directory content by using the system node coredump commands or a web browser.
• Saving the core dump content and uploading the saved file to a specified location or to technical support
ONTAP prevents you from initiating the saving of a core dump file during a takeover, an aggregate
relocation, or a giveback.
You use the system node coredump config commands to manage the configuration of core dumps, the
system node coredump commands to manage the core dump files, and the system node coredump
reports commands to manage application core reports.
139
If you want to… Use this command…
Configure core dumps system node coredump config modify
Display the configuration settings for core dumps system node coredump config show
Display basic information about core dumps system node coredump show
Manually trigger a core dump when you reboot a node system node reboot with both the -dump and
-skip-lif-migration-before-reboot
parameters
The skip-lif-migration-
before-reboot parameter specifies
that LIF migration prior to a reboot will
be skipped.
Manually trigger a core dump when you shut down a system node halt with both the -dump and
node -skip-lif-migration-before-shutdown
parameters
The skip-lif-migration-
before-shutdown parameter
specifies that LIF migration prior to a
shutdown will be skipped.
Save all unsaved core dumps that are on a specified system node coredump save-all
node
Generate and send an AutoSupport message with a system node autosupport invoke-core-
core dump file you specify upload
Display status information about core dumps system node coredump status
Delete all unsaved core dumps or all saved core files system node coredump delete-all
on a node
140
If you want to… Use this command…
Display application core dump reports system node coredump reports show
Delete an application core dump report system node coredump reports delete
Related information
ONTAP command reference
Local tiers (also called aggregates) are containers for the disks managed by a node. You can use local tiers to
isolate workloads with different performance demands, to tier data with different access patterns, or to
segregate data for regulatory purposes.
• For business-critical applications that need the lowest possible latency and the highest possible
performance, you might create a local tier consisting entirely of SSDs.
• To tier data with different access patterns, you can create a hybrid local tier, deploying flash as high-
performance cache for a working data set, while using lower-cost HDDs or object storage for less
frequently accessed data.
◦ A Flash Pool consists of both SSDs and HDDs.
◦ A FabricPool consists of an all-SSD local tier with an attached object store.
• If you need to segregate archived data from active data for regulatory purposes, you can use a local tier
consisting of capacity HDDs, or a combination of performance and capacity HDDs.
141
Working with local tiers (aggregates)
Related information
• Manage FabricPool cloud tiers
You can use System Manager or the ONTAP CLI to add local tiers (aggregates), manage
their usage, and add capacity (disks) to them.
You can perform the following tasks:
142
To add a local tier, you follow a specific workflow. You determine the number of disks or disk partitions that
you need for the local tier and decide which method to use to create the local tier. You can add local tiers
automatically by letting ONTAP assign the configuration, or you can manually specify the configuration.
For existing local tiers, you can rename them, set their media costs, or determine their drive and RAID
group information. You can modify the RAID configuration of a local tier and assign local tiers to storage
VMs (SVMs).
You can modify the RAID configuration of a local tier and assign local tiers to storage VMs (SVMs). You
can determine which volumes reside on a local tier and how much space they use on a local tier. You can
control how much space that volumes can use. You can relocate local tier ownership with an HA pair. You
can also delete a local tier.
143
System Manager workflow
Use System Manager to add (create) a local tier
System Manager creates local tiers based on recommended best practices for configuring local tiers.
Beginning with ONTAP 9.11.1, you can decide to configure local tiers manually if you want a different
configuration than the one recommended during the automatic process to add a local tier.
144
CLI workflow
Use the CLI to add (create) an aggregate
Beginning with ONTAP 9.2, ONTAP can provide recommended configurations when you create
aggregates (auto-provisioning). If the recommended configurations, based on best practices, are
appropriate in your environment, you can accept them to create the aggregates. Otherwise, you can
create aggregates manually.
145
Determine the number of disks or disk partitions required for a local tier (aggregate)
You must have enough disks or disk partitions in your local tier (aggregate) to meet
system and business requirements. You should also have the recommended number of
hot spare disks or hot spare disk partitions to minimize the potential of data loss.
Root-data partitioning is enabled by default on certain configurations. Systems with root-data partitioning
enabled use disk partitions to create local tiers. Systems that do not have root-data partitioning enabled use
unpartitioned disks.
You must have enough disks or disk partitions to meet the minimum number required for your RAID policy and
enough to meet your minimum capacity requirements.
In ONTAP, the usable space of the drive is less than the physical capacity of the drive. You can
find the usable space of a specific drive and the minimum number of disks or disk partitions
required for each RAID policy in the Hardware Universe.
The procedure you follow depends on the interface you use—System Manager or the CLI:
146
System Manager
Use System Manager to determine usable space of disks
Steps
1. Go to Storage > Tiers
2. Click next to the name of the local tier.
3. Select the Disk Information tab.
CLI
Use the CLI to determine usable space of disks
Step
1. Display spare disk information:
In addition to the number of disks or disk partitions necessary to create your RAID group and meet your
capacity requirements, you should also have the minimum number of hot spare disks or hot spare disk
partitions recommended for your aggregate:
• For all flash aggregates, you should have a minimum of one hot spare disk or disk partition.
The AFF C190 defaults to no spare drive. This exception is fully supported.
• For non-flash homogenous aggregates, you should have a minimum of two hot spare disks or disk
partitions.
• For SSD storage pools, you should have a minimum of one hot spare disk for each HA pair.
• For Flash Pool aggregates, you should have a minimum of two spare disks for each HA pair. You can find
more information on the supported RAID policies for Flash Pool aggregates in the Hardware Universe.
• To support the use of the Maintenance Center and to avoid issues caused by multiple concurrent disk
failures, you should have a minimum of four hot spares in multi-disk carriers.
Related information
NetApp Hardware Universe
147
tiers manually.
When a local tier is created automatically, ONTAP analyzes available spare disks in the cluster and generates
a recommendation about how spare disks should be used to add local tiers according to best practices.
ONTAP displays the recommended configurations. You can accept the recommendations or add the local tiers
manually.
If any of the following disk conditions are present, they must be addressed before accepting the
recommendations from ONTAP:
• Missing disks
• Fluctuation in spare disk numbers
• Unassigned disks
• Non-zeroed spares
• Disks undergoing maintenance testing
The storage aggregate auto-provision man page contains more information about these
requirements.
In many cases, the recommended layout of the local tier will be optimal for your environment. However, if your
cluster is running ONTAP 9.1 or earlier, or your environment includes the following configurations, you must
create the local tier using the manual method.
Beginning with ONTAP 9.11.1, you can manually add local tiers with System Manager.
Related information
• ONTAP command reference
148
Add local tiers automatically (create aggregates with auto-provisioning)
If the best-practice recommendation that ONTAP provides for automatically adding a local
tier (creating an aggregate with auto-provisioning)
is appropriate in your environment, you can accept the recommendation and let ONTAP
add the local tier.
Before you begin
Disks must be owned by a node before they can be used in a local tier (aggregate). If your cluster is not
configured to use automatic disk ownership assignment, you must assign ownership manually.
149
System Manager
Steps
1. In System Manager, click Storage > Tiers.
2. From the Tiers page, click to create a new local tier:
The Add Local Tier page shows the recommended number of local tiers that can be created on the
nodes and the usable storage available.
System Manager displays the following information beginning with ONTAP 9.8:
◦ Local tier name (you can edit the local tier name beginning with ONTAP 9.10.1)
◦ Node name
◦ Usable size
◦ Type of storage
Beginning with ONTAP 9.10.1, additional information is displayed:
Manually configure the local tiers and not use the Proceed to Add a local tier (create aggregate)
recommendations from System Manager. manually:
5. (Optional): If the Onboard Key Manager has been installed, you can configure it for encryption.
Check the Configure Onboard Key Manager for encryption check box.
a. Enter a passphrase.
b. Enter the passphrase again to confirm it.
c. Save the passphrase for future use in case the system needs to be recovered.
d. Back up the key database for future use.
6. Click Save to create the local tier and add it to your storage solution.
150
CLI
You run the storage aggregate auto-provision command to generate aggregate layout
recommendations. You can then create aggregates after reviewing and approving ONTAP
recommendations.
You can also display a detailed summary by using the -verbose option, which displays the following
reports:
• Per node summary of new aggregates to create, discovered spares, and remaining spare disks and
partitions after aggregate creation
• New data aggregates to create with counts of disks and partitions to be used
• RAID group layout showing how spare disks and partitions will be used in new data aggregates to be
created
• Details about spare disks and partitions remaining after aggregate creation
If you are familiar with the auto-provision method and your environment is correctly prepared, you can use
the -skip-confirmation option to create the recommended aggregate without display and
confirmation. The storage aggregate auto-provision command is not affected by the CLI
session -confirmations setting.
The storage aggregate auto-provision man page contains more information about the
aggregate layout recommendations.
Steps
1. Run the storage aggregate auto-provision command with the desired display options.
◦ no options: Display standard summary
◦ -verbose option: Display detailed summary
◦ -skip-confirmation option: Create recommended aggregates without display or confirmation
2. Perform one of the following steps:
151
Accept the recommendations Review the display of recommended aggregates, and then
from ONTAP. respond to the prompt to create the recommended aggregates.
myA400-44556677::>
Manually configure the local tiers Proceed to Add a local tier (create aggregate) manually.
and not use the
recommendations from ONTAP.
Related information
• ONTAP command reference
If you do not want to add a local tier (create a aggregate) using the best-practice
recommendations from ONTAP, you can perform the process manually.
Before you begin
Disks must be owned by a node before they can be used in a local tier (aggregate). If your cluster is not
configured to use automatic disk ownership assignment, you must assign ownership manually.
152
System Manager
Beginning with ONTAP 9.11.1, if you do not want to use the configuration recommended by System
Manager to create a local tier, you can specify the configuration you want.
Steps
1. In System Manager, click Storage > Tiers.
2. From the Tiers page, click to create a new local tier:
The Add Local Tier page shows the recommended number of local tiers that can be created on the
nodes and the usable storage available.
3. When System Manager displays the storage recommendation for the local tier, click Switch to
Manual Local Tier Creation in the Spare Disks section.
The Add Local Tier page displays fields that you use to configure the local tier.
4. In the first section of the Add Local Tier page, complete the following:
a. Enter the name of the local tier.
b. (Optional): Check the Mirror this local tier check box if you want to mirror the local tier.
c. Select a disk type.
d. Select the number of disks.
5. In the RAID Configuration section, complete the following:
a. Select the RAID type.
b. Select the RAID group size.
c. Click RAID allocation to view how the disks are allocated in the group.
6. (Optional): If the Onboard Key Manager has been installed, you can configure it for encryption in the
Encryption section of the page. Check the Configure Onboard Key Manager for encryption check
box.
a. Enter a passphrase.
b. Enter the passphrase again to confirm it.
c. Save the passphrase for future use in case the system needs to be recovered.
d. Back up the key database for future use.
7. Click Save to create the local tier and add it to your storage solution.
CLI
Before you create aggregates manually, you should review disk configuration options and simulate
creation.
Then you can issue the storage aggregate create command and verify the results.
153
configuration, it is recommended that your data partitions be assigned to different nodes.
The procedure for creating aggregates on systems with root-data partitioning and root-data-data
partitioning enabled is the same as the procedure for creating aggregates on systems using unpartitioned
disks. If root-data partitioning is enabled on your system, you should use the number of disk partitions for
the -diskcount option. For root-data-data partitioning, the -diskcount option specifies the count of
disks to use.
When creating multiple aggregates for use with FlexGroups, aggregates should be as
close in size as possible.
The storage aggregate create man page contains more information about aggregate creation
options and requirements.
Steps
1. View the list of spare disk partitions to verify that you have enough to create your aggregate:
Data partitions are displayed under Local Data Usable. A root partition cannot be used as a
spare.
3. If any warnings are displayed from the simulated command, adjust the command and repeat the
simulation.
4. Create the aggregate:
Related information
• ONTAP command reference
After you have created local tiers (aggregates), you can manage how they are used.
You can perform the following tasks:
154
• Determine drive and RAID group information for a local tier (aggregate)
• Assign local tiers (aggregates) to storage VMs (SVMs)
• Determine which volumes reside on a local tier (aggregate)
• Determine and control a volume’s space usages in a local tier (aggregate)
• Determine space usage in a local tier (aggregate)
• Relocate local tier (aggregate) ownership within an HA pair
• Delete a local tier (aggregate)
You can rename a local tier (aggregate). The method you follow depends on the interface
you use—System Manager or the CLI:
System Manager
Use System Manager to rename a local tier (aggregate)
Beginning with ONTAP 9.10.1, you can modify the name of a local tier (aggregate).
Steps
1. In System Manager, click Storage > Tiers.
2. Click next to the name of the local tier.
3. Select Rename.
4. Specify a new name for the local tier.
CLI
Use the CLI to rename a local tier (aggregate)
Step
1. Using the CLI, rename the local tier (aggregate):
Beginning with ONTAP 9.11.1, you can use System Manager to set the media cost of a
local tier (aggregate).
Steps
1. In System Manager, click Storage > Tiers, then click Set Media Cost in the desired local tier (aggregate)
tiles.
155
2. Select active and inactive tiers to enable comparison.
3. Enter a currency type and amount.
When you enter or change the media cost, the change is made in all media types.
On systems freshly installed with ONTAP 9.4 or later and systems reinitialized with
ONTAP 9.4 or later, fast zeroing is used to zero drives.
With fast zeroing, drives are zeroed in seconds. This is done automatically before provisioning and greatly
reduces the time it takes to initialize the system, create aggregates, or expand aggregates when spare drives
are added.
Fast zeroing is not supported on systems upgraded from ONTAP 9.3 or earlier. ONTAP 9.4 or
later must be freshly installed or the system must be reinitialized. In ONTAP 9.3 and earlier,
drives are also automatically zeroed by ONTAP, however, the process takes longer.
If you need to manually zero a drive, you can use one of the following methods. In ONTAP 9.4 and later,
manually zeroing a drive also takes only seconds.
156
CLI command
Use a CLI command to fast-zero drives
Steps
1. Enter the CLI command:
Steps
1. From the boot menu, select one of the following options:
◦ (4) Clean configuration and initialize all disks
◦ (9a) Unpartition all disks and remove their ownership information
◦ (9b) Clean configuration and initialize node with whole disks
Disks must be owned by a node before they can be used in a local tier (aggregate).
About this task
• If you are manually assigning ownership in an HA pair that is not being initialized and does not have only
DS460C shelves, use option 1.
• If you are initializing an HA pair that has only DS460C shelves, use option 2 to manually assign ownership
for the root drives.
157
Option 1: Most HA pairs
For an HA pair that is not being initialized and does not have only DS460C shelves, use this procedure to
manually assigning ownership.
Steps
1. Use the CLI to display all unowned disks:
You can use the wildcard character to assign more than one disk at once. If you are reassigning a
spare disk that is already owned by a different node, you must use the “-force” option.
158
Option 2: An HA pair with only DS460C shelves
For an HA pair that you are initializing and that only has DS460C shelves, use this procedure to manually
assign ownership for the root drives.
After HA pair initialization (boot up), automatic assignment of disk ownership is automatically enabled
and uses the half-drawer policy to assign ownership to the remaining drives (other than the root
drives) and any drives added in the future, such as replacing failed disks, responding to a “low
spares” message, or adding capacity.
Learn about the half-drawer policy in the topic About automatic assignment of disk ownership.
• RAID needs a minimum of 10 drives for each HA pair (5 for each node) for any greater than 8TB NL-
SAS drives in a DS460C shelf.
Steps
1. If your DS460C shelves are not fully populated, complete the following substeps; otherwise, go to the
next step.
a. First, install drives in the front row (drive bays 0, 3, 6, and 9) of each drawer.
Installing drives in the front row of each drawer allows for proper air flow and prevents
overheating.
b. For the remaining drives, evenly distribute them across each drawer.
Fill drawer rows from front to back. If you don’t have enough drives to fill rows, then install them in
pairs so that drives occupy the left and right side of a drawer evenly.
The following illustration shows the drive bay numbering and locations in a DS460C drawer.
2. Log into the clustershell using the node-management LIF or cluster-management LIF.
159
3. Manually assign the root drives in each drawer to conform to the half-drawer policy using the following
substeps:
The half-drawer policy has you assign the left half of a drawer’s drives (bays 0 to 5) to node A, and
the right half of a drawer’s drives (bays 6 to 11) to node B.
You can use the wildcard character to assign more than one disk at a time.
Determine drive and RAID group information for a local tier (aggregate)
Some local tier (aggregate) administration tasks require that you know what types of
drives compose the local tier, their size, checksum, and status, whether they are shared
with other local tiers, and the size and composition of the RAID groups.
Step
1. Show the drives for the aggregate, by RAID group:
The drives are displayed for each RAID group in the aggregate.
You can see the RAID type of the drive (data, parity, dparity) in the Position column. If the Position
column displays shared, then the drive is shared: if it is an HDD, it is a partitioned disk; if it is an SSD, it is
part of a storage pool.
160
Example: A Flash Pool aggregate using an SSD storage pool and data partitions
Usable Physical
Position Disk Pool Type RPM Size Size Status
-------- ---------- ---- ----- ------ -------- -------- -------
shared 2.0.1 0 SAS 10000 472.9GB 547.1GB (normal)
shared 2.0.3 0 SAS 10000 472.9GB 547.1GB (normal)
shared 2.0.5 0 SAS 10000 472.9GB 547.1GB (normal)
shared 2.0.7 0 SAS 10000 472.9GB 547.1GB (normal)
shared 2.0.9 0 SAS 10000 472.9GB 547.1GB (normal)
shared 2.0.11 0 SAS 10000 472.9GB 547.1GB (normal)
Usable Physical
Position Disk Pool Type RPM Size Size Status
-------- ---------- ---- ----- ------ -------- -------- -------
shared 2.0.13 0 SSD - 186.2GB 745.2GB (normal)
shared 2.0.12 0 SSD - 186.2GB 745.2GB (normal)
If you assign one or more local tiers (aggregates) to a storage virtual machine (storage
VM or SVM, formerly known as Vserver), then you can use only those local tiers to
contain volumes for that storage VM (SVM).
What you’ll need
The storage VM and the local tiers you want to assign to that storage VM must already exist.
Steps
1. Check the list of local tiers (aggregates) already assigned to the SVM:
161
The aggregates currently assigned to the SVM are displayed. If there are no aggregates assigned, “-” is
displayed.
The listed aggregates are assigned to or removed from the SVM. If the SVM already has volumes that use
an aggregate that is not assigned to the SVM, a warning message is displayed, but the command is
completed successfully. Any aggregates that were already assigned to the SVM and that were not named
in the command are unaffected.
Example
In the following example, the aggregates aggr1 and aggr2 are assigned to SVM svm1:
You might need to determine which volumes reside on a local tier (aggregate) before
performing operations on the local tier, such as relocating it or taking it offline.
Steps
1. To display the volumes that reside on an aggregate, enter
You can determine which FlexVol volumes are using the most space in a local tier
(aggregate) and specifically which features within the volume.
The volume show-footprint command provides information about a volume’s footprint, or its space usage
within the containing aggregate.
The volume show-footprint command shows details about the space usage of each volume in an
aggregate, including offline volumes. This command bridges the gap between the output of the volume
show-space and aggregate show-space commands. All percentages are calculated as a percent of
aggregate size.
The following example shows the volume show-footprint command output for a volume called testvol:
162
cluster1::> volume show-footprint testvol
Vserver : thevs
Volume : testvol
The following table explains some of the key rows of the output of the volume show-footprint command
and what you can do to try to decrease space usage by that feature:
Volume Data Footprint The total amount of space used in • Deleting data from the volume.
the containing aggregate by a
• Deleting Snapshot copies from
volume’s data in the active file
the volume.
system and the space used by the
volume’s Snapshot copies. This
row does not include reserved
space.
Volume Guarantee The amount of space reserved by Changing the type of guarantee for
the volume in the aggregate for the volume to none.
future writes. The amount of space
reserved depends on the
guarantee type of the volume.
Flexible Volume Metadata The total amount of space used in No direct method to control.
the aggregate by the volume’s
metadata files.
Delayed Frees Blocks that ONTAP used for No direct method to control.
performance and cannot be
immediately freed. For SnapMirror
destinations, this row has a value
of 0 and is not displayed.
File Operation Metadata The total amount of space reserved No direct method to control.
for file operation metadata.
Total Footprint The total amount of space that the Any of the methods used to
volume uses in the aggregate. It is decrease space used by a volume.
the sum of all of the rows.
163
Related information
NetApp Technical Report 3483: Thin Provisioning in a NetApp SAN or IP SAN Enterprise Environment
You can view how much space is used by all volumes in one or more local tiers
(aggregates) so that you can take actions to free more space.
WAFL reserves a percentage of the total disk space for aggregate level metadata and performance. The space
used for maintaining the volumes in the aggregate comes out of the WAFL reserve and cannot be changed.
In aggregates smaller than 30 TB, WAFL reserves 10% of the total disk space for aggregate level metadata
and performance.
Beginning with ONTAP 9.12.1, in aggregates that are 30 TB or larger, the amount of reserved disk space for
aggregate level metadata and performance is reduced, resulting in 5% more usable space in aggregates. The
availability of this space savings varies based on your platform and version of ONTAP.
You can view space usage by all volumes in one or more aggregates with the aggregate show-space
command. This helps you see which volumes are consuming the most space in their containing aggregates so
that you can take actions to free more space.
The used space in an aggregate is directly affected by the space used in the FlexVol volumes it contains.
Measures that you take to increase space in a volume also affect space in the aggregate.
Beginning with ONTAP 9.15.1, two new metadata counters are available. Together with changes
to several existing counters, you can get a clearer view of the amount of user data allocated.
See Determine space usage in a volume or aggregate for more information.
The following rows are included in the aggregate show-space command output:
• Volume Footprints
The total of all volume footprints within the aggregate. It includes all of the space that is used or reserved
by all data and metadata of all volumes in the containing aggregate.
• Aggregate Metadata
The total file system metadata required by the aggregate, such as allocation bitmaps and inode files.
• Snapshot Reserve
The amount of space reserved for aggregate Snapshot copies, based on volume size. It is considered
used space and is not available to volume or aggregate data or metadata.
164
• Snapshot Reserve Unusable
The amount of space originally allocated for aggregate Snapshot reserve that is unavailable for aggregate
Snapshot copies because it is being used by volumes associated with the aggregate. Can occur only for
aggregates with a non-zero aggregate Snapshot reserve.
• Total Used
The sum of all space used or reserved in the aggregate by volumes, metadata, or Snapshot copies.
The amount of space being used for data now (rather than being reserved for future use). Includes space
used by aggregate Snapshot copies.
The following example shows the aggregate show-space command output for an aggregate whose
Snapshot reserve is 5%. If the Snapshot reserve was 0, the row would not be displayed.
Aggregate : wqa_gx106_aggr1
Related Information
• Knowledge Base article: Space Usage
• Free up 5% of your storage capacity by upgrading to ONTAP 9.12.1
You can change the ownership of local tiers (aggregates) among the nodes in an HA pair
without interrupting service from the local tiers.
Both nodes in an HA pair are physically connected to each other’s disks or array LUNs. Each disk or array
LUN is owned by one of the nodes.
Ownership of all disks or array LUNs within a local tier (aggregate) changes temporarily from one node to the
other when a takeover occurs. However, local tiers relocation operations can also permanently change the
ownership (for example, if done for load balancing). The ownership changes without any data-copy processes
or physical movement of the disks or array LUNs.
165
• Because volume count limits are validated programmatically during local tier relocation operations, it is not
necessary to check for this manually.
If the volume count exceeds the supported limit, the local tier relocation operation fails with a relevant error
message.
• You should not initiate local tier relocation when system-level operations are in progress on either the
source or the destination node; likewise, you should not start these operations during the local tier
relocation.
◦ Takeover
◦ Giveback
◦ Shutdown
◦ Another local tier relocation operation
◦ Disk ownership changes
◦ Local tier or volume configuration operations
◦ Storage controller replacement
◦ ONTAP upgrade
◦ ONTAP revert
• If you have a MetroCluster configuration, you should not initiate local tier relocation while disaster recovery
operations (switchover, healing, or switchback) are in progress.
• If you have a MetroCluster configuration and initiate local tier relocation on a switched-over local tier, the
operation might fail because it exceeds the DR partner’s volume limit count.
• You should not initiate local tier relocation on aggregates that are corrupt or undergoing maintenance.
• Before initiating the local tier relocation, you should save any core dumps on the source and destination
nodes.
Steps
1. View the aggregates on the node to confirm which aggregates to move and ensure they are online and in
good condition:
The following command shows six aggregates on the four nodes in the cluster. All aggregates are online.
Node1 and Node3 form an HA pair and Node2 and Node4 form an HA pair.
166
cluster::> storage aggregate show
Aggregate Size Available Used% State #Vols Nodes RAID Status
--------- -------- --------- ----- ------- ------ ------ -----------
aggr_0 239.0GB 11.13GB 95% online 1 node1 raid_dp,
normal
aggr_1 239.0GB 11.13GB 95% online 1 node1 raid_dp,
normal
aggr_2 239.0GB 11.13GB 95% online 1 node2 raid_dp,
normal
aggr_3 239.0GB 11.13GB 95% online 1 node2 raid_dp,
normal
aggr_4 239.0GB 238.9GB 0% online 5 node3 raid_dp,
normal
aggr_5 239.0GB 239.0GB 0% online 4 node4 raid_dp,
normal
6 entries were displayed.
The following command moves the aggregates aggr_1 and aggr_2 from Node1 to Node3. Node3 is
Node1’s HA partner. The aggregates can be moved only within the HA pair.
3. Monitor the progress of the aggregate relocation with the storage aggregate relocation show
command:
The following command shows the progress of the aggregates that are being moved to Node3:
167
cluster::> storage aggregate relocation show -node node1
Source Aggregate Destination Relocation Status
------ ----------- ------------- ------------------------
node1
aggr_1 node3 In progress, module: wafl
aggr_2 node3 Not attempted yet
2 entries were displayed.
node1::storage aggregate>
When the relocation is complete, the output of this command shows each aggregate with a relocation
status of “Done”.
You can delete a local tier (aggregate) if there are no volumes on the local tier.
The storage aggregate delete command deletes a storage aggregate. The command fails if there are
volumes present on the aggregate. If the aggregate has an object store attached to it, then in addition to
deleting the aggregate, the command deletes the objects in the object store as well. No changes are made to
the object store configuration as part of this command.
There are specific ONTAP commands for relocating aggregate ownership within an HA
pair.
Related information
• ONTAP command reference
168
If you want to… Use this command…
Display the size of the cache for all Flash Pool storage aggregate show -fields hybrid-
aggregates cache-size-total -hybrid-cache-size
-total >0
Display disk information and status for an aggregate storage aggregate show-status
Display the root aggregates in the cluster storage aggregate show -has-mroot true
Display basic information and status for aggregates storage aggregate show
Display the type of storage used in an aggregate storage aggregate show -fields storage-
type
Change the RAID type for an aggregate storage aggregate modify -raidtype
Related information
• ONTAP command reference
You can add disks to a local tier and add drives to a node or shelf.
169
• Add drives to a node or shelf
• Correct misaligned spare partitions
To add capacity to a local tier (expand an aggregate) you must first identify which local
tier you want to add to, determine how much new storage is needed, install new disks,
assign disk ownership, and create a new RAID group, if needed.
You can use either System Manager or the CLI to add capacity.
170
Methods to create space in an local tier (aggregate)
If a local tier (aggregate) runs out of free space, various problems can result that range
from loss of data to disabling a volume’s guarantee. There are multiple ways to make
more space in a local tier.
All of the methods have various consequences. Prior to taking any action, you should read the relevant section
in the documentation.
The following are some common ways to make space in local tier, in order of least to most consequences:
You can add disks to an local tier (aggregate) so that it can provide more storage to its
associated volumes.
171
System Manager (ONTAP 9.8 and later)
Use System Manager to add capacity (ONTAP 9.8 and later)
Beginning with ONTAP 9.12.1, you can use System Manager to view the committed
capacity of a local tier to determine if additional capacity is required for the local tier. See
Monitor capacity in System Manager.
Steps
1. Click Storage > Tiers.
2. Click next to the name of the local tier to which you want to add capacity.
3. Click Add Capacity.
If there are no spare disks that you can add, then the Add Capacity option is not
shown, and you cannot increase the capacity of the local tier.
4. Perform the following steps, based on the version of ONTAP that is installed:
Beginning with ONTAP 1. Select the disk type and number of disks.
9.11.1
2. If you want to add disks to a new RAID group, check the check
box. The RAID allocation is displayed.
3. Click Save.
5. (Optional) The process takes some time to complete. If you want to run the process in the
background, select Run in Background.
6. After the process completes, you can view the increased capacity amount in the local tier information
at Storage > Tiers.
You can add capacity to a local tier (aggregate) by adding capacity disks.
172
About this task
You perform this task only if you have installed ONTAP 9.7 or earlier. If you installed ONTAP 9.8 or later,
refer to Use System Manager to add capacity (ONTAP 9.8 or later).
Steps
1. (For ONTAP 9.7 only) Click (Return to classic version).
2. Click Hardware and Diagnostics > Aggregates.
3. Select the aggregate to which you want to add capacity disks, and then click Actions > Add
Capacity.
You should add disks that are of the same size as the other disks in the aggregate.
CLI
Use the CLI to add capacity
The procedure for adding partitioned disks to an aggregate is similar to the procedure for adding
unpartitioned disks.
When you provision partitions, you must ensure that you do not leave the node without a drive with both
partitions as spare. If you do, and the node experiences a controller disruption, valuable information about
the problem (the core file) might not be available to provide to the technical support.
Do not use the disklist command to expand your aggregates. This could cause partition
misalignment.
Steps
1. Show the available spare storage on the system that owns the aggregate:
You can use the -is-disk-shared parameter to show only partitioned drives or only unpartitioned
drives.
173
cl1-s2::> storage aggregate show-spare-disks -original-owner cl1-s2
-is-disk-shared true
174
cl1-s2::> storage aggregate show-status -aggregate data_1
You can see the result of the storage addition without actually provisioning any storage. If any
warnings are displayed from the simulated command, you can adjust the command and repeat the
simulation.
175
cl1-s2::> storage aggregate add-disks -aggregate aggr_test
-diskcount 5 -simulate true
First Plex
When creating a Flash Pool aggregate, if you are adding disks with a different checksum than the
aggregate, or if you are adding disks to a mixed checksum aggregate, you must use the
-checksumstyle parameter.
If you are adding disks to a Flash Pool aggregate, you must use the -disktype parameter to specify
the disk type.
You can use the -disksize parameter to specify a size of the disks to add. Only disks with
approximately the specified size are selected for addition to the aggregate.
176
cl1-s2::> storage aggregate add-disks -aggregate data_1 -raidgroup
new -diskcount 5
6. Verify that the node still has at least one drive with both the root partition and the data partition as
spare:
177
cl1-s2::> storage aggregate show-spare-disks -original-owner cl1-s2
-is-disk-shared true
You add drives to a node or shelf to increase the number of hot spares or to add space to
local tier (aggregate).
Before you begin
The drive you want to add must be supported by your platform. You can confirm using the NetApp Hardware
Universe.
The minimum number of drives you should add in a single procedure is six. Adding a single drive might reduce
performance.
178
Steps to install the drives
1. Check the NetApp Support Site for newer drive and shelf firmware and Disk Qualification Package files.
If your node or shelf does not have the latest versions, update them before installing the new drive.
Drive firmware is automatically updated (nondisruptively) on new drives that do not have current firmware
versions.
The correct slots for adding drives vary depending on the platform model and ONTAP
version. In some cases you need to add drives to specific slots in sequence. For example, in
an AFF A800 you add the drives at specific intervals leaving clusters of empty slots.
Whereas, in an AFF A220 you add new drives to the next empty slots running from the
outside towards the middle of the shelf.
Refer to the steps in Before you begin to identify the correct slots for your configuration in the NetApp
Hardware Universe.
When the drive’s activity LED is solid, it means that the drive has power. When the drive’s activity LED is
blinking, it means that the drive has power and I/O is in progress. If the drive firmware is automatically
updating, the LED blinks.
The new drives are not recognized until they are assigned to a node. You can assign the new drives
manually, or you can wait for ONTAP to automatically assign the new drives if your node follows the rules
for drive auto-assignment.
8. After the new drives have all been recognized, verify that they have been added and their ownership is
specified correctly.
179
storage aggregate show-spare-disks
You should see the new drives, owned by the correct node.
2. Optionally (for ONTAP 9.3 and earlier only), zero the newly added drives:
Drives that have been used previously in an ONTAP local tier (aggregate) must be zeroed before they can
be added to another aggregate. In ONTAP 9.3 and earlier, zeroing can take hours to complete, depending
on the size of the non-zeroed drives in the node. Zeroing the drives now can prevent delays in case you
need to quickly increase the size of an local tier. This is not an issue in ONTAP 9.4 or later where drives
are zeroed using fast zeroing which takes only seconds.
Results
The new drives are ready. You can add them to a local tier (aggregate), place them onto the list of hot spares,
or add them when you create a new local tier.
When you add partitioned disks to a local tier (aggregate), you must leave a disk with
both the root and data partition available as a spare for every node. If you do not and
your node experiences a disruption, ONTAP cannot dump the core to the spare data
partition.
Before you begin
You must have both a spare data partition and a spare root partition on the same type of disk owned by the
same node.
Steps
1. Using the CLI, display the spare partitions for the node:
Note which disk has a spare data partition (spare_data) and which disk has a spare root partition
(spare_root). The spare partition will show a non-zero value under the Local Data Usable or Local
Root Usable column.
2. Replace the disk with a spare data partition with the disk with the spare root partition:
You can copy the data in either direction; however, copying the root partition takes less time to complete.
4. After the replacement operation is complete, display the spares again to confirm that you have a full spare
disk:
180
You should see a spare disk with usable space under both “Local Data Usable” and Local Root
Usable.
Example
You display your spare partitions for node c1-01 and see that your spare partitions are not aligned:
c1::> storage disk replace -disk 1.0.1 -replacement 1.0.10 -action start
While you are waiting for the replacement operation to finish, you display the progress of the operation:
After the replacement operation is complete, confirm that you have a full spare disk:
181
ie2220::> storage aggregate show-spare-disks -original-owner c1-01
Manage disks
A hot spare disk is a disk that is assigned to a storage system and is ready for use, but is
not in use by a RAID group and does not hold any data.
If a disk failure occurs within a RAID group, the hot spare disk is automatically assigned to the RAID group to
replace the failed disks. The data of the failed disk is reconstructed on the hot spare replacement disk in the
background from the RAID parity disk. The reconstruction activity is logged in the /etc/message file and an
AutoSupport message is sent.
If the available hot spare disk is not the same size as the failed disk, a disk of the next larger size is chosen
and then downsized to match the size of the disk that it is replacing.
Maintaining the proper number of spares for disks in multi-disk carriers is critical for optimizing storage
182
redundancy and minimizing the amount of time that ONTAP must spend copying disks to achieve an optimal
disk layout.
You must maintain a minimum of two hot spares for multi-disk carrier disks at all times. To support the use of
the Maintenance Center and to avoid issues caused by multiple concurrent disk failures, you should maintain
at least four hot spares for steady state operation, and replace failed disks promptly.
If two disks fail at the same time with only two available hot spares, ONTAP might not be able to swap the
contents of both the failed disk and its carrier mate to the spare disks. This scenario is called a stalemate. If
this happens, you are notified through EMS messages and AutoSupport messages. When the replacement
carriers become available, you must follow the instructions that are provided by the EMS messages.
For me information, see Knowledge Base article RAID Layout Cannot Be Autocorrected - AutoSupport
message
How low spare warnings can help you manage your spare disks
By default, warnings are issued to the console and logs if you have fewer than one hot
spare drive that matches the attributes of each drive in your storage system.
You can change the threshold value for these warning messages to ensure that your system adheres to best
practices.
Step
1. Set the option to “2”:
Beginning with ONTAP 9.2, a new root-data partitioning option is available from the Boot
Menu that provides additional management features for disks that are configured for root-
data partitioning.
The following management features are available under the Boot Menu Option 9.
This option is useful if your system is configured for root-data partitioning and you need to reinitialize it with
a different configuration.
◦ Your system is not configured for root-data partitioning and you would like to configure it for root-data
partitioning
◦ Your system is incorrectly configured for root-data partitioning and you need to correct it
◦ You have an AFF platform or a FAS platform with only SSDs attached that is configured for the
183
previous version of root-data partitioning and you want to upgrade it to the newer version of root-data
partitioning to gain increased storage efficiency
• Clean configuration and initialize node with whole disks
The Disk Qualification Package (DQP) adds full support for newly qualified drives. Before
you update drive firmware or add new drive types or sizes to a cluster, you must update
the DQP. A best practice is to also update the DQP regularly; for example, every quarter
or semi-annually.
You need to download and install the DQP in the following situations:
For example, if you already have 1-TB drives and add 2-TB drives, you need to check for the latest DQP
update.
Related information
NetApp Downloads: Disk Qualification Package
You can view disk ownership to determine which node controls the storage. You can also view the partition
ownership on systems that use shared disks.
You can select a non-default policy for automatically assigning disk ownership or disable automatic
184
assignment of disk ownership.
If your cluster is not configured to use automatic disk ownership assignment, you must assign ownership
manually.
You can set the ownership of the container disk or the partitions manually or by using auto-assignment—
just as you do for unpartitioned disks.
A disk that has failed completely is no longer considered by ONTAP to be a usable disk, and you can
immediately disconnect the disk from the shelf.
ONTAP writes disk ownership information to the disk. Before you remove a spare disk or its shelf from a
node, you should remove its ownership information so that it can be properly integrated into another node.
The default auto-assignment policy is based on platform-specific characteristics, or the DS460C shelf if your
HA pair has only these shelves, and it uses one of the following methods (policies) to assign disk ownership:
shelf All disks in the shelf are assigned Entry-level systems in an HA pair
to node A. configuration with one stack of two
or more shelves, and MetroCluster
configurations with one stack per
node, two or more shelves.
185
split shelf Disks on the left side of the shelf Most AFF platforms and some
are assigned to node A and on the MetroCluster configurations.
This policy falls under the “default” right side to Node B. Partial
value for the -autoassign shelves on HA pairs are shipped
-policy parameter of the from the factory with disks
storage disk option populated from the shelf edge
command for applicable platform toward the center.
and shelf configurations.
stack All disks in the stack are assigned Stand-alone entry-level systems
to node A. and all other configurations.
half-drawer All drives in the left half of a HA pairs with only DS460C
DS460C drawer (drive bays 0 to 5) shelves, after HA pair initialization
This policy falls under the “default” are assigned to node A; all drives (boot up).
value for the -autoassign in the right half of a drawer (drive
-policy parameter of the bays 6 to 11) are assigned to node After an HA pair boots up,
storage disk option B. automatic assignment of disk
command for applicable platform ownership is automatically enabled
and shelf configurations. When initializing an HA pair with and uses the half-drawer policy to
only DS460C shelves, automatic assign ownership to the remaining
assignment of disk ownership is not drives (other than the root
supported. You must manually drives/container drives that have
assign ownership for drives the root partition) and any drives
containing root/container drives added in the future.
that have the root partition by
conforming to the half-drawer If your HA pair has DS460C
policy. shelves in addition to other shelf
models, the half-drawer policy is
not used. The default policy used is
dictated by platform-specific
characteristics.
• You can display the current auto-assignment settings (on/off) with the storage disk option show
command.
• You can disable automatic assignment by using the storage disk option modify command.
• If the default auto-assignment policy is not desirable in your environment, you can specify (change) the
bay, shelf, or stack assignment method using the -autoassign-policy parameter in the storage
disk option modify command.
The half-drawer and split-shelf default auto-assignment policies are unique because they
cannot be set by users like the bay, shelf, and stack policies can.
In Advanced Drive Partitioning (ADP) systems, to make auto-assign work on half-populated shelves, drives
must be installed in the correct shelf bays based on what type of shelf you have:
186
• If your shelf is not a DS460C shelf, install drives equally on the far left side and far right side moving toward
the middle. For example, six drives in bays 0-5 and six drives in bays 18-23 of a DS224C shelf.
• If your shelf is a DS460C shelf, install drives in the front row (drive bays 0, 3, 6, and 9) of each drawer. For
the remaining drives, evenly distribute them across each drawer by filling drawer rows from front to back. If
you don’t have enough drives to fill rows, then install them in pairs so that drives occupy the left and right
side of a drawer evenly.
Installing drives in the front row of each drawer allows for proper air flow and prevents overheating.
If drives are not installed in the correct shelf bays on half-populated shelves, when a container
drive fails and is replaced, ONTAP does not auto-assign ownership. In this case, assignment of
the new container drive needs to be done manually. After you have assigned ownership for the
container drive, ONTAP automatically handles any drive partitioning and partitioning
assignments that are required.
In some situations where auto-assignment will not work, you need to manually assign disk ownership using the
storage disk assign command:
• If you disable auto-assignment, new disks are not available as spares until they are manually assigned to a
node.
• If you want disks to be auto-assigned and you have multiple stacks or shelves that must have different
ownership, one disk must have been manually assigned on each stack or shelf so that automatic
ownership assignment works on each stack or shelf.
• If auto-assignment is enabled and you manually assign a single drive to a node that isn’t specified in the
active policy, auto-assignment stops working and an EMS message is displayed.
You can view disk ownership to determine which node controls the storage. You can also
view the partition ownership on systems that use shared disks.
Steps
1. Display the ownership of physical disks:
187
cluster::> storage disk show -ownership
Disk Aggregate Home Owner DR Home Home ID Owner ID DR
Home ID Reserver Pool
-------- --------- -------- -------- -------- ---------- -----------
----------- ----------- ------
1.0.0 aggr0_2 node2 node2 - 2014941509 2014941509 -
2014941509 Pool0
1.0.1 aggr0_2 node2 node2 - 2014941509 2014941509 -
2014941509 Pool0
1.0.2 aggr0_1 node1 node1 - 2014941219 2014941219 -
2014941219 Pool0
1.0.3 - node1 node1 - 2014941219 2014941219 -
2014941219 Pool0
2. If you have a system that uses shared disks, you can display the partition ownership:
You can use the storage disk option modify command to select a non-default
policy for automatically assigning disk ownership or to disable automatic assignment of
disk ownership.
Learn about automatic assignment of disk ownership.
188
Steps
1. Modify automatic disk assignment:
a. If you want to select a non-default policy, enter:
▪ Use stack as the autoassign_policy to configure automatic ownership at the stack or loop
level.
▪ Use shelf as the autoassign_policy to configure automatic ownership at the shelf level.
▪ Use bay as the autoassign_policy to configure automatic ownership at the bay level.
b. If you want to disable automatic disk ownership assignment, enter:
If your HA pair is not configured to use automatic disk ownership assignment, you must
manually assign ownership. If you are initializing an HA pair that has only DS460C
shelves, you must manually assign ownership for the root drives.
About this task
• If you are manually assigning ownership in an HA pair that is not being initialized and does not have only
DS460C shelves, use option 1.
• If you are initializing an HA pair that has only DS460C shelves, use option 2 to manually assign ownership
for the root drives.
189
Option 1: Most HA pairs
For an HA pair that is not being initialized and does not have only DS460C shelves, use this procedure to
manually assigning ownership.
Steps
1. Use the CLI to display all unowned disks:
You can use the wildcard character to assign more than one disk at once. If you are reassigning a
spare disk that is already owned by a different node, you must use the “-force” option.
190
Option 2: An HA pair with only DS460C shelves
For an HA pair that you are initializing and that only has DS460C shelves, use this procedure to manually
assign ownership for the root drives.
After HA pair initialization (boot up), automatic assignment of disk ownership is automatically enabled
and uses the half-drawer policy to assign ownership to the remaining drives (other than the root
drives) and any drives added in the future, such as replacing failed disks, responding to a “low
spares” message, or adding capacity.
Learn about the half-drawer policy in the topic About automatic assignment of disk ownership.
• RAID needs a minimum of 10 drives for each HA pair (5 for each node) for any greater than 8TB NL-
SAS drives in a DS460C shelf.
Steps
1. If your DS460C shelves are not fully populated, complete the following substeps; otherwise, go to the
next step.
a. First, install drives in the front row (drive bays 0, 3, 6, and 9) of each drawer.
Installing drives in the front row of each drawer allows for proper air flow and prevents
overheating.
b. For the remaining drives, evenly distribute them across each drawer.
Fill drawer rows from front to back. If you don’t have enough drives to fill rows, then install them in
pairs so that drives occupy the left and right side of a drawer evenly.
The following illustration shows the drive bay numbering and locations in a DS460C drawer.
2. Log into the clustershell using the node-management LIF or cluster-management LIF.
191
3. Manually assign the root drives in each drawer to conform to the half-drawer policy using the following
substeps:
The half-drawer policy has you assign the left half of a drawer’s drives (bays 0 to 5) to node A, and
the right half of a drawer’s drives (bays 6 to 11) to node B.
You can use the wildcard character to assign more than one disk at a time.
You can manually assign the ownership of the container disk or the partitions on
Advanced Drive Partitioning (ADP) systems. If you are initializing an HA pair that only has
DS460C shelves, you must manually assign ownership for the container drives that will
include root partitions.
About this task
• The type of storage system you have determines which method of ADP is supported, root-data (RD) or
root-data-data (RD2).
FAS storage systems use RD and AFF storage systems use RD2.
• If you are manually assigning ownership in an HA pair that is not being initialized and does not have only
DS460C shelves, use option 1 to manually assign disks with root-data (RD) partitioning or use option 2 to
manually assign disks with root-data-data (RD2) partitioning.
• If you are initializing an HA pair that has only DS460C shelves, use option 3 to manually assign ownership
for the container drives that have the root partition.
192
Option 1: Manually assign disks with root-data (RD) partitioning
For root-data partitioning, there are three owned entities (the container disk and the two partitions)
collectively owned by the HA pair.
Steps
1. Use the CLI to display the current ownership for the partitioned disk:
3. Enter the appropriate command, depending on which ownership entity you want to assign ownership
for:
If any of the ownership entities are already owned, then you must include the “-force” option.
193
Option 2: Manually assign disks with root-data-data (RD2) partitioning
For root-data-data partitioning, there are four owned entities (the container disk and the three partitions)
collectively owned by the HA pair. Root-data-data partitioning creates one small partition as the root
partition and two larger, equally sized partitions for data.
Steps
1. Use the CLI to display the current ownership for the partitioned disk:
3. Enter the appropriate command, depending on which ownership entity you want to assign ownership
for:
If any of the ownership entities are already owned, then you must include the “-force” option.
194
Option 3: Manually assign DS460C container drives that have the root partition
If you are initializing an HA pair that has only DS460C shelves, you must manually assign ownership for
the container drives that have the root partition by conforming to the half-drawer policy.
After HA pair initialization (boot up), automatic assignment of disk ownership is automatically enabled
and uses the half-drawer policy to assign ownership to the remaining drives (other than the container
drives that have the root partition) and any drives added in the future, such as replacing failed drives,
responding to a “low spares” message, or adding capacity.
• Learn about the half-drawer policy in the topic About automatic assignment of disk ownership.
Steps
1. If your DS460C shelves are not fully populated, complete the following substeps; otherwise, go to the
next step.
a. First, install drives in the front row (drive bays 0, 3, 6, and 9) of each drawer.
Installing drives in the front row of each drawer allows for proper air flow and prevents
overheating.
b. For the remaining drives, evenly distribute them across each drawer.
Fill drawer rows from front to back. If you don’t have enough drives to fill rows, then install them in
pairs so that drives occupy the left and right side of a drawer evenly.
The following illustration shows the drive bay numbering and locations in a DS460C drawer.
2. Log into the clustershell using the node-management LIF or cluster-management LIF.
3. For each drawer, manually assign the container drives that have the root partition by conforming to
195
the half-drawer policy using the following substeps:
The half-drawer policy has you assign the left half of a drawer’s drives (bays 0 to 5) to node A, and
the right half of a drawer’s drives (bays 6 to 11) to node B.
You can use the wildcard character to assign more than one drive at a time.
This procedure is designed for nodes for which no data local tier (aggregate) has been created from the
partitioned disks.
Steps
All commands are inputted at the cluster shell.
The output shows that half of the data partitions are owned by one node and half are owned by the other
node. All of the data partitions should be spare.
196
Root Physical
Disk Type RPM Checksum Usable
Usable Size
--------------------------- ----- ------ -------------- --------
-------- --------
1.0.0 BSAS 7200 block 753.8GB
0B 828.0GB
1.0.1 BSAS 7200 block 753.8GB
73.89GB 828.0GB
1.0.5 BSAS 7200 block 753.8GB
0B 828.0GB
1.0.6 BSAS 7200 block 753.8GB
0B 828.0GB
1.0.10 BSAS 7200 block 753.8GB
0B 828.0GB
1.0.11 BSAS 7200 block 753.8GB
0B 828.0GB
set advanced
197
3. For each data partition owned by the node that will be the passive node, assign it to the active node:
storage disk assign -force -data true -owner active_node_name -disk disk_name
You do not need to include the partition as part of the disk name.
You would enter a command similar to the following example for each data partition you need to reassign:
storage disk assign -force -data true -owner cluster1-01 -disk 1.0.3
4. Confirm that all of the partitions are assigned to the active node.
198
1.0.11 BSAS 7200 block 753.8GB
0B 828.0GB
set admin
6. Create your data aggregate, leaving at least one data partition as spare:
This procedure is designed for nodes for which no data local tier (aggregate) has been created from the
partitioned disks.
199
Steps
All commands are input at the cluster shell.
The output shows that half of the data partitions are owned by one node and half are owned by the other
node. All of the data partitions should be spare.
set advanced
3. For each data1 partition owned by the node that will be the passive node, assign it to the active node:
You do not need to include the partition as part of the disk name
4. For each data2 partition owned by the node that will be the passive node, assign it to the active node:
You do not need to include the partition as part of the disk name
5. Confirm that all of the partitions are assigned to the active node:
200
1.0.3 BSAS 7200 block 753.8GB
0B 828.0GB
1.0.4 BSAS 7200 block 753.8GB
0B 828.0GB
1.0.5 BSAS 7200 block 753.8GB
0B 828.0GB
1.0.6 BSAS 7200 block 753.8GB
0B 828.0GB
1.0.7 BSAS 7200 block 753.8GB
0B 828.0GB
1.0.8 BSAS 7200 block 753.8GB
0B 828.0GB
1.0.9 BSAS 7200 block 753.8GB
0B 828.0GB
1.0.10 BSAS 7200 block 753.8GB
0B 828.0GB
1.0.11 BSAS 7200 block 753.8GB
0B 828.0GB
set admin
7. Create your data aggregate, leaving at least one data partition as spare:
8. Alternatively, you can use ONTAP’s recommend aggregate layout which includes best practices for RAID
201
group layout and spare counts:
ONTAP writes disk ownership information to the disk. Before you remove a spare disk or
its shelf from a node, you should remove its ownership information so that it can be
properly integrated into another node.
If the disk is partitioned for root-data partitioning and you are running ONTAP 9.10.1 or later,
contact NetApp Technical Support for assistance in removing ownership. For more information
see the Knowledge Base article: Failed to remove the owner of disk.
You cannot remove ownership from a disk that is being used in an local tier (aggregate).
Steps
1. If disk ownership automatic assignment is on, use the CLI to turn it off:
Example:
4. If the disk is partitioned for root-data partitioning and you are running ONTAP 9.9.1 or earlier, remove
202
ownership from the partitions:
5. If you previously turned off automatic assignment of disk ownership, turn it on after the disk has been
removed or reassigned:
A disk that has completely failed is no longer counted by ONTAP as a usable disk, and
you can immediately disconnect the disk from the disk shelf. However, you should leave a
partially failed disk connected long enough for the Rapid RAID Recovery process to
complete.
About this task
If you are removing a disk because it has failed or because it is producing excessive error messages, you
should not use the disk again in this or any other storage system.
Steps
1. Use the CLI to find the disk ID of the failed disk:
If the disk does not appear in the list of failed disks, it might have partially failed, with a Rapid RAID
Recovery in process. In this case, you should wait until the disk is present in the list of failed disks (which
means that the Rapid RAID Recovery process is complete) before removing the disk.
3. Remove the disk from the disk shelf, following the instructions in the hardware guide for your disk shelf
model.
Disk sanitization
203
This functionality is available through the nodeshell in all ONTAP 9 releases, and starting with ONTAP 9.6 in
maintenance mode.
The disk sanitization process uses three successive default or user-specified byte overwrite patterns for up to
seven cycles per operation. The random overwrite pattern is repeated for each cycle.
Depending on the disk capacity, the patterns, and the number of cycles, the process can take several hours.
Sanitization runs in the background. You can start, stop, and display the status of the sanitization process. The
sanitization process contains two phases: the "Formatting phase" and the "Pattern overwrite phase".
Formatting phase
The operation performed for the formatting phase depends on the class of disk being sanitized, as shown in
the following table:
When the sanitization process is complete, the specified disks are in a sanitized state. They are not returned to
spare status automatically. You must return the sanitized disks to the spare pool before the newly sanitized
disks are available to be added to another aggregate.
Disk sanitization is not supported for all disk types. In addition, there are circumstances in
which disk sanitization cannot be performed.
• It is not supported on all SSD part numbers.
For information about which SSD part numbers support disk sanitization, see the Hardware Universe.
204
known state, but you must also take action before the sanitization process can finish.
Disk sanitization is a long-running operation. If the sanitization process is interrupted by power failure, system
panic, or manual intervention, the sanitization process must be repeated from the beginning. The disk is not
designated as sanitized.
If the formatting phase of disk sanitization is interrupted, ONTAP must recover any disks that were corrupted by
the interruption. After a system reboot and once every hour, ONTAP checks for any sanitization target disk that
did not complete the formatting phase of its sanitization. If any such disks are found, ONTAP recovers them.
The recovery method depends on the type of the disk. After a disk is recovered, you can rerun the sanitization
process on that disk; for HDDs, you can use the -s option to specify that the formatting phase is not repeated
again.
Tips for creating and backing up local tiers (aggregates) containing data to be sanitized
If you are creating or backing up local tiers (aggregates) to contain data that might need
to be sanitized, following some simple guidelines will reduce the time it takes to sanitize
your data.
• Make sure your local tiers containing sensitive data are not larger than they need to be.
If they are larger than needed, sanitization requires more time, disk space, and bandwidth.
• When you back up local tiers containing sensitive data, avoid backing them up to local tier that also contain
large amounts of nonsensitive data.
This reduces the resources required to move nonsensitive data before sanitizing sensitive data.
Sanitize a disk
Sanitizing a disk allows you to remove data from a disk or a set of disks on
decommissioned or inoperable systems so that the data can never be recovered.
Two methods are available to sanitize disks using the CLI:
205
Sanitize a disk with “maintenance mode” commands (ONTAP 9.6 and later releases)
Beginning with ONTAP 9.6, you can perform disk sanitization in maintenance mode.
You must use the storage encryption disk sanitize command to sanitize an SED.
Steps
1. Boot into maintenance mode.
a. Exit the current shell by entering halt.
2. If the disks you want to sanitize are partitioned, unpartition each disk:
The command to unpartition a disk is only available at the diag level and should be
performed only under NetApp Support supervision. It is highly recommended that you
contact NetApp Support before you proceed.
You can also refer to the Knowledge Base article How to unpartition a spare drive in
ONTAP
disk sanitize start [-p pattern1|-r [-p pattern2|-r [-p pattern3|-r]]] [-c
cycle_count] disk_list
Do not turn off power to the node, disrupt the storage connectivity, or remove target
disks while sanitizing. If sanitizing is interrupted during the formatting phase, the
formatting phase must be restarted and allowed to finish before the disks are sanitized
and ready to be returned to the spare pool. If you need to abort the sanitization
process, you can do so by using the disk sanitize abort command. If the
specified disks are undergoing the formatting phase of sanitization, the abort does not
occur until the phase is complete.
-p pattern1 -p pattern2 -p pattern3 specifies a cycle of one to three user-defined hex byte
overwrite patterns that can be applied in succession to the disks being sanitized. The default pattern
is three passes, using 0x55 for the first pass, 0xaa for the second pass, and 0x3c for the third pass.
-r replaces a patterned overwrite with a random overwrite for any or all of the passes.
-c cycle_count specifies the number of times that the specified overwrite patterns are applied. The
206
default value is one cycle. The maximum value is seven cycles.
disk_list specifies a space-separated list of the IDs of the spare disks to be sanitized.
5. After the sanitization process is complete, return the disks to spare status for each disk:
207
Sanitize a disk with “nodeshell” commands (all ONTAP 9 releases)
For all versions of ONTAP 9, when disk sanitization is enabled using nodeshell commands, some low-
level ONTAP commands are disabled. After disk sanitization is enabled on a node, it cannot be disabled.
If the disks are partitioned, neither partition can be in use in a local tier (aggregate).
You must use the storage encryption disk sanitize command to sanitize an SED.
Steps
1. If the disks you want to sanitize are partitioned, unpartition each disk:
The command to unpartition a disk is only available at the diag level and should be
performed only under NetApp Support supervision. It is highly recommended that
you contact NetApp Support before you proceed. You can also refer to the
Knowledge Base article How to unpartition a spare drive in ONTAP.
2. Enter the nodeshell for the node that owns the disks you want to sanitize:
options licensed_feature.disk_sanitization.enable on
disk sanitize start [-p pattern1|-r [-p pattern2|-r [-p pattern3|-r]]] [-c
cycle_count] disk_list
208
Do not turn off power to the node, disrupt the storage connectivity, or remove target
disks while sanitizing. If sanitizing is interrupted during the formatting phase, the
formatting
phase must be restarted and allowed to finish before the disks are sanitized and ready
to be
returned to the spare pool. If you need to abort the sanitization process, you can do so
by using the disk sanitize
abort command. If the specified disks are undergoing the formatting phase of
sanitization, the
abort does not occur until the phase is complete.
-r replaces a patterned overwrite with a random overwrite for any or all of the passes.
-c cycle_count specifies the number of times that the specified overwrite patterns are applied.
The default value is one cycle. The maximum value is seven cycles.
disk_list specifies a space-separated list of the IDs of the spare disks to be sanitized.
7. After the sanitization process is complete, return the disks to spare status:
exit
10. Determine whether all of the disks were returned to spare status:
If… Then…
All of the sanitized disks are You are done. The disks are sanitized and in spare status.
listed as spares
209
Some of the sanitized disks are Complete the following steps:
not listed as spares
a. Enter advanced privilege mode:
Result
The specified disks are sanitized and designated as hot spares. The serial numbers of the sanitized disks are
written to /etc/log/sanitized_disks.
The specified disks’ sanitization logs, which show what was completed on each disk, are written to
/mroot/etc/log/sanitization.log.
You can use the storage disk and storage aggregate commands to manage your
disks.
Display the disk RAID type, current usage, and RAID storage aggregate show-status
group by aggregate
Display the RAID type, current usage, aggregate, and storage disk show -raid
RAID group, including spares, for physical disks
Display the pre-cluster (nodescope) drive name for a storage disk show -primary-paths
disk (advanced)
210
Illuminate the LED for a particular disk or shelf storage disk set-led
Display the checksum type for a specific disk storage disk show -fields checksum-
compatibility
Display the checksum type for all spare disks storage disk show -fields checksum-
compatibility -container-type spare
Display disk connectivity and placement information storage disk show -fields disk,primary-
port,secondary-name,secondary-
port,shelf,bay
Display the pre-cluster disk names for specific disks storage disk show -disk diskname
-fields diskpathnames
Display the list of disks in the maintenance center storage disk show -maintenance
Stop an ongoing sanitization process on one or more system node run -node nodename -command
specified disks disk sanitize
Retrieve authentication keys from all linked key security key-manager restore
management servers
Related information
• ONTAP command reference
You use the storage aggregate and volume commands to see how space is being
used in your aggregates and volumes and their Snapshot copies.
211
Aggregates, including details about used and storage aggregate show
available space percentages, Snapshot reserve size, storage aggregate show-space -fields
and other space usage information snap-size-total,used-including-
snapshot-reserve
How disks and RAID groups are used in an storage aggregate show-status
aggregate, and RAID status
The amount of disk space that would be reclaimed if volume snapshot compute-reclaimable
you deleted a specific Snapshot copy
Related information
• ONTAP command reference
You use the storage shelf show command to display configuration and error
information for your disk shelves.
Detailed information for a specific shelf, including storage shelf show -shelf
stack ID
212
If you want to display… Use this command…
Power information, including PSUs (power supply storage shelf show -power
units), current sensors, and voltage sensors
Related information
• ONTAP command reference
You can perform various procedures to manage RAID configurations in your system.
• Aspects of managing RAID configurations:
◦ Default RAID policies for local tiers (aggregates)
◦ RAID protection levels for disks
• Drive and RAID group information for a local tier (aggregate)
◦ Determine drive and RAID group information for a local tier (aggregate)
• RAID configuration conversions
◦ Convert from RAID-DP to RAID-TEC
◦ Convert from RAID-TEC to RAID-DP
• RAID group sizing
◦ Considerations for sizing RAID groups
◦ Customize the size of your RAID group
Either RAID-DP or RAID-TEC is the default RAID policy for all new local tiers
(aggregates). The RAID policy determines the parity protection you have in the event of a
disk failure.
RAID-DP provides double-parity protection in the event of a single or double disk failure. RAID-DP is the
default RAID policy for the following local tier (aggregate) types:
RAID-TEC is supported on all disk types and all platforms, including AFF. Local tiers that contain larger disks
have a higher possibility of concurrent disk failures. RAID-TEC helps to mitigate this risk by providing triple-
parity protection so that your data can survive up to three simultaneous disk failures. RAID-TEC is the default
RAID policy for capacity HDD local tiers with disks that are 6 TB or larger.
213
• RAID-DP: minimum of 5 disks
• RAID-TEC: minimum of 7 disks
ONTAP supports three levels of RAID protection for local tiers (aggregates). The level of
RAID protection determines the number of parity disks available for data recovery in the
event of disk failures.
With RAID protection, if there is a data disk failure in a RAID group, ONTAP can replace the failed disk with a
spare disk and use parity data to reconstruct the data of the failed disk.
• RAID4
With RAID4 protection, ONTAP can use one spare disk to replace and reconstruct the data from one failed
disk within the RAID group.
• RAID-DP
With RAID-DP protection, ONTAP can use up to two spare disks to replace and reconstruct the data from
up to two simultaneously failed disks within the RAID group.
• RAID-TEC
With RAID-TEC protection, ONTAP can use up to three spare disks to replace and reconstruct the data
from up to three simultaneously failed disks within the RAID group.
Some local tier (aggregate) administration tasks require that you know what types of
drives compose the local tier, their size, checksum, and status, whether they are shared
with other local tiers, and the size and composition of the RAID groups.
Step
1. Show the drives for the aggregate, by RAID group:
The drives are displayed for each RAID group in the aggregate.
You can see the RAID type of the drive (data, parity, dparity) in the Position column. If the Position
column displays shared, then the drive is shared: if it is an HDD, it is a partitioned disk; if it is an SSD, it is
part of a storage pool.
214
Example: A Flash Pool aggregate using an SSD storage pool and data partitions
Usable Physical
Position Disk Pool Type RPM Size Size Status
-------- ---------- ---- ----- ------ -------- -------- -------
shared 2.0.1 0 SAS 10000 472.9GB 547.1GB (normal)
shared 2.0.3 0 SAS 10000 472.9GB 547.1GB (normal)
shared 2.0.5 0 SAS 10000 472.9GB 547.1GB (normal)
shared 2.0.7 0 SAS 10000 472.9GB 547.1GB (normal)
shared 2.0.9 0 SAS 10000 472.9GB 547.1GB (normal)
shared 2.0.11 0 SAS 10000 472.9GB 547.1GB (normal)
Usable Physical
Position Disk Pool Type RPM Size Size Status
-------- ---------- ---- ----- ------ -------- -------- -------
shared 2.0.13 0 SSD - 186.2GB 745.2GB (normal)
shared 2.0.12 0 SSD - 186.2GB 745.2GB (normal)
If you want the added protection of triple-parity, you can convert from RAID-DP to RAID-
TEC. RAID-TEC is recommended if the size of the disks used in your local tier
(aggregate) is greater than 4 TiB.
What you’ll need
The local tier (aggregate) that is to be converted must have a minimum of seven disks.
Steps
1. Verify that the aggregate is online and has a minimum of six disks:
215
storage aggregate show-status -aggregate aggregate_name
If you reduce the size of your local tier (aggregate) and no longer need triple parity, you
can convert your RAID policy from RAID-TEC to RAID-DP and reduce the number of
disks you need for RAID parity.
What you’ll need
The maximum RAID group size for RAID-TEC is larger than the maximum RAID group size for RAID-DP. If the
largest RAID-TEC group size is not within the RAID-DP limits, you cannot convert to RAID-DP.
Steps
1. Verify that the aggregate is online and has a minimum of six disks:
Configuring an optimum RAID group size requires a trade-off of factors. You must decide
which factors—speed of RAID rebuild, assurance against risk of data loss due to drive
failure, optimizing I/O performance, and maximizing data storage space—are most
important for the (local tier) aggregate that you are configuring.
When you create larger RAID groups, you maximize the space available for data storage for the same amount
of storage used for parity (also known as the “parity tax”). On the other hand, when a disk fails in a larger RAID
group, reconstruction time is increased, impacting performance for a longer period of time. In addition, having
more disks in a RAID group increases the probability of a multiple disk failure within the same RAID group.
216
HDD or array LUN RAID groups
You should follow these guidelines when sizing your RAID groups composed of HDDs or array LUNs:
• All RAID groups in an local tier (aggregate) should have the same number of disks.
While you can have up to 50% less or more than the number of disks in different raid groups on one local
tier, this might lead to performance bottlenecks in some cases, so it is best avoided.
• The recommended range of RAID group disk numbers is between 12 and 20.
The reliability of performance disks can support a RAID group size of up to 28, if needed.
• If you can satisfy the first two guidelines with multiple RAID group disk numbers, you should choose the
larger number of disks.
The SSD RAID group size can be different from the RAID group size for the HDD RAID groups in a Flash Pool
local tier (aggregate). Usually, you should ensure that you have only one SSD RAID group for a Flash Pool
local tier, to minimize the number of SSDs required for parity.
You should follow these guidelines when sizing your RAID groups composed of SSDs:
• All RAID groups in a local tier (aggregate) should have a similar number of drives.
The RAID groups do not have to be exactly the same size, but you should avoid having any RAID group
that is less than one half the size of other RAID groups in the same local tier when possible.
• For RAID-DP, the recommended range of RAID group size is between 20 and 28.
You can customize the size of your RAID groups to ensure that your RAID group sizes
are appropriate for the amount of storage you plan to include for a local tier (aggregate).
About this task
For standard local tiers (aggregates), you change the size of RAID groups for each local tier separately. For
Flash Pool local tiers, you can change the RAID group size for the SSD RAID groups and the HDD RAID
groups independently.
The following list outlines some facts about changing the RAID group size:
• By default, if the number of disks or array LUNs in the most recently created RAID group is less than the
new RAID group size, disks or array LUNs will be added to the most recently created RAID group until it
reaches the new size.
• All other existing RAID groups in that local tier remain the same size, unless you explicitly add disks to
them.
• You can never cause a RAID group to become larger than the current maximum RAID group size for the
local tier.
• You cannot decrease the size of already created RAID groups.
217
• The new size applies to all RAID groups in that local tier (or, in the case of a Flash Pool local tier, all RAID
groups for the affected RAID group type—SSD or HDD).
Steps
1. Use the applicable command:
Change the maximum size of any other RAID storage aggregate modify -aggregate
groups aggr_name -maxraidsize size
Examples
The following command changes the maximum RAID group size of the aggregate n1_a4 to 20 disks or array
LUNs:
The following command changes the maximum RAID group size of the SSD cache RAID groups of the Flash
Pool aggregate n1_cache_a2 to 24:
You can perform various procedures to manage Flash Pool tiers (aggregates) in your
system.
• Caching policies
◦ Flash Pool local tier (aggregate) caching policies
◦ Manage Flash Pool caching policies
• SSD partitioning
◦ Flash Pool SSD partitioning for Flash Pool local tiers (aggregates) using storage pools
• Candidacy and cache size
◦ Determine Flash Pool candidacy and optimal cache size
• Flash Pool creation
◦ Create a Flash Pool local tier (aggregate) using physical SSDs
◦ Create a Flash Pool local tier (aggregate) using SSD storage pools
Caching policies for the volumes in a Flash Pool local tier (aggregate) let you deploy
218
Flash as a high performance cache for your working data set while using lower-cost
HDDs for less frequently accessed data. If you are providing cache to two or more Flash
Pool local tiers, you should use Flash Pool SSD partitioning to share SSDs across the
local tiers in the Flash Pool.
Caching policies are applied to volumes that reside in Flash Pool local tiers. You should understand how
caching policies work before changing them.
In most cases, the default caching policy of “auto” is the best caching policy to use. The caching policy should
be changed only if a different policy provides better performance for your workload. Configuring the wrong
caching policy can severely degrade volume performance; the performance degradation could increase
gradually over time.
Caching policies combine a read caching policy and a write caching policy. The policy name concatenates the
names of the read caching policy and the write caching policy, separated by a hyphen. If there is no hyphen in
the policy name, the write caching policy is “none”, except for the “auto” policy.
Read caching policies optimize for future read performance by placing a copy of the data in the cache in
addition to the stored data on HDDs. For read caching policies that insert data into the cache for write
operations, the cache operates as a write-through cache.
Data inserted into the cache by using the write caching policy exists only in cache; there is no copy in HDDs.
Flash Pool cache is RAID protected. Enabling write caching makes data from write operations available for
reads from cache immediately, while deferring writing the data to HDDs until it ages out of the cache.
If you move a volume from a Flash Pool local tier to a single-tier local tier, it loses its caching policy; if you later
move it back to a Flash Pool local tier, it is assigned the default caching policy of “auto”. If you move a volume
between two Flash Pool local tier, the caching policy is preserved.
You can use the CLI to change the caching policy for a volume that resides on a Flash Pool local tier by using
the -caching-policy parameter with the volume create command.
When you create a volume on a Flash Pool local tier, by default, the “auto” caching policy is assigned to the
volume.
Using the CLI, you can perform various procedures to manage Flash Pool caching
policies in your system.
• Preparation
◦ Determine whether to modify the caching policy of Flash Pool local tiers (aggregates)
• Caching policies modification
◦ Modify caching policies of Flash Pool local tiers (aggregates)
◦ Set the cache-retention policy for Flash Pool local tiers (aggregates)
219
Determine whether to modify the caching policy of Flash Pool local tiers (aggregates)
You can assign cache-retention policies to volumes in Flash Pool local tiers (aggregates)
to determine how long the volume data remains in the Flash Pool cache. However, in
some cases changing the cache-retention policy might not impact the amount of time the
volume’s data remains in the cache.
About this task
If your data meets any of the following conditions, changing your cache-retention policy might not have an
impact:
Steps
The following steps check for the conditions that must be met by the data. The task must be done using the
CLI in advanced privilege mode.
4. Determine the Cacheable Read and Project Cache Alloc of the volume:
If the hit rate of the volume is greater than the Cacheable Read, then your workload does not reread
random blocks cached in the SSDs.
7. Compare the volume’s current cache size to the Project Cache Alloc.
If the current cache size of the volume is greater than the Project Cache Alloc, then the size of your
volume cache is too small.
220
Modify caching policies of Flash Pool local tiers (aggregates)
You should modify the caching policy of a volume only if a different caching policy is
expected to provide better performance. You can modify the caching policy of a volume
on a Flash Pool local tier (aggregate).
What you’ll need
You must determine whether you want to modify your caching policy.
Step
1. Use the CLI to modify the volume’s caching policy:
Example
The following example modifies the caching policy of a volume named “vol2” to the policy “none”:
Set the cache-retention policy for Flash Pool local tiers (aggregates)
You can assign cache-retention policies to volumes in Flash Pool local tiers (aggregates).
Data in volumes with a high cache-retention policy remains in cache longer and data in
volumes with a low cache-retention policy is removed sooner. This increases
performance of your critical workloads by making high priority information accessible at a
faster rate for a longer period of time.
What you’ll need
You should know whether your system has any conditions that might prevent the cache-retention policy from
having an impact on how long your data remains in cache.
Steps
Use the CLI in advanced privilege mode to perform the following steps:
221
ONTAP Version Command
ONTAP 9.0, 9.1 priority hybrid-cache set volume_name
read-cache=read_cache_value write-
cache=write_cache_value cache-
retention-
priority=cache_retention_policy
4. Verify that the volume’s cache-retention policy is changed to the option you selected.
5. Return the privilege setting to admin:
Flash Pool SSD partitioning for Flash Pool local tiers (aggregates) using storage pools
If you are providing cache to two or more Flash Pool local tiers (aggregates), you should
use Flash Pool Solid-State Drive (SSD) partitioning. Flash Pool SSD partitioning allows
SSDs to be shared by all the local tiers that use the Flash Pool. This spreads the cost of
parity over multiple local tiers, increases SSD cache allocation flexibility, and maximizes
SSD performance.
For an SSD to be used in a Flash Pool local tier, the SSD must be placed in a storage pool. You cannot use
SSDs that have been partitioned for root-data partitioning in a storage pool. After the SSD is placed in the
storage pool, the SSD can no longer be managed as a stand-alone disk and cannot be removed from the
storage pool unless you destroy the local tiers associated with the Flash Pool and you destroy the storage
pool.
SSD storage pools are divided into four equal allocation units. SSDs added to the storage pool are divided into
four partitions and one partition is assigned to each of the four allocation units. The SSDs in the storage pool
must be owned by the same HA pair. By default, two allocation units are assigned to each node in the HA pair.
Allocation units must be owned by the node that owns the local tier it is serving. If more Flash cache is required
for local tiers on one of the nodes, the default number of allocation units can be shifted to decrease the number
on one node and increase the number on the partner node.
You use spare SSDs to add to an SSD storage pool. If the storage pool provides allocation units to Flash Pool
local tiers owned by both nodes in the HA pair, then the spare SSDs can be owned by either node. However, if
the storage pool provides allocation units only to Flash Pool local tiers owned by one of the nodes in the HA
pair, then the SSD spares must be owned by that same node.
The following illustration is an example of Flash Pool SSD partitioning. The SSD storage pool provides cache
to two Flash Pool local tiers:
222
Storage pool SP1 is composed of five SSDs and a hot spare SSD. Two of the storage pool’s allocation units
are allocated to Flash Pool FP1, and two are allocated to Flash Pool FP2. FP1 has a cache RAID type of
RAID4. Therefore, the allocation units provided to FP1 contain only one partition designated for parity. FP2 has
a cache RAID type of RAID-DP. Therefore, the allocation units provided to FP2 include a parity partition and a
double-parity partition.
In this example, two allocation units are allocated to each Flash Pool local tier. However, if one Flash Pool local
tier required a larger cache, you could allocate three of the allocation units to that Flash Pool local tier, and only
one to the other.
Before converting an existing local tier (aggregate) to a Flash Pool local tier, you can
determine whether the local tier is I/O bound and the best Flash Pool cache size for your
workload and budget. You can also check whether the cache of an existing Flash Pool
local tier is sized correctly.
What you’ll need
You should know approximately when the local tier you are analyzing experiences its peak load.
Steps
1. Enter advanced mode:
set advanced
2. If you need to determine whether an existing local tier (aggregate) would be a good candidate for
conversion to a Flash Pool aggregate, determine how busy the disks in the aggregate are during a period
of peak load, and how that is affecting latency:
223
-counter disk_busy|user_read_latency -interval 1 -iterations 60
You can decide whether reducing latency by adding Flash Pool cache makes sense for this aggregate.
The following command shows the statistics for the first RAID group of the aggregate “aggr1”:
AWA begins collecting workload data for the volumes associated with the specified aggregate.
set admin
Allow AWA to run until one or more intervals of peak load have occurred. AWA collects workload statistics
for the volumes associated with the specified aggregate, and analyzes data for up to one rolling week in
duration. Running AWA for more than one week will report only on data collected from the most recent
week. Cache size estimates are based on the highest loads seen during the data collection period; the load
does not need to be high for the entire data collection period.
set advanced
7. Stop AWA:
set admin
You create a Flash Pool local tier (aggregate) by enabling the feature on an existing local
tier composed of HDD RAID groups, and then adding one or more SSD RAID groups to
that local tier. This results in two sets of RAID groups for that local tier: SSD RAID groups
(the SSD cache) and HDD RAID groups.
About this task
After you add an SSD cache to an local tier to create a Flash Pool local tier, you cannot remove the SSD cache
224
to convert the local tier back to its original configuration.
By default, the RAID level of the SSD cache is the same as the RAID level of the HDD RAID groups. You can
override this default selection by specifying the “raidtype” option when you add the first SSD RAID groups.
Using fewer RAID groups in the SSD cache reduces the number of parity disks required, but larger RAID
groups require RAID-DP.
• You must have determined the RAID level you want to use for the SSD cache.
• You must have determined the maximum cache size for your system and determined that adding SSD
cache to your local tier will not cause you to exceed it.
• You must have familiarized yourself with the configuration requirements for Flash Pool local tiers.
Steps
You can create a FlashPool aggregate using System Manager or the ONTAP CLI.
225
System Manager
Beginning with ONTAP 9.12.1, you can use System Manager to create a Flash Pool local tier using
physical SSDs.
Steps
1. Select Storage > Tiers then select an existing local HDD storage tier.
2. Select then Add Flash Pool Cache.
3. Select Use dedicated SSDs as cache.
4. Select a disk type and the number of disks.
5. Choose a RAID type.
6. Select Save.
7. Locate the storage tier then select .
8. Select More Details. Verify that Flash Pool shows as Enabled.
CLI
Steps
1. Mark the local tier (aggregate) as eligible to become a Flash Pool aggregate:
If this step does not succeed, determine write-caching eligibility for the target aggregate.
2. Add the SSDs to the aggregate by using the storage aggregate add command.
◦ You can specify the SSDs by ID or by using the diskcount and disktype parameters.
◦ If the HDDs and the SSDs do not have the same checksum type, or if the aggregate is a mixed-
checksum aggregate, then you must use the checksumstyle parameter to specify the
checksum type of the disks you are adding to the aggregate.
◦ You can specify a different RAID type for the SSD cache by using the raidtype parameter.
◦ If you want the cache RAID group size to be different from the default for the RAID type you are
using, you should change it now, by using the -cache-raid-group-size parameter.
Create a Flash Pool local tier (aggregate) using SSD storage pools
Overview of creating a Flash Pool local tier (aggregate) using SSD storage pools
You can perform various procedures to create a Flash Pool local tier (aggregate) using
SSD storage pools:
• Preparation
◦ Determine whether a Flash Pool local tier (aggregate) is using an SSD storage pool
• SSD storage pool creation
◦ Create an SSD storage pool
◦ Add SSDs to an SSD storage pool
226
• Flash Pool creation using SSD storage pools
◦ Create a Flash Pool local tier (aggregate) using SSD storage pool allocation units
◦ Determine the impact to cache size of adding SSDs to an SSD storage pool
Determine whether a Flash Pool local tier (aggregate) is using an SSD storage pool
You can configure a Flash Pool (local tier) aggregate by adding one or more allocation
units from an SSD storage pool to an existing HDD local tier.
You manage Flash Pool local tiers differently when they use SSD storage pools to provide their cache than
when they use discrete SSDs.
Step
1. Display the aggregate’s drives by RAID group:
If the aggregate is using one or more SSD storage pools, the value for the Position column for the SSD
RAID groups is displayed as Shared, and the name of the storage pool is displayed next to the RAID
group name.
You can provision cache by converting an existing local tier (aggregate) to a Flash Pool
local tier (aggregate) by adding solid state drives (SSDs).
You can create solid state drive (SSD) storage pools to provide SSD cache for two to four Flash Pool local tiers
(aggregates). Flash Pool aggregates enable you to deploy flash as high performance cache for your working
data set while using lower-cost HDDs for less frequently accessed data.
• The SSDs used in the storage pool should be the same size.
227
System Manager
Use System Manager to add an SSD cache (ONTAP 9.12.1 and later)
Beginning with ONTAP 9.12.1, you can use System Manager to add an SSD cache.
Steps
1. Click Cluster > Disks and then click Show/Hide.
2. Select Type and verify that spare SSDs exist on the cluster.
3. Click to Storage > Tiers and click Add Storage Pool.
4. Select the disk type.
5. Enter a disk size.
6. Select the number of disks to add to the storage pool.
7. Review the estimated cache size.
Use the CLI procedure if you are using an ONTAP version later than ONTAP 9.7 or
earlier than ONTAP 9.12.1.
Steps
1. Click (Return to classic version).
2. Click Storage > Aggregates & Disks > Aggregates.
3. Select the local tier (aggregate), and then click Actions > Add Cache.
4. Select the cache source as "storage pools" or "dedicated SSDs".
5. Click (Switch to the new experience).
6. Click Storage > Tiers to verify the size of the new aggregate.
CLI
Use the CLI to create an SSD storage pool
Steps
1. Determine the names of the available spare SSDs:
The SSDs used in a storage pool can be owned by either node of an HA pair.
228
storage pool show -storage-pool sp_name
Results
After the SSDs are placed into the storage pool, they no longer appear as spares on the cluster, even though
the storage provided by the storage pool has not yet been allocated to any Flash Pool caches. You cannot add
SSDs to a RAID group as discrete drives; their storage can be provisioned only by using the allocation units of
the storage pool to which they belong.
Create a Flash Pool local tier (aggregate) using SSD storage pool allocation units
You can configure a Flash Pool local tier (aggregate) by adding one or more allocation
units from an SSD storage pool to an existing HDD local tier.
Beginning with ONTAP 9.12.1, you can use the redesigned System Manager to create a Flash Pool local tier
using storage pool allocation units.
Any allocation unit from the storage pool that you want to use must be owned by the same node that owns
the Flash Pool local tier.
• You must have determined how much cache you want to add to the local tier.
You add cache to the local tier by allocation units. You can increase the size of the allocation units later by
adding SSDs to the storage pool if there is room.
• You must have determined the RAID type you want to use for the SSD cache.
After you add a cache to the local tier from SSD storage pools, you cannot change the RAID type of the
cache RAID groups.
• You must have determined the maximum cache size for your system and determined that adding SSD
cache to your local tier will not cause you to exceed it.
You can see the amount of cache that will be added to the total cache size by using the storage pool
show command.
• You must have familiarized yourself with the configuration requirements for Flash Pool local tier.
After you add an SSD cache to a local tier to create a Flash Pool local tier, you cannot remove the SSD cache
to convert the local tier back to its original configuration.
229
System Manager
Beginning with ONTAP 9.12.1, you can use System Manager to add SSDs to an SSD storage pool.
Steps
1. Click Storage > Tiers and select an existing local HDD storage tier.
2. Click and select Add Flash Pool Cache.
3. Select Use Storage Pools.
4. Select a storage pool.
5. Select a cache size and RAID configuration.
6. Click Save.
7. Locate the storage tier again and click .
8. Select More Details and verify that the Flash Pool shows as Enabled.
CLI
Steps
1. Mark the aggregate as eligible to become a Flash Pool aggregate:
If this step does not succeed, determine write-caching eligibility for the target aggregate.
If you want the RAID type of the cache to be different from that of the HDD RAID groups, you must
change it when you enter this command by using the raidtype parameter.
You do not need to specify a new RAID group; ONTAP automatically puts the SSD cache into
separate RAID groups from the HDD RAID groups.
You cannot set the RAID group size of the cache; it is determined by the number of SSDs in the
storage pool.
The cache is added to the aggregate and the aggregate is now a Flash Pool aggregate. Each
allocation unit added to the aggregate becomes its own RAID group.
The size of the cache is listed under Total Hybrid Cache Size.
230
Related information
NetApp Technical Report 4070: Flash Pool Design and Implementation Guide
Determine the impact to cache size of adding SSDs to an SSD storage pool
If adding SSDs to a storage pool causes your platform model’s cache limit to be
exceeded, ONTAP does not allocate the newly added capacity to any Flash Pool local
tiers (aggregates). This can result in some or all of the newly added capacity being
unavailable for use.
About this task
When you add SSDs to an SSD storage pool that has allocation units already allocated to Flash Pool local tiers
(aggregates), you increase the cache size of each of those local tiers and the total cache on the system. If
none of the storage pool’s allocation units have been allocated, adding SSDs to that storage pool does not
affect the SSD cache size until one or more allocation units are allocated to a cache.
Steps
1. Determine the usable size of the SSDs you are adding to the storage pool:
2. Determine how many allocation units remain unallocated for the storage pool:
3. Calculate the amount of cache that will be added by applying the following formula:
When you add solid state drives (SSDs) to an SSD storage pool, you increase the
storage pool’s physical and usable sizes and allocation unit size. The larger allocation
unit size also affects allocation units that have already been allocated to local tiers
(aggregates).
What you’ll need
You must have determined that this operation will not cause you to exceed the cache limit for your HA pair.
ONTAP does not prevent you from exceeding the cache limit when you add SSDs to an SSD storage pool, and
doing so can render the newly added storage capacity unavailable for use.
The SSD you add to the storage pool must be the same size as disk currently used in the storage pool.
231
System Manager
Beginning with ONTAP 9.12.1, you can use System Manager to add SSDs to an SSD storage pool.
Steps
1. Click Storage > Tiers and locate the Storage Pools section.
2. Locate the storage pool, click , and select Add Disks.
3. Choose the disk type and select the number of disks.
4. Review the estimate cache size.
CLI
Steps
1. Optional: View the current allocation unit size and available storage for the storage pool:
The system displays which Flash Pool aggregates will have their size increased by this operation and
by how much, and prompts you to confirm the operation.
ONTAP provides the storage pool command for managing SSD storage pools.
Display how much cache would be added to the storage pool show -instance
overall cache capacity for both RAID types (allocation
unit data size)
Display the unallocated allocation units for a storage storage pool show-available-capacity
pool
Change the ownership of one or more allocation units storage pool reassign
of a storage pool from one HA partner to the other
232
Related information
• ONTAP command reference
The cloud tier can be located on NetApp StorageGRID or ONTAP S3 (beginning with ONTAP 9.8), or one of
the following service providers:
• Alibaba cloud
• Amazon S3
• Amazon Commercial Cloud Services
• Google Cloud
• IBM cloud
• Microsoft Azure Blob Storage
Beginning with ONTAP 9.7, additional object store providers that support generic S3 APIs can
be used by selecting the S3_Compatible object store provider.
Related information
See also the NetApp Cloud Tiering documentation.
The performance tier uses high-performance primary storage, such as an all flash (all SSD) aggregate
of the storage system.
◦ Infrequently accessed (“cold”) data is stored in the cloud tier, also known as the capacity tier.
The cloud tier uses an object store that is less costly and does not require high performance.
233
• You have the flexibility in specifying the tier in which data should be stored.
You can specify one of the supported tiering policy options at the volume level. The options enable you to
efficiently move data across tiers as data becomes hot or cold.
• You can choose one of the supported object stores to use as the cloud tier for FabricPool.
• You can monitor the space utilization in a FabricPool-enabled aggregate.
• You can see how much data in a volume is inactive by using inactive data reporting.
• You can reduce the on-premise footprint of the storage system.
You save physical space when you use a cloud-based object store for the cloud tier.
ONTAP 9.2
ONTAP 9.4
• You must be running ONTAP 9.4 or later releases for the following FabricPool functionality:
◦ The auto tiering policy
◦ Specifying the tiering minimum cooling period
◦ Inactive data reporting (IDR)
◦ Using Microsoft Azure Blob Storage for the cloud as the cloud tier for FabricPool
◦ Using FabricPool with ONTAP Select
ONTAP 9.5
• You must be running ONTAP 9.5 or later releases for the following FabricPool functionality:
◦ Specifying the tiering fullness threshold
◦ Using IBM Cloud Object Storage as the cloud tier for FabricPool
◦ NetApp Volume Encryption (NVE) of the cloud tier, enabled by default.
ONTAP 9.6
• You must be running ONTAP 9.6 or later releases for the following FabricPool functionality:
◦ The all tiering policy
◦ Inactive data reporting enabled manually on HDD aggregates
◦ Inactive data reporting enabled automatically for SSD aggregates when you upgrade to ONTAP 9.6
234
and at time aggregate is created, except on low end systems with less than 4 CPU, less than 6 GB of
RAM, or when WAFL-buffer-cache size is less than 3 GB.
ONTAP monitors system load, and if the load remains high for 4 continuous minutes, IDR is disabled,
and is not automatically enabled. You can reenable IDR manually, however, manually enabled IDR is
not automatically disabled.
◦ Using Alibaba Cloud Object Storage as the cloud tier for FabricPool
◦ Using Google Cloud Platform as the cloud tier for FabricPool
◦ Volume move without cloud tier data copy
ONTAP 9.7
• You must be running ONTAP 9.7 or later releases for the following FabricPool functionality:
◦ Non transparent HTTP and HTTPS proxy to provide access to only whitelisted access points, and to
provide auditing and reporting capabilities.
◦ FabricPool mirroring to tier cold data to two object stores simultaneously
◦ FabricPool mirrors on MetroCluster configurations
◦ NDMP dump and restore enabled by default on FabricPool attached aggregates.
If the backup application uses a protocol other than NDMP, such as NFS or SMB, all
data being backed up in the performance tier becomes hot and can affect tiering of that
data to the cloud tier. Non-NDMP reads can cause data migration from the cloud tier
back to the performance tier.
ONTAP 9.8
• You must be running ONTAP 9.8 or later for the following FabricPool functionality:
◦ Cloud retrieval
◦ FabricPool with SnapLock Enterprise. FabricPool with SnapLock Enterprise requires a Feature Product
Variance Request (FPVR). To create an FPVR, please contact your sales team.
◦ Minimum cooling period maximum of 183 days
◦ Object tagging using user-created custom tags
◦ HDD FabricPool aggregates
HDD FabricPools are supported with SAS, FSAS, BSAS and MSATA disks only on systems with 6 or
more CPU cores.
ONTAP 9.10.1
• You must be running ONTAP 9.10.1 or later for the following FabricPool functionality:
◦ PUT throttling
◦ Temperature-sensitive storage efficiency (TSSE).
235
ONTAP 9.12.1
• You must be running ONTAP 9.12.1 or later for the following FabricPool functionality:
◦ SVM Migrate
◦ Support for FabricPool, FlexGroup, and SVM-DR working in conjunction. (Prior to 9.12.1 any two of
these features worked together, but not all three in conjunction.)
ONTAP 9.14.1
• You must be running ONTAP 9.14.1 or later for the following FabricPool functionality:
◦ Cloud Write
◦ Aggressive Readahead
Platforms
• FabricPool is supported on all platforms capable of running ONTAP 9.2 except for the following:
◦ FAS8020
◦ FAS2554
◦ FAS2552
◦ FAS2520
• On AFF systems, you can only use SSD aggregates for FabricPool.
• On FAS systems, you can use either SSD or HDD aggregates for FabricPool.
• On Cloud Volumes ONTAP and ONTAP Select, you can use either SSD or HDD aggregates for FabricPool.
Using SSD aggregates is recommended.
Flash Pool aggregates, which contain both SSDs and HDDs, are not supported.
Cloud tiers
FabricPool supports using the following object stores as the cloud tier:
Glacier Flexible Retrieval and Glacier Deep Archive are not supported.
236
• The object store “bucket” (container) you plan to use must have already been set up, must have at least 10
GB of storage space, and must not be renamed.
• HA pairs that use FabricPool require intercluster LIFs to communicate with the object store.
• You cannot detach a cloud tier from a local tier after it is attached; however, you can use FabricPool mirror
to attach a local tier to a different cloud tier.
Storage efficiencies such as compression, deduplication, and compaction are preserved when moving data to
the cloud tier, reducing required object storage capacity and transport costs.
Beginning in ONTAP 9.15.1, FabricPool supports Intel QuickAssist Technology (QAT4) which
provides more aggressive, and more performant, storage efficiency savings.
Aggregate inline deduplication is supported on the local tier, but associated storage efficiencies are not carried
over to objects stored on the cloud tier.
When using the All volume tiering policy, storage efficiencies associated with background deduplication
processes might be reduced as data is likely to be tiered before the additional storage efficiencies can be
applied.
FabricPool requires a capacity-based license when attaching third-party object storage providers (such as
Amazon S3) as cloud tiers for AFF and FAS systems. A BlueXP Tiering license is not required when using
StorageGRID or ONTAP S3 as the cloud tier or when tiering with Cloud Volumes ONTAP, Amazon FSx for
NetApp ONTAP, or Azure NetApp files.
StorageGRID’s consistency controls affects how the metadata that StorageGRID uses to track objects is
distributed between nodes and the availability of objects for client requests. NetApp recommends using
the default, read-after-new-write, consistency control for buckets used as FabricPool targets.
Do not use the available consistency control for buckets used as FabricPool targets.
When tiering data that is accessed by SAN protocols, NetApp recommends using private clouds, like ONTAP
S3 or StorageGRID, due to connectivity considerations.
You should be aware that when using FabricPool in a SAN environment with a Windows host, if
the object storage becomes unavailable for an extended period of time when tiering data to the
cloud, files on the NetApp LUN on the Windows host might become inaccessible or disappear.
See the Knowledge Base article During FabricPool S3 object store unavailable Windows SAN
host reported filesystem corruption.
237
Quality of Service
• If you use throughput floors (QoS Min), the tiering policy on the volumes must be set to none before the
aggregate can be attached to FabricPool.
Other tiering policies prevent the aggregate from being attached to FabricPool. A QoS policy will not
enforce throughput floors when FabricPool is enabled.
FabricPool supports StorageGRID’s Information Lifecycle Management policies only for data replication
and erasure coding to protect cloud tier data from failure. However, FabricPool does not support advanced
ILM rules such as filtering based on user metadata or tags. ILM typically includes various movement and
deletion policies. These policies can be disruptive to the data in the cloud tier of FabricPool. Using
FabricPool with ILM policies that are configured on object stores can result in data loss.
• 7-Mode data transition using the ONTAP CLI commands or the 7-Mode Transition Tool
• FlexArray Virtualization
• RAID SyncMirror, except in a MetroCluster configuration
• SnapLock volumes when using ONTAP 9.7 and earlier releases
• Tape backup using SMTape for FabricPool-enabled aggregates
• The Auto Balance functionality
• Volumes using a space guarantee other than none
With the exception of root SVM volumes and CIFS audit staging volumes, FabricPool does not support
attaching a cloud tier to an aggregate that contains volumes using a space guarantee other than none. For
example, a volume using a space guarantee of volume (-space-guarantee volume) is not supported.
FabricPool tiering policies determine when or whether the user data blocks of a volume in FabricPool are
moved to the cloud tier, based on the volume “temperature” of hot (active) or cold (inactive). The volume
“temperature” increases when it is accessed frequently and decreases when it is not. Some tiering policies
have an associated tiering minimum cooling period, which sets the time that user data in a volume of
FabricPool must remain inactive for the data to be considered “cold” and moved to the cloud tier.
After a block has been identified as cold, it is marked as eligible to be tiered. A daily background tiering scan
looks for cold blocks. When enough 4KB blocks from the same volume have been collected, they are
238
concatenated into a 4MB object and moved to the cloud tier based on the volume tiering policy.
Data in volumes using the all tiering policy is immediately marked as cold and begins tiering to
the cloud tier as soon as possible. It does not need to wait for the daily tiering scan to run.
You can use the volume object-store tiering show command to view the tiering status of a
FabricPool volume. For more information, see the Command Reference.
The FabricPool tiering policy is specified at the volume level. Four options are available:
• The snapshot-only tiering policy (the default) moves user data blocks of the volume Snapshot copies
that are not associated with the active file system to the cloud tier.
The tiering minimum cooling period is 2 days. You can modify the default setting for the tiering minimum
cooling period with the -tiering-minimum-cooling-days parameter in the advanced privilege level of
the volume create and volume modify commands. Valid values are 2 to 183 days using ONTAP 9.8
and later. If you are using a version of ONTAP earlier than 9.8, valid values are 2 to 63 days.
• The auto tiering policy, supported only on ONTAP 9.4 and later releases, moves cold user data blocks in
both the Snapshot copies and the active file system to the cloud tier.
The default tiering minimum cooling period is 31 days and applies to the entire volume, for both the active
file system and the Snapshot copies.
You can modify the default setting for the tiering minimum cooling period with the -tiering-minimum
-cooling-days parameter in the advanced privilege level of the volume create and volume modify
commands. Valid values are 2 to 183 days.
• The all tiering policy, supported only with ONTAP 9.6 and later, moves all user data blocks in both the
active file system and Snapshot copies to the cloud tier. It replaces the backup tiering policy.
The all volume tiering policy should not be used on read/write volumes that have normal client traffic.
The tiering minimum cooling period does not apply because the data moves to the cloud tier as soon as
the tiering scan runs, and you cannot modify the setting.
• The none tiering policy keeps a volume’s data in the performance tier and does not move cold to the cloud
tier.
Setting the tiering policy to none prevents new tiering. Volume data that has previously been moved to the
cloud tier remains in the cloud tier until it becomes hot and is automatically moved back to the local tier.
The tiering minimum cooling period does not apply because the data never moves to the cloud tier, and
you cannot modify the setting.
When cold blocks in a volume with a tiering policy set to none are read, they are made hot and written to
the local tier.
The volume show command output shows the tiering policy of a volume. A volume that has never been used
with FabricPool shows the none tiering policy in the output.
239
What happens when you modify the tiering policy of a volume in FabricPool
You can modify the tiering policy of a volume by performing a volume modify operation. You must
understand how changing the tiering policy might affect how long it takes for data to become cold and be
moved to the cloud tier.
• Changing the tiering policy from snapshot-only or none to auto causes ONTAP to send user data
blocks in the active file system that are already cold to the cloud tier, even if those user data blocks were
not previously eligible for the cloud tier.
• Changing the tiering policy to all from another policy causes ONTAP to move all user blocks in the active
file system and in the Snapshot copies to the cloud as soon as possible. Prior to ONTAP 9.8, blocks
needed to wait until the next tiering scan ran.
• Changing the tiering policy from auto to snapshot-only or none does not cause active file system
blocks that are already moved to the cloud tier to be moved back to the performance tier.
Volume reads are needed for the data to be moved back to the performance tier.
• Any time you change the tiering policy on a volume, the tiering minimum cooling period is reset to the
default value for the policy.
• Unless you explicitly specify a different tiering policy, a volume retains its original tiering policy when it is
moved in and out of a FabricPool-enabled aggregate.
However, the tiering policy takes effect only when the volume is in a FabricPool-enabled aggregate.
• The existing value of the -tiering-minimum-cooling-days parameter for a volume moves with the
volume unless you specify a different tiering policy for the destination.
If you specify a different tiering policy, then the volume uses the default tiering minimum cooling period for
that policy. This is the case whether the destination is FabricPool or not.
• You can move a volume across aggregates and at the same time modify the tiering policy.
• You should pay special attention when a volume move operation involves the auto tiering policy.
Assuming that both the source and the destination are FabricPool-enabled aggregates, the following table
summarizes the outcome of a volume move operation that involves policy changes related to auto:
When you move a volume that And you change the tiering Then after the volume move…
has a tiering policy of… policy with the move to…
240
auto or all snapshot-only All data is moved to the
performance tier.
• Beginning with ONTAP 9.8, a clone volume always inherits both the tiering policy and the cloud retrieval
policy from the parent volume.
In releases earlier than ONTAP 9.8, a clone inherits the tiering policy from the parent except when the
parent has the all tiering policy.
• If the parent volume has the never cloud retrieval policy, its clone volume must have either the never
cloud retrieval policy or the all tiering policy, and a corresponding cloud retrieval policy default.
• The parent volume cloud retrieval policy cannot be changed to never unless all its clone volumes have a
cloud retrieval policy never.
When you clone volumes, keep the following best practices in mind:
• The -tiering-policy option and tiering-minimum-cooling-days option of the clone only controls
the tiering behavior of blocks unique to the clone. Therefore, we recommend using tiering settings on the
parent FlexVol that are either move the same amount of data or move less data than any of the clones
• The cloud retrieval policy on the parent FlexVol should either move the same amount of data or should
move more data than the retrieval policy of any of the clones
FabricPool cloud data retrieval is controlled by tiering policies that determine data retrieval from the cloud tier
to performance tier based on the read pattern. Read patterns can be either sequential or random.
The following table lists the tiering policies and the cloud data retrieval rules for each policy.
Beginning with ONTAP 9.8, the cloud migration control cloud-retrieval-policy option overrides the
241
default cloud migration or retrieval behavior controlled by the tiering policy.
The following table lists the supported cloud retrieval policies and their retrieval behavior.
242
Configure FabricPool
Configuring FabricPool helps you manage which storage tier (the local performance tier
or the cloud tier) data should be stored based on whether the data is frequently
accessed.
The preparation required for FabricPool configuration depends on the object store you use as the cloud tier.
The FabricPool license you might have used in the past is changing and is being retained
only for configurations that aren’t supported within BlueXP. Starting August 21, 2021,
Cloud Tiering BYOL licensing was introduced for tiering configurations that are supported
within BlueXP using the Cloud Tiering service.
Learn more about the new Cloud Tiering BYOL licensing.
Configurations that are supported by BlueXP must use the Digital Wallet page in BlueXP to license tiering for
ONTAP clusters. This requires you to set up a BlueXP account and set up tiering for the particular object
storage provider you plan to use. BlueXP currently supports tiering to the following object storage: Amazon S3,
Azure Blob storage, Google Cloud Storage, S3-compatible object storage, and StorageGRID.
You can download and activate a FabricPool license using System Manager if you have one of the
configurations that is not supported within BlueXP:
The FabricPool license is a cluster-wide license. It includes an entitled usage limit that you purchase for object
storage that is associated with FabricPool in the cluster. The usage across the cluster must not exceed the
capacity of the entitled usage limit. If you need to increase the usage limit of the license, you should contact
your sales representative.
A term-based FabricPool license with 10 TB of free capacity is available for first time FabricPool orders for
existing clusters configurations not supported within BlueXP. Free capacity is not available with perpetual
licenses.
A license is not required if you use NetApp StorageGRID or ONTAP S3 for the cloud tier. Cloud Volumes
ONTAP does not require a FabricPool license, regardless of the provider you are using.
This task is supported only by uploading the license file to the cluster using System Manager.
Steps
1. Download the NetApp License File (NLF) for the FabricPool license from the NetApp Support Site.
2. Perform the following actions using System Manager to upload the FabricPool license to the cluster:
243
a. In the Cluster > Settings pane, on the Licenses card, click .
b. On the License page, click .
c. In the Add License dialog box, click Browse to select the NLF you downloaded, and then click Add to
upload the file to the cluster.
Related information
ONTAP FabricPool (FP) Licensing Overview
Unless you plan to disable certificate checking for StorageGRID, you must install a
StorageGRID CA certificate on the cluster so that ONTAP can authenticate with
StorageGRID as the object store for FabricPool.
About this task
ONTAP 9.4 and later releases enable you to disable certificate checking for StorageGRID.
Steps
1. Contact your StorageGRID administrator to obtain the StorageGRID system’s CA certificate.
2. Use the security certificate install command with the -type server-ca parameter to install
the StorageGRID CA certificate on the cluster.
The fully qualified domain name (FQDN) you enter must match the custom common name on the
StorageGRID CA certificate.
To update an expired certificate, the best practice is to use a trusted CA to generate the new server certificate.
In addition, you should ensure that the certificate is updated on the StorageGRID server and on the ONTAP
cluster at the same time to keep any downtime to a minimum.
Related information
StorageGRID Resources
Unless you plan to disable certificate checking for ONTAP S3, you must install a ONTAP
S3 CA certificate on the cluster so that ONTAP can authenticate with ONTAP S3 as the
object store for FabricPool.
Steps
1. Obtain the ONTAP S3 system’s CA certificate.
2. Use the security certificate install command with the -type server-ca parameter to install
the ONTAP S3 CA certificate on the cluster.
The fully qualified domain name (FQDN) you enter must match the custom common name on the ONTAP
244
S3 CA certificate.
To update an expired certificate, the best practice is to use a trusted CA to generate the new server certificate.
In addition, you should ensure that the certificate is updated on the ONTAP S3 server and on the ONTAP
cluster at the same time to keep any downtime to a minimum.
Related information
S3 configuration
Setting up FabricPool involves specifying the configuration information of the object store
(StorageGRID, ONTAP S3, Alibaba Cloud Object Storage, Amazon S3, Google Cloud
Storage, IBM Cloud Object Storage, or Microsoft Azure Blob Storage for the cloud) that
you plan to use as the cloud tier for FabricPool.
If you are running ONTAP 9.2 or later, you can set up StorageGRID as the cloud tier for
FabricPool. When tiering data that is accessed by SAN protocols, NetApp recommends
using private clouds, like StorageGRID, due to connectivity considerations.
Considerations for using StorageGRID with FabricPool
• You need to install a CA certificate for StorageGRID, unless you explicitly disable certificate checking.
• You must not enable StorageGRID object versioning on the object store bucket.
• A FabricPool license is not required.
• If a StorageGRID node is deployed in a virtual machine with storage assigned from a NetApp AFF system,
confirm that the volume does not have a FabricPool tiering policy enabled.
Disabling FabricPool tiering for volumes used with StorageGRID nodes simplifies troubleshooting and
storage operations.
Never use FabricPool to tier any data related to StorageGRID back to StorageGRID itself.
Tiering StorageGRID data back to StorageGRID increases troubleshooting and operational
complexity.
Procedures
You can set up StorageGRID as the cloud tier for FabricPool with ONTAP System Manager or the ONTAP CLI.
245
System Manager
1. Click Storage > Tiers > Add Cloud Tier and select StorageGRID as the object store provider.
2. Complete the requested information.
3. If you want to create a cloud mirror, click Add as FabricPool Mirror.
A FabricPool mirror provides a method for you to seamlessly replace a data store, and it helps to ensure
that your data is available in the event of disaster.
CLI
1. Specify the StorageGRID configuration information by using the storage aggregate object-
store config create command with the -provider-type SGWS parameter.
◦ The storage aggregate object-store config create command fails if ONTAP cannot
access StorageGRID with the provided information.
◦ You use the -access-key parameter to specify the access key for authorizing requests to the
StorageGRID object store.
◦ You use the -secret-password parameter to specify the password (secret access key) for
authenticating requests to the StorageGRID object store.
◦ If the StorageGRID password is changed, you should update the corresponding password stored
in ONTAP immediately.
2. Display and verify the StorageGRID configuration information by using the storage aggregate
object-store config show command.
The storage aggregate object-store config modify command enables you to modify the
StorageGRID configuration information for FabricPool.
If you are running ONTAP 9.8 or later, you can set up ONTAP S3 as the cloud tier for
FabricPool.
What you’ll need
You must have the ONTAP S3 server name and the IP address of its associated LIFs on the remote cluster.
246
Creating intercluster LIFs for remote FabricPool tiering
Procedures
You can set up ONTAP S3 as the cloud tier for FabricPool with ONTAP System Manager or the ONTAP CLI.
247
System Manager
1. Click Storage > Tiers > Add Cloud Tier and select ONTAP S3 as the object store provider.
2. Complete the requested information.
3. If you want to create a cloud mirror, click Add as FabricPool Mirror.
A FabricPool mirror provides a method for you to seamlessly replace a data store, and it helps to ensure
that your data is available in the event of disaster.
CLI
1. Add entries for the S3 server and LIFs to your DNS server.
Option Description
If you use an external DNS server Give the S3 server name and IP addresses to the
DNS server administrator.
If you use your local system’s DNS hosts Enter the following command:
table
dns host create -vserver svm_name
-address ip_address -hostname
s3_server_name
2. Specify the ONTAP S3 configuration information by using the storage aggregate object-
store config create command with the -provider-type ONTAP_S3 parameter.
◦ The storage aggregate object-store config create command fails if the local
ONTAP system cannot access the ONTAP S3 server with the information provided.
◦ You use the -access-key parameter to specify the access key for authorizing requests to the
ONTAP S3 server.
◦ You use the -secret-password parameter to specify the password (secret access key) for
authenticating requests to the ONTAP S3 server.
◦ If the ONTAP S3 server password is changed, you should immediately update the corresponding
password stored in the local ONTAP system.
Doing so enables access to the data in the ONTAP S3 object store without interruption.
3. Display and verify the ONTAP_S3 configuration information by using the storage aggregate
object-store config show command.
248
The storage aggregate object-store config modify command enables you to modify the
ONTAP_S3 configuration information for FabricPool.
If you are running ONTAP 9.6 or later, you can set up Alibaba Cloud Object Storage as
the cloud tier for FabricPool.
Considerations for using Alibaba Cloud Object Storage with FabricPool
• A BlueXP tiering license is required when tiering to Alibaba Cloud Object Storage.
• On AFF and FAS systems and ONTAP Select, FabricPool supports the following Alibaba Object Storage
Service classes:
◦ Alibaba Object Storage Service Standard
◦ Alibaba Object Storage Service Infrequent Access
Contact your NetApp sales representative for information about storage classes not listed.
Steps
1. Specify the Alibaba Cloud Object Storage configuration information by using the storage aggregate
object-store config create command with the -provider-type AliCloud parameter.
◦ The storage aggregate object-store config create command fails if ONTAP cannot
access Alibaba Cloud Object Storage with the provided information.
◦ You use the -access-key parameter to specify the access key for authorizing requests to the Alibaba
Cloud Object Storage object store.
◦ If the Alibaba Cloud Object Storage password is changed, you should update the corresponding
password stored in ONTAP immediately.
Doing so enables ONTAP to access the data in Alibaba Cloud Object Storage without interruption.
2. Display and verify the Alibaba Cloud Object Storage configuration information by using the storage
aggregate object-store config show command.
The storage aggregate object-store config modify command enables you to modify the
Alibaba Cloud Object Storage configuration information for FabricPool.
If you are running ONTAP 9.2 or later, you can set up Amazon S3 as the cloud tier for
FabricPool. If you are running ONTAP 9.5 or later, you can set up Amazon Commercial
249
Cloud Services (C2S) for FabricPool.
Considerations for using Amazon S3 with FabricPool
• A BlueXP tiering license is required when tiering to Amazon S3.
• It is recommended that the LIF that ONTAP uses to connect with the Amazon S3 object server be on a 10
Gbps port.
• On AFF and FAS systems and ONTAP Select, FabricPool supports the following Amazon S3 storage
classes:
◦ Amazon S3 Standard
◦ Amazon S3 Standard - Infrequent Access (Standard - IA)
◦ Amazon S3 One Zone - Infrequent Access (One Zone - IA)
◦ Amazon S3 Intelligent-Tiering
◦ Amazon Commercial Cloud Services
◦ Beginning with ONTAP 9.11.1, Amazon S3 Glacier Instant Retrieval (FabricPool does not support
Glacier Flexible Retrieval or Glacier Deep Archive)
Contact your sales representative for information about storage classes not listed.
• On Cloud Volumes ONTAP, FabricPool supports tiering from General Purpose SSD (gp2) and Throughput
Optimized HDD (st1) volumes of Amazon Elastic Block Store (EBS).
Steps
1. Specify the Amazon S3 configuration information by using the storage aggregate object-store
config create command with the -provider-type AWS_S3 parameter.
◦ You use the -auth-type CAP parameter to obtain credentials for C2S access.
When you use the -auth-type CAP parameter, you must use the -cap-url parameter to specify the
full URL to request temporary credentials for C2S access.
◦ The storage aggregate object-store config create command fails if ONTAP cannot
access Amazon S3 with the provided information.
◦ You use the -access-key parameter to specify the access key for authorizing requests to the
Amazon S3 object store.
◦ You use the -secret-password parameter to specify the password (secret access key) for
authenticating requests to the Amazon S3 object store.
◦ If the Amazon S3 password is changed, you should update the corresponding password stored in
ONTAP immediately.
250
cluster1::> storage aggregate object-store config create
-object-store-name my_aws_store -provider-type AWS_S3
-server s3.amazonaws.com -container-name my-aws-bucket
-access-key DXJRXHPXHYXA9X31X3JX
2. Display and verify the Amazon S3 configuration information by using the storage aggregate object-
store config show command.
The storage aggregate object-store config modify command enables you to modify the
Amazon S3 configuration information for FabricPool.
If you are running ONTAP 9.6 or later, you can set up Google Cloud Storage as the cloud
tier for FabricPool.
Steps
1. Specify the Google Cloud Storage configuration information by using the storage aggregate object-
store config create command with the -provider-type GoogleCloud parameter.
◦ The storage aggregate object-store config create command fails if ONTAP cannot
access Google Cloud Storage with the provided information.
◦ You use the -access-key parameter to specify the access key for authorizing requests to the Google
251
Cloud Storage object store.
◦ If the Google Cloud Storage password is changed, you should update the corresponding password
stored in ONTAP immediately.
Doing so enables ONTAP to access the data in Google Cloud Storage without interruption.
2. Display and verify the Google Cloud Storage configuration information by using the storage aggregate
object-store config show command.
The storage aggregate object-store config modify command enables you to modify the
Google Cloud Storage configuration information for FabricPool.
If you are running ONTAP 9.5 or later, you can set up IBM Cloud Object Storage as the
cloud tier for FabricPool.
Considerations for using IBM Cloud Object Storage with FabricPool
• A BlueXP tiering license is required when tiering to IBM Cloud Object Storage.
• It is recommended that the LIF that ONTAP uses to connect with the IBM Cloud object server be on a 10
Gbps port.
Steps
1. Specify the IBM Cloud Object Storage configuration information by using the storage aggregate
object-store config create command with the -provider-type IBM_COS parameter.
◦ The storage aggregate object-store config create command fails if ONTAP cannot
access IBM Cloud Object Storage with the provided information.
◦ You use the -access-key parameter to specify the access key for authorizing requests to the IBM
Cloud Object Storage object store.
◦ You use the -secret-password parameter to specify the password (secret access key) for
authenticating requests to the IBM Cloud Object Storage object store.
◦ If the IBM Cloud Object Storage password is changed, you should update the corresponding password
stored in ONTAP immediately.
Doing so enables ONTAP to access the data in IBM Cloud Object Storage without interruption.
252
2. Display and verify the IBM Cloud Object Storage configuration information by using the storage
aggregate object-store config show command.
The storage aggregate object-store config modify command enables you to modify the IBM
Cloud Object Storage configuration information for FabricPool.
Set up Azure Blob Storage for the cloud as the cloud tier
If you are running ONTAP 9.4 or later, you can set up Azure Blob Storage for the cloud as
the cloud tier for FabricPool.
Considerations for using Microsoft Azure Blob Storage with FabricPool
• A BlueXP tiering license is required when tiering to Azure Blob Storage.
• A FabricPool license is not required if you are using Azure Blob Storage with Cloud Volumes ONTAP.
• It is recommended that the LIF that ONTAP uses to connect with the Azure Blob Storage object server be
on a 10 Gbps port.
• FabricPool currently does not support Azure Stack, which is on-premises Azure services.
• At the account level in Microsoft Azure Blob Storage, FabricPool supports only hot and cool storage tiers.
FabricPool does not support blob-level tiering. It also does not support tiering to Azure’s archive storage
tier.
Steps
1. Specify the Azure Blob Storage configuration information by using the storage aggregate object-
store config create command with the -provider-type Azure_Cloud parameter.
◦ The storage aggregate object-store config create command fails if ONTAP cannot
access Azure Blob Storage with the provided information.
◦ You use the -azure-account parameter to specify the Azure Blob Storage account.
◦ You use the -azure-private-key parameter to specify the access key for authenticating requests
to Azure Blob Storage.
◦ If the Azure Blob Storage password is changed, you should update the corresponding password stored
in ONTAP immediately.
Doing so enables ONTAP to access the data in Azure Blob Storage without interruption.
2. Display and verify the Azure Blob Storage configuration information by using the storage aggregate
object-store config show command.
253
The storage aggregate object-store config modify command enables you to modify the
Azure Blob Storage configuration information for FabricPool.
If you are running ONTAP 9.7 or later, you can set up a mirrored FabricPool on a
MetroCluster configuration to tier cold data to object stores in two different fault zones.
About this task
• FabricPool in MetroCluster requires that the underlying mirrored aggregate and the associated object store
configuration must be owned by the same MetroCluster configuration.
• You cannot attach an aggregate to an object store that is created in the remote MetroCluster site.
• You must create object store configurations on the MetroCluster configuration that owns the aggregate.
Step
1. Specify the object store configuration information on each MetroCluster site by using the storage
object-store config create command.
In this example, FabricPool is required on only one cluster in the MetroCluster configuration. Two object
store configurations are created for that cluster, one for each object store bucket.
storage aggregate
object-store config create -object-store-name mcc1-ostore-config-s1
-provider-type SGWS -server
<SGWS-server-1> -container-name <SGWS-bucket-1> -access-key <key>
-secret-password <password> -encrypt
<true|false> -provider <provider-type> -is-ssl-enabled <true|false>
ipspace
<IPSpace>
This example sets up FabricPool on the second cluster in the MetroCluster configuration.
254
storage aggregate
object-store config create -object-store-name mcc2-ostore-config-s1
-provider-type SGWS -server
<SGWS-server-1> -container-name <SGWS-bucket-3> -access-key <key>
-secret-password <password> -encrypt
<true|false> -provider <provider-type> -is-ssl-enabled <true|false>
ipspace
<IPSpace>
storage aggregate
object-store config create -object-store-name mcc2-ostore-config-s2
-provider-type SGWS -server
<SGWS-server-2> -container-name <SGWS-bucket-4> -access-key <key>
-secret-password <password> -encrypt
<true|false> -provider <provider-type> -is-ssl-enabled <true|false>
ipspace
<IPSpace>
Before you attach an object store to a local tier, you can test the object store’s latency
and throughput performance by using object store profiler.
Before you being
• You must add the cloud tier to ONTAP before you can use it with the object store profiler.
• You must be at the ONTAP CLI advanced privilege mode.
Steps
1. Start the object store profiler:
After setting up an object store as the cloud tier, you specify the local tier (aggregate) to
use by attaching it to FabricPool. In ONTAP 9.5 and later, you can also attach local tiers
(aggregates) that contain qualified FlexGroup volume constituents.
About this task
Attaching a cloud tier to a local tier is a permanent action. A cloud tier cannot be unattached from a local tier
255
after being attached. However, you can use FabricPool mirror to attach a local tier to a different cloud tier.
When you use System Manager to set up a local tier for FabricPool, you can create the local tier
and set it up to use for FabricPool at the same time.
Steps
You can attach a local tier (aggregate) to a FabricPool object store with ONTAP System Manager or the
ONTAP CLI.
256
System Manager
1. Navigate to Storage > Tiers, select a cloud tier, then click .
2. Select Attach local tiers.
3. Under Add as Primary verify that the volumes are eligible to attach.
4. If necessary, select Convert volumes to thin provisioned.
5. Click Save.
CLI
To attach an object store to an aggregate with the CLI:
1. Optional: To see how much data in a volume is inactive, follow the steps in Determining how much
data in a volume is inactive by using inactive data reporting.
Seeing how much data in a volume is inactive can help you decide which aggregate to use for
FabricPool.
2. Attach the object store to an aggregate by using the storage aggregate object-store
attach command.
If the aggregate has never been used with FabricPool and it contains existing volumes, then the
volumes are assigned the default snapshot-only tiering policy.
You can use the allow-flexgroup true option to attach aggregates that contain FlexGroup
volume constituents.
3. Display the object store information and verify that the attached object store is available by using the
storage aggregate object-store show command.
Beginning with ONTAP 9.8, you can tier data to local object storage using ONTAP S3.
Tiering data to a local bucket provides a simple alternative to moving data to a different local tier. This
procedure uses an existing bucket on the local cluster, or you can let ONTAP automatically create a new
storage VM and a new bucket.
Keep in mind that once you attach to a local tier (aggregate) the cloud tier cannot be unattached.
257
An S3 license is required for this workflow, which creates a new S3 server and new bucket, or uses existing
ones. This license included in ONTAP One. A FabricPool license is not required for this workflow.
Step
1. Tier data to a local bucket: click Tiers, select a tier, then click .
2. If necessary, enable thin provisioning.
3. Choose an existing tier or create a new one.
4. If necessary, edit the existing tiering policy.
Manage FabricPool
To help you with your storage tiering needs, ONTAP enables you to display how much
data in a volume is inactive, add or move volumes to FabricPool, monitor the space
utilization for FabricPool, or modify a volume’s tiering policy or tiering minimum cooling
period.
Determine how much data in a volume is inactive by using inactive data reporting
Seeing how much data in a volume is inactive enables you to make good use of storage
tiers. Information in inactive data reporting helps you decide which aggregate to use for
FabricPool, whether to move a volume in to or out of FabricPool, or whether to modify the
tiering policy of a volume.
What you’ll need
You must be running ONTAP 9.4 or later to use the inactive data reporting functionality.
You cannot enable inactive data reporting when FabricPool cannot be enabled, including the following
instances:
◦ Root aggregates
◦ MetroCluster aggregates running ONTAP versions earlier than 9.7
◦ Flash Pool (hybrid aggregates, or SnapLock aggregates)
• Inactive data reporting is enabled by default on aggregates where any volumes have adaptive compression
enabled.
• Inactive data reporting is enabled by default on all SSD aggregates in ONTAP 9.6.
• Inactive data reporting is enabled by default on FabricPool aggregate in ONTAP 9.4 and ONTAP 9.5.
• You can enable inactive data reporting on non-FabricPool aggregates using the ONTAP CLI, including
HDD aggregates, beginning with ONTAP 9.6.
Procedure
You can determine how much data is inactive with ONTAP System Manager or the ONTAP CLI.
258
System Manager
1. Choose one of the following options:
◦ When you have existing HDD aggregates, navigate to Storage > Tiers and click for the
aggregate on which you want to enable inactive data reporting.
◦ When no cloud tiers are configured, navigate to Dashboard and click the Enable inactive data
reporting link under Capacity.
CLI
To enable inactive data reporting with the CLI:
1. If the aggregate for which you want to see inactive data reporting is not used in FabricPool, enable
inactive data reporting for the aggregate by using the storage aggregate modify command with
the -is-inactive-data-reporting-enabled true parameter.
You need to explicitly enable the inactive data reporting functionality on an aggregate that is not used
for FabricPool.
You cannot and do not need to enable inactive data reporting on a FabricPool-enabled aggregate
because the aggregate already comes with inactive data reporting. The -is-inactive-data
-reporting-enabled parameter does not work on FabricPool-enabled aggregates.
2. To display how much data is inactive on a volume, use the volume show command with the
-fields performance-tier-inactive-user-data,performance-tier-inactive-user-
data-percent parameter.
259
◦ The performance-tier-inactive-user-data-percent field displays what percent of the
data is inactive across the active file system and Snapshot copies.
◦ For an aggregate that is not used for FabricPool, inactive data reporting uses the tiering policy to
decide how much data to report as cold.
▪ For the none tiering policy, 31 days is used.
▪ For the snapshot-only and auto, inactive data reporting uses tiering-minimum-
cooling-days.
▪ For the ALL policy, inactive data reporting assumes the data will tier within a day.
Until the period is reached, the output shows “-” for the amount of inactive data instead of a
value.
◦ On a volume that is part of FabricPool, what ONTAP reports as inactive depends on the tiering
policy that is set on a volume.
▪ For the none tiering policy, ONTAP reports the amount of the entire volume that is inactive for
at least 31 days. You cannot use the -tiering-minimum-cooling-days parameter with
the none tiering policy.
▪ For the ALL, snapshot-only, and auto tiering policies, inactive data reporting is not
supported.
You can add volumes to FabricPool by creating new volumes directly in the FabricPool-
enabled aggregate or by moving existing volumes from another aggregate to the
FabricPool-enabled aggregate.
When you create a volume for FabricPool, you have the option to specify a tiering policy. If no tiering policy is
specified, the created volume uses the default snapshot-only tiering policy. For a volume with the
snapshot-only or auto tiering policy, you can also specify the tiering minimum cooling period.
Steps
1. Create a new volume for FabricPool by using the volume create command.
◦ The -tiering-policy optional parameter enables you to specify the tiering policy for the volume.
▪ snapshot-only (default)
260
▪ auto
▪ all
▪ backup (deprecated)
▪ none
▪ default
The tiering policy determines what data is pulled back, so there is no change to cloud data retrieval
with default cloud-retrieval-policy. This means the behavior is the same as in pre-ONTAP 9.8
releases:
▪ If the tiering policy is none or snapshot-only, then “default” means that any client-driven
data read is pulled from the cloud tier to performance tier.
▪ If the tiering policy is auto, then any client-driven random read is pulled but not sequential
reads.
▪ If the tiering policy is all then no client-driven data is pulled from the cloud tier.
▪ on-read
All client-driven data reads are pulled from the cloud tier to performance tier.
▪ never
▪ promote
▪ For tiering policy none, all cloud data is pulled from the cloud tier to the performance tier
▪ For tiering policy snapshot-only, all active filesystem data is pulled from the cloud tier to the
performance tier.
◦ The -tiering-minimum-cooling-days optional parameter in the advanced privilege level enables
you to specify the tiering minimum cooling period for a volume that uses the snapshot-only or auto
tiering policy.
Beginning with ONTAP 9.8, you can specify a value between 2 and 183 for the tiering minimum cooling
days. If you are using a version of ONTAP earlier than 9.8, you can specify a value between 2 and 63
for the tiering minimum cooling days.
261
cluster1::*> volume create -vserver myVS -aggregate myFabricPool
-volume myvol1 -tiering-policy auto -tiering-minimum-cooling-days 45
Related information
FlexGroup volumes management
When you move a volume to FabricPool, you have the option to specify or change the
tiering policy for the volume with the move. Beginning with ONTAP 9.8, when you move a
non-FabricPool volume with inactive data reporting enabled, FabricPool uses a heat map
to read tierable blocks, and moves cold data to the capacity tier on the FabricPool
destination.
What you’ll need
You must understand how changing the tiering policy might affect how long it takes for data to become cold
and be moved to the cloud tier.
You should not use the -tiering-policy option on volume move if you are using ONTAP 9.8 and you want
FabricPools to use inactive data reporting information to move data directly to the capacity tier. Using this
option causes FabricPools to ignore the temperature data and instead follow the move behavior of releases
prior to ONTAP 9.8.
Step
1. Use the volume move start command to move a volume to FabricPool.
The -tiering-policy optional parameter enables you to specify the tiering policy for the volume.
◦ snapshot-only (default)
◦ auto
◦ all
◦ none
262
cluster1::> volume move start -vserver vs1 -volume myvol2
-destination-aggregate dest_FabricPool -tiering-policy none
Beginning with ONTAP 9.14.1, you can enable and disable writing directly to the cloud on
a new or existing volume in a FabricPool to allow NFS clients to write data directly to the
cloud without waiting for tiering scans. SMB clients still write to the performance tier in a
cloud write enabled volume. Cloud-write mode is disabled by default.
Having the ability to write directly to the cloud is helpful for cases like migrations, for example, where large
amounts of data are transferred to a cluster than the cluster can support on the local tier. Without cloud-write
mode, during a migration, smaller amounts of data are transferred, then tiered, then transferred and tiered
again, until the migration is complete. Using cloud-write mode, this type of management is no longer required
because the data is never transferred to the local tier.
Steps
1. Set the privilege level to advanced:
The following example creates a volume named vol1 with cloud write enabled on the FabricPool local tier
(aggr1):
Steps
1. Set the privilege level to advanced:
263
set -privilege advanced
The following example modifies a volume named vol1 with cloud write enabled on the FabricPool local tier
(aggr1):
Steps
1. Set the privilege level to advanced:
The following example creates a volume named vol1 with cloud write enabled:
Beginning with ONTAP 9.14.1, you can enable and disable aggressive read-ahead mode
on volumes in FabricPools that provide support for media and entertainment, such as
movie streaming workloads. Aggressive read-ahead mode is available in ONTAP 9.14.1
on all on-premises platforms that support FabricPool. The feature is disabled by default.
About this task
The aggressive-readahead-mode command has two options:
264
• file_prefetch: the system reads the entire file into memory ahead of the client application.
Steps
1. Set the privilege level to advanced:
The following example creates a volume named vol1 with aggressive read-ahead enabled with the
file_prefetch option:
Steps
1. Set the privilege level to advanced:
The following example modifies a volume named vol1 to disable aggressive read-ahead mode:
Steps
265
1. Set the privilege level to advanced:
Beginning with ONTAP 9.8, FabricPool supports object tagging using user-created
custom tags to enable you to classify and sort objects for easier management. If you are
a user with the admin privilege level, you can create new object tags, and modify, delete,
and view existing tags.
You can create a new object tag when you want to assign one or more tags to new
objects that are tiered from a new volume you create. You can use tags to help you
classify and sort tiering objects for easier data management. Beginning with ONTAP 9.8,
you can use System Manager to create object tags.
About this task
You can set tags only on FabricPool volumes attached to StorageGRID. These tags are retained during a
volume move.
Keys must contain only alphanumeric characters and underscores, and the maximum number of
characters allowed is 127.
Procedure
You can assign object tags with ONTAP System Manager or the ONTAP CLI.
266
System Manager
1. Navigate to Storage > Tiers.
2. Locate a storage tier with volumes you want to tag.
3. Click the Volumes tab.
4. Locate the volume you want to tag and in the Object Tags column select Click to enter tags.
5. Enter a key and value.
6. Click Apply.
CLI
1. Use the volume create command with the -tiering-object-tags option to create a new
volume with the specified tags. You can specify multiple tags in comma-separated pairs:
The following example creates a volume named fp_volume1 with three object tags.
You can change the name of a tag, replace tags on existing objects in the object store, or
add a different tag to new objects that you plan to add later.
About this task
Using the volume modify command with the -tiering-object-tags option replaces existing tags with
the new value you provide.
Procedure
267
System Manager
1. Navigate to Storage > Tiers.
2. Locate a storage tier with volumes containing tags you want to modify.
3. Click the Volumes tab.
4. Locate the volume with tags you want to modify, and in the Object Tags column click the tag name.
5. Modify the tag.
6. Click Apply.
CLI
1. Use the volume modify command with the -tiering-object-tags option to modify an existing
tag.
The following example changes the name of the existing tag type=abc to type=xyz.
Delete a tag
You can delete object tags when you no longer want them set on a volume or on objects
in the object store.
Procedure
You can delete object tags with ONTAP System Manager or the ONTAP CLI.
268
System Manager
1. Navigate to Storage > Tiers.
2. Locate a storage tier with volumes containing tags you want to delete.
3. Click the Volumes tab.
4. Locate the volume with tags you want to delete, and in the Object Tags column click the tag name.
5. To delete the tag, click the trash can icon.
6. Click Apply.
CLI
1. Use the volume modify command with the -tiering-object-tags option followed by an empty
value ("") to delete an existing tag.
You can view the existing tags on a volume to see what tags are available before
appending new tags to the list.
Step
1. Use the volume show command with the tiering-object-tags option to view existing tags on a
volume.
◦ true — the object tagging scanner has not yet to run or needs to run again for this volume
269
◦ false — the object tagging scanner has completed tagging for this volume
◦ <-> — the object tagging scanner is not applicable for this volume. This happens for volumes that are
not residing on FabricPools.
You need to know how much data is stored in the performance and cloud tiers for
FabricPool. That information helps you determine whether you need to change the tiering
policy of a volume, increase the FabricPool licensed usage limit, or increase the storage
space of the cloud tier.
Steps
1. Monitor the space utilization for FabricPool-enabled aggregates by using one of the following commands to
display the information:
Details of space utilization within an aggregate, storage aggregate show-space with the
including the object store’s referenced capacity -instance parameter
Space utilization of the object stores that are storage aggregate object-store show-
attached to the aggregates, including how much space
license space is being used
In addition to using CLI commands, you can use Active IQ Unified Manager (formerly OnCommand Unified
Manager), along with FabricPool Advisor, which is supported on ONTAP 9.4 and later clusters, or System
Manager to monitor the space utilization.
The following example shows ways of displaying space utilization and related information for FabricPool:
270
cluster1::> storage aggregate show-space -instance
Aggregate: MyFabricPool
...
Aggregate Display Name:
MyFabricPool
...
Total Object Store Logical Referenced
Capacity: -
Object Store Logical Referenced Capacity
Percentage: -
...
Object Store
Size: -
Object Store Space Saved by Storage
Efficiency: -
Object Store Space Saved by Storage Efficiency
Percentage: -
Total Logical Used
Size: -
Logical Used
Percentage: -
Logical Unreferenced
Capacity: -
Logical Unreferenced
Percentage: -
Aggregate: MyFabricPool
...
Composite: true
Capacity Tier Used Size:
...
271
cluster1::> volume show-footprint
Vserver : vs1
Volume : rootvol
Vserver : vs1
Volume : vol
Increase the FabricPool licensed usage limit Contact your NetApp or partner sales
representative.
NetApp Support
Increase the storage space of the cloud tier Contact the provider of the object store that you use
for the cloud tier.
Manage storage tiering by modifying a volume’s tiering policy or tiering minimum cooling period
You can change the tiering policy of a volume to control whether data is moved to the
cloud tier when it becomes inactive (cold). For a volume with the snapshot-only or
272
auto tiering policy, you can also specify the tiering minimum cooling period that user data
must remain inactive before it is moved to the cloud tier.
What you’ll need
Changing a volume to the auto tiering policy or modifying the tiering minimum cooling period requires ONTAP
9.4 or later.
Changing the tiering policy might affect how long it takes for data to become cold and be moved to the cloud
tier.
What happens when you modify the tiering policy of a volume in FabricPool
Steps
1. Modify the tiering policy for an existing volume by using the volume modify command with the
-tiering-policy parameter:
◦ snapshot-only (default)
◦ auto
◦ all
◦ none
2. If the volume uses the snapshot-only or auto tiering policy and you want to modify the tiering minimum
cooling period, use the volume modify command with the -tiering-minimum-cooling-days
optional parameter in the advanced privilege level.
You can specify a value between 2 and 183 for the tiering minimum cooling days. If you are using a version
of ONTAP earlier than 9.8, you can specify a value between 2 and 63 for the tiering minimum cooling days.
Example of modifying the tiering policy and the tiering minimum cooling period of a volume
The following example changes the tiering policy of the volume “myvol” in the SVM “vs1” to auto and the
tiering minimum cooling period to 45 days:
This video shows a quick overview of using System Manager to archive a volume to a
cloud tier with FabricPool.
273
NetApp video: Archiving volumes with FabricPool (backup + volume move)
Related information
NetApp TechComm TV: FabricPool playlist
You can change a volume’s default tiering policy for controlling user data retrieval from
the cloud tier to performance tier by using the -cloud-retrieval-policy option
introduced in ONTAP 9.8.
What you’ll need
• Modifying a volume using the -cloud-retrieval-policy option requires ONTAP 9.8 or later.
• You must have the advanced privilege level to perform this operation.
• You should understand the behavior of tiering policies with -cloud-retrieval-policy.
Step
1. Modify the tiering policy behavior for an existing volume by using the volume modify command with the
-cloud-retrieval-policy option:
Beginning with ONTAP 9.8, if you are a cluster administrator at the advanced privilege
level, you can proactively promote data to the performance tier from the cloud tier using a
combination of the tiering-policy and the cloud-retrieval-policy setting.
You might do this if you want to stop using FabricPool on a volume, or if you have a snapshot-only tiering
policy and you want to bring restored Snapshot copy data back to the performance tier.
You can proactively retrieve all data on a FabricPool volume in the Cloud and promote it
to the performance tier.
274
Step
1. Use the volume modify command to set tiering-policy to none and cloud-retrieval-policy
to promote.
You can proactively retrieve active file system data from a restored Snapshot copy in the
cloud tier and promote it to the performance tier.
Step
1. Use the volume modify command to set tiering-policy to snapshot-only and cloud-
retrieval-policy to promote.
You can check the status of performance tier promotion to determine when the operation
is complete.
Step
1. Use the volume object-store command with the tiering option to check the status of the
performance tier promotion.
275
volume object-store tiering show v1 -instance
Vserver: vs1
Volume: v1
Node Name: node1
Volume DSID: 1023
Aggregate Name: a1
State: ready
Previous Run Status: completed
Aborted Exception Status: -
Time Scanner Last Finished: Mon Jan 13 20:27:30 2020
Scanner Percent Complete: -
Scanner Current VBN: -
Scanner Max VBNs: -
Time Waiting Scan will be scheduled: -
Tiering Policy: snapshot-only
Estimated Space Needed for Promotion: -
Time Scan Started: -
Estimated Time Remaining for scan to complete: -
Cloud Retrieve Policy: promote
Beginning with ONTAP 9.8, you can trigger a tiering scan request at any time when you
prefer not to wait for the default tiering scan.
Step
1. Use the volume object-store command with the trigger option to request migration and tiering.
To ensure data is accessible in data stores in the event of a disaster, and to enable you to
replace a data store, you can configure a FabricPool mirror by adding a second data
store to synchronously tier data to two data stores . You can add a second data store to
new or existing FabricPool configurations, monitor the mirror status, display FabricPool
mirror details, promote a mirror, and remove a mirror. You must be running ONTAP 9.7 or
later.
276
Create a FabricPool mirror
To create a FabricPool mirror, you attach two object stores to a single FabricPool. You
can create a FabricPool mirror either by attaching a second object store to an existing,
single object store FabricPool configuration, or you can create a new, single object store
FabricPool configuration and then attach a second object store to it. You can also create
FabricPool mirrors on MetroCluster configurations.
What you’ll need
• You must have already created the two object stores using the storage aggregate object-store
config command.
• If you are creating FabricPool mirrors on MetroCluster configurations:
◦ You must have already set up and configured the MetroCluster
◦ You must have created the object store configurations on the selected cluster.
If you are creating FabricPool mirrors on both clusters in a MetroCluster configuration, you must have
created object store configurations on both of the clusters.
◦ If you are not using on premises object stores for MetroCluster configurations, you should ensure that
one of the following scenarios exists:
▪ Object stores are in different availability zones
▪ Object stores are configured to keep copies of objects in multiple availability zones
The procedure for creating a FabricPool mirror is the same for both MetroCluster and non-MetroCluster
configurations.
Steps
1. If you are not using an existing FabricPool configuration, create a new one by attaching an object store to
an aggregate using the storage aggregate object-store attach command.
2. Attach a second object store to the aggregate using the storage aggregate object-store mirror
command.
This example attaches a second object store to an aggregate to create a FabricPool mirror.
277
cluster1::> storage aggregate object-store mirror -aggregate aggr1 -name
my-store-2
When you replace a primary object store with a mirror, you might have to wait for the
mirror to resync with the primary data store.
About this task
If the FabricPool mirror is in sync, no entries are displayed.
Step
1. Monitor mirror resync status using the storage aggregate object-store show-resync-status
command.
Complete
Aggregate Primary Mirror Percentage
--------- ----------- ---------- ----------
aggr1 my-store-1 my-store-2 40%
You can display details about a FabricPool mirror to see what object stores are in the
configuration and whether the object store mirror is in sync with the primary object store.
Step
1. Display information about a FabricPool mirror using the storage aggregate object-store show
command.
This example displays the details about the primary and mirror object stores in a FabricPool mirror.
278
This example displays details about the FabricPool mirror, including whether the mirror is degraded due to
a resync operation.
You can reassign the object store mirror as the primary object store by promoting it.
When the object store mirror becomes the primary, the original primary automatically
becomes the mirror.
What you’ll need
• The FabricPool mirror must be in sync
• The object store must be operational
Step
1. Promote an object store mirror by using the storage aggregate object-store modify
-aggregate command.
You can remove a FabricPool mirror if you no longer need to replicate an object store.
What you’ll need
The primary object store must be operational; otherwise, the command fails.
Step
1. Remove an object store mirror in a FabricPool by using the storage aggregate object-store
unmirror -aggregate command.
279
cluster1::> storage aggregate object-store unmirror -aggregate aggr1
You can use FabricPool mirror technology to replace one object store with another one.
The new object store does not have to use the same cloud provider as the original object
store.
About this task
You can replace the original object store with an object store that uses a different cloud provider. For instance,
your original object store might use AWS as the cloud provider, but you can replace it with an object store that
uses Azure as the cloud provider, and vice versa. However, the new object store must retain the same object
size as the original.
Steps
1. Create a FabricPool mirror by adding a new object store to an existing FabricPool using the storage
aggregate object-store mirror command.
2. Monitor the mirror resync status using the storage aggregate object-store show-resync-
status command.
Complete
Aggregate Primary Mirror Percentage
--------- ----------- ---------- ----------
aggr1 my-AWS-store my-AZURE-store 40%
3. Verify the mirror is in sync using the storage aggregate object-store> show -fields mirror-
type,is-mirror-degraded command.
280
aggregate object-store-name mirror-type is-mirror-degraded
-------------- ----------------- ------------- ------------------
aggr1 my-AWS-store primary -
my-AZURE-store mirror false
4. Swap the primary object store with the mirror object store using the storage aggregate object-
store modify command.
5. Display details about the FabricPool mirror using the storage aggregate object-store show
-fields mirror-type,is-mirror-degraded command.
This example displays the information about the FabricPool mirror, including whether the mirror is
degraded (not in sync).
6. Remove the FabricPool mirror using the storage aggregate object-store unmirror command.
7. Verify that the FabricPool is back in a single object store configuration using the storage aggregate
object-store show -fields mirror-type,is-mirror-degraded command.
281
Replace a FabricPool mirror on a MetroCluster configuration
2. Remove the object store mirror from the FabricPool by using the storage aggregate object-store
unmirror command.
3. You can force tiering to resume on the primary data store after you remove the mirror data store by using
the storage aggregate object-store modify with the -force-tiering-on-metrocluster
true option.
The absence of a mirror interferes with the replication requirements of a MetroCluster configuration.
4. Create a replacement object store by using the storage aggregate object-store config
create command.
5. Add the object store mirror to the FabricPool mirror using the storage aggregate object-store
mirror command.
282
storage aggregate object-store mirror -aggregate aggr1 -name
mcc1_ostore3-mc
6. Display the object store information using the storage aggregate object-store show command.
7. Monitor the mirror resync status using the storage aggregate object-store show-resync-
status command.
Complete
Aggregate Primary Mirror Percentage
--------- ----------- ---------- ----------
aggr1 mcc1_ostore1-mc mcc1_ostore3-mc 40%
You use the storage aggregate object-store commands to manage object stores
for FabricPool. You use the storage aggregate commands to manage aggregates for
FabricPool. You use the volume commands to manage volumes for FabricPool.
283
Delete the configuration of an object store storage aggregate object-store config
delete
Attach a second object store to a new or existing storage aggregate object-store mirror
FabricPool as a mirror with the -aggregate and -name parameter in the
admin privilege level
Remove an object store mirror from an existing storage aggregate object-store unmirror
FabricPool mirror with the -aggregate and -name parameter in the
admin privilege level
Promote an object store mirror to replace a primary storage aggregate object-store modify
object store in a FabricPool mirror configuration with the -aggregate parameter in the admin
privilege level
Test the latency and performance of an object store storage aggregate object-store profiler
without attaching the object store to an aggregate start with the -object-store-name and -node
parameter in the advanced privilege level
Monitor the object store profiler status storage aggregate object-store profiler
show with the -object-store-name and -node
parameter in the advanced privilege level
Abort the object store profiler when it is running storage aggregate object-store profiler
abort with the -object-store-name and -node
parameter in the advanced privilege level
Attach an object store to an aggregate for using storage aggregate object-store attach
FabricPool
Attach an object store to an aggregate that contains a storage aggregate object-store attach
FlexGroup volume for using FabricPool with the allow-flexgroup true
Display details of the object stores that are attached storage aggregate object-store show
to FabricPool-enabled aggregates
284
Display the aggregate fullness threshold used by the storage aggregate object-store show with
tiering scan the -fields tiering-fullness-threshold
parameter in the advanced privilege level
Display space utilization of the object stores that are storage aggregate object-store show-
attached to FabricPool-enabled aggregates space
Enable inactive data reporting on an aggregate that is storage aggregate modify with the -is
not used for FabricPool -inactive-data-reporting-enabled true
parameter
Display whether inactive data reporting is enabled on storage aggregate show with the -fields is-
an aggregate inactive-data-reporting-enabled parameter
Display information about how much user data is cold storage aggregate show-space with the
within an aggregate -fields performance-tier-inactive-user-
data,performance-tier-inactive-user-
data-percent parameter
285
Move a volume in to or out of FabricPool volume move start You use the -tiering
-policy optional parameter to specify the tiering
policy for the volume.
Modify the threshold for reclaiming unreferenced storage aggregate object-store modify
space (the defragmentation threshold) for FabricPool with the -unreclaimed-space-threshold
parameter in the advanced privilege level
Modify the threshold for the percent full the aggregate storage aggregate object-store modify
becomes before the tiering scan begins tiering data with the -tiering-fullness-threshold
for FabricPool parameter in the advanced privilege level
Display the threshold for reclaiming unreferenced storage aggregate object-store show or
space for FabricPool storage aggregate object-store show-
space command with the -unreclaimed-space
-threshold parameter in the advanced privilege
level
The SVM’s name and UUID remain unchanged after migration, as well as the data LIF name, IP address, and
object names, such as the volume name. The UUID of the objects in the SVM will be different.
The diagram depicts the typical workflow for an SVM migration. You start an SVM migration from the
destination cluster. You can monitor the migration from either the source or the destination. You can perform a
manual cutover or an automatic cutover. An automatic cutover is performed by default.
286
SVM migration platform support
When migrating from an AFF cluster to a FAS cluster with hybrid aggregates, auto volume
placement will attempt to perform a like to like aggregate match. For example, if the source
cluster has 60 volumes, the volume placement will try to find an AFF aggregate on the
destination to place the volumes. When there is not sufficient space on the AFF aggregates, the
volumes will be placed on aggregates with non-flash disks.
287
Network infrastructure performance requirements for TCP round trip time (RTT) between the source
and the destination cluster
Depending on the ONTAP version installed on the cluster, the network connecting the source and destination
clusters must have a maximum round trip time as indicated:
Source Destination ONTAP 9.14.1 ONTAP 9.13.1 ONTAP 9.12.1 ONTAP 9.11.1
and earlier
AFF AFF 400 200 100 100
FAS FAS 80 80 80 N/A
FAS AFF 80 80 80 N/A
AFF FAS 80 80 80 N/A
Prerequisites
Before initiating an SVM migration, you must meet the following prerequisites:
Best practice
When performing an SVM migration, it is a best practice to leave 30% CPU headroom on both the source
cluster and the destination cluster to enable the CPU workload to execute.
288
SVM operations
You should check for operations that can conflict with an SVM migration:
The table indicates the ONTAP features supported by SVM data mobility and the ONTAP releases in which
support is available.
For information about ONTAP version interoperability between a source and destination in an SVM migration,
see Compatible ONTAP versions for SnapMirror relationships.
289
IPv6 LIFs Not
supported
iSCSI SAN Not
supported
Job schedule replication ONTAP In ONTAP 9.10.1, job schedules are not replicated
9.11.1 during migration and must be manually created on the
destination. Beginning with ONTAP 9.11.1, job
schedules used by the source are replicated
automatically during migration.
Load-sharing mirrors Not
supported
MetroCluster SVMs Not Although SVM migrate does not support MetroCluster
supported SVM migration, you might be able to use SnapMirror
asynchronous replication to migrate an SVM in a
MetroCluster configuration. You should be aware that
the process described for migrating an SVM in a
MetroCluster configuration is not a non-disruptive
method.
NetApp Aggregate Encryption (NAE) Not Migration is not supported from an unencrypted
supported source to an encrypted destination.
NDMP configurations Not
supported
NetApp Volume Encryption (NVE) ONTAP
9.10.1
NFS and SMB audit logs ONTAP Audit log redirect is only available in
9.13.1 cloud-mode. For on-premises SVM
migration with audit enabled, you
should disable audit on the source
SVM and then perform the migration.
290
Onboard key manager (OKM) with Not
Common Criteria mode enabled on supported
source cluster
Qtrees ONTAP
9.14.1
Quotas ONTAP
9.14.1
S3 Not
supported
SMB protocol ONTAP SMB migrations are disruptive and require a client
9.12.1 refresh post migration.
SnapMirror cloud relationships ONTAP Beginning with ONTAP 9.12.1, when you migrate an
9.12.1 SVM with SnapMirror cloud relationships, the
destination cluster must have the SnapMirror cloud
license installed, and it must have enough capacity
available to support moving the capacity in the
volumes that are being mirrored to the cloud.
SnapMirror asynchronous destination ONTAP
9.12.1
SnapMirror asynchronous source ONTAP • Transfers can continue as normal on FlexVol
9.11.1 SnapMirror relationships during most of the
migration.
• Any ongoing transfers are canceled during
cutover and new transfers fail during cutover and
they cannot be restarted until the migration
completes.
• Scheduled transfers that were canceled or missed
during the migration are not automatically started
after the migration completes.
291
SnapMirror active sync Not
supported
SnapMirror SVM peer relationships ONTAP
9.12.1
SnapMirror SVM disaster recovery Not
supported
SnapMirror synchronous Not
supported
Snapshot copy ONTAP
9.10.1
Tamperproof Snapshot copy locking ONTAP Tamperproof Snapshot copy locking is not equivalent
9.14.1 to SnapLock. SnapLock remains unsupported.
Virtual IP LIFs/BGP Not
supported
Virtual Storage Console 7.0 and later Not VSC is part of the ONTAP Tools for VMware vSphere
supported virtual appliance beginning with VSC 7.0.
Volume clones Not
supported
vStorage Not Migration is not allowed when vStorage is enabled. To
supported perform a migration, disable the vStorage option, and
then reenable it after migration is completed.
The following table indicates volume operations supported within the migrating SVM based on migration state:
292
Snapshot copy attributes modify Allowed Allowed Not supported
Snapshot copy autodelete modify Allowed Allowed Not supported
Snapshot copy create Allowed Allowed Not supported
Snapshot copy delete Allowed Allowed Not supported
Restore file from Snapshot copy Allowed Allowed Not supported
Migrate an SVM
After an SVM migration has completed, clients are cut over to the destination cluster
automatically and the unnecessary SVM is removed from the source cluster. Automatic
cutover and automatic source cleanup are enabled by default. If necessary, you can
disable client auto-cutover to suspend the migration before cutover occurs and you can
also disable automatic source SVM cleanup.
• You can use the -auto-cutover false option to suspend the migration when automatic client cutover
normally occurs and then manually perform the cutover later.
• You can use the advance privilege -auto-source-cleanup false option to disable the removal of the
source SVM after cutover and then trigger source cleanup manually later, after cutover.
By default, clients are cut over to the destination cluster automatically when the migration is complete, and the
unnecessary SVM is removed from the source cluster.
Steps
1. From the destination cluster, run the migration prechecks:
293
Migrate an SVM with automatic client cutover disabled
You can use the -auto-cutover false option to suspend the migration when automatic client cutover normally
occurs and then manually perform the cutover later. See Manually cutover clients after SVM migration.
Steps
1. From the destination cluster, run the migration prechecks:
You can use the advance privilege -auto-source-cleanup false option to disable the removal of the source SVM
after cutover and then trigger source cleanup manually later, after cutover. See Manually remove source SVM.
Steps
1. From the destination cluster, run the migration prechecks:
The status displays ready-for-source-cleanup when SVM migration cutover is complete, and it is ready to
remove the SVM on the source cluster.
In addition to monitoring the overall SVM migration with the vserver migrate show
command, you can monitor the migration status of the volumes the SVM contains.
Steps
1. Check volume migration status:
294
dest_clust> vserver migrate show-volume
Pause migration
You can pause an SVM migration before client cutover starts by using the vserver migrate pause
command.
Some configuration changes are restricted when a migration operation is in progress; however, beginning with
ONTAP 9.12.1, you can pause a migration to fix some restricted configurations and for some failed states so
that you can fix configuration issues that might have caused the failure. Some of the failed states that you can
fix when you pause SVM migration include the following:
• setup-configuration-failed
• migrate-failed
Steps
1. From the destination cluster, pause the migration:
Resume migrations
When you’re ready to resume a paused SVM migration or when an SVM migration has failed, you can use the
vserver migrate resume command.
Step
1. Resume SVM migration:
2. Verify that the SVM migration has resumed, and monitor the progress:
If you need to cancel an SVM migration before it completes, you can use the vserver
migrate abort command. You can cancel an SVM migration only when the operation
is in the paused or failed state. You cannot cancel an SVM migration when the status is
“cutover-started” or after cutover is complete. You cannot use the abort option when an
SVM migration is in progress.
Steps
1. Check the migration status:
295
dest_cluster> vserver migrate show -vserver <vserver name>
The migration status shows migrate-aborting while the cancel operation is in progress. When the cancel
operation completes, the migration status shows nothing.
HA pair management
HA pair management overview
Cluster nodes are configured in high-availability (HA) pairs for fault tolerance and
nondisruptive operations. If a node fails or if you need to bring a node down for routine
maintenance, its partner can take over its storage and continue to serve data from it. The
296
partner gives back storage when the node is brought back on line.
The HA pair controller configuration consists of a pair of matching FAS/AFF storage controllers (local node and
partner node). Each of these nodes is connected to the other’s disk shelves. When one node in an HA pair
encounters an error and stops processing data, its partner detects the failed status of the partner and takes
over all data processing from that controller.
Takeover is the process in which a node assumes control of its partner’s storage.
• A software or system failure occurs on a node that leads to a panic. The HA pair controllers automatically
fail over to their partner node. After the partner has recovered from the panic and booted up, the node
automatically performs a giveback, returning the partner to normal operation.
• A system failure occurs on a node, and the node cannot reboot. For example, when a node fails because
of a power loss, HA pair controllers automatically fail over to their partner node and serve data from the
surviving storage controller.
If the storage for a node also loses power at the same time, a standard takeover is not possible.
• Heartbeat messages are not received from the node’s partner. This could happen if the partner
experienced a hardware or software failure (for example, an interconnect failure) that did not result in a
panic but still prevented it from functioning correctly.
• You halt one of the nodes without using the -f or -inhibit-takeover true parameter.
In a two-node cluster with cluster HA enabled, halting or rebooting a node using the ‑inhibit
‑takeover true parameter causes both nodes to stop serving data unless you first disable
cluster HA and then assign epsilon to the node that you want to remain online.
• You reboot one of the nodes without using the ‑inhibit‑takeover true parameter. (The ‑onboot
parameter of the storage failover command is enabled by default.)
• The remote management device (Service Processor) detects failure of the partner node. This is not
applicable if you disable hardware-assisted takeover.
You can also manually initiate takeovers with the storage failover takeover command.
Beginning in ONTAP 9.9.1, the following resiliency and diagnostic additions improve cluster operation:
• Port monitoring and avoidance: In two-node switchless cluster configurations, the system avoids ports
that experience total packet loss (connectivity loss). In ONTAP 9.8.1 and earlier, this functionality was only
available in switched configurations.
• Automatic node failover: If a node cannot serve data across its cluster network, that node should not own
any disks. Instead its HA partner should take over, if the partner is healthy.
• Commands to analyze connectivity issues: Use the following command to display which cluster paths
are experiencing packet loss: network interface check cluster-connectivity show
297
How hardware-assisted takeover works
Enabled by default, the hardware-assisted takeover feature can speed up the takeover
process by using a node’s remote management device (Service Processor).
When the remote management device detects a failure, it quickly initiates the takeover rather than waiting for
ONTAP to recognize that the partner’s heartbeat has stopped. If a failure occurs without this feature enabled,
the partner waits until it notices that the node is no longer giving a heartbeat, confirms the loss of heartbeat,
and then initiates the takeover.
The hardware-assisted takeover feature uses the following process to avoid that wait:
1. The remote management device monitors the local system for certain types of failures.
2. If a failure is detected, the remote management device immediately sends an alert to the partner node.
3. Upon receiving the alert, the partner initiates takeover.
The partner node might generate a takeover depending on the type of alert it receives from the remote
management device (Service Processor).
298
Related information
Hardware-assisted (HWassist) takeover - Resolution guide
Automatic takeovers may also occur if one of the nodes become unresponsive.
Automatic giveback occurs by default. If you would rather control giveback impact on clients, you can disable
automatic giveback and use the storage failover modify -auto-giveback false -node <node>
command. Before performing the automatic giveback (regardless of what triggered it), the partner node waits
for a fixed amount of time as controlled by the -delay- seconds parameter of the storage failover
modify command. The default delay is 600 seconds. By delaying the giveback, the process results in two brief
outages: one during takeover and one during giveback.
This process avoids a single, prolonged outage that includes time required for:
If the automatic giveback fails for any of the non-root aggregates, the system automatically makes two
additional attempts to complete the giveback.
During the takeover process, the automatic giveback process starts before the partner node is
ready for the giveback. When the time limit of the automatic giveback process expires and the
partner node is still not ready, the timer restarts. As a result, the time between the partner node
being ready and the actual giveback being performed might be shorter than the automatic
giveback time.
When a node takes over its partner, it continues to serve and update data in the partner’s aggregates and
volumes.
1. If the negotiated takeover is user-initiated, aggregated data is moved from the partner node to the node
that is performing the takeover. A brief outage occurs as the current owner of each aggregate (except for
the root aggregate) changes over to the takeover node. This outage is briefer than an outage that occurs
during a takeover without aggregate relocation.
A negotiated takover during panic cannot occur in the case of a panic. A takeover can result
from a failure not associated with a panic. A failure is experienced when communication is
lost between a node and its partner, also called a heartbeat loss. If a takeover occurs
because of a failure, the outage might be longer because the partner node needs time to
detect the heartbeat loss.
299
◦ You can monitor the progress using the storage failover show‑takeover command.
◦ You can avoid the aggregate relocation during this takeover instance by using the ‑bypass
‑optimization parameter with the storage failover takeover command.
Aggregates are relocated serially during planned takeover operations to reduce client outage. If
aggregate relocation is bypassed, longer client outage occurs during planned takeover events.
2. If the user-initiated takeover is a negotiated takeover, the target node gracefully shuts down, followed by
takeover of the target node’s root aggregate and any aggregates that were not relocated in the first step.
3. Data LIFs (logical interfaces) migrate from the target node to the takeover node, or to any other node in the
cluster based on LIF failover rules. You can avoid the LIF migration by using the ‑skip‑lif-migration
parameter with the storage failover takeover command. In the case of a user-initiated takeover,
data LIFs are migrated before storage takeover begins. In the event of a panic or failure, depending upon
your configuration, data LIFs could be migrated with the storage, or after takeover is complete.
4. Existing SMB sessions are disconnected when takeover occurs.
Due to the nature of the SMB protocol, all SMB sessions are disrupted (except for SMB 3.0
sessions connected to shares with the Continuous Availability property set). SMB 1.0 and
SMB 2.x sessions cannot reconnect open file handles after a takeover event; therefore,
takeover is disruptive and some data loss could occur.
5. SMB 3.0 sessions that are established to shares with the Continuous Availability property enabled can
reconnect to the disconnected shares after a takeover event. If your site uses SMB 3.0 connections to
Microsoft Hyper-V and the Continuous Availability property is enabled on the associated shares, takeovers
are non-disruptive for those sessions.
If the node that is performing the takeover panics within 60 seconds of initiating takeover, the following events
occur:
The local node returns ownership to the partner node when issues are resolved, when the partner node boots
up, or when giveback is initiated.
The following process takes place in a normal giveback operation. In this discussion, Node A has taken over
Node B. Any issues on Node B have been resolved and it is ready to resume serving data.
1. Any issues on Node B are resolved and it displays the following message: Waiting for giveback
2. The giveback is initiated by the storage failover giveback command or by automatic giveback if the
system is configured for it. This initiates the process of returning ownership of Node B’s aggregates and
volumes from Node A back to Node B.
3. Node A returns control of the root aggregate first.
300
4. Node B completes the process of booting up to its normal operating state.
5. As soon as Node B reaches the point in the boot process where it can accept the non-root aggregates,
Node A returns ownership of the other aggregates, one at a time, until giveback is complete. You can
monitor the progress of the giveback by using the storage failover show-giveback command.
The storage failover show-giveback command does not (nor is it intended to)
display information about all operations occurring during the storage failover giveback
operation. You can use the storage failover show command to display additional
details about the current failover status of the node, such as if the node is fully functional,
takeover is possible, and giveback is complete.
I/O resumes for each aggregate after giveback is complete for that aggregate, which reduces its overall
outage window.
ONTAP automatically assigns an HA policy of CFO (controller failover) and SFO (storage failover) to an
aggregate. This policy determines how storage failover operations occur for the aggregate and its volumes.
The two options, CFO and SFO, determine the aggregate control sequence ONTAP uses during storage
failover and giveback operations.
Although the terms CFO and SFO are sometimes used informally to refer to storage failover (takeover and
giveback) operations, they actually represent the HA policy assigned to the aggregates. For example, the
terms SFO aggregate or CFO aggregate simply refer to the aggregate’s HA policy assignment.
• Aggregates created on ONTAP systems (except for the root aggregate containing the root volume) have an
HA policy of SFO. Manually initiated takeover is optimized for performance by relocating SFO (non-root)
aggregates serially to the partner before takeover. During the giveback process, aggregates are given back
serially after the taken-over system boots and the management applications come online, enabling the
node to receive its aggregates.
• Because aggregate relocation operations entail reassigning aggregate disk ownership and shifting control
from a node to its partner, only aggregates with an HA policy of SFO are eligible for aggregate relocation.
• The root aggregate always has an HA policy of CFO and is given back at the start of the giveback
operation. This is necessary to allow the taken-over system to boot. All other aggregates are given back
serially after the taken-over system completes the boot process and the management applications come
online, enabling the node to receive its aggregates.
Changing the HA policy of an aggregate from SFO to CFO is a Maintenance mode operation.
Do not modify this setting unless directed to do so by a customer support representative.
Background updates of the disk firmware will affect HA pair takeover, giveback, and aggregate relocation
operations differently, depending on how those operations are initiated.
The following list describes how background disk firmware updates affect takeover, giveback, and aggregate
relocation:
301
• If a background disk firmware update occurs on a disk on either node, manually initiated takeover
operations are delayed until the disk firmware update finishes on that disk. If the background disk firmware
update takes longer than 120 seconds, takeover operations are aborted and must be restarted manually
after the disk firmware update finishes. If the takeover was initiated with the ‑bypass‑optimization
parameter of the storage failover takeover command set to true, the background disk firmware
update occurring on the destination node does not affect the takeover.
• If a background disk firmware update is occurring on a disk on the source (or takeover) node and the
takeover was initiated manually with the ‑options parameter of the storage failover takeover
command set to immediate, takeover operations start immediately.
• If a background disk firmware update is occurring on a disk on a node and it panics, takeover of the
panicked node begins immediately.
• If a background disk firmware update is occurring on a disk on either node, giveback of data aggregates is
delayed until the disk firmware update finishes on that disk.
• If the background disk firmware update takes longer than 120 seconds, giveback operations are aborted
and must be restarted manually after the disk firmware update completes.
• If a background disk firmware update is occurring on a disk on either node, aggregate relocation operations
are delayed until the disk firmware update finishes on that disk. If the background disk firmware update
takes longer than 120 seconds, aggregate relocation operations are aborted and must be restarted
manually after the disk firmware update finishes. If aggregate relocation was initiated with the -override
-destination-checks of the storage aggregate relocation command set to true, the
background disk firmware update occurring on the destination node does not affect aggregate relocation.
To receive prompt notification if the takeover capability becomes disabled, you should configure your system to
enable automatic email notification for the “takeover impossible” EMS messages:
• ha.takeoverImpVersion
• ha.takeoverImpLowMem
• ha.takeoverImpDegraded
• ha.takeoverImpUnsync
• ha.takeoverImpIC
• ha.takeoverImpHotShelf
302
• ha.takeoverImpNotDef
Disable automatic giveback. The default setting is storage failover modify ‑node nodename
true. ‑auto‑giveback false
Disable automatic giveback after takeover on panic storage failover modify ‑node nodename
(this setting is enabled by default). ‑auto‑giveback‑after‑panic false
Delay automatic giveback for a specified number of storage failover modify ‑node nodename
seconds (the default is 600). This option determines ‑delay‑seconds seconds
the minimum time that a node remains in takeover
before performing an automatic giveback.
How variations of the storage failover modify command affect automatic giveback
The operation of automatic giveback depends on how you configure the parameters of the storage failover
modify command.
The following table lists the default settings for the storage failover modify command parameters that
apply to takeover events not caused by a panic.
303
-onreboot true | false true
The following table describes how combinations of the -onreboot and -auto-giveback parameters affect
automatic giveback for takeover events not caused by a panic.
The -auto-giveback parameter controls giveback after panic and all other automatic takovers. If the
-onreboot parameter is set to true and a takeover occurs due to a reboot, then automatic giveback is
always performed, regardless of whether the -auto-giveback parameter is set to true.
The -onreboot parameter applies to reboots and halt commands issued from ONTAP. When the -onreboot
parameter is set to false, a takeover does not occur in the case of a node reboot. Therefore, automatic
giveback cannot occur, regardless of whether the -auto-giveback parameter is set to true. A client
disruption occurs.
The effects of automatic giveback parameter combinations that apply to panic situations.
The following table lists the storage failover modify command parameters that apply to panic
situations:
304
-auto-giveback-after-panic true | false true
(Privilege: Advanced)
The following table describes how parameter combinations of the storage failover modify command
affect automatic giveback in panic situations.
-onpanic true No
-auto-giveback false
-auto-giveback-after-panic false
-onpanic false No
If -onpanic is set to false, takeover/giveback does not occur,
regardless of the value set for -auto-giveback or -auto
-giveback-after-panic
A takeover can result from a failure not associated with a panic. A failure is experienced when
communication is lost between a node and its partner, also called a heartbeat loss. If a takeover
occurs because of a failure, giveback is controlled by the -onfailure parameter instead of the
-auto-giveback-after-panic parameter.
When a node panics, it sends a panic packet to its partner node. If for any reason the panic
packet is not received by the partner node, the panic can be misinterpreted as a failure. Without
receipt of the panic packet, the partner node knows only that communication has been lost, and
does not know that a panic has occurred. In this case, the partner node processes the loss of
communication as a failure instead of a panic, and giveback is controlled by the -onfailure
parameter (and not by the -auto-giveback-after-panic parameter).
For details on all storage failover modify parameters, see the ONTAP manual pages.
305
perform the takeover varies.
Before you issue the storage failover command with the immediate option, you must migrate the
data LIFs to another node by using the following command: network interface migrate-
all -node node
If you specify the storage failover takeover ‑option immediate command without
first migrating the data LIFs, data LIF migration from the node is significantly delayed even if the
skip‑lif‑migration‑before‑takeover option is not specified.
Similarly, if you specify the immediate option, negotiated takeover optimization is bypassed even
if the bypass‑optimization option is set to false.
You should move epsilon if you expect that any manually initiated takeovers could result in your storage
system being one unexpected node failure away from a cluster-wide loss of quorum.
This can occur if the node being taken over holds epsilon or if the node with epsilon is not healthy. To maintain
a more resilient cluster, you can transfer epsilon to a healthy node that is not being taken over.
Typically, this would be the HA partner.
306
Only healthy and eligible nodes participate in quorum voting. To maintain cluster-wide quorum, more than N/2
votes are required (where N represents the sum of healthy, eligible, online nodes). In clusters
with an even number of online nodes, epsilon adds additional voting weight toward maintaining quorum for the
node to which it is assigned.
Although cluster formation voting can be modified by using the cluster modify
‑eligibility false command, you should avoid this except for situations such as restoring
the node configuration or prolonged node maintenance. If you set a node as ineligible, it stops
serving SAN data until the node is reset to eligible and rebooted. NAS data access to the node
might also be affected when the node is ineligible.
Steps
1. Verify the cluster state and confirm that epsilon is held by a healthy node that is not being taken over:
a. Change to the advanced privilege level, confirming that you want to continue when the advanced mode
prompt appears (*>):
cluster show
If the node you want to take over does not hold epsilon, proceed to Step 4.
2. Remove epsilon from the node that you want to take over:
307
Prior to performing a giveback, you must remove the failed drives in the taken-over system as
described in Disks and aggregates management.
If giveback is interrupted
If the takeover node experiences a failure or a power outage during the giveback process, that process stops
and the takeover node returns to takeover mode until the failure is repaired or the power is restored.
However, this depends upon the stage of giveback in which the failure occurred. If the node encountered
failure or a power outage during partial giveback state (after it has given back the root aggregate), it will not
return to takeover mode. Instead, the node returns to partial-giveback mode. If this occurs, complete the
process by repeating the giveback operation.
If giveback is vetoed
If giveback is vetoed, you must check the EMS messages to determine the cause. Depending on the reason or
reasons, you can decide whether you can safely override the vetoes.
The storage failover show-giveback command displays the giveback progress and shows which
subsystem vetoed the giveback, if any. Soft vetoes can be overridden, while hard vetoes cannot be, even if
forced. The following tables summarize the soft vetoes that should not be overridden, along with recommended
workarounds.
You can review the EMS details for any giveback vetoes by using the following command:
Disk Check All failed or bypassed disks should be removed before attempting
giveback. If disks are sanitizing, you should wait until the operation
completes.
308
Lock Manager Gracefully shutdown the SMB applications that have open files, or
move those volumes to a different aggregate.
RAID Check the EMS messages to determine the cause of the veto:
If the veto is due to nvfile, bring the offline volumes and aggregates
online.
If the veto is due to mirror resync, mirror verify, or offline disks, the veto
can be overridden and the operation restarts after giveback.
Disk Inventory Troubleshoot to identify and resolve the cause of the problem.
Volume Move Operation Troubleshoot to identify and resolve the cause of the problem.
This veto prevents the volume move operation from aborting during
the important cutover phase. If the job is aborted during cutover, the
volume might become inaccessible.
You can manually initiate a giveback on a node in an HA pair to return storage to the original owner after
completing maintenance or resolving
any issues that caused the takeover.
309
Give back storage even if the partner is not in the storage failover giveback ‑ofnode
waiting for giveback mode nodename
‑require‑partner‑waiting false
Give back storage even if processes are vetoing the storage failover giveback ‑ofnode
giveback operation (force the giveback) nodename
‑override‑vetoes true
Give back only the CFO aggregates (the root storage failover giveback ‑ofnode
aggregate) nodename
‑only‑cfo‑aggregates true
Monitor the progress of giveback after you issue the storage failover show‑giveback
giveback command
310
node 1 node 2 - Waiting for giveback
node 2 node 1 false In takeover, Auto
giveback will be initiated
in number of seconds
5. Display all the disks that belong to the partner node (Node2) that the takeover node (Node1) can detect:
The following command displays all disks belonging to Node2 that Node1 can detect:
cluster::> storage disk show -home node2 -ownership
6. Cconfirm that the takeover node (Node1) controls the partner node’s (Node2) aggregates:
During takeover, the “is-home” value of the partner node’s aggregates is false.
7. Give back the partner node’s data service after it displays the “Waiting for giveback” message:
8. Enter either of the following commands to observe the progress of the giveback operation:
311
storage failover show-giveback
9. Proceed, depending on whether you saw the message that giveback was completed successfully:
The following list describes the node states that the storage failover show command displays.
312
Connected to partner_name, Automatic takeover The HA interconnect is active and can transmit data
disabled. to the partner node. Automatic takeover of the partner
is disabled.
Waiting for partner_name, Giveback of partner spare The local node cannot exchange information with the
disks pending. partner node over the HA interconnect. Giveback of
SFO aggregates to the partner is done, but partner
spare disks are still owned by the local node.
Waiting for partner_name. Waiting for partner lock The local node cannot exchange information with the
synchronization. partner node over the HA interconnect, and is waiting
for partner lock synchronization to occur.
Waiting for partner_name. Waiting for cluster The local node cannot exchange information with the
applications to come online on the local node. partner node over the HA interconnect, and is waiting
for cluster applications to come online.
Takeover scheduled. target node relocating its SFO Takeover processing has started. The target node is
aggregates in preparation of takeover. relocating ownership of its SFO aggregates in
preparation for takeover.
Takeover scheduled. target node has relocated its Takeover processing has started. The target node has
SFO aggregates in preparation of takeover. relocated ownership of its SFO aggregates in
preparation for takeover.
Takeover scheduled. Waiting to disable background Takeover processing has started. The system is
disk firmware updates on local node. A firmware waiting for background disk firmware update
update is in progress on the node. operations on the local node to complete.
Relocating SFO aggregates to taking over node in The local node is relocating ownership of its SFO
preparation of takeover. aggregates to the taking-over node in preparation for
takeover.
Relocated SFO aggregates to taking over node. Relocation of ownership of SFO aggregates from the
Waiting for taking over node to takeover. local node to the taking-over node has completed.
The system is waiting for takeover by the taking-over
node.
Relocating SFO aggregates to partner_name. Waiting Relocation of ownership of SFO aggregates from the
to disable background disk firmware updates on the local node to the taking-over node is in progress. The
local node. A firmware update is in progress on the system is waiting for background disk firmware
node. update operations on the local node to complete.
313
Relocating SFO aggregates to partner_name. Waiting Relocation of ownership of SFO aggregates from the
to disable background disk firmware updates on local node to the taking-over node is in progress. The
partner_name. A firmware update is in progress on system is waiting for background disk firmware
the node. update operations on the partner node to complete.
Connected to partner_name. Previous takeover The HA interconnect is active and can transmit data
attempt was aborted because reason. Local node to the partner node. The previous takeover attempt
owns some of partner’s SFO aggregates. was aborted because of the reason displayed under
Reissue a takeover of the partner with the ‑bypass- reason. The local node owns some of its partner’s
optimization parameter set to true to takeover SFO aggregates.
remaining aggregates, or issue a giveback of the
partner to return the relocated aggregates. • Either reissue a takeover of the partner node,
setting the ‑bypass‑optimization parameter to true
to takeover the remaining SFO aggregates, or
perform a giveback of the partner to return
relocated aggregates.
Connected to partner_name. Previous takeover The HA interconnect is active and can transmit data
attempt was aborted. Local node owns some of to the partner node. The previous takeover attempt
partner’s SFO aggregates. was aborted. The local node owns some of its
Reissue a takeover of the partner with the ‑bypass- partner’s SFO aggregates.
optimization parameter set to true to takeover
remaining aggregates, or issue a giveback of the • Either reissue a takeover of the partner node,
partner to return the relocated aggregates. setting the ‑bypass‑optimization parameter to true
to takeover the remaining SFO aggregates, or
perform a giveback of the partner to return
relocated aggregates.
Waiting for partner_name. Previous takeover attempt The local node cannot exchange information with the
was aborted because reason. Local node owns some partner node over the HA interconnect. The previous
of partner’s SFO aggregates. takeover attempt was aborted because of the reason
Reissue a takeover of the partner with the "‑bypass- displayed under reason. The local node owns some of
optimization" parameter set to true to takeover its partner’s SFO aggregates.
remaining aggregates, or issue a giveback of the
partner to return the relocated aggregates. • Either reissue a takeover of the partner node,
setting the ‑bypass‑optimization parameter to true
to takeover the remaining SFO aggregates, or
perform a giveback of the partner to return
relocated aggregates.
Waiting for partner_name. Previous takeover attempt The local node cannot exchange information with the
was aborted. Local node owns some of partner’s SFO partner node over the HA interconnect. The previous
aggregates. takeover attempt was aborted. The local node owns
Reissue a takeover of the partner with the "‑bypass- some of its partner’s SFO aggregates.
optimization" parameter set to true to takeover
remaining aggregates, or issue a giveback of the • Either reissue a takeover of the partner node,
partner to return the relocated aggregates. setting the ‑bypass‑optimization parameter to true
to takeover the remaining SFO aggregates, or
perform a giveback of the partner to return
relocated aggregates.
314
Connected to partner_name. Previous takeover The HA interconnect is active and can transmit data
attempt was aborted because failed to disable to the partner node. The previous takeover attempt
background disk firmware update (BDFU) on local was aborted because the background disk firmware
node. update on the local node was not disabled.
Connected to partner_name. Previous takeover The HA interconnect is active and can transmit data
attempt was aborted because reason. to the partner node. The previous takeover attempt
was aborted because of the reason displayed under
reason.
Waiting for partner_name. Previous takeover attempt The local node cannot exchange information with the
was aborted because reason. partner node over the HA interconnect. The previous
takeover attempt was aborted because of the reason
displayed under reason.
Connected to partner_name. Previous takeover The HA interconnect is active and can transmit data
attempt by partner_name was aborted because to the partner node. The previous takeover attempt by
reason. the partner node was aborted because of the reason
displayed under reason.
Connected to partner_name. Previous takeover The HA interconnect is active and can transmit data
attempt by partner_name was aborted. to the partner node. The previous takeover attempt by
the partner node was aborted.
Waiting for partner_name. Previous takeover attempt The local node cannot exchange information with the
by partner_name was aborted because reason. partner node over the HA interconnect. The previous
takeover attempt by the partner node was aborted
because of the reason displayed under reason.
Previous giveback failed in module: module name. The previous giveback attempt failed in module
Auto giveback will be initiated in number of seconds module_name. Auto giveback will be initiated in
seconds. number of seconds seconds.
Node owns partner’s aggregates as part of the non- The node owns its partner’s aggregates due to the
disruptive controller upgrade procedure. non- disruptive controller upgrade procedure currently
in progress.
Connected to partner_name. Node owns aggregates The HA interconnect is active and can transmit data
belonging to another node in the cluster. to the partner node. The node owns aggregates
belonging to another node in the cluster.
Connected to partner_name. Waiting for partner lock The HA interconnect is active and can transmit data
synchronization. to the partner node. The system is waiting for partner
lock synchronization to complete.
315
Connected to partner_name. Waiting for cluster The HA interconnect is active and can transmit data
applications to come online on the local node. to the partner node. The system is waiting for cluster
applications to come online on the local node.
Non-HA mode, reboot to use full NVRAM. Storage failover is not possible. The HA mode option
is configured as non_ha.
Non-HA mode. Reboot node to activate HA. Storage failover is not possible.
You should only disable storage failover if required as part of a maintenance procedure.
316
• In a two-node cluster, cluster HA ensures that the failure of one node does not disable the
cluster. However, if you do not disable cluster HA before using the -inhibit-takeover
true parameter, both nodes stop serving data.
• If you attempt to halt or reboot a node before disabling cluster HA, ONTAP issues a warning
and instructs you to disable cluster HA.
• You migrate LIFs (logical interfaces) to the partner node that you want to remain online.
• If on the node you are halting or rebooting there are aggregates you want to keep, you move them to the
node that you want to remain online.
Steps
1. Verify both nodes are healthy:
cluster show
2. Migrate all LIFs from the node that you will halt or reboot to the partner node:
network interface migrate-all -node node_name
3. If on the node you will halt or reboot there are aggregates you want to keep online when the node is down,
relocate them to the partner node; otherwise, go to the next step.
a. Show the aggregates on the node you will halt or reboot:
storage aggregates show -node node_name
317
cluster::> storage aggregates show -node node1
Aggregate Size Available Used% State #Vols Nodes RAID
Status
--------- ---- --------- ----- ----- ----- ----- ----
------
aggr0_node_1_0
744.9GB 32.68GB 96% online 2 node1 raid_dp,
normal
aggr1 2.91TB 2.62TB 10% online 8 node1 raid_dp,
normal
aggr2
4.36TB 3.74TB 14% online 12 node1 raid_dp,
normal
test2_aggr 2.18TB 2.18TB 0% online 7 node1 raid_dp,
normal
4 entries were displayed.
For example, aggregates aggr1, aggr2 and test2_aggr are being moved from node1 to node2:
5. Halt or reboot and inhibit takeover of the target node, by using the appropriate command:
◦ system node halt -node node_name -inhibit-takeover true
◦ system node reboot -node node_name -inhibit-takeover true
In the command output, you will see a warning asking you if you want to proceed, enter
y.
6. Verify that the node that is still online is in a healthy state (while the partner is down):
cluster show
318
For the online node, true appears in the Health column.
In the command output, you will see a warning that cluster HA is not configured. You can
ignore the warning at this time.
7. Perform the actions that required you to halt or reboot the node.
8. Boot the offlined node from the LOADER prompt:
boot_ontap
9. Verify both nodes are healthy:
cluster show
In the command output, you will see a warning that cluster HA is not configured. You can
ignore the warning at this time.
For example, aggregates aggr1, aggr2 and test2_aggr are being moved from node node2 to node node1:
storage aggregate relocation start -node node2 -destination node1 -aggregate
-list aggr1,aggr2,test2_aggr
How System Manager uses the REST API and API log
There are several ways that REST API calls are issued by System Manager to ONTAP.
319
When does System Manager issue API calls
Here are the most important examples of when System Manager issues ONTAP REST API calls.
System Manager automatically issues API calls in the background to refresh the displayed information, such as
on the dashboard page.
One or more API calls are issued when you display a specific storage resource or a collection of resources
from the System Manager UI.
An API call is issued when you add, modify, or delete an ONTAP resource from the System Manager UI.
You can also manually reissue an API call by clicking a log entry. This displays the raw JSON output from the
call.
More information
The most recent entries are displayed at the bottom of the page.
2. On the left, click DASHBOARD and observe the new entries being created for the API calls issued to
refresh the page.
3. Click STORAGE and then click Qtrees.
This causes System Manager to issue a specific API call to retrieve a list of the Qtrees.
4. Locate the log entry describing the API call which has the form:
GET /api/storage/qtrees
You will see additional HTTP query parameters included with the entry, such as max_records.
5. Click the log entry to reissue the GET API call and display the raw JSON output.
Example
320
1 {
2 "records": [
3 {
4 "svm": {
5 "uuid": "19507946-e801-11e9-b984-00a0986ab770",
6 "name": "SMQA",
7 "_links": {
8 "self": {
9 "href": "/api/svm/svms/19507946-e801-11e9-b984-
00a0986ab770"
10 }
11 }
12 },
13 "volume": {
14 "uuid": "1e173258-f98b-11e9-8f05-00a0986abd71",
15 "name": "vol_vol_test2_dest_dest",
16 "_links": {
17 "self": {
18 "href": "/api/storage/volumes/1e173258-f98b-11e9-8f05-
00a0986abd71"
19 }
20 }
21 },
22 "id": 1,
23 "name": "test2",
24 "security_style": "mixed",
25 "unix_permissions": 777,
26 "export_policy": {
27 "name": "default",
28 "id": 12884901889,
29 "_links": {
30 "self": {
31 "href": "/api/protocols/nfs/export-policies/12884901889"
32 }
33 }
34 },
35 "path": "/vol_vol_test2_dest_dest/test2",
36 "_links": {
37 "self": {
38 "href": "/api/storage/qtrees/1e173258-f98b-11e9-8f05-
00a0986abd71/1"
39 }
40 }
41 },
42 ],
321
43 "num_records": 1,
44 "_links": {
45 "self": {
46 "href":
"/api/storage/qtrees?max_records=20&fields=*&name=!%22%22"
47 }
48 }
49 }
322
Copyright information
Copyright © 2024 NetApp, Inc. All Rights Reserved. Printed in the U.S. No part of this document covered by
copyright may be reproduced in any form or by any means—graphic, electronic, or mechanical, including
photocopying, recording, taping, or storage in an electronic retrieval system—without prior written permission
of the copyright owner.
Software derived from copyrighted NetApp material is subject to the following license and disclaimer:
THIS SOFTWARE IS PROVIDED BY NETAPP “AS IS” AND WITHOUT ANY EXPRESS OR IMPLIED
WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY
AND FITNESS FOR A PARTICULAR PURPOSE, WHICH ARE HEREBY DISCLAIMED. IN NO EVENT SHALL
NETAPP BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE
GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
NetApp reserves the right to change any products described herein at any time, and without notice. NetApp
assumes no responsibility or liability arising from the use of products described herein, except as expressly
agreed to in writing by NetApp. The use or purchase of this product does not convey a license under any
patent rights, trademark rights, or any other intellectual property rights of NetApp.
The product described in this manual may be protected by one or more U.S. patents, foreign patents, or
pending applications.
LIMITED RIGHTS LEGEND: Use, duplication, or disclosure by the government is subject to restrictions as set
forth in subparagraph (b)(3) of the Rights in Technical Data -Noncommercial Items at DFARS 252.227-7013
(FEB 2014) and FAR 52.227-19 (DEC 2007).
Data contained herein pertains to a commercial product and/or commercial service (as defined in FAR 2.101)
and is proprietary to NetApp, Inc. All NetApp technical data and computer software provided under this
Agreement is commercial in nature and developed solely at private expense. The U.S. Government has a non-
exclusive, non-transferrable, nonsublicensable, worldwide, limited irrevocable license to use the Data only in
connection with and in support of the U.S. Government contract under which the Data was delivered. Except
as provided herein, the Data may not be used, disclosed, reproduced, modified, performed, or displayed
without the prior written approval of NetApp, Inc. United States Government license rights for the Department
of Defense are limited to those rights identified in DFARS clause 252.227-7015(b) (FEB 2014).
Trademark information
NETAPP, the NETAPP logo, and the marks listed at http://www.netapp.com/TM are trademarks of NetApp, Inc.
Other company and product names may be trademarks of their respective owners.
323