ONTAP Cluster Fundamentals
ONTAP Cluster Fundamentals
1
The ONTAP Cluster Fundamentals course:
▪ Is for cluster administrators of any experience level
▪ Is divided into five modules:
▪ Clusters
▪ Management
Welcome ▪ Networking
▪ Storage Virtual Machines
▪ Maintenance
The ONTAP Cluster Fundamentals course is written for cluster administrators of any
experience level. The course is divided into five modules, with each module based on
a specific topic. The course is followed by a final assessment.
2
ONTAP Compliance
Solutions Administration
ONTAP Data Protection
Fundamentals
ONTAP Data Protection
Administration
ONTAP SMB
Administration
ONTAP NAS
Fundamentals
ONTAP NFS
Administration
Foundational Intermediate
The location marker indicates the course that you are attending. You should complete
this course before you attend the ONTAP Cluster Administration course.
3
How to Complete This Course
ONTAP Cluster Fundamentals Pre-Assessment
▪ If you achieved 80% or greater:
▪ Review any of the ONTAP Cluster Fundamentals modules (optional)
▪ Take the final assessment
Instructions ▪ If you received a list of recommended course modules:
▪ Study the recommended course modules, or study all course modules
▪ Take the final assessment
4
ONTAP Cluster Fundamentals:
Clusters
5
1. Clusters
2. Management
3. Networking
4. Storage Virtual Machines
Course
5. Maintenance
Modules
The ONTAP Cluster Fundamentals course has been divided into five modules, each
module based on a specific topic. You can take the modules in any order. However,
NetApp recommends that you take Clusters first, Management second, Networking
third, Storage Virtual Machines fourth, and Maintenance fifth.
This module was written for cluster administrators and provides an introduction to the
concept of a cluster.
6
This module focuses on enabling you to do the following:
▪ Identify the components that make up a cluster
▪ Describe the cluster configurations that are supported
▪ Create and configure a cluster
About This
Module ▪ Describe the physical storage components
▪ Describe the Write Anywhere File Layout (WAFL) file system
This module identifies and describes the components that make up a cluster. The
module also describes the supported cluster configurations and details the steps that
are required to create and configure a cluster. Then the module discusses the
physical storage components and the Write Anywhere File Layout file system, also
known as the WAFL file system.
7
NetApp ONTAP Is the Foundation for Your Data Fabric
Departments or
Remote Offices
Data Mobility
Data Fabric powered by NetApp weaves hybrid cloud mobility with uniform data
management.
For more information about Data Fabric, see the Welcome to Data Fabric video. A link
to this video is available in the Resources section.
8
Lesson 1
Cluster Components
9
Harness the Power of the Hybrid Cloud
This lesson introduces NetApp ONTAP 9 data management software and the
components that make up a cluster.
A basic knowledge of the components helps you to understand how ONTAP can
simplify the transition to the modern data center.
10
Clusters
Cluster
interconnect
All Flash
FAS
FAS
You might be wondering, “What exactly is a cluster?” To answer that question, this
lesson examines the components individually, but begins with a high-level view.
A cluster is one or more FAS controllers or All Flash FAS controllers that run ONTAP.
A controller running ONTAP is called a “node.” In clusters with more than one node, a
cluster interconnect is required so that the nodes appear as one cluster.
A cluster can be a mix of various FAS and All Flash FAS models, depending on the
workload requirements. Also, nodes can be added to or removed from a cluster as
workload requirements change. For more information about the number and types of
nodes, see the Hardware Universe at hwu.netapp.com. A link is provided in the
module resources.
11
Nodes
What a node consists of:
▪ A FAS or All Flash FAS controller running
ONTAP software:
▪ Network ports
▪ Expansion slots
Controller ▪ Nonvolatile memory (NVRAM or NVMEM)
▪ Disks
A node consists of a FAS controller or an All Flash FAS controller that is running
ONTAP software. The controller contains network ports, expansion slots, and
NVRAM or NVMEM. Disks are also required. The disks can be internal to the
controller or in a disk shelf.
For information about specific controller models, see the product documentation on
the NetApp Support site, or see the Hardware Universe.
12
High-Availability Pairs
FAS8060 with an internal
interconnect ▪ Characteristics of high-availability (HA) pairs:
▪ Two connected nodes that form a partnership
▪ Connections to the same disk shelves
▪ Ability of surviving node to take control of failed
partner’s disks
Disk Shelf 2
In multinode clusters, high-availability (HA) pairs are used. An HA pair consists of two
nodes that are connected to form a partnership. The nodes of the pair are connected to
the same shelves. Each node owns its disks. However, if either of the nodes fails, the
partner node can control all the disks, its own and its partners.
13
Networks
▪ Cluster interconnect:
▪ Connection of nodes
▪ Private network
▪ Management network:
▪ For cluster administration
▪ Management and data may be on a shared
Ethernet network
▪ Data network:
Management Network ▪ One or more networks that are used for data
access from clients or hosts
▪ Ethernet, FC, or converged network
Data Network
In multinode clusters, nodes need to communicate with each other over a cluster
interconnect. In a two-node cluster, the interconnect can be switchless. When more
than two nodes are added to a cluster, a private cluster interconnect using switches is
required.
For clients and host to access data, a data network is also required. The data network
can be composed of one or more networks that are primarily used for data access by
clients or hosts. Depending on the environment, there might be an Ethernet, FC, or
converged network. These networks can consist of one or more switches, or even
redundant networks.
14
Ports and Logical Interfaces
Physical Port
e2a e3a
Nodes have various physical ports that are available for cluster traffic, management
traffic, and data traffic. These ports need to be configured appropriately for the
environment.
Ethernet ports can be used directly or combined by using interface groups. Also,
physical Ethernet ports and interface groups can be segmented by using virtual
LANs, or VLANs. Interface groups and VLANs are called virtual ports, and virtual
ports are treated similarly to physical ports.
A logical interface, or LIF, represents a network access point to a node in the cluster.
A LIF can be associated with a physical port, an interface group, or a VLAN to
interface with the management network or data network.
15
ONTAP Storage Architecture
Aggregate
Physical Layer
RAID Groups of Disks
The ONTAP storage architecture uses a dynamic virtualization engine, where data
volumes are dynamically mapped to physical space.
Disks are grouped into RAID groups. An aggregate is a collection of physical disk
space that contains one or more RAID groups. Each aggregate has a RAID
configuration and a set of assigned disks. The disks, RAID groups, and aggregates
make up the physical storage layer.
Within each aggregate, you can create one or more FlexVol volumes. A FlexVol
volume is an allocation of disk space that is a portion of the available space in the
aggregate. A FlexVol volume can contain files or LUNs. The FlexVol volumes, files,
and LUNs make up the logical storage layer.
16
Physical Storage
▪ Disk:
▪ Disk ownership can be assigned to one controller.
▪ A disk can be used as a spare or added to a
RAID group.
▪ RAID group:
▪ A RAID group is a collection of disks.
▪ Data is striped across the disks.
▪ Aggregate:
▪ One or more RAID groups can be used to form
an aggregate.
▪ An aggregate is owned by a one controller.
There are three parts that make up the physical storage on a node.
When a disk enters the system, the disk is unowned. Ownership is automatically or
manually assigned to a single controller. After ownership is assigned, a disk will be
marked as spare until the disk is used to create an aggregate or added to an existing
aggregate.
A RAID group is a collection of disks across which client data is striped and stored.
To support the differing performance and data sharing needs, you can group the
physical data storage resources into one or more aggregates. Aggregates can contain
one or more RAID groups, depending on the desired level of performance and
redundancy. Although aggregates can be owned by only one controller, aggregates
can be relocated to the HA partner for service or performance reasons.
17
Revised Slide 15
Logical Storage
▪ Storage virtual machine (SVM):
▪ Container for data volumes
Data
LIF ▪ Client data is accessed through a LIF
▪ LIF:
▪ Representation of the network address that
Cluster is associated with a port
▪ Access to client data
A storage virtual machine, or SVM, contains data volumes and logical interfaces, or
LIFs. The data volumes store client data which is accessed through a LIF.
A volume is a logical data container that might contain files or LUNs. ONTAP software
provides three types of volumes: FlexVol volumes, FlexGroup volumes, and Infinite
volumes. Volumes contain file systems in a NAS environment and LUNs in a SAN
environment.
A LIF represents the IP address or worldwide port name (WWPN) that is associated
with a port. Data LIFs are used to access client data.
18
SVM with FlexVol Volumes
▪ FlexVol volume:
Qtree
Q3
Data ▪ Representation of the file system in a
LIF
Q2
Q1
NAS environment
Client Access ▪ Container for LUNs in a SAN environment
LUN Data
LIF ▪ Qtree:
SVM Host Access ▪ Partitioning of FlexVol volumes into
smaller segments
▪ Management of quotas, security style, and
CIFS opportunistic lock (oplock) settings
An SVM can contain one or more FlexVol volumes. In a NAS environment, volumes
represent the file system where clients store data. In a SAN environment, a LUN is
created in the volumes for a host to access.
Qtrees can be created to partition a FlexVol volume into smaller segments, much like
directories. Qtrees can also be used to manage quotas, security styles, and CIFS
opportunistic lock settings, or oplock settings.
A LUN is a logical unit that represents a SCSI disk. In a SAN environment, the host
operating system controls the reads and writes for the file system.
19
New Slide 17
FlexGroup Volumes
▪ A scale-out NAS container constructed from a group of FlexVol volumes,
which are called “constituents.”
▪ Constituents are placed evenly across the cluster to automatically and
transparently share a traffic load.
For more information about FlexGroup volumes, see the Scalability and Performance
Using FlexGroup Volumes Power Guide.
20
SVM with Infinite Volume
▪ Infinite Volume:
▪ One scalable volume that can store up to 2
Infinite billion files and tens of petabytes of data
Data
LIF ▪ Several constituents
Volume
Client Access
▪ Constituent roles:
SVM ▪ The data constituents store data.
▪ The namespace constituent tracks file
names, directories, and the file's physical
data location.
D NS D D D D D ▪ The namespace mirror constituent is a
D D D D D M D
data protection mirror copy of the
Cluster namespace constituent.
An SVM can contain one infinite volume. An infinite volume appears to a NAS client
as a single, scalable volume that can store up to 2 billion files and tens of petabytes of
data. Each infinite volume consists of several, typically dozens, of separate
components called constituents.
The data constituents, shown on the slide in blue, store the file’s physical data.
Clients are not aware of the data constituents and do not interact directly with them.
When a client requests a file from an infinite volume, the node retrieves the file's data
from a data constituent and returns the file to the client.
Each infinite volume has a one namespace constituent, shown on the slide in green.
The namespace constituent tracks file names, directories, and the file's physical data
location. Clients are also not aware of the namespace constituent and do not interact
directly with the namespace constituent.
A namespace mirror constituent, shown on the slide in red, is a data protection mirror
copy of the namespace constituent. It provides data protection of the namespace
constituent and support for incremental tape backup of infinite volumes.
For more information about infinite volumes, see the Infinite Volumes Management
Guide.
21
Knowledge Check
▪ Match each term with the term’s function.
22
Knowledge Check
▪ Which three are network types? (Choose three.)
▪ Cluster interconnect
▪ Management network
▪ Data network
▪ HA network
23
Lesson 2
Cluster Configurations
24
Consolidate Across Environments with ONTAP 9
Simplify data management for any application, anywhere
ONTAP 9
Storage Array Converged Heterogeneous SDS Near Cloud Cloud
ONTAP is mostly known as the data management software that runs on FAS and All
Flash FAS controllers. ONTAP 9 has many deployment options to choose from.
ONTAP can be deployed on engineered systems, which includes FAS and All Flash
FAS; converged systems, which includes FAS and All Flash FAS as part of a FlexPod
solution; third-party or E-Series storage arrays that use FlexArray virtualization
software; or near the cloud with NetApp Private Storage (NPS), which uses FAS or All
Flash FAS systems.
Whichever deployment type you choose, you manage ONTAP in much the same
way, for a variety of applications. Although the ONTAP Cluster Fundamentals course
focuses on ONTAP clusters using FAS or All Flash FAS, the knowledge is also
applicable to all the deployment options.
25
Supported Cluster Configurations
Single-Node
MetroCluster
26
Revised Slide 24
Single-Node Cluster
▪ Single-node cluster:
▪ Special implementation of a cluster that runs on a
standalone node
▪ Appropriate when your workload requires only one
node and does not need nondisruptive operations
▪ Use case: Data protection for a remote office
Some features and operations are not supported for single-node clusters. Because
single-node clusters operate in a standalone mode, storage failover and cluster high
availability are not available. If the node goes offline, clients cannot access data
stored in the cluster. Also, any operation that requires more than one node cannot be
performed. For example, you cannot move volumes, perform most copy operations,
or backup cluster configurations to other nodes.
27
Understanding HA Pairs
▪ HA pairs provide hardware redundancy to
do the following:
▪ Perform nondisruptive operations and upgrades
▪ Provide fault tolerance
▪ Enable a node to take over its partner’s storage and
later give back the storage
▪ Eliminate most hardware components and cables as
single points of failure
▪ Improve data availability
A storage system has various single points of failure, such as certain cables or
hardware components. An HA pair greatly reduces the number of single points of
failure. If a failure occurs, the partner can take over and continue serving data until
the failure is fixed. The controller failover function provides continuous data
availability and preserves data integrity for client applications and users.
28
HA Interconnect
HA
Interconnect
Node 1 Node 2
Node 2 Storage
Node 1 Storage
Primary connection
Standby connection
Note: Multipath HA redundant
storage connections are not shown
This example uses a standard FAS8080 EX HA pair with native DS4246 disk shelves.
The controllers in the HA pair are connected through an HA interconnect that consists
of adapters and cables. When the two controllers are in the same chassis, adapters
and cabling are not required because connections are made through an internal
interconnection. To validate an HA configuration, use the Hardware Universe.
For multipath HA support, redundant primary and secondary connections are also
required. For simplicity, these connections are not shown on the slide. Multipath HA is
required on all HA pairs except for some FAS2500 series system configurations,
which use single-path HA and lack the redundant standby connections.
29
Two-Node Cluster Interconnect
In a two-node switchless
cluster, ports are
connected between nodes. Onboard
10-GbE
4 x Ports
Cluster interconnect ports
on a FAS8060
In clusters with more than one node, a cluster interconnect is required. This example
shows a FAS8060 system that has two controllers installed in the chassis. Each
controller has a set of four onboard 10-GbE ports that can be used to connect to the
cluster interconnect.
30
Switched Clusters
Inter-Switch
Cluster Interconnect Links (ISLs)
If your workload requires more than two nodes, the cluster interconnect requires
switches. The cluster interconnect requires two dedicated switches for redundancy
and load balancing. Inter-Switch Links (ISLs) are required between the two switches.
There should always be at least two cluster connections, one to each switch, from
each node. The required connections vary, depending on the controller model.
After the cluster interconnect is established, you can add more nodes as your
workload requires.
For more information about the maximum number and models of controllers
supported, see the Hardware Universe.
For more information about the cluster interconnect and connections, see the Network
Management Guide.
31
MetroCluster
Benefits of MetroCluster software:
▪ Zero data loss
▪ Failover protection
▪ Nondisruptive upgrades
32
MetroCluster Configurations
In a two-node configuration, each site or data center contains a cluster that consists
of a single node. The nodes in a two-node MetroCluster configuration are not
configured as an HA pair. However, because all storage is mirrored, a switchover
operation can be used to provide nondisruptive resiliency similar to that found in a
storage failover in an HA pair.
In a four-node configuration, each site or data center contains a cluster that consists
of an HA pair. A four-node MetroCluster configuration protects data on a local level
and on a cluster level.
For more information about the MetroCluster configurations, see the MetroCluster
Management and Disaster Recovery Guide.
33
Knowledge Check
▪ Which cluster configuration provides a cost-effective,
nondisruptively scalable solution?
▪ Single-node
▪ Two-node switchless
▪ Multi-node switched
▪ MetroCluster
34
Knowledge Check
▪ What is the maximum number of cluster switches that can be used in a
multinode switched cluster configuration?
▪ One
▪ Two
▪ Three
▪ Four
What is the maximum number of cluster switches that can be used in a multinode
switched cluster configuration?
35
Lesson 3
Create and Configure a Cluster
36
Revised Slide 34
Creating a Cluster
▪ Cluster creation methods:
▪ Cluster setup wizard, using the CLI
▪ Guided Cluster Setup, using OnCommand
System Manager
After installing the hardware, you can set up the cluster by using the cluster setup wizard (via
the CLI) or, in ONTAP 9.1 and later, by using the Guided Cluster Setup (via OnCommand
System Manager).
Before you set up a cluster, you should use a cluster setup worksheet to record the values that
you will need during the setup process. Worksheets are available on the NetApp Support
website.
Whichever method you choose, you begin by using the CLI to enter the cluster setup wizard
from a single node in the cluster. The cluster setup wizard prompts you to configure the node
management interface. Next, the cluster setup wizard asks whether you want to complete the
setup wizard by using the CLI.
If you press Enter, the wizard continues using the CLI to guide you through the configuration.
When you are prompted, enter the information that you collected on the worksheet. After
creating the cluster, you use the node setup wizard to join nodes to the cluster one at a time.
The node setup wizard helps you to configure each node's node-management interface.
It is recommended that, after you complete the cluster setup and add all the nodes, you
configure additional settings, such as the cluster time and AutoSupport.
If you choose to use the Guided Cluster Setup, instead of the CLI, use your web browser to
connect to the node management IP that you configured on the first node. When prompted,
enter the information that you collected on the worksheet. The Guided Cluster Setup discovers
all the nodes in the cluster and configures them at the same time.
For more information about setting up a cluster, see the Software Setup Guide.
37
Cluster Administration
▪ Cluster administrators administer
the entire cluster:
▪ All cluster resources
▪ SVM creation and management
▪ Access control and roles
▪ Resource delegation
▪ Login credentials:
▪ The default user name is “admin.”
▪ Use the password that was created
during cluster setup.
You access OnCommand System Manager through a web browser by entering the
cluster administration interface IP address that was created during cluster setup. You
log in as cluster administrator to manage the entire cluster. You manage all cluster
resources, the creation and management of SVMs, access control and roles, and
resource delegation.
To log in to the cluster, you use the default user name “admin” and the password that
you configured during cluster creation.
38
Managing Resources in a Cluster
OnCommand System Manager: The CLI:
▪ Visual representation of the ▪ Manual or scripted commands
available resources ▪ Manual resource creation that might require
▪ Wizard-based resource creation many steps
▪ Best-practice configurations ▪ Ability to focus and switch between specific
▪ Limited advanced operations objects quickly
There are many tools that can be used to create and manage cluster resources, each
with their own advantages and disadvantages. This slide focuses on two tools.
The CLI can also be used to create and configure resources. Commands are entered
manually or through scripts. Instead of the wizards that are used in System Manager,
the CLI might require many manual commands to create and configure a resource.
Although manual commands give the administrator more control, manual commands
are also more prone to mistakes that can cause issues. One advantage of using the
CLI is that the administrator can quickly switch focus without having to move through
System Manager pages to find different objects.
39
Knowledge Check
▪ In OnCommand System Manager, which user name do you use to
manage a cluster?
▪ admin
▪ administrator
▪ root
▪ vsadmin
In OnCommand System Manager, which user name do you use to manage a cluster?
40
Knowledge Check
▪ In the CLI, which user name do you use to manage a cluster?
▪ admin
▪ administrator
▪ root
▪ vsadmin
41
Lesson 4
Physical Storage
42
ONTAP Storage Architecture
Aggregate
Physical Layer
RAID Groups of Disks
This lesson focuses on the physical storage layer. The physical storage layer consists
of disks, RAID groups, and the aggregate.
43
Disks Types
ONTAP Industry-Standard
Disk Class Description
Disk Type Disk Type
BSAS Capacity SATA Bridged SAS-SATA disks
FSAS Capacity NL-SAS Near-line SAS
mSATA Capacity SATA SATA disk in multidisk carrier storage shelf
SAS Performance SAS Serial-attached SCSI
SSD Ultra-performance SSD Solid-state drive
ATA Capacity SATA FC-connected Serial ATA
FC-AL Performance FC Fibre Channel
LUN Not applicable LUN Array LUN
Virtual Machine Disks that VMware ESX
VMDISK Not applicable VMDK
formats and manages
At the lowest level, data is stored on disks. The disks that are most commonly used
are SATA disks for capacity, SAS disks for performance, and solid-state drives, or
SSDs, for ultra-performance.
The LUN disk type is not the same as a LUN that is created in a FlexVol volume. The
LUN disk type appears when the FlexArray storage virtualization software presents
an array LUN to ONTAP.
44
Identifying Disks
Shelf ID
DS4246
In all storage systems, disks are named to enable the quick location of a disk. The
example identifies disk 1.0.22 located in a DS4246 shelf.
ONTAP assigns the stack ID, which is unique across the cluster. The shelf ID is set
on the storage shelf when the shelf is added to the stack or loop. The bay is the
position of the disk within its shelf.
45
Array LUNs
▪ Array LUNs are presented to ONTAP
using FlexArray storage virtualization
E-Series
software:
▪ An array LUN is created on the enterprise
or
storage array and presented to ONTAP.
Enterprise Array LUNs
▪ Array LUNs can function as hot spares or be
Storage Array assigned to aggregates.
Like disks, array LUNs can be used to create an aggregate. With the FlexArray
storage virtualization software licenses, you enable an enterprise storage array to
present an array LUN to ONTAP. An array LUN uses an FC connection type.
The way that ONTAP treats an array LUN is similar to the way it treats a typical disk.
When array LUNs are in use, the aggregates are configured with RAID 0. RAID
protection for the array LUN is provided by the enterprise storage array, not ONTAP.
Also, the aggregate can contain only other array LUNs. The aggregate cannot contain
hard disks or SSDs.
For more information about array LUNs, see the FlexArray Virtualization
Implementation Guides.
46
Disks and Aggregates
▪ What happens when a disk is
Unowned inserted into a system:
Disks ▪ The disk is initially “unowned.”
▪ By default, disk ownership is
assigned automatically.
▪ Disk ownership can be changed.
Aggregate
When a disk is inserted into a storage system’s disk shelf or a new shelf is added, the
disk is initially unowned. By default, the controller takes ownership of the disk. In an
HA pair, only one of the controllers can own a particular disk, but ownership can be
manually assigned to either controller.
When an aggregate is created or disks are added to an aggregate, the spare disks
are used.
47
RAID Groups
▪ Disks are added to RAID groups
within an aggregate.
▪ Disk must be same type:
▪ SAS, SATA, or SSD
▪ Array LUNs
When an aggregate is created or disks are added to an aggregate, the disks are
grouped into one or more RAID groups. Disks within a RAID group protect each other
in the event of a disk failure. Disk failure is discussed on the next slide.
Disks within a RAID group or aggregate must be the same type and usually the same
speed.
You should always provide enough hot spares for each disk type. That way, if a disk
in the group fails, the data can be reconstructed on a spare disk.
48
RAID Types
▪ RAID 4:
▪ RAID 4 provides a parity disk to protect the data in
the event of a single-disk failure.
▪ RAID 4 data aggregates require a minimum of
three disks.
▪ RAID-DP:
▪ RAID-DP provides two parity disks to protect the
data in the event of a double-disk failure.
Data Disks Parity Double Triple ▪ RAID-DP data aggregates require a minimum of
Disk Parity Parity five disks.
Disk Disk
▪ RAID-TEC:
▪ RAID-TEC provides three parity disks to protect
the data in the event of a triple-disk failure.
▪ RAID-TEC data aggregates require a minimum of
seven disks.
Three primary RAID types are used in ONTAP: RAID 4, RAID-DP, and RAID-TEC.
RAID 4 provides a parity disk to protect data in the event of a single-disk failure. If a
data disk fails, the system uses the parity information to reconstruct the data on a
spare disk. When you create a RAID 4 data aggregate, a minimum of three disks are
required.
RAID-DP technology provides two parity disks to protect data in the event of a
double-disk failure. If a second disk fails or becomes unreadable during
reconstruction when RAID 4 is in use, the data might not be recoverable. With RAID-
DP technology, a second parity disk can also be used to recover the data. When you
create a RAID-DP data aggregate, a minimum of five disks are required. RAID-DP is
the default for most disk types.
RAID-TEC technology provides three parity disks to protect data in the event of a
triple-disk failure. As disks become increasingly larger, RAID-TEC can be used to
reduce exposure to data loss during long rebuild times. When you create a RAID-TEC
data aggregate, a minimum of seven disks are required. RAID-TEC is the default for
SATA and near-line SAS hard disks that are 6 TB or larger.
49
Aggregates
Storage System ▪ Aggregates are composed RAID
Aggregate groups that contain disks or array
LUNs:
Plex0 (Pool 0) ▪ All RAID groups must be the same RAID
type.
rg0
▪ Aggregates contain the same disk type.
To support the differing security, backup, performance, and data sharing needs of
your users, you can group the physical data storage resources on your storage
system into one or more aggregates. You can then design and configure these
aggregates to provide the appropriate level of performance and redundancy.
Each aggregate has its own RAID configuration, plex structure, and set of assigned
disks or array LUNs. Aggregates can contain multiple RAID groups, but the RAID
type and disk type must be the same.
Aggregates contain a single copy of data, which is called a plex. A plex contains all
the RAID groups that belong to the aggregate. Plexes can be mirrored by using the
SyncMirror software, which is most commonly used in MetroCluster configurations.
Each plex is also assigned a pool of hot spare disks.
50
Aggregate Types
Each node of an HA pair requires three disks to be used for a RAID-DP root
aggregate, which is created when the system is first initialized. The root aggregate
contains the node’s root volume, named vol0, which contains configuration
information and log files. ONTAP prevents you from creating other volumes in the root
aggregate.
Aggregates for user data are called non-root aggregates or data aggregates. Data
aggregates must be created before any data SVMs or FlexVol volumes. When you
are creating data aggregates, the default is RAID-DP with a minimum of five disks for
most disk types. The aggregate can contain hard disks, SSDs, or array LUNs.
51
Advanced Disk Partitioning
▪ Advanced Disk Partitioning
(ADP):
▪ Shared disks for more efficient
<- N1 Parity
<- N1 Parity
<- N2 Parity
N2 Parity
resource use
<- N1 Spare
N2 Spare
<- N1 Data
<- N2 Data
Parity
Parity
Parity
Parity
consumption requirements
Node2 Root
User Aggr
Node2 Root<-
Node2 Root<-
Node1 Root<-
Node1 Root<-
User Aggr
▪ Partitioning types: Root Partition
▪ Root-data
▪ Root-data-data (not shown)
Spare
Parity
Parity
Data
Data
Data
Data
Data
Data
Data
Data Partition
▪ Default configuration for:
▪ Entry-level FAS2xxx systems
▪ All Flash FAS systems 1 21 23 34 45 56 6 7 7 8 8 99 1010 11
11 12
12
All nodes require a dedicated root aggregate of three disks, and a spare disk should
be provided for each node. Therefore, a 12-disk, entry-level system, as shown here,
would require at least eight disks before a data aggregate could even be created.
This configuration creates a challenge for administrators because the four remaining
disks do not meet the five-disk minimum for a RAID DP data aggregate.
ADP reserves a small slice from each disk to create the root partition that can be used
for the root aggregates and hot spares. The remaining larger slices are configured as
data partitions that can be used for data aggregates and hot spares. The partitioning
type that is shown is called root-data partitioning. A second type of partitioning that is
called [8] root-data-data partitioning creates one small partition as the root partition
and two larger, equally sized partitions for data.
ADP is the default configuration for entry-level systems and for All Flash FAS
systems. Different ADP configurations and partitioning types are available, depending
on the controller model, disk type, disk size, or RAID type.
For more information about ADP configurations, see the Hardware Universe.
52
Hybrid Aggregates
Flash Pool aggregate
A Flash Pool aggregate combines SAS or SATA disks and SSDs to provide a high-
performance aggregate that is more economical than an SSD aggregate. The SSDs
provide a high-performance cache for the active dataset of the data volumes that are
provisioned on the Flash Pool aggregate. The cache offloads random read operations
and repetitive random write operations to improve response times and overall
throughput for disk I/O-bound data access operations.
Flash Pool can improve workloads that use online transactional processing, or OLTP,
for example a database application’s data. Flash Pool does not improve performance
of predominantly sequential workloads.
53
Hybrid Aggregates
FabricPool aggregate
Storing data in tiers can enhance the efficiency of your storage system. FabricPool
stores data in a tier based on whether the data is frequently accessed. ONTAP
automatically moves inactive data to lower-cost cloud storage, which makes more
space available on primary storage for active workloads.
For more information about FabricPool aggregates, see the Disks and Aggregates
Power Guide.
54
Knowledge Check
▪ What is the minimum number of disks that are required to create a
RAID-DP data aggregate (excluding hot spares)?
▪ Two
▪ Three
▪ Four
▪ Five
▪ Six
What is the minimum number of disks that are required to create a RAID-DP data
aggregate (excluding hot spares)?
55
Knowledge Check
▪ What does a Flash Pool aggregate contain?
▪ Hard disks only
▪ Solid state drives (SSDs) only
▪ Hard disks for data storage and SSDs for caching
▪ Hard disks and SSDs that are used for data storage
56
Lesson 5
WAFL
Lesson 5, WAFL.
57
Write Anywhere File Layout
Write Anywhere File Layout (WAFL) file system:
▪ Organizes blocks of data on disk into files
▪ FlexVol volumes represent the file system
FlexVol Volume
Inode file
Inode Inode
A B C D E
The Write Anywhere File Layout, or WAFL, file system organizes blocks of data on
disks into files. The logical container, which is a FlexVol volume, represents the file
system.
The WAFL file system stores metadata in inodes. The term “inode” refers to index
nodes. Inodes are pointers to the blocks on disk that hold the actual data. Every file
has an inode, and each volume has a hidden inode file, which is a collection of the
inodes in the volume.
58
NVRAM and Write Operations
▪ What happens when a host or client
Client Access
writes to the storage system:
▪ The system simultaneously writes to
system memory and logs the data in System
NVRAM. memory
▪ If the system is part of an HA pair, the
system also mirrors the log to the partner.
▪ The write can safely be acknowledged
because the NVRAM is battery-backed
memory. CP
When a host or client writes to the storage system, the system simultaneously writes
to system memory and logs the data in NVRAM. If the system is part of an HA pair,
the system also simultaneously mirrors the logs to the partner.
After the write is logged in battery-backed NVRAM and mirrored to the HA pair, the
system can safely acknowledge the write to the host or client.
The system does not write the data to disk immediately. The WAFL file system
caches the writes in system memory. Write operations are sent to disk, with other
write operations in system memory, at a consistency point, or CP. The system only
uses the data that is logged in NVRAM during a system failure, so after the data is
safely on disk, the logs are flushed from NVRAM.
59
Consistency Points
Certain circumstances trigger a CP:
▪ A ten-second timer runs out.
▪ An NVRAM buffer fills up and it is time to
flush the writes to disk.
▪ A Snapshot copy is created.
CP
Block A B C D E D’
WAFL optimizes all the incoming write requests in system memory before committing
the write requests to disk. The point at which the system commits the data in memory
to disk is called a consistency point because the data in system memory and disks is
consistent then.
A CP occurs at least once every 10 seconds or when the NVRAM buffer is full,
whichever comes first. CPs can also occur at other times, for example when a
Snapshot copy is created.
60
Direct Write Operation
Network interface card (NIC) or
Client Access host-bus adapter (HBA)
NVRAM
System HA System
Memory
NVRAM
Memory
When a write request is sent from the client, the storage system receives the request
through a network interface card (NIC) or a host-bus adapter (HBA). In this case, the
write is to a volume that is on the node and therefore has direct access. The write is
simultaneously processed into system memory, logged in NVRAM, and mirrored to
the NVRAM of the partner node of the HA pair. After the write has been safely logged,
the write is acknowledged to the client. The write is sent to storage at the next CP.
61
Indirect Write Operation
Client Access
NVRAM
System HA System
Memory
NVRAM
Memory
If a write request is sent from the client to a volume that is on a different node, the
write request accesses the volume indirectly.
The write request is processed by the node to which the volume is connected. The
write is redirected, through the cluster interconnect, to the node that owns the volume.
The write is simultaneously processed into system memory, logged in NVRAM, and
mirrored to the NVRAM of the partner node of the HA pair. After the write has been
safely logged, the write is acknowledged to the client. The write is sent to storage at
the next CP.
62
Direct Cache Read Operation
Client Access
NVRAM
System HA System
Memory
NVRAM
Memory
When a read request is sent from the client, the storage system receives the request
through a NIC or an HBA. In this case, the read is from a volume that is on the node
and therefore has direct access. The system first checks to see if the data is still in
system memory, which is called read cache. If the data is still in cache, the system
serves the data to the client.
63
Direct Disk Read Operation
Client Access
NVRAM
System HA System
Memory
NVRAM
Memory
If the data is not in cache, the system retrieves the data into system memory. After the
data is cached, the system serves the data to the client.
64
Indirect Read Operation
Client Access
NVRAM
System HA System
Memory
NVRAM
Memory
If a read request is sent from the client to a volume that is on a different node, the
read request accesses the volume indirectly.
The read is processed by the node to which the volume is connected. The read is
redirected, through the cluster interconnect, to the node that owns the volume. As
with the direct read, the system that owns the volume checks system memory first. If
the data is in cache, the system serves the data to the client. Otherwise, the system
needs to retrieve the data from disk first.
65
Knowledge Check
▪ Match each term with the term’s function.
66
Knowledge Check
▪ When a client reads or writes to a volume that is on the node that the
client is connected to, access is said to be:
▪ Direct for both reads and writes
▪ Direct for reads, indirect for write
▪ Direct for writes, indirect for reads
▪ Indirect for reads and writes
When a client reads or writes to a volume that is on the node that the client is
connected to, access is said to be:
67
Knowledge Check
▪ When a client reads or writes to a volume that is on a node other
than the node that the client is connected to, access is said to be:
▪ Direct for both reads and writes
▪ Direct for reads, indirect for write
▪ Direct for writes, indirect for reads
▪ Indirect for reads and writes
When a client reads or writes to a volume that is on a node other than the node that
the client is connected to, access is said to be:
68
Resources
▪ Welcome to Data Fabric video:
http://www.netapp.com/us/campaigns/data-fabric/index.aspx
▪ NetApp product documentation:
http://mysupport.netapp.com/documentation/productsatoz/index.html
▪ Hardware Universe:
http://hwu.netapp.com
Resources
69
ONTAP Cluster Fundamentals:
Management
70
1. Clusters
2. Management
3. Networking
4. Storage Virtual Machines
Course
5. Maintenance
Modules
The ONTAP Cluster Fundamentals course has been divided into five modules, each
module based on a specific topic. You can take the modules in any order. However,
NetApp recommends that you take Clusters first, Management second, Networking
third, Storage Virtual Machines fourth, and Maintenance fifth.
This module was written for cluster administrators and provides an introduction to the
concept of managing a cluster.
71
This module focuses on enabling you to do the following:
▪ Define the role of a cluster administrator
▪ Manage a cluster
▪ List the cluster-configuration options
About This
Module ▪ Monitor a cluster
In this module, you learn about the role of a cluster administrator, the methods that
are used to manage a cluster, and the options for configuration. You also learn about
the ways to monitor a cluster.
72
Lesson 1
Cluster Administration
73
Administrators
▪ Tasks of cluster administrators:
▪ Administer the entire cluster
▪ Administer the cluster’s storage virtual
machines (SVMs)
▪ Can set up data SVMs and delegate SVM
administration to SVM administrators
Cluster administrators administer the entire cluster and the storage virtual machines,
or SVMs, that the cluster contains. Cluster administrators can also set up data SVMs
and delegate SVM administration to SVM administrators.
SVM administrators administer only their own data SVMs. SVM administrators can
configure certain storage and network resources, such as volumes, protocols,
services, and logical interfaces, or LIFs. What an SVM administrator is allowed to
configure is based on how the cluster administrator has configured the SVM
administrator’s user account.
74
Admin SVM
Admin SVM:
▪ Automatic creation during cluster
creation process
Cluster
Management LIF
▪ Representation of the cluster
▪ Primary access point for administration of
Admin nodes, resources, and data SVMs
Admin SVM
▪ Not a server of data
▪ A cluster must have at least one data SVM to
serve data to its clients.
The admin SVM is automatically created during cluster creation process. There is
only one admin SVM, which represents the cluster. Through the cluster management
LIF, you can manage any node, resource, or data SVM. Also, the cluster
management LIF and is configured to be able to fail over to any node in the cluster.
The admin SVM cannot serve data. A cluster must have at least one data SVM to
serve data to its clients. Unless otherwise specified, the term SVM typically refers to a
data-serving SVM, which applies to both SVMs with FlexVol volumes and SVMs with
Infinite Volume. Also, in the CLI, SVMs are displayed as Vservers.
75
Accessing the Cluster
The CLI: OnCommand System Manager:
▪ Console access through a node’s serial port ▪ Web service in ONTAP
▪ Secure Shell (SSH) access through the ▪ Accessible with a browser and the cluster
cluster management LIF IP address management LIF IP address
▪ Telnet or Remote Shell (RSH) access is
disabled by default
You can enter commands in the CLI from a console. You use the serial port or Secure
Shell, or SSH, and the IP address of the cluster management LIF. If the cluster
management LIF is unavailable, one of the node management LIFs can be used.
SSH is enabled by default. SSH and the cluster management LIF are the
recommended access methods. Although Telnet and Remote Shell, or RSH, are
supported, they are not secure protocols and are therefore disabled by default. If
Telnet or RSH is required in your environment, see the steps to enable these
protocols in the System Administration Guide.
If you prefer to use a GUI instead, you can use OnCommand System Manager.
OnCommand System Manager is included with ONTAP as a web service and is
enabled by default. To use a web browser to access System Manager, point the
browser to the IP address of the cluster management LIF.
76
Node Root Aggregate and Volume
▪ Node root aggregate (aggr0):
▪ Requirement for every node in the cluster HA
▪ Contains only the node root volume
Node 1 Node 2
ONTAP prevents you from creating other
volumes in the root aggregate.
A common question about clustering is, “How can several individual nodes appear as
one cluster?” The answer involves two parts. The first part of the answer involves
each node’s requirements for resources. The second part of the answer involves the
way that the cluster uses those resources. This slide discusses the node resources.
Every node in the cluster requires an aggregate that is dedicated to the node. This
aggregate is called the node root aggregate. By default, the aggregate is named
aggr0, but the name might include the node name also. The purpose of the node root
aggregate is to store the node root volume. ONTAP prevents you from creating other
volumes in the root aggregate.
By default, the node root volume is named vol0. The node root volume contains
special directories and files for the node. The special files include resources that the
node requires for proper operation, log files for troubleshooting, and cluster-wide
configuration database information. Because this volume is so critical to the node,
user data should never be stored in the node root volume.
77
Replicated Database
Cluster
interconnect
▪ Replicated database (RDB):
▪ Basis of clustering
HA HA
▪ An instance on each node in
the cluster Node 1 Node 2 Node 3 Node 4
▪ In use by several processes
▪ Replication rings:
▪ Consistency
▪ Healthy cluster links among all nodes
vol0 vol0 vol0 vol0
This slide explains the second part of the answer, how the cluster uses the dedicated
node resources.
Clustering is how nodes maintain a configuration with each other. The basis of
clustering is the replicated database, or RDB. Replication is communicated over the
dedicated cluster interconnect.
An instance of the RDB is maintained on each node in the cluster. Several processes
use the RDB to ensure consistent data across the cluster. The processes that the
RDB collects data for include the management, volume location, logical interface,
SAN, and configuration replication services.
Replication rings are sets of identical processes that run on all nodes in the cluster.
Replication rings are used to maintain consistency. Each process maintains its own
ring, which is replicated over the cluster interconnect. Replication requires healthy
cluster links among all nodes; otherwise, file services can become unavailable.
78
Knowledge Check
1. The admin SVM is created to manage the cluster and serve
data to the cluster administrators.
a. True
b. False
The admin SVM is created to manage the cluster and serve data to the cluster
administrators.
79
Knowledge Check
2. Where is a cluster’s configuration information stored?
a. In the first node’s root volume
b. In every node’s root volume
c. In the first SVM’s root volume
d. In every SVM’s root volume
80
Lesson 2
Managing Clusters
81
Managing Resources in a Cluster
OnCommand System Manager: The CLI:
▪ Visual representation of the ▪ Manual or scripted commands
available resources ▪ Manual resource creation that might require
▪ Wizard-based resource creation many steps
▪ Best-practice configurations ▪ Ability to focus and switch between specific
▪ Limited advanced operations objects quickly
There are many tools that can be used to create and manage cluster resources, each
with their own advantages and disadvantages. This slide focuses on two tools.
The CLI can also be used to create and configure resources. Commands are entered
manually or through scripts. Instead of the wizards that are used in System Manager,
the CLI might require many manual commands to create and configure a resource.
Although manual commands give the administrator more control, manual commands
are also more prone to mistakes that can cause issues. One advantage of using the
CLI is that the administrator can quickly switch focus without having to move through
System Manager pages to find different objects.
82
Clustershell
The default CLI, or shell, in ONTAP is called the “clustershell.”
Clustershell features:
▪ Inline help
▪ Online manual pages
▪ Command history
▪ Ability to reissue a command
▪ Keyboard shortcuts
▪ Queries and UNIX-style patterns
▪ Wildcards
The cluster has different CLIs or shells that are used for different purposes. This
course focuses on the clustershell, which is the shell that starts automatically when
you log in to the cluster.
Clustershell features include] inline help, an online manual, history and redo
commands, and keyboard shortcuts. The clustershell also supports queries and
UNIX-style patterns. Wildcards enable you to match multiple values in command-
parameter arguments.
83
login as: admin
Using keyboard-interactive authentication.
Using the CLI Password:
cluster1::> cluster show
Node Health Eligibility
--------------------- ------- ------------
cluster1-01 true true
cluster1-02 true true
▪ Command structure: 2 entries were displayed.
cluster1::> cluster
▪ Cluster name at prompt
cluster1::cluster> show
▪ Hierarchy of commands in Node Health
--------------------- -------
Eligibility
------------
command directories cluster1-01
cluster1-02
true
true
true
true
▪ Choice of command path 2 entries were displayed.
cluster1::cluster> ?
or directory structure contact-info> Manage contact information for the cluster.
create Create a cluster
▪ Directory name at prompt date> Manage cluster's date and time setting
ha> Manage high-availability configuration
▪ Context-sensitive help identity> Manage the cluster's attributes, including name
and serial number
image> Manage cluster images for automated nondisruptive
update
join Join an existing cluster using the specified
member's IP address or by cluster name
log-forwarding> Manage the cluster's log forwarding configuration
peer> Manage cluster peer relationships
setup Setup wizard
show Display cluster node members
statistics> Display cluster statistics
time-service> Manage cluster time services
cluster1::cluster> top
cluster1::>
The CLI provides a command-based mechanism that is similar to the UNIX tcsh shell.
You start at the prompt, which displays the cluster name. Commands in the CLI are
organized into a hierarchy by command directories. You can run commands in the
hierarchy either by entering the full command path or by navigating through the
directory structure. The directory name is included in the prompt text to indicate that
you are interacting with the appropriate command directory.
To display context-sensitive help, use the question mark. To return to the top of the
menu, use the top command.
84
Privilege Levels in the CLI
Admin Advanced
▪ Most commands and parameters ▪ Infrequently used commands
▪ Default level and parameters
▪ Advanced knowledge requirements
▪ Possible problems from
inappropriate use
▪ Advice of support personnel
CLI commands and parameters are defined at privilege levels. The privilege levels
reflect the skill levels that are required to perform the tasks.
Most commands and parameters are available at the admin level. The admin level is
the default level that is used for common tasks.
Commands and parameters at the advanced level are used infrequently. Advanced
commands and parameters require advanced knowledge and can cause problems if
used inappropriately. You should use advanced commands and parameters only with
the advice of support personnel.
To change privilege levels in the CLI, you use the set command. An asterisk appears
in the command prompt to signify that you are no longer at the admin level. Changes
to privilege level settings apply only to the session that you are in. The changes are
not persistent across sessions. After completing a task that requires the advanced
privilege, you should change back to admin privilege to avoid entering potentially
dangerous commands by mistake.
There is also a diagnostic privilege level, which is not listed on this slide. Diagnostic
commands and parameters are potentially disruptive to the storage system. Only
support personnel should use diagnostic commands to diagnose and fix problems.
85
Navigating OnCommand System Manager
Main window for ONTAP 9.3 or greater
Your version of OnCommand System Manager might look a little different, depending
on the version of ONTAP software that runs on your cluster. The example that is
displayed here is from a cluster that runs ONTAP 9.3.
After you log in to System Manager, the main window opens. You can use the Guided
Problem Solving, Technical Support Chat, or Help menus at any time. Click the Setup
icon to manage users, roles, and other clusters setting.
The default view is of the cluster dashboard, which can display cluster details such as
alerts and notifications, health, and performance.
You use the navigation menu on the left side to manage the cluster. For example,
under Storage, you find SVMs and Volumes.
86
Navigating OnCommand System Manager
Main window before ONTAP 9.3
In ONTAP versions before ONTAP 9.3, the navigation menu is below the title bar.
After you log in to OnCommand System Manager, the main window opens. You can
use Help at any time. The default view is of the cluster dashboard, which is similar to
the dashboard for ONTAP 9.3, as previously shown.
87
OnCommand Management Portfolio
Insight
API Services and Service Level Manager
Workflow Automation
Unified Manager
System Manager Cloud Manager
Small Midsize Enterprise Private Public Hybrid
Besides the CLI and OnCommand System Manager, there are other products in the
OnCommand management portfolio that you can use to manage storage resources in
a cluster.
88
Knowledge Check
1. What is another name for the default CLI in ONTAP?
a. Systemshell
b. Clustershell
c. Vservershell
d. Rootshell
89
Knowledge Check
2. Which LIF should be used to access OnCommand System
Manager?
a. cluster LIF
b. cluster management LIF
c. node management LIF
d. SVM management LIF
90
Lesson 3
Configuring Clusters
91
Replace Slide 28
Configuring Clusters
The cluster might require some initial configuration, depending on the environment.
This lesson discusses access control, date and time, licenses, jobs and schedules,
and alerts.
92
Managing Cluster Access
You can control access to the cluster and enhance security by managing user
accounts, access methods, and access-control roles.
You can create, modify, lock, unlock, or delete a cluster user account or an
SVM user account. You can also reset a user's password or display
information for all user accounts.
You must specify the methods, by application, that enable a user account to access
the storage system. A user can be assigned one or more access methods.
Examples of the access methods include the HTTP, ONTAPI (ONTAP API), SSH,
console, and Service Processor.
Role-based access control, or RBAC, limits users' administrative access to the level
that is granted for their role. RBAC enables you to manage users based on the role
that users are assigned to. ONTAP provides several predefined access-control roles.
You can also create additional access-control roles, modify them, delete them, or
specify account restrictions for users of a role.
93
Predefined Cluster Roles
admin
autosupport backup
read-only none
ONTAP provides several predefined roles for the cluster. The admin role is the cluster
superuser, which has access to all commands. The admin role can also create roles,
modify created roles, or delete created roles.
The remaining predefined cluster roles are used for applications, services, or auditing
purposes. The autosupport role includes a predefined AutoSupport account that is
used by AutoSupport OnDemand. Backup applications can use the backup role. The
read-only and none roles are used for auditing purposes.
94
Predefined SVM Roles
vsadmin
vsadmin-volume vsadmin-protocol
vsadmin-backup vsadmin-read-only
Each SVM can have its own user and administration authentication domain. After you
create the SVM and user accounts, you can delegate the administration of an SVM to
an SVM administrator. The predefined vsadmin role is the SVM superuser and is
assigned by default. The vsadmin typically manages the vsadmin’s own user account
local password and key information.
The remaining predefined SVM roles have progressively fewer capabilities. These
SVM roles can be used for applications, services, or auditing purposes.
95
User Accounts
You can manage users from the CLI or OnCommand System Manager. There are two
preconfigured users, admin and AutoSupport.
To add a user, click Add and enter the user name and password. You then add user
login methods. Click Add in the Add User dialog box and then select the application,
authentication method, and role. You can select predefined roles, or you can create
custom roles. Also, you need to repeat the user login methods process for each
application.
96
Date and Time
Problems can occur when the cluster time is inaccurate. ONTAP software enables
you to manually set the time zone, date, and time on the cluster. However, you should
configure the Network Time Protocol, or NTP, servers to synchronize the cluster time.
To configure the date and time, click Edit, select the time zone from the menu, enter
the NTP address in the time server field, and click Add. Adding the NTP server
automatically configures all the nodes in the cluster, but each node needs to be
synchronized individually. It might take a few minutes for all the nodes in the cluster to
be synchronized.
97
Licenses
▪ A license is a record of
software entitlements.
▪ Before ONTAP 9.3, each
cluster required
a cluster-based
license key.
▪ Certain features or
services might require
additional licenses.
▪ Feature licenses are
issued as packages.
To add a license package, click Add and then enter the license keys or license files.
98
Schedules
Schedules for tasks:
▪ Basic schedules are
recurring.
▪ Interval schedules are run
at intervals.
▪ Advanced schedules are
run at a specific instance
(month, day, hour, and
minute).
Many tasks can be configured to run on specified schedules. For example, volume
Snapshot copies can be configured to run on specified schedules. These schedules
are similar to UNIX cron schedules.
You manage schedules from the protection menu in OnCommand System Manager.
In the Schedules pane, you can create schedules, edit schedules, or delete
schedules.
99
Jobs
▪ Are asynchronous
tasks
▪ Are managed by the
job manager
▪ Are typically long-
running operations
▪ Are placed in a job
queue
A job is any asynchronous task that the job manager manages. Jobs are typically
long-running volume operations such as copy, move, and mirror. Jobs are placed in a
job queue.
You can monitor the Current Jobs and view the Job History.
100
AutoSupport
▪ Is an integrated
monitoring and
reporting technology
▪ Checks the health of
NetApp systems
▪ Should be enabled on
each node of a cluster
101
Knowledge Check
1. Which name is the name of a predefined cluster role?
a. admin
b. vsadmin
c. svmadmin
d. root
102
Knowledge Check
2. Match the feature with one of the functions that the feature provides.
Match the feature with one of the functions that the feature provides.
103
Lesson 4
Monitoring Clusters
104
Monitoring Clusters
Resources Performance
Alerting Reporting
Reasons to monitor your storage might include the provisioning and protection of
resources, alerting the administrator about an event, and gathering performance-
related information. You might also monitor storage for use reporting and trend
reporting.
This lesson focuses on monitoring resources. This lesson also introduces some of the
software in the OnCommand management portfolio for monitoring the other items.
105
Active IQ
▪ Dashboard
▪ Inventory of NetApp
systems
▪ Health summary and
trends
▪ Storage efficiency and risk
advisors
▪ Upgrade Advisor
▪ Active IQ mobile app
(iOS and Android)
You can access Active IQ from NetApp Support or through the Active IQ
mobile app.
106
Using Unified Manager to Monitor
Manage cluster resources at scale
107
OnCommand Portfolio
Complex
Complexity of Configuration
Performance, Capacity,
Configuration, and
Strong ROI Story
Insight
Target Audience: Large
Enterprises and Service Providers
Manage at Scale,
Automate Storage Processes,
and Data Protection
Target Audience: Midsize to Large Enterprise Customers
There are several management tools to choose from. Examine the use cases and
target audiences of these products.
108
Knowledge Check
1. Which OnCommand product can you use to monitor space
use in a heterogeneous environment?
a. System Manager
b. Unified Manager
c. Insight
d. Performance Manager
Which OnCommand product can you use to monitor space use in a heterogeneous
environment?
109
Resources
▪ NetApp product documentation:
http://mysupport.netapp.com/documentation/productsatoz/index.html
▪ Hardware Universe:
http://hwu.netapp.com
Resources
110
ONTAP Cluster Fundamentals:
Networking
111
1. Clusters
2. Management
3. Networking
4. Storage Virtual Machines
Course
5. Maintenance
Modules
The ONTAP Cluster Fundamentals course has been divided into five modules, each
module based on a specific topic. You can take the modules in any order. However,
NetApp recommends that you take Clusters first, Management second, Networking
third, Storage Virtual Machines fourth, and Maintenance fifth.
This module was written for cluster administrators and provides an introduction to the
concept of networking in a cluster.
112
This module focuses on enabling you to do the following:
▪ List the type of networks that are used by clusters
▪ Identify the types of network ports
▪ Describe IPspaces, broadcast domains, and subnets
About This
Module ▪ Describe network interfaces and their features
In this module, you learn about the networks, ports, IPspaces, broadcast domains,
subnets, and network interfaces that clusters use.
113
Lesson 1
Networks
Lesson 1, networks.
114
Networks: Management and Data
▪ Cluster interconnect:
▪ Connection of nodes
▪ Private network
▪ Management network:
▪ For cluster administration
▪ Management and data may be on a shared
Ethernet network
▪ Data network:
Management Network ▪ One or more networks that are used for data
access from clients or hosts
▪ Ethernet, FC, or converged network
Data Network
This module further examines the networking of a cluster. You can get started by
examining the different types of networks.
In multinode clusters, nodes need to communicate with each other over a cluster
interconnect. In a two-node cluster, the interconnect can be switchless. When more
than two nodes are added to a cluster, a private cluster interconnect using switches is
required.
For clients and host to access data, a data network is also required. The data network
can be composed of one or more networks that are primarily used for data access by
clients or hosts. Depending on the environment, there might be an Ethernet, FC, or
converged network. These networks can consist of one or more switches, or even
redundant networks.
115
Cluster Interconnect
FAS8060
In a two-node switchless
cluster, ports are
connected between nodes. Onboard
10-GbE
Cluster interconnect ports
4 x Ports on a FAS8060
This example shows a FAS8060, which has two controllers installed in the chassis.
Each controller has a set of four onboard 10-GbE ports that are used to connect to the
cluster interconnect.
116
Cluster Interconnect Inter-Switch
Private cluster interconnect Links (ISLs)
Cluster Interconnect
Cluster Switch A Cluster Switch B
A B B A
For more than two nodes, a private cluster interconnect is required. There must be
two dedicated switches, for redundancy and load balancing. Inter-Switch Links, or
ISLs, are required between the two switches. There should always be at least
two cluster connections, one to each switch, from each node. The connections
that are required vary, depending on the controller model and cluster size. The
connections might require all four ports.
For more information about the maximum number and models of controllers that are
supported, see the Hardware Universe at hwu.netapp.com. For more information
about the cluster interconnect and connections, see the Network Management
Guide. Links are provided in the course resources.
117
Management Network
A Cluster Interconnect
B
Cluster Switch A Cluster Switch B
You should also connect the management ports of the cluster switches to the
management network for configuration and management of the cluster switches.
118
Data Networks
▪ Ethernet network:
▪ Ethernet ports
▪ Support for NFS, CIFS, and iSCSI protocols
▪ FC network:
▪ FC ports
▪ Support for FC protocol
▪ Converged network:
▪ Unified Target Adapter (UTA) ports
▪ Support for NFS, CIFS, iSCSI, and FCoE protocols
Data Network
The data network might consist of one or more networks. The required networks
depend on which protocols the clients use.
An Ethernet network connects Ethernet ports, which support the NFS, CIFS, and
iSCSI protocols. An FC network connects FC ports, which support the FC protocol. A
converged network combines Ethernet and FC networks into one network. Converged
networks connections use Unified Target Adapter ports, or UTA ports, on the nodes to
enable support for NFS, CIFS, iSCSI, and FCoE protocols.
119
Knowledge Check
1. Which network type requires a private network?
a. Cluster interconnect
b. Management network
c. Data network
d. HA network
120
Knowledge Check
2. Which port speed is supported for a cluster interconnect?
a. 1 Gbps
b. 8 Gbps
c. 10 Gbps
d. 16 Gbps
121
Lesson 2
Network Ports
122
Network Ports and Interfaces
Virtual LAN
(VLAN) a0a-50 a0a-80
Virtual
Interface group a0a
Physical Port
e2a e3a
Network
Ports
Nodes have various physical ports that are available for cluster traffic, management
traffic, and data traffic. These ports need to be configured appropriately for the
environment. In this example, Ethernet ports are shown; physical ports also include
FC ports and UTA ports.
Physical Ethernet ports can be used directly or combined by using interface groups.
Also, physical Ethernet ports and interface groups can be segmented by using virtual
LANs, or VLANs. Interface groups and VLANS are considered virtual ports but are
treated similar to physical ports.
Unless specified, the term “network port” includes physical ports, interface groups,
and VLANs.
123
Physical Ports
Controllers support a range of ports. Each model has several onboard ports. This
example shows a FAS8060 that contains two controllers in an HA pair configuration.
On the right, there are two Ethernet ports reserved for management purposes. To the
left of the management ports are four 1-GbE ports that can be used for data or
management. To the left of the 1-GbE ports are four UTA2 data ports, which can be
configured as either 10-GbE ports or 16-Gbps FC ports. And lastly, there are four 10-
GbE cluster interconnect ports.
Controllers might also have expansion slots to increase the number of ports by
adding network interface cards (NICs), FC host bus adapters (HBAs), or UTAs.
124
Physical Port Identification
▪ Ethernet port name: ▪ FC port name: ▪ UTA2 ports have an
e<location><letter> <location><letter> Ethernet name and an
▪ Examples: ▪ Examples: FC name:
▪ e<location><letter>
▪ e0i is the first onboard 1- ▪ 0a is the first onboard FC
▪ <location><letter>
GbE port on this controller. port on a controller.
▪ e2a would be the first port ▪ 3a is the first port on the ▪ Examples:
on the NIC in slot 2. host bus adapter (HBA) in
▪ e0e/0e is the first onboard
slot 3.
UTA2 port on this controller.
▪ e4a/4a is the first port on
the UTA card in slot 4.
Port names consist of two or three characters that describe the port's type and
location.
Ethernet port names consist of three characters. The first character is a lowercase “e,”
to represent Ethernet. The second character represents the location; onboard ports
are labeled zero and expansion cards are labeled by slot number. The third character
represents the order of the ports. The slide shows some examples.
FC port names consist of only two characters. FC port names do not begin with the
lowercase “e,” but otherwise FC port names are named in the same manner as
Ethernet port names. The slide shows some examples. However, the controller model
pictured on the slide does not have any dedicated FC ports.
UTA2 ports are unique. Physically, a UTA2 port is a single port but the UTA2 port can
be configured as either a 10-GbE converged Ethernet port or as a 16-Gbps FC port.
Therefore, UTA2 ports are labeled with both the Ethernet name and the FC name.
The slide shows some examples.
125
Interface Groups
▪ Combine one or more
Ethernet interfaces
▪ Interface group modes:
▪ Single-mode (active-standby)
▪ Static multimode (active-active)
▪ Dynamic multimode using Link Aggregation
10-GbE 1-GbE
Control Protocol (LACP)
multimode single-mode
▪ Naming syntax: a<number><letter>, ifgrp ifgrp
for example, a0a
NOTE: Vendors might use other terms Active
for combining Ethernet interfaces. Standby
Interface groups (ifgrps) combine one or more Ethernet interfaces, which can be
implemented in one of three ways.
In single-mode, one interface is active and the other interfaces are inactive until the
active link goes down. The standby paths are only used during a link failover.
In static multimode, all links are active. Therefore, static multimode provides link
failover and load balancing features. Static multimode complies with the IEEE
802.3ad (static) standard and works with any switch that supports the combining of
Ethernet interfaces. However, static multimode does not have control packet
exchange.
Dynamic multimode is similar to static multimode, except that it complies with the
IEEE 802.3ad (dynamic) standard. When switches that support Link Aggregation
Control Protocol, or LACP, are used, the switch can detect a loss of link status and
dynamically route data. NetApp recommends that when you are configuring interface
groups, you use dynamic multimode with LACP and compliant switches.
All modes support the same number of interfaces per groups, but the interfaces in the
group should always be the same speed and type. The naming syntax for interface
groups is the letter “a,” followed by a number, followed by a letter; for example, a0a.
Vendors might use terms such as link aggregation, port aggregation, trunking,
bundling, bonding, teaming, or EtherChannel.
126
VLANs
Switch 1
e0i-170
Switch 2
Router
Mgmt
Switch
A physical Ethernet port or interface group can be subdivided into multiple VLANs.
VLANs provide logical segmentation of networks by creating separate broadcast
domains. VLANs can span multiple physical network segments, as shown in the
diagram. VLANs are used because they provide better network security and reduce
network congestion.
Each VLAN has a unique tag that is communicated in the header of every packet. The
switch must be configured to support VLANs and the tags that are in use. The VLAN's
ID is used in the name of the VLAN when it is created. For example, VLAN "e0i-170"
is a VLAN with tag 170, which is in the management VLAN, and it is configured on
physical port e0i.
127
Network Ports
ifgrp
port
port
port port
ifgrp
So you’re probably asking yourself, “What type of network port should I use?” The
answer depends on your environment.
Environments that use interface groups typically use VLANs also, for segmentation of
the network. This segmentation is common for service providers that have multiple
clients that require the bandwidth that interface groups provide and the security that
VLANs provide.
And lastly, it is not uncommon for different types of ports to be used in mixed
environments that have various workloads. For example, an environment might use
interface groups with VLANs that are dedicated to NAS protocols, a VLAN that is
dedicated to management traffic, and physical ports for FC traffic.
128
Knowledge Check
1. How would you describe port e3a/3a?
a. The first Ethernet port in expansion slot 3
b. The first UTA2 port in expansion slot 3
c. The third Ethernet port of expansion card A
d. The third UTA2 port in expansion slot 3
129
Lesson 3
IPspaces
Lesson 3, IPspaces.
131
IPspace Components
IPspace
Broadcast Domain
Storage Virtual
Machine (SVM) Subnet
Port
LIF IP Addresses:
192.168.0.101
192.168.0.1 192.168.0.1 – 192.168.0.100
ONTAP has a set of features that work together to enable multitenancy. Before
looking at the individual components in depth, consider how they interact with each
other.
When you create a logical interface, or LIF, on the SVM, the LIF represents a network
access point to the node. The IP address for the LIF can be assigned manually. If a
subnet is specified, the IP address is automatically assigned from the pool of
addresses in the subnet. This assignment works in much the same way that a
Dynamic Host Configuration Protocol (DHCP) server assigns IP addresses.
132
IPspaces
Default Company A Company B
IPspace IPspace IPspace
SVM_1 SVM_A1 SVM_B1
Storage
Service
Provider Default Company A Company B
The “cluster” IPspace
is not shown. 192.168.0.5 > 10.1.2.5 > 10.1.2.5 >
The IPspace feature enables the configuration of one cluster so that clients can access the
cluster from more than one administratively separate network domain. Clients can access the
cluster even if those clients are using the same IP address subnet range. This feature enables
separation of client traffic for privacy and security.
An IPspace defines a distinct IP address space in which SVMs reside. Ports and IP addresses
that are defined for an IPspace are applicable only within that IPspace. A distinct routing table is
maintained for each SVM within an IPspace; therefore, no cross-SVM or cross-IPspace traffic
routing occurs.
During the cluster creation, a default IPspace was created. If you are managing storage for one
organization, then you do not need to configure additional IPspaces. If you are managing
storage for multiple organizations on one cluster and you are certain your customers do not
have conflicting networking configurations, you do not need to configure additional IPspaces.
The primary use case for this feature is the storage service provider that needs to connect
customers that are using overlapping IP addresses or ranges. In this example, both Company A
and Company B are using 10.1.2.5 as an IP address for their servers. The service provider
starts the configuration by creating two IPspaces, one for company A and the other for company
B. When the service provider creates SVMs for customer A, they are created in IPspace A.
Likewise, when the service provider creates SVMs for customer B, they are created in IPspace
B.
An IPspace that is named “cluster” that contains the cluster interconnect broadcast domain is
also created automatically during cluster initialization. The “cluster” IPspace is not shown on
this slide.
133
Broadcast Domains
Default
Broadcast Domain
Company A
Broadcast Domain
Company B
Broadcast Domain
A broadcast domain enables you to group network ports that belong to the same layer
2 network. Broadcast domains are commonly used when a system administrator
wants to reserve specific network ports for use by a certain client or group of clients.
Broadcast domains should include network ports from many nodes in the cluster to
provide high availability for the connections to SVMs. A network port can exist in only
one broadcast domain.
This example extends the IPspace example from the previous slide. The default
IPspace, which is automatically created with the cluster, contains the first network
ports from each node. The system administrator created two broadcast domains
specifically to support the customer IPspaces. The broadcast domain for Company
A’s IPspace contains only network ports from the first two nodes. The broadcast
domain for Company B’s IPspace contains one network port from each of the nodes
in the cluster.
A broadcast domain that is named “cluster” that contains the cluster interconnect
ports is also created automatically during cluster initialization. Also, although only
physical ports are used in the example, interface groups and VLANs are also
supported.
134
Subnets
192.168.0.1
Default Subnet to
Broadcast Domain 192.168.0.100
10.1.2.5
Company A Subnet to
Broadcast Domain 10.1.2.20
Company B Subnet
10.1.2.5
to
Broadcast Domain 10.1.2.100
135
Knowledge Check
1. What does a broadcast domain contain?
a) Physical ports only
b) Network ports (physical, interface group, or VLAN)
c) Logical interfaces (LIFs)
d) A pool of IP addresses
136
Lesson 4
Network Interfaces
137
Network Ports and Interfaces
Physical Port
e2a e3a
138
Logical Interfaces
LIFs are managed by the cluster administrators, who can create, view, modify,
migrate, or delete LIFs. An SVM administrator can only view the LIFs associated with
the SVM.
The properties of LIFs include: the SVM that the LIF is associated with, the role, the
protocols the LIF supports, the home node, the home port, and the network address
information. Depending on the type of LIF, there might be an associated failover
policy and group, firewall policy and load balancing options.
139
LIF Roles
Cluster LIFs provide an interface to the cluster interconnect, which carries the “intracluster”
traffic between nodes in a cluster. Cluster LIFs are node scoped, meaning they can fail over to
other ports in the cluster broadcast domain but the ports must be on the same node. Cluster
LIFs cannot be migrated or failed over to a different node. Also, cluster LIFs must always be
created on 10-GbE network ports.
The cluster management LIF provides a single management interface for the entire cluster. The
cluster management LIF is cluster-wide, meaning the cluster management LIF can fail over to
any network port, on any node in the cluster, that is in the proper broadcast domain.
Data LIFs provide an interface for communication with clients and are associated with a specific
SVM. Multiple data LIFs from different SVMs can reside on a single network port, but a data LIF
can be associated with only one SVM. Data LIFs that are assigned NAS protocol access can
migrate or fail over throughout the cluster. Data LIFs that are assigned SAN protocol access do
not fail over, but can be moved offline to a different node in the cluster.
Intercluster LIFs provide an interface for cross-cluster communication, backup, and replication.
Intercluster LIFs are also node scoped and can only fail over or migrate to network ports on the
same node. When creating intercluster LIFs, you must create one on each node in the cluster.
Node management LIFs provide a dedicated interface for managing a particular node. Typically
cluster management LIFs are used to manage the cluster and any individual node. Therefore,
node management LIFs are typically only used for system maintenance when a node becomes
inaccessible from the cluster.
140
Data LIFs
▪ NAS data LIFs:
Data ▪ Multiprotocol (NFS, CIFS or both)
LIF
▪ Manually or automatically assigned
Client Access IP addresses
Data ▪ Failover or migration to any node in the cluster
LIF
SVM LUN Host Access ▪ SAN data LIFs:
▪ Single-protocol (FC or iSCSI):
▪ FC LIF is assigned a WWPN when created.
▪ iSCSI LIF IP addresses can be manually or
automatically assigned.
▪ No failover
Cluster ▪ Restrictions on migration
Data LIFs that are assigned a NAS protocol follow slightly different rules than LIFs
that are assigned a SAN protocol.
Data LIFs that are assigned with NAS protocol access are often called NAS LIFs.
NAS LIFs are created so that client’s can access data from a specific SVM. They are
multiprotocol and can be assigned NFS, CIFs or both. When the LIF is created, you
can manually assign an IP address or specify a subnet so that the address is
automatically assigned. NAS LIFs can fail over or migrate to any node in the cluster.
Data LIFs that are assigned with SAN protocol access are often called SAN LIFs.
SAN LIFs are created so that a host can access LUNs from a specific SVM. SAN LIFs
are single-protocol and can be assigned either the FC or iSCSI protocol. When a LIF
is created that is assigned the FC protocol, a WWPN is automatically assigned. When
a LIF is created that is assigned the iSCSI protocol, you can either manually assign
an IP address or specify a subnet, and the address is automatically assigned.
Although SAN Data LIFs do not fail over, they can be migrated. However, there are
restrictions on migration.
For more information about migrating SAN LIFs, see the SAN Administration Guide.
141
LIF Movement
Migrate Fail Over Revert
▪ The process of moving a ▪ The automatic migration ▪ Return of a failed-over or
LIF from one network port of a LIF from one migrated LIF back to its
to another network port network port to another home port
network port.
▪ A nondisruptive operation ▪ Process:
(NDO) for: ▪ Link failures: ▪ Manual
▪ Maintenance ▪ Component failure ▪ Automatic, if configured to
▪ Performance ▪ Nondisruptive upgrade (NDU) be automatic
Migration is the process of moving a LIF from one network port to another network
port. The destination depends on the role the LIF has been assigned or in the case of
data LIFs, the protocol. Migrating a LIF is considered a nondisruptive operation, or
NDO. Typically LIFs are migrated before maintenance is performed, for example to
replace a part. LIFs might also be migrated manually or automatically for performance
reasons, for example when a network port becomes congested with traffic.
You can revert a LIF to its home port after the LIF fails over or is migrated to a
different network port. You can revert a LIF manually or automatically. If the home
port of a particular LIF is unavailable, the LIF remains at its current port and is not
reverted.
142
LIF Failover
Failover Policies
Configuring LIF failover involves creating the failover group, modifying the LIF to use the
failover group, and specifying a failover policy.
A failover group contains a set of network ports from one or more nodes in a cluster. The
network ports that are present in the failover group define the failover targets that are available
for the LIF. Failover groups are broadcast domain–based and are automatically created when
you create a broadcast domain. The ”Cluster” failover group contains only cluster LIFs. The
”Default” failover group can have cluster management LIFs, node management LIFs,
intercluster LIFs, and NAS data LIFs assigned to it. User-defined failover groups can be created
when the automatic failover groups do not meet your requirements. For example, a user-
defined failover group can define only a subset of the network ports that are available in the
broadcast domain.
LIF failover policies are used to restrict the list of network ports within a failover group that are
available as failover targets for a LIF. Usually, you should accept the default policy when you
create a LIF. For example, the cluster management LIF can use any node in the cluster to
perform management tasks, so the cluster management LIF is created by default with the
“broadcast-domain-wide” failover policy.
The node management LIFs and cluster LIFs are set to the “local-only” failover policy because
failover ports must be on the same local node.
NAS data LIFs are set to be system defined. This setting enables you to keep two active data
connections from two unique nodes when performing software updates. This setting also
enables rolling upgrades to be performed.
SAN data LIFs are configured as disabled. This configuration cannot be changed, so SAN data
LIFs do not fail over.
143
Knowledge Check
1. Which two items can a logical interface represent?
(Choose two.)
a) An IP address
b) A WWPN
c) A VLAN
d) An interface group
144
Knowledge Check
2. Match the LIF role with the default LIF failover policy.
Match the LIF role with the default LIF failover policy.
145
Resources
▪ NetApp product documentation:
http://mysupport.netapp.com/documentation/productsatoz/index.html
▪ Hardware Universe:
http://hwu.netapp.com
Resources
146
ONTAP Cluster Fundamentals:
Storage Virtual Machines
147
1. Clusters
2. Management
3. Networking
4. Storage Virtual Machines
Course
5. Maintenance
Modules
The ONTAP Cluster Fundamentals course has been divided into five modules, each
module based on a specific topic. You can take the modules in any order. However,
NetApp recommends that you take Clusters first, Management second, Networking
third, Storage Virtual Machines fourth, and Maintenance fifth.
This module was written for cluster administrators and provides an introduction to the
concept of storage virtual machines.
148
This module focuses on enabling you to do the following:
▪ Describe the benefits, components, and features of storage
virtual machines (SVMs)
▪ Describe FlexVol volumes and efficiency features
About This ▪ Create and manage SVMs
Module
In this module, you learn about the benefits, components, and features of storage
virtual machines (SVMs). You learn about FlexVol volumes and efficiency features.
You also learn how to create and manage SVMs.
149
Lesson 1
Storage Virtual Machines
150
Replace Slide 5
Data SVM
▪ Stored in data SVMs:
▪ Data volumes that serve client data
Data
LIF ▪ Logical interfaces (LIFs) that serve
client data
Client access
SVM with FlexVol ▪ Data SVM volume types:
volumes ▪ FlexVol volumes
▪ FlexGroup volumes
▪ Infinite volumes
Cluster
A data SVM contains data volumes and logical interfaces, or LIFs, that serve data to
clients. Unless otherwise specified, the term SVM refers to data SVM. In the CLI,
SVMs are displayed as Vservers.
151
SVM Benefits
▪ Secure multitenancy: ▪ Unified storage:
▪ Partitioning of a storage system ▪ SVMs with FlexVol volumes
▪ Isolation of data and management ▪ NAS protocols: CIFS and NFS
▪ No data flow among SVMs in cluster ▪ SAN protocols: iSCSI and FC (FCoE included)
▪ Scalability:
▪ Adding and removing SVMs as needed
▪ Modifying SVMs for data throughput and
storage requirements on demand
152
Replace Slide 7
SVM Considerations
SVM creation tools: SVM use cases:
▪ System Manager ▪ Configuring secure multitenancy
▪ The CLI ▪ Separating resources and workloads
You must set up at least one data access SVM per cluster, which involves planning
the setup, understanding requirements, and creating and configuring the SVM.
NetApp recommends using OnCommand System Manager to create an SVM.
The reasons for creating an SVM depend on the use case or workload requirements.
Usually, only a single SVM is needed. Sometimes, for example when the customer is
a service provider, SVMs can be created for each tenant. Other use cases include
separating different storage domains, meeting network requirements, configuring data
protection domains, or managing different workloads.
When creating more than one SVM, you cannot move resources such as volumes or
LIFs between different SVMs nondisruptively.
153
SVM with FlexVol Volumes
▪ FlexVol volume:
Qtree
Q3
Data ▪ Representation of the file system in a
LIF
Q2
Q1
NAS environment
Client Access ▪ Container for LUNs in a SAN environment
LUN Data
LIF ▪ Qtree:
SVM Host Access ▪ Partitioning of FlexVol volumes into
smaller segments
▪ Management of quotas, security style, and
CIFS opportunistic lock (oplock) settings
An SVM can contain one or more FlexVol volumes. In a NAS environment, volumes
represent the file system where clients store data. In a SAN environment, a LUN is
created in the volumes for a host to access.
Qtrees can be created to partition a FlexVol volume into smaller segments, much like
directories. Qtrees can also be used to manage quotas, security styles, and CIFS
opportunistic lock settings, or oplock settings.
A LUN is a logical unit that represents a SCSI disk. In a SAN environment, the host
operating system controls the reads and writes for the file system.
154
SVM Root Volume
Cluster
When the SVM is created, an SVM root volume is also created, which serves as the
NAS clients’ entry point to the namespace provided by an SVM. NAS clients' data
access depends on the health of the root volume in the namespace. In contrast, SAN
clients' data access is independent of the root volume's health in the namespace.
You should not store user data in the root volume of an SVM.
155
Data LIFs
Data LIFs that are assigned a NAS protocol follow slightly different rules than LIFs
that are assigned a SAN protocol.
Data LIFs that are assigned with NAS protocol access are often called NAS LIFs.
NAS LIFs are created so that clients can access data from a specific SVM. They are
multiprotocol and can be assigned NFS, CIFs or both. When the LIF is created, you
can manually assign an IP address or specify a subnet so that the address is
automatically assigned. NAS LIFs can fail over or migrate to any node in the cluster.
Data LIFs that are assigned with SAN protocol access are often called SAN LIFs.
SAN LIFs are created so that a host can access LUNs from a specific SVM. SAN LIFs
are single-protocol and can be assigned either the FC or iSCSI protocol. When a LIF
is created that is assigned the FC protocol, a WWPN is automatically assigned. When
a LIF is created that is assigned the iSCSI protocol, you can either manually assign
an IP address or specify a subnet, and the address is automatically assigned.
Although SAN Data LIFs do not fail over, they can be migrated. However, there are
restrictions on migration.
For more information about migrating SAN LIFs, see the SAN Administration Guide.
156
Administration
▪ Cluster administrator
▪▪ Aggregates
Administer andthe entire
network cluster
ports: and the
Can perform SVMs
all system
administration
it contains.tasks
▪ SVMs: Can create, view, modify, or delete
▪▪ Access-control:
Set up data Can SVMs and
create, delegate
view, modify, or SVM
delete
▪ Volumes:
administration toview,
Can create, SVM administrators.
modify, move, or delete
▪ LIFs: Can create, view, modify, migrate, or delete LIFs
▪ SVM administrator
▪ Administer only their own data SVMs.
▪ Aggregates and network ports: Have a limited view
▪▪ SVMs:
Set up Arestorage
assigned and
to an network
SVM by theresources, such
cluster administrator
as volumes, protocols, LIFs, and services.
▪ Access-control: Can manage their own user account local
password and key information
Note: SVM administrators cannot
▪ Volumes: Can create, view, modify, or delete log in to System Manager.
▪ LIFs: Can only view the LIFs associated with their
assigned SVM
Cluster administrators administer the entire cluster and the SVMs it contains. They can
also set up data SVMs and delegate SVM administration to SVM administrators. This
list is a list of common tasks, but the specific capabilities that cluster administrators
have depend on their access-control roles.
SVM administrators administer only their own data SVMs storage and network
resources, such as volumes, protocols, LIFs, and services. This list is a list of common
tasks, but the specific capabilities that SVM administrators have depend on the access-
control roles that are assigned by cluster administrators.
It should be noted, when the cluster administrator creates an SVM administrator, they
also need to create a management LIF for the SVM. The SVM administrator or
management software uses this LIF to log in to the SVM. For example, SnapDrive data
management software would use this LIF. SVM administrators cannot log in to System
Manager. SVM administrators are required to manage the SVM by using the CLI.
157
Knowledge Check
1. Match each term with the term’s function.
SVM’s root volume Serves as the NAS clients’ entry point to the namespace
158
Knowledge Check
2. Using the default configuration, which items can an SVM
administrator create?
a. Aggregate
b. SVM
c. Volume
d. LIF
Using the default configuration, which items can an SVM administrator create?
159
Lesson 2
FlexVol Volumes
160
FlexVol Volumes
Write Anywhere File Layout (WAFL) file system:
▪ Organizes blocks of data on disk into files
▪ FlexVol volumes represent the file system
FlexVol Volume
Inode file
Inode Inode
A B C D E
The Write Anywhere File Layout, or WAFL, file system organizes blocks of data on
disks into files. The logical container, which is a FlexVol volume, represents the file
system.
The WAFL file system stores metadata in inodes. The term “inode” refers to index
nodes. Inodes are pointers to the blocks on disk that hold the actual data. Every file
has an inode, and each volume has a hidden inode file, which is a collection of the
inodes in the volume.
161
Volumes in Aggregates
▪ Aggregate: FlexVol 1 FlexVol 2 FlexVol 3
Inode file
▪ 4KB blocks
▪ WAFL reserves 10% vol1 vol2
vol3
▪ Volume:
▪ Provisioning types:
▪ Thick: volume guarantee = volume
▪ Thin: volume guarantee = none
▪ Dynamic mapping to 4KB 4KB
physical space 10%
RG1 RG2
Aggregate
The WAFL file system writes data in 4KB blocks that are contained in the aggregate.
When the aggregate is created, WAFL reserves 10 percent of capacity for overhead.
The remainder of the aggregate is available for volume creation.
A FlexVol volume is a collection of disk space that is provisioned from the available
space within an aggregate. FlexVol volumes are loosely tied to their aggregates.
FlexVol volumes are striped across all the disks of the aggregate, regardless of the
volume size. In this example, the blue block that is labeled “vol1” represents the inode
file for the volume, and the other blue blocks contain the user data.
When a volume is created, the volume guarantee setting must be configured. The
volume guarantee setting is the same as the space reservations. If space is reserved
for the volume, the volume is said to be thick-provisioned. If space is not reserved
during creation, the volume is said to be thin-provisioned. FlexVol volumes are
dynamically mapped to physical space. Whether the volume is thick-provisioned or
thin-provisioned, blocks are not consumed until data is written to the storage system.
162
Volume Footprint
User data is written to Metadata is internal The Snapshot reserve is
a volume. tracking for the file counted as used space
system, inodes, even if there are no
and features. Snapshot copies in
the reserve.
Volume footprint with guarantee = Volume
None
Volume size
Aggregate
A volume footprint is the amount of space that a volume is using in the aggregate. The
volume footprint consists of the space that is used by user data, snapshot copies, and
metadata. The metadata includes metadata that resides in the aggregate rather than in
the volume itself. For this reason, a volume might take up more space in the aggregate
than ONTAP advertises to the client.
When a volume is created, the client sees the total volume size, regardless of the
volume guarantee settings. For example, if you create a 10GB volume, the client sees
the full 10GB, regardless of whether the space is available.
If the volume guarantee is set to “volume,” the volume footprint inside the aggregate
includes the total reserved space. If another thick provisioned volume is created, the
volume could only be the size of the remaining aggregate free space.
With a guarantee of “none,” the volume size is not limited by the aggregate size. In fact,
each volume could, if necessary, be larger than the containing aggregate. The storage
that is provided by the aggregate is used only as data is written to the volume.
Thin provisioning enables you to overcommit the storage object that supplies its storage.
A storage object is said to be overcommitted if the objects it supplies storage to are
collectively larger than the amount of physical storage it can currently supply.
Overcommitting a storage object can increase your storage efficiency. However,
overcommitting also requires that you take an active role in monitoring your free space
to prevent writes from failing due to lack of space.
163
Snapshot Copy Technology
Create Snapshot copy 1
File or A
LUN B
C
Snapshot
Copy 1
Understanding the technology that is used to create a Snapshot copy helps you to
understand how space is utilized. Furthermore, understand this technology will also
help you understand features such as FlexClone volumes, deduplication and
compression.
164
Snapshot Copy Technology
Continue writing data
A
B
C
Snapshot
Copy 1
When ONTAP writes changes to disk, the changed version of block C gets written to
a new location. In this example, D is written to a new location. ONTAP changes the
pointers rather than moving data.
In this way, the file system avoids the parity update changes that are required if new
data is written to the original location. If the WAFL file system updated the same
block, the system would have to perform multiple parity reads to be able to update
both parity disks. The WAFL file system writes the changed block to a new location,
again writing in complete stripes and without moving or changing the original data
blocks.
165
Snapshot Copy Technology
Create Snapshot copy 2
A A
B B
C D
Snapshot Snapshot
Copy 1 Copy 2
When ONTAP creates another Snapshot copy, the new Snapshot copy points only to
the active file system blocks A, B, and D. Block D is the new location for the changed
contents of block C. ONTAP does not move any data; the system keeps building on
the original active file system. Because the method is simple, the method is good for
disk use. Only new and updated blocks use additional block space.
166
Snapshot Copy Technology
Restore from a Snapshot copy
Snapshot Snapshot
Copy 1 Copy 2
Assume that after the Snapshot copy was created, the file or LUN became corrupted,
which affected logical block D. If the block is physically bad, RAID can manage the
issue without recourse to the Snapshot copies. In this example, block D became
corrupted because part of the file was accidentally deleted and you want to restore
the file.
To easily restore data from a Snapshot copy, use the SnapRestore feature.
SnapRestore technology does not copy files; SnapRestore technology moves
pointers from files in the good Snapshot copy to the active file system. The pointers
from that Snapshot copy are promoted to become the active file system pointers.
When a Snapshot copy is restored, all Snapshot copies that were created after that
point in time are destroyed. The system tracks links to blocks on the WAFL system.
When no more links to a block exist, the block is available for overwrite and is
considered free space.
167
Volume Efficiency
Deduplication: Data Compression: Data Compaction:
▪ Elimination of duplicate ▪ Compression of redundant ▪ Store more data in less
data blocks data blocks space
▪ Inline or postprocess ▪ Inline or postprocess ▪ Inline
▪ Inline deduplication for All ▪ Two compression methods: ▪ Enabled by default on All
Flash FAS and Flash Pool ▪ Secondary: 32KB Flash FAS systems
systems to reduce the compression groups (optional on FAS systems)
number of writes to the ▪ Adaptive: 8KB compression
groups, which improves read
solid-state drives (SSDs) performance
ONTAP provides three features that can increase volume efficiency: deduplication, data
compression, and data compaction. You can use these features together or
independently on a FlexVol volume to reduce the amount of physical storage that a
volume requires.
To reduce the amount of physical storage that is required, deduplication eliminates the
duplicate data blocks, data compression compresses redundant data blocks, and data
compaction increases storage efficiency by storing more data in less space. Depending
on the version of ONTAP and the type of disks that are used for the aggregate,
deduplication and data compression can be run inline or postprocess. Data compaction
is inline only.
Inline deduplication can reduce writes to solid-state drives (SSDs), and is enabled by
default on all new volumes that are created on the All Flash FAS systems. Inline
deduplication can also be enabled on new and existing Flash Pool volumes.
Data compression combines multiple 4KB [kilobytes] WAFL blocks into compression
groups before the compression process starts. There are two data compression
methods that can be used. The secondary method uses 32KB [kilobytes] compression
groups. The adaptive method uses 8KB compression groups, which helps to improve
the read performance of the storage system.
Inline data compaction stores multiple user data blocks and files within a single 4KB
block on a system that is running ONTAP software. Inline data compaction is enabled
by default on All Flash FAS systems, and you can optionally enable it on volumes on
FAS systems.
168
Deduplication
▪ Deduplication:
▪ Elimination of duplicate data blocks to Volume
reduce the amount of physical storage File A File B
▪ Volume-level
abcd eabc deaa abcd eaaa bcde abcd eabc
▪ Postprocess example:
▪ File A is ~20KB, using five blocks
▪ File B is ~12KB, using three blocks free eaaa bcde abcd
abcd eabc deaa abcd free eabc
free
Aggregate
In file A, the first and fourth block contain duplicate data, one of the blocks can be
eliminated. The second block in file B, also contains the same duplicate data, which
can be eliminated. Duplication eliminates duplicate blocks within the volume,
regardless of the file.
169
Aggregate-Level Inline Deduplication
A cross-volume
▪ Aggregate-level inline shared block is
deduplication: owned by the
FlexVol volume
▪ Performs cross-volume sharing for
that first wrote the
volumes belonging to the same aggregate block. Cross-Volume
Deduplication
▪ Is enabled by default on all newly created
volumes on All Flash FAS systems that run
ONTAP 9.2 or greater Volume
Deduplication
▪ A cross-volume shared block is
owned by the FlexVol volume that
first wrote the block.
Enhanced
for
ONTAP 9.3
Beginning with ONTAP 9.2, you can perform cross-volume sharing in volumes that
belong to the same aggregate using aggregate-level inline deduplication. Aggregate-
level inline deduplication is enabled by default on all newly created volumes on All
Flash FAS (AFF) systems running ONTAP 9.2 or greater. Cross-volume sharing is
not supported on Flash Pool and HDD systems.
170
Data Compression
▪ Compression:
▪ Compression of redundant data blocks to Volume
reduce the amount of physical storage File A File B
▪ Volume-level
abcd eabc deaa abcd eaaa bcde abcd eabc
▪ Example:
▪ File A is ~20KB, using five blocks
▪ File B is ~12KB, using three blocks abcd eabc deaa abcd eaaa bcde abcd eabc
~>#! *abc
Aggregate
~>#! *abc
This example starts exactly where the previous example started, except postprocess
data compression is enabled.
Data compression first combines several blocks into compression groups. In this
example, the 32KB compression group is made up of these eight 4KB [kilobytes]
blocks. The data compression algorithm identifies redundant patterns, which can be
compressed. The algorithm continues to find redundancies and compress them. After
everything has been compressed, all that remains on disk are the fully compressed
blocks.
171
Inline Data Compaction
Data 4KB
▪ Stores multiple logical I/Os or files in
Free Logical
a single physical 4KB block
Space Block ▪ For small I/O or files, less than 4KB
Data ▪ Increases efficiency of adaptive (8KB)
4KB compression
▪ Compresses 4KB I/Os
Physical
Data Block ▪ Enabled by default on All Flash FAS
systems
▪ Optional for FAS systems
Data
Data compaction takes I/Os that normally consume a 4KB block on physical storage
and packs multiple such I/Os into one physical 4KB block.
This increases space savings for very small I/Os and files, less than 4KB, that have a
lot of free space.
To increase efficiency, data compaction is done after inline adaptive compression and
inline deduplication.
Compaction is enabled by default for All Flash FAS systems shipped with ONTAP 9.
Optionally, a policy can be configured for Flash Pool and HDD-only aggregates.
172
All Flash FAS Inline Storage Efficiency Workflow
Detects all-zero blocks Compresses 8KB Deduplicates incoming Combines two or more
blocks written to blocks against recently small logical blocks into
Updates only
storage written blocks a single 4KB physical
metadata, not user
block
data Is aligned with the I/O Is used in conjunction
size used with most with background (post-
databases write) deduplication to
achieve maximum
space savings
Data compaction is an inline operation that occurs after inline compression and inline
deduplication. On an All Flash FAS system, the order of execution follows the steps
shown here.
In the first step, inline zero-block deduplication detects all-zero blocks. No user data is
written to physical storage during this step. Only metadata and reference counts are
updated.
In the second step, inline adaptive compression compresses 8KB logical blocks into
4KB physical blocks. Inline adaptive compression is very efficient in determining
compressibility of the data and doesn’t waste lot of CPU cycles trying to compress
incompressible data.
In the last step, inline adaptive data compaction combines multiple logical blocks that
are less than 4KB into a single 4KB physical block to maximize savings. It also tries to
compress any 4KB logical blocks that are skipped by inline compression to gain
additional compression savings.
173
All Flash FAS Storage Efficiency Example
Vol A Vol B Vol C
4KB 4KB 4KB 4KB 4KB 4KB 4KB 4KB 4KB 4KB 4KB
Without
compression
11 blocks
After inline 4KB 4KB 4KB 4KB 4KB 4KB 4KB 4KB
adaptive 8 blocks
compression
Without data compression or data compaction, the incoming I/Os would consume a
total of eleven 4KB blocks on physical storage. The 1KB I/Os from Vol C each require
a 4KB block because the minimum block size in WAFL is 4KB.
If inline adaptive compression is used, the 50% compressible 8KB I/O from Vol A is
compressed to 4KB. The two 80% compressible 8KB I/Os from Vol A and the three
1KB I/Os from Vol C also consume 4KB each on the physical storage because of the
WAFL 4K block size. The result totals eight 4KB blocks on physical storage.
If inline adaptive data compaction is used after the inline adaptive compression, the
two 80% compressible 8KB I/Os from Vol A are packed into a single 4KB block. The
two 55% compressible 4KB I/Os from Vol B are packed into another 4KB block. And
the three 1KB I/Os from Vol C are packed into another 4KB block. The result totals
four 4KB blocks on physical storage.
174
Moving Volumes
▪ Where and how volumes can
be moved:
▪ To any aggregate in the cluster
▪ Only within the SVM
▪ Nondisruptively to the client
▪ Use cases:
aggr1 ▪ Capacity: Move a volume to an aggregate
aggr5
with more space
aggr3
aggr2
▪ Performance: Move a volume to an
aggregate with different
aggr6 performance characteristics
▪ Servicing: Move volumes to newly added
aggr4
nodes or from nodes that are being retired
FlexVol volumes can be moved from one aggregate or node to another within the
same SVM. A volume move does not disrupt client access during the move.
You can move volumes for capacity use, for example when more space is needed.
You can move volumes to change performance characteristics, for example from a
controller with hard disks to one that uses SSDs. You can move volumes during
service periods, for example to a newly added controller or from a controller that is
being retired.
175
Cloning Volumes
A B C B’ C’
H
G I
Aggregate
A read/write FlexClone volume can be split from the parent volume, for example to
move the clone to a different aggregate. Splitting a read/write FlexClone volume from
its parent requires the duplication of the shared blocks and removes any space
optimizations that are currently used by the FlexClone volume. After the split, both the
FlexClone volume and the parent volume require the full space allocation determined
by their volume guarantees. The FlexClone volume becomes a normal FlexVol
volume.
176
Knowledge Check
1. Which storage efficiency feature removes duplicate blocks?
a) Thin provisioning
b) Snapshot copy
c) Deduplication
d) Compression
177
Knowledge Check
2. Data can be written to a Snapshot copy.
a) True
b) False
178
Knowledge Check
3. Data can be written to a FlexClone volume.
a) True
b) False
179
Lesson 3
Creating and Managing SVMs
180
SVM Setup Workflow
Step 1: SVM basic details
▪ SVM details:
▪ SVM name
▪ IPspace
▪ Volume Type
▪ Data Protocols
▪ Default Language
▪ Root volume security style
▪ Root aggregate (root
volume location)
In the first step, you specify details about the SVM. Next you specify the Domain
Name Server, or DNS, configuration information.
The next steps depend on the protocols that you choose here. In this example, the
user has chosen CIFS, NFS and iSCSI, which require separate steps for NAS
protocols and SAN protocols.
181
SVM Setup Workflow
Step 2: Configure NAS protocols
Configure CIFS or
NFS protocols:
▪ Configuration of data LIFs
▪ CIFS server configuration
▪ Network Information Service
(NIS) server configuration
(optional, for NFS)
▪ Provisioning (optional):
▪ Volume for CIFS storage
▪ Volume for NFS storage
If you choose either CIFS or NFS, you configure those protocols in Step 2. First, you
specify information about the data LIFs. If you choose the CIFS protocol, you specify
the CIFS server information. If you choose the NFS protocol, you might want to
specify the Network Information Service (NIS) server information if applicable.
Optionally, you can also have the wizard provision storage. You can specify those
details before continuing.
182
SVM Setup Workflow
Step 3: Configure SAN protocols
If you also choose either iSCSI or FC, you configure those protocols in Step 3. In the
example, the user chose iSCSI. If you choose FC, the steps are similar.
First, you specify information about the data LIFs. Optionally, you can also have the
wizard provision storage. You can specify those details before continuing.
183
SVM Setup Workflow
Step 4: Configure SVM administration
SVM administrator
details (optional):
▪ User name and password
▪ Configuration of
management LIF for SVM
In the final step, you are asked to optionally create an SVM administrator for use by
host-side applications like SnapDrive software and SnapManager software. Data LIFs
that are assigned the CIFS or NFS protocols enable management access by default.
For environments where only iSCSI or FC protocols are chosen and host-side
applications like SnapDrive and SnapManager are used, a dedicated SVM
management LIF is required.
184
Editing an SVM
Cluster administration
After the SVM setup is complete, you can add or remove protocols, configure
resource allocation, or edit the name services properties.
By default, administrators can create a volume or move a volume within the SVM to
any aggregate in the cluster. To enable or prevent an SVM from using a particular
aggregate in the cluster, you edit the Resource Allocation properties. When the
“Delegate volume creation” option is selected, you can select aggregates to delegate
volume creation to those aggregates.
185
Volume Properties
Now that the SVM has been created, you can create, edit, resize, delete, clone, or
move volumes within the SVM. You can also configure efficiency features or
performance features, using storage quality of service, or QoS. Also, you can protect
volumes by using snapshot copies, mirrors, and vaults.
186
Configuring SVMs
In addition to volumes, you can allocate and configure other storage resources. You
can also create and apply policies and configure SVM data protection features. You
can also configure [4] other configuration settings such as protocols, security,
services, users, and groups.
For more information about configuring SVMs, see the Logical Storage
Management Guide.
187
Policy-Based Management
These examples are only two of the policies that you encounter in ONTAP. The
advantage of policy-based management is that when you create a policy, you can
apply the policy to any appropriate resource, either automatically or manually. Without
policy-based management, you would have to enter these settings for each individual
resource separately.
188
Knowledge Check
1. How can you change the configuration to prevent an SVM from
creating a volume on a particular aggregate?
a) Modify the aggregate settings
b) Modify the SVM settings
c) Modify the volume settings
d) Modify the user policy
How can you change the configuration to prevent an SVM from creating a volume on
a particular aggregate?
189
Resources
▪ NetApp product documentation:
http://mysupport.netapp.com/documentation/productsatoz/index.html
▪ Hardware Universe:
http://hwu.netapp.com
Resources
190
ONTAP Cluster Fundamentals:
Maintenance
191
1. Clusters
2. Management
3. Networking
4. Storage Virtual Machines
Course
5. Maintenance
Modules
The ONTAP Cluster Fundamentals course has been divided into five modules, each
module based on a specific topic. You can take the modules in any order. However,
NetApp recommends that you take Clusters first, Management second, Networking
third, Storage Virtual Machines fourth, and Maintenance fifth.
This module was written for cluster administrators and provides an introduction to the
concept of servicing and maintaining clusters.
192
This module focuses on enabling you to do the following:
▪ Upgrade cluster hardware and software
▪ Describe the performance features and monitoring tools
▪ Describe the tools and features that are used to identify and
About This resolve cluster issues
Module
This module discusses how to maintain the health of a cluster. You learn about
hardware and software upgrades, performance maintenance, cluster issues, and the
tools that can be used to maintain clusters.
193
Lesson 1
Nondisruptive Upgrades
194
Nondisruptive Upgrades and Operations
HA pairs and the ONTAP architecture make many of these nondisruptive operations
possible.
195
Upgrade Advisor
Upgrade Advisor, which is part of NetApp Active IQ, simplifies the process of planning
ONTAP upgrades. NetApp strongly recommends that you generate an upgrade plan
from Upgrade Advisor before upgrading your cluster.
When you submit your system identification and target release to Upgrade Advisor,
the tool compares AutoSupport data about your cluster to known requirements and
limitations of the target release. Upgrade Advisor then generates an upgrade plan
(and optionally a back-out plan) with recommended preparation and execution
procedures.
196
Rolling Upgrade
To perform a software upgrade in a cluster
that consists of two or more nodes:
1. The HA partner takes over control of the
Offline HA Offline storage resources.
Node 1 Node 2 2. The node that is being upgraded is taken offline.
Storage Resources Storage Resources
3. The node is upgraded after a reboot.
Data Data
Aggregate Aggregate 4. When the upgrade is complete, the node gives
back control to the original node.
Vol1 Vol1
Vol2 Vol2
5. The process is repeated on the other node of the
HA pair.
6. The process is repeated on additional HA pairs.
Rolling upgrades can be performed on clusters of two or mode nodes, but rolling
upgrades are run on one node of an HA pair at a time.
For a rolling upgrade, the partner node must first perform a storage takeover of the
node that is being upgraded. The node that is being upgraded is taken offline and
upgraded while its partner controls the storage resources. When the node upgrade is
complete, the partner node gives control back to the original owning node. The
process is repeated, this time on the partner node. Each additional HA pair is
upgraded in sequence until all HA pairs are running the target version.
197
Batch Upgrade
Batch upgrades can be performed on clusters of eight or mode nodes. Unlike rolling
upgrades, batch upgrades can be run on more than one HA pair at a time.
To perform a batch upgrade, the cluster is separated into two batches, each of which
contains multiple HA pairs. In the first batch, one node in each HA pair is taken offline
and upgraded while the partner nodes take over the storage. When the upgrade is
completed for the first half of all the HA pairs, the partner nodes give control back to
the original owning nodes. Then the process is repeated, this time on the partner
nodes. The process then begins on the second batch.
198
Software Upgrade with System Manager
If you are upgrading from ONTAP and you prefer a UI, you can use OnCommand
System Manager to perform an automated, nondisruptive upgrade. Alternatively, you
can use the CLI to perform upgrades.
199
Automated Upgrade
Stage 1 Stage 2 Stage 3
Select Validate Update
The automated upgrades that are performed by using System Manager consist of
three stages. The stages are select, validate, and update.
In the first stage, you select the ONTAP software image. The current version details
are displayed for each of the nodes or HA pairs. System Manager enables you to
select an already available software image for the update or to download a software
image from the NetApp Support site and add the image for the update.
In the second stage, you view and validate the cluster against the software image
version for the update. A pre-update validation checks whether the cluster is in a state
that is ready for an update. If the validation is completed with errors, a table displays
the status of the various components and the required corrective action for the errors.
You can perform the update only when the validation is completed successfully.
In the third and final stage, you update all the nodes in the cluster, or an HA pair in
the cluster, to the selected version of the software image. The default upgrade type
can be rolling or batch. The upgrade type that is performed depends on the number of
nodes in the cluster. While the update is in progress, you can choose to pause and
then either cancel or resume the update. If an error occurs, the update is paused and
an error message is displayed with the remedial steps. You can choose to either
resume the update after performing the remedial steps or cancel the update. You can
view the table with the node name, uptime, state, and ONTAP version when the
update is successfully completed.
200
Nondisruptive Hardware Maintenance
To perform hardware maintenance in
a cluster that consists of two or
more nodes:
Offline HA
Node 1 Node 2 1. The HA partner takes over control of the
Storage Resources Storage Resources storage resources.
Data Data
2. The node that is being serviced is taken offline
Aggregate Aggregate and powered off.
Vol1 Vol1 3. After the node has been serviced, the node is
Vol2 Vol2
powered on.
4. When the node is back online, the partner
node gives back control to the original node.
For hardware maintenance, the partner node must first perform a storage takeover of
the node that will be serviced. The node can now be taken offline and powered off.
After the node has been serviced, the node is powered on. After the node has come
back online and is healthy, the partner node gives control back to the original owning
node. The process can be repeated, this time on the partner node, if necessary.
201
Nondisruptive Addition of Nodes to a Cluster
To add nodes to a healthy multinode ::> cluster setup
1. Verify that the nodes are configured as You can enter the following commands at any time:
"help" or "?" - if you want to have a question
HA pairs and connected to the clarified,
"back" - if you want to change previously answered
cluster interconnect. questions, and
"exit" or "quit" - if you want to quit the cluster
2. Power on both nodes of the HA pair. setup wizard.
Any changes you made before quitting will be saved.
3. Start the Cluster Setup wizard on one of You can return to cluster setup at any time by typing
the nodes. "cluster setup".
To accept a default or omit a question, do not enter a
value.
4. Use the join command and follow the wizard.
Do you want to create a new cluster or join an existing
5. Repeat Steps 3 and 4 on the partner node. cluster?
{create, join}: join
Nodes must be added from HA pairs that are connected to the cluster interconnect.
Nodes are joined to the cluster one at a time. Power on both nodes of the HA pair that
you want to add to the cluster. After the nodes boot, use a console connection to start
the Cluster Setup wizard on one of the nodes. Use the join command and follow the
wizard. After the node has been joined to the cluster, repeat the steps for the partner
node and any additional nodes that you want to add.
202
Cluster Expansion
ONTAP 9.2 or greater
Beginning with ONTAP 9.2, clusters can also be expanded nondisruptively using
System Manager. System Manager automatically detects any new compatible nodes,
whether the cluster configuration is switchless or switched.
203
Knowledge Check
1. Which two upgrade types can group HA pairs that are
upgraded together? (Choose two.)
a. Rolling upgrade
b. Batch upgrade
c. Automated upgrade
d. Hardware upgrade
Which two upgrade types can group HA pairs that are upgraded together?
204
Knowledge Check
2. What are the three phases of an automated upgrade?
(Choose three)
a. Select
b. Validate
c. Failover
d. Update
205
Lesson 2
Cluster Performance
206
Performance Considerations
▪ Workloads
▪ I/O operation types:
▪ Random
▪ Sequential
The storage system sends and receives information that is called I/O operations. l/O
operations can be categorized as either random or sequential. Random operations
are usually small. Random operations lack any pattern and happen quickly, for
example database operations. In contrast, sequential operations are large, with
multiple parts that must be accessed in a particular order, for example video files.
Some applications have more than one dataset. For example, a database
application’s data files and log files might have different requirements. Data
requirements might also change over time. For example, data might start with specific
requirements but as the data ages, those requirements might change.
Also, if more than one application is sharing the storage resources, each workload
might need to have quality of service, or QoS, restrictions imposed. The QoS
restrictions prevent applications or tenants from being either bullies or victims.
207
Analyzing I/O
IOPS
Applications with a random I/O profile, such as databases and email servers, usually
have requirements that are based on an IOPS value.
208
Analyzing I/O
Throughput
Applications with a sequential I/O profile, such as video or audio streaming, file
servers, and disk backup targets, usually have requirements that are based on an
MBps value.
209
Analyzing I/O
Latency
Latency is the measurement of how long a storage system takes to process an I/O
task. Smaller latency time values are better.
210
ONTAP Performance
You must balance the need for performance and the need for resilience:
▪ More disks per RAID group increase performance.
▪ Fewer disks per RAID group increase resilience.
Always
Protect Use Space follow best
Data Efficiently practices.
When creating aggregates and the underlying RAID group, you must balance the
need for performance and the need for resilience. By adding more disks per RAID
group, you increase performance by spreading the workload across more disks, but at
the cost of resiliency. In contrast, adding fewer disks per RAID group increases the
resiliency because the parity has less data to protect, but at the cost of performance.
By following best practices when you add storage to an aggregate, you optimize
aggregate performance. Also, you should choose the right disk type for the workload
requirements.
211
Performance of Disk Types
High IOPS and high cost per GB
Use solid-state
drive (SSD) for
ultra-performance
Performance
Flash
Acceleration Use SAS for
performance
The proper disk type depends on the performance or capacity requirements of the
workload.
When a workload requires the largest capacity at the lowest cost with lower
performance, SATA disks should be used.
When a workload requires the highest performance at the lowest cost with lower
capacity, solid-state drives (SSDs) should be used.
When a workload requires a balance of capacity and performance, SAS disks should
be used.
Sometimes, a workload might require large amounts of capacity at the lowest cost but
at a higher performance than SATA or SAS provides. To improve the performance of
high-capacity hard disks, Flash Cache or a Flash Pool can be used.
212
Virtual Storage Tier
Flash Cache Flash Pool
▪ Controller-level cache ▪ Storage-level cache
▪ Flash Cache modules in ▪ Hybrid aggregates
the expansion slots of a node of hard disks and SSDs
▪ Improved response time for repeated, ▪ Improved response time for repeated,
random reads random reads and overwrites
▪ Simple use; no additional administration ▪ Consistent performance across storage
failover events
▪ Cache for all volumes on the controller
▪ Cache for all volumes that are
on the aggregate
The Virtual Storage Tier provides two flash acceleration methods to improve the
performance of FAS storage systems.
Flash Pool uses both hard disks and SSDs in a hybrid aggregate to provide storage-
level flash acceleration. Flash Pool is an ideal option for workloads that require
acceleration of repeated random reads and random overwrites, for example database
and transactional applications. Because Flash Pool is at the storage level, rather than
in the expansion slot of a controller, the cache remains available even during storage
failover or giveback. Like Flash Cache, the Flash Pool feature is simple to use,
because acceleration is automatically provided to volumes that are on the Flash Pool
aggregate.
213
SSDs in Flash Pool
Storage Each SSD is divided
Allocation pool into four partitions.
unit
When adding SSDs to a Flash Pool aggregate, you add the SSDs to form a RAID
group dedicated to caching. Alternatively, you can use Flash Pool SSD partitioning,
also known as Advanced Drive Partitioning. Flash Pool SSD partitioning enables you
to group SSDs together into an SSD storage pool from which partitions are allocated
to multiple Flash Pool aggregates. This grouping spreads the cost of the parity SSDs
over more aggregates, increases SSD allocation flexibility, and maximizes SSD
performance. The storage pool is associated with an HA pair, and can be composed
of SSDs owned by either node in the HA pair.
When you add an SSD to a storage pool, the SSD becomes a shared SSD, and the
SSD is divided into four partitions. The SSD storage pool is made up of rows of these
partitions, which are called allocation units. Each allocation unit represents 25 percent
of the total storage capacity of the storage pool. Each allocation unit contains one
partition from each SSD in the storage pool. Allocation units are added to a Flash
Pool cache as a single RAID group. By default, for storage pools associated with an
HA pair, two allocation units are assigned to each of the HA partners. However, you
can reassign the allocation units to the other HA partner if necessary.
214
Cluster Performance
Adding and relocating resources
Relocating resources
nondisruptively:
▪ Moving an aggregate
between the nodes of an
HA pair
▪ Moving volumes, LUNs, and
A B LIFs within an SVM
▪ Creating a FlexClone of a
C D volume or LUN
SATA SAS
We have been discussing performance at the node level. We also need to discuss
performance at the cluster level.
After some time, the administrator needs to add a volume for a database application.
The SATA disks do not meet the requirements for this new workload. The
administrator decides, for future growth, to nondisruptively add another HA pair with
SAS disks. With new nodes with SAS disks active in the cluster, the administrator can
nondisruptively move the volume to the faster disks.
The slide shows some other nondisruptive resource relocation actions that are
commonly performed in a cluster. [add 3s silence here to final audio]
215
Cluster Performance
All Flash FAS
All Flash FAS
FlashEssentials features:
▪ Coalesced writes to
free blocks
▪ A random read I/O
E processing path
A B
D ▪ A highly parallelized
processing architecture
C ▪ Built-in quality of
SATA SAS SSD
service (QoS)
▪ Inline data reduction
and compression
The administrator has a new requirement for a workload that requires high
performance requirements. For easier management of the various workload types,
the administrator decides to create in the cluster a new high-performance tier that
uses All Flash FAS controllers.
NetApp FlashEssentials is the power behind the performance and efficiency of All
Flash FAS. All Flash FAS uses high-end or enterprise-level controllers with an all-
flash personality, which supports SSDs only. The slide shows some of the
FlashEssentials features. For more information about All Flash FAS and
FlashEssentials, see Using All Flash FAS with ONTAP on the NetApp Support site. A
link is provided in the module resources.
216
Storage QoS
Storage QoS can deliver
consistent performance for
mixed workloads and
mixed tenants.
Monitor, isolate, and
SVM1 limit workloads of
storage objects:
▪ Volume
▪ LUN
SVM2 ▪ File
▪ SVM
The storage QoS feature can be configured to prevent user workloads or tenants from
affecting each other. The feature can be configured to isolate and throttle resource-
intensive workloads. The feature can also enable critical applications to achieve
consistent performance expectations. QoS policies are created to monitor, isolate,
and limit workloads of such storage objects as volumes, LUNs, files and SVMs.
Policies are throughput limits that can be defined in terms of IOPS or megabytes per
second.
217
Monitoring Cluster Performance
Using OnCommand System Manager
System Manager has built-in cluster performance monitoring from the main window.
The cluster performance charts enable you to view latency, IOPS, and throughput.
218
Monitoring Cluster Performance
Using OnCommand Unified Manager
The Performance Dashboard provides various performance metrics for each cluster
that Unified Manager is monitoring.
219
OnCommand Portfolio
Complex
Complexity of Configuration
Performance, Capacity,
Configuration, and
Strong ROI Story
Insight
Target Audience: Large
Enterprises and Service Providers
Manage at Scale,
Automate Storage Processes,
and Data Protection
Target Audience: Midsize to Large Enterprise Customers
Unified Manager
220
Knowledge Check
1. Match each term with the term’s function.
221
Knowledge Check
2. When you create a Flash Pool, which two options are
supported? (Choose two.)
a. SATA disks with SSDs
b. SAS disks with SSDs
c. Array LUNs with SSDs on FAS only
d. Array LUNs with SSDs on All Flash FAS only
When you create a flash pool, which two options are supported?
222
Knowledge Check
3. When Flash Pool SSD partitioning is used, how many
partitions are created by default?
a. Two partitions; one per node
b. Three partitions; one per node plus a parity partition
c. Four partitions; two per node
d. Five partitions; two per node plus a parity partition
When Flash Pool SSD partitioning is used, how many partitions are created by
default?
223
Lesson 3
Identifying Issues
224
Common Issues
Understanding the topics and best practices covered in the ONTAP Cluster
Fundamentals course is essential to keeping a cluster healthy and working
continuously without disruptions. But components can fail, configurations change, and
performance can suffer due to over-utilization or configuration issues.
225
Active IQ
▪ Dashboard
▪ Inventory of NetApp
systems
▪ Health summary and
trends
▪ Storage efficiency and
risk advisors
Active IQ provides predictive analytics and proactive support for your hybrid cloud.
Along with an inventory of NetApp systems, you are provided with a predictive health
summary, trends, and a system risk profile.
You can access Active IQ from NetApp Support or through the Active IQ mobile app.
Alerts
Tools to monitor system:
▪ System Manager
▪ Unified Manager
▪ Event management
system (EMS)
▪ AutoSupport
!!
In the example, there is an alert from System Manager that needs to be diagnosed.
When there is an alert or event, first try the solution that the monitoring software
suggests.
227
Component Failure
LEDs to observe: Items to inspect: Common cluster CLI
▪ Controllers ▪ Cables commands:
▪ cluster show
▪ Drives ▪ Connections
▪ system node show
▪ Switches ▪ Power
▪ Ports
Attention
LED
There are a few basic actions that you can take to assess the situation. The actions
are not listed in any particular order on the slide.
Observe the LEDs on the controllers, drives, switches, and ports.
Inspect the cables, connections, and power.
Analyze the cluster, nodes, and resources by using common CLI commands such as
cluster show and node show.
228
Disk Failures
▪ ONTAP continually
▪ Place suspect disk in prefail mode.
monitors disks. Prefail
ONTAP continually monitors disks to assess their performance and health. This
monitoring is often called “predictive failure” in the storage industry.
When ONTAP encounters certain errors or behaviors from a disk, ONTAP takes the
disk offline temporarily or takes the disk out of service to run further tests. While the disk
is offline, ONTAP reads from other disks in the RAID group while writes are logged.
When the offline disk is ready to come back online, ONTAP resynchronizes the RAID
group and brings the disk online. This process generally takes a few minutes and incurs
a negligible performance effect.
Disks can sometimes display small problems that do not interfere with normal operation,
but the problems can be a sign that the disk might fail soon. The maintenance center
provides a way to put these disks under increased scrutiny. When a suspect disk is in
the maintenance center, the disk is subjected to several tests. If the disk passes all of
the tests, ONTAP redesignates the disk as a spare; if the disk fails any tests, ONTAP
fails the disk. By default, ONTAP puts a suspect disk into the maintenance center
automatically only if there are two or more spares available for that disk.
When ONTAP determines that a disk has exceeded error thresholds, ONTAP can
perform rapid RAID recovery. ONTAP removes the disk from its RAID group for testing
and, if necessary, fails the disk. Spotting disk errors quickly helps prevent multiple disk
failures and enables problem disks to be replaced. By performing the rapid RAID
recovery process on a suspect disk, ONTAP avoids long rebuilding time, performance
degradation, and potential data loss due to additional disk failure during reconstruction.
229
Disk Failures
Spare disk selection
Larger Size:
Unused capacity
ONTAP always tries to choose a hot spare that exactly matches the failed or failing
disk. If an exact match is not available, ONTAP uses the best available spare, or
ONTAP puts the RAID group into a degraded mode. Understanding how ONTAP
chooses an appropriate spare when there is no matching spare enables you to
optimize the spare allocation for your environment.
First, if the available hot spares are not the correct size, ONTAP uses the hot spare
that is the next larger size, if there is one. The replacement disk is downsized to
match the size of the disk that it is replacing; the extra capacity is not available.
Next, if the available hot spares are not the correct speed, ONTAP uses a hot spare
that is a different speed. Using disks with different speeds in the same aggregate is
not optimal. Replacing a disk with a slower disk can cause performance degradation,
and replacing a disk with a faster disk is not cost-effective.
Finally, if no spare exists with an equivalent disk type or checksum type, the RAID
group that contains the failed disk enters degraded mode. ONTAP does not combine
effective disk types or checksum types within a RAID group.
230
Configuration
Config Advisor
▪ ONTAP features:
▪ Validation of shelf cabling
▪ Validation of ONTAP and switches setup
▪ Firmware revision checks
▪ Support for MetroCluster, FlexPod, and
7-Mode Transition Tool (7MTT) transitions
▪ Config Advisor AutoSupport
Config Advisor contains more than 300 configuration checks that can be used to
validate setup or operational configuration. Config Advisor contains checks for
cabling, shelf setup, and the latest firmware validation. Config Advisor also contains
several checks to validate network switches and the setup of ONTAP.
Config Advisor has three major components that collect data, analyze data, and
present the findings. For consistency in the display of alerts, the results are shown in
a table format similar to My AutoSupport. There is also a visual depiction of the shelf
and storage layout to better emphasize connectivity issues.
231
Performance
Ways to minimize performance issues: Potential performance issues:
▪ Correctly size and follow best practices for the ▪ Controller: Resource over-utilization, ONTAP
specific workload. version, offline, or rebooting
▪ Verify the supported minimums and ▪ Storage: Disk types, aggregate configuration,
maximums. volume movement, and free space
▪ Adhere to the ONTAP storage platform mixing ▪ Networking: Configuration, LIF location, port
rules. saturation, port speeds, or indirect access
▪ Check compatibility of components, host OS, ▪ Host or clients: Application, drivers, network
applications, and ONTAP version. adapter, or user knowledge
As the saying goes, prevention is the best medicine. Start with a properly sized
system and follow best practices for ONTAP, the host operating system, and the
application. Verify that the supported minimums, maximums, and mixing rules are
adhered to. Always use the NetApp Interoperability Matrix Tool (or IMT) to check
compatibility of components, host OS, applications, and ONTAP.
Things can change over time and issues can arise. Performance issues can occur for
many different reasons, and analysis can be complex. Performance analysis is
beyond the scope of a fundamentals course, but this some components that might be
related to performance issues are listed here.
232
Storage Utilization
Ways to minimize use issues:
▪ Use the appropriate volume and LUN
settings for the workload requirements.
▪ Monitor free space to prevent offline
volumes and LUNs.
▪ Monitor the number of Snapshot copies.
▪ Select the appropriate efficiency settings.
When you provision storage, use the appropriate volume and LUN settings for the
workload requirements. There are best practices guides for ONTAP, host operating
systems, and applications.
When a resource such as a volume or a LUN runs out of space, ONTAP protects the
currently stored data by taking the resource offline. To prevent resources from going
offline, you should monitor the free space in aggregates, volumes, and LUNs. You
also need to monitor the number of Snapshot copies and their retention period
because they share space with user data in the volume.
233
NetApp Support
▪ NetApp Support:
mysupport.netapp.com
▪ Hardware Universe:
hwu.netapp.com
▪ NetApp Interoperability
Matrix Tool (IMT):
mysupport.netapp.com/
matrix
234
Knowledge Check
1. A disk has experienced errors. What does ONTAP do if at least two
matching spares are available?
a. Immediately halts I/O and takes the disk offline.
b. Immediately halts I/O and rebuilds the disk to a spare.
c. Places the disk in the maintenance center and assesses the disk.
d. Enters degraded mode for 24 hours while the disk is being repaired.
A disk has experienced errors. What does ONTAP do if at least two matching spares
are available?
235
Knowledge Check
2. You require more UTA ports on a controller. Where do you find the
correct UTA expansion card?
a. MyAutoSupport
b. NetApp Interoperability Matrix Tool (IMT)
c. Hardware Universe
d. The expansion card vendor’s website
You require more UTA ports on a controller. Where do you find the correct UTA
expansion card?
236
Knowledge Check
3. You require more CNA ports on your host. Where do you find a
supported CNA card?
a. MyAutoSupport
b. NetApp Interoperability Matrix Tool (IMT)
c. Hardware Universe
d. The expansion card vendor’s website
You require more CNA ports on your host. Where do you find a supported CNA card?
237
Resources
▪ NetApp product documentation:
http://mysupport.netapp.com/documentation/productsatoz/index.html
▪ Hardware Universe:
http://hwu.netapp.com
Resources
238
Thank You!
Thank you.
239