Implementing Nutanix Storage Services
Understanding Nutanix Core Datapath Architecture
Jaya Bodkhey
Information Security & Automation Engineer
@jayabodkhey
Course
Nutanix core datapath architecture
Outline
Nutanix volumes services
Nutanix files services
Nutanix objects services
Storage Provisioning
Assigning storage capacity to computing
devices
Manual & automated
Implemented in a computing environment
Scalable
Failure handling
Evolution of Hyperconverged Infrastructure
1990s 2000s 2009
SAN storage based Storage Hyperconverged
datacenter infrastructure virtualization infrastructure
Evolution of Hyperconverged Infrastructure
1990s 2000s 2009
SAN storage based Storage Hyperconverged
datacenter infrastructure virtualization infrastructure
Evolution of Hyperconverged Infrastructure
1990s 2000s 2009
SAN storage based Storage Hyperconverged
datacenter infrastructure virtualization infrastructure
SAN Based Datacenter Architecture
Clients
Client Access LAN
Application Server
Storage Area Network
Fiber Channel / iSCSI Protocol
Storage Devices
Storage Virtualization
Clients
Logical Drive 1 Logical Drive 2 Logical Drive 1 Logical Drive 1
Virtualization
Vol 1 Vol 2 Vol 1 Vol 1
Storage Pool Storage Pool Storage Pool
Storage Devices
Nutanix Hyperconverged Architecture
Virtualization
Lower cost
Software Driven
Performance
Flexibility
Hard Disk Drive
Electro-mechanical, magnetic disk storage
2.5” and 3.5” form factors
SAS and SATA interfaces
Pros
- Cost effective, higher storage capacity,
greater life
Cons
- Slower, bigger size, higher power
consumption, noisy, mechanical failure
Solid State Drive
Works on flash memory
NAND, NOR (SLC, MLC, TLC, QLC)
2.5” and M.2 form factors
SAS, SATA, NVMe interfaces
Pros
- Higher speed, no moving/mechanical parts,
power effective
Cons
- Costlier, sensitive to temperature
conditions, faster wear and tear
Types of Data Storage
Block Storage File Storage Object Storage
Storage Topologies
Direct Attached Network Attached Storage Area
Storage Storage Networks
SCSI, SAS, NVMe NFS, CIFS iSCSI, FC
Hypervisor
Virtual machine monitor (VMM)
Software to create and run virtual machines
Multiple VMs simultaneously
Abstracts computer’s software from its
hardware
Type1 and Type2 Hypervisors
Provides speed, efficiency, flexibility and
portability
Nutanix Data Storage Fabric (DSF)
Appears to hypervisor as a centralized storage
array
IOs are handled locally for better performance
Acropolis distributed file system (ADSF)
Stores user data across different storage tiers
on different nodes
Supports snapshots, clones, deduplication,
compression and erasure coding
DSF High-level Filesystem
vDisks
File Disk/File
Group of VMs/Files Container 1 C2 Datastore
Group of physical storage devices Storage Pool N/A – Transparent to Hypervisor
Storage Devices
DSF Low-level Filesystem
Guest Filesystem
Blocks
Typically 4 – 8KB Guest Filesystem
Logical
Slices
Typically 4 – 8KB NDFS Logical
Extents
1 MB NDFS Logical
1 or 4 MB Extent Group 1 Extent Group 2 File on disk
Storage Devices
N GB or TB Physical
DSF Features
Snapshot Clone VM Disk Deduplication
Compression Erasure Coding Backup
Intelligent Data Placement Algorithm
Used to tier data across different classes of
storage devices
Most frequently used data is placed in memory
or in flash
Data is placed on the node local to VM
Extents get stored closer to the node running
the user VM
Types of Failures
Disk Failure Controller VM Failure Node Failure
Removal, CVM power action Hardware or Software
unresponsive, failed, causing CVM to be failure within the
having I/O errors unavailable node
Disk Failure Handling
Stargate marks the disk offline on identifying
errors
S.M.A.R.T. ensures disk state
Curator scan checks metadata to restore the
data through re-replication
All the nodes, CMVs, disks take part in re-
replication
Drive Self Test
VM doesn’t see any impact
CVM Failure Handling
CVM Autopathing
Original local CVM takes over once it’s back
CVM multipathing
Affects latency parameter for VM depending
on I/O load
Node Failure Handling
VMs receive High Availability (HA) event and
are restarted
VMs come back good post restart
Curator scan is performed to find replicas
On a prolonged node failure, CVM beneath is
removed from metadata ring
Redundancy Factor and Replication Factor
Redundancy factor Replication factor
Capability of Nutanix cluster to Number of data copies on Nutanix
withstand failure cluster
RF=2, RF=3 (N-1 node/drive failure) RF2, RF3 (N number of copies)
RF2 vs. RF3
RF2 RF3
2 components redundancy 3 components redundancy
Sustains single drive/node failure Sustains 2 drive/node failure
Requires minimum 3 nodes Requires minimum 5 nodes
Rack Fault Tolerance
Cluster has rack awareness
Sustainability of one or two rack failure
Redundant copies of guest VM data and
metadata
Awareness of physical mapping of racks and
blocks
Block Fault Tolerance
Block is rack-mountable enclosure that
contains Nutanix nodes
Redundant copies of data and metadata on
different blocks
Opt-in block fault tolerance
Best-effort block fault tolerance
Guest VMs can continue to run despite a
block failure
Rebuild and Curator
Disk/Node Failure, unplanned failure
Multiple copies of data and metadata
- Granularity of metadata
- Choice of peers for RF
- Handling rebuild
Curator metadata management
Curator scan
Storage provisioning
Module Evolution of hyperconverged infrastructure
Summary Basics of storage
Nutanix hyper converged architecture
Data storage fabric
Types of failures
Failure handling
Redundancy and replication factors
Block and rack fault tolerance
Rebuild process