0% found this document useful (0 votes)
38 views

Module 5 - Block, File, and Object-based Storage Systems - Participant Guide

Uploaded by

Entertain Me
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
38 views

Module 5 - Block, File, and Object-based Storage Systems - Participant Guide

Uploaded by

Entertain Me
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 68

MODULE 5-BLOCK, FILE,

AND OBJECT-BASED
STORAGE SYSTEMS

PARTICIPANT GUIDE

PARTICIPANT GUIDE

[email protected]
Table of Contents

Module Objectives ............................................................................................................... 1

Block-Based Storage Systems ................................................................................. 2


Block-Based Storage System Overview ............................................................................... 3
Block Storage System Components ..................................................................................... 5
Cache Operations ................................................................................................................ 8
Block Storage System Disk Drive Protocols ....................................................................... 10
Use Case - Block-Based Storage in the Cloud ................................................................... 11

Knowledge Check .................................................................................................... 13


Knowledge Check .............................................................................................................. 14

File-Based Storage Systems ................................................................................... 15


File Systems and Network File Sharing.............................................................................. 16
File-Based Storage Systems: Network Attached Storage (NAS) ........................................ 18
General Purpose Servers Vs. NAS Systems ...................................................................... 19
NAS Components .............................................................................................................. 21
Scale-Up NAS.................................................................................................................... 22
Scale-Out NAS .................................................................................................................. 23
Network File Sharing Access Protocols.............................................................................. 25
NAS I/O Operation ............................................................................................................. 28
Use-Case for Scale-Out NAS: Data Lake ........................................................................... 31

Knowledge Check .................................................................................................... 32


Knowledge Check .............................................................................................................. 33

Object-Based Storage Systems .............................................................................. 34


Drivers for Object-Based Storage ...................................................................................... 35
What is an Object in Object-Based Storage? ..................................................................... 36
Hierarchical File System Vs. Flat Address Space............................................................... 37
Components of Object-Based Storage Device ................................................................... 38
Key Features of OSD Storage Systems ............................................................................. 39

Module 5-Block, File, and Object-based Storage Systems

© Copyright 2022 Dell Inc.


Page ii [email protected]
Use Case: Cloud-Based Storage ....................................................................................... 40
Use Case: Cloud-based Object Storage Gateway.............................................................. 41

Knowledge Check .................................................................................................... 43


Knowledge Check .............................................................................................................. 44
Knowledge Check .............................................................................................................. 45

Unified Storage Systems ......................................................................................... 46


Drivers For Unified Storage Systems ................................................................................. 47
Unified Storage System Architecture.................................................................................. 48

Concepts in Practice................................................................................................ 49
Concepts in Practice .......................................................................................................... 50

Module 5-Block, File and Object-based Storage Systems - Appendix ................ 54


Appendix: Storage Controller ............................................................................................. 55
Appendix: Deficiencies of General Purpose Server File Sharing ........................................ 56
Appendix: NAS System Components ................................................................................. 57
Appendix: Scale-Out NAS .................................................................................................. 58
Appendix: Scale-Out NAS Operation ................................................................................. 59
Appendix: Use-Case for Scale-Out NAS: Data Lake .......................................................... 60
Appendix: NAS vs. Object-Based Storage ......................................................................... 61
Appendix: OSD Controller Operations................................................................................ 62
Appendix: Key Features of OSD Systems .......................................................................... 63

Module 5-Block, File, and Object-based Storage Systems

© Copyright 2022 Dell Inc. Page iii


[email protected]
[email protected]
Module Objectives

Module Objectives

The main objectives of the module are to:

→ Describe the architecture, components, operations, and use of each


storage system type.
→ Explain the advantages of unified storage systems to store and
serve block and file-based data.

Module 5-Block, File, and Object-based Storage Systems

© Copyright 2022 Dell Inc. Page 1


[email protected]
Block-Based Storage Systems

Block-Based Storage Systems

Module 5-Block, File, and Object-based Storage Systems

Page 2 © Copyright 2022 Dell Inc.


[email protected]
Block-Based Storage Systems

Block-Based Storage System Overview

Block-based storage systems LUNs appear to the server as internal physical disks. (Click to
enlarge)

Data is stored on disk devices in blocks containing a fixed number of bytes.


Typically, a data block contains 512 or 4,096 bytes.

Block-based storage systems:

• Store raw data only.


• Maintain the file system software1 in the operating system.

1 A file system adds a organization structure to the block data

Module 5-Block, File, and Object-based Storage Systems

© Copyright 2022 Dell Inc. Page 3


[email protected]
Block-Based Storage Systems

• Can have either a scale-up or scale-out architecture.


• Offer data protection, replication, and other capabilities.

Module 5-Block, File, and Object-based Storage Systems

Page 4 © Copyright 2022 Dell Inc.


[email protected]
Block-Based Storage Systems

Block Storage System Components

All block-storage systems are designed using four main storage system
components regardless of manufacturer or model. All storage components are
contained within or connected to the storage controllers.

Front-end Ports

The front-end ports connect hosts to the storage system and hosts connect only
through the front-end ports. They do not have direct access to any other devices of
the storage system. Front-end ports are attached to front-end controllers. Front-end
ports provide the connectivity protocol logic such as Fibre Channel or iSCSI, or for
mainframe hosts, the ESCON or FICON protocols. Depending on the host interface
connectivity configuration, front-end port block I/O is processed through either or
both storage controllers.

The image shows four hosts, each with one FC HBA port. Dual redundant
connections to both storage controllers are still achieved, as it is configured through
a Fibre Channel Storage Area Network (SAN). For complete, end-to-end
connection redundancy, each host should be equipped with at least two FC HBA
ports.

Click image to enlarge.

Module 5-Block, File, and Object-based Storage Systems

© Copyright 2022 Dell Inc. Page 5


[email protected]
Block-Based Storage Systems

Cache

Storage system controllers contain high speed DRAM memory. Used as an active
cache, the DRAM buffers inbound data before writing to disk, and buffers outbound
data from disks to hosts. Placing DRAM memory between storage system disks
and front-end ports increases I/O performance.

Both storage controllers contain the same amount of embedded cache in the form
of DRAM memory modules. The cache in the path between the disk array back-end
ports and the host front-end ports. In this location, all read and write block I/O must
pass through the cache, increasing read and write I/O performance.

I/O performance acceleration can be increased and scaled further to benefit the
addition of hosts and storage by adding more cache memory to the controllers.

Click image to enlarge.

Back-end Ports

Back-end ports connect through the link control cards to the shelves of physical
disk and devices in each disk array enclosure (DAE). Dual redundant connections
between the controllers and all DAEs provide disk I/O reliability. Unused back-end
ports are available to connect additional DAEs to the system.

Module 5-Block, File, and Object-based Storage Systems

Page 6 © Copyright 2022 Dell Inc.


[email protected]
Block-Based Storage Systems

Click image to enlarge.

Disk Array

The disk array contains traditional hard disk drives or solid state drives. These
devices are only accessible by the storage system controllers.

The disk array provides physical block storage capacity to the storage system. The
devices are arranged in shelves within a DAE.

Scaling, increased storage capacity, and I/O performance are achieved by adding
DAEs.

Click image to enlarge.

Module 5-Block, File, and Object-based Storage Systems

© Copyright 2022 Dell Inc. Page 7


[email protected]
Block-Based Storage Systems

Cache Operations

How cache is used to increase host read I/O performance:

• Host sends a read request to the front-end port. If the requested data is in
cache, the data is quickly sent from its cache location to the front-end port back
to the host. Fetching requested data that is already in cache is known as a
Read hit2.
• Host sends a read request to the front-end port. If requested data is not in
cache, the request is forwarded through the back-end ports, and the link
controller to the disk devices. Data is fetched from the disks and takes the same
route back to the host. Fetching requested data that is not already in cache is
known as a Read miss3.

The image compares cache read hit and read miss operations (Click to enlarge).

Cache Write I/O Operations:

• Write-Back Operation: Write data that is in the cache of both controllers


eventually must be written to the disk devices. Writing to disk can be done later

2 The storage controller stores the requested data in cache to increase the chances
of a read hit when next requested.
3 A Read Miss requires read data to traverse the longer end-to-end I/O path to disk.

It must be and processed at each storage system component which adds latency.
Also, disk device access times are very slow compared to cache memory.

Module 5-Block, File, and Object-based Storage Systems

Page 8 © Copyright 2022 Dell Inc.


[email protected]
Block-Based Storage Systems

because there is no impact to host I/O performance. Typically, the storage


system schedules writes to disk during low host activity or idle time.
• Write-Through Operation: Write data passes through the cache, and is
immediately written to the disk devices. An acknowledgment is sent to the host
after the data is written to disk. Since data is committed to disk as it arrives, the
risk of data loss is low. However, the write-response time is longer due to
bypassing the speed of storing data in cache before de-staging to disk.

Module 5-Block, File, and Object-based Storage Systems

© Copyright 2022 Dell Inc. Page 9


[email protected]
Block-Based Storage Systems

Block Storage System Disk Drive Protocols

Intelligent storage systems provide support for a variety of disk devices of different
speeds and types, such as FC, SATA, SAS, and solid state (SSD) disk devices.
They also support using a mix of SSD, FC, or SATA within the same storage
system. Additionally, enterprise storage systems support disk devices that use the
NVM Express (NVMe) protocol.

FC

Fibre Channel is a high-speed block data transfer protocol. Along with high
performance, FC guarantees in-order delivery of data block data that is read from
the disk device. FC disk drives are design to provide high performance over
storage capacity.

SAS

Serial Attached SCSI is a block disk protocol that replaced parallel SCSI disk drive
connectivity. SAS disk drives are designed for midrange block I/O storage
applications, balancing performance with higher storage capacity requirements.

SATA

Serial Advanced Technology Attachment is a block disk protocol that is typically


used in less demanding block storage I/O applications. SATA disk drives are less
expensive and provide higher storage capacity than FC or SAS disk drives.

NVMe

Nonvolatile Memory Express is an optimized block disk connectivity protocol


provides the highest block I/O transfer rates and lowest latency of all block-based
storage protocols. It is designed to allow enterprise class SSD storage devices to
operate at their maximum performance.

Module 5-Block, File, and Object-based Storage Systems

Page 10 © Copyright 2022 Dell Inc.


[email protected]
Block-Based Storage Systems

Use Case - Block-Based Storage in the Cloud

Virtual Machines Running Business Applications

Block-based Storage Volumes

Block-based Storage System

Click image to enlarge.

To develop prototypes or to quickly scale to meet user demand, organizations may


move their application to a public cloud. To ensure proper functioning of the
application and provide acceptable performance, service providers offer block-
based storage in the cloud.

Module 5-Block, File, and Object-based Storage Systems

© Copyright 2022 Dell Inc. Page 11


[email protected]
Block-Based Storage Systems

The service providers enable consumers to create block-based storage volumes


and attach them to the virtual machine instances. After the volumes are attached,
consumers can create file systems on these volumes and run applications the way
they would on an on-premises data center.

Module 5-Block, File, and Object-based Storage Systems

Page 12 © Copyright 2022 Dell Inc.


[email protected]
Knowledge Check

Knowledge Check

Module 5-Block, File, and Object-based Storage Systems

© Copyright 2022 Dell Inc. Page 13


[email protected]
Knowledge Check

Knowledge Check

1. What is a characteristic of block data storage?


a. Data bytes written to disk devices are stored in blocks of a consistent size.
b. Data bytes written to disk devices are stored in blocks of variable size.
c. Data blocks written to disk devices are stored in bytes of variable size.
d. Data blocks written to disk devices are stored in bytes of a consistent size.

Module 5-Block, File, and Object-based Storage Systems

Page 14 © Copyright 2022 Dell Inc.


[email protected]
File-Based Storage Systems

File-Based Storage Systems

Module 5-Block, File, and Object-based Storage Systems

© Copyright 2022 Dell Inc. Page 15


[email protected]
File-Based Storage Systems

File Systems and Network File Sharing

Applications access data in the form of files. A file has metadata so an application
or user can correctly access, and use the raw, block file data. Metadata4 adds other
important information that is associated with the file datatype.

Windows Operating System Windows Operating System

NTFS File System NTFS File System

With file sharing enabled, folders virtually become a part of the hierarchy of another file system.
Shared folders appear as locally stored. (Click to enlarge)

4 In computer files, metadata is additional data that describes the raw data in the
file. For example, when a digital photo editor opens a file, it first reads the metadata
to ensure the raw data is a digital photo in the correct format, such as JPG or PNG.
The photo editor also reads the metadata to understand details about the image,
such as its height and width, pixel density, and the type of compression used to
store the raw data on disk.

Module 5-Block, File, and Object-based Storage Systems

Page 16 © Copyright 2022 Dell Inc.


[email protected]
File-Based Storage Systems

• File names, extensions, and metadata are organized, and maintained by the
host operating system in the form of a file system5.
• Each server has its own file system. The file system is only accessible to that
server.

− File sharing functionality is integrated but must be enabled. Enabling file


sharing allows access by other hosts. The creator or owner of each file
determines the type of access to be given to other users.
To learn about limitations of using general-purpose servers for file sharing, click
here.

5 A file system is a logical representation of how an operating system manages


where and how data is stored on disk drives. Files are typically stored in folders,
and folders are organized in a hierarchical tree-like structure that can be directly
accessed or searched sequentially for files. A file system also contains metadata
about file and folder size, names, file data location on disk drives, date and time
accessed, modified, etc. File metadata also describes how an application or user
can access the raw data in the correct format. There are different file system types
that add additional features and functions, such as deduplication, compression,
distributed access across clustered hosts, and rapid search capabilities.

Module 5-Block, File, and Object-based Storage Systems

© Copyright 2022 Dell Inc. Page 17


[email protected]
File-Based Storage Systems

File-Based Storage Systems: Network Attached Storage


(NAS)

Clients

LAN

Application Servers

NAS
System

Click image to enlarge.

File-based storage systems are purpose-built, high performance, high scalability


platforms that take the place of general-purpose servers to store and share file
data. These specialized storage systems are known as Network Attached
Storage (NAS) systems.

NAS provides the advantages of server consolidation by storing all file data from
the general-purpose servers into its own file systems. File server consolidation
makes it easier to manage the storage.

• Centralizes and optimizes file sharing operations, administration, and


management.
• Uses a specialized operating system that is optimized for file I/O.
• Enables Linux, UNIX, and Windows users to share data more efficiently.

Module 5-Block, File, and Object-based Storage Systems

Page 18 © Copyright 2022 Dell Inc.


[email protected]
File-Based Storage Systems

General Purpose Servers Vs. NAS Systems

A NAS system is optimized for file-serving functions such as storing, retrieving, and
accessing files for applications and clients; as shown on the image:

• A general-purpose server can be used to host any application because it runs a


general-purpose operating system.
• Unlike a general-purpose server, a NAS device is dedicated to file-serving.
• Has a specialized operating system that is dedicated for file serving by using
industry standard protocols. NAS vendors also support features, such as
clustering6 for high availability, scalability, and performance.

6 The clustering feature enables multiple NAS controllers, heads, or nodes to


function as a single entity. The workload can be distributed across all the available
nodes. Clustering enables NAS to support massive workloads.

Module 5-Block, File, and Object-based Storage Systems

© Copyright 2022 Dell Inc. Page 19


[email protected]
File-Based Storage Systems

NAS servers do not run user applications nor access user peripheral devices. (Click image to
enlarge)

Module 5-Block, File, and Object-based Storage Systems

Page 20 © Copyright 2022 Dell Inc.


[email protected]
File-Based Storage Systems

NAS Components

NAS systems consist of NAS controllers and storage. For smaller applications,
NAS controllers can reside in the same physical unit as the storage controllers and
disk array.

• A NAS controller consists of:


− CPU, memory, network adapters, and so on.
− Specialized operating systems installed.
• Storage

− Supports different types of storage devices.


− Storage can be integrated with the NAS system or be connected through a
SAN.

The image on the left shows an integrated NAS system. The image on the right shows multiple NAS
Servers with external SAN storage. (Click image to enlarge).

To learn more about NAS system components, click here.

Module 5-Block, File, and Object-based Storage Systems

© Copyright 2022 Dell Inc. Page 21


[email protected]
File-Based Storage Systems

Scale-Up NAS

Storage

Scale-Up NAS

NAS Head(s)

Click image to enlarge.

Scale-Up architecture provides the ability to independently grow capacity and


performance. For example, if you only added storage to a system, you'd be scaling
up. However, if you only added NAS Controllers that contained CPU and memory,
you'd be scaling up.

NAS systems have a fixed capacity ceiling, which limits their scalability. The
performance of these systems starts degrading when they approach the capacity
limit.

Module 5-Block, File, and Object-based Storage Systems

Page 22 © Copyright 2022 Dell Inc.


[email protected]
File-Based Storage Systems

Scale-Out NAS

Click image to enlarge.

Scale-Out is the ability to grow capacity and performance simultaneously. For


example, adding a node to a NAS system adds additional CPU, Memory, Network
Adapters and Storage. Adding a node would be an example of Scale-Out.

Scale-out NAS:

• Pools multiple NAS servers or nodes in a cluster to work as a single NAS


device.
• Scales performance and capacity non-disruptively.
• Creates a single file system that runs on all nodes in the cluster.
− Clients, which are connected to any node, can access the entire file system.
− File system grows dynamically as nodes are added.
• Stripes data across nodes with mirror or parity protection.

Module 5-Block, File, and Object-based Storage Systems

© Copyright 2022 Dell Inc. Page 23


[email protected]
File-Based Storage Systems

To learn more about scale-out NAS, click here.

Module 5-Block, File, and Object-based Storage Systems

Page 24 © Copyright 2022 Dell Inc.


[email protected]
File-Based Storage Systems

Network File Sharing Access Protocols

Different methods can be used to access files on a NAS system. The most
common methods are:

CIFS/SMB

Common Internet File System (CIFS) is a client/server application protocol that


enables client programs to make requests for files and services on remote
computers over TCP/IP. It is a non-proprietary version of the Microsoft Windows
Server Message Block (SMB) protocol.

The CIFS protocol enables remote clients to gain access to files on a server. CIFS
enables file sharing with other clients by using special locks. CIFS provides the
following features to ensure data integrity:
• Uses file and record locking to prevent users from overwriting the work of
another user on a file or a record.
• Supports fault tolerance and can automatically restore connections and reopen
files that were open prior to an interruption.

CIFS is a stateful protocol because the CIFS server maintains connection


information regarding every connected client.

Users refer to remote file systems with an easy-to-use file-naming scheme:


\\server\share or \\servername.domain.suffix\share.

NFS

Network File System (NFS) is a client/server protocol for file sharing that is
commonly used on UNIX systems. NFS was originally based on the connectionless
User Datagram Protocol (UDP). It uses a machine-independent model to represent
user data. It also uses Remote Procedure Call (RPC) for interprocess
communication between two computers.

The NFS protocol provides a set of RPCs to access a remote file system for the
following operations:
• Searching files and directories.
• Opening, reading, writing to, and closing a file.

Module 5-Block, File, and Object-based Storage Systems

© Copyright 2022 Dell Inc. Page 25


[email protected]
File-Based Storage Systems

• Changing file attributes.


• Modifying file links and directories.

NFS creates a connection between the client and the remote system to transfer
data.

HDFS

Hadoop Distributed File System (HDFS) is supported by many of the scale-out


NAS vendors. HDFS requires programmatic access because the file system cannot
be mounted. All HDFS communication is layered on top of the TCP/IP protocol.
HDFS has a primary/secondary architecture. An HDFS cluster consists of a single
Name Node that acts as a master server.

This cluster has in-memory maps of every file, file locations as well as all the blocks
within the file and which DataNodes they reside on. The NameNode is responsible
for managing the file system namespace and controlling the access to the files by
clients. DataNodes act as slaves that serve read/write requests and perform block
creation, deletion, and replication as directed by the NameNode.

• A file system that spans multiple nodes in a cluster and enables user data to be
stored in files.
• Presents a traditional hierarchical file organization so that users or applications
can manipulate (create, rename, move, or remove) files and directories.
• Presents a streaming interface to run any application of choice using the
MapReduce framework.

Module 5-Block, File, and Object-based Storage Systems

Page 26 © Copyright 2022 Dell Inc.


[email protected]
File-Based Storage Systems

Click image to enlarge.

FTP

• FTP is a client/server protocol that enables data transfer over an IP network.


• An FTP server and an FTP client communicate with each other using TCP as
the transport protocol.
• FTP uses a set of commands and arguments to log into the remote FTP client
to access, manipulate, and transfer shared files and file metadata.

Module 5-Block, File, and Object-based Storage Systems

© Copyright 2022 Dell Inc. Page 27


[email protected]
File-Based Storage Systems

NAS I/O Operation

Network file I/O operations differ between scale-up and scale-out NAS
configurations.

Scale-Up NAS I/O Operation

The figure illustrates an I/O operation in a scale-up NAS system. The process of
handling read/write requests in a scale-up NAS environment is as follows:
1. The requestor (client) packages an I/O request into TCP/IP and forwards it
through the network stack. The NAS system receives this request from the
network.
2. The NAS system converts the I/O request into an appropriate physical storage
request, which is a block-level I/O. This system then performs the operation on
the physical storage.
3. When the NAS system receives data from the storage, it processes and
repackages the data into an appropriate file protocol response.
4. The NAS system packages this response into TCP/IP again and forwards it to
the client through the network.

Click image to enlarge.

Module 5-Block, File, and Object-based Storage Systems

Page 28 © Copyright 2022 Dell Inc.


[email protected]
File-Based Storage Systems

Scale-Out NAS I/O Operation

Scale-out NAS (Click image to enlarge).

The figure illustrates an I/O operation in a scale-out NAS system. A scale-out NAS
consists of multiple NAS nodes and each of these nodes has the functionality
similar to a NameNode or a DataNode. In some proprietary scale-out NAS
implementations, each node may function as both a NameNode and DataNode,
typically to provide Hadoop integration. All the NAS nodes in scale-out NAS are
clustered.

Write Operation Read Operation

1. Client sends a file to the NAS. 1. Client requests a file.


2. Node to which the client is connected 2. Node to which the client is connected
receives the file. receives the request.
3. File is striped across the nodes. 3. The node retrieves and rebuilds the
file and gives it to the client.

Module 5-Block, File, and Object-based Storage Systems

© Copyright 2022 Dell Inc. Page 29


[email protected]
File-Based Storage Systems

To learn more about scale-out NAS I/O operation, click here.

Module 5-Block, File, and Object-based Storage Systems

Page 30 © Copyright 2022 Dell Inc.


[email protected]
File-Based Storage Systems

Use-Case for Scale-Out NAS: Data Lake

The data lake represents a change from the linear data flow model. As data
increases in value, enterprise-wide data storage is transformed into a hub for data
ingestion and consumption systems. This data hub enables enterprises to bring
analytics to data and avoid the high cost of multiple systems, storage, and time for
ingestion and analysis.

The scale-out data lake:


• Accepts data from various sources like file shares, archives, web applications,
devices, and the cloud.
• Enables data access for uses from conventional purposes to mobile, analytics,
and cloud applications.
• Scales to meet the demands of consolidation and growth as technology
evolves.
• Provides a tiering capability that enables organizations to manage costs without
setting up specialized infrastructures.

Click image to enlarge.

To learn more about deploying scale-out NAS for data lakes, click here.

Module 5-Block, File, and Object-based Storage Systems

© Copyright 2022 Dell Inc. Page 31


[email protected]
Knowledge Check

Knowledge Check

Module 5-Block, File, and Object-based Storage Systems

Page 32 © Copyright 2022 Dell Inc.


[email protected]
Knowledge Check

Knowledge Check

1. What is a problem with using general-purpose servers for network file sharing?
a. File system incompatibilities when sharing files with clients that use
different operating systems.
b. File system incompatibilities when sharing files with clients that are virtual
machines.
c. File system incompatibilities when sharing files with clients that also serve
files.
d. File system incompatibilities when sharing files with clients that use the S3
or HDFS protocols.

Module 5-Block, File, and Object-based Storage Systems

© Copyright 2022 Dell Inc. Page 33


[email protected]
Object-Based Storage Systems

Object-Based Storage Systems

Module 5-Block, File, and Object-based Storage Systems

Page 34 © Copyright 2022 Dell Inc.


[email protected]
Object-Based Storage Systems

Drivers for Object-Based Storage

Listed are the key drivers for object-based storage adoption:

• Amount of data created annually is growing exponentially and more than 90% of
data generated is unstructured.
− Rapid adoption of third platform technologies leads to significant growth of
data.
− Longer data retention due to regulatory compliance also leads to data
growth.
• Data must be instantly accessible through a variety of devices from anywhere in
the world.
• Traditional storage solutions are inefficient in managing this data and in
handling the growth.

To learn more about NAS vs object-based storage systems in high growth and
access demand environments, click here.

Module 5-Block, File, and Object-based Storage Systems

© Copyright 2022 Dell Inc. Page 35


[email protected]
Object-Based Storage Systems

What is an Object in Object-Based Storage?

An object is the fundamental unit of object-based storage.

Most object-based storage systems support API integration with software-defined


data center and cloud environments.

• Objects contains user data, related metadata, and user defined attributes of
data, such as retention, access pattern, and others.
− Object metadata or attributes are used to optimize search, retention policies,
and automated deletion of objects.
• Each object is identified by a unique object ID. The object ID allows access to
objects without specifying the storage location.
− The object ID is generated applying specialized algorithms to the data. This
fingerprinting process guarantees unique object identification. Any changes
to the object data results in a new object ID.
• An object storage database is used to track where objects and metadata are
stored. For retrieval, the database uses the object ID to access the storage
location records for the object and its metadata.

Instead of storing data in a hierarchical structure of folders and files, object data is stored across a
logically flat data repository. (Click image to enlarge)

Module 5-Block, File, and Object-based Storage Systems

Page 36 © Copyright 2022 Dell Inc.


[email protected]
Object-Based Storage Systems

Hierarchical File System Vs. Flat Address Space

An object storage device (OSD) stores data using a flat address space where
objects exist at the same level, and one object cannot be placed inside another
object. Therefore, there is no hierarchy of directories and files, and billions of
objects can be stored in a single namespace.

• Hierarchical file system organizes data in the form of files/directories.


− Limits the number of files that can be stored.
• OSD uses a non-hierarchical, flat address space that enables storing large
number of objects without having to maintain an absolute path to each object.

− Enables the OSD to meet the scale-out storage indexing requirements of


cloud computing, big data, and data analytics environments.

The image on the left shows indexed data stored in a hierarchical file system. On the right is an
example of data stored in a flat, non-hierarchical address space. (Click image to enlarge)

Module 5-Block, File, and Object-based Storage Systems

© Copyright 2022 Dell Inc. Page 37


[email protected]
Object-Based Storage Systems

Components of Object-Based Storage Device

The OSD system is composed of one or more controllers. A controller is a server


that runs the OSD operating environment and provides services to store, retrieve,
and manage data in the system. Typically OSD controllers use inexpensive x86-
based servers. Each controller provides both compute and storage resources. OSD
systems scale linearly in performance and capacity by adding controllers.

An OSD consists of controllers that connect to each other, and dedicated disk devices through an
internal network. Client systems connect to the OSD over the IP network. (Click image to enlarge)

OSD system typically comprises three key components:


• OSD controllers
• Internal network
• Storage

To learn more about OSD system controllers, click here.

Module 5-Block, File, and Object-based Storage Systems

Page 38 © Copyright 2022 Dell Inc.


[email protected]
Object-Based Storage Systems

Key Features of OSD Storage Systems

Object-based storage devices have these features:

Features Description

Scale-out Provides linear scalability where nodes are independently


architecture added to the cluster to scale massively.

Multitenancy Enables multiple applications/clients to be served from the


same infrastructure

Metadata-driven Intelligently drive data placement, protection, and data


policy services based on the service requirements.

Global namespace Abstracts storage from the application and provides a


common view which is independent of location and making
scaling seamless.

Flexible data Supports REST/SOAP APIs for web/mobile access, and file
access method sharing protocols (CIFS and NFS) for file service access.

Automated system Provides auto-configuring, auto-healing capabilities to reduce


management administrative complexity and downtime.

Data protection: Object is protected using either replication or erasure coding


Geo distribution technique and the copies are distributed across different
locations.

For additional information about OSD system features, click here.

Module 5-Block, File, and Object-based Storage Systems

© Copyright 2022 Dell Inc. Page 39


[email protected]
Object-Based Storage Systems

Use Case: Cloud-Based Storage

Click image to enlarge.

OSD enables multi-tenancy, scalable cloud storage. Cloud storage provides


geographic distribution of data with unified and universal access, policy-based data
placement, and massive scalability. It also enables data access through web
service or file access protocols and provides automated data protection to manage
large amounts of data.

With the growing adoption of cloud computing, cloud service providers can
leverage OSD to offer storage-as-a-service, backup-as-a-service, and archive-as-
a-service to their consumers.

Module 5-Block, File, and Object-based Storage Systems

Page 40 © Copyright 2022 Dell Inc.


[email protected]
Object-Based Storage Systems

Use Case: Cloud-based Object Storage Gateway

The lack of standardized cloud storage APIs has made the gateway a crucial
component for cloud adoption. Service providers offer cloud-based object storage
with interfaces such as REST or SOAP7. However, most business applications
access storage resources through block-based iSCSI or FC interfaces, or file-
based interfaces, such as NFS or CIFS.

OSD gateways provide a translation layer between iSCSI, FC, NFS, CIFS
interfaces, and the cloud provider’s REST API.

Servers Requiring File and Block Storage


Access

Application Server
Virtualization Server

OSD Gateway Cloud Storage

Click image to enlarge

The OSD gateway:


• Presents file and block-based storage interfaces to applications.

7 SOAP is a messaging protocol for sending and receiving structured information


within networked web services. SOAP uses the XML message format, and
application layer protocols such as HTTP for transport. Where HTTP cannot be
used, Simple Mail Transfer Protocol (SMTP) can be used for message
transmission.

Module 5-Block, File, and Object-based Storage Systems

© Copyright 2022 Dell Inc. Page 41


[email protected]
Object-Based Storage Systems

• Performs protocol conversion to send data directly to cloud storage.


• Encrypts the data before it transmits to the cloud storage.
• Supports deduplication and compression.
• Maintains a local cache to reduce latency for remote storage access.
• Provides a data management layer to determine what data to send to cloud
storage or cache locally.
• Can be a physical appliance or a virtual appliance that runs gateway software.

Module 5-Block, File, and Object-based Storage Systems

Page 42 © Copyright 2022 Dell Inc.


[email protected]
Knowledge Check

Knowledge Check

Module 5-Block, File, and Object-based Storage Systems

© Copyright 2022 Dell Inc. Page 43


[email protected]
Knowledge Check

Knowledge Check

1. What is true about how object data is stored in an OSD system?


a. Object data is stored in a logically flat data repository. Object data and
metadata may be stored in different locations.
b. Object data is stored in a logically flat data repository. Object data and
metadata are stored together.
c. Object data is stored on a logically hierarchical data repository. Object data
and metadata are stored together.
d. Object data is stored in a logically hierarchical data repository. Object data
and metadata may be stored in different locations.

Module 5-Block, File, and Object-based Storage Systems

Page 44 © Copyright 2022 Dell Inc.


[email protected]
Knowledge Check

Knowledge Check

2. What is an attribute of an OSD gateway?


a. Presents file and block-based storage interfaces to applications.
b. Presents iSCSI and Fibre Channel-based storage interfaces to
applications.
c. Presents HTTP and XML storage interfaces to applications.
d. Presents REST and SOAP-based storage interfaces to applications.

Module 5-Block, File, and Object-based Storage Systems

© Copyright 2022 Dell Inc. Page 45


[email protected]
Unified Storage Systems

Unified Storage Systems

Module 5-Block, File, and Object-based Storage Systems

Page 46 © Copyright 2022 Dell Inc.


[email protected]
Unified Storage Systems

Drivers For Unified Storage Systems

Midrange and large enterprise IT departments support a growing number of


different applications and users. These applications and users require the IT
department to provide and manage their storage capacity, scale, connectivity and
protocol demands.

• The IT department must configure, administer, and mange an increasing


number of separate block, file, and object-based storage systems.
• Providing increasing storage and services across separate systems is
expensive and complex.
• IT personnel must master using different interfaces, software tools, and
procedures to manage the storage infrastructure.

Click image to enlarge.

Module 5-Block, File, and Object-based Storage Systems

© Copyright 2022 Dell Inc. Page 47


[email protected]
Unified Storage Systems

Unified Storage System Architecture

Unified storage systems converge block, file, and object storage as well as
configuration and management into a single platform. Unified storage lowers the
administration and management impact of a growing storage infrastructure. Data
storage and storage access remain transparent for applications and users.

Key benefits of unified storage systems are:

• Reduced number of separate storage systems.


− Lower environmental, acquisition, personnel, and administration and
maintenance expenses.
• Reduced configuration, administration, and management complexity.
• Integration with a software-defined environment provides the right storage
access for all users and applications.
• Increased utilization, with no stranded capacity. Unified storage eliminates the
capacity utilization penalty.

The image on the left shows separate block, file, and object storage systems. The image on the
right shows them converged into a unified storage system (Click to enlarge)

Module 5-Block, File, and Object-based Storage Systems

Page 48 © Copyright 2022 Dell Inc.


[email protected]
Concepts in Practice

Concepts in Practice

Module 5-Block, File, and Object-based Storage Systems

© Copyright 2022 Dell Inc. Page 49


[email protected]
Concepts in Practice

Concepts in Practice

Dell PowerStore

Dell PowerStore is a unified storage platform that stores and serves block and file
data. Designed for growth, PowerStore can scale-up by adding storage capacity,
and scale-out by adding storage controllers. The PowerStore platform provides:

• A scalable, single platform for block and file data storage and services.
• Active/active storage controllers with end-to-end NVMe storage connectivity.
• Flash or Storage Class Memory (SCM) internal storage devices.
• Integrated data optimization and protection services.

Dell PowerMax Series

Dell PowerMax is a unified storage platform that stores and serves block and file
data. Designed for growth, PowerMax can scale-up by adding storage capacity,
and scale-out by adding storage controllers.

The PowerMax storage architecture offers:

• Up to 15M IOPS, 350 GB/s throughput (187 K IOPS per rack unit).
• Active/active storage controllers with end-to-end NVMe storage connectivity.
• Automated I/O recognition and data placement across NAND flash and SCM
media to maximize performance with no management overhead.
• End to end efficient data encryption, and FIPS 140-2 validated Data at Rest
Encryption.
• Integrated data optimization and protection services.

Module 5-Block, File, and Object-based Storage Systems

Page 50 © Copyright 2022 Dell Inc.


[email protected]
Concepts in Practice

Dell PowerScale

PowerScale are scale-out NAS products that are based on the OneFS operating
environment. Available as all-flash, hybrid and archive models, they achieve high
scalability by pooling multiple nodes into a clustered NAS system that can store
petabytes of file data. OneFS creates a single file system that spans across all
nodes in a PowerScale cluster. These NAS products are optimized for file sharing
and object data storage.

OneFS creates a single file system that spans across all nodes in a PowerScale
cluster.

• Protocols: SMB (1, 2, 2.1, 3.x), NFS (v3, v4.0), FTP, SFTP, FTPS, S3, HDFS,
HTTP.
• Scalability per file system namespace is 66 PB.

Module 5-Block, File, and Object-based Storage Systems

© Copyright 2022 Dell Inc. Page 51


[email protected]
Concepts in Practice

• Integrated data optimization and protection services.


• Scalability per cluster: 252 Nodes.

Dell ECS

Dell ECS object scale storage appliances provide a hyperscale storage


infrastructure that is designed to support modern applications. ECS provides a
scalable, high availability architecture. It provides universal accessibility with
support for object data, and the HDFS file system. The ECS platform provides:

• Hyperscale storage infrastructure.


• Universal accessibility with support for object and HDFS.
• Automated I/O recognition and data placement across NAND flash and SCM
media to maximize performance with no management overhead.
• A single platform for web, mobile, big data, and social media applications.
• Data optimization and protection services.

Dell Cloud Tiering Appliance

The Dell Cloud Tiering Appliance (CTA) is packaged in the form of a virtual
appliance. CTA is used to optimize primary file storage by automatically moving
inactive files to secondary storage based on policies. Secondary storage can be
lower-cost drives, such as SAS or SATA drives, or to other platforms, including
public and private clouds. CTA can also provide block-level LUN data archiving for
Dell EMC Unity storage systems.

Files that are moved, appear as if they are on primary storage. File tiering
dramatically improves storage efficiency, and backup and restore time. File

Module 5-Block, File, and Object-based Storage Systems

Page 52 © Copyright 2022 Dell Inc.


[email protected]
Concepts in Practice

archiving onto storage with WORM functionality can support additional business
requirements such as compliance and retention.

• Tier or archive and recall file data.


• Automatically migrate files.
• Perform orphan file management.
• Simulate the potential effect of inactive file migration policies before starting the
process.

Module 5-Block, File, and Object-based Storage Systems

© Copyright 2022 Dell Inc. Page 53


[email protected]
Module 5-Block, File and Object-based Storage
Systems - Appendix

Module 5-Block, File, and Object-based Storage Systems

© Copyright 2022 Dell Inc. Page 54


[email protected]
Appendix: Storage Controller

A storage controller is the device within the storage system that operates and
manages all of the functional components of the system. Storage systems should
have a minimum of two storage controllers. This arrangement provides operational
redundancy and enhances I/O performance and scalability. Storage Controllers:

• Contain front-end ports to connect to a SAN or directly to servers.


• Contain back-end ports that connect to the internal disk array enclosures.
• Implement hardware RAID to logically segment disk drives. Presents the
segments to servers as logical block disk devices or LUNs.
• Contain and manage the DRAM cache memory that accelerates host and
internal disk array read and write I/O.

Storage controllers are sometimes referred as RAID controllers, or by different


names assigned by the system manufacturer. For example, Dell Unity controllers
are named Storage Processor A and Storage Processor B (SPA, SPB). Dell
PowerMax controllers are named Director 1 and Director 2.

Module 5-Block, File, and Object-based Storage Systems

© Copyright 2021 Dell Inc. Page 55


[email protected]
Appendix: Deficiencies of General Purpose Server File
Sharing

The benefits of network file sharing degrade as more general-purpose servers are
added to share file system data with each other, and client systems. The major
issues with network file serving are:

• Lack of scalability: File server processing demands, and complexity increase


rapidly as more servers and growing file systems are added to the sharing pool.
Performance decreases as the sharing environment increases.
• File system incompatibilities across operating systems: Windows file
systems are based on different protocols than Linux file systems. Cross
operating system file sharing requires complex folder and file metadata mapping
and conversions where Windows and Linux applications and users must access
files across their native file systems.
• Complex file sharing administration and data maintenance: Each file
server, and their file system, folders and files must be administered individually.
Individual server and file system administration and maintenance requirements
is prone to causing errors that can compromise data integrity, access and
security problems.

Module 5-Block, File, and Object-based Storage Systems

Page 56 © Copyright 2021 Dell Inc.


[email protected]
Appendix: NAS System Components

A NAS system consists of two components, controller and storage.

• NAS Controllers: A NAS controller is a compute system that contains


components such as network, memory, and CPU resources. A specialized
operating system optimized for file serving is installed on the controller. Each
controller may connect to all storage in the system. The controllers can be
active/active, with all controllers accessing the storage, or active/passive with
some controllers performing all the I/O processing while others act as spares. A
spare is used for I/O processing if an active controller fails. The controller is
responsible for configuration of RAID sets, creating LUNs, installing file
systems, and exporting the file shares on the network.
• File Data Storage: Similar to general purpose servers, block-based storage is
used to store NAS raw file data and metadata. Block storage controllers and
devices are integrated into the NAS system enclosure. To provide scalability
and higher capacities, midrange to enterprise NAS systems connect to external
high performance block-based storage systems over iSCSI or FC SAN.
• Disk Devices: Integrated NAS, and NAS heads can use different types of disk
devices to support mixed I/O performance and capacity requirements. NAS
systems of each type can support SSD, SAS, and SATA disk devices
simultaneously to add data tiering capabilities.

Module 5-Block, File, and Object-based Storage Systems

© Copyright 2021 Dell Inc. Page 57


[email protected]
Appendix: Scale-Out NAS

The scale-out NAS implementation pools multiple NAS nodes together in a cluster.

• A node may consist of either the NAS head or the storage or both. The cluster
performs the NAS operation as a single entity.
• Scale-out NAS has the capability to scale resources by adding nodes to a
cluster.
− Nodes can be added to the cluster for more performance or storage capacity
without causing any downtime.
• The cluster works as a single NAS device and is managed centrally.

• All information is shared among nodes, so the entire file system is accessible by
clients connecting to any node in the cluster.
− Scale-out NAS stripes data across all nodes in a cluster along with mirror or
parity protection. As nodes are added, the file system grows dynamically,
and data is evenly distributed across the nodes.
• As data is sent from clients to the cluster, the data is divided and allocated to
different nodes in parallel.

Scale-out NAS clusters use separate internal and external networks for back-end
and front-end connectivity respectively. An internal network provides connections
for intra-cluster communication, and an external network connection enables clients
to access and share file data.

Each node in the cluster connects to the internal network. The internal network
offers high throughput and low latency and uses high-speed networking
technology, such as InfiniBand or Gigabit Ethernet. To enable clients to access a
node, the node must be connected to the external Ethernet network. Redundant
internal or external networks may be used for high availability.

Module 5-Block, File, and Object-based Storage Systems

Page 58 © Copyright 2021 Dell Inc.


[email protected]
Appendix: Scale-Out NAS Operation

New nodes can be added as required. As new nodes are added, the file system
grows dynamically and is evenly distributed to each node. As the client sends a file
to store on the NAS system, the file is evenly striped across the nodes.

When a client writes data, even though that client is connected to only one node,
the write operation occurs in multiple nodes in the cluster. This operation is also
true for read operations.

A client is connected to only one node at a time. However, when that client
requests a file from the cluster, the node to which the client is connected does not
have the entire file locally on its drives. The connected node retrieves and rebuilds
the file using the back-end InfiniBand network.

Module 5-Block, File, and Object-based Storage Systems

© Copyright 2021 Dell Inc. Page 59


[email protected]
Appendix: Use-Case for Scale-Out NAS: Data Lake

By limiting the number of parallel, linear data flows, the enterprises can:

• Consolidate vast amounts of their data into a single store, or a data lake,
through a native and simple ingestion process.
• Perform analytics on the data to provide detailed insight.
− Actions can be taken based on this insight in an iterative manner.
• Eliminate the cost of having discrete silos or islands of information spread
across the enterprises.

Scale-out NAS has the ability to provide the storage platform to this data lake. The
scale-out NAS enhances this paradigm by providing scaling capabilities in terms of
capacity, performance, security, and protection.

Module 5-Block, File, and Object-based Storage Systems

Page 60 © Copyright 2021 Dell Inc.


[email protected]
Appendix: NAS vs. Object-Based Storage

In addition to increasing amounts of data, there has also been a significant shift in
how people want, and expect to access data. The rising adoption rate of
smartphones, tablets, and other mobile devices, combined with increasing
acceptance of them in enterprise workplaces, has resulted in an expectation for on-
demand access to data from anywhere, and on any type device.

Traditional storage solutions such as NAS, which is a dominant solution for storing
unstructured data, are limited:

• Cannot scale to the capacities required or provide universal access across


geographically dispersed locations.
• Data growth adds high overhead to the NAS in terms of managing large number
of permissions and nested directories.

− File systems require more management as they scale and are limited in size.
− Performance degrades as the NAS file system size increases. Increasing file
system metadata capacity, a requirement of many new applications, is also
limited.
Object-based storage systems meet these challenges, and can better help to
manage data growth at lower cost. Object-based storage provides extended
metadata capabilities, and is highly scalable to keep up with rapidly growing data
storage, and user access demands.

Module 5-Block, File, and Object-based Storage Systems

© Copyright 2021 Dell Inc. Page 61


[email protected]
Appendix: OSD Controller Operations

The OSD controllers provide two key functions:

• Metadata Service: The metadata service is responsible for generating the


object ID from the contents (may also include other attributes of data) of a file. It
also maintains the mapping of the object IDs and the file system namespace. In
some implementations, the metadata service runs inside an application server.
• Storage Service: The storage service manages a set of disks on which the
user data is stored.

The OSD controllers connect to the storage via an internal network. The internal
network provides inter-controller, and controller-to-storage connectivity. The
application server accesses the controller to store and retrieve data over an
external network. OSD typically uses low-cost, high-density disk drives to store
objects. As more capacity is required, more disk drives can be added to the
system.

Module 5-Block, File, and Object-based Storage Systems

Page 62 © Copyright 2021 Dell Inc.


[email protected]
Appendix: Key Features of OSD Systems

Addition details for each OSD feature are:

• Scale-out architecture: Scalability has always been the most important


characteristic of enterprise storage systems, since the rationale of consolidating
storage assumes that the system can easily grow with aggregate demand. OSD
is based on distributed scale-out architecture where each node in the cluster
contributes with its resources to the total amount of space and performance.
Nodes are independently added to the cluster that provides massive scaling to
support petabytes and even exabytes of capacity with billions of objects that
make it suitable for cloud environment.
• Multi-tenancy: Enables multiple applications to be securely served from the
same infrastructure. Each application is securely partitioned and data is neither
co-mingled nor accessible by other tenants. This feature is ideal for businesses
providing cloud services for multiple customers or departments within an
enterprise.
• Metadata-driven policy: Metadata and policy-based information management
capabilities combine to intelligently (automate) drive data placement, data
protection, and other data services (compression, deduplication, retention, and
deletion) based on the service requirements. For example, when an object is
created, it is created on one node and subsequently copied to one or more
additional nodes, depending on the policies in place. The nodes can be within
the same data center or geographically dispersed.
• Global namespace: Another significant value of object storage is that it
presents a single global namespace to the clients. A global namespace
abstracts storage from the application and provides a common view,
independent of location and making scaling seamless. This unburdens client
applications from the need to keep track of where data is stored. The global
namespace provides the ability to transparently spread data across storage
systems for greater performance, load balancing, and non-disruptive operation.

Module 5-Block, File, and Object-based Storage Systems

© Copyright 2021 Dell Inc. Page 63


[email protected]
The global namespace is especially important when the infrastructure spans
multiple sites and geographies.
• Flexible data access method: OSD supports REST/SOAP APIs for
web/mobile access, and file sharing protocols (CIFS and NFS) for file service
access. Some OSD storage systems support HDFS interface for big data
analytics.
• Automated system management: OSD provides self-configuring and auto-
healing capabilities to reduce administrative complexity and downtime. With
respect to services or processes running in the OSD, there is no single point of
failure. If one of the services goes down, and if the node becomes unavailable,
or site becomes unavailable, there are redundant components and services that
will facilitate normal operations.
• Data protection: The objects stored in an OSD are protected using two
methods: replication and erasure coding. The replication provides data
redundancy by creating an exact copy of an object. The replica requires the
same storage space as the source object. Based on the policy configured for
the object, one or more replicas are created and distributed across different
locations.

Appendix: Virtual Appliance

A virtual appliance is a pre-configured virtual machine that is ready to run on a


hypervisor such as VMware ESXi. The virtual appliance includes the complete
operating environment and software to perform all of its functions.

Module 5-Block, File, and Object-based Storage Systems

Page 64 © Copyright 2021 Dell Inc.


[email protected]

You might also like