0% found this document useful (0 votes)
15 views62 pages

IT436-week 3 - Physical Layer-Storage

The document provides an overview of cloud computing storage technologies, focusing on magnetic disks, RAID configurations, SSD advantages and challenges, and various data access methods. It details RAID techniques such as striping, mirroring, and parity, along with their performance implications and levels. Additionally, it discusses storage virtualization methods and provisioning processes for managing storage resources effectively.

Uploaded by

w16687118
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views62 pages

IT436-week 3 - Physical Layer-Storage

The document provides an overview of cloud computing storage technologies, focusing on magnetic disks, RAID configurations, SSD advantages and challenges, and various data access methods. It details RAID techniques such as striping, mirroring, and parity, along with their performance implications and levels. Additionally, it discusses storage virtualization methods and provisioning processes for managing storage resources effectively.

Uploaded by

w16687118
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 62

IT436-Cloud Computing

Networking

Introduction
Acknowledgment: The presentations contains some figures and text from the following:
Cloud Computing Theory and Practice book by Dan C. Marinescu
Information Storage and Management, John Wiely & Sons, Inc.
Cloud Networking Understanding Cloud-based Data Center Networks by Gary Lee
Shaimaa M. Mohamed 1
Storage - Magnetic
Disks
▪ Circular platter constructed of nonmagnetic material, coated
with a magnetizable material.
▪ Data are recorded on and later retrieved from the disk via a
conducting coil named the head.
▪ The head is stationary while the platter rotates beneath it.
Storage - Magnetic
Disks
Storage - RAID
▪ Performance depends on the disk I/O operation.
▪ I/O operations depend on the computer system, the operating system,
and the nature of the I/O channel and disk controller hardware.
▪ The rate in improvement in secondary storage performance has been
considerably less than the rate for processors and main memory.
▪ This led to the development of arrays of disks that operate
independently and in parallel.
▪ With multiple disks, separate I/O requests can be handled in parallel.
▪ Furthermore, data can be organized and in which redundancy can be
added to improve reliability
Storage - RAID
▪ Improves storage performance by serving multiple write operations in
parallel.
▪ can provide protection against disk failure.
▪ Depends mainly on three techniques:
1. Stiping
2. Mirroring
3. Parity
Storage - RAID
Striping
▪ A technique to spread data across
multiple drives in order to use the
drives in parallel.
▪ Increase performance
▪ Each drive in a RAID group has a
predefined number of contiguously
addressable blocks called a “strip”.
▪ A set of aligned strips that span across
all the drives within the RAID group
is called a “stripe”.
▪ No data protection in case of disk
failure.
Storage - RAID
Mirroring
▪ A technique in which the same data is
stored simultaneously on two different
drives.
▪ Data protected against disk failure.
▪ Twice the amount of data being stored
is needed.
▪ Improves read performance because
read requests can be serviced by both
disks.
Storage - RAID
Parity
▪ Parity is a value derived by
performing a XOR on individual
strips of data and stored on a portion
of a RAID group.
▪ It enables the recreation of missing
data in case of a drive failure.
▪ Compared to mirroring, parity
implementation considerably reduces
the cost associated with data
protection.
▪ Parity is recalculated every time there
is a change in data, which may affect
the performance of the RAID array.
Storage - RAID Levels
▪ RAID levels are implementations using the three techniques explained.
▪ RAID 0 – that uses striping,
▪ RAID 1 – that uses mirroring,
▪ RAID 1+0 – which is a combination of RAID 1 and RAID 0,
▪ RAID 3, 5, and 6 – that use a combination of striping and parity.
Storage - RAID Levels
RAID 0
Storage - RAID Levels
RAID 1
Storage - RAID Levels
RAID 1 0
Storage - RAID Levels
RAID 3

▪ only a single redundant disk, no matter how large the disk array.
▪ Any I/O request will involve the parallel transfer of data from all of
the data disks
Storage - RAID Levels
RAID 5

▪ The distribution of parity strips across all drives avoids the potential
I/O bottleneck
Storage - RAID Levels
RAID 6

▪ In the RAID 6 scheme, two different parity calculations are carried


out and stored in separate blocks on different disks.
▪ Data can be regenerated even if two disks containing user data fail.
▪ Three disks would have to fail within the MTTR (mean time to repair)
interval to cause data to be lost.
▪ On the other hand, RAID 6 incurs a substantial write penalty, because
each write affects two parity blocks.
Storage - RAID Levels
Performance
▪ In RAID 5, every write (update) to a disk requires as four I/O
operations (two disk reads (old value and old parity) and two disk
writes(new parity calculated and the updated value) .
▪ In RAID 6, every write (update) to a disk requires as Six I/O
operations (three disk reads and three disk writes.
▪ In RAID 1, every write requires two disk writes.
Storage - RAID Levels Data
Mapping for a RAID Level 0 Array
Storage - RAID Levels
Storage - RAID Levels
Storage - RAID Levels
Storage - RAID Levels
Storage - SSD
SSDs have the following advantages over HDDs:
▪ High-performance input/output operations per second (IOPS):
Significantly increases performance I/O subsystems.
▪ Durability: Less susceptible to physical shock and vibration.
▪ Longer lifespan: SSDs are not susceptible to mechanical wear.
▪ Lower power consumption: SSDs use considerably less power
than comparable-size HDDs.
▪ Quieter and cooler running capabilities: Less space required,
lower energy costs, and a greener enterprise.
▪ Lower access times and latency rates: Over 10 times faster than
the spinning disks in an HDD
Storage - SSD
Storage - SSD
SSDs have the following practical issues:
▪ SSD performance has a tendency to slow down as the device is
used.
▪ A flash memory becomes unusable after a certain number of
writes. A typical limit is 100,000 writes
Storage
Data Access Methods from Storage
Systems

▪ Data is stored and accessed by applications using the underlying


storage infrastructure.
▪ The key components of this infrastructure are the OS (or file
system), connectivity, and storage.
▪ The server controller card accesses the storage devices using
predefined protocols, such as IDE/ATA, SCSI, or Fibre Channel
(FC).
▪ External storage devices can be connected to the servers directly
or through the storage network.
▪ By using the above SAN features and protocols data which is
stored in the storage systems can be accessed by various
methods.
Storage
Data Access Methods from Storage
Systems

▪ The application requests data from the file system or operating


system by specifying the filename and location.
▪ The file system has two components
➢ User component
➢ Storage component - maps to the physical location.
▪ The file system maps the file attributes to the logical block
address of the data and sends the request to the storage device.
The storage device converts the logical block address (LBA) to
cylinder-head-sector (CHS) address and fetches the data.
Storage
Accessing data from the Intelligent
Storage Systems

Depending on the type of the data access method used for a storage
system, the controller can either be classified:
▪ Block Level
▪ File Level
▪ Object Level
Storage
Accessing data from the Intelligent
Storage Systems
Storage
Block Level Access

▪ A storage volume (a logical unit of storage composed of


multiple blocks, typically created from a RAID set) is created
and assigned to the compute system to house created file
systems.
▪ An application data request is sent to the file system and
converted into logical block address request.
▪ This block level request is sent over the network to the storage
system. The storage system then converts the logical block
address to a CHS address and fetches the data in block-sized
units.
Components of a Controller
Front End

Controller

Compute Front End Back End Storage

Cache
Connectivity
Storage
Network

Ports Front-end
Controllers
Module 5: Block-based Storage System
Components of a Controller
Cache

Controller

Compute Front End Back End Storage

Cache
Connectivity
Storage
Network

Module 5: Block-based Storage System


Read Operation with Cache
Data found in cache = Read hit

Data found in cache

1. Read Request

2. Data sent to
compute system

Data not found in cache = Read miss

Data not found in cache

1. Read Request 2. Read Request

4. Data sent to 3. Data copied to cache


compute system

Module 5: Block-based Storage System


Write Operation with Cache
Write-through cache

Cache

1. Data write 2. Data write

4. Acknowledgment 3. Acknowledgment

Write-back cache

Cache

1. Data write 3. Data write

2. Acknowledgment 4. Acknowledgment

Module 5: Block-based Storage System


Cache Management: Algorithms

• Least recently used (LRU)


– Discards data that have not been accessed for a long time

• Most recently used (MRU)


– Discards data that have been most recently accessed

New Data

Cache LRU/MRU Data

Module 5: Block-based Storage System


Cache Management: Watermarking

• Manages I/O burst through flushing process


– Flushing is the process of committing data from cache to the storage drives

• Three modes of flushing to manage cache utilization are:


– Idle flushing
– High watermark flushing
– Forced flushing

100%

HWM

LWM

Idle flushing High watermark flushing Forced flushing

Module 5: Block-based Storage System


Cache Data Protection

• Protects data in the cache against power or cache failures:


– Cache mirroring
• Provides protection to data against cache failure
• Each write to the cache is held in two different memory locations on
two independent memory cards
– Cache vaulting
• Provides protection to data against power failure
• In the event of power failure, uncommitted data is dumped to a
dedicated set of drives called vault drives

Module 5: Block-based Storage System


Components of a Controller
Back End
Controller

Compute Front End Back End Storage

Cache
Connectivity
Storage
Network

Back-end Ports
Controllers

Module 5: Block-based Storage System


Storage

Controller

Compute Front End Back End Storage

Cache
Connectivity
Storage
Network

Module 5: Block-based Storage System


Storage
File Level Access

▪ The file system is created on a separate file server, which is


connected to storage.
▪ A file-level request from the application is sent over the network
to the file server hosting the file system.
▪ The file system then converts the file-level request into block-
level addressing and sends the request to the storage to access
the data
File-based Storage System

• A dedicated, high
performance file server with
storage (also known as
Network-attached Storage)
• Enables clients to share files
over an IP network
– Supports data sharing for
UNIX and Windows users
• Uses a specialized OS that is
optimized for file I/O
NAS Deployment Options

• The two common NAS deployment options are:


– Traditional NAS (scale-up NAS)
– Scale-out NAS
• Traditional NAS
– Capacity and performance of a single system is scaled by
upgrading or adding NAS components
• Scale-out NAS
– Multiple processing and storage nodes are pooled in a cluster that
works as a single NAS device
– Addition of nodes scales cluster capacity and performance without
disruption
Storage
Object Level Access

▪ Data is accessed over the network in terms of self-contained


objects, each having a unique object identifier.
▪ The application request is sent to the file system.
▪ The file system communicates to the object-based storage device
(OSD) interface, which in turn sends the object-level request by
using the unique object ID over the network to the storage
system.
▪ The storage system has an OSD storage component that is
responsible for managing the access to the object on the storage
system.
▪ The OSD storage component converts the object level request
into block-level addressing and sends it to the storage to access
the data.
Object-based Storage System (Cont'd)
Storage - Virtualization
▪ is the pooling of multiple physical storage arrays from SANs
and making them appear as a single virtual storage device.
▪ The pool can integrate unlike storage hardware from different
networks, vendors, or data centers into one logical view and
manage them from a single pane of glass.
Storage - Virtualization
Storage virtualization can easily be used to address the following
top 7 challenges:
▪ Vendor lock-in
▪ Data migration across arrays
▪ Scalability
▪ Redundancy
▪ Performance
▪ High costs
▪ Management
Storage - Virtualization
▪ One major reason why companies are switching to a storage
virtualization model is the need to consolidate and manage all
existing storage under a single console, while also leveraging a
set of diverse features and functionalities.
▪ A storage virtualization node is essentially a virtual controller
that virtualizes and manages your physical storage.
▪ All your disk arrays are placed inside a “virtual pool” and thin-
provisioned for maximum capacity.
▪ Once the virtual pool is created, “virtual disks” are also created
in the pool and then presented to your host servers as raw LUNs
to store data.
Storage - Virtualization
▪ This means you can start mixing arrays from your HPE 3Par,
Dell Compellent, Pure Storage, or any SAN inside your data
center and consolidate them into a virtual pool.
▪ This in addition to striping your application data across all the
arrays regardless of how fast or slow the disks may be.
Storage - Virtualization
Methods
Host-based Storage Virtualization
▪ is software-based and most often seen in HCI systems and cloud
storage.
▪ The host presents virtual drives of varying capacity to the guest
machines, whether they are VMs in an enterprise environment,
physical servers or PCs accessing file shares or cloud storage.
▪ All of the virtualization and management are done at the host
level via software, and the physical storage can be almost any
device or array.
▪ Some server OSes have virtualization capabilities built in such
as Windows Server Storage Spaces
Storage - Virtualization
Methods
Array-based Storage Virtualization
▪ most commonly refers to the method in which a storage array
acts as the primary storage controller and runs virtualization
software, enabling it to pool the storage resources of other arrays
and to present different types of physical storage for use as
storage tiers.
▪ A storage tier may comprise solid-state drives (SSDs) or HDDs
on the various virtualized storage arrays.
▪ The physical location and specific array is hidden from the
servers or users accessing the storage.
Storage - Virtualization
Methods
Network-based Storage Virtualization
▪ is the most common form used in enterprises today.
▪ A network device, such as a smart switch or purpose-built
server, connects to all storage devices in an FC or iSCSI SAN
and presents the storage in the storage network as a single,
virtual pool.
Storage - Virtualization
Storage Pool
Storage - Virtualization
Storage Pool
Storage - Virtualization

▪ LUN (logical unit number) is created by abstracting the identity


and internal function of storage system(s) and appears as
physical storage to the compute system.
▪ The mapping of virtual to physical storage is performed by the
virtualization layer.
▪ LUNs are assigned to the compute system to create a file system
for storing and managing files.
Storage - Virtualization

▪ LUN masking is a process that provides data access control by


defining which LUNs a compute system can access.
▪ In a cloud environment, the LUNs are created and assigned to
different services based on the requirements.
Overview of Storage Provisioning
Storage Provisioning
The process of assigning storage resources to compute system based on
capacity, availability, and performance requirements.

• Can be performed in two ways:


– Traditional storage provisioning
– Virtual storage provisioning
Traditional Provisioning
Controller
LUN 0
Storage
Front End Back End (RAID Set)
Compute 1 Cache
LUN 0

Storage
Network

LUN 1

LUN 1

Compute 2
LUN Expansion
MetaLUN
A method to expand LUNs that require additional
capacity or performance.

• Created by combining two or more LUNs

• MetaLUNs can either be concatenated or striped

• Concatenated metaLUN
– Provides only additional capacity but no performance
– Expansion is quick as data is not restriped

• Striped metaLUN
– Provides capacity and performance
– Expansion is slow as data is restriped
Virtual Provisioning
10 TB

Thin Compute
LUN 0 System
Reported
Controller
Capacity
3 TB
Allocated Storage
Front End Back End (Storage Pool)
Thin
Compute 1 Cache LUN 0

Storage
Network

10 TB
Thin
LUN 1
Thin
LUN 1 Compute
System
Reported
Capacity
Compute 2 4 TB
Allocated
Expanding Thin LUNs

User
In-use capacity
capacity after
expansion

Thin LUN

Storage Pool Storage Pool Thin LUN


expansion

User
capacity
Adding storage In-use before
drives to the Thin pool capacity expansion
storage pool rebalancing
Thin LUN

Storage Pool Expansion Thin LUN Expansion


Traditional Provisioning Vs. Virtual Provisioning

150 GB
Available
Capacity

1500 GB 1650 GB
or 800 GB or
1.5 TB 550 GB 1.65 TB
600 GB Allocated 500 GB Available
Allocated
Unused Capacity
Unused
400 GB
Allocated 500 GB Capacity Capacity
Unused Allocated
Unused
Capacity
Capacity 350 GB
350 GB 100 GB 50 GB 200 GB
100 GB 200 GB Actual data
Actual data Allocated Allocated Allocated
Data 50 GB Data Data
Thin Thin Thin Storage
LUN 1 LUN 2 LUN 3 Storage
LUN 1 LUN 2 LUN 3 System 2 TB
500 GB 550 GB 800 GB System 2 TB

Traditional Provisioning Virtual Provisioning


LUN Masking
LUN Masking
A process that provides data access control by defining which LUNs a
compute system can access.

• Implemented on storage system


• Prevents unauthorized or accidental use of LUNs in a shared
environment
Questions?

Shaimaa M. Mohamed 64

You might also like