Paper 1 SAN

Download as pdf or txt
Download as pdf or txt
You are on page 1of 13

RAID Techniques

> There are three RAID techniques


1. striping
2. mirroring
3. parity
Striping

> Striping is a technique to spread data across multiple drives (more than one) to use the drives
in parallel.
> All the read-write heads work simultaneously. allowing more data to be processed in a shorter
time and increasing performance, compared to reading and writing from a single disk.
> Within each disk in a RAID set, a predefined number of contiguously addressable disk
blocks are defined as a strip.
> The set of aligned strips that spans across allthe disks within the RAID set is called a stripe.
> The below figure shows physical and logical representations of a striped RAID set.
> Strip size (also called stripe depth) describes the number of blocks in a strip and is the
maximum amount of data that can be written to or read from a single disk in the set.
> All strips in a stripe have the same number of blocks.
Having a smaller strip size means that data is broken into smaller pieces while spread
across the disks.

Storage Area Networks Module-2


> Stripe size is a multiple of strip size by the number of data disks in the RAID set.
Eg: In a5 disk striped RAID set with a strip size of 64 KB, the stripe size is 320KB
(64KB x 5).
> Stripe width refers to the number of data strips in a stripe.
> Striped RAID does not provide any data protection unless parity or mirroring is used.

Sintlas

tripe

Srip 1 Rrip 3

Fig: Striped RAID set


> Mirroring is a technique whereby the same data is
stored on two different disk drives,
yielding two copies of the data.
If one disk drive failure occurs, the data is
intact on the surviving disk drive (see Fig
below) and the controller continues to service the host's data requests from the
surviving disk
of a mirrored pair.

StorageArea Networks Module-2


When the failed disk is replaced with a new disk, the controller copies the data from the

surviving disk of the mirrored pair.


> This activity is transparent to the host.

Mirroring

C c
-Disks
D D

Fig: Mirored disks in an array

Advantages:
complete data redundancy
mirroring enables fast recovery from disk failure.
data protection

Mirroring is not a substitute for data backup. Mirroring constantly captures changes in thedata,
whereas a backup captures point-in-time images of the data.
Disadvantages:
Mirroring involves duplication of data the amount of storage capacity needed is
twice the amount of data being stored.
Expensive
Parity

> Parity is a method to protect striped data from disk drive failure without the cost of
mirroring.
> An additional disk drive is added to hold parity, a mathematical construct that allows re
creation of the missing data.
> Parity is a redundancy technique that ensures protection of data without maintaining a full
set of duplicate data.
> Calculation of parity is a function of the RAID controller.
> Parity information can be stored on separate, dedicated disk drives or distributed across all the
drives in a RAID set.

> Fig shows a parity RAD set.

2 3

Data Disks Parity Disk


Fig: Parity RAID
The first four disks, labeled "Data Disks," contain the data. The fifth disk, labeled "Parity
Disk," stores the parity information, which, in this case, is the sum of the elements in each
row.

> Now, if one of the data disks fails, the missing value can be calculated by subtracing the sum
of the rest of the elements from the parity value.
> Here, computation of parity is represented as an arithmetic sum of the data. However, parity
calculation is a bitwise XOR operation.
NETWORK ATTACHED STORAGE(NAS)

> NAS is an IP based dedicated, high-performance file sharing and storage device.

Enables NAS clients to share files over an IP network.

> Uses network and file-sharing protocols to provide access to the file data.
> Ex: Common Internet File System (CIFS) and Network File System (NFS).

> Enables both UNIX and Microsoft Windows users to share the same data seamlessly.

> NS device uses its own operating system and integrated hardware and software components to
meet specific file-service needs.
> Its operating system is optimized for file VO which performs better than a general-purpose
server.

> ANAS device can serve more clients than general-purpose servers and provide the benefit of
server consolidation.
Components of NAS
> NAS device has two key components (as shown in Fig 2.33): NAS head and storage.
> In some NAS implementations, the storage could be external to the NAS device and shared with

other hosts.

> NAS head includes the following components:


CPU and memory

One or more network interface cards (NICS), which provide connectivity to the client
network.

An optimized operating system for managing the NAS functionality. It translates file
level requests into block-storage requests and further converts the data supplied at the
block level to file data

NFS, CIFS, and other protocols for file sharing


Industry-standard storage protocols and ports to connect and manage physical disk
resources

> The NAS environment includes clients accessing a NAS device over an IP network using file
sharing protocols.

Storage Area Networks Module 3

NES Network Interface


UNIX CIlient
NAS Head
NES CIFS

NAS Device OS

Storage Interface
CIFS

Windows Client

Storage Array

NAS Device

Fig 2 33
Read Operation with Cache
When a host issues a read request, the storage controller reads the tag RAM to determine
whether the required data is available in cache.
> If the requested data is found in the cache, it is called a read cache hit or read hit and
data is sent directly to the host, without any disk operation (see Fig [a]).This provides a
fast response time to the host (about a millisecond).
> If the requested data is not found in cache, it is called a cache miss and the data must be
read from the disk(see ig [b]).. The back-end controller accesses the appropriate disk and
retrieves the requested data. Data is then placed in cache and is finally sent to the host
through the front- end controller.
> Cache misses increase VO response time.
> A Pre-fetch, or Read-ahead, algorithm is used when read requests are sequential. In a
sequential read request, a contiguous set of associated blocks is retrieved. Several other
blocks that have not yet been requested by the host can be read from the disk and placed
into cache in advance. When the host subsequently requests these blocks, the read
operations will be read hits.

> This process significantly improves the response time experienced by the host.
> The intelligent storage system offers fixed and variable prefetch sizes.
> In fixed pre-fetch, the intelligent storage system pre-fetches a fixed amount of data. It is
most suitable when VO sizes are uniform.

> In variable pre-fetch, the storage system pre-fetches an amount of data in multiples of the size
of the host request.

20

Storage Area Networks Module-2


***.***s..Da ta found in cache = Read Hit

Physical Disks
Host
Read
Cache
Request

Send Data
2

a)

Data not found in cache = Read Miss

Physical Disks
Host Read Read
Cache
Request Request

Send Data Read from


the Disk

(b)

Fig: Read hit and read miss

Write Operation with Cache


> Write operations with cache provide performance advantages over writing directly to
disks.

> When an /O is written to cache and acknowledged, it is comnpleted in far less time (from
the host's perspective) than it would take to write directly to disk.
> Sequential writes also offer opportunities for optimization because many smaller writes
can be coalesced for larger transfers to disk drives with the use of cache.
> A
write operation with cache is implemented in the following ways:
> Write-back cache: Data is placed in cache and an acknowledgment is sent to the host
immediately. Later, data from several writes are committed to the disk. Write response
times are much faster, as the write operations are isolated from the mechanical delays of
the disk. However., uncommitted data is at risk of loss in the event of cache failures.
> Write-through cache: Data is placed in the cache and immediately written to the disk,
NAS File Sharing Protocols
> NAS devices support multiple file-service protocols to handle file /O requests

> Two common NAS file sharing protocols are:

Storage Area Networks Module3

" Common Internet File System (CIFS)


Network File System (NFS)

> NAS devices enable users to share file data across different operating environments

> I provides a means for users to migrate transparently from one operating system to another
Network File System (NFS)
> NES is a client-server protocol for file sharing that is commonly used on UNIX systems.

> NFS was originally based on the connectionless User Datagram Protocol (UDP).

> It uses Remote Procedure Call (RPC) as a method of inter-process communication between two
Computers.

> The NFS protocol provides a set of RPCs to access a remote file system for the following
operations:

Searching files and directories


Opening. reading. writing to, and closing a file
Changing file attributes
Modifying file links and directories
> NFS creates a connection between the client and the remote system to transfer data.

> NFSV3 and earlier is a stateless protocol


> t does not maintain any kind of table to store information about open files and associated
pointers. Each call provides a full set of arguments - a file handle, a particular position to read or
write, and the versions of NES - to access files on the server .

Currently, three versions of NFS are in use:


1. NFS version 2 (NFSV2): Uses UDP to provide a stateless network connection between a
client and a server. Features, such as locking, are handled outside the protocol.
2. NES version 3 (NFSV3): Uses UDP or TCP, and is based on the stateless protocol
design. It includes some new features, such as a 64-bit file size, asynchronous writes, and

Storage Area Networks Module 3


additional file attributes to reduce refetching.
3. NES version 4(NFSv4): Uses TCP and is based on astateful protocol design. It offers
enhanced security. The latest NES version 4.1 is the enhancement of NFSV4 and includes
some new features, such as session model, parallel NFS (PNFS), and data retention.

Common Internet File System (CIFS)


> CIFS is a client-server application protocol

> It enables clients to access files and services on remote computers over TCPIP.
> t is a public, or open, variation of Server Message Block (SMB) protocol.
> I provides following features to ensure data integrity:
It uses file and record locking to prevent users from overwriting the work of another user
on a file or a record.

It supports fault tolerance and can automatically restore connections and reopen files that
were open prior to an interruption. This feature depends on whether an application is
written to take advantage of this.
CIFS is a stateful protocol because the CIES server maintains connection information
regarding every connected client. If a network failure or CIFS server failure occurs, the
client receives a disconnection notification. User disruption is minimized if the
application has the embedded intelligence to restore the connection. However, the
embedded intelligence is missing. the user must take steps to reestablish the CIFS
connection.
RAID Levels

> RAID Level selection is determined by below factors:

V Application performance
data availability requirements
cost

> RAD Levels are defined on the basis of:


Striping
Mirroring
Parity techniques
> Some RAID levels use a single technique whereas others use a combination of techniques.
> Table shows the commonly used RAID levels
Table : RAID Levels

LEVELS BRIEF DESCRIPTION

RAID O Striped set with no fault tolerance


RAID 1 Disk mirroring
Nested Combinations of RAID levels. Example: RAID 1+ RAID O
RAID 3 Striped set with parallel access and a dedicated parity disk
RAID 4 Striped set with independent disk access and a dedicated parity disk
RAID 5 Striped set with independent disk access and distributed parity
RAID 6 Striped set with independent disk access and dual distributed parity

RAID O

> RAID0configuration uses data striping techniques, where data is striped across all the disks
within a RAID set. Therefore it utilizes the full storage capacity of a RAID set.
> To read data, all the strips are put back together by the controller.
> Fig shows RAID 0 in an array in which data is striped across five disks.

Storage Area Networks Module-2

Data from Host

RAID Controller

Disis

Fig: RAID 0
> When the number of drives in the RAID set increases, performance improves because more
data can be read or written simultaneously.
RAIDO is a good option for applications that need high VO throughput.
> However, if these applications require high availability during drive failures, RAID 0 does not
provide data protection and availability.
RAID 1

> RAID 1is based on the mirroring technique.


> In this RAID configuration, data is mirrored to provide fault tolerance (see Fig). A
> RAIDlset consists of two disk drives and every write is written to both disks.
> The mirroring is transparent to the host.
> During disk failure, the impact on data recovery in RAIDl is the least among all RAID
implementations. This is because the RAID controller uses the mirror drive for data recovery.
> RAID lis suitable for applications that require high availability and cost is no constraint.

Storage Area Networks Module-2

Data trom Host

RAID Controller

Mirror Set Mirror Set

pisks

Fig: RAID I

Nested RAID

> Most data centers require data redundancy and performance from their RAID arrays.
> RAID l+0 and RAID 0+1 combine the performance benefits of RAID 0 with the redundancy
benefits of RAID 1.

> They use striping and mirroring techniques and combine their benefits.
> These types of RAID require an even number of disks, the minimum being four (see
Fig).

10

Storage Area Networks Module-2

-Data from Host

RAID
Mirrortng Mirrortng Mirroring
Controller triping Sutping

Hirror Set A Mirror Set strine Stripe Set B

(a) RAID 1+0 (b) RAID 0+1

Fig: Nested RAID


Components of FCSAN
> Components of FC SAN infrastructure are:
1) Node Ports,
2) Cables

3) Connectors,
Hubs),
4) Interconnecting Devices (Such As Fc Switches Or
5) San Management Software.

Node Ports
libraries are all referred to as
> In fibre channel, devices such as hosts, storage and tape
Nodes.
Each node is a source or destination of information for one or more nodes.

MODULE 2
Storage Area Networks
a physical interface for communicating
> Each node requires one or more ports to provide
with other nodes.

mode with a transmit (Tx) link and receive (Rx)


A port operates in full-duplex data transmission
link (see Fig 2.1).

Node
Port A
Link
Porr O
Port 1

Port a

Fig 2.1: Nodes, Ports, links

Cables
> SAN implementations use optical fiber cabling.
back-end connectivity
> Copper can be used for shorter distances for
Optical fiber cables carry data in the form of light.
> There are two types of optical cables
:Multi-Mode And Single-Mode.
projected at different angles
1)Multi-mode fiber (MMF) cable carries muliple beams of light
simultaneously onto the core of the cable (see Fig 2.2 (a).
light beams traveling inside the cable tend to
> In an MMF transmission, multiple
signal strength after it travels a
disperse and collide. This collision weakens the
dispersion.
certain distance a process known as modal
data centers for shorter distance runs
> MMFs are generally used within
of light projected at the center of the core (see
2) Single-mode fiber (SMF) carries a single ray
Fig 2.2 (b).
travels in a straight line through the core
In an SMF transmission, a single light beam
of the fiber.

> The small core and the single light wave limits
modal dispersion. Among all types of
maximum
fibre cables, single-mode provides minimum signal attenuation over
up Km).
> A single-mode cable is used for long-distance cable runs, limited only by the power
of the laser at the transmitter and sensitivity of the receiver.
> SMEFs are used for longer distances.

Claddlng Core

Cadding Core
Wght In

(o) Multmode fiber

(b) Single Mode Fiber

Fig 2.2: Multimode fiber and single-mode fiber

Connectors
> They are attached at the end of the cable to enable swift connection and
disconnection of the
cable to and from a port.
> A Standard connector (SC) (see Fig 2.3 (a)) and a Lucent connector (LC) (see
Fig 2.3 (b)
are two commonly used connectors for fiber optic cables.
An SC is used for data transmission speeds up to 1Gb/s, whereas an LC is
used for speeds up
to 4 Gb/s.

Figure 2.3 depicts a Lucent connector and a Standard connector.


> A Straight Tip (ST) is a fiber optic connector with a plug anda socket that is locked with a
half-twisted bayonet lock (see Fig 2.3 (c).

(a) Standard Connector (b) Lucent connector

(c) Straig ht Tip Connector

Fig 2.3: SCLC, and ST connectors


Interconnect Devices

The commonly used interconnecting devices in SAN are

1) Hubs,
2) Switches,
3) Directors

> Hubs are used as communication devices in FC-AL implementations. Hubs physically
connect nodes in a logical loop or a physical star topology.
> All the nodes must share the bandwidth because data travels through all the connection
points. Because of availability of low cost and high performance switches, hubs are no
longer used in SANS.
Switches are more intelligent than hubs and directly route data from one physical
port to another. Therefore, nodes do not share the bandwidth. Instead, each node has a
dedicated communication path, resulting in bandwidh aggregation.
> Switches are available with:
Fixed port count

Modular design : port count is increased by installing additional port cards to


open slots.

> Directors are larger than switches and are deployed for data center implementations.
> The function of directors is similar to that of FC switches, but directors have higher
port count and fault tolerance capabilities.
> Port card or blade has multiple ports for connecting nodes and other FC switches

SAN Managemnent Software

SAN management software manages the interfaces between hosts, interconnect devices,
and storage arrays.
> The software provides a view of the SAN environment and enables management of
various resources from one central console.

Storage Area Networks MODULE 2


> It provides key management functions, including mapping of storage devices, switches,
and servers, monitoring and generating alerts for discovered devices, and logical
partitioning of the SAN, called zoning

You might also like