0% found this document useful (0 votes)
8 views

SAN Module 2 Data Protection RAID

Uploaded by

Dhanu Dhanu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views

SAN Module 2 Data Protection RAID

Uploaded by

Dhanu Dhanu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 30

Storage Area Network

18CS822 Semester VIII

Module 2
Data Protection: RAID
By,
Vijay Anand H M
CSE, Dr. SMCE

Syllabus:
Data Protection - RAID : RAID Implementation Methods, RAID Array
Components, RAID Techniques, RAID Levels, RAID Impact on Disk
Performance, RAID Comparison.
Intelligent Storage Systems: Components of an Intelligent Storage System, Types
of Intelligent Storage Systems.
Fibre Channel Storage Area Networks - Fibre Channel: Overview, The SAN
and Its Evolution, Components of FC SAN.
RAID:
RAID is the use of small-capacity, inexpensive disk drives as an alternative to large capacity
drives common on mainframe computers.
Later RAID has been redefined to refer to independent disks to reflect advances in the storage
technology.

RAID Implementation Methods:


The two methods of RAID implementation are:
1. Hardware RAID.
2. Software RAID.

Hardware RAID:
• In hardware RAID implementations, a specialized hardware controller is implemented
either on the host or on the array.
• Controller card RAID is a host-based hardware RAID implementation in which a
specialized RAID controller is installed in the host, and disk drives are connected to
it.
• Manufacturers also integrate RAID controllers on motherboards.
• A host-based RAID controller is not an efficient solution in a data center environment
with a large number of hosts.
• The external RAID controller is an array-based hardware RAID.
• It acts as an interface between the host and disks.
• It presents storage volumes to the host, and the host manages these volumes as
physical drives.
• The key functions of the RAID controllers are as follows:
 Management and control of disk aggregations
 Translation of I/O requests between logical disks and physical disks
 Data regeneration in the event of disk failures

Software RAID:
• Software RAID uses host-based software to provide RAID functions.
• It is implemented at the operating-system level and does not use a dedicated
hardware controller to manage the RAID array.
• Advantages when compared to Hardware RAID:
 cost
 simplicity benefits
Dr. SMCE, Vijay anand H M Page 1
Limitations:
 Performance: Software RAID affects overall system performance. This is due to
additional CPU cycles required to perform RAID calculations.
 Supported features: Software RAID does not support all RAID levels.
 Operating system compatibility: Software RAID is tied to the host operating system;
hence, upgrades to software RAID or to the operating system should be validated for
compatibility. This leads to inflexibility in the data-processing environment.

RAID Array Components:


• A RAID array is an enclosure that contains a number of HDDs and the supporting hardware
and software to implement RAID.
• A subset of disks within a RAID array can be grouped to form logical associations called
logical arrays, also known as a RAID set or a RAID group.
• Logical arrays are comprised of logical volumes (LV).

Figure 1: RAID Array Components

RAID Techniques:
There are three RAID techniques
1. striping
2. mirroring
3. parity

Dr. SMCE, Vijay anand H M Page 2


Striping:
• Striping is a technique to spread data across multiple drives (more than one) to use the
drives in parallel.
• All the read-write heads work simultaneously, allowing more data to be processed in a
shorter time and increasing performance, compared to reading and writing from a single
disk.
• Within each disk in a RAID set, a predefined number of contiguously addressable
disk blocks are defined as a strip.
• The set of aligned strips that spans across all the disks within the RAID set is called a stripe.
• Following figure shows physical and logical representations of a striped RAID set.

Figure 2: Striped RAID set

• Strip size (also called stripe depth) describes the number of blocks in a strip and is the
maximum amount of data that can be written to or read from a single disk in the set.
• All strips in a stripe have the same number of blocks. Having a smaller strip size means that
data is broken into smaller pieces while spread across the disks.
• Stripe size is a multiple of strip size by the number of data disks in the RAID set. Eg: In a
5 disk striped RAID set with a strip size of 64 KB, the stripe size is 320KB (64KB x 5)
• Stripe width refers to the number of data strips in a stripe.

Dr. SMCE, Vijay anand H M Page 3


Mirroring:
• Mirroring is a technique whereby the same data is stored on two different disk drives,
yielding two copies of the data.
• If one disk drive failure occurs, the data is intact on the surviving disk drive and the
controller continues to service the host’s data requests from the surviving disk of a
mirrored pair.
• When the failed disk is replaced with a new disk, the controller copies the data from the
surviving disk of the mirrored pair. This activity is transparent to the host.

Fig 3: Mirrored disks in an array

Advantages:
 complete data redundancy,
 Mirroring enables fast recovery from disk failure.
 data protection
Disadvantages:
 Mirroring involves duplication of data — the amount of storage capacity needed is twice the
amount of data being stored.
 Expensive.

Parity:

• Parity is a method to protect striped data from disk drive failure without the cost
of mirroring.
• An additional disk drive is added to hold parity, a mathematical construct that
allows re- creation of the missing data.
Dr. SMCE, Vijay anand H M Page 4
• Parity is a redundancy technique that ensures protection of data without
maintaining a full set of duplicate data.
• Calculation of parity is a function of the RAID controller.
• Parity information can be stored on separate, dedicated disk drives or distributed
across all the drives in a RAID set.

Fig 4: Parity RAID


• The first four disks, labeled “Data Disks,” contain the data. The fifth disk, labeled “Parity
Disk,” stores the parity information, which, in this case, is the sum of the elements in each
row.
• Now, if one of the data disks fails, the missing value can be calculated by subtracting the sum
of the rest of the elements from the parity value.
• Here, computation of parity is represented as an arithmetic sum of the data. However, parity
calculation is a bitwise XOR operation.
Advantages:

 Compared to mirroring, parity implementation considerably reduces the cost associated with
data protection.

Disadvantages:

 Parity information is generated from data on the data disk. Therefore, parity is
recalculated every time there is a change in data.
 This recalculation is time-consuming and affects the performance of the RAID
array.

RAID Levels:
RAID Level selection is determined by below factors:
Dr. SMCE, Vijay anand H M Page 5
 Application performance
 data availability requirements
 cost
RAID Levels are defined on the basis of:
 Striping
 Mirroring
 Parity techniques
Some RAID levels use a single technique whereas others use a combination of techniques.

RAID 0:
• RAID 0 configuration uses data striping techniques, where data is striped across all the
disks within a RAID set. Therefore it utilizes the full storage capacity of a RAID set.
• To read data, all the strips are put back together by the controller.
• When the number of drives in the RAID set increases, performance improves because
more data can be read or written simultaneously.
• RAID 0 is a good option for applications that need high I/O throughput.
• However, if these applications require high availability during drive failures, RAID 0 does
not provide data protection and availability.

Dr. SMCE, Vijay anand H M Page 6


Figure 5: RAID 0
RAID 1

• RAID 1 is based on the mirroring technique.


• In this RAID configuration, data is mirrored to provide fault tolerance (see Fig 1.15). A
• RAID 1 set consists of two disk drives and every write is written to both disks.
• The mirroring is transparent to the host.
• During disk failure, the impact on data recovery in RAID 1 is the least among all RAID
implementations. This is because the RAID controller uses the mirror drive for data
recovery.
• RAID 1 is suitable for applications that require high availability and cost is no constraint.

Dr. SMCE, Vijay anand H M Page 7


Figure 6: RAID 1
Nested RAID:

 Most data centers require data redundancy and performance from their RAID arrays.

 RAID 1+0 and RAID 0+1 combine the performance benefits of RAID 0 with the
redundancy benefits of RAID 1.
 They use striping and mirroring techniques and combine their benefits. These types of
RAID require an even number of disks, the minimum being four.

Figure 7: Nested RAID


RAID 1+0:
• RAID 1+0 is also known as RAID 10 (Ten) or RAID 1/0.

Dr. SMCE, Vijay anand H M Page 8


• RAID 1+0 performs well for workloads with small, random, write-intensive I/Os.
• Some applications that benefit from RAID 1+0 include the following:
 High transaction rate Online Transaction Processing (OLTP)
 Large messaging installations
 Database applications with write intensive random access workloads
• RAID 1+0 is also called striped mirror.
• The basic element of RAID 1+0 is a mirrored pair, which means that data is first mirrored
and then both copies of the data are striped across multiple disk drive pairs in a RAID set.
• When replacing a failed drive, only the mirror is rebuilt. The disk array controller uses
the surviving drive in the mirrored pair for data recovery and continuous operation.
Working of RAID 1+0:
• Eg: consider an example of six disks forming a RAID 1+0 (RAID 1 first and then
RAID 0) set.
• These six disks are paired into three sets of two disks, where each set acts as a RAID 1 set
(mirrored pair of disks). Data is then striped across all the three mirrored sets to form
RAID 0.
• Following are the steps performed in RAID 1+0
 Drives 1+2 = RAID 1 (Mirror Set A)
 Drives 3+4 = RAID 1 (Mirror Set B)
 Drives 5+6 = RAID 1 (Mirror Set C)
• Now, RAID 0 striping is performed across sets A through C.
• In this configuration, if drive 5 fails, then the mirror set C alone is affected. It still has
drive 6 and continues to function and the entire RAID 1+0 array also keeps functioning.
• Now, suppose drive 3 fails while drive 5 was being replaced. In this case the array
still continues to function because drive 3 is in a different mirror set.
• So, in this configuration, up to three drives can fail without affecting the array, as long as
they are all in different mirror sets.
• RAID 0+1 is also called a mirrored stripe.
Working of RAID 0+1:
• Eg: Consider the same example of six disks forming a RAID 0+1 (that is, RAID 0 first
and then RAID 1).
• Here, six disks are paired into two sets of three disks each.
• Each of these sets, in turn, act as a RAID 0 set that contains three disks and then these

Dr. SMCE, Vijay anand H M Page 9


two sets are mirrored to form RAID 1.
• Following are the steps performed in RAID 0+1
 Drives 1 + 2 + 3 = RAID 0 (Stripe Set A)
 Drives 4 + 5 + 6 = RAID 0 (Stripe Set B)
• These two stripe sets are mirrored.
• If one of the drives, say drive 3, fails, the entire stripe set A fails.
• A rebuild operation copies the entire stripe, copying the data from each disk in the
healthy stripe to an equivalent disk in the failed stripe.
• This causes increased and unnecessary I/O load on the surviving disks and makes the
RAID set more vulnerable to a second disk failure.
RAID 3
• RAID 3 stripes data for high performance and uses parity for improved fault tolerance.
• Parity information is stored on a dedicated drive so that data can be reconstructed if a
drive fails. For example, of five disks, four are used for data and one is used for parity.
• RAID 3 always reads and writes complete stripes of data across all disks, as the drives
operate in parallel. There are no partial writes that update one out of many strips in a stripe.
• RAID 3 provides good bandwidth for the transfer of large volumes of data. RAID 3 is used
in applications that involve large sequential data access, such as video streaming.

Figure 8: RAID 3

Dr. SMCE, Vijay anand H M Page 10


RAID 4
 RAID 4 stripes data for high performance and uses parity for improved fault
tolerance. Data is striped across all disks except the parity disk in the array.
 Parity information is stored on a dedicated disk so that the data can be rebuilt if a
drive fails. Striping is done at the block level.
 Unlike RAID 3, data disks in RAID 4 can be accessed independently so that specific
data elements can be read or written on single disk without read or write of an entire
stripe. RAID 4 provides good read throughput and reasonable write throughput.

Figure 9: RAID 4
RAID 5
• RAID 5 is a versatile RAID implementation.
• It is similar to RAID 4 because it uses striping. The drives (strips) are also independently
accessible.
• The difference between RAID 4 and RAID 5 is the parity location. In RAID 4, parity is
written to a dedicated drive, creating a write bottleneck for the parity disk
• In RAID 5, parity is distributed across all disks. The distribution of parity in RAID 5
overcomes the Write bottleneck. Below Figure illustrates the RAID 5 implementation.
• Fig 1.18 illustrates the RAID 5 implementation.
• RAID 5 is good for random, read-intensive I/O applications and preferred for messaging,
Dr. SMCE, Vijay anand H M Page 11
data mining, medium-performance media serving, and relational database management
system (RDBMS) implementations, in which database administrators (DBAs) optimize
data access.

Figure 10: RAID 5

RAID 6
 RAID 6 includes a second parity element to enable survival in the event of the failure
of two disks in a RAID group. Therefore, a RAID 6 implementation requires at least
four disks.
 RAID 6 distributes the parity across all the disks. The write penalty in RAID 6 is more
than that in RAID 5; therefore, RAID 5 writes perform better than RAID 6. The
rebuild operation in RAID 6 may take longer than that in RAID 5 due to the presence
of two parity sets.

Dr. SMCE, Vijay anand H M Page 12


Figure 11: RAID 6

RAID Impact on Disk Performance


• When choosing a RAID type, it is imperative to consider its impact on disk performance
and application IOPS.
• In both mirrored (RAID 1) and parity RAID (RAID 5) configurations, every write
operation translates into more I/O overhead for the disks which is referred to as write
penalty.
• In a RAID 1 implementation, every write operation must be performed on two disks
configured as a mirrored pair. The write penalty is 2.
• In a RAID 5 implementation, a write operation may manifest as four I/O operations. When
performing small I/Os to a disk configured with RAID 5, the controller has to read,
calculate, and write a parity segment for every data write operation.
• Four of these disks are used for data and one is used for parity.
• The parity (Ep) at the controller is calculated as follows:
Ep = E1 + E2 + E3 + E4
(XOR operations)

• Whenever the controller performs a write I/O, parity must be computed by reading the
old parity (Ep old) and the old data (E4 old) from the disk, which means two read I/Os.
• The new parity (Ep new) is computed as follows:
Ep new = Ep old – E4 old + E4 new (XOR operations)

Dr. SMCE, Vijay anand H M Page 13


• After computing the new parity, the controller completes the write I/O by doing two
write I/Os for the new data and the new parity onto the disks..

Figure 12: Write penalty in RAID 5


RAID Level Comparison:

Dr. SMCE, Vijay anand H M Page 14


Intelligent Storage System:
• RAID technology made an important contribution to enhancing storage performance and
reliability, but disk drives, even with a RAID implementation, could not meet the
performance requirements of today’s applications.
• With advancements in technology, a new breed of storage solutions, known as intelligent
storage systems, has evolved.
• These intelligent storage systems are feature-rich RAID arrays that provide highly
optimized I/O processing capabilities.
Components Intelligent Storage System:
An intelligent storage system consists of four key components:
 front end
 Cache
 back end
 physical disks

Figure 13: Components of an Intelligent Storage System

Front End
• The front end provides the interface between the storage system and the host.
• It consists of two components:

i. Front-End Ports

ii. Front-End Controllers.


• A front end has redundant controllers for high availability, and each controller contains

Dr. SMCE, Vijay anand H M Page 15


multiple front-end ports that enable large numbers of hosts to connect to the intelligent
storage system.
• Front-end controllers route data to and from cache via the internal data bus.

When the cache receives the write data, the controller sends an acknowledgment message back
to the host.
Cache:
• Cache is semiconductor memory where data is placed temporarily to reduce the time
required to service I/O requests from the host.
• Cache improves storage system performance by isolating hosts from the mechanical
delays associated with rotating disks or hard disk drives (HDD).
• Rotating disks are the slowest component of an intelligent storage system. Data access on
rotating disks usually takes several millisecond because of seek time and rotational
latency.
• Accessing data from cache is fast and typically takes less than a millisecond

Structure of Cache

• Cache is organized into pages, which is the smallest unit of cache allocation. The size of a
cache page is configured according to the application I/O size.
• Cache consists of the data store and tag RAM.

• The data store holds the data whereas the tag RAM tracks the location of the data in the
data store and in the disk.
• Entries in tag RAM indicate where data is found in cache and where the data belongs on
the disk.
• Tag RAM includes a dirty bit flag, which indicates whether the data in cache has been
committed to the disk.
• It also contains time-based information, such as the time of last access, which is used to
identify cached information that has not been accessed for a long period and may be freed
up.

Dr. SMCE, Vijay anand H M Page 16


Figure 14: Structure of Cache
Read Operation with Cache:

• When a host issues a read request, the storage controller reads the tag RAM to determine
whether the required data is available in cache.
• If the requested data is found in the cache, it is called a read cache hit or read hit and
data is sent directly to the host, without any disk operation. This provides a fast response
time to the host (about a millisecond).
• If the requested data is not found in cache, it is called a cache miss and the data must be
read from the disk. The back-end controller accesses the appropriate disk and retrieves the
requested data. Data is then placed in cache and is finally sent to the host through the
front- end controller.
• Cache misses increase I/O response time.

• A Pre-fetch, or Read-ahead, algorithm is used when read requests are sequential. In a


sequential read request, a contiguous set of associated blocks is retrieved. Several other
blocks that have not yet been requested by the host can be read from the disk and placed
into cache in advance. When the host subsequently requests these blocks, the read
operations will be read hits.
• This process significantly improves the response time experienced by the host.

• The intelligent storage system offers fixed and variable prefetch sizes.

• In fixed pre-fetch, the intelligent storage system pre-fetches a fixed amount of data. It is
Dr. SMCE, Vijay anand H M Page 17
most suitable when I/O sizes are uniform.

• In variable pre-fetch, the storage system pre-fetches an amount of data in multiples of the
size of the host request.

Figure 15: Read hit and read miss


Write Operation with Cache:
• Write operations with cache provide performance advantages over writing directly
to disks.
• When an I/O is written to cache and acknowledged, it is completed in far less time
(from the host’s perspective) than it would take to write directly to disk.
• Sequential writes also offer opportunities for optimization because many smaller
writes can be coalesced for larger transfers to disk drives with the use of cache.
• A write operation with cache is implemented in the following ways:
• Write-back cache: Data is placed in cache and an acknowledgment is sent to the host
immediately. Later, data from several writes are committed to the disk. Write response
times are much faster, as the write operations are isolated from the mechanical delays
of the disk. However, uncommitted data is at risk of loss in the event of cache failures.

Dr. SMCE, Vijay anand H M Page 18


• Write-through cache: Data is placed in the cache and immediately written to the disk,
and an acknowledgment is sent to the host. Because data is committed to disk as it
arrives, the risks of data loss are low but write response time is longer because of the
disk operations.
• Cache can be bypassed under certain conditions, such as large size write I/O.
• In this implementation, if the size of an I/O request exceeds the predefined size, called
write aside size, writes are sent to the disk directly to reduce the impact of large writes
consuming a large cache space.
Cache Implementation:

• Cache can be implemented as either dedicated cache or global cache.


• With dedicated cache, separate sets of memory locations are reserved for reads and writes.
• In global cache, both reads and writes can use any of the available memory addresses.
• Cache management is more efficient in a global cache implementation because only one
global set of addresses has to be managed.
• Global cache allows users to specify the percentages of cache available for reads and writes
for cache management.

Cache Management:

• Cache is a finite and expensive resource that needs proper management.


• Even though modern intelligent storage systems come with a large amount of cache, when
all cache pages are filled, some pages have to be freed up to accommodate new data and
avoid performance degradation.
• Various cache management algorithms are implemented in intelligent storage systems to
proactively maintain a set of free pages and a list of pages that can be potentially freed up
whenever required.
• The most commonly used algorithms are listed below:

Least Recently Used (LRU): An algorithm that continuously monitors data access in cache
and identifies the cache pages that have not been accessed for a long time. LRU either frees up
these pages or marks them for reuse. This algorithm is based on the assumption that data which
hasn’t been accessed for a while will not be requested by the host.
Most Recently Used (MRU): In MRU, the pages that have been accessed most recently are
freed up or marked for reuse. This algorithm is based on the assumption that recently accessed
data may not be required for a while

Dr. SMCE, Vijay anand H M Page 19


• As cache fills, the storage system must take action to flush dirty pages (data written
into the cache but not yet written to the disk) to manage space availability.
• Flushing is the process that commits data from cache to the disk.

• On the basis of the I/O access rate and pattern, high and low levels called watermarks
are set in cache to manage the flushing process.
• High watermark (HWM) is the cache utilization level at which the storage system
starts high-speed flushing of cache data.
• Low watermark (LWM) is the point at which the storage system stops flushing data
to the disks.
• The cache utilization level, as shown in Fig 1.24, drives the mode of flushing to be used:

 Idle flushing: Occurs continuously, at a modest rate, when the cache utilization
level is between the high and low watermark.
 High watermark flushing: Activated when cache utilization hits the high
watermark. The storage system dedicates some additional resources for flushing.
This type of flushing has some impact on I/O processing.
 Forced flushing: Occurs in the event of a large I/O burst when cache reaches 100
percent of its capacity, which significantly affects the I/O response time. In forced
flushing, system flushes the cache on priority by allocating more resources.

Figure 16: Types of Flushing

Dr. SMCE, Vijay anand H M Page 20


Cache Data Protection:
• Cache is volatile memory, so a power failure or any kind of cache failure will cause loss
of the data that is not yet committed to the disk.
• This risk of losing uncommitted data held in cache can be mitigated using
iii. cache mirroring
iv. cache vaulting
Cache mirroring
• Each write to cache is held in two different memory locations on two independent
memory cards. In the event of a cache failure, the write data will still be safe in the
mirrored location and can be committed to the disk.
• Reads are staged from the disk to the cache, therefore, in the event of a cache failure; the
data can still be accessed from the disk.
• In cache mirroring approaches, the problem of maintaining cache coherency is introduced.
• Cache coherency means that data in two different cache locations must be identical at all
times. It is the responsibility of the array operating environment to ensure coherency.
Cache vaulting

The risk of data loss due to power failure can be addressed in various ways:

• Powering the memory with a battery until the AC power is restored using battery power
to write the cache content to the disk.
• If an extended power failure occurs, using batteries is not a viable option.
• This is because in intelligent storage systems, large amounts of data might need to
be committed to numerous disks, and batteries might not provide power for
sufficient time to write each piece of data to its intended disk.
• Storage vendors use a set of physical disks to dump the contents of cache during
power failure. This is called cache vaulting and the disks are called vault drives.
• When power is restored, data from these disks is written back to write cache and
then written to the intended disks.
Back End:
• The back end provides an interface between cache and the physical disks.
• It consists of two components:
v. Back-end ports

Dr. SMCE, Vijay anand H M Page 21


vi. Back-end controllers.
• The back end controls data transfers between cache and the physical disks.
• From cache, data is sent to the back end and then routed to the destination disk.
• Physical disks are connected to ports on the back end.
• The back end controller communicates with the disks when performing reads and
writes and also provides additional, but limited, temporary data storage.
• The algorithms implemented on back-end controllers provide error detection
and correction, and also RAID functionality.
• For high data protection and high availability, storage systems are configured with
dual controllers with multiple ports.

Physical Disk:
• A physical disk stores data persistently.

• Physical disks are connected to the back-end storage controller and provide persistent
data storage.
• Modern intelligent storage systems provide support to a variety of disk drives
with different speeds and types, such as FC, SATA, SAS, and flash drives.
• They also support the use of a mix of flash, FC, or SATA within the same array.

Types of Intelligent Storage Systems


An intelligent storage system is divided into following two categories:
1. High-end storage systems
2. Midrange storage systems
• High-end storage systems have been implemented with active-active configuration,
whereas midrange storage systems have been implemented with active-passive
configuration.
• The distinctions between these two implementations are becoming increasingly
insignificant.

High-end Storage Systems;


 High-end storage systems, referred to as active-active arrays, are generally aimed at
large enterprises for centralizing corporate data. These arrays are designed with a
large number of controllers and cache memory.

Dr. SMCE, Vijay anand H M Page 22


 An active-active array implies that the host can perform I/Os to its LUNs across any
of the available paths

Figure 17: Active-Active Configuration


Advantages of High-end storage:

• Large storage capacity

• Large amounts of cache to service host I/Os optimally

• Fault tolerance architecture to improve data availability

• Connectivity to mainframe computers and open systems hosts Availability of


multiple front-end ports and interface protocols to serve a large number of hosts
• Availability of multiple back-end Fibre Channel or SCSI RAID controllers to manage
disk processing
• Scalability to support increased connectivity, performance, and storage

• capacity requirements

• Ability to handle large amounts of concurrent I/Os from a number of servers


and applications
• Support for array-based local and remote replication

Dr. SMCE, Vijay anand H M Page 23


Midrange Storage System
• Midrange storage systems are also referred to as Active-Passive Arrays and they are
best suited for small- and medium-sized enterprises.
• They also provide optimal storage solutions at a lower cost.

• In an active-passive array, a host can perform I/Os to a LUN only through the paths to the
owning controller of that LUN. These paths are called Active Paths. The other paths are
passive with respect to this LUN.

Figure 18: Active-Passive Configuration

• The path to controller B remains Passive and no I/O activity is performed through this
path.
• Midrange storage systems are typically designed with two controllers, each of which
contains host interfaces, cache, RAID controllers, and disk drive interfaces.
• Midrange arrays are designed to meet the requirements of small and medium enterprise
applications; therefore, they host less storage capacity and cache than high- end storage
arrays.
• There are also fewer front-end ports for connection to hosts.
• But they ensure high redundancy and high performance for applications with
predictable workloads.
• They also support array-based local and remote replication.

Dr. SMCE, Vijay anand H M Page 24


Fibre Channel Storage Area Networks
Fibre Channel: Overview:
• Fibre Channel is a high-speed network technology that runs on high-speed optical fibre
cables (preferred for front-end SAN connectivity) and serial copper cables (preferred
for back-end disk connectivity).
• The FC technology was created to meet the demand for increased speeds of data
transfer among computers, servers, and mass storage subsystems.
The SAN and Its Evolution
• A storage area network (SAN) carries data between servers (also known as hosts) and
storage devices through fibre channel switches
• A SAN enables storage consolidation and allows storage to be shared across multiple
servers.
• A SAN provides the physical communication infrastructure and enables secure and
robust communication between host and storage devices.

Figure 19: FC SAN implementation

Dr. SMCE, Vijay anand H M Page 25


Figure 20: FC SAN Evolution
Components of SAN:
• SAN consists of three basic components: servers, network infrastructure, and storage.
• These components can be further broken down into the following key elements: node
ports, cabling, interconnecting devices (such as FC switches or hubs), storage arrays,
and SAN management software.

Node Ports
• In fibre channel, devices such as hosts, storage and tape libraries are all referred to as
nodes.
• Each node is a source or destination of information for one or more nodes.
• Each node requires one or more ports to provide a physical interface for
communicating with other nodes.
• A port operates in full-duplex data transmission mode with a transmit (Tx) link and a
receive (Rx) link

Dr. SMCE, Vijay anand H M Page 26


Figure 21: Nodes, ports, and links

Cables and Connectors:


Cabling

• SAN implementations use optical fibre cabling.


• Copper can be used for shorter distances for back-end connectivity, as it provides a
better signal-to-noise ratio for distances up to 30 meters.
• Optical fiber cables carry data in the form of light. There are two types of optical
cables, multi-mode and single-mode.
• Multi-mode fiber (MMF) cable carries multiple beams of light projected at different
angles simultaneously onto the core of the cable

Figure 22: Multimode fiber and single-mode fiber

Dr. SMCE, Vijay anand H M Page 27


• Based on the bandwidth, multi-mode fibers are classified as OM1 (62.5μm), OM2
(50μm) and laser optimized OM3 (50μm). In an MMF transmission, multiple light
beams traveling inside the cable tend to disperse and collide.
• This collision weakens the signal strength after it travels a certain distance — a
process known as modal dispersion.
• An MMF cable is usually used for distances of up to 500 meters.
• Single-mode fiber (SMF) carries a single ray of light projected at the center of the
core.
• In an SMF transmission, a single light beam travels in a straight line through the core
of the fiber.
Connectors:
• A Standard connector (SC) and a Lucent connector (LC) are two commonly used
connectors for fiber optic cables.
• An SC is used for data transmission speeds up to 1 Gb/s, whereas an LC is used for
speeds up to 4 Gb/s.
• A Straight Tip (ST) is a fiber optic connector with a plug and a socket that is locked
with a half-twisted bayonet lock

Figure 23: SC, LC, and ST connectors

Interconnect Devices
• Hubs, switches, and directors are the interconnect devices commonly used in SAN.
• Hubs are used as communication devices in FC-AL implementations. Hubs physically
connect nodes in a logical loop or a physical star topology.
• Switches are more intelligent than hubs and directly route data from one physical port to
Dr. SMCE, Vijay anand H M Page 28
another. Therefore, nodes do not share the bandwidth.
• Directors are larger than switches and are deployed for data center implementations.
• The function of directors is similar to that of FC switches, but directors have higher port
count and fault tolerance capabilities.
Storage Arrays
• The fundamental purpose of a SAN is to provide host access to storage resources.
• The large storage capacities offered by modern storage arrays have been exploited in
SAN environments for storage consolidation and centralization.
SAN Management Software
• SAN management software manages the interfaces between hosts, interconnect devices,
and storage arrays.
• The software provides a view of the SAN environment and enables management of
various resources from one central console.

Dr. SMCE, Vijay anand H M Page 29

You might also like