SAN Module 2 Data Protection RAID
SAN Module 2 Data Protection RAID
Module 2
Data Protection: RAID
By,
Vijay Anand H M
CSE, Dr. SMCE
Syllabus:
Data Protection - RAID : RAID Implementation Methods, RAID Array
Components, RAID Techniques, RAID Levels, RAID Impact on Disk
Performance, RAID Comparison.
Intelligent Storage Systems: Components of an Intelligent Storage System, Types
of Intelligent Storage Systems.
Fibre Channel Storage Area Networks - Fibre Channel: Overview, The SAN
and Its Evolution, Components of FC SAN.
RAID:
RAID is the use of small-capacity, inexpensive disk drives as an alternative to large capacity
drives common on mainframe computers.
Later RAID has been redefined to refer to independent disks to reflect advances in the storage
technology.
Hardware RAID:
• In hardware RAID implementations, a specialized hardware controller is implemented
either on the host or on the array.
• Controller card RAID is a host-based hardware RAID implementation in which a
specialized RAID controller is installed in the host, and disk drives are connected to
it.
• Manufacturers also integrate RAID controllers on motherboards.
• A host-based RAID controller is not an efficient solution in a data center environment
with a large number of hosts.
• The external RAID controller is an array-based hardware RAID.
• It acts as an interface between the host and disks.
• It presents storage volumes to the host, and the host manages these volumes as
physical drives.
• The key functions of the RAID controllers are as follows:
Management and control of disk aggregations
Translation of I/O requests between logical disks and physical disks
Data regeneration in the event of disk failures
Software RAID:
• Software RAID uses host-based software to provide RAID functions.
• It is implemented at the operating-system level and does not use a dedicated
hardware controller to manage the RAID array.
• Advantages when compared to Hardware RAID:
cost
simplicity benefits
Dr. SMCE, Vijay anand H M Page 1
Limitations:
Performance: Software RAID affects overall system performance. This is due to
additional CPU cycles required to perform RAID calculations.
Supported features: Software RAID does not support all RAID levels.
Operating system compatibility: Software RAID is tied to the host operating system;
hence, upgrades to software RAID or to the operating system should be validated for
compatibility. This leads to inflexibility in the data-processing environment.
RAID Techniques:
There are three RAID techniques
1. striping
2. mirroring
3. parity
• Strip size (also called stripe depth) describes the number of blocks in a strip and is the
maximum amount of data that can be written to or read from a single disk in the set.
• All strips in a stripe have the same number of blocks. Having a smaller strip size means that
data is broken into smaller pieces while spread across the disks.
• Stripe size is a multiple of strip size by the number of data disks in the RAID set. Eg: In a
5 disk striped RAID set with a strip size of 64 KB, the stripe size is 320KB (64KB x 5)
• Stripe width refers to the number of data strips in a stripe.
Advantages:
complete data redundancy,
Mirroring enables fast recovery from disk failure.
data protection
Disadvantages:
Mirroring involves duplication of data — the amount of storage capacity needed is twice the
amount of data being stored.
Expensive.
Parity:
• Parity is a method to protect striped data from disk drive failure without the cost
of mirroring.
• An additional disk drive is added to hold parity, a mathematical construct that
allows re- creation of the missing data.
Dr. SMCE, Vijay anand H M Page 4
• Parity is a redundancy technique that ensures protection of data without
maintaining a full set of duplicate data.
• Calculation of parity is a function of the RAID controller.
• Parity information can be stored on separate, dedicated disk drives or distributed
across all the drives in a RAID set.
Compared to mirroring, parity implementation considerably reduces the cost associated with
data protection.
Disadvantages:
Parity information is generated from data on the data disk. Therefore, parity is
recalculated every time there is a change in data.
This recalculation is time-consuming and affects the performance of the RAID
array.
RAID Levels:
RAID Level selection is determined by below factors:
Dr. SMCE, Vijay anand H M Page 5
Application performance
data availability requirements
cost
RAID Levels are defined on the basis of:
Striping
Mirroring
Parity techniques
Some RAID levels use a single technique whereas others use a combination of techniques.
RAID 0:
• RAID 0 configuration uses data striping techniques, where data is striped across all the
disks within a RAID set. Therefore it utilizes the full storage capacity of a RAID set.
• To read data, all the strips are put back together by the controller.
• When the number of drives in the RAID set increases, performance improves because
more data can be read or written simultaneously.
• RAID 0 is a good option for applications that need high I/O throughput.
• However, if these applications require high availability during drive failures, RAID 0 does
not provide data protection and availability.
Most data centers require data redundancy and performance from their RAID arrays.
RAID 1+0 and RAID 0+1 combine the performance benefits of RAID 0 with the
redundancy benefits of RAID 1.
They use striping and mirroring techniques and combine their benefits. These types of
RAID require an even number of disks, the minimum being four.
Figure 8: RAID 3
Figure 9: RAID 4
RAID 5
• RAID 5 is a versatile RAID implementation.
• It is similar to RAID 4 because it uses striping. The drives (strips) are also independently
accessible.
• The difference between RAID 4 and RAID 5 is the parity location. In RAID 4, parity is
written to a dedicated drive, creating a write bottleneck for the parity disk
• In RAID 5, parity is distributed across all disks. The distribution of parity in RAID 5
overcomes the Write bottleneck. Below Figure illustrates the RAID 5 implementation.
• Fig 1.18 illustrates the RAID 5 implementation.
• RAID 5 is good for random, read-intensive I/O applications and preferred for messaging,
Dr. SMCE, Vijay anand H M Page 11
data mining, medium-performance media serving, and relational database management
system (RDBMS) implementations, in which database administrators (DBAs) optimize
data access.
RAID 6
RAID 6 includes a second parity element to enable survival in the event of the failure
of two disks in a RAID group. Therefore, a RAID 6 implementation requires at least
four disks.
RAID 6 distributes the parity across all the disks. The write penalty in RAID 6 is more
than that in RAID 5; therefore, RAID 5 writes perform better than RAID 6. The
rebuild operation in RAID 6 may take longer than that in RAID 5 due to the presence
of two parity sets.
• Whenever the controller performs a write I/O, parity must be computed by reading the
old parity (Ep old) and the old data (E4 old) from the disk, which means two read I/Os.
• The new parity (Ep new) is computed as follows:
Ep new = Ep old – E4 old + E4 new (XOR operations)
Front End
• The front end provides the interface between the storage system and the host.
• It consists of two components:
i. Front-End Ports
When the cache receives the write data, the controller sends an acknowledgment message back
to the host.
Cache:
• Cache is semiconductor memory where data is placed temporarily to reduce the time
required to service I/O requests from the host.
• Cache improves storage system performance by isolating hosts from the mechanical
delays associated with rotating disks or hard disk drives (HDD).
• Rotating disks are the slowest component of an intelligent storage system. Data access on
rotating disks usually takes several millisecond because of seek time and rotational
latency.
• Accessing data from cache is fast and typically takes less than a millisecond
Structure of Cache
• Cache is organized into pages, which is the smallest unit of cache allocation. The size of a
cache page is configured according to the application I/O size.
• Cache consists of the data store and tag RAM.
• The data store holds the data whereas the tag RAM tracks the location of the data in the
data store and in the disk.
• Entries in tag RAM indicate where data is found in cache and where the data belongs on
the disk.
• Tag RAM includes a dirty bit flag, which indicates whether the data in cache has been
committed to the disk.
• It also contains time-based information, such as the time of last access, which is used to
identify cached information that has not been accessed for a long period and may be freed
up.
• When a host issues a read request, the storage controller reads the tag RAM to determine
whether the required data is available in cache.
• If the requested data is found in the cache, it is called a read cache hit or read hit and
data is sent directly to the host, without any disk operation. This provides a fast response
time to the host (about a millisecond).
• If the requested data is not found in cache, it is called a cache miss and the data must be
read from the disk. The back-end controller accesses the appropriate disk and retrieves the
requested data. Data is then placed in cache and is finally sent to the host through the
front- end controller.
• Cache misses increase I/O response time.
• The intelligent storage system offers fixed and variable prefetch sizes.
• In fixed pre-fetch, the intelligent storage system pre-fetches a fixed amount of data. It is
Dr. SMCE, Vijay anand H M Page 17
most suitable when I/O sizes are uniform.
• In variable pre-fetch, the storage system pre-fetches an amount of data in multiples of the
size of the host request.
Cache Management:
Least Recently Used (LRU): An algorithm that continuously monitors data access in cache
and identifies the cache pages that have not been accessed for a long time. LRU either frees up
these pages or marks them for reuse. This algorithm is based on the assumption that data which
hasn’t been accessed for a while will not be requested by the host.
Most Recently Used (MRU): In MRU, the pages that have been accessed most recently are
freed up or marked for reuse. This algorithm is based on the assumption that recently accessed
data may not be required for a while
• On the basis of the I/O access rate and pattern, high and low levels called watermarks
are set in cache to manage the flushing process.
• High watermark (HWM) is the cache utilization level at which the storage system
starts high-speed flushing of cache data.
• Low watermark (LWM) is the point at which the storage system stops flushing data
to the disks.
• The cache utilization level, as shown in Fig 1.24, drives the mode of flushing to be used:
Idle flushing: Occurs continuously, at a modest rate, when the cache utilization
level is between the high and low watermark.
High watermark flushing: Activated when cache utilization hits the high
watermark. The storage system dedicates some additional resources for flushing.
This type of flushing has some impact on I/O processing.
Forced flushing: Occurs in the event of a large I/O burst when cache reaches 100
percent of its capacity, which significantly affects the I/O response time. In forced
flushing, system flushes the cache on priority by allocating more resources.
The risk of data loss due to power failure can be addressed in various ways:
• Powering the memory with a battery until the AC power is restored using battery power
to write the cache content to the disk.
• If an extended power failure occurs, using batteries is not a viable option.
• This is because in intelligent storage systems, large amounts of data might need to
be committed to numerous disks, and batteries might not provide power for
sufficient time to write each piece of data to its intended disk.
• Storage vendors use a set of physical disks to dump the contents of cache during
power failure. This is called cache vaulting and the disks are called vault drives.
• When power is restored, data from these disks is written back to write cache and
then written to the intended disks.
Back End:
• The back end provides an interface between cache and the physical disks.
• It consists of two components:
v. Back-end ports
Physical Disk:
• A physical disk stores data persistently.
• Physical disks are connected to the back-end storage controller and provide persistent
data storage.
• Modern intelligent storage systems provide support to a variety of disk drives
with different speeds and types, such as FC, SATA, SAS, and flash drives.
• They also support the use of a mix of flash, FC, or SATA within the same array.
• capacity requirements
• In an active-passive array, a host can perform I/Os to a LUN only through the paths to the
owning controller of that LUN. These paths are called Active Paths. The other paths are
passive with respect to this LUN.
• The path to controller B remains Passive and no I/O activity is performed through this
path.
• Midrange storage systems are typically designed with two controllers, each of which
contains host interfaces, cache, RAID controllers, and disk drive interfaces.
• Midrange arrays are designed to meet the requirements of small and medium enterprise
applications; therefore, they host less storage capacity and cache than high- end storage
arrays.
• There are also fewer front-end ports for connection to hosts.
• But they ensure high redundancy and high performance for applications with
predictable workloads.
• They also support array-based local and remote replication.
Node Ports
• In fibre channel, devices such as hosts, storage and tape libraries are all referred to as
nodes.
• Each node is a source or destination of information for one or more nodes.
• Each node requires one or more ports to provide a physical interface for
communicating with other nodes.
• A port operates in full-duplex data transmission mode with a transmit (Tx) link and a
receive (Rx) link
Interconnect Devices
• Hubs, switches, and directors are the interconnect devices commonly used in SAN.
• Hubs are used as communication devices in FC-AL implementations. Hubs physically
connect nodes in a logical loop or a physical star topology.
• Switches are more intelligent than hubs and directly route data from one physical port to
Dr. SMCE, Vijay anand H M Page 28
another. Therefore, nodes do not share the bandwidth.
• Directors are larger than switches and are deployed for data center implementations.
• The function of directors is similar to that of FC switches, but directors have higher port
count and fault tolerance capabilities.
Storage Arrays
• The fundamental purpose of a SAN is to provide host access to storage resources.
• The large storage capacities offered by modern storage arrays have been exploited in
SAN environments for storage consolidation and centralization.
SAN Management Software
• SAN management software manages the interfaces between hosts, interconnect devices,
and storage arrays.
• The software provides a view of the SAN environment and enables management of
various resources from one central console.