100% found this document useful (1 vote)
60 views83 pages

CS 1 2

This document provides an overview of data storage technologies and networks. It discusses key topics like different data storage options (DAS, NAS, SAN), file systems, data access models, units of data transfer, and the differences between data, files, and file attributes. Some main points covered include: - DAS refers to directly attached storage, NAS is network attached storage, and SAN uses a storage area network like Fibre Channel. - File systems provide storage, availability, access, and sharing of files via attributes managed by the operating system. - Distributed file systems solve problems with scalability, performance, data availability and loss in traditional file systems. - Data can be transferred at the file,

Uploaded by

devops573
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
60 views83 pages

CS 1 2

This document provides an overview of data storage technologies and networks. It discusses key topics like different data storage options (DAS, NAS, SAN), file systems, data access models, units of data transfer, and the differences between data, files, and file attributes. Some main points covered include: - DAS refers to directly attached storage, NAS is network attached storage, and SAN uses a storage area network like Fibre Channel. - File systems provide storage, availability, access, and sharing of files via attributes managed by the operating system. - Distributed file systems solve problems with scalability, performance, data availability and loss in traditional file systems. - Data can be transferred at the file,

Uploaded by

devops573
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 83

BITS Pilani

presentation
BITS Pilani Sourish Banerjee
WILP
Pilani Campus
BITS Pilani
Pilani Campus

Data Storage Technologyand Networks (Merged - CSIZC446/ESZC446/ISZC446/SSZC446)

CS 01 & 02
Books

T1 Storage Networking-Real World Skills for the CompTIA Storage+ Certification


and Beyond by Nigel Poulton, Publishers, SYBEX a Wiley brand, 2015

T2 Storage Networks Explained – by Ulf Troppens, Wolfgang Muller-Freidt,


Rainer Wolafka, IBM Storage Software Development, Germany. Publishers:
Wiley

R1 Storage Networks: The complete Reference, Robert Spalding TMH.


R2 Web resource : http://www.snia.org

BITS Pilani, Pilani Campus


Slide references

“Storage Networks Explained” – by Ulf Troppens, Wolfgang


Muller-Freidt, Rainer Wolafka, IBM Storage Software
Development, Germany. Publishers: Wiley

BITS Pilani, Pilani Campus


Data or Information vs File
From the Wikipedia
• Data is a set of values of subjects with respect to qualitative or quantitative
variables.
• Data and information or knowledge are often used interchangeably;
however data becomes information when it is viewed in context or in post-
analysis
• Data is measured, collected and reported, and analyzed, whereupon it can
be visualized using graphs, images or other analysis tools.
• Raw data ("unprocessed data") is a collection of numbers or characters
before it has been "cleaned" and corrected by researchers.
• A computer file is a computer resource for recording data discretely in a
computer storage device.

• Question : Can we safely use the words “data” and “file


interchangeably, in the context of computer, OS & specifically storage?

BITS Pilani, Pilani Campus


Data to the file.
File is every where.
• Going by the context file is either the container (of your
data), or the data itself.
• File is in your disk, and in the memory as well.

• Little food for thought.


• Think of Pi (π) 3.14159265359
• Once stored in the file what is Pi ?? π OR 3.14159265359
• Once read and loaded into memory (RAM) is it 3.14159265359 OR
11.0010010000111111011010101000100010000101110111…
• There is something about data …
BITS Pilani, Pilani Campus
File
• What is a file ?
• Data organized by name
• What can you do with a file?
• Think actions … most primitive actions
• Read Write
• Create Delete Rename
• Move ????
• Truncate
• Is copy/paste a file operation
• Who will provide for, all you can do with a file ?
• File system
• Who provides the File system?
• Operating System

BITS Pilani, Pilani Campus


File Systems

What makes a file system ?


• File

What does the file system provide ?


• Storage
• Availability
• Access
• Sharing

BITS Pilani, Pilani Campus


Problem if at all

• Tightly coupled with the hardware


• Not scalable
• Performance
• Data unavailability
• Data Loss
• Security

Solution ??
✓ Think big, large … very large “Distributed File Systems”
that involves both STORAGE & NETWORK.

BITS Pilani, Pilani Campus


But First

Differentiate between
• Data Storage and
• Data Access

Lets about both.

BITS Pilani, Pilani Campus


Files again

Types : Differentiated based of data.


• Structured : Structured sequence of data
• Non indexed records
• Indexed records
• Unstructured : Unstructured sequence of data

Should the Operating system really care ?


➢ Most modern OSs see files as unstructured data

Why ?
➢ OSs are not about data, applications are.

BITS Pilani, Pilani Campus


Data in a file

Think of the XML file.


Observe the difference

Depends on who is consuming the data


BITS Pilani, Pilani Campus
What else .. Other than data
Attributes
• Read-only - Allows a file to be read, but nothing can be written to the file or
changed.
• Archive - Tells Windows Backup to backup the file.
• System - System file.
• Hidden - File will not be shown when doing a regular dir from DOS.
• Read - Designated as an "r"; allows a file to be read, but nothing can be
written to or changed in the file.
• Write - Designated as a "w"; allows a file to be written to and changed.
• Execute - Designated as an "x"; allows a file to be executed by users or the
operating system.
Question :
Who does the attribute actually belong to ?
Who manages these attributes ?

BITS Pilani, Pilani Campus


Mutable & Immutable file

Can you modify a file ? ….. YES you may !!!

Once modified, the file looses its original flavor.


Buy why immutable ?
• Ease of sharing
• Ease of caching
• Ease of replication
Can we have both ? YES .. But costs
• MVFS

BITS Pilani, Pilani Campus


File access models

Accessibility of the data in a DFS is of paramount


importance.
• Remote service model
• Data caching Model
• Better performance but
• Cache consistency problem

➢ Typical implementations do a hybrid of Remote & Data


caching models.
➢ NFS uses Remote service model, but used caching for
better performance.

BITS Pilani, Pilani Campus


Unit of data transfer
▪ What is the smallest amount of Data ? 1 bit
▪ What is the smallest unit of storage of data?
➢ You write 1 byte, but store 1 block !!!!!

▪ So how should we transfer data?


➢ File level transfer
➢ Move the whole file
➢ Saves network trips
➢ Scalable, because of fewer trips to the server
➢ Optimized disk access
➢ Typically immune to network issue
➢ Problem ??? Consistency, not scalable for large files

BITS Pilani, Pilani Campus


Unit of data transfer cont.

➢ Block level transfer


– File is transferred a blocks
– Block is a contiguous portion of the file (not necessarily the FS
block.)
– Works good when some part of the file is needed
– Poor when the whole file has to be shipped to the client side
– Other standard problems with network latency
➢ Byte level transfer
–Transfer happens as units of sequence of bytes.
–Flexible, but
–Non scalable
–Does not do well with caching, resulting in poor performance.

BITS Pilani, Pilani Campus


Unit of data transfer cont.

➢ Record level transfer


• Suits only to structured files
• Does not work for unstructured files

BITS Pilani, Pilani Campus


BITS Pilani, Pilani Campus
DAS, NAS & SAN
DAS Direct Attached Storage (DAS) is storage that is directly
connected to a server without a storage network, for example over
SCSI (Small computer system interface) or SSA (Serial Storage
Architecture)

NAS Network Attached Storage (NAS) refers to the product category of


preconfigured file servers. NAS servers consist of one or more
internal servers, preconfigured disk capacity and usually a stripped-
down or special operating system.

SAN is the abbreviation for ‘Storage Area Network’. Very often ‘storage
area networks’ or ‘SANs’ are equated with Fibre Channel
technology. The advantages of storage area networks can, however,
also be achieved with alternative technologies such as for example
iSCSI. Therefore always state the transmission technology with
which a storage area network is realized, for example Fibre Channel
SAN or iSCSI SAN.
BITS Pilani, Pilani Campus
DAS, NAS & SAN cont.

* Figure captured from the slides of the recorded lecture.

BITS Pilani, Pilani Campus


DAS, NAS & SAN cont.

DAS NAS SAN


Storage Type sectors shared files blocks

Data Transmission IDE/SCSI TCP/IP, Ethernet Fibre Channel

Access Mode clients or servers clients or servers servers

Capacity (bytes) 109 109 – 1012 ➢1012

Complexity Easy Moderate Difficult


Management
High Moderate Low
Cost (per GB)

* Table captured from the slides of the recorded lecture.

BITS Pilani, Pilani Campus


BITS Pilani, Pilani Campus
SERVER-CENTRIC IT ARCHITECTURE

Think DAS
Desktops/Laptops with attached hard disk(s).
– Pros:
• Direct access, no intermediary.
• Least number of points of failure.
• Reasonable performance.
• Can do data consolidation !! But is it really a Pro?
– Cons:
• Zero redundancy.
• Single failure can kill the whole system.
• Almost total absence of failover mechanisms.
• Cannot be everywhere

BITS Pilani, Pilani Campus


SERVER-CENTRIC IT ARCHITECTURE

• Think Network
• NAS (DNAS)
• SAN

Why DAS cannot


scale up, devices
now can store
TBs of data after
all ?

Answer: Think of
an enterprise

BITS Pilani, Pilani Campus


SERVER-CENTRIC IT ARCHITECTURE
LIMITATIONS

• Data stored over some networked location.


• Storage device(s) are connected to one or more
server(s).
• Storage device(s) exists only in relation to the
server(s) to which it is connected
• Other server(s) cannot directly access the
device(s);
• they always have to go through the server that is connected to the storage
device.

BITS Pilani, Pilani Campus


SERVER-CENTRIC IT ARCHITECTURE
LIMITATIONS

• Failure at any point of access/connect will


cripple the entire data access.
• Typical case of data safely stored, but
unavailable.
• Impact ?

• Network or SCSI (Small Computer System Interface)?


• Performance Vs Distance
• Conventional technologies are therefore no longer sufficient to satisfy
the growing demand for storage capacity

BITS Pilani, Pilani Campus


SERVER-CENTRIC IT ARCHITECTURE
LIMITATIONS

• Lack/Absence
of uniform
utilization
• Solution:
• More software/
hardware
• Increase in surface
of attack.
• Lack of uniform
physical safeguards.

BITS Pilani, Pilani Campus


STORAGE-CENTRIC IT ARCHITECTURE
AND ITS ADVANTAGES

• Solves limitations
imposed by DAS.
• Storage networks
open up new
possibilities for
data
management
• SCSI cable is
replaced by a
network
• Think Mode : iSCSI

BITS Pilani, Pilani Campus


STORAGE-CENTRIC IT ARCHITECTURE
AND ITS ADVANTAGES

Storage Networks:
• In storage networks storage devices exist completely
independently of any computer.
• Several servers can access the same storage device
directly over the storage network without another server
having to be involved.
• Storage devices are also consolidated, which involves
replacing the many small hard disks attached to the
computers with a large disk subsystem.
• Recall enterprise.

BITS Pilani, Pilani Campus


CS 2

BITS Pilani, Pilani Campus


Intelligent Disk Subsystems

Disk (Device) considerations


– Sizes (capacity)
– Cost
– Performance
– Durability
– Management
– Maintenance

BITS Pilani, Pilani Campus


BITS Pilani, Pilani Campus
ARCHITECTURE OF INTELLIGENT DISK
SUBSYSTEMS

• Think File Server … Not


an intelligent storage !!!

• Intelligent + Hard disk


server/ sub-system.

• Hard disk server ..


Like a repo of hard
disks storing data.

• Connection made over


SCSI or FC or over the Figure 2.1 Servers are connected to a disk
subsystem using standard I/O techniques. The
network iSCSI.
figure shows a server that is connected by SCSI.
Two others are connected by Fibre Channel SAN.

BITS Pilani, Pilani Campus


ARCHITECTURE OF INTELLIGENT DISK
SUBSYSTEMS

The internal structure of the disk


subsystem is completely
hidden from the server.

Server only sees the hard disks


that the disk subsystem
provides

The connection ports are


extended to the hard disks of
the disk subsystem by
means of internal I/O Figure 2.2 Servers are connected to the disk
channels (Figure 2.2). subsystems via the ports. Internally, the disk
subsystem consists of hard disks, a controller, a
cache and internal I/O channels.

BITS Pilani, Pilani Campus


ARCHITECTURE OF INTELLIGENT DISK
SUBSYSTEMS

Controller
• In most disk subsystems there is a
controller between the connection ports
and the hard disks.
• can significantly increase the data availability
and data access performance with the aid of
a so-called RAID procedure.
• realize the copying services instant copy and
remote mirroring and further additional
services. Figure 2.2 Servers are
• can act as a cache in an attempt to connected to the disk
subsystems via the ports.
accelerate read and write accesses to the Internally, the disk
server. subsystem consists of hard
disks, a controller, a cache
and internal I/O channels

BITS Pilani, Pilani Campus


ARCHITECTURE OF INTELLIGENT
DISK SUBSYSTEMS

• Disk subsystems are available in all


sizes.
• Small disk subsystems have one to two connection
ports for servers or storage networks, six to eight hard
disks and, depending on the disk capacity, storage
capacity of a few terabytes.
• Large disk subsystems have multiple ten connection
ports for servers and storage networks, redundant
controllers and multiple I/O channels.

BITS Pilani, Pilani Campus


ARCHITECTURE OF INTELLIGENT
DISK SUBSYSTEMS

• Most disk subsystems have


the advantage that free disk
space can be flexibly
assigned to each server
connected to the disk
subsystem (storage pooling).

• Free storage capacity should


be understood to mean both
hard disks that have already
been installed and have not
yet been used and also free
slots for hard disks that have
yet to be installed.

BITS Pilani, Pilani Campus


HARD DISKS AND INTERNAL I/O CHANNELS

• Over all capacity is a function of number of disk and the


capacities of the disks.
• The controller of the disk subsystem must ultimately
store all data on physical hard disks.
• Performance is a function of parallel access and
exclusivity (lack of contention).
• Maximum performance Vs Maximum Capacity.
• With regard to performance it is often beneficial to use smaller hard
disks at the expense of the maximum capacity.
• Best choice is leave it for the consumer.

BITS Pilani, Pilani Campus


HARD DISKS AND INTERNAL I/O CHANNELS

BITS Pilani, Pilani Campus


HARD DISKS AND INTERNAL I/O CHANNELS

• Internal I/O channels


• between connection ports and controller
• between controller and internal hard disks
• SCSI,
• Fibre Channel,
• Increasingly Serial ATA (SATA)
• Serial Attached SCSI (SAS)
• Serial Storage Architecture (SSA) (to limited extent)
• proprietary – i.e., manufacturer-specific – I/O techniques

• Fault-tolerance provided by built-in redundancy

BITS Pilani, Pilani Campus


HARD DISKS AND INTERNAL I/O CHANNELS

In active cabling the


individual physical hard
disks are only connected
via one I/O channel. If
this access path fails,
then it is no longer
possible to access the
data.
In active/passive cabling
the individual hard disks
are connected via two
I/O channels. In normal
operation the controller
communicates with the hard disks via the first I/O channel and the second
I/O channel is not used. In the event of the failure of the first I/O channel, the
disk subsystem switches from the first to the second I/O channel.

BITS Pilani, Pilani Campus


HARD DISKS AND INTERNAL I/O CHANNELS

In Active/active (no load


sharing) cabling
method the controller
uses both I/O channels
in normal operation. The
hard disks are divided
into two groups: in
normal operation the first
group is addressed via
the first I/O channel and
the second via the
second I/O channel. If
one I/O channel fails,
both groups are
addressed via the other
I/O channel.

In Active/active (load sharing) approach all hard disks are addressed via both I/O
channels in normal operation. The controller divides the load dynamically between the two I/O
channels so that the available hardware can be optimally utilized. If one I/O channel fails, then
the communication goes through the other channel only.
BITS Pilani, Pilani Campus
Disk sub systems types

Disk sub systems segregated based on the type (including


presence or absence) of the controller.
• No controller
• RAID controller
• Intelligent controller with additional services such as
instant copy and remote mirroring

BITS Pilani, Pilani Campus


JBOD: JUST A BUNCH OF DISKS

• If the disk subsystem has no internal controller, it is only


an enclosure full of disks (JBODs).
• the hard disks are permanently fitted into the enclosure
• the connections for I/O channels and power supply are taken
outwards at a single point.

• JBOD is simpler to manage than a few loose hard disks.

• Typical JBOD disk subsystems have space for 8 or 16


hard disks.
• A connected server recognizes all these hard disks as
independent disks.
• Therefore, 16 device addresses are required for a JBOD disk
subsystem incorporating 16 hard disks.
BITS Pilani, Pilani Campus
JBODS: Limitations

In some I/O techniques such as SCSI and Fibre Channel


arbitrated loop, this can lead to a bottleneck at device
addresses.
In contrast to intelligent disk subsystems, a JBOD disk
subsystem in particular is not capable of supporting
RAID or other forms of virtualization.
– If required, however, these can be realized outside the JBOD
disk subsystem, for example, as software in the server or as an
independent virtualization entity in the storage.

BITS Pilani, Pilani Campus


RAID

• A disk subsystem with a RAID controller offers greater


functional scope than a JBOD disk subsystem.
• RAID was originally developed at a time when hard disks
were still very expensive and less reliable than they are
today.
• RAID was originally called ‘Redundant Array of
Inexpensive Disks’, today RAID stands for ‘Redundant
Array of Independent Disks’.
• RAID has two main goals:
• to increase performance by striping and
• to increase fault-tolerance by redundancy.

BITS Pilani, Pilani Campus


BITS Pilani, Pilani Campus
BITS Pilani, Pilani Campus
BITS Pilani, Pilani Campus
RAID

Striping distributes the data over several hard disks and


thus distributes the load over more hardware.

Redundancy means that additional information is stored so


that the operation of the application itself can continue in
the event of the failure of a hard disk.

Individual physical hard disks are slow and have a limited


life-cycle. However, through a suitable combination of
physical hard disks it is possible to significantly increase
the fault-tolerance and performance of the system as a
whole.

BITS Pilani, Pilani Campus


Resume here in CS3

BITS Pilani, Pilani Campus


STORAGE VIRTUALISATION USING RAID

The bundle of physical


hard disks brought
together by the RAID
controller are also
known as virtual hard
disks.
A server that is connected
to a RAID system sees
only the virtual hard
disk; the fact that the
RAID controller actually
distributes the data over
several physical hard
disks is completely hidden to
the server.

BITS Pilani, Pilani Campus


STORAGE VIRTUALISATION USING RAID

• A RAID controller can distribute the data that a server writes to the
virtual hard disk amongst the individual physical hard disks in
various manners (RAID Levels).
• One factor common to almost all RAID levels is that they store
redundant information.
• If a physical hard disk fails, its data can be reconstructed from the
hard disks that remain intact.
• The defective hard disk can even be replaced by a new one during
operation if a disk subsystem has the appropriate hardware.
• Then the RAID controller reconstructs the data of the exchanged
hard disk.
• This process remains hidden to the server apart from a possible
reduction in performance: the server can continue to work
uninterrupted on the virtual hard disk

BITS Pilani, Pilani Campus


STORAGE VIRTUALISATION USING RAID

• Modern RAID controllers initiate replacement of failed


disks process automatically, by using a hot spare.
• The hot spare disks are not used in normal operation.
• If a disk fails, the RAID controller immediately begins to
copy the data of the remaining intact disk onto a hot
spare disk.
• After the replacement of the defective disk, this is
included in the pool of hot spare disks.
• Modern RAID controllers can manage a common pool of
hot spare disks for several virtual RAID disks.
• Hot spare disks can be defined for all RAID levels that
offer redundancy.
BITS Pilani, Pilani Campus
STORAGE VIRTUALISATION USING RAID

Hot spare disk:


1. The disk subsystem provides the server with two virtual disks for which a common hot
spare disk is available
2. Due to the redundant data storage the server can continue to process data even though a
physical disk has failed, at the expense of a reduction in performance
3. The RAID controller recreates the data from the defective disk on the hot spare disk
4. After the defective disk has been replaced a hot spare disk is once again available

BITS Pilani, Pilani Campus


BITS Pilani, Pilani Campus
Raid Levels

RAID 0: Block-by-block striping


RAID 1: Block-by-block mirroring
RAID 0+1/RAID 10: Striping and mirroring combined
RAID 0+1: Striping and mirroring combined
RAID 10: Striping and mirroring combined
RAID 4 & RAID 5
RAID 6: double parity
RAID 2
RAID 3

BITS Pilani, Pilani Campus


RAID 0: block-by-block striping

• RAID 0 distributes the data that


the server writes to the virtual
hard disk onto the physical hard
disks one after another block-by-
block (block-by-block striping).

• the server writes the blocks A, B,


C, D, E, etc. onto the virtual hard
disk one after the other

RAID 0 (striping): As in all RAID levels, the server sees only the virtual hard disk. The RAID
controller distributes the write operations of the server amongst several physical hard disks.
Parallel writing means that the performance of the virtual hard disk is higher than that of the
individual physical hard disks.
BITS Pilani, Pilani Campus
RAID 0: block-by-block striping cont.

• RAID 0 increases the performance of the virtual hard disk because


the individual hard disks can exchange data with the RAID controller
via the I/O channel significantly more quickly than they can write to
or read from the rotating disk.

• Whilst the first disk is writing the first block to the physical hard disk,
the RAID controller is already sending the second block, block B, to
the second hard disk and block C to the third hard disk.

• RAID 0 increases the performance of the virtual hard disk, but not its
fault-tolerance. If a physical hard disk is lost, all the data on the
virtual hard disk is lost.

• The ‘R’ for ‘Redundant’ in RAID is incorrect in the case of RAID 0,


with ‘RAID 0’ standing instead for ‘zero redundancy’.
BITS Pilani, Pilani Campus
RAID 1: block-by-block mirroring

• In RAID 1 fault-tolerance is of
primary importance.

• RAID 1 brings together two (or


more) physical hard disks to form
a virtual hard disk by mirroring the
data on the physical hard disks.

• If the server writes a block to the


virtual hard disk, the RAID
controller writes this block to all
physical hard disks.
RAID 1 (mirroring): The RAID controller duplicates each of the server’s write
operations onto two physical hard disks. After the failure of one physical hard disk
the data can still be read from the other disk.
BITS Pilani, Pilani Campus
RAID 0+1/RAID 10:
Striping and Mirroring combined
• Problem is RAID 0 increases performance, while RAID 1
increases fault-tolerance. Neither do both.

Solution is to combine the ideas of RAID 0 and RAID 1.

• RAID 0+1 and RAID 1+0 (10) each represent a two-


stage virtualization hierarchy.

• In both RAID 0+1 and RAID 10 the server sees only a single hard
disk, which is larger, faster and more fault-tolerant than a physical
hard disk. Question is which of the two RAID levels, RAID 0+1 or
RAID 10, is preferable?

BITS Pilani, Pilani Campus


BITS Pilani, Pilani Campus
BITS Pilani, Pilani Campus
RAID 0 + 1 (mirrored stripes)

• Using eight physical hard disks, in


the first level the RAID controller
initially brings together each four
physical hard disks to form a total
of two virtual hard disks that are
only visible within the RAID
controller by means of RAID 0
(striping).

• In the second level, it consolidates


these two virtual hard disks into a
single virtual hard disk by means
of RAID 1 (mirroring); only this
virtual hard disk is visible to the
server.

BITS Pilani, Pilani Campus


RAID 10 (striped mirrors)

In the first stage the RAID controller


initially brings together the physical
hard disks in pairs by means of
RAID 1 (mirroring) to form a total of
four virtual hard disks that are only
visible within the RAID controller.

In the second stage, the RAID


controller consolidates these four
virtual hard disks into a virtual hard
disk by means of RAID 0 (striping).

BITS Pilani, Pilani Campus


RAID 1+0 & RAID 10 cont.
In both RAID 0+1 and RAID 10 the server sees only a single hard disk, which is larger,
faster and more fault-tolerant than a physical hard disk. Question is which of the two
RAID levels, RAID 0+1 or RAID 10, is preferable?

• The consequences of the failure of a physical hard disk


in RAID 0+1 (mirrored stripes) are relatively high in
comparison to RAID 10 (striped mirrors).
• The failure of a physical hard disk brings about the
failure of the corresponding internal RAID 0 disk, so
that in effect half of the
physical hard disks have
failed.
• The recovery of the data
from the failed disk is
expensive.

✓ In RAID 10 (striped mirrors) the consequences of the failure


of a physical hard disk are not as serious as in RAID 0+1
(mirrored stripes).
✓ All virtual hard disks remain intact.
✓ The recovery of the data from the failed hard disk is simple.
BITS Pilani, Pilani Campus
RAID 4 and RAID 5:
Parity instead of mirroring
The idea of RAID 4 and RAID 5 is to replace all mirror disks of RAID 10 with
a single parity hard disk.
• Server writes the blocks A, B, C, D, E,
etc. to the virtual hard disk sequentially.

• RAID controller stripes the data blocks


over the first four physical hard disks.

• RAID controller calculates a parity block


for every four blocks (PABCD) and writes
this onto the fifth physical hard disk.

• If one of the four data disks fails, the


RAID controller can reconstruct the data • The parity block is calculated
of the defective disks using the three with the aid of the logical XOR
other data disks and the parity disk. operator (Exclusive OR)
BITS Pilani, Pilani Campus
RAID 4 and RAID 5: Costs

• Changing a data block changes the value of the


associated parity block.
• Each write operation to the virtual hard disk requires
• the physical writing of the data block
• the recalculation of the parity block
• the physical writing of the newly calculated parity block
• This extra cost for write operations in RAID 4 and RAID
5 is called the write penalty of RAID 4 or 5 as
applicable.
• The recalculation cost of the parity block is relatively
low due to the mathematical properties of the XOR
operator.

BITS Pilani, Pilani Campus


RAID 4 and RAID 5: Costs cont.

• If the block A is overwritten by block A1 and D is the


difference between the old and new data block, then D =
A XOR A1.
• The new parity block P1 can now simply be calculated
from the old parity block P and D, i.e. P1 = P XOR D.
• PABCD be the parity block for the data blocks A, B, C and
D, the new parity PA1BCD block can be calculated without
reading the remaining blocks B, C & D.
• The old data block A & old parity block P must be read in
the controller, so that the same can be used to calculate
D.

BITS Pilani, Pilani Campus


RAID 4 and RAID 5: Costs cont.

Write penalty of RAID 4 and


RAID 5:
The server writes a changed
data block (observe the
cache).
• The RAID controller reads in the
old data block and the associated
old parity block
• Calculates the new parity block.
• Writes the new data block and the
new parity block onto the physical
hard disk in as associated.

BITS Pilani, Pilani Campus


RAID 4 and RAID 5: Performance
• Advanced RAID 4 and RAID 5 implementations can of reducing the write
penalty even further

• If large data quantities are written sequentially, then the RAID controller
can calculate the parity blocks from the data flow without reading the old
parity block from the disk.
• For example say, the blocks E, F, G and H are written in one go.
• The controller calculates the parity block PEFGH from them and overwrite
this without having previously read in the old value.

• Further more a RAID controller with a suitably large cache can hold
frequently changed parity blocks in the cache after writing to the disk, so
that the next time one of the data blocks in question is changed there is
no need to read in the parity block.

BITS Pilani, Pilani Campus


RAID 4 & 5: Difference

• RAID 4 saves all parity blocks onto a single physical


hard disk.
• Problem with RAID 4 is that, while the write operations for the
data blocks are distributed over four physical hard disks, the
parity disk has to handle the same number of write operations
all on its own, there by becoming a performance.

• RAID 5 distributes the parity blocks over all hard disks,


to get around the performance bottleneck of RAID 4 .
• Like PABCD is written on the 5th physical hard disk,
• PEFGH is written on the 4th Physical Hard disk,
• PIJKL is written in the 3rd hard disk

BITS Pilani, Pilani Campus


RAID 4 & 5: Failure & Recovery

• Can withstand the failure of a physical hard disk up to a


certain extent.
• RAID controller has to read the data from all disks, use this
to recalculate the lost data blocks & parity blocks, and then
write these blocks to the replacement disk.
• If a parity block has to be restored, the RAID controller must first
read the blocks A, B, C and D from the physical hard disks,
recalculate the parity block PABCD and then write to the exchanged
physical hard disk.
• If a data block (say C) has to be restored, the controller would first
have to read in the blocks A, B, D and PABCD, use these to
reconstruct block C and write this to the replaced disk.

BITS Pilani, Pilani Campus


RAID 6: Double parity

• RAID 5 arrays cannot correct the double failures.

• There is rise in the probability of data loss in RAID 5


arrays due to the increasing capacity of hard disks

• RAID 6 offers a compromise between RAID 5 and RAID


10 by adding a second parity hard disk to extend RAID 5.

• RAID 6 has a poor write performance because the write


penalty for RAID 5 strikes twice

BITS Pilani, Pilani Campus


RAID 2 and RAID 3
• In RAID 2 the Hamming code is used, so that redundant
information is stored in addition to the actual data.
• Earlier bit errors were possible quite easily, that could lead to a written
‘1’ being read as ‘0’or a written ‘0’ being read as ‘1’.
• This additional data permits the recognition of read errors and to some
degree also makes it possible to correct them.

• RAID 2 no longer has any practical significance,


because, comparable functions are performed by the
controller of each individual hard disk.

• RAID 3 was for a long time called the recommended


RAID level for sequential write and read profiles such as
data mining and video processing.
BITS Pilani, Pilani Campus
RAID 2 and RAID 3 cont.

• RAID 3 distributes the data of a block amongst all the, such


that, all disks are involved in every read or write access.
• RAID 3 only permits the reading and writing of whole blocks,
thus dispensing with the write penalty that occurs in RAID 4
and RAID 5.
• But this needed, the rotation of the individual hard disks be
synchronized, in RAID 3, so that the data of a block can truly be
written simultaneously.
• Not so commonly used any more
• hard disks come with a large cache of their own
• significantly higher rotation speeds
• other RAID levels are now suitable for sequential load profiles

BITS Pilani, Pilani Campus


Comparison of the RAID levels

Question: Which RAID level should be used when ?


Answer: There is no absolute answer.

BITS Pilani, Pilani Campus


Comparison of the RAID levels cont.

• Manufacturers of disk subsystems have design


options in
• selection of the internal physical hard disks;
• I/O technique used for the communication within the disk
subsystem;
• use of several I/O channels;
• realization of the RAID controller;
• size of the cache;
• cache algorithms themselves;
• behavior during rebuild; and
• provision of advanced functions such as data scrubbing
and preventive rebuild
BITS Pilani, Pilani Campus
BITS Pilani, Pilani Campus
Tools

• Notepad++
• Winscp
• MobaXTerm
• Linux VM (HyperV, Virtual Box, VMWare workstation
player)

BITS Pilani, Pilani Campus


LAB
• First how to use “man” .. Try “man man”
• man lsblk
• man fdisk
• man df
• man hwinfo
• man parted
• man cfdisk
• man sfdisk
• man smartctl
• man e4defrag

• Good references
• https://www.howtoforge.com/tutorial/linux-filesystem-defrag/
• https://www.thomas-krenn.com/en/wiki/Analyzing_a_Faulty_Hard_Disk_using_Smartctl

BITS Pilani, Pilani Campus


THANK YOU

BITS Pilani, Pilani Campus

You might also like