100% found this document useful (1 vote)

60 views83 pages

CS 1 2

This document provides an overview of data storage technologies and networks. It discusses key topics like different data storage options (DAS, NAS, SAN), file systems, data access models, units of data transfer, and the differences between data, files, and file attributes. Some main points covered include: - DAS refers to directly attached storage, NAS is network attached storage, and SAN uses a storage area network like Fibre Channel. - File systems provide storage, availability, access, and sharing of files via attributes managed by the operating system. - Distributed file systems solve problems with scalability, performance, data availability and loss in traditional file systems. - Data can be transferred at the file,

Uploaded by

devops573

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

100% found this document useful (1 vote)

60 views83 pages

CS 1 2

Uploaded by

devops573

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 83

BITS Pilani

presentation
BITS Pilani Sourish Banerjee
WILP
Pilani Campus
BITS Pilani
Pilani Campus

Data Storage Technologyand Networks (Merged - CSIZC446/ESZC446/ISZC446/SSZC446)

CS 01 & 02
Books

T1 Storage Networking-Real World Skills for the CompTIA Storage+ Certification

and Beyond by Nigel Poulton, Publishers, SYBEX a Wiley brand, 2015

T2 Storage Networks Explained – by Ulf Troppens, Wolfgang Muller-Freidt,

Rainer Wolafka, IBM Storage Software Development, Germany. Publishers:
Wiley

R1 Storage Networks: The complete Reference, Robert Spalding TMH.

R2 Web resource : http://www.snia.org

BITS Pilani, Pilani Campus

Slide references

“Storage Networks Explained” – by Ulf Troppens, Wolfgang

Muller-Freidt, Rainer Wolafka, IBM Storage Software
Development, Germany. Publishers: Wiley

BITS Pilani, Pilani Campus

Data or Information vs File
From the Wikipedia
• Data is a set of values of subjects with respect to qualitative or quantitative
variables.
• Data and information or knowledge are often used interchangeably;
however data becomes information when it is viewed in context or in post-
analysis
• Data is measured, collected and reported, and analyzed, whereupon it can
be visualized using graphs, images or other analysis tools.
• Raw data ("unprocessed data") is a collection of numbers or characters
before it has been "cleaned" and corrected by researchers.
• A computer file is a computer resource for recording data discretely in a
computer storage device.

• Question : Can we safely use the words “data” and “file

interchangeably, in the context of computer, OS & specifically storage?

BITS Pilani, Pilani Campus

Data to the file.
File is every where.
• Going by the context file is either the container (of your
data), or the data itself.
• File is in your disk, and in the memory as well.

• Little food for thought.

• Think of Pi (π) 3.14159265359
• Once stored in the file what is Pi ?? π OR 3.14159265359
• Once read and loaded into memory (RAM) is it 3.14159265359 OR
11.0010010000111111011010101000100010000101110111…
• There is something about data …
BITS Pilani, Pilani Campus
File
• What is a file ?
• Data organized by name
• What can you do with a file?
• Think actions … most primitive actions
• Read Write
• Create Delete Rename
• Move ????
• Truncate
• Is copy/paste a file operation
• Who will provide for, all you can do with a file ?
• File system
• Who provides the File system?
• Operating System

BITS Pilani, Pilani Campus

File Systems

What makes a file system ?

• File

What does the file system provide ?

• Storage
• Availability
• Access
• Sharing

BITS Pilani, Pilani Campus

Problem if at all

• Tightly coupled with the hardware

• Not scalable
• Performance
• Data unavailability
• Data Loss
• Security

Solution ??
✓ Think big, large … very large “Distributed File Systems”
that involves both STORAGE & NETWORK.

BITS Pilani, Pilani Campus

But First

Differentiate between
• Data Storage and
• Data Access

Lets about both.

BITS Pilani, Pilani Campus

Files again

Types : Differentiated based of data.

• Structured : Structured sequence of data
• Non indexed records
• Indexed records
• Unstructured : Unstructured sequence of data

Should the Operating system really care ?

➢ Most modern OSs see files as unstructured data

Why ?
➢ OSs are not about data, applications are.

BITS Pilani, Pilani Campus

Data in a file

Think of the XML file.

Observe the difference

Depends on who is consuming the data

BITS Pilani, Pilani Campus
What else .. Other than data
Attributes
• Read-only - Allows a file to be read, but nothing can be written to the file or
changed.
• Archive - Tells Windows Backup to backup the file.
• System - System file.
• Hidden - File will not be shown when doing a regular dir from DOS.
• Read - Designated as an "r"; allows a file to be read, but nothing can be
written to or changed in the file.
• Write - Designated as a "w"; allows a file to be written to and changed.
• Execute - Designated as an "x"; allows a file to be executed by users or the
operating system.
Question :
Who does the attribute actually belong to ?
Who manages these attributes ?

BITS Pilani, Pilani Campus

Mutable & Immutable file

Can you modify a file ? ….. YES you may !!!

Once modified, the file looses its original flavor.

Buy why immutable ?
• Ease of sharing
• Ease of caching
• Ease of replication
Can we have both ? YES .. But costs
• MVFS

BITS Pilani, Pilani Campus

File access models

Accessibility of the data in a DFS is of paramount

importance.
• Remote service model
• Data caching Model
• Better performance but
• Cache consistency problem

➢ Typical implementations do a hybrid of Remote & Data

caching models.
➢ NFS uses Remote service model, but used caching for
better performance.

BITS Pilani, Pilani Campus

Unit of data transfer
▪ What is the smallest amount of Data ? 1 bit
▪ What is the smallest unit of storage of data?
➢ You write 1 byte, but store 1 block !!!!!

▪ So how should we transfer data?

➢ File level transfer
➢ Move the whole file
➢ Saves network trips
➢ Scalable, because of fewer trips to the server
➢ Optimized disk access
➢ Typically immune to network issue
➢ Problem ??? Consistency, not scalable for large files

BITS Pilani, Pilani Campus

Unit of data transfer cont.

➢ Block level transfer

– File is transferred a blocks
– Block is a contiguous portion of the file (not necessarily the FS
block.)
– Works good when some part of the file is needed
– Poor when the whole file has to be shipped to the client side
– Other standard problems with network latency
➢ Byte level transfer
–Transfer happens as units of sequence of bytes.
–Flexible, but
–Non scalable
–Does not do well with caching, resulting in poor performance.

BITS Pilani, Pilani Campus

Unit of data transfer cont.

➢ Record level transfer

• Suits only to structured files
• Does not work for unstructured files

BITS Pilani, Pilani Campus

BITS Pilani, Pilani Campus
DAS, NAS & SAN
DAS Direct Attached Storage (DAS) is storage that is directly
connected to a server without a storage network, for example over
SCSI (Small computer system interface) or SSA (Serial Storage
Architecture)

NAS Network Attached Storage (NAS) refers to the product category of

preconfigured file servers. NAS servers consist of one or more
internal servers, preconfigured disk capacity and usually a stripped-
down or special operating system.

SAN is the abbreviation for ‘Storage Area Network’. Very often ‘storage
area networks’ or ‘SANs’ are equated with Fibre Channel
technology. The advantages of storage area networks can, however,
also be achieved with alternative technologies such as for example
iSCSI. Therefore always state the transmission technology with
which a storage area network is realized, for example Fibre Channel
SAN or iSCSI SAN.
BITS Pilani, Pilani Campus
DAS, NAS & SAN cont.

* Figure captured from the slides of the recorded lecture.

BITS Pilani, Pilani Campus

DAS, NAS & SAN cont.

DAS NAS SAN

Storage Type sectors shared files blocks

Data Transmission IDE/SCSI TCP/IP, Ethernet Fibre Channel

Access Mode clients or servers clients or servers servers

Capacity (bytes) 109 109 – 1012 ➢1012

Complexity Easy Moderate Difficult

Management
High Moderate Low
Cost (per GB)

* Table captured from the slides of the recorded lecture.

BITS Pilani, Pilani Campus

BITS Pilani, Pilani Campus
SERVER-CENTRIC IT ARCHITECTURE

Think DAS
Desktops/Laptops with attached hard disk(s).
– Pros:
• Direct access, no intermediary.
• Least number of points of failure.
• Reasonable performance.
• Can do data consolidation !! But is it really a Pro?
– Cons:
• Zero redundancy.
• Single failure can kill the whole system.
• Almost total absence of failover mechanisms.
• Cannot be everywhere

BITS Pilani, Pilani Campus

SERVER-CENTRIC IT ARCHITECTURE

• Think Network
• NAS (DNAS)
• SAN

Why DAS cannot

scale up, devices
now can store
TBs of data after
all ?

Answer: Think of
an enterprise

BITS Pilani, Pilani Campus

SERVER-CENTRIC IT ARCHITECTURE
LIMITATIONS

• Data stored over some networked location.

• Storage device(s) are connected to one or more
server(s).
• Storage device(s) exists only in relation to the
server(s) to which it is connected
• Other server(s) cannot directly access the
device(s);
• they always have to go through the server that is connected to the storage
device.

BITS Pilani, Pilani Campus

SERVER-CENTRIC IT ARCHITECTURE
LIMITATIONS

• Failure at any point of access/connect will

cripple the entire data access.
• Typical case of data safely stored, but
unavailable.
• Impact ?

• Network or SCSI (Small Computer System Interface)?

• Performance Vs Distance
• Conventional technologies are therefore no longer sufficient to satisfy
the growing demand for storage capacity

BITS Pilani, Pilani Campus

SERVER-CENTRIC IT ARCHITECTURE
LIMITATIONS

• Lack/Absence
of uniform
utilization
• Solution:
• More software/
hardware
• Increase in surface
of attack.
• Lack of uniform
physical safeguards.

BITS Pilani, Pilani Campus

STORAGE-CENTRIC IT ARCHITECTURE
AND ITS ADVANTAGES

• Solves limitations
imposed by DAS.
• Storage networks
open up new
possibilities for
data
management
• SCSI cable is
replaced by a
network
• Think Mode : iSCSI

BITS Pilani, Pilani Campus

STORAGE-CENTRIC IT ARCHITECTURE
AND ITS ADVANTAGES

Storage Networks:
• In storage networks storage devices exist completely
independently of any computer.
• Several servers can access the same storage device
directly over the storage network without another server
having to be involved.
• Storage devices are also consolidated, which involves
replacing the many small hard disks attached to the
computers with a large disk subsystem.
• Recall enterprise.

BITS Pilani, Pilani Campus

CS 2

BITS Pilani, Pilani Campus

Intelligent Disk Subsystems

Disk (Device) considerations

– Sizes (capacity)
– Cost
– Performance
– Durability
– Management
– Maintenance

BITS Pilani, Pilani Campus

BITS Pilani, Pilani Campus
ARCHITECTURE OF INTELLIGENT DISK
SUBSYSTEMS

• Think File Server … Not

an intelligent storage !!!

• Intelligent + Hard disk

server/ sub-system.

• Hard disk server ..

Like a repo of hard
disks storing data.

• Connection made over

SCSI or FC or over the Figure 2.1 Servers are connected to a disk
subsystem using standard I/O techniques. The
network iSCSI.
figure shows a server that is connected by SCSI.
Two others are connected by Fibre Channel SAN.

BITS Pilani, Pilani Campus

ARCHITECTURE OF INTELLIGENT DISK
SUBSYSTEMS

The internal structure of the disk

subsystem is completely
hidden from the server.

Server only sees the hard disks

that the disk subsystem
provides

The connection ports are

extended to the hard disks of
the disk subsystem by
means of internal I/O Figure 2.2 Servers are connected to the disk
channels (Figure 2.2). subsystems via the ports. Internally, the disk
subsystem consists of hard disks, a controller, a
cache and internal I/O channels.

BITS Pilani, Pilani Campus

ARCHITECTURE OF INTELLIGENT DISK
SUBSYSTEMS

Controller
• In most disk subsystems there is a
controller between the connection ports
and the hard disks.
• can significantly increase the data availability
and data access performance with the aid of
a so-called RAID procedure.
• realize the copying services instant copy and
remote mirroring and further additional
services. Figure 2.2 Servers are
• can act as a cache in an attempt to connected to the disk
subsystems via the ports.
accelerate read and write accesses to the Internally, the disk
server. subsystem consists of hard
disks, a controller, a cache
and internal I/O channels

BITS Pilani, Pilani Campus

ARCHITECTURE OF INTELLIGENT
DISK SUBSYSTEMS

• Disk subsystems are available in all

sizes.
• Small disk subsystems have one to two connection
ports for servers or storage networks, six to eight hard
disks and, depending on the disk capacity, storage
capacity of a few terabytes.
• Large disk subsystems have multiple ten connection
ports for servers and storage networks, redundant
controllers and multiple I/O channels.

BITS Pilani, Pilani Campus

ARCHITECTURE OF INTELLIGENT
DISK SUBSYSTEMS

• Most disk subsystems have

the advantage that free disk
space can be flexibly
assigned to each server
connected to the disk
subsystem (storage pooling).

• Free storage capacity should

be understood to mean both
hard disks that have already
been installed and have not
yet been used and also free
slots for hard disks that have
yet to be installed.