0% found this document useful (0 votes)
3 views

Module1 SAN

Storage area network Module-1

Uploaded by

GIRISH KUMAR B C
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

Module1 SAN

Storage area network Module-1

Uploaded by

GIRISH KUMAR B C
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 81

Storage Area Networking

Module-1
Introduction to Information
Storage
Chapter Objectives
• Describe who is creating data and the amount of
data being created
• Describe the value of data to business
• List the solutions available for data storage
• List and explain the core elements of data center
• Describe the ILM strategy
• Describe storage evolution
Lesson : Information Storage
• Describe the importance of information to individuals
and to businesses
• Define data and information
• Discuss the categories of data
• Describe the storage architectures and their evolution
Why Information Storage
• “Digital universe – The Information Explosion”
• 21st Century is information era
• Information is being created at ever increasing rate
• Information has become critical for success
• We live in an on-command, on-demand world
• Example: Social networking sites, e-mails, video and photo
sharing website, online shopping, search engines etc
Nearly a quarter of the world's population –
roughly 1.4 billion people – will use the Internet
on a regular basis in 2009.
50 billion photos taken every year
Online Video

4,700,000,000 video streams


monthly
England has approximately 4 million
surveillance cameras

1 for approximately every 14


Britons
Storage requirements: Facebook
• 10,000,000,000 photos
• 2-3 Terabytes of photos are
being uploaded to the site
every day
• One petabyte of photo
storage
• Serve over 15 billion photo
images per day
• Photo traffic now peaks at
over 300,000 images served
per second
Information management is a

big challenge
Store Protect Optimiz Leverage
e

Store Protect Optimiz Leverage


e
A Vocabulary for
Measuring Information
If a Grain of Sand were One Byte of Information . . .

1 Megabyte =
1 million bytes
a tablespoon of
sand
1 Gigabyte =
1 billion bytes
patch of sand—
9” square, 1’
1 Terabyte =
deep
1 trillion bytes
a sandbox—
24’ square, 1’
1 Petabyte =
deep
1,000 terabytes
a mile long beach—

100’ wide , 1’ deep


A Ne Vocabulary for Measuring
w Information
If a Grain of Sand were One Byte of Information . . .

1 Exabyte =
1 Megabyte = 1,000 petabytes
1 million bytes the same beach—
a tablespoon of from Maine to North
sand
1 Gigabyte = 1 Zetabyte =
Carolina
1 billion bytes 1,000 exabytes
patch of sand— the same beach—
9” square, 1’ along the entire US
1 Terabyte =
deep 1 Yottabyte =
coast
1 trillion bytes 1,000 zetabytes
a sandbox— enough info to bury the
24’ square, 1’ entire
1 Petabyte =
deep US under 296 feet of sand
1,000 terabytes
a mile long beach—

100’ wide , 1’ deep


What do you Think ?
• What is your contribution to the digital Universe ( how
many Mb’s have you generated till date ?? )

a) <100 GB

b) 100 GB - 500 GB

c) 500 GB – 1 TB

d) > 1 TB
What is Data
“Collection of raw facts from which conclusions may be drawn”
• Data is converted into more
convenient form i.e. Digital
Data Vide
o

• Increase in data processing


capabilities 01010101010
Phot 10101011010
• Lower cost of digital storage o
00010101011
• Affordable and faster 01010101010
communication technology Boo
k
10101010101
01010101010

• Who creates data? Lette


r
Digital
Data

• Individuals
• Businesses
Categories of Data

• Data can be categorized as


either structured or E-Mail PDF
unstructured data Attachments s

X-Rays
• Over 80% of enterprise Check Unstructured (80%)

information is unstructured Manual


s
Instant Messages

Images
Documents

Forms
Web Pages

Contract
Rich
s
Medi
a
Invoices

Audio
Video
Structured (20%)
Define Information
• What do individuals/businesses do
with the data they collect?
• They turn it into “information”
• “Information is the intelligence Centralized information
storage and

and knowledge derived from data”


Network processing Network

Wired Wireles Wireles Wire


s s d

Uploading Accessing
informatio informatio
n n

• Businesses analyze raw data in


order to identify meaningful trends Users of

• For example:
Creators of
Information
informatio
n

• Buying habits and patterns of


customers
• Health history of patients

Demand for more


Information

Virtuous cycle of information


Value of Information to a
Business
• Identifying new business opportunities
• Buying/spending patterns
• Internet stores, retail stores, supermarkets
• Customer satisfaction/service
• Tracking shipments, and deliveries
• Identifying patterns that lead to changes in existing business
• Reduced cost
• Just-in-time inventory, eliminating over-stocking of products, optimizing shipment
and delivery
• New services
• Security alerts for “stolen” credit card purchases
• Targeted marketing campaigns
• Communicate to bank customers with high account balances about a special
savings plan
• Creating a competitive advantage
Storage
• Data created by individuals/businesses must be stored
for further processing
• Type of storage used is based on the type of data and the
rate at which it is created and used
• Examples:
• Individuals: Digital camera, Cell phone, DVD’s, Hard disk
• Businesses: Hard disk, external disk arrays, tape library
• Storage model: An evolution
• Centralized: mainframe computers
• Decentralized: Client –server model
• Centralized: Storage Networking
Storage Technology and Architecture
Evolution

Multi
Protoc
LA FC SAN ol R
N o
u
t
e
r

IP SAN

RAID Array SAN / NAS

JBOD

Internal DAS

Time
Data Center Infrastructure
• Data Centre
Components:
Applications
Database
Operating System/Server
Network
Storage Device
Example of an Order Processing
System
Server/ OS
Client

LAN
Storage
network
Application
User
Interface DBMS
Storage Array
Key Requirements for Data Center Elements
Availability

Data Integrity Security

Manageability

Performance Capacity

Scalabilit
y
Managing Storage Information
• Monitoring
• Security, Performance, Accessibility, Capacity
• Reporting
• Resource performance, Capacity, Utilization
• Provisioning
• Providing necessary h/w, s/w and other resources needed to
run the data center
• Capacity, resource planning
Challenges in Managing Information
• Exploding digital universe
• Multifold increase of information growth
• Increasing dependency on information
• The strategic use of information plays
• Changing value of information
• Information that is valuable today may become less
important tomorrow.
Information Lifecycle
Management
“CHANGE IN THE VALUE OF INFORMATION OVER TIME”

Protect

New Process Deliver Warranty


orde orde orde claim
r r r Tim
Value e

Fulfilled Age Warranty


orde dat d Voide
r a d

Creat Acces Migrat Archiv Dispos


e s e e e

A proactive strategy that enables an IT organization


to effectively manage the data throughout its lifecycle
ILM strategy – characteristics
• Business-centric
• Centrally managed
• Policy-based
• Heterogeneous
• Optimized
Information Lifecycle Management Process
Policy-based Alignment of Storage Infrastructure with Data Value

AUTOMATED

Classify Implement policies Integrated Organize


data / with information management storage resources
applications based management tools of storage to
on business rules environment align with data
classes

FLEXIBLE
IMPLEMENTATION OF ILM
Benefits of Implementing ILM
• Improved utilization
• Tiered storage platforms
• Simplified management
• Processes, tools and automation
• Simplified backup and recovery
• A wider range of options to balance the need for business continuity
• Maintaining compliance
• Knowledge of what data needs to be protected for what length of time
• Lower Total Cost of Ownership
• By aligning the infrastructure and management costs with information value
Lesson Summary
Key points covered in this lesson:
• The five core elements of a Data Center infrastructure
• Key requirements of storage systems to support
business activities, as well as some of the constraints
• ILM strategy
• Importance
• Characteristics
• Activities in developing ILM strategy
• IML implementation
• Benefits of ILM
Chapter Summary
Key points covered in this chapter:
• Importance of data, information, and storage
infrastructure
• Types of data, its value, and key management
requirements of a storage system
• Evolution of storage architectures
• Core elements of a data center
• Importance of the ILM strategy
With the advancement of computer and
communication technologies, the rate of data
generation and sharing has increased exponentially.
The following is a list of some of the factors that have
contributed to the growth of digital data:
1. Increase in data-processing capabilities
2. Lower cost of digital storage
3. Affordable and faster communication
technology
Types of Data
Data can be classified as structured or unstructured based on
how it is stored and managed.
Structured data is organized in rows and columns in a rigidly
defined format so that applications can retrieve and process it
efficiently.

Structured data is typically stored using a database


management system (DBMS).
Data is unstructured if its elements cannot be stored in
rows and columns, which makes it difficult to query
and retrieve by applications.
A vast majority of new data being created today is unstructured.
The industry is challenged with new architectures, technologies,
techniques, and skills to store, manage, analyze, and derive value
from unstructured data from numerous sources.
Big Data

Big data is a new and evolving concept, which refers to data sets
whose sizes are beyond the capability of commonly used
software tools to capture, store, manage, and process within
acceptable time limits. It includes both structured and
unstructured data generated by a variety of sources, including
business application transactions, web pages, videos, images, e-
mails, social media, and so on. These data sets typically require
real-time capture or updates for analysis, predictive modeling,
and decision making
Information
Data, whether structured or unstructured, does not fulfill any
purpose for individuals or businesses unless it is presented in a
meaningful form. Information is the intelligence and knowledge
derived from data.
Storage
Data created by individuals or businesses must be stored so that it
is easily accessible for further processing. In a computing
environment, devices designed for storing data are termed storage
devices or simply storage.
Evolution of Storage Architecture
There are two types of Storage Architecture:
1.Server-Centric Storage Architecture
2.Information-Centric Storage Architecture
Data Center Infrastructure

Organizations maintain data centers to provide centralized data-


processing capabilities across the enterprise. Data centers house
and manage large amounts of data.
The data center infrastructure includes hardware components,
such as computers, storage systems, network devices, and power
backups; and software components, such as applications,
operating systems, and management software. It also includes
environmental controls, such as air conditioning, fire
suppression, and ventilation.
Core Elements of a Data Center

Five core elements are essential for the functionality of a data


center:
1. Application: A computer program that provides the logic
for computing operations
2. Database management system (DBMS): Provides a
structured way to store data in logically organized tables that
are interrelated
3. Host or compute: A computing platform (hardware,
firmware, and software) that runs applications and databases
4. Network: A data path that facilitates communication among
various networked devices
5. Storage: A device that stores data persistently for
subsequent use.
• These core elements are typically viewed and managed as
separate entities, but all the elements must work together to
address data-processing requirements.
• Below figure is an example of an online order transaction system
that involves the five core elements of a data center and illustrates
their functionality in a business process.
Key Characteristics of a Data Center
Uninterrupted operation of data centers is critical to the
survival and success of a business. Organizations must have a
reliable infrastructure that ensures that data is accessible at all
times. Key Characteristics of a Data Center are:
1.Availability
2.Security
3.Scalability
4.Performance
5.Data integrity
6.Capacity
7.Manageability
Managing a Data Center
Managing a data center involves many tasks. The key
management activities include the following:
1. Monitoring: It is a continuous process of gathering
information on various elements and services running in a
data center. The aspects of a data center that are
monitored include security, performance, availability, and
capacity.
2. Reporting: It is done periodically on resource
performance, capacity, and utilization. Reporting tasks
help to establish business justifications and chargeback of
costs associated with data center operations.
3. Provisioning: It is a process of providing the hardware,
software, and other resources required to run a data center.
Virtualization and cloud computing have dramatically changed
the way data center infrastructure resources are provisioned and
managed. Organizations are rapidly deploying virtualization on
various elements of data centers to optimize their utilization.
Further, continuous cost pressure on IT and on demand data
processing requirements have resulted in the adoption of cloud
computing.
Virtualization and Cloud Computing
Virtualization is a technique of abstracting physical resources,
such as compute, storage, and network, and making them appear as
logical resources. Virtualization has existed in the IT industry for
several years and in different forms. Common examples of
virtualization are virtual memory used on compute systems and
partitioning of raw disks.

Virtualization enables pooling of physical resources and providing


an aggregated view of the physical resource capabilities. For
example, storage virtualization enables multiple pooled storage
devices to appear as a single large storage entity.
Cloud computing enables individuals or businesses to use IT
resources as a service over the network. It provides highly scalable
and flexible computing that enables provisioning of resources on
demand.
Users can scale up or scale down the demand of computing
resources, including storage capacity, with minimal management
effort or service provider interaction. Cloud computing empowers
self-service requesting through a fully automated request-
fulfillment process.
Cloud computing enables consumption-based metering;
therefore, consumers pay only for the resources they use,
such as CPU hours used, amount of data transferred, and
gigabytes of data stored.

Cloud infrastructure is usually built upon virtualized


data centers, which provide resource pooling and rapid
provisioning of resources.
Data Center Environment
Application
An application is a computer program that provides the
logic for computing operations. The application sends
requests to the underlying operating system to perform
read/write (R/W) operations on the storage devices.
Applications can be layered on the database, which in
turn uses the OS services to perform
R/W operations on the storage devices.

The characteristics of I/Os (Input / Output) generated by


the application influence the overall performance of
storage system and storage solution designs.
Host (Compute)

Users store and retrieve data through applications. The computers


on which these applications run are referred to as hosts or
compute systems. Hosts can be physical or virtual machines. A
compute virtualization software enables creating virtual machines
on top of a physical compute infrastructure.
Software runs on a host and enables processing of input and
output (I/O) data.
The various software components that are essential parts

of a host system.

1. Operating System
In a traditional computing environment, an operating system
controls all aspects of computing. It works between the application
and the physical components of a compute system. One of the
services it provides to the application is data access. The operating
system also monitors and responds to user actions and
the environment.
2. Device Driver
A device driver is special software that permits the
operating system to interact with a specific device,
such as a printer, a mouse, or a disk drive. A device
driver enables the operating system to recognize the
device and to access and control devices. Device
drivers are hardware-dependent and operating-
system-specific.
3. Volume Manager
•The Logical Volume Managers (LVMs) enabled dynamic extension of file system
capacity and efficient storage management.

•The LVM is software that runs on the compute system and manages logical and
physical storage. LVM is an intermediate layer between the fi le system and the
physical disk.

•It can partition a larger-capacity disk into virtual, smaller-capacity volumes (the
process is called partitioning) or aggregate several smaller disks to form a larger
virtual volume. (The process is called concatenation.) These volumes are then
presented to applications.
Disk partitioning was introduced to improve the flexibility and
utilization of disk drives. In partitioning, a disk drive is divided
into logical containers called logical volumes (LVs)
4. File System

A file is a collection of related records or data stored as a unit with


a name. A file system is a hierarchical structure of files.
A file system organizes data in a structured hierarchical manner via
the use of directories, which are containers for storing pointers to
multiple files. All file systems maintain a pointer map to the
directories, subdirectories, and files that are part of the file system.
Examples of common file systems are:
1. FAT 32 (File Allocation Table) for Microsoft Windows
2. NT File System (NTFS) for Microsoft Windows
3. UNIX File System (UFS) for UNIX
4. Extended File System (EXT2/3) for Linux
5. Compute Virtualization
Compute virtualization is a technique for masking or
abstracting the physical hardware from the operating system.
It enables multiple operating systems to run concurrently on
single or clustered physical machines. This technique enables
creating portable virtual compute systems called virtual
machines (VMs). Each VM runs an operating system and
application instance in an isolated manner.
Compute virtualization is achieved by a virtualization layer that resides between
the hardware and virtual machines. This layer is also called the hypervisor. The
hypervisor provides hardware resources, such as CPU, memory, and network to
all the virtual machines.
Connectivity

Connectivity refers to the interconnection between hosts or


between a host and peripheral devices, such as printers or
storage devices. Connectivity and communication between host
and storage are enabled using physical components and
interface protocols.

A host interface device or host adapter connects a host to other


hosts and storage devices. Examples of host interface devices
are host bus adapter (HBA) and network interface card (NIC).
Host bus adaptor is an application-specific integrated circuit
(ASIC) board that performs I/O interface functions between the
host and storage, relieving the CPU from additional I/O
processing workload. A host typically contains multiple HBAs.
A port is a specialized outlet that enables connectivity between the
host and external devices. An HBA may contain one or more ports
to connect the host to the storage device. Cables connect hosts to
internal or external devices using copper or fiber optic media.
Interface Protocols
A protocol enables communication between the host and
storage. Protocols are implemented using interface devices (or
controllers) at both source and destination. The popular interface
protocols used for host to storage communications are Integrated
Device Electronics/Advanced Technology Attachment
(IDE/ATA), Small Computer System Interface (SCSI), Fibre
Channel (FC) and Internet Protocol (IP).
IDE/ATA is a popular interface protocol standard used for
connecting storage devices, such as disk drives and CD-ROM
drives.

SCSI has emerged as a preferred connectivity protocol in high-


end computers. This protocol supports parallel transmission and
offers improved performance, scalability, and compatibility
compared to ATA. Serial attached SCSI (SAS) is a point-to-point
serial protocol that provides an alternative to parallel SCSI.
Fibre Channel is a widely used protocol for high-speed
communication to the storage device. The Fibre Channel
interface provides gigabit network speed. It provides a serial
data transmission that operates over copper wire and optical
fiber.

IP is a network protocol that has been traditionally used for


host-to-host traffic. With the emergence of new technologies,
an IP network has become a viable option for host-to-storage
communication.
Storage

Storage is a core component in a data center. A storage device


uses magnetic, optic, or solid state media. Disks, tapes, and
diskettes use magnetic media, whereas CD/DVD uses optical
media for storage. Removable Flash memory or Flash drives are
examples of solid state media.
• Data is stored on the tape linearly along the length of the tape.

• Search and retrieval of data are done sequentially, and it invariably takes
several seconds to access the data. As a result, random data access is slow
and time-consuming. This limits tapes as a viable option for applications
that require real-time, rapid access to data.
• In a shared computing environment, data stored on tape cannot be
accessed by multiple applications simultaneously, restricting its use to one
application at a time.
• On a tape drive, the read/write head touches the tape surface, so the tape
degrades or wears out after repeated use.
• The storage and retrieval requirements of data from the tape and the
overhead associated with managing the tape media are significant.
Disk Drive Components
Platter and Spindle
Actuator ARM Assembly
Physical disk
structures :sectors ,tracks and
cylinders
Zoned bit recording
Logical Block addressing
Disk Drive Performance
• Data Transfer Rate
• Disk Service Time
⮚ Seek Time
⮚ Rotational Latency

• Data Transfer Rate


Disk I/O Controller Utilization

I/O Queuing

Utilization v/s
Response Time
Host Access to Data
Directed Attached Storage
DAS Benefits
• Requires low initial investment
• Simple and can be easily deployed
• The setup is managed using host-based tools, such
as the host-OS which makes the storage
management task easy for small environments.
• Requires fewer management tasks
• Less hardware and software elements to set up and
operate
DAS Limitations
• DAS does not scale well.

• A storage array has a limited number of ports, which restricts


the number of hosts that can directly connect to the storage.

• DAS does not make optimal use of resources due to its


limited capability to share front-end ports. In DAS
environments, unused resources cannot be easily reallocated,
resulting in islands of over-utilized and under-utilized
storage pools
Storage Design based on Application
Requirement and disk performance

You might also like