Information Storage Management
Information Storage Management
Information Storage Management (ISM) refers to the practices, technologies, and processes
that organizations use to manage, store, preserve, and protect data. ISM covers a wide range of
activities, including the design of storage architecture, implementation of storage systems, data
backup and recovery, and ensuring data security and compliance with regulatory requirements.
It involves understanding the lifecycle of data, from creation to disposal, and optimizing the
storage infrastructure for efficiency and cost-effectiveness. ISM is crucial for ensuring data
availability, integrity, and accessibility, and is a key component of overall IT strategy in an era
where data volume and importance are continuously growing.
Pre-requisite Knowledge/Skills
To understand the content and successfully complete this course, a participant must have a
basic understanding of Computer Architecture, Operating Systems, Networking, and databases.
Participants with experience in specific segments of Storage Infrastructure would also be able
to fully assimilate the course material.
Information Storage: The Essential Tool for Organizing, Securing, and Accessing Data
Information storage is a crucial aspect of managing data in today's digital landscape. It refers
to the process of capturing, organizing, storing, and retrieving information in a manner that
ensures its availability, integrity, and security. In an era where businesses rely heavily on data-
driven decision making, information storage plays a pivotal role in supporting operations,
improving efficiency, and enabling innovation.
What is Information Storage?
At its core, information storage involves the use of various technologies, including databases,
file systems, and cloud storage, to store and manage vast amounts of data. It encompasses both
the physical infrastructure, such as computer hardware and storage devices, and the software
systems that enable efficient data organization and retrieval.
Key Components of Information Storage:
1. Capture and Encoding: Information can take various forms, such as text, images,
videos, or audio. The process of capturing and encoding data involves converting these
different formats into a digital representation that can be stored and managed
electronically.
2. Organization and Structure: To facilitate effective data management, information
storage systems employ organizational structures. These structures, such as databases
or file systems, provide a framework for categorizing and arranging data, ensuring easy
retrieval and efficient usage.
3. Storage Mediums: Information storage relies on a variety of physical and virtual
mediums to store data. Traditional options include hard disk drives (HDDs) and solid-
state drives (SSDs), while cloud-based storage solutions have gained immense
popularity in recent years for their scalability, accessibility, and cost-effectiveness.
4. Access and Retrieval: The ability to access and retrieve stored information quickly
and accurately is a critical aspect of information storage. Advanced indexing and search
mechanisms allow users to locate and retrieve specific data points or files, streamlining
workflows and enhancing productivity.
5. Data Security and Integrity: With the growing importance of data protection,
information storage systems prioritize securing data against unauthorized access, loss,
or corruption. Robust security measures, such as encryption, authentication protocols,
and backups, safeguard sensitive information and ensure data integrity.
6. Scalability and Performance: As data volumes continue to explode, scalability and
performance are vital features of any information storage solution. Scalable storage
architectures and technologies ensure that storage systems can accommodate the ever-
increasing demands of expanding data sets without compromising performance.
7. Data Lifecycles: Information storage systems typically incorporate data lifecycle
management strategies. From data creation to deletion, these strategies define how data
is handled, archived, and eventually disposed of, aligning storage practices with
relevant regulatory requirements and business needs.
Why Assess(Evaluate) a Candidate’s Information Storage Skill Level?
Assessing a candidate's information storage skill level is crucial for organizations seeking to
hire individuals proficient in managing data effectively. Here are some key reasons why
evaluating a candidate's information storage skills is essential:
1. Efficient Data Management: Proficiency in information storage ensures that candidates
can efficiently organize and structure data, allowing for easy retrieval, analysis, and decision-
making. Assessing this skill helps identify candidates who can handle large volumes of data,
ensuring smooth operations and streamlined workflows.
2. Data Security and Integrity: Information storage assessment enables organizations to
evaluate a candidate's understanding of data security measures and their ability to maintain data
integrity. Skilled candidates will possess knowledge of encryption techniques, access controls,
and backup strategies, safeguarding sensitive information from potential breaches and ensuring
compliance with data protection regulations.
3. Problem-Solving and Troubleshooting: Assessing candidates' information storage skills
allows organizations to gauge their problem-solving abilities when it comes to data-related
issues. Candidates who excel in information storage will showcase the capability to identify
and address storage inefficiencies, minimize data loss risks, and troubleshoot any arising
challenges promptly.
4. Adaptability to Technology: As technology advancements continue to shape information
storage practices, assessing a candidate's skill level becomes crucial for determining their
adaptability to new and emerging storage technologies. Skilled candidates will exhibit a grasp
of cloud storage, virtualization, and database management systems, enabling organizations to
stay ahead in the ever-evolving digital landscape.
5. Enhanced Decision-Making: Effective information storage influences data-driven
decision-making across various departments within an organization. By assessing a candidate's
information storage skills, businesses can identify individuals who possess the capability to
extract valuable insights from stored data, aiding in strategic planning, forecasting, and overall
business growth.
6. Streamlined Workflows and Efficiency: Candidates proficient in information storage
possess the ability to optimize data retrieval processes, resulting in streamlined workflows and
improved efficiency. Through assessment, organizations can identify candidates who can
implement efficient storage strategies, reducing data search and retrieval time, and enhancing
operational productivity.
7. Competitive Advantage: As organizations increasingly rely on data to gain a competitive
edge, assessing a candidate's information storage skills becomes imperative. By hiring
individuals proficient in information storage, organizations can enhance their data management
practices, unlock valuable insights, and make informed business decisions, ultimately staying
ahead of their competition.
To ensure your organization can harness the power of data, assessing a candidate's information
storage skills is a critical step in identifying the right talent and building a capable team that
can effectively manage and leverage data assets. With Alooba's comprehensive information
storage assessment, unlock the potential of your hiring process and make informed talent
decisions.
Backups today and the way we store data have also sprung out of more and more digital threats.
Companies and people must protect their data in a smarter way to avoid hackers. Therefore,
hyper clouds storage often come with air-gapping, segregation of duties, Immutability, the 3-
2-1-0 rule and other features.
The evolution of data storage has come a long way from punch cards to hyper clouds. Cloud-
based storage has become the new norm and it continues to grow. Even if the way we store
data may sound complex, storage has never been this simple, as it is possible to store and
retrieve data with a single click of your computer mouse. Simple and scalable data management
solutions are at the heart of Anycloud, with a product portfolio consisting of backup,
replication, and data storage solutions – all in a partner-ready set-up.
Architecture:
Over the years, storage technology and the architecture has evolved to overcome the challenges
of increasing demands to store more and ever increasing data and information. Storage
Technology and Architecture is the combination of hardware and software components that are
required to facilitate storage for a system.
The needs of the average customers like the general public have changed alongside the needs
of companies and businesses. Storage capacity demands have grown by leaps and bounds in
many real-time applications such as the emergence of the internet, e-mail, e-commerce, data
warehousing, etc. The necessity to store this huge amount of data and then access it from across
the world has caused dramatic changes in the storage technologies.
These evolutions are monitored by some parameters
1. Accessibility- It is the availability of storage components to perform the desired
operations whenever needed.
2. Capacity- It is the amount of storage resources and capacity available. Capacity
monitoring includes the example of examining the free space available on the disc. This
offers the availability of uninterrupted data and scalability.
3. Performance- Monitoring the performance, measures and analyses the behaviour
regarding the ability to perform at certain predefined level or response time. This is to
evaluate the efficiency of the storage architecture.
4. Security- Security of data and information to be storage is of utmost importance in any
storage architecture.
This is a 5 MB hard-disc drive from 1956 by IBM.
RAID:
The second Storage Architecture we will see is RAID or Redundant Array of Independent
Disks. RAID allows the coordination of multiple HDD devices so as to provide higher levels
of reliability and performance than could be provided by a single drive. Implementation of
RAID offers two benefits which are Data redundancy and Enhanced performance. This is
achieved by making the clone of data to two or more disks.
RAID technology was delivered in low cost hardware and by the mid 1990s became standard
on servers.
SAN devices provide the fast and continuous access to a large amount of data. It provides high-
performance increased availability, scalability and cost benefits compared to DAS. That makes
it more feasible for businesses to keep their backup data in remote locations.
NAS offers higher availability, scalability, cost benefits, performance compared to the
general-purpose file servers.
1.3 Data Centre:
Modern data centers are very different than they were just a short time ago. Infrastructure has
shifted from traditional on-premises physical servers to virtual networks that support
applications and workloads across pools of physical infrastructure and into a multicloud
environment.
In this era, data exists and is connected across multiple data centers, the edge, and public and
private clouds. The data center must be able to communicate across these multiple sites, both
on-premises and in the cloud. Even the public cloud is a collection of data centers. When
applications are hosted in the cloud, they are using data center resources from the cloud
provider.
Why are data centers important to business?
In the world of enterprise IT, data centers are designed to support business applications and
activities that include:
• Email and file sharing
• Productivity applications
• Customer relationship management (CRM)
• Enterprise resource planning (ERP) and databases
• Big data, artificial intelligence, and machine learning
• Virtual desktops, communications and collaboration services
What are the core components of a data center?
Data center design includes routers, switches, firewalls, storage systems, servers, and
application delivery controllers. Because these components store and manage business-critical
data and applications, data center security is critical in data center design. Together, they
provide:
Network infrastructure. This connects servers (physical and virtualized), data center services,
storage, and external connectivity to end-user locations.
Storage infrastructure. Data is the fuel of the modern data center. Storage systems are used to
hold this valuable commodity.
Computing resources. Applications are the engines of a data center. These servers provide the
processing, memory, local storage, and network connectivity that drive applications.
How do data centers operate?
Data center services are typically deployed to protect the performance and integrity of the core
data center components.
Network security appliances. These include firewall and intrusion protection to safeguard the
data center.
Application delivery assurance. To maintain application performance, these mechanisms
provide application resiliency and availability via automatic failover and load balancing.
What is in a data center facility?
Data center components require significant infrastructure to support the center's hardware and
software. These include power subsystems, uninterruptible power supplies (UPS), ventilation,
cooling systems, fire suppression, backup generators, and connections to external networks.
What are the standards for data center infrastructure?
The most widely adopted standard for data center design and data center infrastructure is
ANSI/TIA-942. It includes standards for ANSI/TIA-942-ready certification, which ensures
compliance with one of four categories of data center tiers rated for levels of redundancy and
fault tolerance.
Tier 1: Basic site infrastructure. A Tier 1 data center offers limited protection against physical
events. It has single-capacity components and a single, nonredundant distribution path.
Tier 2: Redundant-capacity component site infrastructure. This data center offers improved
protection against physical events. It has redundant-capacity components and a single,
nonredundant distribution path.
Tier 3: Concurrently maintainable site infrastructure. This data center protects against virtually
all physical events, providing redundant-capacity components and multiple independent
distribution paths. Each component can be removed or replaced without disrupting services to
end users.
Tier 4: Fault-tolerant site infrastructure. This data center provides the highest levels of fault
tolerance and redundancy. Redundant-capacity components and multiple independent
distribution paths enable concurrent maintainability and one fault anywhere in the installation
without causing downtime.
Types of data centers:
Enterprise data centers
These are built, owned, and operated by companies and are optimized for their end users. Most
often they are housed on the corporate campus.
Managed services data centers
These data centers are managed by a third party (or a managed services provider) on behalf of
a company. The company leases the equipment and infrastructure instead of buying it.
Colocation data centers
In colocation ("colo") data centers, a company rents space within a data center owned by others
and located off company premises. The colocation data center hosts the infrastructure: building,
cooling, bandwidth, security, etc., while the company provides and manages the components,
including servers, storage, and firewalls.
Cloud data centers
In this off-premises form of data center, data and applications are hosted by a cloud services
provider such as Amazon Web Services (AWS), Microsoft (Azure), or IBM Cloud or other
public cloud provider.
https://www.bmc.com/blogs/dcim-data-center-infrastructure-management/
Components of DCIM
DCIM solutions are made of several components. These support a variety of enterprise IT
functions at the infrastructure layer.
Physical architecture
The floor space of a data center is planned according to:
• The dimensions of the equipment
• Airflow and cooling
• Human access
• Other geometric and physical factors
Here, DCIM technology helps you visualize and simulate the representation of server racks
deployed in the data center, so you can determine if the physical space is satisfactory.
Rack design
Typically, you’ll use standardized cabinets to install server and networking technologies in your
data center. Understanding of the specifics associated with rack design can help data center
organizations to plan for capacity, space, cooling and access for maintenance and
troubleshooting.
DCIM can help optimize the selection and placement of server racks based on these factors.
(Learn how to secure your server room.)
Materials catalog
DCIM technologies contain vast libraries of equipment material. The information ranges from
basic parameter specifications to high-resolution renders. With new technologies introduced
rapidly in the industry, these libraries are updated and maintained regularly in coordination
with the vendors.
Change management
Data center hardware must be replaced periodically, due to a few reasons:
• The inherently limited lifecycle of hardware
• A malfunction
• The need to upgrade to a better product
This change, however, can affect the performance of other integrated infrastructure
technologies. DCIM allows a structured approach to manage such hardware changes, allowing
IT to change or replace hardware by:
• Following predefined process workflows
• Reducing the risks associated with the change
Capacity planning
The data center should be designed to scale in response to changing business needs. That means
your capacity planning must account for:
• Space limitations
• Weight of equipment and racks
• Power supply
• Cooling performance
• A range of other physical limitations of the data center
The DCIM tool can model a variety of future/potential scenarios, planning future capacity
based on these limitations.
(Read more about capacity planning for the mainframe.)
Software integration
DCIM solutions integrated with existing management solutions that are designed to track and
coordinate data center assets and workflows. Integrations can include:
• Protocols such as SNMP and Modbus
• Complex web integrations
• CMDBs
Data analysis
Real-time data collection and analysis is a critical feature of DCIM technologies. With a DCIM
tool, you can:
• Track a variety of asset metrics
• Transfer data between DCIM solutions using web-based APIs
• Analyze data using advanced AI solutions
Looking at the real-time performance of the metrics can help you mitigate incidents such as
power failure, security infringements, and network outages—ahead of schedule.
(See how data analysis can support DCIM.)
Reporting & dashboard
A good DCIM tool transforms vast volumes of metrics log data into intuitive dashboards and
comprehensive reports. Automated actions can be triggered using the reporting information
and studied for further analysis.
1.7.1 Host
Users store and retrieve data through applications. The computers on which these applications
run are referred to as hosts. Hosts can range from simple laptops to complex clusters of servers.
A host consists of physical components (hardware devices) that communicate with one another
using logical components (software and protocols). Access to data and the overall performance
of the storage system environment depend on both the physical and logical components of a
host. The logical components of the host are detailed in Section 2.5 of this chapter.Physical
Components .
A host has three key physical components:
➢ Central processing unit (CPU)
➢ Storage, such as internal memory and disk devices
➢ Input/Output (I/O) devices
The physical components communicate with one another by using a communication pathway
called a bus. A bus connects the CPU to other components, such as storage and I/O devices.
CPU
The CPU consists of four main components:
Arithmetic Logic Unit (ALU):
This is the fundamental building block of the CPU. It performs arithmetical and logical
operations such as addition, subtraction, and Boolean functions (AND, OR, and NOT).
Control Unit:
A digital circuit that controls CPU operations and coordinates the functionality of the CPU.
Register:
A collection of high-speed storage locations. The registers store intermediate data that is
required by the CPU to execute an instruction and provide fast access because of their
proximity to the ALU. CPUs typically have a small number of registers.
Level 1 (L1) cache:
Found on modern day CPUs, it holds data and program instructions that are likely to be needed
by the CPU in the near future. The L1 cache is slower than registers, but provides more storage
space.
Storage:
Storage
Memory and storage media are used to store data, either persistently or temporarily. Memory
modules are implemented using semiconductor chips, whereas storage devices use either
magnetic or optical media. Memory modules enable data access at a higher speed than the
storage media. Generally, there are two types of memory on a host:
Random Access Memory (RAM):
This allows direct access to any memory location and can have data written into it or read from
it. RAM is volatile; this type of memory requires a constant supply of power to maintain
memory cell content. Data is erased when the system’s power is turned off or interrupted.
Read-Only Memory (ROM):
Non-volatile and only allows data to be read from it. ROM holds data for execution of internal
routines, such as system startup. Storage devices are less expensive than semiconductor
memory. Examples of storage devices are as follows:
➢ Hard disk (magnetic)
➢ CD-ROM or DVD-ROM (optical)
➢ Floppy disk (magnetic)
➢ Tape drive (magnetic
I/O Devices
I/O devices enable sending and receiving data to and from a host. This communication may be
one of the following types:
User to host communications: Handled by basic I/O devices, such as the keyboard, mouse,
and monitor. These devices enable users to enter data and view the results of operations.
Host to host communications:
Enabled using devices such as a Network Interface Card (NIC) or modem.
Host to storage device communications:
Handled by a Host Bus Adaptor (HBA). HBA is an application-specific integrated circuit
(ASIC) board that performs I/O interface functions between the host and the storage,
relieving the CPU from additional I/O processing workload.
HBAs also provide connectivity outlets known as ports to connect the host to the storage
device. A host may have multiple HBAs.
1.7.2 Connectivity
Connectivity refers to the interconnection between hosts or between a host and any other
peripheral devices, such as printers or storage devices. The discussion here focuses on the
connectivity between the host and the storage device. The components of connectivity in a
storage system environment can be classified as physical and logical. The physical components
are the hardware elements that connect the host to storage and the logical components of
connectivity are the protocols used for communication between the host and storage. The
communication protocols are covered in Chapter 5.
Physical components communicate across a bus by sending bits (control, data, and address) of
data between devices. These bits are transmitted through the bus in either of the following
ways:
Serially:
Bits are transmitted sequentially along a single path. This transmission can be unidirectional or
bidirectional.
In parallel:
Bits are transmitted along multiple paths simultaneously. Parallel can also be bidirectional.
The size of a bus, known as its width, determines the amount of data that can be transmitted
through the bus at one time. The width of a bus can be compared to the number of lanes on a
highway. For example, a 32-bit bus can transmit 32 bits of data and a 64-bit bus can transmit
64 bits of data simultaneously. Every bus has a clock speed measured in MHz (megahertz).
These represent the data transfer rate between the end points of the bus. A fast bus allows faster
transfer of data, which enables applications to run faster.Buses, as conduits of data transfer on
the computer system, can be classified as follows:
System bus:
The bus that carries data from the processor to memory.
Local or I/O bus:
A high-speed pathway that connects directly to the processor and carries data between the
peripheral devices, such as storage devices and the processor.
1.7.3 Storage
The storage device is the most important component in the storage system environment. A
storage device uses magnetic or solid state media. Disks, tapes, and diskettes use magnetic
media. CD-ROM is an example of a storage device that uses optical media, and removable
flash memory card is an example of solid state media.
1.8 Logical Components of HOST RAID:
1. The logical components of a host consist of the software applications and protocols that
enable data communication with the user as well as the physical components. Following
are the logical components of a host:
2. Operating system
3. Device drivers
4. Volume manager
5. File system
6. Application
1.Operating System
• An operating system controls all aspects of the computing environment. It works
between the application and physical components of the computer system. One of the
services it provides to the application is data access. The operating system also monitors
and responds to user actions and the environment. It organizes and controls hardware
components and manages the allocation of hardware resources. It provides basic
security for the access and usage of all managed resources. An operating system also
performs basic storage management tasks while managing other underlying
components, such as the file system, volume manager, and device drivers.
2.Device Driver
• A device driver is special software that permits the operating system to interact with a
specific device, such as a printer, a mouse, or a hard drive. A device driver enables the
operating system to recognize the device and to use a standard inter-face (provided as
an application programming interface, or API) to access and control devices.
3.Volume Manager
• Disk partitioning was introduced to improve the flexibility and utilization ofHDDs. In
partitioning, an HDD is divided into logical containers called logicalvolumes (LVs).
For example, a large physical drive can be partitioned into multiple LVs to maintain
data according to the file system’s and applications’ requirements. The partitions are
created from groups of contiguous cylinders when the hard disk is initially set up on
the host. The host’s file system accesses the partitions without any knowledge of
partitioning and the physical structure of the disk. Concatenation is the process of
grouping several smaller physical drives andpresenting them to the host as one logical
drive.
• The evolution of Logical Volume Managers (LVMs) enabled the dynamic extension of
file system capacity and efficient storage management. LVM is software that runs on
the host computer and manages the logical and physical storage. LVM is an optional,
intermediate layer between the file system and the physical disk .The LVM provides
optimized storage access and simplifies storage resource management. It hides details
about the physical disk and the location of data on the disk; and it enables administrators
to change the storage allocation without changing the hardware, even when the
application is running.The basic LVM components are the physical volumes, volume
groups, and logical volumes. In LVM terminology, each physical disk connected to the
host system is a physical volume (PV).
4.File System
o A file is a collection of related records or data stored as a unit with a name. A
file system is a hierarchical structure of files. File systems enable easy access to
data files residing within a disk drive, a disk partition, or a logical volume. A
file system needs host-based logical structures and software routines that control
access to files. It provides users with the functionality to create, modify, delete,
and access files. Access to the files on the disks is controlled by the permissions
given to the file by the owner, which are also maintained by the file system.
o A file system organizes data in a structured hierarchical manner via the use of
directories, which are containers for storing pointers to multiple files. All file
systems maintain a pointer map to the directories, subdirectories, and files that
are part of the file system. Some of the common file systems are as follows:
i.FAT 32 (File Allocation Table) for Microsoft Windows
ii.NT File System (NTFS) for Microsoft Windows
iii.UNIX File System (UFS) for UNIX
iv.Extended File System (EXT2/3) for Linux
5.Application
o An application is a computer program that provides the logic for computing
operations. It provides an interface between the user and the host and among
multiple hosts. Conventional business applications using databases have a three-
tiered architecture the application user interface forms the front-end tier; the
computing logic forms, or the application itself is, the middle tier; and the
underlying databases that organize the data form the back-end tier. The
application sends requests to the underlying operating system to perform read/
write operations on the storage devices. Applications can be layered on the
database, which in turn uses the OS services to perform R/W operations to
storage devices. These R/W operations enable transactions between the front-
end and back-end tiers.
1.8.1 RAID IMPLEMENTATION:
Software or hardware RAID implementation
All the computations involved in making RAID work require a lot of processing power. The
more complex the RAID configuration, the more CPU resources it requires. From a
computational point of view there is little difference between a software RAID implementation
and a hardware RAID implementation. Ultimately, the difference is in where RAID processing
is performed. It can either be performed by the server processor where the RAID system is
installed (that is the software implementation) or by an external processor (that is the hardware
implementation).
Hardware RAID (hardraid) implementation.
In a hardware RAID implementation the drives are connected to a RAID controller card that
plugs into a PCI-Express (PCI-e) slot on the motherboard. This is done in the same way for
both large servers and desktop RAID installations. Most external devices have a RAID
controller card built into the device itself.
Benefits
Better performance, especially for complex RAID configurations. Processing is performed by
a dedicated RAID processor rather than the computer's main processor.
This reduces the load on the system when writing data backups and reduces data recovery time.
More RAID configuration options are provided, including hybrid configurations which may
not be available under certain operating system settings. Compatibility with various operating
systems.
This factor is critical if you plan to access your RAID system from Mac and Windows
computers simultaneously. The RAID hardware implementation will be recognized by any
system.
Disadvantages
• Since the system contains more hardware, initial deployment costs will be higher.
• Performance degradation in certain hardware RAID implementations when using solid
state disks (SSDs). Older RAID controllers do not offer the fast native SSD caching
needed to efficiently program and erase the drive.
• Hardware RAID software is designed to work exclusively with the large systems
(general purpose machines, Solaris RISC systems, Itanium, SAN) used in industrial
infrastructure.
Raid Software Implementation
When the disks storing information are connected directly to a computer or server without a
RAID controller, RAID configuration chosen is handled by a utility included in the operating
system. This arrangement is called a software RAID implementation.
Many operating systems support RAID configuration, including Apple and Microsoft, various
versions of Linux systems such as OpenBSD, FreeBSD, NetBSD and Solaris Unix systems.
Benefits
• Low cost RAID deployment. All you need to do is to connect the drives and then
configure their use with the operating system.
• Today's computers are so powerful that their processors can easily handle RAID Level
0 and 1 without any noticeable performance degradation.
Disadvantages
• RAID software is often specific to the operating system you are using and therefore
cannot be used for disk arrays shared between different operating systems.
• You are limited to RAID levelsthat your operating system can support.
• With more complex RAID configurations, computer performance suffers.
Software or hardware RAID implementations?
The winner of the RAID implementation comparison really depends on how you use your
system. If your intention is to save money (and who doesn't?) then you will use a single
operating system to access the RAID array and use RAID level 0 or 1, using a software RAID
implementation which gives you the same protection and experience as a more expensive
hardware implementation.
If you are able to provide the initial investment then hardware RAID implementations are
definitely preferable. It will free you from the limitations of a software RAID implementation
and give you more flexibility in using and configuring RAID.
RAID 1 requires a minimum of two disks to work. It does take up more usable capacity on
drives, but is an economical failover process on application servers.
Advantages
By copying one disk to another, RAID 1 decreases the chance of total data loss from a disk
failure.
Disadvantages
Because two disks store the same data, RAID 1 can only use half of the array’s total storage.
Raid 5
RAID 5 distributes striping and parity at a block level. Parity is raw binary data–the RAID
system calculates its values to create a parity block, which the system uses to recover striped
data from a failed drive. Most RAID systems with parity functions store parity blocks on the
disks in the array. Some RAID systems also dedicate a disk to parity calculations, but these are
rare.
RAID 5 stores parity blocks on striped disks. Each stripe has its own dedicated parity block.
RAID 5 can withstand the loss of one disk in the array.
RAID 5 combines the performance of RAID 0 with the redundancy of RAID 1, but takes up a
lot of storage space to do it—about one third of usable capacity. This level increases write
performance since all drives in the array simultaneously serve write requests. However, overall
disk performance can suffer from write amplification, since even minor changes to the stripes
require multiple steps and recalculations.
Advantages
• RAID 5’s striping increases read performance.
• Parity improves data accuracy.
• RAID 5 can be used for SSDs as well as hard drives. But be careful to pick SSDs that
are the exact same age in case they fail at the same time.
Disadvantages
RAID 5 only has fault tolerance for one disk failure.
RAID 6
This RAID level operates like RAID 5 with distributed parity and striping. The main
operational difference in RAID 6 is that there is a minimum of four disks in a RAID 6 array,
and the system stores an additional parity block on each desk. This enables a configuration
where two disks may fail before the array is unavailable. Its primary uses are application servers
and large storage arrays.
RAID 6 offers higher redundancy and increased read performance over RAID 5. It can suffer
from the same server performance overhead with intensive write operations. This performance
hit depends on the RAID system architecture: hardware or software, if it’s located in firmware,
and if the system includes processing software for high-performance parity calculations.
Advantages
• RAID 6 arrays can withstand two drive failures because they have two instances of
parity rather than a single one.
• RAID 6 has better read performance than RAID 5.
Disadvantages
• RAID 6 is more expensive than some other forms of RAID.
• Rebuilding data on larger RAID 6 arrays can be a slow process.
Front End
Notice this is the interface between the hosts in your network and the Intelligent Storage System
itself. Most front ends will have redundant front-end controllers, as well as redundant ports for
connectivity. Supported protocols include Fibre Channel, iSCSI, FICON, and FCoE.
Cache
Cache is very fast memory that exists to speed up the IO processes. If cache is doing its job
properly, the number of mechanical disk operations is dramatically reduced.
The cache is organized in pages and consists of the data store and tag RAM. The data store
holds the data that is being read or written, and the tag RAM is responsible for tracking the
location of the data in the actual data storage (the physical disks).
Back End
The job of the back end is to provide an interface between the cache and the physical disks. It
consists of back end controllers and back end ports. The back end is often responsible for Kill
detection and correction, as well as the RAID functionality of the system. Like the front end,
most systems provide multiple controllers and multiple ports for the most redundancy possible.
Physical Disks
Another great feature of the Intelligent Storage System is the variety that is possible with the
physical disks. Most systems support a variety of disks and speeds, including FC, SATA, SAS,
and flash drives. Most even support a mix of disks in the same array!
1.9.3 Intelligent Storage Array
Intelligent storage systems generally fall into one of the following two categories:
➢ High-end storage systems
➢ Midrange storage systems
Traditionally, high-end storage systems have been implemented with active-active arrays,
whereas midrange storage systems used typically in small- and medium sized enterprises have
been implemented with active-passive arrays. Active-passive arrays provide optimal storage
solutions at lower costs. Enterprises make use of this cost advantage and implement active-
passive arrays to meet specific application requirements such as performance, availability, and
scalability. The distinctions between these two implementations are becoming increasingly
insignificant.
1.9.3.1 High-end Storage Systems
High-end storage systems, referred to as active-active arrays, are generally aimed at large
enterprises for centralizing corporate data. These arrays are designed with a large number of
controllers and cache memory. An active-active array implies that the host can perform I/Os to
its LUNs across any of the available paths (see Figure 4-7).
To address the enterprise storage needs, these arrays provide the following capabilities:
➢ Large storage capacity
➢ Large amounts of cache to service host I/Os optimally
➢ Fault tolerance architecture to improve data availability
➢ Connectivity to mainframe computers and open systems hosts
➢ Availability of multiple front-end ports and interface protocols to serve a large
number of hosts
➢ Availability of multiple back-end Fibre Channel or SCSI RAID controllers to
manage disk processing
➢ Scalability to support increased connectivity, performance, and storage capacity
requirements
➢ Ability to handle large amounts of concurrent I/Os from a number of servers and
applications
➢ Support for array-based local and remote replication
In addition to these features, high-end arrays possess some unique features and functionals
that are required for mission-critical applications in large enterprises.
1.9.3.2 Midrange Storage System
Midrange storage systems are also referred to as active-passive arrays and they are best suited
for small- and medium-sized enterprises.
In an active-passive array, a host can perform I/Os to a LUN only through the paths to the
owning controller of that LUN. These paths are called active paths.
The other paths are passive with respect to this LUN. As shown in Figure 4-8, the host can
perform reads or writes to the LUN only through the path to controller A, as controller A is the
owner of that LUN.
The path to controller B remains passive and no I/O activity is performed through this path.
Midrange storage systems are typically designed with two controllers, each of which contains
host interfaces, cache, RAID controllers, and disk drive interfaces.
Midrange arrays are designed to meet the requirements of small and medium enterprises;
therefore, they host less storage capacity and global cache than active-active arrays.
There are also fewer front-end ports for connection to serv ers. However, they ensure high
redundancy and high performance for appli cations with predictable workloads.
They also support array-based local and remote replication.
-------------------------------------------------------****************---------------------------------