0% found this document useful (0 votes)
94 views41 pages

Information Storage Management

Uploaded by

saitejacseai08
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
94 views41 pages

Information Storage Management

Uploaded by

saitejacseai08
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 41

Information Storage Management

Subject Code: BCS18E24

Information Storage Management (ISM) refers to the practices, technologies, and processes
that organizations use to manage, store, preserve, and protect data. ISM covers a wide range of
activities, including the design of storage architecture, implementation of storage systems, data
backup and recovery, and ensuring data security and compliance with regulatory requirements.
It involves understanding the lifecycle of data, from creation to disposal, and optimizing the
storage infrastructure for efficiency and cost-effectiveness. ISM is crucial for ensuring data
availability, integrity, and accessibility, and is a key component of overall IT strategy in an era
where data volume and importance are continuously growing.

Pre-requisite Knowledge/Skills
To understand the content and successfully complete this course, a participant must have a
basic understanding of Computer Architecture, Operating Systems, Networking, and databases.
Participants with experience in specific segments of Storage Infrastructure would also be able
to fully assimilate the course material.
Information Storage: The Essential Tool for Organizing, Securing, and Accessing Data
Information storage is a crucial aspect of managing data in today's digital landscape. It refers
to the process of capturing, organizing, storing, and retrieving information in a manner that
ensures its availability, integrity, and security. In an era where businesses rely heavily on data-
driven decision making, information storage plays a pivotal role in supporting operations,
improving efficiency, and enabling innovation.
What is Information Storage?
At its core, information storage involves the use of various technologies, including databases,
file systems, and cloud storage, to store and manage vast amounts of data. It encompasses both
the physical infrastructure, such as computer hardware and storage devices, and the software
systems that enable efficient data organization and retrieval.
Key Components of Information Storage:
1. Capture and Encoding: Information can take various forms, such as text, images,
videos, or audio. The process of capturing and encoding data involves converting these
different formats into a digital representation that can be stored and managed
electronically.
2. Organization and Structure: To facilitate effective data management, information
storage systems employ organizational structures. These structures, such as databases
or file systems, provide a framework for categorizing and arranging data, ensuring easy
retrieval and efficient usage.
3. Storage Mediums: Information storage relies on a variety of physical and virtual
mediums to store data. Traditional options include hard disk drives (HDDs) and solid-
state drives (SSDs), while cloud-based storage solutions have gained immense
popularity in recent years for their scalability, accessibility, and cost-effectiveness.
4. Access and Retrieval: The ability to access and retrieve stored information quickly
and accurately is a critical aspect of information storage. Advanced indexing and search
mechanisms allow users to locate and retrieve specific data points or files, streamlining
workflows and enhancing productivity.
5. Data Security and Integrity: With the growing importance of data protection,
information storage systems prioritize securing data against unauthorized access, loss,
or corruption. Robust security measures, such as encryption, authentication protocols,
and backups, safeguard sensitive information and ensure data integrity.
6. Scalability and Performance: As data volumes continue to explode, scalability and
performance are vital features of any information storage solution. Scalable storage
architectures and technologies ensure that storage systems can accommodate the ever-
increasing demands of expanding data sets without compromising performance.
7. Data Lifecycles: Information storage systems typically incorporate data lifecycle
management strategies. From data creation to deletion, these strategies define how data
is handled, archived, and eventually disposed of, aligning storage practices with
relevant regulatory requirements and business needs.
Why Assess(Evaluate) a Candidate’s Information Storage Skill Level?

Assessing a candidate's information storage skill level is crucial for organizations seeking to
hire individuals proficient in managing data effectively. Here are some key reasons why
evaluating a candidate's information storage skills is essential:
1. Efficient Data Management: Proficiency in information storage ensures that candidates
can efficiently organize and structure data, allowing for easy retrieval, analysis, and decision-
making. Assessing this skill helps identify candidates who can handle large volumes of data,
ensuring smooth operations and streamlined workflows.
2. Data Security and Integrity: Information storage assessment enables organizations to
evaluate a candidate's understanding of data security measures and their ability to maintain data
integrity. Skilled candidates will possess knowledge of encryption techniques, access controls,
and backup strategies, safeguarding sensitive information from potential breaches and ensuring
compliance with data protection regulations.
3. Problem-Solving and Troubleshooting: Assessing candidates' information storage skills
allows organizations to gauge their problem-solving abilities when it comes to data-related
issues. Candidates who excel in information storage will showcase the capability to identify
and address storage inefficiencies, minimize data loss risks, and troubleshoot any arising
challenges promptly.
4. Adaptability to Technology: As technology advancements continue to shape information
storage practices, assessing a candidate's skill level becomes crucial for determining their
adaptability to new and emerging storage technologies. Skilled candidates will exhibit a grasp
of cloud storage, virtualization, and database management systems, enabling organizations to
stay ahead in the ever-evolving digital landscape.
5. Enhanced Decision-Making: Effective information storage influences data-driven
decision-making across various departments within an organization. By assessing a candidate's
information storage skills, businesses can identify individuals who possess the capability to
extract valuable insights from stored data, aiding in strategic planning, forecasting, and overall
business growth.
6. Streamlined Workflows and Efficiency: Candidates proficient in information storage
possess the ability to optimize data retrieval processes, resulting in streamlined workflows and
improved efficiency. Through assessment, organizations can identify candidates who can
implement efficient storage strategies, reducing data search and retrieval time, and enhancing
operational productivity.
7. Competitive Advantage: As organizations increasingly rely on data to gain a competitive
edge, assessing a candidate's information storage skills becomes imperative. By hiring
individuals proficient in information storage, organizations can enhance their data management
practices, unlock valuable insights, and make informed business decisions, ultimately staying
ahead of their competition.
To ensure your organization can harness the power of data, assessing a candidate's information
storage skills is a critical step in identifying the right talent and building a capable team that
can effectively manage and leverage data assets. With Alooba's comprehensive information
storage assessment, unlock the potential of your hiring process and make informed talent
decisions.

Topics Covered in the Information Storage Skill


To assess candidates' information storage skill level comprehensively, it is important to delve
into various subtopics that encompass this essential skill. Here are some of the key topics
covered when evaluating candidates' information storage proficiency:
1. Database Management Systems (DBMS): Assessing candidates' knowledge of DBMS is
crucial, as it forms the foundation of information storage. Evaluate their understanding of
relational databases, data modeling, normalization techniques, query optimization, and the
ability to implement and manage databases effectively.
2. File Systems and File Organization: Candidates should demonstrate knowledge of
different file systems and their characteristics, such as hierarchical, network, and distributed
file systems. Assess their understanding of file organization methods, including sequential,
direct, indexed, and hashed file organization, to ensure efficient data retrieval and storage.
3. Data Warehousing and Data Mining: Evaluate candidates' familiarity with data
warehousing concepts, including design, extraction, transformation, and loading (ETL)
processes. Assess their knowledge of data mining techniques, such as classification, clustering,
association, and anomaly detection, to determine their ability to derive meaningful insights
from stored data.
4. Backup and Recovery Strategies: Assess candidates' understanding of backup and
recovery strategies to protect data from loss or corruption. This includes knowledge of various
backup methods, such as full, incremental, and differential backups, as well as disaster recovery
planning, ensuring data availability and minimizing downtime.
5. Data Security and Access Controls: Evaluate candidates' grasp of data security concepts,
such as authentication, authorization, and encryption techniques. Assess their understanding of
access controls and their ability to implement security measures to safeguard stored
information from unauthorized access or breaches.
6. Cloud Storage and Virtualization: As cloud computing becomes increasingly important,
candidates should demonstrate knowledge of cloud storage concepts and technologies. Assess
their familiarity with cloud storage models, data partitioning, virtualization techniques, and the
ability to leverage cloud infrastructure efficiently.
7. Data Compression Techniques: Evaluate candidates' understanding of data compression
techniques and their ability to apply appropriate compression algorithms, such as lossless and
lossy compression methods, to optimize storage space and network bandwidth usage.
8. Data Lifecycle Management: Assess candidates' knowledge of data lifecycle management,
including stages such as data creation, storage, archival, and disposal. Evaluate their
understanding of compliance requirements and regulations related to data management,
ensuring adherence to legal and industry standards.
9. Documentation and Metadata Management: Candidates should showcase their ability to
document and manage metadata effectively. Evaluate their understanding of metadata
standards, such as Dublin Core, and their proficiency in capturing and maintaining essential
information about stored data, enhancing searchability and data organization.
By assessing candidates' proficiency in these subtopics of information storage, you can
confidently identify individuals who possess a comprehensive understanding of this critical
skill. Alooba's advanced assessment platform enables you to evaluate candidates' knowledge
and practical application of information storage concepts, ensuring that you hire professionals
who excel in managing and securing valuable data assets.

Practical Applications of Information Storage


Information storage is an essential tool utilized across various industries and domains to
facilitate efficient data management and enable informed decision-making. Here are some of
the practical applications where information storage plays a crucial role:
1. Business Intelligence and Analytics: Information storage enables businesses to store,
organize, and analyze vast amounts of data to extract valuable insights. By utilizing information
storage systems, companies can uncover patterns, trends, and correlations in data, making data-
driven decisions and enhancing business intelligence.
2. E-commerce and Customer Relationship Management (CRM): Information storage is
vital for managing customer data in e-commerce and CRM systems. It allows businesses to
store and retrieve customer information, purchase history, and preferences, facilitating
personalized marketing campaigns, improving customer satisfaction, and driving sales.
3. Healthcare Data Management: Information storage is crucial in healthcare for managing
patient records, laboratory results, and medical history. It ensures secure and accessible storage
of critical health data, facilitating efficient diagnosis, treatment planning, and collaborative
healthcare efforts.
4. Financial Data Management: Financial institutions rely on information storage to manage
vast amounts of transactional data, account details, and financial records securely. These
systems enable accurate and timely reporting, fraud detection, risk assessment, and regulatory
compliance.
5. Research and Scientific Data Management: Researchers and scientists heavily rely on
information storage for managing research data, experimental results, and scientific literature.
It helps ensure data integrity, enables collaboration, and facilitates future referencing and
reproducibility of research findings.
6. Supply Chain and Inventory Management: Information storage supports supply chain
and inventory management by enabling efficient storage and retrieval of product information,
stock levels, and supply chain data. It optimizes inventory control, demand forecasting, order
fulfillment, and enhances supply chain visibility.
7. Digital Media and Content Management: In the media and entertainment industry,
information storage is crucial for managing digital media assets such as images, videos, and
audio files. It enables effective content organization, archiving, and distribution for media
production, broadcasting, and digital publishing.
8. Government and Public Sector Data Management: Government agencies and public
sector entities rely on information storage to manage critical data, including citizen records,
administrative documents, and public datasets. It facilitates efficient data retrieval, analysis,
and supports evidence-based policy-making and governance.
9. Research and Development: Research and development departments utilize information
storage to organize and store research data, project documentation, and intellectual property-
related information. It supports collaboration, knowledge sharing, and safeguards valuable
research data and findings.
By leveraging information storage systems, organizations in various industries can maximize
the value of their data, streamline operations, improve decision-making, and gain a competitive
edge. With Alooba's comprehensive assessment platform, identify candidates with exceptional
information storage skills, enabling your organization to harness the full potential of data-
driven opportunities.
Roles Requiring Good Information Storage Skills
Proficiency in information storage is essential for several roles that involve managing and
organizing data effectively. The following roles heavily rely on good information storage skills
to excel in their responsibilities:
1. Data Engineer: As a Data Engineer, strong information storage skills are vital for
designing, implementing, and maintaining robust data storage systems. They ensure
data is organized, stored efficiently, and accessible for analysis and processing.
2. Data Architect: Data Architects are responsible for designing data management and
storage solutions. Their expertise in information storage enables them to architect
databases and data warehouses that optimize efficiency, scalability, and data integrity.
3. Data Migration Engineer: Data Migration Engineers specialize in moving data from
one system to another. To carry out successful data migrations, they need a deep
understanding of information storage to ensure data integrity throughout the migration
process.
4. Data Pipeline Engineer: Data Pipeline Engineers build and manage data pipelines,
which involve the movement, transformation, and storage of data. Their expertise in
information storage ensures seamless data flow, efficient storage, and reliable data
processing.
5. Data Warehouse Engineer: Data Warehouse Engineers are responsible for designing
and maintaining data warehouses that store large volumes of structured and
unstructured data. They need strong information storage skills to organize data
effectively for querying and analysis.
6. DevOps Engineer: DevOps Engineers play a crucial role in managing infrastructure and
deployment processes. Good information storage skills enable them to design and
optimize storage architectures and implement secure and scalable storage solutions.
7. ELT Developer and ETL Developer: ELT/ETL Developers focus on the extraction,
transformation, and loading of data into target systems. Information storage skills are
vital for designing efficient data pipelines and ensuring data quality during the
extraction and loading processes.
8. GIS Data Analyst: GIS Data Analysts work with geospatial data, which require
effective information storage to manage and analyze geographic datasets. Their
knowledge of information storage ensures accurate storage and retrieval of location-
based data.
9. Supply Analyst: Supply Analysts rely on information storage skills to effectively
manage and analyze supply chain data, inventory levels, and demand forecasting. Their
expertise ensures efficient storage and retrieval of critical supply chain information.
10. Web Analyst: Web Analysts rely on information storage skills to manage and analyze
website data, user behavior, and online marketing campaigns. Their ability to
effectively store and organize data enables comprehensive web analytics and insights.
11. Decision Scientist: Decision Scientists leverage information storage skills to manage
and analyze complex datasets for strategic decision-making. Their expertise ensures
data is stored and organized in a way that supports insightful analysis and modeling.
Having strong information storage skills is crucial for professionals in these roles to effectively
manage and leverage data. Alooba's assessments can help evaluate candidates for these roles,
ensuring they possess the necessary expertise in information storage to excel in their
responsibilities.
Architecture:

1.2 Evolution of Storage Technology and Architecture:


1881: punch card
The punch card was introduced for use in early computer systems for storing and processing
data. A punch card is a stiff piece of paper that contains a series of holes punched in specific
patterns to represent information. Each hole position corresponds to a particular data value. By
inserting the punch card into a card reader, computers could read and interpret the data encoded
on the cards, which allowed data storage and retrieval in early computing systems.
1950: magnetic tape
Magnetic tape consists of a thin plastic coated with magnetic material. To store data on
magnetic tape, the tape is passed through a tape drive with read/write heads, which
magnetically encode and retrieve information from the tape. The data is recorded in a sequential
manner, with tracks running along the length of the tape. To access specific data, the tape must
be fast-forwarded or rewound to the appropriate position.
1963: removable hard drive
The removable hard drive is storage that can be removed from a computer while the system is
running. This made it easy for users to backup and transfer data from one computer to another.
1971: floppy disk
The floppy disk was introduced by IBM and is a portable and flexible storage medium for
computer systems. The first commercially floppy disk drive was introduced along with the
IBM System 370 mainframe computer. The term ‘floppy disk’ was due to the flexibility of the
diskette inside a protective plastic casing. At the time, the floppy disk revolutionized data
storage and became the popular method of storing and transferring data.
1983: CD-ROM
The Compact Disc Read-Only Memory, also known as CD-ROM, can only be read, and not
written to. The disk can hold large amounts of data including text, images, and audio. The disk
functions by inserting the disk into a CD drive, where a laser beam scans the disc’s surface to
retrieve the data stored in a spiral track. This data is transformed into a digital signal that the
computer processes.
1999: SD Card
A secure Digital (SD) card is a small removable flash memory card that can contain high-
capacity memory and is commonly used in digital devices such as cameras, smartphones,
tablets, and gaming consoles. SD cards are characterized by their compact size, durability, and
high storage capacity.
2000: USB
The USB flash drive is a portable storage device that utilizes flash memory technology to store
and transfer data. The flash drive consists of a compact circuit board enclosed in a plastic or
metal casing with a USB connector at one end. The connector allows the USB to easily be
plugged into a USB port, enabling data transfer. The drive is commonly used for data backup,
file sharing, and software installation.
2013: The hyper cloud
Hyper cloud storage introduced a whole new way of storing data – and a lot of it. Hyper cloud
storage refers to the massive storage capacity and infrastructure utilized by major cloud
providers such as AWS, Microsoft Azure, Google Cloud Platform, and IBM Cloud. These
providers operate vast data centers with an enormous number of servers to offer scalable and
available storage solutions.

Backups today and the way we store data have also sprung out of more and more digital threats.
Companies and people must protect their data in a smarter way to avoid hackers. Therefore,
hyper clouds storage often come with air-gapping, segregation of duties, Immutability, the 3-
2-1-0 rule and other features.
The evolution of data storage has come a long way from punch cards to hyper clouds. Cloud-
based storage has become the new norm and it continues to grow. Even if the way we store
data may sound complex, storage has never been this simple, as it is possible to store and
retrieve data with a single click of your computer mouse. Simple and scalable data management
solutions are at the heart of Anycloud, with a product portfolio consisting of backup,
replication, and data storage solutions – all in a partner-ready set-up.

Architecture:
Over the years, storage technology and the architecture has evolved to overcome the challenges
of increasing demands to store more and ever increasing data and information. Storage
Technology and Architecture is the combination of hardware and software components that are
required to facilitate storage for a system.
The needs of the average customers like the general public have changed alongside the needs
of companies and businesses. Storage capacity demands have grown by leaps and bounds in
many real-time applications such as the emergence of the internet, e-mail, e-commerce, data
warehousing, etc. The necessity to store this huge amount of data and then access it from across
the world has caused dramatic changes in the storage technologies.
These evolutions are monitored by some parameters
1. Accessibility- It is the availability of storage components to perform the desired
operations whenever needed.
2. Capacity- It is the amount of storage resources and capacity available. Capacity
monitoring includes the example of examining the free space available on the disc. This
offers the availability of uninterrupted data and scalability.
3. Performance- Monitoring the performance, measures and analyses the behaviour
regarding the ability to perform at certain predefined level or response time. This is to
evaluate the efficiency of the storage architecture.
4. Security- Security of data and information to be storage is of utmost importance in any
storage architecture.
This is a 5 MB hard-disc drive from 1956 by IBM.
RAID:
The second Storage Architecture we will see is RAID or Redundant Array of Independent
Disks. RAID allows the coordination of multiple HDD devices so as to provide higher levels
of reliability and performance than could be provided by a single drive. Implementation of
RAID offers two benefits which are Data redundancy and Enhanced performance. This is
achieved by making the clone of data to two or more disks.

RAID technology was delivered in low cost hardware and by the mid 1990s became standard
on servers.
SAN devices provide the fast and continuous access to a large amount of data. It provides high-
performance increased availability, scalability and cost benefits compared to DAS. That makes
it more feasible for businesses to keep their backup data in remote locations.

SAN proves to be helpful in disaster recovery and backup settings.


NAS or Network Attached Storage. A NAS device is a storage device connected to a network
that allows storage and retrieval of data from a centralised location. NAS is required mainly
for file sharing and it does not provide e-mail, authentication, or file management tasks. Unlike
SAN, this infrastructure connects to LAN and not directly attached to the host.

NAS offers higher availability, scalability, cost benefits, performance compared to the
general-purpose file servers.
1.3 Data Centre:
Modern data centers are very different than they were just a short time ago. Infrastructure has
shifted from traditional on-premises physical servers to virtual networks that support
applications and workloads across pools of physical infrastructure and into a multicloud
environment.
In this era, data exists and is connected across multiple data centers, the edge, and public and
private clouds. The data center must be able to communicate across these multiple sites, both
on-premises and in the cloud. Even the public cloud is a collection of data centers. When
applications are hosted in the cloud, they are using data center resources from the cloud
provider.
Why are data centers important to business?
In the world of enterprise IT, data centers are designed to support business applications and
activities that include:
• Email and file sharing
• Productivity applications
• Customer relationship management (CRM)
• Enterprise resource planning (ERP) and databases
• Big data, artificial intelligence, and machine learning
• Virtual desktops, communications and collaboration services
What are the core components of a data center?
Data center design includes routers, switches, firewalls, storage systems, servers, and
application delivery controllers. Because these components store and manage business-critical
data and applications, data center security is critical in data center design. Together, they
provide:
Network infrastructure. This connects servers (physical and virtualized), data center services,
storage, and external connectivity to end-user locations.
Storage infrastructure. Data is the fuel of the modern data center. Storage systems are used to
hold this valuable commodity.
Computing resources. Applications are the engines of a data center. These servers provide the
processing, memory, local storage, and network connectivity that drive applications.
How do data centers operate?
Data center services are typically deployed to protect the performance and integrity of the core
data center components.
Network security appliances. These include firewall and intrusion protection to safeguard the
data center.
Application delivery assurance. To maintain application performance, these mechanisms
provide application resiliency and availability via automatic failover and load balancing.
What is in a data center facility?
Data center components require significant infrastructure to support the center's hardware and
software. These include power subsystems, uninterruptible power supplies (UPS), ventilation,
cooling systems, fire suppression, backup generators, and connections to external networks.
What are the standards for data center infrastructure?
The most widely adopted standard for data center design and data center infrastructure is
ANSI/TIA-942. It includes standards for ANSI/TIA-942-ready certification, which ensures
compliance with one of four categories of data center tiers rated for levels of redundancy and
fault tolerance.
Tier 1: Basic site infrastructure. A Tier 1 data center offers limited protection against physical
events. It has single-capacity components and a single, nonredundant distribution path.
Tier 2: Redundant-capacity component site infrastructure. This data center offers improved
protection against physical events. It has redundant-capacity components and a single,
nonredundant distribution path.
Tier 3: Concurrently maintainable site infrastructure. This data center protects against virtually
all physical events, providing redundant-capacity components and multiple independent
distribution paths. Each component can be removed or replaced without disrupting services to
end users.
Tier 4: Fault-tolerant site infrastructure. This data center provides the highest levels of fault
tolerance and redundancy. Redundant-capacity components and multiple independent
distribution paths enable concurrent maintainability and one fault anywhere in the installation
without causing downtime.
Types of data centers:
Enterprise data centers
These are built, owned, and operated by companies and are optimized for their end users. Most
often they are housed on the corporate campus.
Managed services data centers
These data centers are managed by a third party (or a managed services provider) on behalf of
a company. The company leases the equipment and infrastructure instead of buying it.
Colocation data centers
In colocation ("colo") data centers, a company rents space within a data center owned by others
and located off company premises. The colocation data center hosts the infrastructure: building,
cooling, bandwidth, security, etc., while the company provides and manages the components,
including servers, storage, and firewalls.
Cloud data centers
In this off-premises form of data center, data and applications are hosted by a cloud services
provider such as Amazon Web Services (AWS), Microsoft (Azure), or IBM Cloud or other
public cloud provider.

1.5 Data center infrastructure components


Servers
Servers are powerful computers that deliver applications, services and data to end-user devices.
Data center servers come in several form factors:
• Rack-mount servers are wide, flat, stand-alone servers the size of a small pizza box.
They are stacked on top of each other in a rack to save space (versus a tower or desktop
server). Each rack-mount server has its own power supply, cooling fans, network
switches and ports, along with the usual processor, memory and storage.
• Blade servers are designed to save even more space. Each blade contains processors,
network controllers, memory and sometimes storage. They're made to fit into a chassis
that holds multiple blades and includes the power supply, network management and
other resources for all the blades in the chassis.
• Mainframes are high-performance computers with multiple processors that can do the
work of an entire room of rack-mount or blade servers. The first virtualizable
computers, mainframes can process billions of calculations and transactions in real
time.

https://www.bmc.com/blogs/dcim-data-center-infrastructure-management/

Components of DCIM
DCIM solutions are made of several components. These support a variety of enterprise IT
functions at the infrastructure layer.
Physical architecture
The floor space of a data center is planned according to:
• The dimensions of the equipment
• Airflow and cooling
• Human access
• Other geometric and physical factors
Here, DCIM technology helps you visualize and simulate the representation of server racks
deployed in the data center, so you can determine if the physical space is satisfactory.
Rack design
Typically, you’ll use standardized cabinets to install server and networking technologies in your
data center. Understanding of the specifics associated with rack design can help data center
organizations to plan for capacity, space, cooling and access for maintenance and
troubleshooting.
DCIM can help optimize the selection and placement of server racks based on these factors.
(Learn how to secure your server room.)
Materials catalog
DCIM technologies contain vast libraries of equipment material. The information ranges from
basic parameter specifications to high-resolution renders. With new technologies introduced
rapidly in the industry, these libraries are updated and maintained regularly in coordination
with the vendors.
Change management
Data center hardware must be replaced periodically, due to a few reasons:
• The inherently limited lifecycle of hardware
• A malfunction
• The need to upgrade to a better product
This change, however, can affect the performance of other integrated infrastructure
technologies. DCIM allows a structured approach to manage such hardware changes, allowing
IT to change or replace hardware by:
• Following predefined process workflows
• Reducing the risks associated with the change
Capacity planning
The data center should be designed to scale in response to changing business needs. That means
your capacity planning must account for:
• Space limitations
• Weight of equipment and racks
• Power supply
• Cooling performance
• A range of other physical limitations of the data center
The DCIM tool can model a variety of future/potential scenarios, planning future capacity
based on these limitations.
(Read more about capacity planning for the mainframe.)
Software integration
DCIM solutions integrated with existing management solutions that are designed to track and
coordinate data center assets and workflows. Integrations can include:
• Protocols such as SNMP and Modbus
• Complex web integrations
• CMDBs
Data analysis
Real-time data collection and analysis is a critical feature of DCIM technologies. With a DCIM
tool, you can:
• Track a variety of asset metrics
• Transfer data between DCIM solutions using web-based APIs
• Analyze data using advanced AI solutions
Looking at the real-time performance of the metrics can help you mitigate incidents such as
power failure, security infringements, and network outages—ahead of schedule.
(See how data analysis can support DCIM.)
Reporting & dashboard
A good DCIM tool transforms vast volumes of metrics log data into intuitive dashboards and
comprehensive reports. Automated actions can be triggered using the reporting information
and studied for further analysis.

DCIM capabilities & vendors


These capabilities are delivered with multiple software modules and solutions, potentially from
multiple different vendors and can be integrated into a comprehensive DCIM suite.
Some of the popular DCIM vendors include:
• Nlyte Software
• Sunbird Software
• Vertiv
• Schneider Electric
• openDCIM.

1.6 ILM: Information Lifecycle Management


What is Information Lifecycle Management (ILM)?
Information Lifecycle Management, commonly referred to as ILM, and we will be using this
acronym throughout this insight. ILM is a comprehensive approach to managing an
organisation’s data throughout its lifecycle, from creation to deletion.
It encompasses a set of strategies, processes, and tools designed to store data, protect it, and
ensure its availability while optimising storage resources and maintaining compliance with
regulatory requirements. Unlike traditional data management approaches, ILM takes into
account the changing value of information over time and applies appropriate management
policies based on the data’s business value and legal or regulatory obligations.
How does ILM differ from traditional data management?
Traditional data management often treats all data equally, storing it in a single repository
without considering its varying importance or relevance over time. In contrast, ILM recognises
that the value of information changes throughout its lifecycle.
It implements intelligent data management practices that align with these changes, ensuring
that data is stored, accessed, and protected according to its current value and relevance to the
organisation. This dynamic approach allows for more efficient use of storage resources and
better alignment with business needs.
Here’s a comprehensive comparison between information management vs data
management.
Why is ILM important for businesses and organisations?
In an era where data is often referred to as the new oil, effective management of information
assets has become crucial for business success. ILM helps organisations handle the exponential
growth of data, both structured and unstructured, by providing a framework to manage
information based on its value and relevance.
It enables businesses to comply with increasingly complex regulatory requirements, reduce
storage costs, improve data accessibility, and enhance overall operational efficiency. Moreover,
ILM plays a vital role in data protection and security, helping businesses safeguard sensitive
information and mitigate the risks of data loss or breaches.
What are the key components of an ILM strategy?
An effective ILM strategy comprises several key components that work together to manage
data throughout its lifecycle. These include data classification, which involves categorising
data based on its importance and sensitivity; storage tiering, which allocates data to appropriate
storage mediums based on its value and access requirements; data retention policies, which
define how long data should be kept and when it should be archived or deleted; and data
protection measures, which ensure the security and integrity of information assets.
Additionally, data governance plays a crucial role in ILM, providing the overarching
framework for how data should be managed, accessed, and used within the organisation.
What are the phases of Information Lifecycle Management?
The Information Lifecycle Management process typically consists of several distinct phases,
each addressing different aspects of data management throughout its lifecycle. Understanding
these phases is crucial for implementing an effective ILM strategy that aligns with an
organisation’s data management needs and business objectives.
The phases of Information Lifecycle Management typically include:
• Creation/Capture
• Storage
• Usage
• Archiving
• Retention
• Disposal/Destruction
• Compliance and Governance.
1.7 Components of a Storage System Environment:
The three main components in a storage system environment — the host, connectivity, and
storage — are described in this section.

1.7.1 Host
Users store and retrieve data through applications. The computers on which these applications
run are referred to as hosts. Hosts can range from simple laptops to complex clusters of servers.
A host consists of physical components (hardware devices) that communicate with one another
using logical components (software and protocols). Access to data and the overall performance
of the storage system environment depend on both the physical and logical components of a
host. The logical components of the host are detailed in Section 2.5 of this chapter.Physical
Components .
A host has three key physical components:
➢ Central processing unit (CPU)
➢ Storage, such as internal memory and disk devices
➢ Input/Output (I/O) devices

The physical components communicate with one another by using a communication pathway
called a bus. A bus connects the CPU to other components, such as storage and I/O devices.
CPU
The CPU consists of four main components:
Arithmetic Logic Unit (ALU):
This is the fundamental building block of the CPU. It performs arithmetical and logical
operations such as addition, subtraction, and Boolean functions (AND, OR, and NOT).
Control Unit:
A digital circuit that controls CPU operations and coordinates the functionality of the CPU.
Register:
A collection of high-speed storage locations. The registers store intermediate data that is
required by the CPU to execute an instruction and provide fast access because of their
proximity to the ALU. CPUs typically have a small number of registers.
Level 1 (L1) cache:
Found on modern day CPUs, it holds data and program instructions that are likely to be needed
by the CPU in the near future. The L1 cache is slower than registers, but provides more storage
space.
Storage:
Storage
Memory and storage media are used to store data, either persistently or temporarily. Memory
modules are implemented using semiconductor chips, whereas storage devices use either
magnetic or optical media. Memory modules enable data access at a higher speed than the
storage media. Generally, there are two types of memory on a host:
Random Access Memory (RAM):
This allows direct access to any memory location and can have data written into it or read from
it. RAM is volatile; this type of memory requires a constant supply of power to maintain
memory cell content. Data is erased when the system’s power is turned off or interrupted.
Read-Only Memory (ROM):
Non-volatile and only allows data to be read from it. ROM holds data for execution of internal
routines, such as system startup. Storage devices are less expensive than semiconductor
memory. Examples of storage devices are as follows:
➢ Hard disk (magnetic)
➢ CD-ROM or DVD-ROM (optical)
➢ Floppy disk (magnetic)
➢ Tape drive (magnetic

I/O Devices

I/O devices enable sending and receiving data to and from a host. This communication may be
one of the following types:

User to host communications: Handled by basic I/O devices, such as the keyboard, mouse,
and monitor. These devices enable users to enter data and view the results of operations.
Host to host communications:
Enabled using devices such as a Network Interface Card (NIC) or modem.
Host to storage device communications:
Handled by a Host Bus Adaptor (HBA). HBA is an application-specific integrated circuit
(ASIC) board that performs I/O interface functions between the host and the storage,
relieving the CPU from additional I/O processing workload.

HBAs also provide connectivity outlets known as ports to connect the host to the storage
device. A host may have multiple HBAs.

1.7.2 Connectivity

Connectivity refers to the interconnection between hosts or between a host and any other
peripheral devices, such as printers or storage devices. The discussion here focuses on the
connectivity between the host and the storage device. The components of connectivity in a
storage system environment can be classified as physical and logical. The physical components
are the hardware elements that connect the host to storage and the logical components of
connectivity are the protocols used for communication between the host and storage. The
communication protocols are covered in Chapter 5.

Physical Components of Connectivity The three physical components of connectivity between


the host and storage are Bus, Port, and Cable (Figure 2-1).
The bus is the collection of paths that facilitates data transmission from one part of a computer
to another, such as from the CPU to the memory. The port is a specialized outlet that enables
connectivity between the host and external devices. Cables connect hosts to internal or external
devices using copper or fiber optic media.

Physical components communicate across a bus by sending bits (control, data, and address) of
data between devices. These bits are transmitted through the bus in either of the following
ways:

Serially:

Bits are transmitted sequentially along a single path. This transmission can be unidirectional or
bidirectional.
In parallel:
Bits are transmitted along multiple paths simultaneously. Parallel can also be bidirectional.
The size of a bus, known as its width, determines the amount of data that can be transmitted
through the bus at one time. The width of a bus can be compared to the number of lanes on a
highway. For example, a 32-bit bus can transmit 32 bits of data and a 64-bit bus can transmit
64 bits of data simultaneously. Every bus has a clock speed measured in MHz (megahertz).

These represent the data transfer rate between the end points of the bus. A fast bus allows faster
transfer of data, which enables applications to run faster.Buses, as conduits of data transfer on
the computer system, can be classified as follows:

System bus:
The bus that carries data from the processor to memory.
Local or I/O bus:
A high-speed pathway that connects directly to the processor and carries data between the
peripheral devices, such as storage devices and the processor.

1.7.3 Storage
The storage device is the most important component in the storage system environment. A
storage device uses magnetic or solid state media. Disks, tapes, and diskettes use magnetic
media. CD-ROM is an example of a storage device that uses optical media, and removable
flash memory card is an example of solid state media.
1.8 Logical Components of HOST RAID:
1. The logical components of a host consist of the software applications and protocols that
enable data communication with the user as well as the physical components. Following
are the logical components of a host:
2. Operating system
3. Device drivers
4. Volume manager
5. File system
6. Application
1.Operating System
• An operating system controls all aspects of the computing environment. It works
between the application and physical components of the computer system. One of the
services it provides to the application is data access. The operating system also monitors
and responds to user actions and the environment. It organizes and controls hardware
components and manages the allocation of hardware resources. It provides basic
security for the access and usage of all managed resources. An operating system also
performs basic storage management tasks while managing other underlying
components, such as the file system, volume manager, and device drivers.
2.Device Driver
• A device driver is special software that permits the operating system to interact with a
specific device, such as a printer, a mouse, or a hard drive. A device driver enables the
operating system to recognize the device and to use a standard inter-face (provided as
an application programming interface, or API) to access and control devices.
3.Volume Manager
• Disk partitioning was introduced to improve the flexibility and utilization ofHDDs. In
partitioning, an HDD is divided into logical containers called logicalvolumes (LVs).
For example, a large physical drive can be partitioned into multiple LVs to maintain
data according to the file system’s and applications’ requirements. The partitions are
created from groups of contiguous cylinders when the hard disk is initially set up on
the host. The host’s file system accesses the partitions without any knowledge of
partitioning and the physical structure of the disk. Concatenation is the process of
grouping several smaller physical drives andpresenting them to the host as one logical
drive.
• The evolution of Logical Volume Managers (LVMs) enabled the dynamic extension of
file system capacity and efficient storage management. LVM is software that runs on
the host computer and manages the logical and physical storage. LVM is an optional,
intermediate layer between the file system and the physical disk .The LVM provides
optimized storage access and simplifies storage resource management. It hides details
about the physical disk and the location of data on the disk; and it enables administrators
to change the storage allocation without changing the hardware, even when the
application is running.The basic LVM components are the physical volumes, volume
groups, and logical volumes. In LVM terminology, each physical disk connected to the
host system is a physical volume (PV).
4.File System
o A file is a collection of related records or data stored as a unit with a name. A
file system is a hierarchical structure of files. File systems enable easy access to
data files residing within a disk drive, a disk partition, or a logical volume. A
file system needs host-based logical structures and software routines that control
access to files. It provides users with the functionality to create, modify, delete,
and access files. Access to the files on the disks is controlled by the permissions
given to the file by the owner, which are also maintained by the file system.
o A file system organizes data in a structured hierarchical manner via the use of
directories, which are containers for storing pointers to multiple files. All file
systems maintain a pointer map to the directories, subdirectories, and files that
are part of the file system. Some of the common file systems are as follows:
i.FAT 32 (File Allocation Table) for Microsoft Windows
ii.NT File System (NTFS) for Microsoft Windows
iii.UNIX File System (UFS) for UNIX
iv.Extended File System (EXT2/3) for Linux
5.Application
o An application is a computer program that provides the logic for computing
operations. It provides an interface between the user and the host and among
multiple hosts. Conventional business applications using databases have a three-
tiered architecture the application user interface forms the front-end tier; the
computing logic forms, or the application itself is, the middle tier; and the
underlying databases that organize the data form the back-end tier. The
application sends requests to the underlying operating system to perform read/
write operations on the storage devices. Applications can be layered on the
database, which in turn uses the OS services to perform R/W operations to
storage devices. These R/W operations enable transactions between the front-
end and back-end tiers.
1.8.1 RAID IMPLEMENTATION:
Software or hardware RAID implementation
All the computations involved in making RAID work require a lot of processing power. The
more complex the RAID configuration, the more CPU resources it requires. From a
computational point of view there is little difference between a software RAID implementation
and a hardware RAID implementation. Ultimately, the difference is in where RAID processing
is performed. It can either be performed by the server processor where the RAID system is
installed (that is the software implementation) or by an external processor (that is the hardware
implementation).
Hardware RAID (hardraid) implementation.
In a hardware RAID implementation the drives are connected to a RAID controller card that
plugs into a PCI-Express (PCI-e) slot on the motherboard. This is done in the same way for
both large servers and desktop RAID installations. Most external devices have a RAID
controller card built into the device itself.

Benefits
Better performance, especially for complex RAID configurations. Processing is performed by
a dedicated RAID processor rather than the computer's main processor.
This reduces the load on the system when writing data backups and reduces data recovery time.
More RAID configuration options are provided, including hybrid configurations which may
not be available under certain operating system settings. Compatibility with various operating
systems.
This factor is critical if you plan to access your RAID system from Mac and Windows
computers simultaneously. The RAID hardware implementation will be recognized by any
system.
Disadvantages
• Since the system contains more hardware, initial deployment costs will be higher.
• Performance degradation in certain hardware RAID implementations when using solid
state disks (SSDs). Older RAID controllers do not offer the fast native SSD caching
needed to efficiently program and erase the drive.
• Hardware RAID software is designed to work exclusively with the large systems
(general purpose machines, Solaris RISC systems, Itanium, SAN) used in industrial
infrastructure.
Raid Software Implementation
When the disks storing information are connected directly to a computer or server without a
RAID controller, RAID configuration chosen is handled by a utility included in the operating
system. This arrangement is called a software RAID implementation.
Many operating systems support RAID configuration, including Apple and Microsoft, various
versions of Linux systems such as OpenBSD, FreeBSD, NetBSD and Solaris Unix systems.
Benefits
• Low cost RAID deployment. All you need to do is to connect the drives and then
configure their use with the operating system.
• Today's computers are so powerful that their processors can easily handle RAID Level
0 and 1 without any noticeable performance degradation.
Disadvantages
• RAID software is often specific to the operating system you are using and therefore
cannot be used for disk arrays shared between different operating systems.
• You are limited to RAID levelsthat your operating system can support.
• With more complex RAID configurations, computer performance suffers.
Software or hardware RAID implementations?
The winner of the RAID implementation comparison really depends on how you use your
system. If your intention is to save money (and who doesn't?) then you will use a single
operating system to access the RAID array and use RAID level 0 or 1, using a software RAID
implementation which gives you the same protection and experience as a more expensive
hardware implementation.

If you are able to provide the initial investment then hardware RAID implementations are
definitely preferable. It will free you from the limitations of a software RAID implementation
and give you more flexibility in using and configuring RAID.

1.8.2 RAID LEVELS:


RAID is a technology that is used to increase the performance and/or reliability of data storage.
The abbreviation stands for either Redundant Array of Independent Drives or Redundant Array
of Inexpensive Disks, which is older and less used. A RAID system consists of two or more
drives working in parallel. These can be hard discs, but there is a trend to also use the
technology for SSD (Solid State Drives). There are different RAID levels, each optimized for
a specific situation. These are not standardized by an industry group or standardization
committee. This explains why companies sometimes come up with their own unique numbers
and implementations. This article covers the following RAID levels:
• RAID 0 – striping
• RAID 1 – mirroring
• RAID 5 – striping with parity
• RAID 6 – striping with double parity
• RAID 10 – combining mirroring and striping
The software to perform the RAID-functionality and control the drives can either be located on
a separate controller card (a hardware RAID controller) or it can simply be a driver. Some
versions of Windows, such as Windows Server 2012 as well as Mac OS X, include software
RAID functionality. Hardware RAID controllers cost more than pure software, but they also
offer better performance, especially with RAID 5 and 6.
RAID-systems can be used with a number of interfaces, including SATA, SCSI, IDE, or FC
(fiber channel.) There are systems that use SATA disks internally, but that have a FireWire or
SCSI-interface for the host system.
Sometimes disks in a storage system are defined as JBOD, which stands for Just a Bunch Of
Disks. This means that those disks do not use a specific RAID level and acts as stand-alone
disks. This is often done for drives that contain swap files or spooling data.
RAID (Redundant Arrays of Independent Disks)
RAID (Redundant Arrays of Independent Disks) is a technique that makes use of a combination
of multiple disks for storing the data instead of using a single disk for increased performance,
data redundancy, or to protect data in the case of a drive failure. The term was defined by David
Patterson, Garth A. Gibson, and Randy Katz at the University of California, Berkeley in 1987.
In this article, we are going to discuss RAID and types of RAID their Advantages and
disadvantages in detail.
What is RAID?
RAID (Redundant Array of Independent Disks) is like having backup copies of your important
files stored in different places on several hard drives or solid-state drives (SSDs). If one drive
stops working, your data is still safe because you have other copies stored on the other drives.
It’s like having a safety net to protect your files from being lost if one of your drives breaks
down.
RAID (Redundant Array of Independent Disks) in a Database Management System (DBMS)
is a technology that combines multiple physical disk drives into a single logical unit for data
storage. The main purpose of RAID is to improve data reliability, availability, and performance.
There are different levels of RAID, each offering a balance of these benefits.
How RAID Works?
Let us understand How RAID works with an example- Imagine you have a bunch of friends,
and you want to keep your favorite book safe. Instead of giving the book to just one friend, you
make copies and give a piece to each friend. Now, if one friend loses their piece, you can still
put the book together from the other pieces. That’s similar to how RAID works with hard
drives. It splits your data across multiple drives, so if one drive fails, your data is still safe on
the others. RAID helps keep your information secure, just like spreading your favorite book
among friends keeps it safe.
What is a RAID Controller?
A RAID controller is like a boss for your hard drives in a big storage system. It works between
your computer’s operating system and the actual hard drives, organizing them into groups to
make them easier to manage. This helps speed up how fast your computer can read and write
data, and it also adds a layer of protection in case one of your hard drives breaks down. So, it’s
like having a smart helper that makes your hard drives work better and keeps your important
data safer.
Types of RAID Controller
There are three types of RAID controller:
Hardware Based: In hardware-based RAID, there’s a physical controller that manages the
whole array. This controller can handle the whole group of hard drives together. It’s designed
to work with different types of hard drives, like SATA (Serial Advanced Technology
Attachment) or SCSI (Small Computer System Interface). Sometimes, this controller is built
right into the computer’s main board, making it easier to set up and manage your RAID system.
It’s like having a captain for your team of hard drives, making sure they work together
smoothly.
Software Based: In software-based RAID, the controller doesn’t have its own special
hardware. So it use computer’s main processor and memory to do its job. It perform the same
function as a hardware-based RAID controller, like managing the hard drives and keeping your
data safe. But because it’s sharing resources with other programs on your computer, it might
not make things run as fast. So, while it’s still helpful, it might not give you as big of a speed
boost as a hardware-based RAID system
Firmware Based: Firmware-based RAID controllers are like helpers built into the computer’s
main board. They work with the main processor, just like software-based RAID. But they only
implement when the computer starts up. Once the operating system is running, a special driver
takes over the RAID job. These controllers aren’t as expensive as hardware ones, but they make
the computer’s main processor work harder. People also call them hardware-assisted software
RAID, hybrid model RAID, or fake RAID.
Why Data Redundancy?
Data redundancy, although taking up extra space, adds to disk reliability. This means, that in
case of disk failure, if the same data is also backed up onto another disk, we can retrieve the
data and go on with the operation. On the other hand, if the data is spread across multiple disks
without the RAID technique, the loss of a single disk can affect the entire data.
Key Evaluation Points for a RAID System
• Reliability: How many disk faults can the system tolerate?
• Availability: What fraction of the total session time is a system in uptime mode, i.e.
how available is the system for actual use?
• Performance: How good is the response time? How high is the throughput (rate of
processing work)? Note that performance contains a lot of parameters, not just the two.
• Capacity: Given a set of N disks each with B blocks, how much useful capacity is
available to the user?
RAID is very transparent to the underlying system. This means, that to the host system, it
appears as a single big disk presenting itself as a linear array of blocks. This allows older
technologies to be replaced by RAID without making too many changes to the existing code.
RAID 0
Requiring a minimum of two disks, RAID 0 splits files and stripes the data across two disks or
more, treating the striped disks as a single partition. Because striping technology distributes
data across multiple disks in an array, reading one file requires reading multiple disks.
RAID 0 does not provide redundancy or fault tolerance. Since it treats multiple disks as a single
partition, if even one drive fails, the striped file is unreadable. This is not an insurmountable
problem in video streaming or computer gaming environments where performance matters the
most, and the source file will still exist even if the stream fails. However, it is a problem in
high-availability environments.
Advantages
RAID 0 is beneficial because of its speed. Because multiple hard drives are reading and writing
parts of the same file at the same time, throughput is generally faster.
Disadvantages
RAID 0’s lack of fault tolerance makes it unreliable for supporting important applications and
untenable for backing up any environments.
RAID 1
RAID 1 uses disk mirroring to provide data redundancy and failover. It reads and writes the
exact same data to each disk. Should a mirrored disk fail, the file exists in its entirety on the
functioning disk. Once IT replaces the failed disk, the RAID system will automatically mirror
back to the replacement drive. RAID 1 also increases read performance.

RAID 1 requires a minimum of two disks to work. It does take up more usable capacity on
drives, but is an economical failover process on application servers.
Advantages
By copying one disk to another, RAID 1 decreases the chance of total data loss from a disk
failure.
Disadvantages
Because two disks store the same data, RAID 1 can only use half of the array’s total storage.
Raid 5
RAID 5 distributes striping and parity at a block level. Parity is raw binary data–the RAID
system calculates its values to create a parity block, which the system uses to recover striped
data from a failed drive. Most RAID systems with parity functions store parity blocks on the
disks in the array. Some RAID systems also dedicate a disk to parity calculations, but these are
rare.
RAID 5 stores parity blocks on striped disks. Each stripe has its own dedicated parity block.
RAID 5 can withstand the loss of one disk in the array.

RAID 5 combines the performance of RAID 0 with the redundancy of RAID 1, but takes up a
lot of storage space to do it—about one third of usable capacity. This level increases write
performance since all drives in the array simultaneously serve write requests. However, overall
disk performance can suffer from write amplification, since even minor changes to the stripes
require multiple steps and recalculations.
Advantages
• RAID 5’s striping increases read performance.
• Parity improves data accuracy.
• RAID 5 can be used for SSDs as well as hard drives. But be careful to pick SSDs that
are the exact same age in case they fail at the same time.
Disadvantages
RAID 5 only has fault tolerance for one disk failure.

RAID 6
This RAID level operates like RAID 5 with distributed parity and striping. The main
operational difference in RAID 6 is that there is a minimum of four disks in a RAID 6 array,
and the system stores an additional parity block on each desk. This enables a configuration
where two disks may fail before the array is unavailable. Its primary uses are application servers
and large storage arrays.
RAID 6 offers higher redundancy and increased read performance over RAID 5. It can suffer
from the same server performance overhead with intensive write operations. This performance
hit depends on the RAID system architecture: hardware or software, if it’s located in firmware,
and if the system includes processing software for high-performance parity calculations.
Advantages
• RAID 6 arrays can withstand two drive failures because they have two instances of
parity rather than a single one.
• RAID 6 has better read performance than RAID 5.
Disadvantages
• RAID 6 is more expensive than some other forms of RAID.
• Rebuilding data on larger RAID 6 arrays can be a slow process.

RAID 10: Striping and Mirroring


RAID 10 requires a minimum of four disks in the array. It stripes across disks for higher
performance, and mirrors for redundancy. In a four-drive array, the system stripes data to two
of the disks. The remaining two disks mirror the striped disks, each one storing half of the data.
RAID 10 combines the benefits of RAID 0 and RAID 1: faster read times and some redundancy
through mirroring.
This RAID level serves environments that require both high data security and high
performance, such as high transactional databases that store sensitive information. It’s the most
expensive of the RAID levels, with lower usable capacity and high system costs.
Advantages
• RAID 10 rebuilds data more quickly than other RAID implementations.
• RAID 10 has fast overall read operations.
Disadvantages
• RAID 10 is the most expensive variation of RAID.
• Fault tolerance is only one disk, unlike RAID 6.
1.9 Intelligent Storage System:
Intelligent storage is a storage system or service that uses AI to continuously learn and adapts
to its hybrid cloud environment to better manage and serve data. It can be deployed as hardware
on-premises, as a virtual appliance, or as a cloud service.
1.1 Front End = front-end ports and front-end controllers
1.2 Cache = Data access on rotating disks usually takes several millisecond because of seek
time and rotational latency. Accessing data from cache is fast and typically takes less than a
millisecond.
pages: size of a cache page is configured according to the application I/O size
The data store holds the data whereas the tag RAM tracks the location of the data in the data
store (see Figure 4-2) and in the disk.

1.3 Back End


interface between cache and the physical disks. It consists of two components: back-end ports
and back-end controllers.
For high data protection and high availability, storage systems are config- ured with dual
controllers with multiple ports. ..further enhanced if the disks are also dual-ported. In that case,
each disk port can connect to a separate controller. Multiple controllers also facilitate load
balancing.
2. Storage Provisioning
the process of assigning storage resources to hosts based on capacity, availability, and
performance requirements of applications running on the hosts.
2.1 Traditional Storage Provisioning
= The number of drives in the RAID set and the RAID level determine the availability,
capacity, and performance of the RAID set.
Logical units are created from the RAID sets by partitioning (seen as slices of the RAID set)
the available capacity into smaller units.
Each logical unit created from the RAID set is assigned a unique ID, called a logical unit
number (LUN).
Why Is Intelligent Storage an Improvement on Previous Storage Solutions?
In the past, data storage was often a monolithic, device-centric service offering only one kind
of access. Today’s storage solutions must support more inputs, more types of data, dispersed
instances, and of course, more users.
Intelligent storage systems remove the barriers created by traditional storage, offering
operational agility, efficiency gains, and the ability to adapt to usage patterns on the fly. For
example, intelligent storage can shift infrequently accessed data to cheaper data storage and
even identify security threats by tracking anomalous data.
Intelligent storage can help digital enterprises extract more value from data, which is becoming
more critical to operations and competitive posture and more complex to collect, store, and
access.
How Does Intelligent Storage Work?
Intelligent storage is storage hardware enhanced with compute resources for software and
processing. Intelligent storage can also be deployed as a virtual machine or a cloud-based
service. Hardware vendors are adding this intelligent storage capability directly to flash
modules and building it into data centre storage arrays. Other approaches use emerging
technologies in intelligent storage, such as software-defined storage.
AI creates the “intelligence” in intelligent storage. AI can be built into storage systems in
several ways: at device levels, tiered storage and data lifecycle management, and data
accessibility support. For each category, AI can analyse the access patterns and performance
associated with data and automatically perform actions to keep speed and access at their peak.
However intelligent storage is built, the goal is to offer storage that optimises collection,
organisation, caching, delivery, and even costs and power consumption. This might entail
unifying data types, extraction from unstructured data, replication and distribution,
data deduplication, strategic caching, backups, and security.
Intelligent storage can work via user-defined algorithms, or, as is increasingly the case, the use
of artificial intelligence and machine learning to learn behaviors, assess patterns, and optimise
storage to meet business goals. Intelligent storage can also flag potential problems, even those
outside storage, before they impact other business processes.
Intelligent Storage Technologies
Technologies enabling intelligent storage include the following:
• AI analytics: Using processes such as analytical queries and predictive modeling, AI
can optimise data storage by looking for patterns and anomalies that indicate problems
or forecast outcomes.
• Internet of things (IoT): IoT sensors and other devices monitor intelligent storage
solutions to generate data for AI analytics on functions such as performance.
• Tiered storage: Segregating storage based on need and priority helps control
bandwidth usage so that essential low-latency applications won’t be adversely affected.
• Smart network interface card (smartNIC): This specialized module makes data
centre networking, security, and storage more efficient and flexible. SmartNICs also
relieve server CPUs of management tasks for modern distributed applications.
Benefits of Intelligent Storage
Because intelligent storage automatically manages tasks like performance and resource
maximization, you gain time back for other tasks. Intelligent storage also offers:
• Compatibility: Intelligent storage supports new workflows such as those for IoT
devices, DevOps, containerization, and the cloud operating model. Since it’s endlessly
expandable and customizable, intelligent storage can adapt to future technology
evolution and innovation.
• Performance: Intelligent storage supports faster performance and lower latency by
optimizing the placement and flow of data for critical applications.
• Efficiency: Intelligent storage can optimise for performance and specific workflows,
but it can also optimise for cost and maximising of resources—for example, through
advanced deduplication and reduction of redundancies and even management of cloud
bursting. Over time, these efficiencies can reduce CAPEX and storage footprints.
• Compliance and security: By offering a way to customize data flows and access
policies, intelligent storage presents new possibilities for strengthening security. Its
algorithms can also factor in data governance rules, providing a basis for automated
compliance.
1.9.1 Intelligent storage system Components:
An intelligent storage system consists of four key components: front end, cache, back end, and
physical disks. Figure 4-1 illustrates these components and their interconnections.

Front End
Notice this is the interface between the hosts in your network and the Intelligent Storage System
itself. Most front ends will have redundant front-end controllers, as well as redundant ports for
connectivity. Supported protocols include Fibre Channel, iSCSI, FICON, and FCoE.
Cache
Cache is very fast memory that exists to speed up the IO processes. If cache is doing its job
properly, the number of mechanical disk operations is dramatically reduced.
The cache is organized in pages and consists of the data store and tag RAM. The data store
holds the data that is being read or written, and the tag RAM is responsible for tracking the
location of the data in the actual data storage (the physical disks).
Back End
The job of the back end is to provide an interface between the cache and the physical disks. It
consists of back end controllers and back end ports. The back end is often responsible for Kill
detection and correction, as well as the RAID functionality of the system. Like the front end,
most systems provide multiple controllers and multiple ports for the most redundancy possible.
Physical Disks
Another great feature of the Intelligent Storage System is the variety that is possible with the
physical disks. Most systems support a variety of disks and speeds, including FC, SATA, SAS,
and flash drives. Most even support a mix of disks in the same array!
1.9.3 Intelligent Storage Array
Intelligent storage systems generally fall into one of the following two categories:
➢ High-end storage systems
➢ Midrange storage systems

Traditionally, high-end storage systems have been implemented with active-active arrays,
whereas midrange storage systems used typically in small- and medium sized enterprises have
been implemented with active-passive arrays. Active-passive arrays provide optimal storage
solutions at lower costs. Enterprises make use of this cost advantage and implement active-
passive arrays to meet specific application requirements such as performance, availability, and
scalability. The distinctions between these two implementations are becoming increasingly
insignificant.
1.9.3.1 High-end Storage Systems
High-end storage systems, referred to as active-active arrays, are generally aimed at large
enterprises for centralizing corporate data. These arrays are designed with a large number of
controllers and cache memory. An active-active array implies that the host can perform I/Os to
its LUNs across any of the available paths (see Figure 4-7).
To address the enterprise storage needs, these arrays provide the following capabilities:
➢ Large storage capacity
➢ Large amounts of cache to service host I/Os optimally
➢ Fault tolerance architecture to improve data availability
➢ Connectivity to mainframe computers and open systems hosts
➢ Availability of multiple front-end ports and interface protocols to serve a large
number of hosts
➢ Availability of multiple back-end Fibre Channel or SCSI RAID controllers to
manage disk processing
➢ Scalability to support increased connectivity, performance, and storage capacity
requirements
➢ Ability to handle large amounts of concurrent I/Os from a number of servers and
applications
➢ Support for array-based local and remote replication
In addition to these features, high-end arrays possess some unique features and functionals
that are required for mission-critical applications in large enterprises.
1.9.3.2 Midrange Storage System
Midrange storage systems are also referred to as active-passive arrays and they are best suited
for small- and medium-sized enterprises.
In an active-passive array, a host can perform I/Os to a LUN only through the paths to the
owning controller of that LUN. These paths are called active paths.
The other paths are passive with respect to this LUN. As shown in Figure 4-8, the host can
perform reads or writes to the LUN only through the path to controller A, as controller A is the
owner of that LUN.
The path to controller B remains passive and no I/O activity is performed through this path.
Midrange storage systems are typically designed with two controllers, each of which contains
host interfaces, cache, RAID controllers, and disk drive interfaces.
Midrange arrays are designed to meet the requirements of small and medium enterprises;
therefore, they host less storage capacity and global cache than active-active arrays.
There are also fewer front-end ports for connection to serv ers. However, they ensure high
redundancy and high performance for appli cations with predictable workloads.
They also support array-based local and remote replication.

-------------------------------------------------------****************---------------------------------

You might also like