Computer Storage Fundamentals: Storage system, storage networking and host connectivity
()
About this ebook
This book explains the basic concept of computer storage and its fundamental features and functionalities. It also includes topics on how the application servers access storage systems through the network. Different storage vendors use different name for physical and logical components of a storage system, but this book primarily focuses on the concept of storage systems using simple and commonly understood terminologies. Almost all modern storage systems have virtualization implemented to enhance performance and fault tolerance. This book explains these implementation aspects in simple terms.
Related to Computer Storage Fundamentals
Related ebooks
Basic Principles of an Operating System: Learn the Internals and Design Principles Rating: 0 out of 5 stars0 ratingsInformation Storage and Management: Storing, Managing, and Protecting Digital Information Rating: 0 out of 5 stars0 ratingsEmbedded Systems Complete Self-Assessment Guide Rating: 0 out of 5 stars0 ratingsExploring Computer Systems Rating: 5 out of 5 stars5/5USB Mass Storage: Designing and Programming Devices and Embedded Hosts Rating: 0 out of 5 stars0 ratingsData Storage Technology A Complete Guide - 2020 Edition Rating: 0 out of 5 stars0 ratingsCentOS High Performance Rating: 0 out of 5 stars0 ratingsOpen Source: Introduction & Outline of Office Suite Rating: 0 out of 5 stars0 ratingsComputer Forensics JumpStart Rating: 3 out of 5 stars3/5Computing and Information Technology V11 Home Study Rating: 0 out of 5 stars0 ratingsFundamentals of Computer Network Analysis and Engineering Rating: 0 out of 5 stars0 ratingsBASICS IN COMPUTER AND GENERAL APPLICATIONS Rating: 5 out of 5 stars5/5Lean and the Art of Cloud Computing Management Rating: 0 out of 5 stars0 ratingsTop Jobs: Computer and Information Technology Rating: 0 out of 5 stars0 ratingsFundamentals of Computer Organization and Architecture Rating: 5 out of 5 stars5/5Learn Operating System in 24 Hours Rating: 0 out of 5 stars0 ratingsVMware Performance and Capacity Management - Second Edition Rating: 0 out of 5 stars0 ratingsMicrosoft Windows Server Administration Essentials Rating: 0 out of 5 stars0 ratingsThe Absolute Beginner's Guide to Binary, Hex, Bits, and Bytes! How to Master Your Computer's Love Language Rating: 5 out of 5 stars5/5Schaum's Easy Outline of XML Rating: 0 out of 5 stars0 ratingsOracle: Protect Your Data Rating: 0 out of 5 stars0 ratingsComptia Server+ Primer Rating: 5 out of 5 stars5/5A To Z of Internet: Everything You Wanted to Know Rating: 0 out of 5 stars0 ratingsRemote Desktop Services Second Edition Rating: 0 out of 5 stars0 ratingsComputer Studies Course for Rusty Readers Rating: 0 out of 5 stars0 ratingsGoing Text: Mastering the Command Line Rating: 4 out of 5 stars4/5The Internet of Things: Living in a connected world Rating: 0 out of 5 stars0 ratingsCompTIA Linux+ Study Guide: Exam XK0-004 Rating: 0 out of 5 stars0 ratings
Databases For You
Grokking Algorithms: An illustrated guide for programmers and other curious people Rating: 4 out of 5 stars4/5Learn SQL Server Administration in a Month of Lunches Rating: 3 out of 5 stars3/5The AI Bible, Making Money with Artificial Intelligence: Real Case Studies and How-To's for Implementation Rating: 4 out of 5 stars4/5COMPUTER SCIENCE FOR ROOKIES Rating: 0 out of 5 stars0 ratingsSQL QuickStart Guide: The Simplified Beginner's Guide to Managing, Analyzing, and Manipulating Data With SQL Rating: 4 out of 5 stars4/5Visualizing Graph Data Rating: 0 out of 5 stars0 ratingsAccess 2019 For Dummies Rating: 0 out of 5 stars0 ratingsBlockchain Basics: A Non-Technical Introduction in 25 Steps Rating: 4 out of 5 stars4/5Advanced Analytics in Power BI with R and Python: Ingesting, Transforming, Visualizing Rating: 0 out of 5 stars0 ratingsPractical Data Analysis Rating: 4 out of 5 stars4/5Serverless Architectures on AWS, Second Edition Rating: 5 out of 5 stars5/5Data Analysis with R Rating: 5 out of 5 stars5/5Go in Action Rating: 5 out of 5 stars5/5Access 2010 All-in-One For Dummies Rating: 4 out of 5 stars4/5Learn SQL in 24 Hours Rating: 5 out of 5 stars5/5Troubleshooting PostgreSQL Rating: 5 out of 5 stars5/5Starting Database Administration: Oracle DBA Rating: 3 out of 5 stars3/5Developing Analytic Talent: Becoming a Data Scientist Rating: 3 out of 5 stars3/5Python Projects for Everyone Rating: 0 out of 5 stars0 ratingsCompTIA DataSys+ Study Guide: Exam DS0-001 Rating: 0 out of 5 stars0 ratingsVisual Basic 6.0 Programming By Examples Rating: 5 out of 5 stars5/5MATLAB Machine Learning Recipes: A Problem-Solution Approach Rating: 0 out of 5 stars0 ratingsProfessional ADO.NET 3.5 with LINQ and the Entity Framework Rating: 3 out of 5 stars3/5Teach Yourself VISUALLY Access 2010 Rating: 0 out of 5 stars0 ratingsLearn Git in a Month of Lunches Rating: 0 out of 5 stars0 ratingsA Concise Guide to Object Orientated Programming Rating: 0 out of 5 stars0 ratingsSQL Server: Tips and Tricks - 2 Rating: 4 out of 5 stars4/5Artificial Intelligence Basics: A Non-Technical Introduction Rating: 5 out of 5 stars5/5
Reviews for Computer Storage Fundamentals
0 ratings0 reviews
Book preview
Computer Storage Fundamentals - Susanta Dutta
Chapter 1
Storage Systems and Solutions
Learning objectives
This chapter discusses different type of storage systems, the technologies to store data and solutions that are deployed in small, medium, and large organizations. You will also learn about the pros and cons of all the storage systems with respect to their usages in the solutions.
Upon successful completion of this chapter, you will be able to learn about the following areas:
Type of storage systems: Block, File, and Object storage systems
Type of storage solutions:
DAS, NAS, SAN and Cloud storage
Entry level, mid-range and enterprise storage solutions
Storage solution components
Hyper-convergence Infrastructure
Introduction
Computer data storage has been evolving since 1950s. Initial storage systems were very simple and elementary. During those days, host servers were directly connected to its storage devices. The storage device was integral part within the computer system itself. There was no concept of a separate system for remotely storing data. Over the time for faster and efficient storing of data, storage system has become a separate entity, which can also be shared by multiple computers to store their data.
Type of storage systems
There are three most popular technologies available to store the data on to a storage system. Storage systems are categories based on how data is stored and accessed by the host servers.
Block storage: Block storage provides access to host server as raw block device. These Blocks are controlled by server based operating systems and each block can be individually formatted with the required file system such as NTFS and VMFS. Application can then access the data on the file system for read and write operations. There are also intelligent applications available, which can access the raw volume directly without any file system.
Figure 1.1: Block Storage
Storage has no knowledge of the data stored in it, however operating system and application understands it.
Block storage is primarily used for structured data, such as relational databases. Host server runs databases, supports random read/write operations. Many other client system access the application running on host server over the network.
This level of storage can also offer boot-up of systems which are connected to them.
Block level storage can be used to store files and can work as storage for special applications like databases, virtual machine file systems, and so on.
Block level storage data transportation is much efficient and reliable.
Each storage volume can be treated as an independent disk drive and it can be controlled by external server operating system. Block level storage uses Fibre Channel, iSCSI and FCoE Communication protocols for data transfer as SCSI commands act as communication interface in between the initiator and the target.
File storage: On a file, storage data is stored and accessed using filename and its directory location over Local Area Network (LAN) or Wide Area Network (WAN). In File Level storage, the storage space is configured to access files with a protocol such as Network File System (NFS) or Server Message Block (SMB) or Common Internet File System (CIFS).
File storage system uses block storage internally with a local file system to store these files, and host server only accesses for reading and writing these files. File storage deployed mainly in the environment where application requires to access data inside files or group of people require to store files such as documents, spreadsheets, presentations, audio, and image files in shared storage system.
Figure 1.2: File Storage
Computer systems that access the files and folder on the file storage are called NAS clients. They can run an application to read and write data on to those files and other clients access the application over network. For example, Hypervisor enabled operating system can host Virtual Machines(VMs) on NAS shares and client can access those virtual machines.
Object storage: Object storage is a relatively new type of storage system that stores data as objects along with metadata and unique identifier. This storage is primarily evolved in recent years to support storing mass amounts of unstructured data in cloud, Big Data and mobility. Data that is often stored as objects includes songs, images, and video clips. Some of the examples of Object storage in the Cloud are Amazon S3(Simple Storage Service), Azure Blob Storage, Google cloud storage.
Figure 1.3: Object Storage
Representational State Transfer (REST) API over HTTP is used to transfer objects to and from a client system.
Storage solutions
The hardware and software component that make a solution to store and protect digital data is called a storage solution. A storage solution has the following two primary components:
Storage System: Storage system is a hardware device connected to computers through a network for storing the data. Storage disk array is an example of a storage system.
Host server: Applications that run on these computer servers, perform read and write data stored on to the storage systems. This is also called application server, host computer or just simply host or just a server.
A storage solution is formed by connecting one or more host servers to storage systems.
Type of storage solutions
Storage solutions are commonly categorized in two ways - first, categorized based on how storage systems are connected to host servers and second, based on configuration and level of complexity of the deployed solution.
Storage solution based on host connectivity:
Type of storage solution chosen by the organization is primarily driven by requirement. Selection of type of storage system for the solution and connectivity with host servers also follows accordingly.
In the previous section we learnt that the three most popular storage system technologies are block storage, file storage, and object storage. Storage systems are also primarily identified based on these technologies.
Storage Area Network
Block storages are deployed in Storage Area Network(SAN). Typically SAN storage solutions are complex and expensive compared to other storage solutions. SAN is generally deployed in-house to host business critical applications that require high performance and data availability. Since block storages are generally connected in SAN, they are often referred as SAN Storage. In SAN storage solution, file system is created within host server. Therefore different host servers with different operating system and file system can share the same storage system.
Network Attached Storage
File level storage deployed in Ethernet network environment. This solution is popularly called Network Attached Storage(NAS) storage solution. Since files are stored on to the storage system, file system is created within it.
Figure 1.4: NAS storage Solution and SAN Storage Solution
Direct Attached Storage
In Direct Attached Storage (DAS), disk storage is directly connected to host server. A direct-attached storage supports block access to data just like SAN. This solution is simple to manage and less expensive. Due to multiple limitations, not many companies use this storage much today. These limitations are primarily inefficiency of utilization disk space utilization by not sharing across multiple servers, no storage virtualization, and so on.
Though this is a traditional storage solution, nowadays this storage solution is becoming popular because of adoption of software defined storage, which is discussed in later chapters.
Figure 1.5: DAS storage solution
There are several models of DAS storage solution. Server with in-built internal disk enclosure, server with external disk enclosure, server with internal and external enclosures both.
DAS storage with internal built-in disk enclosure saves rack space and power, however to additional disk space, external disk enclosures are needed to be attached to it.
In DAS storage solution, Disk storage, File System and application all layers belong to the Host server. For this reason all resources are dedicated to single server, sharing is not possible with other host server.
Figure 1.6: Direct attached storage stacks
All data finally stored as blocks
Host servers or client systems access storage system via different protocols and store data differently. For example, a client system accesses a File storage to store files. That is because the File storage is designed to communicate via CIFS or NFS protocol via IP network interface. Here the client system's perspective the data stored as a file, but within the file storage, files reside on file system. File system within a file storage represents set of data blocks on disk as file to the client system, beneath of the file system is storage blocks.
Although type of storage is determined by the interfaces and communication protocol designed for client or host servers, fundamentally every storage is a block storage.
Figure 1.7: Data storing technologies
Cloud Storage
Typically large enterprise object storage systems are deployed in a data center and provided access to client computers and mobile devices stores their data on to it through the internet. This service model is called Cloud Storage.
Its great advantage for a user to access the data from anywhere in the world. In general, personal files such as pictures, videos, and archival data are stored in cloud storage, but not business critical application data, as it requires faster response time.
Figure 1.8: Cloud Storage
Cloud service provider ensures the data availability to user by implementing various data protection solutions in the storage systems.
Generally, service provider charges to the user for this service.
Storage solutions based on deployment:
Storage solutions can also be classified based on configuration and level of complexity of the deployed solution:
Entry level storage solution:
This storage solution is deployed when an organization has the requirement of storing less amount of data. Typically this solution implemented with either DAS or basic SAN or NAS storage. In DAS solution, storage devices can be internal to the host server or external. This is simple to manage and a low cost solution. It has several limitations such as failure of any component can cause loss of data, low performance and storage capacity can scale up to few Terabytes (TBs).
Figure 1.9: Entry level storage solutions
Mid-range storage solution:
Mid-range storage solution is deployed when the organization has a requirement of storing about few hundred terabytes. Typically this solution is implemented using SAN or NAS storage with host servers in cluster for high availability. This is simple to manage and low cost solution. It has several limitations such as failure of any component can cause loss of data, low performance, and storage capacity can scale up to few Terabytes (TBs).
If host servers in cluster configuration for high availability, each storage volume is provided access to all host servers in cluster, so that if any host server fails, other server can continue to access the data on the volume. Since the volume is shared across multiple servers, this volume access configuration is called shared volume.
Figure 1.10: Mid-range storage solution
Some storage vendors assemble all required hardware, pre-configure software and fine-tune in factory to consolidate all solutions for ready-made use and sell it as a single Stock-keeping unit (SKU) product. This is called converged infrastructure or converged system.
Enterprise storage solution:
A datacenter is a facility with adequate power, space and cooling to deploy computer servers, storage systems and network switches to store and process massive amount of data.
Large enterprise organizations run their business critical applications in a datacenter to store and process hundreds of petabytes of data. For high-availability and disaster recovery, enterprise organizations also deploy a similar set of hardware at another datacenter located at different site, called remote or secondary site.
Figure 1.11: Enterprise storage solution
Though every storage solution require to backup solution in place to recover data, in case of any accidental loss of it, enterprise storage solution implements backup solution that make multiple copies of their critical data at different time interval in different systems.
Hyper-Converged
Information Technology (IT) industry has been evolving since mid-twentieth century, XT/AT PCs were in use during 1980s and 1990s, Computer networking came into picture during 1990s and 2000s, SAN became popular during 2000s -2010s and finally Virtual Machines (VM) came in use during 2010s. In this process of evolution, datacenter started having multiple servers, multiple storages, clusters, switches, cables causing inefficiency in management, deployment and space, power consumption.
Hyper-convergence Infrastructure (HCI) is a type of infrastructure system with a software-centric architecture that tightly integrates compute, storage, networking, and virtualization resources. This has evolved from concept of converged system discussed in previous section. Unlike converged system, technologies used in a HCI cannot be separated into components. For example, if converged system, storage, and servers are still separate piece of hardware, whereas in HCI, solutions are implemented within the software.
A Hyper-Converged system is typically sold as a software or software preinstalled on a hardware box. The software is installed on an existing hardware or any commodity hardware box.
All resources are consolidated within a single system to enhance efficiency of Management, Deployment and consumption of Space, Power. For ease of management, HCI vendors provide single management application to management of in-build technologies in a Hyper-Converged system, such as compute, storage networking, and virtualization.
Figure 1.12: Hyper-convergence Infrastructure
Hyper-convergence is designed to run VM. Applications are installed inside VMs.
For high availability solution and more scale out resource requirement, multiple Hyper-converged systems can be grouped together, which is known as cluster of Hyper-converged systems. Each system is referred as node. Once the number of VMs has reached the capacity, scaling out is easy, just by adding more nodes that includes more compute, storage, and networking resources.
Virtual desktop infrastructure (VDI) is an infrastructure deployment technology that hosts a desktop operating system on a centralized server and storage in a datacenter. Many storage vendors design, build, and sell Hyper-convergence product to deploy and support VDI requirement.
Hyper-convergence products are not suitable for business critical applications that demands high performance and support thousands of users.
Summary
Storage system and storage solutions have been evolved since many decades to meet different business objectives and use cases. Primary components of storage solutions are Storage System, Host Server, Switch, HBA and Management software. All these hardware components are connected and configured though the management software to deploy a storage solution.
Storage systems are classified based on the technologies used on how data is stored and accessed; those storage systems are Block Storage, File storage and Object Storage.
Following table shows comparison information among all these type of storage systems:
Table 1.1: Block Storage, File storage and Object Storage
Typically, host Servers access block storage within datacenter. They are accessed via Fibre Channel or iSCSI protocol. Client systems access file storage via local or wide area network using SMB/CIFS or NFS protocol. Mobile devices access object storage over internet using Restful API.
Figure 1.13: Block, File and Object storage system
Based on organization's requirement, storage solutions are built using required type of storage systems, those solutions are SAN, NAS, DAS, and Cloud storage. Each solution has its advantages and limitations over other solutions. For example, typically SAN solution provides high performance, but requires high cost, complex management, on the other hand