Data Storage
Data Storage
DATA STORAGE
Storage systems are inevitable for modern day computing. All known computing
platforms ranging from handheld devices to large super computers use storage
systems for storing data temporarily or permanently. Beginning from punch card
which stores a few bytes of data, storage systems have reached to multi Terabytes
of capacities in comparatively less space and power consumption.
Reference:
http://www.logix4u.net/component/content/article
Data Storage
A few definitions:
A device capable of storing data. The term usually refers to mass storage
devices, such as disk and tape drives. (courtesy: webopedia.com)
This can be studied separately for Single Devices, Networked devices and
for Large Scale Storage Purposes.
A. Unit Storage
B. Networked Storage
•Magnetic storage
•Optical storage
•Semiconductor storage
1. Primary and Secondary and Tertiary Storage
In simple words, primary storage is the storage device that is directly connected to
the CPU and store data temporarily during execution. i.e. CPU can directly access
primary storage and stores instruction and data for execution/processing. The most
popular example of this kind of memory is the RAM (Random Access Memory)
that we use in modern day computers. CPU registers, Caches and other memories
connected to the CPU.
Primary storage devices are comparatively faster than all other kinds of memory
types. Usually primary storage devices are considered to be directly connected to
the processor. But in reality, modern computers employ components like Virtual
Memory Manager, DRAM controllers etc. in between processor and the memory
but the notion of 'Direct connection' is still valid since these components are
transparent to the processor . Volatile memories are usually used as primary
storage.
On the contrary, Secondary storage may not be directly accessible by the processor.
And is usually used for more permanent storage of data. This requires secondary
storage devices to be non-volatile. Secondary storage devices are connected to
storage controllers and the CPU is required to talk to the controllers in order to
access information from secondary devices. The most popular example of
secondary device is the Hard disk. CD ROM, DVD ROM, USB mass storage
devices, Floppy etc. also fall in this category.
Secondary storage devices are also called Mass Storage Devices since the capacity
of these devices are comparatively large.
In contrast to Primary and Secondary storage, Tertiary storage may not be directly
connected to the CPU or the computer itself. Tertiary storage mechanisms are
usually used for storage of large volumes of data such as backups etc.
As the name implies, volatile memory looses its contents when power supply is
withdrawn. So usually Volatile memories are used for temporary storage of
data. In some exceptional cases, volatile memory devices are used along with
long life batteries to make semi-permanent storage devices.
Compared with non-volatile storage, Volatile storage devices are faster while
both reading and writing data. This makes these kinds of memories very suitable
to be used as main memories of computers. In fact, the memories we use in
computers (RAM) are volatile devices.
Non-volatile storage devices retain the contents even in absence of active power
source. This makes non-volatile devices suitable for long term permanent data
storage. Non-volatile devices usually available in large capacities. Hard Disks,
CD ROM, Floppy disks, Flash, ROM etc.. are examples of non-volatile memory
devices. Non-volatile storage devices are slower when compared to volatile
storage devices. But some non-volatile can faster during read operation and
slower during write operation. Semiconductor non-volatile memory devices fall
in this category.
3. Read only and Writable storage
Read only storage devices only allows contents to be read from and doesn't allow
the contents to be modified. Meanwhile, Writable storage devices allow both
content retrieval as well as content modification. Read only devices are usually
used for long term permanent storage where modification of data is not
necessary. CD ROM, DVD etc are examples of Read Only Storage devices. Some
Read Only Storage devices comes with factory programmed data which you can
only read but not modify.
There is another class of devices called Write Once Read Multiple (WORM)
devices which allows us to write data to it one and only one time and allows any
number of subsequent reads. CD-R and DVD-R are technically comes under this
category.
4. Random Access and Sequential Access Storage
Random Access storage devices allow retrieval of content from any location in the
same amount of time. i.e. Latency (the time taken to access a particular location in
storage) is independent of content's location. RAM used in computers is an
example of Random Access Memory.
Optical storage devices store data on reflective polycarbonate discs in the form
of pits and bumps. Data is recorded on the disc by pointing modulated laser
beam on to the rotating disc. This makes a series of tiny pits which doesn't
reflect light and bumps that reflect light. For reading the data, a low power laser
beam is focused to the track and the reflected beam is directed to a photo diode.
The photo diode detects the presence of pits and bumps from the reflected laser
beam and convert it in to bits and bytes of information.
7. Semiconductor Storage
Semiconductor storage devices store data in tiny memory cells made of very
small transistors and capacitors made of semiconductor materials such as silicon.
Each cell can hold one bit of information and an array of cells stores large chunk
of information. Semiconductor storage devices can be volatile and non-volatile.
RAM is an example of volatile semiconductor storage device. EEPROM and
FLASH are examples of non-volatile semiconductor storage devices. FLASH
devices are gaining popularity over conventional secondary storage devices like
hard disks. There are a large number of products in the market now which uses
FLASH devices exclusively as secondary storage (E.g. MP3 players, Mobile
Phones etc).
B. Networked Storage
BUS
•In computer technology, BUS is an interconnect that helps data to be transferred
between CPU and different peripherals or between computers. a BUS facilitates an
organized and usually co-operative access on resources.
•In contrast to point to point connections, a BUS can logically accommodate several
peripherals irrespective of their type and functionality as long as it confirms to
the rules specified by the BUS specification.
•While learning about storage domain, different types of BUS technologies and
standards come in to picture. Let's take a closer look in to the different BUS
standards used in storage technologies.
•In computer technology, BUS is an interconnect that helps data to be transferred
between CPU and different peripherals or between computers. A BUS facilitates an
organized and usually co-operative access on resources.
•In contrast to point to point connections, a BUS can logically accommodate several
peripherals irrespective of their type and functionality as long as it confirms to
the rules specified by the BUS specification.
•While learning about storage domain, different types of BUS technologies and
standards come in to picture.
•Bus Topology is a network topology in which all nodes, i.e., stations, are
connected together by a single bus.
Let's take a closer look in to the different BUS standards used in storage
technologies.
Common BUS Topologies
The BUS topology describes how peripherals are connected to the BUS physically.
Usually, in a topology there will have one or more BUS Masters (usually CPU) and
at least one Slave device ( usually peripheral).
In some interconnect mechanisms, peripherals are also can be masters on the BUS.
This facilitates transactions to be initiated at peripherals will in contrast to some
mechanisms where CPU is the only Master and all transactions needed to be
initiated by the CPU.
The physical structure of the BUS (All devices directly connected to single data
path) puts it in a position where all the connected devices must be highly co-
operative.
Any malfunctioning device may put the functionality of the BUS in risk unless
the situation is handled properly.
• Multi Drop
• Daisy Chain
• Switched Hub
1. Multi Drop
In Multi Drop topology, the devices are connected parallel on the BUS. The data
transmitted by any device will be presented to all the other devices and it is up to
each device whether it should accept or reject the data.
A multidrop bus (MDB) is a computer bus in which all components are connected
to the same set of electrical wires. A process of arbitration determines which device
gets the right to be the sender of information at any point in time. The other
devices must listen for the data that is intended to be received by them.
•If the data on the BUS matches with the criterion as per a device's
requirements, the device may read in the data for further processing. Other
wise the device will stay inactive as if no data is available on the BUS. A
bus connection can happen if more than one device tries to transmit
something on the BUS.
•To avoid this, Multi Drop Buses usually incorporate some kind of collision
detection and correction mechanism as a part of the bus implementation. A
very popular example for this is the CSMA/CD (Carrier sense multiple
access with collision detection) implemented in Ethernet.
•As a solution for this, if the two devices in the far ends of the chain are
connected together, we will get a Ring topology. In Ring topology, failure of one
device won't break the entire network since in ring topology, two paths always
exist between two devices. But if two devices happen to fail, the network will be
broken.
3. Switched Hub topology
Switched Hub topology uses a Hub as a mediator for communication
between devices. All access requests are routed through Hub only. The hub
should be intelligent enough to rout the request to appropriate connected
device.
The major advantage of this topology is that the chance for collision is virtually
zero since all requests are routed through the Hub. And the disadvantages are
increased cost due to additional hardware (Hub) and when the number of
devices increases, the Hub will become a bottleneck in the network. The most
popular examples of networks that use Hub topology are Ethernet and USB.
C. Large Scale Storage
In larger companies, the storage architecture is often composed of several, linked
types of storage hardware. These are typically classified as Direct Attached
Storage (DAS), Network Attached Storage (NAS), or Storage Area Networks
(SANs).
1. DAS
These more basic secondary storage devices are directly connected to a host
computer or server. For instance, disk drives for disk backups, RAID arrays, and
tape libraries for tape backups are DAS systems, usually connected by standard
protocols like small computer system interface (SCSI). The numerous variations
of SCSI developed by vendors create numerous component-driven storage
standards. Data retrieval is at the block level. DAS systems are used for local file
sharing.
2. NAS
NAS is composed of both hard disks and management software, and is
completely dedicated to serving files from a company network running a Gigabit
Ethernet. It is based on standard network protocols such as TCP/IP, FC, and CIFS.
NAS systems typically consist of RAID systems and software for configuring and
mapping file locations to a network-attached device. Storage is shared across
multiple servers.
3. SAN
A storage area network, or SAN, is a highly scalable, dedicated, high-speed storage
network of devices for transferring large blocks of data securely among servers,
networking components, and storage devices. It is separate from the corporate local
area network.
In a SAN infrastructure, storage devices such as NAS, DAS, RAID arrays, or others
are connected to servers using highly reliable interconnect technology called Fibre
Channel.
Serial ATA and Serial Attached SCSI interfaces are also making headway with
SANs.
DAS, NAS, and SAN all offer benefits, but each is best suited for a particular
environment.
END