Cloud Storage Seminar
Cloud Storage Seminar
A SEMINAR REPORT
BY
SUBMITTED TO
Mr Fashoro
December, 2023
1
ABSTRACT
As an emerging technology and business paradigm, Cloud Computing has taken commercial
computing by storm. Cloud computing platforms provide easy access to a company’s high-
performance computing and storage infrastructure through web services. Cloud computing
platforms provide massive scalability, 99.999% reliability, high performance, and specifiable
configurability. These capabilities are provided at relatively low costs compared to dedicated
infrastructures. This article covers the key technologies in Cloud Computing and Cloud
Storage, after the introduction of the Cloud Storage reference model.
The architecture of cloud storage is based on it conception and the proposed hierarchical
and the discussed key technologies involving data organization, virtual storage, Data
Duplication, security, etc. With the development of cloud computing and global data growth,
it will get more attention and will be developed well.
The paper also concentrates on analyzing and discussing Storage Management. Storage
Management Control optimized is an effective method that will reduce the working time to
large-scale data Storage management. Combining Storage devices and control management
software will provide system data sharing and system high applicability. With Cloud Storage
Management control mechanism being applied, business enterprises could benefit from this.
2
1.0 Introduction
Computer data storage, often called storage or memory, refers to computer components and
recording media that retain digital data used for computing for some interval of time.
Computer data storage provides one of the core functions of the modern computer, that of
information retention. It is one of the fundamental components of all modern computers, and
coupled with a central processing unit (CPU, a processor), implements the basic computer
model used since the 1940s.
With the introduction of cloud storage and cloud servers, it has become easier than ever to
backup all our important computer files online. We are now given the flexibility of accessing
all our files from anywhere in the world, with the benefit of knowing that all our important
pictures, videos, music, files, documents, as well as other programs and data are securely
stored and available to us 24 hours a day 7 days a week. Assuming that we are online,
ofcourse.
Cloud computing portends a major change in how to store information and run applications.
Instead of running programs and data on an individual desktop computer, everything is
hosted in the “cloud” a nebulous assemblage of computers and servers accessed via the
Internet. Cloud computing lets you access all your applications and documents from
anywhere in the world, freeing you from the confines of the desktop and making it easier for
group members in different locations to collaborate.
Providers such as Amazon, Google, Salesforce, IBM, Microsoft, and Sun Microsystems have
begun to establish new data centres for hosting Cloud computing applications in various
locations around the world to provide redundancy and ensure reliability in case of site
failures. Since user requirements for cloud services are varied, service providers have to
ensure that they can be flexible in their service delivery while keeping the users isolated from
the underlying infrastructure. Recent advances in microprocessor technology and software
have led to the increasing ability of commodity hardware to run applications within Virtual
Machines (VMs) efficiently. VMs allow both the isolation of applications from the
underlying hardware and other VMs, and the customization of the platform to suit the needs
of the end user. Providers can expose applications running within VMs, or provide access to
VMs themselves as a service (e.g. Amazon Elastic Compute Cloud) thereby allowing
consumers to install their own applications. One of the primary uses of cloud computing is
for data storage.
With cloud storage, data is stored on multiple third-party servers, rather than on the dedicated
servers used in traditional networked data storage. When storing data, the user sees a virtual
3
server that is, it appears as if the data is stored in a particular place with a specific name. The
user’s data could be stored on any one or more of the computers used to create the cloud. The
actual storage location may even differ from day to day or even minute to minute, as the
cloud dynamically manages available storage space. But even though the location is virtual,
the user sees a “static” location for his data and can actually manage his storage space as if it
were connected to his own PC. Cloud storage has both financial and security associated
advantages. As for security, data stored in the cloud is secure from accidental erasure or
hardware crashes, because it is duplicated across multiple physical machines; since multiple
copies of the data are kept continually, the cloud continues to function as normal even if one
or more machines go offline.
4
Modern Computer Data Storage Timeline
In 1970’s the two most popular or better say only two ways to store computer data were 8”
Floppy Disk and 5.25” Floppy Disk. Maximum capacity of these two discs were 1.2MB and
at its peak it was produced in quantity of 4000 units a day.
1980’s continued with the evolution of floppy discs and 3.5” Floppy Disk was invented in
1982. That same year Sony put Compact Disk on the market, but it was only couple of years
later that it would become commonplace in IT world.
By the mid 90’s, the 1.44MB floppy had become ubiquitous with new PCs, but its low
capacity made it ill suited for larger backups. Iomega in 1994 introduced its Zip Drive.
Superior to 1.44MB floppies in nearly every way.
The DVD emerged in 1995 as a successor to compact disks, and this time around, the optical
media targeted PC users just as much as it did movie buffs. As a result, the transition from
CD to DVD as the default storage medium went much faster than the transition from floppy
disks to CDs.
Secure Digital cards were conceived as a competing format against Sony's Memory Stick and
appeared on the storage scene in early 2000. Early SD cards were limited to just 32MB and
64MB capacities, but have since scaled to 32GB in high capacity SDHC cards.
Arguably the most significant storage innovation since the 1.44MB floppy disk, the advent of
USB flash drives in 2000 signaled the eventual end of the road for floppies.
5
Figure 2: A typical Cloud Storage system architecture
Currently, the data storage unit based on cloud classification, cloud storage can be divided
into two categories: Block Storage and File Storage.
i. Block Storage: Block Storage of data will write a different single hard disk, in order
to get a larger single read and write bandwidth, Its advantage is the single read and
write data quickly, disadvantage is high cost, and cannot solve the real mass file
storage.
ii. File Storage : File Storage is based on the file-level storage, it is to a file on a hard
disk, even if the file is too large split, they put the same hard disk. The disadvantage is
that a single file read and write performance will be a single hard drive limit, the
advantage of a multi-file, multi-user system, the total bandwidth can be increased with
the expansion of the storage node, its structure can be unlimited expansion, and low
cost. File Storage suitable for the occasion are as follows: a. Large file, the total read
bandwidth-intensive - such as Web sites, IPTV; b. Write multiple files simultaneously
- for example monitoring; c. Prolong storage of files - such as file backup, storage or
search.
6
B. Storage virtualization
Cloud storage in the large number of storage devices and distributed in many different areas,
how different manufacturers, different models and even different types (such as FC storage
and IP storage) among multiple devices logical volume management, storage management
and virtualising multi-link redundancy management will be a huge problem, The Deployment
of virtual technology is a method of computing resources, it will apply the system at different
levels: hardware, software, data, networking, storage, etc. each one to isolate, to break the
data center, servers, storage, networking, data and applications obstacles in the physical
device, to achieve dynamic framework, and to centralized management and dynamic use of
physical resources and virtual resources, improve flexibility and improve service, manage
risk and other purposes. Virtual Storage is to enable multiple storage device which looks like
a storage device, to achieve unified management, deployment and monitoring.
C. Thin Provisioning
Thin Provisioning technology goal is to achieve storage resources "according to his needs."
System to the application of virtual storage space, when the actual physical space required to
write data, the system before the actual allocation of physical space and virtual space to
physical space to complete the mapping, which are transparent to the application. Although
Use of Thin Provisioning technology, the actual allocation of physical space is small,
however the application to see is the actual allocation of physical space than the larger virtual
memory space. With the application of data to write more and more people, the actual
physical space must be automatic and timely expansion followed, to avoid the lack of
adequate physical space and downtime caused by application.
D. Storage Security
Cloud storage and distributed characteristics relative to traditional data services, cloud
storage model for greater dependence on the remote server clusters, cloud platform server
cluster is running in the network environment, the server cluster may contain many user data,
and these data may be scattered in various virtual data centers, these data centers are not
necessarily in the same physical location, if access to these data, the control of carelessness, it
will face serious security and user data privacy issues, and when these problems arise when
different servers located in different physical locations. Cloud storage service providers
access to data for the server access control must be strict.
7
E. Data Duplication
Data duplication is detected through duplicate data, remove redundant files, data blocks of
the process, so that only unique data is stored in the system. Data duplication technology
through effective reduction of redundant data storage system in the possession, use to solve
the storage space efficiency.
Data duplication technology, the specific usage is: the data set (in the backup environment,
usually the backup data stream) is divided into blocks of data and the data blocks written to
disk target region. To transmit data stream identification data block, data de-duplication
engine for the data segment to create a digital signature (like fingerprints), and the signature
of a given repository to create an index. The index could be stored in data segment
reconstruction, and provides a reference list to determine whether the data blocks in the
repository. In the copy operation, the index used to determine which data segment to be
stored, which data segment to be copied. When the data de-duplication software found in a
block of data has been processed before, it will insert a link to the original data set metadata
in the data block pointers instead of storing the data block again. If the same block appears
more than once, will generate more than a pointer to it. Using variable-length data de-
duplication technology can store multiple sets of discrete metadata image. However each
image represents the different data set, but all images are referenced shared memory pool
contains data blocks.
Load balances are to keep available storage spaces for later application in different storage
devices in cloud storage system. Data migration of cloud storage means moving data in one
storage system to other storage system in different places. It aims in cooperation and keeping
load balance in cloud storage system. Data migration is one effective mechanism for load
balance. When the storage capacity is used over some threshold proportion values, the data
should be migrated into other cloud storage units and keep pointers in the old stored
positions, or modify and update the metadata at the same time.
However it may bring overhead workload to network bandwidth and VO process, and it
doesn't relieve access bottleneck of concurrence clients.
G. Hierarchical Storage
8
Most of the cloud storage system is a "loose cluster," which means that the performance of a
single node will become a bottleneck, because the data did not, and closely matched the same
cluster are distributed to the node. As a result, if a file is frequently gained access to, then the
time it can only be read from a node. The solution is, copy this file to multiple cluster nodes,
and then change the application to see who else needs the document. In addition, if the file
access frequency lowers down, you need to find a copy of this file and delete redundant.
Often, the final step is rare, and this led to a lot of wasted space. This requires store managers
to pay more additional management time. A simpler and more effective solution is to add the
hierarchical storage management. Automatic hierarchical document will visit frequently (or
document fragment) or move to RAM-based solid state disk cache area. Then, when files are
frequently gained access to when the system will be provided from the high-speed store the
file. This method does not require changes to the environment (or change limited), when files
are frequently gained access to they can be identified and when to migrate to high-speed
storage. Then, with the visit frequency decreased, the file will be automatically migrated to
the cache. Therefore, the memory becomes self-management and self-regulation can be
stored.
3.1 Choosing a Cloud Storage Provider
Using a cloud storage solution can have both advantages and disadvantages. Choosing the
best cloud service provider requires expertise and professionalism. These points need to be
considered for choosing the best cloud storage provider.
Cloud service providers charge fees for their services, which are beyond their free plans. The
free plans usually pose restrictions on the data size which you can store on the cloud. Before
choosing a subscription, it would be best to be well aware of your storage needs. Maybe it
would be better not to go for long-term contracts, as the needs may change as the business
grows.
Usability: The cloud storage system should be such that we don’t feel the difference
between working on our files over the internet or our local hard drives. We should check
out the user interface and the responsiveness of the system. And look out for all the
possible limitations that might waste our time and sap our productivity.
Reliability: Even though we can enjoy a free cloud storage service, our endeavor might be
accompanied by surplus charges due to the frequent downtimes, data corruption or
security incidents. Before choosing our cloud storage solution, we should do a thorough
9
research on the cloud service provider and check out their reputation and brand quality.
Value for money: Selecting a cloud data storage company that is the least expensive does
not mean that the customer is getting the best value for their money. What we should look
for in value is a combination of everything that makes up the cloud storage company and
then, how much they charge.
Reliability and Uptime: This category is based on statistical data related to the uptime of
the service. It is expected that cloud storage companies have a 99.9% uptime; however
some may fall below that due to unforeseen circumstances. But every consumer should be
made aware of that so that they may make the most informed decision in selecting their
online storage service provider.
Features: This category is a rating of the amount of features that are offered as well as
their usefulness to the consumer. Some cloud storage companies tend to comment on a
large number of features that they offer however, those features are not always of a
benefit to the consumer.
Storage Space: This category is a comparison of the amount of storage space offered by
the company. Storage space also factors into the ‘value for money’ rating.
Ease of Use: When testing cloud storage services, we should think of how easy that
service is to use. A good rating represents a median between ease of use for general and
technical users.
4.0 Conclusions
Cloud Storage with a great deal of promise, isn’t designed to be high performing file systems
but rather extremely scalable, easy to manage storage systems. They use a different approach
to data resiliency, redundant array of inexpensive nodes, coupled with object based or object-
like file systems and data replication (multiple copies of the data).
The seminar discusses the architecture of the cloud storage, and the related key technologies.
Cloud storage is a new concept, its related products and research is still in the initial state.
With the rapid increase of data, stored in network storage in the cloud will become
increasingly important, market demand will be more strongly. Emphasis should be concerned
10
about the performance, reliability, fault tolerance, ease of use, scalability and self-
management capabilities, as well as cloud storage and cloud computing for the next
generation of operating system development.
References
Abu-Libdeh, H., Princehouse, L., & Weatherspoon, H. (2010, June). RACS: a case for
cloud storage diversity. In Proceedings of the 1st ACM symposium on Cloud
computing (pp. 229-240).
Chang, V., & Wills, G. (2016). A model to compare cloud and non-cloud storage of
Big Data. Future Generation Computer Systems, 57, 56-76.
Jiyi WU1,2, Lingdi PING1, Xiaoping GE3,Ya Wang4, Jianqing FU1, Cloud Storage
as the Infrastructure of Cloud Computing, 2010 International Conference on
Intelligent Computing and Cognitive Informatics
Qinlu He, Zhanhuai Li, Xiao Zhang, Analysis of the key technology on cloud storage,
2010 International Conference on Future Information Technology and Management
Engineering
Wu, J., Ping, L., Ge, X., Wang, Y., & Fu, J. (2010, June). Cloud storage as the
infrastructure of cloud computing. In 2010 International conference on intelligent
computing and cognitive informatics (pp. 380-383). IEEE.
Zeng, W., Zhao, Y., Ou, K., & Song, W. (2009, November). Research on cloud
storage architecture and key technologies. In Proceedings of the 2nd International
Conference on Interaction Sciences: Information Technology, Culture and Human
(pp. 1044-1048).
11