Hybrid Approach
Hybrid Approach
Article
Hybrid Approach for Improving the Performance of Data
Reliability in Cloud Storage Management
Ali Alzahrani 1 , Tahir Alyas 2 , Khalid Alissa 3 , Qaiser Abbas 1 , Yazed Alsaawy 1 and Nadia Tabassum 4, *
1 Faculty of Computer and Information Systems, Islamic University Madinah, Madinah 42351, Saudi Arabia
2 Department of Computer Science, Lahore Garrison University, Lahore 54000, Pakistan
3 Networks and Communications Department, College of Computer Science and Information
Technology (CCSIT), Imam Abdulrahman Bin Faisal University (IAU),
P.O. Box 1982, Dammam 31441, Saudi Arabia
4 Department of Computer Science and Information Technology, Virtual University of Pakistan,
Lahore 54000, Pakistan
* Correspondence: [email protected]
Abstract: The digital transformation disrupts the various professional domains in different ways,
though one aspect is common: the unified platform known as cloud computing. Corporate solutions,
IoT systems, analytics, business intelligence, and numerous tools, solutions and systems use cloud
computing as a global platform. The migrations to the cloud are increasing, causing it to face new
challenges and complexities. One of the essential segments is related to data storage. Data storage on
the cloud is neither simplistic nor conventional; rather, it is becoming more and more complex due
to the versatility and volume of data. The inspiration of this research is based on the development
of a framework that can provide a comprehensive solution for cloud computing storage in terms
of replication, and instead of using formal recovery channels, erasure coding has been proposed
for this framework, which in the past proved itself as a trustworthy mechanism for the job. The
Citation: Alzahrani, A.; Alyas, T.;
proposed framework provides a hybrid approach to combine the benefits of replication and erasure
Alissa, K.; Abbas, Q.; Alsaawy, Y.;
Tabassum, N. Hybrid Approach for
coding to attain the optimal solution for storage, specifically focused on reliability and recovery.
Improving the Performance of Data Learning and training mechanisms were developed to provide dynamic structure building in the
Reliability in Cloud Storage future and test the data model. RAID architecture is used to formulate different configurations for
Management. Sensors 2022, 22, 5966. the experiments. RAID-1 to RAID-6 are divided into two groups, with RAID-1 to 4 in the first group
https://doi.org/10.3390/s22165966 while RAID-5 and 6 are in the second group, further categorized based on FTT, parity, failure range
and capacity. Reliability and recovery are evaluated on the rest of the data on the server side, and for
Academic Editors: Hai Dong,
Pengcheng Zhang, Le Sun and
the data in transit at the virtual level. The overall results show the significant impact of the proposed
Tooba Aamir hybrid framework on cloud storage performance. RAID-6c at the server side came out as the best
configuration for optimal performance. The mirroring for replication using RAID-6 and erasure
Received: 2 July 2022
coding for recovery work in complete coherence provide good results for the current framework
Accepted: 6 August 2022
while highlighting the interesting and challenging paths for future research
Published: 10 August 2022
Publisher’s Note: MDPI stays neutral Keywords: cloud computing; cloud storage; reliability; performance; secure data management;
with regard to jurisdictional claims in modeling
published maps and institutional affil-
iations.
1. Introduction
Copyright: © 2022 by the authors.
Cloud computing is the delivery of different services through the Internet. These
Licensee MDPI, Basel, Switzerland. resources include tools and applications like data storage, servers, databases, networking,
This article is an open access article and software. These features are the requirements and concerns raised by a larger audience
distributed under the terms and on the internet. The world wide web has carved new paths and approaches to utilize the
conditions of the Creative Commons concept of the digital world at an optimum level. Cloud computing has emerged with this
Attribution (CC BY) license (https:// conception and has fulfilled many expectations. It has also brought new questions and
creativecommons.org/licenses/by/ challenges for this new ecosystem. Cloud computing serves corporate needs and caters to
4.0/). individual users, making its adaptation swifter than expected [1]. The private and public
Figure
Figure 1. Data Failure
1. Data Failure in
in Cloud
Cloud computing.
computing.
Erasure codingisismainly
Erasure coding mainlyused
used
forfor protecting
protecting thethe
datadata
fromfrom failures
failures in large-scale
in large-scale stor-
storage. It is also used to detect and correct errors in cloud computing. In
age. It is also used to detect and correct errors in cloud computing. In erasure codes,erasure codes, a
a file
file can be divided into equal chunks. It also added the parity chunks that can
can be divided into equal chunks. It also added the parity chunks that can be restored to be restored
to recover the original file. Erasure codes can be divided into two categories: Maximum
recover the original file. Erasure codes can be divided into two categories: Maximum Dis-
Distance Separable (MDS) and non-MDS [13].
tance Separable (MDS) and non-MDS [13].
The replication technique focuses more on the cloud computing reliability process to
The replication technique focuses more on the cloud computing reliability process to
maximize data availability and reliability. Low latency and minimal latency can be reached
maximize data availability and reliability. Low latency and minimal latency can be
by consuming bandwidth overcapacity on the network. The lost data need to be restored
reached by consuming bandwidth overcapacity on the network. The lost data need to be
in the alternative storage medium by retaining the emphasis on reliability. Furthermore,
restored in the alternative storage medium by retaining the emphasis on reliability. Fur-
restoration is reactively and proactively divided into two groups. For replication, there
thermore, restoration is reactively and proactively divided into two groups. For replica-
are two techniques used. The replica will be generated with the reactive method after the
tion, there are two techniques used. The replica will be generated with the reactive method
failure. In a constructive method, the replica will be generated before failure occurs. In
Static Replication, the total number and location of replicas are fixed. Random replication
is used in HDFS, GFS, RAM Cloud and Windows Azure [14]. In dynamic replication,
replicas are generated and removed dynamically. The management, position, and deletion
of replica productions are autonomous processes that rely on user requirements to improve
usability, durability, cost, bandwidth, latency, energy, storage efficiency and execution time.
Sensors 2022, 22, 5966 4 of 19
The storage segment of cloud computing is based on the data centers in terms of
physical infrastructure. By a definition from Google, a data center is a cluster of buildings
with various functions, numerous server machines and communication equipment to be
linked together to develop a common environment with common maintenance and security
needs. In terms of components, the physical structure can be divided into three main
categories, i.e., server machines, storage and reliability. The server machines are designed
for heavy processing, communication and storage facilities [15]. The multi-function, multi-
user and multi-tiered physical structure is the most vital and common segment to be looked
at for reliability and storage optimization in a cloud environment.
2. Problem Statement
Data storage in cloud environments is a big challenge for data reliability and trust
issues. Both key attributes play a critical role in the survival and growth of cloud services
and in attaining the trust level of cloud users. The tendency of engaging cloud storage
services is expanding continuously, and the data management in cloud environments is
becoming a big challenge in terms of data reliability, security and accessibility. This research
aims to formalize a hybrid approach for data reliability in cloud storage management and
also introduces a conceptual framework to improve reliability storage efficiency and latency
in accessing data over cloud computing.
3. Research Motivation
This research has formulated the following research motivations;
a. Reliability of data requires optimizing durability and data availability.
b. Durability mitigates permanent failures and mitigates temporary failures by
availability.
c. In cloud data centers, different methods are used to increase the fault tolerance of the
storage system.
5. Research Objectives
The following objectives are defined for this research;
a. Designing cloud storage reliability assurance to evaluate storage properties.
b. To produce an autonomous storage management model for improving storage
efficiency.
c. Formulating a model for data reliability in cloud storage management.
6. Literature Review
Cloud storage systems are composed of large numbers of hardware and software
components. Failures are the norm rather than the exception in cloud storage systems.
Any failures such as hardware failures, power outages, software glitches, maintenance
shutdowns or network failures in the cloud storage system will raise temporary data un-
availability events and sometimes lead to permanent data loss. In spite of these failures, to
provide reliable service to the customers, various fault-tolerant mechanisms are employed.
To meet the large-scale storage needs of clients, cloud defines virtual storage using Network
Attached Storage (NAS) and Storage Area Network (SAN) [16].
The networked storage NAS and SAN are easily scalable in terms of both performance
and capacity, and hence are highly influential in cloud storage systems. They use a dis-
Sensors 2022, 22, 5966 5 of 19
tributed file system to organize data for storage and provide controlled data access to
clients. Distributed File System (DFS) spreads data in a storage cluster which is composed
of thousands of nodes. DFS applies data redundancy to improve the fault tolerance of
cloud storage systems, and it spreads redundant data into nodes from different failure
zones [17]. DFS is also designed to ensure the durability, availability and I/O performance
of the storage according to the client’s Service Level Agreement (SLA).
Any failures in cloud storage systems mentioned above may lead to unavailability
events from time to time. Whenever an unavailability event occurs, it activates data re-
covery to maintain durability and data availability. The data redundancy mechanisms
employed in cloud storage systems are replication and erasure coding. Replication main-
tains multiple copies of data on distinct nodes from different failure domains. Replication
is a straightforward fault-tolerant method. However, it is not an efficient solution for big
data due to the volume of data. Erasure coding is a storage-efficient alternative reliability
method [18].
Strauss has mentioned the importance of reliability based on the availability of data,
which is becoming a critical factor when considering or measuring the performance of any
cloud service; therefore, it is becoming essential to consider the reliability methodologies
while developing the cloud’s structure [19]. Zhang has mentioned the importance of virtual
machines, i.e., the configuration of virtual machines in a manner that supplements the
reliability and availability of the data. It also has an impact on the performance of the
cloud services, and therefore, the configuration of virtual machines is becoming more
and more critical [20]. The role of infrastructure as a service in terms of the storage and
performance of the cloud services is discussed by Vishwanath and Nagappan. They have
presented reliability metrics for the performance evaluation of the cloud services focused
on storage and data [20]. In continuation, Bauer has highlighted the critical parameters for
the quality of service. The reliability metrics presented by various scholars are applied on
certainly quality parameters to evaluate the impact of various variables on the reliability
and availability of the data [21].
The reliability and recovery of the data has not been evaluated at the infrastructure
level only, and various scholars have developed and tested different algorithms to address
various critical challenges. In the same year, Cheng proposed another framework for the
reduction in failure and dependencies by improving the system’s reliability. The basis of
this framework is also rooted in the infrastructural services in cloud computing [22]. This
framework has been used to evaluate the performance of cloud services in an independent
mode and develop rankings for the same.
The research related to reliability and availability has extended into the prevention of
failures, identification of failures, performance prediction and defensive methodologies.
Sharma has highlighted the importance of such indirect factors on the overall performance
and reliability of the data. The preventive measures and predictive maintenance of the
computing system have a long-term impact on the strategies for cloud computing storage,
replication and recovery methodologies [23].
The topic of cloud computing storage has become a more comprehensive area for
researchers as it incorporates hardware or infrastructure as a service (IaaS), virtualization
and computation capabilities, and more importantly, the configuration of virtual machines.
Nachiappan has discussed the virtual machines scenarios of multiple configurations and
the relevant impact on the performance of the cloud services. The importance of this work
is related to the challenges of big data in cloud computing, and specifically related to the
storage reliability and availability in case of virtual machine failures or weak configurations.
They also discussed the role of security, preventive measures and storage methodolo-
gies such as static replication, dynamic replication, mirroring and erasure coding, to be
specific [24].
The Popular Hadoop Distributed File System (HDFS) uses three replicas. Hence,
it can tolerate any two simultaneous failures with a storage overhead of 3x. The most
popular Reed–Solomon can manage any four simultaneous failures with 1.4x storage
Sensors 2022, 22, 5966 6 of 19
overhead. Even though the storage efficiency of erasure coding sounds appealing, data
recovery/repair in erasure coding involves enormous resource consumption. For example,
data recovery in Reed–Solomon code increases disk I/O and network bandwidth by 10x
compared to replication (increased resource consumption due to data recovery also impacts
read performance) [25]. Data recovery in replication has a limited impact on resource
consumption and read performance. Data recovery issues of erasure coding prevent it from
being more pervasive in cloud storage systems. For example, in a 3000-node production
cluster of Facebook, erasure coding can replace replication for only 8% of data. In the case
that 50% of the data are replaced with erasure code, the repair network traffic will saturate
cluster network links [26].
When there is a failure in cloud storage systems, the objects that are resided in the
failed zone will enter into a degraded mode. A delay is applied to recover any degraded
objects to avoid any unnecessary repair. Degraded objects will remain in degraded mode
from the time of failure utill complete recovery. Any data read request to a degraded object
in replication is handled by redirecting requests to the next available replica. On the other
hand, in erasure coding, a degraded object is reconstructed on the fly. In replication, the
object is recovered by copying it from the next available replica, whereas in erasure coding,
the object is recovered using the data reconstruction of any other k available chunks [27].
other words, the default configuration of all virtual machines resides with the configuration
manager, while the dynamic status of the virtual machines is captured through CSI and
CNI interfaces that transform into the health status of the virtual machines, i.e., in the case
that, due to a recovery instance, erasure coding is activated on virtual machine number 3,
the dynamic value of its storage will increase. The CSI and CNI interfaces pass the value to
the VM health controller, and in the case of any room being made available for dynamic
replication or erasure coding by creating an instance on another virtual machine, it will
start the load balancing accordingly to create space for the current instance.
In the case that all VMs are busy and not able to accommodate the current instance,
then instead of crashing the system, the current instance will be in the cache cue. Therefore,
as soon as the virtual machine is available in processing terms, i.e., suitable compute, storage
and network properties, then it will start the recovery or replication procedure as planned.
The physical resource layer consists of hardware variants in terms of storage devices,
i.e., racks, nodes, disks, etc. As we are considering the cloud platform, therefore, this
layer physically resides with the cloud service provider in the case of a public, community
or hybrid cloud. In the case of a private cloud or a corporate data center, this layer
shall be categorized as in-prim hardware services. It was mentioned before that erasure
coding works on RAID 5/6 and replication uses RAID-1, and the physical resource layer
manages both. The accessibility to this layer is provided through virtualization in terms of
different virtual machines. The cloud platform is flexible enough to provide a customized
configuration of each virtual machine, though the overall capacity of all virtual machines
is dependent on the physical resource available. To select between two storage services,
i.e., erasure coding and replication, this layer is formulated to provide data cluster health
and device failure possibilities using a fabric agent for host management. It is notable
here that device failure predictions are based on the possible failure of the erasure coding
or replication method. To elaborate further, if replication uses HDFS for mirroring and
faces a hardware, bandwidth or logical fault, the fabric agent will provide a health alert
for the current data cluster and evaluate the possibility of failure to predict the event.
The same process is applicable for erasure coding, e.g., in the case that erasure coding is
using hyper-convergent storage for video streaming and facing high latency, hardware
failure or a split modulation problem, the fabric agent will perform the same function as
mentioned before. The objective of this module is to take necessary action, i.e., switching to
the other method within a minimal timeframe without disturbing the data operations and
management, as shown in Figure 2.
This layer is linked with the prediction layer and operates in accordance with the fabric
agent’ instructions. This framework takes erasure coding and replication management
modules as hosts, while the fabric agent is the host controller to assign tasks and operations
received from the virtual machines or cloud users. As depicted in the diagram, the hybrid
layer is split into main modules, i.e., erasure coding management and replication manage-
ment, with two connecting functions: a cache configuration system and selection for the
prediction layer. This layer also contains the respective algorithms for erasure coding as
well as for replication. New algorithms can be added into the same modules, and therefore,
the optimum solution for storage management is possible with the most recent and best
algorithms.
The management sub-module contains the algorithms with the encoder and decoder
sub-modules. This framework works on this assumption that the available algorithms are
proven and tested; therefore, the encoder and decoder are able to perform their respective
functions. Virtual erasure coding, HDFS erasure, hyper-convergent erasure and Reed
Solomon erasure are a few examples. From the prediction layer, the fabric agent will
provide the task to the host, i.e., erasure coding management with the data cluster health
status, and by default, both methods will be used based on the capacity and availability
of physical resources. In the case of replication management failure, the erasure coding
management module will resume the task and start recovering data from the replication
module. For this purpose, the replication module has one dedicated node in the erasure
Sensors 2022, 22, 5966 8 of 19
coding module. In the case of a bad health flag on the data cluster or the predictive failure
Sensors 2022, 22, x FOR PEER REVIEW 8 of 20
of a device linked with the replication management module, the erasure coding module
will start recovering and assuming data tasks through the dedicated node.
Figure2.2.Conceptual
Figure Conceptualdescription
descriptionofofthe
theproposed
proposedsolution.
solution.
by engaging close with virtual machines to provide an instance for the recovery process to
streamline the system again.
RAID Type Description Min. Drives Range for Fault Tolerance Parity
RAID-0 Striping without mirroring 2 None No Parity
RAID-1 Mirroring without striping 2 01 drive failure No Parity
RAID-2 Striping with ECC 3 01 drive failure Shared Parity
RAID-3 Striping (Byte) 3 01 drive failure Dedicated Parity
RAID-4 Striping (Block) 3 01 drive failure Dedicated Parity
RAID-5 Striping (Block) 3 01 drive failure Distributed Parity
Double Distributed
RAID-6 Striping (Block) 4 02 drive failures
Parity
RAID-5
A1 B1 C1 P1
B2 C2 P2 A2
C3 P3 A3 B3
P4 A4 B4 C4
RAID-6
A1 B1 C1 D1 P1 Q1
B2 C2 D2 P2 Q2 A2
C3 D3 P3 Q3 A3 B3
D4 P4 Q4 A4 B4 C4
RAID-Configuration Type Fault Domains Failure Tolerate Data Volume Required Capacity
RAID1 Mirror - 1 100 gb 200 gb
RAID 5/6 Erasure Code 4 1 100 gb 133 gb
RAID1 Mirror - 2 100 gb 300 gb
RAID 5/6 Erasure Code 6 2 100 gb 150 gb
As is visible, RAID-5 and RAID-6 reciprocate RAID-1 mirroring, while RAID-3 and
RAID-4 use mirroring, but in a block and striping format. If RAID-1 (mirroring without
striping) will impact erasure coding, then upgrading the other RAID formats becomes easy
and more logical. Therefore, from a data spread point of view, all RAIDS are engaged,
while the configuration above is being used from a reliability and recovery perspective.
The failure limit is also defined for each RAID classification, and with the exception of
RAID-0 and RAID-6, all other RAIDS have a failure threshold on one drive, while RAID-6
is configured as a failure threshold on two drives. There is a parity diversification for each
RAID as well, ranging from no parity, shared parity, dedicated parity, distributed parity
and double distributed parity.
The parity range also significantly impacts the replication and performance of the
RAID; therefore, it is highly important to evaluate various configurations to understand the
validation of the proposed framework. Double distributed parity is assigned to RAID-6,
while RAID-5 uses distributed parity, but both are complex parity cases. Therefore, with
the other configurations mentioned in the previous section, the scenarios developing at
RAID-5 and RAID-6 are challenging and depict the real-life complexity level and robustness
requirements at an optimum level.
In cloud computing, there are other storage methods, but as mentioned earlier, in the
literature, we have identified the best practices for storage, and RAID is one of the most
usable methodologies in various cloud configurations.
Sensors 2022, 22, 5966 11 of 19
Initially, data parameters are used for training purposes to engage intelligence, which
is required to develop autonomy as the system obtains maturity. The switching between
replicated mirroring and erasure coding is essentially based on the intelligent flagging and
understating of the anomaly patterns by the machine itself.
Therefore, the parameters are trained to learn the anomaly flags, and these learning
datasets are generated through a number of iterations. To make the learning more inde-
pendent, the column names are also dynamic, and the machine learns the columns and
transforms a training structure every time instead of fixing on specific columns, meaning
that the same learning is applicable on multiple cloud and RAID configurations without
any modifications, as it is going to pick the columns dynamically to formulate the learning
parameters.
Iterations
dst_host_same_src_port_rate
dst_host_srv_diff_host_rate
dst_host_srv_serror_rate
dst_host_srv_rerror_rate
dst_host_same_srv_rate
dst_host_diff_srv_rate
dst_host_serror_rate
dst_host_rerror_rate
num_failed_logins
wrong_fragment
difficulty_leve
protocol_type
dst_bytes
src_bytes
Duration
Drives
service
Urgent
Label
Flag
mirror’s server rate for mirror-5 configuration shows an overall promising result, and on
the other hand, the recovery using the erasure-6 configuration is also prominent.
Table 6. Reference adjustment results.
Iterations
dst_host_same_src_port_rate
dst_host_srv_diff_host_rate
dst_host_srv_serror_rate
dst_host_same_srv_rate
dst_host_diff_srv_rate
dst_host_serror_rate
dst_host_rerror_rate
num_failed_logins
num_failed_logins
wrong_fragment
difficulty_leve
protocol_type
dst_bytes
src_bytes
Duration
Service
urgent
Drives
label
Flag
Figure
Figure 3. 3. Training
Training Results.
Results.
Data are linked with various protocols assuming that various communication mecha-
nisms Data are linked
are attached with and
with he cloud, various protocols
therefore, the natureassuming
and source that
of the various
data needscommu
mechanisms
to be rationalized are attached
before with towards
proceeding he cloud, theand therefore,
learning the of
and testing nature and source
the system for of
needs toand
mirroring be rationalized before proceeding
recovery. The following towards
are the data framing the over
results learning and
multiple testing of the
iterative
engagements
for mirroring to evaluate various possibilities.
and recovery. The results
The following are show the normalization
the data framing resultsof the over m
data to be processed for the pre-training adjustments. The results show that the tcp and
iterative engagements to evaluate various possibilities. The results sh
udp source using ftp and http protocols though the private channel does not reach the
normalization
normal value. of the data to be processed for the pre-training adjustments. The
show that the
Further datatcp and udp
processing source
and using
learning shows ftpthe
and httpdata
private protocols
channelthough
generatesthere-private
jection of tcp instead of proceeding
does not reach the normal value. towards restoring. The other examples show the ftp
data again reaching a normal value but not proceeding towards mirroring or restoring.
Further data processing and learning shows the private data channel g
Similarly, another case has generated an association pattern for mirroring and showing the
rejection
status of theof tcp instead
restoration of proceeding towards restoring. The other examples show
of tcp.
dataTheagain reaching
training a normal
dataset has value
generated but notand
the categories proceeding towards
columns shown mirroring or re
in the aforemen-
Similarly,
tioned another
table. The case
dataset has generated
is segregated an association
into the host pattern
mirror’s server rate andfor
themirroring
transition and s
the status of the restoration of tcp.
The training dataset has generated the categories and columns shown
aforementioned table. The dataset is segregated into the host mirror’s server rate
transition server rate, while the recovery is also split into these types, i.e., host r
Sensors 2022, 22, 5966 13 of 19
server rate, while the recovery is also split into these types, i.e., host recovery rate and
host server recovery rate. The purpose is to manage the server-side storage as well as the
VM-level storage. Another aspect covered in this activity is mirroring, which not only
required for data at rest for servers but for data in transient. The results provide the values
for the proceedings. The following graph summarizes the results. The host mirror’s server
rate for the mirror-5 configuration shows an overall promising result, but on the other hand,
the recovery using the erasure-6 configuration is also prominent.
The further processing of the data focuses on mirroring and erasure coding with
reference to the mirror, transit and restoration. These parameters are evaluated with the
same modes, i.e., host mirroring, transit mirroring, host recovery and host recovery at the
server end. The results are showing lower rejection in the case of mirroring, while rejection
is showing higher rates in the b recovery configuration at the server and virtual machine
level. Similarly, erasure coding and mirroring show significant trends in host mirroring
at the server end, while the restoration is higher on the recovery modes. The erasure and
mirroring show significant results at the server end, while these results are not as good
when considering data in transit.
Now, for the scenario built on different configurations of the RAID, the results show
the analysis of RAID-1 to RAID-6 with respect to erasure coding, erasure coding with parity
and mirroring combinations with all parameters. RAID 1 to 4 are used in the default con-
figurations, while RAID-5 and RAID-6 are used in different combinations/configurations,
i.e., RAID-5a and RAID-5b, while RAID-6 is configured in three different modes labelled
as a, b and c. The mirroring instance engages all the configurations. As is visible in the
graph, mirroring RAID-6c and RAID-6a are on the top for mirror and replication. Both
RAID modes are hybrid in nature as configured previously. The instance occurring as
ER1—erasure coding instance—engages RAID-6c, RAID-6b and RAID-6a as the most sig-
nificant configurations, while RAID-4 also has a prominent value that shows that the best
performance of erasure coding is in hybrid mode along with the mirroring instance, but in
the case of only erasure coding, the RAID-4 value shows the impactful behavior of erasure
coding for recovery and reliability. The second instance, or erasure coding ER2, takes
RAID-6, RAID-6b and RAID-5b as the parameters, working in a hybrid mode only. This
proves that the proposed framework provides significant results regarding reliability and
swift recovery performance in cloud storage.
The results show the best performance and coherence in a hybrid manner, i.e., mir-
roring and then handing over to erasure coding for recovery. The Erasure-6 and Mirror-5
configurations are hybrid modes. Data at rest and data in transit are both addressed
successfully.
shows the impactful behavior of erasure coding for recovery and reliability. The
instance, or erasure coding ER2, takes RAID-6, RAID-6b and RAID-5b as the par
of one in four out of five values. The one in which erasure does not show results also has
working in a hybrid mode only. This proves that the proposed framework p
a source volume of zero. Therefore, erasure does not perform any operations with zero
significant
volume. Whileresults regarding
in the other reliability
four cases, the dataand swift
volume is recovery performance
tangible, this also proves thein cloud
The of
application results
erasureshow
codingthe
withbest performance
volume, and the sameand coherence
is applicable for in a hybrid
mirroring, manner,
as the
real power and purpose of mirroring is exposed with volumes, as shown
roring and then handing over to erasure coding for recovery. The Erasure-6 and in Table 9.
configurations
Table are hybrid modes. Data at rest and data in transit are both addres
7. Training Set Results.
cessfully.
host_srv_recover_rate_b
trans_mirror_srv_rate_b
host_srv_recover_rate_a
trans_mirror_srv_rate_a
host_mirror_srv_rate_a.
host_mirror_srv_rate_b
host_recoverr_rate_a
host_recover_rate_b
complexity_level
8.2. Mirror and Erasure Coding Results
Further processing of the data focuses on mirroring and erasure coding wi
Class
ence to the mirror, transit and restoration, as shown in Table 7. These parameters
uated with the same modes, i.e., host mirroring, transit mirroring, host recovery
recovery at the server end.
The results in Figure 4 show lower rejection in the case of mirroring, while
0.17 0.03 0.17 0 0 0 0.05 0 normal 20
shows a higher rate in the b recovery configuration at the server and virtual machi
0 0.6 0.88 0 0 0 0 0 normal 15
Similarly, erasure coding and mirroring show significant trends in host mirrorin
0.1 0.05 0 0 1 1 0 0 neptune 19
server end, while the restoration is higher on the recovery modes. The erasure a
1 0 0.03 0.04 0.03 0.01 0 0.01 normal 21
roring show significant results at the server end, while these results are not as goo
1 0 0 0 0 0 0 0 normal 21
considering data in transit.
Figure
Figure 4. Mirroring
4. Mirroring and Erasure
and Erasure Coding Coding
Graph onGraph
RAID. on RAID.
RAID-6b and RAID-6a as the most significant configurations, while RAID-4 also has a
prominent value that shows the best performance of erasure coding is in the hybrid mode
along with the mirroring instance, but in the case of only erasure coding, the RAID-4 value
shows the impactful behavior of erasure coding for recovery and reliability.
Table 8. Mirroring and erasure coding.
host_srv_recover_rate_b
trans_mirror_srv_rate_b
host_srv_recover_rate_a
trans_mirror_srv_rate_a
host_mirror_srv_rate_a.
host_mirror_srv_rate_b
host_recoverr_rate_a
host_recover_rate_b
complexity_level
Class
0.04 0.06 0 0 0 0 1 1 neptune 21
0 0.06 0 0 0 0 1 1 neptune 21
0.61 0.04 0.61 0.02 0 0 0 0 normal 21
1 0 1 0.28 0 0 0 0 ER1 15
Sensors 2022, 22, x FOR PEER REVIEW 16 of 20
0.31 0.17 0.03 0.02 0 0 0.83 0.71 MIR 11
Figure5.5.Mirroring
Figure Mirroringand
andErasure
ErasureCoding
CodingGraph.
Graph.
8.3.Training
8.4. Erasure Results
Coding Results
Now,
The for theresults
focused scenarioforbuilt
the on different
erasure configurations
coding as shown in ofTable
the RAID,
11 are results are shown
extracted with
in Table
RAID 10 for the analysis
configurations, of RAID-1
as is visible in thetotable
RAID-6 concerning
above, and the sameerasure coding,
is also moreerasure
visiblecod-
in
inggraph
the with below.
parity and mirroring
The ER instances combinations
are visible on with all parameters,
all levels in the caseas of shown
RAID-6c, in while
Figureall6.
other RAIDs do instance
The mirroring not showengages
a significant
all theperformance on any
configurations. RAIDparameter.
1 to 4 areTherefore,
used in theit isdefault
right
to say that the most
configurations, successful
while RAID-5 hybrid
and RAID-6configuration
are usedfor replication
in different and erasure coding is
combinations/configura-
RAID-6c,
tions, i.e.,the second and
RAID-5a best RAID-5b,
can be RAID-6b, followed
while RAID-6 is by RAID-6ain
configured and RAID-5a.
three differentRAID-6c
modes
has a significant
labeled value
as a, b and andgraph
c. The is more comprehensive
shows that RAID-6c as and
it responds
RAID-6a toare
all on
other
theparameters,
top for mir-
even if those
roring parametersinhave
and replication lower or
mirroring. nominal
Both RAID values,
modes but at aleast
are of it depicts
hybrid the rationale
in nature as config-
between different parameters. The base class developing in this scenario
ured previously. The instance occurring as erasure coding instance ER1 is engaging RAID- also reaches the
value of one, which also reciprocates the significant value of flag_erasure.
6c, RAID-6b and RAID-6a as the most significant configurations, while RAID-4 also has a The overall
prominent value that shows the best performance of erasure coding is in the hybrid mode
along with the mirroring instance, but in the case of only erasure coding, the RAID-4 value
shows the impactful behavior of erasure coding for recovery and reliability.
results prove this research’s hypotheses regarding the reliability and recovery impact of a
hybrid modulation between replication and erasure coding.
Table 9. Erasure coding results.
flag_RSTOS0
flag_RSTO
flag_RSTR
flag_MIR
flag_ER1
flag_ER2
flag_REJ
flag_SH
flag_S1
flag_S2
flag_S3
...
0 ... 0 0 0 0 0 0 0 0 1 0
0 ... 0 0 0 0 0 0 0 0 1 0
0 ... 0 0 0 0 1 0 0 0 0 0
1 ... 0 0 0 0 0 0 0 0 1 0
1 ... 0 0 0 0 0 0 0 0 1 0
... ... ... ... ... ... ... ... ... ... ... ...
0 ... 0 0 0 0 1 0 0 0 0 0
0 ... 0 0 0 0 0 0 0 0 1 0
1 ... 0 0 0 0 0 0 0 0 1 0
0 ... 0 0 0 0 1 0 0 0 0 0
1 ... 0 0 0 0 0 0 0 0 1 0
flag_RSTOS0
flag_Erasure
flag_RSTO
flag_RSTR
logged_in
dst_bytes
src_bytes
Duration
flag_SH
flag_S0
flag_S1
flag_S2
flag_S3
Urgent
Class
land
Hot
0 0 491 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0
1 0 146 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0
2 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1
3 0 232 8153 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0
4 0 199 420 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0
Figure6.6.Erasure
Figure Erasurecoding
codingresults.
results.
8.4. Training
Table Results
11. Erasure and replication results.
The focused results for the erasure coding as shown in Table 11 are extracted with
flag_MIR_RSTO
RAID configurations, as is visible in the table above, and the same is also more visible in
flag_Erasure
flag_RSTO
flag_RSTR
flag_MIR1
flag_MIR2
flag_ER1
flag_ER2
the graph below. The ER instances are visible on all levels in the case of RAID-6c, while
flag_SH
Class
all other RAIDs do not show a significant performance on any parameter. Therefore, it is
right to say that the most successful hybrid configuration for replication and erasure cod-
ing is RAID-6c, the second best can be RAID-6b, followed by RAID-6a and RAID-5a.
RAID-6c has a significant value and is more comprehensive as it responds to all other
0 0 0 0 0 0 0 1 0
parameters, even if those parameters have lower or nominal values, but at least it 0depicts
0 0 0 the rationale
0 between different
0 0
parameters. The0 base class developing
1 0 this scenario
in 0 also
0 0 0 reaches the1 value of one,
0 which also 0 reciprocates
0 the significant
0 value of
0 flag_erasure.1 The
0 0 0 overall results
0 prove this
0 research’s
0 hypotheses0 regarding the
1 reliability
0 and recovery
0 im-
pact of a hybrid modulation between replication and erasure coding.
0 0 0 0 0 0 0 1 0 0
The structure of the dataset after the training and testing shows a minimal error rate,
with erasure coding having value ranges between zero and one. Table 10 shows other
parameters along with erasure flag having a value of one for four out of five values. The
one in which erasure is not showing a result also has a source volume of zero; therefore,
erasure does not perform any operations with zero volume. While in the other four cases,
the data volume is tangible, this also proves the application of erasure coding with volume,
and the same is applicable for mirroring, as the real power and purpose of mirroring is
exposed with volumes.
The focused results for the erasure coding are extracted with RAID configurations, as
is visible in Table 11 and moreso in the graph below. The ER instances are visible on all
levels in the case of RAID-6c, while all other RAIDs do not show a significant performance
on any parameter. Therefore, it is right to say that the most successful hybrid configuration
for replication and erasure coding is RAID-6c, the second best can be RAID-6b, followed
by RAID-6a and RAID-5a. RAID-6c not only has a significant value but also is more
comprehensive, as it is responding to all other parameters even if those parameters have
lower or insignificant values, but at least it is depicting the rationale between different
parameters. The base class developed in this scenario also reaches the value of one, which
reciprocates the significant value of flag_erasure. The overall results prove the hypotheses
Sensors 2022, 22, 5966 18 of 19
of this research regarding the reliability and recovery impact of a hybrid modulation
between replication and erasure coding.
9. Conclusions
Cloud computing provides different solutions for data safety, backups and replication.
All these methods are independent and work separately, although these methods and tools
have common parameters and can work together. The inspiration of this research is based
on the development of a framework that can provide a comprehensive solution for cloud
computing storage in terms of replication, and instead of using formal recovery channels,
erasure coding was proposed for this framework, which in the past proved itself as a
trustworthy mechanism for the job. The proposed framework provides a hybrid approach
to combine the benefits of replication and erasure coding to attain the optimal solution for
storage, specifically focused on reliability and recovery.
The overall results show the significant impact of the proposed hybrid framework on
cloud storage performance. RAID-6c at the server came out as the best configuration for
optimal performance. The mirroring for replication using RAID-6 and erasure coding for
recovery work in complete coherence and provide good results for the current framework,
while highlighting the interesting and challenging paths for future research.
Author Contributions: Conceptualization, K.A.; Data curation, T.A. and Y.A.; Formal analysis, K.A.,
Q.A. and N.T.; Investigation, A.A.; Methodology, T.A. and A.A.; Supervision, Y.A.; Validation, Y.A.;
Visualization, K.A. and Q.A.; Writing—original draft, Q.A. and N.T.; Writing—review & editing, A.A.
All authors have read and agreed to the published version of the manuscript.
Funding: This research is supported by Deanship of Scientific Research, Islamic University of
Madinah, KSA.
Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.
Data Availability Statement: The data used in this paper can be requested from the corresponding
author upon request.
Acknowledgments: The authors thank the participants involved in the collection of the datasets
and results.
Conflicts of Interest: The authors declare that they have no conflict of interest regarding the publica-
tion of this work.
References
1. Lattuada, M.; Barbierato, E.; Gianniti, E.; Ardagna, D. Optimal Resource Allocation of Cloud-Based Spark Applications. IEEE
Trans. Cloud Comput. 2020, 7161, 1301–1316. [CrossRef]
2. Singh, A.; Kumar, R. Performance evaluation of load balancing algorithms using cloud analyst. In Proceedings of the 2020 10th
International Conference on Cloud Computing, Data Science & Engineering (Confluence), Noida, India, 29–31 January 2020; pp.
156–162.
3. Dab, B.; Fajjari, I.; Rohon, M.; Auboin, C.; Diquelou, A. An Efficient Traffic Steering for Cloud-Native Service Function Chaining.
In Proceedings of the 2020 23rd Conference on Innovation in Clouds, Internet and Networks and Workshops (ICIN), Paris, France,
24–27 February 2020; pp. 71–78.
4. Tabassum, N.; Alyas, T.; Hamid, M.; Saleem, M.; Malik, S. Hyper-Convergence Storage Framework for EcoCloud Correlates.
Comput. Mater. Contin. 2022, 70, 1573–1584. [CrossRef]
5. van der Boor, M.; Borst, S.; van Leeuwaarden, J. Load balancing in large-scale systems with multiple dispatchers. In Proceedings
of the IEEE INFOCOM 2017—IEEE Conference on Computer Communications, Atlanta, GA, USA, 1–4 May 2017; pp. 1–13.
6. Paris, J.F.; Schwarz, T. Three-dimensional RAID Arrays with Fast Repairs. In Proceedings of the 2021 International Conference on
Electrical, Computer and Energy Technologies (ICECET), Cape Town, South Africa, 9–10 December 2021; pp. 9–10.
7. Telenyk, S.; Bidyuk, P.; Zharikov, E.; Yasochka, M. Assessment of Cloud Service Provider Quality Metrics. In Proceedings of the
2017 International Conference on Information and Telecommunication Technologies and Radio Electronics (UkrMiCo), Odessa,
Ukraine, 11–15 September 2017.
8. Li, R.; Zheng, Q.; Li, X.; Yan, Z. Multi-objective optimization for rebalancing virtual machine placement. Futur. Gener. Comput.
Syst. 2020, 105, 824–842. [CrossRef]
Sensors 2022, 22, 5966 19 of 19
9. Yang, Z.; Awasthi, M.; Ghosh, M.; Bhimani, J.; Mi, N. I/O Workload Management for All-Flash Datacenter Storage Systems Based
on Total Cost of Ownership. IEEE Trans. Big Data 2018, 8, 332–345. [CrossRef]
10. Mergenci, C.; Korpeoglu, I. Generic resource allocation metrics and methods for heterogeneous cloud infrastructures. J. Netw.
Comput. Appl. 2019, 146, 102413. [CrossRef]
11. Wu, R.; Wu, Y.; Wang, M.; Wang, L. An Efficient RAID6 System Based on XOR Accelerator. In Proceedings of the 2021 3rd
International Conference on Computer Communication and the Internet (ICCCI), Nagoya, Japan, 25–27 June 2021; pp. 171–175.
12. Nayyer, M.Z.; Raza, I.; Hussain, S.A. Revisiting VM Performance and Optimization Challenges for Big Data, 1st ed.; Elsevier Inc.:
Amsterdam, The Netherlands, 2019; Volume 114.
13. Hou, Z.; Gu, J.; Wang, Y.; Zhao, T. An autonomic monitoring framework of web service-enabled application software for the
hybrid distributed HPC infrastructure. In Proceedings of the 2015 4th International Conference on Computer Science and
Network Technology (ICCSNT), Harbin, China, 19–20 December 2015; pp. 85–90.
14. Zhou, B.; Jiang, H.; Cao, Q.; Wan, S.; Xie, C. A-Cache: Asymmetric Buffer Cache for RAID-10 Systems under a Single-Disk Failure
to Significantly Boost Availability. IEEE Trans. Comput. Des. Integr. Circuits Syst. 2022, 41, 723–736. [CrossRef]
15. Wu, R.; Wu, Y.; Wang, L. A single failure correction accelerated RAID-6 code. In Proceedings of the 2021 IEEE International
Conference on Emergency Science and Information Technology (ICESIT), Chongqing, China, 22–24 November 2021; pp. 120–123.
16. Colombo, M.; Asal, R.; Hieu, Q.H.; El-Moussa, F.A.; Sajjad, A.; Dimitrakos, T. Data protection as a service in the multi-cloud
environment. In Proceedings of the 2019 IEEE 12th International Conference on Cloud Computing (CLOUD), Milan, Italy, 8–13
July 2019; pp. 81–85.
17. Paiola, M.; Schiavone, F.; Grandinetti, R.; Chen, J. Digital servitization and sustainability through networking: Some evidences
from IoT-based business models. J. Bus. Res. 2021, 132, 507–516. [CrossRef]
18. Su, W.T.; Dai, C.Y. QoS-aware distributed cloud storage service based on erasure code in multi-cloud environment. In Proceedings
of the 2017 14th IEEE Annual Consumer Communications & Networking Conference (CCNC), Las Vegas, NV, USA, 8–11 January
2017; pp. 365–368.
19. Yala, L.; Frangoudis, P.A.; Ksentini, A. QoE-aware computing resource allocation for CDN-as-a-service provision. In Proceedings
of the 2016 IEEE Global Communications Conference (GLOBECOM), Washington, DC, USA, 4–8 December 2016.
20. Liu, B.; Chang, X.; Han, Z.; Trivedi, K.; Rodríguez, R.J. Model-based sensitivity analysis of IaaS cloud availability. Futur. Gener.
Comput. Syst. 2018, 83, 1–13. [CrossRef]
21. Taylor, S.J.; Kiss, T.; Anagnostou, A.; Terstyanszky, G.; Kacsuk, P.; Costes, J.; Fantini, N. The CloudSME simulation platform and
its applications: A generic multi-cloud platform for developing and executing commercial cloud-based simulations. Futur. Gener.
Comput. Syst. 2018, 88, 524–539. [CrossRef]
22. Toffetti, G.; Brunner, S.; Blöchlinger, M.; Spillner, J.; Bohnert, T.M. Self-managing cloud-native applications: Design, implementa-
tion, and experience. Futur. Gener. Comput. Syst. 2017, 72, 165–179. [CrossRef]
23. Leite, R.; Solis, P. Performance analysis of data storage in a hyperconverged infrastructure using docker and glusterfs. In
Proceedings of the 2019 XLV Latin American Computing Conference (CLEI), Panama City, Panama, 30 September–4 October 2019.
24. Lovas, R.; Nagy, E.; Kovács, J. Cloud agnostic Big Data platform focusing on scalability and cost-efficiency. Adv. Eng. Softw. 2018,
125, 167–177. [CrossRef]
25. Nasir, A.; Alyas, T.; Asif, M.; Akhtar, M.N. Reliability Management Framework and Recommender System for Hyper-converged
Infrastructured Data Centers. In Proceedings of the 2020 3rd International Conference on Computing, Mathematics and
Engineering Technologies (iCoMET), Sukkur, Pakistan, 29–30 January 2020.
26. Tabassum, N.; Ditta, A.; Alyas, T.; Abbas, S.; Alquhayz, H.; Mian, N.A.; Khan, M.A. Prediction of Cloud Ranking in a Hypercon-
verged Cloud Ecosystem Using Machine Learning. Comput. Mater. Contin. 2021, 67, 3129–3141. [CrossRef]
27. Chen, F.; Meng, F.; Xiang, T.; Dai, H.; Li, J.; Qin, J. Towards Usable Cloud Storage Auditing. IEEE Trans. Parallel Distrib. Syst. 2020,
31, 2605–2617. [CrossRef]
28. Sarwar, M.I.; Iqbal, M.W.; Alyas, T.; Namoun, A.; Alrehaili, A.; Tufail, A.; Tabassum, N. Data Vaults for Blockchain-Empowered
Accounting Information Systems. IEEE Access 2021, 9, 117306–117324. [CrossRef]
29. Alyas, T.; Javed, I.; Namoun, A.; Tufail, A.; Alshmrany, S.; Tabassum, N. Live migration of virtual machines using a mamdani
fuzzy inference system. Comput. Mater. Contin. 2022, 71, 3019–3033. [CrossRef]