Zeba 1
Zeba 1
The growth of cloud computing across different areas has required powerful risk detection
techniques to relieve possible threats. This study investigates the utilization of ML methods in
cloud computing risk assessment, zeroing in on upgrading prescient models' exactness and
accuracy. Utilizing ML algorithms like Random forest Classifier and Multi-layered Perceptron,
the examination plans to foster a strong risk evaluation system. To further enhance predictive
performance, ensemble learning techniques are utilized to harness the classifiers' collective
intelligence. A comprehensive literature review identifies key cloud environment-specific risk
factors that inform the creation of targeted mitigation strategies. Cross-validation methods and
established metrics like accuracy, precision, and F1-score are used to assess model performance
for generalizability. In addition, the study's significance in advancing cloud security practices is
highlighted by discussing its practical implications and future research directions. Organizations
can proactively identify and reduce risks by integrating ML-based risk assessment models,
improving the resilience and security of their cloud computing infrastructure.
Table of Contents
Abstract............................................................................................................................................2
1. Introduction..................................................................................................................................5
1.1 Background of Cloud Computing and Its Rapid Adoption across Sectors............................5
2. Literature review..........................................................................................................................8
4.2.1 Bagging.........................................................................................................................17
4.2.2 Boosting.........................................................................................................................17
References......................................................................................................................................21
1. Introduction
1.1 Background of Cloud Computing and Its Rapid Adoption across Sectors
Regardless of the various benefits presented by distributed computing, there are various threats
that require cautious appraisal and the executives. Risk evaluation is an essential cycle that
incorporates seeing, investigating, and working with likely threats to ensure the security and
unwavering quality of cloud-based structures. Data breaks and unapproved access, which can
prompt the abuse of sensitive information, are two of the essential threats related with
appropriated registering. The commitment that organizations need to maintain different security
and information assurance regulations and guidelines brings about consistence and
administrative issues. Data burglary and association blackouts address a basic risk to valuable
effectiveness and business discernment (Aljawarneh et al., 2018). At the point when an
organization turns out to be superfluously subject to a solitary cloud master affiliation, it can
restrict adaptability and inflate costs. Seller secure treatment of data complete risk assessment is
fundamental for protecting touchy information, observing legitimate consistency, and ensuring
the nonstop progression of business processes. Financial thefts and legitimate assents are
potential results of an absence of hazard evaluation. In like manner, persuading bet assessment
structures that are custom fitted to the specific troubles of cloud conditions should be taken on by
affiliations.
The study undertakes the below flow shows an exhaustive risk management system, focusing in
on four fundamental stages: Risk Identification, Risk Assessment, Risk Mitigation, and
Evaluation. At first, Risk Identification includes perceiving expected threats, arranging them into
internal threats, external threats, and shared threats.
Figure 1 - Flow chart showing risk identification and mitigation in cloud computing
The main goal of this study is to examine the way that (ML) strategies can be utilized in
distributed computing risk evaluation. By using the capacities of ML algorithms, the study
attempts to cultivate judicious models that can chip away at the precision and exactness of peril
assessment processes in cloud conditions. The exploration targets of the review are as per the
following:-
▪ To assess the presentation of the proposed model utilizing predefined measurements like
exactness, accuracy, and F1-score. The review will likewise contrast the ML-based model
and conventional risk appraisal techniques to feature upgrades and distinguish regions for
additional improvement.
1. What are the current risk assessment frameworks and methods for cloud computing?
2. How are the most important risk factors associated with cloud computing categorized?
3. How can cloud computing risk assessment be made more accurate and efficient using
machine learning algorithms?
4. Which of the best machine learning algorithms can anticipate and mitigate cloud
computing risks?
5. Using machine learning methods, how can a cloud computing-specific comprehensive
risk evaluation model be developed?
ML Machine Learning
AI Artificial Intelligence
By providing scalable and adaptable computing resources as utility services that users can rent
on demand, cloud computing represents a significant shift in the IT landscape. As per Mell and
Grance (2011), cloud computing is characterized by fundamental attributes, for example, on-
request self-administration, expansive organization access, asset pooling, fast versatility, and
estimated administration. Numerous organizations have been able to improve their operational
effectiveness and cut costs thanks to this paradigm shift. However, the widespread use of cloud
computing also brings with it a number of security issues that call for extensive frameworks for
risk assessment (Cayirci et al., 2016). The fast development of cloud computing carries with it a
large number of safeties provokes that should be addressed to safeguard delicate data and basic
foundation. In cloud environments, Drissi et al. (2016a) highlight key security issues like data
breaches, data loss, account hijacking, insecure interfaces and APIs, denial of service attacks,
and shared technology vulnerabilities. These issues emphasize the need for robust security
measures to ensure the respectability, privacy, and accessibility of cloud-based data.
Various structures have been developed to address the security risks of distributed computing.
According to CSA (2016), the Cloud Controls Matrix (CCM) of the Cloud Security Alliance
(CSA) is a comprehensive framework that offers a comprehensive set of controls for governance,
risk management, and compliance. It is intended to help with jumbling expert associations and
clients by evaluating the cloud climate's security and executing suitable controls to decrease
chances. A methodical approach to managing regulatory information security opportunities is
provided by the ISO/IEC 27001 standard. It includes risk evaluation; risk the executives, and
persistent checking to guarantee that associations keep up with powerful information security
management systems (ISO, 2013). Albeit these designs give urgent rules to deciding cloud
conditions, they much of the time miss the mark regarding the granularity and versatility
important to manage the cloud's dynamic and heterogeneous nature. Therefore, the adaptability
and efficiency of these frameworks could be improved by incorporating machine learning
techniques into risk assessment procedures.
ML techniques have significantly improved the viability of hazard evaluation models for cloud
computing. Aljawarneh et al. (2018) propose a half-and-half risk evaluation model that combines
fluffy reasoning and inherited calculations to examine security risks in cloud-based medical care
frameworks. This strategy makes it possible to assess cloud security risks with greater precision
and adaptability by utilizing the advantages of fuzzy logic and genetic algorithms. Risk
assessment can be done with more noteworthy accuracy and flexibility on account of the half and
half model's ability to consider the innate vulnerabilities and intricacy of cloud conditions.
For improving the accuracy and precision of chance evaluation models, effective group learning
methods have emerged. According to Dietterich (2000), a hypothetical framework for equipping
knowledge includes consolidating various base students to construct a more grounded general
model. Group techniques can work on the generalizability, precision, and change of hazard
evaluation models by joining the forecasts of a few unmistakable models. Troupe learning, a
reasonable technique for decreasing the innate vulnerabilities and vacillations in cloud-based
datasets, upgrades the unwavering quality and power of chance evaluation models with regards
to distributed computing.
Highlight choice plays a crucial role in improving the capability and interpretability of risk
assessment models by identifying the most important indicators and eliminating irrelevant or
irrelevant ones. In Liu et al.'s (2019) hybrid feature selection strategy, particle swarm
optimization and rough set theory are combined to select the best feature subset for cloud
security risk assessment. Working on the productivity and interpretability of risk evaluation
models, this strategy achieves prevalent element choice execution by utilizing the integral
qualities of development and unpleasant set hypothesis. When dealing with the complexity of
cloud security data and ensuring the accuracy of risk expectations, convincing element selection
is essential.
Meta-learning methods offer a promising methodology for uniting different betting assessment
models into sturdy gatherings by using the joined knowledge of the singular classifiers. Zeng et
al. (2020) propose a meta-learning methodology for cloud security risk assessment in which
different base classifiers are joined with meta-features eliminated from the covered up dataset.
This technique accomplishes unrivaled farsighted execution and goodness by using meta-parts to
get cloud-based datasets' innate changeability and unconventionality. By utilizing the attributes
of various models to give risk appraisals that are more accurate and solid, meta-learning chips
away at affiliations' ability to proactively ease up security wagers in cloud conditions.
Aljawarne Hybrid risk Develop a Evaluation Healthc Cloud- Matlab Fuzzy logic
h et al. assessment comprehen of risk are based Controller
(2018) model sive risk assessment datasets environ Simulator
combining assessment accuracy, ment
fuzzy logic framework computatio
and genetic for cloud- nal
algorithms based efficiency,
for cloud- healthcare scalability,
based systems and
healthcare robustness
systems
Liu et al. Hybrid Enhance Evaluation Securit Cloud- AWS Simulation
(2019) feature the of feature y based
selection effectivene selection datasets environ
using ss of accuracy, ment
particle feature computatio
swarm selection in nal
optimizatio cloud efficiency,
n and rough security scalability,
set theory risk and real-
for cloud assessment world
security applicabilit
risk y
assessment
Zeng et al. Meta- Improve Comparati Securit Cloud- AWS Simulation
(2020) learning predictive ve analysis y based
approach performanc of datasets environ
for e and predictive ment
combining robustness accuracy,
multiple in cloud generalizati
risk security on ability,
assessment risk computatio
models in assessment nal
cloud complexity
security , and
scalability
across
various risk
assessment
models and
datasets
The NSL-KDD dataset, a refined version of the KDD Cup 1999 dataset that is frequently used
for the evaluation of network intrusion detection systems, serves as the primary source of data
for this study. The NSL-KDD dataset tends to exhibit the obvious repetitiveness and irregularity
of the first dataset, making it a superior benchmark for evaluating AI calculations in network
security (Dietterich, 2000). The dataset includes basic features like duration and protocol type,
content features like the number of failed login attempts, and traffic features like the number of
connections to the same host in the last two seconds.
3.2 Data Pre-processing
Several pre-processing steps are necessary to ensure that the NSL-KDD dataset is clean,
consistent, and ready for modeling:
3. Data Partitioning- In order to evaluate the effectiveness of the machine learning models,
the dataset is divided into training and testing subsets. To ensure that the models are
trained on a diverse set of data and evaluated on unseen examples, the data are typically
divided into 70% for training and 30% for testing.
This study utilizes a scope of ML algorithms to survey their viability in distinguishing and
relieving takes a chance in cloud conditions. The chose algorithms include:
2. Multilayer Perceptron (MLP): A sort of fake brain network with various layers of
neurons is known as a multi-facet perceptron (MLP). MLPs are good for getting mind
boggling models and joint efforts inside the information, making them suitable for
endeavors requiring high perceptive precision.
3.4 Model Evaluation Metrics
To evaluate the performance of the selected machine learning models, several metrics are used:
Precision The proportion of true positive predictions over Critical in scenarios where
the total number of positive predictions. false positives need to be
minimized.
F1-score The harmonic mean of precision and recall. Provides a balanced measure
that accounts for both false
positives and false negatives,
suitable for imbalanced
datasets.
In sensitivity analysis, key model parameters are changed to see how they affect performance.
These aids in determining which parameters have the greatest impact and how sensitive the
models are to changes in parameters. In a Multilayer Perceptron, for instance, parameters like the
learning rate, activation functions, number of hidden layers, and number of neurons per layer are
experimented with to see how they affect model performance.
Parameter optimization means to distinguish the best arrangement of boundaries that boost
model execution. For this purpose, grid search and random search are utilized. Network search
deliberately investigates a predetermined subset of the hyper boundary space, while irregular
pursuit tests hyper boundaries haphazardly. The best parameters for achieving the highest cross-
validation performance, accuracy, and precision can be found using either approach.
3.6 Risk Assessments and Mitigation Strategy
Risk Category Risk Description Mitigation Strategy
Data Privacy Unauthorized access to Implement robust access controls and encryption
sensitive data mechanisms. Conduct regular audits and enforce
least privilege access policies.
Data Loss Loss of critical data due Implement robust data backup and recovery
to corruption or theft mechanisms. Encrypt sensitive data at rest and in
transit. Conduct regular data integrity checks.
Cyber Attacks Malicious attacks Deploy robust intrusion detection and prevention
targeting cloud systems. Implement multi-factor authentication
infrastructure and security incident response protocols.
The performance of the selected machine learning algorithms Random Classifier and Multilayer
Perceptron were evaluated using the following metrics:
Ensemble learning join the forecasts of different models to work on generally execution, utilizing
the qualities of every individual model. In order to improve the accuracy and dependability of
risk assessment in cloud computing environments, ensemble methods like Bagging, Boosting,
and Stacking (Meta-Learning) were used in this study.
4.2.1 Bagging
Bagging, or Bootstrap Aggregating, plans to lessen change via preparing numerous models on
various subsets of the data and averaging their forecasts. This strategy further developed the
exhibition measurements essentially. The precision of the troupe model utilizing Packing was
92.1%, higher than any singular model. The accuracy was 91.0%, demonstrating less misleading
up-sides. The F1-score, at 91.5%, showed a reasonable way to deal with accuracy and review.
Cross-approval results were reliably high, averaging 91.3%.
4.2.2 Boosting
By training models sequentially boosting aim is to reduce bias by trying to fix errors made by
previous models. The Supporting outfit accomplished a precision of 93.5%, showing unrivaled
execution in arranging occasions accurately. Accuracy was 92.4%, mirroring a decrease in
misleading up-sides. The F1-score was 92.9%, showing a balanced exhibition in dealing with
both bogus up-sides and misleading negatives. Its robustness was demonstrated by cross-
validation, which yielded an average performance of 92.7%. Supporting's iterative way to deal
with refining expectations demonstrated exceptionally powerful for risk appraisal undertakings.
The similar investigation of group techniques uncovers that they fundamentally upgrade model
execution over individual algorithms. Stowing really diminished difference, supporting limited
inclination, and Stacking joined the qualities of numerous models to convey the best
presentation. In cloud computing environments, where precise predictions are necessary for
proactive risk management, these enhancements are crucial for robust risk assessment.
The discoveries from the model presentation and group techniques demonstrate that ML can
altogether upgrade risk evaluation in cloud computing. The high exactness and accuracy of the
models propose they are appropriate to distinguishing and relieving different threats. Advanced
machine learning algorithms allow for quicker and more precise risk detection in cloud
environments. The lessons learned from these models can shed light on the development of risk
moderation systems that are better suited to the specific needs of different areas. Machine
learning models are suitable for dynamic cloud environments, where risks can change rapidly,
due to their adaptability. Consistency and information security are crucial in the clinical
considerations area. Unauthorized access and regulatory non-compliance, for example, were
successfully identified by the ensemble models (Iyer, 2014). By executing fantastic access
controls and never-ending seeing, clinical thought affiliations can mitigate these risks. Data
breaks and establishment disillusionments are two likely threats to enlightening establishments.
The models emphasized the significance of regular data backups and redundant systems. These
methods can ensure the uprightness and availability of information. Because they handle
sensitive information, government agencies should adhere to strict guidelines. The requirement
for broad consistency designs and interference expectation systems was accentuated by the risk
assessment models. Security could be improved and regulatory compliance could be ensured by
these measures.
The use of AI (ML) methods for risk assessment in distributed computing conditions was the
subject of this survey. The review's huge discoveries highlight ML's capability to improve cloud
security across different areas. The Multilayer Perceptron (MLP) demonstrated superior accuracy
(98.22 percent), precision (99.30 percent), and F1-score (98.07 percent), demonstrating its
robustness in accurately classifying events and handling imbalanced datasets. Irregular
Backwoods likewise performed well, accomplishing a sensible harmony among accuracy and
review. The review featured that Packing, Supporting, and Stacking can altogether work on
model execution. Outstandingly, Stacking displayed the most elevated F1-score and exactness,
accentuating the advantages of joining various models to make exact forecasts. Furthermore, the
exploration distinguished basic gamble classes in distributed computing, like information
protection, consistence, framework disappointment, information misfortune, digital assaults,
seller secure, and cost overwhelms. Comparing relief techniques were proposed, accentuating the
significance of powerful access controls, far reaching consistence structures, overt repetitiveness
systems, information encryption, multifaceted verification, multi-cloud methodologies, and cost
streamlining measures.
5.2 Contributions to the Field
Several significant contributions to cloud computing security and risk assessment are made in
this study. First and foremost, it gives a thorough assessment of various ML algorithms and
troupe procedures, offering a definite near examination that guides in the determination of proper
models for risk evaluation. Second, the application of ensemble methods led to significant
performance enhancements, advancing the creation of cloud-based risk assessment models that
are more accurate and dependable. By demonstrating how combining models can produce
superior results, this contributes to the broader field of machine learning. Lastly, practitioners in
the industry can benefit from the identification of risk categories and the mitigation strategies
that correspond to them. Organizations will be able to better manage and reduce risks associated
with cloud computing as a result of these strategies, which help improve the resilience and
security of cloud infrastructures.
The study's contributions aside, it must be acknowledged that it has a few drawbacks. The
reliance on the NSL-KDD dataset is one of the main drawbacks. Even though this dataset is
extensive, it may not cover all cloud environment variations and scenarios. To ensure a wider
range of applicability and validation of the findings, future studies should incorporate a wider
variety of datasets. The models' generalizability is another limitation. Albeit cross-approval was
utilized to evaluate model execution, the appropriateness of these models to various cloud
conditions and setups requires further approval (Latif et al., 2014). Besides, the review zeroed in
on unambiguous risk classes, possibly neglecting extra threats remarkable to specific cloud
applications or businesses. The proposed models and strategies will become more robust and
adaptable if these limitations are addressed in subsequent research.
Several directions for future research are suggested based on the findings and limitations. Future
examinations ought to consolidate more assorted and state-of-the-art datasets from different
cloud specialist co-ops and enterprises to upgrade the generalizability of the models. In addition,
developing risk assessment models that are even more precise and adaptable could be made
possible by examining the application of more advanced ML methods like reinforcement
learning and deep learning. In order to evaluate the practical applicability and efficacy of the
proposed models and strategies in live cloud environments, case studies and real-world
validation are also essential. Last but not least, the development of models that are capable of
performing dynamic risk assessment in real time and adapting to shifting threat landscapes and
cloud configurations would significantly advance the field and offer more efficient solutions for
risk management.
This study's findings have a number of practical implications for industry professionals.
Practitioners can enhance the accuracy and dependability of risk assessments by implementing
the proposed ML models and ensemble techniques, resulting in more efficient strategies for risk
management and mitigation. This study's findings can help businesses prioritize resources and
implement robust security measures by guiding strategic decisions about cloud security
investments. In cloud environments, organizations can also meet regulatory requirements and
improve governance overall by implementing the recommended compliance frameworks and
mitigation strategies. The proposed cost advancement procedures can help associations in
overseeing and controlling cloud consumptions, guaranteeing that Financial requirements are
met without compromising security. Generally, the review gives noteworthy proposals that can
essentially upgrade the security and effectiveness of cloud computing rehearses in different
ventures.
References
Abdelaziz, A., Elhoseny, M., Salama, A. S., & Riad, A. M. (2018). A machine learning model
for improving healthcare services on cloud computing environment. Measurement, 119, 117-128.
Ahmed, N. and Abraham, A. (2015a). Modeling cloud computing risk assessment using
ensemble methods, pages 261–274. Springer International Publishing.
Ahmed, N. and Abraham, A. (2015b). Modeling cloud computing risk assessment using machine
learning.
Aljawarneh, S., Aldwairi, M., and Yassein, M. B. (2018). Cloud computing security: A survey.
Journal of Data Security and Applications, 38:1–16.
Alosaimi, R. and Alnuem, M. (2016). Risk management frameworks for cloud computing: a
critical review. AIRCC’s International Journal of Computer Science and Data Technology, pages
1–11.
Cayirci, E., Garaga, A., Santana de Oliveira, A., and Roudier, Y. (2016). A risk assessment
model for selecting cloud service providers. Journal of Cloud Computing, 5(1):14.
Drissi, S., Benhadou, S., and Medromi, H. (2016a). Evaluation of risk assessment methods
regarding cloud computing. In Proceedings of the 5th Conference on Multidisciplinary Design
Optimization and Application.
Drissi, S., Benhadou, S., and Medromi, H. (2016b). A new shared and comprehensive tool of
cloud computing security risk assessment. In Proceedings of the UNet’15, volume 1, pages 155–
167. Springer Singapore.
Duc, T. L., Leiva, R. G., Casari, P., & Östberg, P. O. (2019). Machine learning methods for
reliable resource provisioning in edge-cloud computing: A survey. ACM Computing Surveys
(CSUR), 52(5), 1-39.
Iyer, E. K. (2014). Segmentation of risk factors associated with cloud computing adoption. In
Proceedings of The International Conference on Cloud Security Management ICCSM-2014,
pages 82–89.
Latif, R., Abbas, H., Assar, S., and Ali, Q. (2014). Cloud computing risk assessment: a
systematic literature review. In FutureTech 2013, pages 285–295.
Mell, P. and Grance, T. (2011). The nist definition of cloud computing (special publication 800-
145). Technical report, National Institute of Standards and Technology.
Nassif, A. B., Talib, M. A., Nasir, Q., Albadani, H., & Dakalbab, F. M. (2021). Machine learning
for cloud security: a systematic review. IEEE Access, 9, 20717-20735.
Nguyen, K. K., Hoang, D. T., Niyato, D., Wang, P., Nguyen, D., & Dutkiewicz, E. (2018, April).
Cyberattack detection in mobile cloud computing: A deep learning approach. In 2018 IEEE
wireless communications and networking conference (WCNC) (pp. 1-6). IEEE.
Pavithra, B., Mishra, N., and Naveen, G. (2023). Cloud security analysis using machine learning
algorithms.
Sharma, A. and Singh, U. K. (2022). Modelling of smart risk assessment approach for cloud
computing environment using ai and supervised machine learning algorithms. Global Transitions
Proceedings, 3(1):243–250.
Systems ICAISS 2023, pages 704–708. IEEE.
Zekri, M., El Kafhali, S., Aboutabit, N., & Saadi, Y. (2017, October). DDoS attack detection
using machine learning techniques in cloud computing environments. In 2017 3rd international
conference of cloud computing technologies and applications (CloudTech) (pp. 1-7). IEEE.