Adarsh

Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

Cloud Computing with Machine Learning:

Adarsh Kumar
Department of Computer Science, Lovely Professional University, Phagwara, Punjab, India.

Abstract 2 Theoretical Background


Cloud computing is a computing paradigm that provides on- This section describes the background details for cloud
demand, scalable as well as measured services to the end uses. computing and machine learning thus providing a brief idea
In today’s era almost each and every business has huge about both of them.
dependency on this computing technology in terms of cost-
saving, infrastructure, development platform, data processing,
data analytics etc. The services provided by the cloud service 2.1 Cloud Computing
providers (CSP) can be consumed by the end users anytime,
Cloud computing is a computing paradigm that provides on-
anywhere by the web application over the internet. The
demand, scalable, measured and secure services to the end
security of the cloud infrastructure is of utmost importance
and several research work involving various technologies are users over the internet. It is due to these benefits cloud
utilized so as to provide better and more accurate defence computing paradigm finds a very large set of use cases. There
mechanism against cloud attacks. Machine learning is a are many cloud service providers in the market today that
offer variety of cloud services to their customers. Some of
technology that has proved to produce better results in
them are Amazon Web Services (AWS), Microsoft Azure,
securing the cloud environment in the recent times. Machine
IBM Cloud, Google Cloud, Oracle Cloud, Alibaba Cloud, etc.
learning algorithms are trained on the various authentic
datasets to build models that can automate the process of
detecting the cloud attacks with higher accuracy in
comparison with any other technology. This paper reviews 2.1.1 Characteristics of Cloud Computing
some of the latest research papers that have employed According to NIST there are primarily five essential
machine learning as a security mechanism against cloud characteristics of cloud computing. [1, 2]
attacks.
On-demand self-service- This characteristic states that the
Keywords - Cloud Computing, Machine Learning, Intrusion cloud services are made available to the end users on their
Detection System, Datasets, Supervised Machine Learning, demand and without the intervention of the cloud service
Unsupervised Machine Learning, Reinforcement Learning, provider.
Deep Neural Network.
Rapid elasticity- Cloud based applications are capable to
handle the rapid increase or decrease in the demand of the
1 INTRODUCTION services accordingly without the resource shortage or
downtime in the business.
The security of cloud environment is the biggest concern in
the recent times. Even the big cloud service providers which Measured Service- The services provided by the cloud service
has enough security measures such as Amazon, Google etc provider to the end user are billed on a measured basis, i.e.
also suffers from several cloud attacks which are reported consumers availing the cloud services are charged only with
time to time on regular basis. Cloud security can be broadly for their service consumption and are free to stop consuming
categorized under five different categories namely the services any time.
information security, identity security, network security, Broad network access- Cloud services are available and can
infrastructure security and software security. Machine be accessed from a wide range of thin clients such as mobile
Learning as a Service (MLaaS) is the service model that is devices, laptop, desktop, PDAs etc.
utilized by the cloud computing in order to enhance the
defence strategy against several cloud attacks. Several Resource Pooling- Cloud computing resources such as
Intrusion Detection System has been developed with the help memory, storage, processing unit, network bandwidth etc are
of machine learning algorithms that has improved the pooled by the cloud service provider in order to serve the
accuracy of detecting the attacks and allowing the smooth request of multiple customers.
business operations to carry on.
On-Demand Self-Service Rapid Elasticity Measured Service Broad Network Access Cloud
Characteristic
Resource Pooling s

Cloud Service
Infrastructure as a Platform as a Service Software as a Service Models
Service (IaaS) (PaaS) (SaaS)

Cloud
Public Cloud Private Cloud Hybrid Cloud Community Cloud Deployment
Model

Fig. 1: Working Model of Cloud Computing

2.1.2 Service Models altogether. Based on the requirement community cloud model
can be outsourced or on-site basis. Example include-
IaaS- The fundamental hardware required to run the cloud
Microsoft Government Community Cloud, Google Apps for
application is provisioned by this service model. These
Government.
resources include storage, network, processing unit, virtual
machines etc. The cost of setting up and maintenance of these Hybrid Cloud- Cloud structure that is based on the
resources are very high and is saved by this service layer as combination of other available deployment models is termed
the overall cost of setting up as well as maintenance is taken as hybrid cloud. Example may include VMware vCloud etc.
care by service provider. [3, 4]
PaaS- This layer provides the platform for the developers to
2.1.4 Cloud Attacks
develop the applications using the underlying cloud
infrastructure. PaaS provides different tools, technologies, Cloud computing paradigm suffers from several kinds of
programming languages etc required in order to develop the attacks. These attacks depending on their types may occur at
cloud application. The end user doesn’t have control over the different cloud service models, i.e. at IaaS, PaaS or at SaaS.
underlying cloud infrastructure but has complete command [6]. Figure 2 displays some of the popular cloud attacks at the
over the application. respective service models.
SaaS- This model allows the end user to use the cloud
deployed applications over internet through wide range of
available clients. End users have no command over the cloud Cloud Attacks
infrastructure as well as the application itself except
consuming the application.

2.1.3 Deployment Model IaaS PaaS SaaS

Public Cloud- Public cloud is made available to all the end - Stepping Stone - Phishing
users who just need to use the applications that is deployed - DDoS
over public cloud. Example includes Amazon EC2, Google - Malicious Attacker - Man-in-the-Middle
- Authentication
App Engine, Microsoft Azure etc. [3, 5]
- Cross VM Attack - Cloud Malware Attacks
Private Cloud- Private cloud is totally dedicated to the Injection
individual private organizations for carrying out their business - VM Rollback - SQL Injection
with high level of privacy as well as security and without the - Password Reset
- Programming - Cross-Site
intervention of the outsiders. Examples include Microsoft ECI Scripting
data center, Amazon Virtual Private Cloud, Ubuntu Enterprise Attack
Cloud, Eucalyptus etc.
Community Cloud- This type of deployment model is used
when the organizations share the cloud infrastructure Fig. 2: Classification for Cloud Attacks
2.2 Machine Learning Supervised ML algorithm can solve two category of ML
problem, namely Classification and Regression. The problem
According to Arthur Samuel, learning through past experience
which has categorical (yes/no) target variable are solved using
instead of learning through programming is termed as
Classification Supervised ML algorithms where as when the
machine learning. Machine learning makes use of various
target variable is not categorical but is continuous instead,
types of algorithms to create models which when trained on
such type of problems are solved using Regression ML
large volume of dataset can predict the future outcome from
algorithms. [9, 10]
the learning of the past historical data. The algorithms used to
train the models are the backbone of machine learning. The Unsupervised Machine Learning- Unsupervised ML
choice of machine learning algorithm depends on the type of algorithms train the ML model with the datasets that is not
problem to be solved. The process of applying machine labeled as well as not categorized. By examining the large
learning in order to solve a given problem starts with data dataset, unsupervised ML algorithm determines and learns all
collection and then follows the task of data preparation, data the data insights such as data patterns, classes, categories etc
analysis, training the model, testing the model and finally on its own. Clustering and Association are the two categories
deploying the model for actual use. [7, 8] of unsupervised ML.
Clustering based algorithm form the groups of similar data
which has similar characteristics. Whereas Association based
2.2.1 Types of Machine Learning
algorithms finds the relation between the data that can be
Supervised Machine Learning- Supervised ML algorithms are grouped together. [11, 2]
used to predict the future outcomes as they are trained on the
Semi-Supervised Machine Learning- The shortcoming of
datasets that are labeled and are mapped with corresponding
supervised and unsupervised ML algorithm is addressed by
output target values. The major task of supervised ML
semi-supervised ML algorithms. Both labeled as well as
algorithm is to observe the given input data and allot an
unlabelled datasets are used to train the ML model in semi-
appropriate class for that data. This allotment of the class can
supervised based learning. [9]
only be achieved by getting trained beforehand using large
volume properly labeled dataset with clear classes defined.

Machine Learning

Supervised Unsupervised

Clustering
Classification Regression
- K-Means Clustering
-Simple Linear Regression
-Random Forest
- DBSCAN Algorithm
- Multivariate Regression
- Decision Tree
- Principal Component
- Random Forest Analysis
- Logistic Regression
Regression
- Support Vector Machine - Independent Component
- Decision Tree Analysis
- K-Nearest Neighbor
- Lasso Regression - Mean-shift algorithm
- Naïve Bayes

Fig. 3: Classification of Machine Learning Algorithms


Reinforcement Machine Learning- Feedback based learning approach utilizes the hybridization of PSO-PNN (particle
methodology is utilized in Reinforcement ML. The agent swarm optimization and probabilistic neural network).
learns from its own experience has no training over any type Results showed the effectiveness of the proposed hybrid
of supervised datasets and is rewarded or penalized for scheme by achieving high accuracy in terms of true positive
making the correct or incorrect decisions accordingly. Figure rate, false negative rate, f-measure and precision.
2 shows the classification of machine learning algorithms. Hesamifard et al. [18] utilizes machine learning capability for
preserving the privacy. Data encrypted with homomorphic
3 LITERATURE REVIEW encryption is used for the purpose of training the neural
Chkirbene et al. [13] proposed machine learning based network. The traditional sigmoid as well as ReLU (Rectified
intrusion detection system for cloud computing. The classifier Linear Unit) activation functions of the neural network is
in an intrusion detection system is most important component substituted with the accurate polynomial approximations as an
which fails to give high classification accuracy results due to activation function of NN. The proposed approach produces
the imbalance nature of the datasets available. In order to cater more accurate in providing privacy in comparison with SMC
this problem weighted supervised decision tree algorithm is (secure multiparty computation) and HE (homomorphic
employed as classification algorithm in this proposed encryption).
approach. High accuracy for the classifier is achieved as the Secure machine learning based sharing of data over cloud is
proposed approach produces low scores for negative achieved by Singh et al. [19] via mutual authentication
classification and high scores for positive classification. protocol. The proposed mutual authentication protocol easily
Another security framework is proposed in the research work guards defence several types of cloud attacks such as DoS,
of Bagga et al. [14] based on the combination of SVM DDoS, MITM, reply etc. ECC (Elliptic curve cryptography)
machine learning algorithm, Network Function Virtualization as well as Schnorr’s signature are used in combination for the
and Software Defined Network. This approach marks its purpose of encrypting the data with the benefit of small size
importance as the security against different attacks for both keys and classification of threats or attacks are performed by
NFV as well as for SDN in achieved. The proposed voting classifier. The high accuracy of the proposed
framework is divided into two levels. Firstly into “security methodology is proved by the results from the ProVerif tool.
enforcement plane” which is responsible for providing the Salman et al. [20] gave a research paper suggesting the use of
security against both the internal as well as external attacks in machine learning in order to mitigate different cloud attacks in
IoT and is further sub-divided into three components namely multi-cloud environment via intrusion detection system.
MA (Monitoring Agent), IB (Infrastructure Block), CMB Linear regression and random forest supervised machine
(Control and Management Block). And secondly into learning algorithms are employed by the intrusion detection
“security orchestration plane” for configuring the security system used in this proposed approach. Apart from the
policies at the run time. Better result is achieved in terms of detection of the cloud threats, the main advantage of this
accuracy, FRP, detection rate and training time as compared approach is that it also makes sure to categorize the threats via
with other existing approaches. a novel step-wise algorithm. 99.0% and 93.6% accuracy is
The importance of data security in mobile cloud computing achieved in terms of categorization as well as detection of the
due to the involvement of heterogeneous network is depicted threats respectively.
by Dey et al. [15] and an intrusion detection system that can With the hybridization of genetic and simulated annealing
handle such complex security constraints is thus proposed. K- algorithms Chiba et al. [21] proposed an intrusion detection
Means and DBSCAN machine learning algorithm lays the system based on deep neural network. The improved genetic
foundation for such an IDS, which can guard defence against algorithm used by this approach provides reduction in the
heterogeneous attacks such as MITM as well as DDoS. This convergence as well as in the execution time at the same time
approach trains the system on cluster basis and does the traffic the optimization in the search process of genetic algorithm is
classification on the basis of distance calculation. Better achieved by the SAA algorithm. These algorithms improve
accuracy results for the proposed IDS is achieved as their is a factors of DNN including feature selection, activation
reduction in the complexity due to the non requirement of function and thus enhancing the overall performance of the
updates in the rules regularly. deep neural network.
Concept for secure offloading using machine learning in Machine learning based authorization to allow only the
multi-environment (Fog-Cloud-IoT) is given by Alli et al. authenticated user access the cloud services is proposed by
[16]. Optimal selection of the fog node is done by PSO Khilar et al. [22]. As the proposed approach improves the
(Particle Swarm Optimization) which can be used as IoT data authorization mechanism of the cloud users and restricts the
storage and then transfer of the data is done to the cloud unauthorized access of the cloud resources, the trust between
which is selected via reinforcement learning. Private cloud is the service providers and the end users improves and also the
used for storing the sensitive data whereas the non-sensitive overall data security reaches another level. The proposed
data not uploaded in the private cloud. approach gave better results in terms of MAE, time, recall,
Another machine learning based scheme for monitoring the precision and f1-sccore when compared with traditional
behavior of the user in the cloud for the CSP (cloud service mechanism for user access to cloud resources.
provider) is given by the Rabbani et al. [17]. For the purpose Machine learning based IDS with improved accuracy is
of identification of unauthenticated user in the cloud, the proposed in the work of Aljamal et al. [23]. SVM for
classification along with K-Means clustering machine
learning algorithm is used in hybrid mode at the cloud investigation, unwanted feature reduction from dataset,
hypervisor in order to detect the anomalies in the network. clustering the data with K-Means algorithm and classification
The proposed hybrid model performs network traffic between normal as well as malicious requests via SVM.

Table 1. Summary of related work


Ref ML Algorithm Used Proposed Approach Dataset
[13] Decision Tree Intrusion Detecting System based on weight optimization. UNSW
[14] Support Vector Machine AI Framework based on the combination of ML, NFV, SDN. NSL-KDD
[15] K-Means and DBSCAN Traffic filtration via distance calculation and training system Multiple datasets
via cluster basis.
[16] Reinforcement Learning Neuro-Fuzzy system for secure data offloading with PSO to Multiple datasets
select secure fog node in Fog-Cloud-IoT environment.
[17] Multilayer Neural Network Identification of unwanted user in the cloud with PSO and UNSW-NB15
PNN.
[18] Deep Neural Network Training NN with encrypted data and using accurate Crab, Fertility and Climate
activation function for the NN. Dataset
[19] LR(Linear Regression) and ECC along with voting classifier for mutual authentication CICD
KNN (K-Nearest neighbor) over multi cloud environment.
[20] Linear Regression (LR) and Machine Learning based Intrusion detection system for UNSW
Random Forest (RF) detection of attacks in multi-cloud environment.
[23] K-Means clustering and SVM The hybrid model is responsible for performing network UNSW-NB15
classification traffic investigation, unwanted feature reduction from dataset,
clustering the data with K-Means algorithm and classification
between normal as well as malicious requests via SVM.
[24] Random Forest , Quadratic Deep Reinforcement Learning model which has the host, UNSW-NB15
Discriminant Analysis, K- agent and administrator network that predicts the affected
Nearest Neighbours, Gaussian virtual machines and also blocks them.
Naive Bayes (GNB) and
AdaBoost
[21] Deep Neural Network IGASAA, i.e. hybridization of genetic algorithm and CICIDS2017, NSL-KDD
simulated annealing algorithm for machine learning based version 2015 and CIDDS-
network IDS. 001
[25] Linear Regression (LR) EIDS for traffic analysis in which the past as well as current UNSW-NB-15
decisions are compared with each other using machine
learning.
[26] Decision Tree, Random Machine learning based classification TIDCS and detection NSL-KDD, UNSW
Forest TIDCS-A models for IDS.
[22] K-Nearest Neighbor, Decision Authorization of the user is increased to provide better User Dataset
Tree, Logistic Regression, security to the cloud resources using machine learning
Naive Bays approach.
[27] DML DML-DIV for retraining the integrity of the data in Advertisement Click
distributed cloud environment. Prediction

Security for cloud is enhanced by IDS based on reinforcement compared with each other using machine learning based
learning in the research paper of Sethi et al. [24]. The major intrusion detection in order to provide cloud security. The
drawback of the traditional IDS for cloud security is the current and the past decision comparison for the classification
incorrect classification accuracy, i.e. low FPR (false positive of attacks are performed in order to enhance the performance
rate) is taken care by this approach. The proposed model of the intrusion detection system. The security of IDS is
involves three modules, the host network responsible for increased as there is overall 24% increase (near about 90%),
mitigating virtual machine based attacks, agent network in the detection rate of the supervised learning classifier.
responsible for detection between normal or malicious Chkirbene et al. [26] gave two models for trust based IDS first
requests and the administration network for allowing the classification model (TIDCS) and second one the accelerated
administrators to block the affected virtual machines. model (TIDCS-A). The former model is responsible for the
task of dimensionality reduction allowing only the required
Chkirbene et al. [25] suggested an “EIDS” scheme for traffic
features to be processed by the machine learning algorithm
analysis in which the past as well as current decisions are
from the UNSW dataset while the latter model takes care of
detection of the anomalies. Simulation results clearly depicts Intelligence 2018, Bangalore, India.
that both the proposed model with the help of machine [10] Baraneetharan, E.: Role of Machine Learning Algorithms Intrusion
learning algorithms, (TIDCS and TIDCS-A) are capable for Detection in WSNs: A Survey. Journal of Information Technology
and Digital World, Vol. 02, pp. 161-173, 2020.
attack classification as well detection with better accuracy.
[11] Saranyaa, T., Sridevi, S., Deisy, C., Chung, T.D., Khan, M.K.A.:
When it comes to machine learning in the distributed cloud Performance Analysis of Machine Learning Algorithms in Intrusion
environment, the problem of data tampering increases. In Detection System: A Review. Procedia Computer Science, Vol 171,
pp. 1251-1260, 2020.
order to solve this problem, Zhao et al. [27] proposed a
verification methodology named as DML-DIV (distributed [12] Sen, P.C., Hajra, M., Ghosh, M.: Supervised Classification
Algorithms in Machine Learning: A Survey and Review. Emerging
machine learning data integrity verification), for the data in Technology in Modelling and Graphics. Advances in Intelligent
distributed cloud environment so that the integrity of the data Systems and Computing, vol 937, pp. 99-111, 2019.
can be retrained. Also the simulation results clearly depicts [13] Chkirbene, Z., Erbad, A., Hamila, R., Gouissem, A., Mohamed, A.,
that the proposed DML-DIV approach is better than the Hamdi, M.: Machine Learning Based Cloud Computing Anomalies
existing compared approaches in terms of privacy protection, Detection. IEEE Network, Vol. 34, pp. 178-183, 2020.
forgery as well as tampering attack. [14] Bagaa, M., Taleb, T., Bernabe, J.B., Skarmeta, A.: A Machine
Learning Security Framework for Iot Systems. IEEE Access, Vol. 8,
pp. 114066-114077, 2020.
[15] Dey, S., Ye, Q., Sampalli, S.: A machine learning based intrusion
4 CONCLUSIONS AND FUTURE SCOPE detection scheme for data fusion in mobile clouds involving
heterogeneous client networks. Information Fusion, vol. 49, pp. 205-
Client data over the cloud is very crucial and its security can’t 215, 2019.
be compromised by any means. Several new technologies
[16] Alli, A.A., Alam, M.M.: SecOFF-FCIoT: Machine learning based
making use various security algorithms are applied by the secure offloading in Fog-Cloud of things for smart city applications.
researchers in order to enhance the security of the cloud Internet of Things, Vol. 7, 2019.
ecosystem. Machine learning finds a huge space to provide [17] Rabbani, M., Wang, Y.L., Khoshkangini, R., Jelodar, H., Zhao, R.,
more accurate as well as automate defence against the known Hu, P.: A hybrid machine learning approach for malicious behaviour
and unknown cloud attacks. The main focus or the takeaway detection and recognition in cloud computing. Journal of Network
and Computer Applications, Vol. 151, 2020.
from this survey paper is to have a latest glimpse of the
research work in the field of cloud security using machine [18] Hesamifard, E., Takabi, H., Ghasemi, M., Jones, C.: Privacy-
preserving Machine Learning in Cloud. Cloud Computing Security
learning. Workshop, pp. 39-43, 2017.
In future, we propose an intrusion detection system that will [19] Singh A.K., Saxena, D.: A Cryptography and Machine Learning
Based Authentication for Secure Data-Sharing in Federated Cloud
make use of enhanced and optimized machine learning Services Environment. Journal of Applied Security Research, 2021.
algorithm in order to provide more accurate cloud data
[20] Salman, T., Bhamare, D., Erbad, A., Jain, R., Samaka, M.: Machine
security. Learning for Anomaly Detection and Categorization in Multi-Cloud
Environments. IEEE 4th International Conference on Cyber Security
REFERENCES and Cloud Computing, 2017.
[1] Alouffi, B., Hasnain, M., Alharbi, A., Alosaimi, W.: A Systematic [21] Chiba, Z., Abghour, N., Moussaid, K., Elomri, A., Rida, M.:
Literature Review on Cloud Computing Security: Threats and Intelligent approach to build a Deep Neural Network based IDS for
Mitigation Strategies. IEEE Access, 9, pp. 57792-57807, 2021. cloud environment using combination of machine learning
[2] Abdulsalam, Y.S., Hedabou, M.: Security and Privacy in Cloud algorithms. Computers & Security, Vol. 86, pp. 291-317, 2019.
Computing: Technical Review. Future Internet 2022, 14, 11. [22] Khilar, P.M., Chaudhari, V., Swain, R.R.: Trust-Based Access
[3] George, S.S., Pramila, R.S.: A review of different techniques in cloud Control in Cloud Computing Using Machine Learning. Cloud
computing. Materialstoday proceedings, 46, pp. 8002-8008, 2021. Computing for Geospatial Big Data Analytics, pp. 55-79, 2018.

[4] Attaran, M., Woods, J.: Cloud computing technology: improving [23] Aljamal, I., Tekeoğlu, A., Bekiroglu, K., Sengupta, S.: Hybrid
small business performance using the Internet. Journal of Small Intrusion Detection System Using Machine Learning Techniques in
Business & Entrepreneurship. 13. pp. 94-106, 2018. Cloud Computing Environments. IEEE 17th International Conference
on Software Engineering Research, Management and Applications
[5] Basu, S., Bardhan, A., Gupta, K., Saha, P., Pal, M., Bose, M., Basu, (SERA), 2019.
K., Chaudhury, S., Sarkar, P.: Cloud computing security challenges
& solutions-A survey. Annual Computing and Communication [24] Sethi, K., Kumar, R., Prajapati, N., Bera, P.: Deep Reinforcement
Workshop and Conference (CCWC), 2018. Learning based Intrusion Detection System for Cloud Infrastructure.
International Conference on Communication Systems & Networks
[6] Dwivedi, R.K., Saran, M., Kumar, R.: A Survey on Security over (COMSNETS), 2020.
Sensor-Cloud. In: 2019 9th International Conference on Cloud
Computing, Data Science & Engineering (Confluence), Noida, India, [25] Chkirbene, Z., Erbad, A., Hamila, R.: A Combined Decision for
pp. 31-37, 2019. Secure Cloud Computing Based on Machine Learning and Past
Information. IEEE Wireless Communications and Networking
[7] Butt, U.A.; Mehmood, M.; Shah, S.B.H.; Amin, R.; Shaukat, M.W.; Conference (WCNC), 2019.
Raza, S.M.; Suh, D.Y.; Piran, M.J. A Review of Machine Learning
Algorithms for Cloud Computing Security. Electronics 2020, 9, [26] Chkirbene, Z., Erbad, A., Hamila, R., Mohamed, A., Guizani, M.,
1379. Hamdi, M.: TIDCS: A Dynamic Intrusion Detection and
Classification System Based Feature Selection. IEEE Access, vol. 8,
[8] Sarker, I.H.; Machine Learning: Algorithms, Real-World pp. 95864-95877, 2020.
Applications and Research Directions. SN computer science volume.
2, 3 (2021): 160. [27] Zhao, X., Jiang, R.: Distributed Machine Learning Oriented Data
Integrity Verification Scheme in Cloud Computing Environment.
[9] Alzubi, J., Nayyar, A., Kumar, A.: Machine Learning from Theory to IEEE Access, Vol. 8, pp. 26372-26384, 2020.
Algorithms: An Overview. Journal of Physics: Conference Series,
Volume 1142, Second National Conference on Computational

You might also like