Autoadaptivelearningbasedworkloadforecastingindynamiccloudenvironment
Autoadaptivelearningbasedworkloadforecastingindynamiccloudenvironment
net/publication/344609871
CITATIONS READS
37 432
2 authors:
Some of the authors of this publication are also working on these related projects:
Calls for Papers - Upcoming Special Issues on "Recent Advances and Challenges in Quantum-Dot Cellular Automata" View project
Call for Book Chapter: Design and Testing of Reversible Logic with Springer View project
All content following this page was uploaded by Deepika Saxena on 12 October 2020.
To cite this article: Deepika Saxena & Ashutosh Kumar Singh (2020): Auto-adaptive learning-
based workload forecasting in dynamic cloud environment, International Journal of Computers and
Applications, DOI: 10.1080/1206212X.2020.1830245
Summarizing the contributions as follows: multiple solutions simultaneously against single solution approach
of Backpropagation.
• An Auto-adaptive neural network model is presented for work- Recently, a biphase adaptive algorithm [1] is developed and
load forecasting in dynamic cloud environment. applied to train a feed-forward neural network for workload predic-
• Development and application of improved adaptive variant of tion in cloud environment. Its advantage is that it allows adaptation at
differential evolution learning algorithm for training neural net- mutation and crossover level, and has beaten the prediction accuracy
work. of SaDE algorithm because of dual adaptive nature to correctly fore-
• Adaptation is applied in three dimensions- Mutation, Crossover cast the dynamic workload demands. Two LSTM neural networks
and Control parameter tuning adaption, to carry out global opti- approach is proposed in [22] where one neural network forecast
mization of prediction neural network model. the arrival interval of the user requests and other other predicts
• In-depth experimental analysis of the proposed workload pre- aggregated requests for resources for pre-defined time interval. A
diction approach using two benchmark datasets namely, NASA deep learning based for cloud Workloads prediction algorithm is
HTTP and Saskatchewan HTTP internet traffic traces and presented in [23]. A top-sparse auto-encoder (TSA) is designed to
remarkable improvement in results while predicting future work- effectively extract the essential representations of workloads from the
load demand at cloud data center. original high-dimensional workload data by using recurrent neural
network.These methods promise high accuracy with large size train-
The rest of the paper is organized as follows: Section 2 discusses ing data for dynamic workload prediction. However, the limitation is
the related work, followed by Section 3 that entails the proposed that they are complex and time consuming for small training dataset.
model description. Section 4 pertains to discuss experiment and
results and finally the Section 5 presents conclusive remarks and
2.2. Hybrid models
future scope of the proposed work.
Control theory based prediction model applies feedback mechanism
to deal with unpredictable changes [24]. Wu et al. [25] proposed feed-
2. Related work
back control algorithm in which different combinations of cost and
In this section, recent key contributions in the field of work- resource benefits are computed and best combination is chosen for
load prediction are discussed. They are divided into two categories highest profit and least cost. There are fuzzy controllers driven by
including (1) neural network models and (2) hybrid models, as rule based approaches [26]. However, the limitation is that control
follows: theory prediction model lacks learning and adaptation capability.
The queuing network prediction model maps the relation
between the workload and performance criteria, where, the server
2.1. Neural network models
acts like queuing system [27]. Zhang et al. [28] presented regres-
An Empirical prediction model for adaptive workload provision- sion method to predict CPU demands and the outcome is used
ing is proposed in [15] which allows adaptive resource provisioning to parameterized queuing network. The benefit of queuing model
for dynamic and proactive resource management in cloud using is that they do not need training phase and but they are sensitive
error correction neural network (ECNN) and Linear regression with to parameterization. Later, [5] proposed sequential pattern mining
sliding window and cross validation technique. The pros of this approach extract episodes based behavioral patterns of workload.
technique is that it is adaptive and proactive method of workload pre- The limitation is that there is no way to measure the fitness or accu-
diction. However, the sliding window having fixed size is incapable racy of the predicted outcome. Herbst et al. [29] performed host
of workload forecasting in dynamic cloud environment. Rand Vari- load prediction by using artificial neural network, decision tree, sup-
able Learning Rate Backpropagation Neural Network (RVLRBPNN) port vector machine, ARIMA model, Bayesian network and cubic
[16] presents future workload prediction technique based on Back- smoothing splines for proactive workload prediction. The pros is that
propagation neural network that outperforms the Hidden Markov this approach raises the prediction accuracy, however, it is complex
Model (HMM) based prediction technique. Its advantage is that it and more time consuming.
allows gradient descent-based learning and do not converge prema- Cao. et al. [30], have applied historical data monitoring to predict
turely over HMM. Commonly, the prediction methods of workload the server load in future by machine learning method like, Random
are derived using single machine learning model. However, the dis- Forest. This scheme outperforms the time series analysis method
advantage is that variations in network traffic makes it difficult to in terms of accuracy for the workload prediction. This is because
predict the future trends of workload using single model, therefore, machine learning models do not learn directly from lag observa-
the adaptive approach is proposed by [17]. tion like time-series, however, they learn a pattern of a large input
Duy et al. [18], have proposed host load prediction method using sequence that is most relevant for the prediction problem.
neural network and Backpropagation and achieved accuracy upto
79%. Moreover, the Backpropagation trained neural network is uti-
3. Proposed adaptive workload forecasting model
lized for cloud workload prediction in [19]. Later, Kumar et al. pro-
posed dynamic resource scaling using neural network and blackhole Resource provisioning at cloud data center is driven according to
learning algorithm based workload prediction approach is presented the demands of the users. The proposed model is shown in Figure 1
in [20]. Moreover, they proposed a self-adaptive differential evolu- which illustrates that prior prediction of demands at cloud data cen-
tion (SaDE) based artificial neural network model [21] for work- ter governs the resource management process. The historical work-
load prediction, which is capable of producing more accurate result load from data center provides training and testing data to neural
as compared to RVLRBPNN gradient decent based Backpropaga- network-based prediction system. This data is pre-processed to gen-
tion trained neural network model of [16]. The advantage is that erate training input and output data. The training input data initial-
these (Blackhole and SaDE) evolutionary learning-based approaches izes neurons at input layer of neural network. Then, Auto Adaptive
allow the fast convergence, better exploration, learning capability Differential Evolution (AADE) learning algorithm trains the neural
and more accuracy as compared to Backpropagation algorithm. The network and optimizes the network weight connections. This helps
negative side is that they are time consuming as they work with in learning of dynamic patterns extracted from training input data
INTERNATIONAL JOURNAL OF COMPUTERS AND APPLICATIONS 3
for several generations (or epochs) and finally generating predicted that training input in the form of aggregate function of i values
workload as output. The training output data serves actual work- produce (i + 1)th value as predicted output during training stage.
load which is then compared to predicted workload and accuracy For instance, seven historical input values predicts eighth value
of the under training model is evaluated. If the error score is reduced as output. Finally, forecast workload in the form of frequency of
upto desired level, then it is deployed as trained workload prediction requests arriving at data center during specific time-interval (or pre-
model, otherwise, the training process continues. The test data (or diction interval). Algorithm 1 illustrates the procedure to carry out
unseen data) is further use to analyze the accuracy of trained pre- preprocessing step of historical data.
diction model. After completion of train and test phase, the trained
model is ready to deploy at cloud data center to forecast future
trend of user demands and allows benefit in efficient management 3.2. Auto adaptive pattern learning by neurons
of resources. Additionally, live data is also given to allow trained pre- The proposed adaptive neural network model allows mapping
diction model to adapt dynamically and predict accurate workload between multiple inputs and single output as p−q−r, where p, q, and
information over consecutive periods. r stands for number of neurons in input, hidden and output layer
The proposed approach includes workload pre-processing stage respectively as illustrated in Figure 2. The training begins with ran-
and training of neural network using AADE learning algorithm domly generating L different synaptic weight connections denoted as
improved with three-dimensional adaptation as described in subse- α of size (p + 1) × q + (q × r) = q(p + r + 1) ⇐ q(p + 2) as r = 1
quent subsections: in proposed work. One extra input is added to input neurons to take
up bias value.
Neural network predicts useful information by correlating and
3.1. Pre-processing
learning useful patterns from training input data. This training pro-
The historical data received from cloud data center contains noise cess requires optimization of network weight connections. Since,
and redundant data, is filtered at pre-processing stage. The cur- workload arrival at data center is dynamic like, there may be spikes
rent forecasting approach focus on filtering of useful attribute only in one second or zero request at another second. Therefore, an
from the log of previous data leaving irrelevant information. For the AADE learning algorithm is developed, which is a variant of learn-
reported experiments, firstly, we extracted ”timestamp” attribute at ing algorithm presented in [31] and utilized to train neural network
which the user request arrived from the training data (benchmark based prediction model. Generally DE algorithm begins with initial-
dataset mentioned in Section 4), secondly, clustered and counted the ization of solution search space containing L number of population
timestamp values according to particular prediction interval (like, vectors P ∈ {P1 , P2 . . . PL }, where each Pi ∈ {δ1 , δ2 . . . δl }. Then, the
1 min, 2 min, . . . ,60 min). Then counted values are arranged such solution space is optimized by mutation and crossover operator.
4 D. SAXENA AND A. K. SINGH
adaptive manner. We apply two distinct kinds of crossover strategies 3.2.3. Control parameters adaptation
as follows: The adaptation of control parameters including, crossover rate (Cr)
and mutation rate (M) guides convergence speed of proposed AADE
• Heuristic Crossover algorithm. To adjust control parameters values in accordance with
Heuristic crossover compares fitness value of both parent chro- success status of each candidate of the population during evolution
mosomes and finds the parent with better fitness value to produce process. Before the next generation evolution begins, we count the
a new offspring as depicted in Equation (6). It can bring significant number of candidates updated in previous generation i.e. g. The con-
diversity in search space, that adds promising genetic material by trol parameters Cr and M are updated according to Equations (9)
generating new offspring more closer to parent with better fitness and (10), where, random1 and random2 are uniform random num-
j+1
value [36]. This phenomenon is illustrated in Figure 3. There- bers in the range [0, 1], Mi is mutation rate for ith population
fore, this crossover operator improves the exploitation capability vector in next generation, Ml = 0.1 and Mu = 0.8 are lower and
of DE learning algorithm. Let δ1 and δ2 be two parent chromo- j+1
upper bounds for mutation respectively. Similarly, Cri is next gen-
somes defined as, δ1 ∈ {δ11 , δ12 · · · δ1L } and δ2 ∈ {δ21 , δ22 · · · δ2L }. eration crossover rate of ith population vector, Crl = 0.1 and Cru =
Both of the parents are evaluated using fitness function (we use 0.5 are lower and upper limits for crossover operator [38]. If the value
root mean square method). Let δ1 is the parent chromosome of g is less than Z, we set Z equals to 2:5 of original population size,
with better fitness value. Then, the offspring δOf denoted as δOf ∈ which means if atleast two-fifth members of total population is not
{δOf 1 , δOf 2 · · · δOfL }, for next generation generates as follows: updated with last values of M and Cr, then mutation and crossover
rates are regenerated. This will prevent condition of premature con-
δOfi = Cri ∗ (δ1i − δ2i ) + δ1i (6) vergence. The number of generations elapsed in upgrading values of
M and Cr is known as ‘learning period’ for control parameters.
where, Cri is crossover rate of ith gene generated randomly in
the range [0, 1] such that i = {1, 2 · · · , L}, δOfi is ith gene of new j+1 Ml + random1(Mu − Ml ) (g ≤ Z)
Mi = j (9)
offspring, δ2i and δ1i are ith gene of parents δ1 and δ2 respectively. Mi (otherwise.)
• Uniform Crossover
In this approach, crossover occurs at gene level instead of seg- j+1 Crl + random2(Cru − Crl ) (g ≤ Z)
Cri = j (10)
ment level, a random number z in the range [0, 1] is generated Cri (otherwise.)
for each gene of parent chromosome. If the crossover rate Cri for
ith solution (parent) vector is more than the random value gen- During each generation or epoch, following mutation and
erated for the gene, then the gene values of the two parents will crossover, we keep track of the number of candidates success-
be exchanged, otherwise same gene continue for the production fully reaching the next generation denoted as sm1 , sm2 , sm3 and
of new offspring as shown in Figure 4. This crossover technique sm4 for four different mutation strategies. Similarly, fm1 , fm2 , fm3
allows exploration of both parent chromosomes at fine-grained and fm4 records the number of candidates failed to reach the
level and then produces two new offspring. In this crossover next generation. The probabilities of successful offspring generated
method, two child chromosomes are produced during each gen- by the DE/random/2, DE/best/2, DE/current − to − best/1 and
eration. We evaluate the fitness of both the children and select the DE/current − to − random/1 mutation techniques are computed as
child having maximum fitness value (least error score) to proceed ρ1 , ρ2 , ρ3 and ρ4 shown in Equation (11).
in the successive generation [37]. Equation (7) shows selection
b = 2(sm2 sm3 sm4 + sm1 sm3 sm4 + sm1 sm3 sm2 + sm2 sm3 sm4 )
procedure for new offspring.
+ fm1 (sm2 + sm3 + sm4 ) + fm2 (sm1 + sm3 + sm4 )
j
δi (z ∈ (0, 1) ≤ Cri j .) + fm3 (sm1 + sm2 + sm4 ) + fm4 (sm1 + sm3 + sm2 )
ν= j (7)
αi (otherwise.) sm1 (sm2 + fm2 + sm3 + fm3 + sm4 + fm4 )
ρ1 =
b
Let σ1 and σ2 be the probabilities for selecting the uniform sm2 (sm1 + fm1 + sm3 + fm3 + sm4 + fm4 )
and heuristic crossover strategies respectively. Similar to mutation ρ2 =
b
scheme selection, roulette wheel selection, given in Equation (8)
is applied to select appropriate crossover strategy. The terms ηc sm3 (sm1 + fm1 + sm2 + fm2 + sm4 + fm4 )
ρ3 =
represents crossover selection, css is crossover selection scheme. b
ρ4 = 1 − (ρ1 + ρ2 + ρ3 )
Uniform crossover, If (0 < cssi ≤ σ1 ) (11)
ηc = (8) The probabilities of successful offspring generated by the heuristic
Heuristic crossover, otherwise
and uniform crossover strategies are computed as σ1 and σ2 shown in
6 D. SAXENA AND A. K. SINGH
j j j
Equation (12). Similar to mutation, for crossover too, we keep track j+1 νi (fitness(δi )) ≤ (fitness(νi ))
of the number of successful candidates denoted as cs1 and cs2 and δi = j (14)
failure candidates cf1 and cf2 reaching the next generation, that helps δi (otherwise.)
to compute σ1 and σ2 .
Algorithm 2 and 3 discusses the process of training (learning of pat-
c = 2(cs2 + cs1 ) + cf1 × cs2 + cf2 × cs1 terns) of neural network using Auto Adaptive DE algorithm. Algo-
cs1 (cs2 + cf2 ) rithms 2 and 3 shows operational summary of AADE algorithm and
σ1 = (12) workload prediction main module.
c
σ2 = 1 − σ 1
Algorithm 3 Proposed workload forecasting algorithm
3.2.3.1. Selection. The fitness function selected for evaluation of 1: Input HistoricalWorkload, trainSize
prediction accuracy, is root mean square error (RMSE) score 2: Output Predicted output
depicted in Equation (13), where m is number of data samples, 3: [trainInput, trainOutput, testInput, testOutput]⇐
zactual and zprediction are actual and predicted output respectively. FP(HistoricalWorkload, trainSize).
Since, accuracy is inversely proportional to RMSE, the purpose is to 4: Initialize L networks each of size C = ((p + 1) × q) + (q × r) =
minimize the fitness function. The population for next generation q(p + r + 1).
is selected using greedy approach in the form of survival of fittest 5: Initialize Cr = M = 0.5, Crl = 0.1, Cru = 0.5, Ml = 0.1, Mu =
j+1
concept using Equation (14), where δi is selected candidate for 0.8, ρ1 =ρ2 =ρ3 =ρ4 =0.25, σ1 = σ2 =0.5.
j j 6: Evaluate each network on training data using objective function.
next generation, νi is the solution generated after crossover and δi
7: for each generation g = {1, 2, ..., Gmax } do
is currently existing candidate solution.
8: Generate vector mssg and cssg of L random numbers ∈ [0,1].
1
m 9: Listg+1 ⇐ AADE( Listg , Cr, M, Crl , Cru , Ml , Mu , ρ1 , ρ2 , ρ3 ,
RMSE = (zactual − zprediction )2 (13) ρ4 , σ1 , σ2 ).
m i=1 10: Update ρ1 , ρ2 , ρ3 , ρ4 , σ1 and σ2 .
11: Update Cr and M.
12: end for
Algorithm 2 Auto adaptive differential evolution algorithm 13: Select network having minimum error score.
1: function AADE(Listg , Cr, M, Crl , Cru , Ml , Mu , ρ1 , ρ2 , ρ3 , ρ4 , σ1 , 14: W ⇐ MAXIMUM_FITNESS( Listg+1 ).
σ2 ) 15: trainedModel ⇐ W.
2: for j = {1, 2, ..., C} do 16: Analyze trained model over test data.
3: Generate β1 = β2 = β3 = β4 = β5 = j ∈ [1, L] and
Krand ∈ [1, C]
4: if 0 < mssj ≤ ρ1 then
4. Performance evaluation
5: Apply DE/random/2 mutation strategy.
6: else if ρ1 < mssj ≤ (ρ1 + ρ2 ) then 4.1. Experimental set-up
7: Apply DE/best/2 mutation strategy.
The simulation experiments are executed on a server machine assem-
8: else if (ρ1 + ρ2 ) < mssj ≤ (ρ1 + ρ2 + ρ3 ) then
bled with two Intel® Xeon® Silver 4114 CPU with 40 core processor
9: Apply DE/current − to − best/1 mutation strategy.
and 2.20GHz clock speed. The computation machine is deployed
10: else
with 64-bit Ubuntu 16.04 LTS, having main memory of 128 GB. We
11: Apply DE/current − to − random/1 mutation strat-
have implemented the proposed work in Python 3. Table 1 shows
egy.
the list of different parameters and their values used in experimental
12: end if
set up.
13: if 0 < cssj ≤ σ1 then
14: Apply heuristic crossover strategy.
15: else Table 1. Experimental set-up parameters
16: Apply uniform crossover strategy. and their values.
17: end if
Parameter Value
18: end for
19: Evaluate newly generated solutions using fitness function. Input neural nodes (p) 10
20: Select participants for next generation from offspring vector Hidden layer nodes (q) 7–20
Output layer nodes (r) 1
and current population. Maximum epochs (Gmax ) 250
21: return List of updated network for successive generation. Size of training data 75%
22: end function Number of population (L) 15
Table 2. RMSE, Convergence rate and Execution time elapsed for different combina- the reply. This log was collected from 00:00:00 June 1, 1995 through
tions of Input nodes and Hidden nodes for 10 min prediction interval. 23:59:59 December 31, 1995, a total of 214 days. In this seven month
Number of Number of period there were 2,408,625 requests. Furthermore, it must be noted
Input nodes Hidden nodes RMSE Time elapsed (sec) Convergence rate that timestamp have 1 second resolution.
10 7 0.0262 172.83 14
10 10 0.0174 179.03 16
10 12 0.0138 161.72 21 4.3. Results and analysis
10 15 0.0197 175.55 18 To analyze the performance of proposed work, numerous experi-
10 20 0.0718 203.65 21
ments were conducted based on different variations. The authors
executed experiments based on different time intervals (like, 1 min,
10 min, and so on). In the observed simulation, the sample input data
4.2. Data sets
is partitioned into training data (75%) and rest of the data (i.e. 25%) is
The benchmark data set opted for experiments are NASA and used to test the prediction model. Each experiment was executed for
Saskatchewan internet request traces taken from [39]. These internet 20 times and mean of the obtained results are reported in the article.
traffic traces are available in ASCII files having one HTTP request in In addition, the performance of proposed work is analyzed for the
one row. NASA traces have two month’s HTTP requests recorded at different number of hidden layer neurons and the most efficient out-
the NASA Kennedy Space Center busy WWW server in Florida. The come was obtained with 12 data input neurons as reported in Table 2.
first log was collected from 00:00:00 July 1, 1995 through 23:59:59 The number of input neurons and hidden neurons are chosen in the
July 31, 1995, a total of 31 days. NASA has 1891715 number of HTTP ratio of 5:6.
requests. Saskatchewan data set is HTTP request log of seven months We noticed during experimental analysis that with increasing the
recording at a University WWW server. The attributes of the data set number of neurons in the hidden layer, the RMSE value is decreasing
are Host, Time stamp, HTTP request, HTTP reply and Bytes sent in from 7 to 12 hidden nodes and then increases. The reason behind
Figure 5. Predicted Workload vs Actual Workload during Training and Testing phase of NASA traces. (a) Prediction interval = 1 min. (b) Prediction interval = 5 min. (c)
Prediction interval = 10 min. (d) Prediction interval = 20 min. (e) Prediction interval = 30 min and (f) Prediction interval = 60 min.
8 D. SAXENA AND A. K. SINGH
Table 3. Comparision of Root mean square error of existing and proposed approach.
Root Mean Square Error (RMSE)
it, is that as we increase the number of nodes in the hidden layer, min upto 60 min respectively. The reported graphs have prediction
network size grows upto 12 hidden nodes, learning efficiency rises error score of 0.010, 0.012, 0.011, 0.009, 0.083, and 0.045 respec-
and reaches optimal value (where error score reduces to minimum), tively, for 1 min, 10 min, 20 min, 30 min, and 60 min prediction
thereafter, the larger size network, results in memorizing the training interval during training phase. Figure 6 depicts the resultant graphs
set rather than correlating or learning patterns and therefore, pre- plotted for Saskatchewan traces having RMSE score of 0.0020, 0.025,
dicts over-fit or under-fit outcomes for the same data samples which, 0.047, 0.0172, 0.0167, and 0.057 respectively, for prediction inter-
consequently, increase RMSE (error) value (Figure 4). val from 1 min, 5 min, and so on, upto 60 min. The outcome of
Figure 5 show the actual and predicted workload during train- the proposed prediction model, clearly depicts the close proximity
ing and testing for NASA traces for time period of 1 min, 5 min, 10 between actual and predicted workload at various time intervals. The
Figure 6. Predicted Workload vs Actual Workload during Training and Testing phase of Saskatchewan trace. (a) Prediction interval = 1 min. (b) Prediction interval = 5 min.
(c) Prediction interval = 10 min. (d) Prediction interval = 20 min. (e) Prediction interval = 30 min and (f) Prediction interval = 60 min.
INTERNATIONAL JOURNAL OF COMPUTERS AND APPLICATIONS 9
Table 4. Percentage-wise accuracy improvement of proposed method over existing workload prediction
approaches.
Accuracy Improvement Percentage
BP SaDE Proposed
predicted workload is either same or near to the actual workload in Table 6. Generations elapsed in training Backpropagation and SaDE vs AADE.
most of the experiments. The experimental observation revealed that Generations (or Epochs)
RMSE value increments (or accuracy decrements) with increase in consumed in training
the value of time interval (from 1 min to 60 min), the reason is that
with bigger prediction interval, the prediction model under training BP [16] SaDE [21] Proposed
get lesser number of data samples for pattern recognition or learn- Prediction Time (min) NS SK NS SK NS SK
ing as compared to smaller prediction time interval for the same 1 250 250 47 61 21 20
benchmark dataset. 5 250 250 39 77 20 21
10 250 250 51 51 19 20
20 250 250 47 51 20 18
4.4. Comparison 30 250 250 26 42 18 19
60 250 250 26 21 21 19
The comparison of proposed prediction model is done with Self
adaptive Differential Evolutionary algorithm based workload predic-
tion scheme proposed by [21], average and Backpropagation based
prediction models [16]. The Table 3 illustrates the superiority of that evolutionary algorithm uses the population of number of solu-
proposed work in terms of accuracy improvement concluded with tion vectors, rather than Backpropagation technique that works with
respect to reduced root mean square error (RMSE) in case of pro- single population vector.
posed prediction model as compared to other existing prediction In addition, we compared convergence speed or number of
models. iteration spent in training prediction model for various prediction
We have obtained accuracy improvement upto 94.8% over SaDE, intervals as illustrated in Table 6. The convergence rate or number
97.4% over Backpropagation and 97.6% over Average learning of epochs passed in training neural network using AADE algorithm
schemes for NASA traces and 92.2% over SaDE, 98.1% over BP and is always lesser than other existing algorithms for both datasets dur-
98.9% over Average method for Saskatchewan traces as discussed ing each time interval. Therefore, the proposed work allows faster
in Table 4. The accuracy improvement percentage is evaluated by convergence but do not converge prematurely as the accuracy of pre-
applying Equation (15). diction is improved as compared to the other State of art prediction
models and promise more worthy and pragmatic results for prior
RMSEES − RMSEPS × 100 estimation of upcoming workload at cloud data centers.
Accuracyimprovement = (15)
RMSEES
5. Conclusion
where ES and PS represents existing scheme and proposed scheme
respectively. The RMSE value is incredibly reduced for each time Due to variations in customer demand patterns (sudden rise and
except for 1 minute prediction interval. This is due to the usage falls in resource request), the workload arriving at cloud data centers
of adaptive learning algorithm that enables to draw counter intu- is highly dynamic, noisy, and redundant. To effectively handle the
itive patterns, developing correlations between various patterns and dynamic resource provisioning task, accurate workload forecast in
allows extensive exploration throughout the search space. advance could benefit the resource manager in proper resource uti-
Table 5 shows the comparison of time consumed in training pro- lization and reduce extravagant power consumption at data centers.
posed model, SaDE and Backpropagation for both data sets respec- However, it becomes challenging to correctly predict the dynamic
tively. The time taken for training prediction model with SaDE and nature of workload. Therefore, to resolve this issue, this article
AADE is greater than required in Backpropagation, due to the fact presents auto adaptive neural network to forecast the workload with
10 D. SAXENA AND A. K. SINGH
precision by getting adapt according to behavioral changes in user [2] Saxena D, Vaisla KS, Rauthan MS. Abstract model of trusted and secure
demand patterns in advance. AADE learning algorithm is applied to middleware framework for multi-cloud environment. In: International Con-
train the neural network, which is an improved version of original DE ference on Advanced Informatics for Computing Research; Springer; 2018.
p. 469–479.
algorithm, extended with mutation, crossover and parameter tuning [3] Saxena D, Chauhan R, Kait R. Dynamic fair priority optimization task
adaptation. This kind of learning approach allows neural network scheduling algorithm in cloud computing: concepts and implementations.
to develop correlations based on operational changes in extracted Inter J Comput Netw Inform Secur. 2016;8(2):41.
workload patterns and learn the dynamic behavior of user demands [4] Saxena D, Chauhan S. A review on dynamic fair priority task scheduling
algorithm in cloud computing. Inter J Sci Envir Technol. 2014;3(3):997–
effectively. The experimental analysis of the proposed work is done 1003.
using two benchmark data sets, NASA and Saskatchewan HTTP [5] Amiri M, Mohammad-Khanli L, Mirandola R. An online learning model
internet traffic traces. The results show significant improvement in based on episode mining for workload prediction in cloud. Future Gener
prediction accuracy upto 99% and the proposed prediction model Comp Sy. 2018;87:83–101.
[6] Zhang Q, Yang LT, Yan Z, et al. An efficient deep learning model to pre-
surpass the other State of art neural network workload prediction
dict cloud workload for industry informatics. IEEE Trans Industr Inform.
approaches in terms of error reduction and convergence speed. The 2018;14(7):3170–3178.
future aim of this work would be to further raise the quality of predic- [7] Khan A, Yan X, Tao S, et al. Workload characterization and prediction in the
tion by determining the dynamic peaks in the upcoming workload. cloud: a multiple time series approach. In: 2012 IEEE Network Operations
Moreover, to train the neural network model with some other learn- and Management Symposium. IEEE; 2012. p. 1287–1294.
[8] Amekraz Z, Hadi MY. Higher order statistics based method for workload
ing algorithm for prediction of bandwidth requirement and power prediction in the cloud using arma model. In: 2018 International Conference
consumption in future at cloud data centers. on Intelligent Systems and Computer Vision (ISCV). IEEE; 2018. p. 1–5.
[9] Calheiros RN, Masoumi E, Ranjan R, et al. Workload prediction using arima
model and its impact on cloud applications’ qos. IEEE Trans Cloud Comput.
Acknowledgments 2015;3(4):449–458.
This work is financially supported by National Institute of Technology Kuruk- [10] Nielsen H, Brunak S, von Heijne G. Machine learning approaches for the
shetra, Haryana, India. prediction of signal peptides and other protein sorting signals. Engineer-
ing¡/DIFdel¿Protein Eng.. 1999;12(1):3–9.
[11] Shirzad E, Saadatfar H. Job failure prediction in hadoop based on log file
Disclosure statement analysis. Inter J Comput Appl. 2020;1–10.
No potential conflict of interest was reported by the authors. [12] Saxena D, Singh AK. Energy aware resource efficient-(eare) server consoli-
dation framework for cloud datacenter. In: Advances in communication and
computational technology. Springer; 2020. p. 1455–1464.
Funding [13] Saxena D, Singh A. Security embedded dynamic resource allocation model
for cloud data centre. Electron Lett. 2020;56(20):1062–1065.
This work is financially supported by National Institute of Technology Kuruk-
[14] Cetinski K, Juric MB. Ame-wpc: advanced model for efficient workload
shetra, Haryana, India granted by ‘Ministry of Education, Government of India –
prediction in the cloud. J Netw Comput Appl. 2015;55:191–201.
MHRD’.
[15] Islam S, Keung J, Lee K, et al. Empirical prediction models for
adaptive resource provisioning in the cloud. Future Gener Comp Sy.
2012;28(1):155–162.
Notes on contributors [16] Lu Y, Panneerselvam J, Liu L, et al. Rvlbpnn: a workload forecasting model
Deepika Saxena received her M.Tech degree in Computer Science and Engineer- for smart cloud computing. Sci Program. 2016;2016.
ing from Kurukshetra University Kurukshetra, Haryana, India in 2014. Currently, [17] Liu C, Liu C, Shang Y, et al. An adaptive prediction approach based
she is pursuing her Ph.D from Department of Computer Applications, National on workload pattern discrimination in the cloud. J Netw Comput Appl.
Institute of Technology (NIT), Kurukshetra, India. Her major research interests 2017;80:35–44.
are predictive analytics.evolutionary algorithms, scheduling and security in cloud [18] Duy TVT, Sato Y, Inoguchi Y. Improving accuracy of host load predictions
computing. on computational grids by artificial neural networks. Int J Parallel Emergent
Distrib Syst. 2011;26(4):275–290.
Ashutosh Kumar Singh is working as a Professor and Head in the Department of [19] Prevost JJ, Nagothu KM, Kelley B, et al. Prediction of cloud data center
Computer Applications, National Institute of Technology Kurukshetra, India. He networks loads using stochastic and neural models. In: 2011 6th Interna-
has more than 18 years research and teaching experience in various Universities tional Conference on System of Systems Engineering; IEEE; 2011. p. 276–
of the India, UK, and Malaysia. He received his PhD in Electronics Engineering 281.
from Indian Institute of Technology, BHU, India and Post Doc from Depart- [20] Kumar J, Singh AK. Dynamic resource scaling in cloud using neural network
ment of Computer Science, University of Bristol, UK. He is also Charted Engineer and black hole algorithm. In: 2016 Fifth International Conference on Eco-
from UK. His research area includes Verification, Synthesis, Design and Testing friendly Computing and Communication Systems (ICECCS); IEEE; 2016. p.
of Digital Circuits, Data Science, Cloud Computing, Machine Learning, Security, 63–67.
Big Data. He has published more than 160 research papers in different jour- [21] Kumar J, Singh AK. Workload prediction in cloud using artificial neu-
nals, conferences and news magazines. He is the co-author of six books which ral network and adaptive differential evolution. Future Gener Comp Sy.
includes ‘Web Spam Detection Application using Neural Network’, ‘Digital Sys- 2018;81:41–52.
tems Fundamentals’ and ‘Computer System Organization & Ar- chitecture’. He [22] Zhu Z, Fan P. Machine learning based prediction and classification of com-
has worked as an Editorial Board Member of International Journal of Networks putational jobs in cloud computing centers. arXiv preprint arXiv:190303759.
and Mobile Technologies, International journal of Digital Content Technology 2019.
and its Applica- tions. Also he has shared his experience as a Guest Editor for [23] Chen Z, Hu J, Min G, et al. Towards accurate prediction for high-
Pertanika Journal of Science and Technology. He is involved in reviewing process dimensional and highly-variable cloud workloads with deep learning. IEEE
of different journals and conferences such as; IEEE transaction of computer, IET, Trans Parallel Distrib Syst. 2019;31(4):923–934.
IEEE conference on ITC, ADCOM etc. [24] Zhu X, Uysal M, Wang Z, et al. What does control theory bring to systems
research? ACM SIGOPS Oper Syst Rev. 2009;43(1):62–69.
[25] Wu H, Zhang W, Zhang J, et al. A benefit-aware on-demand provisioning
ORCID approach for multi-tier applications in cloud computing. Front Comput Sci.
Deepika Saxena http://orcid.org/0000-0002-9689-6387 2013;7(4):459–474.
[26] Amiri M, Mohammad-Khanli L. Survey on prediction models of appli-
Ashutosh Kumar Singh http://orcid.org/0000-0002-8053-5050 cations for resources provisioning in cloud. J Netw Comput Appl.
2017;82:93–113.
[27] Urgaonkar B, Shenoy P, Chandra A, et al. Dynamic provisioning of multi-tier
References internet applications. In: Second International Conference on Autonomic
[1] Kumar J, Saxena D, Singh AK, et al. Biphase adaptive learning-based neu- Computing (ICAC’05); Citeseer; 2005. p. 217–228.
ral network model for cloud datacenter workload forecasting. Soft Comput. [28] Zhang Q, Cherkasova L, Smirni E. A regression-based analytic model
2020;1–18. for dynamic resource provisioning of multi-tier applications. In: Fourth
INTERNATIONAL JOURNAL OF COMPUTERS AND APPLICATIONS 11
International Conference on Autonomic Computing (ICAC’07); IEEE; 2007. [34] Iorio AW, Li X. Solving rotated multi-objective optimization problems using
p. 27–27. differential evolution. In: Australasian Joint Conference on Artificial Intelli-
[29] Herbst N, Amin A, Andrzejak A, et al. Online workload forecasting. In: Self- gence. Springer; 2004. p. 861–872.
aware computing systems. Springer; 2017. p. 529–553. [35] Zhang L, Chang H, Xu R. Equal-width partitioning roulette wheel selection
[30] Cao R, Yu Z, Marbach T, et al. Load prediction for data centers based in genetic algorithm. In: 2012 Conference on Technologies and Applications
on database service. In: 2018 IEEE 42nd Annual Computer Software of Artificial Intelligence; IEEE; 2012. p. 62–67.
and Applications Conference (COMPSAC); Vol. 1; IEEE; 2018. p. 728– [36] Wright AH. Genetic algorithms for real parameter optimization. In: Foun-
737. dations of genetic algorithms. Vol. 1. Elsevier; 1991. p. 205–218.
[31] Price KV. Differential evolution: a fast and simple numerical optimizer. [37] Pavai G, Geetha T. A survey on crossover operators. ACM Comput Surveys
In: Fuzzy Information Processing Society, 1996. NAFIPS., 1996 Biennial (CSUR). 2017;49(4):72.
Conference of the North American; IEEE; 1996 June. p. 524–527. [38] Wang YN, Wu LH, Yuan XF. Multi-objective self-adaptive differential evolu-
[32] Wang S, Li Y, Yang H, et al. Self-adaptive differential evolution algorithm tion with elitist archive and crowding entropy-based diversity measure. Soft
with improved mutation strategy. Soft Comput. 2018;22(10):3433–3447. Comput. 2010;14(3):193.
[33] Dawar D, Ludwig SA. Effect of strategy adaptation on differential evolution [39] Arlitt MF, Williamson CL. Web server workload characterization: the search
in presence and absence of parameter adaptation: an investigation. J Artif for invariants. SIGMETRICS Perform Eval Rev. 1996 May;24(1):126–137.
Intell Soft Comput Res. 2018;8(3):211–235. Traces available at ftp://ita.ee.lbl.gov/html/.