0% found this document useful (0 votes)
9 views13 pages

Prediction of Network Traffic in Wireless Mesh Net

The article discusses the prediction of network traffic in wireless mesh networks using a hybrid deep learning model, specifically a Convolution Neural Network and Long Short-Term Memory (Convo-LSTM) architecture. It presents a case study on predicting the performance of High-Speed Diesel pumps by analyzing sensor data, comparing various statistical and machine learning algorithms for accuracy. The proposed model demonstrates improved performance in traffic prediction, emphasizing the importance of historical data in enhancing network monitoring tasks.

Uploaded by

Ibrahim Ahmed
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views13 pages

Prediction of Network Traffic in Wireless Mesh Net

The article discusses the prediction of network traffic in wireless mesh networks using a hybrid deep learning model, specifically a Convolution Neural Network and Long Short-Term Memory (Convo-LSTM) architecture. It presents a case study on predicting the performance of High-Speed Diesel pumps by analyzing sensor data, comparing various statistical and machine learning algorithms for accuracy. The proposed model demonstrates improved performance in traffic prediction, emphasizing the importance of historical data in enhancing network monitoring tasks.

Uploaded by

Ibrahim Ahmed
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

This article has been accepted for publication in a future issue of this journal, but has not been

fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2022.3140646, IEEE Access

Digital Object Identifier xx.xx/ACCESS.2020.DOI

Prediction of Network Traffic in Wireless


Mesh Networks using Hybrid Deep
Learning Model
SMITA MAHAJAN1 , HARIKRISHNAN R2 ,AND KETAN KOTECHA.3
1
Dept. of Comp Science Engineering, Symbiosis Institute of Technology, Symbiosis International (Deemed) University Pune, Maharashtra, India (e-mail:
[email protected])
2
Dept. of Electronics and Communication Engineering, Symbiosis Institute of Technology, Symbiosis International (Deemed) University. Pune, Maharashtra,
India (e-mail: [email protected])
3
Symbiosis Institute of Technology, Symbiosis International (Deemed) University. Pune,Maharashtra, India (e-mail: [email protected])
Corresponding author:Harikrishnan R(e-mail:[email protected]).

ABSTRACT Wireless mesh networks are getting adopted in the domain of network communication.
Their main benefits include adaptability, configuration,and flexibility, with added efficiency in cost and
transmission time. Traffic prediction refers to forecasting the traffic volumes in a network. The traffic
volume includes incoming requests and outgoing data transmitted by the network nodes. The previous
logs of traffic in the network are used for extracting patterns that help for accurate predictions. In this
paper, an analysis of various existing traffic prediction methods is done. Specifically, the analysis of a case
study where the performance of the High-Speed Diesel (HSD) pump is predicted by observing its output. A
network of sensors forms a less mesh network, sensors act as nodes while reading the parameters namely,
three phase Current, Voltage, Temperature, and Vibration. In this case study, a High-Speed Diesel pumps’
performance is predicted by predicting the vibration parameter as the output parameter. Other parameters
affecting the performance of the High-Speed Diesel pump which are causing the change in vibration
value are identified. Various algorithms including Statistical Auto-Regressive Integration and Moving
Average, Poisson’s regression, and few Machine Learning and Deep Learning algorithms like Decision
Tree Regressor,Multi Layer Perceptron, Linear Regression, and Long Short-Term Memory are implemented
and evaluated for this purpose. Along with the comparison, a novel architecture using Convolution Neural
Network and Long Short-Term Memory is described in this paper. The result and comparison between these
give the clear understanding that the suggested novel Convo-LSTM model gives better performance and
helps to predict the performance of the High-Speed Diesel pump. The proposed system makes a strong
case for the network traffic prediction, where the use of historical data is collected over the wireless
mesh network. A similar analogy can be used where this model could be implemented further for network
monitoring tasks.

INDEX TERMS Deep Learning, Machine learning, Multivariate time series analysis, Prediction, Wireless
mesh networks

I. INTRODUCTION in Figure 1. The main advantages of wireless mesh networks


are their easy adaptability and configuration ability. Any
ETWORKS are playing an important role during this
N age of digital expansion. For a given network, the
most critical issues are its security, load balancing ability,
future changes can be easily accommodated, thus leading to
lower costs and maintenance. The main concepts related to a
wireless mesh network are traffic prediction, traffic routing,
maintainability, and speed. Various network topologies have and traffic control. Out of this, traffic prediction is a crucial
existed, including bus, ring, star, mesh, hybrid, etc. Out of aspect owing to being the fundamental block on which the
this, mesh networks have been one of the most popular performance of routing and congestion control algorithms is
choices owing to their stronger connection ability, lesser dependent. Traffic prediction refers to accurately predicting
disadvantages in terms of lag and rigidity [1]. Wireless mesh the possible traffic in a network at a given instance based on
networks are wireless based on the mesh topology, as shown

VOLUME X, 2020 1

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2022.3140646, IEEE Access

Mahajan et al.: Prediction of Network Traffic in Wireless Mesh Networks using Hybrid Deep Learning Model

used by both statistical and non-statistical methods, with a


significant focus on lag, first-order difference, and second-
order difference. The stationarity and nature of the time series
are more important in the fundamental analysis method [15].
Convolution Neural Network (CNN) is commonly employed
in feature engineering because it focuses on the most evident
elements in the line of sight.Long Short Term (LSTM) is
extensively utilized in time series because it has the property
of adopting/enhancing following the sequence of time [3].
In this paper, a novel Convo-LSTM architecture is pro-
FIGURE 1. A sample wireless mesh network
posed for the wireless mesh network’s traffic prediction,
citing a case study of the HSD pump. A Vibration forecasting
model based on one dimensional (1-D) CNN-LSTM is built
considering the properties of CNN and LSTM. The data
previous network data. An accurate estimate of the network tuples are collected over one year’s period, with constant
traffic can help the network administrator improve the avail- monitoring of the network, which is a real world data. The
ability and transmission speeds of the network [2], [3]. Previ- fundamental structure of the model is a hybrid or mixing of
ous approaches for network traffic prediction have primarily one dimensional (1-D) Convolution neural network and Long
focused on the host server logs along with consideration Short Term (LSTM). It has an input layer, one-dimensional
of the network parameter configuration [4]–[6].This paper convolution layer, pooling layer, LSTM hidden layer, and
compares the performance of the six algorithms, namely, full connection layer in its architecture. The proposed model
Decision Tree Regressor,Linear Regression, Multi-layer Per- is evaluated in terms of Mean Square Error, Mean Absolute
ceptron, Poisson’s Regression, Auto Regressive Integration Error and Root Mean Square Error to check its performance.
and Moving average, Long Short Term Memory. Out of The main contributions through this paper could be en-
these, ARIMA is well-known model [7], Poisson’s regression listed as follows:
is a probabilistic model [8] and both are implemented tradi- 1) A hybrid model for network traffic prediction is pro-
tionally in various applications.Decision Tree Regressor [9] posed for the wireless mesh networks formed by vari-
and Linear Regressor [10] are machine learning algorithms ous sensors, with the HSD pump as a case study.
whereas Multi-Layer Perceptron [11]is a subset of Deep Neu- 2) A set of statistical, non-statistical, deep learning and
ral Network and LSTM is artificial recurrent neural network machine learning algorithms are implemented, and re-
(RNN) architecture [12].This way the paper compares the sults are compared for the collected multivariate time
performance of various different algorithms implemented on series data.
the same, real world data.This paper proposes a neoteric, 3) After applying these time-series based algorithms, un-
hybrid technique for network traffic prediction in wireless biased analysis of the performances is done, which can
mesh networks by focusing on the historical data collected be helpful for the researchers in the domain of wireless
over the network. Specifically,the case study of High-Speed mesh networks.
Diesel (HSD) pumps is considered, where different sensors 4) An unbiased analysis of the results that a researcher can
are used to collect the data values. Specifically, the case expect when applying these multivariate time-series
study of HSD pumps is considered, whose sensor readings algorithms in the domain of wireless mesh networks.
are used to evaluate the performance. The installation of sen- The paper outline is as follows: Section 2 provides a review
sors(for reading various parameters) forms a mesh network, of the previous work done in this domain. Section 3 explains
that would be a good indicator to portray a typical mesh the data collection process and data description. Section 4
network traffic scenario. The sensor’s mesh network is used elucidates the various analysis methods that are applied for
to collect the data that includes readings of input and output traffic prediction. The obtained results are presented and
parameters. In wireless mesh networks the mesh nodes like analyzed in Section 5, while the conclusion of our findings
in MANET nodes can form spontaneous connections with is shown in Section 6.
other nodes due to their intrinsic features to connect with and
can traverse the network, collecting data from sensors, RFID- II. BACKGROUND WORK
enabled nodes, and other fixed Wireless nodes [13]. Wireless Network traffic prediction and routing have been a topic of
Sensor Networks(WSN)s have been identified as a significant interest and research in the last few decades. Approaches in
enabler of the IoT models since their inception. In IoT, all this domain have comprised the use of various time series
sensor nodes can get connected to the Internet to share and models, namely machine learning algorithms, deep learning
receive data [14]. A set of statistical and machine learning (neural networks), and various traditional statistical methods
algorithms are used for the multivariate time series analy- Time series models like ARIMA have been preferred for
sis of the collected data and subsequent output prediction. forecasting, even in the case of regular networks. Zhou et
The traditional, fundamental, technical analysis approach is al. [5] used a combination of ARIMA and GARCH models
2 VOLUME X, 2020

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2022.3140646, IEEE Access

Mahajan et al.:Prediction of Network Traffic in Wireless Mesh Networks using Hybrid Deep Learning Model

for prediction and modeling of the network traffic. Wavelet- modified sequence model. Kim came up with an INGARCH
based transformation models have also been used for this model, an enhancement over the previous GARCH models
cause. Unlike image or text data problems, neural networks [34]. Researchers have also used some statistical and evo-
made inroads into network analysis and prediction almost lutionary algorithms. Recently, most notably Li et al. [32]
a decade earlier. Khotanzad et al. [16] were one of the applied a Gaussian regressor with a Prophet model for user
first to use neural networks for high-speed network traffic traffic prediction in networks. A quantum-PSO approach has
prediction. Alarcon et al. [17] applied a multi-resolution also been tried out in this domain [32]. Costa et al., gives a
neural network for the task. Chen et al. [18] deployed a thorough overview of predictive maintenance projects in In-
flexible neural tree for the resolution of traffic in the case dustry 4.0, identifying and classifying techniques, standards,
of small scale networks. There have also been extensive and applications.Their survey’s key contributions include a
studies comparing the performance of traditional methods discussion of the existing issues and limitations in predictive
versus deep learning networks [19]. Vinayakumar et al. [20] maintenance, as well as a proposal for a new taxonomy to
applied various sequence models, including Long Short Term define this research topic in light of Industry 4.0 requirements
Memory (LSTM)s, Recurrent neural networks (RNN), and [35]. As per Costa et al.the industry has entered an era due
Image Recognition Neural Networks (IRNN), for prediction to the necessity to adapt and adopt new technologies.The
of network traffic. Similar types of traditional and advanced Internet of Things (IoT) is a recent era for communication,
approaches have also been applied for traffic prediction in the in which all kinds of objects in our daily lives, such as smart-
specific case of wireless networks. Amongst one of the earlier phones, sensors, or devices, that have been linked to network-
approaches, Gowrishankar and Satyanarayana [21] presented enabled objects (such as RFID) to communicate with each
a neural network architecture for wireless network traffic other and get to be a part of the Internet. Industry 4.0 is
prediction. Xiang et al. [22] had proposed a hybrid ANN- characterized by connectivity, data volume, tech gadgets,
based approach for this task. Nikravesh et al. [23] analyzed inventory reduction, customization, and controlled produc-
the use of multiple techniques, including SVM, MLP, and tion [35]. Tan et al.,in their research have analysed recent
MLPWD. Ke Wang, et al. [10] have suggested a hybrid advancements in smart monitoring and data analytics that
model using CNN and LSTM where the ability of CNN and have enabled infrastructure predictive maintenance (PdM).
LSTM combined for traffic flow prediction. Stefany Coxe et As per them, the industry is currently hesitant in adopting
al. [24] have worked on Poisson’s regression and alternatives new smart monitoring sensors, information technologies’,
where they have stated that Count data represent the number and data analytics to achieve PdM. PdM is data-driven, rely-
of times an activity occurred over a specific period. e.g., ing on smart monitoring and data analytics insights to prevent
suppose one wants to measure or observe how many hyper downtime through maintenance, protection, and repairs. PdM
aggressive actions are expressed by children while playing is a relatively new trend in the industry that has recently
during a playground area on particular occasions. In that taken a leap in the industrial world since the 1990s; yet, their
case, that is nothing but count data. As per them, almost recent analysis of the industry revealed that its applicability
all Poisson regression models provide easy method to im- in infrastructure maintenance is quite limited [36]. Chuang et
plement analyses of count data. Farhan Mohammad Khan al.,while emphasizing the importance of predictive mainte-
and Rajiv Gupta used an Auto-Regressive Integrated Moving nance, have stated that [37], business in all industries can be
Average (ARIMA) model to compare the accuracy of the redefined with the emergence of AI and IoT. The information
predicted model [?]with a nonlinear autoregressive (NAR) gathered is utilized to not only used to draw inferences
neural network. Recently, for the daily prediction of COVID- from the past but also to forecast the future. Artificial neural
19 cases for the next 50 days, the model was developed networks and evolutionary algorithms are two of the most
and implemented. Nie et al. [25], [26] have made multiple common AI techniques for machine diagnosis. According to
contributions in this domain in recent years. One of their them, Predictive maintenance cuts down accidental device
initial approaches involved the use of deep belief networks downtime, lowers maintenance costs, and extends equipment
for wireless mesh backbone networks. Recently, they further life cycle, among other benefits. The fundamental infras-
enhanced their work by applying reinforcement learning in tructure of an IoT framework consists of sensors, actuators,
an IoT setup [26]. Qiu et al. [27] deployed RNNs in a computation servers, and the communication network [38].
Spatio-temporal sense for improved performance for traffic The TCP/IP (Transmission Control Protocol/Internet Pro-
prediction. Xu et al. [28] applied a multi-layer Gaussian tocol) communication protocol transmits sensor data. The
framework for this task. Recently, researchers have also seen environmental sensing sensors are programmed into the pro-
the rise of attention mechanism [29] and deep learning in grammable interface controller, and the data is saved in a his-
interdisciplinary ways [30]. This trend is slowly reflected torical manner [35], [37]. Pallavi et al. have mentioned that
in traffic prediction.He et al. [31] applied a meta-learning because IoT devices are typically located in geographically
scheme for faster traffic prediction in smaller networks. Li separated places, they communicate primarily over wireless
et al. [32] combined wavelet analysis with backpropagation mediums. They also have stated that Wireless channels are
neural networks for traffic flow analysis in wireless networks. known for having significant distortion levels while being
Zhang et al. [33] considered a spatiotemporal network with a unstable. Communication techniques are essential for the
VOLUME X, 2020 3

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2022.3140646, IEEE Access

Mahajan et al.: Prediction of Network Traffic in Wireless Mesh Networks using Hybrid Deep Learning Model

analysis of IoT devices. In this scenario, reliably transferring network. This form of a network without the concept of a cen-
data without too many retransmissions is a significant con- tralized entity, so nodes need to lean on another node to send
cern. The fundamental infrastructure of an IoT framework packets. Multiple copies of data are created to enhance data
consists of sensors, actuators, computation servers, and the availability. The access frequency and node level are taken
communication network [38]. MANET is a vital element of into account while allocating copies [42]. In the research by
the IoT network, serving as its backbone. MANET nodes Devarajan, Ganesh Gopal et al, the Internet of Things (IoT)is
with mesh architecture can form spontaneous connections a computational concept that envisions widespread Internet
with other nodes due to their intrinsic features, requiring connectivity, transforming everyday objects into connected
minimal infrastructure. MANET nodes can traverse the IoT devices. The fundamental methodology in an IOT-based
network, collecting data from sensors, RFID-enabled nodes, model is the transmission of billions or perhaps trillions
and other fixed Wireless nodes The MANET nodes take the of sensitive data capable of detecting the surrounding situ-
most effective route to connect with the Internet gateways, ation, communicating and transferring precise information,
which one is available. MANET nodes, like sensor nodes, and then providing feedback to nature. Remote connections
can be employed as an essential technology in a variety of are frequently used to meet the adaptability and versatility
IoT applications.MANET nodes and sensor nodes (includ- required by IoT interchanges. While cellular innovations
ing RFID-enabled devices), forming a MESH Network, can such as 3/4/5G provide interface separations of large devices,
be deployed in huge numbers due to their self-configuring they necessitate framework support and legally allowed band.
nature. For the past decade, researchers have been working They have explained the concepts of IoT and IIoT,as well as
on the Internet of Things (IoT) using a variety of mature the current trend of robotization and data exchange in man-
technologies such as Radio Frequency Identifiers (RFID), ufacturing breakthroughs known as Industry 4.0 [43]. As per
Wireless Sensor Network (WSN), Mobile Adhoc Network Farrukh et al.,it is understood that, With the creation of highly
(MANET), and so on [39]. Nagarajan et al.have proposed accurate and accurate algorithms, investigation aims to focus
remote health monitoring and data analysis by combining on building more rigorous and practical methodologies [44].
IoT and Deep Learning techniques. They have suggested a It should be noted here that rarely has the work on traffic
new IOT-based FoG-assisted cloud network architecture that prediction focused on the historical data, which eventually
collects real-time health care data from patients via numerous can be used for traffic prediction. As a result, having models
medical IoT sensor networks. The analyses of it by using a that are both resilient and appropriate is critical. Further, a
deep learning algorithm installed at a Fog-based Healthcare standard paper highlighting all the contemporary machine
Platform. Furthermore, they have proposed a methodology learning and deep learning methods together could benefit
used to analyze the process in real-time for smart cities. young researchers in this domain. Also, there is hardly any
As per them, for timely, accurate and secure data analysis, notable work done where data is collected over a given period
new IOT-based FoG-assisted cloud network architecture can and analyzed for identifying the patterns in the collected data
be effectuated to various domains such as traffic analysis where real-life data is considered and eventually predicting
and management, agriculture and smart farming, weather the performance of the network. All these points highlight the
forecasting etc [14]. In the view of Manuel et al., WSNs scope for improvement and the need for our research work.
have been identified as a significant enabler of the IoT
models since their inception. In IoT, all sensor nodes can get III. DATA COLLECTION AND PREPROCESSING
connected to the Internet to share and receive data; however, Before describing the algorithms used for modeling the data,
in WSNs, the nodes do not have a direct internet connection. the data collection and preparation process is as mentioned
To connect to the Internet, all nodes in the WSN need a below.
mediator [40]. In their work, Krishnasamy et al. mention that,
a wireless sensor network (WSN) comprises a large number
A. DATA COLLECTION
of sensor nodes that can both sense and communicate. The
sensor nodes work together to gather and send the data to the The data was collected for over a year using a set of HSD
sink node, also known as the coordinator node. The primary pumps. A total of eight sensors were placed for collecting
goal of sensor nodes is to oversee the environment before data, forming a wireless mesh network.The collected data
processing and transferring data to an analysis centre. Sen- consists of seven input variables and one output variable.
sors are installed in locations, which are frequently uneven in Each of these parameters are described in Table 1. The
design. Sensors are also installed randomly in specific sites total data consists of 8960 such tuples, each containing 1
that are irregular in shape, relying on the transmission range. identifier (Date / Time), seven input variables(sensor reading
As a result, an algorithm that can adapt to each geographic for 3phase Current, 3phase Voltage and Temperature), and 1
region with different deployment structures is required [41]. output variable that is reading from the Vibration sensor. The
Authors Deverajan Ganesh Gopal et al. have researched a data is divided into training and test sets in the ratio of 4:1 i.e.
DANET, or dynamic ad hoc network, is a network of multiple an 80% -20 % split is observed. The data is processed before
dynamic nodes that does not necessitate any infrastructure. converting it into a more algorithm-friendly format.
As and when needed, the movable nodes build a temporary
4 VOLUME X, 2020

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2022.3140646, IEEE Access

Mahajan et al.:Prediction of Network Traffic in Wireless Mesh Networks using Hybrid Deep Learning Model

TABLE 1. Description of the collected data parameters

Parameter Description
Date / Timestamp The unique identifier indicating the timeframe
when the data points were collected
Three Phase Current: The three parameters corresponding to the
IR, IY, IB three phase current in the system
Three Phase Voltage: The three parameters corresponding to the
VR, VY, VB three phase voltage in the system
Temperature Temperature of the sensor of the HSD pump
Vibration The output parameter corresponding to the
traffic in the network
FIGURE 3. The seasonal decompose

ods. To employ parametric methods, first a well-structured


but flexible family of models is created, after which the
model parameters using training data must be estimated.
Forecasts can then be made using the model. This is a well-
known method in traffic modelling, as seen in Figure 4.
The Auto-Regressive Integrated Moving Average is a promi-
nent modelling tool for traffic forecasting (ARIMA). [45].
Artificial intelligence-based regression models can provide
the necessary skills. Regression is a solution for creating
FIGURE 2. ADfuller Test Results models capable of predicting the value of an output variable
in accordance with a set of input variables that is widely
utilized in many disciplines. AI-based algorithms are fre-
B. DATA PREPROCESSING quently utilized for complex regression models. This type
The following checks were performed on the data as a part of of regression approach detects complex correlations between
preprocessing: input variables and interactions between input variables and
1) Check the time series stationarity, that is if the time output variables automatically [46]. Thus, forecasting In-
series is stationary or not. ternet traffic is critical for network planning, resource al-
2) Check whether a consistent mean and standard devi- location and network anomaly detection caused by attacks.
ation exists in the collected data range or not. This is This is because enhanced TCP/IP (Transmission Control
verified by plotting the mean and standard deviation Protocol/Internet Protocol)traffic forecasting can assist net-
values on a rolling window across the entire data range. work providers in optimizing their resources. Better traffic
3) The ADFuller test for stationarity checks: The AD- predictions can assist avoid congestion and resource waste
Fuller test is used to check how well a trend persists in bandwidth allocation schemes. Short-term prediction and
over the time series. This is achieved by keeping a long-term prediction are the two categories of network traffic
null hypothesis and an alternative hypothesis. Results prediction. Short-term predicts traffic conditions in the near
of the test are shown in Figure 2. Our null hypothesis future based on historical and present traffic data. A forecast’s
is that the time series has a common root and is non- horizon is only a few minutes long. On the other hand,
stationary. The alternative hypothesis would be the long-term prediction provides traffic estimates for longer
series being stationary. The p − value for the null time periods, such as years. Traditional forecasting models
hypothesis is calculated. The threshold is set to 0.05 for such as the Poisson regression model (PRM) is used to
the p-value. As the value is observed to be less than the model a counting variable , which is usually computed by
threshold, it can be concluded that the null hypothesis using maximum likelihood estimation (MLE) method [47].
is true and that the series is indeed stationary. Auto regressive (AR) and Auto regressive Integrated Moving
4) Application of seasonal decompose: The seasonal de- Average (ARIMA) can figure-out the linear and Short Range
compose method is applied to get the triad values used Dependencies (SRD) between terms, but not the Long Range
for setting up the stationary time series that is used for Dependencies (LRD), resulting in poor performance when
forecasting. Residuals, Seasonality, and Trends are the used for Internet traffic forecasting. Nonetheless, they are
three values from this method Figure 3. widely used [48].
In this paper, both statistical and non-statistical algorithms
IV. ALGORITHMS USED FOR TRAFFIC PREDICTION are implemented for the task of output prediction. The de-
In general, there are two types of traffic modelling for short- scription of each of these is given below:
term traffic prediction: parametric and non-parametric meth-
VOLUME X, 2020 5

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2022.3140646, IEEE Access

Mahajan et al.: Prediction of Network Traffic in Wireless Mesh Networks using Hybrid Deep Learning Model

V
Vib V
ra�
R
on B
Tem
per
atur VY
e

IR IB
IY

Wireless mesh Data Model Output


Data Collec�on Apply Algorithm
network data Preprocessing Evalua�on Predic�on

FIGURE 4. Block diagram for Prediction model

A. DECISION TREE REGRESSOR C. MULTI-LAYER PERCEPTRON


The Decision Tree Regressor(DTR) takes together all the The multi-layer perceptron model is a type of deep learning
input features and iteratively generates multiple trees trying architecture where there is a combination of hidden layers
out possible combinations of the root, internal, and leaf through which the input features are passed to derive the
nodes amongst all the features [9]. The tree with the closest output value. Each node in the hidden layer assigns an
predicted output to the actual traffic is considered the best importance weight to each input that it receives from its pre-
tree for subsequent inference. ceding layer along with a bias value for error normalization.
For each tree, for each node, two metrics can be calculated: Generally, a fully connected setup is used wherein each node
gini index and entropy. Gini index is a probabilistic measure in a layer is connected to all nodes in its next layer [11].
that indicates the probability that the particular feature being Mathematically, the value at a particular node i in a given
at the given node would lead to the prediction error crossing layer l is derived as follows:
a set threshold. At a particular node index n, the Gini index
n
for a feature fi is calculated as follows: X
H(x)li = wj H(x)l−1
j +b (5)
Gini Indexn = 1 − Σni=1 p2 (fi ) (1) j=1

where p(fi ) indicates the probability of fi being present A rectified linear unit activation function is used at the
at node n. Ideally, the feature that produces the lowest Gini output layer to get a continuous prediction value. After com-
index for the particular node is assigned to that node. Overall, paring the predicted value with the actual value, the network
creating the decision tree would be to reduce the entropy back-propagates i.e. updates the initially randomized weights
i.e. degree of randomness of possible traffic at each position. so that the predicted output matches the actual value as
Entropy is defined as follows: closely as possible.
E(S) = Σni=1 − pi log2 pi (2) w = w − αdw (6)
The lesser the entropy, the more confident and accurate the
predictions would be. The Decision Tree Regressor figures D. POISSON REGRESSION
out the best tree setup by iteratively trying out multiple The Poisson regression is a probabilistic model [8].It deploys
combinations of positions of the seed and internal nodes in a probabilistic mass function (PMF) to check what could be
the tree. the probability of observing a particular continuous output
y for a given input containing a set of dependent variables
B. LINEAR REGRESSION X ∈ [X1 , X2 , .., Xn ]. This function is defined as follows:
The Linear Regression model uses a linear mapping of fea-
tures to get a continuous prediction output. It is one of the e−λi ∗ λyi i
P M F (yi |xi ) = (7)
most primitive algorithms represented mathematically as: yi !
n
X Where λi is the mean rate, also meant to be the predicted
H(x) = wi Xi + ϵ (3)
value. The predicted regression output for a given input x is
i=1
defined as follows:
where ϵ is the error factor put in to accommodate normal-
ization [10].To approximate the given data, the regression λ i = exi β (8)
and log-linear models can be employed.The data is modelled
Here, β is the regression coefficient or feature importance
to match a straight line in (Simple) linear regression. A
that is given to each dependent variable. The training objec-
dependent variable, y i.e. response variable, can, for example,
tive of a Poisson regressor model is to find this β value. The
be described as a linear function of another random variable,
best β value would be the one that produces the maximum
x. i.e. predictor variable [10].
value for P M F . The maximum value would be when the
y = wx + b (4) slope of the PMF curve is minimum, which is better derived
6 VOLUME X, 2020

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2022.3140646, IEEE Access

Mahajan et al.:Prediction of Network Traffic in Wireless Mesh Networks using Hybrid Deep Learning Model

by taking derivatives of the logarithm of PMF. This derivative performed based on the spikes and curve in the graph of ACF
equation is as follows: and PACF. Once the best model is selected, forecasting is
n
X done using parameters (p,d,q) given by the model. Diagnostic
ln(P M F ) = (yi xi β − exi β − lnyi !) (9) forecasting evaluation involves evaluating the efficacy of the
x=1 currently built model using statistically relevant measures
such as the Akaike information criterion (AIC), Bayesian
This is equated to zero to find the best β value. This value
criterion (BIC), and mean square error measurement [49].
is then used to derive the prediction for a new test input xp :
yp = λ p = e x p β (10) F. LONG SHORT TERM MEMORY
Long Short-Term Memory is a recurrent network-based ar-
The training of Poisson regression is done with the ob- chitecture where it keeps track of a cell state to remember
jective of finding the values of the regression coefficient β certain memory trends in the series, shown in Figure 5. For
that would make the vector of observed count y most likely. every point in the state, the model decides whether to let go
Following are the steps to be taken: of some information, update some pattern information, or
1) Convert the data set into only numeric values. output any new information.LSTMs are specifically devel-
2) The data set should contain only non-negative integer oped to prevent the problem of long-term dependency. All
values that represent the frequency of an event during a recurrent neural networks are made up of a series of repeated
set interval. For our problem statement, it would be the neural network modules. This recurring module in standard
traffic of network crossing host in particular interval. Recurrent Neural Networks (RNNs),shown in Figure 5, will
3) Then find the regression variables that will influence have a simple structure, such as a single tanh layer [12].
the observed counts to derive the maximum PMF
value.

E. AUTO REGRESSIVE INTEGRATION AND MOVING


AVERAGE
ARIMA, that stands for Auto Regressive Integrated Moving
Average, deploys a combination of auto-regression and mov-
ing average algorithms to get future predictions from past
time series value [7]. Mathematically, it is represented as
follows:
ht = α + β1 Yt−1 + β2 Yt−2 + ... + βp Yt−p ϵt (11)
FIGURE 5. Block diagram of LSTM architecture
yt = ht + ϕ1 ϵt−1 + ϕ2 ϵt−2 + ... + ϕq ϵt−q (12)
There are three gates: forget, update, and output gates that
There are some terms based on auto-regression and some
operate on the given input for a time series input Xi and
terms based on moving average. If terms in the time series
intermediate output ht .
are under-different, add more AR terms, and in cases of
excess difference then add more MA terms.ARIMA (p, d, ft = σ(αf xt + βf ht−1 ) (13)
q) method applies lag at the 1st or 2nd level if the non-
stationary problem exists in the data; otherwise, if stationary ot = σ(αo xt + βo ht−1 ) (14)
without lag, then ARMA (p, q) is an alternative method,
hence p for Moving Average (MA) and q for Autoregressive G. CONVO-LSTM
(AR) order that is the number of errors lag in ARIMA This paper proposes a novel combination of CNN along
model forecast.The most common method used for making with LSTM such that the feature extraction ability of CNN
a sequence stationary is to subtract the initial value from can benefit the sequence recurrence mapping ability of the
the current value. Depending upon if the type of time series Recurrent Neural Networks(RNN). Figure 4 depicts the
i.e. univariate or multivariate one or more lag is anticipated. model structural diagram where as Figure 6 depicts the
Subsequently, the value of d signifies the smallest number proposed model’s architectural diagram. An input layer, one-
of differentiation which is prescribed to keep the series dimensional convolution layer, pooling layer, LSTM hidden
stationary, so if without differentiation, the data series is still layer, and full connection layer are the main constituents
stationary, then d = 0. The identification method began by that build the main structure of Convo-LSTM. Lecun et al.
measuring the presence of autocorrelation (ACF) and partial proposed the CNN network model in 1998. The convolution
autocorrelation (PACF) by plotting the correlogram by [49]. operation extracts the attributes from the input layer vectors
Then, depending on the ACF and PACF of the series, estimate [50]. In this case, exclusively one dimensional - 1 D opera-
of relevant models, setting the level of auto-regressive and tions are only performed owing to the data structure. Pooling
moving averages. The (p, q) identified and the best model is layers are deployed to reduce storage requirements and avoid
VOLUME X, 2020 7

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2022.3140646, IEEE Access

Mahajan et al.: Prediction of Network Traffic in Wireless Mesh Networks using Hybrid Deep Learning Model

the huge training costs in the system. The pooling layer So at each BP iteration, the total number of multiplications
subdivides the convolutional layer’s small rectangular chunks and additions will be, respectively:
to generate a single output from each block. Pooling can be
TF P (mul) + TF P (add) + TB P (mul)TB P (add) (21)
done in various ways, such as by calculating the average
or the maximum. The average pooling takes the average Statistical Analysis: CNN stands for convolutional neural
value of the block it is pooling, whereas the max-pooling network and is a type of feed-forward neural network. It can
takes the maximum of the block it is pooling [51]. Firstly, be exploited to forecast time series with great success. Two
the CNN layer extracts the features from the data, which inherent features namely, Local perception and weight shar-
are the readings from the Current, Voltage, Temperature and ing, of CNN can significantly lower the number of parame-
Vibration sensor’s readings collected over the previous year. ters, which in turn help enhance model learning efficiency.
The LSTM is then used to forecast the output, Vibration, The convolution layer and the pooling layer are the two
based on the retrieved feature data.As per the experiment’s fundamental constituents of CNN. Each convolution layer
findings, With the maximum prediction accuracy, the CNN- has a number of convolution kernels.The formula mentioned
LSTM that is Convo-LSTM can provide credible forecasting below is used for calculating them
of the output parameter (Vibration).
Time Complexity of Convo-LSTM: To determine the It = tanh(xt ∗ kt + bt ) (22)
time complexity of both the forward propagation and back where It is output value as a result of convolution, xt is
propagation processes, the total number of operations at each input vector,tanh is activation function , bt is bias and kt is
1D CNN layer must first be determined, and then the entire convolution kernels’ weight.The data features are obtained
number of operations must be aggregated to determine the once the convolution layer completes the convolution opera-
overall time complexity [52]. During forward propagation P, tion, but, as the extracted feature dimensions are quite large,
the number of connections to the preceding layer at a CNN after the convolution layer, to lower the feature dimension
layer,l,is N l−1 N l the previous layer’s number of connections and to lower the cost of training the network, a pooling layer
is N l−1 N l , an individual linear convolution, which is a linear is added. The forget gate receives the output value of the
weighted sum, is evaluated. Let S l−1 and W l−1 represent the previous moment and the input value of the current time,
vector sizes of the preceding layer output, Skl−1 and the ker- with which the forget gate’s output value is calculated, [50]
nel (weight), respectively. A linear convolution is constituted as indicated in the following formula:
of(S l−1 W l−1 )2 multiplications and S l−1 additions from a
single connection, ignoring the boundary conditions. If the ft = σ(Wf .[ht− 1, Xt ] + bf ) (23)
bias is ignored, the aggregate number of multiplications and The last time’s output value and the current time’s input value
additions in layer l will be: are both fed into the input gate, and the output value and
candidate cell state of the input gate are calculated, as shown
in the formulas below:
N (mul)l = N l−1 N l ∗ S l − 1 ∗ (W l−1 )2 (15)
it = σ(Wi .[ht1 , Xt ] + bf ) (24)
l l−1 l l−1
N (add) = N N ∗S (16)
The final output Ot is calculated as follows:
A low computational complexity is attained in all of the 1D
CNN . Thus, in forward propagation, the total number of ot = σ(αo xt + βo ht−1 ) (25)
multiplications T(mul) and total number of addition T(add), ht = ot ∗ tanh(Ct ) (26)
in the CNN layer l will be
where ft is having the value range of(0,1),Wf and bf are the
L
X weight and bias of forget gate. Similarly,Wi and bi are the
TF P (mul) = N l−1 N l ∗ S l − 1 ∗ (W l−1 )2 (17)
weight and bias of the input gate having value range (0,1)and
l=0
Ct is the output of the current cell with value and (0,1) [50].
L
X The following is a summary of the proposed architecture:
TF P (add) = N l−1 N l ∗ S l − 1 (18) The convolution layer and the pooling layer are the two
l=0 fundamental components of CNN. Each convolution layer
Now , similarly, at back propagation iteration, the total has several convolution kernels. The data features are ex-
number of multiplications and additions due to the first tracted after the convolution operation of the Complexity the
convolution will, therefore, be: convolution layer. As the pulled feature dimensions are very
L large, a pooling layer is added after the convolution layer
X
TB P (mul) = N l+1 N l ∗ S l + 1 ∗ (W l+1 )2 (19) to reduce the feature dimension and to reduce the cost of
l=0 training the network. The convolution operation extracts the
L
features from the input layer vectors. In this case, exclusively
TB P (add) =
X
N l+1 N l ∗ S l + 1 (20) 1D operations are only performed owing to the structure of
l=0
the data. To avoid the introduction of huge training costs in
8 VOLUME X, 2020

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2022.3140646, IEEE Access

Mahajan et al.:Prediction of Network Traffic in Wireless Mesh Networks using Hybrid Deep Learning Model

Three-phase current and Vibration values of the HSD Pump.


A snapshot of the data is shown in Figure 7. As to implement
various algorithms the data is required to be divided into
training set, testing data set and validation set.The first 7178
readings of the data are taken as the training set, and the
data of 1345 readings are taken as the validation set the
last 450 readings as the test set. According to the influence
factors, including the Temperature, Three Phase Current (IR,
FIGURE 6. Proposed Convo-LSTM architecture IY, IB), Three Phase voltage (VR, VY, VB), the HSD pump’s
Vibration is predicted On daily, hourly, weekly, monthly,
and yearly basis. As per Standard industry practice, LM35
the system, pooling layers are also deployed to reduce storage temperature sensor (Texas Instruments, Dallas, TX, USA) is
requirements. The summary of the proposed architecture is a precision IC temperature sensor with a proportional output
shown in Table 1 To demonstrate the model’s usefulness, the is used for temperature. A Hall Effect-based DC current
sensor is the ideal method of measurement for monitoring the
TABLE 2. Summary of proposed Convo-LSTM architecture current of the motor. Allegro MicroSystems LLC’s ACS712
current transducer is used in this circuit (Worcester, MA,
Layer (type) Output Shape Param # USA) [39]. High frequency accelerometers with a flat fre-
conv1d (Conv1D) (None, 15, 64) 2304 quency response up to 28kHz for multi-stage compressors
max_pooling 1d (MaxPooling1D) (None, 14, 64) 0 and boiler feed pumps monitoring and bearing wear detec-
conv1d_1 (Conv1D) (None, 14, 32) 6176 tion. The snapshot of the data set is shown in Figure 7.
max_pooling1d_1 (MaxPooling1) (None, 13, 32) 0 Model Description Various parameters of the Convo-LSTM
LSTM (LSTM) (None, 13, 128) 82432
LSTM_1 (LSTM) (None, 13, 128) 131584
flatten (Flatten) (None, 1664) 0
dense (Dense) (None, 64) 106560
dropout (Dropout) (None, 64) 0
dense_1 (Dense) (None, 16) 1040
dropout_1 (Dropout) (None, 16) 0
dense_2 (Dense) (None, 1) 17

performance of ARIMA, Decision Tree Regressor, Linear


Regressor, Multi-Layer Perceptron, and Long Short Term
Memory is compared in this work, using the same training FIGURE 7. Data set-snapshot
and test sets (sample mentioned in Figure 7 under the same
operating environment. All the model development and its are included in Table 2, it shows the CNN-LSTM parameter
performance evaluations are performed under the running settings used in this experiment. The specific model is built
environment of Intel (R) Core(TM) i5-7300HQ CPU @ as follows, based on the parameter settings of the Convo-
2.50GHz 2.50 GHz, 8.00 GB (7.87 GB usable) RAM and LSTM network: A three-dimensional data vector is used as
Windows10. The sensors used for current, temperature and the input training data (None, 10, 7), where 10 is the time
vibration include LM35 temperature sensor (Texas Instru- step size and as there are 7 attributes of the input data. A
ments, Dallas, TX, USA) is a precision IC temperature sensor one-dimensional convolution layer is used to send the data
with a proportional output (in C) is used for temperature.A at first, which extracts additional features and produces a
Hall Effect-based DC current sensor, Allegro MicroSystems three-dimensional output vector (None, 15, 64), where the
LLC’s ACS712 current transducer is used in this circuit size of the convolution layer filters is 64. After passing
(Worcester, MA, USA).High frequency accelerometers with through the pooling layer, the vector is transformed into a
a flat frequency response up to 28kHz for multi-stage com- three-dimensional output vector (None, 13, 32). The output
pressors and boiler feed pumps monitoring and bearing wear vector is then trained using the LSTM layer and two dense
detection. layers, and the output data (None, 64) goes through another
Sample Data set Description The proposed system makes complete connection layer after training to retrieve the output
a strong case for the network traffic prediction, where the use value; 64 is the number of hidden units in the LSTM layer.
of historical(time series data) data is collected over the wire- This CNN-LSTM model structure is shown in Figure 6.
less mesh network. The sensors nodes with mesh architecture
form connections. The data is collected over one year starting V. RESULTS
from 1st June 2019 to 8th June 2020 are obtained from the The models are trained using the data from the processed
sensors measuring the Temperature, Three-phase Voltage, training set, namely, Decision Tree Regressor (DTR), Lin-
VOLUME X, 2020 9

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2022.3140646, IEEE Access

Mahajan et al.: Prediction of Network Traffic in Wireless Mesh Networks using Hybrid Deep Learning Model

ear Regression (LR), Multi-layer Perceptron (MLP), Pois- TABLE 3. Results for hourly prediction model
son Regression, Auto-Regressive Integrated Moving Aver-
age (ARIMA), Long Short-Term Memory (LSTM), Convo- Approach MAE MSE RMSE
LSTM respectively are trained. To forecast the output from DTR 0.12 0.47 0.21
the test set data, this fully trained model is used. The re- Linear Regression 0.13 0.42 0.17
sult of the predictions of the model is compared with the Multi-layer Perceptron 0.16 0.70 0.49
actual value of the output from the data set. Among the six Poisson Regression 0.06 0.50 0.68
forecasting methods, Decision Tree Regressor (DTR), Lin- ARIMA 0.19 0.46 0.67
ear Regression (LR), Multi-layer Perceptron (MLP), Pois- LSTM 0.46 0.24 0.49
son Regression, Auto-Regressive Integrated Moving Average Convo-LSTM 0.1 0.025 0.16
(ARIMA), Long Short-Term Memory (LSTM), and Convo-
LSTM, the maximum degree of broken line fitting is shown TABLE 4. Results for daily prediction model
by Convo-LSTM, which practically fits with each other and
the MLP model shows the lowest degree of broken line Approach MAE MSE RMSE
fitting. DTR 0.1199 0.4612 0.2127
The aforementioned model is evaluated across multiple Linear Regression 0.1253 0.4161 0.1732
time intervals: Multi-layer Perceptron 0.1608 0.6941 0.4818
• Hourly: The Sensor readings are logged at the interval Poisson Regression 0.055 0.464 0.681
of every hour. ARIMA 0.15 0.46 0.68
• Daily: The Sensor readings are logged as the cumulative LSTM 0.46 0.23 0.48
value for the entire day. Convo-LSTM 0.1 0.025 0.159
• Weekly: The Sensor readings are logged at the weekly
level, thereby negating any daily variations observed TABLE 5. Results for weekly prediction model
between any specific days.
• Monthly: The Sensor readings are logged on a month Approach MAE MSE RMSE
over month basis. DTR 0.13 0.47 0.22
• Yearly: The Sensor readings on yearly basis. Linear Regression 0.14 0.48 0.18
The results are evaluated across the following three met- Multi-layer Perceptron 0.16 0.69 0.48
rics: Poisson Regression 0.05 0.46 0.68
ARIMA 0.15 0.49 0.677
• Mean Absolute Errors (MAE): Absolute difference be- LSTM 0.45 0.23 0.49
tween actual and predicted output value.From the ob- Convo-LSTM 0.1 0.025 0.159
servations it is found that the MAE for Convo-LSTM
model is 0.1, which indicates the model’s performance
over other models. TABLE 6. Results for monthly prediction model

D
X Approach MAE MSE RMSE
|xi − yi | (27) DTR 0.11 0.46 0.21
i=1 Linear Regression 0.12 0.41 0.17
Multi-layer Perceptron 0.16 0.69 0.48
• Mean Square Error (MSE): Square of the difference
Poisson Regression 0.055 0.464 0.681
between actual and predicted output.
ARIMA 0.14 0.45 0.67
D LSTM 0.463 0.239 0.489
X
(xi − yi )2 (28) Convo-LSTM 0.1 0.025 0.159
i=1

TABLE 7. Results for yearly prediction model


• Root Mean Squared Error (RMSE): Square root of the
MSE value. The result shows the comparison of the Approach MAE MSE RMSE
RMSE value of Convo-LSTM over other models. DTR 0.1199 0.4612 0.2127
r Linear Regression 0.1253 0.4161 0.1732
1 n  di − fi 2
Σ (29) Multi-layer Perceptron 0.1608 0.6941 0.4818
n i=1 σi
Poisson Regression 0.055 0.464 0.681
ARIMA 0.148 0.458 0.677
The hourly, daily, weekly, monthly and yearly prediction of
LSTM 0.463 0.239 0.489
the output is presented as the results as mentioned in the
Table 3 to Table 7. Convo-LSTM 0.1 0.025 0.159

10 VOLUME X, 2020

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2022.3140646, IEEE Access

Mahajan et al.:Prediction of Network Traffic in Wireless Mesh Networks using Hybrid Deep Learning Model

The result shows the comparison of the MAE, MSE and


RMSE value of Convo-LSTM over other models for hourly,
6
daily, weekly,monthly and yearly models, which indicates the

Vibra�on Predicted
5
model’s performance over other models.From the observa-
4
tions it is found that the MAE for Convo-LSTM model is
3
0.1, MSE is 0.025 and RMSE is 0.16 which indicates the
2
model’s performance over other models. For interpretabil-
ity of the results,the graphs indicating comparison between 1

predicted and actual output for the implemented algorithms 0


0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000 5500 6000 6500
are shown,where Figure 8 MLP, Figure 9 ARIMA, Figure 10 Interval
DTR, Figure 11 Linear Regression, Figure 12 Multi Layer
Perceptron, Figure 13 LSTM, Figure 14 Convo-LSTM. The FIGURE 11. Actual vs Predicted vibrations using Linear Regression algorithm
resultant graphs have been added in this paper.

FIGURE 8. Actual vs Predicted vibrations using Poisson’s algorithm


FIGURE 12. Actual vs Predicted vibrations using MLP algorithm

FIGURE 9. Actual vs Predicted vibrations using ARIMA algorithm

FIGURE 13. Actual vs Predicted vibrations using LSTM algorithm

FIGURE 10. Actual vs Predicted vibrations using DTR algorithm

FIGURE 14. Actual vs Predicted vibrations using Convo-LSTM algorithm

VOLUME X, 2020 11

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2022.3140646, IEEE Access

Mahajan et al.: Prediction of Network Traffic in Wireless Mesh Networks using Hybrid Deep Learning Model

VI. CONCLUSION [13] Laisen Nie, Xiaojie Wang, Liangtian Wan, Shui Yu, Houbing Song, and
This paper proposes a hybrid model for traffic prediction Dingde Jiang. Network traffic prediction based on deep belief network and
spatiotemporal compressive sensing in wireless mesh backbone networks.
in wireless mesh networks by application of regression Wireless Communications and Mobile Computing, 2018, 2018.
methods on system configuration parameters. Specifically, [14] Senthil Murugan Nagarajan, Ganesh Gopal Deverajan, Puspita Chatterjee,
six different algorithms are applied: decision tree regressor, Waleed Alnumay, and Uttam Ghosh. Effective task scheduling algorithm
with deep learning for internet of health things (ioht) in sustainable smart
linear regression, multi-layer perceptron, Poisson regression, cities. Sustainable Cities and Society, 71:102945, 2021.
ARIMA, and LSTM on the three main feature types: three- [15] Benjamin Lindemann, Timo Müller, Hannes Vietz, Nasser Jazdi, and
phase current, three-phase voltage, and temperature to predict Michael Weyrich. A survey on long short-term memory networks for time
series prediction. Procedia CIRP, 99:650–655, 2021.
the output- Vibration of the HSD Pump. This paper also [16] Alireza Khotanzad and Nayyara Sadek. Multi-scale high-speed network
proposes a new Convo-LSTM setup for this task and achieve traffic prediction using combination of neural networks. In Proceedings of
good results from the same. The system was evaluated on the International Joint Conference on Neural Networks, 2003., volume 2,
pages 1071–1075. IEEE, 2003.
five different intervals: hourly, daily, weekly, monthly, and [17] Vicente Alarcon-Aquino and Javier A Barria. Multiresolution fir neural-
yearly and it is found the Convo-LSTM algorithm to be the network-based learning algorithm applied to network traffic prediction.
best performing one. A time-series multivariate data set is IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applica-
tions and Reviews), 36(2):208–220, 2006.
similar to that of wireless mesh networks data set. This gives [18] Yuehui Chen, Bin Yang, and Qingfang Meng. Small-time scale network
a direction to the researchers to implement the proposed al- traffic prediction based on flexible neural tree. Applied Soft Computing,
gorithm to predict the volume of network traffic. Future work 12(1):274–279, 2012.
[19] Tiago Prado Oliveira, Jamil Salem Barbar, and Alexsandro Santos Soares.
in the domain includes the application of some contemporary Computer network traffic prediction: a comparison between traditional
methods like the attention mechanism and transformers for and deep learning neural networks. International Journal of Big Data
traffic prediction. Multimodal networks that combine the Intelligence, 3(1):28–37, 2016.
[20] R Vinayakumar, KP Soman, and Prabaharan Poornachandran. Applying
physical configuration values and the network system log deep learning approaches for network traffic prediction. In 2017 Inter-
values can also be proposed. As network systems become national Conference on Advances in Computing, Communications and
larger and more distributed in nature, smart algorithms that Informatics (ICACCI), pages 2353–2358. IEEE, 2017.
[21] Gowrishankar and PS Satyanarayana. Neural network based traffic pre-
can automatically predict the incoming traffic and accord- diction for wireless data networks. International Journal of Computational
ingly allocate the resources would become the need of the Intelligence Systems, 1(4):379–389, 2008.
hour. [22] Lin Xiang, Xiao-Hu Ge, Chuang Liu, Lei Shu, and Cheng-Xiang Wang.
A new hybrid network traffic prediction method. In 2010 IEEE Global
Telecommunications Conference GLOBECOM 2010, pages 1–5. IEEE,
REFERENCES 2010.
[1] Jamal N. Al-Karaki and Ahmed E. Kamal. Routing techniques in wireless [23] Ali Yadavar Nikravesh, Samuel A Ajila, Chung-Horng Lung, and Wayne
sensor networks: A survey. IEEE Wireless Communications, 2004. Ding. Mobile network traffic prediction using mlp, mlpwd, and svm. In
[2] Antonio Cilfone, Luca Davoli, Laura Belli, and Gianluigi Ferrari. Wireless 2016 IEEE International Congress on Big Data (BigData Congress), pages
mesh networking: An IoT-oriented perspective survey on relevant tech- 402–409. IEEE, 2016.
nologies, volume=11, year=2019,. Future Internet, (4). [24] Stefany Coxe, Stephen G West, and Leona S Aiken. The analysis of count
[3] Chaoyun Zhang, Paul Patras, and Hamed Haddadi. Deep learning in data: A gentle introduction to poisson regression and its alternatives. Jour-
mobile and wireless networking: A survey. IEEE Communications surveys nal of Personality Assessment, 91(2):121–136, 2009. PMID: 19205933.
& tutorials, 21(3):2224–2287, 2019. [25] Laisen Nie, Dingde Jiang, Shui Yu, and Houbing Song. Network traffic
[4] H Zare Moayedi and MA Masnadi-Shirazi. Arima model for network prediction based on deep belief network in wireless mesh backbone
traffic prediction and anomaly detection. In 2008 international symposium networks. In 2017 IEEE Wireless Communications and Networking
on information technology, volume 4, pages 1–6. IEEE, 2008. Conference (WCNC), pages 1–5. IEEE, 2017.
[5] Bo Zhou, Dan He, Zhili Sun, and Wee Hock Ng. Network traffic modeling [26] Laisen Nie, Zhaolong Ning, Mohammad S Obaidat, Balqies Sadoun,
and prediction with arima/garch. In Proc. of HET-NETs Conference, pages Huizhi Wang, Shengtao Li, Lei Guo, and Guoyin Wang. A reinforcement
1–10, 2005. learning-based network traffic prediction mechanism in intelligent internet
[6] Sun Han-Lin, Jin Yue-Hui, Cui Yi-Dong, and Cheng Shi-Duan. Network of things. IEEE Transactions on Industrial Informatics, 17(3):2169–2180,
traffic prediction by a wavelet-based combined model. Chinese Physics B, 2020.
18(11):4760, 2009. [27] Chen Qiu, Yanyan Zhang, Zhiyong Feng, Ping Zhang, and Shuguang Cui.
[7] Farhan Mohammad Khan and Rajiv Gupta. Arima and nar based predic- Spatio-temporal wireless traffic prediction with recurrent neural network.
tion model for time series analysis of covid-19 cases in india. Journal of IEEE Wireless Communications Letters, 7(4):554–557, 2018.
Safety Science and Resilience, 1(1):12–18, 2020. [28] Yue Xu, Feng Yin, Wenjun Xu, Jiaru Lin, and Shuguang Cui. Wireless
[8] Kimberly F Sellers and Bailey Premeaux. Conway–maxwell–poisson re- traffic prediction with scalable gaussian process: Framework, algorithms,
gression models for dispersed count data. Wiley Interdisciplinary Reviews: and verification. IEEE Journal on Selected Areas in Communications,
Computational Statistics, page e1533, 2020. 37(6):1291–1306, 2019.
[9] Shalika Walker, Waqas Khan, Katarina Katic, Wim Maassen, and Wim [29] Pratik Ratadiya and Deepak Mishra. An attention ensemble based ap-
Zeiler. Accuracy of different machine learning algorithms and added- proach for multilabel profanity detection. In 2019 International Confer-
value of predicting aggregated-level energy performance of commercial ence on Data Mining Workshops (ICDMW), pages 544–550. IEEE, 2019.
buildings. Energy and Buildings, 209:109705, 2020. [30] Pratik Ratadiya, Khushi Asawa, and Omkar Nikhal. A decentralized
[10] Ke Wang, Changxi Ma, Yihuan Qiao, Xijin Lu, Weining Hao, and Sheng aggregation mechanism for training deep learning models using smart con-
Dong. A hybrid deep learning model with 1dcnn-lstm-attention networks tract system for bank loan prediction. arXiv preprint arXiv:2011.10981,
for short-term traffic flow prediction. Physica A: Statistical Mechanics and 2020.
its Applications, 583:126293, 2021. [31] Qing He, Arash Moayyedi, György Dán, Georgios P Koudouridis, and
[11] Yidong Liu, Siting Liu, Yanzhi Wang, Fabrizio Lombardi, and Jie Han. A Per Tengkvist. A meta-learning scheme for adaptive short-term network
stochastic computational multi-layer perceptron with backward propaga- traffic prediction. IEEE Journal on Selected Areas in Communications,
tion. IEEE Transactions on Computers, 67(9):1273–1286, 2018. 38(10):2271–2283, 2020.
[12] Yong Yu, Xiaosheng Si, Changhua Hu, and Jianxun Zhang. A Review [32] Ming Li, Yuewen Wang, Zhaowen Wang, and Huiying Zheng. A deep
of Recurrent Neural Networks: LSTM Cells and Network Architectures. learning method based on an attention mechanism for wireless network
Neural Computation, 31(7):1235–1270, 07 2019. traffic prediction. Ad Hoc Networks, 107:102258, 2020.

12 VOLUME X, 2020

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2022.3140646, IEEE Access

Mahajan et al.:Prediction of Network Traffic in Wireless Mesh Networks using Hybrid Deep Learning Model

[33] Dehai Zhang, Linan Liu, Cheng Xie, Bing Yang, and Qing Liu. Citywide SMITA MAHAJAN is currently working as Assis-
cellular traffic prediction based on a hybrid spatiotemporal network. Al- tant Professor in Computer Science Engineering
gorithms, 13(1):20, 2020. Department, Symbiosis Institute of Technology,
[34] Meejoung Kim. Network traffic prediction based on ingarch model. Symbiosis International Deemed University, Pune,
Wireless Networks, 26(8):6189–6202, 2020. India. Her main research interest includes Com-
[35] Costa C.A. Righi R.D. Lima M.J. Trindade E.S. Li G. Zonta, T. Predictive puter Networking, Wireless and broadband net-
maintenance in the industry 4.0: A systematic literature review. Comput. works, Deep Learning and Machine Learning. She
Ind. Eng., 150:106889, 2020.
received her Bachelor’s degree from the Shivaji
[36] M Tan, Indivarie Ubhayaratne, Ying Huo, F Bob Varela, and Yong Xiang.
University of Maharashtra, a Master’s degree in
Predictive maintenance based on smart monitoring and data analytics. In
Proceedings of the Australasian Corrosion Association 2019 Corrosion Information Technology from Mumbai University.
and Prevention Conference, pages 1–10. Australasian Corrosion Associ- She is pursuing her PhD degree in Computer Science Engineering from
ation, 2019. Symbiosis International University, Pune, Maharashtra, India. Contact her
[37] Shang-Yi Chuang, Nilima Sahoo, Hung-Wei Lin, and Yeong-Hwa Chang. on [email protected]
Predictive maintenance with sensor data analytics on a raspberry pi-based
experimental platform. Sensors, 19(18):3884, Sep 2019.
[38] Pallavi Sethi and Smruti R Sarangi. Internet of things: architectures, pro-
tocols, and applications. Journal of Electrical and Computer Engineering,
2017, 2017.
[39] Sankar Mukherjee and G.P. Biswas. Networking for iot and applications
using existing communication technology. Egyptian Informatics Journal,
19(2):107–127, 2018. HARIKRISHNAN R is currently working as As-
[40] Asha Jerlin Manuel, Ganesh Gopal Deverajan, Rizwan Patan, and Amir H. sociate Professor in Electronics and Telecom-
Gandomi. Optimization of routing-based clustering approaches in wireless munication Engineering Department, Symbiosis
sensor network: Review and open research issues. Electronics, 9(10):1630, Institute of Technology, Symbiosis International
Oct 2020. Deemed University, Pune, India. His main re-
[41] Lalitha Krishnasamy, Rajesh Dhanaraj, D. Ganesh Gopal, Thippa search interest includes Smart Grid, Internet of
Reddy Gadekallu, Mohamed Aboudaif, and Emad Abouel Nasr. A heuris-
Things, Artificial Intelligenc, and Wireless Sen-
tic angular clustering framework for secured statistical data aggregation in
sor Network. He received his Bachelor’s degree
sensor networks. Sensors, 20(17):4937, Aug 2020.
[42] Deverajan Ganesh Gopal and R Saravanan. Selfish node detection based in Electrical and Electronics Engineering from
on evidence by trust authority and selfish replica allocation in danet. University of Madras, Master’s degree in Energy
International Journal of Information and Communication Technology, System Engineering from VIT University, Vellore, Master’s degree in Em-
9(4):473–491, 2016. bedded System Technologies from Anna University, Chennai. He received
[43] Ganesh Gopal Deverajan, V Muthukumaran, Ching-Hsien Hsu, Ph.D. degree in Electrical Engineering from Sathyabama University, Chen-
Marimuthu Karuppiah, Yeh-Ching Chung, and Ying-Huei Chen. Public nai, India. He has 21 years of teaching, research and industrial experience.
key encryption with equality test for industrial internet of things system Contact him at [email protected].
in cloud computing. Transactions on Emerging Telecommunications
Technologies, page e4202, 2021.
[44] Yasir Ali Farrukh, Irfan Khan, Zeeshan Ahmad, and Rajvikram Madurai
Elavarasan. A sequential supervised machine learning approach for cyber
attack detection in a smart grid system. arXiv preprint arXiv:2108.00476,
2021.
[45] Walaa Alajali, Wei Zhou, Sheng Wen, and Yu Wang. Intersection traffic
prediction using decision tree models. Symmetry, 10(9):386, Sep 2018. KETAN KOTECHA has Ph.D.and MTech from
[46] Julian PT Higgins and Simon G Thompson. Controlling the risk of spu- (IIT Bombay) and is currently holding the po-
rious findings from meta-regression. Statistics in medicine, 23(11):1663– sitions as Head, Symbiosis Centre for Applied
1682, 2004.
AI ( SCAAI).Dr.Kotecha has expertise and ex-
[47] Muhammad Amin, Muhammad Nauman Akram, and Muhammad Aman-
perience in cutting-edge research and projects in
ullah. On the james-stein estimator for the poisson regression model.
Communications in Statistics-Simulation and Computation, pages 1–13, AI and Deep Learning for the last 25 + years.
2020. He has published 100+ widely in a number of
[48] Christos Katris and Sophia Daskalaki. Comparing forecasting approaches excellent peer-reviewed journals on various topics
for internet traffic. Expert Systems with Applications, 42(21):8172–8183, ranging from cutting-edge AI, education policies,
2015. teaching-learning practices and AI for all. He is
[49] E ELAKKIYA, M RADHA, and R SATHY. Application of arima model a recipient of the two SPARC projects worth INR 166 lacs from MHRD
for predicting cashew nut production in india–an analysis. International govt of India in AI in collaboration with Arizona State uni, USA and the
Journal of Research inBusiness Management, 5:45–52, 2017. University of Queensland Australia, and also the recipient of numerous
[50] Wenjie Lu, Jiazheng Li, Yifan Li, Aijun Sun, and Jingyang Wang. A cnn- prestigious awards like Erasmus+ faculty mobility grant to Poland, DUO-
lstm-based model to forecast stock prices. Complexity, 2020, 2020. India professors fellowship for research in Responsible AI in collaboration
[51] Mohammed Alawad and Mingjie Lin. Stochastic-based deep convolu- with Brunel University, UK, LEAP grant at Cambridge University UK,
tional networks with reconfigurable logic fabric. IEEE Transactions on UKIERI grant with Aston University UK, and a grant from Royal Academy
Multi-Scale Computing Systems, 2:242–256, 2016. of Engineering, the UK under Newton Bhabha Fund.Dr.Kotecha has pub-
[52] Serkan Kiranyaz, Onur Avcı, Osama Abdeljaber, Turker Ince, Moncef
lished 3 patents and delivered key note speeches at various national and
Gabbouj, and Daniel J. Inman. 1d convolutional neural networks and
international forums, including at Machine Intelligence Lab, USA, at IIT
applications: A survey. ArXiv, abs/1905.03554, 2019.
Bombay under World bank project, at International Indian Science Festival
organized by Department of Science Technology, Govt of India and many
more. Currently,he is also an Associate Editor of IEEE Access journal.

VOLUME X, 2020 13

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/

You might also like