0% found this document useful (0 votes)

24 views33 pages

Artificial Neural Networks Based Machine Learning For Wireless Networks A Tutorial

Uploaded by

Amanda

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

24 views33 pages

Artificial Neural Networks Based Machine Learning For Wireless Networks A Tutorial

Uploaded by

Amanda

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 33

IEEE COMMUNICATIONS SURVEYS & TUTORIALS, VOL. 21, NO.

4, FOURTH QUARTER 2019 3039

Artificial Neural Networks-Based Machine Learning

for Wireless Networks: A Tutorial
Mingzhe Chen , Ursula Challita, Walid Saad , Fellow, IEEE, Changchuan Yin , Senior Member, IEEE,
and Mérouane Debbah, Fellow, IEEE

Abstract—In order to effectively provide ultra reliable low Index Terms—Machine learning, neural networks, artificial
latency communications and pervasive connectivity for Internet intelligence, wireless networks, reinforcement learning, virtual
of Things (IoT) devices, next-generation wireless networks can reality, communications.
leverage intelligent, data-driven functions enabled by the inte-
gration of machine learning (ML) notions across the wireless
core and edge infrastructure. In this context, this paper provides
a comprehensive tutorial that overviews how artificial neural I. I NTRODUCTION
networks (ANNs)-based ML algorithms can be employed for solv- HE WIRELESS networking landscape is undergoing a
ing various wireless networking problems. For this purpose, we
first present a detailed overview of a number of key types of
T major revolution. The smartphone-centric networks of
yesteryears are gradually morphing into an Internet of Things
ANNs that include recurrent, spiking, and deep neural networks,
that are pertinent to wireless networking applications. For each (IoT) system [1]–[3] that integrates a heterogeneous mix of
type of ANN, we present the basic architecture as well as specific wireless-enabled devices ranging from smartphones, to drones,
examples that are particularly important and relevant wireless connected vehicles, wearables, sensors, and virtual reality
network design. Such ANN examples include echo state networks, devices. This unprecedented transformation will not only drive
liquid state machine, and long short term memory. Then, we
provide an in-depth overview on the variety of wireless commu- an exponential growth in wireless traffic in the foreseeable
nication problems that can be addressed using ANNs, ranging future, but it will also lead to the emergence of new and
from communication using unmanned aerial vehicles to virtual untested wireless service use cases, that substantially differ
reality applications over wireless networks as well as edge com- from conventional multimedia or voice-based services [4]. For
puting and caching. For each individual application, we present instance, beyond the need for high data rates – which has been
the main motivation for using ANNs along with the associated
challenges while we also provide a detailed example for a use case the main driver of the wireless network evolution in the past
scenario and outline future works that can be addressed using decade – next-generation wireless networks will also have to
ANNs. In a nutshell, this paper constitutes the first holistic tuto- deliver ultra-reliable, low-latency communication [4] and [5],
rial on the development of ANN-based ML techniques tailored that is adaptive and in real-time to the dynamics of the IoT
to the needs of future wireless networks. users and the IoT’s physical environment. For example, drones
and connected vehicles [6] will place autonomy at the heart of
Manuscript received January 8, 2019; revised May 5, 2019; accepted the IoT. This, in turn, will necessitate the deployment of ultra-
June 22, 2019. Date of publication July 3, 2019; date of current ver- reliable wireless links that can provide real-time, low-latency
sion November 25, 2019. This work was supported in part by the
National Natural Science Foundation of China under Grant 61629101, Grant control for such autonomous systems [7]–[9]. Meanwhile, in
61871041, and Grant 61671086, in part by the Beijing Natural Science tomorrow’s wireless networks, large volumes of data will be
Foundation and Municipal Education Committee Joint Funding Project collected, periodically and in real-time, across a massive num-
under Grant KZ201911232046, in part by the 111 Project under Grant
B17007, in part by Grant ZDSYS201707251409055, Grant 2017ZT07X152, ber of sensing and wearable devices that monitor physical
Grant 2018B030338001, and Grant 2018YFB1800800, and in part by the environments. Such massive short-packet transmissions will
U.S. National Science Foundation under Grant CNS-1460316, Grant CNS- lead to a substantial traffic over the wireless uplink, which has
1836802, and Grant IIS-1633363. (Corresponding author: Mingzhe Chen.)
M. Chen is with the Beijing Laboratory of Advanced Information Network, traditionally been much less congested than the downlink [10].
Beijing University of Posts and Telecommunications, Beijing 100876, China, This same wireless network must also support cloud-based
also with the Future Network of Intelligence Institute, Chinese University gaming [11], immersive virtual reality services [12], real-time
of Hong Kong, Shenzhen 518172, China, and also with the Department
of Electrical Engineering, Princeton University, Princeton, NJ 08544 USA HD streaming, and conventional multimedia services. This
(e-mail: [email protected]). ultimately creates a radically different networking environment
U. Challita is with the School of Informatics, University of Edinburgh, whose novel applications and their diverse quality-of-service
Edinburgh EH8 9AB, U.K. (e-mail: [email protected]).
W. Saad is with the Wireless@VT, Bradley Department of Electrical and (QoS) and reliability requirements mandate a fundamental
Computer Engineering, Virginia Tech, Blacksburg, VA 24060 USA (e-mail: change in the way in which wireless networks are modeled,
[email protected]). analyzed, designed, and optimized.
C. Yin is with the Beijing Laboratory of Advanced Information Network,
Beijing University of Posts and Telecommunications, Beijing 100876, China The need to cope with this ongoing and rapid evolution of
(e-mail: [email protected]). wireless services has led to a considerable body of research
M. Debbah is with the Mathematical and Algorithmic Sciences that investigates what the optimal cellular network architecture
Laboratory, Huawei France R&D, 92100 Paris, France (e-mail:
[email protected]). will be within the context of the emerging fifth generation (5G)
Digital Object Identifier 10.1109/COMST.2019.2926625 wireless networks (e.g., see [13] and the references therein).
1553-877X c 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

Authorized licensed use limited to: UNIVERSIDADE FEDERAL DO PARANA. Downloaded on February 09,2023 at 00:02:55 UTC from IEEE Xplore. Restrictions apply.
3040 IEEE COMMUNICATIONS SURVEYS & TUTORIALS, VOL. 21, NO. 4, FOURTH QUARTER 2019

While the main ingredients for 5G – such as dense small cell management, power control, and intelligent beamforming. In
deployments, millimeter wave (mmWave) communications, contrast to conventional distributed optimization techniques,
and device-to-device (D2D) communications – have been iden- that are often done iteratively in an offline or semi-offline man-
tified, integrating them into a truly harmonious wireless system ner [31], ML-guided resource management mechanisms will
that can meet the IoT challenges requires instilling intelligent be able to operate in a fully online manner by learning, in real
functions across both the edge and the core of the network. time, the states of the wireless environment and the network’s
These intelligent functions must be able to adaptively exploit users. Such mechanisms will therefore be able to continuously
the wireless system resources and the generated data, in order improve their own performance over time which, in turn, will
to optimize the network operations and guarantee, in real- enable more intelligent and dynamic network decision making.
time, the QoS needs of emerging wireless and IoT services. Such ML-driven decision making is essential for much of the
Such mobile edge and core intelligence can potentially be real- envisioned IoT and 5G services, particularly those that require
ized by integrating fundamental notions of machine learning real-time, low latency operation, such as autonomous driv-
(ML) [14], in particular, artificial neural network (ANN)-based ing, drone guidance, and industrial control. In fact, if properly
ML approaches, across the wireless infrastructure and the designed, ML optimization algorithms will provide inherently
end-user devices. ANNs [15] are a computational nonlinear self-organizing, self-healing, and self-optimizing solutions for
machine learning framework that can be used for supervised a broad range of problems within the context of network
learning, unsupervised learning [16], semi-supervised learn- optimization and resource management. Such ML-driven self-
ing [17], and reinforcement learning [18], in various wireless organizing solutions are particularly apropos for ultra dense
networking scenarios.1 wireless networks in which classical centralized and dis-
tributed optimization approaches can no longer cope with the
A. Role of ANNs in Wireless Networks scale and the heterogeneity of the network.
ML tools are undoubtedly one of the most important tools Third, beyond its system-level functions, ML can play a
for endowing wireless networks with intelligent functions, as key role at the physical layer of a wireless network [32]. As
evidenced by the wide adoption of ML in a myriad of applica- shown in [32]–[37], ML tools can be used to redefine the way
tions domains [19]–[24]. In the context of wireless networks, in which physical layer functions, such as coding and mod-
ML will enable any wireless device to actively and intelli- ulation, are designed, at both transmitter and receiver levels,
gently monitor its environment by learning and predicting the within a generic communication system. Such an ML-driven
evolution of the various environmental features (e.g., wireless approach has been shown [32]–[37] to have a lot of promise
channel dynamics, traffic patterns, network composition, con- in delivering lower bit error rates and better robustness to the
tent requests, user context, etc.) and proactively taking actions wireless channel impediments.
that maximize the chances of success for some predefined Last, but not least, the rapid deployment of highly user-
goal, which, in a wireless system, pertains to some sought centric wireless services, such as virtual reality [38], in which
after quality-of-service. ML enables the network infrastruc- the gap between the end-user and the network functions
ture to learn from the wireless networking environment and is almost minimal, strongly motivates the need for wireless
take adaptive network optimization actions. In consequence, networks that can track and adapt to the human user behavior.
ML is expected to play several roles in the next-generation of In this regard, ML is perhaps the only tool that is capable to
wireless networks [25]–[29]. learn and mimic human behavior, which will help in creating
First, the most natural application of ML in a wireless the wireless network to adapt its functions to its human users,
system is to exploit intelligent and predictive data analytics thus creating a truly immersive environment and to maximize
to enhance situational awareness and overall network oper- the overall quality-of-experience (QoE) of the users.
ations [25]. In this context, ML will provide the wireless From the above discussion, we can further narrow down
network with the ability to parse through massive amounts of the introduction of ML in wireless networks to imply two
data, generated from multiple sources that range from wire- key functions: 1) Intelligent and predictive data analytics,
less channel measurements and sensor readings to drones and the ability of the wireless network to intelligently pro-
surveillance images, in order to create a comprehensive oper- cess large volumes of data, gathered from its devices, in
ational map of the massive number of devices within the order to analyze and predict the context of the wireless
network [30]. This map can, in turn, be exploited to optimize users and the wireless network’s environmental states thus
various functions, such as fault monitoring and user tracking, enabling data-driven network-wide operational decisions, and
across the wireless network. 2) intelligent/self-organizing network control and optimization
Second, beyond its powerful intelligent and predictive data and the ability of the wireless network to dynamically learn
analytics functions, ML will be a major driver of intelli- the wireless environment and intelligently control the wireless
gent and data-driven wireless network optimization [30]. For network and optimize its resources according to information
instance, ML tools will enable the introduction of intelligent smartly learned about the wireless environment and users’
resource management tools, that can be used to address a states.
variety of problems ranging from cell association and radio Clearly, ML-based system operation is no longer a priv-
access technology selection to frequency allocation, spectrum ilege, but rather a necessity for future wireless networks.
ML-driven wireless network designs will pave the way towards
1 Hereinafter, ML is used to refer to ANN-based ML. an unimaginably rich set of new network functions and

Authorized licensed use limited to: UNIVERSIDADE FEDERAL DO PARANA. Downloaded on February 09,2023 at 00:02:55 UTC from IEEE Xplore. Restrictions apply.
CHEN et al.: ANNs-BASED ML FOR WIRELESS NETWORKS: TUTORIAL 3041

TABLE I
C OMPARISON OF T HIS T UTORIAL WITH E XISTING S URVEY AND T UTORIAL PAPERS . H ERE , “CC”, “CR”, “DT”, “PL”, AND “DA” R EFER TO C ACHING
AND C OMPUTING , C OGNITIVE R ADIO N ETWORK , DATA T RAFFIC D OMAIN , P HYSICAL L AYER D OMAIN , AND DATA A NALYTICS

wireless services. For instance, even though 5G networks may as [3], [51]–[53], [60], and [62],2 are highly qualitative and
not be fully ML capable, we envision that the subsequent, do not provide an in-depth technical and quantitative descrip-
sixth generation (6G) [39] of wireless cellular networks will tion on the variety of existing ML tools that are suitable for
surely integrate important tools from ML, as evidenced by the wireless communications. Last, but not least, some surveys
recent development of intelligent mobile networks proposed discuss the basics of neural networks with applications out-
by Huawei [40] and the “big innovation house” held by side of wireless communications. However, these surveys are
Qualcomm [41]. As such, the question is no longer if ML tools largely inaccessible to the wireless community, due to their
are going to be integrated into wireless networks but rather reliance on examples from rather orthogonal disciplines such
when such an integration will happen. In fact, the importance as computer vision. Moreover, most of the existing tutori-
of an ML-enabled wireless network has already been moti- als or surveys do not provide concrete guidelines on how,
vated by a number of recent wireless networking paradigms, when, and where to use different ANN tools in the context
such as mobile edge caching, context-aware networking, and of wireless networks. Finally, the introductory literature on
mobile edge computing [42]–[49], the majority of which use ML for wireless networks such as in [3], [32], and [50]–[62],
ML techniques for various tasks such as user behavior analysis is largely sparse and fragmented and provides very scarce
and predictions so as to determine which contents to cache and details on the role of ANNs, hence, making it difficult to
how to proactively allocate computing resources. However, understand the intrinsic details of this broad and far reaching
despite their importance, these works have a narrow focus and area. Table I summarizes the difference between this tutorial
do not provide any broad, tutorial-like material that can shed and existing tutorials and surveys. From Table I, we can see
light on the challenges and opportunities associated with the that, compared to prior works such as [3], [32] and [50]–[62],
use of ML for designing intelligent wireless networks. our tutorial provides a more detailed exposition of several
types of ANNs that are particularly useful for wireless appli-
cations and explains, pedagogically and, in detail, how to
B. Previous Works develop ANN-based ML solutions to endow intelligent wire-
A number of surveys and tutorials on ML applications in less networks and realize the full potential of 5G systems, and
wireless networking have been published, such as, for exam- beyond.
ple, in [3], [32], and [50]–[62]. Nevertheless, these works
are limited in a number of ways. First, a majority of the C. Contributions
existing works focuses on a single ML technique (often the The main contribution of this paper is, thus, to provide a
basics of deep learning [32], [50], and [56]–[58] or rein- comprehensive tutorial on the topic of ANN-based ML for
forcement learning [61]) and, as such, they do not capture wireless network design The overarching goal is to give a tuto-
the rich spectrum of available ML frameworks. Second, they rial on the emerging research contributions, from ANNs and
mostly restrict their scope to a single wireless application wireless communications, that address the major opportunities
such as sensor networks [53], cognitive radio networks [52], and challenges in developing ANN-based ML frameworks for
machine-to-machine (M2M) communication [3], physical understanding and designing intelligent wireless systems. To
layer design [32], software defined networking [55], Internet
of Things [57], or self-organizing networks (SONs) [59], and, 2 The main difference between our tutorial and [62] is that the authors

hence, they do not comprehensively cover the broad range in [62] do not investigate how a broad range of ANNs can be used for
solving wireless communication problems related to drone-based communica-
of applications that can adopt ML in future networks. Third, tions, spectrum management with multiple radio access technologies, wireless
a large number of the existing surveys and tutorials, such virtual reality, mobile edge caching and computing, and the IoT.

Authorized licensed use limited to: UNIVERSIDADE FEDERAL DO PARANA. Downloaded on February 09,2023 at 00:02:55 UTC from IEEE Xplore. Restrictions apply.
3042 IEEE COMMUNICATIONS SURVEYS & TUTORIALS, VOL. 21, NO. 4, FOURTH QUARTER 2019

Fig. 1. Organization of the tutorial.

the best of our knowledge, this is the first tutorial that gath- networks (RNNs), spiking neural networks (SNNs), and deep
ers the state-of-the-art and emerging research contributions neural networks (DNNs). In Section IV, we discuss the use
related to the use of ANNs for addressing a set of commu- of ANNs for wireless communication along with the corre-
nication problems in beyond 5G wireless networks. Our main sponding challenges and opportunities. Finally, conclusions
contributions include: are drawn in Section V.
• We provide a comprehensive treatment of artificial neu-
ral networks, with an emphasis on how such tools can II. A RTIFICIAL N EURAL N ETWORKS : P RELIMINARIES
be used to create a new breed of ML-enabled wireless ML was born from pattern recognition and it is essentially
networks. based on the premise that intelligent machines should be able
• After providing a brief introduction to the basics of ML, to learn from and adapt to their environment through experi-
we provide a more detailed exposition of ANNs that ence [19]–[24]. Due to the ever growing volumes of generated
are particularly useful for wireless applications, such as data – across critical infrastructures, communication networks,
recurrent, spiking, and deep neural networks. For each and smart cities – and the need for intelligent data analyt-
type, we provide an introduction on their basic archi- ics, the use of ML algorithms has become ubiquitous [64]
tectures and a specific use-case example. Other ANNs across many sectors, such as in financial services, health care,
that can be used for wireless applications are also briefly technology, and entertainment. Using ML algorithms to build
mentioned where appropriate. models that uncover connections and predict dynamic system
• Then, we discuss a broad range of wireless applica- features or human behavior, system operators can make intel-
tions that can make use of ANN. These applications ligent decisions without any human intervention. For example,
include drone-based communications, spectrum manage- in a wireless system such as the IoT, ML tools can be used for
ment with multiple radio access technologies, wireless intelligent data analytics and edge intelligence. ML tasks often
virtual reality, mobile edge caching and computing, and depend on the nature of their training data. In ML, training
the IoT system, among others. For each application, we is the process that teaches the machining learning framework
first outline the main rationale for applying ANNs while how to achieve a specific goal, such as speech recognition.
pinpointing illustrative scenarios. Then, we expose the In other words, training enables the ML framework to dis-
challenges and opportunities brought forward by the use cover potential relationships between the input data and the
of ANNs in each specific wireless application. We com- output data of this machine learning framework. There exist, in
plement this discussion with a detailed example drawn general, four key classes of learning approaches [65]: a) super-
from the state-of-the-art and, then, we conclude by shed- vised learning, b) unsupervised learning, c) semi-supervised
ding light on the potential future works within each learning, and d) reinforcement learning.
specific area. Supervised learning algorithms are trained using labeled
The rest of this tutorial is organized as follows (see Fig. 1). data [65]. When dealing with labeled data, both the input
In Section II, we introduce the basics of ANNs. Section III data and its desired output data are known to the system.
presents several key types of ANNs such as recurrent neural Supervised learning is commonly used in applications that

Fig. 2. Summary of artificial neural networks.

have enough historical data. In contrast, the training of unsu- A trained ANN can be thought of as an “expert” in dealing
pervised learning tasks is done without labeled data [65]. with human-related data. Therefore, using ANNs to extract
The goal of unsupervised learning is to explore the data information from the user environment can provide a wireless
and infer some structure directly from the unlabeled data. network with the ability to predict the users’ future behav-
Semi-supervised learning is used for the same applications iors and, hence, to design an optimal strategy to improve the
as supervised learning but it uses both labeled and unlabeled resulting QoS and reliability.
data for training [65]. This type of learning can be used As seen in Fig. 2, there are various types of ANNs:
with methods such as classification, regression, and prediction. • Modular Neural Networks: A modular neural network
Semi-supervised learning is useful when the cost of a fully- (MNN) is composed of several independent ANNs and an
labeled training process is relatively high. In contrast to the intermediary. In an MNN, each ANN is used to complete
previously discussed learning methods that need to be trained one subtask of the entire task that an MNN wants to
with historical data, reinforcement learning (RL) is trained perform. An intermediary is used to process the output
by the data collected from implementation of the RL [65]. of each independent ANN and generate the output of an
The goal of RL is to learn an environment and find the MNN.
best strategies for a given agent, in different environments. • Recurrent Neural Networks: RNNs are ANN architectures
RL algorithms are particularly interesting in the context of that allow neuron connections from a neuron in one layer
wireless network optimization [66]. To perform supervised, to neurons in previous layers. According to different acti-
unsupervised, semi-supervised, or RL learning tasks, several vation functions and connection methods for the neurons
frameworks have been developed. Among those frameworks, in an RNN, RNNs can be used to define several different
ANNs [54] are arguably the most important, as they are able architectures: a) stochastic neural networks, b) bidirec-
to mimic human intelligence. tional neural networks (BNNs), c) fully recurrent neural
ANNs are inspired by the structure and functional aspects of network (FRNN), d) neural Turing machines (NTMs),
biological neural networks, that can learn from complicated or e) long short-term memories (LSTMs), e) echo state
imprecise data [54]. Within the context of wireless communi- networks, f) simple recurrent neural networks (SRNNs),
cations, as it will be clearer from the later sections, ANNs can and g) gated recurrent units (GRUs).
be used to investigate and predict network and user behavior • Generative Adversarial Networks: Generative adversarial
so as to provide user information for solving diverse wire- networks (GANs) consist of two neural networks. One
less networking problems such as cell association, spectrum neural network is used to learn a map from a latent
management, computational resource allocation, and cached space to a particular data distribution, while another neu-
content replacement. Moreover, recent developments of smart ral network is used to discriminate between the true data
devices and mobile applications have significantly increased distribution and the distribution mapped by the neural
the level at which human users interact with mobile systems. network.

Authorized licensed use limited to: UNIVERSIDADE FEDERAL DO PARANA. Downloaded on February 09,2023 at 00:02:55 UTC from IEEE Xplore. Restrictions apply.
3044 IEEE COMMUNICATIONS SURVEYS & TUTORIALS, VOL. 21, NO. 4, FOURTH QUARTER 2019

• Deep Neural Networks: All the ANNs that have multiple

hidden layers are known as DNNs.
• Spiking Neural Networks: Spiking neural networks con-
sist of spiking neurons that accurately mimic the biolog-
ical neural networks.
• Feedforward Neural Networks: In a feedforward neural
network (FNN), each neuron has incoming connections
only from the previous layer and outgoing connections
only to the next layer. FNNs can be used to define
more advanced architectures such as: a) extreme learn-
ing machines (ELMs), b) convolutional neural networks
(CNNs), c) time delay neural networks (TDNNs),
Fig. 3. Recurrent neural network architecture.
d) autoencoders, e) probabilistic neural networks (PNUs),
and e) radial basis functions (RBFs).
• Physical Neural Networks: In a physical neural network introduced. This seemingly simple change enables the out-
(PNN), an electrically adjustable resistance material is put of a neural network to depend, not only on the current
used to emulate the function of a neural activation. input, but also on the historical input, as shown in Fig. 4.
Each type of ANN is suitable for a particular learning This allows RNNs to make use of sequential information and
task. For instance, RNNs are effective in dealing with time- exploit dynamic temporal behaviors such as those faced in
dependent data while SNNs are effective in dealing with mobility prediction or speech recognition. For example, an
continuous data. It should be noted that most of the data col- RNN can be used to predict the mobility patterns of mobile
lected by wireless networks is time-dependent and continuous. devices and wireless users. These patterns are related to the
In particular, in wireless networks, the user context and behav- historical locations that the wireless users have visited. This
ior, the wireless signals, and the wireless channel conditions task cannot be done in one step without combing histori-
are all time-dependent and continuous. RNNs and SNNs are cal locations from previous steps. Therefore, ANNs whose
effective in dealing with such collected data. They can exploit output depends only on the current input, such as FNNs, can-
this data for various purposes, such as network control and not perform such highly time-dependent tasks. A summary
user behavior predictions. However, since RNNs or SNNs can of the key advantages and disadvantages of RNNs for wire-
record only a limited size of historical data, they may not less applications is presented in Table II. Note that, in theory,
be able to solve all of the wireless communication problems. RNNs can make use of historical information in arbitrarily
To solve complex wireless problems that cannot be solved by long sequences, but in practice they are limited to only a sub-
shallow RNNs and SNNs, one can use DNNs which have a set of the historical information [68]. For training RNNs, the
high memory capacity for data analytics and can separate any most commonly used algorithms include the backpropagation
complex problem that needs to be learned into a composition through time (BPTT) algorithm [69]. However, RNNs require
of several simpler problems thus making the learning process more time to train compared to traditional ANNs (e.g., FNNs)
effective. In consequence, in Section III, we specifically intro- since each value of the activation function depends on the
duce RNNs, SNNs, and DNNs that are most suited for wireless series data recorded in RNNs. To reduce the training complex-
network use cases. ity of RNNs, one promising solution is to develop an RNN
that needs to only train the output weight matrix. Next, we
III. T YPES OF A RTIFICIAL N EURAL N ETWORKS specifically introduce this type of RNNs, named echo state
networks (ESNs) [70].
In this section, we specifically discuss three types of ANNs:
2) Example RNN – Echo State Networks: ESNs are known
RNNs, SNNs, and DNNs, that have a promising potential for
to be a highly practical type of RNNs due to their effective
wireless network design, as will become clear in Section IV.
approach for training [71]. In fact, ESNs reinvigorated interest
For each kind of ANN, we briefly introduce its architecture,
in RNNs [72] by making them accessible to wider audiences
advantages, and properties. Then, we present specific example
due to their apparent simplicity. In an ESN, the input weight
architectures.
matrix and the hidden weight matrix are randomly generated
without any specific training. Therefore, ESN needs to only
A. Recurrent Neural Networks train the output weight matrix. ESNs can, in theory, approx-
1) Architecture of Recurrent Neural Networks: In a tradi- imate any arbitrary nonlinear dynamical system with any
tional ANN, it is assumed that all the inputs or all the outputs arbitrary precision, they have an inherent temporal processing
are independent from each other. However, for many tasks, capability, and are therefore a very powerful enhancement of
the inputs (outputs) are related. For example, for predicting the linear blackbox modeling techniques in nonlinear domain.
the mobility patterns of wireless devices, the input data, that Due to the ESN’s appealing properties such as training sim-
is the users’ locations, are certainly related. To this end, recur- plicity and ability to record historical information, it has been
rent neural networks [67], which are ANN architectures that widely applied for supervised learning tasks, RL tasks, clas-
allow neuron connections from a neuron in one layer to neu- sification, and regression. In wireless networks, ESNs have
rons in previous layers [67], as shown in Fig. 3, have been been applied for various natural applications, such as content

Fig. 4. Architecture of an unfolded recurrent neural network.

TABLE II
S UMMARY OF THE A DVANTAGES AND D ISADVANTAGES OF ANN S FOR W IRELESS A PPLICATIONS

prediction, resource management, and mobility pattern esti- radius of W should be smaller than 1. The setting of other ESN
mation, as it will be clear in Section IV. Next, the specific components that is needed to guarantee the echo state property
architecture and training methods for ESNs are introduced. and to optimize ESN performance can be found in [70].
• Architecture of an Echo State Network: ESNs use an RNN Having described the main components of ESNs, we now
architecture with only one hidden layer.3 We define that the describe the activation value of each neuron. Even though the
input vector of an ESN as x t = [xt,1 , . . . , xt,Nin ]T and the input and the hidden weight matrices are fixed (randomly), all
output vector of an ESN as y t = [yt,1 , . . . , yt,Nout ]T . An ESN the neurons of an ESN will have their own activation values
model consists of the input weight matrix W in ∈ RN ×Nin , (hidden state). As opposed to the classical RNNs in which the
the recurrent weight matrix W ∈ RN ×(N +1) , the leaking rate hidden state depends only on the current input, in ESNs, the
α, and the output weight matrix W out ∈ RNout ×(1+N +Nin ) , hidden state will be given by:
where N is the number of neurons in the hidden layer. The
leaking rate α must be chosen to match the speed of the s t = f(W [1; s t−1 ] + W in x t ),
~ (1)
dynamics of hidden states s t = [st,1 , . . . , st,N ]T , where st,i s t = (1 − α)s t−1 + α~ st , (2)
represents the state of neuron i at time t, and output y t . To allow x −x
ESNs to store historical information, the hidden state s t should where f (x ) = ee x −e
+e −x
and [·; ·] represents a vertical vector
satisfy the so-called echo state property, which means that the (or matrix) concatenation. The model is also sometimes used
hidden state s t should be uniquely defined by the fading history without the leaky integration, which is a special case for α = 1
of the input x 0 , x 1 , . . . , x t . This is in contrast to traditional yielding ~s t = s t . From (1), we can see that the scaling of
ANNs, such as FNNs, that need to adjust the weight values W in and W determines the proportion of how much the
of the neurons in the hidden layers, ESNs only need that is current state s t depends on the current input x t and how
needed to guarantee the echo state property. Typically, in order much on the previous state s t−1 . Here, a feedback connection
to guarantee the echo state property of an ESN, the spectral from y t−1 to s t can be applied to the ESNs, defined as a
weight matrix W fb ∈ RN ×Nout . Hence, (1) can be rewritten
3 Deep generalizations of ESNs also exist [73] as ~
s t = f(W [1; s t−1 ] + W in x t + W fb y t−1 ).

Authorized licensed use limited to: UNIVERSIDADE FEDERAL DO PARANA. Downloaded on February 09,2023 at 00:02:55 UTC from IEEE Xplore. Restrictions apply.
3046 IEEE COMMUNICATIONS SURVEYS & TUTORIALS, VOL. 21, NO. 4, FOURTH QUARTER 2019

Based on the hidden state s t , the output signal of the ESN

can be given by:
yt = W out [1; s t ; x t ]. (3)
Here, an additional nonlinearity can be applied to (3), i.e.,
y t = tanh(W out [1; s t ; x t ]).
• Training in Echo State Networks: The training process in
ESNs seeks to minimize the mean square error (MSE) between
the targeted output and the actual output. When this MSE is
minimized, the actual output will be the target output which Fig. 5. Architecture of a LSM.
can be given by y D D
t = W out [1; s t ; x t ] where y t is the tar-
geted output. Therefore, the training goal is to find an optimal
W out such that W out [1; s t ; x t ] = y D t . In contrast to conven- and high information carriage capacity by adding a tempo-
tional RNNs that require gradient-based learning algorithms to ral dimension. Therefore, an SNN can use fewer neurons to
adjust all the inputs, the hidden, and the output weight matri- accomplish the same task compared to traditional ANNs and
ces, ESNs only need to train the output weight matrix with it can also be used for real-time computations on continuous
simple training methods such as ridge regression. The most streams of data, which means that both the inputs and outputs
universal and stable solution to this problem is via the so- of an SNN are streams of data in continuous time. However,
called ridge regression approach, also known as regression the training of SNNs is more challenging (and potentially more
with Tikhonov regularization [74], which is given by: time-consuming) than that of traditional ANNs due to their
−1 complex spiking neural models. A summary of the key advan-
T T
W out = y D t [1; s t ; x t ] [1; s t ; x t ][1; s t ; x t ] + θI , tages and disadvantages of SNNs for wireless applications is
(4) presented in Table II. To reduce the training complexity of
SNNs and keep the dynamics of spiking neurons, one promis-
where I is an identity matrix and θ is a regularization coef- ing solution is to develop a spiking neuron network that needs
ficient which should be selected individually for a concrete to only train the output weight matrix, like ESNs in RNNs.
reservoir based on validation data. When θ = 0, ridge regres- Next, we specifically present this type of SNNs, named liquid
sion will become a generalization of a regular linear regres- state machine.
sion. However, ridge regression is an offline training method 2) Example SNN - Liquid State Machine (LSM): The archi-
for ESNs. In fact, ESNs can be also trained by using online tecture of an LSM consists of only two components: Liquid
methods such as the least mean squares (LMS) algorithm [75], and readout function, as shown in Fig. 5. Here, the liquid rep-
or the recursive least squares (RLS) algorithm [76]. resents a spiking neural network with leaky-integrate-and-fire
(LIF) model neurons and the readout function is a number
B. Spiking Neural Networks of FNNs. For an LSM, the connections between the neurons
Another important type of ANNs is the so-called spiking in the liquid is randomly generated, allowing LSM to pos-
neural networks. In contrast to FNNs and RNNs that simply sess a recurrent nature that turns the time-varying input into a
use a single value to capture the activations of neurons, SNNs spatio-temporal pattern. In contrast to the general SNNs that
use a more accurate model of biological neural networks to need to adjust the weight values of all neurons, LSMs need to
denote the activations of neurons. In the following, we first only train the comparatively simple FNN of the readout func-
briefly introduce the architecture of SNNs. Then, we give an tion. In particular, simple training methods for FNNs such as
example for SNNs, the so-called liquid state machine. the feedforward propagation algorithm can be used for train-
1) Architecture of a Spiking Neural Network: The archi- ing SNNs to minimize the errors between the desired output
tecture of SNNs is similar to the neurons in biological neural signal and the actual output signal, which enables LSM to be
networks. Therefore, we first discuss how neurons operate in widely applied for practical applications such as [77] and [78].
a real-world biological neural network. Then, we discuss the Due to the LSM’s spiking neurons, it can perform ML tasks
model of neurons in SNNs. on continuous data like general SNNs but, it is also possible
In biological neural networks, neurons use spikes to com- to train it using effective and simple algorithms. In wireless
municate with each other. Incoming signals alter the voltage of networks, this can be suitable for signal detection and nonlin-
a neuron and when the voltage exceeds a threshold value, the ear audio prediction. Next, we specifically introduce the LSM
neuron sends out an action potential which is a short (1 ms) architecture.
and sudden increase in voltage that is created in the cell body • Liquid Model: In LSM, the liquid is composed of a large
or soma. Due to the form and nature of this process, we refer number of spiking LIF model neurons, located in a virtual
to it as a spike or a pulse. For SNNs, the use of such spikes can three-dimensional column. The liquid has two important func-
significantly improve the dynamics of the network. Therefore, tions in the classification of time-series data. First, its fading
SNNs can model a central nervous system and study the oper- memory is responsible for collecting and integrating the input
ation of biological neural circuits. Since the neurons in SNNs signal over time. Each one of the neurons in the liquid keeps
are modeled based on the spike, SNNs have two major advan- its own state, which gives the liquid a strong fading memory.
tages over traditional ANNs: Fast real-time decoding of signals The activity in the network and the actual firing of the neurons

DNN training algorithms [81]. As opposed to shallow ANNs

that have only one hidden layer, a DNN having multiple layers
is more beneficial due to the following reasons:
• Number of neurons: Generally, a shallow ANN would
require a lot more neurons than a DNN for the same
level of performance. In fact, the number of units in a
shallow ANN grows exponentially with the complexity
of the task.
• Task learning: Although shallow ANNs can be effective
to solve small-scale problems, they can fall short when
dealing with more complex problems, such as wireless
environment mapping. In fact, the main issue is that shal-
Fig. 6. Architecture of a DNN. low ANNs are very good at memorization, but not so
good at generalization. As such, DNNs are more suitable
for many real-world tasks which often involve complex
can also last for a while after the signal has ended, which can problems that are solved by decomposing the function
be viewed as another form of memory. Second, in the liquid that needs to be learned into several simpler functions so
of an LSM, the different input signals are separated, allowing as to improve the efficiency of the learning process.
for the readout to classify them. This separation is hypothe- It is worth noting that, although DNNs have a large capacity to
sized to happen by increasing the dimensionality of the signal. model a high degree of nonlinearity in the input data, a major
For example, if the input signal has 20 input channels, this is challenge is that of overfitting. In DNNs, overfitting becomes
transformed into 135 (3 × 3×15) signals and states of neurons particularly acute due to the presence of a very large num-
in the liquid. For every pair of input signal and liquid neu- ber of parameters. To overcome this issue, several advanced
ron, there is a certain chance of being connected, e.g., 30% regularization approaches, such as dataset augmentation and
in [79]. The connections between the neurons are allocated in weight decay [82] have been proposed. These methods mod-
a stochastic manner (e.g., see [79, Appendix B]). All neurons ify the learning algorithm so that the test error is reduced at the
in a liquid will connect to the readout functions. expense of increased training errors. A summary of key advan-
• Readout Model: The readout of an LSM consists of one or tages and disadvantages of DNNs for wireless applications are
more FNNs that use the activity state of the liquid to approx- presented in Table II.
imate a specific function. The purpose of the readout is to Next, we elaborate more on LSTM, a special kind of DNN
build the relationship between the dynamics of the spiking that is capable of storing information for long periods of
neurons and the desired output signals. The inputs of the read- time by using an identity activation function for the memory
out networks are called readout-moments. These are snapshots cell. This, in turn, makes LSTM suitable for various wireless
of the liquid activity taken at a regular interval. Whatever communication problems such as channel selection.
measure is used, the readout represents the state of the liq- 1) Example DNN - Long Short Term Memory: LSTMs that
uid at some point in time. In general, in LSM, FNNs are used typically consist of three hidden layers are a special kind of
as the readout function. FNNs will use the liquid dynamics “deep learning” RNNs that are capable of storing information
(i.e., spikes) as their input and the desired output signals as for either long or short periods of time. In particular, the activa-
their output. Then, the readout function can be trained using tions of an LSTM network correspond to short-term memory,
traditional training methods used for FNNs, mainly backprop- while the weights correspond to long-term memory. Therefore,
agation. Once the readout function has been trained, the LSM if the activations can preserve information over long periods
can be used to perform the corresponding tasks. of time, then this makes them long-term short-term memory.
Although both ESNs and LSTMs are good at modeling time
C. Deep Neural Networks series data, LSTM cells have the capability of dealing with
Thus far, all of the discussed ANNs, including ESNs and long term dependencies. An LSTM contains LSTM units each
LSMs, have assumed a single hidden layer. Such an architec- of which having a cell with a state ct at time t. Access to this
ture is typically referred to as a shallow ANN. In contrast, memory unit, as shown in Fig. 7, for reading or modifying
a deep neural network is an ANN with multiple hidden lay- information is controlled via three gates:
ers between the input and the output layers [80], as shown • Input gate (it ): controls whether the input is passed on
in Fig. 6. Therefore, a DNN models high-level abstractions to the memory cell or ignored.
in data through multiple nonlinear transformations to learn • Output gate (ot ): controls whether the current activation
multiple levels of representation and abstraction [80]. Several vector of the memory cell is passed on to the output layer
types of DNNs exist such as deep CNNs, deep ESNs, deep or not.
LSMs, and LSTM [80]. The main reasons that have enabled • Forget gate (ft ): controls whether the activation vector of
a paradigm shift from conventional, shallow ANNs, towards the memory cell is reset to zero or maintained.
DNN, include recent advances in computing capacity due to Therefore, an LSTM cell makes decisions about what to
the emergence of capable processing units, the wide availabil- store, and when to allow reads, writes, and erasures, via gates
ity of data for DNN training, and the emergence of effective that open and close. At each time step t, an LSTM receives

Authorized licensed use limited to: UNIVERSIDADE FEDERAL DO PARANA. Downloaded on February 09,2023 at 00:02:55 UTC from IEEE Xplore. Restrictions apply.
3048 IEEE COMMUNICATIONS SURVEYS & TUTORIALS, VOL. 21, NO. 4, FOURTH QUARTER 2019

Fig. 7. Architecture of an LSTM as shown in [83].

TABLE III
VARIOUS B EHAVIORS OF AN LSTM C ELL Finally, another important type of DNNs is the so-called
convolutional neural network that was recently proposed for
analyzing visual imagery [84]. CNNs are essentially a class of
deep FNNs. In CNNs, the hidden layers have neurons arranged
in three dimensions: width, height, and depth. These hidden
layers are either convolutional, pooling, or fully connected,
and, hence, if one hidden layer is convolutional (pooling/fully
inputs from two external sources, the current frame xt and the connected), then it is called convolutional (pooling/fully con-
previous hidden states of all LSTM units in the same layer nected) layer. The convolutional layers apply a convolution
ht−1 , at each of the four terminals (the three gates and the operation to the input, passing the result to the next layer. The
input). These inputs get summed up, along with the bias factors pooling layers are mainly used to simplify the information
bf , bi , bo , and bc . The gates are activated by passing their total from the convolutional layer while fully connected layers con-
input through the logistic functions. Table III summarizes the nect every neuron in one layer to every neuron in another layer.
various behaviors that an LSTM cell can achieve depending As opposed to LSTM, that are good at temporal modeling,
on the values of the input and the forget gates. Moreover, the CNNs are appropriate at reducing frequency variations which
update steps of a layer of LSTM units are summarized in the therefore makes them suitable for applications that deal with
following equations: spatial data such as interference identification in wireless
networks [85]. Moreover, CNNs can be combined with LSTM,
gt = fg (W f x t + U f s t−1 + b f ), (5) resulting in a CNN LSTM architecture that can be used for
it = fg (W i x t + U i s t−1 + b i ), (6) sequence prediction problems with spatial inputs, like images
ot = fg (W o x t + U o s t−1 + b o ), (7) or videos [86].
ct = gt ct−1 + it fc (W c x t + U c h t−1 + b c ), (8) In summary, different types of ANNs will have different
architectures, activation functions, connection methods, and
s t = ot fh (ct ), (9) data storage capacities. Each specific type of ANNs is suit-
where gt , it , and ot are the forget, the input, and the output able for dealing with a particular type of data. For example,
gate vectors at time t, respectively. xt is the input vector, ht RNNs are good at dealing with time-related data while SNNs
is the hidden/output vector, and ct is the cell state vector (i.e., are good at dealing with continuous data. Moreover, each type
internal memory) at time t. W f and U f are the weight and of ANNs has its own advantages and disadvantages in terms
transition matrices of the forget gate, respectively. W i and U i of learning tasks, specific tasks such as time-related tasks or
are the weight and transition matrices of the input gate, respec- space-related tasks, training data size, training time, and data
tively. W o and U o are the weight and transition matrices of storage space. Given all of their advantages, ANNs are ripe to
the output gate, respectively. W c and U c are the weight and be exploited in a diverse spectrum of applications in wireless
transition matrices of the cell state, respectively. fg , fc , and networking, as discussed in the following section.
fh are the activation functions, corresponding respectively to
the sigmoid and the tanh functions. denotes the Hadamard
product. Compared to a standard RNN, LSTM uses additive IV. A PPLICATIONS OF N EURAL N ETWORKS
memory updates and separates the memory c from the hidden IN W IRELESS C OMMUNICATIONS
state s, which interacts with the environment when making In this section, we first overview the motivation behind
predictions. To train an LSTM network, the stochastic gradient developing ANN solutions for wireless communications and
descent algorithm can be used. networking problems. Then, we introduce the use of ANNs

for various wireless applications. In particular, we discuss

how to use ANNs for unmanned aerial vehicles (UAVs), wire-
less virtual reality (VR), mobile edge caching and computing,
multiple radio access technologies, and the IoT.

A. Artificially Intelligent Wireless Networks Using ANNs: An

Overview
Recently, ANNs have started to attract significant atten-
tion in the context of wireless communications and
networking [4], [25], and [32], since the development of smart
Fig. 8. UAV-enabled wireless networks. In this figure, UAVs can be used
devices and mobile applications has significantly increased the as BSs to serve users in hotspot areas due to special events such as a sport
autonomy of a wireless network, as well as the level at which game or a disaster scenarios.
human users interact with the wireless system. Moreover, the
development of mobile edge computing and caching technolo-
gies makes it possible for base stations (BSs) to store and other disruptive events, so as to improve its resilience to such
analyze the behavior of the users of a wireless network. In events.
addition, the emergence of the Internet of Things further moti- Second, a key application of ANNs in wireless networks
vates the use of ANNs to improve the way in which wireless is for enabling self-organizing network operation by instill-
data is processed, collected, and used for various sensing and ing ANN-based ML at the edge of the network, as well as
autonomy purposes. across its various components (e.g., base stations and end-
In essence, within wireless communication domains, ANNs user devices). Such edge intelligence is a key enabler of
admit two major applications. First, they can be used for self-organizing solutions for resource management, user asso-
prediction, inference, and the intelligent and predictive data ciation, and data offloading. In this context, ANNs can serve as
analytics purposes. Within this application domain, ANN- RL tools [87] that can be used by a wireless network’s devices
based ML algorithms enable the wireless network to learn to learn the wireless environment and to make intelligent deci-
from the datasets generated by its users, environment, and sions. An ANN-based RL algorithm also can be used to learn
network devices. For instance, ANNs can be used to ana- the users’ information such as their locations and data rate, and
lyze and predict the wireless users’ mobility patterns and determine the UAV’s path based on the learned information.
content requests therefore allowing the BSs to optimize the Traditional learning algorithms, such as Q-learning, that use
use of their resources, such as frequency, time, or the files tables or matrices to record historical data, do not scale well
that will be cached across the network. Moreover, predictions for dense wireless networks. On the other hand, ANNs recently
and inference will be a primary enabler of the emerging use a nonlinear function approximation method to find the
IoT and smart cities paradigms. Within an IoT or within relationship using historical information. Therefore, ANN-
a smart city ecosystem, sensors will generate massive vol- based RL algorithms can learn complex relationships between
umes of data that can be used by the wireless network to wireless users and their networking environments to provide
optimize its resources usage, understand its network opera- solutions for the notoriously challenging problems of network
tion, monitor failures, or simply deliver smart services, such performance optimization and resource management.
as intelligent transportation. In this regard, the use of ANNs ANNs can be simultaneously employed for both prediction
for optimized predictions is imperative. In fact, ANNs will and intelligent/self-organizing operation, for scenarios in
equip the network with the capability to process massive vol- which these two functions are largely interdependent. For
umes of data and to parse useful information out of this instance, data can help in decision making, while decision
data, as a pre-cursor to delivering smart city services. For making can generate new data. For example, when consid-
example, road traffic data gathered from IoT sensors can be ering virtual reality applications over wireless networks, one
processed using ANN tools to predict the road traffic status can use ANNs to predict the behavior of users, such as head
at various locations in the city. This can then be used by the movement and content requests. These predictions can help an
wireless network that connects road traffic signals, appara- ANN-based RL algorithm to allocate computational and spec-
tus, and autonomous/connected vehicles to inform the vehicles tral resources to the users hence improving their QoS. Next,
of the traffic state and to potentially re-route some traffic we discuss specific applications that use ANNs for wireless
to respond to the current state of the system. Furthermore, communications.
ANNs can be beneficial for integrating different data from
multiple sensors thus facilitating more interesting and com- B. Wireless Communications and Networking With
plex wireless communication applications. In particular, ANNs Unmanned Aerial Vehicles
can identify nonintuitive features largely from cross-sensor 1) UAVs for Wireless Communications: Providing connec-
correlations which can result in a more accurate estimation tivity from the sky to ground wireless users is an emerging
of a wireless network’s conditions and an efficient alloca- trend in wireless networking [6] (see Fig. 8). Compared to ter-
tion of the available resources. Finally, a wireless network restrial communications, a wireless system with low-altitude
can use ANNs to learn about faults, infrastructure failure, and UAVs is faster to deploy, more flexibly reconfigured, and likely

Authorized licensed use limited to: UNIVERSIDADE FEDERAL DO PARANA. Downloaded on February 09,2023 at 00:02:55 UTC from IEEE Xplore. Restrictions apply.
3050 IEEE COMMUNICATIONS SURVEYS & TUTORIALS, VOL. 21, NO. 4, FOURTH QUARTER 2019

to experience better communication channels due to the pres- Using ANNs for UAVs faces many challenges, such as the
ence of line-of-sight (LoS) links. The use of highly mobile limited flight time to collect data, the limited power and com-
and energy-constrained UAVs for wireless communications putational resources for training ANNs, as well as possible
also introduces many new challenges [6], such as the need of data errors due to the air-to-ground channel. First, the limited
network modeling, backhaul (fronthaul) limitations for UAV- battery life and the limited computational power of UAVs can
to-UAV communication when the UAVs act as flying BSs, significantly constrain the use of ANNs. This stems from the
optimal deployment, air-to-ground channel modeling, energy fact that ANNs require a non-negligible amount of time and
efficiency, path planning, and security. In particular, compared computational resources for training. For instance, UAVs must
to the deployment of terrestrial BSs that are static, mostly consider a tradeoff between the energy used for training ANNs
long-term, and two-dimensional, the deployment of UAVs is and that used for other applications such as servicing users.
flexible, short-term, and three-dimensional. Therefore, there is Moreover, due to their flight time constraints [89], UAVs can
a need to investigate the optimal deployment of UAVs for cov- only collect data within a limited time period. In consequence,
erage extension and capacity improvement. Moreover, UAVs UAVs may not have enough collected data for training ANNs.
can be used for data collection, delivery, and transmitting In addition, the air-to-ground channels of UAVs will be sig-
telematics. Hence, there is a need to develop intelligent self- nificantly affected by the weather, the environment, and their
organizing control algorithms to optimize the flying path of movement. Therefore, the collected data can include errors
UAVs. In addition, the scarcity of the wireless spectrum, that that may affect the accuracy of the outcomes of the ANNs.
is already heavily used for terrestrial networks, is also a big The existing literature has studied a number of problems
challenge for UAV-based wireless communication. Due to the related to using ANNs for UAVs [90]–[97]. In [90], the authors
UAVs’ channel characteristics (less blockage and high proba- used a deep RL algorithm to efficiently control the coverage
bility for LoS link), the use of mmWave spectrum bands and and connectivity of UAVs. The authors in [91] studied the
visible light [88] will be a promising solution for UAV-based use of ANNs for UAV assignment to meet the high traffic
communication. Therefore, one can consider resource man- demands of ground users. The work in [92] investigated the
agement problems in the context of mmW-equipped UAVs, use of ANNs for UAV detection. In [93], the authors stud-
given their potential benefits for air-to-ground communica- ied the use of ANNs for trajectory tracking of UAVs. The
tions. Finally, one must also consider the problems of resource work in [94] proposed a multilayer perceptron based learning
allocation, interference management, and routing when the algorithm that uses aerial images and aerial geo-referenced
UAVs act as users of the ground wireless network. images to estimate the positions of UAVs. In [95], an ESN
2) Neural Networks for UAV-Based Wireless based RL algorithm is proposed for resource allocation in UAV
Communication: Due to the flying nature of UAVs, based networks. In [97], we proposed an RL algorithm that
they can track the users’ behavior and collect information uses LSM for resource allocation in UAV-based LTE over the
related to the users and the UAVs within any distance, at unlicensed band (LTE-U) network. For UAV-based wireless
any time or any place, which provides an ideal setting for communications, ANNs can be also used for many applica-
implementing ANN techniques. ANNs have two major use tions such as path planning [98], as mentioned previously.
cases for UAV-based wireless communication. First, using Next, we explain a specific ANN application for UAV-based
ANN-centric RL algorithms, UAVs can be operated in a wireless communication.
self-organizing manner. For instance, using ANNs as a RL, 3) Example: An elegant and interesting use of ANNs for
UAVs can dynamically adjust their locations, flying directions, UAV-based communication systems is presented in [96] for the
resource allocation decisions, and path planning to serve their study of the proactive deployment of cache-enabled UAVs. The
ground users and adapt to the users’ dynamic environment. model in [96] considers the downlink of a wireless cloud radio
Second, UAVs can be used to map the ground environment access network (CRAN) servicing a set of mobile users via
as well as the wireless environment itself to collect data and terrestrial remote radio heads and flying cache-enabled UAVs.
take advantage of ANN algorithms to exploit the collected The terrestrial remote radio heads (RRHs) transmit over the
data and perform data analytics to predict the ground users’ cellular band and are connected to the cloud’s pool of the base-
behavior. For example, ANNs can exploit the collected band units (BBUs) via capacity-constrained fronthaul links.
mobility data to predict the users’ mobility patterns. Based Since each user has its own QoE requirement, the capacity-
on the behavioral patterns of the users, battery-limited UAVs constrained fronthaul links will directly limit the data rate
can determine their optimal locations and design an optimal of the users that request content from the cloud. Therefore,
flying path to service ground users. Meanwhile, using ANNs the cache-enabled UAVs are introduced to service the mobile
enables more advanced UAV applications such as environment users along with terrestrial RRHs. Each cache-enabled UAV
identification. Clearly, within a wireless environment, most can store a limited number of popular content that the users
of the data of interest, such as that pertaining to the human request. By caching the predicted content, the transmission
behavior, UAV movement, and data collected from wireless delay from the content server to the UAVs can be significantly
devices, will be time related. For instance, certain users will reduced as each UAV can directly transmit its stored content
often go to the same office for work at the same time during to the users.
weekdays. ANNs can effectively deal with time-dependent A realistic model for periodic, daily, and pedestrian mobil-
data which makes them a natural choice for the applications ity patterns is considered according to which each user will
of UAV-based wireless communication. regularly visit a certain location of interest. The QoE of each

Fig. 9. Mobility patterns predictions of conceptor ESN algorithm [96]. In this figure, the green curve represents the conceptor ESN prediction, the black
curve is the real positions, top rectangle j is the index of the mobility pattern learned by ESN, the legend on the bottom left shows the total reservoir memory
used by ESN and the legend on the bottom right shows the normalized root mean square error of each mobility pattern prediction.

user is formally defined as function of each user’s data rate, distribution prediction, the cloud’s BBUs must implement one
delay, and device type. The impact of the device type on the conceptor ESN algorithm for each user. The input is defined as
QoE is captured by the screen size of each device. The screen each user’s context that includes gender, occupation, age, and
size will also affect the QoE perception of the user, especially device type. The output is the prediction of a user’s content
for video-oriented applications. The goal of [96] is to find request distribution. The generation of the reservoir is done
an effective deployment of cache-enabled UAVs to satisfy the as explained in Section III-A2. The conceptor is defined as
QoE requirements of each user while minimizing the trans- a matrix that is used to control the learning of an ESN. For
mit powers of the UAVs. This problem involves predicting, predicting mobility patterns, the input of the ESN-based algo-
for each user, the content request distribution and the peri- rithm is defined as the user’s context and current location. The
odic locations, finding the optimal contents to cache at the output is the prediction of a user’s location in the next time
UAVs, determining the users’ associations, as well as adjust- slots. Ridge regression is used to train the ESNs. The concep-
ing the locations and transmit powers of the UAVs. ANNs can tor is also defined as a matrix used to control the learning of
be used for the prediction tasks due to their effectiveness in an ESN. During the learning stage, the conceptor will record
dealing with time-varying data (e.g., mobility data). Moreover, the learned mobility patterns and content request distribution
ANNs can extract the relationships between the user locations patterns. When the conceptor ESN-based algorithm encounters
and the users’ context information such as gender, occupation, a new input pattern, it will first determine whether this pat-
and age. In addition, ANN-based RL algorithms can find the tern has been learned. If this new pattern has been previously
relationship between the UAVs’ location and the data rate of learned, the conceptor will instruct the ESN process to directly
each user, enabling UAVs to find the locations that maximize ignore it. This can allow the ESN algorithm to save some of
the users’ data rates. its memory only for the unlearned patterns.
A prediction algorithm using the framework of ESN with Based on the users’ mobility pattern prediction, the BBUs
conceptors is developed to find the users’ content request dis- can determine the user association using a K-mean clustering
tributions and their mobility patterns. The predictions of the approach. By implementing a K-mean clustering approach, the
users’ content request distribution and their mobility patterns users that are close to each other are grouped into one clus-
are then used to find the user-UAV association, optimal loca- ter. In consequence, each UAV services one cluster and the
tions of the UAVs and content caching at the UAVs. Since user-UAV association is determined. Then, based on the UAV
the data of the users’ behaviors such as mobility and content association and each user’s content request distribution, the
request are time-related, an ESN-based approach, as previously optimal contents to cache at each UAV and the optimal UAVs’
discussed in Section III-A2, can quickly learn the mobility locations can be found. When the altitude of a UAV is much
pattern and content request distributions without requiring sig- higher (lower) than the size of its corresponding coverage, the
nificant training data. Conceptors, defined in [99], enable an optimal location of the UAV can be found [96, Ths. 2 and 3].
ESN to perform a large number of predictions of mobility For more generic cases, it can be found by the ESN-based RL
and content request patterns. Moreover, new patterns can be algorithm [100].
added to the reservoir of the ESN without interfering with In Fig. 9, based on [96], we show how the memory
the previously acquired ones. The architecture of the con- of the conceptor ESN reservoir changes as the number of
ceptor ESN-based prediction approach is based on the ESN mobility patterns that were learned varies. The used mobil-
model specified in Section III-A2. For the content request ity data is gathered from Beijing University of Posts and

Authorized licensed use limited to: UNIVERSIDADE FEDERAL DO PARANA. Downloaded on February 09,2023 at 00:02:55 UTC from IEEE Xplore. Restrictions apply.
3052 IEEE COMMUNICATIONS SURVEYS & TUTORIALS, VOL. 21, NO. 4, FOURTH QUARTER 2019

between content requests, user association, resource allocation

and caching content.
4) Lessons Learned: From this example, we have demon-
strated that the conceptor ESN can be used for effective data
analytics in wireless networks that integrate UAV base stations,
particularly, for mobility pattern and content request distri-
bution predictions (and UAV-level caching). The ML angle
in this application stems from the fact that predictions are
used for intelligently determining the user association, optimal
caching, and optimal UAV locations. The key lessons learned
here include:
• The advantage of the conceptor ESN for UAV-based
networks is that it provided the network with an ability to
Fig. 10. Simulation result showing the transmit power as the number of proactively determine the deployment of UAVs and the
users varies [96].
optimal content stored at UAVs. Since UAVs are flexible
in their deployment (unlike terrestrial base stations), such
Telecommunications by recording the students’ locations dur- a proactive approach is desirable. The analysis in [96]
ing each day. In Fig. 9, one mobility pattern represents the also revealed that the use of a conceptor in the ESN
users’ trajectory in one day and the colored region is the scheme allows it to separate a user’s weekly mobility
memory used by the ESN. Fig. 9 shows that the usage of into several patterns and use various non-linear systems
the memory increases as the number of the learned mobility for predictions thus improving accuracy. Moreover, the
patterns increases. Fig. 9 also shows that the conceptor ESN use of a conceptor ESN enables the cloud to add new
uses less memory for learning mobility pattern 2 compared to patterns to the ESN without interfering with previously
pattern 6. In fact, compared to pattern 6, mobility pattern 2 acquired ones, which can then improve the usage of an
has more similarities to mobility pattern 1, and, hence, the ESN’s memory (i.e., its capacity to store past data).
conceptor ESN requires less memory to learn pattern 2. This • The conceptor ESN algorithm that we presented in this
is because the proposed approach can be used to only learn section is able to perform its predictions over a long
the difference between the learned mobility patterns and the period of time. In this case, the conceptor ESN can be
new ones rather than to learn the entirety of every new pattern. trained in a completely offline manner and its training
Fig. 10 shows how the total transmit power of the UAVs process can be implemented at the cloud, thus leverag-
changes as the number of users varies. From Fig. 10, we can ing its computational power. Once trained at the cloud,
observe that the total UAV transmit power resulting from all of the UAVs can then directly use the cloud-trained con-
the presented algorithms increases with the number of users. ceptor ESNs for predictions and deployment. Thus, this
This is due to the fact that the number of users associated results energy savings which is particularly important for
with the RRHs and the capacity of the wireless fronthaul resource-limited UAVs. Another reason to train conceptor
links are limited. Therefore, the UAVs must increase their ESNs at the cloud is that the cloud is better positioned
transmit power to satisfy the QoE requirement of each user. in the network to collect mobility information. Due to
From Fig. 10, we can also see that the conceptor based ESN this implementation, one can neglect the overhead for
approach can reduce the total transmit power of the UAVs by the training of the conceptor ESNs.
about 16.7% compared to the ESN algorithm used to predict • From this example, we have observed that, for mobility
the content request and the mobility for a network with 70 prediction, a shallow conceptor ESN learning algorithm
users. This is because the conceptor ESN, that separates the can achieve the same prediction accuracy compared to a
users’ behavior into multiple patterns and uses the conceptor deep learning algorithm (e.g., similar to the one that will be
to learn these patterns, can predict the users’ behavior more introduced in the multi-RAT application of Section IV-E).
accurately compared to the ESN algorithm. This is mainly due to the fact that the future locations of
Resource allocation problems in UAV-based wireless each user depend only on a small number of the locations
networks can also be addressed using LSMs, as explained that the user has previously visited. In consequence, a
in [97]. In particular, in [97], an LSM-based RL algorithm shallow conceptor ESN is sufficient to record these visited
is used for resource and cache management in LTE over unli- locations and perform reasonable predictions.
censed (LTE-U) UAV networks. The LSM-based RL algorithm • One disadvantage of using a conceptor ESN learning
in [97] can find the appropriate policies for user associa- algorithm for intelligent and predictive data analytics is
tion and resource allocation as well as the contents to cache that the use of a conceptor will increase the ESN training
at UAVs, as the users’ content requests change dynamically. complexity. This is due to the fact that, during the training
This is due to the fact that an LSM can record the dynamic process, the conceptor needs to identify the input data of
user content requests as well as the policies of the user asso- a given ESN and also needs to find appropriate memory
ciation, resource allocation, and content caching due to its space of the ESN for data recording. This further moti-
large memory (compared to ESN). Based on the recorded vates the need to train the conceptor ESNs at the cloud
information, the LSM algorithm can build a relationship so as to save the UAV energy.

TABLE IV
S UMMARY OF THE U SE OF ANN S FOR S PECIFIC A PPLICATION

Note that the observations in the third and fourth bullets above UAVs that act as users. In this scenario, the wireless network
can be generalized to other shallow RNNs. can properly select which BSs can serve the flying UAV users.
5) Future Works: Clearly, ANNs are an important tool A summary of key problems that can be solved by using ANNs
for addressing key challenges in UAV-based communication for UAV-based communications is presented in Table VI along
networks. In fact, different types of ANNs can be suitable for with the challenges and future works.
various UAV applications. For instance, given their effective-
ness in dealing with time-dependent data, RNNs can be used
for predicting user locations and traffic demands. This allows C. Wireless Virtual Reality
UAVs to optimize their location based on the dynamics of the 1) Virtual Reality Over Wireless Networks: Recently, the
network. DNN-based RL algorithms can be used to determine wireless industry such as Qualcomm [129] and Nokia [130],
the time duration that the UAVs need to service the ground has rated VR as one of the most important applications in 5G
users and how to service the ground users (e.g., stop or fly to and beyond networks. Moreover, 3GPP is creating a standard
service the users). Since DNNs have the ability to store large for wireless VR, called extended reality (XR) [12]. In addi-
amount of data, DNN-based RL algorithms can also be used tion, several industrial players such as HTC Vive [131], and
to store the data related to the users’ historical context and, Oculus [132], and Intel [133] are all developing wireless VR
then, predict each ground user’s locations, content requests, devices that can operate over wireless cellular networks. These
and latency requirement. Based on these predictions, the UAVs recent developments motivate us to analyze wireless VR as a
can find their optimal trajectory and, as a result, determine key use case of ANNs in future wireless networks.
which area to serve at any given time. In addition, SNNs can When a VR device is operated over a wireless link, the users
be used for modeling the air-to ground channel, in general, and must send the tracking information that includes the users’
over mmWave frequencies, in particular. This is because SNNs locations and orientations to the BSs and, then, the BSs will
are good at dealing with continuous data and the wireless chan- use the tracking information to construct 360◦ images and
nel is time-varing and continuous [101]. For instance, UAVs send these images to the users. Therefore, for wireless VR
can use SNNs to analyze the data that they can collect from applications, the uplink and downlink transmissions must be
the radio environment, such as the received signal strength, jointly considered. Moreover, in contrast to traditional video
UAVs’ positions, and users’ positions, and then generate an that consists of 120◦ images, a VR video consists of high-
air-to-ground channel model to fit the collected data. Finally, resolution 360◦ vision with three-dimensional surround stereo.
SNNs are a good choice for the prediction of the trajectories of This new type of VR video requires a much higher data rate

Authorized licensed use limited to: UNIVERSIDADE FEDERAL DO PARANA. Downloaded on February 09,2023 at 00:02:55 UTC from IEEE Xplore. Restrictions apply.
3054 IEEE COMMUNICATIONS SURVEYS & TUTORIALS, VOL. 21, NO. 4, FOURTH QUARTER 2019

than that of traditional mobile video. In addition, as the VR

images are constructed according to the users’ movement such
as their head or eye movement, the tracking accuracy of the
VR system will directly affect the user experience. In sum-
mary, the challenges of operating VR devices over wireless
networks [38] include tracking accuracy, low delay, high data
rate, user experience modeling, effective image compression
as well as VR content and tracking information transmission
over wireless links. Fig. 11. Wireless VR networks. In this figure, BSs that are acted as VR
controllers generate and transmit VR videos to VR users according to the
2) Neural Networks for Wireless Virtual Reality: The use tracking information collected from VR users.
of ANNs is a promising solution for a number of problems
related to wireless VR. This is due to the fact that, compared
to other applications such as UAV or caching, VR applications time is an important question for wireless VR. In this regard,
depend more on the users’ environment and their behavior vis- training ANNs in an offline manner or using ANNs that con-
a-vis the VR environment. In a wireless VR network, the head verge quickly can be two promising solutions for speeding up
and eye movements will significantly affect resource manage- the training process of ANNs.
ment and network control. This is a very new challenge for The existing literature has studied a number of problems
wireless networks. For instance, ANNs are effective at identi- related to using ANNs for VR such as in [102]–[106]. The
fying and predicting the users’ movements and their actions. work in [104] proposed an ESN based distributed learning
Based on the predictions of the users’ environment, actions, algorithm to predict the users’ head movement in VR applica-
and movements, the BSs can improve the generation of the tions. In [105], a decision forest learning algorithm is proposed
VR images and optimize the resource management for wire- for gaze prediction. The work in [102] developed a neural
less VR users. ANNs have two major applications for wireless network based transfer learning algorithm for data correlation
VR. First, ANNs can be used to predict the users’ movement aware resource allocation. 360◦ content caching and trans-
as well as their future interactions with the VR environment. mission is optimized in [106] using an ESN and SNN based
For example, a user displays only the visible portion of a 360◦ deep RL algorithm. Table V summarizes the type of ANNs
video and, hence, transmitting the entire 360◦ video frame can and learning algorithms used for each existing work in vir-
waste the capacity-limited bandwidth. Since all of the images tual reality networks. In essence, the existing VR literature
are constructed based on the users’ movements, using ANNs, such as [102]–[106] has used ANNs to solve a number of
one can predict the users’ movement and, hence, enable the VR problems such as hand gestures recognition, interactive
wireless BSs to generate only the portion of the VR image shape changes, video conversion, head movement prediction,
that a user wants to display. Moreover, the predictions of and resource allocation. However, with the exception of our
the users’ movement can also improve the tracking accuracy works in [103] and [104], all of the other works that use ANNs
of the VR sensors. In particular, the BSs will jointly con- for VR applications are focused on wired VR. Therefore, they
sider the users’ movement predicted by ANNs and the users’ do not consider the challenges of wireless VR such as scarce
movements collected by VR sensors to determine the users’ spectrum resources, limited data rates, and how to transmit
movements. the tracking data accurately and reliably. In fact, ANNs can
Second, ANNs can be used to develop self-organizing algo- be used for wireless VR to solve the problems such as users
rithms to dynamically control and manage the wireless VR movement prediction, spectrum management, and VR image
network thus addressing problems such as dynamic resource generation. Next, a specific ANNs’ application for VR over
management. In particular, ANNs can be used for adap- wireless network is introduced.
tively optimizing resource allocation and adjusting the quality 3) Example: One key application of using ANNs for wire-
and format of VR images according to the cellular network less VR systems is presented in [103] for the study of resource
environment. allocation in cellular networks that support VR users. In this
Using ANNs for VR faces many challenges. First, in wire- model, the BSs act as the VR control centers that collect
less VR networks, the data collected from the users may the tracking information from the VR users over the cel-
contain errors that are unknown to the BSs. In consequence, lular uplink and then send the generated images (based on
the BSs may need to use erroneous data to train the ANNs and, the tracking information) and accompanying surround stereo
hence, the prediction accuracy of the ANN will be significantly audio to the VR users over the downlink. Therefore, this
affected. Second, due to the large data size of each 360◦ VR resource allocation problem in wireless VR must jointly con-
image, the BSs must spend a large amount of computational sider both the uplink and the downlink transmissions. To
resources to process VR images. Meanwhile, the training capture the VR users’ QoS in a cellular network, the model
of ANNs will also require a large amount of computational in [103] jointly accounts for VR tracking accuracy, process-
resources. Thus, how to effectively allocate the computational ing delay, and transmission delay. The tracking accuracy is
resources for processing VR images and training ANNs is defined as the difference between the tracking vector trans-
an important challenge. In addition, VR applications require mitted wirelessly from the VR headset to the BS and the
ultra-low latency while the training of ANNs can be time- accurate tracking vector obtained from the users’ force feed-
consuming. Hence, how to effectively train ANNs in a limited back. The tracking vector represents the users’ positions and

TABLE V
S UMMARY OF THE U SE OF ANN-BASED L EARNING A LGORITHMS FOR E XISTING W ORKS IN S PECIFIC A PPLICATION

orientations. The transmission delay consists of the uplink able to collect all information needed to calculate the utility
transmission delay and the downlink transmission delay. The function.
uplink transmission delay represents the time that a BS uses To overcome these challenges, an ANN-based RL algo-
to receive the tracking information while the downlink trans- rithm can be used for self-organizing VR resource allocation.
mission delay is the time that a BS uses to transmit the VR In particular, an ANN-based RL algorithm can find the rela-
contents. The processing delay is defined as the time that a BS tionship between the user association, resource allocation,
spends to correct the VR image from the image constructed and the user data rates, and, then, it can, directly select the
based on the inaccurate tracking vector to the image con- optimal resource allocation scheme after the training pro-
structed according to the accurate tracking vector. In [103], cess. For the downlink and uplink resource allocation problem
the relationship between delay and tracking is not necessarily in [103], an ANN-based RL algorithm can use less exploration
linear nor independent and, thus, multi-attribute utility the- time to build the relationship between the actions and their
ory [134] is used to construct a utility function assigns a corresponding utilities and then optimize resource allocation.
unique value to each tracking and delay components of the To simplify the generation and training process of an ANN-
VR QoS. based RL algorithm, an ESN-based RL algorithm is selected
The goal of [103] is to develop an effective resource block for VR resource allocation. The ESN-based learning algorithm
allocation scheme to maximize the users’ utility function that enables each BS to predict the value of VR QoS resulting
captures the VR QoS. This maximization jointly considers the from each resource allocation scheme without having to tra-
coupled problems of user association, uplink resource allo- verse all the resource allocation schemes. The architecture of
cation, and downlink resource allocation. Moreover, the VR the ESN-based self-organizing approach is based on the ESN
QoS of each BS depends not only on its resource alloca- model specified in Section III-A2. To use ESNs for RL, each
tion scheme but also on the resource allocation decisions of row of the ESN’s output weight matrix can be defined as one
other BSs. Consequently, the use of centralized optimization action. Here, one action represents one type of resource allo-
for such a complex problem is largely intractable and yields cation. The input of each ESN is the current action selection
significant overhead. In addition, for VR resource allocation strategies of all BSs. The generation of the ESN model follows
problems, we must jointly consider both uplink and downlink Section III-A2. The output is the estimated utility value. In the
resource allocation, and, thus, the number of actions will be learning process, at each time slot, each BS will implement
much larger than conventional scenarios that consider only one action according to the current action selection strategy.
uplink or downlink resource allocation. Thus, as the num- After the BSs perform their selected actions, they can get the
ber of actions significantly increases, each BS may not be actual utility values. Based on the actual utility values and

Authorized licensed use limited to: UNIVERSIDADE FEDERAL DO PARANA. Downloaded on February 09,2023 at 00:02:55 UTC from IEEE Xplore. Restrictions apply.
3056 IEEE COMMUNICATIONS SURVEYS & TUTORIALS, VOL. 21, NO. 4, FOURTH QUARTER 2019

and downlink resource block allocation. Some key outcomes

learned from this application include the following:
• In non-wireless applications such as speech recognition,
ESNs are used for data analytics. In this VR application,
ESNs are used as a reinforcement learning algorithm for
downlink and uplink resource block management. The
advantage of the ESN-based RL algorithm is that it pro-
vided the network with an ability to predict the value
of the VR QoS that results from each action (instead
of relying on a Q-table to record the observed utility
values as done in Q-learning) and, hence, it can find
the optimal action selection strategy that can maximize
Fig. 12. Delay for each served user vs. the number of BSs [103]. the individual (per SBS) VR QoS utilities without hav-
ing to traverse all actions. As a result, ESN-based RL
is suitable for wireless VR resource management prob-
the utility values estimated by ESN, each BS can adjust the lems in which both uplink and downlink resources must
values of the output weight matrix of an ESN according to (4). be managed jointly, thus increasing the search space for
As time elapses, the ESN can accurately estimate the utility the wireless VR QoS optimization problem, compared to
values for each BS and can find the relationship between the standard wireless resource management problems. This
resource allocation schemes and the utility values. Based on was a novel use case of ESNs that is motivated by the
this relationship, each BS can find the optimal action selection underlying wireless system, rather than by the need to
strategy that maximizes the average VR QoS for its users. process some data as done in computer vision.
Fig. 12 shows how average delay of each user varies as • Compared to most of the existing DNN-based RL algo-
the number of BSs changes. From Fig. 12, we can see that, rithms that cannot analytically guarantee convergence to
as the number of BSs increases, the transmission delay for a final equilibrium or optimization solution, in this appli-
each served user increases. This is due to the fact that, as cation, we have proved that ESN-based RL algorithms
the number of BSs increases, the number of users located in can finally converge to the expected VR QoS utilities if
each BS’s coverage decreases and, hence, the average delay the learning parameters are appropriately set.
increases. However, as the number of BSs increases, the delay • Due to the limited memory capacity of each ESN, the
increase becomes slower due to the additional interference. application of an ESN-based RL algorithm depends on
This stems from the fact that, as the number of BSs contin- the complexity of the underlying wireless problems.
ues to increase, the number of the users associated with each ESN-based RL algorithms can be used to solve the
BS decreases and more spectrum will be allocated to each optimization problem with a moderate number of opti-
user. Hence, the delay of each user will continue to decrease. mized variables while DNN-based algorithms can be
However, as the number of the BSs increases, the increasing used to solve more complex optimization problems. In
interference will limit the reduction in the delay. Fig. 12 also this work, the ESN-based RL algorithms can achieve
shows that the ESN-based algorithm achieves up to a 19.6% the same performance for resource block allocation as
gain in terms of average delay compared to the Q-learning DNN-based RL algorithms. However, the time needed
algorithm for the case with 6 BSs. Fig. 12 also shows that the for training DNNs such as LSTMs will be much higher
ESN-based approach allows the wireless VR transmission to than the time needed for training ESNs. In consequence,
meet the VR delay requirement that includes both the trans- one must choose an appropriate ANN architecture for RL
mission and processing delay (typically 20 ms [135]). These depending on the complexity of the wireless optimization
gains stem from the adaptive nature of ESNs. problems. In a wireless VR application, it could be more
From this example, we have illustrated the use of ESN suitable to use a shallow ANN in the RL algorithm for
as an RL algorithm for self-organizing resource allocation problems such as channel selection and user association,
in wireless VR. An ESN-based RL algorithm enables each while DNN-based RL algorithms are more suitable for
BS to allocate downlink and uplink spectrum resource in a power allocation. This is due to the fact that, in power
self-organizing manner that adjusts the resource allocation allocation problems, the optimized variables are continu-
according to the dynamical environment. Moreover, an ESN- ous and, thus, the number of actions needed for RL will
based RL algorithm can use an approximation method to find be much larger than those used in other problems (e.g.,
the relationship between each BS’s actions and its correspond- user association).
ing utility values, and, hence, an ESN-based RL algorithm can Here, we note that, the aforementioned lessons learned can be
speed up the training process. Simulation results show that an generalized to other shallow ANNs.
ESN-based RL algorithm enables each BS to achieve the delay 5) Future Works: Clearly, ANNs are a promising tool to
requirement of VR transmission. address challenges in wireless VR applications. In fact, the
4) Lessons Learned: Clearly, we have demonstrated that above application of ANNs for spectrum resource allocation
ESNs can be an effective tool for resource management in can be easily extended to manage other types of resources
a wireless VR network that needs to jointly consider uplink such as computational resources, and video formats. Moreover,

TABLE VI
S UMMARY OF THE U SE OF ANN S FOR S PECIFIC W IRELESS P ROBLEMS

SNNs can be used for the prediction of the viewing VR video movement and head movement and their interactions with the
which is the VR video displayed at the headset of one user. environment. Then, the network can pre-construct VR images
Then, the network can reduce the data size of each transmit- based on these predictions which can reduce the time spent to
ted VR video and pre-transmit each viewing VR video to the construct the VR images. The user-VR system interactions are
users. This is because SNNs are good at processing the rapidly all time-dependent and, hence, RNNs are a good choice for
changing, dynamic VR videos. Furthermore, RNNs can be performing such tasks. Note that, the prediction of the users’
used to predict and detect the VR users’ movement such as eye movement will directly affect the VR images that are sent to

Authorized licensed use limited to: UNIVERSIDADE FEDERAL DO PARANA. Downloaded on February 09,2023 at 00:02:55 UTC from IEEE Xplore. Restrictions apply.
3058 IEEE COMMUNICATIONS SURVEYS & TUTORIALS, VOL. 21, NO. 4, FOURTH QUARTER 2019

edge computing is to optimally allocate computational tasks

across both the edge devices (e.g., fog nodes) and the remote
data servers, in a way to optimize latency. Finally, it is worth
noting that some recent works [141] have jointly combined
caching and computing. In this case, caching is used to store
the most popular and basic computational tasks. Based on
the caching results, the network will have to determine the
optimal computational resource allocation to globally mini-
mize latency. However, optimizing mobile edge computing
faces many challenges such as computing placement, com-
putational resource allocation, computing tasks assignment,
end-to-end latency minimization, and minimization of the
energy consumption for the devices.
2) Neural Networks for Mobile Edge Caching and
Computing: ANNs can play a central role in the design of
Fig. 13. Mobile edge caching and computing wireless networks. new mobile edge caching and computing mechanisms. For
instance, the problems of optimal cache placement and cache
update are all dependent on the predictions of the users’ behav-
the users at each time slot and, hence, the learning algorithm iors such as the users’ content request problems. For example,
must complete the training process during a short time period. cache placement depends on the users’ locations while cache
In consequence, we should use RNNs that are easy to train for update depends on the frequency with which a user requests
the prediction of the users’ movement. Finally, CNNs can be a certain content. Since human behavior can be predicted by
used for VR video compression and recovery so as to reduce ANNs, ANNs are a promising solution for effective mobile
the data size of each transmitted VR video and improve the edge caching and computing.
QoS for each VR user. This is because CNNs are good at In essence, ANNs can play a vital role in three major
storing large amount of data in spatial domain and learn the applications for mobile edge caching and computing. First,
features of VR images. A summary of key problems that can ANNs can be used for prediction and inference purposes.
be solved by using ANNs in wireless VR system is presented For example, ANNs can be used to predict the users’ con-
in Table VI along with the challenges and future works. tent request distributions and content request frequency. The
content request distribution and content request frequency can
D. Mobile Edge Caching and Computing be used to determine which content to store at the end-user
1) Mobile Edge Caching and Computing: Caching at the devices or BSs. Furthermore, ANNs can also be used to
edge of wireless networks, as shown in Fig. 13, can enable find social information from the collected data. In particular,
the network devices (BSs and end-user devices) to store the ANNs can learn the users’ interests, activities, and interac-
most popular content to reduce the data traffic (content trans- tions. By exploiting the correlation between the users’ data,
mission), delay, and bandwidth usage, as well as to improve their social interests, and their common interests, the accuracy
energy efficiency and the utilization of the users’ context and of predicting future events such as the users’ geographic loca-
social information [136]. Recently, it has become possible to tions, the next visited cells, and the requested contents can
jointly consider cache placement and content delivery, using be dramatically improved [136]. For example, ANNs can be
coded caching [137]. Coded caching enables network devices used to predict the users’ interests. The users that have the
to create multicasting opportunities for specific content, via same interests are highly likely to request the same content.
coded multicast transmissions, thus significantly improving Therefore, the system operator can cluster the users that have
the bandwidth efficiency [138]. However, designing effective the same interests and store the popular contents they may
caching strategies for wireless networks faces many chal- request. Similarly, ANNs can be used to predict the computa-
lenges [136] such as solving optimized cache placement, cache tional requirements of tasks which in turn enables the network
update, and content popularity analytics problems. devices to schedule the computational resources in advance
In addition to caching, the wireless network’s edge devices thus minimizing latency.
can be used for performing effective and low-latency compu- Second, ANNs can be used as an effective clustering algo-
tations using the emerging paradigm of mobile edge comput- rithm to classify the users based on their activities such as
ing [139]. The basic premise of mobile edge computing is to content request, which enables the system operator to deter-
exploit local resources for computational purposes (e.g., for mine which contents to store at a storage unit and, thus,
VR image generation or for sensor data processing), in order improve the usage of cached contents. For instance, the content
to avoid high-latency transmission to remote cloud servers. requests of users can change over time while the cached con-
Mobile edge computing, which includes related concepts such tent will be updated for a long time (i.g., one day) and, hence,
as fog computing [140], can decrease the overall computa- the system operator must determine which content to cache
tional latency by reducing the reliance on the remote cloud by reviewing all the collected content requests. ANNs, such
while effectively offloading computational resources across as CNNs, can be used to store the content request information
multiple local and remote devices. The key challenge in mobile and classify the large amount of content requests for cache

update. In fact, predictions and clustering are interrelated and, ESN-based learning algorithm to predict the users’ mobility
therefore, ANNs can be used for both applications simulta- patterns and content request distributions. In general, existing
neously. For example, ANNs can first be used to predict the works such as in [96], [107], [108], [113], and [109]–[112]
users’ content request distributions, and, then, ANNs can be have used ANNs to solve the caching problems such as cache
used to classify the users that have similar content request dis- replacement, content popularity prediction, and content request
tributions. Meanwhile, ANN-based clustering algorithms can distribution prediction. For mobile edge computing, in general,
be used to classify the computing tasks. Then, the computing there is no existing work that uses ANNs to solve these rele-
tasks that are clustered into a group can be assigned to a cer- vant problems. Next, we explain a specific ANNs’ application
tain computing center. In this case, each computing center will for mobile edge caching.
process one type of computing tasks thus reducing the compu- 3) Example: One illustrative application for the use of
tational time. Finally, ANNs can also be used for intelligently ANNs for mobile edge caching is presented in [113] which
scheduling the computing tasks to different computing cen- studies the problem of proactive caching in CRANs. In this
ters. In particular, ANNs can be used as an RL algorithm to model, the users are served by the RRHs which are connected
learn each computing center’s state such as its computational to the cloud pool of the BBUs via capacity-constrained wired
load, and then, allocate computing tasks based on the learned fronthaul links. The RRHs and the users are all equipped
information to reduce the computational time. with storage units that can be used to store the most pop-
Using ANNs for mobile edge caching and computing faces ular content that the users request. The RRHs which have
many challenges. Data cleaning is an essential part of the data the same content request distributions are grouped into a vir-
analysis process for mobile edge processing. For example, to tual cluster and serve their users using zero-forcing method.
predict the users’ content requests, the data processing system The content request distribution for a particular user repre-
should be capable of reading and extracting useful data from sents the probabilities with which the user requests different
huge and disparate data sources. For example, one user’s con- content. Virtual clusters are connected to the content servers
tent request depends on this user’s age, job, and locations. In via capacity-constrained wired backhaul links. Since the back-
fact, the data cleaning process usually takes more time than haul (fronthaul) links are wired, we assume that the total
the learning process. For instance, the type and volume of transmission rate of the backhaul (fronthaul) links is equally
content that users may request can be in the order of millions allocated to the content that must be transmitted over the
and, hence, the data processing system should select appropri- backhaul (fronthaul) links. Each user has a periodic mobility
ate content to analyze and predict the users’ content request pattern and regularly visits a certain location. Since cache-
behaviors. For caching, the most important use of ANNs is to enabled RRHs and BBUs can store the requested content, this
predict the users’ content requests which directly determines content can be transmitted over four possible links: a) con-
the caching update. However, each user may request a large tent server-BBUs-RRH-user, b) cloud cache-BBUs-RRH-user,
volume of content types such as video, music, and news, each c) RRH cache-RRH-user, and d) remote RRH cache-remote
of which having different formats and resolutions. Hence, for RRH-BBUs-RRH-user. The notion of effective capacity4 [142]
each user, the total number of the requested content items was used to capture the maximum content transmission rate
will be significantly large. However, the memory of an ANN of a channel under a certain QoS requirement. The effective
is limited and, hence, each ANN can record only a limited capacity of each content transmission depends on the link that
number of requested contents. In consequence, an ANN must is used to transmit the content and the actual link capacity
be able to select the most important content for content request between the user and the associated RRHs.
prediction so as to help the network operator determine which The goal of [113] is to develop an effective framework for
content to store at mobile edge cache. Similarly, for comput- content caching and RRH clustering in an effort to reduce the
ing tasks predictions, the limited-memory ANNs can only store network’s interference and to offload the traffic of the back-
a finite number of the computing tasks and, hence, they must haul and of the fronthaul based on the predictions of the users’
select suitable computing tasks to store and predict. Moreover, content request distributions and mobility patterns. To achieve
as opposed to mobile edge caching that requires a long period this goal, a QoS and delay optimization problem is formulated,
of time to update the cached contents, mobile edge computing whose objective is to maximize the long-term sum effective
needs to process the tasks as soon as possible. Therefore, the capacity of all users. This optimization problem involves the
ANNs used for mobile edge computing must complete their prediction of the content request distribution and of the peri-
training process in a short period time. odic location for each user, and the finding of the optimal
The existing literature has studied a number of problems content to cache at the BBUs and at the RRHs. To predict
related to the use of ANNs for caching [96], [107], [108], the content request distribution and mobility patterns for each
[113], and [109]–[112]. The authors in [107] proposed a user, an ESN-based learning algorithm is used, similarly to
big data-enabled architecture to investigate proactive content the one described in Section III-A2. For each user, the BBUs
caching in 5G wireless networks. In [108]–[110], ANNs are
used to determine the cache replacement and content delivery.
The authors in [111] developed a data extraction method using
4 The effective capacity is a link-layer channel model that can be used to
the Hadoop platform to predict content popularity. In [112],
measure a content transmission over multiple hops. In particular, the effective
an extreme-learning machine neural network is used to predict capacity can be used to measure a content transmission from the BBUs to the
content popularity. The works in [96] and [113] developed an RRHs, then from RRHs to the users.

Authorized licensed use limited to: UNIVERSIDADE FEDERAL DO PARANA. Downloaded on February 09,2023 at 00:02:55 UTC from IEEE Xplore. Restrictions apply.
3060 IEEE COMMUNICATIONS SURVEYS & TUTORIALS, VOL. 21, NO. 4, FOURTH QUARTER 2019

ESNs are effective for the prediction of the users’ mobility

patterns and content request distribution, based on which the
cloud can determine the content stored at the cloud and at
the RRHs. Some key outcomes learned from this application
include:
• Even though analyzing the memory capacity of an ESN is
generally challenging, in this application, we were able
to derive the memory capacity for an ESN that uses a
linear activation function. Based on this analysis, we can
accurately set the size of the matrices and the memory
capacity of each ESN that can precisely predict the users’
mobility and content request distributions. Here, we need
Fig. 14. Sum effective capacity as function of the number of RRHs [113]. to note that, as the memory capacity increases, the train-
ing complexity of an ESN will significantly increase. In
this context, for mobility prediction in this application, we
build an ESN model with minimum memory capacity that
must implement one ESN algorithm for content request dis- can accurately predict the users’ mobility patterns and
tribution prediction and another ESN algorithm for mobility quickly converge. In fact, for different prediction tasks,
pattern prediction. one can adjust the memory capacity of each ESN using
For the content request distribution prediction, the input of the obtained results to enable the ESNs to record all of
the developed ESN is a user’s context which includes con- the information needed for the predictions.
tent request time, week, gender, occupation, age, and device • This example also showed that ESN-based learning algo-
type. The output is the predicted content request distribu- rithms can be trained to predict only one mobility pattern
tion. The ESN model consists of the input weight matrix, the for each user. For example, to predict the weekly mobil-
output weight matrix, and the recurrent weight matrix (see ity pattern of each user, an ESN-based learning algorithm
Section III-A2). A linear gradient descent approach is used to cannot separate the mobility pattern in a week into sev-
train the output weight matrix. For mobility pattern prediction, eral days and use a specific non-linear system to predict
the input of the developed ESN is the current location of each the users’ mobility in each day. In fact, as we discussed in
user and the output is the vector of locations that a user is the UAV application in Section IV-B, using a unique non-
predicted to visit for the next steps. In contrast to the recur- linear system to predict the mobility of each user each day
rent matrix that is a sparse matrix and generated randomly, can significantly improve the accuracy of weekly mobility
the recurrent matrix of the ESN used for mobility prediction pattern prediction. Learning using ESNs is more appro-
contains only W non-zero elements, where W is the dimension priate for predicting a single task, rather than for multiple
of the recurrent matrix. This simplified recurrent matrix can prediction tasks. To overcome this challenge, one can use
speed up the training process of the ESNs. An offline man- the conceptor notion that was discussed in Section IV-B.
ner using ridge regression is used to train the output weight Note that, this observation can be generalized to other
matrix. shallow RNNs and SNNs.
Based on the users’ content request distribution and loca- • Compared to conceptor ESNs, ESN-based learning algo-
tions, the cloud can estimate the users’ RRH association, rithms have a lower training complexity and faster con-
determine each RRH’s content request distribution, and, then, vergence speed. However, as already mentioned, ESNs
cluster the RRHs into several groups. Finally, the content cannot separate the users’ contexts for multiple mobil-
that must be cached at the cloud and at the RRHs can be ity pattern predictions which will affect the prediction
determined. The analysis result proved that the ESN-based accuracy. In consequence, one must choose between
algorithm will reach an optimal solution to the content caching standard ESN or a conceptor ESN depending on the
problem. number of prediction tasks needed and their complexity.
Fig. 14 shows how the sum of the effective capacities of all In Section IV-D, the predictions are used to determine
the users in a period changes with the number of RRHs. As the cached content whose prediction is somewhat less
the number of the RRHs increases, the effective capacities of challenging compared to other metrics that require more
all the algorithms increase as the users become closer to their precise predictions such as the UAV locations in Section
RRHs. The ESN approach can yield up to 21.6% and 24.4% IV-B. Therefore, we choose the ESN based prediction
of improvements in the effective capacity compared to random algorithms for mobility and content request distribution
caching with clustering and random caching without cluster- predictions.
ing, respectively, for a network with 512 RRHs. This stems 5) Future Works: Clearly, ANNs will be an important tool
from the fact that the ESN-based algorithm can effectively for solving challenges in mobile edge caching and comput-
use the predictions of the ESNs to determine which content ing applications, especially for content request prediction and
to cache. computing tasks prediction. In fact, CNNs, that are good
4) Lessons Learned: The presented example of the mobile at storing voluminous data in spatial domains, can be used
edge caching and computing application demonstrated that to investigate the content correlation in the spatial domains.

Based on the content correlation, each BS can store the con- Spectrum management is also regarded as another key
tents that are the most related to other contents to improve component of multi-RAT networks [148]. Unlike early gener-
the caching efficiency and hit ratio. Moreover, RNNs can be ations of cellular networks that operateted exclusively on the
used as self-organizing RL algorithms to allocate computa- sub-6 GHz (microwave) licensed band, multi-RAT networks
tional resources. RNNs are suitable here because they can are expected to transmit over the conventional sub-6 GHz
record the utility values resulting from different computa- band, the unlicensed spectrum, and high-frequency mmWave
tional resources allocation schemes as time elapses. Then, the bands [149], [150]. In this respect, although the sub-6 GHz
RNN-based RL algorithms can find the optimal computational licensed LTE band is reliable, its bandwidth is limited and,
resource allocation after several implementations. Meanwhile, hence, it is a scarce resource. Meanwhile, the unlicensed bands
in contrast to the user association in cellular network where can be used to serve best effort traffic only since the operation
each user can only associate with one BS, one computing task over this spectrum should account for the presence of other
can be assigned to several computing centers and one comput- coexisting technologies. Therefore, a multi-mode BS operat-
ing center can process different computing tasks. Therefore, ing over the licensed, unlicensed, and mmWave frequency
the problem of computing task assignment is a many-to-many bands can exploit the different characteristics and availabil-
matching problem [143]. RNN-based RL algorithms can also ity of the frequency bands thus providing robust and reliable
be used to solve the computing task assignment problem due to communication links for the end users [150]. However, to reap
their advantages in analyzing historical data pertaining to past the benefits of multi-mode BSs, effective spectrum sharing is
assignments of computing tasks. In addition, DNN-based RL crucial.
algorithms can be used to jointly optimize the cache replace- 2) Neural Networks for Spectrum Management and Multi-
ment and the content delivery. To achieve this purpose, each RAT: ANNs are an attractive solution approach for tackling
action of the DNN-based RL algorithm must contain one con- various challenges that arise in multi-RAT scenarios. To lever-
tent delivery method as well as one cache update scheme. This age the advantages of such multi-RAT networks, ANNs can
is because DNNs are good at storing large amounts of util- allow the smart use of different RATs wherein a BS can learn
ity values resulting from different content delivery and cache when to transmit on each type of frequency band based on
update schemes. Last but not as least, SNNs can be used to the underlying network conditions. For instance, ANNs may
predict the dynamic computational resource demands for each allow multi-mode BSs to steer their traffic flows between the
user due to their advantages in dealing with highly dynamic mmWave, the microwave, and the unlicensed band based on
data. A summary of the key problems of using ANNs for the availability of a LoS link, the congestion on the licensed
mobile edge caching and computing is presented in Table VI band and the availability of the unlicensed band. Moreover, in
along with the challenges and future works. LTE-WiFi link aggregation (LWA) scenarios, ANNs allow cel-
lular devices to learn when to operate on each band or utilize
both links simultaneously.
E. Co-Existence of Multiple Radio Access Technologies Moreover, ANNs can provide multi-mode BSs with the abil-
1) Co-Existence of Multiple Radio Access Technologies: ity to learn the appropriate resource management procedure
To cope with the unprecedented increase in mobile data over different RATs or spectrum bands in an online manner
traffic and realize the envisioned 5G services, a significant and, thus, to offer an autonomous and self-organizing oper-
enhancement of per-user throughput and overall system capac- ation with no explicit communication among different BSs,
ity is required [144]. Such an enhancement can be achieved once deployed. For instance, ANNs can be trained over large
through advanced PHY/MAC/network technologies and effi- datasets which take into account the variations of the traf-
cient methods of spectrum management. In fact, one of the fic load over several days for scenarios in which the traffic
main advancements in the network design for 5G networks load of WiFi access points (WAPs) can be characterized based
relies on the integration of multiple different radio access tech- on a particular traffic model [151]. It should be noted that
nologies (RATs) [145]. Multi-RAT based networks encompass cellular data traffic networks exhibit statistically fluctuating
several technologies in which spectrum sharing is important. and periodic demand patterns, especially for applications such
These include cognitive radio networks, LTE-U networks, as as file transfer, video streaming, and browsing [151]. ANNs
well as heterogeneous networks that include both mmWave can also accommodate the users’ mobility patterns to predict
and sub-6 GHz frequencies. With the multi-RAT integration, the availability of a LoS link, thus, allowing the transmis-
a mobile device can potentially transmit data over multiple sion over the mmWave band. In particular, they can be trained
radio interfaces such as LTE and WiFi [146], at the same to learn the antenna tilting angle based on the environment
time, thus improving its performance [147]. Moreover, a multi- changes in order to guarantee a LoS communication link with
RAT network allows fast handover between different RATs the users and, thus, to enable an efficient communication over
and, thus, it provides seamless mobility experience for users. the mmWave spectrum. Moreover, ANNs may enable multiple
Therefore, the integration of different RATs results in an BSs to learn how to form multi-hop, mmWave links over back-
improvement in the utilization of the available radio resources haul infrastructure, while properly allocating resources across
and, thus, in an increase in the system’s capacity. It also those links in an autonomous manner [152], [153]. To cope
guarantees a consistent service experience for different users with the changes in the traffic model and/or the users’ mobil-
irrespective of the served RAT and it facilitates network ity pattern, ANNs can be combined with online ML [154]
management. by properly re-training the weights of the developed learning

Authorized licensed use limited to: UNIVERSIDADE FEDERAL DO PARANA. Downloaded on February 09,2023 at 00:02:55 UTC from IEEE Xplore. Restrictions apply.
3062 IEEE COMMUNICATIONS SURVEYS & TUTORIALS, VOL. 21, NO. 4, FOURTH QUARTER 2019

mechanisms. Multi-mode BSs can, thus, learn the traffic pat- multi-RAT systems. In what follows, we summarize our work
terns over time and, thus, predict the future channel availability in [156], in which we developed a deep RL scheme, based on
status. With proper network design, ANNs can allow opera- LSTM memory cells, for allocating the resources in an LTE-U
tors to improve their network’s performance by reducing the network over a fixed time window T.
probability of congestion occurrence while ensuring a degree 3) Example: An interesting application of DNNs in the
of fairness to the other corresponding technologies in the context of LTE-U and WiFi coexistence is presented in [156].
network. The work in [156] considers a network composed of sev-
A proactive resource management of the radio spectrum eral LTE-U BSs belonging to different LTE operators, several
for multi-mode BSs can also be achieved using ANNs. In WAPs and a set of unlicensed channels on which LTE-U BSs
a proactive approach, rather than reactively responding to and WAPs can operate on. The LTE carrier aggregation fea-
incoming demands and serving them when requested, multi- ture, using which the BSs can aggregate up to five component
mode BSs can predict traffic patterns and determine future carriers belonging to the same or different operating frequency
off-peak times on different spectrum bands so that the incom- bands, is adopted. We consider a time domain divided into
ing traffic demand can be properly allocated over a given time multiple time windows of duration T, each of which consist-
window. In an LTE-U system, for instance, a proactive coex- ing of multiple time epochs t. Our objective is to proactively
istence mechanism may enable future delay-intolerant data determine the spectrum allocation vector for each BS at t = 0
demands to be served within a given prediction window ahead over T while guaranteeing long-term equal weighted airtime
of their actual arrival time thus avoiding the underutilization share with WLAN. In particular, each BS learns its channel
of the unlicensed spectrum during off-peak hours [155]. This selection, carrier aggregation, and fractional spectrum access
will also lead to an increase in the LTE-U transmission oppor- over T while ensuring long-term airtime fairness with the
tunity as well as to a decrease in the collision probability with WLAN and the other LTE-U operators. A contention-based
WAPs and other BSs in the network. protocol is used for channel access over the unlicensed band.
Several existing works have adopted various learning tech- The exponential backoff scheme is adopted for WiFi while
niques in order to tackle a variety of challenges that arise in the BSs adjust their contention window size (and, thus, the
multi-RAT networks [62], [100], [114]–[119]. The problem channel access probability) on each of the selected channels
of resource allocation with uplink-downlink decoupling in an based on the network traffic conditions while also guaran-
LTE-U system has been investigated in [100] in which the teeing a long-term equal weighted fairness with WLAN and
authors propose a decentralized scheme based on ESNs. The other BSs.
authors in [114] propose a fuzzy-neural system for resource The proactive resource allocation scheme in [156] is for-
management among different access networks. The work mulated as a noncooperative game in which the players are
in [115] used an ANN-based learning algorithm for channel the BSs. Each BS must choose which channels to transmit
estimation and channel selection. The authors in [116] pro- on along with the corresponding channel access probabili-
pose a supervised ANN approach, based on FNNs, for the ties at t = 0 for each t of the next time window T. This,
classification of the users’ transmission technology in a multi- in turn, allows the BSs to determine future off-peak hours of
RAT system. In [117], the authors propose a hopfield neural the WLAN on each of the unlicensed channels thus trans-
network scheme for multi-radio packet scheduling. In [118], mitting on the less congested channels. Each BS can therefore
the authors propose a cross-system learning framework in maximize its total throughput over the set of selected channels
order to optimize the long-term performance of multi-mode over T while guaranteeing long-term equal weighted fairness
BSs, by steering delay-tolerant traffic towards WiFi. The work with the WLAN and the other BSs. To solve the formulated
in [119] used a deep RL algorithm for mode selection and game (and find the so-called Nash equilibrium solution), a
resource management in a fog radio access network. Other DNN framework based on LSTM cells was used. To allow
important problems in this domain include root cause analysis a sequence-to-sequence mapping, we considered an encoder-
issues as the ones are studied in [62]. Nevertheless, these prior decoder model as described in Section III-C. In this model, the
works [62], [100], [114]–[119] consider a reactive approach encoder network maps an input sequence to a vector of a fixed
in which the data requests are first initiated and, then, the dimensionality and then the decoder network decodes the tar-
resources are allocated based on their corresponding delay get sequence from the vector. In this scheme, the input of the
tolerance value. In particular, existing works do not consider encoder is a time series representation of the historical traf-
the predictable behavior of the traffic and, thus, they do not fic load of the BSs and WAPs on all the unlicensed channels.
account for future off-peak times during which data traffic The learned vector representation is then fed into a multi-layer
could be distributed among different RATs. perceptron (MLP) that summarizes the input vectors into one
Here, note that, ANNs are suitable for learning the data traf- vector, thus accounting for the dependency among all the input
fic variations over time and, thus, they can predict the future time series vectors. The output of the MLP is then fed into
traffic load. In particular, since LSTM cells are capable of stor- different separate decoders, allowing each BS to reconstruct
ing information for long periods of time, they can learn the its predicted action sequence.
long-term dependency within a given sequence. Predictions To train the proposed network, the REINFORCE algo-
at a given time step are influenced by the network activa- rithm [157] is used to compute the gradient of the expected
tions at previous time steps, thus, making LSTMs an attractive reward with respect to the policy parameters, and the stan-
solution for proactively allocating the available resources in dard gradient descent optimization algorithm [158] is adopted

context, LSTM enabled the RL algorithm to estimate

future utilities (rather than just observe them from the
environment as done in Q-learning) and, hence, be able
to seek better optimization problem solutions (equivalent
to game-theoretic equilibria). This was a novel use case
of LSTM that is motivated by the underlying wireless
system, rather than by the need to process some data.
• Even though proving the optimality properties of the
LSTM output itself is difficult, in this application, we
have shown that by combining LSTM with a game-
theoretic framework, we can ensure that, whenever the
RL algorithm converges, it is guaranteed to be at a Nash
equilibrium (i.e., as a point at which none of the RL algo-
rithms can find a better outcome). However, guaranteeing
convergence analytically is much more challenging than,
for example, the ESN-based approaches we used in the
VR and the UAV problems, due to the deep nature of
Fig. 15. The average throughput gain for LTE-U upon applying a proactive
approach (with varying T) as compared to a reactive approach [156]. LSTM. We do note that our thorough simulations (for
many simulation parameters and settings), showed that
the algorithm will actually always converge, even though
to allow the model to generate optimal action sequences for that is not ascertained analytically. An interesting future
input history traffic values. In particular, we considered the research to address in this context is to analyze the con-
RMSprop gradient descent optimization algorithm [159], an vergence for LSTM-based RL in an LTE-U context, or
adaptive learning rate approach, wherein the learning rate of more generally, in a multi-RAT resource management
a particular weight is divided by a running average of the context. This difficulty in analyzing the convergence of
magnitudes of the recent gradients for that weight. LSTM can also be encountered when dealing with other
The proposed proactive resource allocation scheme was types of ANN-based RL schemes.
compared with a reactive approach for three different network • In this LTE-U scenario, the network operator can train
scenarios. Fig. 15 shows that for very small values of T, the LSTM neural network in a completely offline man-
the proposed scheme does not yield any significant gains. ner since all that is needed for this training is to use
However, as T increases, the BSs have additional opportu- past observations of WiFi traffic and, it is generally
nities for shifting part of the traffic into the future and, thus, known that, within a geographic area, over long periods
the gains start to become more pronounced. For example, we of time, the wireless data traffic parameters are more or
can see that, for 4 BSs and 4 channels, the proposed proactive less consistent. This is a key motivation for using a deep
scheme achieves an increase of 17% and 20% in terms of the architecture here.
average airtime allocation for LTE-U as compared to the reac- • This example has demonstrated that, even though deep
tive approach. Here, note that the gain of the proposed scheme, learning based on LSTM can provide significant improve-
with respect to the reactive approach, keeps on increasing until ments in the predictions of time-stamped sequences of
it reaches a maximum achievable value, after which it remains data (here being the time-varying WiFi traffic), in a prac-
almost constant. tical wireless application, one does need to use many
4) Lessons Learned: In the aforementioned application, we layers. In fact, through our simulations, we observed that
have demonstrated that LSTM can be an effective tool for increasing the number of hidden layers has a very small
resource management in an LTE-U system that needs to main- impact on the achieved performance. This is mainly due
tain a fair co-existence between WiFi and LTE. The key benefit to the fact that the WiFi traffic that is used as input to
brought forward by LSTM in this application is that it enabled LSTM in this work, is much less time-varying than the
the cellular system to accurately predict future off-peak hours datasets that are used in other, non-wireless fields, such as
of WiFi, so as to seize the channels on which to transmit. This, in natural language processing, where multiple layers pro-
in turn, led to a better co-existence between the two systems, vide more gains. However, we do note that, in this work,
owing to the predictive ability of LSTM that provided the we wanted to predict a future sequence of WiFi traffic
system with the ability to use historical WiFi traffic data to data based on a significant history of data and, therefore,
determine future traffic and, thus, make anticipatory resource using shallow ANNs like ESN (e.g., as done in the UAV
management decisions. The main lessons learned here include: and VR applications) would not have been as effective as
• LSTM has mostly been used for data analytics. In the using LSTM that has both short and long term memory
aforementioned application, the network needed LSTM (as explained in Section III-C) and can more effectively
as a part of an RL algorithm that can determine the solu- handle predictions of future sequences that require signif-
tion of a game-theoretic setting, which can be thought icant historical data, as is the case for WiFi traffic. That
of as the solution of a series of optimization prob- said, in our simulations, we only needed three hidden
lems that are solved at the level of each BS. In this layers to reap the benefits of LSTM.

Authorized licensed use limited to: UNIVERSIDADE FEDERAL DO PARANA. Downloaded on February 09,2023 at 00:02:55 UTC from IEEE Xplore. Restrictions apply.
3064 IEEE COMMUNICATIONS SURVEYS & TUTORIALS, VOL. 21, NO. 4, FOURTH QUARTER 2019

• As it is evident from the previous point, whether or not availability of a LoS link and data rate requirements. This
one adopts a deep architecture or a very advanced type in turn improves the users’ throughput thus achieving high
of ANN depends on the type of application that is being data rate. Moreover, LSTMs are capable of learning long
addressed. For WiFi traffic prediction, a deep architecture time series and, thus, they can allow BSs to predict the link
was appropriate. Meanwhile, for prediction of mobility formation for a mmWave backhaul network. In fact, the for-
data and user-based content in the VR and UAV applica- mation of this backhaul network is highly dependent on the
tions that were previously discussed, the use of a shallow network topology and the traffic conditions. Therefore, given
RNN by itself provided significant gains, even without the dynamics of the network, LSTMs enable BSs to dynam-
using a deep architecture. That said, as we will see later ically update the formation of the links among each others
in Section IV, in some applications like IoT, one can based on the changes in the network. Moreover, SNNs can
solve meaningful wireless problems by resorting to very be used for mmWave channel modeling since they can pro-
simple ANNs, such as FNNs, without the need for deep cess and predict continuous-time data effectively. A summary
architectures or more advanced structures. This is a major of key problems that can be solved by using ANNs in multi-
contrast to other ML application domains such as com- RAT system is presented in Table VI along with the challenges
puter vision, where oftentimes a complex, deep ANN is and future works.
needed to obtain meaningful results.
• One disadvantage of using an ANN within a RL
algorithm is that the prediction errors may affect the F. Internet of Things
performance of the outcome. In some sense, within the 1) The Internet of Things: In the foreseeable future, it
aforementioned game-theoretic context, the efficiency of is envisioned that trillions of machine-type devices such as
the reached equilibrium can be impacted by the prediction wearables, sensors, connected vehicles, or mundane objects
errors. While this is true for all of the applications in will be connected to the Internet, forming a massive IoT
which we used ANNs as part of a RL algorithm, the ecosystem [160]. The IoT will enable machine-type devices
effect of the prediction errors may be more pronounced to connect with each other over wireless links and operate
for the LTE-U application because it may lead to the LTE in a self-organizing manner [161]. Therefore, IoT devices
seizing more or less WiFi slots than needed, which can will be able to collect and exchange real-time information
directly impact the operation of the WiFi user. Naturally, to provide smart services. In this respect, the IoT will allow
this is a more serious drawback than in scenarios where delivering innovative services and solutions in the realms of
the network is simply using ANNs to cache data (e.g., smart cities, smart grids, smart homes, and connected vehi-
as in the previously discussed UAV application) or per- cles that could provide a significant improvement in people’s
form cell association (in which case, if a prediction error lives. However, the practical deployment of an IoT system
occurs, the network can simply resort back to known cell still faces many challenges [161] such as data analytics,
association algorithms). computation, transmission capabilities, connectivity, end-to-
5) Future Works: The above application of ANNs to LTE- end latency, security [162], and privacy. In particular, how
U systems can be easily extended to a multi-mode network in to provide massive device connectivity with stringent latency
which the BSs transmit on the licensed, the unlicensed, and the requirement will be one of the most important challenges.
mmWave spectrum. In fact, given their capability of dealing The current centralized communication models and the cor-
with time series data, RNNs can enhance mobility and han- responding technologies may not be able to provide such
dover in highly mobile wireless environments by learning the massive connectivity. Therefore, there is a need for a new
mobility patterns of users thus decreasing the ping-pong effect communication architecture, such as fog computing models
among different RATs. For instance, a predictive mobility for IoT devices connectivity. Moreover, for each IoT device,
management framework can address critical handover issues, the energy and computational resources are limited. Hence,
including frequent handovers, handover failures, and exces- how to allocate computational resources and power for all the
sive energy consumption for seamless handovers in emerging IoT devices to achieve the data rate and latency requirements
dense multi-RAT wireless cellular networks. ANNs can also is another challenge.
predict the QoS requirements, in terms of delay and rate, for 2) Neural Networks for the Internet of Things: ANNs can
the future offered traffic. Moreover, they can predict the trans- be used to address some of the key challenges within the con-
mission links’ conditions and, thus, schedule users based on text of the IoT. So far, ANNs have been used in four major
the links’ conditions and QoS requirements. Therefore, given applications for the IoT. First, ANNs enable the IoT system to
the mobility patterns, the conditions of the transmission links, leverage intelligent data analytics to extract important patterns
and the QoS requirements for each user, BSs can learn how to and relationships from the data sent by the IoT devices. For
allocate different users on different bands such that the total example, ANNs can be used to discover important correlations
network performance, in terms of delay and throughput, is among data to improve the data compression and data recov-
optimized. ery. Second, using ANN-based RL algorithms, IoT devices
An interesting future work of the use of DNNs for mmWave can operate in a self-organizing manner and adapt their strate-
communication is antenna tilting. In particular, DNNs are gies (i.e., channel selection) based on the wireless and users
capable of learning several features of the network environ- environments. For instance, an IoT device that uses an ANN-
ment and thus predicting the optimal tilt angle based on the based RL algorithm can dynamically select the most suitable

frequency band for communication according to the network layer and the devices mapped to the neurons in the output
state. Third, the IoT devices that use ANN-based algorithms layers. Here, the overall cost represents the total transmit
can identify and classify the data collected from the IoT sen- power of all devices used to transmit the information signals,
sors. Finally, one of the main goals of the IoT is to improve and b) minimizing the expected transmission time to deliver
the life quality of humans and reduce the interaction between the information signals.
human and IoT devices. Thus, ANNs can be used to predict To minimize the total transmit power and the expected trans-
the users behavior to provide advanced information for the mit time for the IoT, the basic idea of [120] is to train an ANN
IoT devices. For example, ANNs can be used to predict the so as to approximate the objective functions discussed above
time that an individual will come home, and, hence, adjust the and, then, map the IoT network to the ANN. FNNs, are used
control strategy for the IoT devices at home. for this mapping since they transmit the information in only
Using ANNs for IoT problems faces many challenges. First, one direction, forward, from the input nodes, through the hid-
in an IoT, both energy and computational resources are lim- den nodes, and to the output nodes. First, one must identify
ited. Therefore, one should consider the tradeoff between the the devices that want to send signals as well as the devices that
energy and computational needs of training ANNs and the will receive signals. The IoT devices that want to send signals
accuracy requirement of a given ANN-based learning algo- are mapped to the neurons in the input layers. The IoT devices
rithm. In particular, the higher the required accuracy, the higher that want to receive signals are mapped to the neurons in the
the computational and energy requirements. Second, within an output layers. The other IoT devices are mapped to the neu-
IoT ecosystem, the collected data may have different structure rons in the hidden layers. Some of the devices that are mapped
and even contain several errors. Therefore, when data are used to the hidden layers will be used to forward the signals. Then,
to train ANNs, one should consider how to classify the data the FNN is trained in an offline manner to approximate the
and deal with the flaws in the data. In other words, the ANNs objective functions. The IoT network devices are mapped into
in IoT must tolerate erroneous data. Third, in the IoT system, neurons and wireless links into connections between neurons,
ANNs can exploit thousands of types of data for prediction and, hence, a method is needed to map the trained FNN to
and self-organizing control. For a given task, the data col- the IoT network. Since the computational resources of each
lected from the IoT devices may not all be related to the task IoT device is limited, IoT devices with different computational
being executed. Hence, ANNs must select suitable data for the resources will map to a different number of neurons. For exam-
sought task. ple, an IoT device that has more computational resources can
The existing literature [120]–[128] has studied a number map to a larger number of neurons. Moreover, to ensure the
of problems related to using ANNs for the IoT. In [120], the integrity of the mapping model, each neuron can only map to
authors use a framework to treat an IoT network as an ANN to one of the IoT devices. Given that there are several ways to
reduce delivery latency. The authors in [121] and [122] used map the IoT network to the trained FNN, the optimal map-
a backpropagation neural network for sensor failure detection ping is formulated as an integer linear program which is then
in an IoT network. In [123], eight ML algorithms, including solved using CPLEX. When the optimal mapping between the
DNNs and FNNs, are tested for human activities classifi- IoT network and the trained FNN is found, the optimal con-
cation and robot navigation as well as body postures and nections between the IoT devices are built. Hence, if the IoT
movements. In [124], the authors used the Laguerre neu- network can find the optimal connections for all devices based
ral network-based approximate dynamic programming scheme on the objective functions, the transmit power and expected
to improve the tracking efficiency in an IoT network. The transmit time can be reduced. Simulation results show that
authors in [125] develped a streaming hardware accelerator the mapping algorithm can achieve significant gains in terms
for CNNs to improve the accuracy of image detection in an of total transmit power and expected transmit time compared
IoT network. The work in [126] used a denoising autoencoder to a centralized algorithm. This is because the IoT network
neural network for data sampling in an IoT network. In [127], uses FNNs to approximate the objective functions and find
a deep belief network is used for entity state prediction. The the optimal device connections.
authors in [128] used ANNs for target surveillance. In sum- 4) Lessons Learned: This IoT application has shown that
mary, the prior works used ANNs to solve a number of IoT FNNs are an effective tool for network mapping in an IoT. This
problems such as IoT network modeling, failure detection, mapping can then be used to find the optimal transmission
human activities classification, and tracking accuracy improve- links from the transmitters to the receivers through a set of
ment. However, ANNs can also be used to analyze the data relays. We can summarize the main lessons learned here as
correlation for data compression and data recovery, to iden- follows:
tify humans, to predict human activities, and to manage the • The advantage of FNNs for the studied IoT application
resources of devices. Next, we explain a specific application is that it enabled the IoT devices to optimally build the
of ANNs in the IoT domain. transmission links between the receivers and the transmit-
3) Example: One illustrative application for the use of ters so as to reduce the transmission delay without any
ANNs within the context of the IoT is presented in [120] communications among the IoT devices. In this applica-
which studies how to improve the communication quality by tion, the wireless network only consists of the receivers,
mapping IoT networks to ANNs. The considered IoT network the transmitters, and the relays, and, the data in this
is primarily a wireless sensor network. Two objective functions wireless network will only be transmitted from the trans-
are considered: a) minimizing the overall cost of communica- mitters to the relays, then from the relays to the receivers.
tion between the devices mapped to the neurons in the input The use of FNNs to map this network is appropriate as it

Authorized licensed use limited to: UNIVERSIDADE FEDERAL DO PARANA. Downloaded on February 09,2023 at 00:02:55 UTC from IEEE Xplore. Restrictions apply.
3066 IEEE COMMUNICATIONS SURVEYS & TUTORIALS, VOL. 21, NO. 4, FOURTH QUARTER 2019

allows one to find the optimal transmission links between is shown in Table VI along with the challenges and future
the transmitters and the receivers, through the relays. This works.
was a novel use case of FNNs that is motivated by the
underlying wireless system.
• FNNs are very simple neural networks with little train-
G. Summary
ing overhead, which makes them suitable for imple- In summary, for wireless communications, ANNs have
mentation in IoT systems in which the devices are two important use cases: 1) ANN-based RL algorithms for
resource-constrained. network control, resource management, user association, and
• One disadvantage of using FNNs for mapping wireless interference alignment, and 2) intelligent data analytics for
networks is that they can be only used for a network signal detection, spectrum sensing, channel state detection,
with a small number of transmitters and receivers. This energy prediction, as well as user behavior predictions and
is due to the fact that, as the number of transmitters and classifications. In this subsection, we first summarize the
receivers increases, the number of neurons in the input, advantages, challenges, and limitations of ANN based RL
output, and hidden layers increases. Since FNNs need to algorithms for wireless communication applications. Then, we
calculate the gradients of all of the neurons (in contrast to introduce the advantages, challenges, and limitations of using
ESNs that only need to update the output weight matrix), ANNs for data analytics in wireless networks.
the training complexity will significantly increase. 1) Advantages of ANN-Based RL Algorithms: In general,
• The presented IoT application is restricted to a very sim- RL algorithms based on ANNs can be used for wireless
ple mapping of IoT devices via an FNN. However, the network control and resource management particularly when
IoT domain is much richer than this application and one the wireless network states and conditions are unknown, as
can envision a plethora of resource management, physical shown in the example of co-existence of multiple radio access
layer enhancement, and network optimization problems technologies. Moreover, RL algorithms can be used to solve
that can be addressed using more elaborate ANNs such non-convex optimization problems or problems in which the
as those presented in Section III (and in the previous optimization variables are coupled, as shown in the example
applications). of wireless virtual reality.
Note that, the first, second, and third bullets observations above 2) Challenges and Limitations of ANN-Based RL
can be generalized to other works that rely on FNNs for Algorithms: Implementing ANN-based RL algorithms
solving wireless communication problems. in wireless networks also faces many challenges. First, for
5) Future Works: ANNs are undoubtedly an important tool RL algorithms, the training complexity increases quickly as
for solving a variety of problems in the IoT, particularly in the number of BSs or users that implement the RL algorithms
terms of intelligent data analytics and smart operation. In fact, increases. In consequence, one needs to find a smart training
beyond using FNNs to map the IoT devices hence optimizing method to decrease the training complexity. Moreover, the
the connections between the IoT devices as discussed above, complexity and convergence of RL algorithms that rely
FNNs can also be used to map other systems. For example, on ANNs can be challenging to characterize analytically.
one can map the input layer of an FNN to the IoT devices Recently, most of the existing works use models based
and the output layer to the computing centers. Then, one can on Markov decision processes (MDPs) and game theory
find an optimal allocation of computational tasks via FNN to analyze the convergence of RL algorithms. In fact, RL
mapping. Moreover, ANNs can be used for data compression algorithms can also be used for studying problems that cannot
and recovery so as to reduce both the size of the transmitted be modeled by MDP or game theory models. However, the
data and end-to-end devices latency. To compress the data, convergence of RL in such problems is often challenging to
an ANN needs to extract the most important features from ascertain analytically and, thus, one has to rely on simulations.
the data and, then, these features can be used to present the In addition, one must reduce the computational resources and
compressed data. In particular, CNNs can be used for data power needed by ANN-based RL algorithms implemented
compression and recovery in the spatial domain while RNNs at wireless devices. In fact, for ANN-based RL algorithms,
can be used for data compression and recovery in the time the number of actions and states must be finite. In this case,
domain. This is because CNNs are effective at extracting pat- ANN-based RL algorithms need to be carefully designed if
terns and features from large amounts of data while RNNs are they are to be used to solve the problems that have continuous
suitable for extracting the relationships from time-dependent states and actions.
series data. In addition, DNNs can be used for human identi- 3) Advantages of ANN-Based Data Analytics Algorithms:
fication. An IoT system that can identify different individuals The second important use case of ANNs in wireless networks
can pre-allocate spectral or computational resources to the IoT is data analytics. In wireless networks, most of the col-
devices that a certain individual often uses. DNNs are suitable lected data will be time-dependent. For example, mobile user
here because they have multiple hidden layers to store more behaviors, wireless signals, and energy consumption are all
information related to a user compared to other ANNs and, time-dependent. In consequence, wireless operators can use
hence, DNNs can use one user’s information such as hairstyle, RNNs for user behavior prediction, signal detection, chan-
clothes, and oral patterns to identify that individual so as to nel modeling, and energy prediction. In particular, due to the
provide services tailored to this user. A summary of key prob- unique neuron connection method (each neuron in one layer
lems that can be solved by using ANNs in an IoT system can connect to the neurons in previous layers) of RNNs, they

are effective in dealing with time-dependent data. Moreover, Table V summarizes the type of ANNs and learning algo-
one can use CNNs, a type of DNNs, for modulation classifi- rithms used for each existing work in each application. Based
cation, as done in [32]. CNNs can also be used to analyze the on this table, one can identify the advantages, disadvantages,
images captured by the mobile devices such as VR devices and limitations of each learning algorithm for all types of
and UAVs so as to extract the features of captured images. problems encountered in the literature. Table VI provides a
The features extracted by CNNs can be used for the users’ summary of the key wireless networking problems that can be
movement identification, environment identification, and data solved by using ANNs along with the challenges and relevant
compression and recovery which can be used for wireless applications.
network control and data traffic offloading. For example, one
can use CNNs for data compression at the transmitters and V. C ONCLUSION
data recovery at the receivers so as to reduce the traffic load
In this paper, we have provided one of the first comprehen-
over the transmission links between transmitters and receivers.
sive tutorials on the use of artificial neural networks-based
Meanwhile, since SNNs consist of spiking neurons, they are
machine learning for enabling a variety of applications in
effective in dealing with continuous data. In consequence, one
tomorrow’s wireless networks. In particular, we have presented
can use SNNs for signal detection, channel modeling, channel
an overview of a number of key types of neural networks
state detection, and wireless device (aerial or ground) iden-
such as recurrent, spiking, and deep neural networks. For each
tification. For example, one can use both continuous flying
type, we have overviewed the basic architecture as well as
trajectory and radio frequency signals as the input of SNNs to
the associated challenges and opportunities. Then, we have
identify UAVs and then tweak their transmission parameters.
provided a panoramic overview of the variety of wireless com-
4) Challenges and Limitations of ANN-Based Data
munication problems that can be addressed using ANNs. In
Analytics Algorithms: Implementing ANNs for data analyt-
particular, we have investigated many emerging applications
ics in wireless networks also faces many challenges. First, the
including unmanned aerial vehicles, wireless virtual reality,
data related to the behavior of mobile users is not easy to col-
mobile edge caching and computing, Internet of Things, and
lect due to privacy concerns. For instance, a network operator
multi-RAT wireless networks. For each application, we have
such as Verizon can collect only partial datasets related to the
provided the main motivation for using ANNs along with their
mobile users. Due to this partial availability of datasets, the
associated challenges while also providing a detailed example
prediction accuracy of ANNs can be compromised. Second,
for a use case scenario. Last, but not least, for each appli-
for data analytics, existing ANN-based learning algorithms
cation, we have provided a broad overview on future works
cannot be readily implemented at the level of resource-limited
that can be addressed using ANNs. Clearly, the future of wire-
mobile devices such as smartphones due to high training
less networks will inevitably rely on artificial intelligence and,
complexity and energy consumption. In fact, small IoT or
thus, this paper provides a stepping stone towards understand-
wearable devices such as watches and IoT sensors, or even
ing the analytical machinery needed to develop such a new
smartphones, can record more data related to the users’ envi-
breed of wireless networks.
ronment compared to BSs that are located far away from the
users. In consequence, if an ANN learning algorithm can be
implemented at wearable and carriable devices, it can more R EFERENCES
efficiently use the collected data related to the users’ behav- [1] N. C. Luong, D. T. Hoang, P. Wang, D. Niyato, D. I. Kim, and Z. Han,
iors for training and, hence, the prediction accuracy can be “Data collection and wireless communication in Internet of Things
(IoT) using economic analysis and pricing models: A survey,” IEEE
improved, while also alleviating privacy concerns. One possi- Commun. Surveys Tuts., vol. 18, no. 4, pp. 2546–2590, 4th Quart.,
bility to overcome this challenge is to train at a BS or cloud 2016.
then implement the trained ANNs at the users’ device. Third, [2] Z. Dawy, W. Saad, A. Ghosh, J. G. Andrews, and E. Yaacoub,
“Toward massive machine type cellular communications,” IEEE
distributed ANN learning algorithms are needed for wireless Wireless Commun., vol. 24, no. 1, pp. 120–128, Nov. 2017.
networks. In particular, mobile users will connect to the dif- [3] T. Park, N. Abuzainab, and W. Saad, “Learning how to communicate
ferent BSs as they move from one cell to another. In this case, in the Internet of Things: Finite resources and heterogeneity,” IEEE
Access, vol. 4, pp. 7063–7073, 2016.
the data related to such mobile user may be located at different [4] F. Boccardi, R. W. Heath, A. Lozano, T. L. Marzetta, and P. Popovski,
BSs and the BSs may not be able to exchange the collected “Five disruptive technology directions for 5G,” IEEE Commun. Mag.,
data due to limited capacity of backhaul links. In conse- vol. 52, no. 2, pp. 74–80, Feb. 2014.
[5] “Study on latency reduction techniques for LTE,” 3rd Gener.
quence, a distributed ANN learning algorithm is needed for Partnership Project, Sophia Antipolis, France, Rep. TR 36.881, 2016.
data analytics as the users’ data is located at different BSs. One [6] Y. Zeng, R. Zhang, and T. J. Lim, “Wireless communications
possibility to overcome this challenge is to leverage the emerg- with unmanned aerial vehicles: Opportunities and challenges,” IEEE
Commun. Mag., vol. 54, no. 5, pp. 36–42, May 2016.
ing idea of federated learning [163] that enables distributed [7] T. Zeng, O. Semiari, W. Saad, and M. Bennis, “Joint communication
learning. Moreover, the training complexity of ANN-based and control for wireless autonomous vehicular platoon systems,” IEEE
data analytics algorithms can be higher than other ML tools Trans. Commun., to be published.
[8] M. Mozaffari, W. Saad, M. Bennis, and M. Debbah, “Communications
such as ridge regression. In consequence, one must balance the and control for wireless drone-based antenna array,” IEEE Trans.
tradeoff between prediction accuracy and training complexity. Commun., vol. 67, no. 1, pp. 820–834, Jan. 2019.
Finally, training ANNs may require a large amount of training [9] F. Javed, M. K. Afzal, M. Sharif, and B. Kim, “Internet of Things
(IoT) operating systems support, networking technologies, applications,
data (depending on the application) and such data may not be and challenges: A comparative review,” IEEE Commun. Surveys Tuts.,
always readily available in a wireless network. vol. 20, no. 3, pp. 2062–2100, 3rd Quart., 2018.

Authorized licensed use limited to: UNIVERSIDADE FEDERAL DO PARANA. Downloaded on February 09,2023 at 00:02:55 UTC from IEEE Xplore. Restrictions apply.
3068 IEEE COMMUNICATIONS SURVEYS & TUTORIALS, VOL. 21, NO. 4, FOURTH QUARTER 2019

[10] G. Durisi, T. Koch, and P. Popovski, “Toward massive, ultrareliable, and [38] E. Baştuğ, M. Bennis, M. Médard, and M. Debbah, “Towards
low-latency wireless communication with short packets,” Proc. IEEE, interconnected virtual reality: Opportunities, challenges and enablers,”
vol. 104, no. 9, pp. 1711–1726, Sep. 2016. IEEE Commun. Mag., vol. 55, no. 6, pp. 110–117, Jun. 2017.
[11] D. G. Gopal and S. Kaushik, Emerging Technologies and Applications [39] W. Saad, M. Bennis, and M. Chen, “A vision of 6G wireless systems:
for Cloud-Based Gaming: Review on Cloud Gaming, vol. 41. Hershey, Applications, trends, technologies, and open research problems,” arXiv
PA, USA: Inf. Sci., Mar. 2016, pp. 79–89. preprint arXiv:1902.10265, 2019.
[12] “Extended reality (XR) in 5G, version 14.0.0,” 3rd Gener. Partnership [40] R. Yu. (2017). Huawei Reveals the Future of Mobile Ai at IFA
Project, Sophia Antipolis, France, Rep. TR 26.928, Mar. 2018. 2017. [Online]. Available: http://www.businesswire.com/news/home/
[13] J. G. Andrews et al., “What will 5G be?” IEEE J. Sel. Areas Commun., 20170902005020/en/Huawei-Reveals-Future-Mobile-AI-IFA-2017
vol. 32, no. 6, pp. 1065–1082, Jun. 2014. [41] S. Kovach. (2017). Qualcomm CEO Steve Mollenkopf: What the Big
[14] T. Segaran, Programming Collective Intelligence: Building Smart Web Innovation House That Powered the Mobile Boom Is Betting on
2.0 Applications. Sebastopol, CA, USA: O’Reilly Media, 2007. Next. [Online]. Available: http://www.businessinsider.com/qualcomm-
[15] B. Yegnanarayana, Artificial Neural Networks. New York, NY, USA: ceo-steve-mollenkopf-interview-2017–7
PHI Learn., 2009. [42] Y. C. Hu, M. Patel, D. Sabella, N. Sprecher, and V. Young, “Mobile
[16] G. Chakraborty and B. Chakraborty, “A novel normalization technique edge computing—A key technology towards 5G,” vol. 11, Sophia
for unsupervised learning in ANN,” IEEE Trans. Neural Netw., vol. 11, Antipolis, France, ETSI, White Paper, pp. 1–16, Sept. 2015.
no. 1, pp. 253–257, Jan. 2000. [43] A. Ahmed and E. Ahmed, “A survey on mobile edge computing,” in
[17] K. P. Bennett and A. Demiriz, “Semi-supervised support vector Proc. Int. Conf. Intell. Syst. Control, Coimbatore, India, Jan. 2016,
machines,” in Proc. Adv. Neural Inf. Process. Syst., 1999, pp. 368–374. pp. 2322–2358.
[18] V. Mnih et al., “Human-level control through deep reinforcement [44] S. Sardellitti, G. Scutari, and S. Barbarossa, “Joint optimization of
learning,” Nature, vol. 518, no. 7540, p. 529, 2015. radio and computational resources for multicell mobile-edge comput-
[19] C. Andrieu, N. De Freitas, A. Doucet, and M. I. Jordan, “An introduc- ing,” IEEE Trans. Signal Inf. Process. Netw., vol. 1, no. 2, pp. 89–103,
tion to MCMC for machine learning,” Mach. Learn., vol. 50, nos. 1–2, Jun. 2015.
pp. 5–43, Jan. 2003. [45] S. Nunna et al., “Enabling real-time context-aware collaboration
[20] W. T. Freeman, E. C. Pasztor, and O. T. Carmichael, “Learning low- through 5G and mobile edge computing,” in Proc. Int. Conf. Inf.
level vision,” Int. J. Comput. Vis., vol. 40, no. 1, pp. 25–47, Oct. 2000. Technol. New Gener., Las Vegas, NV, USA, Jun. 2015, pp. 601–605.
[46] G. Lee, W. Saad, and M. Bennis, “Decentralized cross-tier interference
[21] F. Sebastiani, “Machine learning in automated text categorization,”
mitigation in cognitive femtocell networks,” in Proc. IEEE Int. Conf.
ACM Comput. Surveys, vol. 34, no. 1, pp. 1–47, Mar. 2002.
Commun. (ICC), Paris, France, May 2017, pp. 1–5.
[22] R. Collobert and J. Weston, “A unified architecture for natural language
[47] Y. Mao, J. Zhang, and K. B. Letaief, “Dynamic computation offloading
processing: Deep neural networks with multitask learning,” in Proc. Int.
for mobile-edge computing with energy harvesting devices,” IEEE J.
Conf. Mach. Learn., Helsinki, Finland, Jul. 2008, pp. 160–167.
Sel. Areas Commun., vol. 34, no. 12, pp. 3590–3605, Dec. 2016.
[23] B. Pang, L. Lee, and S. Vaithyanathan, “Thumbs up: Sentiment classi-
[48] X. Chen, L. Jiao, W. Li, and X. Fu, “Efficient multi-user computation
fication using machine learning techniques,” in Proc. Conf. Empirical
offloading for mobile-edge cloud computing,” IEEE/ACM Trans. Netw.,
Methods Nat. Lang. Process., Stroudsburg, PA, USA, Jul. 2002,
vol. 24, no. 5, pp. 2795–2808, Oct. 2016.
pp. 79–86.
[49] O. Semiari, W. Saad, S. Valentin, M. Bennis, and H. V. Poor, “Context-
[24] C. M. Bishop, Pattern Recognition and Machine Learning. Heidelberg,
aware small cell networks: How social metrics improve wireless
Germany: Springer, 2006.
resource allocation,” IEEE Trans. Wireless Commun., vol. 14, no. 11,
[25] S. Bi, R. Zhang, Z. Ding, and S. Cui, “Wireless communications in the pp. 5927–5940, Jul. 2015.
era of big data,” IEEE Commun. Mag., vol. 53, no. 10, pp. 190–199, [50] H. Sun, X. Chen, Q. Shi, M. Hong, X. Fu, and N. D. Sidiropoulos,
Oct. 2015. “Learning to optimize: Training deep neural networks for wireless
[26] (2018). The Amazing Ways Verizon Uses AI and Machine Learning To resource management,” arXiv:1705.09412, May 2017.
Improve Performance. [Online]. Available: https://www.forbes.com/ [51] C. Jiang, H. Zhang, Y. Ren, Z. Han, K.-C. Chen, and L. Hanzo,
sites/bernardmarr/2018/06/22/the-amazing-ways-verizon-uses-ai-and- “Machine learning paradigms for next-generation wireless networks,”
machine-learning-to-improve-performance/#3695e5e17638 IEEE Wireless Commun., vol. 24, no. 2, pp. 98–105, Apr. 2017.
[27] (2018). Making Waves With AI. [Online]. Available: [52] M. Bkassiny, Y. Li, and S. K. Jayaweera, “A survey on machine-
https://www.ericsson.com/en/mobility-report/reports/june-2018/ learning techniques in cognitive radios,” IEEE Commun. Surveys Tuts.,
applying-machine-intelligence-to-network-management vol. 15, no. 3, pp. 1136–1159, 3rd Quart., 2013.
[28] (2018). Qualcomm AI Research. [Online]. Available: [53] M. A. Alsheikh, S. Lin, D. Niyato, and H.-P. Tan, “Machine learning
https://www.qualcomm.com/invention/artificial-intelligence/ai-research in wireless sensor networks: Algorithms, strategies, and applica-
[29] (2018). Focus Group on Machine Learning for Future Networks tions,” IEEE Commun. Surveys Tuts., vol. 16, no. 4, pp. 1996–2018,
Including 5G. [Online]. Available: https://www.itu.int/en/ITU- 4th Quart., 2014.
T/focusgroups/ml5g/Pages/default.aspx [54] H. B. Demuth, M. H. Beale, O. De Jess, and M. T. Hagan, Neural
[30] J. Ferber, Multi-Agent Systems: An Introduction to Distributed Artificial Network Design. Cambridge, MA, USA: Martin Hagan, 2014.
Intelligence, vol. 1. Harlow, U.K.: Addison-Wesley, 1999. [55] J. Xie et al., “A survey of machine learning techniques applied to
[31] S. Bubeck, Convex Optimization: Algorithms and Complexity. Hanover, software defined networking (SDN): Research issues and challenges,”
Germany: Now Found. Trends, 2015. IEEE Commun. Surveys Tuts., vol. 21, no. 1, pp. 393–430, 1st Quart.,
[32] T. O’Shea and J. Hoydis, “An introduction to deep learning for the 2019.
physical layer,” IEEE Trans. Cogn. Commun. Netw., vol. 3, no. 4, [56] Q. Mao, F. Hu, and Q. Hao, “Deep learning for intelligent wireless
pp. 563–575, Dec. 2017. networks: A comprehensive survey,” IEEE Commun. Surveys Tuts.,
[33] T. O’Shea, K. Karra, and T. C. Clancy, “Learning approximate neu- vol. 20, no. 4, pp. 2595–2621, 4th Quart., 2018.
ral estimators for wireless channel state information,” in Proc. IEEE [57] M. Mohammadi, A. Al-Fuqaha, S. Sorour, and M. Guizani, “Deep
Int. Workshop Mach. Learn. Signal Process. (MLSP), Tokyo, Japan, learning for IoT big data and streaming analytics: A survey,” IEEE
Sep. 2017, pp. 1–7. Commun. Surveys Tuts., vol. 20, no. 4, pp. 2923–2960, 4th Quart.,
[34] T. J. O’Shea, T. Erpek, and T. C. Clancy, “Deep learning based MIMO 2018.
communications,” arXiv:1707.07980, Jul. 2017. [58] Z. M. Fadlullah et al., “State-of-the-art deep learning: Evolving
[35] F. Liang, C. Shen, and F. Wu, “An iterative BP-CNN architecture for machine intelligence toward tomorrows intelligent network traffic
channel decoding,” IEEE J. Sel. Topics Signal Process., vol. 12, no. 1, control systems,” IEEE Commun. Surveys Tuts., vol. 19, no. 4,
pp. 144–159, Feb. 2018. pp. 2432–2455, 4th Quart., 2017.
[36] E. Nachmani, E. Marciano, L. Lugosch, W. J. Gross, D. Burshtein, [59] P. V. Klaine, M. A. Imran, O. Onireti, and R. D. Souza, “A
and Y. Be’ery, “Deep learning methods for improved decoding of linear survey of machine learning techniques applied to self-organizing
codes,” IEEE J. Sel. Topics Signal Process., vol. 12, no. 1, pp. 119–131, cellular networks,” IEEE Commun. Surveys Tuts., vol. 19, no. 4,
Feb. 2018. pp. 2392–2431, 4th Quart., 2017.
[37] N. Samuel, T. Diskin, and A. Wiesel, “Deep MIMO detection,” in Proc. [60] Y. Sun, M. Peng, Y. Zhou, Y. Huang, and S. Mao, “Application
IEEE Int. Workshop Signal Process. Adv. Wireless Commun. (SPAWC), of machine learning in wireless networks: Key techniques and open
Sapporo, Japan, Jul. 2017, pp. 1–5. issues,” IEEE Commun. Surveys Tuts., to be published.

[61] N. C. Luong et al., “Applications of deep reinforcement learning in [89] M. Mozaffari, W. Saad, M. Bennis, and M. Debbah, “Wireless com-
communications and networking: A survey,” IEEE Commun. Surveys munication using unmanned aerial vehicles (UAVs): Optimal transport
Tuts., to be published. theory for hover time optimization,” IEEE Trans. Wireless Commun.,
[62] X. You, C. Zhang, X. Tan, S. Jin, and H. Wu, “AI for 5G: Research vol. 16, no. 12, pp. 8052–8066, Dec. 2017.
directions and paradigms,” Sci. China Inf. Sci., vol. 62, no. 2, [90] C. H. Liu, Z. Chen, J. Tang, J. Xu, and C. Piao, “Energy-efficient UAV
pp. 1589–1602, Nov. 2019. control for effective and fair communication coverage: A deep rein-
[63] J. Schmidhuber, “Deep learning in neural networks: An overview,” forcement learning approach,” IEEE J. Sel. Areas Commun., vol. 36,
Neural Netw., vol. 61, pp. 85–117, Jan. 2015. no. 9, pp. 2059–2070, Sep. 2018.
[64] Machine Learning: What it Is and Why it Matters. Accessed: [91] V. Sharma, M. Bennis, and R. Kumar, “UAV-assisted heterogeneous
Sep. 2017. [Online]. Available: https://www.sas.com/en_us/insights/ networks for capacity enhancement,” IEEE Commun. Lett., vol. 20,
analytics/machine-learning.html no. 6, pp. 1207–1210, Jun. 2016.
[65] E. Alpaydin, Introduction to Machine Learning. Cambridge, MA, USA: [92] H. Zhang, C. Cao, L. Xu, and T. A. Gulliver, “A UAV detection algo-
MIT Press, 2014. rithm based on an artificial neural network,” IEEE Access, vol. 6,
[66] L. Rose, S. Lasaulce, S. M. Perlaza, and M. Debbah, “Learning equilib- pp. 24720–24728, 2018.
ria with partial information in decentralized wireless networks,” IEEE [93] D. Nodland, H. Zargarzadeh, and S. Jagannathan, “Neural network-
Commun. Mag., vol. 49, no. 8, pp. 136–142, Aug. 2011. based optimal adaptive output feedback control of a helicopter UAV,”
[67] D. P. Mandic and J. A. Chambers, Recurrent Neural Networks IEEE Trans. Neural Netw. Learn. Syst., vol. 24, no. 7, pp. 1061–1073,
for Prediction: Learning Algorithms, Architectures and Stability. Jul. 2013.
New York, NY, USA: Wiley, 2001. [94] J. R. G. Braga, H. F. C. Velho, G. Conte, P. Doherty, and
[68] M. Lukoševičius and H. Jaeger, “Reservoir computing approaches to É. H. Shiguemori, “An image matching system for autonomous UAV
recurrent neural network training,” Comput. Sci. Rev., vol. 3, no. 3, navigation based on neural network,” in Proc. Int. Conf. Control Autom.
pp. 127–149, Aug. 2009. Robot. Vis. (ICARCV), Phuket, Thailand, Nov. 2016, pp. 1–6.
[69] P. J. Werbos, “Backpropagation through time: What it does and how [95] J. Cui, Y. Liu, and A. Nallanathan, “Multi-agent reinforcement learn-
to do it,” Proc. IEEE, vol. 78, no. 10, pp. 1550–1560, Oct. 1990. ing based resource allocation for UAV networks,” arXiv preprint
[70] M. Lukos̆evičius, A Practical Guide to Applying Echo State Networks. arXiv:1810.10408, 2018.
Heidelberg, Germany: Springer, 2012. [96] M. Chen, M. Mozaffari, W. Saad, C. Yin, M. Debbah, and C. S. Hong,
[71] H. Jaeger, “Short term memory in echo state networks,” German Nat. “Caching in the sky: Proactive deployment of cache-enabled unmanned
Res. Centre Inf. Technol., Sankt Augustin, Germany, GMD Rep. 152, aerial vehicles for optimized quality-of-experience,” IEEE J. Sel. Areas
2001. Commun., vol. 35, no. 5, pp. 1046–1061, May 2017.
[72] R. Ali and T. Peter, “Minimum complexity echo state network,” IEEE [97] M. Chen, W. Saad, and C. Yin, “Liquid state machine learning for
Trans. Neural Netw., vol. 22, no. 1, pp. 131–144, Nov. 2011. resource and cache management in LTE-U unmanned aerial vehicle
[73] C. Gallicchio and A. Micheli, “Deep reservoir computing: A critical (UAV) networks,” IEEE Trans. Wireless Commun., vol. 18, no. 3,
analysis,” in Proc. Eur. Symp. Artif. Neural Netw. Comput. Intell. Mach. pp. 1504–1517, Mar. 2019.
Learn., Bruges, Belgium, Apr. 2016, pp. 1–6. [98] X. Liu, M. Chen, and C. Yin, “Optimized trajectory design in UAV
[74] C. M. Bishop, “Training with noise is equivalent to Tikhonov regular- based cellular networks for 3D users: A double Q-learning approach,”
ization,” Neural Comput., vol. 7, no. 1, pp. 108–116, 2008. J. Commun. Inf. Netw., vol. 4, no. 1, pp. 24–31, Apr. 2019.
[75] B. Farhang-Boroujeny, Adaptive Filters: Theory and Applications. [99] H. Jaeger. (2014). Controlling Recurrent Neural Networks by
Chichester, U.K.: Wiley, 2013. Conceptors. [Online]. Available: https://arxiv.org/abs/1403.3369
[76] H. Jaeger and H. Haas, “Harnessing nonlinearity: Predicting chaotic [100] M. Chen, W. Saad, and C. Yin, “Echo state networks for self-organizing
systems and saving energy in wireless communication,” Science, resource allocation in LTE-U with uplink–downlink decoupling,” IEEE
vol. 304, no. 5667, pp. 78–80, Apr. 2004. Trans. Wireless Commun., vol. 16, no. 1, pp. 3–16, Jan. 2017.
[77] D. Verstraeten, B. Schrauwen, D. Stroobandt, and J. Van Campenhout, [101] U. Challita, Z. Dawy, G. Turkiyyah, and J. Naoum-Sawaya, “A
“Isolated word recognition with the liquid state machine: A case study,” chance constrained approach for LTE cellular network planning under
Inf. Process. Lett., vol. 95, no. 6, pp. 521–528, Sep. 2005. uncertainty,” Comput. Commun., vol. 73, pp. 34–45, Jan. 2016.
[78] W. Maass, “Liquid state machines: Motivation, theory, and applica- [102] M. Chen, W. Saad, C. Yin, and M. Debbah, “Data correlation-aware
tions,” in Computability in Context: Computation and Logic in the Real resource management in wireless virtual reality (VR): An echo state
World. London, U.K.: Imperial College Press, 2010, pp. 275–296. transfer learning approach,” IEEE Trans. Commun., vol. 67, no. 6,
[79] W. Maass, T. Natschläger, and H. Markram, “Real-time computing pp. 4267–4280, Jun. 2019.
without stable states: A new framework for neural computation based [103] M. Chen, W. Saad, and C. Yin, “Virtual reality over wireless networks:
on perturbations,” Neural Comput., vol. 14, no. 11, pp. 2531–2560, Quality-of-service model and learning-based resource management,”
Nov. 2002. IEEE Trans. Commun., vol. 66, no. 11, pp. 5621–5635, Nov. 2018.
[80] A. Courville I. Goodfellow, and Y. Bengio, Deep Learning. Cambridge, [104] M. Chen, O. Semiari, W. Saad, X. Liu, and C. Yin, “Federated echo
MA, USA: MIT Press, 2016. state learning for minimizing breaks in presence in wireless virtual
[81] X. Glorot, A. Bordes, and Y. Bengio, “Deep sparse rectifier networks,” reality networks,” arXiv preprint arXiv:1812.01202, 2018.
in Proc. Artif. Intell. Stat. (AISTATS), Fort Lauderdale, FL, USA, [105] G. A. Koulieris, G. Drettakis, D. Cunningham, and K. Mania, “Gaze
Jun. 2011, pp. 315–323. prediction using machine learning for dynamic stereo manipulation in
[82] I. Goodfellow, Y. Bengio, and A. Courville, “Regularization for deep games,” in Proc. IEEE Virtual Reality (VR), Greenville, SC, USA,
learning,” in Deep Learning. Cambridge, MA, USA: MIT Press, 2016, Mar. 2016, pp. 113–120.
ch. 7. [Online]. Available: http://www.deeplearningbook.org [106] M. Chen, W. Saad, and C. Yin, “Echo-liquid state deep learning for
[83] A Beginner Guide to Recurrent Networks and LSTMs. Accessed: 360◦ content transmission and caching in wireless VR networks with
Sep. 2017. [Online]. Available: https://deeplearning4j.org/lstm.html cellular-connected UAVs,” IEEE Trans. Commun., to be published.
[84] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “ImageNet classification [107] E. Zeydan et al., “Big data caching for networking: Moving from cloud
with deep convolutional neural networks,” in Proc. Adv. Neural Inf. to edge,” IEEE Commun. Mag., vol. 54, no. 9, pp. 36–42, Sep. 2016.
Process. Syst., 2012, pp. 1097–1105. [108] J. Cobb and H. ElAarag, “Web proxy cache replacement scheme based
[85] M. Schmidt, D. Block, and U. Meier, “Wireless interference identifi- on back-propagation neural network,” J. Syst. Softw., vol. 81, no. 9,
cation with convolutional neural networks,” in Proc. IEEE Int. Conf. pp. 1539–1558, Sep. 2008.
Ind. Inf. (INDIN), Emden, Germany, Jul. 2017, pp. 180–185. [109] Z. Zhang, M. Hua, C. Li, Y. Huang, and L. Yang, “Placement deliv-
[86] M. Soh. (2016). Learning CNN-LSTM Architectures ery array design via attention-based sequence-to-sequence model with
for Image Caption Generation. [Online]. Available: deep neural network,” IEEE Wireless Commun. Lett., vol. 8, no. 2,
http://cs224d.stanford.edu/reports/msoh.pdf pp. 372–375, Apr. 2019.
[87] L. J. Lin, “Reinforcement learning for robots using neural networks,” [110] Y. Wei, F. R. Yu, M. Song, and Z. Han, “Joint optimization of caching,
School Comput. Sci., Carnegie-Mellon Univ., Pittsburgh, PA, USA, computing, and radio resources for fog-enabled IoT using natural actor–
Rep. CMU-CS-93-103, 1993. critic deep reinforcement learning,” IEEE Internet Things J., vol. 6,
[88] Y. Yang, M. Chen, C. Guo, C. Feng, and W. Saad, “Power efficient no. 2, pp. 2061–2073, Apr. 2019.
visible light communication (VLC) with unmanned aerial vehicles [111] E. Baştuğ et al., “Big data meets telcos: A proactive caching perspec-
(UAVs),” IEEE Commun. Lett., to be published. tive,” J. Commun. Netw., vol. 17, no. 6, pp. 549–557, Dec. 2015.

Authorized licensed use limited to: UNIVERSIDADE FEDERAL DO PARANA. Downloaded on February 09,2023 at 00:02:55 UTC from IEEE Xplore. Restrictions apply.
3070 IEEE COMMUNICATIONS SURVEYS & TUTORIALS, VOL. 21, NO. 4, FOURTH QUARTER 2019

[112] S. M. S. Tanzil, W. Hoiles, and V. Krishnamurthy, “Adaptive scheme [136] E. Bastug, M. Bennis, and M. Debbah, “Living on the edge: The role
for caching YouTube content in a cellular network: Machine learning of proactive caching in 5G wireless networks,” IEEE Commun. Mag.,
approach,” IEEE Access, vol. 5, pp. 5870–5881, 2017. vol. 52, no. 8, pp. 82–89, Aug. 2014.
[113] M. Chen, W. Saad, C. Yin, and M. Debbah, “Echo state networks for [137] M. A. Maddah-Ali and U. Niesen, “Coding for caching: Fundamental
proactive caching in cloud-based radio access networks with mobile limits and practical challenges,” IEEE Commun. Mag., vol. 54, no. 8,
users,” IEEE Trans. Wireless Commun., vol. 16, no. 6, pp. 3520–3535, pp. 23–29, Aug. 2016.
Jun. 2017. [138] Y. Fadlallah, A. M. Tulino, D. Barone, G. Vettigli, J. Llorca, and
[114] L. Giupponi, R. Agusti, J. Perez-Romero, and O. Sallent, “Joint radio J.-M. Gorce, “Coding for caching in 5G networks,” IEEE Commun.
resource management algorithm for multi-RAT networks,” in Proc. Mag., vol. 55, no. 2, pp. 106–113, Feb. 2017.
IEEE Glob. Telecommun. Conf. (GLOBECOM), St. Louis, MO, USA, [139] Y. Mao, C. You, J. Zhang, K. Huang, and K. B. Letaief, “A survey
Nov. 2005, p. 5. on mobile edge computing: The communication perspective,” IEEE
[115] H. He, C. Wen, S. Jin, and G. Y. Li, “Deep learning-based channel Commun. Surveys Tuts., vol. 19, no. 4, pp. 2322–2358, 4th Quart.,
estimation for beamspace mmwave massive MIMO systems,” IEEE 2017.
Wireless Commun. Lett., vol. 7, no. 5, pp. 852–855, Oct. 2018. [140] M. Peng, Y. Sun, X. Li, Z. Mao, and C. Wang, “Recent advances
[116] S. Baban, D. Denkoviski, O. Holland, L. Gavrilovska, and H. Aghvami, in cloud radio access networks: System architectures, key techniques,
“Radio access technology classification for cognitive radio networks,” and open issues,” IEEE Commun. Surveys Tuts., vol. 18, no. 3,
in Proc. IEEE Int. Symp. Pers. Indoor Mobile Radio Commun. pp. 2282–2308, 3rd Quart., 2016.
(PIMRC), London, U.K., Sep. 2013, pp. 2718–2722. [141] M. S. Elbamby, M. Bennis, and W. Saad, “Proactive edge computing in
[117] Y. Cui, Y. Xu, R. Xu, and X. Sha, “A multi-radio packet schedul- latency-constrained fog networks,” in Proc. Eur. Conf. Netw. Commun.
ing algorithm for real-time traffic in a heterogeneous wireless network (EuCNC), Oulu, Finland, Jun. 2017, pp. 1–6.
environment,” Inf. Technol. J., vol. 10, pp. 182–188, Oct. 2010. [142] D. Wu and R. Negi, “Effective capacity-based quality of service mea-
[118] M. Bennis, M. Simsek, A. Czylwik, W. Saad, S. Valentin, and sures for wireless networks,” Mobile Netw. Appl., vol. 11, no. 1,
M. Debbah, “When cellular meets WiFi in wireless small cell pp. 91–99, Feb. 2006.
networks,” IEEE Commun. Mag., vol. 51, no. 6, pp. 44–50, Jun. 2013. [143] Y. Gu, W. Saad, M. Bennis, M. Debbah, and Z. Han, “Matching theory
[119] Y. Sun, M. Peng, and S. Mao, “Deep reinforcement learning-based for future wireless networks: Fundamentals and applications,” IEEE
mode selection and resource management for green fog radio access Commun. Mag., vol. 53, no. 5, pp. 52–59, May 2015.
networks,” IEEE Internet Things J., vol. 6, no. 2, pp. 1960–1971, [144] The 5G Infrastructure Public Private Partnership: The Next Generation
Apr. 2019. of Communication Networks and Services, 5GPPP, Valencia, Spain,
[120] N. Kaminski et al., “A neural-network-based realization of in-network Feb. 2015.
computation for the Internet of Things,” in Proc. IEEE Int. Conf. [145] C. Sexton, N. J. Kaminski, J. M. Marquez-Barja, N. Marchetti, and
Commun., Paris, France, May 2017, pp. 1–6. L. A. DaSilva, “5G: Adaptable networks enabled by versatile radio
[121] S. R. Naidu, E. Zafiriou, and T. J. McAvoy, “Use of neural networks access technologies,” IEEE Commun. Surveys Tuts., vol. 19, no. 2,
for sensor failure detection in a control system,” IEEE Control Syst. pp. 688–720, 2nd Quart., 2017.
Mag., vol. 10, no. 3, pp. 49–55, Apr. 1990. [146] Y. Hu, R. MacKenzie, and M. Hao, “Expected Q-learning for self-
[122] H. Ning and Z. Wang, “Future Internet of Things architecture: Like organizing resource allocation in LTE-U with downlink-uplink decou-
mankind neural system or social organization framework?” IEEE pling,” in Proc. Eur. Wireless Conf., Dresden, Germany, May 2017,
Commun. Lett., vol. 15, no. 4, pp. 461–463, Mar. 2011. pp. 1–6.
[123] F. Alam, R. Mehmood, I. Katib, and A. Albeshri, “Analysis of eight [147] J. G. Andrews, “Seven ways that HetNets are a cellular paradigm shift,”
data mining algorithms for smarter Internet of Things (IoT),” Procedia IEEE Commun. Mag., vol. 51, no. 3, pp. 136–144, Mar. 2013.
Comput. Sci., vol. 98, pp. 437–442, Dec. 2016. [148] G. Salami, O. Durowoju, A. Attar, O. Holland, R. Tafazolli, and
[124] X. Luo, Y. Lv, M. Zhou, W. Wang, and W. Zhao, “A laguerre neural H. Aghvami, “A comparison between the centralized and distributed
network-based ADP learning scheme with its application to tracking approaches for spectrum management,” IEEE Commun. Surveys Tuts.,
control in the Internet of Things,” Pers. Ubiquitous Comput., vol. 20, vol. 13, no. 2, pp. 274–290, 2nd Quart., 2011.
no. 3, pp. 361–372, Jun. 2016. [149] Q. C. Li, H. Niu, A. T. Papathanassiou, and G. Wu, “5G network
[125] L. Du et al., “A reconfigurable streaming deep convolutional neural capacity: Key elements and technologies,” IEEE Veh. Technol. Mag.,
network accelerator for Internet of Things,” IEEE Trans. Circuits Syst. vol. 9, no. 1, pp. 71–78, Mar. 2014.
I, Reg. Papers, vol. 65, no. 1, pp. 198–208, Jan. 2018. [150] O. Semiari, W. Saad, and M. Bennis, “Joint millimeter wave and
[126] T. Yu, X. Wang, and A. Shami, “UAV-enabled spatial data sampling in microwave resources allocation in cellular networks with dual-mode
large-scale IoT systems using denoising autoencoder neural network,” base stations,” IEEE Trans. Wireless Commun., vol. 16, no. 7,
IEEE Internet Things J., vol. 6, no. 2, pp. 1856–1865, Apr. 2019. pp. 4802–4816, Jul. 2017.
[127] P. Zhang, X. Kang, D. Wu, and R. Wang, “High-accuracy entity state [151] S. Ha, S. Sen, C. Joe-Wong, Y. Im, and M. Chiang, “TUBE: Time-
prediction method based on deep belief network toward IoT search,” dependent pricing for mobile data,” in Proc. Special Interest Group
IEEE Wireless Commun. Lett., vol. 8, no. 2, pp. 492–495, Apr. 2019. Data Commun. (ACM SIGCOMM). Helsinki, Finland, Aug. 2012,
[128] J. Liang, X. Yu, and H. Li, “Collaborative energy-efficient moving pp. 247–258.
in Internet of Things: Genetic fuzzy tree vs. neural networks,” IEEE [152] U. Challita and W. Saad, “Network formation in the sky: Unmanned
Internet Things J., to be published. aerial vehicles for multi-hop wireless backhauling,” in Proc. IEEE
[129] (2018). Qualcomm Announces Support for Next-Generation VR Glob. Commun. Conf. (GLOBECOM), Singapore, Dec. 2017, pp. 1–6.
Experiences With New Snapdragon 845 Virtual Reality Development [153] O. Semiari, W. Saad, M. Bennis, and Z. Dawy, “Inter-operator resource
Kit. [Online]. Available: https://www.qualcomm.com/news/ management for millimeter wave multi-hop backhaul networks,” IEEE
releases/2018/03/21/qualcomm-announces-support-next-generation-vr- Trans. Wireless Commun., vol. 16, no. 8, pp. 5258–5272, Aug. 2017.
experiences-new-snapdragon [154] N. Burlutskiy, M. Petridis, A. Fish, A. Chernov, and N. Ali, “An inves-
[130] M. Bennis M. S. Elbamby, C. Perfecto, and K. Doppler. (Jan. 2018). tigation on online versus batch learning in predicting user behaviour,”
Towards Low-Latency and Ultra-Reliable Virtual Reality. [Online]. in Research and Development in Intelligent Systems XXXIII. Cham,
Available: https://arxiv.org/abs/1801.07587 Switzerland: Springer, 2016.
[131] (2018). Vive Wireless Adapter. [Online]. Available: [155] U. Challita, L. Dong, and W. Saad, “Deep learning for proactive
https://www.vive.com/us/wireless-adapter/ resource allocation in LTE-U networks,” in Proc. Eur. Wireless Conf.,
[132] (2018). TPCAST Wireless Adapter for Oculus RIFT. [Online]. Dresden, Germany, May 2017, pp. 1–6.
Available: https://www.tpcastvr.com/product-rift [156] U. Challita, L. Dong, and W. Saad, “Proactive resource management
[133] (2018). Because Your Senses Do Not Have Wires. [Online]. Available: for LTE in unlicensed spectrum: A deep learning perspective,” IEEE
https://www.intel.com/content/www/us/en/wireless-products/wigig- Trans. Wireless Commun., vol. 17, no. 7, pp. 4674–4689, Jul. 2018.
overview.html [157] R. J. Williams, “Simple statistical gradient-following algorithms for
[134] A. E. Abbas, “Constructing multiattribute utility functions for decision connectionist reinforcement learning,” Mach. Learn., vol. 8, nos. 3–4,
analysis,” in Proc. INFORMS Tuts. Oper. Res., Oct. 2010, pp. 62–98. pp. 229–256, May 1992.
[135] M. Abrash. What VR Could, Should, and Almost Certainly Will [158] R. S. Sutton, D. McAllester, S. Singh, and Y. Mansour, “Policy gradient
Be Within Two Years. Accessed: Sep. 2017. [Online]. Available: methods for reinforcement learning with function approximation,” in
https://media.steampowered.com/apps/abrashblog/Abrash Proc. Adv. Neural Inf. Process. Syst., vol. 12, 2000, pp. 1057–1063.

[159] T. Tieleman and G. Hinton, “Lecture 6.5—RmsProp: Divide the Changchuan Yin (M’98–SM’15) received the
gradient by a running average of its recent magnitude,” Rep., 2012. Ph.D. degree in telecommunication engineer-
[160] M. Agiwal, A. Roy, and N. Saxena, “Next generation 5G wireless ing from the Beijing University of Posts and
networks: A comprehensive survey,” IEEE Commun. Surveys Tuts., Telecommunications, Beijing, China, in 1998. In
vol. 18, no. 3, pp. 1617–1655, 3rd Quart., 2016. 2004, he held a visiting position with the Faculty of
[161] A. Al-Fuqaha, M. Guizani, M. Mohammadi, M. Aledhari, and Science, University of Sydney, Sydney, Australia.
M. Ayyash, “Internet of Things: A survey on enabling technologies, From 2007 to 2008, he held a visiting position
protocols, and applications,” IEEE Commun. Surveys Tuts., vol. 17, with the Department of Electrical and Computer
no. 4, pp. 2347–2376, 4th Quart., 2015. Engineering, Texas A&M University, College
[162] Y. Hu, A. Sanjab, and W. Saad, “Dynamic psychological game theory Station, USA. He is currently a Professor with
for secure Internet of Battlefield Things (IoBT) systems,” IEEE Internet the School of Information and Communication
Things J., vol. 6, no. 2, pp. 3712–3726, Apr. 2019. Engineering, Beijing University of Posts and Telecommunications. His
[163] V. Smith, C. K. Chiang, M. Sanjabi, and A. Talwalkar. research interests include wireless networks and statistical signal processing.
(May 2017). Federated Multi-Task Learning. [Online]. Available: He was a co-recipient of the IEEE International Conference on Wireless
https://arxiv.org/abs/1705.10467 Communications and Signal Processing Best Paper Award in 2009. He
has served as a technical program committee member for various IEEE
conferences.

Mingzhe Chen received the Ph.D. degree

from the Beijing University of Posts and
Telecommunications, Beijing, China, in 2019.
From 2016 to 2019, he was a Visiting Researcher
with the Department of Electrical and Computer
Engineering, Virginia Tech. He is currently a
Post-Doctoral Fellow with the Chinese University
of Hong Kong, Shenzhen, China, and the Electrical
Engineering Department, Princeton University.
His research interests include machine learning,
virtual reality, unmanned aerial vehicles, game
theory, wireless networks, and caching. He was an exemplary reviewer
for IEEE T RANSACTIONS ON W IRELESS C OMMUNICATIONS and IEEE
T RANSACTIONS ON C OMMUNICATIONS in 2018.

Ursula Challita received the Ph.D. degree from

the University of Edinburgh, U.K., in 2018. From
2016 to 2018, she was a Visiting Research Scholar
with Virginia Tech, USA. She is currently an
Experienced Researcher with Ericsson Research, Mérouane Debbah (S’01–M’04–SM’08–F’15)
Stockholm, Sweden. Her research interests include received the M.Sc. and Ph.D. degrees from the
machine learning, optimization theory, wireless Ecole Normale Supérieure Paris-Saclay, France.
networks, unmanned aerial vehicles, and spectrum He was with Motorola Labs, Saclay, France, from
management. She was a recipient of the Edinburgh 1999 to 2002, and also with the Vienna Research
Global Research Scholarship and Principal’s Career Center for Telecommunications, Vienna, Austria,
Development Scholarship from 2014 to 2017. until 2003. From 2003 to 2007, he was an Assistant
Professor with the Mobile Communications
Department, Institut Eurecom, Sophia Antipolis,
France. From 2007 to 2014, he was the Director of
Walid Saad (S’07–M’10–SM’15–F’19) received the the Alcatel-Lucent Chair on Flexible Radio. Since
Ph.D. degree from the University of Oslo in 2010. 2007, he has been a Full Professor with CentraleSupelec, Gif-sur-Yvette,
He is currently a Professor with the Department of France. Since 2014, he has been the Vice-President of the Huawei France
Electrical and Computer Engineering, Virginia Tech, Research Center and the Director of the Mathematical and Algorithmic
where he leads the Network Science, Wireless, and Sciences Laboratory. He has managed 8 EU projects and over 24 national and
Security Laboratory. His research interests include international projects. His research interests lie in fundamental mathematics,
wireless networks, machine learning, game theory, algorithms, statistics, information, and communication sciences research. He
security, unmanned aerial vehicles, cyber-physical was a recipient of the ERC Grant MORE (Advanced Mathematical Tools
systems, and network science. He was a recip- for Complex Network Engineering) from 2012 to 2017, the Mario Boella
ient of the NSF CAREER Award in 2013, the Award in 2005, the IEEE Glavieux Prize Award in 2011, and the Qualcomm
AFOSR Summer Faculty Fellowship in 2014, the Innovation Prize Award in 2012; 20 best paper awards, among which the
Young Investigator Award from the Office of Naval Research in 2015, 2007 IEEE GLOBECOM Best Paper Award, the Wi-Opt 2009 Best Paper
the 2015 Fred W. Ellersick Prize from the IEEE Communications Society, the Award, the 2010 Newcom++ Best Paper Award, the WUN CogCom Best
2017 IEEE ComSoc Best Young Professional in Academia Award, and the Paper 2012 and 2013 Award, the 2014 WCNC Best Paper Award, the
2018 IEEE ComSoc Radio Communications Committee Early Achievement 2015 ICC Best Paper Award, the 2015 IEEE Communications Society
Award. He was the author/coauthor of eight conference best paper awards Leonard G. Abraham Prize, the 2015 IEEE Communications Society Fred
at WiOpt in 2009, ICIMP in 2010, IEEE WCNC in 2012, IEEE PIMRC in W. Ellersick Prize, the 2016 IEEE Communications Society Best Tutorial
2015, IEEE SmartGridComm in 2015, EuCNC in 2017, IEEE GLOBECOM Paper Award, the 2016 European Wireless Best Paper Award, the 2017
in 2018, and IFIP NTMS 2019. From 2015 to 2017, he was named the Eurasip Best Paper Award, the 2018 IEEE Marconi Prize Paper Award, the
Stephen O. Lane Junior Faculty Fellow at Virginia Tech and, in 2017, he was 2019 IEEE Communications Society Young Author Best Paper Award, and
named the College of Engineering Faculty Fellow. He currently serves as an the Valuetools 2007, Valuetools 2008, CrownCom 2009, Valuetools 2012,
Editor for the IEEE T RANSACTIONS ON W IRELESS C OMMUNICATIONS, the SAM 2014, and 2017 IEEE Sweden VT-COM-IT Joint Chapter best student
IEEE T RANSACTIONS ON M OBILE C OMPUTING, the IEEE T RANSACTIONS paper awards. He is an Associate Editor-in-Chief of Random Matrix: Theory
ON C OGNITIVE C OMMUNICATIONS AND N ETWORKING , and the IEEE and Applications. He was an Associate Area Editor and a Senior Area Editor
T RANSACTIONS ON I NFORMATION F ORENSICS AND S ECURITY. He is the of the IEEE T RANSACTIONS ON S IGNAL P ROCESSING from 2011 to 2013
Editor-at-Large of the IEEE T RANSACTIONS ON C OMMUNICATIONS. He is and from 2013 to 2014, respectively. He is a WWRF Fellow and a Membre
an IEEE Distinguished Lecturer. émérite SEE.

Authorized licensed use limited to: UNIVERSIDADE FEDERAL DO PARANA. Downloaded on February 09,2023 at 00:02:55 UTC from IEEE Xplore. Restrictions apply.

Developing Networks Using Artificial Intelligence (PDFDrive) PDF
100% (2)
Developing Networks Using Artificial Intelligence (PDFDrive) PDF
256 pages
Deep Learning Unit-II
No ratings yet
Deep Learning Unit-II
19 pages
Deep Learning in Mobile and Wireless Networking: A Survey
No ratings yet
Deep Learning in Mobile and Wireless Networking: A Survey
67 pages
2405.02336v16G AI
No ratings yet
2405.02336v16G AI
33 pages
Generative AI Notes
100% (1)
Generative AI Notes
3 pages
Thesis
No ratings yet
Thesis
233 pages
Machine Learning For Physical Layer in 5G and Beyond
No ratings yet
Machine Learning For Physical Layer in 5G and Beyond
44 pages
Machine Learning For Wireless Networks With Artificial Intelligence: A Tutorial On Neural Networks
No ratings yet
Machine Learning For Wireless Networks With Artificial Intelligence: A Tutorial On Neural Networks
98 pages
AD3511-DEEP LEARNING LAB MANUAL Revised
No ratings yet
AD3511-DEEP LEARNING LAB MANUAL Revised
72 pages
11.RNN and Transformers
No ratings yet
11.RNN and Transformers
100 pages
1 MC Culloh Pitts Neuron Model 22 Jul 2019material I 22 Jul 2019 Intro New
No ratings yet
1 MC Culloh Pitts Neuron Model 22 Jul 2019material I 22 Jul 2019 Intro New
58 pages
Chapter 4 Deep Neural Nets
No ratings yet
Chapter 4 Deep Neural Nets
75 pages
IoT Comprehensive Survey
No ratings yet
IoT Comprehensive Survey
97 pages
Machine Learning For Wireless Networks With Artificial Intelligence: A Tutorial On Neural Networks
No ratings yet
Machine Learning For Wireless Networks With Artificial Intelligence: A Tutorial On Neural Networks
93 pages
Distributed Intelligence in Wireless Networks
No ratings yet
Distributed Intelligence in Wireless Networks
79 pages
10 1109@comst 2020 2965856 PDF
No ratings yet
10 1109@comst 2020 2965856 PDF
46 pages
Individual Project Proposal - Workshop
No ratings yet
Individual Project Proposal - Workshop
47 pages
RNN, LSTM, Gru
No ratings yet
RNN, LSTM, Gru
36 pages
Thesis 2
No ratings yet
Thesis 2
45 pages
Sun 2019
No ratings yet
Sun 2019
37 pages
2 Mayank - Vatsa
No ratings yet
2 Mayank - Vatsa
39 pages
Lec 1 - Deep Learning Introduction
No ratings yet
Lec 1 - Deep Learning Introduction
46 pages
Artificial Neural Networks-Based Machine Learning For Wireless Networks: A Tutorial
No ratings yet
Artificial Neural Networks-Based Machine Learning For Wireless Networks: A Tutorial
33 pages
Enabling AI in Future Wireless Networks: A Data Life Cycle Perspective
No ratings yet
Enabling AI in Future Wireless Networks: A Data Life Cycle Perspective
42 pages
A Survey On Machine Learning-Based Performance Improvement of Wireless Networks: PHY, MAC and Network Layer
No ratings yet
A Survey On Machine Learning-Based Performance Improvement of Wireless Networks: PHY, MAC and Network Layer
35 pages
Class44-46 Introduction To Enncoder-Decoder Model Attention-03-09May2023
No ratings yet
Class44-46 Introduction To Enncoder-Decoder Model Attention-03-09May2023
35 pages
Artificial Neural Networks-Based Machine Learning For Wireless Networks: A Tutorial
No ratings yet
Artificial Neural Networks-Based Machine Learning For Wireless Networks: A Tutorial
33 pages
Distributed Intelligence in Wireless Networks
No ratings yet
Distributed Intelligence in Wireless Networks
39 pages
Tech Seminar
No ratings yet
Tech Seminar
22 pages
Progress
No ratings yet
Progress
30 pages
Artificial Intelligence in 6G Wireless Networks Op
No ratings yet
Artificial Intelligence in 6G Wireless Networks Op
27 pages
Be - Electronics and Telecommunication Engineering - Semester 7 - 2024 - May - Deep Learning DL 2019 Pattern
No ratings yet
Be - Electronics and Telecommunication Engineering - Semester 7 - 2024 - May - Deep Learning DL 2019 Pattern
2 pages
Artificial Intelligence For Engineers KMC 101 PDF
No ratings yet
Artificial Intelligence For Engineers KMC 101 PDF
1 page
6G Whitepaper WS
No ratings yet
6G Whitepaper WS
29 pages
3 Digital - Twin - of - Wireless - Systems - Overview - Taxonomy - Challenges - and - Opportunities
No ratings yet
3 Digital - Twin - of - Wireless - Systems - Overview - Taxonomy - Challenges - and - Opportunities
25 pages
Digital Twin of Wireless Systems: Overview, Taxonomy, Challenges, and Opportunities
No ratings yet
Digital Twin of Wireless Systems: Overview, Taxonomy, Challenges, and Opportunities
27 pages
Survey of Graph Neural Network For Internet of Things and NextG Networks
No ratings yet
Survey of Graph Neural Network For Internet of Things and NextG Networks
30 pages
2021-Deep Learning - A Comprehensive Overview On Techniques, Taxonomy, Applications and Research Directions
No ratings yet
2021-Deep Learning - A Comprehensive Overview On Techniques, Taxonomy, Applications and Research Directions
20 pages
6G Communication Networks Introduction, Vision, Challenges, and Future Directions
No ratings yet
6G Communication Networks Introduction, Vision, Challenges, and Future Directions
28 pages
A Comprehensive Review On Artificial Intelligence Machine Learning Algorithms For Empowering The Future IoT Toward 6G Era
No ratings yet
A Comprehensive Review On Artificial Intelligence Machine Learning Algorithms For Empowering The Future IoT Toward 6G Era
28 pages
Quantum Machine Learning For Next-G Wireless Communications Fundamentals and The Path Ahead
No ratings yet
Quantum Machine Learning For Next-G Wireless Communications Fundamentals and The Path Ahead
21 pages
Under Water Final
No ratings yet
Under Water Final
21 pages
Machine Learning Techniques For 5G and Beyond
No ratings yet
Machine Learning Techniques For 5G and Beyond
17 pages
Machine Learning Techniques For 5G and Beyond
No ratings yet
Machine Learning Techniques For 5G and Beyond
18 pages
Wireless 6G Connectivity For Massive Number of Devices and Critical Services
No ratings yet
Wireless 6G Connectivity For Massive Number of Devices and Critical Services
19 pages
Feed-Forward Neural Networks (Part 2: Learning)
No ratings yet
Feed-Forward Neural Networks (Part 2: Learning)
17 pages
Artificial Intelligence Overview
No ratings yet
Artificial Intelligence Overview
10 pages
Distributed Learning For Wireless Communications Methods Applications and Challenges
No ratings yet
Distributed Learning For Wireless Communications Methods Applications and Challenges
17 pages
Class - 10th Computer Vision Mcqs by Pratik
No ratings yet
Class - 10th Computer Vision Mcqs by Pratik
15 pages
Artificial Neural Networks - 3: Dr. Aditya Abhyankar
No ratings yet
Artificial Neural Networks - 3: Dr. Aditya Abhyankar
24 pages
Recent Advances in Artificial Intelligence For Wireless Internet of Things and CyberPhysical Systems A Comprehensive Survey
No ratings yet
Recent Advances in Artificial Intelligence For Wireless Internet of Things and CyberPhysical Systems A Comprehensive Survey
15 pages
Causal Reasoning
No ratings yet
Causal Reasoning
15 pages
A Big Data Enabled Channel Model For 5G Wireless Communication Systems
No ratings yet
A Big Data Enabled Channel Model For 5G Wireless Communication Systems
12 pages
Electronics 11 02071
No ratings yet
Electronics 11 02071
13 pages
Wang Et Al. - 2020 - Artificial Intelligence Enabled Wireless Networkin
No ratings yet
Wang Et Al. - 2020 - Artificial Intelligence Enabled Wireless Networkin
8 pages
Enhancing Sentence Embedding With Generalized Pooling
No ratings yet
Enhancing Sentence Embedding With Generalized Pooling
12 pages
DCS 304
No ratings yet
DCS 304
8 pages
When Machine Learning Meets Big Data: A Wireless Communication Perspective
No ratings yet
When Machine Learning Meets Big Data: A Wireless Communication Perspective
8 pages
12-Deep-Learning-Based SDN Model For Internet of Things An Incremental Tensor Train Approach
No ratings yet
12-Deep-Learning-Based SDN Model For Internet of Things An Incremental Tensor Train Approach
10 pages
Deep Learning For Wireless Communications: An Emerging Interdisciplinary Paradigm
No ratings yet
Deep Learning For Wireless Communications: An Emerging Interdisciplinary Paradigm
7 pages
A Comprehensive Introduction To Convolutional Neural Networks: A Case Study For Character Recognition
No ratings yet
A Comprehensive Introduction To Convolutional Neural Networks: A Case Study For Character Recognition
10 pages
Building A Tanh Activation Function
No ratings yet
Building A Tanh Activation Function
9 pages
Rise of AI
No ratings yet
Rise of AI
7 pages
Machine Learning and Deep Learning Methods For Wireless Network Applications
No ratings yet
Machine Learning and Deep Learning Methods For Wireless Network Applications
4 pages
AI Script 2
No ratings yet
AI Script 2
6 pages
History of Neural Networks
No ratings yet
History of Neural Networks
4 pages
Recent Trends in Wireless Network
No ratings yet
Recent Trends in Wireless Network
6 pages
Coe 414: Digital Signal Processing: "Application of Neural Network in Wireless Communication"
No ratings yet
Coe 414: Digital Signal Processing: "Application of Neural Network in Wireless Communication"
5 pages
Deep-Learning Assisted Cross-Layer Routing in Multi-Hop Wireless Network
No ratings yet
Deep-Learning Assisted Cross-Layer Routing in Multi-Hop Wireless Network
5 pages
22dgyugdrtygg03 11854
No ratings yet
22dgyugdrtygg03 11854
5 pages
Deep Learning - IIT Ropar - Unit 9 - Week 6
No ratings yet
Deep Learning - IIT Ropar - Unit 9 - Week 6
4 pages
Artificial Intelligence AI and Machine Learning ML
No ratings yet
Artificial Intelligence AI and Machine Learning ML
3 pages
Machine Learning Enabled Wireless Communication Network System
No ratings yet
Machine Learning Enabled Wireless Communication Network System
5 pages
Machine Learning/Ai For Iot, M2M, and Computer Communication
No ratings yet
Machine Learning/Ai For Iot, M2M, and Computer Communication
3 pages
FDB Brochure - Deep Learning For Computer Vision From 05.02.2024 To 10.02.2024
No ratings yet
FDB Brochure - Deep Learning For Computer Vision From 05.02.2024 To 10.02.2024
2 pages
20SE58 Assignment2
No ratings yet
20SE58 Assignment2
3 pages
Optimized References
No ratings yet
Optimized References
2 pages
ANN (Artificial Neural Network) 4. LSTM (Long Short-Term Memory)
No ratings yet
ANN (Artificial Neural Network) 4. LSTM (Long Short-Term Memory)
2 pages
PSIT104 Soft Computing Techniques: Objective
No ratings yet
PSIT104 Soft Computing Techniques: Objective
2 pages
MTech AICurriculum 2022
No ratings yet
MTech AICurriculum 2022
2 pages
Contiki Operating System for Embedded IoT: Definitive Reference for Developers and Engineers
From Everand
Contiki Operating System for Embedded IoT: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
5 G Technologies
From Everand
5 G Technologies
Ajit Singh
5/5 (2)
5G Networks and Technologies: Definitive Reference for Developers and Engineers
From Everand
5G Networks and Technologies: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Emerging Technologies in Telecommunications
From Everand
Emerging Technologies in Telecommunications
Matthew N. O. Sadiku
No ratings yet
Private 5G: A Systems Approach
From Everand
Private 5G: A Systems Approach
Larry L Peterson
No ratings yet
5G Technology
From Everand
5G Technology
Arthur Tech
No ratings yet
NB-IoT Systems and Protocols: Definitive Reference for Developers and Engineers
From Everand
NB-IoT Systems and Protocols: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Wireless Sensor Network: Smart Monitoring and Autonomous Data Collection for NextGen Robotics
From Everand
Wireless Sensor Network: Smart Monitoring and Autonomous Data Collection for NextGen Robotics
Fouad Sabry
No ratings yet
The Impact of 5G on Society: 5G technology can connect people, devices, infrastructures, and objects.
From Everand
The Impact of 5G on Society: 5G technology can connect people, devices, infrastructures, and objects.
Topin
No ratings yet
Innovation Landscape brief: Internet of Things
From Everand
Innovation Landscape brief: Internet of Things
International Renewable Energy Agency (IRENA)
No ratings yet

Artificial Neural Networks Based Machine Learning For Wireless Networks A Tutorial

Uploaded by

Artificial Neural Networks Based Machine Learning For Wireless Networks A Tutorial

Uploaded by

IEEE COMMUNICATIONS SURVEYS & TUTORIALS, VOL. 21, NO.

4, FOURTH QUARTER 2019 3039

Artificial Neural Networks-Based Machine Learning

Fig. 1. Organization of the tutorial.

Fig. 2. Summary of artificial neural networks.

• Deep Neural Networks: All the ANNs that have multiple

Fig. 4. Architecture of an unfolded recurrent neural network.

Based on the hidden state s t , the output signal of the ESN

DNN training algorithms [81]. As opposed to shallow ANNs

Fig. 7. Architecture of an LSTM as shown in [83].

for various wireless applications. In particular, we discuss

A. Artificially Intelligent Wireless Networks Using ANNs: An

between content requests, user association, resource allocation

than that of traditional mobile video. In addition, as the VR

and downlink resource block allocation. Some key outcomes

edge computing is to optimally allocate computational tasks

ESNs are effective for the prediction of the users’ mobility

context, LSTM enabled the RL algorithm to estimate

Mingzhe Chen received the Ph.D. degree

Ursula Challita received the Ph.D. degree from

You might also like