Towards Intelligent Vehicular Networks: A Machine Learning Framework
Towards Intelligent Vehicular Networks: A Machine Learning Framework
Towards Intelligent Vehicular Networks: A Machine Learning Framework
and storage devices onboard, these sensing technologies are the basic concepts and major categories of machine learning,
transforming vehicles from a simple transportation facility to and then investigate how to apply machine learning to learn the
a powerful computing and networking hub with intelligent pro- dynamics of high mobility networks in Section IV. In Section
cessing capabilities. They keep collecting, generating, storing, V, we present some preliminary examples of applying machine
and communicating large volumes of data, subject to further learning for data-driven decision making and wireless resource
processing and commonly referred to as mobile big data [20]– management problems in vehicular networks. In Section VI,
[22]. Such data provide rich context information regarding the we recognize and highlight several open issues that warrant
vehicle kinetics (such as speed, acceleration, and direction), further research and concluding remarks are finally made in
road conditions, traffic flow, wireless environments, etc., that Section VII.
can be exploited to improve network performance through
adaptive data-driven decision making. However, traditional II. C HALLENGES OF H IGH M OBILITY V EHICULAR
communications strategies are not designed to handle and N ETWORKS
exploit such information. High mobility vehicular networks exhibit distinctive charac-
As a prevailing approach to AI, machine learning, in partic- teristics, which have posed significant challenges to wireless
ular deep learning, has drawn considerable attention in recent network design. In this section, we identify such challenges
years due to its astonishing progress in such areas as image and then discuss the potential of leveraging machine learning
classification [23], video game playing [24], and Go [25]. to address them.
It helps build intelligent systems to operate in complicated
environments and has found many successful applications in
computer vision, natural language processing, and robotics A. Strong Dynamics
[26], [27]. Machine learning develops efficient methods to High mobility of vehicles leads to strong dynamics and
model and analyze large volumes of data by finding patterns affects system design in multiple aspects of the communica-
and underlying structures and represents an effective data- tions network. Special channel propagation characteristics are
driven approach to problems encountered in various scientific among the most fundamental differentiating factors of high
fields where heterogeneous types of data are available for mobility networks compared with low mobility counterparts.
exploitation. As a result, machine learning provides a rich set For example, vehicular channels exhibit rapid temporal varia-
of tools that can be leveraged to exploit the data generated and tion and also suffer from inherent non-stationarity of channel
stored in vehicular networks [28], [29] and help the network statistics due to their unique physical environment dynam-
make more informed and data-driven decisions. However, how ics [30], [31]. Such rapid variations induce short channel
to adapt and exploit such tools to account for the distinctive coherence time and bring significant challenges in acquir-
characteristics of high mobility vehicular networks and serve ing accurate channel estimates at the receiver in real time.
the purpose of reliable vehicular communications remains This is further hindered by the non-stationarity of channel
challenging and represents a promising research direction. statistics, which are usually leveraged to improve estimation
In this paper, we identify and discuss major challenges in accuracy [32]–[34]. Meanwhile, due to the high Doppler
supporting vehicular networks with high mobility, such as fast- spread caused by vehicle mobility, the multicarrier modulation
varying wireless channels, volatile network topologies, ever- scheme is more susceptible to intercarrier interference (ICI)
changing vehicle densities, and heterogeneous QoS require- in vehicular networks [35], [36] and hence brings difficulty
ments for diverse vehicular links. To address these challenges, to signal detection. Constant mobility of vehicles also causes
we deviate from the traditional network design methodology frequent changes of the communications network topology,
and motivate the use of various machine learning tools, includ- affecting channel allocation and routing protocol designs. For
ing a range of supervised, unsupervised, deep, and reinforce- example, in cluster-based vehicular networks [37], moving
ment learning methods, to exploit the rich sources of data in vehicles may join and leave the cluster frequently, making
vehicular networks for the benefits of communications perfor- it hard to maintain long-lasting connections within the formed
mance enhancement. In particular, we discuss in greater detail cluster and thus warranting further analysis on cluster stability.
recent advances of leveraging machine learning to acquire and Another source of dynamics in high mobility networks comes
track the dynamics of vehicular environments, automatically from the changing vehicle density, which varies dramatically
make decisions regarding vehicular network traffic control, depending on the locations (remote suburban or dense urban
transmission scheduling and routing, and network security, and areas) and time (peak or off hours of the day). Flexible and
perform intelligent network resource management based on robust resource management schemes that make efficient use
reinforcement learning techniques. Since research in this area of available resources while adapting to the vehicle density
is still in its infancy, a wide spectrum of interesting research variation are thus needed.
problems are yet to be defined and fully explored. We list a Traditionally developed rigorous mathematical theories and
few of them in this paper and hope to bring more attention to methods for wireless networks are mostly based on static
this emerging field. or low-mobility environment assumptions and usually not
The rest of this paper is organized as follows. In Section II, designed to treat the varying environment conditions in an ef-
we introduce the unique characteristics and challenges of high fective way. Therefore, it is important to explore new method-
mobility vehicular networks and motivate the use of machine ologies that can interact with fast changing environments and
learning to address the challenges. In Section III, we discuss obtain optimal policies for high mobility vehicular networks
3
in terms of both physical layer problems, such as channel QoS requirements of vehicular networks while adapting to
estimation and signal detection and decoding, and upper layer the varying wireless environment. For example, in resource
designs, such as resource allocation, link scheduling and allocation problems, the optimal policies are first learned
routing. and then the vehicle agents accordingly take actions to ad-
just powers and allocate channels adaptive to the changing
B. Heterogeneous and Stringent QoS Requirements environments characterized by, e.g, link conditions, locally
perceived interference, and vehicle kinetics while traditional
In high mobility vehicular networks, there exist different
static mathematical models are not good at capturing and
types of connections, which we broadly categorize into V2I
tracking such dynamic changes.
and V2V links. The V2I links enable vehicles to communicate
with the base station to support various traffic efficiency and
information and entertainment (infotainment) services. They III. M ACHINE L EARNING
generally require frequent access to the Internet or remote Machine learning allows computers to find hidden insights
servers for media streaming, high-definition (HD) map down- through iteratively learning from data, without being explicitly
loading, and social networking, which involve considerable programmed. It has revolutionized the world of computer
amount of data transfer and thus are more bandwidth intensive science by allowing learning with large datasets, which enables
[3]. On the other hand, the V2V links are mainly considered machines to change, re-structure and optimize algorithms by
for sharing safety-critical information, as basic safety mes- themselves. Existing machine learning methods can be divided
sages (BSM) in DSRC [7], among vehicles in close proximity into three categories, namely, supervised learning, unsuper-
in either a periodic or event triggered manner. Such safety vised learning, and reinforcement learning. Other learning
related messages are strictly delay sensitive and require very schemes, such as semi-supervised learning, online learning,
high reliability. For example, the European METIS project and transfer learning, can be viewed as variants of these three
requires the end-to-end latency to be less than 5 milliseconds basic types. In general, machine learning involves two stages,
and the transmission reliability to be higher than 99.999% for i.e., training and testing. In the training stage, a model is
a safety packet of 1600 bytes [38]. Moreover, high bandwidth learned based on the training data while in the testing stage,
sensor data sharing among vehicles is currently being con- the trained model is applied to produce the prediction. In this
sidered in 3GPP for V2X enhancement in future 5G cellular section, we briefly introduce the basics of machine learning
networks for advanced safety applications [12], whose quality in the hope that the readers can appreciate their potential in
degrades gracefully with increase in packet delay and loss. As solving traditionally challenging problems.
a result, stringent QoS requirements of low latency and high
reliability are in turn imposed on the V2V links. Traditional
A. Supervised Learning
wireless design approaches are hard to simultaneously meet
such diverse and stringent QoS requirements of vehicular The majority of practical machine learning algorithms use
applications, which is further challenged by the strong dy- supervised learning with a labeled dataset, where each training
namics in high mobility vehicular networks as discussed in sample comes with a label. The ultimate goal of supervised
Section II-A. learning is to find the mapping from the input feature space
to the label so that reliable prediction can be made when
new input data is given. Supervised learning problems can be
C. The Potential of Machine Learning
further categorized into classification and regression, where
Machine learning emphasizes the ability to learn and adapt the difference between the two tasks is that the labels are
to the environment with changes and uncertainties. Different categorical for classification and numerical for regression.
from the traditional schemes that rely on explicit system Classification algorithms learn to predict a category output
parameters, such as the received signal power or signal-to- for each incoming sample based on the training data. Some
interference-plus-noise ratio (SINR), for decision making in classic algorithms in this category include Bayesian classifiers
vehicular networks, machine learning can exploit multiple [40], k-nearest neighbors (KNN) [41], decision trees [42],
sources of data generated and stored in the network (e.g., support vector machine (SVM) [43], and neural networks
power profiles, network topologies, vehicle behavior patterns, [44]. Instead of discrete outputs, regression algorithms predict
the vehicle locations/kinetics, etc.) to learn the dynamics in a continuous value corresponding to each sample, such as
the environment and then extract appropriate features to use estimating the house price given its associated feature inputs.
for the benefit of many tasks for communications purposes, Classic regression algorithms include logistic regression [45],
such as signal detection, resource management, and routing. support vector regression (SVR) [46], and Gaussian process
However, it is a non-trivial task to extract context or semantic for regression [40].
information from a huge amount of accessible data, which
might have been contaminated by noise, multi-modality, or
redundancy, and thus information extraction and distillation B. Unsupervised Learning
need to be performed. The label data serves as the teacher in supervised learning
In particular, reinforcement learning [39], one of the ma- so that there is a clear measure of success that can be used to
chine learning tools, can interact with the dynamic envi- judge the goodness of the learned model in various situations.
ronment and develop satisfactory policies to meet diverse Nevertheless, a large amount of labeled data is often hard to
4
TABLE I
S UMMARY OF DYNAMICS LEARNING IN VEHICULAR NETWORKS.
networks may easily explode or vanish [26]. By virtue of such historical data to predict the channel statistics and en-
development in faster computation resources, new training hance instantaneous channel estimation for current vehicular
methods (new activation functions [61], pre-training [62]), and links.
new structures (batch norm [63], residual networks [64]), train- Compared with traditional channel estimation schemes re-
ing a much deeper neural network becomes viable. Recently, lying on precise mathematical models, the learning-based
deep learning has been widely used in computer vision [23], method provides yet another data-driven approach that can eas-
speech recognition [65], natural language processing [66], ily incorporate various sources of relevant context information
etc., and has greatly improved state-of-the-art performance to enhance estimation accuracy. It can potentially deal with a
in each area. In addition, different structures can be added number of non-ideal effects that are difficult to handle under
to the deep neural networks for different applications. For the traditional estimation framework, such as nonlinearity of
example, convolutional networks share weights among spatial power amplifiers, phase noise, and time/frequency offsets.
dimensions while recurrent neural networks (RNN) and long The channel estimator can be trained offline across different
short term memory (LSTM) networks share weights among channel models for varying propagation environments and
the temporal dimensions [26]. calibrated using real-world data collected from field mea-
surements. During online deployment, the estimator module
IV. L EARNING DYNAMICS produces channel estimates on the fly with low computational
complexity given necessary inputs, which includes received
High mobility networks exhibit strong dynamics in many pilot data and other relevant context information.
facets, e.g., wireless channels, network topologies, traffic dy- For example, a Bayesian learning approach has been
namics, etc., that heavily influence the network performance. adopted to estimate the sparse massive multiple-input multiple-
In this section, we discuss how to exploit machine learning to output (MIMO) channel in [56], where the channel is modeled
efficiently learn and robustly predict such dynamics based on using Gaussian mixture distribution and an efficient estimator
data from a variety of sources. Table I summarizes these tasks has been derived based on approximate message passing
along with the leveraged machine learning methods. (AMP) and expectation-maximization (EM) algorithms. Deep
learning has been exploited in [57] to implicitly estimate
wireless channels in orthogonal frequency division multiplex-
A. Learning-Enabled Channel Estimation
ing (OFDM) systems and shown to be robust to nonlinear
Accurate and efficient channel estimation is a critical com- distortions and other impairments, such as pilots reduction and
ponent in modern wireless communications systems. It has cyclic prefix (CP) removal. In addition, the temporal relation-
strong impacts on receiver design (e.g., channel equaliza- ship in data is traditionally characterized by Bayesian models,
tion, demodulation, decoding, etc.) as well as radio resource such as the HMMs, which can be used to track time-varying
allocation at the transmitter for interference mitigation and vehicular channels. It is interesting to see if recently developed
performance optimization. Channel estimation is more of an sophisticated models powered by deep neural networks, such
issue in vehicular networks with high Doppler shifts and short as RNN and LSTM, can improve channel estimation accuracy
channel coherence periods. by exploiting the long-range dependency.
Statistical information of wireless channels, such as time
and frequency domain correlation, mainly depends on vehicle
locations/speeds, multipath delay spread, and the surrounding B. Traffic Flow Prediction
wireless environment. In cellular based vehicular networks, the Traffic flow prediction aims to infer traffic information
base station can easily access accurate location information from historical and real-time traffic data collected by various
(speed can also be inferred) of all vehicles traveling under onboard and roadway sensors. It can be used in a variety of
its coverage from various global navigation satellite systems ITS applications, such as traffic congestion alleviation, fuel
(GNSS) on vehicles. It can maintain a dynamic database to efficiency improvement, and carbon emission reduction. Given
store the historical estimates of communications channels for the rich amount of traffic data, machine learning can be lever-
all vehicular links along with relevant context information, aged to enhance the flow prediction performance and achieve
such as locations of the transmitters and/or receivers and traffic unparalleled accuracy. In [58], a deep learning based method
patterns. Various machine learning tools, such as Bayesian has been proposed to predict traffic flow, where a stacked
learning and deep learning, can then be leveraged to exploit autoencoder is exploited to learn generic features for traffic
6
flow and trained in a greedy layerwise fashion. It implicitly [68] to extract vehicular mobility patterns from real trace data
takes into consideration the spatial and temporal correlations in in an urban vehicular network environment, which is used
the modeling and achieves superior performance. A probabilis- to predict the possible trajectories of moving vehicles and
tic graphical model, namely the Poisson dependency network develop efficient prediction-based soft routing protocols. In
(PDN), has been learned in [59] to describe empirical vehicular [69], a recursive least squares algorithm has been used for
traffic dataset and then used for traffic flow prediction. The large-scale channel prediction based on location information of
strong correlations between cellular connectivity and vehicular vehicles and facilitate the development of a novel scheduling
traffic flow have been further leveraged to enhance prediction strategy for cooperative data dissemination in VANETs.
for both of them by means of Poisson regression trees.
B. Network Congestion Control
C. Vehicle Trajectory Prediction Data traffic congestion is an important issue in vehicular
Vehicle trajectory prediction is of significant interest for networks, especially when the network conditions are highly
advanced driver assistance systems (ADAS) in many tasks, dense in, e.g., busy intersections and crowded urban environ-
such as collision avoidance and road hazard warning. It ments. In such cases, a large number of vehicles are vying
also plays an important role in networking protocol designs, for the available communication channels simultaneously and
such as handoff control, link scheduling, and routing, since hence cause severe data collisions with increased packet loss
network topology variations can be inferred from the predicted and delay. To guarantee a reliable and timely delivery of vari-
vehicle trajectories and exploited for communications perfor- ous delay-sensitive safety-critical messages, such as BSMs, the
mance enhancement. Probabilistic trajectory prediction based vehicular networks need to have carefully designed congestion
on Gaussian mixture models (GMM) and variational GMM control strategies. Traditionally, there are five major categories
has been studied in [60] to predict the vehicle’s trajectory of congestion control methods, namely rate-based, power-
using previously observed motion patterns. A motion model based, carrier-sense multiple access/collision avoidance based,
is learned based on previously observed trajectories, which is prioritizing and scheduling-based, and hybrid strategies [70],
then used to build a functional mapping from the observed which adjust communications parameters, such as transmission
historical trajectories to the most likely future trajectory. The power, transmission rates, and contention window sizes, etc.,
latent factors that affect the trajectories, such as drivers’ inten- to meet the congestion control purposes.
tion, traffic patterns, and road structures, may also be implicitly Different from the traditional approaches, an effective ma-
learned from the historical data using deep neural networks. chine learning based data congestion control strategy utilizing
More sophisticated models, such as RNN and LSTM, can k-means clustering has been developed in [70] for congestion
potentially lead to better results for modeling the dynamics prone intersections. The proposed strategy relies on local road
of vehicle trajectories and are worth further investigation. side units (RSUs) installed at each intersection for congestion
detection, data processing, and congestion control to provide
a centralized congestion management for all vehicles that are
V. L EARNING BASED D ECISION M AKING IN V EHICULAR
passing through or stop at the intersection. After detection
N ETWORKS
of congestion, each RSU collects all data transferred among
The rich sources of data generated and stored in vehicular vehicles in its coverage, removes their redundancy, exploits
networks motivate a data-driven approach for decision making k-means algorithms to cluster the messages according to their
that is adaptive to network dynamics and robust to various features, such as sizes, validity, and types, and finally adjusts
impairments. Machine learning represents an effective tool to communications parameters for each cluster.
serve such purposes with proven good performance in a wide
variety of applications, as demonstrated by some preliminary
C. Load Balancing and Vertical Handoff Control
examples discussed in this section and summarized in Table II.
Due to periodicity of everyday traffic, potential patterns
and regularities lie in the traffic flow and can be further
A. Location Prediction Based Scheduling and Routing exploited with learning based methods for load balancing and
We have shown in Section IV that machine learning can vertical control in vehicular networks. An online reinforcement
be leveraged to learn the dynamics in high mobility vehicular learning approach has been developed in [71] to address the
networks, including vehicle trajectory prediction. In fact, the user association problem with load-balancing in the dynamic
predicted dynamics can be further used towards networking environment. The initial association is achieved based on the
protocol designs for system performance improvement. For current context information using reinforcement learning. Af-
example, the hidden Markov model (HMM) has been applied ter a period of learning, with the association information being
in [67] to predict vehicles’ future locations based on past collected at the base station, the new association results will
mobility traces and movement patterns in a hybrid VANET be obtained directly and adaptively using historical association
with both V2I and V2V links. Based on the predicted vehicle patterns. Besides user association, the reinforcement learning
trajectories, an effective routing scheme has been proposed based approach has also been applied in [72] to the vertical
to efficiently select relay nodes for message forwarding and handoff design for heterogeneous vehicular networks. The
enable seamless handoff between V2V and V2I communica- network connectivity can be determined by a fuzzy Q-learning
tions. A variable-order Markov model has been adopted in approach with four types of information, namely, received
7
TABLE II
S UMMARY OF LEARNING BASED DECISION MAKING IN VEHICULAR NETWORKS.
signal strength value, vehicle speed, data quantity, and the 1) Virtual Resource Allocation: Employing recent advances
number of users associated with the targeted network. With in software-defined networking (SDN) and network function
the learning based strategy, users can be connected to the best virtualization (NFV), the traditional vehicular network can
network without prior knowledge on handoff behavior. be transformed into a virtualized network offering improved
efficiency and greater flexibility in network management.
Future intelligent vehicles and RSUs will be equipped with
D. Network Security
advanced sensing, computing, storage, and communication
As intelligent vehicles become more connected and bring facilities, which can be further integrated into the virtualized
huge benefits to the society, the improved connectivity can vehicular network to provide a pool of resources for a variety
make vehicles more vulnerable to cyber-physical attacks. As of ITS applications. In such a complicated system, how to dy-
a result, security of information sharing in vehicles is crucial namically allocate the available resources to end users for QoS
since any faulty sensor measurements may cause accidents maximization with minimal overhead is a nontrivial task. A
and injuries. In [73], an intrusion detection system has been delay-optimal virtualized radio resource management problem
proposed for vehicular networks based on deep neural net- in software-defined vehicular networks has been considered
works, where the unsupervised deep belief networks are used in [75], which is formulated as an infinite-horizon partially
to initialize the parameters as a preprocessing stage. Then, observed MDP. An online distributed learning algorithm has
the deep neural networks are trained by high-dimensional been proposed to address the problem based on an equivalent
packet data to figure out the underlying statistical properties Bellman equation and stochastic approximation. The proposed
of normal and hacking packets and extract the corresponding scheme is divided into two stages, which adapt to large time
features. In addition, LSTM is used in [74] to detect attacks scale factors, such as the traffic density, and small timescale
on connected vehicles. The LSTM based detector is able to factors, such as channel and queue states, respectively. In
recognize the synthesized anomalies with high accuracy by [76], the resource allocation problem in vehicular clouds
learning to predict the next word originating from each vehicle. has been modeled as an MDP and reinforcement learning
is leveraged to solve the problem such that the resources
E. Intelligent Wireless Resource Management are dynamically provisioned to maximize long-term benefits
for the network and avoid myopic decision making. Joint
The current mainstream approach to wireless resource man-
management of networking, caching, and computing resources
agement is to formulate the design objective and constraints
in virtualized vehicular networks has been further considered
as an optimization problem and then solve for a solution with
in [77], where a novel deep reinforcement learning approach
certain optimality claims. However, in high mobility vehicular
has been proposed to deal with the highly complex joint
networks, such an approach is insufficient. The first challenge
resource optimization problem and shown to achieve good
arises due to the strong dynamics in vehicular networks
performance in terms of total revenues for the virtual network
that lead to a brief valid period of the optimization results
operators.
in addition to the incurred heavy signaling overhead. The
second issue comes with the difficulty to formulate a satisfac- 2) Energy-Efficient Resource Management: Energy con-
tory objective to simultaneously consider the vastly different sumption should be taken into consideration, especially when
goals of the heterogeneous vehicular links, which is further RSUs in vehicular networks lack permanent grid-power con-
complicated by the fact that some of the QoS formulations nection. In [78], an MDP problem is formulated and solved
are mathematically difficult if not intractable. Fortunately, using reinforcement learning techniques to optimize the RSUs’
reinforcement learning provides a promising solution to these downlink scheduling performance during a discharge period.
challenges through interacting with the dynamic environment The RSUs learn to select a vehicle to serve at the beginning
to maximize a numeric reward, which is discussed in detail in of each time slot based on the collected information about
this part. traffic characteristics, infrastructure power budget, and the total
8
C. Security Issues
A. Method Complexity
Machine learning has been shown to be helpful in con-
Unlike traditional machine learning techniques that require fronting cyber-physical attacks, which threatens the safety of
much effort on feature design, deep neural networks provide vehicular networks, as discussed in Section V-D. Ironically, it
better performance by learning the features directly from raw also raises tremendous potential challenges and risks by itself
data. Hence, information can be distilled more efficiently in since the machine learning based system can produce harmful
deep neural networks than the traditional methods. It has or unexpected results [85]. For instance, the convolutional
been shown by experimental results that the deep hierarchical neural networks can be easily fooled by maliciously designed
structure is necessary. Recently, in order to enhance the noised images [86] while the agents in reinforcement learning
representation ability of the model, more advanced structures may find undesirable ways to enhance the reward delivered by
and technologies have since been devised, such as the LSTM their interacting environment [87]. As a result, even though
as briefly discussed in Section III-D. Moreover, with high- machine learning has achieved remarkable improvement in
performance computing facilities, such as graphics processing many areas, significant efforts shall be made to improve the ro-
unit (GPU), deep networks can be efficiently trained with bustness and security of machine learning methods before they
massive amounts of data through advanced training techniques, come to the safety-sensitive areas, such as vehicular networks,
such as batch norm [63] and residual networks [64]. However, where minor errors may lead to disastrous consequences.
computation resources aboard vehicles are rather limited and
because of the stringent end-to-end latency constraints in ve-
D. Learning for Millimeter Wave Vehicular Networks
hicular networks, the use of powerful servers housed remotely
for computation would also be confined. As a result, special The millimeter wave (mmWave) band is an attractive option
treatments, such as model reduction or compression, should to support high data rate communications for advanced safety
be carefully developed to alleviate the computation resource and infotainment services in future vehicular networks with the
limitation without incurring much performance degradation. availability of order of magnitude larger bandwidth [88], [89].
The small wavelength of mmWave bands makes it possible to
pack massive antenna elements in a small form factor to direct
B. Distributed Learning and Multi-Agent Cooperation sharp beams to compensate for the significantly higher power
Different from most existing machine learning applications attenuation of mmWave propagation. Over the past years,
that assume easy availability of data, in vehicular networks, significant research efforts have been dedicated to addressing
however, the data is generated and stored across different units a wide range of problems in mmWave communications, in-
in the network, e.g., vehicles, RSUs, remote clouds, etc. As a cluding mmWave channel modeling, hybrid analog and digital
consequence, distributed learning algorithms are desired such precoding/combining, channel estimation, beam training, and
that they can act on partially observed data and meanwhile codebook designs [90].
have the ability to exploit information obtained from other A distinctive challenge of mmWave vehicular commu-
entities in the network. Such scenarios can be technically nications is the large overhead to train and point narrow
modeled as a multi-agent system, where cooperation and beams to the right direction due to the constant moving of
coordination among participating agents play important roles vehicles. Besides, the mmWave transmission is susceptible to
10
blockage and therefore fast and efficient beam tracking and [10] M. I. Hassan, H. L. Vu, and T. Sakurai, “Performance analysis of the
switching schemes are critical in establishing and maintaining IEEE 802.11 MAC protocol for DSRC safety applications,” IEEE Trans.
Veh. Technol., vol. 60, no. 8, pp. 3882–3896, Oct. 2011.
reliable mmWave links [91]. Machine learning tools can be [11] 3rd Generation Partnership Project; Technical Specification Group
effective in addressing such challenges, through exploiting Services and System Aspects; Study on LTE support for Vehicle to
historical beam training results [92], situational awareness Everything (V2X) services (Release 14), 3GPP TR 22.885 V14.0.0, Dec.
2015.
[93], and other context information of the communications [12] 3rd Generation Partnership Project; Technical Specification Group
environment. Mapping functions from the context informa- Services and System Aspects; Study on enhancement of 3GPP Support
tion (features), such as environment geometry, network sta- for 5G V2X Services (Release 15), 3GPP TR 22.886 V15.1.0, Mar. 2017.
tus, and user locations, to the beam training results can [13] L. Liang, G. Y. Li, and W. Xu, “Resource allocation for D2D-enabled
vehicular communications,” IEEE Trans. Commun., vol. 65, no. 7, pp.
be learned using deep neural networks or other regression 3186–3197, Jul. 2017.
algorithms. It remains to study the proper resolution levels [14] L. Liang, J. Kim, S. C. Jha, K. Sivanesan, and G. Y. Li, “Spectrum
for encoding/representing the context information to strike a and power allocation for vehicular communications with delayed CSI
feedback,” IEEE Wireless Comun. Lett., vol. 6, no. 4, pp. 458–461, Aug.
balance between performance and computational complexity. 2017.
Moreover, it would be particularly interesting to see if more [15] W. Sun, E. G. Ström, F. Brännström, K. Sou, and Y. Sui, “Radio resource
sophisticated machine learning models, such as RNNs and management for D2D-based V2V communication,” IEEE Trans. Veh.
Technol., vol. 65, no. 8, pp. 6636–6650, Aug. 2016.
LSTMs to exploit temporal correlations, can achieve better [16] W. Sun, D. Yuan, E. G. Ström, and F. Brännström, “Cluster-based radio
performance in predicting mmWave beamforming directions resource management for D2D-supported safety-critical V2X communi-
in rapidly changing vehicular environments. cations,” IEEE Trans. Wireless Commun., vol. 15, no. 4, pp. 2756–2769,
Apr. 2016.
[17] M. Botsov, M. Klügel, W. Kellerer, and P. Fertl, “Location dependent
VII. C ONCLUSION resource allocation for mobile device-to-device communications,” in
Proc. IEEE WCNC, Apr. 2014, pp. 1679–1684.
In this article, we have investigated the possibility of ap- [18] R. Zhang, X. Cheng, Q. Yao, C.-X. Wang, Y. Yang, and B. Jiao, “Inter-
plying machine learning to address problems in high mobility ference graph based resource sharing schemes for vehicular networks,”
vehicular networks. Strong dynamics exhibited by such types IEEE Trans. Veh. Technol., vol. 62, no. 8, pp. 4028–4039, Oct. 2013.
[19] L. Liang, S. Xie, G. Y. Li, Z. Ding, and X. Yu, “Graph-based resource
of networks and the demanding QoS requirements challenge sharing in vehicular communication,” IEEE Trans. Wireless Commun.,
the state-of-the-art communications technologies. Machine vol. 17, no. 7, pp. 4579–4592, Jul. 2018.
learning is believed to be a promising solution to this challenge [20] X. Cheng, L. Fang, X. Hong, and L. Yang, “Exploiting mobile big data:
Sources, features, and applications,” IEEE Netw., vol. 31, no. 1, pp.
due to its remarkable performance in various AI related areas. 72–79, Jan. 2017.
We have briefly introduced the basics of machine learning and [21] X. Cheng, L. Fang, L. Yang, and S. Cui, “Mobile big data: The fuel
then provided some examples of using such tools to learn the for data-driven wireless,” IEEE Internet Things J., vol. 4, no. 5, pp.
1489–1516, Oct. 2017.
dynamics and perform intelligent decision making in vehicular
[22] W. Xu, H. Zhou, N. Cheng, F. Lyu, W. Shi, J. Chen, and X. Shen,
networks. We have further highlighted some open issues and “Internet of vehicles in big data era,” IEEE/CAA J. Autom. Sinica, vol. 5,
pointed out areas that require more attention. no. 1, pp. 19–35, Jan. 2018.
[23] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification
with deep convolutional neural networks,” in Proc. NIPS, Dec. 2012,
R EFERENCES pp. 1097–1105.
[24] V. Mnih, K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Veness, M. G.
[1] L. Liang, H. Peng, G. Y. Li, and X. Shen, “Vehicular communications: Bellemare, A. Graves, M. Riedmiller, A. K. Fidjeland, G. Ostrovski
A physical layer perspective,” IEEE Trans. Veh. Technol., vol. 66, no. 12, et al., “Human-level control through deep reinforcement learning,”
pp. 10 647–10 659, Dec. 2017. Nature, vol. 518, no. 7540, pp. 529–533, Feb. 2015.
[2] H. Peng, L. Liang, X. Shen, and G. Y. Li, “Vehicular communications: [25] D. Silver, A. Huang, C. J. Maddison, A. Guez, L. Sifre, G. Van
A network layer perspective,” submitted to IEEE Trans. Veh. Technol., Den Driessche, J. Schrittwieser, I. Antonoglou, V. Panneershelvam,
2017. [Online]. Available: https://arxiv.org/abs/1707.09972. M. Lanctot et al., “Mastering the game of go with deep neural networks
[3] G. Araniti, C. Campolo, M. Condoluci, A. Iera, and A. Molinaro, “LTE and tree search,” Nature, vol. 529, no. 7587, pp. 484–489, Jan. 2016.
for vehicular networking: A survey,” IEEE Commun. Mag, vol. 51, no. 5,
[26] J. Schmidhuber, “Deep learning in neural networks: An overview,”
pp. 148–157, May 2013.
Neural Netw., vol. 61, pp. 85–117, Jan. 2015.
[4] X. Cheng, C. Chen, W. Zhang, and Y. Yang, “5G-enabled cooperative in-
[27] E. Alpaydin, Introduction to Machine Learning. MIT press, 2014.
telligent vehicular (5GenCIV) framework: When Benz meets Marconi,”
IEEE Intell. Syst., vol. 32, no. 3, pp. 53–59, May 2017. [28] C. Jiang, H. Zhang, Y. Ren, Z. Han, K.-C. Chen, and L. Hanzo,
[5] X. Cheng, L. Yang, and X. Shen, “D2D for intelligent transportation “Machine learning paradigms for next-generation wireless networks,”
systems : A feasibility study,” IEEE Trans. Intell. Transp. Syst., vol. 16, IEEE Wireless Commun., vol. 24, no. 2, pp. 98–105, Apr. 2017.
no. 4, pp. 1784–1793, Aug. 2015. [29] H. Ye, L. Liang, G. Y. Li, J. Kim, L. Lu, and M. Wu, “Machine learning
[6] R. Zhang, X. Cheng, L. Yang, X. Shen, and B. Jiao, “A novel centralized for vehicular networks,” IEEE Veh. Technol. Mag., vol. 13, no. 2, Jun.
TDMA-based scheduling protocol for vehicular networks,” IEEE Trans. 2018.
Intell. Transp. Syst., vol. 16, no. 1, pp. 411–416, Feb. 2015. [30] W. Viriyasitavat, M. Boban, H. M. Tsai, and A. Vasilakos, “Vehicular
[7] J. B. Kenney, “Dedicated short-range communications (DSRC) standards communications: Survey and challenges of channel and propagation
in the United States,” IEEE Trans. Emerg. Sel. Topics Power Electron, models,” IEEE Veh. Technol. Mag., vol. 10, no. 2, pp. 55–66, Jun. 2015.
vol. 99, no. 7, pp. 1162–1182, Jul. 2011. [31] L. Bernadó, T. Zemen, F. Tufvesson, A. F. Molisch, and C. F. Meck-
[8] Intelligent Transport Systems (ITS); Cooperative ITS (C-ITS); lenbräuker, “Delay and Doppler spreads of nonstationary vehicular
Release 1, ETSI TR 101 607 V1.1.1, May 2013. [Online]. Available: channels for safety-relevant scenarios,” IEEE Trans. Veh. Technol.,
vol. 63, no. 1, pp. 82–93, Jan. 2014.
http://www.etsi.org/deliver/etsi tr/101600 101699/101607/01.01.01 60/tr 101607v010101p.pdf.
[9] IEEE Standard for Information Technology–Telecommunications and [32] Y. Li, L. J. Cimini, and N. R. Sollenberger, “Robust channel estimation
information exchange between systems–Local and metropolitan area for OFDM systems with rapid dispersive fading channels,” IEEE Trans.
networks–Specific requirements–Part 11: Wireless LAN Medium Access Commun., vol. 46, no. 7, pp. 902–915, Jul. 1998.
Control (MAC) and Physical Layer (PHY) specifications Amendment 6: [33] Y. Li, “Pilot-symbol-aided channel estimation for OFDM in wireless
Wireless Access in Vehicular Environments, IEEE Std. 802.11p-2010, systems,” IEEE Trans. Veh. Technol., vol. 49, no. 4, pp. 1207–1215, Jul.
Jul. 2010. 2000.
11
[34] W. Ding, F. Yang, W. Dai, and J. Song, “Time-frequency joint sparse [61] X. Glorot, A. Bordes, and Y. Bengio, “Deep sparse rectifier neural
channel estimation for MIMO-OFDM systems,” IEEE Commun. Lett., networks,” in Proc. Int.l Conf. Artif. Intell. Statist, Jun. 2011, pp. 315–
vol. 19, no. 1, pp. 58–61, Jan. 2015. 323.
[35] M. Russell and G. L. Stüber, “Interchannel interference analysis of [62] D. Erhan, Y. Bengio, A. Courville, P.-A. Manzagol, P. Vincent, and
OFDM in a mobile environment,” in Proc. IEEE VTC, Jul. 1995, pp. S. Bengio, “Why does unsupervised pre-training help deep learning?”
820–824. J. Machine Learning Research, vol. 11, pp. 625–660, Feb. 2010.
[36] Y. Li and L. J. Cimini, “Bounds on the interchannel interference of [63] S. Ioffe and C. Szegedy, “Batch normalization: Accelerating deep
OFDM in time-varying impairments,” IEEE Trans. Commun., vol. 49, network training by reducing internal covariate shift,” arXiv preprint
no. 3, pp. 401–404, Mar. 2001. arXiv:1502.03167, 2015.
[37] K. Abboud and W. Zhuang, “Stochastic analysis of a single-hop commu- [64] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image
nication link in vehicular Ad Hoc networks,” IEEE Trans. Intell. Transp. recognition,” in Proc. IEEE CVPR, Jun. 2016, pp. 770–778.
Syst., vol. 15, no. 5, pp. 2297–2307, Oct. 2014. [65] C. Weng, D. Yu, S. Watanabe, and B.-H. F. Juang, “Recurrent deep
[38] Scenarios, requirements and KPIs for 5G mobile neural networks for robust speech recognition,” in Proc. IEEE ICASSP,
and wireless system, METIS ICT-317669-METIS/D1.1, May 2014, pp. 5532–5536.
METIS deliverable D1.1, Apr. 2013. [Online]. Available: [66] K. Cho, B. Van Merriënboer, C. Gulcehre, D. Bahdanau, F. Bougares,
https://www.metis2020.com/documents/deliverables/. H. Schwenk, and Y. Bengio, “Learning phrase representations using
[39] R. S. Sutton and A. G. Barto, Introduction to reinforcement learning. rnn encoder-decoder for statistical machine translation,” arXiv preprint
MIT press Cambridge, 1998, vol. 135. arXiv:1406.1078, 2014.
[40] G. E. Box and G. C. Tiao, Bayesian inference in statistical analysis. [67] L. Yao, J. Wang, X. Wang, A. Chen, and Y. Wang, “V2X routing in
John Wiley & Sons, 2011, vol. 40. a VANET Based on the hidden Markov model,” IEEE Trans. Intell.
[41] K. Beyer, J. Goldstein, R. Ramakrishnan, and U. Shaft, “When is nearest Transp. Syst., vol. 19, no. 3, pp. 889–899, Mar. 2017.
neighbor meaningful?” in Proc. Int. Conf. Database Theory,, Jan. 1999, [68] G. Xue, Y. Luo, J. Yu, and M. Li, “A novel vehicular location prediction
pp. 217–235. based on mobility patterns for routing in urban VANET,” EURASIP J.
[42] S. R. Safavian and D. Landgrebe, “A survey of decision tree classifier Wireless Commun. Netw., vol. 2012, no. 1, pp. 222–235, Jul. 2012.
methodology,” IEEE Trans. Syst., Man, Cybern., Syst, vol. 21, no. 3, pp. [69] F. Zeng, R. Zhang, X. Cheng, and L. Yang, “Channel prediction based
660–674, May/Jun. 1991. scheduling for data dissemination in VANETs,” IEEE Commun. Lett.,
[43] C. Cortes and V. Vapnik, “Support-vector networks,” Machine learning, vol. 21, no. 6, pp. 1409–1412, Jun. 2017.
vol. 20, no. 3, pp. 273–297, Sep. 1995. [70] N. Taherkhani and S. Pierre, “Centralized and localized data congestion
[44] Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,” Nature, vol. 521, control strategy for vehicular ad hoc networks using a machine learning
no. 7553, pp. 436–444, May 2015. clustering algorithm,” IEEE Trans. Intell. Transp. Syst., vol. 17, no. 11,
[45] S. H. Walker and D. B. Duncan, “Estimation of the probability of pp. 3275–3285, Nov. 2016.
an event as a function of several independent variables,” Biometrika, [71] Z. Li, C. Wang, and C.-J. Jiang, “User association for load balancing in
vol. 54, no. 1-2, pp. 167–179, Jun. 1967. vehicular networks: An online reinforcement learning approach,” IEEE
[46] D. Basak, S. Pal, and D. C. Patranabis, “Support vector regression,” Trans. Intell. Transp. Syst., vol. 18, no. 8, pp. 2217–2228, Aug. 2017.
Neural Inf. Process. Lett. Rev.,, vol. 11, no. 10, pp. 203–224, Oct. 2007. [72] Y. Xu, L. Li, B.-H. Soong, and C. Li, “Fuzzy Q-learning based vertical
[47] T. Kanungo, D. M. Mount, N. S. Netanyahu, C. D. Piatko, R. Silverman, handoff control for vehicular heterogeneous wireless network,” in Proc.
and A. Y. Wu, “An efficient k-means clustering algorithm: Analysis and IEEE ICC, Jun. 2014, pp. 5653–5658.
implementation,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 24, no. 7, [73] M.-J. Kang and J.-W. Kang, “A novel intrusion detection method using
pp. 881–892, Jul. 2002. deep neural network for in-vehicle network security,” in Proc. IEEE VTC
[48] G. Gan, C. Ma, and J. Wu, Data Clustering: Theory, Algorithms, and Fall, May 2016, pp. 1–5.
Applications. SIAM, 2007, vol. 20. [74] A. Taylor, S. Leblanc, and N. Japkowicz, “Anomaly detection in auto-
[49] A. Y. Ng, M. I. Jordan, and Y. Weiss, “On spectral clustering: Analysis mobile control network data with long short-term memory networks,”
and an algorithm,” in Proc. NIPS, 2002, pp. 849–856. in Proc. IEEE Int. Conf. Data Sci. Adv. Anal. (DSAA), Oct. 2016, pp.
[50] Y. W. Teh, “Dirichlet process,” in Encyclopedia of Machine Learning. 130–139.
Springer, 2011, pp. 280–287. [75] Q. Zheng, K. Zheng, H. Zhang, and V. C. Leung, “Delay-optimal virtu-
[51] J. H. Friedman, “On bias, variance, 0/1–loss, and the curse-of- alized radio resource scheduling in software-defined vehicular networks
dimensionality,” Data Mining Knowl. Disc., vol. 1, no. 1, pp. 55–77, via stochastic learning,” IEEE Trans. Veh. Technol., vol. 65, no. 10, pp.
Mar. 1997. 7857–7867, Oct. 2016.
[52] I. T. Jolliffe, “Principal component analysis and factor analysis,” in [76] M. A. Salahuddin, A. Al-Fuqaha, and M. Guizani, “Reinforcement
Principal Component Analysis. Springer, 1986, pp. 115–128. learning for resource provisioning in the vehicular cloud,” IEEE Wireless
[53] S. T. Roweis and L. K. Saul, “Nonlinear dimensionality reduction by Commun., vol. 23, no. 4, pp. 128–135, Jun. 2016.
locally linear embedding,” Science, vol. 290, no. 5500, pp. 2323–2326, [77] Y. He, N. Zhao, and H. Yin, “Integrated networking, caching and com-
Dec. 2000. puting for connected vehicles: A deep reinforcement learning approach,”
[54] J. B. Tenenbaum, V. De Silva, and J. C. Langford, “A global geometric IEEE Trans. Veh. Technol., vol. 67, no. 1, pp. 44 – 55, Jan. 2018.
framework for nonlinear dimensionality reduction,” Science, vol. 290, [78] R. Atallah, C. Assi, and J. Y. Yu, “A reinforcement learning technique
no. 5500, pp. 2319–2323, Dec. 2000. for optimizing downlink scheduling in an energy-limited vehicular
[55] C. J. C. H. Watkins, “Learning from delayed rewards,” Ph.D. disserta- network,” IEEE Trans. Veh. Technol., vol. 66, no. 6, pp. 4592–4601,
tion, King’s College, Cambridge, 1989. Jun. 2017.
[56] C.-K. Wen, S. Jin, K.-K. Wong, J.-C. Chen, and P. Ting, “Channel esti- [79] R. Atallah, C. Assi, and M. Khabbaz, “Deep reinforcement learning-
mation for massive MIMO using Gaussian-mixture Bayesian learning,” based scheduling for roadside communication networks,” in Proc. IEEE
IEEE Trans. Wireless Commun., vol. 14, no. 3, pp. 1356–1368, Mar. WiOpt, May 2017, pp. 1–8.
2015. [80] H. Ye, G. Y. Li, and B.-H. Juang, “Deep reinforcement learning for
[57] H. Ye, G. Y. Li, and B.-H. Juang, “Power of deep learning for channel resource allocation in V2V communications,” in Proc. IEEE ICC, May
estimation and signal detection in OFDM systems,” IEEE Wireless 2018, pp. 1–5.
Commun. Lett., vol. 7, no. 1, pp. 114–117, Feb. 2018. [81] M. I. Ashraf, M. Bennis, C. Perfecto, and W. Saad, “Dynamic proximity-
[58] Y. Lv, Y. Duan, W. Kang, Z. Li, and F.-Y. Wang, “Traffic flow prediction aware resource allocation in vehicle-to-vehicle (V2V) communications,”
with big data: A deep learning approach,” IEEE Trans. Intell. Transp. in Proc. IEEE GLOBECOM Workshops, Dec, 2016, pp. 1–6.
Syst., vol. 16, no. 2, pp. 865–873, Apr. 2015. [82] J. Foerster, I. A. Assael, N. de Freitas, and S. Whiteson, “Learning
[59] C. Ide, F. Hadiji, L. Habel, A. Molina, T. Zaksek, M. Schreckenberg, to communicate with deep multi-agent reinforcement learning,” in
K. Kersting, and C. Wietfeld, “LTE connectivity and vehicular traffic Advances in Neural Information Processing Systems (NIPS), 2016, pp.
prediction based on machine learning approaches,” in Proc. IEEE VTC 2137–2145.
Fall, Sep. 2015, pp. 1–5. [83] J. Foerster, N. Nardelli, G. Farquhar, T. Afouras, P. H. S. T. 1, P. Kohli,
[60] J. Wiest, M. Höffken, U. Kreßel, and K. Dietmayer, “Probabilistic and S. Whiteson, “Stabilising experience replay for deep multi-agent
trajectory prediction with Gaussian mixture models,” in Proc. Intell. reinforcement learning,” in Proc. Int. Conf. Mach. Learning (ICML),
Veh. Symp., Jun. 2012, pp. 141–146. 2017, pp. 1146–1155.
12