0% found this document useful (0 votes)
35 views28 pages

Intelligent Transportation and Control Systems Using Data Mining and Machine Learning Techniques A Comprehensive Study

ITS

Uploaded by

jankar123
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
35 views28 pages

Intelligent Transportation and Control Systems Using Data Mining and Machine Learning Techniques A Comprehensive Study

ITS

Uploaded by

jankar123
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 28

Received March 4, 2019, accepted March 23, 2019, date of publication April 3, 2019, date of current version April

25, 2019.
Digital Object Identifier 10.1109/ACCESS.2019.2909114

Intelligent Transportation and Control Systems


Using Data Mining and Machine Learning
Techniques: A Comprehensive Study
NAWAF O. ALSREHIN , AHMAD F. KLAIB , AND AWS MAGABLEH
Computer Information Systems Department, Faculty of Information Technology and Computer Science, Yarmouk University, Irbid 21136, Jordan
Corresponding author: Nawaf O. Alsrehin ([email protected])
This work was supported by the Faculty of Scientific Research and Graduate Studies, Yarmouk University, Jordan, under Grant 10/2018.

ABSTRACT Traffic congestion is becoming the issues of the entire globe. This study aims to explore and
review the data mining and machine learning technologies adopted in research and industry to attempt to
overcome the direct and indirect traffic issues on humanity and societies. The study’s methodology is to
comprehensively review around 165 studies, criticize, and categorize all these studies into a chronological
and understandable category. The study is focusing on the traffic management approaches that were depended
on data mining and machine learning technologies to detect and predict the traffic only. This study has
found that there is no standard traffic management approach that the community of traffic management has
agreed on. This study is important to the traffic research communities, traffic software companies, and traffic
government officials. It has a direct impact on drawing a clear path for new traffic management propositions.
This study is one of the largest studies with respect to the size of its reviewed articles that were focused
on data mining and machine learning. Additionally, this study will draw general attention to a new traffic
management proposition approach.

INDEX TERMS Artificial intelligent, data mining, intelligent transportation, machine learning.

I. INTRODUCTION Predicting traffic and dealing with it has taken a great atten-
Nowadays, the capabilities of roads and transportation sys- tion and became a vital issue in big and smart cities.
tems have not evolved in a way that is efficiently copes with Cities municipalities, governments, companies, and
the increasing number of vehicles and growth of population. researchers have proposed many solutions to solve the traf-
Due to this, traffic jams and road congestion have increased. fic jam problem. Some of these solutions are using adap-
TomTom R reported that the commuters in 2014 spent tive traffic signals, vehicle-to-infrastructure smart corridors,
on average 66 more hours stuck in traffic than they did autonomous vehicle technology, real-time traffic feedback,
in 2013 and a trip that might take 60 minutes in non-congested tracking pedestrian traffic, car sharing, and multi-modal
traffic will take 57 minutes longer during rush hour [1]. Since solutions. Most of these solutions are based on the concepts of
current expansion of the existing roads network is limited, it is Internet of Things (IoT), Wireless Sensor Networks (WSN),
essential to develop technologies to make road infrastructure and Data Analytics (DA) approaches. Other partial solutions
well-organized, which allows smooth running of traffic. The were offered including: 1) construction of new roads, bridges,
traffic congestion issues have some other indirect overseen tunnels, flyover, and bypass roads, 2) creating rings and
issues such as noise, pollution and increase travelling time. performing road rehabilitation.
INRIX reported that the economic loss in the U.S. is esti- Traffic congestion refers to an excess of vehicles on a
mated as $121 billion in 2011 and is expecting to increase portion of roadway at specific time resulting in slower speeds
up to $199 billion in 2020 because of traffic congestion [2]. and longer trip times and it is a major challenge in the
Having all these concerns in mind, it was essential to think area of traffic management and transportation planning [3].
of a solution to overcome these concerns and manage traffic. It cannot be solved completely, but it can be solved to some
extent. Informing road users in advance about the road status
The associate editor coordinating the review of this manuscript and will help in minimizing the opportunity of occurring traffic
approving it for publication was Mehul S. Raval. congestion and allowing road users to make better decisions
2169-3536 2019 IEEE. Translations and content mining are permitted for academic research only.
49830 Personal use is also permitted, but republication/redistribution requires IEEE permission. VOLUME 7, 2019
See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
N. O. Alsrehin et al.: Intelligent Transportation and Control Systems Using Data Mining and Machine Learning Techniques

during their journey. This information includes quantifiable


measures of traffic congestions, which can be represented by
estimating some traffic parameters such as travel time and
traffic density. Measuring these parameters from the field is
very difficult [3]. Traffic congestion to the traveller means
lost of time, missing opportunities and frustration. While to
the employer it means lost worker productivity, trade oppor-
tunity, delivery delay, and increased cost. Reducing traffic
congestion will provide safe transits to people, reduce number
of accidents, reduce fuel consumption, help in controlling the
air pollution, reduce waiting time and realize smooth motion
FIGURE 1. Distribution of the collected and reviewed papers from the
of cars in the transportation routes, and help in providing the leading journals and conferences based on the publication year.
required data for future road planning and analysis. There
are different causes of traffic congestion, such as insufficient (32 from 2018 only), shown in Figure 1, were collected
capacity, unrestrained demands, large red light delay, and and reviewed from the leading journals and conferences
obstacles in the road such as accidents, random vehicle stops, in these areas. Some of these journals are: IEEE Transac-
double parking, road work, and road narrowing down. tions on Intelligent Transportation Systems, IEEE Intelligent
Traffic generates huge amounts of data that are collected Transportation Systems Magazine, International Journal
from different types of devices, such as intelligent cameras of Intelligent Transportation Systems Research, Interna-
and sensors. So, there is no issue in collecting these data; the tional Journal of Transportation, Journal of Transportation
challenging issue is how to store, handle, process, analyze Engineering, Part A: Systems, Journal of Transportation Sys-
and manage the increased amounts of traffic data to be make tems Engineering and Information Technology, Transporta-
useful use of it. The above-mentioned approaches mainly tion Engineering – Periodica Polytechnica, and Journal of
focus on analyzing huge amount of traffic data to extract Traffic and Transportation Engineering. Some of leading
certain aspects of traffic data -including but not limited to- conferences are: IEEE Intelligent Transportation Systems
traffic speed, traffic volume, vehicle arrival rate, and average Conference, IEEE Vehicular Networking Conference (VNC),
waiting time. International Conference on Connected Vehicles & Expo
Data mining is the process of analyzing, predicting and (ICCVE), and IEEE Intelligent Conference on Intelligence
discovering interesting knowledge and hidden patterns from and Security Informatics (ISI).
large amounts of data stored in repositories, such as databases The reminder of the paper is organized as follows.
and data warehouses [4]. This process includes statisti- Section II discusses the general steps for developing smart
cal models, mathematical algorithms, and machine learning traffic management systems and specifically in this study
methods [4]. Using data mining technology in traffic man- we focus on the following issues: estimating and predicting
agement provides a powerful analysis and processing func- of traffic parameters, which were reviewed in section III,
tion of mass traffic data and directs drivers and systems to Section IV reviews methods in detecting, recognizing, and
make better decisions. Knowledge mining and discovery is tracking of traffic related objects. While section V focuses on
an emerging area in traffic management systems focuses on methods that identify trip routing and planning. Section VI
using and analyzing large amount of traffic data to be used focuses on reviewing the state of the art methods that iden-
for traffic control, route guidance, or route programming. tify traffic patterns, traffic and driver behaviors, and vehi-
Despite the advancement progress in various aspects of cle/pedestrian’s behaviors. Methods for controlling signal
intelligent transportation and traffic management areas, only and traffic lights were reviewed on section VII. After that,
a limited number of surveys can be found that review the in section VIII, we provide a chronological review of recent
growing body of literature focusing on data mining, artificial surveys papers in the area of intelligent transportation and
intelligent algorithms and techniques adopted in these areas. traffic management systems. Moreover, we have embedded
Among those, data mining methods and clustering techniques a discussion part after each section; where we present our
have been reviewed in [5], while Zhang et.al in [6] reviewed viewpoint, as well as our own experience, and point out
the data-driven approaches used in Intelligent Transportation current weaknesses and possible future directions.
Systems (ITS). In this paper, we investigate the usage of
data mining and artificial intelligent techniques in manag- II. GENERAL STEPS FOR DEVELOPING INTELLIGENT
ing traffic systems and puts forward a hierarchical architec- TRANSPORTATION AND CONTROL SYSTEMS
ture that summarizes and classifies these mining techniques. Figure 2 shows the general phases for developing intelligent
In this survey paper, we do not follow the typical one-by- transportation and control systems, these phases are [7], [8]:
one review; instead we provide an issue-based structure that 1. Collection: Traffic data are collected using different
reviews the state-of-the-art research for improving intelligent methods, such as:
transportation, traffic management and control systems. More A. Image- or video-based methods. Surveillance
than 150 research articles published between 2010 and 2018 cameras are used to visually observe road traffic

VOLUME 7, 2019 49831


N. O. Alsrehin et al.: Intelligent Transportation and Control Systems Using Data Mining and Machine Learning Techniques

message system (VMS), and traffic information. These


approaches are generally based on machine learning,
data mining, and artificial intelligence algorithms.
4. Storage: The rapid growth in the volume of traffic
data leads to great demands of cost-effective storage
FIGURE 2. General steps for the development of intelligent
transportation and control systems. technologies. Cloud storage could be used to store and
secure big traffic data to create more effective real-time
in a specific area and record or stream the cap- traffic decisions. When data is secure and appropriately
tured images/videos to control rooms. It is widely structured, there is greater trust and confidence in its
used in the area of managing road traffic due use [8].
to efficiency and ease of maintenance. However, 5. Communication: Data communication includes using
video and image contents require lot of storage, and sharing traffic data. Traffic data is used to study,
network bandwidth, and computation complexity. plan, design, construct, operate, and monitor traffic
B. Sensor-based methods, such as ultrasonic sensors, systems. Traffic data communication helps researchers,
RFIDs, photoelectric sensors, lasers, radar, and policy makers, government, planner, and departments
vehicle probe data. of transportation and many others to understand trav-
C. Vehicle to Vehicle (V2V) and Vehicle to infras- eler behavior and pattern and identify ways to make
tructure (V2I) Communications using WiFi, their systems more efficient and cost-effective. The
GPRS, WiMax and Bluetooth. usage of this data depends on the goal to be achieved
D. Hybrid-based methods that combine two or more and how it is originally collected, processed, analyzed,
of the above methods together. and stored. Sharing traffic data obtained from a wide
2. Preprocessing: Collected raw data from any of the variety of resources, both internally and externally, can
above methods are subjected to noise, missing values, help agencies/researchers to obtain a more comprehen-
and inconsistent data due to sensor failures, measure- sive picture that improves their decisions to be clear
ment errors, and data link errors or huge size [7]. with high quality. However, sharing and communicat-
Therefore data manipulation is required, some of these ing public traffic data has several concerns, such as
approaches are: transparency, privacy, security, liability, coordination
A. Data cleaning, which includes noise removal, with different agencies and partners, maintenance cost
malfunction detection, recover missing data. of shared data, . . . , etc.
B. Dimensionality reduction in which the 6. Maintenance and Archiving: Data maintenance is the
dimensionality of the data is reduced using process of continual improvement and systematic
manifold learning, non-negative matrix factor- checks that includes ongoing correction and verifica-
ization, or kernel dimension reduction. This tion. Higher levels of maintenance insure the good
improves the performance of learning driven tasks functioning of all the requirement systems. Data archiv-
under the reduced dimensional space. ing includes moving and storing less common use
C. Sparsity Analysis, which includes remove some data out of active systems and databases in special-
redundant features from the original feature ized archival systems to optimize the performance,
space using compressive sensing or heteroge- achieve the cost-effective strategy, and allow for future
neous learning. retrieval.
D. Data fusion, which requires processing many
sources of data. More details about data collection III. APPROACHES FOR PREDICTION OF TRAFFIC
and preprocessing can be found in [7]. PARAMETERS
3. Analysis: Data analysis includes using different anal- There are few good amount of approaches were proposed that
ysis tools to provide useful information such as esti- work on the prediction of the traffic parameters, we categorize
mation of the total number of vehicles using a specific them into four main categories. Firstly, approaches that esti-
segment of roadway at any given day of the year. mate and predict real-time traffic flow. Secondly, approaches
Meaningful information may lead to a resolution of that predict short-term traffic flow in heterogeneous condi-
a problem or improvement of an existing situation. tions. Thirdly, approaches that estimate and predict travel
Identifying erroneous data elements and measuring the time at real-time. And finally, approaches that estimate and
impact of various data-driven processes might also be predict the real-time traffic density.
done to ensure the quality of the analyzed data. Cloud
computing and advanced data processing techniques A. ESTIMATION AND PREDICTION OF REAL-TIME
and tools could be used to analyze big traffic data TRAFFIC FLOW
to create more effective real-time traffic decisions. Developing a mechanism to predict the real-time traffic
In addition, it uses some learning tools to learn systems flow in urban regions that reduce trip time using data min-
how to control the traffic lights, lane signals, visual ing algorithms will increases the accuracy, scalability, and

49832 VOLUME 7, 2019


N. O. Alsrehin et al.: Intelligent Transportation and Control Systems Using Data Mining and Machine Learning Techniques

adaptability of smart traffic applications. This mechanism traffic flow parameters. The framework uses the Kanade–
combines several scalable data mining techniques such as Lucas–Tomasi (KLT) tracker, k-means clustering, connected
decision tree, association rules, and neural networks. These graphs, and traffic flow theory to analyze motion based on
approaches use some traffic parameters and historical data interest-points, determine the connectivity of interest points
as input. Past traffic data were used to predict the short- belonging to one traffic stream cluster, and then estimate
term traffic flow using the Artificial Neural Network (ANN) the traffic parameters. The experimental results showed that
[10]. The model uses traffic volume, speed, density, time the proposed method achieves 96% accuracy in estimating
and day of week along with the speed of each category as average traffic stream speed and 87% accuracy in estimating
input parameters. Experimental results were done in [10] vehicle count. In addition, it achieves high accuracy and pro-
showed that the proposed approach produced good results and cessing speed in both daytime and nighttime settings and not
consistent performance even if time interval for traffic flow sensitive to effects associated with some aerial videos such as
prediction has been increased. object movements, vibration, drifting, changes in speed, and
Another attempt was done by Diao et al. [9] were the hovering.
authors developed a model to predict the short-term traffic Real-time prediction of the number of on-board passengers
volume in massive transportation systems. The authors pre- of a bus (i.e., passenger flow) for the running buses helps in
sented a novel hybrid model to accurately forecast the volume improving the quality of bus service. Zhang et al. [13] used
of passenger flows multi-step ahead. Comprehensive factors smart card fare collection systems and GPS tracing systems
were considered such as temporal, origin-destination spa- in public transportation to analyze and predict the passenger
tial, frequency and self-similarity, and historical probabilistic flow in real-time. The evaluation results showed that the
distribution perspectives. Simulation results with real-time proposed model outperforms existing prediction models in
passenger flow data in Chongqing city in China were used predicting accuracy in most time and stations. This is because
to evaluate the forecasting performance. The results showed it uses both historical data and recent value to predict the
the hybrid model can achieve on average 20%–50% accuracy future passenger flow. Enhancing public transportation, route
improvement compared to other models, especially during guidance systems, traffic light improvements, and incident
rush hours. management can minimize the traffic congestion greatly.
Meenakshi et. al in [4] developed a hierarchical clustering Limitation of their proposed system is that it did not consider
technique for traffic signal decision support to automatically 24 hours traffic system, it focused on the peak hours only.
identify the time of day intervals in which traffic congestion In addition, night time traffic flow characteristics were not
might occur. Applying this cluster analysis approach to utilize considered.
high-resolution system takes full advantage of sensor-based Generally, there are two main approaches to predict
traffic signal data to cluster and validate presented hypothesis, road traffic parameters: model-driven and data-driven [14].
which represents benefits to the system engineering field. Model-driven approaches use simulation to reproduce the
An unmanned aerial vehicle (UAV) is an aircraft that car- road network behavior. Accurate predictions require a
ries no human pilot or passengers and guided autonomously detailed knowledge about the network topology. The lim-
using remote control. UAV used first in military applications itations of the Model-driven approaches are that the num-
and recently used to enhance the transportation systems and ber of parameters and the model structure are fixed, which
many prospective applications. In such trend, fast and accu- cannot reflect the continuity changing of the road network
rate detection of vehicles and extracting traffic parameters infrastructures to keep accurate results. On the other hand,
from UAV video becomes crucial. Ruimin Keet. al in [11] the data-driven approach aims to examine and organize the
proposed a new and complete analysis framework, which road traffic data to analyze and interpret road traffic situation
contains four stages that classify and estimate the traffic neglecting the underlying data generation process and disre-
flow parameters (i.e., speed, density, and volume) from UAV garding the network topology. Table 1 summarizes the state-
video. The proposed framework addresses issues such as of-the-art research were done in predicting traffic parameters,
irregular ego-motion, low estimation accuracy in dense traffic in which we describe the set of features, input and testing
situation, and high computational complexity. In addition, the environment, evaluation metrics, and intelligent algorithms
authors publicly provided a dataset that contains 20,000 train- were used. Figure 3 shows a general structure of the methods
ing and testing image samples as benchmark for researcher were used to estimate and predict traffic parameters, which
working on UAV. Experimental results showed that their are classified into pridective mining and pattern mining. The
proposed framework is able to achieve very good accuracy pridective mining is a process that uses data mining and
results with high real-time processing speed in both free flow statistical models to forecast outcomes. Each model is made
and congested traffic scenarios. up of a number of parameters, which are variables that are
Another attempt to estimate the average speed of traffic likely to influence future results. Pattern mining is a process
stream and count of vehicles from UAV-based traffic videos that uses statistical models to find relevant patterns between
was done by Ruimin Ke et. al in [12]. The authors pro- data examples.
posed a four-step framework to identify the directions of Future studies should focus on: 1) using other param-
traffic streams and for each traffic stream; it extracts the eters like weather condition, seasonal variation in traffic

VOLUME 7, 2019 49833


N. O. Alsrehin et al.: Intelligent Transportation and Control Systems Using Data Mining and Machine Learning Techniques

TABLE 1. Features, evaluation metrics, and intelligent algorithm were used to predict traffic parameters.

flow, extreme conditions, and variability in the traffic flow. have been exist to handle the seasonal heterogeneity in traffic
2) Collecting real-time traffic information for longer period condition series. Huang et al. [15] proposed a real-time model
of time to cover all possible realistic conditions associated that uses the seasonal adjustment factor plus adaptive Kalman
with traffic flow. 3) developing a more general model that filter to online predict the seasonal heterogeneity in traffic
considers all the above traffic parameters and use real-time flow series. Experimental results showed acceptable results
traffic data to estimate traffic flow. and achieved comparable performance when compared with
offline models. Also, when the traffic is highly volatile,
B. PREDICTING SHORT-TERM TRAFFIC FLOW IN the online model improves the performance over the offline
HETEROGENEOUS CONDITIONS model. Longer time interval can be explored, and more data
Homogeneous traffic is composed of identical vehicles that sets can be applied to evaluate the algorithm. In addition,
follow a lane path. While heterogeneous traffic composed there is an urgent need for uniform performance measure-
of motorized and non-motorized vehicles, such as two- ments that evaluate the overall performance of long-term
and three-wheelers, along with several other vehicles and prediction.
trucks with no-lane path. This heterogeneous traffic with Deep learning model was used by Polson and Sokolov
the absence of lane discipline results into a complex traf- in [18] to predict traffic flows. The authors developed an
fic behavior and make the prediction of traffic flow more innovative architecture that combines a linear model and a
challenge than in the homogeneous traffic [16]. Capturing sequence of tanh layers, which are used to identify spatio-
the effect of different vehicles size and the lack in lane temporal relations among predictors and to model nonlinear
discipline are the main challenges in modeling heterogeneity relations. For evaluation purposes, the authors used sensor
in traffic [17]. Short-term traffic prediction is the process of data from Interstate I-55 and predict traffic flows during two
predicting traffic conditions at a future time, given continuous special events; a Chicago Bears football game and an extreme
short-term feedback of traffic information and the response is snowstorm event. The experimental results showed that their
returned immediately. proposed deep learning model provides precise results for
Despite the extensive studies were published to handle the short term traffic flow predictions. In addition, the authors
predicting of short term traffic conditions, several proposals empirically observed that prediction based on recent traffic

49834 VOLUME 7, 2019


N. O. Alsrehin et al.: Intelligent Transportation and Control Systems Using Data Mining and Machine Learning Techniques

FIGURE 3. A general structure of the methods used to estimate and predict traffic parameters.

data within last 40 minutes generate stronger results rather and represented by a deterministic fundamental diagram. All
than historical values within the last 24 hours. This indicates these classes encounter identical traffic state, but each class
that a powerful model could be based on features generated realizes the impact of other classes differently. Extensive
from recent observations (i.e., within the last few minutes) numerical experiments with the NGSIM I-80 data set showed
rather than previous days. that the proposed multi-class model can produce realistic con-
Another approach was done by Qian et. al in [19], in which gestion propagation for multiple classes in various scenarios.
the authors developed a macroscopic model that approxi- It also computes realistic time-varying travel time for each
mates heterogeneous traffic flow using interplay of multiple class, which cannot be obtained from conventional single
vehicle classes; each class represents a homogeneous traffic class models.

VOLUME 7, 2019 49835


N. O. Alsrehin et al.: Intelligent Transportation and Control Systems Using Data Mining and Machine Learning Techniques

Mohan and Ramadurai in [17] proposed a parsimonious optimal departure times. In addition, the ATTP module ana-
model of heterogeneous traffic that can capture the gap fill- lyzes the current traffic conditions to provide accurate predic-
ing. Gap filling is an event that occurs when the driver tries to tions of arrival time, which allows drivers to select alternate
position his vehicle into the available gaps in the road section routes to further reduce travel times. Another experiments
ahead. The proposed model uses area occupancy, relaxation showed that combining the two models provides highly accu-
time, and the relative speed of vehicles. Field data from an rate travel time prediction for freeway drivers.
arterial road in Chennai city in India were used to validate Classifying algorithm such as k-NN along with historic
and evaluate their proposed model. The evaluation results data were used by Kumar et. al. in [29] for predicting the
showed that their proposed model generates results that are next bus travel time. This real-time prediction uses a model-
comparable with those from few existing generalized multi- based recursive estimation scheme based on Kalman filtering.
class models. The predicted travel time can be represented in terms of
Estimating the traffic flow based on traffic video analysis the remaining time to reach the destination and displayed at
is another methods that was proposed by Hung et. al. in [20]. bus stops, within buses, or through web portals. The evalua-
The authors used the captured videos from a surveillance tion results showed that the proposed method improved the
camera mounted in a cross road to detect moving vehicles prediction accuracy compared with other methods based on
and control the traffic flow. In addition, they used the traf- static inputs. Other methods and algorithms were reviewed
fic flow as input for automatic timing for the traffic light in [30] and [31].
accordingly. The authors used background subtraction and The accuracy of predicting the travel time information
Gaussians model to detect and trace vehicles and then syn- given to passengers is important in developing any Advanced
chronized in a given time period to eliminate the shadow from Public Transportation Systems (APTS) applications and will
the sunlight or streetlight. The authors used Vietnam as a help passengers to determine their departure/arrival times.
context, which include a mixed flow of motorbikes with other To improve the accuracy of such applications, Bin Yu et. al.
transport means, to evaluate the proposed system. in [28] proposed a random forests based on the nearest-
neighbor (RFNN) method to predict the bus travel time.
C. ESTIMATION AND PREDICTING OF REAL-TIME Experiments were conducted using real GPS data collected
TRAVEL TIME from two bus routes in Shenyang city. These experiments
Travel time is the time required for road users to travel from showed that the RFNN method achieved high accuracy com-
a source location to a destination point. Predicting travel pared with other four methods. However, RFNN method has
time in a timely manner avoids congestion and increases a longer computation time that might be optimized using
the utilization of the entire highway network [21]. Available parallel computing.
technologies and sensors in the transportation systems gen- Another attempt was done by B. Anil Kumar et. al. in [32].
erate huge volume of traffic data in real time. Also, various The authors used both temporal and spatial variations based
prediction methods have been proposed to rapidly process on the basic traffic conservation equation to predict the travel
these data. Factors that influence predicting of real-time time. Experimental results showed that the proposed method
travel time are: 1) the time of the predication, whether it is was able to perform better prediction than historical aver-
during the day, weekdays, weekends, summer, winter or hol- age, regression, and ANN methods and the methods that
idays which affect the disparity of cars flow with time and uses either temporal or spatial variations only. In addition,
thereby the accuracy of the prediction, 2) the hard coded these results showed that using vehicle tracking data without
delays where the transition time slots are fixed and do not location-based data is good enough to generate accurate pre-
depend on real time traffic flow, 3) the adjacency of traffic diction results.
lights in which the traffic light at intersection does influence Dawn Woodard et. al in [33] proposed a method,
the traffic at adjacent intersections, 4) emergency cases such called TRIP, which uses GPS data from mobile phones to
as accidents, roadwork, breakdown cars, ambulances, rescue predict the probability distribution of travel time for a ran-
vehicles, police, fire brigade, and 5) pedestrians that cross the dom route in a road network at any given time. TRIP also
roads. provides information about the reliability of the travel time
Some of these methods are time series methods [22]–[24], prediction. Evaluation results using data from mobile phones
regression models [25], [26], and machine learning meth- collected from the Seattle metropolitan region showed that
ods [27], [28], more details can be found in [21]. Random the TRIP provides predictions that are as accurate as Bing
forests and Apache Hadoop as machine learning methods Maps predictions. In addition, it is computationally feasible
were used by Shu-Kai et. al. in [21] to construct a big data even for very large-scale road networks.
analytics platform that predicts highway travel time. The Chaiyaphum et. al. in [34] proposed an effective travel
authors developed two models, namely OTTP and ATTP and time prediction technique based on a concept of Deep Belief
proposed a platform that uses data collected from highway Networks (DBN). The proposed technique uses the Restricted
electronic toll collection in Taiwan. Experimental results Boltzmann Machines (RBM) to automatically learn generic
showed that the OTTP model helps highway drivers to avoid traffic features following an unsupervised learning architec-
traffic congestion and minimize travel time by selecting ture. And then following a supervised learning architecture

49836 VOLUME 7, 2019


N. O. Alsrehin et al.: Intelligent Transportation and Control Systems Using Data Mining and Machine Learning Techniques

then the sigmoid regression is used to predict the travel time. Secondly, approaches that predict the occurrence of vehi-
The prediction model was tested using both the PeMS data cle accident, and thirdly, approaches that recognize license
set and real traffic data. Evaluation results achieved high pallet.
prediction accuracy.
A. DETECTING AND COUNTING OF REAL-TIME
D. ESTIMATION AND PREDICTION OF REAL-TIME PEDESTRIANS
TRAFFIC DENSITY Effectively detecting of traffic related objects often improves
Throughput, travel time, safety, fuel consumption, emission, the accuracy of the recognition and tracking steps, which
reliability, and traffic density are considered as the primary plays a crucial role in ITS. Traffic signs, cars, pedestrians,
measures of quantifying traffic congestion on roadways other and cyclists are important classes of traffic related objects.
than signalized intersection [3]. The prediction of real-time Jifeng Shen et. al. in [38] proposed a pedestrian extraction and
traffic density could be done based on: 1) Aerial Photogra- refinement framework for pedestrian detection. The proposed
phy using loop detector, 2) Data driven approach using lin- system is based on Pixel Differential Feature (PDF) and
ear model, linear regression, ANN, k-NN, pattern matching, Aggregated Region Feature (ARF). PDF is a light weighted
PCA, nearest neighbor approach, Kalman filtering, clustered feature with a high recall rate and the authors used the
neural network, wavelet neural network, k-NN and Linearly multi-scale ARF to extract the co-existing dominant pixel
Sewing Principle Components, and 3) Image processing tech- differential patterns in a local region for fusing informa-
niques. Jithin et al. in [3] used the k-Nearest Neighbor (kNN) tion from different resolutions and scales to improve the
and Artificial Neural Network (ANN) as machine learning performance and to reject hard false positives. Experiments
techniques to estimate the travel time and traffic density. The based on the INRIA, Caltech, ETH, TUD-Brussel and KITTI
authors used the available data collected every five minutes datasets demonstrate the effectiveness of this method at real-
from automatic sensors as input to estimate the target travel time speed with low computational complexity, which makes
time and density. The evaluation results showed promising it more encouraging in the embedded or mobile platforms.
results in terms of Mean Absolute Percentage Error (MAPE) David et. al. in [39] reviewed the state of the art research for
in Indian traffic conditions for each technique individu- pedestrian detection for advanced driver assistance systems.
ally. However, combing these two techniques did not show Table 2 summarizes the typical machine learning algorithms
any significant improvements in performance. In addition, used to detect, recognize, and track of traffic related objects,
the authors suggested that to achieve a better performance for especially the real-time pedestrians.
the ANN, the number of training data is highly recommended There has been an enormous research effort in automatic
to be huge. detection and classification of pedestrians. However, the feel-
Deep Convolutional Neural Network (D-CNN) method ing is that there is a need to develop more efficient algorithms
based on video images was developed by Chung and Keemin and ideal systems. Going forward with this comprehensive is
Sohn in [35] to count the number of vehicles on a road fully justified. We would like to provide few more sugges-
segment. The experimental results showed that their method tions:
outperforms existing approaches. Limitations of their method 1) Explore more features and machine learning algorithms
are: (1) it counts the number of vehicles regardless the to generate more discriminative feature representations
vehicle’s size, make, and model. (2) the count is based on and learning strategy.
extracting images from video without considering consecu- 2) Increase the robustness of existing algorithms by
tive images or vehicle status (i.e., moving or stopped). increasing their discrimination power and lowering
Zhang et al. in [36] analyzed and compared Fully Convo- their computation time, which could be done by inte-
lutional Networks (FCN) method with the regression-based grated the parallel processing option. This might result
method and concluded that the FCN generates more accurate in lowering the detection latency in some critical cases.
results using the TRANCOS public dataset. These algorithms might be extended to be able to detect
A novel Periodic-Convolutional Recurrent Network other important objects in the scene, such as bikes, cars,
(P-CRN) method was proposed by Ali Zonoozi et. al. in [37] traffic signs and lights.
to predict crowd density. P-CRN adapts CRN to accurately 3) Focus on different poses, 3D views, and partially
capture spatial and temporal correlations, it learns and incor- occluded pedestrians would be of interest
porates explicit periodic representations, and it can be opti- 4) Use pedestrian tracks to predict and be aware
mized with multi-step ahead prediction. of pedestrian intentions in advance for collision
avoidance.
IV. DETECTION, RECOGNITION, AND TRACKING OF 5) Add more dynamic situations to complement the cur-
TRAFFIC RELATED OBJECTS rent data sets.
There are decent amount of approaches were proposed to 6) Use 3D LIDAR technology that provides more accu-
track objects at the traffic management space. We are able rate data than using machine vision and cameras.
to categorize them into three general categories. Firstly, 3D LIDAR technology can be successfully used to
approaches that detect and count real-time pedestrians. detect pedestrians in different lighting conditions.

VOLUME 7, 2019 49837


N. O. Alsrehin et al.: Intelligent Transportation and Control Systems Using Data Mining and Machine Learning Techniques

TABLE 2. Machine learning algorithms used to detect, recognize, and track of traffic related objects.

B. PREDICATION OF THE OCCURRENCE OF VEHICLE the degree of similarity. They adopted the agglomerative
ACCIDENT hierarchical clustering based on the inertia criterion and on
Accident prediction is an estimation of the occurrence of Ward’s method [63] that combines clusters in such a way that
accident based on the nature of the relationships between at each agglomeration the two clusters merged are those with
different roadway entities, such as diver behavior, road struc- the smallest increment in the sum of squared distances (within
ture and type, and surrounding environment. It is important cluster standard deviation). By analyzing and simplifying the
to understand the mechanisms involved in accidents on one available road traffic accident data in Italy which have been
hand and to better predict their occurrence on the other collected using the report forms created by Italian Institute of
hand [61]. Statistics (ISTAT), they proceeded to develop three different
Gianfranco et al. in [62] developed a model that aims to models for predicting accident frequency at roundabouts,
predict and estimate the number of accidents for three situa- junctions and straight stretch of road. Each model shows that
tions in an urban road network, a roundabout, a three- or four- accident rates vary with risk factors. Thus, it is possible to
way junction and a straight stretch of road. The model divided identify appropriate countermeasures to be implemented for
the accident data into homogeneous clusters, constructed are reducing the risk of road accidents.
based on Poisson’s and Negative Binomial (NB) algorithm. Meng et al. in [64] proposed a connected vehicle (CV)-
The analysis is based on analogous, be they contrasting, based dynamic all-red extension (DARE) framework for
concepts of similarity and distance: the shorter the distance, adaptive signalized intersections to avoid potential crashes
the greater the similarity. Therefore, they used the squared caused by red-light running (RLR) behavior. The concep-
Euclidean distance as the most common method for measur- tual framework consists of three components: CVs, road-
ing the distance between cases. Prior to the analysis, the vari- side equipment (RSE) and traffic control devices. The
ables are standardized (divided by the standard deviation) RLR prediction of signalized intersections is a crucial com-
such that the unit of measure does not affect their distance. ponent of DARE that avoids potential collisions caused by
Essentially, this involves working with standard deviations. RLR behavior. They formulated the RLR prediction as a
Once the cases have been classified into groups, the absolute binary classification problem based on continuous trajec-
value of the correlation coefficient is widely used to measure tories measured by radar sensors. In the CV environment,

49838 VOLUME 7, 2019


N. O. Alsrehin et al.: Intelligent Transportation and Control Systems Using Data Mining and Machine Learning Techniques

TABLE 3. Features, factors, and strength of the methods were used for prediction of the occurrence of vehicle accident.

vehicle trajectories and real-time signal timing could be for the training and testing data were 81.6% and 74.6%,
obtained via vehicle-to-infrastructure and vehicle-to-vehicle respectively.
communications. Using continuous trajectories, individual To improve the prediction accuracy of the ANN classi-
speed, acceleration and distance to the stop line at the red- fier, traffic accident data were split into three clusters using
light onset time are selected as classification attributes. Non- a k-means algorithm. The results after clustering revealed
weighted and weighted least square support vector machines significant improvement in the prediction accuracy of the
(LS-SVM) are adopted to solve the RLR prediction problem. ANN classifier, especially for the training dataset. Further-
Parameter tuning is conducted by the cross-validation in the more, in order to validate the performance of the ANN
discrete parameter space and the Bayesian inference in the model, an ordered probit model was also used as a com-
continuous parameter space, respectively. As a comparison, parative benchmark. The dependent variable (i.e. degree of
the existing DARE approach at adaptive signalized intersec- injury) was transformed from ordinal to numerical (1, 2, 3, 4)
tions based on inductive loop detectors is discussed. for (minor, moderate, sever, death) respectively. The R tool
Alkheder et al. in [65] used artificial neural network was used to perform an ordered probit. For each accident,
(ANN) to predict the injury severity of traffic accidents based the ordered probit model showed how likely this accident
on 5973 traffic accident records occurred in Abu Dhabi over a would result in each class (minor, moderate, severe, death).
6-year period (from 2008 to 2013). For each accident record, The accuracy of 59.5% obtained from the ordered pro-
48 different attributes had been collected at the time of the bit model was clearly less than the ANN accuracy value
accident. After data preprocessing, the data were reduced to of 74.6%. Table 3 summarizes the typical machine learning
16 attributes and four injury severity classes. The WEKA algorithms used to for predication of the occurrence of vehicle
(Waikato Environment for Knowledge Analysis) data-mining accident.
software was used to build the ANN classifier. The traf- We would like to provide few more suggestions as future
fic accident data were used to build two classifiers in two directions:
different ways. The whole data set were used for training 1) Using more data points makes the prediction stronger,
and validating the first classifier (training set), while 90% of better quality rules, ensure the reduction of accidents.
the data were used for training the second classifier and the 2) Adding more factors, such as weather, time, speed
remaining 10% were used for testing it (testing set). limit, road curvature, average traffic flows and vol-
The experimental results revealed that the developed umes, proximity to intersections, road direction and
ANN classifiers can predict accident severity with reason- alignment (north, south, east and west), road width,
able accuracy. The overall model prediction performance road surface type and Human factors. These factors

VOLUME 7, 2019 49839


N. O. Alsrehin et al.: Intelligent Transportation and Control Systems Using Data Mining and Machine Learning Techniques

TABLE 4. Features, factors, and strength of the methods were used for identifying of trip routing and planning.

allow the prediction model to produce more accurate recommendation (CTRR) and an extended version of CTRR
and unbiased results. (CTRR+). First, they estimated users’ travel behavior fre-
quencies by using collaborative filtering technique. A route
C. RECOGNITION OF LICENSE PALLET with the maximum probability of a user’s travel behavior is
Automatic license plate recognition systems have been then generated based on the Naïve Bayes model. The CTRR+
widely used in different applications such as traffic control, method improves the performances of CTRR by taking into
traffic surveillance, automated car parking, vehicle track- account cold start users and integrating distance with the
ing, electronic toll collection, vehicle localization and long- user travel behavior probability. Table 4 summarizes the
distance vehicle localization or tracking. The detection of features, factors and strength of the typical machine learning
small and vague license plates and characters in real appli- algorithms were used for identifying of trip planning and
cations is difficult and still an open problem. Chunsheng routing algorithms.
and Faliang Chang in [78] proposed a novel hybrid cascade The authors in [81] conducted some case studies based on a
framework for fast detecting small and vague license plates real GPS trajectory data set from Beijing, China. The exper-
in large and complex visual surveillance scenes. The experi- imental results show that the proposed CTRR and CTRR+
ments showed that the proposed framework is able to rapidly methods achieve better results for travel route recommenda-
detect license plates with different resolutions and different tions compared with the shortest distance path method. They
sizes in large and complex visual surveillance scenes. The recommended in future to continue studying the segmenta-
proposed framework outperforms different evaluation data tion method for the GPS trajectories with context-adaptive
sets with many small and vague plates. sampling rates and they suggested that both algorithms can be
improved from the following perspectives: First, CTRR and
V. IDENTIFYING TRIP ROUTING AND PLANNING CTRR+ consider the travel behavior by integrating spatial
Development of real-time trip routing and planning systems information (i.e. the road segment) and temporal information
will help users to route and plan their trip according to (i.e. time interval), but they do not consider the sequence of
their preferences. These users can access these services from travel behaviors, which may lead to unreasonable routes for
their mobile devices or by visiting websites over internet. some cases. One possible solution is to study user’s transition
Cui et al. [81] planned an optimal travel route between behavior which describes the transition between two travel
two geographical locations, based on the road networks and behaviors on adjacent road segments in one time interval. The
users’ travel preferences. They defined users’ travel behav- frequencies and probabilities of user’s transition behavior can
iors from their historical Global Positioning System (GPS) be estimated with MF and smoothing algorithm, similar to
trajectories and propose two personalized travel route the study of user’s travel behaviors. Hence, a personalized
recommendation methods – collaborative travel route travel route could be recommended by searching the route

49840 VOLUME 7, 2019


N. O. Alsrehin et al.: Intelligent Transportation and Control Systems Using Data Mining and Machine Learning Techniques

with the maximum probability of the travel behavior and simulation module, and determines recommended lanes to
the transition behaviors, which will modify the limitation of use and the associated lane-level maneuvers along the candi-
ignoring the sequence of travel behaviors. Besides, the tran- date paths. A link-level optimizer may be used to determine
sition behavior probability also indicates some restrictions the candidate paths based on link travel times determined
in road network. For instance, the transition behavior prob- by the simulation module which then may further refined
ability must be smaller at the turns with restrictions than at with the lane-level optimizer. They recommended using the
those without restrictions, which could modify the route with simulation by multi-threaded and/or distributed for faster
unreasonable turnings. Second, CTRR and CTRR+ assume computation in order to provide more timely navigation guid-
that traveling happens in the same time interval in the route ance or to evaluate multiple alternatives simultaneously.
recommendation. However, in the real world, a travel may Wang et al. in [86] proposed a model to serve the school
cross multiple time intervals. There are two key problems to transportation routing and scheduling problem, which aims
be considered. The first one is that the travelling time on each to reduce the cost and time using the minimum number of
road segment should be estimated so that the elapsed time of buses. They developed a Mixed Integer Linear Program-
travelling can be tracked. ming (MILP) model for the integrated school bus routing
The second issue is that the transition between travel and scheduling problem. The model is solved to optimality
behaviors on one road segment across the adjacent time inter- on small size problems to test its correctness. An advanced
val should be considered. As travel behavior is assumed to be decomposition algorithm, namely the School Compatibility
composed of a set of latent factors, one possible solution is to Decomposition Algorithm (SCDA), is proposed to solve the
consider the latent factors as time-dependent, and study the model for larger problems. SCDA is superior to the traditional
transition between the latent factors over the adjacent time decomposition methods because it considers the valuable
interval. With the time-dependent latent factors involving scheduling information (the compatibility) when solving the
transition relationship over time, the frequencies and proba- routing problem. They claimed that the biggest contribution
bilities estimation of travel behaviors can be better estimated. of the proposed model and algorithm is that the interrelation
In CTRR+, the parameter α is to balance the weights between between the routing and scheduling is kept even in the decom-
the travel behavior probability and distance. The setting of the posed problems. The validity of the model and the efficiency
optimal α depends on the dataset, and should be learned from of the SCDA algorithm are tested on the randomly generated
the data set. A potential method for learning α is gradient problems and a set of test problems. The first experiments
ascent method which is an iterative process. The precision show that SCDA can find solutions as good as the integrated
of CTRR+ varies with the value of α, and the gradient of model (in terms of the number of buses) in much shorter time
the precision at α is the direction in which the precision is (as little as 0.6%) and that it also outperforms the traditional
increased fast. Based on gradient ascent method, α can take decomposition algorithms. The second experiments show that
steps proportional to the gradient of the precision at α until the SCDA can find better with a fewer number of buses
the process converges. The deficient of the method is that the (up to 26%), and shorter mean and maximum travel time per
precision may approach to a local maximum. trip (up to 7%). A few directions for future direction can be
Slavin et al. in [85] introduced improvements to traffic identified. One of them is a more efficient algorithm to solve
using micro simulation methods by taking account of more each single school routing problem such that it can handle
realistic lane level trajectory selections made by drivers. more complicated problems with more stops to every single
These data can be used to analyze the likely travel times school. Another one is that a more flexible way to handle
and other characteristics of potential vehicle trajectories at bus service start time can be devised, especially for morning
the lane-level from an origin to a destination. Within the trips. An appropriate time window might be more financially
context of a single simulation run, a look ahead mechanism beneficial than a fixed service start time.
is used to identify better lane-level guidance and ensure that Liu et al. in [87] focused on the identification and opti-
the guidance is feasible considering other traffic in later mization of flawed region pairs with problematic bus rout-
time. These improved processes are based on the ability to ing to improve utilization efficiency of public transporta-
store, manage and utilize time dependent, lane-level infor- tion services, according to people’s real demand for public
mation on traffic and geometric conditions on highways and transportation. First, they provided an integrated mobility
between and within intersections on streets. Similarly, these pattern analysis between the location traces of taxicabs and
data items can be gathered for the components of trajectories the mobility records in bus transactions. Based on the mobil-
that take place inside the intersections where delays are often ity patterns, they proposed a localized transportation mode
experienced due to conflicting movements of vehicles and choice model, where they can dynamically predict the bus
even pedestrians. A lane-level vehicle routing and naviga- travel demand for different bus routing by taking into account
tion apparatus according to embodiments of the invention both bus and taxi travel demands. This model then used
includes a simulation module that performs microsimulation for bus routing optimization which aims to convert as many
of individual vehicles in a traffic stream, and a lane-level route people from private transportation to public transportation as
optimizer that evaluates predicted conditions along candidate possible given budget constraints on the bus route modifica-
paths from an origin to a destination as determined by the tion. They also leveraged the model to identify region pairs

VOLUME 7, 2019 49841


N. O. Alsrehin et al.: Intelligent Transportation and Control Systems Using Data Mining and Machine Learning Techniques

with flawed bus routes, which are effectively optimized using anomaly simulation system in which users can introduce and
their approach. To validate the effectiveness of the proposed edit anomalies whenever and wherever they need in traffic
methods, extensive studies are performed on real-world data simulations, also to explore modeling of complex vehicle-
collected in Beijing which contains 19 million taxi trips and crowd interactions.
10 million bus trips. Li et al. [90] proposed a system and method for traffic
The work reported by [87] showed how to optimize bus engineering in networks and in particular embodiments, for
routing to attract more bus riders from taxi. As a future direc- distributed traffic engineering in Software defined networks.
tion, improvements can be made through several different In accordance with an embodiment, a network component
directions: such as taking bus stop location selection into for dynamic Zoning for traffic engineering (TE) in Soft-
account. In [87], it can optimize bus routing and bus stop ware defined networking (SDN) includes a processor and
location simultaneously to meet people’s travel demands. a computer readable storage medium storing programming
Second, more transportation modes can be considered, for for execution by the processor. The programming including
example, bus network optimization can be conducted together instructions to: receive network information from at least
with subway system and city bike system. This can help to one SDN controller from a plurality of SDN controllers in
model the whole city travel demand as a whole and better a network; determine a plurality of TE Zones for the net-
serve the goal to make public transportation more attractive to work, selecting a local Zone TE controller for each of the
riders. Table 4 summarizes the features, factors and strength plurality of TE Zones, and selecting a master TE controller
of the typical machine learning algorithms were used for according to the network information and a Zoning scheme,
identifying of trip routing and planning. wherein the local Zone TE controller is selected from one of
As future research direction might be focusing more on the SDN controllers, and wherein the master TE controller
the data-driven models and collecting more detailed data that is selected from one of the SDN controllers; and transmit-
might yield better and unexpected results, which worth to be ting an indication of the Zone composition, the local Zone
explored. TE controllers, and the master controllers to at least some of
the SDN controllers.
VI. IDENTIFYING TRAFFIC PATTERNS AND BEHAVIOUR The master selection strategy provides the network zoning
Identifying vehicle movements, understanding traffic pat- processor with a methodology for how to select the master
terns, behaviour, how traffic congestions appear and increase controller from the controller candidates. Possible strategies
in time and space can benefit the prediction of short- include random selection, fixed selection, joint optimiza-
and long-term traffic situations, it also can reduce the tion with Zoning, etc. Selection preference/un-preference
congestion. There are several attempts that focus on and other necessary data may be specified along with the
analyzing and identifying traffic pattern and behaviour. selection strategy. The Zoning strategy indicates whether to
Sekar and Shondelmyer [88] focused on detecting and ana- perform node grouping or flow grouping. The former means
lyzing traffic infraction based on traffic behavior. The authors to create Zones by associating nodes (routers, Switches, base
proposed an approach in which an information handling sys- stations, UEs, etc) to controllers; the latter implies to do so
tem detects a traffic infraction of a driver driving a vehicle. by associating flows (represented by their candidate routing
In turn, the information handling system forms an infraction paths) to controllers. In an embodiment, which strategy to use
detection zone that includes a set of traffic control devices, depends on network and traffic status. For instance, in stable
and sends a set of configuration parameters to the set of flow network segments flow grouping is preferable over node
traffic control devices. The information handling system then grouping; whereas, in unstable flow segments the opposite is
uses vehicle identification data in the set of configuration desired. Other parameters, for example, Zone border type, can
parameters to identify driving behaviors of the driver through be included in Zoning configuration. There are three border
the infraction detection zone and issues a citation based upon types: link sharing only, node sharing only, and link and node
the identified driving behaviors. sharing.
Wang et al. [89] introduced a novel concept is called
‘‘shadow traffic’’ for modeling traffic anomalies in a unified VII. DETECTION AND EVALUATION OF DRIVER
way in traffic simulations. They transformed the properties BEHAVIOUR
of anomalies to the properties of shadow vehicles and then There are decent amount of approaches were proposed to
described how these shadow vehicles participate in traffic detect and evaluate of driver behavior. We focused on the
simulations, (i.e. a variety of traffic anomalies can be depicted following areas: detecting and evaluating of driver distrac-
in a unified way and well describe how the anomaly itself is tion, capturing the driver behavior using speech model, and
evolving). They claimed that their model could be incorpo- analyzing and predicting driver behavior.
rated into most existing traffic simulators with little computa-
tional overhead. Moreover, experimental results demonstrate A. DETECTION AND EVALUATION OF DRIVER
that the model is capable of simulating a variety of abnormal DISTRACTION
traffic behaviors realistically and efficiently. They recom- Nowadays, many drivers get distracted from visual and cog-
mended, as a future work, to build a real-time editing traffic nitive distractions. In addition, the features that are equipped

49842 VOLUME 7, 2019


N. O. Alsrehin et al.: Intelligent Transportation and Control Systems Using Data Mining and Machine Learning Techniques

with the vehicles such as entertainment systems, driver C. ANALYZE AND PREDICT DRIVER BEHAVIOR
fatigue or portable devices that are brought into the vehi- Researchers from different backgrounds such as car industry,
cles such as smart phones also cause the distraction. Driver transport engineering, and psychology have been trying to
distraction, especially among young drivers, is considered study and analyze human driving behavior in realistic situa-
one of the most common causes of traffic accidents and tions to: (1) better understand the relations between the driver
congestion. Andrei Aksjonov et. al in [91] and [92] proposed actions, the vehicle performance and the driving environment,
a novel method for evaluating driver distraction and situation (2) prevent road crashes and reinforce traffic safety, and
awareness while performing a secondary task using machine (3) enhance drivers comfort. Afaf Bouhoute et. al. In [95]
learning and fuzzy set theory. The evaluation results showed developed a methodology to process and analyze car-
that the proposed method allows to recognize, detect and to generated data. The proposed method used probabilistic
calculate the level of driver distraction in percentage based graphical models combined with a machine-learning algo-
on safe vehicle dynamic performance. The secondary task is rithm for building a formal model of the driver behavior.
examined as a chatting on a cellular telephone. The conducted It also focused on two analysis goals: 1) automatic conformity
driver-distraction experiments are done using simulation in of the drivers’ behavior to traffic rules; and 2) visualization
laboratory and generate more accuracy compared to the old and comparison of drivers’ behaviors. Early experimental
one as it involves more input measures. results showed that the design of numerical domains consid-
For monitoring the driver attention, Nanxiang et. al. ered influences hugely on the analysis results.
in [93] build statistical models in the form of Gaussian Mix- Najah and Hatem in [96] provided an overview of advances
ture Models (GMMs) to quantify and analyze the actual devi- of in-vehicle and smartphone sensing capabilities and com-
ations in driver behaviors from the expected normal driving munication and recent applications and services of driver
patterns. The authors defined secondary tasks as operating behavior modeling (DBM) such as cloud-based services.
with the radio, phone and a navigation system. They collected The components and stages involved in driver behavior
data from real world scenarios using different noninvasive modeling, the various forms of input and the primary
sensors including the controller area network-bus (CAN- modeling approaches were introduced and the use of the
Bus), video cameras and microphone arrays. Their model DBM with emphasis on Advanced Driver Assistance Sys-
achieves 77.2% accuracy and shows that certain tasks are tems (ADAS) and the emerging autonomous vehicles were
more distracting than others. Building upon these results, described. Also, the authors provided different techniques
the authors proposed a regression model to generate a metric for simulation-based and data-driven evaluation mechanism
that identifies and describes the driver’s attention level to along with datasets for specific DBM objectives and appli-
signal alarms, preventing collision and improving the overall cations. Finally, the authors highlight several research chal-
driving experience [93]. lenges and key future directions and open research issues that
All the previously proposed methods provide a dis- enabling researchers to develop more sophisticated DBM.
tracted or non-distracted decision and require some additional Fugiglando et. al in [97] proposed a near-real-time method
devices, such as cameras. Consequently, suggested method to analyze and classify driver behavior into different groups
could be developed and used as a practical tool for different using a selected subset of controller area network (CAN)
evaluation and comparative analyses of the secondary tasks bus signals. The proposed method uses unsupervised learning
influence on vehicle safety. Examples of the applications of technique and information that are collected from different
detecting and predicting the driver behavior are: lane chang- resources such as gas pedal position, brake pedal pressure,
ing, intersection decision-making, driver profiling, and router steering wheel angle, steering wheel momentum, velocity,
choice modeling. RPM, longitudinal, and lateral acceleration. The authors offer
a validation method to test the robustness of clustering in
B. CAPTURE THE DRIVER BEHAVIOR USING SPEECH a wide range of experimental settings and to compute the
MODEL minimal amount of data needed to preserve robust driver
Another direction is to study the driver motion behavior clusters.
which includes driver’s body movements to obtain their Results showed that the optimal number of cluster can be
motion parameters that have impact on the vehicle con- identified, and specific combinations of signal-feature pro-
trol and traffic safety while driving on the road. This can vide very high performances in terms of robustness. In addi-
be used to analyze the impact of human fatigue and alco- tion, the authors showed that it is possible to reduce the size of
hol on their driving performance under different traffic the database as much as 99% without affecting the clustering
conditions. performance.
Modeling the driver behavior essentially emerged to pre- Bashar et.al. in [98] provided an overview of latest
dict the driver intent, vehicle and driver state, and environ- approaches used for drivers and passengers recognition based
mental factors that are used to improve transportation safety, on data collected from smart phones. The authors also pro-
reducing traffic congestion and enhancing the driving experi- posed a probabilistic method that utilizes features based on
ence as a whole [94]. smartphone inertial measurements and doors signal, such as

VOLUME 7, 2019 49843


N. O. Alsrehin et al.: Intelligent Transportation and Control Systems Using Data Mining and Machine Learning Techniques

user motion during entry and captures salient ingress, to iden- traffic sign especially when there are more than one lane
tify and analyze the user behavior. Experimental results options available for merging. In addition, developing a
showed the usefulness, effectiveness and simplicity of the dataset that contains complete driver behaviors is a neces-
identification method. sary. This data set should be labeled and annotated correctly
Analyzing the impacts of drivers’ characteristics on their and cover different driver styles, urban environments, and
driving performance under varying traffic conditions is help- cultures.
ful and significant to the research on traffic safety and in
formulating the measures of traffic control. Also, it is impor- VIII. ANALYSIS OF VEHICLE/PEDESTRIANS BEHAVIOUR
tant to many professionals such as who are working in traffic Dangerous and abnormal pedestrian behaviors are the main
behavioral studies. Many factors affecting driver behaviors cause of many traffic accidents. On average, a pedestrian was
have been investigated extensively, such as driver distraction, killed every 1.6 hours and injured every 7.5 minutes in traffic
experience, in-vehicle information, fatigue and alcohol. Cap- crashes in the US [102]. More than 30% of the total number
turing the driving motion is a process to record and track of people killed in traffic accidents were pedestrians [103].
drivers’ body movements and obtains their motion parameters Significant efforts are being made to understand pedestrians’
in a three-dimensional space [99]. Drivers’ motional behavior behavior to develop more accident free traffic environments
greatly impacts the moving operation of an individual vehicle. in several countries. Qianyin et. al. in [104] studied five
Jianjun et. al. in [99] presented a study to better understand kinds of dangerous pedestrian abnormal behaviors, which
the driver behaviors based on information captured from a are: crossing road border, illegal stay, crossing the road,
driving motion capture system (MCS) which is crucial to moving along the curb and entering road area. In addition,
highway traffic safety. The authors use of MCS to capture they built a behavior model between the pedestrian trajectory
driving motion and MotionBuilder to reproduce the driving and the road to describe the above behaviors. The authors
motion. Then the quaternion method is utilized to understand first extracted the background from the video frames to detect
and calculate the efficiency of driver behaviors. the proposed the pedestrian, and then the shadow is eliminated. Third,
method explores the relation between driving behaviors and the aspect ratio characteristic and traditional tracking method
traffic conditions. Simultaneously, it could be used in other based on features are used to recognize and track the pedes-
fields such as ergonomics, behavior and athletic sports [99]. trians. Finally, the authors developed a mathematical model
Generally, it is difficult to measure and quantify the cog- to detect the abnormal behavior of the pedestrians. Experi-
nitive workload of a driver because a set of factors should be mental results showed that the proposed model using surveil-
considered, such as the driver characteristics, the driving con- lance videos, collected from real traffic monitoring system
ditions, the traffic conditions, and the weather and local envi- in Guangzhou, China, achieved more than 85% detection
ronment. However, Hyun et. al. in [100] developed a model accuracy of abnormal behaviors. There are some problems
to predict the driver’s electroencephalograph (EEG) level need to be addressed in future research, such as tracking
utilizing basic information obtained while the vehicle is being pedestrians covered by vehicle and detection of abnormal
driven. The model extracts useful features from the vehicle behavior of the pedestrian at night.
driving information, such as engine RPM, vehicle speed, Zaki et. al in [105] used automated computer vision track-
lane changes, and turns that are grouped into three groups; ing approach to localize pedestrians in small groups via
driving information group, driving conditions, and driving the MMTrack Algorithm. The authors used the walking
behavior information. These features are used to divide the behavior to identify possible commonality between nearby
EEG values into two classes, ‘‘normal’’ and ‘‘overload’’. The pedestrians. The authors used the spatio-temporal criteria and
classification model uses the support vector machine (SVM) the introduced movement similarity measure to classify and
algorithm to predict normal and overload states during actual counting pedestrians in groups. To show the feasibility and
driving. The model uses actual driving data collected from accuracy of the proposed method, the authors used video
driver on real roads. The authors evaluated the prediction data collected at a moderately dense pedestrian crosswalk
performance after building a SVM model and found that in Vancouver, British Columbia. Evaluation results showed
better prediction performance was obtained for drivers who that the proposed method achieves 77% accuracy. Avoiding
felt more overloaded during complex driving than during collision could be done by studying the group behavior.
simple straight driving. In addition, this prediction model can An unsafe pedestrian crossing at the signalized intersec-
be utilized in some existing human vehicle interface-based tions is considered one of the most common sources of
driver workload management system (HVI-DWMS), such as pedestrian fatalities. Iryo-Asano et. al. in [106] analyzed and
in [101]. modeled the probabilistic crossing behavior of pedestrians
There are several challenges remain unaddressed for iden- after the onset of the pedestrian flashing green (PFG) until
tifying and analysis of traffic pattern and behavior; some the completion of crossing. The authors used the stop–go
of these are: detecting of traffic bottlenecks, automatically decision and speed distribution models to represent the side-
distinguish between different types of transportation infras- walk and the crosswalk behaviors. Sensitivity data analysis
tructure to be able to understand the relationship between of the proposed model showed that the crosswalk length and
them, understanding how vehicles and drivers deal with the distance to the crosswalk from the pedestrian position at

49844 VOLUME 7, 2019


N. O. Alsrehin et al.: Intelligent Transportation and Control Systems Using Data Mining and Machine Learning Techniques

the onset of the PFG cause pedestrians to speed up while the effectiveness of the proposed method. As an example,
crossing or give up crossing. In addition, it successfully rep- One Soft-Yield driver model is evaluated and the simulation
resented the percentage of illegal crossings and the speeds of results showed that it achieved more efficiency than nature
pedestrians at different crosswalks. human-driver using data collected from the Ann Arbor city
Distracted drivers are responsible for more accidents than in Michigan. Future work might include developing a more
impaired drivers, but the ubiquity of smartphones, distracted detailed pedestrian model and more behaviors and features
pedestrian is also on the rise and leads to traffic accidents, can be considered.
especially while crossing the street. Dong et. al in [107] Another attempt was done by Camara et. al in [112] in
proposes a new algorithm to detect the pedestrian behavior which the authors considered the interactions between pedes-
who unconsciously using their phones during the crossing trian and autonomous vehicle as a game-theoretic interaction
of the street. Experimental results show efficiency of the in which a pedestrian wishes to cross the road in front of
proposed algorithm in pedestrian detection using a PWUM the AV at an unmarked crossing. In this situation one agent
dataset. must yield to another. The authors used and analyzed data
Nan and Hideki in [108] analyzed and examined the char- collected from real-world pedestrians interacting with manual
acteristics of heavy vehicle behavior. The authors empirically drive vehicles when crossing roads. The authors studies the
analyzed data observed at a single-lane roundabout in Japan time orders in which the results suggested that AVs should
and examined the headway characteristics of heavy vehicles not act right away on detecting a road crossing interaction,
using three headway parameters. The big sizes of heavy but rather wait for first few informative features before act-
vehicles force them to behave differently and the headways ing (speed up or slow down). This implies that the first
are commonly greater than passenger cars only. These results event features gains more attention to make a decision rather
could be used to evaluate the performance of roundabout such wait for late features, which may risk making a decision
as roundabout entry capacity estimation. if it waits much longer. In addition, the authors suggested
Kaparias et. al. in [109] proposed a new behavioral anal- that studying the Optimal Stopping for AV controllers for
ysis, which uses video sources to qualitatively presents the pedestrian interactions would be a high impact research
interactions between vehicles and pedestrians and classify area.
these behaviors and reactions as a function of traffic param- There are different factors that potentially impact the way
eters (e.g., speed, density, and frequency of pedestrian cross- pedestrians behave; some of them are age, gender, group
ings). The method used video data collected from a number size, culture, education level, and economic. We believe that
of critical locations in London to show the factors that affect studying the impact of these factors and the relationships
the confidence and tolerance of road user. Results showed between them represent the next step to go. In addition,
that pedestrians have increased their confidence when they how pedestrians interact and communicate with autonomous
interact with vehicles, but drivers have not changed their vehicles should be addressed [108]. Moreover, understanding
behavior. To generalize the proposed method, other sits have the pedestrians’ intention is limited.
to be studied, more road users (e.g. cyclists) should be consid-
ered, and additional characteristics of road users (e.g., demo- IX. SIGNAL CONTROL AND TRAFFIC LIGHT
graphics and perceptions) should be covered. In addition, Traffic light is the main element to control the movement of
it would be attractive to explore other aspects of vehicle- vehicles by specifying the waiting and going times; fixing
pedestrian interactions, such as the behavior of disabled road the time for traffic lights is inefficient way to control vehicle
users, the effect of weather conditions, and the impact on the movements and lead to imbalance system due to inconsistent
surrounding areas. number of vehicles on each side. Abu Zaid et. al. [113]
Koji and Hiroki in [110] conducted observation surveys proposed an algorithm to control the traffic light based on
and developed linear regression models to analyze and clarify number of vehicles on each traffic light. The algorithm uses
the risky behavior of both pedestrians and vehicles making image data extracted from the captured video using a camera
left turns at five major intersections in Japan. In addition, installed in the field and apply the artificial neural network
the authors discussed the issues that prevent and reduce the and fuzzy logic to adapt the time length for each light. The
pedestrian-vehicle conflicts at intersections based on the sen- algorithm is validated by comparing its results with manual
sitivity analysis. Number of large-size intersections and other results. The generated results will regulate traffic flow and
conflict patterns should be discussed as future works that reduce the waiting time wasted in the roads.
might improve the accuracy of the proposed models. Video monitoring and surveillance systems have been
Qualified human drivers are able to analyze the road traffic widely used in traffic management and traffic light control
and choose a driving strategy that avoids accidents. However, systems. Several attempts have been made to develop smart
automated vehicles need to be trained to be able to smartly traffic lights. Anurag et. al. [114] developed a method that
perform the same task. Chen et. al in [111] proposed a new uses images extracted from live videos feed from cameras
method for evaluating the safety and feasibility of the driving installed at traffic junctions to calculate a real time traffic
and passing strategies for automated vehicles at un-signalized density. This method switches the light according to the
crossings. Simulation tests were conducted to demonstrate traffic density aiming to reduce traffic density.

VOLUME 7, 2019 49845


N. O. Alsrehin et al.: Intelligent Transportation and Control Systems Using Data Mining and Machine Learning Techniques

In general, these techniques that are based on videos and vehicles as well. A survey that shows the current progress
images require (1) good image quality that is weather depen- and future directions in the field of vision-based embedded
dent especially in case of rain and fog, (2) high rate of ADAS that bridges the gap between theory and practice is
transmitted and received data, (3) sophisticated algorithms to shown in [123].In addition, the authors in [123] reviewed dif-
model the various states of traffic that are based on fuzzy logic ferent hardware and software options used in the ADAS, and
and genetic algorithms. To address these issues, many works the design, development, and testing considerations are also
have used the Programmable Intelligent Computer (PIC) discussed. Moreover, some outstanding challenges are also
microcontroller with transmitter and receiver IR sensors to identified. The authors in [124] listed a set of vision-based
evaluate the traffic density and accomplish dynamic timing ADAS with a consistent terminology and taxonomy. They
slots. PIC and IR sensors require small amount of information also proposed an abstract model to formalize a top-down view
to be transmitted and received and low installation cost. of application development to scale towards autonomous
Bilal Ghazal et. al. [115] proposed a smart traffic light driving system.
system that control and manage the traffic light of a Several attempts have been made toward providing an
‘‘+’’ junction of mono directional road. The system uses the energy efficient intersection service in order to optimize the
IR sensors to estimate the traffic density posted in either side way vehicles cross an intersection. To reduce the energy con-
of the roads. Based on the density value, the green light will sumption and emission, energy efficient intersection service
be either extended in case of traffic jam to allow large flow of is developed to optimize the way vehicles cross an intersec-
vehicles or reduced in case of no cars are present to prevent tion and to avoid any un-necessary acceleration or braking
unnecessary waiting time. In addition, the system allows by the drivers. LED Traffic Lights Reduce Energy Use in
emergency vehicles to pursue using a portable controller. Chicago by 85% [125].
To handle emergency vehicles towards the junction, sev- To investigate the potential of V2X communication tech-
eral attempts have been introduced. Some of them, such nology in reducing road traffic congestion using smart traffic
as [116], [117] use the RF emitters to send warning signals to light controller, Cullen Rhodes and Soufiene Djahel [126]
the RF transceivers disposed at every traffic light intersection developed an efficient mechanism, called TRaffic Light
and provide a special route accordingly. Other attempts use Phases Aware Driving for REduced tRaffic Congestion
the Global Positioning Systems (GPS) to provide preemption (TRADER) to reduce the overall vehicle’s travel time in
signals to both traffic light and hospitals [118], [119], [120]. smart cities. TRADER has been implemented and extensively
To reduce the latency of emergency services for vehicles, evaluated using SUMO and TraCI. The evaluation results
the authors of [121] explored the effect of traffic congestion show that the performance varied based on network topology
on the emergency services and proposed a framework that and traffic density.
dynamically adjusts the traffic lights, changes related driving Soufiene et. al. in [127] investigated the opportunities
policies and drivers’ behavior, and applies essential security of improving the commuter’s journey duration using the
controls based on the announced emergency level. However, technology offered by V2I and proposed a Belief-Desire-
the effectiveness of the proposed framework has not been Intention architecture that uses local knowledge and informa-
evaluated. tion collected from the surrounding infrastructures (vehicles
When the drivers wait too much in the traffic light queue and traffic light controllers (TLCs)) to model the way how
and the traffic light changes from green to yellow, most of the vehicles behave in the road. This architecture exchanges
time the drivers cross the road during transitions from yellow beacons among vehicles to determine their optimal speed and
to red, consequently, the possibility of accidents increases. position in the road segment in order to cross the intersec-
This knows as a Red Light Running (RLR) phenomenon, tion with minimum delays while ultimately avoid stoppages
which often occurs as a consequence of the fact that the traffic whenever possible. The initial simulation results show sig-
light is not well balanced. The authors in [122] proposed nificant reduction of average travel time. however, the pro-
a technique that uses the information gathered through a posed architecture does not handle the selfishly acting vehi-
wireless sensor network to dynamically optimize the waiting cles or the reactive and proactive solutions in an efficient way.
time in the road queue and to reduce the occurring of the Another attempt was done by Jagadeesh et al. [128] to
RLR phenomenon in an isolated intersection. This is done enable the traffic light to switch from red to green based
by assigning a longer time of green light to the road that has on traffic density. The authors combined existing technology
a longest queue. with artificial intelligent to develop and implement a low
Automatic driving and parking, automatic traffic sign cost real-time and sensor-based dynamic traffic light control
recognition, and automatic collision detection are some tasks system to reduce the Average Trip Waiting TIme (ATWT).
that are provided by the Advanced Driver Assistance Systems Their proposed system uses Dynamic control, IR sensor,
(ADAS). Current developments aim to automate some of the Low power embedded controllers, comparators and storage
drivers’ tasks using computer vision and other technologies device.
such as machine learning and robotic navigation. ADAS Communications between vehicle to vehicle and vehicle
already have a huge impact on the industry and society, to road (infrastructure), Khekare and Sakahra [129] proposed
increase the safety of the drivers, and help in maintaining the a framework called VENET that plays an important role

49846 VOLUME 7, 2019


N. O. Alsrehin et al.: Intelligent Transportation and Control Systems Using Data Mining and Machine Learning Techniques

in smart cities by transmitting information about the traffic without stopping through predicting the time conflict and
condition and aiding divers to take smart decisions to pre- generating an efficient traffic schedule for the entire road net-
vent themselves from congestion. Another attempt that uses work. Their simulated results improve the fairness by 331%,
VANET was done by Bani Younes and Boukerche in [130] to decrease the average delay by 88%, and improve the abil-
estimate the vehicle stop time for each traffic light. However, ity to solve congestion by 12% compared with traditional
their proposed method is not reliable because there is no algorithms.
synchronization between the simulation scenario and traf- Another attempt was done by Khamis and Gomaa in [135]
fic flow, also it does not address the case when the vehi- to control and monitor the traffic lights that are operating
cles accelerate or decelerate at the moment the signal light in the adjacent intersections to reduce the jam and potential
change [20]. accidents, especially in the rush hours.
Automatically moving the traffic to less crowded and Some ITS uses Local Positioning System (LPS) instead
adapting it to handle emergency scenarios are the ulti- of GPS (Global Positioning system) for locating a vehicle
mate goals of most traffic management systems. Researchers with the help of localized workstations or sensors situated at
developed several intelligent traffic systems that capture the optimal points. The evolution of the Internet of Things, which
feature of surveillance via camera presents on the junction requires to install as many sensors as possible to capture the
using image analysis techniques [131] and controlling traffic detail from all different angels, allow smart cities to be real.
lights with the help of photoelectric sensors [132]. These Benefits of sensor-based data as input to traffic management
systems use the relevant weight of each road to open traffic systems are: 1) there is no special data base to store the
for roads that are more crowded and give them longer time huge volume of traffic data and to make proper queries and
compared to others less crowded. Limitations of VANET are: decisions, and 2) there is no special automated system that
1) Appropriate hardware should be installed in every vehicle uses data warehouse and mining techniques for summarizing
and 2) Decisions should be made by users. Systems that are and cluster traffic data to make proper decisions for a partic-
implemented on four-way junction and have no relation to ular time. However, more sensors mean huge and redundant
every vehicle and limit the number of hardware required to data will be generated and storage and maintenance cost
be installed on every vehicle such as [114]. will be increased. Interaction between system supplier (traffic
Inefficient traffic light systems trouble the transporta- engineers) and system users (travelers) is less considered.
tion system which will badly affect the economic, health, Visual analysis provides suppliers the chance to make better
financial, and environmental domain of citizens and govern- decisions based on visual information.
ments [115]. The major problem of the existing traffic light In the vehicle type classification, instead of labeling the
systems is that the transition timing slots are fixed, which samples as positive and negative, multi-group classification
is known as Fixed Traffic Light Control (FTLC) system. (e.g., car, truck, bus, motorcycle, large/heavy vehicles) would
FTLC systems are unable to solve the situations where the further benefit the machine’s understanding of traffic scene.
traffic congestion is only observed from one direction [115]. This leads to develop a more general classification frame-
Other situation is when traffic flow is increased at intersection works that improves the accuracy and the overall perfor-
roads during rush hours and decreased at night. In these situ- mances and will be further evaluated over a longer monitoring
ations, the traffic light should be adapted to extend or reduce time and more complicated scenarios, such as low video
the green light activation based on traffic status and density, quality (e.g., low resolution, motion blur), different lighting
which is known as Dynamic Traffic Light Control (DTLC) conditions, and more operational challenges. In addition,
systems. DTLC system reduces the average waiting time of evaluate the proposed framework using heavily congested
vehicles on junctions and is the most suitable to the complex- traffic conditions and different curved road segments, might
ity of current traffic conditions [133]. realize the generated results. Directions to future researches
Future effort should be made to address the influence can be as follows:
of adjacent intersections on one junction to achieve a • More general classification approach based on data
complete modeling, monitoring, and control for multiple syn- collected from more sites is required. The classifica-
chronized junctions. Adaptive traffic light systems and syn- tion approach can categorize based on vehicles size
chronize multiple traffic lights at different junctions to reduce (e.g., Grand saloons, Small vans, minivans, SUVs, bus,
traffic congestion is required. To improve the safety, traf- trucks and others), drivers behavior (normal, dangerous,
fic efficiency, and fairness among vehicles at intersections, aggressive, or conservative drivers), vehicle maximum
Wang et. al. [134] proposed a 3-level buffer based virtual speeds (luxury cars versus regular cars), driving mode
traffic scheme for intelligent collaborative intersections. The (automated vehicles or human-driving vehicles), or vehi-
authors divided the intersections into three adaptive areas cle models and make. Other vehicle classification meth-
according to the traffic flow of each lane and used the V2V, ods can be examined to better approximate the real
V2R, and V2I to improve the safety and fairness without traffic flow [19].
involving traffic light. In addition, the authors proposed a • Uniform performance measurements are urgently
Collaborative Collision Avoidance Predictive (CCAP) con- needed for evaluating the overall performance and
trol algorithm to assist vehicles to go a cross next intersection accuracy of the prediction algorithms, especially when

VOLUME 7, 2019 49847


N. O. Alsrehin et al.: Intelligent Transportation and Control Systems Using Data Mining and Machine Learning Techniques

the two selected measures are showing different current traffic control methodologies could fit in an IVHS-
performances. based traffic control set-up.
• Longer time intervals can be used to evaluate different In 2012, Partishtha et al. [138] presented various
online algorithms so as to extend their applicability to approaches for intelligent traffic systems and proposed a
meet the requirements of more ITS applications. model for managing real time traffic system using CCTV
• Using natural language processing in predicting traffic cameras and WAN. The paper started by exploring the chal-
flow is an interesting and challenging problem which lenges in ITS such as the real time signal control, traffic
needs to be studied further. This could be done by rep- load prediction and computation, vehicle tracking, vehicle
resenting the input of the main properties of traffic flow routine and route optimization. Then it viewed a variety of
as vectors and trying to get the vector that is very close approaches that are in use to explore and search the field
using some machine learning and artificial intelligence of Intelligent Traffic Management such as the Geographical
techniques [18], [136]. Information Systems, Artificial Intelligence, Graph Theory
• Other data sets can be applied to test the online algo- and Real time Systems. In addition, it explored the main
rithms. These data sets cover more sites and at different technologies that can be used for ITS such as Wireless Sensor
time slots. Networks, CCTV, RFID and GPS. The proposed model for
• Modeling flow in different situations, such as stretch of real time traffic management relies on installing the CCTV
highway and diverge junctions. cameras in the desired locations to capture images for real
time traffic, then analyze and process these images to firstly
Real-time information about traffic conditions and active obtain vehicles’ plates numbers to track suspected vehicles
information gathering methods that are collected using data and secondly compute the traffic load which can be used to
from sensors installed in highways or average measured GPS control the green light signal timing. In addition, it suggested
and LPS speeds from mobile phones during the current time the use of other technologies such as WAN and mobile ser-
period. These methods might be used to guide both selective vices that can help users to find emergency services and get
real time sensing of different portions of a road network and the traffic information dynamically in real time.
the bulk collection of data to reduce uncertainty about the In 2013, Kashif and Abdul Hanan [139] presented the
flows over segments and routes. variety of Intelligent Transport System (ITS) areas, appli-
cations, and technologies. The ITS integrates the virtual
technologies with transportation in order to reduce risks,
X. CHRONOLOGICAL REVIEW OF INTELLIGENT accidents rate, traffic congestion, carbon emissions, air pol-
TRANSPORTATION AND TRAFFIC MANAGEMENT lution and increase the safety, reliability and travel speeds.
SYSTEMS Authors firstly review the generation and the areas of ITS
Here we provide a chronological review for the recent survey that provide solutions for cooperation and reliable platform
papers in the areas of intelligent transportation and traffic such as: Arterial and Freeway Management Systems, Freight
management systems. In 2011, Baskar et al. [137] focused Management Systems, Transit Management Systems, Inci-
on reviewing the traffic management and control frameworks dent Management Systems, Emergency Management Sys-
for Intelligent Vehicle Highway Systems (IVHS) to improve tems and Regional Multimodal and Traveler Information
the traffic performance. IVHS represent the most promising Systems/Information Management (IM). In addition, they
solutions to the traffic congestion problems, which focus reviewed the application of ITS that used for transportation
on applying intelligent techniques to allow vehicles to com- safety, efficiency and user services such as Electronic Toll
municate with roadside infrastructure in order to shifts the Collection (ETC), Highway Data Collection (HDC), Traf-
driving tasks from the driver to the vehicles and make bet- fic Management Systems (TMS), Vehicle Data Collection
ter driving decisions. Some of these tasks include activities (VDC), Transit Signal Priority (TSP) and Emergency Vehicle
such as braking, steering and making control decisions about Preemption (EVP). Finally, authors reviewed the different
speed and headways. The authors also described the ITS and technologies of ITS that used to improve the transportation
IVHS, and difference between them and their relations with conditions, safety and services such as: wireless communica-
the Automated highway Systems (AHS) and the Intelligent tions, computational technologies, floating car data/floating
Vehicles (IV). The IV represents the main component of the cellular data, sensing technologies, inductive loop detection,
IVHS and AHS, and aims to achieve more efficient vehicle video vehicle detection and Bluetooth detection. The conclu-
operation by helping the driver or by taking the complete sion of the survey paper showed that the ITS covers many
control of the vehicle. In addition, the authors investigated the technologies that can help in improving safety, efficiency,
control design methods and their applications used for traffic mobility, accessibility and intermodal connections of trans-
control tasks, artificial intelligent techniques, IV and traffic port systems.
control frameworks and architectures for freeway systems. In 2015, Chepuru and Rao in [140] reviewed the current
Various traffic management architectures such as PATH, Dol- and research challenges and opportunities to the development
phin, CVIS, SafeSpot, PReVENT, and Auto21 CDS were of secure and safe IoT-based ITS applications. The authors
discussed and quantitatively compared to show how the also reviewed the current ITS architectures, requirements,

49848 VOLUME 7, 2019


N. O. Alsrehin et al.: Intelligent Transportation and Control Systems Using Data Mining and Machine Learning Techniques

and standards. They also classified and analyzed existing ITS stolen vehicles and any position such as land, sea, battlefield,
threats and attacks. In addition, they offer a broad view on and underground. The authors defined the problem with the
how recent and ongoing advances in sensors, devices, internet proposed solution, and finally an initial test-bed prototype
applications, and other technologies have motivated afford- was developed.
able healthcare gadgets and connected health care services In 2016, Anaswara and Lakshmi [148] reviewed different
to limitlessly expand the potential IoT-based ITS for further traffic management schemes that use IoT in smart cities.
development. Some of these schemes are (1) reservation-based system at
In 2016, Merrad et al. [141] provided a survey on how intersections that perform better than normal traffic light by
the researchers used WSN, simple sensors and/or analyt- reserving a piece of roads to cross junction and used for both
ical approaches to regulate the traffic around an intersec- fully-automated and human-driver cars to create a reservation
tion. Experiments show that a network of wireless magnetic algorithm [149], [150], (2) automated intersection manage-
sensors provides more flexibility, consumes lower energy, ment systems that build a detailed communication protocol,
reduces the installation and maintenance costs, easy to through simulation evaluation, the system proposed in [151]
deploy, and has smaller size than video, radar detection, and outperforms current intersection technology that uses traffic
inductive loop systems [142]. The authors also presented the light and stop signs, (3) different polices, such as FCFS,
basic notations and the most important parameters that affect FCFS-light, and FCFS-Emerg polices, that communicate dif-
the traffic control. These parameters include: signal cycle, ferent kinds of vehicles such as automated and emergency
stage, split, and offset. The signal cycle is the repetition of vehicles with traffic light infrastructures and give some kinds
the signal combination and has different stages, during one of vehicles priorities than other vehicles without increasing
stage a set of streams can move securely. Split parameter their delay. Switching between these polices is done by learn-
represents the green duration of each stage. Offset is the ing from reservation history which policy is best for partic-
phase that represents the difference between signal cycles of ular traffic state. (4) intelligent intersection and autonomous
successive intersections that optimize the green wave along passing-through intersections especially with the emerging of
an arterial [141], [143], [144]. Some other parameters are driver less vehicles in which all the traffic components ()such
travel time, travel delay, turning probabilities, queue length as lane, path, critical section, and vehicles are modeled and
and delay. These parameters are of three types [141]: dynamic evaluated. Other reviewed schemes are: online coordination
data, model parameters, and statistic data. Dynamic data are of a continuous flow of connected and automated vehicles
data that changed over a short period of time (i.e., second [152], traffic light control for multiple intersections [153],
by second). Model parameters refer to parameters that are smart parking system for an urban environment [154].
either constant or slowly changed over time. Statistics data In 2017, Anand et al. [5] presented an extensive review
represents data that are constant such as road_id, number of on various data mining as well as clustering methods that
lanes in each road. In addition, the authors in [141] compared model, predict, and plan transportation systems to facilitate
RHODES [145], a real-time traffic adaptive signal control Intelligent Transportation Systems (ITS). Around 50 research
system, with PREDICT [146] model that is used to predict articles from the last decade were collected from the leading
the arrival time. The comparison results showed that the journals and reviewed in three stages. Firstly, the contri-
RHODES achieves slightly better throughput and significant butions that pertain to the ITS were reviewed chronologi-
decreasing in the delay. cally. Secondly, the data mining approaches, involving pre-
In 2016, Tendulkar et al. [147] proposed a traffic man- dictive and descriptive concepts for managing transportation,
agement system using IoT to manage the traffic signals by were reviewed based on the adopted methodologies. In the
monitoring the traffic density1 in order to avoid the traffic third stage, the clustering models were categorized as super-
congestion on road using network communication. The archi- vised or unsupervised, and reviewed based on their usefulness
tecture of the proposed system consists of wireless network over transportation data analysis. Finally, the paper discussed
sensors, RFID (Radio Frequency Identification) and GSM- their findings from the review and summarized the research
GPS (Global positioning System). The wireless network sen- gaps as following: descriptive mining methods have aided in
sors (i.e., sensor nodes) use sensors to communicate together, generating a real-time information system, identifying traf-
send information through the used network and record the fic patterns, developing travel speed calculation model, and
environmental physical condition such as temperature, pres- investigating parking decisions. Predictive data mining tech-
sure, pollution etc. The RFID technology is used to identify, niques have helped in inferring the network topology, finding
trace and count objects based on three parameters; the speed traffic bottlenecks, solving the multi-objective location inven-
of vehicle, average waiting time and queue size. The GSM- tory problem, constructing two data reduction algorithms,
GPS is used to track the current location of the vehicle, the and predicting short-term traffic flow in heterogeneous condi-
distance between source and destination, information about tions. Among the descriptive data mining methods, clustering
models have extensive use in forming the single transporta-
1 Traffic density is defined as a number of vehicles in a specified length tion system, calibrating the speed-density parameters, esti-
of a road in a given time period. Traffic density is the most commonly used mating the travel speed, and clustering the taxi-cab trips and
parameter to indicate the level of congestion on a roadway. route planning system. Especially, the unsupervised meth-

VOLUME 7, 2019 49849


N. O. Alsrehin et al.: Intelligent Transportation and Control Systems Using Data Mining and Machine Learning Techniques

ods have many applications in ITS. Traditional unsuper- traffic flow data [162]. Similarly, Wang et al. in 2018 used
vised models have drawbacks; however, these are rectified the Local Outlier Factor (LOF) algorithm basis to combine
using the evolutionary approach. The presented evolutionary the extracted features of different types of traffic data and
approach also has drawbacks. Thus, increasing the potential developed a grid-based LOF algorithm to detect the abnormal
of the evolutionary model is rather challenging. Finally they area in Beijing. Their extensive experiments on taxi and bus
recommended an effective evolutionary approach without trips shows the effectiveness of the proposed approach [163].
drawbacks as a direction for future research that will certainly Sun in his doctoral dissertation developed an application
help in developing enhanced ITS. is called transit-hub. He used data mining and machine
Urban traffic system is a huge system that consists of many learning techniques for context sensitive prediction of long-
objects such as traffic lights, buildings, pedestrians, cyclists, term, short-term and real-time delays in sparse public transit
cars, buses and other public transport vehicles. In order to networks. He integrated neural network models, heuristic
estimate the traffic flow different methods are used to count search algorithm, deep learning techniques and sensitivity
the number of different objects during a specific time. The analyses of the hyper-parameters algorithms to analyze the
detection of anomalies (outliers) that represents the obser- performance of public transit networks, optimize the on-
vations of inconsistent set of data in the traffic flow is an time performance under uncertainty of traffic and weather
important method that can be used to analysis the urban traffic conditions, detect the operations of transit networks over
data using the traffic prediction or application scenarios such a large metropolitan area and identify non-recurring traffic
as spatial data [62], [158], [159]. congestion and explain its causes. It efficiently convert traffic
For traffic prediction, in 2019 Yao et al. proposed a novel data in Traffic Message Channel (TMC) format to images,
approach that tackles the spatial dependencies and temporal as well as a data augmentation mechanism using crossover
dynamics in a unified framework is called Spatial-Temporal operators for class balancing. In addition, the application pro-
Dynamic Network (STDN). The STDN mechanism is intro- vides set of experiments to understand how advanced decision
duced to learn the dynamic similarity between locations, and support tools improve the utilization of the transportation
a periodically shifted attention mechanism is designed to infrastructure [164].
handle long-term periodic temporal shifting. Their experi- Bhowmick and Narvekar studied the trajectory outliers in
mental results on two datasets of real-world traffic verify the urban traffic data and classified them into distance based,
effectiveness of the STDN method [160] density-based, and motifs-based outliers based on the used
Urban anomalies represent the disturbances in urban city method in processing steps [165]. Djenouri and Zimek pre-
environments. In 2017, Wu et al. focus on studying the future sented a tutorial on outlier detection in urban traffic data and
anomaly prediction problem in urban environments rather classified them into statistical techniques that employ statis-
than studying the anomalies in existing urban data. They tical models to identify anomalies in traffic data, similarity-
developed the Urban Anomaly PreDiction (UAPD) frame- based techniques that use distance measures and neighbor-
work, which addresses a number of challenges, including the hoods to derive local density estimates, and techniques based
dynamic, spatial varieties of different categories of anoma- on pattern analysis that explore the correlation between traffic
lies. Using up to date urban anomaly data, the UAPD first flow values by using concepts from pattern analysis [162].
detects the change point of each type of anomalies in the tem- Djenouri et al. in 2019 reviewed the use of outlier detection
poral dimension and then uses a tensor decomposition model approaches in urban traffic analysis and divided them into
to decouple the interrelations between the spatial and categor- two main categories: flow outlier detection that detects flow
ical dimensions. Finally, the UAPD applies an auto regression outliers and includes statistical, similarity and pattern mining
method to predict which categories of anomalies will happen approaches, and trajectory outlier detection that includes off-
at each region in the future. The experimental results on two line processing for trajectory outliers and online processing
urban environments, demonstrate that UAPD outperforms for sub-trajectory outliers [166].
alternative baselines across various settings, including differ-
ent region and time-frame scales, as well as diverse categories XI. INTELLIGENT TRAFFIC PUBLIC DATA SETS AND
of anomalies [161]. TOOLS
In 2018, Dejenouri et al. studied the impact of different Table 5 shows a list of open source tools that are used by
conditions on traffic flow that lead to unusual patterns. They researchers in the area of traffic and transportation manage-
focused on studying the historical data of these conditions ment and here we provide a set of public datasets used by
such as the festivals and events related to unusual patterns researchers in the area of traffic management systems.
in the traffic flow in order to improve organizing both the
layout of traffic and events. To handle the flow distributions, A. MULTI-MODAL INTELLIGENT TRAFFIC SIGNAL
an established outlier detection method, the local outlier SYSTEMS GPS - DEPARTMENT OF TRANSPORTATION
factor (LOF) is used instead of the individual observation. (USDOT)
They applied the LOF to extend the database with new flow Data were collected during the Multi-Modal Intelligent
distributions and the results of their method on a real urban Transportation Signal Systems (MMITSS) study. MMITSS is
traffic data as a case study finds meaningful outliers in the a next-generation traffic signal system that seeks to provide

49850 VOLUME 7, 2019


N. O. Alsrehin et al.: Intelligent Transportation and Control Systems Using Data Mining and Machine Learning Techniques

TABLE 5. Open source tools that are used by researchers in the area of traffic and transportation management.

a comprehensive traffic information framework to service location of sensors, current travel speed, traffic counts, occu-
all modes of transportation. The GPS data set catalogs the pancy counts, and more. Work Zone Alert Feed: Includes
vehicle operation data of the test vehicles that used for the work zones that have dropped below the normal speed and
MMITSS field testing. The data contains the performance and are determined to have a critical traffic speed abnormality.
operation details of vehicles. This file contains a number of
fields detailing elements such as vehicle position and speed, D. ACTIVE TRANSPORTATION DEMAND MANAGEMENT
fidelity measures of GPS-based data elements, and vehicle (ATDM) TRAJECTORY LEVEL VALIDATION
operation data.
The ATDM Trajectory Validation project developed a vali-
B. INTELLIGENT TRANSPORTATION SYSTEMS RESEARCH
dation framework and a trajectory computational engine to
compare and validate simulated and observed vehicle trajec-
DATA EXCHANGE - PORTLAND
tories and dynamics. The field data were used to demonstrate
The Portland data environment provides the following data:
how on-site instrumented vehicle data can be used to validate
(a) Freeway data consisting of two months of data from
simulated vehicle dynamics using the validation framework.
dual-loop detectors deployed in the main line and on-ramps
The vehicle trajectory data were collected in a separate task
of a Portland-area freeway (I-205), (b) Incident data from
of the Active Transportation Demand Management (ATDM)
the Oregon Department of Transportation Advanced Traffic
Trajectory Level Validation project. The primary project
Management System database and planned event data from
objective was to develop a methodology to validate simulated
the ODOT Trip-Check Traveler Information Portal informa-
vehicle dynamics at the trajectory level. Microscopic and
tion web site, (c) Weather data from two sources: NOAA
macroscopic performance measures were calculated from the
data and Remote Weather Information System (RWIS) station
trajectory data and used in a number of validation tests related
data, (d) Three types of arterial data: (1) Volume and occu-
to safety, vehicle limits, driver comfort levels, and traffic flow.
pancy data from four single loop detectors on 82nd Ave., (2)
Signal phase and timing data for 32 signals along the 82nd
Avenue corridor, (3) Travel times on 82nd Ave., computed E. MULTI-MODAL INTELLIGENT TRAFFIC SIGNAL SYSTEMS
from data collected by two Bluetooth readers, and (e) Transit VEHICLE TRAJECTORIES FOR ROADSIDE EQUIPMENT
data provided from TriMet, the Portland-metro area transit Data were collected during the Multi-Modal Intelligent
agency, including schedule, stop event and passenger counts Transportation Signal Systems (MMITSS) study. MMITSS is
data for both bus and light rail. a next-generation traffic signal system that seeks to provide
a comprehensive traffic information framework to service
C. LIVE TRAFFIC DATA SENSOR FEEDS (REACTOR) - STATE all modes of transportation. The Vehicle Trajectories file is
OF IOWA: DEPARTMENT OF TRANSPORTATION populated with basic safety messages received from equipped
Iowa Department of Transportation’s Intelligent Transporta- vehicle within the communication range of an Roadside
tion System (ITS) Detector Sensors. Sensor Feed: Includes Equipment (RSEs).

VOLUME 7, 2019 49851


N. O. Alsrehin et al.: Intelligent Transportation and Control Systems Using Data Mining and Machine Learning Techniques

The data also contains elements that communicate addi- I. NHTSA PRODUCT INFORMATION CATALOG AND
tional details about the vehicle that is used for vehicle safety VEHICLE LISTING (VPIC) - MODIFIER DB
applications, and elements that communicate specific items The NHTSA Product Information Catalog and Vehicle List-
of a vehicle‘s status that are used in data event snapshots ing (vPIC) is a consolidated platform that presents data
which are gathered and periodically reported to an RSEs. collected within the manufacturer reported data from CFR
These data are transmitted at a rate of 10 Hz. 49 Parts 551 - 574 for use in a variety of modern tools.
NHTSA’s vPIC platform is intended to serve as a central-
F. INTELLIGENT NETWORK FLOW OPTIMIZATION ized source for basic Vehicle Identification Number (VIN)
PROTOTYPE BASIC SAFETY MESSAGES decoding, Manufacturer Information Database (MID), Man-
Data is from the small-scale demonstration of the Intelligent ufacturer Equipment Plant Identification and associated data.
Network Flow Optimization (INFLO) Prototype System and vPIC is intended to support the Open Data and Transparency
applications in Seattle, Washington. Connected vehicle sys- initiatives of the agency by allowing the data to be freely
tems were deployed in 21 vehicles in a scripted driving sce- used by the public without the burden of manual retrieval
nario circuiting this I-5 corridor northbound and southbound from a library of electronic documents (PDFs). While these
during morning rush hour. Basic Safety Messages (BSM) documents will still be available online for viewing within the
sent by connected vehicles (CVs) through either the cellular Manufacturer Information Database (MID) module of vPIC
network or Dedicated Short Range Communication (DSRC) one can view and use the actual data through the VIN Decoder
when the vehicle is in the range of Roadside Units (RSU). and Application Programming Interface (API) modules.
These messages were received by the traffic management
center (TMC). J. VEHICLE TRAVEL INFORMATION SYSTEM (VTRIS) -
DATA DOWNLOAD TOOL
G. VEHICLE AWARENESS DEVICE DATA FROM The VTRIS W-Tables are designed to provide a standard
LEESBURG, VIRGINIA format for presenting the outcome of the Vehicle Weighing
The files in this data environment were produced using the and Classification efforts at truck weigh sites. The data that
Vehicle Awareness Device (VAD) installed on one test vehi- appears in the W-Tables comes from the Summary files that
cle over a two month period. The VAD installed in the test are generated by the Summary subsystem.
car is identical to the VADs installed in over 2800 vehi-
cles participating in the Safety Pilot Model Demonstration XII. PROPOSED MODEL
conducted from August 2012 through August 2013 by the Based on the above mentioned approaches and techniques,
National Highway Traffic Safety Administration (NHTSA) we found that there are limited researches that:
in Ann Arbor, Michigan. A. Investigate the usage of cloud-based frameworks to
This legacy dataset was created before data.transportation.gov enhance the performance of the traffic management
and is only currently available via the attached file(s). Please systems.
contact the dataset owner if there is a need for users to B. Develop a real-time application that captures the rela-
work with this data using the data.transportation.gov analysis tionship between different traffic components, such as
features (online viewing, API, graphing, etc.) and the USDOT traffic lights, road signs, and road intersections.
will consider modifying the dataset to fully integrate in Therefore, as a future direction, we encourage researchers
data.transportation.gov. to propose a general framework that address the above lim-
itations to enhance the performance of traffic management
H. INTELLIGENT NETWORK FLOW OPTIMIZATION systems. Figure 4 shows the conceptual proposed model of
PROTOTYPE TRAFFIC MANAGEMENT ENTITY-BASED the future work due to the limitations of some existing work.
QUEUE WARNING In the model, we divided the city into regions and each
Data is from the small-scale demonstration of the Intelli- region has some road intersections; each intersection has a
gent Network Flow Optimization (INFLO) Prototype System traffic light. In the proposed mode; there is a relationship
and applications in Seattle, Washington. Connected vehicle between these regions and the road intersections. In addition,
systems were deployed in 21 vehicles in a scripted driving the model specifies the relationships between road intersec-
scenario circuiting this I-5 corridor northbound and south- tions, in which the traffic lights that are close to each other
bound during morning rush hour. This data set contains contribute to each other. For example, in region 1, there
queue warning messages that were recommended by the are two intersections: A and B. while in region 2 there is
INFLO Q-WARN algorithm and sent by the traffic man- one intersection, which is C. It is clear that the relationship
agement center to vehicles to warn drivers upstream of between intersection A and B is stronger than the relationship
the queue. The objective of queue warning is to provide between A and C, meaning that vehicles in intersection B
a vehicle operator sufficient warning of impending queue may contribute more and affect the traffic status in intersec-
backup in order to brake safely, change lanes, or modify tion A than what intersection C does. Assume that there are
route such that secondary collisions can be minimized or even x vehicles in intersection A, y vehicles in intersection B and
eliminated. z vehicles in intersection C. After a specified amount of time;

49852 VOLUME 7, 2019


N. O. Alsrehin et al.: Intelligent Transportation and Control Systems Using Data Mining and Machine Learning Techniques

approaches that focused on the traffic light signals. Our future


work will focus on proposing a new traffic management
approach.

REFERENCES
[1] (Mar. 2015). TomTom. Accessed: Oct. 11, 2018. [Online]. Available:
https://corporate.tomtom.com/news-releases
[2] D. Schrank, B. Eisele, and T. Lomax, ‘‘TTI’s 2012 urban mobility report
powered by INRIX traffic data,’’ Texas A&M Transp. Inst. and Texas
A&M Univ. Syst., Texas, TX, USA, Tech. Rep. 1, 2012.
[3] J. Raj, H. Bahuleyan, and L. D. Vanajakshi, ‘‘Application of data mining
techniques for traffic density estimation and prediction,’’ Transp. Res.
Procedia, vol. 17, pp. 321–330, Dec. 2016.
[4] S. Sundaram, S. S. Kumar, and M. D. Shree, ‘‘Hierarchical clustering
technique for traffic signal decision support,’’ Int. J. Innov. Sci., Eng.
Technol., vol. 2, no. 6, pp. 72–82, Jun. 2015.
[5] S. Anand, P. Padmanabham, A. Govardhan, and R. H. Kulkarni, ‘‘An
extensive review on data mining methods and clustering models for intel-
ligent transportation system,’’ J. Intell. Syst., vol. 27, no. 2, pp. 263–273,
FIGURE 4. The proposed traffic management conceptual model. 2018.
[6] J. Zhang, F.-Y. Wang, K. Wang, W.-H. Lin, X. Xu, and C. Chen, ‘‘Data-
driven intelligent transportation systems: A survey,’’ IEEE Trans. Intell.
Transp. Syst., vol. 12, no. 4, pp. 1624–1639, Dec. 2011.
some vehicles a where a <= y that were originally located at [7] J. Lopes, J. Bento, E. Huang, C. Antoniou, and M. Ben-Akiva, ‘‘Traffic
intersection B moved to intersection A and after a specified and mobility data collection for real-time applications,’’ in Proc. 13th
Int. IEEE Annu. Conf. Intell. Transp. Syst., Madeira, Portugal, Sep. 2010,
amount of time, some vehicles b where b <= z that were pp. 216–223.
originally located at intersection C moved to intersection A. [8] K. Miller, M. Miller, M. Moran, and B. Dai, ‘‘Data management life
In other words, a/y and b/z will be the contribution from cycle,’’ Texas A&M Transp. Inst., College Station, TX, USA, Tech.
Rep. 1, Mar. 2018.
intersection B and C to intersection A, respectively. [9] Z. Diao et al., ‘‘A hybrid model for short-term traffic volume prediction
Imagine that traffic congestion occurs at intersection A and in massive transportation systems,’’ IEEE Trans. Intell. Transp. Syst.,
considering the above relationships, building these relation- vol. 20, no. 3, pp. 935–946, Mar. 2019.
[10] K. Kumara, M. Paridab, and V. Katiyar, ‘‘Short term traffic flow predic-
ships help the traffic lights to communicate to each other tion for a non urban highway using artificial neural network,’’ in Proc.
in order to increase/decrease the green interval so the traffic 2nd Conf. Transp. Res. Group India, Agra, India, 2013, pp. 755–764.
congestion might be minimal. In addition to the relationship [11] R. Ke, Z. Li, J. Tang, Z. Pan, and Y. Wang, ‘‘Real-time traffic flow
parameter estimation from uav video based on ensemble classifier and
between traffic intersections (lights); the proposed model optical flow,’’ IEEE Trans. Intell. Transp. Syst., vol. 20, no. 1, pp. 54–64,
includes the following sub-systems: Jan. 2019.
1. Traffic lights control sub-system. This system located [12] R. Ke, Z. Li, S. Kim, J. Ash, Z. Cui, and Y. Wang, ‘‘Real-time bidirectional
traffic flow parameter estimation from aerial videos,’’ IEEE Trans. Intell.
in Cloud-based server and controls the traffic lights that Transp. Syst., vol. 18, no. 4, pp. 890–901, Apr. 2017.
located in a specific region by determining the green [13] J. Zhang et al., ‘‘A real-time passenger flow estimation and prediction
and red intervals for these traffic lights based on current method for urban bus transit systems,’’ IEEE Trans. Intell. Transp. Syst.,
vol. 18, no. 11, pp. 3168–3178, Nov. 2017.
traffic status. [14] R. Mena-Yedra, R. Gavaldà, and J. Casas, ‘‘Adarules: Learning rules
2. Traffic communication sub-system that receives on- for real-time road-traffic prediction,’’ Transp. Res. Procedia, vol. 27,
line traffic data, updates traffic status and send updated pp. 11–18, Sep. 2017.
[15] W. Huang et al., ‘‘Real-time prediction of seasonal heteroscedasticity in
traffic data to the main cloud-based server. vehicular traffic flow series,’’ IEEE Trans. Intell. Transp. Syst., vol. 19,
3. Intelligent sub-system that use artificial intelligent and no. 10, pp. 3170–3180, Oct. 2018.
data mining techniques to processes on-line traffic data [16] C. Mallikarjuna and K. R. Rao, ‘‘Heterogeneous traffic flow modelling:
A complete methodology,’’ Transportmetrica, vol. 7, no. 5, pp. 321–345,
and extracts useful decisions and triggers the traffic Sep. 2011.
lights control system with the best action. [17] R. Mohan and G. Ramadurai, ‘‘Heterogeneous traffic flow modelling
4. Storage sub-system that stores historical traffic data to using second-order macroscopic continuum model,’’ Phys. Lett. A,
vol. 381, no. 3, pp. 115–123, Jan. 2017.
be used in the future. [18] N. G. Polson and V. O. Sokolov, ‘‘Deep learning for short-term traffic
flow prediction,’’ Transp. Res. C, Emerg. Technol., vol. 79, pp. 1–17,
XIII. CONCLUSION AND FUTURE DIRECTIONS Jun. 2017.
[19] Z. Qian, J. Li, X. Li, M. Zhang, and H. Wang, ‘‘Modeling heterogeneous
This research aimed at improving the understanding of the traffic flow: A pragmatic approach,’’ Transp. Res. B, Methodol., vol. 99,
state of art of traffic management technologies specially using pp. 183–204, May 2017.
data mining and machine learning. Our review has categories [20] N. V. Hung, L. C. Tran, N. H. Dung, T. M. Hoang, and N. T. Dzung,
‘‘A traffic monitoring system for a mixed traffic flow via road estimation
the existing studies into approaches that depended on traffic and analysis,’’ in Proc. IEEE 6th Int. Conf. Commun. Electron. (ICCE),
parameters in real time measurement, approaches that work Ha Long, Vietnam, Jul. 2016, pp. 375–378.
on detecting moving objects, approaches depended on iden- [21] S.-K. S. Fan, C.-J. Su, H.-T. Nien, P.-F. Tsai, and C.-Y. Cheng, ‘‘Using
machine learning and big data approaches to predict travel time based on
tifying routing, other approaches worked on identifying the historical and real-time data from Taiwan electronic toll collection,’’ Soft
pattern and behaviors of drivers and pedestrians and finally Comput., vol. 22, no. 17, pp. 5707–5718, 2018.

VOLUME 7, 2019 49853


N. O. Alsrehin et al.: Intelligent Transportation and Control Systems Using Data Mining and Machine Learning Techniques

[22] J.-S. Yang, ‘‘A study of travel time modeling via time series analysis,’’ [44] P. J. Navarro, C. Fernândez, R. Borraz, and D. Alonso, ‘‘A machine
in Proc. IEEE Conf. Control Appl., Toronto, ON, Canada, Aug. 2005, learning approach to pedestrian detection for autonomous vehicles using
pp. 855–860. high-definition 3D range data,’’ Sensors, vol. 17, no. 1, p. 18, 2017.
[23] T.-Y. Hu and W.-M. Ho, ‘‘Travel time prediction for urban networks: The [45] J. Kim et al., ‘‘Pedestrian detection in front of the ego vehicle using
comparisons of simulation-based and time-series models,’’ in Proc. 17th (stereo) camera in the urban scene: Deep versus Shallow learning
ITS World Congr.-Autom. Vehicles Symp., Busan, South Korea, Oct. 2010, approaches,’’ M.S. thesis, Dept. Inf. Technol., Chemnitz Univ. Technol.,
pp. 1–11. Chemnitz, Germany, 2016.
[24] A. Ladino, A. Y. Kibangou, C. C. de Wit, and H. Fourati, ‘‘A real time [46] M. Errami and M. Rziza, ‘‘Improving pedestrian detection using support
forecasting tool for dynamic travel time from clustered time series,’’ vector regression,’’ in Proc. 13th Int. Conf. Comput. Graph., Imag. Vis.,
Transp. Res. C, Emerg. Technol., vol. 80, pp. 216–238, Jul. 2017. Beni Mellal, Morocco, Mar./Apr. 2016, pp. 156–160.
[25] A. Gal, A. Mandelbaum, F. Schnitzler, A. Senderovich, and M. Weidlich, [47] M. T.-T. Nguyen, V. D. Nguyen, and J. W. Jeon, ‘‘Real-time pedestrian
‘‘Traveling time prediction in scheduled transportation with journey seg- detection using a support vector machine and stixel information,’’ in
ments,’’ Inf. Syst., vol. 64, pp. 266–280, Mar. 2017. Proc. 17th Int. Conf. Control, Automat. Syst. (ICCAS), Jeju, South Korea,
[26] X. Ma, Z. Tao, Y. Wang, H. Yu, and Y. Wang, ‘‘Long short-term mem- Oct. 2017, pp. 1350–1355.
ory neural network for traffic speed prediction using remote microwave [48] Y. Xu, L. Xu, D. Li, and Y. Wu, ‘‘Pedestrian detection using back-
sensor data,’’ Transp. Res. C, Emerg. Technol., vol. 54, pp. 187–197, ground subtraction assisted support vector machine,’’ in Proc. 11th Int.
May 2015. Conf. Intell. Syst. Design Appl. (ISDA), Cordoba, Spain, Nov. 2011,
[27] B. D. Martin, V. Addona, J. Wolfson, G. Adomavicius, and Y. Fan, ‘‘Meth- pp. 837–842.
ods for real-time prediction of the mode of travel using smartphone-based [49] M. Jeong, B. C. Ko, and J.-Y. Nam, ‘‘Early detection of sudden pedestrian
GPS and accelerometer data,’’ Sensors, vol. 17, no. 9, p. 2058, 2017. crossing for safe driving during summer nights,’’ IEEE Trans. Circuits
[28] B. Yu, H. Wang, W. Shan, and B. Yao, ‘‘Prediction of bus travel time Syst. Video Technol., vol. 27, no. 6, pp. 1368–1380, Jun. 2017.
using random forests based on near neighbors,’’ Comput.-Aided Civil [50] T. Xiang, T. Li, M. Ye, and Z. Liu, ‘‘Random forest with adaptive local
Infrastruct. Eng., vol. 33, no. 4, pp. 333–350, Nov. 2017. template for pedestrian detection,’’ Math. Problems Eng., vol. 2015,
[29] B. A. Kumar, R. Jairam, S. S. Arkatkar, and L. Vanajakshi, ‘‘Real time Oct. 2015, Art. no. 767423.
bus travel time prediction using k-NN classifier,’’ Transp. Lett., to be [51] J. Marín, D. Vázquez, A. M. López, J. Amores, and B. Leibe, ‘‘Random
published. forests of local experts for pedestrian detection,’’ in Proc. IEEE Int. Conf.
Comput. Vis., Sydney, NSW, Australia, Dec. 2013, pp. 2592–2599.
[30] U. Mori, A. Mendiburu, M. Álvarez, and J. A. Lozano, ‘‘A review of
travel time estimation and forecasting for advanced traveller information [52] E. Gabriel, H. Schramm, and C. Meyer, ‘‘Analysis of the discriminative
systems,’’ Transportmetrica A, Transp. Sci., vol. 11, no. 2, pp. 119–157, generalized hough transform for pedestrian detection,’’ in Proc. 19th Int.
2015. Conf. Image Anal. Process., Catania, Italy, 2017, pp. 104–115.
[53] J. Brownlee, Boosting and AdaBoost for Machine Learning. Vermont,
[31] S. M. Kothuri, K. A. Tufte, H. Hagedorn, R. L. Bertini, and D. Deeter,
VIC, Australia: Machine Learning Mastery, 2016.
‘‘Survey of best practices in real time travel time estimation and predic-
tion,’’ in Proc. Compendium Tech. Papers, Inst. Transp. Eng., District 6th [54] S.-S. Huang, S.-C. Chien, F.-C. Chang, C.-H. Hsiao, and Y.-S. Hsiao,
Annu. Meeting, 2007, pp. 15–18. ‘‘All-weather thermal-image pedestrian detection method,’’ U.S. Patent
2018 0 165 552, Jun. 14, 2018.
[32] B. A. Kumar, L. Vanajakshi, and S. C. Subramanian, ‘‘Bus travel time
prediction using a time-space discretization approach,’’ Transp. Res. C, [55] W. G. Aguilar, M. A. Luna, J. F. Moya, V. Abad, H. Parra, and H. Ruiz,
Emerg. Technol., vol. 79, pp. 308–332, Jun. 2017. ‘‘Pedestrian detection for UAVs using cascade classifiers with Mean-
shift,’’ in Proc. IEEE 11th Int. Conf. Semantic Comput., San Diego, CA,
[33] D. Woodard, G. Nogin, P. Koch, D. Racz, M. Goldszmidt, and E. Horvitz,
USA, Jan./Feb. 2017, pp. 509–514.
‘‘Predicting travel time reliability using mobile phone GPS data,’’ Transp.
[56] W. G. Aguilar et al., ‘‘Cascade classifiers and saliency maps based people
Res. C, Emerg. Technol., vol. 75, pp. 30–44, Feb. 2017.
detection,’’ in Proc. Int. Conf. Augmented Reality, Virtual Reality Comput.
[34] C. Siripanpornchana, S. Panichpapiboon, and P. Chaovalit, ‘‘Travel-time
Graph., Ugento, Italy, 2017, pp. 501–510.
prediction with deep learning,’’ in Proc. IEEE Region 10 Conf. (TEN-
[57] X. Du, M. El-Khamy, J. Lee, and L. Davis, ‘‘Fused DNN: A deep neural
CON), Singapore, Nov. 2016, pp. 1859–1862.
network fusion approach to fast and robust pedestrian detection,’’ in Proc.
[35] J. Chung and K. Sohn, ‘‘Image-based learning to measure traffic density IEEE Winter Conf. Appl. Comput. Vis. (WACV), Santa Rosa, CA, USA,
using a deep convolutional neural network,’’ IEEE Trans. Intell. Transp. Mar. 2017, pp. 953–961.
Syst., vol. 19, no. 5, pp. 1670–1675, May 2018.
[58] V. V. Molchanov, B. V. Vishnyakov, Y. V. Vizilter, O. V. Vishnyakova,
[36] S. Zhang, G. Wu, J. P. Costeira, and J. M. F. Moura, ‘‘Understanding and V. A. Knyaz, ‘‘Pedestrian detection in video surveillance using fully
traffic density from large-scale Web camera data,’’ in Proc. IEEE Conf. convolutional YOLO neural network,’’ Proc. SPIE, vol. 10334, Jun. 2017,
Comput. Vis. Pattern Recognit. (CVPR), Honolulu, HI, USA, Jul. 2017, Art. no. 103340Q.
pp. 4264–4273. [59] J. Zhu, S. Liao, Z. Lei, and S. Z. Li, ‘‘Multi-label convolutional neural
[37] A. Zonoozi, J. Kim, X. L. Li, and G. Cong, ‘‘Periodic-CRN: A con- network based pedestrian attribute classification,’’ Image Vis. Comput.,
volutional recurrent model for crowd density prediction with recurring vol. 58, pp. 224–229, Feb. 2017.
periodic patterns,’’ in Proc. 27th Int. Joint Conf. Artif. Intell., Stockholm, [60] D. Matti, H. K. Ekenel, and J.-P. Thiran, ‘‘Combining LiDAR space
Sweden, Jul. 2018, pp. 3732–3738. clustering and convolutional neural networks for pedestrian detection,’’
[38] J. Shen, X. Zuo, L. Zhu, J. Li, W. Yang, and H. Ling, ‘‘Pedestrian proposal in Proc. 14th IEEE Int. Conf. Adv. Video Signal Based Surveill. (AVSS),
and refining based on the shared pixel differential feature,’’ IEEE Trans. Lecce, Italy, Aug./Sep. 2017, pp. 1–6.
Intell. Transp. Syst., to be published. [61] B. Nambuusi, T. Brijs, and E. Hermans, ‘‘A review of accident predic-
[39] D. Geronimo, A. M. Lopez, A. D. Sappa, and T. Graf, ‘‘Survey of pedes- tion models for road intersections,’’ Policy Res. Centre Mobility Public
trian detection for advanced driver assistance systems,’’ IEEE Trans. Works, Ghent, Belgium, Tech. Rep. RA-MOW-2008-004, 2008.
Pattern Anal. Mach. Intell., vol. 32, no. 7, pp. 1239–1258, Jul. 2010. [62] F. Gianfranco, S. Soddu, and P. Fadda, ‘‘An accident prediction model for
[40] C. Zhou and J. Yuan, ‘‘Bi-box regression for pedestrian detection and urban road networks,’’ J. Transp. Saf. Secur., vol. 10, no. 4, pp. 387–405,
occlusion estimation,’’ in Proc. Eur. Conf. Comput. Vis. (ECCV), Munich, 2018.
Germany, 2018, pp. 138–154. [63] W. E. Bunney, Jr., and D. A. Hamburg, ‘‘Development of a method for
[41] J. Kim et al., ‘‘Optimal feature selection for pedestrian detection based on systematic observation of emotional behavior on psychiatric wards,’’ Arch
logistic regression analysis,’’ in Proc. IEEE Int. Conf. Syst., Man, Cybern., Gen Psychiatry, vol. 9, no. 3, 1963.
Manchester, U.K., Oct. 2013, pp. 239–242. [64] M. Li, X. Chen, X. Lin, D. Xu, and Y. Wang, ‘‘Connected vehicle-
[42] T. Yamashita, H. Fukui, Y. Yamauchi, and H. Fujiyoshi, ‘‘Pedestrian based red-light running prediction for adaptive signalized intersec-
and part position detection using a regression-based multiple task deep tions,’’ J. Intell. Transp. Syst.-Technol., Planning, Oper., vol. 22, no. 3,
convolutional neural network,’’ in Proc. 23rd Int. Conf. Pattern Recog- pp. 229–243, 2018.
nit. (ICPR), Cancun, Mexico, Dec. 2016, pp. 3500–3505. [65] S. Alkheder, M. Taamneh, and S. Taamneh, ‘‘Severity prediction of traffic
[43] R. Irina, ‘‘An empirical study of the naive Bayes classifier,’’ in Proc. accident using an artificial neural network,’’ J. Forecasting, vol. 36, no. 1,
Workshop Empirical Methods Artif. Intell., 2001, pp. 41–46. pp. 100–108, 2016.

49854 VOLUME 7, 2019


N. O. Alsrehin et al.: Intelligent Transportation and Control Systems Using Data Mining and Machine Learning Techniques

[66] T. Lu, Y. Lixin, Z. Dunyao, and Z. Pan, ‘‘The traffic accident hotspot [89] H. Wang, M. Xu, F. Zhu, Z. Deng, Y. Li, and B. Zhou, ‘‘Shadow traffic:
prediction: Based on the logistic regression method,’’ in Proc. Int. Conf. A unified model for abnormal traffic behavior simulation,’’ Comput.
Transp. Inf. Saf., Wuhan, China, Jun. 2015. Graph., vol. 70, pp. 235–241, Feb. 2018.
[67] W. Ma and Z. Yuan, ‘‘Analysis and comparison of traffic accident regres- [90] X. Li, P. Djukic, and H. Zhang, ‘‘Traffic behavior driven dynamic zoning
sion prediction model,’’ in Proc. 3rd Int. Conf. Electromech. Control for distributed traffic engineering in SDN,’’ U.S. Patent 9 432 257 B2,
Technol. Transp. (ICECTT), Chongqing, China, 2018, pp. 1–6. Aug. 30, 2013.
[68] A. Theofilatos, G. Yannis, P. Kopelias, and F. Papadimitriou, ‘‘Predicting [91] A. Aksjonov, P. Nedoma, V. Vodovozov, E. Petlenkov, and M. Herrmann,
road accidents: A rare-events modeling approach,’’ Transp. Res. Proce- ‘‘A method of driver distraction evaluation using fuzzy logic,’’ in Proc.
dia, vol. 14, pp. 3399–3405, Apr. 2016. Int. Conf. Inf., Commun. Automat. Technol., Sarajevo, Bosnia and Herze-
[69] C. J. O’donnell and D. H. Connor, ‘‘Predicting the severity of motor govina, 2017, pp. 1–7.
vehicle accident injuries using models of ordered multiple choice,’’ Int. [92] A. Aksjonov, P. Nedoma, V. Vodovozov, E. Petlenkov, and M.
J. Intell. Syst. Appl. Eng., vol. 6, no. 1, pp. 72–79, 2018. Herrmann, ‘‘Detection and evaluation of driver distraction using
[70] M. Taamneh, S. Alkheder, and S. Taamneh, ‘‘Data-mining techniques for machine learning and fuzzy logic,’’ IEEE Trans. Intell. Transp. Syst.,
traffic accident modeling and prediction in the United Arab Emirates,’’ to be published.
J. Transp. Saf. Secur., vol. 9, no. 2, pp. 146–166, 2017. [93] N. Li, J. J. Jain, and C. Busso, ‘‘Modeling of driver behavior in real world
[71] X. Gu, T. Li, Y. Wang, L. Zhang, Y. Wang, and J. Yao, ‘‘Traffic fatalities scenarios using multiple noninvasive sensors,’’ IEEE Trans. Multimedia,
prediction using support vector machine with hybrid particle swarm vol. 15, no. 5, pp. 1213–1225, Aug. 2013.
optimization,’’ J. Algorithms Comput. Technol., vol. 12, no. 1, pp. 20–29, [94] N. AbuAli and H. Abou-zeid, ‘‘Driver behavior modeling: Develop-
2018. ments and future directions,’’ Int. J. Veh. Technol., vol. 2016, Nov. 2016,
[72] B. Sharma, V. K. Katiyar, and K. Kumar, ‘‘Traffic accident prediction Art. no. 6952791.
model using support vector machines with Gaussian kernel,’’ in Proc. [95] A. Bouhoute, R. Oucheikh, K. Boubouh, and I. Berrada, ‘‘Advanced
5th Int. Conf. Soft Comput. Problem Solving, Uttar Pradesh, India, 2016, driving behavior analytics for an improved safety assessment and driver
pp. 1–10. fingerprinting,’’ IEEE Trans. Intell. Transp. Syst., to be published.
[73] J. You, J. Wang, and J. Guo, ‘‘Real-time crash prediction on freeways [96] N. AbuAli and H. Abou-Zeid, ‘‘Driver behavior modeling: Develop-
using data mining and emerging techniques,’’ J. Modern Transp., vol. 25, ments and future directions,’’ Int. J. Veh. Technol., vol. 2016, Nov. 2016,
no. 2, pp. 116–123, 2017. Art. no. 6952791.
[74] S. Sarkar, A. Patel, S. Madaan, and J. Maiti, ‘‘Prediction of occupational [97] U. Fugiglando et al., ‘‘Driving behavior analysis through CAN bus data in
accidents using decision tree approach,’’ in Proc. 13th Int. IEEE India an uncontrolled environment,’’ IEEE Trans. Intell. Transp. Syst., vol. 20,
Conf. (INDICON), Bengaluru, India, 2016, pp. 1–6. no. 2, pp. 737–748, Feb. 2019.
[75] K. S. Jadaan, M. Al-Fayyad, and H. F. Gammoh, ‘‘Prediction of road traf-
[98] B. I. Ahmad, P. M. Langdon, J. Liang, S. J. Godsill, M. Delgado,
fic accidents in jordan using artificial neural network (ANN),’’ J. Traffic
and T. Popham, ‘‘Driver and passenger identification from smartphone
Logistics Eng., vol. 2, no. 2, pp. 92–94, 2014.
data,’’ IEEE Trans. Intell. Transp. Syst., vol. 20, no. 4, pp. 1278–1288,
[76] E. Contreras, L. Torres-Treviño, and F. Torres, ‘‘Prediction of car acci- Apr. 2019.
dents using a maximum sensitivity neural network,’’ Smart Technology
[99] J. Shi, H. Wei, and S. Shi, ‘‘Driving motion capture based driver behavior
(Lecture Notes of the Institute for Computer Sciences, Social Informatics
analysis,’’ in Proc. 15th Int. IEEE Conf. Intell. Transp. Syst., Anchorage,
and Telecommunications Engineering), vol. 213. New York, NY, USA:
AK, USA, Sep. 2012, pp. 1166–1171.
Springer, 2018, pp. 86–95.
[100] H. S. Kim, D. Yoon, H. S. Shin, and C. H. Park, ‘‘Predicting the EEG level
[77] F. N. Ogwueleka, S. Misra, T. C. Ogwueleka, and L. Fernandez-Sanz,
of a driver based on driving information,’’ IEEE Trans. Intell. Transp.
‘‘An artificial neural network model for road accident prediction: A case
Syst., vol. 20, no. 4, pp. 1215–1225, Apr. 2019.
study of a developing country,’’ Acta Polytechnica Hungarica, vol. 11,
no. 5, pp. 177–197, 2014. [101] H. S. Kim, Y. S. Hwang, D. S. Yoon, W. G. Choi, and C. H. Park,
[78] F. Chang and C. Liu, ‘‘Hybrid cascade structure for license plate detection ‘‘Driver workload characteristics analysis using EEG data from an urban
in large visual surveillance scenes,’’ IEEE Trans. Intell. Transp. Syst., to road,’’ IEEE Trans. Intell. Transp. Syst., vol. 15, no. 4, pp. 1844–1849,
be published. Aug. 2014.
[79] Z. Ma, S. Zhu, H. N. Koutsopoulos, and L. Ferreira, ‘‘Quantile regression [102] Traffic Safety Facts, document 20590, Utah Dept. Transp., Nat. Highway
analysis of transit travel time reliability with automatic vehicle location Traffic Safety Admin., Washington, DC, USA, 2015.
and farecard data,’’ Transp. Res. Rec., J. Transp. Res. Board, vol. 2652, [103] Police of Japan. (2017). National Police Agency. Accessed: Oct. 24, 2018.
pp. 19–29, Aug. 2017. [Online]. Available: https://www.npa.go.jp/english/kokusai/pdf/Police_
[80] N. Zenina and A. Borisov, ‘‘Regression analysis for transport trip gener- of_Japan_2017_full_text.pdf
ation evaluation,’’ Inf. Technol. Manage. Sci., vol. 16, no. 1, pp. 89–94, [104] J. Qianyin, L. Guoming, Y. Jinwei, and L. Xiying, ‘‘A model based
2013. method of pedestrian abnormal behavior detection in traffic scene,’’ in
[81] G. Cui, J. Luo, and X. Wang, ‘‘Personalized travel route recommendation Proc. IEEE 1st Int. Smart Cities Conf. (ISC2), Guadalajara, Mexico,
using collaborative filtering based on GPS trajectories,’’ Int. J. Digit. Oct. 2015, pp. 1–6.
Earth, vol. 11, no. 3, pp. 284–307, May 2017. [105] M. H. Zaki and T. Sayed, ‘‘Automated analysis of pedestrian group
[82] B. Sun and B. B. Park, ‘‘Route choice modeling with Support Vector behavior in urban settings,’’ IEEE Trans. Intell. Transp. Syst., vol. 19,
Machine,’’ Transp. Res. Procedia, vol. 25, pp. 1806–1814, 2017. no. 6, pp. 1880–1889, Jun. 2018.
[83] C. P. Tribby, H. J. Miller, B. B. Brown, C. M. Werner, and K. R. Smith, [106] M. Iryo-Asano, W. K. M. Alhajyaseen, and H. Nakamura, ‘‘Analysis and
‘‘Analyzing walking route choice through built environments using ran- modeling of pedestrian crossing behavior during the pedestrian flash-
dom forests and discrete choice techniques,’’ Environ. Planning B, Urban ing green interval,’’ IEEE Trans. Intell. Transp. Syst., vol. 16, no. 2,
Anal. City Sci., vol. 44, no. 6, pp. 1145–1167, Jul. 2016. pp. 958–969, Apr. 2015.
[84] I. Lamouik, A. Yahyaouy, and M. A. Sabri, ‘‘Deep neural network [107] Y. Dong, Y. Li, W. Liu, and J. Wu, ‘‘Unconscious behavior detection
dynamic traffic routing system for vehicles,’’ in Proc. Int. Conf. Intell. for pedestrian safety based on gesture features,’’ in Proc. 18th Int.
Syst. Comput. Vis. (ISCV), Fez, Morocco, Apr. 2018, pp. 2–4. Conf. Parallel Distrib. Comput., Appl. Technol. (PDCAT), Taipei, Taiwan,
[85] H. Slavin, Q. Yang, D. Morgan, A. Rabinowicz, J. Brandon, and Dec. 2017, pp. 39–43.
R. Balakrishna, ‘‘Lane-level vehicle navigation for vehicle routing and [108] N. Kang, and H. Nakamura, ‘‘An analysis of characteristics of heavy
traffic management,’’ U.S. Patent 9 964 414 B2, May 8, 2018. vehicle behavior at roundabouts in Japan,’’ Transp. Res. Procedia, vol. 25,
[86] Z. Wang, A. Shafahi, and A. Haghani, ‘‘SCDA: School compatibil- pp. 1485–1493, Jun. 2017.
ity decomposition algorithm for solving the multi-school bus routing [109] I. Kaparias, M. G. H. Bell, T. Biagioli, L. Bellezzaa, and B. Mountc,
and scheduling problem,’’ Univ. Maryland, College Park, MD, USA, ‘‘Behavioural analysis of interactions between pedestrians and vehicles
Tech. Rep., 2017. in street designs with elements of shared space,’’ Transp. Res. F, Traffic
[87] Y. Liu et al., ‘‘Intelligent bus routing with heterogeneous human mobility Psychol. Behav., vol. 30, pp. 115–127, Apr. 2015.
patterns,’’ Knowl. Inf. Syst., vol. 50, no. 2, pp. 383–415, 2017. [110] K. Suzuki and H. Ito, ‘‘Empirical analysis on risky behaviors and
[88] D. Sekar and W. J. Shondelmyer, ‘‘Behavioral based traffic infraction pedestrian-vehicle conflicts at large-size signalized intersections,’’
detection and analysis system,’’ U.S. Patent 10 037 691 B1, Jul. 31, 2018. Transp. Res. Procedia, vol. 25, pp. 2139–2152, Jul. 2017.

VOLUME 7, 2019 49855


N. O. Alsrehin et al.: Intelligent Transportation and Control Systems Using Data Mining and Machine Learning Techniques

[111] B. Chen, D. Zhao, and H. Peng, ‘‘Evaluation of automated vehicles [133] D. H. Stolfi and E. Alba, ‘‘Red Swarm: Reducing travel times in smart
encountering pedestrians at unsignalized crossings,’’ in Proc. IEEE cities by using bio-inspired algorithms,’’ Appl. Soft Comput., vol. 24,
Intell. Vehicles Symp. (IV), Redondo Beach, CA, USA, Jun. 2017, pp. 181–195, Nov. 2014.
pp. 1679–1685. [134] G. Wang, Y. Hou, Y. Zhang, Y. Zhou, N. Lu, and N. Cheng, ‘‘TLB-VTL:
[112] F. Camara et al., ‘‘Filtration analysis of pedestrian-vehicle interactions for 3-level buffer based virtual traffic light scheme for intelligent collabora-
autonomous vehicle control,’’ in Proc. 15th Int. Conf. Auton. Syst. (IAS), tive intersections,’’ in Proc. IEEE 86th Veh. Technol. Conf. (VTC-Fall),
Baden-Baden, Germany, Jun. 2018, pp. 1–13. Toronto, Candad, Sep. 2017, pp. 1–5.
[113] A. A. Zaid, Y. Suhweil, and M. A. Yaman, ‘‘Smart controlling for traffic [135] M. Khamis and W. Gomaa, ‘‘Enhanced multiagent multi-objective rein-
light time,’’ in Proc. IEEE Jordan Conf. Appl. Elect. Eng. Comput. forcement learning for urban traffic light control,’’ in Proc. 11th IEEE Int.
Technol. (AEECT), Aqaba, Jordan, Oct. 2017, pp. 1–5. Conf. Mach. Learn. Appl. (ICMLA), Boca Raton, FL, USA, Dec. 2012,
[114] A. Kanungo, A. Sharma, and C. Singla, ‘‘Smart traffic lights switching pp. 586–591.
and traffic density calculation using video processing,’’ in Proc. Recent [136] J. Turian, L. Ratinov, and Y. Bengio, ‘‘Word representations: A sim-
Adv. Eng. Comput. Sci. (RAECS), Chandigarh, India, Mar. 2014, pp. 1–6. ple and general method for semi-supervised learning,’’ in Proc. 48th
[115] B. Ghazal, K. ElKhatib, K. Chahine, and M. Kherfan, ‘‘Smart traffic light Annu. Meeting Assoc. Comput. Linguistics, Uppsala, Sweden, Jul. 2010,
control system,’’ in Proc. 3rd Int. Conf. Elect., Electron., Comput. Eng. pp. 384–394.
Their Appl. (EECEA), Beirut, Lebanon, Apr. 2016, pp. 140–145. [137] L. Baskar, B. De Schutter, J. Hellendoorn, and Z. Papp, ‘‘Traffic control
[116] N. M. Z. Hashim, A. S. Jaafar, N. A. Ali, L. Salahuddin, N. R. Mohamad, and intelligent vehicle highway systems: A survey,’’ IET Intell. Transp.
and M. A. Ibrahim, ‘‘Traffic light control system for emergency vehicles Syst., vol. 5, no. 1, pp. 38–52, Mar. 2011.
using radio frequency,’’ Int. Org. Sci. Res. J. Eng., vol. 3, pp. 43–52, [138] P. Gupta, G. N. Purohit, and A. Dadhich, ‘‘Approaches for intelligent
Jul. 2013. traffic system: A survey,’’ in Proc. Int. J. Comput. Sci. Eng. (IJCSE),
[117] S. Maqboo, U. Sabeel, N. Chandra, and R.-U.-A. Bhat, ‘‘Smart traffic vol. 9, no. 4, pp. 1570–1578, Sep. 2012.
light control and congestion avoidance system during emergencies using [139] K. N. Qureshi and A. H. Abdullah, ‘‘A survey on intelligent trans-
arduino and zigbee 802.15. 4,’’ Int. J. Adv. Res. Comput. Sci. Softw. Eng., portation systems,’’ Middle-East J. Sci. Res., vol. 5, no. 5, pp. 629–642,
vol. 3, pp. 1801–1808, Jun. 2013. 2013.
[118] S. Jaiswal, T. Agarwal, A. Singh, and Lakshita, ‘‘Intelligent traffic control [140] A. Chepuru and D. Rao, ‘‘A survey on IoT applications for intelli-
unit,’’ Int. J. Elect., Electron. Comput. Eng., vol. 2, no. 2, pp. 66–72, gent transport systems,’’ Int. J. Current Eng. Sci. Res., vol. 2, no. 8,
Aug. 2013. pp. 116–127, 2015.
[119] N. Mascarenhas, G. Pradeep, M. Agrawal, P. Subash, and A. Ajina, [141] W. Merrad, A. Rachedi, K. Busawon, and R. Binns, ‘‘A survey on
‘‘A proposed model for traffic signal preemption using global positioning smart traffic network control and optimization,’’ in Proc. International
system (GPS),’’ Comput. Sci. Inf. Technol., pp. 219–226, Jul. 2013. Conf. Multidisciplinary Eng. Design Optim. (MEDO), Belgrade, Serbia,
[120] P. Parida, S. Dhurua, and S. Priya, ‘‘An intelligent ambulance with some Sep. 2016, pp. 1–6.
advance features of telecommunication,’’ Int. J. Emerg. Technol. Adv. [142] S. Y. Cheung, S. C. Ergen, and P. Varaiya, ‘‘Traffic surveillance with
Eng., vol. 4, pp. 398–405, Oct. 2014. wireless magnetic sensors,’’ in Proc. 12th World Congr. Intell. Transp.
[121] S. Djahel, M. Salehie, I. Tal, and P. Jamshidi, ‘‘Adaptive traffic manage- Syst., San Francisco, CA, USA, 2005, pp. 1–13.
ment for secure and efficient emergency services in smart cities,’’ in Proc. [143] M. AmineKafi, Y. Challal, D. Djenouri, A. Bouabdallah, L. Khelladia,
IEEE Int. Conf. Pervasive Comput. Commun. Workshops (PERCOM and N. Badachea, ‘‘A study of wireless sensor network architectures and
Workshops), San Diego, CA, USA, Mar. 2013, pp. 340–343. projects for traffic light monitoring,’’ Procedia Comput. Sci., vol. 10,
[122] M. Collotta, G. Pau, G. Scatà, and T. Campisi, ‘‘A dynamic traffic light pp. 543–552, Aug. 2012.
management system based on wireless sensor networks for the reduction [144] M. Papageorgiou, C. Diakaki, V. Dinopoulou, A. Kotsialos, and Y. Wang,
of the red-light running phenomenon,’’ Transp. Telecommun., vol. 15, ‘‘Review of road traffic control strategies,’’ Proc. IEEE, vol. 91, no. 12,
no. 1, pp. 1–11, 2014. pp. 2043–2067, Dec. 2003.
[123] G. Velez and O. Otaegui, ‘‘Embedding vision-based advanced driver [145] P. Mirchandani and L. Head, ‘‘A real-time traffic signal control system:
assistance systems: A survey,’’ IET Intell. Transp. Syst., vol. 11, no. 3, Architecture, algorithms, and analysis,’’ Transp. Res. C, Emerg. Technol.,
pp. 103–112, Apr. 2017. vol. 9, pp. 415–432, Dec. 2011.
[124] J. Horgan, C. Hughes, J. McDonald, and S. Yogamani, ‘‘Vision-based [146] K. L. Head, ‘‘Event-based short-term traffic flow prediction model,’’
driver assistance systems: Survey, taxonomy and advances,’’ in Proc. 18th Transp. Res. Rec., pp. 45–52, Jan. 1995.
IEEE Int. Conf. Intell. Transp. Syst. (ITSC), Las Palmas, Spain, Sep. 2015, [147] N. Tendulkar, K. Sonawane, D. Vakte, D. Pujari, and G. Dhomase,
pp. 2032–2039. ‘‘A review of traffic management system using IoT,’’ Int. J. Modern
[125] ‘‘LED traffic lights reduce energy use in chicago by 85%,’’ C40 Cities, Trends Eng. Res., pp. 247–249, Apr. 2016.
New York, NY, USA, Tech. Rep., 2011. [148] R. Anaswara and S. Lakshmi, ‘‘A survey on traffic management in
[126] C. Rhodes and S. Djahel, ‘‘TRADER: Traffic light phases aware driving smart cities,’’ Int. J. Eng. Comput. Sci., vol. 11, pp. 18983–18986,
for reduced traffic congestion in smart cities,’’ in Proc. Int. Smart Cities Nov. 2016.
Conf. (ISC2), Wuxi, China, Sep. 2017, pp. 1–8. [149] K. Dresner and P. Stone, ‘‘Multiagent traffic management: A reservation-
[127] S. Djahel, N. Jabeur, R. Barrett, and J. Murphy, ‘‘Toward V2I commu- based intersection control mechanism,’’ Univ. Texas Austin, Austin, TX,
nication technology-based solution for reducing road traffic congestion USA, Tech. Rep., 2004.
in smart cities,’’ in Proc. Int. Symp. Netw., Comput. Commun. (ISNCC), [150] A. de La Fortelle, ‘‘Analysis of reservation algorithms for cooperative
Hammamet, Tunisia, May 2015, pp. 1–6. planning at intersections,’’ in Proc. 13th Int. IEEE Conf. Intell. Transp.
[128] Y. M. Jagadeesh, G. M. Suba, S. Karthik, and K. Yokesh, ‘‘Smart Syst., Funchal, Portugal, Sep. 2010, pp. 445–449.
autonomous traffic light switching by traffic density measurement [151] K. Dresner and P. Stone, ‘‘A Multiagent Approach to Autonomous
through sensors,’’ in Proc. Int. Conf. Comput., Commun., Syst. (ICCCS), Intersection Management,’’ J. Artif. Intell. Res., vol. 31, pp. 591–656,
Kanyakumari, India, Nov. 2015, pp. 123–126. Mar. 2008.
[129] G. S. Khekare and A. V. Sakhare, ‘‘A smart city framework for intelligent [152] Y. J. Zhang, A. A. Malikopoulos, and C. G. Cassandras, ‘‘Optimal control
traffic system using VANET,’’ in Proc. Int. Mutli-Conf. Automat., Com- and coordination of connected and automated vehicles at urban traffic
put., Commun., Control Compressed Sens. (iMac4s), Kottayam, India, intersections,’’ in Proc. Amer. Control Conf. (ACC), Boston, MA, USA,
Mar. 2013, pp. 302–305. Jul. 2016, pp. 6227–6232.
[130] M. B. Younes and A. Boukerche, ‘‘Intelligent traffic light controlling [153] Y. Geng and C. G. Cassandras, ‘‘Multi-intersection Traffic Light Control
algorithms using vehicular networks,’’ IEEE Trans. Veh. Technol., vol. 8, with blocking,’’ Discrete Event Dynamic Syst., vols. 1–2, no. 25, pp. 7–30,
no. 65, pp. 5887–5899, Aug. 2016. Jun. 2015.
[131] S. Badura and A. Lieskovsky, ‘‘Intelligent traffic system: Cooperation [154] Y. Geng and C. G. Cassandras, ‘‘New ‘smart parking’ system based on
of MANET and image processing,’’ in Proc. 1st Int. Conf. Integr. Intell. resource allocation and reservations,’’ IEEE Trans. Intell. Transp. Syst.,
Comput. (ICIIC), Bangalore, India, Aug. 2010, pp. 119–123. vol. 14, no. 3, pp. 1129–1139, Sep. 2013.
[132] A. S. Salama, B. K. Saleh, and M. M. Eassa, ‘‘Intelligent cross road [155] D. Prangchumpol, ‘‘A network traffic prediction algorithm based on data
traffic management system (ICRTMS),’’ in Proc. 2nd Int. Conf. Comput. mining technique,’’ Int. J. Comput. Inf. Eng., vol. 7, no. 7, pp. 999–1002,
Technol. Develop. (ICCTD), Cairo, Egypt, Nov. 2010, pp. 27–31. 2013.

49856 VOLUME 7, 2019


N. O. Alsrehin et al.: Intelligent Transportation and Control Systems Using Data Mining and Machine Learning Techniques

[156] J. Mackenzie, J. F. Roddick, and R. Zito, ‘‘An evaluation of HTM and AHMAD F. KLAIB received the B.Sc. degree in
LSTM for short-term arterial traffic flow prediction,’’ IEEE Trans. Intell. computer information systems from Al al-Bayt
Transp. Syst., to be published. University, Jordan, in 2005, the master’s degree
[157] S. Guogang, G. Jianhua, H. Wei, and B. M. Williams, ‘‘Modeling sea- in computer science from the University of Sci-
sonal heteroscedasticity in vehicular traffic condition series using a sea- ence, Malaysia, in 2007, and the Ph.D. degree in
sonal adjustment approach,’’ J. Transp. Eng., vol. 140, no. 5, pp. 1–11, computer science from the University of Hudder-
May 2014. sfield, U.K., in 2015. He is currently an Assistant
[158] M. Ernst and G. Haesbroeck, ‘‘Comparison of local outlier detection
Professor with the Computer Information Systems
techniques in spatial multivariate data,’’ Data Mining Knowl. Discovery,
Department, Faculty of Information Technology
vol. 31, no. 2, pp. 371–399, 2017.
[159] Y. Djenouri and A. Zimek, ‘‘Outlier detection in urban traffic data,’’ in and Computer Science, Yarmouk University, Jor-
Proc. 8th Int. Conf. Web Intell., Mining Semantics. New York, NY, USA: dan. He has two funded projects in the areas of smart homes and smart trans-
ACM, Jun. 2018, p. 3. portation systems. His research interests include string matching algorithms,
[160] H. Yao, X. Tang, H. Wei, G. Zheng, and Z. Li, ‘‘Revisiting spatial- text processing, video and image processing, optimization, health care, and
temporal similarity: A deep learning framework for traffic prediction,’’ the Internet-of-Things technology.
in Proc. AAAI Conf. Artif. Intell., 2019, pp. 277–289.
[161] X. Wu, Y. Dong, C. Huang, J. Xu, D. Wang, and N. V. Chawla, ‘‘UAPD:
Predicting urban anomalies from spatial-temporal data,’’ in Proc. Joint
Eur. Conf. Mach. Learn. Knowl. Discovery Databases. Cham, Switzer-
land: Springer, Sep. 2017, pp. 622–638.
[162] Y. Djenouri, A. Zimek, and M. Chiarandini, ‘‘Outlier detection in urban
traffic flow distributions,’’ in Proc. IEEE Int. Conf. Data Mining (ICDM),
Nov. 2018, pp. 935–940.
[163] Q. Wang, W. Lv, and B. Du, ‘‘Spatio-temporal anomaly detection in traffic
data,’’ in Proc. 2nd Int. Symp. Comput. Sci. Intell. Control. New York, NY,
USA: ACM, Sep. 2018, p. 46.
[164] F. Sun, ‘‘Algorithms for context-sensitive prediction, optimization and
anomaly detection in urban mobility,’’ Ph.D. dissertation, Comput. Sci.,
Vanderbilt Univ., Nashville, TN, USA, 2018.
[165] K. Bhowmick and M. Narvekar, ‘‘Trajectory outlier detection for traffic
events: A survey,’’ in Intelligent Computing and Information and Com-
munication. Singapore: Springer, 2018, pp. 37–46.
[166] Y. Djenouri, A. Belhadi, J. C.-W. Lin, D. Djenouri, and A. Cano, ‘‘A sur-
vey on urban traffic anomalies detection algorithms,’’ IEEE Access, vol. 7,
pp. 12192–12205, 2019.

NAWAF O. ALSREHIN received the B.Sc. degree AWS MAGABLEH received the B.Sc. degree in
in computer science and the master’s degree in software engineering from Hashemite University,
computer information system from Yarmouk Uni- Jordan, in 2006, the master’s degree in software
versity, Irbid, Jordan, in 2003 and 2006, respec- engineering from the University Malaya (ML),
tively, and the Ph.D. degree in computer science Malaysia, in 2008, and the Ph.D. degree in soft-
from Utah State University, USA, in 2016. He is ware engineering from the National University of
currently an Assistant Professor and the Head of Malaysia, Malaysia, in 2015. He is currently an
the Computer Information Systems Department, Assistant Professor with the Computer Informa-
Faculty of Information Technology and Computer tion Systems Department, Faculty of Information
Science, Yarmouk University. His research inter- Technology and Computer Science, Yarmouk Uni-
ests include multimedia, video and image processing, video transcoding, versity, Jordan. His research interests include software engineering, unified
multimedia services, video quality assessments, distributed multimedia sys- modeling language, and aspect orientation.
tems, and multimedia applications in the cloud.

VOLUME 7, 2019 49857

You might also like