Final Proposal PDF
Final Proposal PDF
Some places might contribute more to the accident than others might. Addis Ababa, takes the lion’s
share of the risk having higher number of vehicles and traffic and the cost of these fatalities and
injuries has a huge impact on the socio-economic development of the society [2]. Every year,
around 300 people are killed on Addis Ababa's roads and 1500 are lightly and seriously injured.
The government has started several campaigns, such as “Think!” and Road Safety Campaign
(RSC), to help people become aware of road safety issues and try to reduce road accidents [3].
There are different reasons responsible for the accidents like abandonment of traffic rules but road
conditions and the traffic are considered the one of prime cause of fatality and causality across the
globe. These accidents occur due to dynamic design and development of automobile industries. A
traffic crash occurred due to certain reasons like crashes of two vehicles on road, walking person,
animal, or any other natural obstacles. It could result in injury, property damage, and death. Traffic
accident analysis required study of the various factor affecting behind them.
Road traffic accident is outlined as a collision or incident involving at least one road vehicle in
motion that can be on a public or private road to which the public have the right of access. Thus,
road traffic accident can be a collision among vehicles, between vehicles and pedestrians, between
vehicles and animals, or between vehicles and geographical or architectural obstacles [1]. Single
vehicle accidents, in which one vehicle alone (and no other road user) was involved, are included.
Various sectors are dealing with huge amounts of data available in different formats from disparate
sources. The huge amount of data is becoming easily available and accessible due to the
progressive use of technology. Governments and companies realize the huge insights that can be
obtained from tapping into big data but lack the resources and time required to examine through
its wealth of information. As such, artificial intelligence measures are being employed by different
industries to gather, process, communicate, and share useful information from data sets. One
method of Artificial Intelligent that is increasingly utilized for big data processing is machine
learning [4].
To evaluate and analyze data stored in large databases, machine learning techniques are needed to
search large quantities of data and to discover new patterns and relationships hidden in the data.
Machine learning allows analysis of massive quantities of data. While it generally delivers faster,
more accurate outcome in order to identify profitable opportunities or dangerous risks, it may also
require additional time and resources to train it properly. Integrating machine learning with
Artificial Intelligent and cognitive technologies can make it even more effective in processing
large volumes of information.
Traffic control system is the area, where serious data about the society is recorded and kept. Using
this data, we can identify the risk factors and causes for road traffic accidents, injuries and fatalities
and make preventive measures to save the peoples life. Road traffic accident analysis, a part of
criminology, is a law enforcement function that involves the methodical analysis of identifying
and analyzing both patterns and trends in accident.
Machine learning holds the promise of making it easy, convenient, and practical to explore very
large databases for organizations and users [6]. Actually road traffic accident analysis includes
exploring and detecting accident and their relationships with those who are case of the accident.
The high volume of accident datasets and also the different variables are used in identification of
the major causes of the accident using machine learning techniques [5].
Research on road traffic accidents has been conducted for several years mainly in developed
countries, and a few locally. Tibebe [7] conducted a research on historical road traffic accidents
data comprising a dataset of 4,658 accident records at Addis Ababa Traffic Office to investigate
the application of data mining technology for the analysis of accident severity. Following Tibebe,
Zelalem [8] has also conducted a research to classify drivers’ responsibility on a given accident in
Addis Ababa. In addition, Tibebe and Hill [9] again did a research on road related factors on
accident severity.
The previous researches have focused merely on single attributes that help to predict traffic
accident in Addis Ababa, which shows there is a gap for further research that combines the drivers’
information, road characteristics and other related attributes to predict the causes of accidents.
Changes on traffic rules and regulations are made in the capital city, which has its own contribution
in road safety after these researches have been done.
Moreover, although the existence of a large number of road accidents are shown by different
studies, road traffic accident data are gathered periodically by the Addis Ababa traffic control and
investigation department, due to lack of appropriate data analysis tools this historical and
accumulated data has not been used for analysis.
The recorded data is a major source of solution to analyze the contributing factor of the problem
that cause a great loss of life. In an attempt to prevent road traffic accidents one role that can be
played is researching the main causes of traffic accidents and try to attack the problem from its
root.
In this research, the researcher will construct a model that predicts the major causes of road traffic
accidents based on the drivers’ information, road and other related attributes, using a traffic
accident data from Addis Ababa sub city’s Police Departments in Addis Ababa City.
Purpose and Research Question
In this thesis, a machine learning technique will be used in a knowledge discovery process to
identify and predict major causes of road traffic accident. Thus this research will address the following
three main research questions:
What are the main determinant factors (attributes) that can cause traffic accident?
Which machine learning techniques perform well in identifying the main causes of road
traffic accident?
What are the most interesting patterns or rules generated using the cause factors of roads
traffic accident that can be used as a traffic rules and policies?
In [10] different Supervised Machine Learning methods like Logistic Regression, K- Nearest
Neighbor, Naive Bayes, Decision Tree and Random Forests are implemented on accidents dataset
like to discover how each component is affecting the accidents variables and this gives a safe
driving proposals to limit the accidents. The discoveries of this investigation demonstrate that the
Decision Tree can be a best model for anticipating the reason for accidents by using Anaconda,
which contains Jupyter notebook it is a free source conveyance of R and Python programming
languages for enormous data processing, prediction and analysis. Decision Tree shown better
performance on all the components, namely Weather condition, Causes, Road Features, Road
Condition, Type of Accident, with 99.4%.
In [11] three classification algorithms were implemented Decision tree, ANN, and SVM to detect
the influential environmental features of RTAs that can be used to build the prediction
classification rules. These classifiers were trained and tested using the dataset was obtained from
the Department for Transport of United Kingdom using WEKA tool. R tool also used to apply
sampling techniques to handle the imbalanced data problem of the used dataset. The experiment
results show that the highest Accuracy, Precision, Recall, and F-Measure values were 80.650%,
0.814%, 0.806%, and 0.801% to Decision Tree. The PART algorithm was used to present the
knowledge in the form of rules. PART was run with the accuracy of 76.570% on the Traffic
Accident dataset, and Cross Validation 10-folds were used. Moreover, the JAVA language was
used to build PART rules list for the prediction model. Rules were generated based on Urban or
Rural Area, Speed limit, Light Conditions, and Number of Vehicles attributes.
In [12] have applied different machine learning classification algorithms and discussed the six
algorithms with high accuracy and best classification performances such as Fuzzy-FARCHD,
Random Forest, Hierarchal LVQ, RBF Network (Radial Basis Function Network), Multilayer
Perceptron, and Naïve Bayes on road traffic accident dataset obtained from United Kingdom road
traffic accident of the year 2016. The results from analysis show that Fuzzy-FARCHD algorithm
was effective to classify the dataset and achieves an accuracy of 85.94%. In this research work,
Lighting Conditions, 1st Road Class & No., Number of vehicles are the key features in selecting
the attributes.
In [13] four machine learning techniques which are Naïve Bayes, k-Nearest Neighbors, Decision
trees, and Support Vector Machines were used for evaluation of Punjab road accidents. This
research work had a challenge of performing parametric evaluation to extract highly important
parameters especially for Punjab. The result of this study yields 12 most suitable parameters and
higher performance of 86.25% for Decision Tree classifier. The main causes behind the road
accidents in Punjab come from three most contributing factors with mental state of driver, alcohol
consumption, and speed of vehicle.
In [14] demonstrated models to select a set of influential factors and to build up a model for
classifying the severity of injuries. These models are demonstrated by various machine learning
techniques. Supervised machine learning algorithms, such as AdaBoost, Logistic Regression,
Naive Bayes, and Random Forests are implemented on traffic accident data. SMOTE algorithm
was used to handle data imbalance. The outcome of this research study shows that the Random
Forest model can be a best tool for predicting the injury severity of traffic accidents. RF algorithm
has shown better performance with 75.5% accuracy than LR with 74.5%, NB with 73.1%, and
AdaBoost with 74.5% accuracy.
In [15] Machine Learning algorithms like Decision Tree and Naïve Bayes are used for
determination of the harshness of the accident using WEKA tool. From the Result analysis it shows
that J8 classifier gives the better accuracy compared to other algorithms to determine the severity
of an accident.
Evaluation Used machine Dataset is Mental state Naïve Bayes, The outcome
and learning taken from of driver, k-Nearest of this study
Classification algorithms for Punjab alcohol Neighbors, yields 12
of Road evaluation and government’s consumption, Decision most suitable
Accidents classification of authentic
and speed of trees, and parameters
Using road accidents organization
vehicle Support and
Machine named Punjab
Learning Road Safety
Vector maximum
Techniques Organization Machines performance
of 86.25% for
Decision
Tree
classifier
Specific Objectives
To accomplish the above stated general objective, the following specific objectives will be carried out:
Conduct a thorough review of literature on the existing machine learning techniques and
methods and their application in road traffic accidents.
Identify appropriate machine learning algorithms and assess different machine learning
application software that are more appropriate to the problem domain, and select the best
software.
Select and extract the data set required for analysis from the database of Addis Ababa Sub
city’s police departments.
Prepare the data for analysis which includes adjusting inconsistent data encoding,
accounting for missing values, and deriving other fields from existing ones;
Conduct training and testing of the predictive models using the new prepared dataset
Compare and suggest the best model for prediction.
Interpret and analyze the results of the selected model and forward recommendation.
Scope and Limitations
The scope of this research is limited to identifying and predicting the main causes to the road traffic
accident in Addis Ababa city.
There are different data related problem or limitations in this study are-
Accidental records are found in hardcopy and hand written format. Therefore, this need
additional time and effort to encode and deal with.
Significance of the Research
The Ethiopian government is implementing different new traffic rules. The new measure, which
is taken by the government, aims to reduce the increasing number of traffic accidents that is
resulting in thousands of death of people and damages of hundreds of millions of dollars properties
every year. So that this study will support the government by adding knowledge on the
understanding of what are the risk factors that contribute to the occurrence of road traffic accidents
and related injuries in Addis Ababa. The result that will be obtained in this study, can be used by
the road safety authorities for planning and evaluating road safety measures. It will also pave the
way to develop better parameters in all aspects of traffic control system. Specifically it will support
the Traffic Control Division of Addis Ababa in taking proper action, such as revising the existing
traffic rules, against road traffic accidents. Citizens, NGOs and media can also take necessary
action with the help of local government. The recommendations given are going to benefit the
public at large on prevention of road accidents and increasing safety performance if considered.
Research Methodology
The methodologies to be used in conducting this research are described as follows.
Budget
The study allocated the budget according to the plan considering the scope of the project from the
very beginning to the completion. This plan includes all the expenses spent from the starting of
the proposal to the completion of the project work. The estimated cost of the research is expected
to be greater than 10,000 ETB in order to get adequate, appropriate data and information.
Table 2: Budget Plan