1822 B.E Cse Batchno 157 - Removed
1822 B.E Cse Batchno 157 - Removed
Machine Learning (ML) is coming into its own, with a growing recognition that ML
can play a key role in a wide range of critical applications, such as data mining,
natural language processing, image recognition, and expert systems.
10
" A computer program is said to learn from experience E with respect to some task T and
some performance measure P, if its performance on T, as measured by P, improves with
experience E.”
Machine learning workflow refers to the series of stages or steps involved in the process
of building a successful machine learning system.
The various stages involved in the machine learning workflow are-
11
Figure 1.2 Machine Learning Process Data Collection- In this stage,
Data is collected from different sources.The type of data collected depends upon the type
of desired project.Data may be collected from various sources such as files, databases
etc.The quality and quantity of gathered data directly affects the accuracy of the desired
system.
Data Preparation- In this stage,
Data preparation is done to clean the raw data.Data collected from the real world is
transformed to a clean dataset.
Raw data may contain missing values, inconsistent values, duplicate instances etc.So, raw
data cannot be directly used for building a model.
Choosing Learning Algorithm- In this stage,
The best performing learning algorithm is researched.It depends upon the type of
problem that needs to solved and the type of data we have.If the problem is to classify
and the data is labeled, classification algorithms are used.
If the problem is to perform a regression task and the data is labeled, regression
algorithms are used.
If the problem is to create clusters and the data is unlabeled, clustering algorithms are
used.
Training Model- In this stage,
The model is trained to improve its ability.The dataset is divided into training dataset and
testing dataset.
The training and testing split is order of 80/20 or 70/30.It also depends upon the size of
the dataset.Training dataset is used for training purpose.Testing dataset is used for the
testing purpose.
12
Types of Machine Learning
14
Regression : Regression algorithms are used to solve regression problems in which there
is a linear relationship between input and output variables. These are used to predict
continuous output variables, such as market trends, weather prediction, etc.
15
taking action, learning from experiences, and improving its performance. Reinforcement
learning is categorized mainly into two types of methods/algorithms:
16
LITERATURE SURVEY
Proposed work
Incidents are defined as random and nonrecurring events such as vehicular crashes,
disabled vehicles, spilled loads, temporary maintenance and construction activities, and
other unusual events that disrupt the normal flow of traffic. Incident-related congestion
is a common occurrence on busy roadways. The number of incidents per million-vehicle-
miles has been reported to be between 20 and 200 and lane-blocking incidents lasting
more than 45 minutes per 100 million-vehicle-miles have been reported to be 1.09. It has
been estimated that 52 to 58 percent of the traffic congestion in urban areas is due to
incidents, amounting to 2 billion vehicle-hours of delay and a cost of $40 billion in terms
of hours of delay and excess fuel consumption in 2001, as reported in the 2003 Annual
emergency services, lost time and reduction in productivity, increased cost of goods and
services, reduced air quality, increased vehicle maintenance costs and reduced quality
of life.
17
: Kukshya, V., Krishnan, H., Kellum, C., n.d. Design of a system solution for relative
positioning of vehicles using vehicle-to-vehicle radio communications during GPS
outages.
Proposed work
Active safety applications for vehicles have been at the forefront of the automotive safety
community for a number of years. Cooperative collision warning based on vehicle-to-
vehicle radio communications and GPS systems is one such promising active safety
application that has evoked considerable interest among automobile manufacturers and
research communities worldwide. In this paper, we address one of the key functional
components of the cooperative collision warning application, which is, accurate
estimation of relative positions of all the neighboring vehicles based on real-time
exchange of their individual GPS position coordinates, and then propose a novel system
solution for achieving the same (relative positioning functionality) during persistent GPS
outages. Based on survey results, we also qualitatively assess various radio based ranging
and relative positioning techniques, experimentally evaluate the received signal strength
based ranging technique, and comment on their suitability for our proposed solution.
18
Xu, Q., Mak, T., Ko, J., Sengupta, R., 2004. Vehicle-to-vehicle safety messaging in DSRC.
AC
The design of layer-2protocols for a vehicle to send safety messages to other
vehicles. The target is to send vehicle safety messages with high reliability and
low delay. The communication is one-to-many, local, and geo-significant. The
vehicular communication network is ad-hoc, highly mobile, and with large
numbers of contending nodes. We design several random access protocols for
medium access control. The protocols fit in the DSRC multi-channel
architecture.Analytical bounds on performance of the addressed protocols are
derived. Simulations are conducted to assess the reception reliability and channel
usage of the protocols. The sensitivity of the protocol performance is evaluated
under various offered traffic and vehicular traffic flows. The results show our
approach is feasible for vehicle safety messages in DSRC.
Krakiwsky, E.J., Harris, C.B., Wong, R.V.., 1988. A Kalman filter for integrating dead
reckoning, map matching and GPS positioning, in: , IEEE Position Location and
Navigation Symposium, 1988. Record. Navigation into the 21st Century. IEEE PLANS
’88. Presented at the , IEEE Position Location and Navigation Symposium, 1988.
Record. Navigation into the 21st Century. IEEE PLANS ’88, IEEE, pp. 39-46.
To make driving easier and safer, modern vehicles are equipped with driver support
systems. Some of these systems, for example navigation or
19
curvature warning systems, need the global position of the vehicle. To determine this
position, the Global Positioning System (GPS) or a Dead Reckoning (DR) system can be
used. However, these systems have often certain drawbacks. For example, DR systems
suffer from error growth with time and GPS signal masking can occur. By integrating the
DR position and the GPS position, the complementary characteristics of these two
systems can be used advantageously. In this thesis, low cost in-vehicle sensors (gyroscope
and speedometer) are used to perform DR and the GPS receiver used has a low update
frequency. The two systems are integrated with an extended Kalman filter in order to
estimate a position. The evaluation of the implemented positioning algorithm shows that
the system is able to give an estimated position in the horizontal plane with a relatively
high update frequency and with the accuracy of the GPS receiver used. Furthermore, it is
shown that the system can handle GPS signal masking for a period of time. In order to
increase the performance of a positioning system, map matching can be added. The idea
with map matching is to compare the estimated trajectory of a vehicle with roads stored
in a map data base, and the best match is chosen as the position of the vehicle. In this
thesis, a simple off-line map matching algorithm is implemented and added to the
positioning system. The evaluation shows that the algorithm is able to distinguish roads
with different direction of travel from each other and handle off-road driving.
20
Boukerche, A., Oliveira, H., Nakamura, E., Loureiro, A., 2008. Vehicular Ad Hoc
Networks: A New Challenge for Localization-Based Systems. Computer
Communications 31, 2838–2849.
The new kind of ad hoc network is hitting the streets: Vehicular Ad Hoc Networks.
(VANets). In these networks, vehicles communicate with each other and possibly with a
roadside infrastructure to provide a long list of applications varying from transit safety to
driver assistance and Internet access. In these networks, knowledge of the real-time
position of nodes is an assumption made by most protocols, algorithms, and applications.
This is a very reasonable assumption, since GPS receivers can be installed easily in
vehicles, a number of which already comes with this technology. But as VANets advance
into critical areas and become more dependent on localization systems, GPS is starting to
show some undesired problems such as not always being available or not being robust
enough for some applications. For this reason, a number of other localization
techniques such as Dead Reckoning, Cellular Localization, and Image/Video Localization
has been used in VANets to overcome GPS limitations. A common procedure in all
these cases is to use Data Fusion techniques to compute the accurate position of vehicles,
creating a new paradigm for localization in which several known localization techniques
are combined into a single solution that is more robust and precise than the individual
approaches. In this paper, we further discuss this subject by studying and analyzing the
localization requirements of the main VANet applications.
21
CHAPTER - 3 PROJECT DESCRIPTION
EXISTING SYSTEM:
In existing method Data Mining techniques are used to identify the locations where high
frequency accidents are occurred and analyze them to identify the factors that have an
effect on road accidents at that locations. The first task is to divide the accident location
into k groups using the k-means clustering algorithm based on road accident frequency
counts. Then, association rule mining algorithm applied in order to find out the
relationship between distinct attributes which are in accident data set and according to
that know the characteristics of locations.
PROPOSED SYSTEM:
Now in this method classification techniques will be using for identifying the accident
prone area's. The accident data records which can help to understand the characteristics
of many features like drivers behavior, roadway conditions, light condition, weather
conditions and so on. This can help the users to compute the safety measures which is
useful to avoid accidents. The data set can be analyzing based on Yolo(You only look
once) algorithm will gives the accurate dataset. The models are performed to identify
statistically significant factors which can be able to predict the probabilities of crashes
and injury that can be used to perform a risk factor.
22
WORKING
In this project, we will be developing a system which will be constantly monitor vehicles
to check whether accident occurred or not. first we need to collect the data from the
website.In this project we have collect the data from kaggle.after we need topreprocess
the data, because when we collect the data from. the website there is some missing data
or some aspects will occur, so to avoid this we need to preprocess the data ,this is called
as data preprocessing Then apply the Machine learning algorithms to predict the output
. after prediction we will save the data .Thus, helps to effectively monitor accidents
among the general public .
MODULE DESCRIPTION
24
Data imbalance. Some classes or categories in the data may have a
disproportionately high or low number of corresponding samples. As a result,
they risk being under-represented in the model.
Data bias. Depending on how the data, subjects and labels themselves are
chosen, the model could propagate inherent biases on gender, politics, age or
region, for example. Data bias is difficult to detect and remove.
Data Preprocessing : Real-world raw data and images are often incomplete,
inconsistent and lacking in certain behaviors or trends. They are also likely to
contain many errors. So, once collected, they are pre-processed into a format
the machine learning algorithm can use for the model.
Pre-processing includes a number of techniques and actions:
Data cleaning. These techniques, manual and automated, remove data
incorrectly added or classified.
Data imputations. Most ML frameworks include methods and APIs for
balancing or filling in missing data. Techniques generally include imputing
missing values with standard deviation, mean, median and k-nearest neighbors
(k-NN) of the data in the given field.
25
Data cleaning :
Data cleaning is one of the important parts of machine learning. It plays a significant part
in building a model. It surely isn’t the fanciest part of machine learning and at the same
time, there aren’t any hidden tricks or secrets to uncover. However, the success or failure
of a project relies on proper data cleaning. Professional data scientists usually invest a
very large portion of their time in this step because of the belief that “Better data beats
fancier algorithms”.
Data visualization :
Data visualization is the graphical representation of information and data in a pictorial or
graphical format(Example: charts, graphs, and maps). Data visualization tools provide an
accessible way to see and understand trends, patterns in data, and outliers. Data
visualization tools and technologies are essential to analyzing massive amounts of
information and making data-driven decisions. The concept of using pictures is to
understand data that has been used for centuries. General types of data visualization are
Charts, Tables, Graphs, Maps
26
METHODOLOGY FLOW
This chapter describes the overall design. It also describes each module that is to be
implemented.
SYSTEM ARCHITECTURE
: System Architecture
Input images:
28
Accident dataset creation :
To create this dataset, Then creating a custom computer vision Python script to add
accidents to them, thereby creating an artificial (but still real-world applicable) datasetThis
method is actually a lot easier than it sounds once you apply landmarks to the problem.
Working :
First we collect the data from CCTV cameras.Then we apply data preprocessing
technique.after preprocessing the data we train the system by using data set(accident) &
(non accident).after completion of training we will apply YOLO algorithm to detect the
object.By using YOLO algorithm it will divide into frames then the frames will detect
whether the accident is happened or not.
30
ADVANTAGES OF PROPOSED SYSTEM:
APPLICATIONS:
Used in traffic .
Used in Highways .
We can also use this at driving schools and vehicle exhibitions
31
TOOLS AND TECHNIQUES
SOFTWARE REQUIREMENTS
Python
Pycharm
Python
In this project, python is used as the programming language. In technical terms, Python
is an object-oriented, high-level programming language with integrated dynamic
semantics primarily for web and app development. It is extremely attractive in the field of
Rapid Application Development because it offers the dynamic typing and binding
options. Python is relatively simple, so it's easy to learn since it requires a unique syntax
that focuses on readability. Developers can read and translate Python code much easier
than other languages. In turn, this reduces the cost of program maintenance and
development because it allows teams to work collaboratively without significant language
and experience barriers.
Python supports the use of modules and packages, which means that programs can be
designed in a modular style and code can be reused across a variety of projects. Once
you've developed a module or package you need, it can be scaled for use in other
projects, and it's easy to import or export these modules. One of the most promising
benefits of Python is that both the standard library and the interpreter are available free
32
of charge, in both binary and source.
33
no exclusivity either, as Python and all the necessary tools are available on all major
platforms. Therefore, it is an enticing option for developers who don't want to worry
about paying high development costs.
If this description of Python over your head, don't worry. You will understand it soon
enough. What you need to take away from this section is that Python is a programming
language used to develop software on the web and in app form, including mobile. It's
relatively easy to learn, and the necessary tools are available to all free of cost.
PYCHARM
PyCharm is the Python IDE by JetBrains, designed for professional Python developers.
Industry-leading code completion, code navigation, safe refactoring, and smart debugging
are just a few important features that contribute to make professional software
development a more productive and enjoyable experience. PyCharm Professional Edition
comes with wide support for Python web frameworks, modern JavaScript development,
as well as with advanced database tools and scientific tools integrations.
Pros:
"I have been using PyCharm since over 4 years now and It is one of the best IDEs out
there for python developers. Great product with auto-complete features."
34
"Also PyCharm have some tools to find errors and helps you to correct them. I also like
the style and colors very nice for me”.
"The best all in one IDE out there, the python supporting features are great and it has a
many templates for different projects for ease of architecture."
"PyCharm is probably the best IDE for Python projects as it has so many Python
orientated features. Personally, I only use PyCharm for the occasional Python project
and I am very satisfied with that."
import cv2
import tensorflow as tf import os
from skimage.transform import resize import numpy as np
from PIL import Image
model = tf.keras.models.load_model('model_train') categories = ["Accident", "Non-
Accident"]
36
main_dir = "test_frames"
for dirs in os.listdir(main_dir):
test_dir = os.path.join(main_dir, dirs) print(test_dir)
a=0
for img_t in os.listdir(test_dir):
37
return hasFrames
38
import cv2
from skimage.transform import resize
39
import cv2
from skimage.transform import resize
#adding first layer i.e convolutional layer, passing input_shape only in first layer
model.add(Conv2D(32, (5 , 5), activation="relu", input_shape=(128, 128, 1)))
40
model.add(MaxPooling2D(pool_size = (2 , 2)))
41
#training the model
train = model.fit(X, Y, batch_size = 256, epochs = 10, validation_split=0.2,
shuffle=True)
model.save('model_train')
Train_DIR = r"/home/sparsh/Desktop/Accident_Detection/Accident-Dataset/train"
Test_DIR = r"/home/sparsh/Desktop/Accident_Detection/Accident-Dataset/test"
categories = ["Accident", "Non-Accident"]
test_data = [] train_data = []
42
def getTrainData(): i = 0
for index, category in enumerate(categories): path = os.path.join(Train_DIR , category)
for img in os.listdir(path):
i += 1
print(i) try:
image = cv2.imread(os.path.join(path, img), cv2.IMREAD_GRAYSCALE)
image = resize(image, (128, 128)) train_data.append([image , index])
except Exception as e: pass
def getTestData(): i = 0
for index, category in enumerate(categories): path = os.path.join(Test_DIR , category) for
img in os.listdir(path):
i += 1
print(i) try:
43
image = cv2.imread(os.path.join(path, img), cv2.IMREAD_GRAYSCALE)
image = resize(image, (128, 128)) test_data.append([image , index])
except Exception as e: pass
print("Getting Train Data") getTrainData()
random.shuffle(train_data) random.shuffle(test_data)
44
print(type(train_features))
45
RESULTS
We take the input video from a Source (CC TV camera ) and divide the video into
severalframes.On each frame a Vehicle is detected whether Accident has occurred or not.
If accident occurred it will print the image number with “accident” message.
When we run the we got this screenshot because of accident has occurred , System will
46
take the Video from CC TV cameras and cut the video in such frames then each frame
will be detected using YOLO Algorithm . In this frame two vehicles got
47
smashed to each other .
When we run the we got this screenshot because of accident has occurred , System will
take the Video from CC TV cameras and cut the video in such frames then each frame
will be detected using YOLO Algorithm . In this frame vehicle is going in wrong
direction soo Accident is going to happen so it will detected that also .
48
Comment [1]: Accident occurred
When we run the we got this screenshot because of accident has occurred , System will
take the Video from CC TV cameras and cut the video in such frames then each frame
will be detected using YOLO Algorithm . In this frame two vehicles got smashed to each
49
No of Accident occurred
In this picture we got 4 images displayed with name accident , because in that video 4
accidents has happened so we got displayed that.
50
CONCLUSION
FUTURE WORK
The proposed system deals with the detection of the accidents. But this can be extended
by providing medication to the victims at the accident spot. By increasing the technology
we can also avoid accidents by providing alerts systems that can stop the vehicle to
overcome the accidents.
51