0% found this document useful (0 votes)
23 views38 pages

Review Is Not So Easy

review is not so easy

Uploaded by

mrinal19
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views38 pages

Review Is Not So Easy

review is not so easy

Uploaded by

mrinal19
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 38

A Project Review-2 Report

on
DETECTION OF ROAD LANE LINES

Submitted in partial fulfillment of the


requirement for the award of the degree of

Master of Computer Application with Data Science

Under The Supervision of


Name of Supervisor : Dr. Kavita
Assistant Professor

Submitted By

Mrinal Dev – 21132030292


Abhishek Kumar – 21132030394
Gautam Kumar - 21132030346

SCHOOL OF COMPUTING SCIENCE AND ENGINEERING DEPARTMENT OF


COMPUTER SCIENCE AND ENGINEERING
GALGOTIAS UNIVERSITY, GREATER NOIDA
INDIA
December, 2022

1|Page
Abstract

Today there is a lot of research on ADAS where everything from ”Lane Departure Warning
(LDW)” to ”Full autonomous driving” is investigated. However, there is a need for research
about the integration of safety critical applications and non-safety critical applications on a
mixed criticality platform where the two applications are isolated from each other using
virtualization. For an example Autosaur, which is a partnership for development of software
founded by major players in the automotive industry does address mixed criticality systems in
the sense that they recognize that the standards must be supported on their platforms.
This thesis will investigate different techniques for road and lane detection and how they can be
implemented on the real-time operating system (RTOS) of a Mixed Criticality System.

In our proposed system we use Canny Edge Detection replacing the Simulink Edge Detection
which is recent and efficient implementation in Python instead of MATLAB. Since, Python is
the Scripting and Statistical Modelling Language it supports faster execution for mathematical
functions which could be used by Canny Edge Detection technique, YOLO Algorithm.
Secondly, we use Hough Transform Space for 3-Dimensional Object detection which could
faster and accurate compared to single dimension object detection.

Two modules in three phases are our proposed work plan. Two modules:
For our Proposed System, we have used Canny Edge Detection Gaussian Blur algorithm and
Hough Transformation Technique for Road Lane Detection Project.
These are the Software Configurations that are required.
 Operating System: Windows 10 (incl. 64-bit)
 Language: Python 3
 IDE: Jupyter Notebook/ Spyder 3
 Library : OpenCV

After a dataset passes, confusion matrix and accuracy metrics evaluate output. The model
outputs true positives, false positives, true negatives, and false negatives on every image of the
testing dataset. The confusion matrix shows accuracy and precision scores. Evaluation is this.
Thus, model accuracy and precision are recorded.

Our eyes guide us when driving. The road's lanes guide us when steering. Using an algorithm to
recognize lane lines is one of the initial steps in constructing a self-driving vehicle. Road
detection ROIs must be adaptable. Driving up or down a steep hill changes the horizon and
frame proportions. This study focuses on image processing and road recognition in self-driving
cars and has considerable potential. We finished the implementation utilizing road detection
methods. If people still think self-driving cars are safe, they are and will get safer. Computerized
driving is only available to those who trust technology.

2|Page
List of Figures
Figur Page
Table Name
e No. Number
1. UML Diagram 7

2. Data Flow Diagram 6

3|Page
Table of Contents

Title Page No.


Abstract I
List of Table II
List of Figures III
Chapter 1 Introduction 1
2
1.2 Formulation of Problem 3
1.2.1 Tool and Technology Used
Chapter 2 Literature Survey/Project Design 5
2.1 EDGE DETECTION
2.2 FILTER OUT NOISE
2.2.1 Convolution
2.2 EXISTING SYSTEM

4|Page
CHAPTER-1
Introduction

1.1 DIGITAL IMAGE PROCESSING


Image processing is a method to perform some operations on an image, in order to get an
enhanced image or to extract some useful information from it. It is a type of signal processing in
which input is an image and output may be image or characteristics/features associated with that
image. Nowadays, image processing is among rapidly growing technologies. It forms core
research area within engineering and computer science disciplines too.

Image processing is a method to perform some operations on an image, in order to get an


enhanced image and or to extract some useful information from it.

If we talk about the basic definition of image processing then “Image processing is the analysis
and manipulation of a digitized image, especially in order to improve its quality”.

Digital-Image: An image may be defined as a two-dimensional function f(x, y), where x and y
are spatial(plane) coordinates, and the amplitude of fat any pair of coordinates (x, y) is called the
intensity or grey level of the image at that point.

In another word An image is nothing more than a two-dimensional matrix (3-D in case of
colored images) which is defined by the mathematical function f(x, y) at any point is giving the
pixel value at that point of an image, the pixel value describes how bright that pixel is, and what
color it should be.

Image processing is basically signal processing in which input is an image and output is image or
characteristics according to requirement associated with that image.
1.1.1 Steps in Image Processing:

Image processing basically includes the following three steps:

• Importing the image via image acquisition tools;


• Analysing and manipulating the image;
• Output in which result can be altered image or report that is based on image analysis.
There are two types of methods used for image processing namely, analogue and digital image
processing. Analogue image processing can be used for the hard copies like printouts and
photographs. Image analysts use various fundamentals of interpretation while using these visual
techniques. Digital image processing techniques help in manipulation of the digital images by
using computers. The three general phases that all types of data have to undergo while using
digital technique are pre-processing, enhancement, and display, information extraction.
1.1.2 What is an Image?

An image is defined as a two-dimensional function, F (x, y), where x and y are spatial
coordinates, and the amplitude of F at any pair of coordinates (x,y) is called the intensity of that

5|Page
image at that point. When x,y, and amplitude values of F are finite, we call it a digital image. In
other words, an image can be defined by a two-dimensional array specifically arranged in rows
and columns. Digital Image is composed of a finite number of elements, each of which elements
have a particular value at a particular location. These elements are referred to as picture
elements, image elements, and pixels. A Pixel is most widely used to denote the elements of a
Digital Image.
1.1.3 Image in Matrix Representation

As we know, images are represented in rows and columns we have the following syntax in which
images are represented:

The right side of this equation is digital image by definition. Every element of this matrix is
called image element, picture element, or pixel.
1.1.4 Types of an Image

Binary Image– The binary image as its name suggests, contain only two pixel elements i.e. 0 &
1, where 0 refers to black and 1 refers to white. This image is also known as Monochrome.
Black and White Image– The image which consist of only black and white colour is called Black
and White Image.
8 Bit Colour Format– It is the most famous image format. It has 256 different shades of colours
in it and commonly known as Grayscale Image. In this format, 0 stands for Black, and 255
stands for white, and 127 stands for grey.
16 Bit Colour Format– It is a colour image format. It has 65,536 different colours in it. It is also
known as High Colour Format. In this format the distribution of color is not as same as
Grayscale image.
A 16-bit format is actually divided into three further formats which are Red, Green and Blue.
That famous RGB format.
1.2 SELF DRIVING CARS

A self-driving car (sometimes called an autonomous car or driverless car) is a vehicle that
uses a combination of sensors, cameras, radar and artificial intelligence (AI) to travel between
destinations without a human operator. To qualify as fully autonomous, a vehicle must be able to
navigate without human intervention to a predetermined destination over roads that have not
been adapted for its use.
6|Page
Companies developing and/or testing autonomous cars include Audi, BMW, Ford,
Google, General Motors, Tesla, Volkswagen and Volvo. Google's test involved a fleet of
selfdriving cars -- including Toyota Prii and an Audi TT -- navigating over 140,000 miles of
California streets and highways.

Levels of autonomy in self-driving cars


The U.S. National Highway Traffic Safety Administration (NHTSA) lays out six levels of
automation, beginning with zero, where humans do the driving, through driver assistance
technologies up to fully autonomous cars. Here are the five levels that follow zero automation:

Level 1: Advanced driver assistance system (ADAS) aid the human driver with either
steering, braking or accelerating, though not simultaneously. ADAS includes rear view cameras
and features like a vibrating seat warning to alert drivers when they drift out of the traveling lane.

Level 2: An ADAS that can steer and either brake or accelerate simultaneously while the
driver remains fully aware behind the wheel and continues to act as the driver.

Level 3: An automated driving system (ADS) can perform all driving tasks under certain
circumstances, such as parking the car. In these circumstances, the human driver must be ready
to re-take control and is still required to be the main driver of the vehicle.

Level 4: An ADS is able to perform all driving tasks and monitor the driving environment
in certain circumstances. In those circumstances, the ADS is reliable enough that the human
driver needn't pay attention.

Level 5: The vehicle's ADS acts as a virtual chauffeur and does all the driving in all
circumstances. The human occupants are passengers and are never expected to drive the vehicle.

1.3 WORKING OF SELF DRIVING CARS

Lane lines are being drawn as the car drives. Also, you can see the radius of curvature is
being calculated to help the car steer. It is cheap to equip cars with a front facing camera. Much
cheaper than RADAR or LIDAR. Once we get a camera image from the front facing camera of
self-driving car, we make several modifications to it. The steps I followed are detailed below:

1.3.1 Distortion correction

Image distortion occurs when a camera looks at 3D objects in the real world and
transforms them into a 2D image. This transformation isn’t always perfect and distortion can
result in a change in apparent size, shape or position of an object. So we need to correct this
distortion to give the camera an accurate view of the image. This is done by computing a
camera calibration matrix by taking several chessboard pictures of a camera.

7|Page
See example below of a distortion corrected image. Please note that the correction is very
small in normal lenses and the difference isn’t visible much to the naked eye.

Figure 1.a: Describing the original image and undistorted image

1.3.2 Create a binary image

Now that we have the undistorted image, we can start our analysis. We need to explore
different schemes so that we can clearly detect the object of interest on the road, in this case lane
lines while ignoring the rest. I did this in two ways:

• Using Sobel operator to compute x-gradient


The gradient of an image can be used to identify sharp changes in colour in a black and white
image. It is a very useful technique to detect edges in an image. For
the image of a road, we usually have a lane line in either yellow or white on a black road and so
x-gradient can be very useful.
• Explore other colour channels
HSV (Hue, Saturation and Value) colour space can be very useful in isolating the yellow and line
white lines because it isolates colour (hue), amount of colour (saturation) and brightness (value).
We can use the S colour channel in the image.
• Birds Eye View Image
After the thresholding operation, we perform a perspective transform to change the image to
bird’s eye view. This is done because from this top view we can identify the curvature of the lane
and decide how to steer the car. To perform the perspective transform, I identified 4 source
points that form a trapezoid on the image and 4 destination points such that lane lines are parallel
to each other after the transformation. The destination points were chosen by trial and error but
once chosen works well for all images and the video since the camera is mounted in a fixed
position. OpenCV can be used to perform this. See how clearly the curvature of the lane lines is
visible in this view.
• Fit curve lines to the bird eye view image
In order to better estimate where the lane is, we use a histogram of the bottom half of image to
identify potential left and right lane markings. Modification of this function to narrow down the

8|Page
area in which left and right lanes can exist so that highway lane separators or any other noise
doesn’t get identified as a lane. Once the initial left and right lane bottom points are identified.
• Plot the result identified by the system clearly. This plotting can be done filling the space
area with transparent colour using OpenCV.
Thus, Self-Driving car works and road detection can be useful in detection of road from an
image captured from car.

1.4 MOTIVATION FOR THE WORK

In the past five years, autonomous driving has gone from “maybe possible” to “definitely
possible” to “inevitable” to “how did anyone ever think this wasn’t inevitable?” to "now
commercially available." In December 2018, Waymo, the company that emerged from Google’s
self-driving-car project, officially started its commercial self-driving-car service in the suburbs of
Phoenix. The details of the program—it's available only to a few hundred vetted riders, and
human safety operators will remain behind the wheel—may be underwhelming but don't erase its
significance. People are now paying for robot rides. And it's just a start. Waymo will expand the
service's capability and availability over time. Meanwhile, its onetime monopoly has evaporated.
Smaller start-ups like May Mobility and Drive.ai are running small-scale but revenue-generating
shuttle services. Every significant automaker is pursuing the tech, eager to rebrand and rebuild
itself as a “mobility provider” before the idea of car ownership goes kaput. Ride-hailing
companies like Lyft and Uber are hustling to dismiss the profit-gobbling human drivers who now
shuttle their users about. Tech giants like Apple, IBM, and Intel are looking to carve off their
slice of the pie. Countless hungry start-ups have materialized to fill niches in a burgeoning
ecosystem, focusing on laser sensors, compressing mapping data, setting up service centres, and
more. This 21st-century gold rush is motivated by the intertwined forces of opportunity and
survival instinct. By one account, driverless tech will add $7 trillion to the global economy and
save hundreds of thousands of lives in the next few decades.
Simultaneously, it could devastate the auto industry and its associated gas stations, drive-thrust,
taxi drivers, and truckers. Some people will prosper. Most will benefit. Many will be left behind.
It’s worth remembering that when automobiles first started rumbling down manure-clogged
streets, people called them horseless carriages. The moniker made sense: Here were vehicles that
did what carriages did, minus the hooves. By the time “car” caught on as a term, the invention
had become something entirely new. Over a century, it reshaped how humanity moves and thus
how (and where and with whom) humanity lives. This cycle has restarted, and the term
“driverless car” will soon seem as anachronistic as “horseless carriage.” We don’t know how
cars that don’t need human chauffeurs will mold society, but we can be sure a similar gear shift
is on the way. Just over a decade ago, the idea of being chauffeured around by a string of zeros
and ones was ludicrous to pretty much everybody who wasn’t at an abandoned Air Force base
outside Los Angeles, watching a dozen driverless cars glide through real traffic. That event was
the Urban Challenge, the third and final competition for autonomous vehicles put on by Darpa,
the Pentagon’s skunkworks arm. At the time, America’s military-industrial complex had already
thrown vast sums and years of research trying to make unmanned trucks. It had laid a foundation
for this technology, but stalled when it came to making a vehicle that could drive at practical
speeds, through all the hazards of the real world. So, Darpa figured, maybe someone else—
someone outside the DOD’s standard roster of contractors, someone not tied to a list of detailed

9|Page
requirements but striving for a slightly crazy goal—could put it all together. It invited the whole
world to build a vehicle that could drive across California’s Mojave Desert, and whoever’s robot
did it the fastest would get a million-dollar prize. The most successful vehicle went just seven
miles. Most crashed, flipped, or rolled over within sight of the starting gate. But the race created
a community of people—geeks, dreamers, and lots of students not yet jaded by commercial
enterprise—who believed the robot drivers people had been craving for nearly forever were
possible, and who were suddenly driven to make them real.

They came back for a follow-up race in 2005 and proved that making a car drive itself was
indeed possible: Five vehicles finished the course. By the 2007 Urban Challenge, the vehicles
were not just avoiding obstacles and sticking to trails but following traffic laws, merging,
parking, even making safe, legal U-turns.

When Google launched its self-driving car project in 2009, it started by hiring a team of
Darpa Challenge veterans. Within 18 months, they had built a system that could handle some of
California’s toughest roads (including the famously winding block of San Francisco’s Lombard
Street) with minimal human involvement. A few years later, Elon Musk announced Tesla would
build a self-driving system into its cars. And the proliferation of ride-hailing services like Uber
and Lyft weakened the link between being in a car and owning that car, helping set the stage for
a day when actually driving that car falls away too. In 2015, Uber poached dozens of scientists
from Carnegie Mellon University—a robotics and artificial intelligence powerhouse—to get its
effort going.

1.5 PROBLEM STATEMENT

Given an image captured from a camera attached to a vehicle moving on a road in which
captured road may or may not be well levelled, or have clearly delineated edges, or some prior
known patterns on it, then road detection from a single image can be applied to find the road in
an image so that it could be used as a part in automation of driving system in the vehicles for
moving the vehicle in correct road. In this process of finding the road in the image captured by
the vehicle, we can use some algorithms for vanishing point detection using Hough transform
space, finding the region of interest, edge detection using canny edge detection algorithm and
then road detection. We use thousands of images of different roads to train our model so that the
model could detect the road which is present in the new image processed through the vehicle.

CHAPTER-2
Literature Survey
2.1 EDGE DETECTION
Edges characterize boundaries and are therefore a problem of fundamental
importance in image processing. Edges in images are areas with strong intensity contrasts –
a jump in intensity from one pixel to the next. Edge detecting an image significantly reduces

10 | P a g e
the amount of data and filters out useless information, while preserving the important
structural properties in an image.

Canny edge detection algorithm is also known as the optimal edge detector.
Cranny’s intentions were to enhance the many edge detectors in the image.

• The first criterion should have low error rate and filter out unwanted information
while the useful information preserve.
• The second criterion is to keep the lower variation as possible between the original
image and the processed image.
• Third criterion removes multiple responses to an edge.

Based on these criteria, the canny edge detector first smoothens the image to eliminate
noise. It then finds the image gradient to highlight regions with high spatial derivatives. The
algorithm then tracks along these regions and suppresses any pixel that is not at the maximum
using non-maximum suppression. The gradient array is now further reduced by hysteresis to
remove streaking and thinning the edges.

2.2 FILTER OUT NOISE

2.2.1 Convolution
First step to Canny edge detection require some method of filter out any noise
and still preserve the useful image. Convolution is a simple mathematic method to
many common image-processing operators.

Figure 2.a: An example small image (left), kernel (right)

11 | P a g e
2.2 EXISTING SYSTEM
In the current existing system is permitted only to use in ideal road conditions such as
runway. This could not be used in general roads because the edge detection used till now was
Simulink Edge Detection which is implemented in MATLAB. The secondary thing is in current
system Hough transform Space is only used for angle rotation and has very limited road dataset
to detect the objects in single dimension of an image.

2.3 LIMITATIONS OF EXISTING SYSTEM


The Hough transform is only efficient if a high number of votes fall in the right bin, so
that the bin can be easily detected amid the background noise. This means that the bin must not
be too small, or else some votes will fall in the neighboring bins, thus reducing the visibility of
the main bin.
Also, when the number of parameters is large (that is, when we are using the Hough transform
with typically more than three parameters), the average number of votes cast in a single bin is
very low, and those bins corresponding to a real figure in the image do not necessarily appear to
have a much higher number of votes than their neighbors. The complexity increases at a rate of
O(Am-2) with each additional parameter, where A is the size of the image space and m is the
number of parameters. (Shapiro and Stockman, 310) Thus, the Hough transform must be used
with great care to detect anything other than lines or circles.
Finally, much of the efficiency of the Hough transform is dependent on the quality of the input
data: the edges must be detected well for the Hough transform to be efficient. Use of the Hough
transform on noisy images is a very delicate matter and generally, a denoising stage must be
used before. In the case where the image is corrupted by speckle, as is the case in radar images,
the Radon transform is sometimes preferred to detect lines, because it attenuates the noise
through summation.

2.4 HOUGH TRANSFORM SPACE


The Hough transform is a feature extraction technique used in image analysis,
computer vision, and digital image processing. The purpose of the technique is to find
imperfect instances of objects within a certain class of shapes by a voting procedure. This
voting procedure is carried out in a parameter space, from which object candidates are
obtained as local maxima in a so-called accumulator space that is explicitly constructed by
the algorithm for computing the Hough transform.
The classical Hough transform was concerned with the identification of lines in the image,
but later the Hough transform has been extended to identifying positions of arbitrary shapes,
most commonly circles or ellipses. The Hough transform as it is universally used today was
invented by Richard Duda and Peter Hart in 1972, who called it a "generalized Hough

12 | P a g e
transform" after the related 1962 patent of Paul Hough. The transform was popularized in
the computer vision community by Dana H Ballard through a 1981 journal article titled
"Generalizing the Hough transform to detect arbitrary shapes"
2.4.1 Theory of Hough Transform Space
In automated analysis of digital images, a sub problem often arises of detecting simple shapes,
such as straight lines, circles or ellipses. In many cases an edge detector can be used as a pre-
processing stage to obtain image points or image pixels that are on the desired curve in the image
space. Due to imperfections in either the image data or the edge detector, however, there may be
missing points or pixels on the desired curves as well as spatial deviations between the ideal
line/circle/ellipse and the noisy edge points as they are obtained from the edge detector. For. The
purpose of the Hough transform is to address this problem by making it possible to perform
groupings of edge points into object candidates by performing an explicit voting procedure over
a set of parameterized image objects (Shapiro and Stockman, 304). The simplest case of Hough
transform is detecting straight lines. In general, the straight-line y = mx + b can be represented as
a point (b, m) in the parameter space. However, vertical lines pose a problem. They would give
rise to unbounded values of the slope parameter m. Thus, for computational reasons, Duda and
Hart proposed the use of the Hesse normal form. These reasons, it is often non-trivial to group
the extracted edge features to an appropriate set of lines, circles or ellipses.

Figure 2.i: Hesse Normal Form Graph

It is therefore possible to associate with each line of the image a pair (r, θ). The (r, theta)
plane is sometimes referred to as Hough space for the set of straight lines in two dimensions. This
representation makes the Hough transform conceptually very close to the two-dimensional Radon
transform. (They can be seen as different ways of looking at the same transform)

13 | P a g e
Given a single point in the plane, then the set of all straight lines going through that point
corresponds to a sinusoidal curve in the (r, θ) plane, which is

unique to that point. A set of two or more points that form a straight line will produce sinusoids
which cross at the (r, θ) for that line. Thus, the problem of
detecting collinear points can be converted to the problem of finding concurrent curves.

2.5 YOLO Algorithm


Compared to the approach taken by object detection algorithms before YOLO, which repurpose
classifiers to perform detection, YOLO proposes the use of an end-to-end neural network that makes
predictions of bounding boxes and class probabilities all at once.

2.5.1 YOLO limitations


 Although YOLO does seem to be the best algorithm to use if you have an object detection
problem to solve, it comes with several limitations.
 YOLO struggles to detect and segregate small objects in images that appear in groups, as each
grid is constrained to detect only a single object. Small objects that naturally come in groups,
such as a line of ants, are therefore hard for YOLO to detect and localize.
 YOLO is also characterized by lower accuracy when compared to much slower object
detection algorithms like Fast RCNN.
 Now, before we deep dive into more details about the YOLO architecture and methodology,
let's go over some of the important terminology.
2.15 PROPOSED SYSTEM
In our proposed system we use Canny Edge Detection replacing the Simulink Edge
Detection which is recent and efficient implementation in Python instead of MATLAB. Since,
Python is the Scripting and Statistical Modelling Language it supports faster execution for
mathematical functions which could be used by Canny Edge Detection technique. Secondly, we
use Hough Transform Space for 3-Dimensional Object detection which could faster and accurate
compared to single dimension object detection.

CHAPTER-3
Methodology /Implementation

3.1 IMAGE PROCESSING METHODOLOGY

Digital image processing consists of the manipulation of images using digital computers.
Its use has been increasing exponentially in the last decades. Its applications range from

14 | P a g e
medicine to entertainment, passing by geological processing and remote sensing. Multimedia
systems, one of the pillars of the modern information society, rely heavily on digital image
processing.

The discipline of digital image processing is a vast one, encompassing digital signal
processing techniques as well as techniques that are specific to images. An image can be
regarded as a function f (x, y) of two continuous variables x and y. To be processed digitally, it
has to be sampled and transformed into a matrix of numbers. Since a computer represents the
numbers using finite precision, these numbers have to be quantized to be represented digitally.
Digital image processing consists of the manipulation of those finite precision numbers. The
processing of digital images can be divided into several classes: image enhancement, image
restoration, image analysis, and image compression. In image enhancement, an image is
manipulated, mostly by heuristic techniques, so that a human viewer can extract useful
information from it. Image restoration techniques aim at processing corrupted images from
which there is a statistical or mathematical description of the degradation so that it can be
reverted. Image analysis techniques permit that an image be processed so that information can be
automatically extracted from it. Examples of image analysis are image segmentation, edge
extraction, and texture and motion analysis. An important characteristic of images is the huge
amount of information required to represent them. Even a grey-scale image of moderate
resolution, say 512 × 512, needs 512 × 512 × 8 ≈ 2 × 106 bits for its representation. Therefore, to
be practical to store and transmit digital images, one needs to perform some sort of image
compression, whereby the redundancy of the images is exploited for reducing the number of bits
needed in their representation.

Digital image processing is to process images by computer. Digital image processing can
be defined as subjecting a numerical representation of an object to a series of operations in order
to obtain a desired result. Digital image processing consists of the conversion of a physical
image into a corresponding digital image and the extraction of significant information from the
digital image by applying various algorithms. Digital image processing mainly includes image
collection, image processing, and image analysis. At its most basic level, a digital image
processing system is comprised of three components, i.e., a computer system on which to
process images, an image digitizer, and an image display device. Physical images are divided

15 | P a g e
into small areas called pixels. The division plan used often is the rectangular sampling grid
method shown in Fig. 13.6, in which an image is segmented into many horizontal lines
composed of adjacent pixels, and the value of each pixel position reflects the brightness of
corresponding point on the physical Physical images cannot be directly analysed by a computer
because the computer can only process digits rather than images, so an image must be converted
into a digital form before processed by a computer. The conversion process is called digitization
image.

At each pixel position, the brightness is sampled and quantized to obtain an integer value
indicating the brightness of the corresponding position in the image. After the conversion of all
pixels of an image is completed, the image can be represented by a matrix of integers. Each pixel
has two attributions: position and grey level. The position is determined by the two coordinates
of sampling point in the scanning line, namely row and column. The integer indicating the
brightness of the pixel position is called grey level. Images displayed by digital matrix are called
digital images, and all digital image processing is based on the digital matrix. The digital matrix
is the object process

f(i, j) = the grey level of pixel (i, j).

On the basis of image processing, it is necessary to separate objects from images by


pattern recognition technology, then to identify and classify these objects through technologies
provided by statistical decision theory. Under the conditions that an image includes several
objects, the pattern recognition consists of three phasesssed by a computer.

The first phase includes the image segmentation and object separation. In this phase,
different objects are detected and separate from other background. The second phase is the
feature extraction. In this phase, objects are measured. The measuring feature is to quantitatively
estimate some important features of objects, and a group of the features are combined to make up
a feature vector during feature extraction. The third phase is classification. In this phase, the
output is just a decision to determine which category every object belongs to. Therefore, for
pattern recognition, what input are images and what output are object types and structural
analysis of images. The structural analysis is a description of images in order to correctly
understand and judge for the important information of images.

16 | P a g e
Image processing is the application of a set of techniques and algorithms to a digital
image to analyse, enhance, or optimize image characteristics such as sharpness and contrast.
Most image processing techniques involve treating the image as either a signal or a matrix and
applying standard signal-processing or matrix manipulation techniques, respectively, to it. A
pixel or “picture element” is the smallest sample of a two-dimensional image that can be
programmatically controlled. The number of pixels in an image controls the resolution of the
image. The pixel value typically represents its intensity in terms of shades of grey (value 0– 255)
in a grayscale image or RGB (red, green, blue, each 0–255) values in a colour image. A voxel or
“volumetric pixel” is the three-dimensional counterpart of the 2D pixel. It represents a single
sample on a three-dimensional image grid. Similar to pixels, the number of voxels in a 3D
representation of an image controls its resolution. The spacing between voxels depends on the
type of data and its intended use. In a 3D rendering of medical images such as CT scans and
MRI scans, the size of a voxel is defined by the pixel size in each image slice and the slice
thickness. The value stored in a voxel may represent multiple values. In CT scans, it is often the
Hounsfield unit which can then be used to identify the type of tissue represented. In MRI
volumes, this may be the weighting factor (T1, T2, T2*, etc.). Image arithmetic is usually
performed at pixel level and includes arithmetic as well as logical operations applied to
corresponding points on two or more images of equal size. Geometric transformations can be
applied to digital images for translation, rotation, scaling, and shearing, as required. Matrix
transformation algorithms are typically employed in this case. For binary and grayscale images,
various morphological operations such as image opening and closing, skeletonization, dilation,
erosion, and so on, may also be employed for pattern matching or feature extraction.

An image histogram represents the distribution of image intensity values for an input
digital image. Histogram manipulation is often used to modify image contrast or for image
segmentation when the range of values for the desired feature is clearly definable. Some
common image processing applications are introduced as follows. Feature extraction is an area
of image processing where specific characteristics within an input image are isolated using a set
of algorithms. Some commonly used methods for this include contour tracing, thresholding, and
template matching. Image segmentation is a common application of feature extraction which is
often used with medical imaging to identify anatomical structures. Pattern and template matching
is useful in applications ranging from feature extraction to image substitution. It is also used with

17 | P a g e
face and character recognition and is one of the most commonly used image processing
applications. There are several image processing software packages available, from freely
distributed ones such as ImageJ to expensive suites such as MATLAB and Avizo which range in
functionality and targeted applications. We’ll discuss only a few of the commonly used ones
within medical physics/clinical engineering here. The image format most commonly used in
medical applications is DICOM, providing a standardized structure for medical image
management and exchange between different medical applications (see Chapter 8, “DICOM”).
ImageJ is an open source, Java-based image processing program developed at the National
Institute of Health. It provides various built-in image acquisition, analysis, and processing
plugins as well as the ability to build your own using ImageJ’s builtin editor and a Java compiler.
User-written plugins make it possible to solve many bespoke image processing and analysis
problems. ImageJ can display, edit, analyse, process, save, and print 8-bit colour and grayscale,
16-bit integer and 32-bit floating point images.43 It can read many standard image formats as
well as raw formats. It is multithreaded, so time-consuming operations can be performed in
parallel on multi-CPU hardware. It has built-in routines for most common image manipulation
operations in the medical field including processing of DICOM images and image stacks such as
those from CT and MRI.

Mimics44 is an image processing software for 3D design and modelling, developed by


Materialise NV. It is used to create 3D surface models from stacks of 2D image data. These 3D
models can then be used for a variety of engineering applications. Mimics calculates surface 3D
models from stacked image data such as CT, micro-CT, CBCT, MRI, confocal microscopy, and
ultrasound, through image segmentation. The region of interest (ROI) selected in the
segmentation process is converted to a 3D surface model using an adapted marching cubes
algorithm that takes the partial volume effect into account, leading to very accurate 3D models.
The 3D files are represented in the STL format.45

The most common input format is DICOM, but other image formats such as TIFF, JPEG,
BMP, and raw are also supported. Output file formats differ, depending on the subsequent
application, but common 3D output formats include STL, VRML, PLY, and DXF. Mimics
provides a platform to bridge stacked image data to a variety of different medical engineering

18 | P a g e
applications such as finite element analysis (FEA; see the next section), computer aided design
(CAD), rapid prototyping, and so on.

MATLAB is a programming environment for algorithm development, data analysis,


visualization, and numerical computation.46 It has a wide range of applications, including signal
and image processing, communications, control design, test and measurement, financial
modelling and analysis, and computational biology. The MATLAB Image Processing Toolbox™
provides a comprehensive set of reference-standard algorithms and graphical tools for image
processing, analysis, visualization, and algorithm development. It also has built-in support for
DICOM images and provides various functions to manipulate DICOM data sets. This makes it a
widely used tool in various medical physics/clinical engineering research groups and related
academia.

IDL is a cross-platform vectorised programming language used for interactive processing


of large amounts of data including image processing.47 IDL also includes support for medical
imaging via the IDL DICOM Toolkit add-on module. The image processing software packages
mentioned here are but a few of the commonly used ones within a medical physics/clinical
engineering environment partly due to their extensive libraries for medical image processing and
partly for historical reasons.48 There are many more free-for-use as well as commercial software
packages available providing varying degrees of functionality for different applications and, if
there is a choice available, it is advisable to explore the options available for a particular task.

3.2 ARCHITECTURE
System architecture of a road detection from a single image using computer vision
consists mainly the image which can be sent to a model and the output which consists of marking
of detections of a road. The system architecture starts by selecting a required image which is
captured by driving camera of a self-driving car. This should consist of all the details along with
the road which should be detected by the computer. This image should be sent to the model. The
model mainly consists of Edge detection model and Road Line detection model. The edge
detection model occurs after implementing the Canny edge detection algorithm and Road line
detection model occur after training the Hough transform space algorithm. Canny edge detection
mainly consists of modules such as Gaussian blur algorithm for noise reduction, Smoothening of

19 | P a g e
image, Gradient Calculations, Non-maximum suppression, double threshold and edge tracking
by hysteresis. All these algorithms are compounded together to form a edge
detection model by Canny’s process. The selected image is first sent to Canny’s model and edges
are found in an image. This edge detected image is then sent into road line detection model
which is formed using Hough Transform Space algorithm. Hough transform space algorithm
normalizes the sent image and then changes the value of 𝜃 in the normalized trigonometric line
equation and thus detects the required road lines in an image. The only image with edges is sent
into the Hough Transform Space algorithm, because the image with more noise takes large time
for calculation rather than the image consists only edges of an image. The Hough Transform
Space uses the Road line dataset to train itself for detecting the calculated line as a road line.
This system architecture is thus used to detect the road lines in an image. When a required output
for a single selected image is sent into the model, the testing dataset is sent into the model to get
actual output of all the images. This architecture is thus used to build the model which could
detect the road from an image. When a model is manually evaluated the process stops or will be
used for training by improving the dataset used by Hough Transform Space algorithm and then
process is continued until required output is observed from a model.

To build the architecture required by a project, we use incremental process model in


which we test each prototype and then clubbed with the actual model on observing a correct
output. Each Prototype is built along each model and then clubbed together with an actual model.
In this way project is built using incremental model.

20 | P a g e
Figur
e 3.a: Image showing an architecture of a project: Road Detection from an Image using
Computer Vision

3.3 CANNY EDGE DETECTION ARCHITECTURE:

This process consists of mainly four functions in it. The image can be of any type which
consists of change in intensity in it. The change in intensity of pixels in an image defines the
edges in an image. The canny edge detection mainly focuses on change in intensity in an image.
The change in pixel’s intensities from high intensity to low intensity is known as edge. At first,
the colour image is changed into black and white image and is passed to smoothening technique.
We use Gaussian blur as a smoothening technique followed by gradient calculations,
Nonminimum suppression and double threshold. Edge detection mainly use derivatives of pixel
intensities in an image and then reduce the complexity of an image. The edge is detected when
there is change in intensity from high to low which refers white shades to black shades (in Gray
scale image) in an image. Gray Scale image is used because it would be easy to process the Gray
scale image than the coloured image. A gradient calculation process is used for calculating the 𝜃
through Sobel filters. Non-Maximum suppression is a process of thinning of edges that should be
occurred in a required output image. Then a double thresholding is done to intensify the strong
pixels of an output image and to close the intensity of weaker pixels in an image.

21 | P a g e
Thus, canny edge detection is used and its architecture is built.

Figure 3.b: Image showing Architecture of Canny Edge Detection

The resultant image occurred through Canny Edge detection is sent to Hough Transform
Space algorithmic model as an input and then produces a required output.

3.4 PROJECT MODULES


Road Detection from a single image using Computer Vision consists of image insertion,
model building and then testing. The model evaluation is done manually by the developer. We
divided the complete project into mainly four modules. They are as follows:

Module 1: Selecting the appropriate testing image

Module 2: Preprocessing the selected image

Module 3: Edge Detection Implementation

Module 4: Hough Transformation

22 | P a g e
Module 5: Evaluating the output

Module 1: Selecting the appropriate testing image

It is the most important process in the project. Single Image from testing dataset is taken such a
way that it reaches our implementation of a model. Each model we implement takes a resultant
image as an input and process it further to produce an output. This selection of image is more
important because implementation of each model requires an image input for processing. And if
the processing is done, then output is produced. If output produced for the testing image is same
as required, then the resultant image is sent to next process that we need to develop further. In
order to observe the clear output, the best suitable image should be selected such a way that the
testing image should be able to produce the clear required output at the end of processes.

Module 2: Pre-processing the selected image

Pre-processing plays a major role in producing the required output in sufficient required
amount of time. Pre-processing of selected image mainly undergo the gray scale conversion and
smoothening techniques which would be considered as the first process in Canny’s process. The
selected image is converted into gray scale through the open source computer vision package.
And then smoothening is applied by implementing the Gaussian Blur algorithm on the selected
gray scale image. A gray scale image mainly consists of change in variants from white to black
that represents the color mixes of red, green and blue. The normalization is main process of
Gaussian Blur process conversion which is done through multiplying each intensities of a pixels
by their corresponding normalized matrix values. Thus, preprocessing is done on the selected
image. This conversion of gray scale image and reducing noise in the image can help by
reducing the processing time in the next large processes. In machine learning projects in general,
you usually go through a data preprocessing or cleaning step. As a machine learning engineer,
you’ll spend a good amount of your time cleaning up and preparing the data before you build
your learning model. The goal of this step is to make your data ready for the ML model to make
it easier to analyze and process computationally, as it is with images. Based on the problem
you’re solving and the dataset in hand, there’s some data massaging required before you feed
your images to the ML model.

23 | P a g e
Image processing could be simple tasks like image resizing. In order to feed a dataset of images
to a convolutional network, they must all be the same size. Other processing tasks can take place
like geometric and color transformation or converting color to grayscale and many more.

The acquired data are usually messy and come from different sources. To feed them to the ML
model (or neural network), they need to be standardized and cleaned up. More often than not,
preprocessing is used to conduct steps that reduce the complexity and increase the accuracy of
the applied algorithm. We can’t write a unique algorithm for each of the condition in which an
image is taken, thus, when we acquire an image, we tend to convert it into a form that allows a
general algorithm to solve it.

Data preprocessing techniques might include:

 Grey Scale Image Conversion:


Convert color images to grayscale to reduce computation complexity: in certain problems you’ll
find it useful to lose unnecessary information from your images to reduce space or computational
complexity.
For example, converting your colored images to grayscale images. This is because in many
objects, color isn’t necessary to recognize and interpret an image. Grayscale can be good enough
for recognizing certain objects. Because color images contain more information than black and
white images, they can add unnecessary complexity and take up more space in memory
(Remember how color images are represented in three channels, which means that converting it
to grayscale reduces the number of pixels that need to be processed).

Module 3: Edge Detection Implementation

The next step in the process if edge detection, which is main part in the program and required to
detect the edges in the image irrespective of details present in an image. We use Canny Edge
Detection Algorithm to implement the edge detection techniques because the other processes
which are also used to find the edges in an image would contain detailed images compared to
Canny Edge Detection Technique. Canny Edge Detection technique mainly consists of four
processes in it. They are Gaussian Blur which we have performed for smoothening of image as
preprocessing technique,

24 | P a g e
Gradient Calculation which is used to calculate 𝜃 for boundary selection of an image followed by
Non-Maximum suppression and double threshold required for strengthening the lines occurred in
edged image of previous functions. Thus, we get the image with edges which is applied as an input
to Hough transformation techniques.

The Canny edge detector is an edge detection operator that uses a multi-stage algorithm to detect a
wide range of edges in images. It was developed by John F. Canny in 1986. Canny also produced
a computational theory of edge detection explaining why the technique works.

The Canny filter is a multi-stage edge detector. It uses a filter based on the derivative of a
Gaussian in order to compute the intensity of the gradients. The Gaussian reduces the effect of
noise present in the image. Then, potential edges are thinned down to 1pixel curves by removing
non-maximum pixels of the gradient magnitude. Finally, edge pixels are kept or removed using
hysteresis thresholding on the gradient magnitude. The Canny has three adjustable parameters: the
width of the Gaussian (the noisier the image, the greater the width), and the low and high
threshold for the hysteresis thresholding.

The general criteria for edge detection include:

• Detection of edge with low error rate, which means that the detection should accurately
catch as many edges shown in the image as possible.
• The edge point detected from the operator should accurately localize on the center of the
edge.
• A given edge in the image should only be marked once, and where possible, image noise
should not create false edges.

Module 4: Hough Transformations

Hough Transformations require a Hough transformation space which is used to rotate the angles
of a trigonometric line equation and then specify the lines present in an edge detected image. If
the trigonometric line in rotation meets the edges in the image, then it may consider for applying
the model trained for detecting roads in an image. The training is done in Hough transformation
space which is used to detect the actual road lines from an image. When the Hough
transformations and training is done, then the road lines are detected on the selected image. The
training of model is improvised until the correct output is observed from a selected image.

25 | P a g e
The Hough transform is a technique which can be used to isolate features of a particular shape
within an image. Because it requires that the desired features be specified in some parametric
form, the classical Hough transform is most commonly used for the detection of regular curves
such as lines, circles, ellipses, etc. A generalized Hough transform can be employed in
applications where a simple analytic description of a feature(s) is not possible. Due to the
computational complexity of the generalized Hough algorithm, we restrict the main focus of this
discussion to the classical Hough transform. Despite its domain restrictions, the classical Hough
transform (hereafter referred to without the classical prefix) retains many applications, as most
manufactured parts (and many anatomical parts investigated in medical imagery) contain feature
boundaries which can be described by regular curves. The main advantage of the Hough
transform technique is that it is tolerant of gaps in feature boundary descriptions and is relatively
unaffected by image noise.

The Hough technique is particularly useful for computing a global description of a feature(s)
(where the number of solution classes need not be known a priori), given (possibly noisy) local
measurements. The motivating idea behind the Hough technique for line detection is that each
input measurement (e.g. coordinate point) indicates its contribution to a globally consistent
solution (e.g. the physical line which gave rise to that image point).

As a simple example, consider the common problem of fitting a set of line segments to a set of
discrete image points (e.g. pixel locations output from an edge detector). Figure 1 shows some
possible solutions to this problem. Here the lack of a priori knowledge about the number of
desired line segments (and the ambiguity about what constitutes a line segment) render this
problem under-constrained.

We can analytically describe a line segment in a number of forms. However, a convenient

equation for describing a set of lines uses parametric or normal notion: X cos t + Y sin t = r

where r is the length of a normal from the origin to this line and t is the orientation of r with
respect to the X-axis. (See Figure 2.) For any point (X, Y) on this line, r and t are constant.
In an image analysis context, the coordinates of the point(s) of edge segments (i.e. (X, Y) ) in the
image are known and therefore serve as constants in the parametric line equation, while r and t
are the unknown variables we seek. If we plot the possible (r, t) values defined by each (X, Y),

26 | P a g e
points in Cartesian image space map to curves (i.e. sinusoids) in the polar Hough parameter
space. This point-to-curve transformation is the Hough transformation for straight lines. When
viewed in Hough parameter space, points which are collinear in the Cartesian image space
become readily apparent as they yield curves which intersect at a common (r, t) point.

The transform is implemented by quantizing the Hough parameter space into finite intervals or
accumulator cells. As the algorithm runs, each (X, Y) is transformed into a discretized (r, t) curve
and the accumulator cells which lie along this curve are incremented. Resulting peaks in the
accumulator array represent strong evidence that a corresponding straight line exists in the
image.

We can use this same procedure to detect other features with analytical descriptions.

In this case, the computational complexity of the algorithm begins to increase as we now have
three coordinates in the parameter space and a 3-D accumulator. (In general, the computation and
the size of the accumulator array increase polynomials with the number of parameters. Thus, the
basic Hough technique described here is only practical for simple curves.)

Module 5: Evaluating the output

Evaluation of output is done through confusion matrix and the accuracy metrics when a testing
dataset is passed. When all the images of testing dataset are passed into the model, then the
output is observed on every image and then the true positives, false positives, true negatives and
false negatives are noted. This forms a confusion matrix and then accuracy, precision scores can
be noted. This process is known as Evaluation.
Thus, all the accuracy and precision of a model are noted.

27 | P a g e
CHAPTER- 4
Result analysis and conclusion

4.1 SYSTEM CONFIGURATION

4.1.1 Software Configuration

These are the Software Configurations that are required.

• Operating System: Windows 10/8/7 (incl. 64-bit), Mac OS, Linux


• Language: Python 3
• IDE: Jupyter Notebook
• Framework: Tkinter

4.1.2 Hardware Configuration


These are minimum Hardware configurations that are required.
• Processor: Intel core 2 duo or higher.
• RAM: 1 GB or higher
• HDD: 256 GB or higher
• Monitor: 1024 x 768 minimum screen resolution.
• Keyboard: US en Standard Keyboard.

28 | P a g e
4.2 SCREENSHOTS AND OUTPUT

First we will have a look at our projects Screenshot.

Figure 4.a: Selected testing image so that it undergoes into process

29 | P a g e
Figure 4.b: Grey scaled image output of a selected image

30 | P a g e
Figure 4.c: Gaussian Blur applied on grey scaled image

31 | P a g e
Figure 4.d: Canny Edge Detection output when applied on Gaussian Blurred image

32 | P a g e
Figure 4.e: Masking applied to edge detected image to get a required part in an image.

33 | P a g e
CHAPTER-5
References

Figure 4.f: Final image with detected lines of Hough Transform Space

34 | P a g e
Figure 4.f: Lines detected through Hough Transformations by changing the angle theta

35 | P a g e
CONCLUSION

When we drive, we use our eyes to decide where to go. The lines on the road that show us

where the lanes are act as our constant reference for where to steer the vehicle. Naturally, one of

the first things we would like to do in developing a self-driving vehicle is to automatically detect

lane lines using an algorithm. The road detection region of interest (ROI), must be flexible.

When driving up or down a steep incline, the horizon will change and no longer be a product of

the proportions of the frame. This is also something to consider for tight turns and bumper to

bumper traffic. This project is entirely based on image processing and road detection in self-

driving vehicles in which has a great scope in future. We have completed the entire

implementation using specific algorithms to detect the road clearly. If the people’s thought hasn’t

changed about the self-driving cars being safe, these cars are already safe and are becoming

safer. Only if they believe and give a try to technology, they get to enjoy the luxury of

computerized driving.

Driverless cars appear to be an important next step in transportation technology. They are

a new all-media capsule- text to your heart’s desire and it’s safe. Developments in autonomous

cars is continuing and the software in the car is continuing to be updated. Though it all started

from a driverless thought to radio frequency, cameras, sensors, more semiautonomous features

will come up, thus reducing the congestion, increasing the safety with faster reactions and fewer

errors.

36 | P a g e
CHAPTER 5

REFERENCES

1. General Road Detection From A Single Image, TIP-05166-2009, ACCEPTED, Hui Kong

, Member, IEEE, Jean-Yves Audibert, and Jean Ponce , Fellow, IEEE Willow Team, Ecole

Normale Superieure / INRIA / CNRS, Paris, France Imagine team, Ecole des Ponts ParisTech,

Paris, France.

2. J. C. McCall and M. M. Trivedi, “Video based lane estimation and tracking for driver

assistance: Survey, system, and evaluation,” IEEE Trans. on Intelligent Transportation Systems,

pp. 20–37, 2006.

3. K.-Y. Chiu and S.-F. Lin, “Lane detection using color-based segmentation,” IEEE

Intelligent Vehicles Symposium, 2005. 1

4. H. Kong, J.-Y. Audibert, and J. Ponce, “Vanishing point detection for road detection,”

CVPR, 2009.

5. Y. Wang, E. K. Teoh, and D. Shen, “Lane detection and tracking using b-snake,” Image

and Vision Computing, pp. 269–280, 2004

6. A. Lookingbill, J. Rogers, D. Lieb, J. Curry, and S. Thrun, “Reverse optical flow for

selfsupervised adaptive autonomous robot navigation,” IJCV, vol. 74, no. 3, pp. 287–302, 2007.

37 | P a g e
7. A. Broggi, C. Caraffi, R. I. Fedriga, and P. Grisleri, “Obstacle detection with stereo

vision for off-road vehicle navigation,” IEEE International Workshop on Machine Vision for

Intelligent Vehicles, 2005.

8. J. Sparbert, K. Dietmayer, and D. Streller, “Lane detection and street type classification

using laser range images,” IEEE Proceedings in Intelligent transportation Systems, pp. 456– 464,

2001.

9. J. B. Southhall and C. Taylor, “Stochastic road shape estimation,” ICCV, pp. 205–212,

2001.

38 | P a g e

You might also like