Paper Analysis
Paper Analysis
This image illustrates the process of license plate detection and recognition from a vehicle
image. Let's break it down step by step:
1. Input Image: The process starts with an input image of a blue BMW car with a visible
license plate "KAH-9329".
2. Pre-processing:
o Image Downsampling: The original image is reduced in size to improve
processing speed.
o Gray Image Conversion: The downsampled image is converted to grayscale to
simplify further processing.
3. Candidate Extraction:
o Binary Segmentation: The grayscale image is converted to a binary (black and
white) image, emphasizing the license plate area.
o Kernel Density Function: This step further refines the binary image to isolate
potential license plate regions.
4. Output of Detected Image: The license plate area is isolated and extracted from the
original image.
5. Segmentation: The extracted license plate is further processed to isolate individual
characters.
The final result shows the successfully extracted and segmented license plate "KAH-9329".
This process demonstrates a typical workflow in automatic license plate recognition systems,
involving image preprocessing, region of interest extraction, and character segmentation. These
steps prepare the image for the final stage of character recognition, which would typically
involve machine learning techniques to identify each character and reconstruct the full license
plate number.
all object detection using the Haar-like features algorithm. Haar cascade
Positive images – Positive images are those images which are supposed
Negative Images – Negative images are all the random images which
are of no use i.e., all those images which do not contain the object we
images.
1. Grayscale Conversion:
o Purpose: Simplifies the image by converting it from color to shades of gray.
o How it Works: Each pixel's color value is transformed into a single intensity
value, representing the brightness of the pixel. This reduces the computational
complexity.
2. Thresholding:
o Purpose: Converts a grayscale image into a binary image (black and white).
o How it Works: Sets a threshold value. Pixels with intensity values above the
threshold are turned white, and those below are turned black. This helps in
distinguishing the license plate characters from the background.
3. Morphological Operations:
o Purpose: Enhances the structure of objects within the image.
o Common Operations:
Dilation: Expands the boundaries of white regions, useful for connecting
disjointed parts of the license plate characters.
Erosion: Shrinks the boundaries of white regions, helping to remove
small noise points.
Opening: Erosion followed by dilation, used to remove small objects
from the foreground.
Closing: Dilation followed by erosion, useful for closing small holes in
the foreground objects.
These pre-processing techniques prepare the image for the subsequent steps of license plate
localization and character recognition by enhancing the features and reducing noise.
1. Filtering: The code checks if the width and height of each potential character are within
specified lower and upper bounds. This helps to filter out noise and non-character
elements.
2. Contour Extraction: The x-coordinates of valid character contours are stored in
x_cntr_list for later use in indexing.
3. Character Extraction: Each character is extracted from the original image using the
bounding rectangle coordinates.
4. Resizing: The extracted character is resized to a standard size (20x40 pixels).
5. Visualization: A rectangle is drawn around each character for visualization purposes.
6. Color Inversion: The character image is inverted (255 - char) to prepare it for
classification.
7. Standardization: The character is placed within a larger 44x24 pixel image with a black
border, creating a consistent format for all characters.
8. Storage: The processed character image is appended to img_res, which stores all
segmented characters.
This process aims to isolate individual characters from the license plate image, standardize their
size and format, and prepare them for subsequent character recognition. The method uses
geometric properties (width and height ratios) to identify and extract valid characters while
filtering out non-character elements.
Character Recognition: Implemented and compared four deep learning models (CNN,
MobileNet, Inception V3, ResNet50)
1 Input layer
4 Convolutional (Conv2D) layers
1 Max Pooling layer
1 Dropout layer
1 Flatten layer
2 Dense layers
2. Layer Details:
3. Purpose of this Model: This CNN architecture is designed for character recognition in
license plates. It's structured to:
Mobilnet
This image illustrates two key building blocks of the MobileNet architecture: the stride=1 block
and the stride=2 block. Let me explain each:
These building blocks are repeated and stacked to form the complete MobileNet architecture,
with variations in the number of filters and layers depending on the specific version of
MobileNet being used.
This approach allows for efficient transfer learning on mobile devices while maintaining good
performance for tasks like image classification.
Inception V3
Overview:
Implementation Steps:
ResNet50
Overview:
Architecture:
Five Stages: Each stage contains a convolution block and an identity block.
Blocks: Each block has three layers.
Implementation Steps:
Basic Concepts
Transfer Learning:
Purpose: Utilizes pre-trained models to leverage existing knowledge and apply it to new
tasks.
Advantage: Reduces training time and computational resources.
Both Inception V3 and ResNet50 use transfer learning to adapt pre-trained models for
recognizing 36 characters, with modifications to the final layers to suit the specific task.
Paper 1
The paper presents a methodology for detecting and recognizing Bangla license plate numbers
using a deep learning approach. Specifically, the authors have used a 53-layer convolutional
neural network model to perform the tasks of license plate detection and character recognition.
They have captured images using 12-megapixel cameras and prepared a
dataset comprising 1050 training images and 200 testing images of private vehicles.
The images have been manually annotated and augmented to enhance model robustness(paper1).
Proposed Method:
Definition: YOLOv3 is a specific object detection algorithm that uses a single CNN to
predict multiple bounding boxes and class probabilities for objects in images.
Structure: YOLOv3 is built upon CNN architecture. It uses a feature extraction network
(Darknet-53) which is a 53-layer convolutional network. The network is followed by
several layers that predict bounding boxes and class probabilities.
Usage: YOLOv3 is designed specifically for real-time object detection. It divides the
input image into a grid and predicts bounding boxes and probabilities for each grid cell.
Relationship
*******************