0% found this document useful (0 votes)
43 views24 pages

Seminar Report2

Uploaded by

harithanetz
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
43 views24 pages

Seminar Report2

Uploaded by

harithanetz
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 24

KERALA GOVERNMENT POLYTECHNIC COLLEGE

KOZHIKODE-5

DEPARTMENT OF COMPUTER ENGINEERING


2023-24

SEMINAR REPORT
ON

AN INTELLIGENT QR CODE SCANNING SYSTEM FOR


VISUALLY-IMPAIRED USERS

Submitted by

ASWATHI P

guided by

Mrs.ROJNA N
KERALA GOVERNMENT POLYTECHNIC
COLLEGE KOZHIKODE-5

Department of Computer Engineering

CERTIFICATE

This is to certify that Seminar entitled AN INTELLIGENT QR CODE SCANNING


SYSTEM FOR VISUALLY-IMPAIRED USERS submitted by ASWATHI P
[Reg.No : 2201131485], to the Department of Computer Science and Engineer-
ing,Kerala Government Polytechnic College Kozhikode-5 , in partial fulfilment of
the requirement for the award of Diploma under the Directorate of Technical Educa-
tion,Government of kerala is a bonafide record of the work carried out by her.

Dr. Jawharali B S Mrs.Rojna N

Head Of Department Guide

Internal Examiner External Examiner

i
ACKNOWLEDGEMENT

It is with great enthusiasm and learning spirit that I am bringing out the
report of my Seminar. Here I would like to mark my token of gratitude to all those
who influenced me during the period of my work. I would like to express my
sincere thanks to Dr.Jawharali B SK, Principal, Kerala Government Polytechnic
College Kozhikode for the facilities provided here.

With immense pleasure I express sincere thanks to Mrs. Rojna N, Head,


Department of Computer Engineering, for the entire motivation. I also extend my
heart-felt gratitude to my guides Mrs. Rojna N and Dr. Shabeera T P, Lec-
turers, Department of Computer Engineering, for their committed guidance and
valuable suggestions. Also I extend my gratitude to all teachers in the Department
of Computer Science and Engineering, Kerala Government Polytechnic College
Kozhikode, for their support and inspiration.

And above all I praise and thank the Almighty God, who showered His
abundant grace on me to make this work a success. I also express my special thanks
and gratitude to my family and all my friends for their support and encouragement.
ABSTRACT
While QR code reading finds its applications in many diverse fields like retail envi-
ronments, industries, product identification, marketing, education, warcraft etc., this work aims
to streamline the performance of the algorithm according to the requirements of a visually-
impaired user for assistance in product identification and or way-finding in both indoor and
outdoor environments. Vision is a gift and being able to make up for it for someone would be
a great cause to serve. Given the laborious hassle involved taking their route and the alarming
consequence of taking the stairs instead of an elevator, this work aims to contribute a robust and
effective QR code scanning system addressing the possible problems and challenges associated
with reading the QR code within such an environment using a deep learning based method-
ology. Especially in case of an emergency to identify a fire exit for instance, this work will
effectively contribute towards robust recognition of the QR Code even when it might be in a
condition not interpretable by standard QR Code readers. It was concluded that while the same
Computer Vision algorithm can be tweaked to execute targeted drone attacks, it is also the same
technology that can assist humanity hand in hand and help visually-impaired users in seeing the
missed and fulfill the purpose that technology was originally created for i.e. to serve humanity,
rather than the opposite.

iii
TABLE OF CONTENTS

TITLE PAGE i

BONAFIDE CERTIFICATE i

ABSTRACT iii

LIST OF FIGURES vi

1 INTRODUCTION 1
1.1 Background and Evolution of QR Codes 1
1.2 QR Code Capabilities 1
1.3 QR Code Structure 2
1.4 QR Codes and Visually-Impaired Users 3
1.5 Increasing Relevance of QR Codes 3
1.6 Existing Work and Limitations 3
1.7 Challenges and Requirements 3

2 RELATED WORK 4
2.1 Challenges with Existing QR Code Scanning Techniques 4
2.2 Image Morphological Operations and Probabilistic Methods 5
2.3 Deep Learning Approaches 5
2.4 Assisting Visually-Impaired Users 5
2.5 State-of-the-Art Research Comparisons 5

3 ALGORITHM 7
3.1 Overview 7
3.2 Performance Criteria 7
3.3 Methodology 7

iv
3.4 Detection and Localization 8
3.5 Bounding Box Prediction 9
3.6 Confidence Score and Classification Loss 9
3.7 Confidence Score and Loss 9

4 DATA SET 10
4.1 Training Data 10
4.2 Test Data 10
4.3 Synthetic Data Generation 11
4.4 Summary of Datasets 12

5 RESULTS 13
5.1 Performance Evaluation 13
5.2 Comparative Analysis 13
5.3 Dataset Evaluation 14
5.4 Summary of Results 15

6 CONCLUSION AND FUTURE DIRECTIONS 16

REFERENCES 17

v
LIST OF FIGURES

1.1 Various QR codes 1


1.2 Sample QR code structure 2

2.1 Challenges in QR code detection(misalignment,damaged, different angles) 4

3.1 YOLOv3 network architecture for QR code detection 8

4.1 Dubska dataset samples 11


4.2 Soros dataset samples 11
4.3 Kaggle dataset samples 11

5.1 QR code detected and decoded by the algorithm 14

vi
CHAPTER 1

INTRODUCTION

1.1 Background and Evolution of QR Codes


Quick Response (QR) codes are 2-dimensional matrix codes that were first developed
in 1994 by Denso Wave in Japan for tracking the flow of automobile parts during manufacturing.
Since then, significant advancements have been made in this domain. For instance, in 1999, a
paper was published regarding an application that processes Spanish medicine prescriptions
automatically. The introduction of smartphones further expanded the scope and facilitated the
processes in this domain. Using corner detection and spiral search, a mobile application was
developed to detect QR codes in 2004.

Figure 1.1: Various QR codes

1.2 QR Code Capabilities


QR codes can hold up to 4,296 alphanumeric characters in a small space, compared to
a 1D barcode which can only store thirty digits. They consist of edges and corners and can store
numeric, alphanumeric, Kanji characters, and binary data. Due to these capabilities, QR codes
have become a preference over barcodes, which are now becoming redundant.

1
1.3 QR Code Structure
The architecture of a typical QR code consists of several components:

Figure 1.2: Sample QR code structure

• Finder Patterns(FP):Squares on three corners called ’Finder Patterns’ or ’Position De-


tection Pattern’ determine the location, orientation, and dimensions of the code.

• Modules: Many small black and white squares each represent a single bit, organizing
themselves into rows and columns to form a matrix.

• Timing Patterns: A combination of light and dark modules that interlink the FPs, pro-
viding key information about the size, matrix dimensions, and distortion of the code.

• Alignment Patterns: Used to determine the perspective distortion of the QR code’s im-
age.

• Format Information: Carries additional details for decoding.

• Quiet Zone: A light area of minimum width equivalent to four modules forming a bound-
ary around the QR code.

• Encoded Data: Protected by a Reed-Solomon algorithm for restoration of damaged data


in case the QR code is damaged. Error correction levels range from Low (7%),Medium
(15%), Quartile (25%), to High (30%).)

These properties, when varied, determine the version of the QR code as one of 40 different
versions.

2
1.4 QR Codes and Visually-Impaired Users
A visually-impaired user may be defined as partly sighted or blind persons that are part
of a wider group of persons with reduced mobility. Globally, over 2.2 billion people are affected
by some kind of visual impairment, of which 79% use a smartphone. Several user requirements
support the use of a camera-based device to assist visually impaired users in daily life due to
the wide range of difficulties faced in day-to-day tasks

1.5 Increasing Relevance of QR Codes


The Covid-19 pandemic has further accelerated the use of QR codes and visual tags
as they are contactless. A significant degree of assistance may be provided through the use of
visual tags such as QR codes in handheld or wearable devices to identify medicine, acquire
dietary information regarding food allergens, or for indoor/outdoor navigation.

1.6 Existing Work and Limitations


The first barcode detector based on morphological operators was proposed by Katona
et al. Since then, many similar works have been published, focusing on improving barcode
localization algorithms and increasing the robustness of the algorithm. Methods using machine
learning and text-recognition modules have been proposed, but they have their own limitations,
especially when compared to QR codes.

1.7 Challenges and Requirements


A smartphone application that effectively incorporates the inhibitions posed in real-life
scenarios as an assistive navigation system is a key requirement. Challenges include guiding
a blind user holding a camera towards the QR code, dealing with misalignment, multiple QR
codes, rotated objects, and varying illumination conditions. The system must detect QR codes
from a significant distance and angle, with high accuracy, to avoid hassle and provide efficient
assistance.

3
CHAPTER 2

RELATED WORK

2.1 Challenges with Existing QR Code Scanning Techniques


Many open-source libraries rely on the assumption that the camera used to scan the
QR code has inbuilt auto-focus and good resolution. However, not all devices have the same
functionalities, so it is useful to develop image restoration techniques. Often, the QR code is
placed misaligned (not at a right angle on the wall) or the camera captures the QR code at an
undesirable angle. Automatically aligning the QR code with the camera increases the decoding
success rate, meaning the sides of the small squares need to be aligned with the X and Y axes.
Furthermore, smartphones require user assistance, and it would be beneficial to make the pro-
cess independent of user intervention. This ’QR Code localization’ challenge is the direction of
recent work in this domain.

Figure 2.1: Challenges in QR code detection(misalignment,damaged, different angles)

4
2.2 Image Morphological Operations and Probabilistic Methods
QR code localization may be achieved through image morphological operations, as in
the QR code may be split into segments called ”tiles,” and a probabilistic method is applied to
classify each tile before the post-processing step where the whole QR code is framed together.
Neural networks have shown significant promise in addressing this problem.

2.3 Deep Learning Approaches


The first deep learning-based solution for 1D barcode detection was introduced in
2013. It wasn’t until 2017 that a deep learning-based 2D barcode detection method was intro-
duced using YOLO (You Only Look Once) for detection. This method employs the DarkNet19
classification network with the Softmax layer removed and one filter in the convolutional layer.
Using regression analysis in an angle prediction network, they predict the angle at which the
QR code is rotated in the captured image and classify it between 45 and 135 degrees before
re-aligning it for decoding.

2.4 Assisting Visually-Impaired Users


Much work has been done to assist visually-impaired users in daily life by scanning
QR codes. In QR codes were used as visual tags. simple techniques were applied to localize
and decode a barcode for an application. Although these works serve a great cause, they lack
the use of more robust techniques and do not guarantee a hassle-free approach, especially given
the challenges the process is vulnerable to. The authors suggested an approach similar to ours.
In distorted QR codes were reconstructed with image morphological operations, and degraded
barcodes were reconstructed. However, little or no work has been done to achieve the same
objectives through neural networks that can make this process less laborious.

2.5 State-of-the-Art Research Comparisons


For comparison, we examine two state-of-the-art research publications. The first one
discusses localization and subsequent detection of 1D and 2D barcodes. They execute their
algorithm on MATLAB and upgrade to C++ using the OpenCV framework. They use the
Dubska dataset to test their algorithm, but they note that their algorithm returned an oversized

5
bounding box in images with QR codes surrounded by text. When they used their own dataset,
the bounding box was too small if the QR code was rotated to about 45 degrees. They address
issues of perspective distortion in orientation, scale, image blur, and symbology invariance.
Their methodology involves using areas of high concentration of edge and corner structures to
construct a barcodeness map from the structure matrix and identifying peaks in the map to find
barcode borders. They also incorporated colored QR codes.
In the paper by Dubska, they developed their own open-source dataset of challenging QR
code images. Their methodology involved using the Hough Transform to detect vanishing
points with the Hough space as a 2D signal using parameterization. They then extracted the
grids and matrix codes. They compared their results with ZXing, finding that ZXing fails on
rotated and skewed images compared to their more robust algorithm.

6
CHAPTER 3

ALGORITHM

3.1 Overview
QR Code work proposes a streamlined approach for QR code detection and interpre-
tation from live video streams as well as static images. This will assist a visually impaired user
in indoor and outdoor navigation, for example, by identifying a fire-exit through a QR code.
The performance of state-of-the-art algorithms and frameworks has been reaffirmed. While QR
codes find applications in many fields, the performance requirements of this algorithm were
prioritized based on those most crucial to a visually-impaired user.

3.2 Performance Criteria


Factors like runtime and computation speed are significant indicators for performance eval-
uation in other domains. However, for a visually-impaired user, other factors must be catered
to:

• Perspective Distortion: The image may not be aligned with the camera.

• Accuracy: To prevent the user from assuming the wrong route.

• Motion Blur:Addressed by iteratively capturing the image live from the camera as frames.

• Illumination Conditions:Less illumination can be catered to by capturing the image with


flash automatically.

3.3 Methodology
The entire algorithm is split into three parts.

7
• Detection

• Localization

• Decoding

3.4 Detection and Localization


Detection and LocalizationFor detection and localization, the tiny You Only Look
Once (YOLO) version 3 convolutional neural network architecture was used, stacked with the
Darknet 19 framework. It is a state-of-the-art, fast, and accurate object detection and image
classification network.
Network Architecture: Consists of convolutional layers and max pooling layers followed
by 2 fully connected layers in the end. The final layer uses a LeNet activation function, and the
previous layers use a leaky RELU. By changing the size of the network, runtime and accuracy
can be adjusted without retraining.
Training Process: In YOLOv3, the first 20 convolutional layers are pre-trained using the
ImageNet 1000-class dataset with an input size of 224x224. The input resolution is then in-
creased to 448x448. The whole network is trained for 135 epochs with a batch size of 64, 0.9
momentum, and 0.0005 decay. The learning rate is gradually raised from 0.001 to 0.01 until
epoch 75, then gradually decreased. Data augmentation is used with random scaling, transla-
tion, and random adjustment of exposure and saturation.

Figure 3.1: YOLOv3 network architecture for QR code detection

8
3.5 Bounding Box Prediction
The input image is divided into an SxS grid of cells. Each cell is responsible for
predicting an object lying in the center (x, y) of that cell in the image. The grid cell also predicts
the bounding box B with 5 components (x, y, w, h, confidence) and the class probabilities C,
giving a total of S x S x B x 5 outputs related to the bounding box predictions.
Bounding Box Loss: Equation (3.1) shows the loss in predicting the bounding box position.
The function computes a sum over each bounding box predictor of every cell.

S X
B2
X
1obj 2 2

λcoord ij (xi − x̂i ) + (yi − ŷi ) (3.1)
i=0 j=0

3.6 Confidence Score and Classification Loss


The confidence score is part of the bounding box prediction. If an object is present in
the cell, the confidence score is 1; otherwise, it is equal to the Intersection Over Union (IOU) of
the predicted box and ground truth represented by C.Classification Loss: The network predicts
a set of class probabilities for each cell, making a total of S x S x C probabilities in total. The
classification loss is mathematically represented as equation (3.2). This shows that when no
object is present in the cell, the classification error is not affected. When an object is present, it
is equal to 1.

S 2
X X
1obj i c ∈ classes (pi (c) − p̂i (c))2 (3.2)
i=0

3.7 Confidence Score and Loss


The loss associated with the confidence score C parameters weighs different parts of
the loss function to increase model stability. See equation (3.3).

2
S X
B
X
λcoord 1obj
ij (3.3)
i=0 j=0

9
CHAPTER 4

DATA SET

4.1 Training Data


The YOLOv3 neural network is trained on the popular open-source object detection
COCO dataset, which consists of 328,000 images of various objects. For fine-tuning, part of
the model was trained on a dataset consisting of both synthetically generated QR codes and
natural scene images of QR codes.

4.2 Test Data


The model was tested on several datasets:

• kaggle Dataset:Consisting of 10,000 QR codes with three-channel binary images.

• Dubska Dataset: Consisting of 400 QR codes with perspective distortion, challenging


illumination conditions, motion blur, and surrounding text, all captured with a mobile
phone. This dataset was used to test the algorithm’s ability to handle perspective distor-
tion specifically, as shown in Figure 4.2.

• Soros Dataset: Consisting of real, sharp, defocused, and motion-blurred images. Figure
4.3 shows sample images from this dataset.

• Additional Dataset: Consisting of 130 images with various types of challenging QR


code images.

10
Figure 4.1: Dubska dataset samples

Figure 4.2: Soros dataset samples

4.3 Synthetic Data Generation


QR code datasets may also be generated by oneself through open-source APIs. These
datasets often provide better accuracy and detection results because they contain automatically
generated sharp QR codes with no motion blur, angle variance, or perspective distortion. Figure
4.1 shows a few samples from this synthetic dataset.

Figure 4.3: Kaggle dataset samples

11
4.4 Summary of Datasets

• COCO Dataset: 328,000 images of various objects for initial training.

• Synthetic Dataset: Sharp QR codes without distortions for fine-tuning.

• Kaggle Test Set: 10,000 QR codes for initial testing.

• Dubska Dataset: 400 QR codes with various distortions for comprehensive evaluation.

• Soros Dataset: Real and challenging QR codes for robustness testing.

• Additional Dataset: 130 challenging QR code images for further validation.

Each dataset serves a specific purpose, ensuring the algorithm is robust, accurate, and reli-
able under various conditions.

12
CHAPTER 5

RESULTS

The results of my study on the intelligent QR code scanning system for visually-
impaired users are promising. The algorithm performed well, effectively handling various
challenges while maintaining real-time accuracy. This technology, when integrated into ETAs,
smartphone apps, or smart glasses, can significantly aid visually-impaired users in navigation
and product identification. Although there was a slight increase in decoding time, the improved
success rate justifies this trade-off, especially for critical tasks like identifying medications. En-
hancing the algorithm’s robustness and exploring additional applications could further expand
its potential and benefit users.

5.1 Performance Evaluation


For a fair comparison, each image should ideally be tested with a smartphone using an
automatic flash, ensuring that only the challenge of perspective distortion remains. However,
due to time constraints, all images were used for testing, implying that real-time results would
likely be better than those depicted in Table 5.1. The results of the detection rate on a PC with
a webcam and a comparison with other works are shown in Table 5.1. These results highlight
our algorithm’s capability to handle tough performance requirements.

5.2 Comparative Analysis


A comparison of the detection rate of our algorithm with that of Soros, Dubska, and
Hansen on the Dubska dataset is provided in Table 5.1. Our algorithm achieved a 100% suc-
cess rate in detection compared to Soros with a detection rate of 42.6% and Hansen with a
detection rate of 89%. Decoding rates were not compared as they were not stated in the other
works. Figure 5.1 shows a sample QR code image being detected and decoded by our algorithm.

13
Algorithm Detection Rate
Ours 1.000
Gabour Soros 0.426
Hansen 0.890

Table 5.1: Comparison of Detection Rate on Dubska Dataset

Figure 5.1: QR code detected and decoded by the algorithm

5.3 Dataset Evaluation

• 130 Images Dataset: Out of 130 images, 70 were read successfully, while others were
not due to irrelevant challenges.

• Dubska Dataset: The model achieved a 100% detection rate and a 68% success rate in
decoding on the Dubska dataset consisting of 400 images of QR codes.

• Kaggle Dataset: On a test set from Kaggle with 9,999 QR codes, all were detected
and 9,651 were decoded. This method showed robustness in detecting QR codes with
perspective distortion, achieving a 99.9% success rate for detection and a 96.5% success
rate for decoding.

• Soros Dataset: A detection success of 76.7% and a decoding success of 30.83% were
achieved. The low decoding rate is due to various scales, orientations, and blurriness of
images in this dataset.

14
5.4 Summary of Results
The detection rate and decoding rates obtained on each of these datasets are listed in
Table2.

Dataset Detection Rate Decoding Rate


Kaggle 0.999 0.965
Dubska 1.000 0.680
130 images 0.914 0.414
Soros 0.767 0.308

Table 5.2: Detection Rate and Decoding Rate on Listed Datasets

These results provide a generalized depiction of the proposed algorithm’s capability, demon-
strating its effectiveness in detecting and decoding QR codes under various conditions and chal-
lenges.

15
CHAPTER 6

CONCLUSION AND FUTURE DIRECTIONS

This algorithm demonstrated robust performance against angle rotation, automatically


aligning the QR code image with the camera, performing both detection and interpretation in
real-time with state-of-the-art accuracy despite visual challenges. It outperformed the algo-
rithms by Soros and Hansen. Decoding was done successfully, as indicated by the results,
followed by reading the information back to the user. Factors impacting ease of use were also
outlined.
The algorithm can be embedded in an Electronic Travelling Aid (ETA) for visually-impaired
users or used as a navigation utility apart from being a smartphone application. It may also be
embedded in smart glasses for identifying medicines. Rotating the angle of the captured QR
code image to automatically align it facilitated ease of use and increased the success rate with
only a minor compromise on the time taken for decoding.
After analyzing the runtime vs accuracy dilemma, it was concluded that while timely medi-
cation is crucial, there is no danger in a few milliseconds delay, but taking the wrong medication
poses a significant risk. Additionally, the process must be hassle-free as the angle may not al-
ways be aligned for everyone, so technology must bridge this gap.
Different choices of Application Programming Interfaces (APIs) may be combined with the
algorithm to allow for more flexible usage. With its remarkable real-time performance, this
application may be incorporated into a range of wearable devices in the future to further the
cause of assistive technological aids.

16
REFERENCES

[1] A. Namane and M. Arezki, ”Fast Real Time 1D Barcode Detection FromWebeam Images
Using the Bars Detection Method”, Proceedings of the World Congress on Engineer-
ing,2017.

[2] S. Busaeed, R. Mehmood, I. Katib and J. Corchado, ”LidSonic for Visually Impaired:
Green Machine Learning-Based Assistive Smart Glasses with Smart App and Arduino”,
Electronics, 2022.

[3] L. B. Neto et al., ”A Kinect-Based Wearable Face Recognition System to Aid Visually
Impaired Users”, IEEE Transactions on Human-Machine Systems,2017.

[4] M. Elgendy, T. Guzsvinecz and C. Sik-Lanyi, ”Identification of Markers in Challenging


Conditions for People with Visual Impairment Using Convolutional Neural Network”,
Applied Sciences,2019.

[5] P. Theodorou and A. Meliones, ”Gaining insight for the design development deployment
and distribution of assistive navigation systems for blind and visually impaired people
through a detailed user requirements elicitation”, Universal Access in the Information
Society, 2022

17

You might also like