0% found this document useful (0 votes)
101 views7 pages

COURSE - Digital Image Processing PDF

This document provides information about a computer vision based text scanner project. It discusses a group of 6 students working on developing a system that can scan images, such as a sudoku puzzle, and extract the text. The methodology involves image acquisition, preprocessing, detecting the sudoku grid, warping the image, and using a neural network to recognize digits in each tile. The results demonstrate the image processing steps and digit recognition. The conclusion discusses potential applications and improvements to the technology.

Uploaded by

Ganesh Inguva
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
101 views7 pages

COURSE - Digital Image Processing PDF

This document provides information about a computer vision based text scanner project. It discusses a group of 6 students working on developing a system that can scan images, such as a sudoku puzzle, and extract the text. The methodology involves image acquisition, preprocessing, detecting the sudoku grid, warping the image, and using a neural network to recognize digits in each tile. The results demonstrate the image processing steps and digit recognition. The conclusion discusses potential applications and improvements to the technology.

Uploaded by

Ganesh Inguva
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

COURSE:​ Digital Image Processing

PROJECT:​ Computer Vision Based Text Scanner

GROUP NUMBER:​ 22

TEAM:

NAME ID NUMBER

Arnav Jain 2017B3AA1378H

Vasu Sood 2017B4A31476H

Mihir Vilas Shende 2017B5A31157H

Thambabathula Omana 2018A3PS0553H

I M V S Ganesh 2018AAPS0389H

Toran Maheshwari 2017B3A80948H


INTRODUCTION

We humans, have a very robust visual system, which helps us to identify people and objects,
play sports, perform operations, drive vehicles, read, and so on.

Although it might seem that we do not put any special effort to do most of these tasks human
visual system is fairly complex to replicate and implement.

Computer Vision, in the simplest terms, is the automation of such a visual system, so that
computers or machines, in general, can obtain high level understanding of the environment from
digital images and videos.

In the manufacturing sector identifying defective products and ensuring quality and

accuracy is of utmost importance.

Object detection, deals with detecting instances of objects of a certain class, such as humans,
buildings, or cars in digital images and videos. Computer Vision is vital in implementing object
detection from digital images.
OBJECTIVE

To develop a computer vision based text scanner that will scan through any image (Example: a
Sudoku Puzzle from the Newspaper) to obtain the respective text from it.

METHODOLOGY

Any computer vision application starts with Image acquisition (Image acquisition is the digital
representation of the visual characteristics of the physical world).Image sensors are used to
detect and capture the information required to make an image.

The images acquired are then processed in the next stage. In this step, the signals in the acquired
images are filtered to remove the noise or any irrelevant frequencies. If needed the images are
padded and transformed to a different space, so as to make them ready for the actual analysis.

The processed images are then analysed to extract useful information, this involves pattern
identification, colour recognition, object recognition, feature extraction, motion tracking, image
segmentation, etc.

Finally, the high dimensional data obtained from all the above steps is used to produce
meaningful numerical information, which leads to making decisions.

SCRIPTS

main.py

This script combines all the scripts given below.

christopher.py

This script consists of a Convolutional Neural Network trained on a custom dataset.

basic.py
This script is used to take as input the original image, apply pre-processing, get the corner
points of the board, warp the image and separate out the individual smaller grids (tiles)
containing the individual digits/blanks.

sud.py

This script is used to take the individual tiles, does a bit of pre-processing and predict the digits
in each tile. As the grid is a 9x9, the number of tiles are 81.

RESULTS

1) ​Image Processing

The image is converted to grayscale and further Adaptive Thresholding and Dilation are applied
to the image to reduce noise and enhance contours. After this happens, the coordinates of the
maximum area (the Sudoku grid) in the image are found.

2) ​Warping

Using the coordinates found, we warp the image and form individual grids on the image. These
individual grids will help in extracting out the smaller tiles which contain a single digit or a blank.

3) ​Digit Recognition

The individual grids are passed into a convolutional neural network (Christopher) which is
pre-trained on a custom dataset. These grids are identified and returned in the form of a list.
CONCLUSION

We can scan a sudoku puzzle of an image using our camera scanner and then convert it into text,
using warping & Artificial intelligence. But in future we plan to work on how to scan any image
to successfully obtain respective texts on it. Further improvements and adaptations of this
technology can help us, it improves the Searching ability of our computers, as our data records
continue to get bigger and more complex, computers with OCR will make record searching
much easier. Computers with OCR can scan a document and store it in a database, making it
easier to quickly retrieve it in the future. Inclusion of AI, such that it can read text from images
or banners etc. can use in-built software to quickly translate it into your preferred language,
boosting communication. Also, AI-enabled with OCR would be able to read paper bills and
records, analyse complex charts, offer recommendations and take business decisions.​ ​AI that is
capable of recognizing facial expressions can understand how people around them are feeling.
This offers benefits in hospitality and healthcare sectors.​ ​Assembly processing robots with
computer vision will be able to identify defective products or rotten produce and separate them
from quality products.

REFERENCES

A. Rosenfeld and A. C. Kak, Digital Picture Processing, Academic Press, New York, 1976

K. S. Fu and A. Rosenfeld, Pattern recognition and image processing,


IEEE Trans, on Computers C-25, 1976, 1336–1346.

A. R. Hanson and E. M. Riseman, Design of VISIONS: segmentation


and interpretation of images, in Conference Record, 1976 Joint
Workshop on Pattern Recognition and Artificial Intelligence (IEEE
Publ. 76CH1169-2C), pp. 135-144.

https://www.cv-foundation.org/openaccess/content_cvpr_2015/html/Zhang_Symmetry-Bas
ed_Text_Line_2015_CVPR_paper.html

http://iihm.imag.fr/publs/2003/procam03_magictable_berard.pdf

http://opencv-python-tutroals.readthedocs.io/en/latest/py_tutorials/py_imgproc/py_table_o
f_contents_imgproc/py_table_of_contents_imgproc.html

https://scholar.google.co.in/citations?hl=en&vq=eng_computervisionpatternrecognition&vi
ew_op=list_hcore&venue=x0SOFhwf7eMJ.2020

You might also like