COURSE - Digital Image Processing PDF
COURSE - Digital Image Processing PDF
GROUP NUMBER: 22
TEAM:
NAME ID NUMBER
I M V S Ganesh 2018AAPS0389H
We humans, have a very robust visual system, which helps us to identify people and objects,
play sports, perform operations, drive vehicles, read, and so on.
Although it might seem that we do not put any special effort to do most of these tasks human
visual system is fairly complex to replicate and implement.
Computer Vision, in the simplest terms, is the automation of such a visual system, so that
computers or machines, in general, can obtain high level understanding of the environment from
digital images and videos.
In the manufacturing sector identifying defective products and ensuring quality and
Object detection, deals with detecting instances of objects of a certain class, such as humans,
buildings, or cars in digital images and videos. Computer Vision is vital in implementing object
detection from digital images.
OBJECTIVE
To develop a computer vision based text scanner that will scan through any image (Example: a
Sudoku Puzzle from the Newspaper) to obtain the respective text from it.
METHODOLOGY
Any computer vision application starts with Image acquisition (Image acquisition is the digital
representation of the visual characteristics of the physical world).Image sensors are used to
detect and capture the information required to make an image.
The images acquired are then processed in the next stage. In this step, the signals in the acquired
images are filtered to remove the noise or any irrelevant frequencies. If needed the images are
padded and transformed to a different space, so as to make them ready for the actual analysis.
The processed images are then analysed to extract useful information, this involves pattern
identification, colour recognition, object recognition, feature extraction, motion tracking, image
segmentation, etc.
Finally, the high dimensional data obtained from all the above steps is used to produce
meaningful numerical information, which leads to making decisions.
SCRIPTS
main.py
christopher.py
basic.py
This script is used to take as input the original image, apply pre-processing, get the corner
points of the board, warp the image and separate out the individual smaller grids (tiles)
containing the individual digits/blanks.
sud.py
This script is used to take the individual tiles, does a bit of pre-processing and predict the digits
in each tile. As the grid is a 9x9, the number of tiles are 81.
RESULTS
1) Image Processing
The image is converted to grayscale and further Adaptive Thresholding and Dilation are applied
to the image to reduce noise and enhance contours. After this happens, the coordinates of the
maximum area (the Sudoku grid) in the image are found.
2) Warping
Using the coordinates found, we warp the image and form individual grids on the image. These
individual grids will help in extracting out the smaller tiles which contain a single digit or a blank.
3) Digit Recognition
The individual grids are passed into a convolutional neural network (Christopher) which is
pre-trained on a custom dataset. These grids are identified and returned in the form of a list.
CONCLUSION
We can scan a sudoku puzzle of an image using our camera scanner and then convert it into text,
using warping & Artificial intelligence. But in future we plan to work on how to scan any image
to successfully obtain respective texts on it. Further improvements and adaptations of this
technology can help us, it improves the Searching ability of our computers, as our data records
continue to get bigger and more complex, computers with OCR will make record searching
much easier. Computers with OCR can scan a document and store it in a database, making it
easier to quickly retrieve it in the future. Inclusion of AI, such that it can read text from images
or banners etc. can use in-built software to quickly translate it into your preferred language,
boosting communication. Also, AI-enabled with OCR would be able to read paper bills and
records, analyse complex charts, offer recommendations and take business decisions. AI that is
capable of recognizing facial expressions can understand how people around them are feeling.
This offers benefits in hospitality and healthcare sectors. Assembly processing robots with
computer vision will be able to identify defective products or rotten produce and separate them
from quality products.
REFERENCES
A. Rosenfeld and A. C. Kak, Digital Picture Processing, Academic Press, New York, 1976
https://www.cv-foundation.org/openaccess/content_cvpr_2015/html/Zhang_Symmetry-Bas
ed_Text_Line_2015_CVPR_paper.html
http://iihm.imag.fr/publs/2003/procam03_magictable_berard.pdf
http://opencv-python-tutroals.readthedocs.io/en/latest/py_tutorials/py_imgproc/py_table_o
f_contents_imgproc/py_table_of_contents_imgproc.html
https://scholar.google.co.in/citations?hl=en&vq=eng_computervisionpatternrecognition&vi
ew_op=list_hcore&venue=x0SOFhwf7eMJ.2020