Project Report
Project Report
on
CURRENCY RECOGNITION USING PYTHON
Submitted in the partial fulfilment of the requirements for
the award of the degree of
BACHELOR OF TECHNOLOGY
In
ELECTRONICS AND COMMUNICATION ENGINEERING
By
1
SREENIDHI INSTITUTE OF SCIENCE AND TECHNOLOGY
(Affiliated to Jawaharlal Nehru Technological University, Hyderabad)
Yamnampet (V), Ghatkesar (M), Hyderabad – 501 301
CERTIFICATE
This is to certify that the project report entitled “CURRENCY RECOGNITION
USING PYTHON” is being submitted by
in partial fulfilment of the requirements for the award of Bachelor of Technology degree in
Electronics and Communication Engineering to Sreenidhi Institute of Science &
Technology affiliated to Jawaharlal Nehru Technological University, Hyderabad
(Telangana). This record is a bona fide work carried out by them under our guidance and
supervision. The results embodied in the report have not been submitted to any other University
or Institution for the award of any degree or diploma.
2
DECLARATION
We hereby declare that the work described in this thesis titled “CURRENCY
RECOGNITION USING PYTHON” which is being submitted by us in partial fulfilment
for the award of Bachelor of Technology in the Department of Electronics and
Communication Engineering, Sreenidhi Institute Of Science & Technology is the result of
investigations carried out by us under the guidance of Dr.C N SUJATHA, Associate
professor, Department of ECE, Sreenidhi Institute of Science & Technology, Hyderabad.
No part of the thesis is copied from books/ journals/ internet and whenever the portion is taken,
the same has been duly referred. The report is based on the project work done entirely by us
and not copied from any other source. The work is original and has not been submitted for any
Degree/Diploma of this or any other university.
Place: Hyderabad
Date: 05-12-2019
3
ACKNOWLEDGEMENTS
We thank Dr. C N SUJATHA, Professor, Dept of ECE, Sreenidhi Institute of Science &
Technology, Hyderabad for her valuable comments and suggestions that greatly helped in
improving quality of thesis.
We would like to express our sincere gratitude to Dr. S.P.V. Subbarao, Professor, Head of the
department, Electronics & Communication Engineering, Sreenidhi Institute of Science &
Technology, Hyderabad for his continued support and valuable guidance and encouragement
extended to us during our research work. We thank him for his pains taking efforts to guide us
throughout our research work.
We are very grateful to Dr. P. NARSIMHA REDDY, Director and Dr. Siva Reddy,
Principal and the Management of Sreenidhi Institute Of Science & Technology for having
provided the opportunity for taking up this project.
We thank to all my teachers and professors for their valuable comments after reviewing our
research papers.
We wish to extend my special thanks to all my colleagues and friends who helped directly or
indirectly to complete our research work.
We extend our thanks to our parents and all our family members for their unceasing
encouragement and support who wished us a lot to complete this work.
4
TABLE OF CONTENTS
CHAPTER NO. TITLE PAGE NO.
INDEX-------------------------------------------------------------------------------- 5
LIST OF FIGURES------------------------------------------------------------------7
ABSTRACT---------------------------------------------------------------------------8
CHAPTER-1: INTRODUCTION .......................................................................................... 9
1.0 INTRODUCTION:.......................................................................................................... 9
1.1 BRIEF HISTORY: ..... ………………………………………………………………….9
CHAPTER 2: THEORETICAL BACKGROUND............................................................. 12
2.0 SAMPLING AND QUANTIZATION........................................................................... 12
2.1 RESIZING IMAGE........................................................................................................ 13
2.2 ALIASING AND IMAGE ENHANCEMENT .............................................................. 13
2.3 CONTRAST ENHANCEMENT ..................................................................................... 13
2.4 ARITHMETIC AND LOGICAL OPERATIONS……………………………………………………………14
2.5 SPATIAL DOMAIN FILTERING……………………………………………………………………………………14
CHAPTER-3: BUILDING CURRENCY RECOGNATION MODEL USING OPEN
CV……………………………………………………………………………………15
3.1 RESIZING IMAGE WITH FIXED ASPECT RATIO .................................................... 155
3.2 CONVERTING IMAGE TO GRAYSCALE ................................................................. 199
3.3 GAUSSIAN BLUR ........................................................................................................ 20
3.4 THRESHOLDING…………………………………………………………………….21
CHAPTER 4: EDGE OPERATORS ................................................................................... 23
4.0 SOBEL EDGE OPERATOR………………………………………………………….23
4.1 CANNY EDGE OPERATOR……………………………………………………...…24
4.2 LAPLACE EDGE OPERATOR……………………………………………………...26
4.3 HARRIS EDGE OPERATOR……………………………………………………......26
CHAPTER-5: BUILDING ROBUST CURRENCY RECOGNITION SYSTEM ......... 288
5.0 INTRODUCTION ........................................................................................................ 288
5.1 READ IN IMAGE ........................................................ Error! Bookmark not defined.9
5.2 IMAGE PROCESSING ............................................... Error! Bookmark not defined.9
5.3 SEGMENTATION......................................................................................................... 31
5.4 COLOUR DETECTION ................................................................................................ 31
5
CHAPTER-6: DESIGN AND IMPLEMENTATION ........................................................ 32
CHAPTER-7: EXPERIMENTAL RESULTS AND DISCUSSIONS ............................. 388
REFERENCES……………………………………………………………………………40
6
LIST OF FIGURES
10 5.1.1 Flow-chart 28
7
ABSTRACT
It is difficult for blind people to recognize currencies. In this project, we propose a system for
automated currency recognition using image processing techniques. The proposed method can
be used for recognizing both the country or origin as well as the denomination or value of a
given banknote. Only paper currencies have been considered. This method works by first
identifying the country of origin using certain predefined areas of interest, and then extracting
the denomination value using characteristics such as size, color, or text on the note, depending
on how much the notes within the same country differ. Our system is able to accurately and
quickly identify test notes.It can also be used in forex and ATM’s.
8
CHAPTER-1
INTRODUCTION
1.0 INTRODUCTION:
In computer science, digital image processing is the use of computer algorithms to perform
image processing on digital images. As a subcategory or field of digital signal processing,
digital image processing has many advantages over analog image processing. It allows a much
wider range of algorithms to be applied to the input data and can avoid problems such as the
build-up of noise and signal distortion during processing. Since images are defined over two
dimensions (perhaps more) digital image processing may be modeled in the form of
multidimensional systems. The generation and development of digital image processing are
mainly affected by three factors: first, the development of computers; second, the development
of mathematics (especially the creation and improvement of discrete mathematics theory);
third, the demand for a wide range of applications in environment, agriculture, military,
industry and medical science has increased.
Many of the techniques of digital image processing, or digital picture processing as it often was
called, were developed in the 1960s, at Bell Laboratories, the Jet Propulsion Laboratory,
Massachusetts Institute of Technology, University of Maryland, and a few other research
facilities, with application to satellite imagery, wire-photo standards conversion, medical
imaging, videophone, character recognition, and photograph enhancement.[ The purpose of
early image processing was to improve the quality of the image. It was aimed for human beings
to improve the visual effect of people. In image processing, the input is a low-quality image,
and the out put is an image with improved quality. Common image processing include image
enhancement, restoration, encoding, and compression. The first successful application was the
American Jet Propulsion Laboratory (JPL). They used image processing techniques such as
geometric correction, gradation transformation, noise removal, etc. on the thousands of lunar
photos sent back by the Space Detector Ranger 7 in 1964, taking into account the position of
the sun and the environment of the moon. The impact of the successful mapping of the moon's
surface map by the computer has been a huge success. Later, more complex image processing
was performed on the nearly 100,000 photos sent back by the spacecraft, so that the topographic
map, color map and panoramic mosaic of the moon were obtained, which achieved
extraordinary results and laid a solid foundation for human landing on the moon.
The cost of processing was fairly high, however, with the computing equipment of that era.
That changed in the 1970s, when digital image processing proliferated as cheaper computers
and dedicated hardware became available. This led to images being processed in real-time, for
some dedicated problems such as television standards conversion. As general-purpose
computers became faster, they started to take over the role of dedicated hardware for all but
the most specialized and computer-intensive operations. With the fast computers and signal
processors available in the 2000s, digital image processing has become the most common form
9
of image processing, and is generally used because it is not only the most versatile method, but
also the cheapest.
Image sensors
The basis for modern image sensors is metal-oxide-semiconductor (MOS) technology, which
originates from the invention of the MOSFET (MOS field-effect transistor) by Mohamed M.
Atalla and Dawon Kahng at Bell Labs in 1959. This led to the development of
digital semiconductor image sensors, including the charge-coupled device (CCD) and later
the CMOS sensor.
The charge-coupled device was invented by Willard S. Boyle and George E. Smith at Bell Labs
in 1969. While researching MOS technology, they realized that an electric charge was the
analogy of the magnetic bubble and that it could be stored on a tiny MOS capacitor. As it was
fairly straighforward to fabricate a series of MOS capacitors in a row, they connected a suitable
voltage to them so that the charge could be stepped along from one to the next. The CCD is a
semiconductor circuit that was later used in the first digital video cameras for television
broadcasting.
The NMOS active-pixel sensor (APS) was invented by Olympus in Japan during the mid-
1980s. This was enabled by advances in MOS semiconductor device fabrication,
with MOSFET scaling reaching smaller micron and then sub-micron levels. The NMOS APS
was fabricated by Tsutomu Nakamura's team at Olympus in 1985. The CMOS active-pixel
sensor (CMOS sensor) was later developed by Eric Fossum's team at the NASA Jet Propulsion
Laboratory in 1993. By 2007, sales of CMOS sensors had surpassed CCD sensors.
Image compression
An important development in digital image compression technology was the discrete cosine
transform (DCT), a lossy compression technique first proposed by Nasir Ahmed in 1972. DCT
compression became the basis for JPEG, which was introduced by the Joint Photographic
Experts Group in 1992. JPEG compresses images down to much smaller file sizes, and has
become the most widely used image file format on the Internet. Its highly efficient DCT
compression algorithm was largely responsible for the wide proliferation of digital
images and digital photos, with several billion JPEG images produced every day as of 2015.
10
between different color formats (YIQ, YUV and RGB) for display purposes. DCTs are also
commonly used for high-definition television (HDTV) encoder/decoder chips.
Medical imaging
In 1972, the engineer from British company EMI Housfield invented the X-ray computed
tomography device for head diagnosis, which is what we usually called CT(Computer
Tomography). The CT nucleus method is based on the projection of the human head section
and is processed by computer to reconstruct the cross-sectional image, which is called image
reconstruction. In 1975, EMI successfully developed a CT device for the whole body, which
obtained a clear tomographic image of various parts of the human body. In 1979, this diagnostic
technique won the Nobel Prize. Digital image processing technology for medical applications
was inducted into the Space Foundation Space Technology Hall of Fame in 1994.
11
CHAPTER 2
THEORETICAL BACKGROUND
• Quantization
The sampling rate determines the spatial resolution of the digitized image, while the
quantization level determines the number of grey levels in the digitized image. A magnitude of
the sampled image is expressed as a digital value in image processing. The transition between
continuous values of the image function and its digital equivalent is called quantization.
The number of quantization levels should be high enough for human perception of fine shading
details in the image. The occurrence of false contours is the main problem in image which has
been quantized with insufficient brightness levels.
12
2.2 ALIASING AND IMAGE ENHANCEMENT
Digital sampling of any signal, whether sound, digital photographs, or other, can result in
apparent signals at frequencies well below anything present in the original. Aliasing occurs
when a signal is sampled at a less than twice the highest frequency present in the signal. Signals
at frequencies above half the sampling rate must be filtered out to avoid the creation of signals
at frequencies not present in the original sound. Thus digital sound recording equipment
contains low-pass filters that remove any signals above half the sampling frequency.
Since a sampler is a linear system, then if an input is a sum of sinusoids, the output will be a
sum of sampled sinusoids. This suggests that if the input contains no frequencies above the
Nyquist frequency, then it will be possible to reconstruct each of the sinusoidal components
from the samples. This is an intuitive statement of the Nyquist-Shannon sampling theorem.
Anti-aliasing is a process which attempts to minimize the appearance of aliased diagonal edges.
Anti-aliasing gives the appearance of smoother edges and higher resolution. It works by taking
into account how much an ideal edge overlaps adjacent pixels.
Image enhancement techniques have been widely used in many applications of image
processing where the subjective quality of images is important for human interpretation.
Contrast is an important factor in any subjective evaluation of image quality. Contrast is created
by the difference in luminance reflected from two adjacent surfaces. In other words, contrast is
the difference in visual properties that makes an object distinguishable from other objects and
the background. In visual perception, contrast is determined by the difference in the colour and
brightness of the object with other objects. Our visual system is more sensitive to contrast than
absolute luminance; therefore, we can perceive the world similarly regardless of the
considerable changes in illumination conditions. Many algorithms for accomplishing contrast
enhancement have been developed and applied to problems in image processing.
If the contrast of an image is highly concentrated on a specific range, e.g. an image is very
dark; the information may be lost in those areas which are excessively and uniformly
concentrated. The problem is to optimize the contrast of an image in order to represent all the
information in the input image.
Image arithmetic applies one of the standard arithmetic operations or a logical operator to two
or more images. The operators are applied in a pixel-by-pixel way, i.e. the value of a pixel in
the output image depends only on the values of the corresponding pixels in the input images.
Hence, the images must be of the same size. Although image arithmetic is the most simple
form of image processing, there is a wide range of applications. A main advantage of arithmetic
operators is that the process is very simple and therefore fast.
Logical operators are often used to combine two (mostly binary) images. In the case of integer
images, the logical operator is normally applied in a bitwise way.
13
2.4 SPATIAL DOMAIN FILTERING
14
CHAPTER-3
OpenCV (Open Source Computer Vision Library) is an open source computer vision and
machine learning software library. OpenCV was built to provide a common infrastructure for
computer vision applications and to accelerate the use of machine perception in the commercial
products. Being a BSD-licensed product, OpenCV makes it easy for businesses to utilize and
modify the code.
The library has more than 2500 optimized algorithms, which includes a comprehensive set of
both classic and state-of-the-art computer vision and machine learning algorithms. These
algorithms can be used to detect and recognize faces, identify objects, classify human actions
in videos, track camera movements, track moving objects, extract 3D models of objects,
produce 3D point clouds from stereo cameras, stitch images together to produce a high
resolution image of an entire scene, find similar images from an image database, remove red
eyes from images taken using flash, follow eye movements, recognize scenery and establish
markers to overlay it with augmented reality,etc.
OpenCV was started at Intel in 1999 by Gary Bradsky and the first release came out in 2000.
Vadim Pisarevsky joined Gary Bradsky to manage Intel’s Russian software OpenCV team. In
2005, OpenCV was used on Stanley, the vehicle who won 2005 DARPA Grand Challenge.
Later its active development continued under the support of Willow Garage, with Gary Bradsky
and Vadim Pisarevsky leading the project. Right now, OpenCV supports a lot of algorithms
related to Computer Vision and Machine Learning and it is expanding day-by-day.
Scaling
Scaling is just resizing of the image.OpenCV comes with a function cv.resize() for this
purpose. The size of the image can be specified manually, or you can specify the scaling factor.
Different interpolation methods are used. Preferable interpolation methods
are cv.INTER_AREAfor shrinking and cv.INTER_CUBIC (slow) & cv.INTER_LINEAR for
zooming. By default, the interpolation method cv.INTER_LINEAR is used for all resizing
purposes. You can resize an input image with either of following methods:
import numpy as np
import cv2 as cv
img = cv.imread('messi5.jpg')
res = cv.resize(img,None,fx=2, fy=2, interpolation = cv.INTER_CUBIC)
#OR
height, width = img.shape[:2]
res = cv.resize(img,(2*width, 2*height), interpolation = cv.INTER_CUBIC)
15
Translation
Translation is the shifting of an object's location. If you know the shift in the (x,y) direction
and let it be (tx,ty), you can create the transformation matrix M as follows:
M=[1001txty]
You can take make it into a Numpy array of type np.float32 and pass it into
the cv.warpAffine() function. See the below example for a shift of (100,50):
import numpy as np
import cv2 as cv
img = cv.imread('messi5.jpg',0)
rows,cols = img.shape
M = np.float32([[1,0,100],[0,1,50]])
dst = cv.warpAffine(img,M,(cols,rows))
cv.imshow('img',dst)
cv.waitKey(0)
cv.destroyAllWindows()
Warning
The third argument of the cv.warpAffine() function is the size of the output image, which
should be in the form of (width, height).
image 3.1.1
Rotation
Rotation of an image for an angle θ is achieved by the transformation matrix of the form
M=[cosθsinθ−sinθcosθ]
But OpenCV provides scaled rotation with adjustable center of rotation so that you can rotate
at any location you prefer. The modified transformation matrix is given by
[α−ββα(1−α)⋅center.x−β⋅center.yβ⋅center.x+(1−α)⋅center.y]
16
where:
α=scale⋅cosθ,β=scale⋅sinθ
To find this transformation matrix, OpenCV provides a function, cv.getRotationMatrix2D.
Check out the below example which rotates the image by 90 degree with respect to center
without any scaling.
img = cv.imread('messi5.jpg',0)
rows,cols = img.shape
M = cv.getRotationMatrix2D(((cols-1)/2.0,(rows-1)/2.0),90,1)
dst = cv.warpAffine(img,M,(cols,rows))
image 3.1.2
Affine Transformation
In affine transformation, all parallel lines in the original image will still be parallel in the output
image. To find the transformation matrix, we need three points from the input image and their
corresponding locations in the output image. Then cv.getAffineTransform will create a 2x3
matrix which is to be passed to cv.warpAffine.
Check the below example, and also look at the points I selected (which are marked in green
color):
img = cv.imread('drawing.png')
rows,cols,ch = img.shape
pts1 = np.float32([[50,50],[200,50],[50,200]])
pts2 = np.float32([[10,100],[200,50],[100,250]])
M = cv.getAffineTransform(pts1,pts2)
dst = cv.warpAffine(img,M,(cols,rows))
plt.subplot(121),plt.imshow(img),plt.title('Input')
plt.subplot(122),plt.imshow(dst),plt.title('Output')
plt.show()
17
image 3.1.3
Perspective Transformation
For perspective transformation, you need a 3x3 transformation matrix. Straight lines will
remain straight even after the transformation. To find this transformation matrix, you need 4
points on the input image and corresponding points on the output image. Among these 4 points,
3 of them should not be collinear. Then the transformation matrix can be found by the
function cv.getPerspectiveTransform. Then apply cv.warpPerspective with this 3x3
transformation matrix.
img = cv.imread('sudoku.png')
rows,cols,ch = img.shape
pts1 = np.float32([[56,65],[368,52],[28,387],[389,390]])
pts2 = np.float32([[0,0],[300,0],[0,300],[300,300]])
M = cv.getPerspectiveTransform(pts1,pts2)
dst = cv.warpPerspective(img,M,(300,300))
plt.subplot(121),plt.imshow(img),plt.title('Input')
plt.subplot(122),plt.imshow(dst),plt.title('Output')
plt.show()
18
Result:
image 3.1.4
Grayscaling is the process of converting an image from other color spaces e.g RGB, CMYK,
HSV, etc. to shades of gray. It varies between complete black and complete white.
Importance of grayscaling –
• Dimension reduction: For e.g. In RGB images there are three color channels and has
three dimensions while grayscaled images are single dimensional.
• Reduces model complexity: Consider training neural article on RGB images of 10x10x3
pixel.The input layer will have 300 input nodes. On the other hand, the same neural
network will need only 100 input node for grayscaled images.
• For other algorithms to work: There are many algorithms that are customized to work
only on grayscaled images e.g. Canny edge detection function pre-implemented in
OpenCV library works on Grayscaled images only.
# importing opencv
import cv2
19
cv2.imshow('Grayscale', gray_image)
cv2.waitKey(0)
Input image:
image 3.2.1
Output:
image 3.2.2
src − A Mat object representing the source (input image) for this operation.
dst − A Mat object representing the destination (output image) for this operation.
20
ksize − A Size object representing the size of the kernel.
sigmaX − A variable of the type double representing the Gaussian kernel standard deviation in
X direction.
3.4 THRESHOLDING
Thresholding is a technique in OpenCV, which is the assignment of pixel values in relation to
the threshold value provided. In thresholding, each pixel value is compared with the threshold
value. If the pixel value is smaller than the threshold, it is set to 0, otherwise, it is set to a
maximum value (generally 255). Thresholding is a very popular segmentation technique, used
for separating an object considered as a foreground from its background. A threshold is a value
which has two regions on its either side i.e. below the threshold or above the threshold.
In Computer Vision, this technique of thresholding is done on grayscale images. So initially,
the image has to be converted in grayscale color space.
If f (x, y) > T
then f (x, y) = 0
else
f (x, y) = 255
where
f (x, y) = Coordinate Pixel Value
T = Threshold Value.
In OpenCV with Python, the function cv2.threshold is used for thresholding.
Syntax: cv2.threshold(source, thresholdValue, maxVal, thresholdingTechnique)
Parameters:
-> source: Input Image array (must be in Grayscale).
-> thresholdValue: Value of Threshold below and above which pixel values will change
accordingly.
-> maxVal: Maximum value that can be assigned to a pixel.
-> thresholdingTechnique: The type of thresholding to be applied.
Simple Thresholding
The basic Thresholding technique is Binary Thresholding. For every pixel, the same threshold
value is applied. If the pixel value is smaller than the threshold, it is set to 0, otherwise, it is set
to a maximum value.
The different Simple Thresholding Techniques are:
21
• cv2.THRESH_BINARY: If pixel intensity is greater than the set threshold, value set to
255, else set to 0 (black).
• cv2.THRESH_BINARY_INV: Inverted or Opposite case of cv2.THRESH_BINARY.
• cv.THRESH_TRUNC: If pixel intensity value is greater than threshold, it is truncated
to the threshold. The pixel values are set to be the same as the threshold. All other values
remain the same.
• cv.THRESH_TOZERO: Pixel intensity is set to 0, for all the pixels intensity, less than
the threshold value.
• cv.THRESH_TOZERO_INV: Inverted or Opposite case of cv2.THRESH_TOZERO.
22
CHAPTER 4
EDGE OPERATORS
4.0 SOBEL EDGE OPERATOR
Formulation
Gx=⎡⎣⎢−1−2−1000+1+2+1⎤⎦⎥∗I
Gy=⎡⎣⎢−10+1−20+2−10+1⎤⎦⎥∗I
2. At each point of the image we calculate an approximation of the gradient in that point
by combining both results above:
G=G2x+G2y−−−−−−−√
G=|Gx|+|Gy|
23
4.1 CANNY EDGE OPERATOR
Canny Edge Detection is a popular edge detection algorithm. It was developed by John F.
Canny in
Since edge detection is susceptible to noise in the image, first step is to remove the
noise in the image with a 5x5 Gaussian filter. We have already seen this in previous
chapters.
Smoothened image is then filtered with a Sobel kernel in both horizontal and vertical
direction to get first derivative in horizontal direction ( Gx) and vertical direction ( Gy).
From these two images, we can find edge gradient and direction for each pixel as
follows:
Edge_Gradient(G)=G2x+G2y−−−−−−−√Angle(θ)=tan−1(GyGx)
4. Non-maximum Suppression
After getting gradient magnitude and direction, a full scan of image is done to remove
any unwanted pixels which may not constitute the edge. For this, at every pixel, pixel
is checked if it is a local maximum in its neighborhood in the direction of gradient.
Check the image below:
image 4.1.1
24
Point A is on the edge ( in vertical direction). Gradient direction is normal to the edge.
Point B and C are in gradient directions. So point A is checked with point B and C to
see if it forms a local maximum. If so, it is considered for next stage, otherwise, it is
suppressed ( put to zero).
In short, the result you get is a binary image with "thin edges".
5. Hysteresis Thresholding
This stage decides which are all edges are really edges and which are not. For this, we
need two threshold values, minVal and maxVal. Any edges with intensity gradient more
than maxVal are sure to be edges and those below minVal are sure to be non-edges, so
discarded. Those who lie between these two thresholds are classified edges or non-
edges based on their connectivity. If they are connected to "sure-edge" pixels, they are
considered to be part of edges. Otherwise, they are also discarded. See the image below:
image 4.1.2
Image
This stage also removes small pixels noises on the assumption that edges are long lines
25
4.2 LAPLACE EDGE OPERATOR
1. The second derivative can be used to detect edges. Since images are "*2D*", we would
need to take the derivative in both dimensions. Here, the Laplacian operator comes
handy.
2. The Laplacian operator is defined by:
Laplace(f)=∂2f∂x2+∂2f∂y2
Window function is either a rectangular window or gaussian window which gives weights to
pixels underneath.
We have to maximize this function for corner detection. That means, we have to
maximize the second term. Applying Taylor Expansion to above equation and using some
mathematical steps (please refer any standard text books you like for full derivation), we get
the final equation as:
where
26
Here, and are image derivatives in x and y directions respectively. (Can be easily found
out using cv2.Sobel()).
Then comes the main part. After this, they created a score, basically an equation, which will
determine if a window can contain a corner or not.
where
•
•
• and are the eigen values of M
So the values of these eigen values decide whether a region is corner, edge or flat.
• When is small, which happens when and are small, the region is flat.
• When , which happens when or vice versa, the region is edge.
• When is large, which happens when and are large and , the region is
a corner.
image 4.1.3
27
CHAPTER-5
5.0 INTRODUCTION
The system will ask the user to take the image of currency when launching. After that, the
system tries to recognize the currency. When the recognition processing starts, the system
inside will do some image processing with the image (pre-processing, segmentation, edge
detection and so on). If the image exhibit information loses such as surface damage, noise level,
sharpness issues and so on, the recognition may fail and the user has to do the processing again.
The system do not need extra device, our algorithm relies on visual features for recognition. It
can recognition the currency, and print out the result by text.
The system contains a complete user interface, the user just needs to open the image, and the
result will be shown after processing. The most important part of this system is pattern
matching. Based on edge detection algorithm, the system performs pattern matching. When
some error occurs, the system will emerge some exception, which may cause the exceptions
like “the image not complete”, “failed recognition”, etc
image 5.0.1
28
5.1 READ IN IMAGE
The system can read not only JPEG (JPG) format but others. Our image was obtained
from a scanner. As mentioned before, the resolution is set to 600 DPI. But this will
make the image a big size. So after reading in the image, the system will reset the
image to size 1024 by 768 pixels and this work will refer to image pre-processing.
When using a digital camera or a scanner and perform image transfers, some noise will appear
on the image. Image noise is the random variation of brightness in images. Removing the noise
is an important step when image processing is being performed. However noise may affect
segmentation and pattern matching. When performing smoothing process on a pixel, the
neighbour of the pixel is used to do some transforming. After that a new value of the pixel is
created. The neighbour of the pixel is consisting with some other pixels and they build up a
matrix,the size of the matrix is odd number, the target pixel is located on the middle of the
matrix.
Convolution is used to perform image smoothing. is showing the convolution. As the first step,
we centre our filter over pixel that will be filtered. The filters coefficients are multiplied by the
pixel values beneath and the results are added together. The central pixel value is changed to
the new calculated value.
As the last step, the filter is moved to the next pixel and the convolution process is repeated.
New calculated values are not used in the next pixel filtering. Only old values are involved.
When the filter is centred over the pixel with the border, some parts of it will be outside the
edge the image. There are some techniques to handle these situations:
1. Zero padding: all filter values outside the image are set to 0.
2. Wrapping: all filter values outside the image are set to its “reflection” value.
4. The unfiltered rows and columns will be copied to the resulting image.
We use Gaussian operator to blur an image and suppress the noise, it could be seen as a perfect
function which is easy to specify. We create elements in Gaussian.
29
The standard deviation σ is a square root of the average of the n values squared deviations from
its average value, or simply standard deviation of the distribution.
𝜎 = √𝜎2
Where the σ2 is the deviation expressed as average of sums of the average subtracted from
each dimensions coordinates.
σ2 =1
Filter is produced by the code illustrate before. However there are some other ways to smooth
the image, such as median filter. Median filtering is a nonlinear operation. And this perform is
often used in image processing to reduce "salt and pepper" noise. Besides the median filter is
more effective than convolution when the goal is to simultaneously reduce noise and
preserve edges. After processing with median filter, the noise is removed so well, and some
detail is described so well on the image. The pattern which is the most important thing that we
want to find is also clear.
Median filter replaces a pixel via the median pixel of all the neighbourhoods:
After removing the noises, the next step is to cut off some useless area. Sometimes, for some
reasons, some black lines will appear on the edge of the original image, which will affect the
next operation. To avoid this problem, we cut each side down by 10 pixels. Compared with an
A4 size paper, the currency is so small. However, when we get the image from the scanner, the
image we get is a picture like an A4 paper. So after scanning, the image will have lots of white
area surrounding the currency. Actually this is useless part for recognition. In order to make
the system efficient, the white area part will be cut entirely.
Because the light condition, when getting the image from digital camera, we need to perform
histogram equalization. Histogram equalization is used to adjust the contras and brightness of
the image, because some part of the recognition based on color processing. Different light
conditions may affect the result. So histogram equalization is needed to perform.
For segmentation, we removed more things that are not expected by binarizing the image. We
had to set a threshold to decide which one is set to “0” (black) and A. Before cutting B. After
cutting 14 which one is set to “1” (white). Actually the thing we don’t need is set to “1”. We
set two values for the threshold, the value of the pixel between those two values is set to “0”,
and others will set to “1”.
In order to remove the white area, we create our algorithm. Scan the image by x direction and
y direction, and detect each pixel’s value, if the value does not equal to 0 (that means is not a
white point), record this point, then continue detecting, if the value equals to 1 (that means is a
30
white point), record this point and break the loop. We set a flag, when flag equals to 0 that
means this row or line has been checked, and it contains a black point. So this row or line can
be skipped and check next row or line. When meet the row or line which only consist by white
point, we record this row or line. When finishing y direction, do the same thing with the x
direction. Because the system records the points which is first time hit the black one and the
white one. We can get the boundary of top and bottom, left and right.
When we get the boundary of left, right, top and bottom, we get coordinate of each point, and
then the currency is separated successfully
5.3 SEGMENTATION
After observation we knew that each currency has one or more unique patterns. So
these unique patterns can be used to distinguish different types of currencies.
Because the position of the pattern is aptotic, so we can segment it proportionally
and finally get what we want. We set scales for each side of the image, after that, the
pattern is segmented.
Actually PY1, PY2 is the boundary of top and bottom. PX1 and PX2 is the
boundary of left and right. After calculation, we get the new boundary of the pattern.
That is the way we segment the pattern.
In this part, we are going to describe how to detect the primary colour of the images. There
are too many types of color model we can use, like RGB, HSV, and GREY. We use RGB
model because we need to calculate the mean of the colour. The image is presented as x by y
by 3 matrix (here x is the width of the image, y is the height of the image), iteration each pixel
and store the value of R, G, and B. After that, the mean of each channel will be calculated. We
do not calculate the whole primary color of the currency. We cut half of the currency, because
most of the currencies are dividing into two parts. And the left part is mostly white area, while
the right part has some patterns or portrait. The primary colour of the image is used to check
what this currency is and this is one of the important characters for recognizing the currency.
31
CHAPTER-6
DESIGN AND IMPLEMENTATION
from matplotlib import pyplot as plt
import subprocess
from gtts import gTTS
# utils.py
# contains utility functions
import cv2
import math
import numpy as np
import matplotlib.pyplot as plt
from pprint import pprint
# read image as is
def read_img(file_name):
img = cv2.imread(file_name)
return img
# binarize (threshold)
# retval not used currently
def binary_thresh(image, threshold):
retval, img_thresh = cv2.threshold(image, threshold, 255,
cv2.THRESH_BINARY)
return img_thresh
def adaptive_thresh(image):
32
img_thresh = cv2.adaptiveThreshold(image, 255,
cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY, 11, 8)
# cv2.adaptiveThreshold(src, maxValue, adaptiveMethod, thresholdType,
blockSize, C[, dst]) → dsta
return img_thresh
# sobel edge x + y
def sobel_edge2(image):
# ksize = size of extended sobel kernel
grad_x = cv2.Sobel(image, cv2.CV_16S, 1, 0, ksize=3, borderType =
cv2.BORDER_DEFAULT)
grad_y = cv2.Sobel(image, cv2.CV_16S, 0, 1, ksize=3, borderType =
cv2.BORDER_DEFAULT)
abs_grad_x = cv2.convertScaleAbs(grad_x)
abs_grad_y = cv2.convertScaleAbs(grad_y)
# laplacian edge
def laplacian_edge(image):
# good for text
img = cv2.Laplacian(image, cv2.CV_8U)
return img
# detect countours
def find_contours(image):
(_, contours, _) =
cv2.findContours(image,cv2.RETR_LIST,cv2.CHAIN_APPROX_SIMPLE)
contours = sorted(contours, key = cv2.contourArea, reverse =
True)[:5]
33
return contours
# median blur
def median_blur(image):
blurred_img = cv2.medianBlur(image, 3)
return blurred_img
# erode image
def close(image):
img = cv2.Canny(image, 75, 300)
img = cv2.dilate(img, None)
img = cv2.erode(img, None)
return img
def harris_edge(image):
img_gray = np.float32(image)
# calculate histogram
def histogram(image):
hist = cv2.calcHist([image], [0], None, [256], [0, 256])
# cv2.calcHist(images, channels, mask, histSize, ranges[, hist[,
accumulate]])
plt.plot(hist)
plt.show()
plt.show()
34
# calculate scale and fit into display
def display(window_name, image):
screen_res = 1440, 900 # MacBook Air
# display image
cv2.imshow(window_name, image)
max_val = 8
max_pt = -1
max_kp = 0
orb = cv2.ORB_create()
# orb is an alternative to SIFT
test_img = read_img('files/test_100_2.jpg')
#test_img = read_img('files/test_50_2.jpg')
#test_img = read_img('pic50.jpg')
#test_img = read_img('files/test_100_3.jpg')
#test_img = read_img('files/test_20_4.jpg')
good = []
# give an arbitrary number -> 0.789
# if good -> append to list of good matches
for (m, n) in all_matches:
if m.distance < 0.789 * n.distance:good.append([m])
35
if len(good) > max_val:
max_val = len(good)
max_pt = i
max_kp = kp2
if max_val != 8:
print(training_set[max_pt])
print('good matches ', max_val)
train_img = cv2.imread(training_set[max_pt])
img3 = cv2.drawMatchesKnn(test_img, kp1, train_img, max_kp, good, 4)
note = str(training_set[max_pt])[6:-4]
print('\nDetected denomination: Rs. ', note)
# audio_file = "value.mp3
# tts = gTTS(text=speech_out, lang="en")
# tts.save(audio_file)
return_code = subprocess.call(["afplay", audio_file])
(plt.imshow(img3), plt.show())
else:
print('No Matches')
36
OUTPUTS:
image 6.0.1
0 files/20.jpg 16
1 files/50.jpg 17
2 files/100.jpg 15
3 files/500.jpg 11
files/50.jpg
good matches 17
37
CHAPTER-7
38
CHAPTER 8
CONCLUSION:
This project proposes an algorithm for recognizing the currency using image
processing. The proposed algorithm uses the primary color and a part of currency for
recognition. We differentiated the denomination of currency using mean value of brightness of
R, G and B. This is the first condition to recognize the currency. Following, we segmented the
pattern from the currency and performed template matching to check the currency. The
experiment performed by program based on aforesaid algorithm indicates that our currency
recognition system based on image processing is quite quick and accurate. However such
system suffers from many drawbacks. The quality of sample the currency, the damage level of
the paper currency will affect the recognition rate. And our system still has some limitations,
such as the light condition. In the future, we are going to modify our system, overcome some
limitation, especially the problem that we get the image from digital camera and complete our
data base for recognizing more currencies.
39
REFERENCES
[1] Ahmed, M. J., Sarfraz, M., Zidouri, A., and Alkhatib, W. G., License Plate
Recognition System, The Proceedings of The 10th IEEE International Conference
On Electronics, Circuits And Systems (ICECS2003), Sharjah, United Arab
Emirates (UAE), 2003.
[3] Burger, W., Burge, M..J. Digital Image Processing: An Algorithmic Introduction
Using Java. Springer, New York, 2007.
[4] John C.Russ. The image processing Handbook Fifth Edition. Taylor & Francis,
North Carolina, 2006.
[5] Mark S. Nixon and Alberto S. Aguado. Feature Extraction and Image Processing.
Academic Press, 2008
[6] Milan Sonka, Vaclav Hlavac, Roger Boyle. Image Processing, Analysis, and
Machine Vision Third Edition. Thomson, 2008
[7] Rafael C. Gonzalez, Richard E. Woods. Digital Image Processing 2nd edition.
Pearson Education, New York, 2001.
[11] Xifan Shi, Weizhong Zhao,and Yonghang Shen, Automatic License Plate
Recognition System Based on Color Image Processing.Springer Berlin /Heidelberg, 2005
40