A - 10 Research Paper
A - 10 Research Paper
A - 10 Research Paper
Using ML*
1st Gaurav Arvindrao Deshmukh 2nd Kunal Sunil Salunkhe 3rd Pranjali Ram Bansode
Computer Engineering Computer Engineering Computer Engineering
GHRCEM GHRCEM GHRCEM
Pune, India Pune, India Pune, India
[email protected] [email protected] [email protected]
Abstract—In today’s world technology has affected each sector. need human intervention to complete the method. the present
But still, within the banking sector, we tend to face several method needs high human capital readying and longer interval.
problems. Bank handles giant volumes of cheques within the fallacious practices are one among the key problems plaguing
clearing method. The method involves several technical verifica-
tions as well as signature verification. A number of these steps banks in recent times. Hiring more people to manage the
are manual and need human intervention to finish the technique. cheque verification process does not solve this issue. This
This method needs high human capital preparation and longer project helps as a solution to this problem. Optical Character
interval. Fallacious practices are one in every one of the foremost Recognition (OCR), workflow systems, and machine learning
problems plaguing banks in recent times. Hiring a lot of folks to techniques are the key technologies to building automatic
manage the cheque verification method does not solve this issue.
This project helps as an answer to the present downside. Optical document processing. When it involves the clearance of bank
Character Recognition (OCR), workflow systems, and machine cheques and financial transactions, this alternate methodology
learning techniques are the key technologies to building automatic for the process of bank cheques with stripped-down human in-
document processing. When it involves the clearance of bank tervention additionally saves time, and automating the method
cheques and financial transactions, this alternate methodology through computer vision technology can facilitate ensuring
for the process of bank cheques with stripped-down human
intervention additionally saves time, and automating the method that solely authentic banknotes are handled and under such
through computer vision technology can facilitate ensuring that scenario, cheques will not be delayed in reaching its destina-
solely authentic banknotes are handled and under such scenario, tion. We tend to suggest an automated system that uses optical
cheques will not be delayed in reaching its destination. We tend character recognition and deep learning to extract important
to propose an automated system that extracts relevant details information from a bank check, such as the Payee Name,
of a bank cheque like Payee Name, Amount, Date, and Bank
Name using Optical Character Recognition and Deep Learning Amount, Date, and Bank Name, and uses feature extraction
and verifies the signature on the cheque with the prevailing and principal element analysis to compare the signature on
signature keep within the database using feature extraction and the check to the current signature stored in the database. Our
principal element analysis. Our innovation aims to learn the innovation aims to learn the banking industry by re-innovating
banking industry by re-innovating the other competent cheque- the other competent cheque-based financial transaction system
based financial transaction system that needs machine-driven
system intervention. that needs machine-driven system intervention.
Index Terms—Financial Document Processing, document im-
age processing, Machine Learning, Optical Character Recog- II. EXISTING SYSTEM
nition, information retrieval, information extraction, Financial Over the past ten years, a large number of developers and
Document Classification. authors have created and published a large number of models
and systems. One of the models is listed below. Only a few
I. I NTRODUCTION of the models were successful, and others failed to meet
In today’s world technology has affected every sector. But the requirements to be considered successful. This innovative
still, in the banking sector, we face many issues. Bank handles digit recognizer technology eliminates the need for users to
giant volumes of cheques within the clearing method. the manually enter the account number and courtesy amount that
method involves several technical verifications together with are inscribed on checks. For the automation of check pro-
signature verification. a number of these steps ar manual and cessing or auxiliary verification, some research in the past has
introduced various methods for reading the information on the
check, including the courtesy amount (numeric), legal amount
(text), signature, and particularly written languages like En- was the key element required for rotation to function since it
glish, French, or Korea.The development of a bank cheque allowed us to establish the position values of the date box,
recognition system utilising a neural network was the subject which served as our anchor for all length mapping-related
of research in Malaysia, but the results were unsatisfactory. To operations. In the segmentation sub-section that follows, the
the best of our knowledge, Malaysia has not conducted any contour extraction procedure was described. Additionally, we
research or implemented any digital recognition, particularly rotated the picture using the image’s midway as the rotation
to enhance performance in the banking sector. The goal of the point and a date box.
solution is to fully automate the check deposit procedure in
a nation, which will benefit both bank employees and clients. D. Grayscale Operation
The check deposit machine will integrate the digit recognizer.
Customers must place their checks into the machine, which After removing the background noise of an RGB (a three
will then scan the image of the cheque and read the courtesy channel) image, we converted it into grayscale (a single
amount and bank account information depending on the image. channel) image. The mathematical formula, Gray Image =
0.2126 × R + 0.7151 × G + 0.0721 × B Here RGB represents
III. M ETHODOLOGY the Red, Green and Blue channels of the image and Gray
A high-level, open-source programming language for Image represents the pixel value of the grayscale image
general-purpose programming, Python is available for free. It
has surpassed all other programming languages in popularity E. Signature Verifications
among data scientists thanks to its user-friendliness, simplicity,
and quick prototyping. To read and process data reliably To determine if the signature belongs to the specified
and effectively, it offers strong statistical and numerical tools individual, characteristics from the signature are extracted and
like numpy and pandas. Additionally, it features a useful then compared with features that are stored. The technique
machine learning programme called scikit-learn. In this work, is employed to stop counterfeit instances from happening
we’ll use the aforementioned programmes to create a CNN in financial facilities. Using the PCA (Principal Component
model. For the creation of an effective deep learning model, Analysis) technique, features are retrieved. The characteristics
Python provides Tensorflow and Keras. We chose Python as that were extracted are then compared to those in the database.
our implementation language in this article to carry out data Execution continues if the signature is validated; else, it quits.
analytics and machine learning since the large library shows
that Python has mature support for data science. F. CNN model for amount identification
A. Image Acquisition Following the extraction of the image’s component parts,
For the Document Processing, acquiring a picture of a the CNN model is used to recognise the courtesy amount’s
bank check is essential. These photos are usually acquired handwritten numeric digits and to turn the legal amount into
with flatbed scanners. We were unable to use the obtained a string. utilised Deep Learning Toolbox to create CNN using
picture(s) directly for the image processing activities because two convolutional layers and between six and twelve filters.
of the orientation and abnormalities contained in the scanned Then, in order to match the output from the courteous amount
image(s), therefore some pre-processing was necessary. of bank checks with the string representing the legal amount,
it received numerical data.
B. Image Pre-processing
For our Research, we used the scanned photographs of IV. S YSTEM D ESIGN
the checks. Since scanned images from scanners cannot be
utilised directly, we had to pre-process them. This entailed The system receives the check, and as part of the pre-
two main operations: rotation and removing extraneous back- processing process, OCR is initially performed on the en-
ground data.In this phase, the scanned picture was rotated in tire set of data. Currently, we have created templates for
relation to the ”Date Box”—a standard feature located in the banks where customers may input checks so that data can
same location on every bank check—and the background noise be retrieved using Opencv to rapidly and effectively crop
and additional data were then eliminated. The elimination out the necessary portions. After receiving the results, we
of unnecessary background information significantly increases use string matching and manipulation techniques to extract
the efficiency of parameter identification. the specific bank template from the result. The process of
establishing a system’s architecture, parts, modules, interfaces,
C. Rotation and data in order to meet predetermined criteria is known
We selected the date box seen on all standard check leaflets as system design. You may think of systems design as the
since the scanned pictures may differ in orientation and application of systems theory to the creation of products. The
because of the position’s comparatively unchanging nature. most popular methodologies for designing computer systems
We established the point of rotation and the degree of rotation are increasingly those that use object-oriented analyses and
in order to rotate the picture. The contour extraction process procedures.
O BSERVATIONS ACKNOWLEDGMENT
A cheque amount processing system becomes commercially I would like to convey my heartfelt gratitude to Mr. Om
efficient only when the error rate is very low. So a cheque Khade for his tremendous support and assistance in the com-
reader must be able to refuse to give an answer when the pletion of my project. I would also like to thank our Project
probability to make mistake is high. Human eyes can read Mentor , Mrs.Sunita Nandgave and Mrs.Gayatri Bedre, for
a ‘rejected’ cheque afterwards or other more advanced au- providing me with this wonderful opportunity to work on a
tomated approaches can be used. However, a cheque ‘read’ project Automated Financial Document Proceessing System.
incorrectly is very difficult to deal with, in terms of costs and The completion of the project would not have been possible
time involved to correct the mistake without their help and insights.
C ONCLUSION R EFERENCES
Using OCR, CNN, SIFT, and SVM, we created a model to R EFERENCES
validate bank checks. We conducted CNN to provide accurate [1] R. Jayadevan, S. R. Kohle, P. M. Patil, and U. Pal, “Automatic Processing
output for the handwritten figures printed on the cheque book- of Handwritten Bank Cheque Images: A Servey”, Received: 8 January
let, while we utilised OCR to accurately and efficiently detect 2011 / Revised: 25 April 2011 / Accepted: 29 June 2011.
[2] JN Sudarshan, Manikanta Ya, Kalpaj Pise, and Akshata S Bayyar,
the machine typographic characters. For the purpose of verify- “Automated Cheque Processing System”, JETIR July 2021, Volume 8,
ing the check, which is one of the main causes of its bounce- Issue 7.
off and the halting of financial transactions, we devised and [3] Samir Abdaljalil, and Houda Bouamor, “An Exploration of Automatic
Text Summarization of Financial Reports”, 2021.
implemented the technique to transform numbers into words. [4] Rajkumar Ramamurthy; Max Lübbering; Thiago Bell; Michael Gebauer;
This was accomplished using an OCR approach to identify the Bilge Ulusay; Daniel Uedelhoven; Tim DilmaghanI Khameneh, Rüdiger
machine-printed digits and 97.7 percent exact matching.The Loitz, Maren Pielka, Christian Bauckhage, Rafet Sifa, “Automatic In-
dexing of Financial Documents via Information Extraction”, IEEE 2021.
network was then trained using a separate database, and after [5] V. Madaan and A. Goyal. Predicting ayurveda-based constituent bal-
training it to the appropriate degree of accuracy, we tested the ancing in human body using machine learning methods. IEEE Access,
trained model by comparing numbers to words using multiple 8:65060–65070, 2020.
[6] . S Ahlawat, A Choudhary, A Nayyar, S Singh, and B Yoon. Improved
data sets. Evidently, we improved upon the prior accuracy of handwritten digit recognition using convolutional neural networks (cnn).
99.05 percent for the CNN employed for digit recognition by Sensors, 20:3344, 2020.
achieving an accuracy of 99.14 percent for digit recognition. In [7] Cheque Market 2018. http://www.idrbt.ac.in/icid.html, 2017. [Online;
accessed October 4, 2019].
a similar vein, we were successful in achieving an accuracy of [8] Victor Carbune, Pedro Gonnet, Thomas Deselaers, Henry A. Rowley,
up to 99.94 percent for character recognition using CNN. We Alexander Daryin, Marcos Calvo, LiLun Wang, Daniel Keysers, Sandro
employed SIFT and SVM classifiers to identify the relevance Feuz, Philippe Gervais, “Fast multi-language LSTM-based online hand-
writing recognition”, International Journal on Document Analysis and
for signature verification and obtained 98.1 percent accuracy Recognition (IJDAR) (2020) 23:89–102
in the process of recognising signatures. We employed SIFT [9] Abhay Bansal, Divye Garg, Anand Gupta, “APattern Matching Classifier
and SVM classifiers to identify the relevance for signature ver- for Offline Signature Verification”, First International Conference on
Emerging Trends in Engineering and Technology, 2008, ISBN: 978-0-
ification and obtained 98.1 percent accuracy in the process of 7695-3267-7, pp.1160-1163
recognising signatures. By reducing time and labour compared [10] Drouhard J-P, Sabourin R, Godbout M, “Evaluation of a training method
to the traditional and manual technique, our invention will and of various rejection criteria for a neural network classifier used
for off-line signature verification” ,IEEE International Conference on
assist financial institutions in automating the process of cheque Neural Networks(ICNN), 1994,ISBN: 0-7803-1901-X, Volume: 7,pp.
clearance. In order to automate the process of verifying bank 4294-4299
checks based on the recognition of the handwritten numeric [11] S Ahlawat, A Choudhary, A Nayyar, S Singh, and B Yoon. Improved
handwritten digit recognition using convolutional neural networks (cnn).
digits, we completed the verification of bank checks using Sensors, 20:3344, 2020.
CNN in our study work. But in order to effectively clear bank [12] L. G. Hafemann, R. Sabourin, and L. S. Oliveira. Meta-learning for fast
checks, we also need to validate bank checks and be able to classifier adaptation to new users of signature verification systems. IEEE
Transactions on Information Forensics and Security, 15:1735–1745,
work online with a database that can answer the necessary 2020.
queries in real time.
F UTURE W ORK
By creating a system capable of sequentially carrying out
all the steps necessary for the clearance of bank cheques in
real time using a customer-centric database with the necessary
information of a bank customer and also enabling the bank to
track all the transactions via cheques in a centralised manner,
the verification process can be combined with the validation
process of cheque leaflets in the future, making the process of
cheques clearance fully automated. Our invention only applies
to checks that are currently written in English. Similar models
could be developed for more languages in the future.