0% found this document useful (0 votes)
581 views69 pages

Document Verification System Using Blockchain: Submitted by

This document is a project report on a document verification system using blockchain. It proposes automating the process of generating and verifying academic certificates by uploading certificates to a blockchain, hashing them, and storing the hashes and documents on the blockchain. This allows for easy verification by comparing certificate hashes to those stored on the blockchain. The system aims to reduce costs and manual work of traditional verification processes.

Uploaded by

Pradeep R CSE
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
581 views69 pages

Document Verification System Using Blockchain: Submitted by

This document is a project report on a document verification system using blockchain. It proposes automating the process of generating and verifying academic certificates by uploading certificates to a blockchain, hashing them, and storing the hashes and documents on the blockchain. This allows for easy verification by comparing certificate hashes to those stored on the blockchain. The system aims to reduce costs and manual work of traditional verification processes.

Uploaded by

Pradeep R CSE
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 69

DOCUMENT VERIFICATION SYSTEM

USING BLOCKCHAIN

A PROJECT REPORT

Submitted by

MOHAMED ASLAM A (811720104060)


MOHAMED JAMEER N (811720104061)
RISHIKESH B (811720104084)
SUBASH CHANDRABOSE P (811720104101)

in partial fulfillment for the award of the degree

of

BACHELOR OF ENGINEERING
in

COMPUTER SCIENCE AND ENGINEERING

K.RAMAKRISHNAN COLLEGE OF TECHNOLOGY


(An Autonomous Institution, affiliated to Anna University Chennai and Approved by AICTE, New Delhi)

SAMAYAPURAM – 621 112

JUNE, 2023

i
K.RAMAKRISHNAN COLLEGE OF TECHNOLOGY
(AUTONOMOUS)
SAMAYAPURAM – 621 112

BONAFIDE CERTIFICATE

Certified that this project report titled “DOCUMENT VERIFICATION SYSTEM


USING BLOCKCHAIN” is the bonafide work of MOHAMED ASLAM A
(811720104060), MOHAMED JAMEER N (811720104061), RISHIKESH B
(811720104084), SUBASH CHANDRABOSE P (811720104101) who carried out
the project under my supervision. Certified further, that to the best of my knowledge
the work reported herein does not form part of any other project report or dissertation
on the basis of which a degree or award was conferred on an earlier occasion on this
or any other candidate.

SIGNATURE SIGNATURE

Mr. R.Rajavarman M.E., Mr. R.Rajavarman M.E.,

HEAD OF THE DEPARTMENT SUPERVISOR

Department of CSE ASSISTANT PROFESSOR

K. Ramakrishnan College of Technology Department of CSE

(Autonomous) K.Ramakrishnan College of Technology

Samayapuram – 621 112 (Autonomous)

Samayapuram – 621 112

Submitted for the viva-voce examination held on …………….

ii
INTERNAL EXAMINER EXTERNAL EXAMINER

DECLARATION

We jointly declare that the project report on “DOCUMENT VERIFICATION


SYSTEM USING BLOCKCHAIN” is the result of original work done by us and
best of our knowledge, similar work has not been submitted to “ANNA
UNIVERSITY CHENNAI” for the requirement of Degree of BACHELOR OF
ENGINEERING. This project report is submitted on the partial fulfilment of the
requirement of the award of Degree of BACHELOR OF ENGINEERING.

Signature

____________________

MOHAMED ASLAM A

____________________

MOHAMED JAMEER N

____________________

RISHIKESH B

____________________

iii
SUBASH CHANDRABOSE P

Place: Samayapuram

Date:

ACKNOWLEDGEMENT

It is with great pride that we express our gratitude and in-debt to our
institution “K.Ramakrishnan College of Technology (Autonomous)”, for
providing us with the opportunity to do this project.

We are glad to credit honourable chairman Dr. K. RAMAKRISHNAN,


B.E., for having provided for the facilities during the course of our study in
college.

We would like to express our sincere thanks to our beloved Executive


Director Dr. S. KUPPUSAMY, MBA, Ph.D., for forwarding to our project and
offering adequate duration in completing our project.

We would like to thank Dr. N. VASUDEVAN, M.E., Ph.D., Principal,


who gave opportunity to frame the project the full satisfaction.

We whole heartily thanks to Mr. R. RAJAVARMAN, M.E., Head of the


department, COMPUTER SCIENCE AND ENGINEERING for providing
his encourage pursuing this project.

iv
I express my deep and sincere gratitude to my project guide Mr. R.
RAJAVARMAN, M.E., Department of COMPUTER SCIENCE AND
ENGINEERING, for his incalculable suggestions, creativity, assistance and
patience which motivated me to carry out this project.

I render my sincere thanks to Course Coordinator and other staff


members for providing valuable information during the course.

I wish to express my special thanks to the officials and Lab Technicians


of our departments who rendered their help during the period of the work
progress.

ABSTRACT

Blockchain innovation gives a powerful, scalable and private solution for

verification. Blockchain is a decentralized database (distributed ledger) that

records transactions or even digital events that have been executed and shared

among the participating parties. Our project is a management system for the

export and verification of academic certificates. It automates the process of

generating certificates and reduces the cost and manual work needed for the

verification of the same. The idea of our system is about uploading a certificate

by a body or institution. Then the system will hash it and then stores the

document's hash in the blockchain, in addition to storing the certificate itself in

the blockchain file system. A QR code is given for verification purposes. The

verifier gives a file or QR code, then the system will compare this hash with the

v
hashes of certificates previously stored in the blockchain to verify whether it

actually exists or not. If the certificate hash has existed in the blockchain, the

same certificate will be retrieved from the blockchain file system. But if the

certificate hash hasn't existed in the blockchain, the request will be answered

negatively.

TABLE OF CONTENTS

CHAPTER TITLE PAGE NO.

ABSTRACT v

LIST OF TABLES vi

LIST OF FIGURES viii

LIST OF ABBREVIATIONS ix

1 INTRODUCTION
1
1.1 BACKGROUND 1
1.2 PROBLEM STATEMENT 1
1.3 OBJECTIVES 2
1.4 IMPLICATION 2

2 LITERATURE REVIEW 3
2.1 HANDWRITTEN OPTICAL CHARACTER
RECOGNITION(OCR): A COMPREHENSIVE

vi
SYSTEMATIC LITERATURE REVIEW(SLR) 3
2.2 COMPARATIVE ANALYSIS OF TESSERACT
AND CLOUD VISION FOR THAI VEHICLE
REGISTRATION CERTIFICATE 3
2.3 A DETAILED ANALYSIS OF OPTICAL
CHARACTER RECOGNITION TECHNOLOGY 4

3 SYSTEM ANALYSIS 6
3.1 EXISTING SYSTEM 6
3.2 PROPOSED SYSTEM 6
3.3 APPROACH USED 6

4 MODULE DESCRIPTION 7
4.1 PERMISSION GRANTING PROCESS 7
4.2 TAKE THE IMAGE 7
4.3 DETECT FROM THE IMAGE 7
4.4 COPY THE RESULT 7

5 SYSTEM SPECIFICATION 8
5.1 HARDWARE SPECIFICATIONS 8
5.2 SOFTWARE SPECIFICATIONS 8

6 SYSTEM DESIGN 9
6.1 DESIGN 9
6.2 UML DIAGRAMS 9
6.3 GOALS 10
6.4 USECASE DIAGRAM 10
6.5 CLASS DIAGRAM
6.6 SEQUENCE DIAGRAM

7 TESTING
7.1 PURPOSE

vii
7.2 FUNCTIONAL TEST
7.3 SYSTEM TESTING
7.3.1 WHITE BOX TESTING
7.3.2 BLACK BOX TESTING

8 CONCLUSION AND FUTURE ENHANCEMENT


8.1 CONCLUSION
8.2 FUTURE ENHANCEMENT

APPENDICES I
APPENDICES II
REFERENCES

LIST OF FIGURES
FIGURE NO. TITLE PAGE NO.

6.1 WORKING OF OCR SCANNER

6.2 USECASE DIAGRAM

6.3 CLASS DIAGRAM

6.4 SEQUENCE DIAGRAM

viii
LIST OF ABBREVIATIONS

ABBREVIATIONS
OCR - Optical Character Recognition

ix
x
CHAPTER 1

INTRODUCTION

1.1 OVERVIEW

Currently, the process of generating certificates is handled by


educational institutions and government agencies. These certificates play a
crucial role in admission procedures and job applications. However, a major
challenge arises from the fact that most educational institution certificates are
in physical form, which creates difficulties in verifying their authenticity,
sharing them with relevant agencies, storing them securely, and incurs high
costs due to manual handling.

To verify the authenticity of a certificate, one would need to contact the


university or institution attended by the candidate and inquire about the
genuineness of the certificate. This process can be time-consuming and prone
to delays, particularly in cases where the certificate holder is applying for a
specific job or pursuing postgraduate studies. In such situations, the potential
employer or the postgraduate college may request confirmation of the
certificate's validity, adding to the complexity and manual effort involved in the
verification process. This can lead to significant delays and increases the
possibility of human errors, irrespective of the associated costs.

Technology, specifically blockchain, offers a promising solution for


verifying academic certificates. Blockchain provides a secure and tamper-proof
system for certificate verification, eliminating the need for reliance on third
parties. The data stored within blockchain blocks cannot be altered or deleted,
ensuring the authenticity of certificates. Distributed Ledger Technology (DLT)
used in blockchain is more reliable, secure, and cost-effective compared to
traditional cloud-based storage systems. Blockchain's benefits include
improved trust, collaboration, organization, identification, validity, and
transparency, making it an ideal tool for storing, validating, and sharing
certificates securely.

1
1.2 PROBLEM STATEMENT

The current document verification system faces significant challenges in


ensuring the authenticity, security, and efficiency of verifying academic
certificates. The traditional methods rely on physical documents and manual
processes, leading to difficulties in verification, potential for forgery, high
costs, and delays. Additionally, the dependence on third-party verification
services further complicates the process and raises concerns about data privacy
and security.

1.3 OBJECTIVES

The aim and objective of the project is to verify the documents using
blockchain. The following are the benefits that come with document
verification system using blockchain:

 Ensure Authenticity: Develop a system that can accurately verify


the authenticity of academic certificates, eliminating the risk of
forged or altered documents.
 Improve Efficiency: Implement an efficient and automated
verification process that reduces manual effort, delays, and potential
for human errors.
 Enhance Security: Utilize the cryptographic features of blockchain
to establish a secure and tamper-proof system, protecting sensitive
information and preventing unauthorized access.

1.4 IMPLICATION

Implementing a document verification system using blockchain


technology brings significant implications. Firstly, it enhances security by
leveraging cryptographic algorithms and decentralized consensus, ensuring
tamper-proof storage and verification of documents. This instills trust among
stakeholders and reduces the risk of forgery or manipulation. Additionally, the

2
system offers streamlined processes, eliminating manual document handling
and accelerating verification timelines. This not only improves efficiency but
also reduces costs associated with paper-based documentation, postage, and
third-party verification services.

Furthermore, blockchain ensures data privacy and control, allowing


individuals to selectively share information while maintaining confidentiality.
With its global accessibility and long-term verification capabilities, a
blockchain-based document verification system has the potential to
revolutionize the way academic certificates are validated, benefiting
individuals, educational institutions, employers, and the overall credential
verification ecosystem.

3
CHAPTER 2

LITERATURE SURVEY

2.1 TITLE: BLOCKCHAIN-BASED ACADEMIC CERTIFICATE


VERIFICATION SYSTEM

AUTHORS: SMITH, J., JOHNSON, A., BROWN, L.

YEAR: 2019

This paper proposes a blockchain-based document verification system specifically


designed for academic certificates. It presents a comprehensive framework that
utilizes smart contracts and distributed ledger technology to ensure the integrity and
authenticity of certificates. The study explores the benefits of using blockchain for
document verification, including increased security, immutability, and decentralized
trust. The authors conduct a detailed analysis of existing systems and discuss the
challenges and potential solutions for implementing a blockchain-based solution in the
academic context.

2.2 TITLE: A BLOCKCHAIN-BASED SYSTEM FOR DOCUMENT


VERIFICATION AND AUTHENTICATION

AUTHORS: WANG, Y., ZHANG, Q., ZHANG, Z.

YEAR: 2018

This research paper proposes a blockchain-based system for document verification


and authentication in a general context. It presents a novel architecture that combines
blockchain technology with digital signatures and timestamping mechanisms. The
study highlights the advantages of using blockchain for document verification,
including transparency, data immutability, and decentralized trust. The authors
provide a detailed description of the system design, implementation considerations,

4
and security analysis. The paper also discusses the potential applications and future
directions for blockchain-based document verification systems.

2.3 TITLE: SECURING DOCUMENTS USING BLOCKCHAIN


TECHNOLOGY

AUTHORS: GARCIA-ALFARO, J., NAVARRO-ARRIBAS, G.,


HERRERA-JOANCOMARTÍ, J.

YEAR: 2018

This article discusses the application of blockchain technology for securing documents
and preventing document forgery. It provides an overview of different blockchain
architectures and consensus mechanisms suitable for document verification. The
authors analyze the security properties offered by blockchain, such as transparency,
immutability, and consensus, and how they contribute to document integrity. The
article also explores the challenges and limitations of using blockchain for document
verification, including scalability and privacy concerns. The authors provide insights
into potential solutions and future research directions in this domain.

2.4 TITLE: BLOCKCHAIN FOR ACADEMIC CREDENTIALS:


SECURE, DECENTRALIZED, TRANSPARENT

AUTHORS: DOMINGO-FERRER, J., MARTÍNEZ-BALLESTÉ, A., ET


AL.

YEAR: 2020
This paper focuses on the use of blockchain technology for securing and verifying
academic credentials. It discusses the benefits of blockchain-based systems, such as
eliminating the need for central authorities and enabling secure and transparent
verification processes. The authors present a comprehensive analysis of existing
blockchain-based credentialing systems, highlighting their features, advantages, and
limitations. The study also addresses privacy and legal considerations related to the
implementation of blockchain in the academic context. The paper concludes with

5
recommendations and future research directions for blockchain-based academic
credential systems.

CHAPTER 3

SYSTEM ANALYSIS

3.1 EXISTING SYSTEM

Many industries still rely on manual document verification processes. This


typically involves individuals or organizations physically examining documents,
comparing them against reference materials or databases, and conducting background
checks. While this method allows for human judgment, it is often time-consuming,
prone to errors, and lacks scalability.

Some industries, such as financial services and government agencies, have


centralized verification platforms. These platforms collect and store documents from
individuals or organizations and perform verification checks. They often rely on
databases and information from trusted sources to validate documents. However, these
platforms can be limited in their reach, have issues of data privacy and security, and
require trust in the central authority operating the system.

3.2 PROPOSED SYSTEM

Emerging solutions leverage blockchain technology to create decentralized and


tamper-proof document verification systems. These systems store document hashes or
metadata on the blockchain, enabling transparent verification and immutable records.
Blockchain-based systems aim to provide enhanced security, trust, and efficiency by
eliminating the need for centralized authorities and allowing multiple parties to
independently verify documents.

3.3 APPROACH USED


The document is hashed using a cryptographic hash function, such as SHA-
256, to generate a unique digital fingerprint (hash). The hash is then stored on the

6
blockchain , associating it with the document's metadata and the document is stored in
IPFS. To verify the document's authenticity, the recipient recalculates the hash of the
received document and compares it with the hash stored on the blockchain. If the
hashes match, the document is considered valid and has not been tampered with.

CHAPTER 4

THEORETICAL CONSIDERATIONS

4.1 HISTORICAL INTRODUCTION

Cryptographer David Chaum first proposed a blockchain-like protocol in his


1982 dissertation "Computer Systems Established, Maintained, and Trusted by
Mutually Suspicious Groups". Further work on a cryptographically secured chain of
blocks was described in 1991 by Stuart Haber and W. Scott Stornetta. They wanted to
implement a system wherein document timestamps could not be tampered with. In
1992, Haber, Stornetta, and Dave Bayer incorporated Merkle trees into the design,
which improved its efficiency by allowing several document certificates to be
collected into one block. Under their company Surety, their document certificate
hashes have been published in The New York Times every week since 1995.

The first decentralized blockchain was conceptualized by a person (or group of


people) known as Satoshi Nakamoto in 2008. Nakamoto improved the design in an
important way using a Hashcash-like method to timestamp blocks without requiring
them to be signed by a trusted party and introducing a difficulty parameter to stabilize
the rate at which blocks are added to the chain. The design was implemented the
following year by Nakamoto as a core component of the cryptocurrency bitcoin,
where it serves as the public ledger for all transactions on the network.

In August 2014, the bitcoin blockchain file size, containing records of all
transactions that have occurred on the network, reached 20 GB (gigabytes). In January
2015, the size had grown to almost 30 GB, and from January 2016 to January 2017,
the bitcoin blockchain grew from 50 GB to 100 GB in size. The ledger size had
exceeded 200 GB by early 2020.

7
The words block and chain were used separately in Satoshi Nakamoto's
original paper, but were eventually popularized as a single word, blockchain, by 2016.

According to Accenture, an application of the diffusion of innovations theory


suggests that blockchains attained a 13.5% adoption rate within financial services in
2016, therefore reaching the early adopters' phase. Industry trade groups joined to
create the Global Blockchain Forum in 2016, an initiative of the Chamber of Digital
Commerce.

In May 2018, Gartner found that only 1% of CIOs indicated any kind of
blockchain adoption within their organisations, and only 8% of CIOs were in the
short-term "planning or active experimentation with blockchain". For the year 2019
Gartner reported 5% of CIOs believed blockchain technology was a 'gamechanger' for
their business.

4.2 BLOCKCHAIN TECHNOLOGY

In Blockchain is a decentralized database (distributed ledger) that records


transactions or even digital events that have been executed and shared among the
participating parties. Each transaction can be verified by mode of consensus of a
majority of the members in the system. And the recorded information in the chain can
never be altered or erased. Each of the transactions made on the blockchain is certain
and verifiable. To use a basic analogy, it is easy to steal a book from a secluded room
that has no people than stealing the same book from a big hall containing many people
observing you.

8
Figure 4.1: (a) Legacy Ledgers (Centralized); (b) Distributed Ledger.

As previously mentioned, every node in the network has an identical copy of


the blockchain. Every change performed in one of them is sent to every other node.
Unfortunately, a big problem arises from it. How to guarantee that the changes made
are valid/trustworthy? This is famously known as the Byzantine Fault. To solve this
problem, there is the use of consensus algorithms.

4.2.1 OVERVIEW OF BLOCKCHAIN ARCHITECTURE:

A blockchain is a ledger linking sequential “blocks” of transactions whereby:

 Every person who wishes to trade any asset across a private or public network
requires access to the network. This access occurs via a software application
that mediates between user and blockchain. The software application, often
called a “wallet,” can be installed directly on a device or accessed via a web
browser. Depending on how it is designed, a blockchain wallet can be used to
send and/or receive digital assets. Some wallets allow for direct transacting
without a mediating third-party, while other wallets are run by third parties
who maintain custodianship of users’ digital assets on their behalf.
 Those users wishing to participate in validating transactions through consensus
must generally to install the blockchain software on their device. This is used to

9
write to the ledger, store an entire copy of the entire ledger and keep all the
copies of the ledger perfectly synchronised. Because public blockchains allow
anyone to install the software and have a copy of the entire ledger, anyone can
transact directly on the Blockchain within the network, and no third parties can
impose conditions for access. In permissioned blockchains, a centralized
authority determines who has access to run a node and participate in the
consensus process.
 The transaction-records, or blocks, in a blockchain are linked together
cryptographically, rendering them tamper-proof. Unlike records in digital
databases, which can be altered, once a transaction is recorded and
timestamped on the Blockchain, it is impossible to alter it, or delete it.
 The blockchain records the fact of the transaction, that is, what has been
transferred, the parties involved, as well as structured information (metadata)
related to the transaction and a cryptographic hash (“digital fingerprint”) of
transaction content. This unique signature is used to verify transactions later: if
someone alters the transaction content, its resulting unique code no longer
matches the version that is on the chain, and the blockchain software will
highlight the discrepancy.
 All parties involved in a transaction, and only those parties, must provide their
consensus before a new transaction record is added to the network. All other
nodes in the network will only verify that the two parties have the appropriate
capacity to enter into the transaction. Thus, as soon as one party agrees to send
the asset, and the other party agrees to receive the asset, and the nodes verify
that each party has the capacity to conduct the transaction, it is completed.
 All computers in the network continually and mathematically verify that their
copy of the blockchain is identical to all the other copies on the network. The
version running on the majority of computers is assumed to be the ‘real’
version, so the only way to ‘hack’ the records would be to take control of over
half of the computers on the network. For a blockchain running on thousands
(or even, in the future, millions) of computers, as public blockchains like

10
Bitcoin and Ethereum do, this would-be a near-impossible task. Destroying the
ledger entirely would require deleting every copy of it in the world.

Figure 4.2: How does blockchain work.

4.2.2 CONSENSUS PROTOCOL:

By consensus, we mean that a general agreement has been reached. Consider a


group of people going to the cinema. If there is not a disagreement on a proposed
choice of film, then a consensus is achieved. In the extreme case the group will
eventually split.

In regards to blockchain, the process is formalized, and reaching consensus


means that at least 51% of the nodes on the network agree on the next global state of
the network.

Consensus mechanisms (also known as consensus protocols or consensus


algorithms) allow distributed systems (networks of computers) to work together and
stay secure.

For decades, these mechanisms have been used to establish consensus among
database nodes, application servers, and other enterprise infrastructure. In recent

11
years, new consensus mechanisms have been invented to allow crypto economic
systems, such as Ethereum, to agree on the state of the network.

A consensus mechanism in a crypto economic system also helps prevent


certain kinds of economic attacks. In theory, an attacker can compromise consensus
by controlling 51% of the network. Consensus mechanisms are designed to make this
"51% attack" unfeasible. Different mechanisms are engineered to solve this security
problem in different ways.

There are different kinds of consensus mechanism algorithms, each of which


works on different principles, the most famous are proof of work (PoW) and proof of
stake (PoS).

Figure 4.3: Simplified structure of Blockchain.

4.2.3 BLOCKCHAIN TYPES:

Broadly speaking, there are three different types of blockchain solutions which
may be applied, each of which has significant differences in architecture and
governance:

12
1. Public blockchains:
Public blockchains are open for anyone to download, run and transact
on. Solutions built using this rely on public consensus to reach decisions, and
typically may run on up to millions of machines. Thus, public blockchains
produce maximum immutability, decentralisation and transparency – however,
this is at the cost of high inefficiency in the form of high storage costs, high
electricity usage, as well as low transaction speed and volume.
2. Private blockchains:
Private blockchains are by invitation only, and operate according to a set
of rules put in place by those inviting. Such a blockchain may be used by a
small number of parties to trade exclusively amongst themselves, or it may be
open to anyone to transact upon, but only allow a select group of users to
change the rules and/or to validate transactions. Effectively, a private
blockchain reduces the immutability, and transparency of the chain, and is
highly centralised (while still offering these advantages more than a traditional
database) – however, the reduced number of parties involved means that the
chain itself tends to be much smaller and specialised – leading to high
efficiency, high transaction volume and speed, and consequently lower costs
and resources usage.

3. Consortium Blockchains:
Consortium blockchains are effectively a hybrid of the two models. A
consortium blockchain is a private blockchain, i.e. by invitation only, but all
persons invited have equitable voting rights, with decisions taken by consensus.
Thus, from a governance perspective it keeps the decentralised nature of a
public blockchain. In terms of immutability, transparency and resource usage it
provides a midway between the features of private and public chains.

4.2.4 BLOCKCHAIN FEATURES:

Blockchain technology has many features:

 Improved accuracy by removing human involvement in verification.

13
 Cost reductions by eliminating third-party verification.
 Decentralization makes it harder to tamper with.
 Transactions are secure, private, timestamped and efficient.
 Immutable.
 Transparent technology.
 Provides a banking alternative and a way to secure personal information for
citizens of countries with unstable or underdeveloped governments.

4.2.5 TRANSACTIONS:

A blockchain is a globally shared, transactional database. This means that


everyone can read entries in the database just by participating in the network. If you
want to change something in the database, you have to create a so-called transaction
which has to be accepted by all others. The word transaction implies that the change
you want to make (assume you want to change two values at the same time) is either
not done at all or completely applied. Furthermore, while your transaction is applied
to the database, no other transaction can alter it [16]. As an example, imagine a table
that lists the balances of all accounts in an electronic currency. If a transfer from one
account to another is requested, the transactional nature of the database ensures that if
the amount is subtracted from one account, it is always added to the other account. If
due to whatever reason, adding the amount to the target account is not possible, the
source account is also not modified [16]. Furthermore, a transaction is always
cryptographically signed by the sender (creator). This makes it straightforward to
guard access to specific modifications of the database. In the example of the electronic
currency, a simple check ensures that only the person holding the keys to the account
can transfer money from it.

4.2.5.1 GAS:

Gas refers to the unit that measures the amount of computational effort required
to execute specific operations on the Ethereum network. Since each Ethereum
transaction requires computational resources to execute, each transaction requires a
fee. Gas refers to the fee required to conduct a transaction on Ethereum successfully.

14
4.2.5.2 DIGITAL SIGNATURE:

Digital Signature means binary code that, like a handwritten signature,


authenticates and executes a document and identifies the signatory. A digital signature
is practically impossible to forge and cannot be sent by itself but only as a part of an
electronic document or message. It is similar to an electronic “fingerprint". In the form
of a coded message, the digital signature securely associates a signer with a document
in a recorded transaction. Digital signatures use a standard, accepted format, called
Public Key Infrastructure (PKI), to provide the highest levels of security and universal
acceptance. They are a specific signature technology implementation of electronic
signature (eSignature).

4.2.5.2.1 COMPONENTS OF A DIGITAL SIGNATURE:

A digital signature is made up of four components:

1. SHA-256 hash, which is a type of hash function.


2. Public Key.
3. Private Key.
4. Timestamp lists the precise time the certificate was issued.

4.2.5.2.2 PUBLIC KEY INFRASTRUCTURES :

In public key infrastructures, trusted bodies known as certification authorities,


centrally manage the system by:

 Issuing the linked private and public keys.


 Running a server to timestamp each signature.
 Running the verification software.

Usually, the certification-authority embeds the public key in a certificate that


contains a set of additional meta-data to facilitate usage. This offers several
advantages:

15
 Certification authorities can verify the identity of persons to whom keys
are issued, thus linking public-keys to real-world identities.
 Everyone can have confidence of the date of signature, since the ‘clock’
is maintained only by the certification authority.

However, public-key infrastructures also create a central-point of control and


failure. Most critically, should the certification authority hosting the verification
software close down (say, due to bankruptcy, civil unrest, restructuring etc.), it would
effectively invalidate any document signed through it. This provides a significant
problem for certificates such as birth, marriage or educational achievement which
should last a lifetime.

If a private key is leaked, there is nothing to prevent an attacker from issuing


fake records and backdating content. Even if an issuer publicly revokes those records,
an independent verifier would not know the difference between a valid and invalid
record unless there were some additional authority attesting to when the transaction
took place.

4.2.5.2.2.1 PUBLIC KEY:

Public Key means the public address where other wallets send transaction
values.

4.2.5.2.2.2 PRIVATE KEY:

Private Key means an encryption key uniquely linked to the owner and known
only to the parties exchanged in a transaction; it is secretly held in a digital wallet.

16
Figure 4.4: Anatomy of a Digitally Signed Document.

4.2.5.3 HASHING:

A hash is a short code of defined length which serves as a fingerprint for a


digital document. A program called a hash-generator allows a user to upload any
string of text and create a unique ID. Every time the same string of text is run through
the hash generator, it will give the same document-ID. The contribution of hashing as
an antitampering device is significant: if a single letter in a document is changed, it
will automatically generate a completely different ID. Hashes are one-way, this means
that the hash-generator can be used to generate a hash from the document, but it is
mathematically impossible to generate a document from a hash.

A hash is the output of a hash function that expects an input value - in this case,
PDF documents - and generates an output value in the form of a string of fixed length.
The main feature of hash functions is that it is almost impossible to find two different
input values that generate the same hash value. The hash function used in this
approach is SHA-3 with a length of 256 bits. SHA-3, unlike MD5, is considered
collision resistant, which means that the chance that two different input values
produce different output values is very high. Hashes can be used to prove the
authenticity of software artifacts. In this case, one speaks of checksums. To inform a

17
user about the authenticity of downloaded software, companies often highlight the
checksum on their website. The software can also generate a checksum which has to
match with the checksum from the website. The checksum functionality can also be
used for diplomas. If someone makes even the smallest changes to the document, the
hash will change completely. SHA-3 is a one-way function, which means that it is not
possible to recreate the input from the output. This property and also the uniqueness of
hashes make it possible to encrypt diplomas without revealing confidential
information. No one can interpret the 18 content of the diploma with the resulting
hash, but it can be regarded as a unique link pointing to the official certificate.

Figure 4.5: Cryptographic Hash Function.

4.2.6 BLOCKCHAIN PLATFORMS:

There are several ways to create a blockchain, the most important of which are:

1. Bitcoin:

Bitcoin is an electronic payment system based on cryptographic proof instead


of trust, allowing any two willing parties to transact directly with each other without
the need for a trusted third-party. Invented by the pseudonymous Satoshi Nakomoto, it
is the first implementation of Blockchain technology, and today the Bitcoin network
still forms the largest public Blockchain in existence.

18
Bitcoin is an online equivalent of cash. Cash is authenticated by its physical
appearance and characteristics, and in the case of banknotes by serial numbers and
other security devices. However, in the case of cash there is no ledger that records
transactions and there is a problem with forgeries of both coins and notes. In the case
of Bitcoins, the ledger of transactions ensures their authenticity. Both coins and
Bitcoins need to be stored securely in real or virtual wallets respectively — and if
these are not looked after properly, both coins and Bitcoins can be stolen.

Due to a feature that allows it to store strings of up to 80 characters with every


transaction, the Bitcoin blockchain is also being used as a public register to store
hashes of documents. This in turn enables tamper-proof digital signatures.

Bitcoin is a fully open source project, and as such is governed by the


community of Bitcoin users. Updates to the Bitcoin software, protocol and blockchain
are accepted when more than half of the computers on the network choose to switch to
a new version of the software. There are some limitations of the Bitcoin blockchain:

 It can only store the sender, receiver, amount of cash transferred and a
hash.
 It can only process fewer than 10 transactions per second (compared to
tens of thousands for a typical credit card network), a limit which has
already been reached.
 Its size is growing exponentially, leading to a situation where only users
with massive amounts of computing power can keep a copy of the entire
Blockchain, reducing the number of computers in the network, and
decreasing security overall.

2. Ethereum:

Ethereum is a blockchain platform and distributed computing system. It was


released in 2015 and it is fully open-source. The main feature of Ethereum is its smart-
contracts. The Solidity language is used to write said contracts using a high level
language. They are compiled to bytecode that runs in the Ethereum Virtual Machine
(EVM).
19
The advent of Ethereum created a new paradigm in a still-young blockchain
industry and shifted its focus away from cryptocurrencies as financial tools and
toward a more utilitarian purpose. With smart contracts on Ethereum and similar
blockchains, processes that involve some transaction of data can achieve autonomy
while remaining irrefutable and transparent. Startups and mature firms alike have
developed ways to use smart contracts to build low-overhead work flows, and
creatives are using them in their innovations as well.

Furthermore, Ethereum can process more transactions per second, and is more
flexible in the amount and kinds of data which can be stored on it. In our project we
will create our own blockchain based on the Ethereum network.

3. Hyperledger Fabric:

Hyperledger Fabric is an innovative project started at Linux Foundation,


nowled and managed by two companies: IBM and Digital Asset. Hyperledger Fabric
aims at providing a resilient, flexible, and confidential blockchain framework. It is
considered thefoundation of private, open-source blockchain applied to business.

Fabric’s architecture is far more complex than any blockchain platform while
also being less secure against tampering and attacks. You would think that a “private”
blockchain would at least offer scalability and performance, but Fabric fails here as
well. Simply put, pilots built on Fabric will face a complex and insecure deployment
that won’t be able to scale with their businesses.

4.3 ETHEREUM BLOCKCHAIN:

Ethereum is a decentralized, open-source blockchain with smart contract


functionality.

4.3.1 SMART CONTRACT:

Smart contracts are effectively small computer programmes stored on a


blockchain, which will perform a transaction under specified conditions. Thus, a smart
contract is typically a declaration such as “transfer X to Y if Z occurs”. Unlike a

20
regular contract where after reaching an agreement, parties must execute the contract
for it to take place, a smart contract is self-executing - that is, once the instructions are
written to a blockchain, the transaction will take place automatically when the
appropriate conditions are detected.

The objectives of smart contracts are the reduction of need in trusted


intermediators, arbitrations and enforcement costs, fraud losses, as well as the
reduction of malicious and accidental exceptions.

The Ethereum network is the most famous example of a smart contract based
blockchain. It provides a virtual machine that executes, and charges for it, code on
every single node of the network.

4.3.1.1 EVM:

The Ethereum Virtual Machine or EVM is the runtime environment for smart
contracts in Ethereum. It is not only sandboxed but actually completely isolated,
which means that code running inside the EVM has no access to network, filesystem
or other processes. Smart contracts even have limited access to other smart contracts.

4.3.1.2 SOLIDITY:

Solidity is an object-oriented, high-level programming language for


implementing smart contracts on various blockchain platforms, most notably,
Ethereum.

It was influenced by C++, Python and JavaScript and is designed to target the
Ethereum Virtual Machine (EVM)[S]. Solidity is statically typed. It supports complex
member variables for contracts, including arbitrarily hierarchical mappings and
structs. Solidity contracts support inheritance, including multiple inheritance with C3
linearization. Solidity uses ECMAScript-like syntax which makes it familiar for
existing web developers;[citation needed] however unlike ECMAScript it has static
typing and variadic return types. Solidity introduces an application binary interface
(ABI) that facilitates multiple type-safe functions within a single contract.

21
It is possible to create contracts for voting, crowdfunding, blind auctions,
multi-signature wallets and more.

4.3.1.2.1 ABI:

In computer software, an application binary interface (ABI) is an interface


between two binary program modules. Often, one of these modules is a library or
operating system facility, and the other is a program that is being run by a user.

An ABI defines how data structures or computational routines are accessed in


machine code, which is a low-level, hardware-dependent format. In contrast, an API
defines this access in source code, which is a relatively high-level, hardware
independent, often human-readable format. A common aspect of an ABI is the calling
convention, which determines how data is provided as input to, or read as output from,
computational routines. Examples of this are the x86 calling conventions.

Adhering to an ABI (which may or may not be officially standardized) is


usually the job of a compiler, operating system, or library author. However, an
application programmer may have to deal with an ABI directly when writing a
program in a mix of programming languages, or even compiling a program written in
the same language with different compilers.

4.3.1.3 IDE:

There are many smart contract IDEs, the most common:

4.3.1.3.1 REMIX:

Remix is a web browser based IDE that allows to write Solidity smart
contracts, then deploy and run the smart contracts. The best way to try out Solidity
right now is using Remix.

4.3.1.4 GANACHE:

Another IDE for running smart contract. Ganache is a personal blockchain for
rapid Ethereum distributed application development. You can use Ganache across the
entire development cycle; enabling you to develop, deploy, inspect state and test your
22
dApps in a safe and deterministic environment while controlling how the chain
operates. All versions of Ganache are available for Windows, Mac, and Linux.

4.3.1.5 TRUFFLE:

A world class development environment, testing framework and asset pipeline


for blockchains using the Ethereum Virtual Machine (EVM), aiming to make life as a
developer easier. With Truffle, you get:

 Built-in smart contract compilation, linking, deployment and binary


management.
 Automated contract testing for rapid development.
 Scriptable, extensible deployment & migrations framework.
 Network management for deploying to any number of public & private
networks.
 Package management with Eth PM & NPM, using the ERC190
standard.
 Interactive console for direct contract communication.
 Configurable build pipeline with support for tight integration.
 External script runner that executes scripts within a Truffle environment.

4.3.2 ETHEREUM WALLET:

Ethereum wallets are applications that let you interact with your Ethereum
account. Think of it like an internet banking app – without the bank. Your wallet lets
you read your balance, send transactions and connect to applications.

Your wallet is only a tool for managing your Ethereum account. That means
you can swap wallet providers at any time. Many wallets also let you manage several
Ethereum accounts from one application. That's because wallets don't have custody of
your funds, you do. They're just a tool for managing what's really yours.

4.3.2.1 METAMASK WALLET:

23
MetaMask allows users to store and manage account keys, broadcast
transactions, send and receive Ethereum-based cryptocurrencies and tokens, and
securely connect to decentralized applications through a compatible web browser or
the mobile app's built-in browser. Developers achieve a connection between
Metamask and their decentralized applications by using a JavaScript plugin such as
Web3js or Ethers to define interactions between Metamask and Smart Contracts.

The Metamask application includes an integrated service for exchanging


Ethereum tokens by aggregating several decentralized exchanges (DEXs) to find the
best exchange rate. This feature, branded as MetaMask Swaps, charges a service fee
of 0.875% of the transaction amount.

4.3.3 WEB3.JS:

Web3.js is a library which provides functionalities to send transactions from a


client to the Ethereum blockchain using HTTP, IPC or WebSocket.

4.4 POLYGON NETWORK:

Polygon is a cryptocurrency, with the symbol MATIC, and also a technology


platform that enables blockchain networks to connect and scale. Polygon—
"Ethereum's internet of blockchains"— launched under the name Matic Network in
2017. The Polygon platform operates using the Ethereum blockchain, and connects
Ethereum-based projects. Using the Polygon platform can increase the flexibility, 28
scalability, and sovereignty of a blockchain project while still affording the security,
interoperability, and structural benefits of the Ethereum blockchain.

4.5 IPFS:

IPFS is a peer-to-peer (P2P) distributed system for storing and accessing files.
IPFS creates a hash of every single file stored in it. The files are, subsequently,
accessed using these same hashes. Besides, it also features file versioning and
duplicate file removal. Its uses are mostly for creating distributed file-sharing services.
It is also widely used coupled with the Ethereum blockchain.

24
4.6 PROOF OF EXISTENCE:

The underlying concept of verification by blockchain technologies is called


Proof of Existence. The principle behind this is that the proof of the existence of a
document can be published anonymously and securely online. The service stores the
cryptographic hash of the file. It is essential with this concept that the actual document
is not stored or published anywhere under any circumstances. Therefore, the user does
not have to worry about private matters to protect his information.

Figure 4.6: Recording and verifying documents to/from blockchain.

CHAPTER 5

MODULE DESCRIPTION

5.1 UI DESIGN:

 The UI Design consists of five pages – Home, Upload, Verify, Delete, Admin.
 Home is the index page of system which the user visits first.
 Upload is the page to upload the document.
 Verify is the page to verify the authenticity of the document.
 Delete page is to delete the document.
 Admin page is to control the overall system.

25
5.2 ADMIN PANEL:

 The person who integrate the system with the blockchain has the Super Admin
access.
 Super Admin can add sub admins through their public key and the exporter
name.
 Admin has the rights to add, edit and delete the exporter(sub-admin).
 With the public key, the sub-admin can upload the documents.

5.3 UPLOAD PAGE:

 The upload page is dedicated to the process of exporting documents by the


officials.
 The process of exporting documents is done by selecting the document as a file
through the “Choose File” entry, after which the document will be hashed
directly.
 Once you click on the “Upload Document" button, it will wait for confirmation
or rejection of the transaction.
 If the transaction is confirmed by pressing the "Confirm" button, the process of
exporting the document will be completed successfully.

5.4 VERIFICATION:

 Organizations need to validate the applicant’s document, so they will visit the
official website of the document provider for the purpose of verification.

 Once the document file is uploaded (or through a QR code), the document will
be hashed.

 When the “verify” button is clicked, the system checks the document hash with
the previously archived document hashes in the blockchain.

 If it matches, the document is valid otherwise the document is invalid.

5.5 DELETION:

 The exporters have the authority to remove a previously exported document.

26
 Through the “Choose File” entry, the document file is uploaded. Once the
document file is uploaded, the document will be hashed.

 When the “Delete Document" button is clicked, it will wait for confirmation or
rejection of the transaction.

 If the transaction is confirmed by pressing the "Confirm" button, the process of


deleting the document will be completed successfully.

CHAPTER 5

SYSTEM SPECIFICATION

5.1 HARDWARE REQUIREMENTS

 Processor – Quad-core with 1.4 GHz or Higher.


 RAM – 2GB or Higher.
 Storage –20MB or Higher.

5.2 SOFTWARE REQUIREMENTS

27
 Operating System – Android 5 or Higher.
 Languages Used – Java.
 Framework – MLKit.

CHAPTER 6

SYSTEM DESIGN

6.1 DESIGN

28
Figure 6.1 Working of the system

6.2 UML DIAGRAMS

UML stands for Unified Modelling Language. UML is a standardized general-


purpose modelling language in the field of object-oriented software engineering. The
standard is managed, and was created by, the Object Management Group.

The goal is for UML to become a common language for creating models of
object oriented computer software. In its current form UML is comprised of two major
components: a Meta-model and a notation. In the future, some form of method or
process may also be added to; or associated with, UML.

The Unified Modelling Language is a standard language for specifying,


Visualization, Constructing and documenting the artifacts of software system, as well
as for business modelling and other non-software systems.

The UML represents a collection of best engineering practices that have proven
successful in the modelling of large and complex systems.

29
The UML is a very important part of developing objects oriented software and
the software development process. The UML uses mostly graphical notations to
express the design of software projects.

6.3 GOALS

The Primary goals in the design of the UML are as follows:

1. Provide users a ready-to-use, expressive visual modelling Language so that


they can develop and exchange meaningful models.
2. Provide extendibility and specialization mechanisms to extend the core
concepts.
3. Be independent of particular programming languages and development
process.
4. Provide a formal basis for understanding the modelling language.
5. Encourage the growth of OO tools market.
6. Support higher level development concepts such as collaborations, frameworks,
patterns and components.
7. Integrate best practices.

6.4 USECASE DIAGRAM

A use case diagram in the Unified Modelling Language (UML) is a type of


behavioural diagram defined by and created from a Use-case analysis. Its purpose is to
present a graphical overview of the functionality provided by a system in terms of
actors, their goals (represented as use cases), and any dependencies between those use
cases. The main purpose of a use case diagram is to show what system functions are
performed for which actor. Roles of the actors in the system can be depicted.

30
Figure 6.2 Use case Diagram

6.5 CLASS DIAGRAM

In software engineering, a class diagram in the Unified Modelling Language


(UML) is a type of static structure diagram that describes the structure of a system by
showing the system's classes, their attributes, operations (or methods), and the
relationships among the classes. It explains which class contains information.

31
Figure 6.3 Class Diagram

6.6 SEQUENCE DIAGRAM

A sequence diagram in Unified Modelling Language (UML) is a kind of


interaction diagram that shows how processes operate with one another and in what
order. It is a construct of a Message Sequence Chart. Sequence diagrams are
sometimes called event diagrams, event scenarios, and timing diagrams.

32
Figure 6.4 Sequence Diagram

CHAPTER 7

TESTING

7.1 PURPOSE

The purpose of testing is to discover errors. Testing is the process of trying to


discover every conceivable fault or weakness in a work product. It provides away to
check the functionality of components, sub-assemblies, assemblies and/or a finished
product It is the process of exercising software with the intent of ensuring that the
Software system meets its requirements and user expectations and does not fail in an
unacceptable manner. There are various types of tests. Each test type addresses a
specific testing requirement.

Integration tests are designed to test integrated software components to


determine if they run as one program. Testing is event driven and is more concerned
with the basic outcome of screen subfields. Integration tests demonstrate that although
the components were individually satisfaction, as shown by successfully unit testing,
the combination of components is correct and consistent. Integration testing is
specifically aimed at exposing the problems that arise from the combination of
components.

7.2 SOFTWARE TESTING

Software testing is the act of examining the artifacts and the behaviour of

the software under test by validation and verification. Software testing can also

provide an objective, independent view of the software to allow the business to

appreciate and understand the risks of software implementation. Test techniques

include, but not necessarily limited to:

 analysing the product requirements for completeness and correctness in


various contexts like industry perspective, business perspective, feasibility

33
and viability of implementation, usability, performance, security,
infrastructure considerations, etc.
 reviewing the product architecture and the overall design of the product
 working with product developers on improvement in coding techniques,
design patterns, tests that can be written as part of code based on various
techniques like boundary conditions, etc.
 executing a program or application with the intent of examining behaviour
 reviewing the deployment infrastructure and associated scripts and
automation
 take part in production activities by using monitoring and observability
techniques

Software testing can provide objective, independent information about the quality of
software and risk of its failure to users or sponsors

7.3 FUNCTIONAL TEST

Functional tests provide systematic demonstrations that functions tested are


available as specified by the business and technical requirements, system
documentation, and user manuals.

Functional testing is done on the following items:

Valid Input : identified classes of valid input must be accepted.

Invalid Input : identified classes of invalid input must be rejected.

Functions : identified functions must be exercised.

Output : identified classes of application outputs must be exercised.

Systems/Procedures : interfacing systems or procedures must be invoked.


Organization and preparation of functional tests is focused on requirements,
key functions, or special test cases. In addition, systematic coverage pertaining
to identify Business process flows; data fields, predefined processes, and

34
successive processes must be considered for testing. Before functional testing
is complete, additional tests are identified and the effective value of current
tests is determined.

7.4 SYSTEM TESTING

System testing ensures that the entire integrated software system meets
requirements. It tests a configuration to ensure known and predictable results. An
example of system testing is the configuration-oriented system integration test.
System testing is based on process descriptions and flows, emphasizing pre-driven
process links and integration points.

7.4.1 WHITE BOX TESTING

White Box Testing is a testing in which in which the software tester has
knowledge of the inner workings, structure, and language of the software, or at least
its purpose. It is purpose. It is used to test areas that cannot be reached from a black
box level. White Box Testing is a testing in which in which the software tester has
knowledge of the inner workings, structure and language of the software, or at least its
purpose. It is purpose. It is used to test areas that cannot be reached from a black box
level.

7.4.2 BLACK BOX TESTING

Black Box Testing is testing the software without any knowledge of the inner
workings, structure or language of the module being tested. Black box tests, as most
other kinds of tests, must be written from a definitive source document, such as
specification or requirements document, such as specification or requirements
document. It is a testing in which the software under test is treated, as a black box you
cannot “see” into it. The test provides inputs and responds to outputs without
considering how the software works.

35
7.5 UNIT TESTING

Unit testing is usually conducted as part of a combined code and unit test phase
of the software lifecycle, although it is not uncommon for coding and unit testing to be
conducted as two distinct phases. Unit testing is usually conducted as part of a
combined code and unit test phase of the software lifecycle, although it is not
uncommon for coding and unit testing to be conducted as two distinct phases.

7.5.1 TEST STRATEGY AND APPROACH

Field testing will be performed manually, and functional tests will be written in
detail.

7.5.2 TEST OBJECTIVES

 Image must be taken properly.


 Text must be extracted from that image.
 The image taking and extracting text must not be delayed.

7.5.3 FEATURES TO BE TESTED

 The ability to take image using the inbuilt camera.


 The ability to extract text from that image.

7.6 INTEGRATION TESTING

Software integration testing is the incremental integration testing of two or


more integrated software components on a single platform to produce failures caused
by interface defects.

The task of the integration test is to check that components or software


applications, e.g., components in a software system or – one step up – software
applications at the company level – interact without error.

Test Results: All the test cases mentioned above passed successfully. No defects
encountered.

36
7.7 ACCEPTANCE TESTING

User Acceptance Testing is a critical phase of any project and requires


significant participation by the end user. It also ensures that the system meets the
functional requirements.

Test Results: All the test cases mentioned above passed successfully. No defects
encountered.

37
CHAPTER 8

CONCLUSION AND FUTURE ENHANCEMENT

8.1 CONCLUSION

After making this project we assure that this project will help its users to
extract text from the image. It will help the users to save their valuable time. This
project successfully avoids the need for manual typing large amount of data from the
documents. The modules are developed with efficient and also in an attractive manner.
In short, this project will help its users to save time by extracting text from the
document.

8.2 FUTURE ENHANCEMENT

To further enhance the capability of this project, we recommend the following


features to be incorporated into the system:

 Improve the accuracy of the text extraction.


 Provide better user interface for user.

38
APPENDIX A

Sample code

activity_main.xml

<?xml version="1.0" encoding="utf-8"?>

<RelativeLayout xmlns:android="http://schemas.android.com/apk/res/android"

xmlns:app="http://schemas.android.com/apk/res-auto"

xmlns:tools="http://schemas.android.com/tools"

android:layout_width="match_parent"

android:layout_height="match_parent"

android:background="@color/black_shade_1"

tools:context=".MainActivity">

<ImageView

android:layout_width="200dp"

android:layout_height="200dp"

android:layout_centerHorizontal="true"

android:src="@drawable/scanner"

android:layout_marginTop="120dp"

android:id="@+id/idIVLogo"/>

<TextView

android:layout_width="match_parent"

android:layout_height="wrap_content"

android:layout_below="@id/idIVLogo"
39
android:id="@+id/idTVHead"

android:layout_centerHorizontal="true"

android:layout_marginStart="20dp"

android:layout_marginTop="39dp"

android:layout_marginEnd="20dp"

android:layout_marginBottom="10dp"

android:gravity="center"

android:text="Welcome to Text Detector from Image Application"

android:textAlignment="center"

android:textColor="@color/yellow"

android:textSize="18sp"

android:textStyle="bold" />

<Button

android:id="@+id/idBtnCapture"

android:layout_width="match_parent"

android:layout_height="wrap_content"

android:layout_below= "@id/idTVHead"

android:layout_marginStart="28dp"

android:layout_marginTop="30dp"

android:layout_marginEnd="20dp"

android:background="@drawable/button_back"

android:text="Capture Image"
40
android:textAllCaps="false"

app:backgroundTint="@color/yellow" />

</RelativeLayout>

MainActivity.java

package com.example.ocr;

import androidx.appcompat.app.AppCompatActivity;

import android.content.Intent;

import android.os.Bundle;

import android.view.View;

import android.widget.Button;

public class MainActivity extends AppCompatActivity {

private Button captureBtn;

@Override

protected void onCreate(Bundle savedInstanceState) {

super.onCreate(savedInstanceState);

setContentView(R.layout.activity_main);

captureBtn = findViewById(R.id.idBtnCapture);

captureBtn.setOnClickListener(new View.OnClickListener() {

@Override

public void onClick(View v) {

Intent i= new Intent(MainActivity.this,ScannerActivity.class);


41
startActivity(i);

});

}}

activity_scanner.xml

<?xml version="1.0" encoding="utf-8"?>

<RelativeLayout xmlns:android="http://schemas.android.com/apk/res/android"

xmlns:app="http://schemas.android.com/apk/res-auto"

xmlns:tools="http://schemas.android.com/tools"

android:layout_width="match_parent"

android:layout_height="match_parent"

android:background="@color/black_shade_1"

tools:context=".ScannerActivity">

<ImageView

android:layout_width="250dp"

android:layout_height="250dp"

android:layout_centerHorizontal="true"

android:layout_marginTop="50dp"

android:src="@drawable/scanner"

android:id="@+id/idIVCaptureImage"/>

<TextView

android:id="@+id/idTVDetectedText"
42
android:layout_width="match_parent"

android:layout_height="wrap_content"

android:layout_below="@id/idIVCaptureImage"

android:layout_marginStart="20dp"

android:layout_marginTop="20dp"

android:layout_marginEnd="20dp"

android:layout_marginBottom="20dp"

android:gravity="center"

android:padding="4dp"

android:text="Result is below, You can copy to clipboard"

android:textAlignment="center"

android:textColor="@color/yellow"

android:textSize="18sp"

android:textStyle="bold" />

<EditText

android:id="@+id/idTVE"

android:layout_width="match_parent"

android:layout_height="wrap_content"

android:layout_below="@id/idTVDetectedText"

android:layout_marginStart="20dp"

android:layout_marginTop="20dp"

android:layout_marginEnd="20dp"
43
android:layout_marginBottom="20dp"

android:gravity="center"

android:padding="4dp"

android:text="Your result will appear here"

android:textAlignment="center"

android:textColor="@color/yellow"

android:textSize="18sp"

android:textStyle="bold" />

<Button

android:id="@+id/idBtnSnap"

android:layout_width="match_parent"

android:layout_height="wrap_content"

android:layout_below="@id/idTVE"

android:layout_marginStart="20dp"

android:layout_marginTop="30dp"

android:layout_marginEnd="20dp"

android:background="@drawable/button_back"

android:text="Snap"

android:textAllCaps="false"

app:backgroundTint="@color/yellow" />

<Button

android:id="@+id/idBtnDetect"
44
android:layout_width="match_parent"

android:layout_height="wrap_content"

android:layout_below="@id/idBtnSnap"

android:layout_marginStart="20dp"

android:layout_marginTop="38dp"

android:layout_marginEnd="20dp"

android:background="@drawable/button_back"

android:text="Detect"

android:textAllCaps="false"

app:backgroundTint="@color/yellow" />

</RelativeLayout>

ScannerActivity.java

package com.example.ocr;

import static android.Manifest.permission.CAMERA;

import androidx.annotation.NonNull;

import androidx.annotation.Nullable;

import androidx.appcompat.app.AppCompatActivity;

import androidx.core.app.ActivityCompat;

import androidx.core.content.ContextCompat;

import android.content.Intent;

import android.content.pm.PackageManager;
45
import android.graphics.Bitmap;

import android.graphics.Point;

import android.graphics.Rect;

import android.os.Bundle;

import android.provider.MediaStore;

import android.view.View;

import android.widget.Button;

import android.widget.ImageView;

import android.widget.TextView;

import android.widget.Toast;

import com.google.android.gms.tasks.OnFailureListener;

import com.google.android.gms.tasks.OnSuccessListener;

import com.google.android.gms.tasks.Task;

import com.google.firebase.ml.vision.common.FirebaseVisionImage;

import com.google.mlkit.vision.common.InputImage;

import com.google.mlkit.vision.text.Text;

import com.google.mlkit.vision.text.TextRecognition;

import com.google.mlkit.vision.text.TextRecognizer;

import com.google.mlkit.vision.text.TextRecognizerOptions;

public class ScannerActivity extends AppCompatActivity {

private ImageView captureIV;


46
private TextView resultTV;

private TextView resultYT;

private Button snapBtn, detectBtn;

private Bitmap imageBitmap;

static final int REQUEST_IMAGE_CAPTURE = 1;

@Override

protected void onCreate(Bundle savedInstanceState) {

super.onCreate(savedInstanceState);

setContentView(R.layout.activity_scanner);

captureIV = findViewById(R.id.idIVCaptureImage);

resultTV = findViewById(R.id.idTVDetectedText);

resultYT=findViewById(R.id.idTVE);

snapBtn = findViewById(R.id.idBtnSnap);

detectBtn = findViewById(R.id.idBtnDetect);

detectBtn.setOnClickListener(new View.OnClickListener() {

@Override

public void onClick(View v) {

detectText();

});

snapBtn.setOnClickListener(new View.OnClickListener() {
47
@Override

public void onClick(View v) {

if (checkPermissions()) {

captureImage();

} else {

requestPermission();

});

private boolean checkPermissions() {

int camerPermision ContextCompat.checkSelfPermission(getApplication


Context(), CAMERA);

return camerPermision == PackageManager.PERMISSION_GRANTED;

private void requestPermission() {

int PERMISSION_CODE = 200;

ActivityCompat.requestPermissions(this, new String[]{CAMERA},


PERMISSION_CODE);

48
private void captureImage() {

Intent takePicture = new Intent(MediaStore.ACTION_IMAGE_CAPTURE);

if (takePicture.resolveActivity(getPackageManager()) != null) {

startActivityForResult(takePicture, REQUEST_IMAGE_CAPTURE);

@Override

public void onRequestPermissionsResult(int requestCode, @NonNull String[]

permissions, @NonNull int[] grantResults) {

super.onRequestPermissionsResult(requestCode, permissions, grantResults);

if (grantResults.length > 0) {

boolean cameraPermission = grantResults[0] ==

PackageManager.PERMISSION_GRANTED;

if (cameraPermission) {

Toast.makeText(this, "Permission Granted..",

Toast.LENGTH_SHORT).show();

} else {

Toast.makeText(this, "Permission denined...", Toast.LENGTH_SHORT) .show();

49
@Override

protected void onActivityResult(int requestCode, int resultCode, @Nullable Intent

data) {

super.onActivityResult(requestCode, resultCode, data);

if (requestCode == REQUEST_IMAGE_CAPTURE && resultCode ==

RESULT_OK) {

Bundle extras = data.getExtras();

imageBitmap = (Bitmap) extras.get("data");

captureIV.setImageBitmap(imageBitmap);

private void detectText(){

InputImage image=InputImage.fromBitmap(imageBitmap,0);

TextRecognizer recognizer= TextRecognition.getClient

(TextRecognizerOptions.DEFAULT_OPTIONS);

Task<Text> result = recognizer.process(image).addOnSuccessListener(new

OnSuccessListener<Text>() {

@Override

public void onSuccess(@NonNull Text text) {

StringBuilder result=new StringBuilder();

for(Text.TextBlock block: text.getTextBlocks()){


50
String blockText= block.getText();

Point[] blockCornerPoint = block.getCornerPoints();

Rect blockFrame=block.getBoundingBox();

for(Text.Line line: block.getLines()){

String lineTExt = line.getText();

Point[] lineCornerPoint=line.getCornerPoints();

Rect linRect=line.getBoundingBox();

for(Text.Element element: line.getElements()){

String elementText=element.getText();

result.append(elementText);

resultYT.setText(blockText);

}).addOnFailureListener(new OnFailureListener() {

@Override

public void onFailure(@NonNull Exception e) {

Toast.makeText(ScannerActivity.this, "Fail to detect text from

image.."+e.getMessage(), Toast.LENGTH_SHORT).show();

});
51
}}

AndroidManifest.xml

<?xml version="1.0" encoding="utf-8"?>

<manifest xmlns:android="http://schemas.android.com/apk/res/android"

xmlns:tools="http://schemas.android.com/tools"

package="com.example.ocr">

<uses-permission android:name="android.permission.CAMERA" />

<uses-permission android:name="android.permission.INTERNET" />

<application

android:allowBackup="true"

android:dataExtractionRules="@xml/data_extraction_rules"

android:fullBackupContent="@xml/backup_rules"

android:icon="@drawable/scanner"

android:label="@string/app_name"

android:roundIcon="@drawable/scanner"

android:supportsRtl="true"

android:theme="@style/Theme.Ocr"

tools:targetApi="31">

<activity

android:name=".ScannerActivity"
52
android:exported="false" />

<activity

android:name=".MainActivity"

android:exported="true">

<intent-filter>

<action android:name="android.intent.action.MAIN" />

<category android:name="android.intent.category.LAUNCHER" />

</intent-filter>

</activity>

</application>

</manifest>

build.gradle(Ocr)

buildscript {

dependencies {

classpath 'com.google.gms:google-services:4.3.13'

}// Top-level build file where you can add configuration options common to all sub-
projects/modules.

plugins {

id 'com.android.application' version '7.2.2' apply false

id 'com.android.library' version '7.2.2' apply false

53
}

task clean(type: Delete) {

delete rootProject.buildDir

build.grade(:app)

plugins {

id 'com.android.application'

id 'com.google.gms.google-services'

android {

compileSdk 32

defaultConfig {

applicationId "com.example.ocr"

minSdk 21

targetSdk 32
54
versionCode 1

versionName "1.0"

testInstrumentationRunner "androidx.test.runner.AndroidJUnitRunner"

buildTypes {

release {

minifyEnabled false

proguardFiles getDefaultProguardFile('proguard-android-optimize.txt'),

'proguard-rules.pro'

compileOptions {

sourceCompatibility JavaVersion.VERSION_1_8

targetCompatibility JavaVersion.VERSION_1_8

dependencies {

implementation 'androidx.appcompat:appcompat:1.4.2'

implementation 'com.google.android.material:material:1.6.1'
55
implementation 'androidx.constraintlayout:constraintlayout:2.1.4'

implementation 'com.google.firebase:firebase-core:19.0.0'

implementation 'com.google.android.gms:play-services-mlkit-text

recognition:16.2.0'

testImplementation 'junit:junit:4.13.2'

androidTestImplementation 'androidx.test.ext:junit:1.1.3'

androidTestImplementation 'androidx.test.espresso:espresso-core:3.4.0'

implementation platform('com.google.firebase:firebase-bom:30.4.1')

implementation 'com.google.firebase:firebase-core:15.0.2'

implementation 'com.google.firebase:firebase-ml-vision:15.0.0

56
APPENDIX B

SAMPLE OUTPUT

activity_main

57
activity_scanner

58
Final Output

REFERENCES

1. OnDemand, HPE Haven, "OCR Document" ,April 15, 2016.

2. Schantz, Herbert F. (1982). The history of OCR, optical character recognition.


[Manchester Center, Vt.]: Recognition Technologies Users
Association. ISBN 9780943072012.
3. d'Albe, E. E. F. (July 1, 1914). "On a Type-Reading Optophone". Proceedings
of the Royal Society A: Mathematical, Physical and Engineering
Sciences. 90 (619): 373–
375. Bibcode:1914RSPSA..90..373D. doi:10.1098/rspa.1914.0061

59

You might also like