0% found this document useful (0 votes)
12 views17 pages

Synopsis

The document presents a project synopsis for a Document Scanner application developed for a Bachelor of Technology in Computer Science Engineering. The application utilizes Python libraries and image processing techniques to scan, store, and convert documents into various formats, addressing the limitations of traditional hardware scanners. The project aims to provide a user-friendly, cost-effective solution for document digitization and includes methodologies, experimental setups, and future work directions for enhancing OCR performance and mobile deployment.

Uploaded by

Suvit Sharma
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views17 pages

Synopsis

The document presents a project synopsis for a Document Scanner application developed for a Bachelor of Technology in Computer Science Engineering. The application utilizes Python libraries and image processing techniques to scan, store, and convert documents into various formats, addressing the limitations of traditional hardware scanners. The project aims to provide a user-friendly, cost-effective solution for document digitization and includes methodologies, experimental setups, and future work directions for enhancing OCR performance and mobile deployment.

Uploaded by

Suvit Sharma
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

Project Name

A Project Work Synopsis

Submitted in the partial fulfillment for the award of the degree of

BACHELOR OF TECHNOLOGY

IN

COMPUTERSCIENCE ENGINEERING

Submitted by:

133499 Hitesh

Under the supervision of:

Merry Paulose

Logo

NIMS UNIVERSITY, RAJASTHAN, JAIPUE – 303121

OCTOBER, 2024
Abstract
Document Scanner application facilitates easy scanning, storage of photos and
important documents online. The application uses python libraries to convert text
and images as scanned document. The idea is to select a targeted portion of the
document to be scanned and create a clear copy of that in image or pdf format.
Hence this application will process the document in various available formats

Keywords:

• Document Scanner
• Python
• OpenCV
• Edge Detection
• Image Processing
• Perspective Transformation
• Text recognition
• Image to character
Table of Contents

Title Page

Abstract

1. Introduction
a. Problem Definition
b. Project Overview
c. Hardware Specification
d. Software Specification
2. Literature Survey
a. Existing System
b. Proposed System
c. Literature Review Summary
3. Problem Formulation
4. Research Objective
5. Methodologies
6. Experimental Setup
7. Conclusion
8. Tentative Chapter Plan for the Proposed Work
9. Reference
INTRODUCTION
Doc scanner systems facilitate the initial input, storage, retrieval, and display of digital images.
Specialized image processing systems additionally provide for image enhancement, image
restoration, image analysis, image compression, and image synthesis. Image enhancement
activities include, for example, sharpening edges and adjusting contrast. Restoration activities,
like photometric correction, adjust images to compensate for conversion errors. Image analysis
may extract features or classify objects within an image, while image compression concerns
itself with decreasing the overall size of a digital image file. Finally, image synthesis may
incorporate activities like visualization and image mergers. Scanning software may incorporate
features of an image processing system for user convenience and effectiveness.

1.1 Problem Definition


The conventional way of scanning documents requires bulky hardware scanners, which can be
expensive and difficult to use for individuals or small businesses. With advancements in mobile
computing and image processing technologies, it is now possible to create an affordable and
portable solution using smartphones and laptops. This project aims to provide a solution for
scanning documents using computer vision techniques to detect and process document images
efficiently.

1.2 Problem Overview


The main purpose of the Doc Scanner app is to facilitate easy scanning, storage of photos and
important documents online. It uses python libraries to convert text and images as a scanned
document, selecting a targeted portion of the document to be scanned and creating a clear copy
image or pdf format. The application will process the document in various available formats.

As the need and demand of all sorts of documents, transportation from one place to another and
also the types of formats have kept increasing, so have the overloading and disadvantageous
factors that may lead to confusion, innacurracy in conversion and concern. This is why
implementing a procedure by means of a document scanner app wherein the quality, backup
content, conversion convenience and overall safety of the document and its contents all kept in
mind is the intended purpose, by means of Python interface.

1.3 Hardware Specification

• Processor: AMD Ryzen 7 4800HS.


• RAM: 16GB smooth multitasking.
• Storage: 256GB SSD for quick access to files and fast development processes.
• Additional Tools: For version control Git and VS Code.
• Camera : Smartphone or any device with a 5MP+ camera for capturing documents.

1.4 Software Specifications

• Operating System: Windows 11

• Programming Language: Python 3.13

• Libraries: OpenCV, NumPy, Scikit-Image


2. LITERATURE SURVEY
This section provides a detailed overview of the existing systems, the proposed system, and a
summary of relevant research articles to highlight the significance and innovation of the
document scanner project using Python and OpenCV. It includes a critical analysis of related
work from various authors, tools and techniques used, as well as the evaluation parameters
applied in those studies.

2.1 Existing System


Existing systems in document scanning involve both hardware and software solutions. The most
common are commercial scanning apps such as Adobe Scan, CamScanner, and Google Drive
Scan. These apps use smartphones for capturing documents and apply image processing
techniques to enhance the document quality. They often employ OCR (Optical Character
Recognition) for text extraction.
Key features of existing systems:
• Image Processing: Automatic edge detection and perspective correction.
• OCR Integration: Extracting text from scanned images for easy digitization.
• Cloud Integration: Storing scanned documents in the cloud for remote access.
Limitations:
• Dependency on proprietary software and services.
• Limited customizability for specific use cases.
• Handwritten text recognition is often inaccurate.
• Expensive subscription models for advanced features.

2.2 Proposed System


The proposed system addresses some of the limitations of existing commercial applications by
providing an open-source solution using Python and OpenCV. The proposed system emphasizes:
• Customization: The ability to modify the document processing pipeline for specific
needs.
• Cost Efficiency: Open-source libraries like OpenCV and Tesseract OCR make the
system cost-effective.
• Flexibility: It can be deployed on various platforms (PC, mobile, or cloud-based) without
requiring proprietary software.
Key components of the proposed system include:
• Image Processing: Preprocessing techniques such as grayscale conversion, noise
removal, edge detection, and perspective transformation.
• OCR using Tesseract: For extracting text from images, with support for multiple
languages.
• Saving Options: Scanned documents can be saved as images, PDFs, or text files.
3. Problem Formulation

The problem involves developing a system that can accurately detect and extract the boundaries
of a document from an image and perform various image processing techniques to ensure the
document is readable, properly oriented, and visually clean for storage or sharing. The system
needs to handle various lighting conditions and document quality.

4. Objectives

The objective of this project is to create an efficient and user-friendly document scanning tool
that works with images taken from regular cameras.

The main objectives of this application are:

1. To develop a scanner application that scans the image and text documents using the web
camera and its features.

2. To deliver better scanned images and documents.

3. To minimize additional amounts of storage and focus on the exact content mentioned by the
user.

4. To provide authenticity and reliability.

5. To create backups and avoid interruption.


5. METHODOLOGY

Document scanner scans the image and text documents using the web camera and deliver better
scanned images and documents which can be further utilized by as per the need and demand.
Document Scanner App is developed using python and its libraries. The working of doc scanner
involves following steps and ways:

a). Select Image.

b). Scan the Image

c). Save the Scanned Image

d). Recognize Text

e). Save the Output Text

a). Select Image: We use a GUI based image selection option. It will direct the user to a dialog
box which will show the file explorer of the system from where the image can be selected
manually. This option is very handy to use.

b). Scan the Image: This step involves the scanning of image which further involves three sub
steps. They are as follows:

• Edge Detection

• Finding Contours

• Perspective Transform

Edge Detection: To detect the edge of image the canny edge detector algorithm is used. The
canny edge detector algorithm was developed in 1986 by John F. Canny. The Canny edge
detection algorithm is composed of 5 steps:

1). Noise reduction;


2). Gradient calculation;

3) .Non-maximum suppression;

4). Double threshold;

5). Edge Tracking by Hysteresis

Finding Contours: A contour is a closed curve joining all the continuous points having some
color or intensity, they represent the shapes of objects found in an image. Contour detection is a
useful technique for shape analysis and object detection and recognition.

For a better accuracy, here is the whole pipeline that we gonna follow to successfully detect
contours in an image:

• Convert the image to a binary image, it is a common practice for the input image to be a binary
image (which should be a result of a thresholded image or edge detection).

• Finding the contours using find Contours() OpenCV function.

• Draw these contours and show the image.

Perspective Transform : The Perspective Transformation is that operation that we use when we
want to change the perspective of an object.

How to do perspective transformation?

• First we need to load the image we want to transform.

• We then need to select 4 points, in order: top-left, top-right, bottom-left, bottom-right.

• Then drawing a circle to show the exact points we are taking.

• create a list with this 4 points, and we’ll use this list later to apply the transformation.
• create a new set of 4 points. This 4 points are the size of the new window where we want to
display the image transformed.

• Then we apply the perspective transform to create the matrix and finally we can warp the
image into using the original frame and the matrix just created.

• Then we can show it on the screen

c). Save the Scanned Image: This step is concerned with the saving of the scanned image which
can be used ahead for the further task. The image is stored in a particular format such as png or
jpg.

d). Recognize Text: This step is responsible for extracting the text portion from the image that is
being scanned. This also allows us to select a targeted portion of the document to be scanned and
create a clear copy. It returns the scanned copy of text in grey scale as an output.

Once the document has been scanned and saved as an image, you can use Tesseract OCR to
extract the text from the image.

Parameters to Improve OCR Performance:

a). Preprocess the image to improve recognition (binarization, resizing, and noise removal).

b). Specify the language using the lang parameter for multilingual documents:

e). Save the Output Text: This step is concerned with the saving of the output text. This copy
can be used for further tasks such as uploading or sharing of documents.

Once the text has been recognized, you can save it in a .txt file, or you can save it in other
formats like PDF using FPDF or a similar library.
6. Experimental Setup
The experimental setup includes testing the system with various types of documents, including
text-heavy pages, handwritten notes, and documents in different lighting conditions.
Performance metrics such as processing time, accuracy of edge detection, and quality of the final
output will be analyzed.
Edge Detection Accuracy: Measure how accurately the system identifies document boundaries
in various conditions (e.g., folded, partially obscured).
Processing Time: Evaluate how quickly the system processes images and produces the final
output. Measure times for each phase (edge detection, perspective transformation, and post-
processing).
Quality of Output: Assess the clarity of the document text, alignment, and overall document
quality after processing.
Error Rate: Analyze cases where the system fails (e.g., failed to detect edges or wrong contour
detection).
Test Scenarios :
• Scenario 1: Perfect Condition: Well-lit, flat document, perfect angle.
• Scenario 2: Poor Lighting: Document in dim light.
• Scenario 3: Skewed Angle: Document photographed at an angle.
• Scenario 4: Crumpled Document: Folded or wrinkled document.
• Scenario 5: Overlapping Objects: Document with objects partially overlapping its
edges.
Once testing is complete, you will analyze the outcomes and identify areas for improvement:
• Which conditions caused the most challenges (e.g., dim lighting, crumpled documents)?
• What enhancements can be made to improve edge detection, accuracy, or processing
time?
• How does the system compare to other solutions?
The experimental setup ensures that the document scanner works efficiently in different
conditions and environments, providing reliable results.
7. Conclusion
The Doc Scanner App implemented using python fully meets the objectives for which it has been
developed. It operates at a high level of accuracy and the user associated with the system
understands its advantage. It was intended to solve the problem as per requirement specification
and does this successfully by saving a lot of time, manual effort and extra expenses. Moreover, it
makes the task of sharing and uploading documents easier and faster.

This project aimed to design a document scanner using Python and OpenCV to streamline
document digitization. By using image processing techniques for edge detection and perspective
correction, followed by text recognition using OCR, the system successfully demonstrated its
utility in creating accurate digital copies of physical documents.

The system effectively detects the boundaries of a document, performs perspective


transformations to correct the angle, and uses OCR to extract text. The output can be saved in
both image and text formats, making the solution versatile and user-friendly.

While the system performed well under most conditions, challenges were encountered with poor
lighting and handwritten documents. Future iterations of the system could incorporate advanced
preprocessing techniques or train a custom OCR model to improve recognition accuracy in
challenging environments.

Future work will focus on improving the OCR performance for handwritten documents and
extending the solution to mobile platforms, making document scanning more accessible and
efficient.

This project provides an efficient, Python-based solution for document digitization, offering a
flexible and cost-effective alternative to commercial scanners. With additional improvements,
this system has the potential to become a valuable tool for personal and professional use.
8. Tentative Chapter Plan for the Proposed Work

Chapter 1: Introduction
This chapter introduces the project, giving an overview of the problem you are addressing and
the objectives of the project.

• Problem Definition: Define the problem of manually scanning documents and the need
for an automated, efficient document scanner.

• Project Motivation: Explain why this project is important, such as its usefulness in
digitizing documents for personal, academic, or professional purposes.

• Objectives: Outline the main objectives of the project (e.g., creating a Python-based
document scanner, extracting text from scanned images using OCR).

Chapter 2: Literature Review


This chapter reviews previous research and existing systems related to document scanning,
image processing, and text recognition.

• Existing Systems: Discuss commercial and open-source document scanners (e.g., Adobe
Scan, CamScanner) and their capabilities/limitations.

• Relevant Research: Summarize key academic papers or technologies related to edge


detection, perspective correction, and OCR.

• Summary: Identify the gaps in existing systems and justify the need for your proposed
solution.

Chapter 3: Problem Formulation and Objectives


This chapter focuses on clearly defining the problem and setting measurable goals.

• Problem Formulation: Formally state the problem your project addresses, such as
improving document digitization using accessible tools.
• Objectives: Set clear and specific objectives (e.g., building a user-friendly scanner,
accurate text extraction, performance in various lighting conditions).

• Scope of the Project: Define the boundaries and limitations of the project (e.g., focusing
on printed text and not handwritten documents).

Chapter 4: Methodology
This chapter explains the design approach and the tools used for the system.

• System Design: Present the overall design architecture of the document scanner, from
image acquisition to processing and output.

• Methodology: Explain the step-by-step process, such as:

1. Image acquisition using a camera.

2. Image preprocessing (grayscale conversion, noise reduction).

3. Edge detection and contour extraction using OpenCV.

4. Perspective transformation to flatten the document.

5. Text recognition using Tesseract OCR.

6. Saving the output image and recognized text.

CHAPTER 5: EXPERIMENTAL SETUP


This chapter describes the setup for testing and evaluating the project.

Experimental Setup: Explain the hardware and software used in testing (e.g., types of documents,
lighting conditions, camera specifications).

Test Cases: Present different test scenarios, such as:

a). Scanning flat documents.

b). Scanning wrinkled or folded documents.

c). Performance in low-light environments.


CHAPTER 6: CONCLUSION AND FUTURE SCOPE
This chapter summarizes the project and outlines future directions.

Conclusion: Recap the achievements of the project, the solutions provided, and the results
obtained.

Future Work: Propose improvements or extensions to the project, such as:

Adding support for handwritten text recognition.

b). Improving the system’s performance under poor lighting.

c). Deploying the system on mobile platforms.


9. REFERENCE
[1] G. T. Shrivakshan and Dr. C. Chandrasekar, "A Comparison of various Edge Detection
Techniques used in Image Processing", International Journal of Computer Science Issues, Vol. 9,
Issue 5, No 1, September 2012.

[2] WenshuoGao, et.al. ; “An improved Sobel edge detection”, Computer Science and
Information Technology (ICCSIT), 2010 3rd IEEE International Conference, China, Volume:
5,pp. 67 – 71, 9-11 July 2010.

[3] Bao, P., Zhang, L., Wu, X.: Canny Edge Detection Enhancement by Scale Multiplication.
IEEE Trans. on PAMI 27(9), 1485–1490 (2005).

[4] Muthukrishman, R. and M. Radha., “Edge Detection Techniques for Image Segmentation,”
International Journal of Computer Science & Information Technology (IJCSIT), Vol. 3, No. 6,
Dec. 2011.

[5] D. B. Dhar and B. Chanda, “Extraction and recognition of geographical features from paper
maps,” International Journal on Document Analysis and Recognition, vol. 8, no. 4, pp. 232– 245,
2006.

[6] R. Pradhan, S. Kumar, R. Agarwal, M. P. Pradhan, and M. K. Ghose, “Contour line tracing
algorithm for digital topographic maps,” International Journal of Image Processing, vol. 4, no. 2,
pp. 156–163, 2010.

[7] Zhang J. Research on the geometric distortion auto-correction algorithm for image scanned.
Applied Mechanics and Materials, 2014, 3468(644): 30–44.

[8] Namboodiri, A., Jain, A.K. (2007)“Document Structure and Layout Analysis”, Digital Doc
Proc: Major Dir. and Recent Adv.

You might also like