Synopsis
Synopsis
BACHELOR OF TECHNOLOGY
IN
COMPUTERSCIENCE ENGINEERING
Submitted by:
133499 Hitesh
Merry Paulose
Logo
OCTOBER, 2024
Abstract
Document Scanner application facilitates easy scanning, storage of photos and
important documents online. The application uses python libraries to convert text
and images as scanned document. The idea is to select a targeted portion of the
document to be scanned and create a clear copy of that in image or pdf format.
Hence this application will process the document in various available formats
Keywords:
• Document Scanner
• Python
• OpenCV
• Edge Detection
• Image Processing
• Perspective Transformation
• Text recognition
• Image to character
Table of Contents
Title Page
Abstract
1. Introduction
a. Problem Definition
b. Project Overview
c. Hardware Specification
d. Software Specification
2. Literature Survey
a. Existing System
b. Proposed System
c. Literature Review Summary
3. Problem Formulation
4. Research Objective
5. Methodologies
6. Experimental Setup
7. Conclusion
8. Tentative Chapter Plan for the Proposed Work
9. Reference
INTRODUCTION
Doc scanner systems facilitate the initial input, storage, retrieval, and display of digital images.
Specialized image processing systems additionally provide for image enhancement, image
restoration, image analysis, image compression, and image synthesis. Image enhancement
activities include, for example, sharpening edges and adjusting contrast. Restoration activities,
like photometric correction, adjust images to compensate for conversion errors. Image analysis
may extract features or classify objects within an image, while image compression concerns
itself with decreasing the overall size of a digital image file. Finally, image synthesis may
incorporate activities like visualization and image mergers. Scanning software may incorporate
features of an image processing system for user convenience and effectiveness.
As the need and demand of all sorts of documents, transportation from one place to another and
also the types of formats have kept increasing, so have the overloading and disadvantageous
factors that may lead to confusion, innacurracy in conversion and concern. This is why
implementing a procedure by means of a document scanner app wherein the quality, backup
content, conversion convenience and overall safety of the document and its contents all kept in
mind is the intended purpose, by means of Python interface.
The problem involves developing a system that can accurately detect and extract the boundaries
of a document from an image and perform various image processing techniques to ensure the
document is readable, properly oriented, and visually clean for storage or sharing. The system
needs to handle various lighting conditions and document quality.
4. Objectives
The objective of this project is to create an efficient and user-friendly document scanning tool
that works with images taken from regular cameras.
1. To develop a scanner application that scans the image and text documents using the web
camera and its features.
3. To minimize additional amounts of storage and focus on the exact content mentioned by the
user.
Document scanner scans the image and text documents using the web camera and deliver better
scanned images and documents which can be further utilized by as per the need and demand.
Document Scanner App is developed using python and its libraries. The working of doc scanner
involves following steps and ways:
a). Select Image: We use a GUI based image selection option. It will direct the user to a dialog
box which will show the file explorer of the system from where the image can be selected
manually. This option is very handy to use.
b). Scan the Image: This step involves the scanning of image which further involves three sub
steps. They are as follows:
• Edge Detection
• Finding Contours
• Perspective Transform
Edge Detection: To detect the edge of image the canny edge detector algorithm is used. The
canny edge detector algorithm was developed in 1986 by John F. Canny. The Canny edge
detection algorithm is composed of 5 steps:
3) .Non-maximum suppression;
Finding Contours: A contour is a closed curve joining all the continuous points having some
color or intensity, they represent the shapes of objects found in an image. Contour detection is a
useful technique for shape analysis and object detection and recognition.
For a better accuracy, here is the whole pipeline that we gonna follow to successfully detect
contours in an image:
• Convert the image to a binary image, it is a common practice for the input image to be a binary
image (which should be a result of a thresholded image or edge detection).
Perspective Transform : The Perspective Transformation is that operation that we use when we
want to change the perspective of an object.
• create a list with this 4 points, and we’ll use this list later to apply the transformation.
• create a new set of 4 points. This 4 points are the size of the new window where we want to
display the image transformed.
• Then we apply the perspective transform to create the matrix and finally we can warp the
image into using the original frame and the matrix just created.
c). Save the Scanned Image: This step is concerned with the saving of the scanned image which
can be used ahead for the further task. The image is stored in a particular format such as png or
jpg.
d). Recognize Text: This step is responsible for extracting the text portion from the image that is
being scanned. This also allows us to select a targeted portion of the document to be scanned and
create a clear copy. It returns the scanned copy of text in grey scale as an output.
Once the document has been scanned and saved as an image, you can use Tesseract OCR to
extract the text from the image.
a). Preprocess the image to improve recognition (binarization, resizing, and noise removal).
b). Specify the language using the lang parameter for multilingual documents:
e). Save the Output Text: This step is concerned with the saving of the output text. This copy
can be used for further tasks such as uploading or sharing of documents.
Once the text has been recognized, you can save it in a .txt file, or you can save it in other
formats like PDF using FPDF or a similar library.
6. Experimental Setup
The experimental setup includes testing the system with various types of documents, including
text-heavy pages, handwritten notes, and documents in different lighting conditions.
Performance metrics such as processing time, accuracy of edge detection, and quality of the final
output will be analyzed.
Edge Detection Accuracy: Measure how accurately the system identifies document boundaries
in various conditions (e.g., folded, partially obscured).
Processing Time: Evaluate how quickly the system processes images and produces the final
output. Measure times for each phase (edge detection, perspective transformation, and post-
processing).
Quality of Output: Assess the clarity of the document text, alignment, and overall document
quality after processing.
Error Rate: Analyze cases where the system fails (e.g., failed to detect edges or wrong contour
detection).
Test Scenarios :
• Scenario 1: Perfect Condition: Well-lit, flat document, perfect angle.
• Scenario 2: Poor Lighting: Document in dim light.
• Scenario 3: Skewed Angle: Document photographed at an angle.
• Scenario 4: Crumpled Document: Folded or wrinkled document.
• Scenario 5: Overlapping Objects: Document with objects partially overlapping its
edges.
Once testing is complete, you will analyze the outcomes and identify areas for improvement:
• Which conditions caused the most challenges (e.g., dim lighting, crumpled documents)?
• What enhancements can be made to improve edge detection, accuracy, or processing
time?
• How does the system compare to other solutions?
The experimental setup ensures that the document scanner works efficiently in different
conditions and environments, providing reliable results.
7. Conclusion
The Doc Scanner App implemented using python fully meets the objectives for which it has been
developed. It operates at a high level of accuracy and the user associated with the system
understands its advantage. It was intended to solve the problem as per requirement specification
and does this successfully by saving a lot of time, manual effort and extra expenses. Moreover, it
makes the task of sharing and uploading documents easier and faster.
This project aimed to design a document scanner using Python and OpenCV to streamline
document digitization. By using image processing techniques for edge detection and perspective
correction, followed by text recognition using OCR, the system successfully demonstrated its
utility in creating accurate digital copies of physical documents.
While the system performed well under most conditions, challenges were encountered with poor
lighting and handwritten documents. Future iterations of the system could incorporate advanced
preprocessing techniques or train a custom OCR model to improve recognition accuracy in
challenging environments.
Future work will focus on improving the OCR performance for handwritten documents and
extending the solution to mobile platforms, making document scanning more accessible and
efficient.
This project provides an efficient, Python-based solution for document digitization, offering a
flexible and cost-effective alternative to commercial scanners. With additional improvements,
this system has the potential to become a valuable tool for personal and professional use.
8. Tentative Chapter Plan for the Proposed Work
Chapter 1: Introduction
This chapter introduces the project, giving an overview of the problem you are addressing and
the objectives of the project.
• Problem Definition: Define the problem of manually scanning documents and the need
for an automated, efficient document scanner.
• Project Motivation: Explain why this project is important, such as its usefulness in
digitizing documents for personal, academic, or professional purposes.
• Objectives: Outline the main objectives of the project (e.g., creating a Python-based
document scanner, extracting text from scanned images using OCR).
• Existing Systems: Discuss commercial and open-source document scanners (e.g., Adobe
Scan, CamScanner) and their capabilities/limitations.
• Summary: Identify the gaps in existing systems and justify the need for your proposed
solution.
• Problem Formulation: Formally state the problem your project addresses, such as
improving document digitization using accessible tools.
• Objectives: Set clear and specific objectives (e.g., building a user-friendly scanner,
accurate text extraction, performance in various lighting conditions).
• Scope of the Project: Define the boundaries and limitations of the project (e.g., focusing
on printed text and not handwritten documents).
Chapter 4: Methodology
This chapter explains the design approach and the tools used for the system.
• System Design: Present the overall design architecture of the document scanner, from
image acquisition to processing and output.
Experimental Setup: Explain the hardware and software used in testing (e.g., types of documents,
lighting conditions, camera specifications).
Conclusion: Recap the achievements of the project, the solutions provided, and the results
obtained.
[2] WenshuoGao, et.al. ; “An improved Sobel edge detection”, Computer Science and
Information Technology (ICCSIT), 2010 3rd IEEE International Conference, China, Volume:
5,pp. 67 – 71, 9-11 July 2010.
[3] Bao, P., Zhang, L., Wu, X.: Canny Edge Detection Enhancement by Scale Multiplication.
IEEE Trans. on PAMI 27(9), 1485–1490 (2005).
[4] Muthukrishman, R. and M. Radha., “Edge Detection Techniques for Image Segmentation,”
International Journal of Computer Science & Information Technology (IJCSIT), Vol. 3, No. 6,
Dec. 2011.
[5] D. B. Dhar and B. Chanda, “Extraction and recognition of geographical features from paper
maps,” International Journal on Document Analysis and Recognition, vol. 8, no. 4, pp. 232– 245,
2006.
[6] R. Pradhan, S. Kumar, R. Agarwal, M. P. Pradhan, and M. K. Ghose, “Contour line tracing
algorithm for digital topographic maps,” International Journal of Image Processing, vol. 4, no. 2,
pp. 156–163, 2010.
[7] Zhang J. Research on the geometric distortion auto-correction algorithm for image scanned.
Applied Mechanics and Materials, 2014, 3468(644): 30–44.
[8] Namboodiri, A., Jain, A.K. (2007)“Document Structure and Layout Analysis”, Digital Doc
Proc: Major Dir. and Recent Adv.