100% found this document useful (1 vote)
2K views37 pages

Omr

it's an bubble sheet scanner and omr technology report using machine learning

Uploaded by

Bhagya Shree
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
2K views37 pages

Omr

it's an bubble sheet scanner and omr technology report using machine learning

Uploaded by

Bhagya Shree
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 37

CHAPTER-1

INTRODUCTION
To build a document scanner, contour sorting and perspective image filtering
techniques using canning filter for edge detection and Gaussian filter for
blurring the image(OMR answer key) .This all is done using python in sublime
software. The OMR response of the student is taken and is compared with the
answer key and thus percentage is estimated. This evaluated OMR sheet is
displayed on the screen after executing the code.

1.1 What is Optical Mark Recognition (OMR)?


Optical Mark Recognition, or OMR for short, is the process
of automatically analyzing human-marked documents and interpreting their
results.
Arguably, the most famous, easily recognizable form of OMR are bubble sheet
multiple choice tests, not unlike the ones you took in elementary school,
middle school, or even high school.
If you’re unfamiliar with “bubble sheet tests” or the trademark/corporate
name of “Scantron tests”, they are simply multiple-choice tests that you take
as a student. Each question on the exam is a multiple choice — and you use a
#2 pencil to mark the “bubble” that corresponds to the correct answer.

The most notable bubble sheet test you experienced (at least in the United
States) were taking the SATs during high school, prior to filling out college
admission applications.

I believe that the SATs use the software provided by Scantron to perform OMR
and grade student exams, but I could easily be wrong there. I only make note
of this because Scantron is used in over 98% of all US school districts.
In short, what I’m trying to say is that there is a massive market for Optical
Mark Recognition and the ability to grade and interpret human-marked forms
and exams.

1
FIG.1.1

1.2 Getting started:


These instructions will get you a copy of the project up and running on your
local machine for development and testing purposes.

1.3 Prerequisites:

 Python 3
 OpenCV 3.4.3 or later
 NumPy
 imutils
 SciPy (Windows only)

1.4 Installing on Windows Subsystem for Linux

To install the libraries, run the following commands:

$ apt install python3


$ apt install python3-opencv
$ apt install python3-pip
$ pip3 install numpy
$ pip3 install scipy

2
$ pip3 install imutils

1.5 About Python3:


Python is a general-purpose interpreted, interactive, object-oriented, and
high-level programming language. It was created by Guido van Rossum during
1985- 1990. Like Perl, Python source code is also available under the GNU
General Public License (GPL). Python is named after a TV Show called ‘Monty
Python’s Flying Circus’ and not after Python-the snake.
Python 3.0 was released in 2008. Although this version is supposed to be
backward incompatibles, later on many of its important features have been
backported to be compatible with version 2.7.This tutorial gives enough
understanding on Python 3 version programming language.

Python is a high-level, interpreted, interactive and object-oriented scripting


language. Python is designed to be highly readable. It uses English keywords
frequently whereas the other languages use punctuations. It has fewer
syntactical constructions than other languages.
 Python is Interpreted − Python is processed at runtime by the
interpreter. You do not need to compile your program before executing
it. This is similar to PERL and PHP.
 Python is Interactive − You can actually sit at a Python prompt and
interact with the interpreter directly to write your programs.
 Python is Object-Oriented − Python supports Object-Oriented style or
technique of programming that encapsulates code within objects.
 Python is a Beginner's Language − Python is a great language for the
beginner-level programmers and supports the development of a wide
range of applications from simple text processing to WWW browsers to
games.
1.6 History of Python
Python was developed by Guido van Rossum in the late eighties and early
nineties at the National Research Institute for Mathematics and Computer
Science in the Netherlands.

3
 Python is derived from many other languages, including ABC, Modula-3,
C, C++, Algol-68, SmallTalk, and Unix shell and other scripting languages.
 Python is copyrighted. Like Perl, Python source code is now available
under the GNU General Public License (GPL).
 Python is now maintained by a core development team at the institute,
although Guido van Rossum still holds a vital role in directing its
progress.
 Python 1.0 was released in November 1994. In 2000, Python 2.0 was
released. Python 2.7.11 is the latest edition of Python 2.
 Meanwhile, Python 3.0 was released in 2008. Python 3 is not backward
compatible with Python 2. The emphasis in Python 3 had been on the
removal of duplicate programming constructs and modules so that
"There should be one -- and preferably only one -- obvious way to do
it." Python 3.5.1 is the latest version of Python 3.
1.7 Python Features
Python's features include −
 Easy-to-learn − Python has few keywords, simple structure, and a
clearly defined syntax. This allows a student to pick up the language
quickly.
 Easy-to-read − Python code is more clearly defined and visible to the
eyes.
 Easy-to-maintain − Python's source code is fairly easy-to-maintain.
 A broad standard library − Python's bulk of the library is very portable
and cross-platform compatible on UNIX, Windows, and Macintosh.
 Interactive Mode − Python has support for an interactive mode which
allows interactive testing and debugging of snippets of code.
 Portable − Python can run on a wide variety of hardware platforms and
has the same interface on all platforms.
 Extendable − You can add low-level modules to the Python interpreter.
These modules enable programmers to add to or customize their tools
to be more efficient.
 Databases − Python provides interfaces to all major commercial
databases.

4
 GUI Programming − Python supports GUI applications that can be
created and ported to many system calls, libraries and windows
systems, such as Windows MFC, Macintosh, and the X Window system
of Unix.
 Scalable − Python provides a better structure and support for large
programs than shell scripting.
Apart from the above-mentioned features, Python has a big list of good
features. A, few are listed below −
 It supports functional and structured programming methods as well as
OOP.
 It can be used as a scripting language or can be compiled to byte-code
for building large applications.
 It provides very high-level dynamic data types and supports dynamic
type checking.
 It supports automatic garbage collection.
 It can be easily integrated with C, C++, COM, ActiveX, CORBA, and Java.

1.8 About OpenCV:


OpenCV (Open Source Computer Vision Library: http://opencv.org) is an open-
source BSD-licensed library that includes several hundreds of computer vision
algorithms. The document describes the so-called OpenCV 2.x API, which is
essentially a C++ API, as opposite to the C-based OpenCV 1.x API. OpenCV was
built to provide a common infrastructure for computer vision applications and
to accelerate the use of machine perception in the commercial products. Being
a BSD-licensed product, OpenCV makes it easy for businesses to utilize and
modify the code.

OpenCV has a modular structure, which means that the package includes
several shared or static libraries. The following modules are available:

 Core functionality - a compact module defining basic data structures,


including the dense multi-dimensional array Mat and basic functions
used by all other modules.
 Image processing - an image processing module that includes linear and
non-linear image filtering, geometrical image transformations (resize,

5
affine and perspective warping, generic table-based remapping), color
space conversion, histograms, and so on.
 video - a video analysis module that includes motion estimation,
background subtraction, and object tracking algorithms.
 calib3d - basic multiple-view geometry algorithms, single and stereo
camera calibration, object pose estimation, stereo correspondence
algorithms, and elements of 3D reconstruction.
 features2d - salient feature detectors, descriptors, and descriptor
matchers.
 objdetect - detection of objects and instances of the predefined classes
(for example, faces, eyes, mugs, people, cars, and so on).
 highgui - an easy-to-use interface to simple UI capabilities.
 Video I/O - an easy-to-use interface to video capturing and video
codecs.
 gpu - GPU-accelerated algorithms from different OpenCV modules.
 ... some other helper modules, such as FLANN and Google test wrappers,
Python bindings, and others.

1.9 Software Used:

1.9.1 About SUBLIME TEXT:

Sublime Text is a proprietary cross-platform source code editor with


a Python application programming interface (API). It natively supports
many programming languages and markup languages, and functions can be
added by users with plugins, typically community-built and maintained
under free-software licenses.

FIG. 1.2

6
1.9.2 Features
The following is a list of features of Sublime Text:[4]

 "Goto Anything," quick navigation to files, symbols, or lines


 "Command palette" uses adaptive matching for quick keyboard invocation
of arbitrary commands
 Simultaneous editing: simultaneously make the same interactive changes to
multiple selected areas
 Python-based plugin API
 Project-specific preferences
 Extensive customizability via JSON settings files, including project-specific
and platform-specific settings
 Cross-platform (Windows, macOS, and Linux) and Supportive Plugins for
cross-platform
 Compatible with many language grammars from TextMate

FIG 1.3

CHAPTER -2

2.1 Implementing a bubble sheet scanner and grader using OMR, Python,
and OpenCV
7
We will use this image of bubble sheet for executing the code:

FIG 2.1

2.2 The 7 steps to build a bubble sheet scanner and grader:


To accomplish this, our implementation will need to satisfy the following
7 steps:

 Step #1: Detect the exam in an image.


 Step #2: Apply a perspective transform to extract the top-down, birds-eye-
view of the exam.
 Step #3: Extract the set of bubbles (i.e., the possible answer choices) from the
perspective transformed exam.
 Step #4: Sort the questions/bubbles into rows.
 Step #5: Determine the marked (i.e., “bubbled in”) answer for each row.
 Step #6: Lookup the correct answer in our answer key to determine if the user
was correct in their choice.
 Step #7: Repeat for all questions in the exam.

2.3 Implementation with Python and OpenCV

8
To get started, open up a new file, name it test_grader.py

2.3.1 STARTING WITH THE CODE:

FIG 2.2

2.3.2 Explaining the code:


 Installing imutils

FIG 2.3

 Installing Numpy

 Installing argparse

 Installing OpenCV

9
Step 1: Install Visual Studio

FIG 2.4

Step 3: Install Anaconda (a python distribution)

10
FIG 2.5

Step 4: Download and extract opencv-3.3.1 and opencv_contrib-3.3.1

FIG 2.6

Depending upon where you have kept opencv-3.3.1 folder, this path would be
different.

FIG 2.7

11
FIG 2.8

FIG 2.9

When prompted to select a compiler, select Visual Studio 14 2015 Win64.

12
FIG 2.10

Click finish and in the next window keep the default parameters checked.

Click finish. Now CMake will look in the system directories and generate the
makefiles.

13
Step 5.2 : Add Python paths for both Python2 and Python3

Now click configure again. After configuring is done, search opencv_python in


search bar, both BUILD_opencv_python2 and BUILD_opencv_python3 will be
automatically checked. Now we are sure that OpenCV binaries for both
Python2 and Python 3 will be generated after compilation.

Step 5.3 : Generate build files

14
If CMake is able to configure without any errors it should say “Configuring
done”.
Click generate.

Step 6: Compile OpenCV

Step 6.1:Compile opencv in Release mode

Open Windows Command Prompt (cmd).


Go to OPENCV_PATH/build directory and run this command

1 cmake.exe --build . --config Release --target INSTALL

Step 6.2 : Compile opencv in Debug mode

Open CMake GUI again as mentioned in Step 5.

1. Search “python” in search box


2. Uncheck INSTALL_PYTHON_EXAMPLES, BUILD_opencv_python3 and
BUILD_opencv_python2
3. Click configure
4. Click generate

Now in windows command prompt


Go to OPENCV_PATH/build directory and run this command

1 cmake.exe --build . --config Debug --target INSTALL

Now that we have compiled OpenCV we will find out how to test a OpenCV
project using CMake.

Step 7: Update System Environment Variables

Step 7.1 : Update environment variable – PATH

First of all we will add OpenCV dll files’ path to our system PATH. Press
Windows Super key, search for “environment variables”

15
Click Environment Variables in System Properties window

Under System Variables, Select Path and click edit

16
Click New, and give path to OPENCV_PATH\build\install\x64\vc14\bin and click
Ok. Depending upon where you have kept opencv-3.3.1 folder and what
version of Visual Studio you used to compile OpenCV, this path would be
different. In my case full path is:
C:\Users\Documents\opencv-3.3.1\build\install\x64\vc14\bin

17
Now click Ok to save.

2.4 Now progressing with the code...

 Lines 10-12 parse our command line arguments. We only need a single
switch here, --image , which is the path to the input bubble sheet test
image that we are going to grade for correctness.
 Line 17 then defines our ANSWER_KEY .
As the name of the variable suggests, the ANSWER_KEY provides integer
mappings of the question numbers to the index of the correct bubble.
In this case, a key of 0 indicates the first question, while a value
of 1 signifies “B” as the correct answer (since “B” is the index 1 in the
string “ABCDE”). As a second example, consider a key of 1 that maps to a value
of 4 — this would indicate that the answer to the second question is “E”.
As a matter of convenience, I have written the entire answer key in plain
english here:

 Question #1: B
 Question #2: E
 Question #3: A
18
 Question #4: D
 Question #5: B
Next, let’s preprocess our input image:

On Line 21 we load our image from disk, followed by converting it to grayscale


(Line 22), and blurring it to reduce high frequency noise (Line 23).
We then apply the Canny edge detector on Line 24 to find
the edges/outlines of the exam.

Below I have included a screenshot of our exam after applying edge detection:

19
Notice how the edges of the document are clearly defined, with all
four vertices of the exambeing present in the image.
Obtaining this silhouette of the document is extremely important in our next
step as we will use it as a marker to apply a perspective transform to the exam,
obtaining a top-down, birds-eye-view of the document:

 Now that we have the outline of our exam, we apply


the cv2.findContours function to find the lines that correspond to the
exam itself.

 We do this by sorting our contours by their area (from largest to


smallest) on Line 37 (after making sure at least one contour was found
on Line 34, of course). This implies that larger contours will be placed at
the front of the list, while smaller contours will appear farther back in
the list.

 We make the assumption that our exam will be the main focal point of
the image, and thus be larger than other objects in the image. This

20
assumption allows us to “filter” our contours, simply by investigating
their area and knowing that the contour that corresponds to the exam
should be near the front of the list.
 However, contour area and size is not enough — we should also check
the number of vertices on the contour.

 To do, this, we loop over each of our (sorted) contours on Line 40. For
each of them, we approximate the contour, which in essence means
we simplify the number of points in the contour, making it a “more
basic” geometric shape.

 On Line 47 we make a check to see if our approximated contour has four


points, and if it does, we assume that we have found the exam.

 Below I have included an example image that demonstrates the docCnt


variable being drawn on the original image:

21
An example of drawing the contour associated with the exam on our
original image, indicating that we have successfully found the exam.

 Sure enough, this area corresponds to the outline of the exam.

 Now that we have used contours to find the outline of the exam, we
can apply a perspective transform to obtain a top-down, birds-eye-
view of the document:

 In this case, we’ll be using my implementation of


the four_point_transform function which:
1. Orders the (x, y)-coordinates of our contours in a specific, reproducible
manner.
2. Applies a perspective transform to the region.

22
Obtaining a top-down, birds-eye view of both the original image (left) along
with the grayscale version (right).

We found our exam in the original image.

We applied a perspective transform to obtain a 90 degree viewing angle of


the document.

 But how do we go about actually grading the document?


 This step starts with binarization, or the process of
thresholding/segmenting the foreground from the background of the
image:

After applying Otsu’s thresholding method, our exam is now a binary image:

23
Using Otsu’s thresholding allows us to segment the foreground from the
background of the image.

Notice how the background of the image is black, while


the foreground is white.

 This binarization will allow us to once again apply contour extraction


techniques to find each of the bubbles in the exam:

24
 Lines 64-67 handle finding contours on our thresh binary image,
followed by initializingquestionCnts , a list of contours that correspond
to the questions/bubbles on the exam.
 To determine which regions of the image are bubbles, we first loop over
each of the individual contours (Line 70).
 For each of these contours, we compute the bounding box (Line 73),
which also allows us to compute the aspect ratio, or more simply, the
ratio of the width to the height (Line 74).
 In order for a contour area to be considered a bubble, the region should:

1. Be sufficiently wide and tall (in this case, at least 20 pixels in both dimensions).
2. Have an aspect ratio that is approximately equal to 1.
As long as these checks hold, we can update our questionCnts list and mark
the region as a bubble.
 Below I have included a screenshot that has drawn the output
of questionCnts on our image:

Using contour filtering allows us to find all the question bubbles in our bubble
sheet exam recognition software.

25
Notice how only the question regions of the exam are highlighted and nothing
else.

We can now move on to the “grading” portion of our OMR system:

First, we must sort our questionCnts from top-to-bottom. This will ensure that
rows of questions that are closer to the top of the exam will appear first in the
sorted list.
We also initialize a bookkeeper variable to keep track of the number of correct
answers.
 On Line 90 we start looping over our questions. Since each question has
5 possible answers, we’ll apply NumPy array slicing and contour sorting
to to sort the current set of contours from left to right.
 The reason this methodology works is because we have already sorted
our contours from top-to-bottom. We know that the 5 bubbles for each
question will appear sequentially in our list — but we do not
know whether these bubbles will be sorted from left-to-right. The sort
contour call on Line 94 takes care of this issue and ensures each row
of contours are sorted into rows, from left-to-right.

 To visualize this concept, I have included a screenshot below that depicts


each row of questions as a separate color:

26
By sorting our contours from top-to-bottom, followed by left-to-right, we can
extract each row of bubbles. Therefore, each row is equal to the bubbles for
one question.

Given a row of bubbles, the next step is to determine which bubble is filled in.

We can accomplish this by using our thresh image and counting the number of
non-zero pixels (i.e., foreground pixels) in each bubble region:

27
 Line 98 handles looping over each of the sorted bubbles in the row.
 We then construct a mask for the current bubble on Line 101 and then
count the number of non-zero pixels in the masked region (Lines 107
and 108). The more non-zero pixels we count, then the more foreground
pixels there are, and therefore the bubble with the maximum non-zero
count is the index of the bubble that the the test taker has bubbled in
(Line 113 and 114).
 Below I have included an example of creating and applying a mask to
each bubble associated with a question:

28
An example of constructing a mask for each bubble in a row.

 Clearly, the bubble associated with “B” has the most thresholded pixels,
and is therefore the bubble that the user has marked on their exam.

 This next code block handles looking up the correct answer in


the ANSWER_KEY , updating any relevant bookkeeper variables, and
finally drawing the marked bubble on our image:

29
 Based on whether the test taker was correct or incorrect yields which
color is drawn on the exam. If the test taker is correct, we’ll highlight
their answer in green. However, if the test taker made a mistake and
marked an incorrect answer, we’ll let them know by highlighting
the correct answer in red:

Drawing a “green” circle to mark “correct” or a “red” circle to mark


“incorrect”.

Finally, our last code block handles scoring the exam and displaying the results
to our screen:

30
Below you can see the output of our fully graded example image:

Finishing our OMR system for grading human-taken exams.

31
CHAPTER 3
3.1 Why not use circle detection?

To start, tuning the parameters to Hough circles on an image-to-image basis


can be a real pain. But that’s only a minor reason.

The real reason is:


User error.
How many times, whether purposely or not, have you filled in outside the lines
on your bubble sheet? I’m not expert, but I’d have to guess that at least 1 in
every 20 marks a test taker fills in is “slightly” outside the lines.

And guess what?

Hough circles don’t handle deformations in their outlines very well — your
circle detection would totally fail in that case.

Because of this, I instead recommend using contours and contour properties to


help you filter the bubbles and answers. The cv2.findContours function
doesn’t care if the bubble is “round”, “perfectly round”, or “oh my god, what
the hell is that?”

Instead, the cv2.findContours function will return a set of blobs to you, which
will be the foreground regions in your image. You can then take these regions
process and filter them to find your questions (as we did in this tutorial), and
go about your way.

3.2 Our bubble sheet test scanner and grader results

We’ve already seen test_01.png as our example earlier in this post, so let’s try
test_02.png :

32
We get the following result:

Let’s try another image:

We get the following result:

33
3.3 Extending the OMR and test scanner

 In the current implementation, we (naively) assume that a reader has


filled in one and only one bubble per question row.
 However, since we determine if a particular bubble is “filled in” simply
by counting the number of thresholded pixels in a row and then
sorting in descending order, this can lead to two problems:

1. What happens if a user does not bubble in an answer for a particular question?
2. What if the user is nefarious and marks multiple bubbles as “correct” in the
same row?
 Luckily, detecting and handling of these issues isn’t terribly challenging,
we just need to insert a bit of logic.

34
 For issue #1, if a reader chooses not to bubble in an answer for a
particular row, then we can place a minimum threshold on Line
108 where we compute cv2.countNonZero :

Detecting if a user has marked zero bubbles on the exam.

 If this value is sufficiently large, then we can mark the bubble as “filled
in”. Conversely, if total is too small, then we can skip that particular
bubble. If at the end of the row there are no bubbles with sufficiently
large threshold counts, we can mark the question as “skipped” by the
test taker.
 A similar set of steps can be applied to issue #2, where a user
marks multiple bubbles as correct for a single question:

Detecting if a user has marked multiple bubbles for a given question.


 Again, all we need to do is apply our thresholding and count step, this
time keeping track if there are multiple bubbles that have a total that
exceeds some pre-defined value. If so, we can invalidate the question
and mark the question as incorrect.

35
CHAPTER 4

4.1 Summary
 In this report , we demonstrated how to build a bubble sheet scanner
and test grader using computer vision and image processing techniques.

 Specifically, we implemented Optical Mark Recognition (OMR) methods


that facilitated our ability of capturing human-marked documents
and automatically analyzing the results.
 Finally, we provided a Python and OpenCV implementation that you can
use for building your own bubble sheet test grading systems.

36
37

You might also like