100% found this document useful (1 vote)
132 views65 pages

Phase 2 Final

Uploaded by

Mr Hentai
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
132 views65 pages

Phase 2 Final

Uploaded by

Mr Hentai
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 65

AUTOMATIC DESCRIPTIVE ANSWER EVALUATOR

USING MACHINE LEARNING

A PROJECT REPORT

Submitted by

PRAVEEN KUMAR S (190701147)

RAGHUL P (190701155)

in partial fulfillment for the award of the degree

of

BACHELOR OF ENGINEERING

IN

COMPUTER SCIENCE AND ENGINEERING

RAJALAKSHMI ENGINEERING COLLEGE


DEPARTMENT OFCOMPUTER SCIENCE ENGINEERING
ANNA UNIVERSITY, CHENNAI

APRIL 2023
RAJALAKSHMI ENGINEERING COLLEGE
CHENNAI

BONAFIDE CERTIFICATE

Certified that this project report titled “AUTOMATIC DESCRIPTIVE ANSWER


EVALUATOR USING MACHINE LEARNING” is the bonafide work of “PRAVEEN
KUMAR S (190701147) and RAGHUL P (190701155)”, who carried out the work under my
supervision. Certified further that to the best of my knowledgethe work reported herein does not form
part of any other project report or dissertationon the basis of which a degree or award was conferred
on an earlier occasion on this or any other candidate.

SIGNATURE SIGNATURE

Dr. P Kumar, Ph.D., Mrs.Susmita Mishra

HEAD OF DEPARTMENT ASSISTANT PROFESSOR (SG)

Department of Computer Science Department of Computer Science


and Engineering and Engineering

Rajalakshmi Engineering College Rajalakshmi Engineering College


Chennai – 602 105 Chennai – 602 105

Submitted to Project Viva-Voce Examination held on______________

Internal Examiner External Examiner


iii

ABSTRACT

Automating the evaluation of descriptive answers would be beneficial for academic institutions to
efficiently manage the online exam results of their students. Our project involves designing an
algorithm to automatically evaluate descriptive answers consisting of multiple sentences. Our
approach involves representing the student's answer and comparing it with pre-defined answers
created by the staff. To evaluate the answer, we use a pattern-matching algorithm and various
modules to achieve efficient evaluation without manual labor. This pattern can be used by many
organizations to reduce manpower and save time. Natural Language Processing (NLP) aims to
interpret human language in a meaningful way and typically involves machine learning techniques.
Evaluating the objective function involves assessing candidate solutions against a portion of the
training dataset, usually measured by an error score or loss. While the objective function is easy to
define, evaluating it can be costly.
iv

ACKNOWLEDGEMENT

Initially we thank the Almighty for being with us through every walk of our life and
showering his blessings through the endeavor to put forth this report. Our sincere thanks to
our Chairman Mr. S.MEGANATHAN, B.E, F.I.E., our Vice Chairman Mr. ABHAY
SHANKAR MEGANATHAN, B.E., M.S., and our respected Chairperson Dr. (Mrs.)
THANGAM MEGANATHAN, Ph.D., for providing us with the requisite infrastructure
and sincere endeavoring in educating us in their premier institution.

Our sincere thanks to Dr. S.N. MURUGESAN, M.E., Ph.D., our beloved Principal for his
kind support and facilities provided to complete our work in time. We express our sincere
thanks to Dr. P.KUMAR,Ph.D., Professor and Head of the Department of Computer
Science and Engineering for his guidance and encouragement throughout the project work.
We convey our sincere and deepest gratitude to our internal guide,”Mrs.SUSMITA
MISHRA”,from the Department of Computer Science and Engineering. Rajalakshmi
Engineering College for her valuable guidance throughout the course of the project. We are
very glad to thank our Project Coordinator, Dr. N. Srinivasan Department of Computer
Science and Engineering for his useful tips during our review to build our project.

PRAVEEN KUMAR S

RAGHUL P
v

TABLE OF CONTENTS

CHAPTER NO. TITLE PAGE NO.

ABSTRACT iii

LIST OF TABLES vii

LIST OF FIGURES ix

1. INTRODUCTION 1
1.1 DATA SCIENCE 1
1.2 DATA SCIENTIST 2
1.2.1 REQUIRED SKILLS FOR
DATA SCIENTIST 2
1.3 ARTIFICIAL INTELLIGIENCE 2
1.4 NATURAL LANGUAGE PROCESSING 2
1.5 MACHINE LEARNING 3

2. LITERATURE REVIEW 4
2.1 EXISTING SYSTEM 14
2.1.1 DRAW BACKS OF EXISTING
SYSTEM 15
2.2 PROPOSED SYSTEM 15
vi

2.2.1 ADVANTAGES OF PROPOSED


SYSTEM 15

3. SYSTEM DESIGN 16
3.1 GENERAL 16
3.2 SYSTEM REQUIREMENTS 16
3.2.1 FUNCTIONAL REQUIREMENTS 17
3.2.2 NON-FUNCTIONAL REQUIREMENTS 17
3.2.3 ENVIRONMENTAL REQUIREMENTS 18
3.3 WORKING PROGRESS 18
3.4 DESIGN OF THE ENTIRE SYSTEM 19
3.4.1 SYSTEM FLOW DIAGRAM 19
3.4.2 ARCHITECTURE DIAGRAM 20
3.4.3 USE CASE DIAGRAM 21
3.4.4 ACTIVITY DIAGRAM 22

4. PROJECT DESCRIPTION 23
4.1 METHODOLOGIES 23
4.1.1 MODULES 23
4.2 MODULES DESCRIPTION 23
4.2.1 DATA PRE-PROCESSING 23
4.2.2 DATA VISUALIZATION 24
4.2.3 ALGORITHM IMPLEMENTATION 25
4.2.3.1 DECISION TREE
CLASSIFIER 25
vii
4.2.3.2 RANDOM FOREST
CLASSIFIER 26
4.2.3.3 ADA BOOST CLASSIFIER 27
4.2.4 FLASK 27

5. RESULT AND DISCUSSION 29


5.1 PERFORMANCE METRICES 29
5.1.1 ACCURACY 29
5.1.2 LOSS 30
5.1.3 VALIDATION ACCURACY 30
5.1.4 VALIDATION LOSS 30
5.2 OUTPUT SCREEN SHOTS 32

6. CONCLUSION AND FUTURE WORKS 37


6.1 CONCLUSION 37
6.2 FUTURE ENHANCEMENTS 37

APPENDICES 38

REFERENCES 54
viii

LIST OF TABLES
ssss
TABLE NO. TITLE PAGE NO.

5.1 ALGORITHM USED AND THEIR


ACCURACY 31
ix

LIST OF FIGURES

FIGURE NO. TITLE PAGE NO.

1.1 PROCESS OF MACHINE LEARNING 3


3.1 SYSTEM FLOW DIAGRAM 19
3.2 ARCHITECTURE DIAGRAM 20
3.3 USE CASE DIAGRAM 21
3.4 ACTIVITY DIAGRAM 22
4.1 MODULE OF PRE-PROCESSING 24
4.2 MODULE OF DATA VISUALIZATION 24
4.3 DECISION TREE CLASSIFIER 25
4.4 MODULE OF DECISION TREE CLASSIFIER 26
4.5 RANDOM FOREST CLASSIFIER 26
4.6 MODULE OF ADA BOOST CLASSIFIER 27
5.2 OUTPUT SCREEN SHOTS 32
1

CHAPTER 1

INTRODUCTION

The evaluation of student answers using computer-based methods has become a common practice
in various areas of the education system. The integration of computers in learning has revolutionized
the field of education. The computer-assisted assessment system was initially developed for
evaluating single-word answers, such as those found in multiple-choice questions, but it can also
assess paragraph answers using keyword matching. This system is highly useful in academic
institutions for checking answer sheets and can also be implemented in organizations conducting
competitive exams. The system works by taking a scanned copy of the answer sheet as input and
extracting the text of the answer after preprocessing. Model answer sets, along with keywords and
question-specific criteria, are provided by the evaluator and are used to train the system. The system
evaluates the student's answer based on three parameters: keywords, grammar, and question-specific
criteria.
1.1 DATA SCIENCE

Data science is an interdisciplinary field that involves the use of scientific methods, algorithms, and
systems to extract valuable insights and knowledge from both structured and unstructured data. The
practical applications of data science span a wide range of domains. The term "data science" was
first suggested as a possible alternative to computer science in 1974 by Peter Naur. However, it was
not until 1996 that data science was specifically featured as a topic at the International Federation
of Classification Societies conference. In 2008, D.J. Patil and Jeff Hammerbacher, who were then
leading the data and analytics efforts at LinkedIn and Facebook, respectively, coined the term "data
science." Today, data science is one of the most popular and highly sought-after professions. It
requires a combination of skills, including domain expertise, programming skills, mathematics and
statistics knowledge, and machine learning techniques to extract meaningful insights and patterns
from data that can be used in making critical business decisions.
2

1.2 DATA SCIENTIST

Data scientists possess the necessary skills to identify relevant questions and locate data sources to
provide answers. In addition to their analytical abilities, they also possess business acumen and can
effectively extract, refine, and present data. Many businesses employ data scientists to manage,
analyze, and organize large quantities of unstructured data.

1.2.1 REQUIRED SKILLS FOR DATA SCIENTIST

• Programming: Python, SQL, Scala, Java, R, MATLAB.


• Machine Learning: Natural Language Processing, Classification, Clustering.
• Data Visualization: Tableau, SAS, D3.js, Python, Java, R libraries.
• Big data platforms: MongoDB, Oracle, Microsoft Azure, Cloudera.

1.3 ARTIFICIAL INTELLIGENCE

Artificial intelligence (AI) is the simulation of human intelligence in machines that are designed to
think and act like humans. It involves creating intelligent machines that can perform tasks that
typically require human intelligence, such as learning, reasoning, problem-solving, perception, and
decision-making. AI is also applied to machines that exhibit human-like traits, such as learning and
problem-solving. AI applications include advanced web search engines, recommendation systems,
speech recognition, self-driving cars, and strategic game systems. As machines become more
advanced, tasks once considered "intelligent" are often removed from the AI definition, which is
known as the AI effect. Optical character recognition is an example of a technology that is often
excluded from AI due to its routine nature.

1.4 NATURAL LANGUAGE PROCESSING

Natural language processing (NLP) is a field of artificial intelligence that focuses on making
machines capable of understanding and interpreting human language. An advanced NLP system
could enable human-like interactions with computers and allow them to learn directly from human-
3

written sources like news articles. Some practical applications of NLP include text mining,
information retrieval, machine translation, and question answering. Traditional approaches to NLP
involve analyzing word frequency and co-occurrence patterns to build syntactic representations of
text. However, this approach has limitations, as it may miss relevant information or fail to capture
the meaning of words in context. More modern statistical approaches to NLP use a combination of
strategies, such as keyword spotting and lexical affinity, to achieve higher accuracy levels. The
ultimate goal of NLP is to create machines that possess common sense reasoning abilities, and recent
advancements in deep learning have brought us closer to that goal. As of 2019, transformer-based
deep learning architectures are capable of generating coherent text.

1.5 MACHINE LEARNING

Machine learning involves using past data to predict future outcomes. It is a subset of artificial
intelligence that allows computers to learn and improve their performance without being explicitly
programmed. The main goal of machine learning is to develop computer programs that can adapt
and learn from new data. The process of training and prediction involves using specialized
algorithms to feed training data to a model, which can then make predictions on new test data. There
are three main categories of machine learning: supervised learning, unsupervised learning, and
reinforcement learning. In supervised learning, both the input data and corresponding labels are
provided to the model, whereas in unsupervised learning, there are no labels and the model must
figure out the patterns in the data. Reinforcement learning involves the model dynamically
interacting with its environment and receiving feedback to improve its performance over time.
Specialized algorithms are used to implement machine learning, and Python is a popular language
for this purpose.

Fig 1.1 Process of Machine Learning


4

CHAPTER 2

LITERATURE SURVEY

[1] Automatic Evaluation of Descriptive Answer Using Pattern Matching


Algorithm
Authors: Pranali Nikam, Mayuri Shinde, Rajashree Mahajan and Shashikala
Kadam

The primary goal of education is to impart knowledge and skills to students in a specific subject or
field. However, the ultimate objective is for students to be able to apply this knowledge practically.
To achieve this, it is important to determine the extent to which students have absorbed the material
taught. This can be accomplished by evaluating their degree of learning through written or practical
examinations.

Objective questions are easier to evaluate using automated systems than descriptive answers.
However, assessing descriptive answers is a difficult and labor-intensive task. To address this issue,
an algorithm is proposed to automate the evaluation process of descriptive answers. The motivation
behind this automation is to expedite the evaluation process, reduce the need for manpower,
eliminate subjective biases, simplify record keeping and extraction, and ensure uniform evaluation
regardless of any mood swings or changes in perspective of the human assessor.

[2] Automatic of Answer Scripts Evaluation

Authors: Ravikumar M, Sampath Kumar S and Shivakumar G

The examination process is crucial for evaluating the performance of students at various levels of
education, from primary to postgraduate. However, evaluating the answer booklets written by
students can be challenging due to the differences in handwriting styles, fonts, sizes, orientations,
5

6
and other factors. At the primary and high school levels, the question paper pattern typically includes
fill-in-the-blanks, matching, true/false, one-word answers, odd-man-out, and pick-out-the-odd-
word questions, which are answered in the booklets. The questions are printed, but the answers are
handwritten.

For technical subjects, manual evaluation by human evaluators is a difficult task, as it involves
assessing answers based on various parameters, including question-specific content and writing
style. Evaluating hundreds of answer scripts with similar answers can also become a tedious task
for evaluators, whose perception may vary from one another. To address these issues and expedite
the evaluation process, automation of answer script evaluation is necessary.

[3] Answer Evaluation Using Machine Learning


Authors: Prince Sinha, Ayush Kaul, Sharad Bharadia, Dr. Sheetal Rathi

Manually evaluating answers is a time-consuming and tedious task that requires a lot of manpower,
and can result in unequal marks being given by the paper checker. Our system aims to automate
answer evaluation by utilizing keywords and saving manpower. The answer paper can be scanned,
and the system will provide marks to the question based on the keywords present in the answer,
using a dataset. This system will also reduce errors in marks given for a particular question.

Our application uses a machine learning algorithm that matches keywords from a dataset to
automatically evaluate answers. This is different from other applications available in the market that
only evaluate multiple-choice questions and not subjective questions. To use this application, the
answer to a particular question needs to be scanned, and the system will split the answer's keywords
using OCR technology. Based on the keywords in the answer and those in the dataset, the
application will provide marks ranging from 1 to 5.

[4] Subjective Answer Evaluation Using Machine Learning


Authors: Piyush Patil, Sachin Patil, Vaibhav Miniyar, Amol Bandal

The conventional method of evaluating subjective exams is problematic. This is because the quality
of the evaluation may differ based on the emotional state of the evaluator. To address this issue, our
proposed system employs machine learning NLP. The algorithm is designed to tokenize words and
sentences, perform part-of-speech tagging, chunking, lemmatizing words, and utilize Wordnet to
evaluate subjective answers. Additionally, our system provides the semantic meaning of the context.
The system comprises two modules. The first module extracts data from scanned images and
organizes it appropriately. The second module applies machine learning and NLP to the retrieved
text and assigns marks based on the analysis. This approach saves time and improves the accuracy
of the evaluation.

[5] Online Subjective answer verifying system Using Artificial Intelligence


Authors: Jagadamba G, Chaya Shree G

Organizations and educational institutes rely heavily on the examination grading system, which
mostly consists of objective questions. While these systems are beneficial in terms of resource-
saving, they fail to evaluate subjective questions. This research aims to evaluate descriptive answers
by comparing them graphically to standard answers. The proposed solution involves a subjective
answer verifier that assigns marks based on the accuracy percentage of the answer provided by
different users, with three different answers given.

To implement the system, a database containing questions, corresponding answers, and their
allocated marks is necessary. The system must verify the user's answers by comparing them with
the template answers and identifying the key elements of the responses using artificial intelligence
to assign marks.
7

[6] Automatic Answer Script Evaluator


Authors: M.Venkateshwara Rao, I.Sri Harshitha, Y. Sukruthi, T. Sudharshan

In today's age of technological advancements, technology has become an essential need in the daily
lives of people, and the Internet of Things (IoT) is a system that provides objects and people with
unique identities and the ability to transfer data through a network without human interaction. The
IoT has a great potential for enhancing life by utilizing intelligent sensors and smart devices that
collaborate over the internet. Assessing a student's capability is usually done by evaluating the
answers they provide in an exam, which allows us to measure their learning ability. However, this
process is time-consuming, costly, and may not be entirely accurate. Therefore, an automated
system that evaluates the student's answers can provide more precise results. Unlike other answer
script evaluators, this system assesses handwritten scripts, which are more convenient and easily
accessible for the students. To use this system, the institution only needs to upload the answer key,
and the system will generate individual marks for each student by uploading their answer scripts
directly.

[7] Intelligent Short Answer Assessment using Machine Learning


Authors: Rosy Salomi Victoria D, Viola Grace Vinitha P, Sathya R

Numerous algorithms have been proposed for handwriting recognition and conversion, and it is not
possible to achieve full accuracy using a single technique for preprocessing. A multi-level
perceptron classifier is suggested for identifying Bangla digits by means of a 76-component feature
array. The evaluation of subjective answer checking is not a new concept, and various techniques
have been experimented with, such as Natural Language Processing, Latent Semantic Analysis,
Generalized Latent Semantic Analysis, Bayes Theorem, and K-Nearest Neighbor.

A sentiment classification model that removes stop words was discussed, and a model was
developed that takes short answers as input and constructs RDF sentences. The model considers
both the lexical structure and synonyms while matching with the model answer for one-sentence
answers. There is a limit on the length of the answer sentences. A semi-automated evaluation
technique was used to evaluate subjective papers, where a question base and an answer base were
8

created with model answers. The student answer is evaluated based on the semantic meaning and
sentence length.

[8] Automated subjective answer evaluation using Semantic Learning


Authors: Era Johri, Nidhi Dedhia, Kunal Bohra, Prem Chandak, Hunain
Adhikari

There are multiple programs available for grading subjective answers, but these programs have
some issues that the creators of the "ASSESS" system aimed to address. The systems being
considered include an Automated Grading System, which emphasizes a knowledge-based approach
instead of just a keyword-matching algorithm. The system employs ontology to link domains related
to a given keyword, and LSA and dictionary mapping ensure that pertinent answers receive credit.
Although grammar and syntax are verified, they do not influence the overall score as long as the
concept is thoroughly explained. However, the primary limitation of this system is that it does not
provide any feedback or information to the student regarding their errors.

Another program employs various machine learning techniques such as Latent Semantic Analysis,
Generalized Latent Semantic Analysis, Maximum Entropy technique, and Bi Lingual Evaluation
Understudy to capture the latent relationship between words. This program measures the
relationship between words, words and concepts, and uses ontology for answer evaluation. The
techniques described in this paper demonstrate a strong correlation (up to 90%) with human
performance.
[9] Automated Descriptive Answer Evaluation System Using Machine Learning
Authors: Ms. Sharmeen J. Shaikh, Ms. Prerana S. Patil, Ms. Jagruti A.
Pardhe, Ms. Sayali V. Marathe, Ms. Sonal P. Patil

To create a software application that can assess descriptive answers and assign marks based on the
accuracy of the response, a user must first log in to the system for authentication. Once
authenticated, users can access the questions provided by the system. The proposed system evaluates
answers by comparing them with a standard answer stored in the database, which contains the
9

description, meaning, and keywords. The system then matches the keywords or key concepts, as
well as synonyms, with the answer and checks the grammar and spelling of the words. The answer
is then graded based on its accuracy, with the evaluation process consisting of three main steps:
extracting keywords and synonyms, matching the keywords, and weighting the keywords to
generate a score. The system assigns grades based on the number of keywords matched.

[10] Subjective Answers Evaluation Using Machine Learning and Natural Language
Processing
Authors: Muhammad Farruk Bashir,Hamza Abdul Rehman Javed, Natalia
Kryvinska, And Shahab S. Band

They utilized Chinese automatic segmentation techniques and subjective ontologies to produce a k-
dimensional LSI space matrix. The responses were represented as TF-IDF embedding matrices and
then underwent Singular Value Decomposition to construct a semantic space of vectors. LSI was
employed to mitigate issues with synonyms and polysemy. Ultimately, cosine similarity was used
to calculate the similarity between responses. The dataset consisted of 35 categories and 850
instances evaluated by teachers, and the findings indicated a 5% variation in grading between the
teacher evaluations and the proposed system.The system did not employ hyper-parameters and
utilized a relaxed WMD method to relax the constraints of the vector space. The dataset contained
eight real-world collections, such as Twitter sentiment data and BBC sports articles. The Word2vec
model from Google News was used, as well as two custom models that were trained. The testing
data was classified using the KNN approach. As a result, the relaxed WMD approach decreased
error rates and resulted in classification speeds 2 to 5 times faster.

[11] SUBJECTIVE ANSWER-CHECKER


Authors: Nisarg Chakravarty, Dr. Sunil Maggu

Assessing a person's competence and skills is typically done through exams, which can either be
subjective or multiple choice in format. Objective exams are easier to grade automatically, saving
resources and effort. However, many exams are still subjective, and automated grading is currently
only available for objective exams. Finding a similar solution for subjective exams remains a
challenge. Administering exams is a tedious task for educational institutions, involving the
10

distribution of exam papers, students writing answers, and multiple levels of checking by examiners
and authorities responsible for processing the sheets.

[12] Automated Paper Evaluation System for Subjective Handwritten Answers


Authors: Sarika Singh, Yash Shah, Yug Vajani, Surekha Dholay

The process of converting PDF files to text files through OCR is commonly used. In the education
setting, teachers may provide model answers in a notepad file and calculate the similarity level of
students' answers to the provided model. If the similarity level exceeds a certain threshold, students
receive full marks. However, this system may not always be reliable due to variations in word choice
or missing answer types, and there is no weighting system for keywords or points. As a result, the
accuracy of this system may not accurately reflect real-life evaluation systems.

In contrast, some teachers may evaluate students' answers based on a model answer provided by the
teacher. The weight assigned to factors such as answer length and grammar can vary depending on
the course, and the system of assigning weight is not commonly used by evaluators. Additionally,
this system does not take into account synonyms, which can lead to discrepancies when students
use different words to convey the same meaning.

[13] Automatic Answer Evaluation Using Machine Learning

Authors: MANDADA SAMEMI, TIRUMALA SAI HAREESHA, GUDLURU


VENKATA SIVA SAI PAVAN KUMAR, NALLURI PRAMOD

The input for the process was an image dataset, which underwent a pre-processing step including
resizing and conversion to grayscale. Text was then extracted from the pre-processed image using
mean standard deviation and pytesseract, and stored in text format. Natural Language Processing
was applied to clean the extracted text, and the number of words and letters were calculated. Finally,
an Artificial Neural Network (ANN) deep learning algorithm was implemented and experimental
results were obtained, including performance metrics such as accuracy and evaluation based on the
number of words and letters.
11

[14] Automatic Evaluation of Descriptive Answers Using NLP and Machine


Learning.
Authors: Sumedha P Raut, Siddhesh D Chaudhari, Varun B Waghole,Pruthviraj
U Jadhav, Abhishek B Saste

The authors used a latent semantic categorization method to evaluate subjective queries online,
utilizing Chinese automatic segmentation techniques and subjective ontologies to construct a k-
dimensional LSI area matrix. Answers were represented in TF-IDF embedding matrices, and
Singular Value Decomposition (SVD) was applied to the term-document matrix to create a semantic
space of vectors. LSI helped to address issues with word ambiguity. Cosine similarity was used to
calculate the similarity between answers, with a dataset of 35 classes and 850 instances marked by
teachers. The results showed a 5% difference in grading between the teacher and the proposed
system.
Kusner et al. introduced the novel concept of using Word Mover's Distance (WMD) to measure
dissimilarity between two texts. The system utilized no hyper-parameters and a relaxed approach to
slow down vector space bounds. The dataset included eight real-world sets, including Twitter
sentiment data and BBC sports articles. The Word2vec model from Google News and two custom
models were used. A K-Nearest Neighbor (KNN) classification approach was employed to classify
the testing data. As a result, relaxed W.M.D. reduced error rates and led to two to five times faster
classification.

[15] Machine Learning based Automatic Answer Checker Imitating Human Wayof
Answer Checking
Authors: Vishwas Tanwar

In today's world, there are various ways to conduct exams, including online exams, OMR sheet
exams, and MCQ type exams. Examinations are conducted frequently around the world, and an
essential aspect of any exam is the evaluation of students' answer sheets. This task is usually
performed manually by the teacher, which can be very time-consuming, especially if the number of
students is large. In such cases, automating the answer checking process would be very useful.
Automation would not only reduce the workload of the teacher but also make the checking process
12

more transparent and fair by eliminating any chances of bias. Although there are several online tools
available for checking multiple-choice questions, there are only a few tools to evaluate subjective
answer type exams.

[16] Automatic Answer Sheet Checker


Authors: Ronika Shrestha, Raj Gupta and Priya Kumari

An automated answer checking system that grades written answers is similar to a human being. The
system requires users to create an account, which is available to administrative staff. The
administrator can add questions and their subjective solutions to the system, which are stored in
notepad files. When a user takes the test, they are provided with questions and a space to type their
answers. After the user submits their answers, the system compares them to the main answer stored
in the database and assigns grades accordingly, even if the responses are not identical. The system
incorporates Artificial Intelligence (AI) tools that evaluate answers and assign grades similarly to a
human grader. The designers used CNN, Image processing, and res-net image feature extractor to
create the system. An accuracy checking algorithm checks character error rates.

The system was designed for use with a device and scanner, along with software applications that
can evaluate MCQ examination tests with questions having four options, and students can only
select one answer per question. The software analyzes the question paper to identify the response to
each question by matching it with an accurate answer stored in the database. The program is user-
friendly and utilizes OpenCV to facilitate image processing.

[17] Automatic Question Answering System


Authors: Matej Pavla

By considering the issues that the Quality Assurance (QA) field deals with, we can establish
a fundamental structure for a system designed for open-domain question answering.
Typically, such a system comprises three main components. The first module is responsible
for extracting information that pertains to the question being asked. This involves identifying
13

the type of question, determining the expected answer format, and creating a basic query that
includes a range of relevant keywords, phrases, and entities, as well as information about the
syntactic and semantic relationships between the words in the question. This query can then
be enhanced with additional information, such as synonyms or translations, in the case of
multi-lingual QA systems.

[18] Assessment of Answers: Online Subjective Examination


Authors: Asmita Dhokrat, Gite Hanumant R, C.Namrata Mahender

The functioning of this system involves attempting to identify potential answers by extracting the
relevant content from a predefined template or model answer that is provided within the framework
for question answering. The correctness of an answer is always subject to scrutiny, and as such, we
need to determine its level of confidence by comparing it to the model answer. During the evaluation
process, not every word in the answer carries equal importance, so it is necessary to evaluate the
answer as a single sentence. The system assigns scores based on the level of matching between
packages of words and keywords.

[19] Automated Answering for Subjective Examination


Authors: Asmita Dhokrat, C.Namrata Mahendra, Gite Hanumant

Subjective evaluation refers to the process of assessing responses to Descriptive, Define or Explain
type of questions, which are designed to measure a candidate's understanding of the concepts in a
particular subject. Our system has an efficiency rate of over 80% in evaluating one-word and one-
sentence answers. Paraphrasing is used to evaluate variations in vocabulary use when answering
single-sentence questions. While objective-based answering systems are common in online
education courses and are easy to evaluate, traditional university-level courses require subjective or
descriptive evaluations to test a student's conceptual understanding. Various online examination
systems are available in the market, such as web-based evaluation systems for computer education
and online annotation research and practices. Web-based educational technologies provide insight
into effective learning strategies and how students learn.
14

[20] Automated Grading for Handwritten Answer Sheets using Convolutional


Neural Networks
Authors: Eman Shaikh, Iman Mohiuddin, Ayisha Manzoor, NazeeruddinMohamma

Handwritten recognition is a relatively narrow field of study in the domain of pattern recognition
and image processing, with a growing need for optical character recognition of handwritten scripts.
In this section, a thorough examination of existing research on handwritten recognition systems that
rely on various machine learning techniques is presented. While recognizing printed text has
become a well-established practice, recognizing handwritten text remains a challenging task due to
the significant variation in handwriting among individuals, including differences in letter or digit
size, orientation, thickness, format, and dimension. Several machine learning approaches have been
proposed for handwritten text recognition, such as the automated grading of handwritten answers
and the identification of handwritten alphabets and digits in different languages. This section
discusses different machine learning classifiers used in handwritten recognition methods.

2.1 EXISTING SYSTEM

Manual evaluation of subjective papers can be challenging and tedious due to the subjective nature
of the task. Analyzing subjective papers using AI presents its own set of challenges, such as
inadequate understanding and acceptance of data. While previous attempts have used traditional
counting methods and specific words to score student answers, there is a shortage of curated datasets
for this purpose. To address this, a new approach is proposed in this paper that employs various
machine learning and natural language processing techniques, including Word net, Word2vec, word
mover’s distance (WMD), cosine similarity, multinomial naive Bayes (MNB), and term frequency-
inverse document frequency (TF-IDF). This approach uses solution statements and keywords to
evaluate descriptive answers, and a machine learning model is trained to predict grades. The study
found that WMD outperformed cosine similarity, and with sufficient training, the machine learning
model could function as a standalone tool. The accuracy rate achieved was 88% without the MNB
model, which was further reduced by 1.3% with its inclusion.
15

2.1.1 DRAW BACKS IN EXISTING SYSTEM

1. They failed to adequately prioritize the classification of grades.


2. They did not utilize a significant amount of data.
3. They are not employing any methodology for deploying models.
4. They did not accurately locate the output.

2.2 PROPOSED SYSTEM

Education is undergoing a significant transformation with the use of machine learning (ML), which
is revolutionizing teaching, learning, and research. ML is being employed by educators to identify
struggling students early and intervene to improve their success and retention rates. Researchers are
also utilizing ML to expedite their work and uncover new discoveries and insights. One proposed
approach involves creating an ML model for predicting Mark Evaluation. The project's initial step
is gathering past related data to build a dataset, which is then pre-processed to eliminate irrelevant
data. After the dataset is analyzed, it is prepared for training. Machine learning is most commonly
used in the Mark Evaluation domain to minimize human errors. Various algorithms are utilized to
train the model, and the most effective one is selected. The chosen algorithm is then saved as a
model file, which is utilized for making predictions.

2.2.1 ADVANTAGES OF PROPOSED SYSTEM

• We are implementing the machine learning algorithm for classification purposes.


• More than three algorithms are used for comparison of getting the best accuracy.
• Deployment is done for getting results.
• The prediction will be better.
• Reduces the mistakes made by human experts.
• Mark evaluations are predicted perfectly.
16

CHAPTER 3

SYSTEM DESIGN

3.1 GENERAL

To provide a comprehensive overview of the current knowledge or methodological approaches


on a specific topic. It is composed of secondary sources, which means it discusses previously
published information on a particular subject area, and sometimes only focuses on information
within a specific time frame. The main objective of a literature review is to provide the reader
with up-to-date information on the topic and serves as a basis for future research proposals. It
can be a simple summary of sources or can follow an organizational pattern that combines
both summary and synthesis. A summary provides a condensed version of the source's key
information, whereas synthesis involves rearranging the information in a new way, such as
providing a new interpretation or combining old and new interpretations. Additionally, a
literature review may assess and evaluate the sources, guiding the reader towards the most
pertinent or relevant ones based on the specific situation.

3.2 SYSTEM REQUIREMENTS


Requirements are essential limitations that must be considered when creating a system. These
requirements are gathered during the design phase of the system. The subsequent requirements need
to be examined.
1. Functional requirements
2. Non-Functional requirements
3. Environment requirements
A. Hardware requirements
B. software requirements
17

3.2.1 FUNCTIONAL REQUIREMENTS


The technical specifications for the software product are documented in the software requirements
specification, which is the initial stage in the requirements analysis process. This document outlines
the specific requirements for the software system. Moreover, it includes instructions that pertain to
particular libraries such as sk-learn, pandas, numpy, matplotlib, and seaborn.

3.2.2 NON-FUNCTIONAL REQUIREMENTS


The automatic descriptive answer evaluator should possess the following non-functional
requirements:
1) Performance: The system should have the ability to evaluate a significant number of
responses quickly and accurately.

2) Reliability: The system should be consistent and dependable, providing consistent


evaluations for different submissions.

3) Usability: The system should have a user-friendly interface that is simple to navigate and
understand.

4) Accuracy: The system should be dependable and precise, providing accurate evaluations
based on established criteria.

5) Security: The system should be secure and safeguarded from potential threats, with the
necessary measures in place to prevent unauthorized access and protect user data.

6) Scalability: The system should be able to manage an increasing number of users and
submissions without affecting its performance or reliability.

7) Maintainability: The system should be easily maintainable, with transparent documentation


and a codebase that can be updated easily for continuous maintenance and improvements.
18

3.2.3 ENVIRONMENTAL REQUIREMENTS


1. Software Requirements:
Operating System : Windows 10 or later
Tool : Anaconda with Jupyter Notebook
2. Hardware requirements:
Processor : Intel i3
Hard disk : minimum 80 GB
RAM : minimum 4 GB

3.3 WORKING PROCESS

To begin using Python for machine learning, follow these steps:


1) Download and install Anaconda and choose the most useful machine learning package
for Python.
2) Load a dataset and understand its structure by analyzing statistical summaries and data
visualization.
3) Select the most suitable machine learning models and verify the accuracy to ensure
reliability.
Python is a highly popular and powerful interpreted language that serves as a comprehensive
platform for research and development as well as production systems. With a vast number of
modules and libraries available, it can be overwhelming to determine the best approach.

Completing a project is an ideal way to start using Python for machine learning. By doing so,
you will be prompted to install and launch the Python interpreter, obtain an overview of how
to work through a small project, and gain confidence that you can embark on your own small
projects.

When working on a machine learning project with your own datasets, it may not be a linear
process, but it typically involves a number of well-known stages, such as defining the problem,
preparing the data, evaluating algorithms, improving results, and presenting the final outcome.
19

3.4 DESIGN OF THE ENTIRE SYSTEM

3.4.1 SYSTEM FLOW DIAGRAM

The student provides a written response to the questions on the staff's exam paper, which is then
compared to the answer key using specific modules. The result is then evaluated to determine if the
student has passed or failed, and this information is presented to the evaluator.

Fig 3.1 System flow Diagram


20

3.4.2 ARCHITECTURE DIAGRAM

Fig 3.2 Architecture Diagram

The diagram of the system architecture illustrates the process of how data is collected and
transformed into the final output for a given input. It starts with the questions and answers dataset
and eventually generates results indicating whether the input has passed or failed using various
modules, trained datasets, and predictions.
21

3.4.3 USE CASE DIAGRAM

Fig 3.3 Use Case Diagram

Use case diagrams are utilized to perform high-level requirement analysis of a system. During the
analysis of the system's requirements, its functionalities are identified and documented in use cases,
which can be considered a systematic representation of the system's functionalities. In other words,
use cases serve as an organized presentation of the system's functionalities.
22

3.4.4 ACTIVITY DIAGRAM

Fig 3.4 Activity Diagram

An activity in a system represents a specific operation or task. Activity diagrams are utilized not
only for visualizing the dynamic nature of a system but also for constructing an executable system
using forward and reverse engineering techniques. The only aspect that activity diagrams do not
depict is the message flow between activities. Therefore, activity diagrams are sometimes referred
to as flow charts, although they are not exactly the same. Activity diagrams depict various types of
flow, including parallel, branched, concurrent, and single, whereas flow charts do not necessarily
capture these types of flow.
23

CHAPTER 4

PROJECTDESCRIPTION

4.1 METHODOLOGIES
4.1.1 MODULES

➢ Data Pre-processing
➢ Data Analysis of Visualization
➢ Algorithm Implementation
• Decision tree Classifier
• Random forest Classifier
• Ada Boost Classifier
➢ Deployment

4.2 MODULE DESCRIPTION

4.2.1 Data Pre-processing:

In machine learning, validation techniques are utilized to determine the error rate of the ML model,
which is considered to be close to the true error rate of the dataset. While a large enough dataset can
represent the population, in real-world scenarios, working with small samples may not accurately
represent the dataset's population. Validation techniques are used to detect missing or duplicate
values and to identify the data type of each variable. A sample dataset is used to provide an impartial
evaluation of a model's performance and to fine-tune its hyper parameters. However, as the
validation set's skill is incorporated into the model configuration, the evaluation may become more
biased. Machine learning engineers use this data for frequent evaluation and fine-tuning of the
model's hyper parameters. Data collection, analysis, and addressing the data's content, quality, and
structure can be a time-consuming process. It is essential to understand the data and its properties
during the data identification process, which helps in selecting the algorithm for building the model.
24

Fig 4.1 Module of Pre-processing

4.2.2 Data Visualization:

Data visualization is a crucial skill in applied statistics and machine learning. While statistics deals
with quantitative descriptions and estimations of data, data visualization provides a set of important
tools for understanding data qualitatively. It is useful for exploring and familiarizing oneself with a
dataset and can aid in identifying patterns, outliers, corrupt data, and more. With some domain
knowledge, data visualizations can express and demonstrate key relationships in plots and charts
that are more impactful to stakeholders than measures of association or significance. Data
visualization and exploratory data analysis are entire fields in themselves, and it is recommended to
delve deeper into some of the suggested books.
Visualizing data through charts and plots can help make sense of it, even when the data itself may
not be immediately understandable. The ability to quickly visualize data samples is a valuable skill
in both applied statistics and machine learning. This includes understanding different types of plots
that are useful for visualizing data in Python, such as line plots for time series data and bar charts
for categorical quantities. Additionally, histograms and box plots are useful for summarizing data
distributions.

Fig 4.2 Module of Data Visualization


25
4.2.3 Algorithm Implementation:

It is crucial to consistently compare the performance of different machine learning algorithms. To


do so, a test harness can be created using scikit-learn in Python, which can be used as a template to
compare multiple algorithms on various machine learning problems. As each model has unique
performance characteristics, resampling techniques such as cross-validation can provide an estimate
of each model's accuracy on unseen data. These estimates can help select the best models among
the suite of models created. Similarly, different visualization methods can be employed to examine
the estimated accuracy of machine learning algorithms and choose one or two to finalize. To ensure
a fair comparison, it is important to evaluate each algorithm in the same way on the same data. This
can be achieved by forcing every algorithm to be evaluated on a consistent test harness. In the next
section, you will learn how to perform this process using scikit-learn in Python.

4.2.3.1 Decision Tree Classifier:

Decision Tree is a technique for Supervised learning that can solve both Regression and
Classification problems, though it is primarily used for the latter. This classifier takes the form of a

tree structure, where the internal nodes of the tree correspond to features in the dataset, branches
represent decision rules, and each leaf node corresponds to an outcome. The Decision Node is a
node that makes decisions and has multiple branches, while the Leaf Node is the result of those
decisions and does not have any further branches. This approach uses the features of the dataset to
perform tests and make decisions. It is a visual representation of all possible solutions to a
problem/decision based on the given conditions. This technique is known as a decision tree because,
like a tree, it starts with a root node that expands into further branches to form a tree-like structure.

Fig 4.3 Decision Tree Classifier


26

Fig 4.4 Module of Decision Tree Classifier

4.2.3.2 Random Forest Classifier:

The Random Forest is a well-known algorithm in machine learning that is categorized under
supervised learning. It can effectively handle both classification and regression tasks in machine
learning. Its foundation lies in ensemble learning, which is a method of combining multiple
classifiers to solve complex problems and improve model performance.

Random Forest is a classifier comprising of numerous decision trees, each trained on a different
subset of the given dataset, and it aggregates their predictions to enhance the accuracy of the dataset.
Unlike relying on the output of a single decision tree, Random Forest considers the predictions of
every tree and decides the final output based on the majority vote of predictions. Moreover, having
a higher number of trees in the forest can lead to more accurate predictions and also helps to prevent
overfitting.

Fig 4.5 Random Forest Classifier


27
4.2.3.3 Ada Boost Classifier:

Ada-boost, also known as Adaptive Boosting, is an ensemble boosting classifier created by Yoav
Freund and Robert Schapire in 1996. The primary purpose of AdaBoost is to enhance the accuracy
of classifiers by combining multiple classifiers. It operates as an iterative ensemble method.
AdaBoost builds a strong classifier by merging several poorly performing classifiers, leading to a
highly accurate strong classifier. The fundamental idea behind AdaBoost is to set weights for
classifiers and train data samples iteratively, ensuring accurate predictions of unusual observations.
Any machine learning algorithm that accepts weights on the training set can be used as a base
classifier. Ada boost must fulfill two conditions to work efficiently. Firstly, the classifier should be
trained interactively on various weighed training examples. Secondly, in each iteration, it should
attempt to provide an excellent fit for these examples by minimizing training error.

Fig 4.6 Module of Ada Boost Classifier

4.2.4 Flask (Web Frame Work):

Flask is a micro web framework that is created using the Python programming language. This
framework is categorized as a micro-framework because it does not require specific tools or libraries
to function. Unlike other frameworks, Flask does not come with pre-built features such as database
abstraction layer or form validation. However, Flask allows the use of extensions that can add
features to the application as if they were part of Flask itself. Flask is an excellent framework for
building REST APIs and has access to all of Python's powerful features since it is built on top of it.
Although Flask is primarily used for the backend, it uses a templating language called Jinja2 to
28
create HTML, XML, or other markup formats that are sent to the user via an HTTP request. Flask

has a modular and lightweight design that makes it easy to transform it into the web framework you
require by adding a few extensions without adding any extra weight. Additionally, the foundation
API is well-structured and coherent.
29

CHAPTER 5
RESULT AND DISCUSSION
The development of an automatic descriptive answer evaluator system has enabled the assessment
of descriptive answers provided by students. This system compares a student's answer with the
staff's answer to the same question, using a variety of natural language processing techniques to
determine the degree of similarity between the two answers.

The evaluation process involves the use of NLP techniques such as keyword matching and sentence
similarity to compare the student's answer with the staff's answer. The system then assigns a score
to the answer based on the degree of similarity between the two answers, and can also identify
whether the student's answer has been plagiarized.

The system has been tested on a dataset of descriptive answers provided by students, and the results
show that it is highly accurate in its assessments. The system has been tested on various question
types and topics, with consistent results across all tests.

One limitation of the system is that it relies on pre-defined staff answers, which may restrict its
ability to assess answers that differ significantly from the staff's answer. Additionally, the system
may not be able to evaluate the quality of an answer beyond its similarity to the staff's answer.
In summary, the automatic descriptive answer evaluator system has demonstrated great potential in
accurately assessing descriptive answers provided by students. This system has the potential to save
a significant amount of time and effort for educators who need to assess large volumes of student
answers.

5.1 PERFORMANCE METRICES

5.1.1 ACCURACY

In evaluating the performance of the automatic descriptive answer evaluator system, accuracy is
used as a metric. This metric is particularly useful when all the answers carry the same level of
30

significance. The calculation of accuracy involves dividing the total number of correct evaluations
by the total number of evaluations made. By using accuracy, the system is able to determine how
effectively it can assess the students' answers. Nevertheless, it is worth mentioning that relying
solely on accuracy may not be enough to provide a comprehensive evaluation of the answers. Other
metrics may also be required for a more comprehensive assessment.

5.1.2 LOSS

The system calculates the loss to determine the gradients in relation to the model's parameters, which
are subsequently modified via backpropagation. This process is performed iteratively, with the
system being updated each time until there is no further improvement in the desired evaluation
metric. By reducing the loss, the system can enhance its capacity to correctly evaluate descriptive
answers given by students. Nonetheless, it is important to keep in mind that the selection of the loss
function can considerably affect the system's performance, and selecting an appropriate loss
function is a crucial step in developing an effective automatic descriptive answer evaluator system.

5.1.3 VALIDATION ACCURACY

It can assess its performance during the training process by using a validation dataset. The validation
dataset is a subset of data that is not used for training but rather to evaluate the system's accuracy.
The system compares the predicted answers with the actual answers in the validation dataset to
calculate its validation accuracy. This metric is important in estimating the system's ability to
perform on new and unseen data. It helps to monitor the system's performance during training and
avoid overfitting. The use of a validation dataset and validation accuracy is crucial in ensuring that
the automatic descriptive answer evaluator system accurately assesses descriptive answers provided
by students.

5.1.4 VALIDATION LOSS

The validation loss is a performance metric utilized to evaluate machine learning models on the
validation dataset. The validation dataset is a fraction of the entire dataset used to check the model's
31

performance. The validation loss metric is similar to the training loss and is computed by summing
up the errors for each example in the validation dataset.

Table 5.1 – Algorithm used and their Validation Accuracy

Table 5.1 represents the algorithm used and their respective accuracy. Among these, Random Forest
Algorithm outperformed the others with an accuracy of 93.4647%.
32

5.2 OUTPUT SCREEN SHOTS


5.2.1 DATA PRE-PROCESSING

Fig 5.1 Before and After removing the Null values


33

5.2.2 DATA VISUALIZATION

Fig 5.2 Graph

5.2.3 DECISION TREE CLASSIFIER

Fig 5.3 Accuracy


34

5.2.4 RANDOM FOREST CLASSIFIER

Fig 5.4 Accuracy

5.2.5 ADA BOOST CLASSIFIER

Fig 5.5 Accuracy


35

5.2.6 FLASK

Fig 5.6 Local Host Address

5.2.7 DATA SET

Fig 5.7 Data Set used


36

5.2.8 FRONT END

Fig 5.8 Web Page

5.2.9 BACK END

Fig 5.9 Web Page


37

CHAPTER 6

CONCLUSION AND FUTURE WORKS

6.1 CONCLUSION

The analytical process begins with data cleaning and processing, addressing missing values,
performing exploratory analysis, and eventually building and evaluating models. The aim is to find
the algorithm with the highest accuracy score on a public test set, which will be used in an
application to determine Mark Evaluation. While many educational institutions conduct online
exams, mostly in the form of MCQs, our project aims to evaluate descriptive answers as well. This
type of examination is useful for assessing a student's aptitude, but it cannot measure their
theoretical knowledge. The proposed system aims to calculate subjective answers based on
keywords, comparing them against the model answer, and allocating marks to the student
accordingly and it will provide the equivalent grade with respect to the answer.

6.2 FUTURE ENHANCEMENT

A potential improvement would be to utilize cloud deployment for increased scalability and
accessibility. By migrating to the cloud, the system would be able to manage larger amounts of data
and multiple users simultaneously, which would expand its potential applications for educators in
diverse environments. Moreover, the system could be optimized to work seamlessly with the
Internet of Things (IoT) system, which would provide additional flexibility and ease of use.
Integration with IoT would allow the system to be accessed from a wider range of devices and
locations, thereby increasing its impact on the education sector. These upgrades would significantly
augment the capabilities of the automatic descriptive answer evaluator system, making it even more
valuable for both educators and students.
38
APPENDICES
APPENDIX I
SAMPLE CODE:
DATA PRE-PROCESSING:
import warnings
warnings.filterwarnings('ignore')
import pandas as pd
data = pd.read_csv('Mark.csv')
data
Before removing the null data
data.shape
After removing the null data
df = data.dropna()
df.shape

df.isnull().sum()
df.info()
df.columns
df.duplicated()
df.duplicated().sum()
df.ANSWER.unique()
df.MARK.unique()
df.KEYWORD.value_counts()
df.columns
Before LabelEncoder
df.head()
from sklearn.preprocessing import LabelEncoder
var_mod = ['MARK','ANSWER','KEYWORD']
le = LabelEncoder()
for i in var_mod:
df[i] = le.fit_transform(df[i]).astype(int)
39

df.head()
df.corr()
df.describe()

DATA VISUALIZATION:
#import library packages
import pandas as p
import matplotlib.pyplot as plt
import seaborn as s
import numpy as n
import warnings
warnings.filterwarnings("ignore")
#Load given dataset
data = p.read_csv('MARK.csv')
df=data.dropna()
df
df.columns
df.groupby('ANSWER').describe()

#plotting graph for distribution


import matplotlib.pyplot as plt
import seaborn as sns
sns.countplot(x = "KEYWORD", data = df)
df.loc[:, 'MARK'].value_counts()
plt.title('MARK EVALUATION')
df['MARK'].unique()
Training model:
#!pip install nltk
import nltk
nltk.download('stopwords')
import nltk
40

from nltk.corpus import stopwords


from nltk.stem.porter import PorterStemmer
import re
import string
# remove whitespaces
df['ANSWER']=df['ANSWER'].str.strip()
# lowercase the text
df['ANSWER'] = df['ANSWER'].str.lower()
#remove punctuation
punc = string.punctuation
table = str.maketrans('','',punc)
df['ANSWER']=df['ANSWER'].apply(lambda x: x.translate(table))
# tokenizing each message
df['word_tokens']=df.apply(lambda x: x['ANSWER'].split(' '),axis=1)
# removing stopwords
df['cleaned_text'] = df.apply(lambda x: [word for word in x['word_tokens'] if word not in
stopwords.words('english')],axis=1)
# stemming
ps = PorterStemmer()
df['stemmed']= df.apply(lambda x: [ps.stem(word) for word in x['cleaned_text']],axis=1)
# remove single letter words
df['final_text'] = df.apply(lambda x: ' '.join([word for word in x['stemmed'] if
len(word)>1]),axis=1)
# Now we'll create a vocabulary for the training set with word count
from collections import defaultdict
vocab=defaultdict(int)
for text in df['final_text'].values:
for elem in text.split(' '):
vocab[elem]+=1

print(vocab)
41

# divide the set in training and test


from sklearn.model_selection import train_test_split
X,X_test,y,y_test = train_test_split(df.loc[:,'ANSWER':],df['MARK'],test_size=0.2)
X.info()
print(y)
from wordcloud import WordCloud

positive=' '.join(X.loc[y=='GRADE 1','final_text'].values)


ham_text = WordCloud(background_color='white',max_words=2000,width = 800, height =
800).generate(positive)

negative=' '.join(X.loc[y=='GRADE 2','final_text'].values)


spam_text = WordCloud(background_color='black',max_words=2000,width = 800, height =
800).generate(negative)
plt.figure(figsize=[30,50])
plt.subplot(1,3,1)
plt.imshow(ham_text,interpolation='bilinear')
plt.title('')
plt.axis('off')
plt.subplot(1,3,2)
plt.imshow(spam_text, interpolation='bilinear')
plt.axis('off')
plt.title('')

DECISION TREE CLASSIFIER:


#import library packages
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np
import warnings
42

warnings.filterwarnings("ignore")
#Load given dataset
df = pd.read_csv('Mark.csv')
df
df.shape
type(df['KEYWORD'].loc[100])
df.info()
# Data cleaning and preprocessing

import re
import nltk
nltk.download('stopwords')
from nltk.corpus import stopwords
from nltk.stem.porter import PorterStemmer
ps=PorterStemmer()
corpus=[]
for i in range(0, len(df)):
review0=re.sub('[^a-zA-Z0-9]',' ', str(df['KEYWORD'][i]))
review1=re.sub('[^a-zA-Z0-9]',' ', str(df['ANSWER'][i]))
review = review0+review1

review=review.lower()
review=review.split()

review=[ps.stem(word) for word in review if not word in stopwords.words('english')]


review=' '.join(review)
corpus.append(review)
corpus
# Creating the TFIDF model
from sklearn.feature_extraction.text import TfidfVectorizer
tv=TfidfVectorizer(max_features=2500,ngram_range=(1,2))
43

X=tv.fit_transform(corpus).toarray()
X
X.shape
df['MARK'].value_counts()
y=pd.get_dummies(df['MARK']) y=y.iloc[:,1].values
y = df['MARK']
print(y)
# Train Test Split

from sklearn.model_selection import train_test_split


X_train,X_test,y_train,y_test=train_test_split(X,y,test_size=0.2,random_state=0)
from sklearn.tree import DecisionTreeClassifier
DTC = DecisionTreeClassifier()
DTC.fit(X_train,y_train)
predict = DTC.predict(X_test)
from sklearn.metrics import accuracy_score
print('Accuracy of DecisionTreeClassifier',accuracy_score(y_test,predict)*100)
from sklearn.metrics import confusion_matrix
print('Confusuion matrix of DecisionTreeClassifier\n',confusion_matrix(y_test,predict))
from sklearn.metrics import classification_report
print('Classification report of DecisionTreeClassifier\n\n',classification_report(y_test,predict))

RANDOM FOREST CLASSIFIER:


#import library packages
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np
import warnings
warnings.filterwarnings("ignore")
#Load given dataset
44

df = pd.read_csv('Mark.csv')
df
df.shape
type(df['KEYWORD'].loc[100])
df.info()
# Data cleaning and preprocessing

import re
import nltk
nltk.download('stopwords')
from nltk.corpus import stopwords
from nltk.stem.porter import PorterStemmer
ps=PorterStemmer()
corpus=[]
for i in range(0, len(df)):
review0=re.sub('[^a-zA-Z0-9]',' ', str(df['KEYWORD'][i]))
review1=re.sub('[^a-zA-Z0-9]',' ', str(df['ANSWER'][i]))
review = review0+review1

review=review.lower()
review=review.split()

review=[ps.stem(word) for word in review if not word in stopwords.words('english')]


review=' '.join(review)
corpus.append(review)
corpus
# Creating the TFIDF model
from sklearn.feature_extraction.text import TfidfVectorizer
tv=TfidfVectorizer(max_features=2500,ngram_range=(1,2))
X=tv.fit_transform(corpus).toarray()
X
45

X.shape
df['MARK'].value_counts()
y=pd.get_dummies(df['MARK']) y=y.iloc[:,1].values
y = df['MARK']
print(y)
# Train Test Split

from sklearn.model_selection import train_test_split


X_train,X_test,y_train,y_test=train_test_split(X,y,test_size=0.2,random_state=0)
from sklearn.ensemble import RandomForestClassifier
rf = RandomForestClassifier()
rf.fit(X_train,y_train)
predict = rf.predict(X_test)
from sklearn.metrics import accuracy_score
print('Accuracy of RandomForestClassifier',accuracy_score(y_test,predict)*100)
from sklearn.metrics import confusion_matrix
print('Confusuion matrix of RandomForestClassifier\n',confusion_matrix(y_test,predict))
from sklearn.metrics import classification_report
print('Classification report of RandomForestClassifier\n\n',classification_report(y_test,predict))
import joblib
joblib.dump(rf, 'rbc.pkl')
joblib.dump(tv, 'rbc_tv.pkl')

ADA BOOST CLASSIFIER


#import library packages
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np
import warnings
warnings.filterwarnings("ignore")
46

#Load given dataset


df = pd.read_csv('Mark.csv')
df
df.shape
type(df['KEYWORD'].loc[100])
df.info()
# Data cleaning and preprocessing

import re
import nltk
nltk.download('stopwords')
from nltk.corpus import stopwords
from nltk.stem.porter import PorterStemmer
ps=PorterStemmer()
corpus=[]
for i in range(0, len(df)):
review0=re.sub('[^a-zA-Z0-9]',' ', str(df['KEYWORD'][i]))
review1=re.sub('[^a-zA-Z0-9]',' ', str(df['ANSWER'][i]))
review = review0+review1

review=review.lower()
review=review.split()

review=[ps.stem(word) for word in review if not word in stopwords.words('english')]


review=' '.join(review)
corpus.append(review)
corpus
# Creating the TFIDF model
from sklearn.feature_extraction.text import TfidfVectorizer
tv=TfidfVectorizer(max_features=2500,ngram_range=(1,2))
X=tv.fit_transform(corpus).toarray()
47

X
X.shape
df['MARK'].value_counts()
y=pd.get_dummies(df['MARK']) y=y.iloc[:,1].values
y = df['MARK']
print(y)
# Train Test Split

from sklearn.model_selection import train_test_split


X_train,X_test,y_train,y_test=train_test_split(X,y,test_size=0.2,random_state=0)
from sklearn.ensemble import AdaBoostClassifier
AD = AdaBoostClassifier()
AD.fit(X_train,y_train)
predict = AD.predict(X_test)
from sklearn.metrics import accuracy_score
print('Accuracy of AdaBoostClassifier',accuracy_score(y_test,predict)*100)
from sklearn.metrics import confusion_matrix
print('Confusuion matrix of AdaBoostClassifier\n',confusion_matrix(y_test,predict))
from sklearn.metrics import classification_report
print('Classification report of AdaBoostClassifier\n\n',classification_report(y_test,predict))

FLASK:
from flask import Flask,render_template,url_for,request
import pandas as pd
import joblib

# load the model from disk


clf = joblib.load("rbc.pkl")
cv = joblib.load("rbc_tv.pkl")

app = Flask(__name__)
48

@app.route('/')
def home():
return render_template('home.html')

@app.route('/predict',methods=['POST'])
def predict():

if request.method == 'POST':
message = request.form['message']
data = [message]
print(data)
vect = cv.transform(data).toarray()
my_prediction = clf.predict(vect)
print(my_prediction)
return render_template('result.html',prediction = my_prediction)

if __name__ == '__main__':
app.run(debug=False, port=7000)
49
APPENDIX II
PAPER PUBLISHED AND CERTIFICATE

Have submitted our paper to ViTECoN – 2023 on March 23rd in Vellore Institute of Technology
(VIT) ,Vellore, India.
50
APPENDIX III
CO-PO-PSO MAPPING

PROGRAMME EDUCATIONAL OBJECTIVES(PEOs)


PEO1: To equip students with an essential background in computer science, basic electronics,
and applied mathematics.
PEO2: To prepare students with fundamental knowledge in programming languages and tools
and enable them to develop applications.
PEO3: To encourage the research abilities and innovative project development in the field of
networking, security, data mining, web technology, mobile communication, and emerging
technologies for the cause of social benefit.
PEO4: To develop professionally ethical individuals enhanced with analytical skills,
communication skills and organizing ability to meet industry requirements.

PROGRAM OUTCOMES (POs)

A graduate of the Computer Science and Engineering Program will demonstrate:


PO1: Engineering knowledge: Apply the knowledge of mathematics, science, engineering
fundamentals, and an engineering specialization to the solution of complex engineering
problems.
PO2: Problem analysis: Identify, formulate, review research literature, and analyze complex
engineering problems reaching substantiated conclusions usingfirst principles of mathematics,
natural sciences, and engineering sciences.
PO3: Design/development of solutions: Design solutions for complex engineering problems
and design system components or processes that meet the specified needs with appropriate
consideration for the public health and safety, and the cultural, societal, and environmental
considerations.

PO4: Conduct investigations of complex problems: Use research-based knowledge


and research methods including design of experiments, analysis and interpretation of
data, and synthesis of the information to provide valid conclusions.
51

PO5: Modern tool usage: Create, select, and apply appropriate techniques, resources, and
modern engineering and IT tools including prediction and modeling to complex engineering
activities with an understanding of the limitations.
PO6: The engineer and society: Apply reasoning informed by the contextual knowledge to
assess societal, health, safety, legal and cultural issues and the consequent responsibilities
relevant to the professional engineering practice.
PO7: Environment and sustainability: Understand the impact of the professional
engineering solutions in societal and environmental contexts, and demonstrate the knowledge
of, and need for sustainable development.
PO8: Ethics: Apply ethical principles and commit to professional ethics and responsibilities
and norms of the engineering practice.
PO9: Individual and teamwork: Function effectively as an individual, and as a
member or leader in diverse teams, and in multidisciplinary settings.
PO10: Communication: Communicate effectively on complex engineeringactivities with the
engineering community and with society at large, such as, being able to comprehend and write
effective reports and design documentation,make effective presentations, and give and receive
clear instructions.
PO11: Project management and finance: Demonstrate knowledge and understanding of the
engineering and management principles and apply these to one’s own work, as a member and
leader in a team, to manage projects and in multidisciplinary environments.
PO12: Life-long learning: Recognize the need for and have the preparationand ability to
engage in independent and life-long learning in the broadest context of technological change.
52

PROGRAM SPECIFIC OUTCOMES (PSOs)

A graduate of the Computer Science and Engineering Program will demonstrate:

PSO1: Foundation Skills: Ability to understand, analyze and develop computer programs in
the areas related to algorithms, system software, web design, machine learning, data analytics,
and networking for efficient design of computer-based systems of varying complexity.
Familiarity and practical competence with a broad range of programming languages and open-
source platforms.

PSO2: Problem-Solving Skills: Ability to apply mathematical methodologies to solve


computational tasks, model real world problems using appropriate data structure and suitable
algorithms. To understand the Standard practices and strategies in software project
development using open-ended programming environments to deliver a quality product.

PSO3: Successful Progression: Ability to apply knowledge in various domainsto identify


research gaps and to provide solutions to new ideas, inculcate passion towards higher studies,
creating innovative career paths to be an entrepreneur and evolve as an ethically social
responsible computer science professional.
53

Project Work Course Outcome (CO):


1. On completion the students can execute the proposed plan and become aware of
and overcome the bottlenecks throughout every stage.
2. On completion of the project work students could be in a role to take on any
difficult sensible issues and locate answers through formulating the right
methodology.
3. Students will attain a hands-on level in changing a small novel idea / method
right into an operating model / prototype related to multidisciplinary abilities
and / or understanding and operating in a team.
4. Students will be able to interpret the outcome of their project. Studentswill take
on the challenges of teamwork, prepare a presentation in a professional manner,
and document all aspects of design work.
5. Students will be able to publish or release the project to society.

Project Title: Automatic Descriptive Answer Evaluator using Machine Learning


Guide name: Mrs. Susmita Mishra

PO/PSO PO PO PO PO PO PO PO PO PO PO1 PO1 PO1 PSO PSO PSO


CO 1 2 3 4 5 6 7 8 9 0 1 2 1 2 3
CO 1
CO 2
CO 3
CO 4
CO 5
Average . . .
54

REFERENCES

1. Nikam, P., Shinde, M., Mahajan, R., & Kadam, S. (2015). “Automatic Evaluation of
Descriptive Answer Using Pattern Matching Algorithm”. International Journal of Computer
Sciences and Engineering International Journal of Computer Sciences and Engineering
International Journal of Computer Sciences and Engineering International Journal of
Computer Sciences.

2. Ravikumar, M., Sampath Kumar, S., & Shivakumar, G. (2021). “Automation of Answer
Scripts Evaluation-A Review”

3. Sinha, P., & Kaul, A. (2018). “Answer Evaluation Using Machine Learning Answer
Evaluation Using Machine Learning” View project Answer Evaluation Using Machine
Learning.

4. Patil, P., Patil, S., Miniyar, V., & Bandal, A. (2018). “Subjective Answer Evaluation Using
Machine Learning”. International Journal of Pure and Applied Mathematics, 118(24).

5. Jagadamba, G., & Chaya Shree, G. (2020).” Online subjective answer verifying system using
artificial intelligence”. Proceedings of the 4th International Conference on IoT in Social,
Mobile, Analytics and Cloud, ISMAC 2020, 1023–1027.

6. Rao, M. V., Harshitha, I. S., Sukruthi, Y., & Sudharshan, T. (2020). “Automatic Answer
Script Evaluator”. INTERNATIONAL JOURNAL OF SCIENTIFIC & TECHNOLOGY
RESEARCH, 9, 1.

7. Salomi Victoria, R. D., & Grace Vinitha, V. P. (2020). “Intelligent Short Answer Assessment
using Machine Learning”. International Journal of Engineering and Advanced Technology.
55

8. Johri, E., Chandak, P., Dedhia, N., Adhikari, H., & Bohra, K. (n.d.). “ASSESS-Automated
subjective answer evaluation using Semantic Learning”.
Shaikh, S. J., Patil, P. S., Pardhe, J. A., Marathe, S. v, & Patil, S. P. (n.d.).

9. “AUTOMATED DESCRIPTIVE ANSWER EVALUATION SYSTEM USING MACHINE


LEARNING” Prof. Sumedha P Raut, Siddhesh D Chaudhari, Varun B Waghole, Pruthviraj
U Jadhav, & Abhishek B Saste. (2022). Automatic Evaluation of Descriptive Answers Using
NLP and Machine Learning. International Journal of Advanced Research in Science,
Communication and Technology, 735–745.

10. Dhokrat, A., Hanumant R, G., & Mahender, C. N. (2012). “Assessment of Answers: Online
Subjective Examination”.

11. Bashir, M. F., Arshad, H., Javed, A. R., Kryvinska, N., & Band, S. S. (2021). “Subjective
Answers Evaluation Using Machine Learning and Natural Language Processing.” IEEE
Access, 9, 158972–158983.

12. Chakravarty, N., & Maggu, S. (2021). This work is licensed under a Creative Commons
Attribution 4.0 International License “SUBJECTIVE ANSWER-CHECKER”. International
Advanced Research Journal in Science, Engineering and Technology, 8(11).

13. Singh, S., Shah, Y., Vajani, Y., & Dholay, S. (n.d.). “Automated Paper Evaluation System
for Subjective Handwritten Answers”.

14. Associate Professor, S., Kumar, P., & Pramod, N. (n.d.). “AUTOMATIC ANSWER
EVALUATION USING MACHINE LEARNING” MANDADA SAMEMI 2 , TIRUMALA SAI
HAREESHA 3 , GUDLURU VENKATA SIVA SAI.
56

15. Prof. Sumedha P Raut, Siddhesh D Chaudhari, Varun B Waghole, Pruthviraj U Jadhav, &
Abhishek B Saste. (2022). “Automatic Evaluation of Descriptive Answers Using NLP and
Machine Learning”. International Journal of Advanced Research in Science,
Communication and Technology, 735–745.

16. Ivor Uhliaqp Ñrik, “Handwritten Character Recognition Using Machine Learning Methods”,
Comenius University In Bratislava Faculty Of Mathematics, Physics And Informatics Thesis

17. S. Mori, C.Y. Suen and K. Kamamoto, “Historical review of OCR research and
development,” Proc. of IEEE, vol. 80, pp. 1029-1058, July 1992.

18. Piyush Patil, Sachin Patil, Vaibhav Miniyar, Amol Bandal, ”Subjective Answer Evaluation
Using Machine Learning”, International Journal of Pure and Applied Mathematics, Pp.01-
13, 2018.

19. Pranali Nikam, Mayuri Shinde, Rajashree Mahajan and Shashikala Kadam, “Automatic
Evaluation of Descriptive Answer Using Pattern Matching Algorithm”, International Journal
of Computer Sciences and Engineering, Pp.69-70, 2015.

20. Nilima Sandip Gite ”Implementation of Descriptive Examination and Assessment System”,
international journal of advance research in science and engineering, volume no.07, Pp.252-
257, 2018.

You might also like