Major Project
Major Project
A PROJECT REPORT
Submitted by
KARTHIKEYAN B (953620104023)
VINO A (953620104059)
of
BACHELOR OF ENGINEERING
in
MAY 2024
i
ANNA UNIVERSITY: CHENNAI 600 025
BONAFIDE CERTIFICATE
SIGNATURE SIGNATURE
DEPARTMENT
Assistant Professor
Department of Computer
Professor & Head, Department of Science and Engineering
Computer Science and Engineering Ramco Institute of
Ramco Institute of Technology Technology North
North Venganallur Village
Venganallur Village
Rajapalayam – 626117
Rajapalayam– 626117
ii
INTERNAL EXAMINER EXTERNAL EXAMINER
ii
ACKNOWLEDGEMENT
iv
ABSTRACT
v
TABLE OF CONTENTS
Generation
1.4 Machine Learning 5
02 LITRATURE SURVEY 8
vi
High-Quality Multiple Choice Questions
vi
2.4 Learning to Reuse Distractors to 11
Support Multiple-Choice Question
Generation in Education
2.5 Question Generation in Education 12
2.6 Multi-Hop Reasoning Question Generation
14
and Its application
03 BACKGROUND 16
3.1 Technology Used 16
3.1.1 PyPdf2 16
3.1.2 Spacy 16
3.1.2.1 Similarity 17
3.1.3 Natural Language Processing 17
3.1.4 Torch 17
3.1.5 T5 Transformers 18
3.1.6 Gensim 18
3.1.6.1 GloVe 18
3.1.7 Textstat 19
3.1.7.1 Automatic Readability Index 19
04 PROPOSED WORD 20
4.1.1.5 Summarization 23
vi
ix
4.1.1.9 Difficulty Level for Question 25
4.1.1.10 Difficulty Level for Distractor 25
4.1.2 Work Flow of Web Application 26
4.1.2.1 Faculty 26
4.1.2.1.1 Faculty Register 27
4.1.2.1.2 Faculty Login 27
4.1.2.1.3 Test Posting 27
4.1.2.1.4 Faculty Dashboard 28
4.1.2.2 Student 28
4.1.2.2.1 Student Registration 28
4.1.2.2.2 Student Login 28
4.1.2.2.3 Assessment Login 29
4.1.2.2.4 Assessment 29
4.1.2.2.5 Student Dashboard 30
05 RESULT AND DISCUSSION 31
07 PUBLICAATION DETAILS 41
APPENDIX II -CODING 46
REFERENCES 71
x
LIST OF FIGURES
xi
LIST OF ABBREVIATIONS
ABBREVIATIO EXPANSION
NS
DG DISTRACTOR GENERATION
i
CHAPTER 1
INTRODUCTION
In the era of the computerized world, the demand for customized learning
experiences is more crucial in the current educational environment. Adaptive tests,
which modify their level of difficulty according to the user's performance, provide
a promising approach to meet the unique learning requirements of individuals.
Dynamic Question Generation (DQG) is a crucial aspect of adaptive tests, since it
involves creating questions that are specifically designed to align with the student's
degree of skill and progress in learning.
As a greater number of students exist in a school or in an organization, it is
subsequently difficult for the teachers to evaluate and track students’ academic
performance via offline examinations. Hence the need for developing applications
for assessing students' performance has risen substantially. Because of the need to
bring education into the digital age, internet tests have become more important in
recent years. Online tests make it possible for students to take them from anywhere
with internet access, removing geographical obstacles and making learning more
open to everyone. In terms of scheduling, this flexibility means that students can
pick test times that work best for them, reducing scheduling conflicts and making
things easier. They are also clearly more cost-effective than traditional ones
because they don't require actual testing centers, printed materials, or manual
grading. With the help of technology, these tests make grading easier, giving
students feedback right away and giving teachers more time to focus on teaching.
Even though there are worries about security, improvements in technology have
made online exams more trustworthy, making sure that tests stay valid and
dependable. Multiple Choice Questions (MCQs) are a great example of how
assessment methods have changed over the years in modern education. They can
be changed to fit different question types and large groups of students, meeting the
needs of both students and schools.
1
1.1 OBJECTIVE OF THE WORK
The objective in designing the Dynamic Question Generation based on user
performance in Adaptive Exams involves fundamental objectives aimed at
ensuring accuracy, reliability, user-friendliness, and affordability. The main
objective of the Dynamic Question Generation is providing customized learning
experience to test takers. It will provide personalized experience to attend the test
for test takers or users. The dynamic question generation will provide personalized
growth that will support the academic growth.
The overall objective of the Dynamic question generation system is to create a
personalized and adaptive assessment experience for learners. By dynamically
generating questions based on various factors such as the learner's performance,
preferences, and learning objectives, this system aims to enhance engagement,
promote deeper understanding, and optimize learning outcomes. Key objectives
include Personalization that involves tailoring questions to each learner's
proficiency level, learning style, and prior knowledge fosters a more relevant and
engaging learning experience, leading to improved comprehension and retention. It
ensures adaptability in which the system dynamically adjusts the difficulty and
complexity of questions in response to the learner's progress, ensuring that
assessments remain challenging yet achievable, thereby maximizing learning
effectiveness. To ensure efficiency by automating the question generation process,
the system saves educators time and resources, allowing them to focus on more
strategic aspects of teaching and learning and providing timely and constructive
feedback based on the learner's responses helps identify areas of strength and
weakness, guiding further study and facilitating continuous improvement for the
students. and to enhance the overall diversity in generating questions across a
range of formats (e.g., topics related to science, social science) that promotes a
comprehensive understanding of the subject matter and encourages critical
thinking and problem-solving skills.
2
The major objectives of the proposed system are,
❖ To provide customized testing approach to user or test takers
❖ To develop an adaptive exam system that dynamically generates questions based
on individual student performance to provide personalized learning experiences.
❖ To set the difficulty level of questions which are dynamically adjusted based on
the user's performance.
3
different levels based on their answering
4
capacity. The interface should be intuitive and accessible to teachers as well as
students of varying technical backgrounds.
Furthermore, the project encompasses the development of interfaces and
platforms to deliver these dynamically generated questions seamlessly to learners,
educators, and assessment systems. These interfaces should be adaptive and
scalable to accommodate various educational settings, including traditional test
practices, online assessments, and personalized learning environments.
5
Furthermore, the scalability of the system ensures that it can accommodate
varying needs and preferences. Users have the flexibility to access the contents
displayed in the adaptive testing system.
6
computer
7
that makes it more similar to humans. The ability to learn. There are three main
types of machine learning: supervised learning, unsupervised learning, and
reinforcement learning.
9
CHAPTER 2
LITERATURE SURVEY
Overviews of different approaches for Dynamic Question Generation Project
are discussed in this chapter.
1
and cleans the
1
text. A set of further processing is done. Tokenization divides long strings of text
into basic units or tokens. The input is separated into chapters, sections,
paragraphs, and other relevant tags in the text using structural analysis. The next
phase removes unnecessary content from the text and converts the text into the
required format unifying the character shapes by converting all the text into either
lower or uppercase, removing numbers, punctuation, and stop words, stemming,
and lemmatization. POS tagging assigns POS to each word of a given text. POS
tagging is used for getting more granular information about the words. We applied
word frequency in distractor generation and TF(term-frequency) *IDF (Inverse
Document Frequency) in the key selection phase.
Next sentence selection phases all the sentences in a text do not contain factual
information. This step aims to identify the sentences that can act as the basis of
question formation. To find the importance of the sentence. Sentence
Normalization, semantic and PAS analysis, Semantic play role is underlying
relationship that a which word is a main verb in the sentence to find the meaning of
the sentence. Distractor generation is play very crucial role in MCQ generation
because by generating quality distractor is it can able easily confuse the people
those who attending the MCQ. Generate distractor by build in modules like GloVe,
WordNet, Fast words.
In conclusion the automatically generating MCQ questions from school
textbooks. Automatic question generation it has Manual evaluation results. The
individual modules are also efficient. Furthermore, the number of MCQs generated
by the system is quite high. A high recall is necessary for the system to utilize it in
a real application scenario. Although the system contains some domain specific
features and resources, it can be ported to other domains (or subjects) with minor
effort. We have checked this by utilizing the system for other subjects.
1
2.2 AUTOMATED MULTIPLE-CHOICE QUESTION GENERATION
FROM TEXT:
The automated Multiple-Choice question generation from text [2]. This is used
to generate MCQ from Text content. It will be worked based on the six steps. The
First step is Pre-Processing the text for relevant question generation. It will remove
un relevant data from text that is removing topic, Chapter, Table of contents and
etc. It also divides sentence into individual word that is known as Tokenization.
The Preprocessing step also have syntactic and statistical analyze. The next step is
sentence selection, in this step they can identify the sentence length, word
frequency and parts of speech tagging that is known as Parsing.
The next step is Key selection. In this step they have identify the relevant key
that is known as answer. The key will identify by using parts of speech, parse info.
The next step is Question formation. The Question is generated by using some
methodology that is identifies the relevant WH -word selection, SVO relation,
Knowledge in sentence. The next step is identifying the Distractor. The distractor
generation is worked by using some model that is WordNet, Domain Ontology,
Distributional Hypothesis. The Final step is post-preprocessing. In this step contain
Question post editing, Question Filtering, Ranking. By using those steps, they can
generate MCQ from text. The Primary advantage of this system is generating the
MCQ from the text. It will generate question from text contents. The disadvantage
of this system is didn’t generate question from larger texts and it takes more time
to generate question from text.
1
decoder framework. In this paper, a multi-selector generation network (MSG-Net)
is
1
proposed that generates distractors with rich semantics based on different sentences
in an article. MSG-Net adopts a multi-selector mechanism to select multiple
different sentences in an article that are useful to generate diverse distractors. Here,
a question-aware and answer-aware mechanism are introduced to assist in selecting
useful key sentences, where each key sentence is logical with the question and not
equivalent to the answer. MSG-Net can generate diverse distractors based on each
selected key sentence with different semantics.
The proposed methodology involves the process consisting of three layers: (i)
expression at the sentence level (ii) a layer for choosing different key phrases, and
(iii) a layer for making things. In the first layer, article lines are represented by a
system that is both question- and answer-aware. This article sentence
representation shows how each sentence relates to the question and answer. The
representation can help the second layer pick out the key words that make sense
with the question but aren't the same as the answer. The second layer uses a
selector to match a key line with a latent variable for each distractor. This makes
the distractions more varied. the multi-selector mechanism which is used can pick
out different key sentences based on how the sentences are represented and the
latent factors. To make distractions, the third layer uses a text-to-text transfer
transformer model. The T5 model is given a number of key lines with different
meanings, so the distractors it makes are also varied and are diverse. However, one
drawback of this mechanism is these methods often tend to generate semantically
similar distractors. Multiple generated distractors with similar semantics are
considered equivalent. Because the correct answer is unique, students can eliminate
these distractors even without reading the article. And also, this proposed system is
developed to generate MCQ’s only for an article or a paragraph. Therefore, the
future work of the system extends to generate questions for larger files such as
PDF.
1
2.4 LEARNING TO REUSE DISTRACTORS TO SUPPORT MULTIPLE-
CHOICE QUESTION GENERATION IN EDUCATION
Learning to reuse distractors to support multiple choice question generation in
education [4]. proposed system studies how a large existing set of manually created
answers and distractors for questions over a variety of domains, subjects, and
languages can be leveraged to help teachers in creating new MCQs, by the smart
reuse of existing distractors. In this article, it primarily focusses on such context-
aware models called transformers which provide rich representations, for many
tasks in NLP, such as question answering, machine translation, and text
summarization. The proposed model was evaluated by conducting user test with
teachers.
A transformer was used to detect every part of the input sequence in how they
are related to all other. The proposed system combines Bidirectional encoder
representations from transformers (BERT) which is the pre-trained masked
language model and has been widely used in many downstream tasks such as
question answering and generation, machine reading comprehension, and machine
translation, by fine-tuning it using a labeled dataset that provides supervision
signal. This paper introduced and evaluated multilingual context- aware distractor
retrieval models for reusing distractors for teachers who can help with the task of
making MCQs.
The system generated context-aware multilingual distractor retrieval models to
reuse distractor candidates that can speed up the process of creating multiple-
choice questions. The result obtained as three distractors out of the ten shown to
teachers were rated as high-quality distractors. It’s Future work is to extend the
current work to a multimodal system that considers other sources of information,
e.g., images that accompany MCQs in digital learning tools.
1
2.5 QUESTION GENERATION IN EDUCATION
Question generation in education [5]. This project focuses on Natural
Language Processing (NLP), aims to implement AI applications for the education
purpose, and look forward to the benefits of emerging AI technologies that can
bring into education. The system is aimed at generating short answer questions
automatically to reduce the time for teachers to write exam questions. In addition,
the main reason to implement this on short answer is that many studies had proven
that short answer exercises can enhance student’s long-term memory, thereby
improving their learning performance. Here an automatic question generation
(AQG) system is proposed that combines syntax-based and semantics-based
improve Question Generation(QG) processes to improve student’s learning
performance. The proposed system implements short answer AQG technology to
solve creating a large number of exercises and questions, these questions can be
used for assignments or quizzes in programming courses.
The project methodology combines syntax-based and semantic-based methods
for automatic question generation This interface allows teachers to view and
modify machine generated questions and allows students to answer questions. The
main purpose of semantic analysis is to enable machines to understand the content
of textbooks and extract important keywords from it. Here, the system uses BERT
analysis to extract (Bidirectional Encoder Representations from Transformers)
keywords from textbooks and use a large amount of unlabeled data to train the
model through unsupervised learning and transfer learning methods, using transfer
learning to train a language model. This brings some added advantages in the
proposed system: (1) no need to mark training data, (2) the language model can
understand the grammatical structure, interpret the semantics, and, in addition to
semantic analysis, the purpose of syntax analysis is to extract sentences that
contain keywords. A Stanford Core NLP tool developed by Stanford University is
used here to perform syntactic analysis on the extracted sentences. The result will
be displayed in the parse tree to find the complete sentence containing the subject,
1
verb and target. Since
1
syntax analysis can only output declarative sentences containing keywords, The T5
is used in the system to convert declarative sentences into interrogative sentences,
T5 is a text-to-text framework that combines encoder and decoder transformer
architecture. It treats each NLP question as “text-to-text”, which means using text
as input and generating new text as output. Therefore, the system inputs descriptive
sentences containing keywords into the T5 model and use the T5 model to
transform the descriptive sentences into interrogative sentences beginning with
WH questions. The result obtained in the proposed model has good performance in
terms of the quality of the generated questions.
However, a few limitations exist in the proposed model, in generating question
types, the system only generates WH questions. In the future, the work extends to
develop assessments for multiple-choice questions.
1
first extract the reasoning chain, an answer, and supporting sentences from the
given text, then view them as the reasoning contents for a specific question. Based
on the extracted contents, it yields the multi-hop result.
The system generated questions from multiple sources of information The
MCQ’s produced are from the provided text. Existing QG methods are mainly used
to generate the simple questions with a single sentence. However, this system
works on the multi-hop questions that require reasoning skills to answer from
multiple sources of information. The future scope extends to generate high value of
these customized techniques on solving the challenges in the QG process,
including repetition, consistency, relevance, etc. In the future, that can help to
generate the results with correct syntax and semantics.
2
CHAPTER 3
BACKGROUND
In this chapter, the background of the proposed system is explained. Hardware
components and the technology used in this project is discussed in this chapter.
3.1.2 SpaCy:
SpaCy is a highly regarded open-source library for natural language processing
(NLP) in Python. SpaCy works by a pipeline-based architecture to process natural
language text efficiently. It offers NLP tasks like tokenization, part-of-speech
tagging, named entity recognition, and dependency parsing, enabling extraction of
meaningful insights from text data. With pre-trained models available for multiple
languages. SpaCy allows for easy extension with custom components and
pipelines. Users can customize and extend its functionality to suit their specific
needs or integrate it into existing NLP workflows. Additionally, SpaCy provides
pre-trained statistical models for various languages and domains, allowing users to
perform NLP tasks without needing to train models from scratch.
3.1.2.1 Similarity
In SpaCy, similarity refers to the measure of similarity between two pieces of
2
text, typically represented as vectors in a high-dimensional space. SpaCy provides
2
a built-in method to compute the similarity between documents, spans, or tokens
based on their word embedding’s. SpaCy’s models are trained with word
embedding’s, which represent words as dense vectors in a continuous space. These
vectors capture semantic similarities between words, with similar words having
vectors closer to each other in the embedding space. SpaCy provides similarity ()
method to compute the similarity score between two objects (documents, spans, or
tokens). By calling this method on two objects, you can obtain their similarity
score.
3.1.4 Torch
Torch module, an integral component of the PyTorch library, stands as a
cornerstone in the field of deep learning and scientific computing. PyTorch, built
upon the Torch framework, offers a versatile and intuitive platform for developing,
training, and deploying state-of-the-art machine learning models. At its core lies
the torch. Tensor data structure, facilitating seamless handling of multi-
dimensional arrays and enabling efficient computation on both CPUs and GPUs.
With its dynamic computational graph feature, PyTorch empowers users to
construct complex neural network architectures with ease, facilitating
experimentation and rapid prototyping. PyTorch's seamless integration with
NVIDIA CUDA facilitates high-performance computing, enabling significant
2
speedups for computationally intensive tasks
2
3.1.5 T5 Transformers:
The T5 Transformer, short for "Text-To-Text Transfer Transformer," is a state-
of-the-art neural network architecture introduced by researchers at Google. It uses
a special structure called the Transformer, which is great at understanding how
words relate to each other in a sentence. This helps the model process and create
text very effectively. In simple terms, it's like having a smart assistant that
understands language well, making it useful for tasks like summarizing text,
answering questions, and translating languages. it used in real-time applications,
question generation, Chabot creation. Its ability to understand and manipulate
language allows it to analyze the content of a passage and generate contextually
relevant questions. T5 can generate questions across various difficulty levels and
question types, including factual, inferential, and evaluative questions, catering to
different educational and assessment needs.
3.1.6 Gensim
Gensim is an open-source Python library designed for topic modelling,
document indexing, and similarity retrieval with large corpora. It is primarily used
for unsupervised learning on text data. It provides efficient implementations of
various algorithms for natural language processing (NLP) tasks, particularly
focusing on techniques like topic modelling and word embedding’s.
3.1.6.1 GloVe
GloVe, short for Global Vectors for Word Representation, is an unsupervised
learning algorithm for obtaining vector representations of words. These
representations, also known as word embedding’s, capture semantic relationships
between words based on their co-occurrence statistics in a corpus of text. Glove
embedding’s are commonly used in distractor generation tasks in natural language
processing (NLP). Distractors are incorrect or misleading options provided
alongside a correct answer in multiple-choice questions or other assessment
formats. GloVe embedding encode semantic relationships between words,
2
allowing for the
2
identification of words that are similar in meaning. This enables the generation of
distractors that are contextually relevant to the correct answer but differ in
meaning.
3.1.7 Textstat
Textstat is a Python library used for computing various statistics from text. It
offers a range of functionalities that help analyse and understand text data more
effectively. textstat provides functions to compute readability scores, such as the
Flesch-Kincaid Grade Level, Coleman-Liau Index, Automated Readability Index
(ARI), and others.
2
CHAPTER 4
PROPOSED WORK
In this chapter, the design of the proposed system is explained. Further, the
methodology to develop automated question generation.
2
Figure 4.1 Workflow for MCQ Generation
The Figure 4.1 show the work of the entire question generation process from
starting to end
2
generation pipeline, text extraction plays a pivotal role in ensuring the quality and
3
relevance of the generated questions, thereby enhancing the overall effectiveness of
the model in facilitating learning and assessment.
3
delving into these linguistic features, statistical analysis enriches the understanding
3
of the text's content and context, laying a robust foundation for question generation
and summarization.
4.1.1.5 SUMMARIZATION
Summarization plays a vital role in condensing lengthy paragraphs into
concise versions, addressing the challenge of generating questions from extensive
text. By reducing the paragraph's length, summarization enhances efficiency in
question generation, as processing shorter texts is quicker and more manageable.
This process relies on assigning scores to sentences based on their word frequency
counts, with higher scores indicating greater relevance and importance.
Leveraging the N-Largest () package, summarization selects the top N sentences
with the highest scores, typically around 30 sentences, to form the summarized
version. This approach ensures that the essential information is retained while
discarding unnecessary details, thus streamlining the question generation process
and improving its effectiveness.
3
generation system, serving as the primary mechanism for generating relevant
question from summarized content. This process plays a central role in facilitating
learning and assessment by crafting questions tailored to the context of the
summarized paragraph. Leveraging keywords extracted from the text, the question
generation process is seamlessly executed using the T5 model. This powerful
model utilizes the provided keywords to generate questions, effectively
transforming them into inquiries that delve deeper into the content's essence. By
incorporating the identified keywords, the generated questions maintain relevance
and coherence, ensuring they align closely with the context of the summarized
paragraph. Thus, question generation not only enhances the comprehensiveness of
the learning experience but also contributes significantly to the effectiveness and
adaptability of our dynamic question generation system.
3
4.1.1.9 DIFFICULTY LEVEL FOR QUESTION
Setting the difficulty level for questions is a pivotal aspect in optimizing the
effectiveness of adaptive exams within our dynamic testing system. By tailoring
the difficulty level to each individual's proficiency and comprehension abilities,
we ensure a personalized and engaging testing experience. This dynamic approach
allows us to adjust the complexity of questions in real-time, accommodating the
diverse learning needs and skill levels of test-takers. Utilizing the Automated
Readability Index (ARI), we gauge the reading complexity of sentences and
categorize questions into easy, medium, or hard difficulty levels accordingly. This
classification enables us to present questions that challenge and stimulate critical
thinking while remaining accessible and fair to all participants. By dynamically
adjusting the difficulty level, our system promotes deeper engagement, fosters skill
development, and provides a more accurate assessment of learners' knowledge and
capabilities. Thus, integrating difficulty level settings into our adaptive testing
framework enhances its adaptability, efficacy, and overall impact on the learning
process.
3
thereby enhancing the rigor and reliability of the assessment. Moreover, aligning
the difficulty level of distractors with the proficiency and comprehension levels of
test-takers ensures a fair and equitable testing experience for all participants.
Overall, integrating difficulty level settings for distractors into our dynamic
question generation system enhances its adaptability, precision, and overall
effectiveness in evaluating learners’ skills.
4.1.2.1 FACULTY
The faculty plays the crucial role in our system. They can monitor the whole
3
student database and generate question by providing relevant documents. They can
able to create test for students.
3
unique identifier
3
for each set of questions.
4.1.2.2 STUDENT
The Students plays another important role in our system. The Student is the
one who going to attend the exam. The exam questions posted by the faculty. The
student has a dashboard to see all it score and previous attended test.
3
4.1.2.2.2 STUDENT LOGIN
For the student login feature, users must first register with valid credentials.
Upon attempting to log in, the system verifies the entered email against the
registered emails database. If the email is not found or is invalid, an error message
prompts the user to provide a valid email address. Subsequently, if the email is
valid, the system compares the entered password with the password associated with
the email in the database. If the password matches the email, the user is
successfully logged in. Otherwise, an error message indicates that the password is
incorrect.
4.1.2.2.4 ASSESSMENT
Upon entering the credentials, the test begins with the presentation of the first
question, which is of medium difficulty. Subsequent questions adapt dynamically
based on the user's responses: correct answers lead to more challenging questions,
while incorrect responses result in easier questions. This adaptive process
continues until the maximum number of required questions is reached.
4
Additionally, a timer is initiated, ensuring that the test duration is limited. For
example, if the timer is set to
4
30 minutes, users must complete the test within this time frame. Upon reaching the
maximum time limit, the test automatically submits.
4
CHAPTER 5
RESULTS AND DISCUSSION
The results of the implemented system are discussed in this chapter. The section
5.1 is discussed about the question generation. The section 5.2 described Initial
Posting and PDF Documentation, section 5.3 shows the Test Attendance phase and
section 5.4 Visualize the results in dashboard.
4
Figure 5.2 Final Set of Question
The finalized set of questions and answers, as depicted in Figure 5.2, will be
integrated into our web application for system testing.
This collection incorporates a range of difficulty levels determined through the
Automated Readability Index (ARI) and similarity technology, as discussed
previously. Each question-answer pair is accompanied by its respective difficulty
value, crucial for implementing a dynamic testing approach. To ensure efficient
storage and retrieval, this dataset will be stored in MongoDB, a robust database
management system. Accurate categorization of difficulty levels facilitates
effective testing, enhancing the overall performance and adaptability of our
system.
4
checks assess the richness of the PDF content to determine the feasibility of
generating the desire
4
number of questions. then at last before storing the questions in database it
generates a random value by 8-character (Capital letters + numbers) example:
MST3DJS1.
The Figure 5.3 shows the test posting page. In this page the faculty will provide
the pdf and relevant information for generating the question and answers.
4
Figure 5.4 Starting test
The First question will come after centering the credentials. It initiates with a
medium difficulty question, adapting subsequent questions based on the user's
responses: correct answers advance to harder difficulty questions, while incorrect it
goes to difficulty level of easy question. This dynamic process continues until the
maximum number of questions required is reached, the timer will be going until
the maximum time you reached example: 00:30 means the attending duration only
within the time after the time exist the test will automatically submit. ensuring
personalized learning experiences tailored to users' abilities and needs within the
chosen course.
The Figure 5.5 shows the test attending page. From the final set of questions,
the question and answer are retrieved and rendered to this page. By answering the
correct answer, it will dynamically change the question.
4
Figure 5.5 MCQ
4
Figure 5.6 Student Dashboard
The Figure 5.6 visualize the student dashboard. By using this the students can
view their results.
The faculty dashboard features comprehensive data on tests created, including
details such as the starting time, ending time, duration, and a unique question ID
for each test, facilitated with a copy icon for convenient duplication. The
information is structured in a tabular format, enabling easy access and navigation.
Upon selecting a specific test row, the dashboard dynamically populates with data
on student attendance, showcasing student names, emails, departments, courses,
total questions attended, and individual scores. This interface empowers faculty
members to efficiently monitor test participation and assess student performance,
facilitating informed decision-making and targeted intervention where necessary.
Additionally, the faculty dashboard provides comprehensive insights into faculty
data, showcasing details such as the faculty member's name, email, department of
employment, and the subjects they teach. This comprehensive overview allows
faculty members to track their own involvement in test creation and management,
fostering transparency and accountability within the educational institution.
4
Figure 5.7 Faculty dashboard
The Figure 5.7 shows the data about the questions posted before and also the
details of the faculty who use this.
This is the dashboard of faculty how has the data of the student by attending
his test. This interface empowers faculty members to efficiently monitor test
participation and assess student performance.
5
on the student's test participation, including the total number of questions
attempted and their respective scores. This comprehensive overview equips faculty
members with the necessary insights to assess student performance and provide
targeted support where needed.
5
CHAPTER 6
CONCLUSION AND FUTURE WORK
6.1 CONCLUSION
5
landscape.
5
6.2 FUTURE WORK
In our project's future endeavors, we aim to expand its capabilities to
generate questions in quick manner. This advancement will enable users to extract
questions from more extensive documents, broadening its applications across
educational materials, research papers, and technical manuals. Additionally, we
plan to enhance the system's ability to generate distractors specifically for technical
terms and concepts, thus improving the quality and effectiveness of the questions
generated. Furthermore, our project's scope extends to extracting relevant
information from images and tables, allowing for the generation of questions,
answers, and distractors based on this data. This evolution will diversify the
system's utility, accommodating a wider array of content types and catering to
various learning materials and domains.
5
PUBLICATION DETAILS
5
5
5
5
APPENDIX I
SOFTWARE SPECIFICATIONS
5
APPEN
DIX II
Import Packages SAMPLE
From flask import Flask CODING
6
from heapq import nlargest
6
import random
import numpy as np;
from ChooseQue import RandomQue, ChooseCrtQues;
TEXT EXTRACTION
def Start(text):
def remove_page(text):
pattern = r'\bPage\s+N\d{2}\b'
c_text=re.sub(pattern,'',text)
c_text=c_text.replace('\n','')
return c_text
c_text=remove_page(text=text)
6
text=c_text
SYNTACTIC ANALYZE
word_token = word_tokenize(text)
doc=nlp(text)
sentence_tokens = [sent for sent in doc.sents]
para=""
count=0;
para_list=[]
for x in sentence_tokens:
if count>30:
para_list.append(para)
para=""
count=0
else:
para=para+str(x)
count=count+1;
if para!="":
para_list.append(para)
text_list=[[paragraph] for paragraph in para_list]
capitalized_paragraphs = []
for paragraph in text_list:
paragraph_text = paragraph[0]
doc = nlp(paragraph_text)
capitalized_text = '. '.join([sent.capitalize() for sent in paragraph_text.split('.')])
capitalized_paragraphs.append([capitalized_text])
text_list = capitalized_paragraphs;
STATISTICAL ANALYZE
word_freq_list = []
6
for text in text_list:
word_freq = Counter()
for sentence in text:
words = sentence.split()
SUMMARIZATION
def tokenize_sentences_regex(text_list):
pattern = r'(?<!\w\.\w.)(?<![A-Z][a-z]\.)(?<=\.|\?|\!)\s'
long_sentences = []
for paragraph in text_list:
tot_sentence = []
sentences = re.split(pattern, paragraph[0])
for sentence in sentences:
if len(sentence) > 10:
tot_sentence.append(sentence)
long_sentences.append(tot_sentence)
return long_sentences
6
long_sentences= tokenize_sentences_regex(text_list)
sentence_scores = []
for sublist, word_freq in zip(long_sentences, word_freq_list):
nested_dict = {}
for sentence in sublist:
score = sum(word_freq.get(word.lower(), 0) for word in sentence.split())
nested_dict[sentence] = score
sentence_scores.append(nested_dict)
def generate_summary(sentence_scores):
summary = nlargest(3, sentence_scores, key=lambda x: sentence_scores[x])
summary_sentences = [sentence for sentence in summary]
return summary_sentences
summaries = []
for sentence_scores in sentence_scores:
summary = generate_summary(sentence_scores)
summaries.append(summary)
text_doc = []
for i in text_list:
para = []
for j in i:
para.append(nlp(str(j)))
text_doc.append(para)
summary_doc = []
for i in summaries:
summary = []
for j in i:
summary.append(nlp(str(j)))
summary_doc.append(summary)
6
KEY GENERATION
keys = []
for doc in text_doc:
k1 = []
for sentence in doc:
doc_sentence = nlp(str(sentence))
for word in doc_sentence.noun_chunks:
if word.text.lower() not in stop:
k1.append(word.text)
keys.append(k1)
keywords = []
doc_sentence = nlp(str(sentence))
for word in doc_sentence.noun_chunks:
if word.text.lower() not in stop:
k1.append(word.text)
keywords.append(k1)
common_keywords = []
for keys_list, keywords_list in zip(keys, keywords):
keys_set = set(keys_list)
keywords_set = set(keywords_list)
common = keys_set.intersection(keywords_set)
common_keywords.append(list(common))
QUESTION GENEARTION
question_tokenizer = T5Tokenizer.from_pretrained('t5-large',legacy=False)
question_model = T5ForConditionalGeneration.from_pretrained('Parth/result')
6
device = 'cuda' if torch.cuda.is_available() else 'cpu'
question_model = question_model.to(device)
def get_question(context, answer, model, tokenizer):
6
Removed_n = input_string.replace("\n", "");
words = input_string.split()
cleaned_words = [word for word in words if word.lower() not in
words_to_remove]
return ' '.join(cleaned_words)
DISTRACTOR GENERATOR
def generate_distractors(target_word, num_distractors=5, glove_model=None):
target_words = target_word.split()
try:
)
most_similar_indices = np.argsort(similarity_scores)
distractors = [glove_model.index_to_key[idx] for idx in
most_similar_indices if glove_model.index_to_key[idx] not in target_words]
return distractors[:num_distractors] except Exception as e:
print("Error occurred:", e)
return []
current_Answer_list = []
Proper_QA = [];
RemoveWords = ["its", "a", "the", "this", "page n01", "n01"];
i = 1;
6
difficulty = []
que_diff = ""
ans = remove_words(answer, RemoveWords);
target_word = ans.lower()
ansnlp = nlp(target_word);
distractors = generate_distractors(target_word, glove_model=glove_model)
if distractors == []:
continue;
else:
current_Answer_list = random.sample(distractors, min(3, len(distractors)))
if current_Answer_list != []:
current_Answer_list.append(target_word)
random.shuffle(current_Answer_list)
for similaritys in current_Answer_list:
doc = nlp(similaritys);
val = ansnlp.similarity(doc);
similarity.append(val);
for value in similarity:
if value == 1.0:
difficulty.append(10)
elif 0.7 <= value < 1.0:
difficulty.append(7)
elif 0.4 <= value < 0.7:
difficulty.append(4)
else:
difficulty.append(0)
var = textstat.automated_readability_index(question)
if var < 6 :
que_diff = "easy"
6
elif 6 <= var < 9:
que_diff = "medium"
else :
que_diff = "hard"
from Format import Format
question_dict = Format(question, ans, que_diff, current_Answer_list,
similarity, difficulty, QuestionId, FacId, var)
Proper_QA.append(question_dict)
i+=1;
return Proper_QA
WEB
APPLI
MCQ PAGE CATI
ON
CODE
FRON
TEND
7
const duration = localStorage.getItem(‘duration’);
const testToken = localStorage.getItem(‘TestToken’);
useEffect(() => {
const intervalId = setInterval(() => {
7
setTime(sec => sec + 1);
console.log(time)
}, 1000);
.then(res => {
if (res.message)
{ toast.success(res.message);
navigate(‘/StudentDashboard’);
} else if (res.error)
{ toast.error(res.error);
} else {
setQuestionGen(res.questionStructure);
localStorage.setItem(“currentque”,
JSON.stringify(res.questionStructure.Questionobjid));
setAnswer(‘’);
setChange(!change, () =>
{ setTime(0); // Reset time to 0
}); }
}).catch(error => {
7
console.error(‘Error fetching question:’, error);
7
// Handle error (e.g., display an error message)
});
};
const handleOptionChange = (event) => {
setAnswer(event.target.value);
};
7
{QuestionGen.Distractors[1]}</label></div>
<div className=”op1”>
<label className=”text-black text-3xl”>
<input type=”radio” value={QuestionGen.Distractors[2]}
checked={answer === QuestionGen.Distractors[2]}
onChange={handleOptionChange}
className=”ra1” name=”option” /> {QuestionGen.Distractors[2]}
</label> </div>
<div className=”op1 cursor-pointer”>
</div>)};
export default Mcq;
FACULTY DASHBOARD
import Panel from “../../assets/panel1.png”
import { FiCopy } from ‘react-icons/fi’
const DashBoard = () => {
const Navigate = useNavigate();
const [Loaded, setLoaded] = useState(false)
const [FacultyData, setFacultyData] =
useState(); const [Questions, setQuestions] =
7
useState(); const [copied, setCopied] =
useState(false);
7
useEffect(() => {
let FacultyId = localStorage.getItem(“userId”);
if(FacultyId == null) {
return Navigate(‘/FacultyLogin’); }
console.log(FacultyId)
const Type = “faculty”;
if (FacultyId) {
fetch(‘http://localhost:5000/GetUserData/’ + Type + ‘/’ + FacultyId)
.then(res =>
{ return
res.json();
}).then(res =>
{ console.log(re
s)
setFacultyData(res.FacultyData)
setQuestions(res.questionsData)
setTimeout(() =>
{ console.log(“wait”)
setLoaded(true)
}, 2000)
}) }
else {
Navigate(‘/FacultyLogin’);
} }, [])
const studExamData = (id) =>
{ console.log(id)
localStorage.setItem(‘currentstudid’, id);
Navigate(‘/Data’) }
7
const copyToClipboard = (questionId) =>
{ navigator.clipboard.writeText(questionId);
7
setCopied(questionId);
setTimeout(() =>
{ setCopied(false);
}, 1000); // Reset copied state after 2 seconds
};
return ( <>
{
Loaded ? (
<div className=”dashboard p-0 m-0 w-[100vw] h-[100vh] “>
<Navbar1 /> <div className=”dash ml-[70px] mt-[40px] p-[25px] w-
[90vw] rounded-xl flex “>
<h1 className=”text-white text-3xl”>Publisher DashBoard</h1> </div>
<div className=” absolute w-[100%] p-[50px] h-[100%]”>
<div className=”w-[98%] flex justify-between h-[370px] max-h-[500px]
m-[20px]”>
<div className=”dash basis-[50%] p-[30px] rounded-3xl duration-[0.5s]
text-2xl text-white”>
<div className=”mb-[40px]”>
<h1>Publisher Details</h1> </div> <div className=”text-lg”>
<div className=”s”> <label>Faculty Name :
{FacultyData.faculty_name}</label>
</div> <hr></hr> <div className=”s”>
<label>Faculty Email : {FacultyData.faculty_email}</label>
</div> <hr></hr> <div className=”s”>
<label>Faculty Department : {FacultyData.faculty_dept}</label>
</div> <hr></hr> <div className=”s”>
<label>Faculty Taught : {FacultyData.faculty_taught}</label>
</div> </div></div>
<div className=”dash basis-[45%] p-[30px] rounded-3xl duration-[0.5s]
7
items-center “>
<img src={Panel} className=” h-[400px] w-[500px] ml-[60px] mt-[-
30px]” /> </div> </div>
<div className=”dash h-[380px] my-[70px] rounded-xl p-6 overflow-auto”
style={{ maxHeight: “400px” }}> <div ><div className=”flex”> <div>
<h1 className=”text-2xl text-white mb-4”>Posted
Questions Information</h1> </div> <div>
<Link to=’/Exam’>
{Questions.length > 0 ? (
<table className=”table-auto w-full relative text-white font-bold text-[15px]”>
<thead>
<tr>
<th className=”px-4 py-2”>Test Posted Date</th>
<th className=”px-4 py-2”>Test Ended Date</th>
<th className=”px4 py2”>Duration</th>
<th className=”px4 py2”>Que Count</th>
<th className=”px4 py2”>Question ID</th>
</tr> </thead> <tbody>
{Questions.map((item, index) => (
<tr key={index} id={item.QuestionId} onClick={() =>
studExamData(item.QuestionId)}>
<td className=”border px-4 py-2 hover cursor-
pointer”>{item.StartingTime}</td>
<td className=”border px-4 py-2 hover cursor-
pointer”>{item.EndingTime}</td>
8
<td className=”border px-4 py-2 hover cursor-
8
pointer”>{item.Duration}</td>
<td className=”border px-4 py-2 hover cursor-
pointer”>{item.quecount}</td>
<td className=”border relative px-4 py-2 hover cursor-pointer”>{item.QuestionId}
<button
className=”absolute right-2 top-1/2 transform -translate-y-1/2
bg-transparent border-none focus:outline-none”
onClick={(e) =>
{ e.stopPropagation();
copyToClipboard(item.QuestionId);
}} >
<FiCopy />
</button>
{copied === item.QuestionId && <span className=”text-green-
500 ml-2”>Copied!</span>}
</td> </tr> ))}
</tbody> </table>
):(
<div className=”absolute bottom-[-6%] left-[45%]”>
<p className=”text-3xl”>🥲<span className=”text-center text-
xl”>There is no data</span></p>
</div> )}
</div> </div> </div></div> </div> ) : (
<div className={`flex w-[100%] h-[100vh] justify-center items-
center text-white`} style={{}}>
<div className=”w-[300px] bg-blue-900 h-[100px] rounded-xl flex items-
center justify-center”>
<img src={Images.Loading} alt=”Loading” className=”w-[35px]”
8
/><span className=”ml-[20px] text-xl”>Loading Please Wait</span>
8
</div> </div> )} </>)}
export default DashBoard;
CREATING QUESTION
const Exam = () => {
8
.then(response => response.json())
.then((res) => {
8
console.log(res);
if(res.message) {
success(res.message)
setLoaded(false)
navigate(‘/PublisherDashboard’)
}
else{
error(res.error)
} })
.catch(error =>
{ console.error(‘Error:’,
error); error(error) });
};
return (
<section>
<div className=”main bg-white h-[100vh]”>
<div className=”m1 h-[100vh] flex justify-center items-center”>
<div className=”m2 bg-slate-300 h-[70%] w-[30%] justify-center
rounded-3xl”> <div className=”head w-[100%]”>
<h1 className=”he text-[35px] text-neutral-700 font-bold text-
center “>Test</h1> </div>
<div className=”m3 text-slate-900”>
<div>
<label className=”lab mt-[30px] ml-[40px] text-[20px] font-
bold”>Input File</label>
<input type=”file” className=”lab mt-[30px] ml-[40px] font-
bold” required onChange={(e) => setSelect(e.target.files[0])} />
</div> <br></br>
8
<label className=”lab mt-[30px] ml-[40px] text-[22px] font-
8
bold”>Starting Time</label>
<input type=”datetime-local” id=”startingTime”
className=”inp h-10 w-half ml-[30px] mt-[30px] font-bold” required
/><br></br>
<label className=”lab mt-[30px] ml-[40px] text-[22px] font-
bold”>Ending Time</label>
<input type=”datetime-local” id=”endingTime”
className=”inp h-10 w-half ml-[40px] mt-[30px] font-bold” required
/><br></br>
<label className=”lab mt-[30px] ml-[40px] text-[22px] font-
bold”>Total Que</label>
<input type=”number” id=”que” className=”inp h-10 w-half
ml-[60px] mt-[30px] font-bold” required /><br></br>
<label className=”lab mt-[30px] ml-[40px] text-[20px] font-
bold”>Duration</label>
<input type=”time” id=”duration” className=”inp h-10 w-
[210px] ml-[70px] mt-[30px] font-bold” required /><br></br>
<button className=”sub1 ml-[180px] font-bold”
onClick={handleUpload} disabled={loaded}>
{loaded ? ‘Uploading…’ : ‘Submit’}
</button> </div> </div></div> </div> </section>
);};
export default Exam;
BACKEND CODE
DASHBOARD
from bson.objectid import ObjectId
@app.route('/GetUserData/<Type>/<id>')
def GetData(Type,id):
print(Type, id)
8
if Type == "faculty":
ids = ObjectId(id)
8
FacultyData = db['faculty'].find_one({"_id": ids}, {"_id":0});
Questions = list(db['questionstiming'].find({"FacultyId":id}, {"_id":0}));
if FacultyData is not None:
return jsonify({"message": "Retrieved Successful", "FacultyData":
FacultyData, "questionsData": Questions});
else:
9
GENERATE QUESTIONS
easy = [];
medium = [];
hard = [];
@app.route('/getquestion', methods=['POST'])
def GenerateNQ():
global Userdata;
data = request.json
testToken = str(data['testToken'])
if testToken == "":
email,password, Id, queid, course =
( str(data['email']),
str(data['pass']),
str(data['Id']),
str(data['queid']),
str(data['Dept'])
)
e = db['studentReg'].find_one({'student_email': email})
realUser = db['studentReg'].find_one({"_id": ObjectId(Id)});
if e is not None:
if str(e['student_email']) != str(realUser['student_email']):
return jsonify({"error": "It not exact user Email"});
if password != e['student_pass']:
return jsonify({"error": "Password incorrect"})
quesid = db['questionstiming'].find_one({"QuestionId": queid})
Questions = list(db['questions'].find({"QuestionID": quesid['QuestionId']},
{"Id": 0}))
for que in Questions:
print(que['Que_Difficulty'])
9
'
Q
u
e
_
D
i
f
f
queid}) i
if c
que['Que_Diffi u
culty'] == l
"easy": t
easy.append(qu y
e) '
elif ]
que['Que_Diffi
=
culty'] ==
=
"medium":
medium.appen
"
d(que)
h
eli
a
f
r
d
q
"
u
:
e
[
h
9
a t
r
d N
. o
a n
p e
p :
e
n T
d o
( k
q e
u n
e
) =
i
f G
e
q n
u e
e r
s a
i t
d e
d
i T
s o
k
n e
o
9
n t
( a
e ;
m
a i
i f
l
) u
userData = s
db['studentattended'].find_one({"email": e
email, "QueId": r
D
U a
s t
e a
r
i
d
s
a
t
N
a
o
n
=
e
u :
s da
e t
r a
D s
a
=
9
u
{ r
s
"
e
e
,
m
a "
i Q
l u
" e
: I
d
e
"
m
:
a
i
q
l
u
,
e
"
i
c
d
o
,
u
r "
s E
e a
" s
: y
"
c
:
o
9
0 "score": 0,
, "percent": 0,
"Medium": 0, "Questionsattented": 0
"Hard": 0,
}
res = db['studentattended'].insert_one(datas)
checkQue =
db['questionstiming'].find_one({"QuestionId":userData['QueId']}, {"quecount":1})
print(int(userData['Questionsattented']) == int(checkQue['quecount']));
if int(userData['Questionsattented']) == int(checkQue['quecount']):
return jsonify({"error": "Already Attended the Test"});
9
QueGen =
RandomQue(medium)
questionStructure = {
"Questionobjid": str(QueGen['_id']),
"Question": QueGen['Question'],
"Distractors": QueGen['Distractors'],
}
str(data['answer'])
)
if answer == "":
return jsonify({"error": "Enter any answer"})
checkQue =
db['questionstiming'].find_one({"QuestionId":Userdata['QueId']}, {"quecount":1})
UpdateduserData = db['studentattended'].find_one_and_update(
{"_id": Userdata["_id"]},
{"$inc":
{"Questionsattented": 1}
},return_document=True
9
)
QueGen = ChooseCrtQues(
9
user=UpdateduserData
, db=db,
easy=easy,
medium=medium,
hard=hard,
duration=duration,
answer=answer.lower(),
id=queobjid,
score=UpdateduserData['score']
)
if int(checkQue['quecount']) == (UpdateduserData['Questionsattented']):
users = db['studentattended'].find_one({"_id": Userdata["_id"]})
easyattent, mediumattent, hardattent = users['Easy'] * 4, users['Medium'] *
7, users['Hard'] * 10
9
return jsonify({"questionStructure": questionStructure})
1
REFERENCES
1
[11] S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural
Compute. vol. 9, no. 8, pp. 1735–1780, 1997.
[12] A. S. Bhatia, M. Kirti, and S. K. Saha, “Automatic generation of multi- ple
choice questions using Wikipedia,” in Proc. Int. Conf. Pattern Rec- ognit. Mach.
Intell., 2013, pp. 733–738.
[13] Santhanavijayan, S. Balasundaram, S. H. Narayanan, S. V. Kumar, and V.
V. Prasad, “Automatic generation of multiple choice questions for e-assessment,”
Int.
J. Signal Imag. Syst. Eng., vol. 10, no. 1/2, pp. 54–62, 2017.