Identification and Addressal of Knowledge Gaps in Students
Identification and Addressal of Knowledge Gaps in Students
Harichandana Magapu
Dept. of Computer Science
PES University
Bengaluru, India
[email protected]
Abstract—According to a survey study titled the Annual Status its education system, raise educational attainment levels, and
of Education Report (ASER), more than half of fifth-grade offer skills to its youth. Some analysts argue that India
children in rural schools cannot read a second-grade textbook will eventually close the economic gap with China due to
or answer simple mathematics problems. This figure hints at
some fundamental problems plaguing the rural education system. its greater proclivity for entrepreneurial innovation and its
Whether it is the lack of pocket-friendly schools, inadequate young, technically trained, rapidly growing English-speaking
structure, or merely the teaching methodology, we have under- workforce—which is expected to be in higher global demand
stood the underlying causes and created software that can help as China’s labour costs rise faster than in India.
students learn more clearly and identify knowledge gaps. It can With over 1.5 million schools and over 260 million students
also be used as a learning tool to address the inadequacies in
the current rural education system. Our application uses the in 2015/16, India has the world’s second-largest education
available educational content and develops questions to test the system behind China. Recent enrollment gains have been
students’ understanding of the fundamental concepts presented. attributed to the country’s youth bulge and greater access.
Once students attempt to answer these questions, we assess their According to government statistics, the student population in
weaknesses by analysing their answers. Alongside the questions, the school system increased by 5 per cent, or 12.6 million
we have used gamification to improve engagement with the app.
We used the BERT Summarizer to extract and summarise the students, between 2010/11 and 2015/16. In India, all children
content. We then generate questions and use the Word2vec model aged six to fourteen are mandated to attend school, which
to generate distractors. Multiple models were tried out, but is provided free of cost in public schools. However, despite
the best results were achieved using the BERT summariser in tremendous progress in recent decades in improving access,
conjunction with the Word2vec model. Metrics used to compute participation rates are still low, particularly in rural regions
the results were answerability and BERT scores.
Index Terms—Natural Language Processing, Cosine Similarity, and among lower castes and other disadvantaged groups.
Word2vec Apart from the alarming dropout rates, India’s education
system is beset by high teacher-to-student ratios, under qual-
I. I NTRODUCTION ified instructors, and unsatisfactory learning results. Although
India is a fast-changing country where a broad, high-quality much of the available comparable data is outdated, it demon-
education is critical to the country’s long-term growth. The strates that India’s framework has severe shortcomings. For
nation is currently experiencing a youth bulge. It boasts example, according to the Annual Status of Education Report
of the most significant young population globally, with (ASER), more than half of fifth-grade students in rural schools
600 million individuals under the age of 25. With more cannot read a second-grade textbook or solve fundamental
than 30 infants born every minute and 28 per cent of the mathematical problems. This statistic suggests that some un-
population under the age of 14, population growth rates derlying issues are causing the rural education system to fail.
are projected to stay around 1 per cent for years. By 2022, Whether it is the lack of pocket-friendly schools, inadequate
India is expected to overtake China as the world’s biggest structure, or merely the teaching methodology, we seek to
country, with 1.5 billion by 2030. (up from 1.34 billion in understand the underlying causes and create software that can
2017). By 2028, the United Nations predicts that Delhi will aid students to learn better.
be the world’s most significant capital, with 37 million people. In urban areas, education has advanced due to the in-
troduction of newer teaching techniques; however, teaching
India will gain a significant economic edge over rapidly techniques in rural India remain primitive and traditional.
ageing countries like China if it can modernise and expand Rural schools continue to instil rote learning in their students,
2
Authorized licensed use limited to: Yarmouk University. Downloaded on July 17,2024 at 20:22:14 UTC from IEEE Xplore. Restrictions apply.
pandemic and have had to rely more on parents and siblings other schools. The data used was the National Standardized
to study at home. The alarming findings of the Annual Status exam results, conducted nationwide for all fourth graders to
of Education Report 2020 could encourage the federal gov- judge educational outcomes. The data was heavily monitored,
ernment and state governments to devise remedial measures. and low scores were dealt with reasoning, such as the exam
Furthermore, since the lockdown in March, physical classes hall environment and the students’ learning ability.
have been suspended, there has been a significant increase in The findings indicate that the ConectaIdeas initiative had
students who are not enrolled, either because they dropped out a sizable impact on math achievement in the trial group of
or because admission was not necessary. Governments must students. The estimated effect of the first inference, which is
be concerned that the digital divide has resurfaced. According not related to the baseline math achievement, has a standard
to the study, 43.6 percent of students in government schools deviation of 0.22. This deviation was valued at 0.27 for the
do not have access to a smartphone, whereas 67.3 percent of specification, which affects the baseline results. Hence, we can
those who obtain learning materials in these institutions do so conclude from these results that language proficiency was not
through WhatsApp, underscoring the importance of gadgets affected by the testing. This experiment shows a significant
and connectivity. Just half of the children received study improvement in mathematics test results compared to other
assistance at home, a third received materials from teachers, subjects compared to the standard benchmarks. The results
and approximately 60 percent used textbooks. of the project indicate that the addition of gamification to
Even if schools choose a hybrid method of partial reopening the learning process can heighten learning capacity and help
and online instruction in the future, the ASER poll provides students overcome the barriers put forth by independent factors
information that might aid the educational system in var- such as low education level of parents, gender discrimination
ious ways. Expanding textbook availability to all students, and lack of academic motivation, which also affect the overall
including those who have dropped out or are awaiting formal academic standard of a student.
admission, would assist parents and siblings in aiding learning.
The ability to transmit learning materials and personal tutorial III. P ROBLEM S TATEMENT AND P ROPOSED S OLUTION
sessions will be enabled by bridging the divide on educational Education is the gateway to the rest of the world, and no
aids, including smartphones. Beyond these fundamentals, how- discussion of rural infrastructure would be complete without a
ever, the educational system should make innovative use of look at how far we’ve progressed in opening this door for rural
current-year resources to expand learning. Students could learn India’s children. India has the world’s second-largest education
a variety of topics by doing things themselves in the safety of system, after China. However, concerns of quality education
the open countryside, with the help of teachers. and access remain a problem in various sections of the country.
Ann Dowker published a paper [5] concluding that it is vital Education is widely recognised for its role in supporting social
to understand more about the relationships between reading and economic advancement. Access to education is essential
and arithmetic in order to increase our understanding of both for seizing new possibilities that occur as a result of economic
arithmetic proficiency and reading language comprehension expansion. Keeping this widely recognised reality in mind,
development as well as factors that may influence the out- education has received a lot of attention since independence,
comes of reading difficulties as well as arithmetical difficulties. yet maintaining quality education in rural India has always
One hundred four primary school students who received an been one of the most challenging tasks for governments.
individualised numeracy intervention (Catch Up Numeracy) The right to education is the fundamental right of every
were compared to 100 children who received matched-time Indian citizen, whether they live in a high-profile society or
instruction and 107 children who received business-as-usual in a remote, underdeveloped secluded village. Although rural
instruction. education in India is improving, the conditions in these schools
These kids were evaluated both before and after the in- remain deplorable.Rural areas have very few schools, and
tervention. The Number Screening Test was used to assess children must travel long distances to access these services;
arithmetic skills. To assess both the reading and comprehen- additionally, the majority of schools in these areas do not
sion components, the Salford Sentence Reading Test was used. provide drinking water. In addition, education is of very poor
Compared to the controls, those who received the intervention quality. Teachers are paid very little, so they are frequently
improved significantly more in numeracy but not in reading absent or do not teach properly.
or comprehension. A link was discovered between numeracy, In urban areas, education has advanced due to the in-
reading, and comprehension scores. troduction of newer teaching techniques; however, teaching
techniques in rural India remain primitive and traditional.
C. Background on Gamification in Education Rural schools continue to instil rote learning in their students.
The authors of this paper [6] have adopted quite a few This must change. Newer learning methods such as using
approaches. They compare how many exercises a student has learning aids, visual aids, introducing gamification will serve
completed compared to their classmates. Personalised ‘ads’ as a way to motivate and encourage students to be inquisitive
are shown to the students in the hopes of motivating them. about learning and studying.
Weekly competitions are set up for students against students There are a variety of learning tools available in India,
from other schools. Lastly, live competitions are set up with such as the Byju’s learning app, Khan Academy, EdApp,
3
Authorized licensed use limited to: Yarmouk University. Downloaded on July 17,2024 at 20:22:14 UTC from IEEE Xplore. Restrictions apply.
Duolingo and Vedantu. Most of these tools provide learning we receive the question and the answer as the outputs. This
material for students, quizzes and games based on some topics, answer will act as the keyword for use in distractor generation.
personalized roadmaps as well as in depth analysis of student’s Generating fill in the blanks is straight-forward. After iden-
performance in the form of reports. The issue with these tifying candidate sentences we replace the keyword with a
learning tools are that most of them are paid, and do not blank and present that sentence as a fill in the blank.
provide the facility to upload learning material, which can The approach to generate True/False sentences is a little
subsequently be tested on. different. After getting the list of candidate sentences, we
In this paper we aim to develop a learning tool which allows use the Berkley Constituency parser to split the sentences at
teachers and educators to upload learning material so that ending verbs and ending noun phrases. Once we have a list
questions and answers can be generated from this material. of candidate sentences and corresponding list of incomplete
Material for either science or math can be presented in a sentences, we use the OpenAI GPT2 model to generate mul-
PDF format so that the content can be parsed through easily. tiple alternative sentences from these incomplete sentences.
Using Natural Language Processing, key points in the learning The sentences that will be generated could be also be very
material will be used to generate Multiple Choice Questions close to the original sentence in meaning and hence has high
(MCQ’s). probability of being True. To filter out similar sentences from
When students use this tool, they will first be presented with the generated sentences, we use Sentence BERT. We pass the
the learning material subdivided into easily digestible sections original candidate sentence and the generated sentences to
before attempting quizzes. These quizzes will be presented in Sentence BERT. Once we get the similarity scores, we discard
a gamified format so that students are motivated to learn and the sentences having a similarity score greater than a threshold
take these quizzes. Their performance will be tracked and in and keep the sentences with lesser similarity scores. Finally,
depth reports will be produced stating which topics the student we keep only 3 sentences with the least similarity scores
scored well in and which topics the student did not score well as False sentences. After identifying candidate sentences and
in. For topics that the student did not score well in, more generating false sentences, we mark the candidate sentences
questions from that topic will be presented to the student. The with ”True” tag and the generated sentences with ”False” tag
teacher can also lay more emphasis on this topic and provide and randomly select some sentences from this to be included
tutorials to the student. This learning tool will be made in the in the main quiz.
form of a web application so that it can be accessed on any
device with internet connection. B. Distractor Generation
The question generation forms the main backbone of the To present the correct answer, we have generated three dis-
project, and four main components have been identified for tractors, each of which is intended to convey a misconception
its implementation. They are as follows: held by the student about the concept being tested by the
• Question Generation question. There are three categories of distractors:
• Distractor Generation
• Antonym - the correct answer is used as a parameter,
• Distractor Explanation
generating its antonym. This indicates that the student is
• Bag of Words
headed in the opposite direction of the answer
The proposed methodology for the components mentioned • Bag of words - a bag of words related to the concept
above is as follows. is generated, and one word from it is picked at random.
A. Question Generation This indicates that the student is in the general vicinity
of the answer but not quite there.
The content from the textbook is summarised using a T5
• Random word - a word is generated at random. This
based summariser. Once we have the summarised text, we can
indicates a complete misunderstanding of the concept.
then move on to create questions. In this project, we have 3
different types of questions that can be generated i.e. Single Wordhoard was used to generate a list of antonyms and
line questions, Fill in the blanks and True/False questions. the one with the shortest cosine distance from the original
We start with classifying each word with its type. This type word (correct answer). The random-words library was used to
can be either a Verb, Adverb, Adjective, Noun or Pronoun. generate the third category of distractors. These, along with
These classifications are used when searching for the definition the corresponding explanations, can be used to analyse the
of the distractor or keyword. In the model, we have used student’s understanding and assess which parts of the lesson
Pythons NLTK library, along with the ‘punkt’, ‘averagedper- the student must review or the teacher must explain differently.
ceptrontagger’, ‘tagsets’ and nltk-tokeniser. Keywords in our
context are any words vital for the generation of a question, C. Distractor Explanation
i.e. the keyword acts as the crux of the question. Using this Once a question is generated, we create an explanation for
keyword, the candidate sentences are identified. each distractor. This explanation helps the user understand
To generate single line questions, we use a T5 based their shortcomings for a particular topic and helps them
question generation model that has been trained on SQuADv1 identify the flaw in their thinking or approach to a particular
dataset. We feed the candidate sentences to this model and problem.
4
Authorized licensed use limited to: Yarmouk University. Downloaded on July 17,2024 at 20:22:14 UTC from IEEE Xplore. Restrictions apply.
Identifying the keyword identifies the context in which they By going by the ‘number of similar words’ basis, we see
are being used, identifies which other keywords are close to the Euclidean distance between the paragraphs is vast, hence
it, and how a question can be generated using this context. falsely concluding that they might not be similar, when in
Keyword extraction is done by identifying the sentence in reality, by taking the Cosine of the angle between them, we
which the keyword exists. This process is done by finding learn they are, in fact, very similar after all.
words before and after the keyword. Alongside, identifying Hence, we can conclude that the closer the angle between
when a stopper such as a full stop, exclamation mark or two paragraphs, the higher the cosine value and the closer the
question mark is present. A list of contexts for each keyword similarity.
and thus found. Finally, once we have the list of keywords and In our context, we are comparing the list of definitions for a
their context, we must find the explanation for the keyword. given keyword, with the context of that given keyword. Hence
This is used when generating a report for the quiz to help the when we compare the two, we may get a matrix as shown
user identify their shortcomings for a certain topic. Finding the below:
meaning of a keyword is done using PyDictionary and sklearn
libraries.
We are using the PyDictionary dict.meaning() function, we
can get a list of all possible meanings for a word, for any
context, i.e. the word as a noun, verb or adjective. Hence,
by cross-referencing our previous classifying model, we only
extract the meanings we need. Hence if a word is a Noun, we
Fig. 2. Initial Matrix
only extract the Noun meanings generated. Once we get the
definition of a word, after filtration by the type of the word, the
meaning may not be entirely accurate for the context in which
it is being used. For example, when used as a noun, the word
‘point’ can be defined as either ‘a geometric element that has
position but no extension’ or ’a brief version of the essential
meaning of something. Both meanings are technically correct,
but only one of them fits our context. To do this, we use
Cosine Similarity to determine the closeness of a definition Fig. 3. Final Matrix
of the keyword to its context. Cosine similarity finds the
Cosine of the angle between the two vectors projected in a The diagonal row of 1’s results from comparing the defini-
2D space, where the two vectors are the keyword definitions tion with itself. In the first row, where we compare the context
and the context. One would assume that the similarity can be with the definition, we see that the cosine value between the
deduced based on the number of words in common between two definitions is non zero. Thus, based on its context, we take
the two texts, aka the Euclidean Distance. However, this the more significant value and use this as our true meaning
becomes cumbersome when more extensive paragraphs of text for the given word.
are involved due to the high number of words present.
D. Bag of Words
Hence Cosine Similarity removes that notion and instead
The nltk word2vec model is used to create a bag of similar
focuses on the angle, hence removing the possibility of con-
words. Using the brown corpus, we first train the word
cluding two paragraphs are similarly based on their degree of
embeddings. Then we load the pre-trained model and prune
similar words.
the binary model. After the model is trained, we save it to be
To visualise the cosine vector, we have the following space: used later. A pre-trained model is included in NLTK, and it is
part of a larger model that was trained on 100 billion words
from the Google News Dataset. The model has been trimmed
to contain the most common terms (about 44k). Finally, the
top ’n’-words similar to a target word are generated.
IV. DATA
The question generation model uses the BERT Extractive
summariser and the word2vec model to create distractors.
To perform extractive summarisations, this tool makes use of
the HuggingFace Pytorch transformers library. This is accom-
plished by first embedding the sentences and then running
a clustering algorithm to find the closest ones to the cluster
centroids. No data is needed as such for the training of the
Fig. 1. Cosine Vector model. The learning material in the text is directly given to the
5
Authorized licensed use limited to: Yarmouk University. Downloaded on July 17,2024 at 20:22:14 UTC from IEEE Xplore. Restrictions apply.
model. This learning material can be excerpts from textbooks VI. C ONCLUSION AND FUTURE WORK
or the entire textbook itself in a text file. It can also be typed As a part of this project, we have created an application
out notes regarding a particular topic. The model extracts the that can generate multiple-choice questions with distractors
sentences containing keywords. Questions are generated from that indicate the concepts misunderstood by the student and a
these sentences and based on the keyword, the distractors in detailed analysis of which parts of the lesson they must review.
the form of the antonym and synonym and a bag of similar As a part of future work, one component of the application
words are used. An explanation for the distractors is also that can be improved upon is the PDF to text conversion
provided to analyse where the student has a misconception. module. The main reason is the pdf to text conversion module,
V. E XPERIMENTAL R ESULTS which does not yield a true conversion. The sentences can be
misplaced and disordered. This results in sentences that are
We have achieved the generation of three types of questions. not always grammatically correct. Furthermore, extra garbage
Multiple Choice fill in the blanks, Multiple Choice single characters are introduced. A trivial form of preprocessing has
word answer and True/False questions. We have observed been done to ensure extra blank lines and garbage characters
that quality questions are generated for science subjects such do not persist in the converted text, but little can be done about
as Biology and chemistry, especially for Physics. This high the ordering of the text. Hence, the best results for question
accuracy is because keywords in middle school and high generation concerning grammatical semantics are found when
school science are available in most word embedding datasets we directly upload the learning material in a text file.
and corpora, especially physics. Biology tends to have some Furthermore, when we get the proper text format for the
unique words for which distractors generated are not relevant input textbook and identify the keywords used for making the
such as Mitochondria and Endoplasmic Reticulum. questions, the context of the keywords is not always captured.
As there was no reference text of multiple-choice questions, This is something which can be improved.
bleu and rouge scores could not be used to evaluate the We could also generate a separate corpus for each do-
quality of questions - human intervention was required to main/subject for better questions and distractor generation for
determine question quality. This issue can be dealt with in the future work. We could also extend and build on this current
future by working with educators to obtain reference texts for model to generate math questions.
their topics. However, BERT scores, which improve metrics Another area of improvement could be the topic modelling.
for text generation, have been used. BERT scores use BERT We are determining the topics based on the format of the PDFs
pre-trained embeddings and match the source and generated of NCERT textbooks. To allow other types of written material
questions by performing pairwise cosine similarity. Then IDF to be used for question generation (for example, notes made
(inverse document frequency) is used as weights, and finally, by the teacher), we could have a topic modelling module.
the three values of Precision, Recall and F1 scores have been
computed. We noticed that BERT scores correlate more with R EFERENCES
human judgement on a sentence level than other trivial n- [1] Kettip Kriangchaivech and Artit Wangperawong, “Question Generators
gram metrics such as Bleu and Rouge. F1 scores range in by Transformers”
[2] Dhaval Swali et. al “Automatic Question Generation from Paragraph”
value between 0 to 1, 1 indicating perfect similarity and 0 [3] Jinnie Shin, Qi Guo, Mark J. Gierl “Multiple-Choice Item Distractor
indicating no similarity. As computed, we attained an F1 score Development Using Topic Modeling Approaches”
of 0.913316 for the ’single word answer’ questions, resulting [4] The Hindu “Gaps in learning: On rural students and the pandemic”
article
in very accurate questions being generated. [5] Ann Dowker “Factors That Influence Improvement in Numeracy, Read-
Alongside generating different questions and options, we ing, and Comprehension in the Context of a Numeracy Intervention”
also implemented an engaging gamification system. This [6] “Does gamification in Education Work? An Experimental Evidence
Report from Chile”
system follows the traditional experience-points system of
levelling up. Each students level corresponds to the number of
quizzes they have completed. This sense of accomplishment
with higher levels reinforces students to continue to take
quizzes to increase their scores. This system allows us to
achieve the desired effect of making learning enjoyable and not
seen as a burden by the student but instead as an achievement.
Another more apt metric than BERT scores for automatic
question generation quality is answerability, which rates ques-
tions based on how answerable they are. This metric has been
used for research purposes and is not commercially used. The
implementation of the computation of the score has been taken
from a cited paper. Essentially the quality of answerability
is a weighted sum of BLEU, Rouge, Meteor and NIST. We
attained a value of 0.6244, indicating good enough question
quality with some scope of improvement.
6
Authorized licensed use limited to: Yarmouk University. Downloaded on July 17,2024 at 20:22:14 UTC from IEEE Xplore. Restrictions apply.