Advanced NLP Models For Technical University Information Chatbots: Development and Comparative Analysis
Advanced NLP Models For Technical University Information Chatbots: Development and Comparative Analysis
ABSTRACT In order to achieve quality education as a defined one of the sustainable goals, it is necessary
to provide information about the education system according to the stakeholders’ requirements. The process
to obtain the information about university/institute is a critical stage in the academic journey of prospective
students who are seeking information about the specific courses which makes that university/institute unique.
This process begins with exploration to general information about universities through websites, rankings,
and brochures from various sources. Most of the time, information available on different sources leads to
discrepancies and influences student’s decisions. By addressing inquiries promptly and providing valuable
information, universities can guide individuals in making informed choices about their academic future.
To address this, the chatbot application is the most effective tool to be implemented and make it functional
on university’s functional website. A chatbot is an artificially intelligent tool which can interact with humans
and can mimic a conversation. This tool can be implemented using advanced Natural Language Processing
(NLP) models to provide the pre-defined answers to the student’s queries. Chatbot is very helpful for query
resolution during the counseling process of the institute as it will provide official/uniform information and
can be accessed 24 × 7. Therefore, the aim of this research work was to implement a chatbot using various
NLP models and compare them to identify best one. In this work, five chatbot models were implemented
using neural networks, TF-IDF vectorization, sequential modeling and pattern matching. From the results,
it was observed that neural network-related models had better accuracy than TF-IDF and pattern matching
model, and sequential modeling is the most accurate model because it prevents over-fitting. Furthermore,
a chatbot having any kind of optimizer can improve the result and it is most important that pattern matching,
and semantic analysis should be the parts of a chatbot for real time scenarios.
INDEX TERMS Conversational AI, natural language processing, artificial intelligence, chatbots, neural
networks, sequential modeling, pattern matching, semantic analysis.
2024 The Authors. This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.
VOLUME 12, 2024 For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/ 29633
G. Attigeri et al.: Advanced NLP Models for Technical University Information Chatbots
the time, chatbots are accessed through the web browser. neural networks, and other technologies. The structured way
A chatbot mainly works by asking a question or query of generating knowledge base with specific patterns and
regarding a specific topic. They work on the principle of responses is possible through specific mark-up language.
Artificial Intelligence (AI) and Natural Language Processing AIML stands for Artificial Intelligence Markup Language,
(NLP) to provide the answers to the user’s queries, and a an XML base markup language meant to create artifi-
predefined knowledge base helps to develop a response to the cial intelligent applications. In 2021, Md Mabrur Husan
question [1]. Dihyat et al. [6] wrote that AIML uses pattern-matching
There are three major types of chatbots, namely the techniques to formulate query answers. The basic unit of the
Rule-based, Retrieval-based model, and Generative-based AIML script is called category tag, which is formed by user
model. In the Rule-based Model, the bot responds to queries input patterns and chatbot responses according to the input.
using pre-programmed rules. This kind of chatbots can The question is stored in the < pattern > tag inside each
answer a simple and limited set of questions. Retrieval- category, while the corresponding answer is stored in the
based Models select an appropriate response from a group < template > tag. The design comprises words, spaces, and
of pre-defined responses using limiting conditions(heuristic). wildcard symbols such as ∧ and ∗. Wildcard symbols are used
The bot created can understand the entire conversation and to replace strings in AIML.
respond based on the context of the conversation. Lastly, Specifically the focus of the work is to use a conversational
Generative-based models generate responses from previous AI based intelligent chatbot, which is a good solution
and current experiences. This highly sophisticated chatbot for answering the student’s specific queries regarding the
type requires complex computational models and vast data admission process. It will provide 24 × 7 assistance, and the
to train [2]. The captured vast data of question and answers information will be uniform and precise [7]. The contribution
can be pre-trained to generate the model and subsequently of the paper are:
the model will be able to generate the accurate responses.
• Preparation of questions related fo counseling process
Usually, neural network based models such as Recurrent
• Handing a various forms of the same query using
Neural Network (RNN) and Long Short-Term Memory
semantic analysis
(LSTM) are more efficient to work on such a use-case.
• Capability to process all types of questions: simple to
The standard form of query will be asked in natural
complex
language format e.g. ‘‘English’’. To generate the appropriate
• Implementation and analysis of chatbots sing various
responses, understanding and analysis of query is important
technologies
and that can be done by Natural Language Processing (NLP).
As per natural language analysis of the queries, there
are three major types of queries can be defined for II. LITERATURE SURVEY
experimentation: The integration of chatbots into counseling services within
engineering institutes has gained traction as a means to
• Simple Query: A simple query consists of a single and
provide accessible and timely support to students facing
unconstrained query desire(desired output) and has a
academic, personal, or career-related challenges. Some of
single unconstrained query input. For example, in the
the chatbots used in counseling process are briefed and
query ‘‘What is the capital of USA?’’, the desired output
comparative analysis is explained in this section. A study by
of the query (i.e., Capital) is explicit, single, and not
Davis and Smith [8] emphasized the potential of chatbots in
bound to any constraint. The input to the query (i.e.
counseling to overcome geographical and time constraints.
USA) is also single and unconstrained.
The implementation of chatbots allowed students to seek
• Complex Query: A complex query consists of a
guidance beyond traditional office hours, leading to increased
single query desire, which can be either constrained or
accessibility. The proposed chatbot emphasizes on the set of
unconstrained and the input to the query is multiple and
questions which are frequently getting asked during coun-
explicit but it can be constrained or unconstrained. For
seling process. Research by Johnson and Lee [9] explored
example, in the query ‘‘What was the capital of the USA
the role of chatbots in providing emotional support to
during World War II?’’, the query has multiple inputs
engineering students. The study found that chatbots equipped
(i.e. USA and World War II) and the desired output is
with sentiment analysis capabilities effectively identified and
single, unconstrained, and implicit (i.e. Capital).
responded to students’ emotional states, contributing to a
• Compound Queries: A compound query is a query
supportive environment. The development proposed chatbot
with a conjunction or dis-junction operator connecting
focuses on all types of questions which are generally need
two simple or complex queries. For example, ‘‘What are
to be answered before taking decision to join any university
the capitals of the USA and Germany?’’ is a compound
for the course. This decision process is quite an emotional
query because it has ‘‘and’’ in it.
for the students and parents hence appropriate questions to
To develop the conversational agents, the broader field be answered is a crucial task. Career-oriented chatbots were
of AI and natural language processing continues to evolve investigated by Patel et al. [10] for offering personalized
with advancements in machine learning, deep learning, career guidance to engineering students. Results indicated
that students who engaged with career-focused chatbots Fryer and Carpenter [20] explained about Jabberwacky,
demonstrated a clearer understanding of their career paths written in CleverScript, an Artificial Intelligence tool. Eleni
and increased confidence in their choices. The proposed Adamopoulou et al. [15] authors wrote that Jabberwacky
chatbot mainly focuses on various preferences of students to was created in 1988 and used contextual pattern-matching
decide the university for the admission. The work of Chang algorithms to answer queries based on previous discussions.
and Wang [11] delved into privacy concerns associated with In 2020, Shivang et al. [21] mentioned about Jab-
counseling chatbots. The study highlighted the importance berwacky’s main goal that was to transition from a text-based
of secure communication channels and transparent data system to a fully voice-driven system. Mathew et al. [22] have
handling practices to ensure the confidentiality of sensitive implemented an NLP-based personal learning assistant for
information shared during counseling sessions. User accep- school education. The chatbot proposed in this paper requires
tance of counseling chatbots was explored by Yang and the potential to cover the whole subject’s contents which
Liu [12]. Their research found a positive correlation between can be achieved by enhancing the ontology and knowledge
the user-friendliness of chatbots and students’ willingness to base.
engage. Clear communication of the chatbot’s capabilities In 2020, Verma et al. [21] implemented the Artificial
and limitations also played a crucial role in user acceptance. Linguistic Internet Computer Entity (ALICE), which is
Ranoliya et al. [13] have developed a chatbot for Natural Language Processing chatbot. This chatbot uses
university-related FAQs. This chatbot was implemented using heuristic pattern and matching algorithms to conduct con-
Artificial Intelligence Markup Language (AIML) and Latent versations. ALICE was written using Artificial Intelligence
Semantic Analysis (LSA). Authors need to try other ways of Markup Language (AIML), an XML-based schema for
implementation significantly when the dataset of questions writing heuristic conversational rules.
increases. In 2020, Adamopoulou and Moussiades [15] wrote
Sharma et al. [14] discussed that ELIZA was created that ALICE entirely relied on pattern-matching algorithms
by a German Computer Scientist, Joseph Weizenbaum, without recognizing the context of the entire conversation.
in 1966. It is considered to be the first chatbot in computer Also, ALICE lacks intelligent traits and cannot generate
history. Eliza used ‘‘pattern matching’’ and substitution human-like responses that express emotions and attributes.
methodologies to simulate conversations. Mittal et al. [23] have developed a Web-based chatbot for
In 2020, Adamopoulou and Moussiades [15] that ELIZA Frequently Asked Queries (FAQ) in Hospitals. Authors have
responds like a psychotherapist by returning the user’s query used ML algorithms to train the dataset and NLP methods
in an interrogative form. The downside of ELIZA is its limited for text processing. Authors have used the Gradient descent
knowledge, so it can only discuss a limited range of topics. algorithm, but no comparison has been provided concerning
Furthermore, ELIZA cannot maintain long conversations and other algorithms.
cannot learn context from the conversation. In 2020, Verma et al. [21] explained about Mitsuku,
Khan and Raza Rabbani [16] implemented a chatbot an intelligent chatbot created by Steve Worswick using
using AI and NLP models for Islamic finance and banking AIML. Maher et al. note maher2020chatbots implemented
customers. Authors have used the traditional NLP model to Mitsuku for a general type of conversation and interacted
implement a chatbot where the chatbot’s performance is not with the user using the rules written in AIML. Mitsuku
compared with various other methods. can also be integrated with social media platforms like
Adamopoulou and Moussiades [15] discussed a chatbot Telegram or Twitter. The chatbot is hosted at Pandorabot
called PARRY created in 1971. PARRY is considered to and employs NLP with heuristic patterns. Mitsuku can retain
be more evolved than ELIZA since it has a ‘‘personality’’ and utilize large amounts of conversational history in future
and a more effective control structure. In 2012, Sandeep conversations.
A Thorat et al. [17] wrote that the ELIZA chatbot system Nguyen et al. [24] performed an empirical study of user
also has language comprehension capabilities and can have interaction with chatbot vs. menu interface. The results
variables like mistrust, anger, and fear. conclude that chatbots provide lower user satisfaction over
Tiwari et al. [18] have implemented a chatbot using the menu interface due to the vague nature of queries and
neural networks and NLP for COVID-19-related queries. generated answers. The authors suggested that implementing
The dataset of questions and answers have used to train chatbots should focus on perceived autonomy, perceived
and generate the responses. The neural network model competence, and cognitive effort. Hence, various methods of
requires a considerable dataset, and the queries are vague, implementations need to be compared.
so preparing the model and developing specific responses In 2020, Verma et al. [21] wrote about Siri, a virtual
takes longer. Ranavare and Kamath [19] have implemented a assistant developed by Apple launched in 2010. Siri uses a
chatbot for placement activity using the DialogFlow method. natural language interface that enables it to take actions, make
The proposed approach needs structured data handling per recommendations and perform specific actions in response
the pre-defined dialogs and cannot accommodate semantic to voice queries. With users usage, Siri can adapt to the
queries. user’s language usage and searches. Siri has many features,
including handling device settings, scheduling events, and In 2015, Ahmed et al. [29] wrote that Artificial Intelligence
searching queries online. Markup Language (AIML) has a syntax similar to Extensible
Adamopoulou and Moussiades [15] wrote about Siri’s Markup Language (XML) and is used for pattern-matching
weaknesses. Siri’s main disadvantage is that it depends on algorithms.
the internet to function. Siri can understand many languages, In 2018, Rani et al. [30] wrote that in IR systems, stop
but many languages are not supported, while the navigational words are words with little or no semantic importance.
instructions are only available in English. Furthermore, Siri Removing such common words can lead to more effective
struggles to understand solid accents and commands in the corpus indexing and boost the performance of an IR system.
presence of external noise. In 2021, Ofer et al. [31] wrote that splitting text into atomic
Han and Lee [25] have implemented a FAQ chatbot for units of information in a selected language representation
Massive open online courses. The authors suggest a concep- (called tokens) is known as tokenization. Although some
tual framework for chatbots and explain how it is essential approaches employ individual letters, most English NLP
to implement and integrate chatbots for conversation-centric models use words as tokens. Individual-character tokens
tasks. provide more versatility, particularly for out-of-vocabulary or
In 2019, Qaffas [26] wrote about IBM’s Watson. Watson misspelled words and languages lacking unambiguous word
is a question-and-answer unit that can answer questions in division.
natural language. Watson uses NLP and machine learning Based on literature, following research gaps have been
algorithms to extract insights from previous conversations. identified to formulate the research:
The downside of Watson is that he only supports English. • The existing research lacks in identifying best suitable
Adamopoulou and Moussiades [15] authors wrote that models to implement chatbots. Hence, there is a need
Watson was created in 2011, and later ‘‘Watson Health’’ for implementation and comparative analysis of various
helped doctors diagnose diseases. models to select the best one.
In 2020, Adamopoulou and Moussiades [15] wrote about • Implementation of chatbot by considering all simple and
Google Assistant, which consists of the next generation complex queries related to university/institution is major
of Google Now. Google Now was created by Google in missing.
2012 and gave responses based on users’ preferences and • Existing chatbots lack in domain information related
locations. Google Assistant is a deeper artificial intelligence to educational institutions. Hence, there is a need
and has a friendlier user interface. The main disadvantages of of generating extensive question-answer repository for
Google Assistant are that it has no personality and violates such chatbots.
the user’s privacy because it is directly linked to their Google • There is a requirement to implement conversation AI
accounts. which considers domain knowledge and semantics of
In 2020, Adamopoulou and Moussiades [15] wrote about questions while answering.
Cortana, a digital assistant developed by Microsoft in 2014.
It understands voice instructions, identifies time and location,
III. RESEARCH FORMULATION
sends emails, creates reminders, and manages lists. Cortana
Technical Engineering colleges follow an online admission
has a significant flaw in that it can run software that installs
process along with counseling for providing information
malware.
regarding various courses. Students and Parents have a lot
In 2016, Abadi et al. [27] wrote that TensorFlow is
of queries regarding the entrance process which are clarified
a programming language for expressing and executing
on calls or by visiting the university. There might be some
machine learning algorithms. With few or no adjustments,
miscommunication of information, or many a times officials
TensorFlow computations can be conducted on various
get busy on other calls which prevents any other student from
heterogeneous systems, ranging from mobile devices like
getting their queries resolved.
phones and tablets to large-scale distributed systems. The
Engineering colleges follow an online admission process
framework is extensible and may be used to define multiple
which constitutes a counseling process and stream selection.
algorithms, such as deep neural networks and inference
During this phase, the students and parents have various kinds
approaches.
of questions, such as:
In 2018, Qaiser et al. [28] wrote that Term Frequency
and Inverse Document Frequency (TF-IDF) is a numerical • Queries regarding the college
statistic that illustrates the relevance of keywords to specific • Queries regarding the branch they are about to select
document in other words. • Queries regarding the students’ options after college
Ranoliya et al. [13] have developed a chatbot for • Queries regarding the placements in the final year
university-related FAQs. This chatbot was implemented using • Queries related to the various branches and the differ-
Artificial Intelligence Markup Language (AIML) and Latent ence between them along with questions related to the
Semantic Analysis (LSA). Authors need to try other ways of relatively new branches.
implementation, significantly when the dataset of questions All the queries can be clarified by visiting the college or
increases. over a phone call with the officials. Due to the shear volume
IV. METHODOLOGY
The development of university information chatbots involves
defining objectives, understanding user requirements, select-
ing a suitable platform, integrating with university systems,
and deploying across various channels. The overview of an
developed solution is explained step-wise as follows:
• Preparation of Questions Related to Counseling Process:
The solution begins with a meticulous preparation
of questions related to the counseling process. This FIGURE 1. Methodology for chatbot development.
involves understanding the varied needs and concerns of
users, including prospective students, and parents. The
question preparation phase includes input from counsel-
ing experts to ensure the chatbot is equipped to address a learning models are integrated to enhance the chatbot’s
wide range of inquiries related to admissions, academic understanding of user intent and context. The solution
programs, career guidance, and support services. embraces a comparative analysis of different technolo-
• Handling Various Forms of the Same Query Using gies to select the most suitable ones, considering factors
Semantic Analysis: To enhance the effectiveness of like accuracy, scalability, and ease of integration with
the chatbot, semantic analysis is employed to handle existing systems.
various forms of similar queries. Through natural
Figure 1 depicts the methodology for developing a chatbot.
language processing techniques, the chatbot is trained
The dataset is created by collecting all the questions
to recognize the semantic meaning behind different
asked about a particular technical university from various
expressions of the same question. This enables the
social media portals and the university’s students and
chatbot to provide consistent and accurate responses
faculty. Answers are obtained for these questions from
regardless of how users phrase their queries, ensuring
authorized sources from the university. The dataset has-
a more user-friendly and efficient interaction.
around 250 questions formed in different ways. The following
• Capability to Process All Types of Questions: Sim-
methodology is used for using these questions and answers to
ple to Complex: The developed solution ensures the
design a chatbot. In the first step, raw data is pre-processed
chatbot’s versatility in processing questions of vary-
and converted into a format that is easier and more effective
ing complexity. Whether users have straightforward
for further processing steps. It also normalizes the raw data in
queries about admission deadlines or complex inquiries
the dataset and reduces the number of features in the feature
regarding academic policies, the chatbot is designed
set. This leads to a decrease in the complexity of fitting the
to comprehend and respond appropriately. The system
data to each classification model.
is equipped with an extensive knowledge base and
The pre-processing steps are explained below:
advanced algorithms to tackle a diverse set of questions,
providing comprehensive support across the counseling • Converting to Lowercase: The raw text is changed
spectrum. to lowercase to avoid numerous variants of the same
• Implementation and Analysis of Chatbots Using Vari- word, and all the terms, regardless of their casing, are
ous Technologies: The implementation of the chatbot standardized/normalized to lowercase so they can be
solution is characterized by the use of various tech- counted together.
nologies to optimize performance and user experience. • Tokenization: Tokenization is dividing a text stream
Technologies such as natural language processing into meaningful elements called tokens. Tokens can be
(NLP), machine learning algorithms, and possibly deep words, sentences, or any other part of the sentence.
Algorithm 1 Algorithm for Chatbot Using TensorFlow equals 1. After the model is trained, the variables are stored
data ← load data from JSON file in the data-pickle file. The training dataset is passed through
If the model has already been trained, load the variables from the pickle file
Initialize lists words, labels, docsx , docsy the model as a bag of words, and the model is trained. When
while intent ∈ data do the user query is passed through the neural network, the tag
while pattern ∈ intent do
wrds ← tokenize words in the pattern with the highest probability is chosen, and the corresponding
Apppend wrds to words response is given to the user [32].
Append wrds to docsx
Append tag of the intent to docsy
end while
/ labels then
if tag ∈
Algorithm 2 Algorithm for Chatbot Using PyTorch
Append tag of the intent to labels Data = Load JSON Data
end if Initialize lists tags, xy, allwords
end while while intent ∈ data do
Tag of the intent is stored in ‘‘tag′′
remove punctuations from ′′ words′′ while question ∈ intent[question] do
Stemming and converting to lowercase of xy ← xy + tokenized sentence
, ‘‘words′′ and store in ‘‘words′′ end while
sort‘‘words′′ and‘‘labels′′ end while
Initialize lists: training, output Remove stop words
while sentence ∈ docsx do Apply stemming, remove all duplicates, sort it and add them to ‘‘tags’’
Initialize ‘‘bag’’(bag of words) Initialize lists Xtrain , ytrain
stem every word in ‘‘sentence′′ and while pattern ∈ xy do
store in ‘‘wrds′′ while tag ∈ xy do
′′
Create bag of words using ‘‘pattern′′ and ‘‘allwords and
while word ∈ words do
if word ∈ wrds then store in ‘‘bag′′
′′
append ‘‘bag to Xtrain
append 1 to bag
else append tag of intent to label
append 0 to bag append ‘‘label ′′ to ytrain
end if end while
end while end while
Append ‘‘bag′′ to ‘‘training′′ A Forward Neural Network is created using PyTorch with two hidden layers with
ReLu activation functions. The output layer is of size equal to the number of tags in
outputrow [labels.index(docsy [x])] = 1 the database. The ReLu activation function is applied to the input and hidden layers.
Append ‘‘bag′′ to ‘‘training′′ The title with the highest probability is chosen. Training loss is calculated and printed
Append outputrow to output after every 100 epochs, and the final loss is computed.
end while query = input from user
convert ‘‘training’’ and ‘‘output’’ to array and save the variables in pickle file X = query is tokenized and converted to bag of words
create a Deep Neural Network using tflearn. The size of the input layer is same as output = query is passed through the model
the size of the bag of words. Then add 2 hidden layers (fully connected) of 8 neurons if probability > 0.75 then
each. The size of the output layer is equal to the number of tags if tag ∈ tags then
set ‘‘number of epochs′′ to 1000 Randomly print one of the responses in that tag
end if
save the model else
Print that the bot does not understand the query
end if
Now, all the words in the ‘‘words’’ list are stemmed using
LancasterStemmer().stem() function and all the duplicate B. SAM: NEURAL NETWORK USING PYTORCH
comments are removed using set(words). ‘‘words’’ and The explanation of Algorithm 2 is given in the following
‘‘labels’’ are then sorted. A bag of words as ‘‘bag’’(empty paragraphs. The data is stored in intents. json file, and it
list) is created, where the size of ‘‘bag’’ is the number of root contains a list of intents. Each intent or class has a tag,
words in the database. For every word in the ‘‘words’’ list, a pattern, and a response. The ‘‘tag’’ defines the intent or
if that word exists in the sentence, then one is appended to class. The ‘‘pattern’’ is a list of possible questions for the
‘‘bag’’; else 0 is appended to ‘‘bag’’. corresponding class. The ‘‘response’’ is a list of possible
A Deep Neural Network is created using TfLearn. The answers to the questions of that ‘‘tag.’’ The chatbot will take
size of the input layer is equal to the size of the ‘‘bag.’’ The the message from the user, identify the ‘‘tag’’ of the message,
input to the neural network is ‘‘bag.’’ Two fully connected and give the corresponding response.
hidden layers of eight neurons each are added to the network. Pre-processing steps are applied to the data. Every question
Fully Connected layers mean that all possible connections are of every intent is tokenized using nltk.word_tokenize() and is
present, wherein every input of the input vector influences appended to the ‘‘all_words’’ list. Every unique tag is stored
every output of the output vector. An output layer of size in the ‘‘tags’’ list. Now, ‘‘all_words’’ is a list that contains
equal to the number of tags in the dataset is added to the all the tokenized words of the dataset, and ‘‘tags’’ is a list
network. The softmax activation function is applied to each that contains all the tags of the database. All the punctuation
neuron in the output layer. ‘‘n_epochs’’ is the number of tokens are removed; every word in the dataset is converted
times the model will see the same training data. In this to lower case using the lower() function, and the words are
model, ‘‘n_epochs’’ is set to 1000. The softmax activation stemmed using PorterStemmer().stem() function from nltk.
function converts the output to a list of probabilities, with ‘‘all_words’’ list is sorted using sorted(all_words) function
each value denoting the possibility that the sentence belongs and all the duplicate words are removed using set(all_words)
to the corresponding tag and that the sum of all probabilities function. ‘‘tags’’ list is also sorted. To create a bag of words,
a ‘‘bag’’ list is created. ‘‘bag’’ list is of length equal to the Algorithm 3 Algorithm for TF-IDF Vectorization
size of the list ‘‘all_words’’ or the number of unique stemmed Data = Query input from the user
Output = Response from chatbot
words in the database. ‘‘bag’’ list is initialized with 0. For
every word in ‘‘sentence,’’ use its corresponding ‘‘index’’ in sentTokens ← Sentence tokenized data from database
Append the query to sentTokens
‘‘all_words’’ to set bag[index] to 1. tfidf ← TF − IDF vectorized data with stop words removed
A Feed Forward Neural Network is created using a vals ← Cosine Similarity between the user
query(tfidf [−1]) and tfidf
torch-module, a base class for all neural network modules. reqTFIDF ← Maximum Cosine Similarity in vals
A feed-forward neural network is an artificial neural network if reqTFIDF > 0 then
return the corresponding response
in which the connections between nodes do not form a cycle. else
One linear input layer of size equal to the ‘‘bag’’ list is created. return‘‘I do not understand.."
end if
Two hidden linear layers having eight neurons are created.
One output layer of size similar to the size of the ‘‘tags’’ list
is formed. A ReLu activation function is defined. Training Algorithm 4 Algorithm for Greeting in TF-IDF Vector
data is passed through the input layer; then, the activation Chatbot
Data = Query from the user
function is applied. This data is fed to the hidden layer, Output = Response from the chatbot if the query is a greeting
and then the activation function is used, which is provided
greetingInput ← List of greeting input words
to another hidden layer. The activation function is applied, greetingOutput ← List of greeting output words
and this data is fed to the output layer. The learning rate is while word ∈ query do
if lowerCase(word) ∈ greetingInput then
a vital hyperparameter that determines how fast the neural return random greetingResponse
network converges to an optimum value. In this model, the end if
end while
value of the learning rate is 0.001. When the user query
is passed through the neural network, the ‘‘tag’’ with the
highest probability is chosen, and the response is given to
the user. Term Frequency-Inverse Document Frequency (TF-IDF)
stores the component of resulting scores assigned to each
word. The goal of TF-IDF vector is to calculate the word
C. BIG MOUTH: USING TF-IDF VECTORIZATION frequency scores for the text that are more interesting (less
The term ‘‘Term Frequency’’(TF) is used to count how common). Term Frequency is used to calculate the frequency
many times a time appears in a document [33]. There are of each word, whereas, Inverse Document Frequency down
5000 words in document ‘‘T1,’’ and the word ‘‘alpha’’ scales the score of frequently occurring words.
appears ten times. As a result, the term ‘‘alpha’’ frequency The explanation of Algorithm 5 is given the following
in document ‘‘T1’’ will be paragraph. The corpus consists of full stop separated
TF = t/s answers in the form of a text file. The corpus is
where t is number of occurrences in a file and s is the total loaded and is tokenized using nltk.word_tokenize and
number of words in the document nltk.sent_tokenize. The tokens are lemmatized using
TF = 10/5000 = 0.002 nltk.stem.WordNetLemmatizer().lemmatize(). Words which
The inverse document frequency gives less weight to are there in string.punctuation (set of punctuations) are
frequently occurring words and more weight to infrequently removed. Input is taken from the user in form of a
occurring words. For example, if we have ten documents and ‘‘query’’. Greeting is implemented using the pattern
the term ‘‘alpha’’ appears in five of them, we may calculate matching algorithm. If the query contains any words from
the inverse document frequency as GREETING_INPUT (list of predefined greeting inputs),
IDF = log(M /m) then the chatbot will return a random response from
where M is the total number of documents in the corpus and GREETING_OUTPUT (list of predefined greeting outputs).
m is then number of documents ‘‘query’’ is tokenized and lemmatized and the result is
containing the required term appended to the list ‘‘sent_tokens’’. sklearn is the library
IDF = log(10/5) = 0.301 used for TF-IDF vectorization and to calculate the cosine
The detailed steps of TF-IDF Vectorization are shown in similarity. Cosine similarity is calculated between every
Algorithm 3 and Algorithm 4. sentence from the corpus and the user query, the sentence
In 2020, Abhishek Jaglan et al. [34] authors wrote that having the highest cosine similarity is given as the output
textual data can not be employed in the model directly, instead (cosine similarity values are sorted in descending order and
it has to be converted to numerical vectors. This can be the first value is selected) [35].
done by assigning a unique number to each word, and given
data can be encoded with the length of vocabulary of known D. HERCULES: USING SEQUENTIAL MODELING
words. The Bag-of-Words model is a way of representing The data is stored in intents. json file and contains a list
whether the words exists in the ‘‘sentence’’ or not regardless of intents. Each intent or class has a tag, a pattern, and a
of their sequence of appearance. response. The ‘‘tag’’ defines the intent or class. The ‘‘pattern’’
Algorithm 5 Algorithm for Chatbot Using IF-IDF Vectoriza- Algorithm 6 Algorithm for Chatbot Using Sequential
tion Modeling
data = load data.txt data = load data from JSON file
senttokens = sentence tokenization of data Implemet Lemmatization
wordtokens = word tokenization of the text Initialize lists words, classes, docx , docy
Remove punctuations from the text while intent ∈ data do
Initialize lists GREETINGINPUT , GREETINGOUTPUT while pattern ∈ intent do
with sample greeting wrds = tokenize words in the pattern
inputs and outputs Append "wrds" to words
while word ∈ sentence.split do Append "wrds" to docx
if word ∈ GREETINGSINPUT then Append "tag" to docy
Print out a random GREETINGSOUTPUT end while
end if if tag ∈/ labels then
end while Append "tag" to labels
userInput = Input from the user end if
userresponse = tokenized user query end while
Append userresponse to senttokens remove punctuations from ‘‘words’’
tfidfVec = Create tfidf Vector of senttokens words = stemed "words" converted to lowercase
vals = Cosine Similarity between "senttokens " and "userresponse " sort ‘‘words’’ and ‘‘labels’’
idx = Id of the sentence with the highest Cosine Similarity Initialize lists: training, output
if reqtfidf = 0 then while sentence ∈ docsx do
Print ‘‘I am sorry, I didnt understand you’’ Initialize "bag"(bag of words)
else wrds = stem every word in "sentence"
Print senttokens [idx] while word ∈ words do
end if if word ∈ wrds then
append 1 to bag
else
append 0 to bag
end if
is a list of possible questions of the corresponding class. The end while
Append "bag" to training
‘‘response’’ is a list of possible answers to the questions of
that ‘‘tag.’’ The chatbot will take the message from the user, outputrow [labels.index(docsy [x])] = 1
Append "bag" to training
identify the ‘‘tag’’ of the message, and give the corresponding Append outputrow to output
response. end while
create a Sequential model with softmax activation function
The explanation of Algorithm 6 is given in the following
paragraphs. Every question of every intent is tokenized
using nltk.word_tokenize() and is appended to the ‘‘words’’
list. All the tags are stored in the ‘‘labels’’ list. ‘‘words’’
of probabilities, wherein each value denotes the likelihood
contains all the words in the database. Every word in
of the sentence belonging to the corresponding tag, and the
‘‘words’’ is converted to lowercase using lower() function.
sum of all possibilities equals 1. The model is optimized
Now, all the words in the ‘‘words’’ list are lemmatized
using ‘‘Adam.’’ The Adam Optimizer is an adaptive learning
using WordNetLemmatizer().lemmatize() function and all the
rate method, which computes individual learning rates for
duplicate words are removed using set(words). Both ‘‘words’’
different parameters. When the user query is passed through
and ‘‘labels’’ are sorted. A bag of words is created having the
the neural network, the tag with the highest probability is
variable name ‘‘bag,’’ where the size of ‘‘bag’’ is the number
chosen, and the response is given to the user.
of root words in the database. For every word in the ‘‘words’’
list, if that word exists in the sentence, then one is appended
E. ALICE: USING AIML
to ‘‘bag’’; else, 0 is appended to ‘‘bag’’.
In 2014, authors Srivastava et al. [36] wrote that dropout In AIML, categories are the basic unit of knowledge. Each
is a technique that prevents overfitting and provides a way category has a pattern and a template. The pattern describes
of efficiently combining different neural networks. The term the query, and template describes the chatbot’s responses.
‘‘dropout’’ refers to dropping out units randomly from the The template tag can have a list of possible responses for
hidden or visible layers in the neural network. By dropping the chatbot to choose from, and it will randomly give one
a team, it is temporarily removed from the web and its response.
incoming and outgoing connections by setting its weight to There are two types of AIML classes:
zero. • Atomic Category: It is an AIML classification where
A Neural Network with three layers is created. A dense the query are an exact match. This type of classification
or fully connected input layer is equal to a ‘‘bag’’ size with does not contain any wildcards.
a ReLu activation function. Dropout is used, which will < category >
drop 50% of the units. A fully connected hidden layer of < pattern > Good Morning < /pattern >
64 neurons is created, and the ReLu activation function is < template > Good Morning to you too! <
applied. Dropout is used, which will drop 30% of the units. /template >
A dense output layer of size equal to the number of tags in < /category >
the database is created with the Softmax activation function. • Default category: wildcard symbols such at ∧ and ∗
The Softmax activation function converts the output to a list are used in the pattern. ∗ wildcard captures one or more
V. RESULT ANALYSIS In this model, the neural network is not created. Instead,
The results of all the implemented chatbot are represented TF-IDF Vectorization converts every sentence into a vector,
in accuracy and validation. The first section contains the and Cosine Similarity calculates the similarity between every
confusion matrices and accuracy, calculated based on a sentence and the query. This model needs to understand
sample training dataset. In the second section, 150 queries the meaning of the query; it simply finds the most similar
were implemented on all the chatbots, and their responses sentence. Table 1 is the confusion matrix of Big Mouth.
were observed to check if they categorized the query correctly In this model, Sequential modeling is used while creating
or not. Lastly, the last section contains screenshots of the the neural network, which was designed to prevent the prob-
conversation with the bots. lem of overfitting; this improves the model’s performance.
Table 1 is the confusion matrix of Hercules.
A. CONFUSION MATRICES AND ACCURACY BASED ON In this model, AIML is used to create pattern-matching
SAMPLE TEST DATASET rules. No computation is done here, and the query is matched
A test dataset having 144 queries of one university is to the predefined rules. The programmer needs to understand
used to test the chatbot models. Confusion matrices and the AIML functionalities to get acceptable results deeply.
accuracies are calculated using sklearn metrics library. (using Table 1 is the confusion matrix of ALICE.
confusion_matrix and accuracy_score functions)
The neural network was created using TensorFlow in B. QUERY ANALYSIS ON CHATBOTS
this model, and multiple pre-processing steps were applied. One hundred fifty simple queries were created, and 15 had
The Lancaster Stemming algorithm was used in the pre- spelling mistakes. All these queries were implemented on all
processing phase, which is more accurate. Furthermore, the the chatbots, and their responses were observed to check if
softmax activation function is applied to the output layer, they categorized the question correctly or not implemented.
increasing the neural network’s performance. Table 1 is the Figure 3 shows the number of queries correctly answered
confusion matrix of Smart Bot. by each chatbot along with the accuracy of each model.
Table 2 depicts how many questions were correctly answered
TABLE 1. Confusion matrix of all the chatbots. by each model.
Table 2 depicts the various queries and how many questions
were correctly answered by each model.
1) SMART BOT
As shown in Figure 4 At the 1000th epoch, the training loss
of the model is 0.35567, and the accuracy is 0.9738. If we
increase the number of epochs to 1500, the training loss of
In this model, the neural network was created using the model reduces to 0.15079, and the accuracy increases to
PyTorch, and the ReLu activation function was applied to the 0.9949. On the contrary, if the number of epochs becomes
input and hidden layers. This impacted the performance of 500, the training loss becomes 0.24558, and the accuracy
the model. Table 1 is the confusion matrix of Sam. becomes 0.9817.
2) SAM
At the 1000th epoch, the training loss of the chatbot is
0.0003 as shown in 6. If the number of epochs is increased to
1500, the loss in training decreases to 0.0001. On the contrary,
if the number of epochs is reduced to 500, the training loss
It can be observed that increasing the number of epochs rises to 0.0017.
increases the accuracy of the chatbot and decreases the It can be observed that by increasing the number of epochs,
training loss. the training loss decreases.
D. TIME COMPLEXITY OF CHATBOTS personalized, responsive, and inclusive tool in the counseling
The time complexity of chatbots implemented using neural process.
networks (NN) and natural language processing (NLP) can
vary depending on the specific architecture, algorithms, and REFERENCES
models employed. Let’s break down the time complexity for [1] T. Lalwani, S. Bhalotia, A. Pal, V. Rathod, and S. Bisen, ‘‘Implementation
different components: of a chatbot system using AI and NLP,’’ Int. J. Innov. Res. Comput. Sci.
Technol. (IJIRCST), vol. 6, no. 3, pp. 26–30, 2018.
[2] J. Thukrul, A. Srivastava, and G. Thakkar, ‘‘Doctorbot—An informative
• Natural Language Processing (NLP): O(n) and interactive chatbot for COVID-19,’’ Int. Res. J. Eng. Technol. (IRJET),
vol. 7, no. 7, pp. 3033–3036, 2020.
• Neural Networks (NN): O(e * n * h)
[3] S. Maher, ‘‘Chatbots & its techniques using AI: A review,’’ Int. J. Res. Appl.
• Response Generation: O(1) Sci. Eng. Technol., vol. 8, no. 12, pp. 503–508, Dec. 2020.
[4] M. Aleedy, H. Shaiba, and M. Bezbradica, ‘‘Generating and analyzing
chatbot responses using natural language processing,’’ Int. J. Adv. Comput.
Sci. Appl., vol. 10, no. 9, 2019.
VI. CONCLUSION
[5] P. Qi, Y. Zhang, Y. Zhang, J. Bolton, and C. D. Manning, ‘‘Stanza: A Python
Engineering colleges follow an online admission process natural language processing toolkit for many human languages,’’ 2020,
that involves a counseling process for engineering stream arXiv:2003.07082.
[6] M. M. H. Dihyat and J. Hough, ‘‘Can rule-based chatbots outperform
selection. During the counseling phase, students and parents neural models without pre-training in small data situations? A preliminary
have many queries regarding the branches offered by the comparison of AIML and Seq2Seq,’’ in Proc. 25th Workshop Semantics
college and many other such queries. These questions can Pragmatics Dialogue, 2021, pp. 22–26.
be answered by visiting the college or over a phone call. [7] A. Chandan, M. Chattopadhyay, and S. Sahoo, ‘‘Implementing chat-bot in
educational institutes,’’ IJRAR J., vol. 6, no. 2, pp. 44–47, 2019.
The volume of the queries can be overwhelming, and due to [8] D. Davis and J. Smith, ‘‘The potential of chatbots in counseling,’’
this, there might be some miscommunication of information J. Counsel., vol. 5, no. 2, pp. 123–136, Apr. 2019.
or many times, the officials might be busy on other calls. [9] R. Johnson and S. Lee, ‘‘Chatbots providing emotional support to
engineering students,’’ in Proc. IEEE Eng. Educ. Conf., Austin, TX, USA,
Students might have to rely on unofficial sources like Quora 2020, pp. 45–50.
to get information. Furthermore, the students have to navigate [10] A. Patel, Personalized Career Guidance for Engineering Students.
through the entire website for data which can be tedious. Springer, 2021.
[11] Y. Chang and W. Wang, ‘‘Privacy concerns in counseling chatbots,’’
In this paper, five chatbot models were created using in Proc. IEEE Int. Conf. Comput. Commun., Paris, France, 2018,
neural networks, TF-IDF vectorization, and pattern matching. pp. 234–239.
In neural network-related models, pre-processing steps like [12] H. Yang and Q. Liu, ‘‘User acceptance of counseling chatbots,’’
J. Comput., vol. 8, no. 3, pp. 210–225, May 2019. [Online]. Available:
converting to lowercase, stemming, lemmatization, tokeniza- https://www.example-url.com
tion, removing stop words, and creating a ‘‘bag of words’’ [13] B. R. Ranoliya, N. Raghuwanshi, and S. Singh, ‘‘Chatbot for university
are applied to the training data before passing it through related FAQs,’’ in Proc. Int. Conf. Adv. Comput., Commun. Informat.
(ICACCI), Sep. 2017, pp. 1525–1530.
the neural network. A query is taken from the user; pre-
[14] V. Sharma, M. Goyal, and D. Malik, ‘‘An intelligent behaviour shown
processing steps are used to it, and it is passed through the by chatbot system,’’ Int. J. New Technol. Res., vol. 3, no. 4, 2017,
model, which returns the list of probabilities that the query Art. no. 263312.
belongs to a certain intent. [15] E. Adamopoulou and L. Moussiades, ‘‘Chatbots: History, technology, and
applications,’’ Mach. Learn. Appl., vol. 2, Dec. 2020, Art. no. 100006.
Hercules performs best among the five chatbots discussed [16] S. Khan and M. R. Rabbani, ‘‘Artificial intelligence and NLP-based chatbot
in the project because it has sequential modeling designed for Islamic banking and finance,’’ Int. J. Inf. Retr. Res., vol. 11, no. 3,
to prevent overfitting training data. Furthermore, it is the pp. 65–77, Jul. 2021.
[17] C. Curry and J. D. O’Shea, ‘‘The implementation of a story telling
only chatbot with any optimizer applied to it, improving its chatbot,’’ Adv. Smart Syst. Res., vol. 1, no. 1, p. 45, 2012.
performance. Therefore, it can be concluded that a chatbot [18] V. Tiwari, L. K. Verma, P. Sharma, R. Jain, and P. Nagrath, ‘‘Neural
similar to Hercules can be implemented in real-time for network and NLP based chatbot for answering COVID-19 queries,’’ Int.
J. Intell. Eng. Informat., vol. 9, no. 2, pp. 161–175, 2021.
university/institute counseling. This will be very helpful for [19] S. S. Ranavare and R. Kamath, ‘‘Artificial intelligence based Chatbot for
the students because it can provide official and accurate placement activity at college using DialogFlow,’’ Our Heritage, vol. 68,
results. Furthermore, it can give 24 × 7 assistance, and the no. 30, pp. 4806–4814, 2020.
[20] L. Fryer and R. Carpenter, ‘‘Bots as language learning tools,’’ Lang. Learn.
information provided will be uniform. Lastly, students will Technol., vol. 10, no. 3, pp. 8–14, 2006.
be able to rely on reliable information sources to resolve their [21] S. Verma, L. Sahni, and M. Sharma, ‘‘Comparative analysis of chatbots,’’
queries. in Proc. Int. Conf. Innov. Comput. Commun. (ICICC), 2020, pp. 67–78.
[22] A. N. Mathew, V. Rohini, and J. Paulose, ‘‘NLP-based personal learning
Future work for integrating ChatGPT into counseling could
assistant for school education,’’ Int. J. Electr. Comput. Eng. (IJECE),
focus on enhancing emotional intelligence, enabling dynamic vol. 11, no. 5, pp. 4522–4530, Oct. 2021.
learning and adaptation, exploring multimodal interactions, [23] M. Mittal, G. Battineni, D. Singh, T. Nagarwal, and P. Yadav, ‘‘Web-based
addressing privacy concerns, ensuring cultural sensitivity, chatbot for frequently asked queries (FAQ) in hospitals,’’ J. Taibah Univ.
Med. Sci., vol. 16, no. 5, pp. 740–746, Oct. 2021.
integrating with counseling resources, implementing a con- [24] Q. N. Nguyen, A. Sidorova, and R. Torres, ‘‘User interactions with chatbot
tinuous user feedback mechanism, prioritizing accessibility interfaces vs. menu-based interfaces: An empirical study,’’ Comput. Hum.
features, optimizing scalability for concurrent interactions, Behav., vol. 128, Mar. 2022, Art. no. 107093.
[25] S. Han and M. K. Lee, ‘‘FAQ chatbot and inclusive learning in
and conducting rigorous evaluation studies for efficacy massive open online courses,’’ Comput. Educ., vol. 179, Apr. 2022,
validation. These developments aim to make ChatGPT a more Art. no. 104395.
[26] A. A. Qaffas, ‘‘Improvement of chatbots semantics using Wit.Ai and word ANKIT AGRAWAL received the bachelor’s degree
sequence kernel: Education chatbot as a case study,’’ Int. J. Mod. Educ. in computer and communication engineering from
Comput. Sci., vol. 11, no. 3, pp. 16–22, Mar. 2019. Manipal Institute of Technology, Manipal. He is
[27] M. Abadi et al., ‘‘TensorFlow: Large-scale machine learning on heteroge- currently working professional in the IT industry.
neous distributed systems,’’ 2016, arXiv:1603.04467. He got the opportunity to write a research article
[28] S. Qaiser and R. Ali, ‘‘Text mining: Use of TF-IDF to examine the and create a project that will solve a real world
relevance of words to documents,’’ Int. J. Comput. Appl., vol. 181, no. 1, problem for colleges and it’s applicants with Mani-
pp. 25–29, Jul. 2018. pal Institute of Technology. He has always been
[29] I. Ahmed and S. Singh, ‘‘AIML based voice enabled artificial intelligent
intrigued with the concept of artificial intelligence
chatterbot,’’ Int. J. u-e-Service, Sci. Technol., vol. 8, no. 2, pp. 375–384,
and was is grateful to the college and his mentors
Feb. 2015.
[30] R. Rani and D. K. Lobiyal, ‘‘Automatic construction of generic stop words to provide him with the freedom and guidance to implement the project and
list for Hindi text,’’ Proc. Comput. Sci., vol. 132, pp. 362–370, Jan. 2018. use his capabilities to the maximum.
[31] D. Ofer, N. Brandes, and M. Linial, ‘‘The language of proteins: NLP,
machine learning & protein sequences,’’ Comput. Structural Biotechnol.
J., vol. 19, pp. 1750–1758, 2021.
[32] G. Sperlí, ‘‘A cultural heritage framework using a deep learning based
chatbot for supporting tourist journey,’’ Expert Syst. Appl., vol. 183,
Nov. 2021, Art. no. 115277.
[33] S. D. Nithyanandam, S. Kasinathan, D. Radhakrishnan, and
J. Jebapandian, ‘‘NLP for chatbot application: Tools and techniques used
for chatbot application, NLP techniques for chatbot, implementation,’’ in
Deep Natural Language Processing and AI Applications for Industry 5.0.
Hershey, PA, USA: IGI Global, 2021, pp. 142–168.
[34] A. Jaglan, D. Trehan, M. Megha, and P. Singhal, ‘‘COVID-19 trend
analysis using machine learning techniques,’’ Int. J. Sci. Eng. Res., vol. 11,
no. 12, pp. 1162–1167, Dec. 2020.
[35] M. A. Al Muid, M. M. Reza, R. B. Kalim, N. Ahmed, M. T. Habib, and
M. S. Rahman, ‘‘EduBot: An unsupervised domain-specific chatbot for
educational institutions,’’ in Artificial Intelligence and Industrial Appli-
cations: Artificial Intelligence Techniques for Cyber-Physical, Digital
Twin Systems and Engineering Applications. Cham, Switzerland: Springer,
2021, pp. 166–174.
[36] N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and SUCHETA V. KOLEKAR (Member, IEEE)
R. Salakhutdinov, ‘‘Dropout: A simple way to prevent neural networks received the Ph.D. degree in adaptive e-learning
from overfitting,’’ J. Mach. Learn. Res., vol. 15, pp. 1929–1958, Sep. 2014. from Manipal Academy of Higher Education,
[Online]. Available: http://jmlr.org/papers/v15/srivastava14a.html Manipal, Karnataka, India. She is currently an
Associate Professor with the Department of
Information and Communication Technology,
GIRIJA ATTIGERI (Member, IEEE) received the MIT, Manipal Academy of Higher Education.
B.E. and M.Tech. degrees from Visvesvaraya She has around 15 years of experience in the
Technological University, Karnataka, India, and field of teaching and research. She has published
the Ph.D. degree from Manipal Institute of Tech- more than 20 papers in national and international
nology, Manipal Academy of Higher Education, journals/conference proceedings. Her primary research interests include E-
Manipal. She has 18 years of teaching and research learning, web usage mining, human–computer interaction, serious game
experience in reputed institutes of Karnataka. development, and cloud computing. She has received E-learning Excellence
She is currently an Associate Professor with the Award, in 2017, by Academic Conferences International for her research
Department of Information and Communication work in adaptive E-learning. She along with her student team have designed
Technology, Manipal Institute of Technology, and developed novel browser extension to capture the usage data of online
Manipal. She has more than 16 publications in reputed international courses which are provided by Coursera. She is one of the inventor for
conferences and journals. She has conducted several seminars and workshops the patent called ‘‘Smart sole-based diabetic foot ulcer prediction system’’
on her big data and machine learning. She is working on several projects which is granted by Chennai Patent Office, India. She handles additional
related to data analytics in health care, education and agriculture. Her responsibility with the institute to promote and enhance innovation and
research interests include big data analytics, artificial intelligence, machine entrepreneurship culture.
learning deep learning, and semantic web.