Retrieval Augmented Generation (RAG) Based Restaurant Chatbot With AI Testability
Retrieval Augmented Generation (RAG) Based Restaurant Chatbot With AI Testability
net/publication/381461839
CITATIONS READS
0 2,357
6 authors, including:
Jerry Gao
San Jose State University
355 PUBLICATIONS 5,850 CITATIONS
SEE PROFILE
All content following this page was uploaded by Jerry Gao on 16 June 2024.
Abstract—Post-COVID the restaurant industry is increase at a CAGR of 23.3%. According to markets and
experiencing a surge in demand, presenting a unique challenge market research [2] the chatbot market size is expected to
of efficiently managing increased customer flow while ensuring grow from USD 2.9 billion in 2020 to USD 10.5 billion by
seamless interactions. Chatbots have emerged as an innovative 2026, at a Compound Annual Growth Rate (CAGR) of
solution to meet the demand increase. The paper addresses the 23.5%. They have become prevalent in various sectors like e-
enhancement of AI chatbots through the integration of commerce, healthcare, hospitality, tourism, banking, and
Retrieval-Augmented Generation (RAG) with the Large
customer service. It is expected and forecasted that 70% of
Language Model (LLM). This paper focuses on the
development of a restaurant chatbot that not only engages in
white-collar workers would interact/converse with
natural-language conversations but also addresses context conversational platforms daily by Gartner.
optimization and LLM optimization for restaurant context In recent years post COVID era, chatbots have gained
learning. The approach uses a Neo4j Knowledge graph built popularity, particularly in the restaurant industry. With the
using the restaurant data as an external source of knowledge. advancement and the increase in the usage of technology,
The graph is traversed to match the user question with customers want to converse smoothly and have quick,
appropriate answer tokens using Term Frequency - Inverse
personalized replies to their questions. Restaurant chatbots
Document Frequency (TF-IDF) embeddings. The relevant
serve to meet this need by handling orders, responding to
tokens along with user questions are used to provide additional
context to the T5 language model to provide nuanced responses
inquiries, providing recommendations, and immediate
to the users. This improvement is quantitatively evidenced by a support to clients without having to wait in line to talk to a
Bilingual Evaluation Understudy (BLEU) score of 0.60, customer representative. There exist traditional chatbots that
indicating a high level of precision in language understanding are less adept at handling intricate consumer requests because
and generation. An extensive evaluation of the chatbot includes of their limited functionality, pre-programmed responses, and
assessing AI testability on the level of words, sentences, and provide dull responses like ‘I don’t know’ which results in
information. These evaluations include simulated dialogue the termination of the conversation. The motivation is to
assessments and performance analyses, with a focus on the overcome the existing limitations and to provide more
chatbot's ability to retrieve and integrate information. Based on relevant and contextual responses to the users.
the AI testability evaluation, the models consistently produce
more knowledgeable, diverse, and relevant answers as This research aims to improve the standard restaurant
compared with state-of-the-art models with an average chatbot by RAG approach i.e. integrating an external source
information score in the range of 0.6-0.8. of knowledge (Neo4j knowledge graphs) to refine the
finetuned LLM for contextual and smoother response
Keywords— Natural Language Processing (NLP), Retrieval generation and adding AI testability for evaluating responses.
Augmented Generation (RAG), Large Language Model (LLM), Furthermore, it introduces a multimodal aspect, enabling
Information Retrieval (IR), Knowledge Graph, AI Testability audio input and output for user queries.
multiple cuisines, their ingredients, and their prices. We b) Removing punctuation marks: All the punctuation
utilize a custom benchmark dataset in the form of question- marks in the text are removed as they usually are meaningless
answer pairs, stored in CSV format. Table III shows sample and can sometime add complex/introduce noise.
of restaurant Q&A dataset. c) Retaining important words: Retaining important
words, like restaurant names and prices, is crucial for
B. Dataset Preprocessing
preserving contextual information and enhancing user query
Data Preprocessing is a significant step in Natural understanding in a restaurant chatbot. This step contributes to
Language Processing which makes the data ready for further improved contextual awareness facilitating a more effective
analysis and modelling. The various steps involved in data and personalized interaction amongst user and the chatbot.
preprocessing are: d) Stop words Removal: Removed the most commonly
occurring words in a sentence that don't contribute to the
a) Conversion of data to lowercase: The text is meaning of the sentence such as “a,” “an,” “the,” and “in”. It
converted to lowercase which helps in standardizing the
dataset.
TABLE II
TECHNOLOGY SURVEY OF EXISTING RESEARCH PAPERS
[15] Task-oriented chatbot to perform Combination of LSTM and RL with attention-based Emotibot BLEU
specific tasks such as answering hierarchical LSTM network and Generative dataset Google-BLEU (GLEU)
questions or completing a Adversarial Networks (GANs) METEOR
transaction.
[16] Combining task-oriented dialogue Gated Recurrent Unit (GRU) Persona Chat Dialogue Reward
with open-ended dialogue, dataset F1 Score
strategic dialogue, spatially aware Recall
dialogue
[17] Survey on various research papers Rule-based chatbots: Expert systems, Decision trees, Text dataset Precision, Recall, Mean
on Al-based, Rule-based and Inference engines Average, Mean Reciprocal
Hybrid chatbots and the evaluation AI-based chatbots: machine learning algorithms like Rank, BLEU, ROUGE,
metrics used NLP, NLU, and LSTM models. Average dialogue length,
Average user utterance,
Number of wins, Perplexity
[18] Task-Oriented chatbot Sequence-To-Sequence learning, NLP, CNN Twitter dataset BLEU, Cosine Similarity
TABLE III
SAMPLE RESTAURANT Q&A DATASET
Intent User Questions Bot Responses
Find Restaurants What restaurants are available? The restaurants available are dish N dash, CHEF CHU'S, Olive Garden,
Rock N Grill, Denny's.
Find Location Where is the restaurant “Olive Garden” located? The address of Olive Garden is 1350 Great Mall Dr, Milpitas, CA
95035.
Find Contact How can I contact Olive Garden restaurant? You can contact Olive Garden restaurant at +1 (408) 935-8176.
Find Category Develop long-form generation with retrieval augmentation,
Inquire Timings What time does the restaurant CHEF CHU'S open? The restaurant CHEF CHU'S opens at 11:30 AM.
Find Cuisine What cuisine is available at Rock N Grill? Indian cuisine is available at this Rock N Grill restaurant.
Find Dish What are the different dishes available under Naan and The different dishes under Naan and Roti in Indian cuisine at Rock N
Roti in Indian cuisine at Rock N Grill restaurant? Grill restaurant available are Paneer Kulcha, Garlic Naan, Cheese Naan,
Butter Roti, Bullet Naan, Aloo Paratha.
Check Price What is the price of the Falafel under Appetizers (Hot) in The price of the Falafel under Appetizers (Hot) in Mediterranean cuisine
Mediterranean cuisine at dish N dash restaurant? at dish N dash restaurant is $8.0.
Get Ingredients What is the Veg Ball Manchurian (Sauce) under Veg The Veg Ball Manchurian (Sauce) under Veg Appetizers in Indian
Appetizers in Indian cuisine at Rock N Grill made of? cuisine at Rock N Grill is made of Mixed Vegetable Blended and
Thickened with Potato, Deep Fried Ball Tossed into Manchurian Sauce.
makes the sentence more manageable and helps in better algorithms. It provides structured representation of textual
analysis and better machine language models. data for computational processing.
e) Tokenization: The text is converted into individual
units called tokens which are analyzed / processed by
IV. MACHINE LEARNING MODELS The chatbot system supports audio output with the help
of Google Text-to-Speech (gTTS). gTTS translates text into
Various cutting-edge technologies are used to enable the high-quality audio by utilizing a complex architecture that
restaurant chatbot to provide enhanced user experiences, combines cutting-edge speech synthesis technology and
including speech recognition, text recognition, Natural natural language processing (NLP). Fundamentally, gTTS
Language Understanding (NLU), Natural Language makes use of deep learning models, including transformer
Generation (NLG), as well as RAG support. Using RAG, models or recurrent neural networks (RNNs), to comprehend
additional context can be provided to the chatbot in response and analyze the incoming text while taking context and
to the query it is receiving. Additional context is retrieved linguistic subtleties into account [21]. Because of its
from a variety of sources, such as knowledge graphs. architecture, which guarantees precise pronunciation,
Whenever a question is asked, relevant data regarding the intonation, and emotional expression in the generated audio,
question is retrieved from the knowledge base. Next, the gTTS provides a realistic and captivating auditory experience
question and relevant documents are passed to the NLG, for a variety of applications.
which generates an answer. The chatbot will be able to
answer the question correctly if relevant information is B. Natural Language Unit (NLU)
provided along with the question. As illustrated in Figure 1, After passing the user input through the pre-processing
various models were employed to enable the restaurant stage, the natural language understanding (NLU) by using the
chatbot’s functionality at each stage. First, a user can either BERT model interprets the conversation by identifying
speak to the system, which uses an Audio to Text converter intents and entities. The architecture for BERT is neural
to convert the speech to the text, or type text directly. The network architecture like Transformers and is particularly
text data is then processed in the data preprocessed step. used for sequence-to-sequence tasks like machine translation
Next, the processed text data goes to the NLP tools such as and language modelling [22].
NLU-BERT to understand and interpret the user’s purpose
from the text. The Database and Knowledge Graph can store For the restaurant chatbot, BERT uses a two-step
information about restaurants. The NLG-T5 can generate procedure to conduct (the reason for the query) and the
text responses based on the user’s purpose and the output of entities (certain bits of information) within the text [22]. It
Dialog Manager. These text responses are either sent begins by processing the input sentence and encoding the
directly to the user or converted into speech by the Text to context and semantic meaning of the written text. It then uses
Audio converter. Figure 2 and Figure 4 show the this contextual knowledge to forecast outcomes. BERT
architectural flow and pipeline of the system. divides the input into predetermined intent categories for
intent recognition based on the encoded context. By utilizing
its knowledge of the sentence's structure and semantics, it
recognizes and labels particular tokens in the input text that
relate to pertinent elements, such as dates, names, or numbers.
As illustrated in Figure 4, the proposed system understandable, contextually relevant, grammatically sound
processes user inputs through several models. First, the pre- natural language output. Because of its ability to produce
processing is done on the user question which gives the well-reasoned and contextually relevant answers in response
question tokens. Next the question tokens are used to find out to input prompts, T5 is an effective model for NLG tasks.
the matching question clusters that share similar tokens Training is done using masked language modelling. By
using TF-IDF embeddings. If there is no matching cluster an transforming NLU and NLG jobs into sequence-to-sequence
empty list is returned. Now, cosine similarity is used to find tasks in the encoder-decoder variation, the T5 model unifies
the most similar question from the matching question both types of tasks. This means that in the text classification
clusters. Thus, the question cluster with highest similarity problem, the text was utilized by the encoder input, and the
score is retrieved. Subsequently the answer clusters label for the decoder must be generated as regular text rather
connected to the most similar question are retrieved. Then than a class [23].
by calculating the relevancy score, the most relevant answer For restaurants, the T5 model is a great option for
cluster is retrieved. Now the answer tokens connected to the powering chatbots that generate natural language. With its
most relevant answer cluster are retrieved as the response text-to-text framework, T5 can understand and generate
from the knowledge graph. human-like text responses, making it well-suited for various
aspects of restaurant-related conversations. Because it can be
tailored to specific restaurant chatbot applications, it may be
made to offer extremely precise and tailored responses,
improving customer satisfaction, and expediting interactions
in the restaurant business.
The performance of the chatbot is evaluated using a set of The restaurant chatbot architecture as shown in Figure 6
metrics at various levels. At the word level, metrics such as utilizes a conversation dataset for training and testing. User
BLEU, ROUGE, METEOR, and F1 SCORE are employed to queries trigger responses generated by the Natural Language
assess the accuracy of individual responses. Sentence-level Generation (NLG) module, which leverages data
evaluation includes language-based similarity evaluation and augmentation techniques for enhanced robustness.
keyword-based weighted text similarity evaluation.
The quality of these generated responses is evaluated by a
Additionally, domain-level metrics are applied to assess the
Test Script at three levels: word-level accuracy, sentence-
overall effectiveness of the chatbot.
In a proactive measure, the system systematically logs level coherence, and information completeness. These are
user questions and corresponding chatbot responses in a used to conduct both semantic as well as syntactic evaluation.
dedicated database. This integration offers several Methods for evaluating models are important for chatbots
advantages: because they provide a methodical manner to evaluate and
a) Historical Analysis: A comprehensive history of user improve the chatbot's functionality, accuracy, and capacity to
interactions enables insightful trend analysis and facilitates meet user needs [17]. By assessing a chatbot's performance
continual system improvements over time. using multiple metrics, its ability to understand user intent,
and generate responses accurately can be improved. Model
b) Performance Metrics: Regularly evaluating the
assessment techniques can be used to evaluate the
chatbot’s responses aids in identifying strengths and areas for
performance of the chatbot and optimize it so that it meets the
enhancement, contributing to an evolving and adaptable
system. needs of the intended use case and target audience. Table IV
shows the different evaluation measures that were employed
c) User Personalization: The database supports tailoring in this project at each step.
responses based on historical user interactions, fostering a
more personalized and engaging user experience.
BLEU Geometric mean of all Unigram, bigram, trigram, 4-gram precision scores Fast computation Doesn’t incorporate 10%
four n-gram precisions 𝐵𝐿𝐸𝑈 = 𝑝1 ∙ 𝑝2 ∙ 𝑝3 ∙ 𝑝4 Easy to calculate semantics
Doesn’t incorporate
sentence structure
ROUGE Compares n-gram of 𝐶𝑜𝑢𝑛𝑡 𝑜𝑓 𝑤𝑜𝑟𝑑 𝑚𝑎𝑡𝑐ℎ𝑒𝑠 Ability to capture 40%
generation with n-gram 𝑅𝑂𝑈𝐺𝐸 − 1 𝑟𝑒𝑐𝑎𝑙𝑙 = and identify all the
𝐶𝑜𝑢𝑛𝑡 𝑜𝑓 𝑤𝑜𝑟𝑑𝑠 𝑖𝑛 𝑟𝑒𝑓𝑒𝑟𝑒𝑛𝑐𝑒
of references relevant instances
𝐶𝑜𝑢𝑛𝑡 𝑜𝑓 𝑤𝑜𝑟𝑑 𝑚𝑎𝑡𝑐ℎ𝑒𝑠
𝑅𝑂𝑈𝐺𝐸 − 1 𝑝𝑟𝑒𝑐𝑖𝑠𝑜𝑛 =
𝐶𝑜𝑢𝑛𝑡 𝑜𝑓 𝑤𝑜𝑟𝑑𝑠 𝑖𝑛 𝑠𝑢𝑚𝑚𝑎𝑟𝑦
𝑝𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 ∙ 𝑟𝑒𝑐𝑎𝑙𝑙
𝑅𝑂𝑈𝐺𝐸 − 1 𝑓1 = 2
𝑝𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 + 𝑟𝑒𝑐𝑎𝑙𝑙
These metrics, termed word-level metrics for chatbot b) Entity Extraction: The Entity Extraction model, also
evaluation, assess the quality of the chatbot's responses at the employing the BERT architecture, exhibited noteworthy
word level or syntactic evaluation. They gauge correctness, improvement in token-level accuracy, progressing from
completeness, and relevance of the chatbot’s replies to user 76.32% to 84.55% as seen in Figure 7.b. The reduction in
input. Word-level metrics encompass BLEU, ROUGE loss, from 0.512 to 0.2965, highlights the model's
(ROUGE-1, ROUGE-2, ROUGE L), among others, in effectiveness in precisely identifying entities within user
evaluating chatbot performance. input. These outcomes showcase the model's ability to
enhance the chatbot's contextual understanding by accurately
B. Sentence Level Metrics extracting relevant information from user queries.
Sentence level metrics gauge the level of sentence quality
thereby providing semantic evaluation techniques. Sentence c) Natural Language Generation: The NLG unit,
level testing is measured using Language based similarity leveraging the T-5 model, exhibited remarkable proficiency
evaluation and Keyword based weighted text similarity in generating coherent and informative sentences. The
evaluation. increasing trend in both training and validation accuracy,
reaching 94.98%, reflects the model's capability to
C. Information Level Metrics dynamically generate content based on structured
Information level metrics gauge the quantity and quality information. The competitive training losses and validation
of information a chatbot offers a user. Dialogue length and losses indicate that the model generalizes well without
confusion indicator are two measures used to measure overfitting. The BLEU score of ~0.60 further validates the
information level. The length of a conversation can be used high quality and alignment of the generated output with
to gauge how well a chatbot does at maintaining engagement reference texts. The iterative improvement in model accuracy
and providing illuminating responses during the exchange. across epochs demonstrates the effectiveness of the chosen
architectures and training methodologies.
VII. RESULT AND DISCUSSION
TABLE V
EVALUATION METRICS
To ensure that the model is trained on a wide variety of
instances and generalizes well to new data, data preparation Evaluation
Metrics for Epoch Accuracy Loss
is crucial. Using a test-train split, the data is split into a
training and validation set (80%) and a testing set (20%). 1 0.8532 0.412
Intent
2 0.9043 0.356
a) Intent Recognition: The Intent Recognition model, Recognition
3 0.9255 0.2565
powered by the BERT architecture, was trained, and
1 0.7632 0.512
validated over three epochs. The progressive increase in Entity
accuracy, from 85.32% to 92.55%, indicates the model's 2 0.8143 0.386
Extraction
capacity to adeptly understand user intents as seen in Figure 3 0.8455 0.2965
8.a. The decreasing trend in loss, from 0.412 to 0.2565, 1 0.95 0.0033
suggests that the model effectively learned to discriminate NLG model 2 0.98 0.0011
between various user intents. This performance underscores 3 0.98 0.007
the significance of BERT in capturing nuanced intent nuances
and enhancing user-query understanding.
Fig.7. Model Training Results
VIII. CONCLUSION
The implications of these results extend to the enhanced In conclusion, the main goal of the restaurant chatbot's
user experience of the restaurant chatbot. The robustness development to improve the dining industry has been
observed in the training and validation metrics suggests that effectively attained. By utilizing a knowledge graph of
the models generalize well and are poised for effective questions and answers and utilizing various degrees of AI
deployment in real-world scenarios. testability criteria, the chatbot ensures a high standard of
performance and accuracy. The use of Neo4j Knowledge
d) AI Testability: The RestoBot is tested on various Graph as an external source of knowledge has proven
scenarios like General Testing, Domain Testing and Limit
instrumental in enhancing information retrieval capabilities.
Testing. The testing is done to evaluate the functionality and By traversing the graph and utilizing TF-IDF embeddings, the
performance of the chatbot. It was observed that the chatbot
chatbot efficiently matches user questions with relevant
could respond by having general conversations such as answer tokens, ensuring tailored responses. The
greetings with the customer. In General Testing the RestoBot
implementation of robust AI testability ensures chatbot’s
is tested on general facts, while in Domain testing the performance meets desired standards. This facilitates cross-
RestoBot is tested on restaurant-based questions. The testing
chatbot comparisons for a domain to find out the most
was performed for various intents such as find restaurant, find effective one. Furthermore, the strategic Integration of
dish, find location, find ingredients, and others. Finally Limit
Retrieval-Augmented Generation (RAG) with the Large
testing shows how well the RestoBot replies to extreme or Language Model (LLM) has significantly elevated the
boundary conditions. In this case if there is a non-contextual
chatbot's conversational precision, enabling contextually
conversation, the conversation is routed to the customer call
relevant responses to user queries in the restaurant setting.
center service to speak with a human agent. The performance This all-encompassing strategy represents a major
of the chatbot is measured at word-level, sentence-level, and
advancement in restaurant chatbot capabilities for enhanced
information level as discussed in Section IV. Table VI customer experiences and increased operational
provides the backend UI of the chat conversations and the
effectiveness.
evaluation results for the conversation. It was observed that
the average information score was achieved in the range from
0.6 - 0.8 for the session conversations.
TABLE VI
AI TESTABILITY RESULTS OF EVALUATION AFTER SESSION (SESSION CURRENT TIME: 12/11/2023, 4:21:26 PM)
Restaurant User Question RestoBot Response Reference Response BLEU ROUGE1 Similarity Weighted Average
Score Similarity Information
Score Score
Olive What time does the The restaurant Olive The restaurant Olive 0.22 0.6 0.52 0.34 0.48
Garden restaurant Olive Garden is open from 11:00 Garden opens at 11:00 AM
Garden open? AM to 10:00 PM.
Olive What is the opening Olive Garden is open from The opening time of Olive 0.11 0.53 0.43 0.27 0.4
Garden time of Olive 11:00 AM to 10:00 PM Garden is from 11:00 AM
Garden?
Olive What time does the Olive Garden is open from The restaurant Olive 0.06 0.55 0.45 0.67 0.5
Garden restaurant Olive 11:00 AM to 10:00 PM Garden opens at 11:00 AM
Garden open and and closes at 10:00 PM.
close?
Olive What is the closing Olive Garden is closing The closing time of Olive 0.14 0.63 0.43 0.76 0.59
Garden time of Olive from 11:00 AM to 10:00 Garden is 10:00 PM
Garden? PM
View publication stats
[12] Lone, M. B., Nazir, N., Kaur, N., Pradeep, D., Ashraf, A. U., Asrar Ul
ACKNOWLEDGMENT Haq, P., Dar, N. B., Sarwar, A., Rakhra, M., & Dahiya, O. (2022). Self-
learning chatbots using reinforcement learning. 2022 3rd International
We would like to express sincere thanks to our research Conference on Intelligent Engineering and
advisor Dr. Jerry Gao and supervisor Dr. Lee C. Chang from Management (ICIEM).https://doi.org/10.1109/iciem54221.2022.9853
the Department of Applied Data Science, San Jose State 156
University for giving us a wonderful opportunity to work on [13] Biswas, D., Nadipalli, S., Sneha, B., Gupta, D., & J, A. (2022). Natural
this project. Their unwavering support and mentorship have question generation using transformers and reinforcement learning.
been invaluable throughout our research journey, providing 2022 OITS International Conference on Information Technology
us with a remarkable opportunity to contribute to this project. (OCIT). https://doi.org/10.1109/ocit56763.2022.00061
Their guidance has been instrumental in the successful [14] Tran, Q.-D. L., & Le, A.-C. (2021). A deep reinforcement learning
completion of this paper, and we are truly grateful for the model using long contexts for Chatbots. 2021 International Conference
enriching experience they have facilitated. Dr. Gao deserves on System Science and Engineering (ICSSE).
special acknowledgement for generously sharing his wealth https://doi.org/10.1109/icsse52999.2021.9538427
of expertise, dedicating time for insightful discussions, and Conference on Information and Education Technology.
guiding us in the right direction. https://doi.org/10.1145/3323771.3323824.
[15] Hsueh, Yu-Ling, and Tai-Liang Chou. “A Task-Oriented Chatbot
We would also like to express our profound gratitude to Based on LSTM and Reinforcement Learning.” ACM Transactions on
our committed team members, whose joint efforts were Asian and Low-Resource Language Information Processing,
essential to this project’s success in addition to our academic vol. 22, no. 1, 2022, pp. 1 27.,https://doi.org/10.1145/3529649
advisors. Their dedication, diligence, and creative ideas have [16] Liu, C.-W., Lowe, R., Serban, I., Noseworthy, M., Charlin, L., &
greatly enhanced our research and added to the range and Pineau, J. (2016). How not to evaluate your dialogue system: An
depth of our conclusions. Our team’s cohesion, which was empirical study of unsupervised evaluation metrics for dialogue
cultivated via open communication and common objectives, response generation. Proceedings of the 2016 Conference on Empirical
was essential to overcoming obstacles and reaching important Methods in Natural Language Processing.
milestones. https://doi.org/10.18653/v1/d16-1230
[17] Maroengsit, W., Piyakulpinyo, T., Phonyiam, K., Pongnumkul, S.,
REFERENCES Chaovalit, P., & Theeramunkong, T. (2019). A survey on evaluation
` methods for Chatbots. Proceedings of the 2019 7th International
[1] Chatbot market size, share, Trends & Growth Report, 2030. Chatbot Conference on Information and Education Technology.
Market Size, Share, Trends & Growth Report, 2030. (n.d.). Retrieved https://doi.org/10.1145/3323771.3323824.
March 23, 2023, from https://www.grandviewresearch.com/industry- [18] Aleedy, M., Shaiba, H., & Bezbradica, M. (2019). Generating and
analysis/chatbot-market analyzing chatbot responses using Natural Language Processing.
[2] “Marketresearch.com.” Market Research, MarketsandMarkets, 15 International Journal of Advanced Computer Science and
Nov. 2019, https://www.marketresearch.com/MarketsandMarkets- Applications, 10(9). https://doi.org/10.14569/ijacsa.2019.0100910
v3719/Chatbot-Component-Solutions-Services-Usage-12771978/ [19] Rajamalli Keerthana, R., Fathima, G., & Florence, L. (2021).
[3] Hsueh, Yu-Ling, and Tai-Liang Chou. “A Task-Oriented Chatbot Evaluating the performance of various deep reinforcement learning
Based on LSTM and Reinforcement Learning.” ACM Transactions on algorithms for a conversational chatbot. 2021 2nd
Asian and Low-Resource Language Information Processing, vol. 22, International Conference for Emerging Technology (INCET).
no. 1, 2022, pp. 1–27., https://doi.org/10.1145/3529649 https://doi.org/10.1109/incet51464.2021.9456321
[4] Hofstätter, Sebastian, et al. “FID-light: Efficient and effective retrieval- [20] Introducing whisper. Introducing Whisper. (n.d). Retrieved April
augmented text generation.” Proceedings of the 46th International 19,2023, from https://openai.com/research/whisper.
ACM SIGIR Conference on Research and Development in Information [21] K, Bharath. “How to Get Started with Google Text-to-Speech Using
Retrieval, 2023, https://doi.org/10.1145/3539618.3591687. Python.” Medium, Towards Data Science, 30 Aug. 2020,
[5] Jeong, Cheonsu. A Study on the Implementation of Generative AI towardsdatascience.com/how-to-get-started-with-google-text-to-
Services Using an Enterprise Data-Based LLM Application speech-using-python-485e43d1d544.
Architecture, Sept. 2023, [22] Silva Barbon, R., & Akabane, A. T. (2022, October 26). Towards
https://doi.org/https://doi.org/10.48550/arXiv.2309.01105. transfer learning techniques-bert, Distilbert, Bertimbau, and
[6] Skandan, Spurthy, et al. “Question answering system using knowledge Distilbertimbau for automatic text classification from different
graphs.” 2023 International Conference on Inventive Computation languages A case Study MDPI. https://www.mdpi.com/1424-
Technologies (ICICT), 2023, 8220/22/21/8184
https://doi.org/10.1109/icict57646.2023.10134047. [23] Alexander Mathew . “Data to Text Generation with T5; Building a
[7] Jiang, Zhengbao, et al. “Active Retrieval Augmented Generation.” Simple yet Advanced NLG Model.” Medium, Towards Data Science,
arXiv, 22 Oct. 2023, https://doi.org/https://arxiv.org/abs/2305.06983. 10 Apr. 2021, towardsdatascience.com/data-to-text- generation-with-
[8] P. Lewis et al., “Retrieval-Augmented Generation for Knowledge- t5-building- a-simple-yet-advanced- nlg-model-b5cce5a6df45.
Intensive NLP tasks,” arXiv (Cornell University), May 2020, [24] Flaticon, the Largest Database of Free Icons.Flaticon,
Available: https://arxiv.org/pdf/2005.11401 www.flaticon.com/icons.
[9] K. Shuster, S. Poff, M. Chen, D. Kiela, and J. Weston, “Retrieval
augmentation reduces hallucination in conversation,” Empirical
Methods in Natural Language Processing, pp. 3784–3803, Apr. 2021,
Available: https://aclanthology.org/2021.findings-emnlp.320/
[10] Y. Mao et al., “Generation-Augmented Retrieval for Open-domain
Question Answering,” Sep. 2020, doi:
https://doi.org/10.48550/arxiv.2009.08553.
[11] Y. Ahn, S.-G. Lee, J. Shim, and J. Park, “Retrieval-Augmented
Response Generation for Knowledge-Grounded Conversation in the
Wild,” IEEE Access, vol. 10, pp. 131374–131385, Jan. 2022, doi:
https://doi.org/10.1109/access.2022.3228964.