Leveraging LLM: Implementing An Advanced AI Chatbot For Healthcare
Leveraging LLM: Implementing An Advanced AI Chatbot For Healthcare
Abstract:- Using the application of Large Language The introduction of LLMs into the realm of medical
Models (LLMs) in healthcare settings, mainly focusing on chatbots brings several key benefits. Firstly, these models can
addressing general illness inquiries through chatbot comprehend and generate human-like responses, enhancing
interfaces. Leveraging the capabilities of LLMs, explore the conversational experience for users. [3] Additionally,
their potential to provide accurate and contextually LLMs have the capacity to process vast amounts of medical
relevant responses to users seeking information about literature, clinical guidelines, and patient data, enabling
common health concerns. LLM have the capacity chatbots to deliver evidence-based information and
continuously learn and improve from user interaction. personalized recommendations. Moreover, by continuously
Through benchmarking experiments, this paper evaluates learning from user interactions, LLM-powered chatbots can
the accuracy (61%) of LLM-based chatbots in adapt and improve over time, refining their responses to better
understanding and responding to user queries related to meet the needs of individual users.
general illnesses. The findings demonstrate the
performance of LLMs against established benchmarks, The emergence of ChatGPT and other LLMs has sparked
shedding light on their efficacy in healthcare applications. debate in academia. These AI-powered chatbots can generate
By examining the intersection of LLM technology and human-quality text and code, potentially boosting research
healthcare, this research contributes to advancing the efficiency and educational experiences. However, concerns
development of intelligent chatbot systems capable of linger. LLMs might produce inaccurate or biased content, and
providing reliable and informative support to individuals their use could lead to plagiarism if not implemented
seeking medical guidance for general health issues. responsibly. [4] The key lies in finding a balanced approach
that leverages the benefits of LLMs while mitigating risks.
Keywords:- LLM, Healthcare, Chatbot, Benchmark, This includes developing methods to identify and address bias,
Accuracy. ensuring proper citation of LLM-generated content, and
fostering critical thinking skills to evaluate the information
I. INTRODUCTION these models produce.
In recent years, the intersection of artificial intelligence However, while the potential of LLMs in medical
(AI) and healthcare has seen remarkable advancements, chatbots is promising, several challenges and considerations
revolutionizing the way we approach medical assistance and must be addressed. [5] These include ensuring the accuracy
patient care. [1] Among the various applications of AI in and reliability of information provided, maintaining user
healthcare, chatbots have emerged as a promising tool for privacy and data security, addressing ethical concerns
delivering personalized medical guidance and support. surrounding AI-driven healthcare, and overcoming potential
Leveraging the capabilities of LLMs, such as OpenAI's GPT biases inherent in the training data of language models.
series, these chatbots have the potential to address a wide
range of health-related queries, providing timely and accurate Through an in-depth analysis of existing literature, case
information to users. studies, and empirical studies, this research paper will delve
into the current state of LLM-powered medical chatbots, their
This research paper aims to make the integration of strengths, limitations, and future directions. [6] By critically
LLMs into medical chatbots and evaluate their effectiveness examining the opportunities and challenges associated with
in improving patient engagement, healthcare accessibility, and these innovative technologies, this paper seeks to contribute to
overall user satisfaction. [2] By harnessing the vast knowledge the ongoing discourse on the role of AI in transforming
encoded within these language models, medical chatbots can healthcare delivery and patient empowerment.
offer valuable insights, assist in symptom assessment, provide
medication advice, offer lifestyle recommendations, and even
facilitate mental health support.
II. ARCHITECTURE regular expression to split the text into smaller pieces based on
a specific pattern. This will improve the performance of the
We use a medical chatbot, in which the proposed system text processing and it makes easier to read and understand.
aims to diagnose diseases and provide basic details about them This chunk is saved in the server.
before a patient consults a doctor and this will not only reduce
healthcare cost but also improve accessibility to medical In the figure 1, Similarity search is the process of finding
knowledge. It is more accessible and efficient healthcare similar items in a database. This can be done for a variety of
services. It also provides patients and healthcare professionals data types, including text, images, and audio. There is various
with invaluable support, offering information on symptoms, approach to find the search like vector space model, hashing
diagnoses, treatment options, and medication details. algorithm, etc. Cosine similarity is the approach for the
similarity search in this architecture. A metric for comparing
The source of the dataset are the approved books from the similarity of two vectors is cosine similarity. It is
Indian Council of Medical Research, All India Institute of calculated as the cosine of the angle between the two vectors.
Medical Sciences. In the figure 1, LangChain Directory The cosine of an angle is equal to the dot product of the two
Loader is a component of the LangChain platform that allows vectors divided by the product of their magnitudes. Cosine
users to load documents from a directory into a LangChain similarity can be used to measure the similarity between any
project. This allows users to easily process and analyze large two vectors, regardless of their length or direction. Then
collections of documents, such as a corpus of news articles or selected chunks with the prompt templates and question from
a set of research papers. LangChain Directory Loader supports the user is given to the LLM.
a variety of document formats, including plain text, HTML,
and PDF. It also allows users to specify a custom loader for LLM is a type of AI model that can process and generate
any document format that is not supported by default. text. Large text and code datasets are used to train LLMs,
LangChain Directory Loader is a powerful tool for users who enabling them to discover statistical correlations between
need to process and analyze large collections of documents. It words and phrases. This knowledge can then be used to
is easy to use and supports a variety of document formats. generate text that is like the text in the training dataset. Once
the model has processed the prompt, it generates a response by
The process of automatically locating and removing predicting the most likely continuation or answer based on the
pertinent data from unstructured text documents is known as information it has learned. The response is produced instantly
text extraction. This can include extracting entities, such as by the model; it is not pre-written or taken verbatim from any
names, dates, and locations, as well as extracting relationships source. However, it is important to note that the model may
between entities. Text extraction is typically performed using sometimes produce responses that resemble or paraphrase
a combination of NLP techniques, such as tokenization, part- existing information due to the nature of its training on a vast
of-speech tagging, and named entity recognition. These corpus of text. Language models read prompts by analyzing
techniques are used to identify and extract the relevant the input text, understanding the context, and generating
information from the text. responses based on their learned knowledge of language.
Although the responses are produced instantly and are meant
Converting text to chunks is the process of dividing a text to be helpful, it is crucial to double-check the information and
into smaller, more manageable pieces. This can be done for a not rely only on them for crucial details. This response is given
variety of reasons, such as to improve the performance of a to the user.
text processing algorithm, to make the text easier to read and
understand, or to break the text up into smaller units for In the figure 1, whenever the question from the user and
storage or transmission. There are several different ways to answer of the proposed model is saved in the database. The
convert text to chunks. One common approach is to use a user database is stored in the database.
The LLM proposed model demonstrates its versatility by Check if a user document was retrieved.
leveraging two distinct hardware configurations for optimal If the document exists, compare the password provided by
performance. On CPU, the model is deployed on an Intel i5- the user with the hashed password stored in the database.
1135G7 processor boasting 8 cores running at 4.200GHz, If the passwords match, proceed to the next step.
ensuring efficient execution of natural language processing Otherwise, return an error indicating invalid credentials.
tasks. For accelerated inference and enhanced parallel
processing capabilities, the model also harnesses the power of Generate Authentication Token (Optional):
GPU acceleration, supported by both NVIDIA GeForce RTX
3050 and AMD Radeon Graphics. Moreover, on CPU, the Upon successful validation, optionally generate an
model benefits from an AMD Ryzen 7 5800H processor, authentication token (e.g., JWT) for the user to maintain
further diversifying the hardware landscape and showcasing their authenticated state in subsequent requests.
the model's adaptability across different computational
environments. This multi-hardware approach enables the Output:
LLM proposed model to achieve efficient and scalable
performance across various hardware configurations, catering If authentication is successful, return a success message or
to diverse computational requirements and user preferences. the generated authentication token.
If authentication fails, return an error message indicating
We present a novel approach to transforming textual data invalid credentials.
from medical journal articles or conference papers into a
vectorized database, enabling efficient storage and analysis. Using of the MongoDB, we storage user dataset. In the
[7] Leveraging state-of-the-art natural language processing user dataset, data stored are username, email and password of
techniques, we pre-process the raw text, extract key features, the user. It also stored the user history, in which user interact
and encode them into high-dimensional vectors, preserving with model. This stores the both question and answer from the
semantic relationships and contextual information. This model. MongoDB is used in the login page and signup page.
vectorized representation facilitates various machine learning History of the user is used while interact with the model.
tasks, such as classification, clustering, and information
retrieval, while maintaining the integrity and richness of the LangChain is a framework designed to simplify the
original medical content. Our methodology not only creation of applications powered by LLMs. [9] It provides
streamlines data management and accessibility but also tools and functionalities to make LLMs more usable and
empowers researchers and practitioners in the medical field to accessible. LangChain plays a crucial role in developing
derive valuable insights and make informed decisions from chatbots powered by Large Language Models (LLMs).
vast amounts of scholarly literature.
B. Algorithm:
MongoDB, a popular NoSQL database, can be utilized in
a wide range of applications, including those involving LLMs Input: PDF File Path.
like GPT (Generative Pre-trained Transformer) models. [8] Initialize a PDF Reader:
Here are some ways MongoDB could be used in conjunction
with LLMs:
Use a library like PyPDF2 or pdfminer.six in Python to
extract text from the PDF.
A. Algorithm:
Extract Text Chunks:
Input:
Iterate through each page of the PDF.
Username/email and password entered by the user.
Extract text from each page.
Connect to MongoDB Database: Split the text into smaller chunks based on certain criteria
(e.g., paragraphs, sentences, or custom-defined
boundaries).
Establish a connection to the MongoDB database where
user credentials are stored.
Text Pre-Processing:
Query user Credentials:
Normalize the text (e.g., lowercase, remove punctuation).
Remove stop words (if necessary).
Search the database for the user with the provided
username/email. Apply stemming or lemmatization (optional).
Retrieve the corresponding user document.
Convert Text Chunks to Vectors: Keep track of the highest cosine similarity value and the
corresponding word vector.
Use a technique like TF-IDF (Term Frequency-Inverse
Document Frequency) or word embeddings to convert Output:
each text chunk into a numerical vector representation.
The word from the database with the highest cosine
Initialize a Vector Database: similarity to the query word.
Choose a suitable database system to store the vectors. When a user sends a query to the model, then the cosine
Set up a schema to accommodate vector storage. similarity measures the cosine of the angle between two
vectors, typically representing text embeddings in a high-
Store Vectors in the Database: dimensional space. Within LLM, Cosine Similarity is
employed to compare the representations of different text
For each text chunk, store its corresponding vector in the segments or documents, assessing their similarity based on the
database along with any metadata needed for future direction of their vectors rather than their magnitudes. This
reference. implementation enables LLM to effectively identify
semantically related documents or passages, aiding tasks such
Repeat Steps 3 to 7 for all Pages: as document ranking, clustering, and question answering. This
gives the model to give precise answer to the user.
Iterate through each page of the PDF, extract text chunks,
convert them to vectors, and store them in the database. For boosting Large Language Models with external
knowledge, [13] Retrieval-Augmented Generation (RAG) is a
Output: technique that enhances the accuracy and reliability of LLMs
Vector database containing vector representations of text like LLaMDA2 by incorporating external knowledge sources.
chunks extracted from the PDF.
D. Algorithm:
Optional:
Index the vectors for efficient retrieval and search Input:
operations.
Prompt: Initial context or information provided to the
This algorithm outlines the basic steps involved in system.
scanning a PDF file, extracting text chunks, converting them Query: User's query or request for additional information.
into vector representations, and storing them in a database.
Depending on your specific requirements and constraints, you Retriever Module:
may need to customize certain steps or incorporate additional
preprocessing or optimization techniques. Use the query to search for relevant information from a
knowledge base or corpus.
Cosine similarity is commonly used in [10] similarity Retrieve relevant passages or documents based on the
search tasks, especially in natural language processing and query using techniques like TF-IDF, BM25, or neural
information retrieval. [11]Using of the IDF-PIF, the proposed retrievers such as DPR (Dense Passage Retrieval).
model gets more accuracy. Filter and rank retrieved passages/documents to identify
the most relevant ones.
C. Algorithm
Answerer Module:
Input:
Extract relevant information from the retrieved
Word to find similarity with (query word). passages/documents.
Vector database containing vector representations of Utilize natural language understanding techniques to
words. comprehend the user's query and the retrieved context.
Generate a structured representation of the retrieved
Retrieve Vector for the Query Word: information to facilitate subsequent processing.
Search the database for the vector representation of the Generator Module:
query word.
Input the prompt, query, and the enhanced context (from
Calculate Cosine Similarity: the Answerer module) into a Large Language Model
(LLM).
Iterate through each vector in the database. Fine-tune the LLM on the given prompt, query, and
For each vector, calculate the cosine similarity with the enhanced context to generate the text response.
vector of the query word.
Optionally, employ techniques like conditional text For the easy question, whenever these types of the
generation or prompting strategies to guide the generation questions are given to the model. It gives the correct answer.
process and ensure coherence with the context. It is also giving the correct answer of the relationship between
Generate the text response based on the input prompt, the two topics.
query, and enhanced context.
But the hard question, the accuracy is coming in the low.
Output: Whenever the user the ask specific topic to the model, the
model gives the proper answer for it, but whenever ask for the
Generated Text Response: relationship between it the or like given in the above example.
The output of the Generator module, providing the It gives a wrong answer.
response to the user's query while incorporating the relevant
information retrieved by the Retriever module.
When leveraging a GPU for the execution of the chatbot Furthermore, GPU-accelerated deployments of LLMs
model LLM, users can expect a notable increase in accuracy offer scalability and flexibility, making them well-suited for
compared to CPU-based deployments. GPUs excel in parallel handling larger models and accommodating increased
processing tasks, enabling faster computations and more workloads. [16] By harnessing the parallel processing
efficient handling of the intricate operations involved in capabilities of GPUs, the chatbot can efficiently process a
language modeling. With the abundant computational larger volume of data and perform complex computations in
resources provided by GPUs, LLMs can be executed more real-time, enhancing its ability to understand user queries and
swiftly, allowing for deeper exploration of contextual generate contextually relevant responses. Consequently, users
relationships and more precise generation of responses. As a benefit from an improved accuracy level, as the chatbot can
result, users interacting with the chatbot on GPU-accelerated leverage the computational power of GPUs to deliver more
systems are likely to experience higher accuracy in the nuanced and accurate responses, ultimately enriching the
generated responses, leading to a more satisfying and seamless overall quality of the interaction.
conversational experience.
[17]LLMs demonstrated promising results, achieving an Forget just being "the ultimate AI chatbot," [18]
overall accuracy exceeding 70%. Bard slightly edged out ChatGPT success hinges on user trust and ethical
ChatGPT in terms of accuracy and understandability, considerations, according to a recent study. While information
providing slightly more comprehensive and easily digestible quality remains king, the research shows users also value
answers. However, ChatGPT displayed a higher level of feeling connected to the technology, even perceiving it as
caution, generating fewer responses with the potential for possessing superhuman abilities! But do not get carried away,
misinformation. This trade-off highlights a key challenge in ethics play a crucial role too. Users' personal values directly
LLM development for the medical field – balancing accuracy impact their loyalty to ChatGPT, highlighting the importance
with the need to avoid harmful information. of responsible AI development.
These agents can handle complex tasks, but building trust [7]. Ankit Bhakkad, S.C. Dharmadhikari, M. Emmanuel,
in them is crucial. While traditional factors like reliability are and Parag Kulkarni, E-VSM: Novel Text
important, LLM-based agents pose new challenges. [20] Representation Model to Capture Contex-Based
Researchers suggest focusing on making these agents more Closeness between Two Text Documents. 2013.
transparent and understandable to users. This includes aspects [8]. Anjali Chauhan, A Review on Various Aspects of
like how they reach decisions and the data they use. MongoDb Databases. 2019.
Additionally, incorporating features that make them relatable, [9]. O. Topsakal and T. C. Akinci, “Creating Large
like virtual assistants with voices or avatars, can also increase Language Model Applications Utilizing LangChain: A
trust. Ultimately, the success of LLM-based automation Primer on Developing LLM Apps Fast,” International
hinges on addressing these new considerations and developing Conference on Applied Engineering and Natural
trustworthy AI agents that function ethically and in line with Sciences, vol. 1, no. 1, pp. 1050–1056, Jul. 2023, doi:
human values. 10.59287/icaens.1127.
[10]. C. Luo, J. Zhan, L. Wang, and Q. Yang, “Cosine
IV. CONCLUSION Normalization: Using Cosine Similarity Instead of Dot
Product in Neural Networks,” Feb. 2017, [Online].
In conclusion, the integration of large language models Available: http://arxiv.org/abs/1702.05870
(LLMs) into medical chatbots represents a significant step [11]. M. Emmanuel, B. D. R. Ramesh, and S. M. Khatri, “A
forward in revolutionizing healthcare delivery and patient novel scheme for term weighting in text
engagement. [21] Throughout this research paper, we have categorization : Positive impact factor,” in Proceedings
explored the potential of LLM-powered chatbots to provide - 2013 IEEE International Conference on Systems,
personalized medical guidance, support, and information to Man, and Cybernetics, SMC 2013, 2013, pp. 2292–
users, enhancing accessibility and convenience in healthcare 2297. doi: 10.1109/SMC.2013.392.
services. [12]. R. Hasan MBA and J. Ferdous, “Dominance of AI and
Machine Learning Techniques in Hybrid Movie
The benefits of LLM-powered medical chatbots are Recommendation System Applying Text-to-number
manifold. These chatbots leverage advanced natural language Conversion and Cosine Similarity Approaches,” 2024,
processing capabilities to understand and respond to user doi: 10.32996/jcsts.
queries in a conversational manner, mimicking human [13]. P. Lewis et al., “Retrieval-Augmented Generation for
interactions. By tapping into vast repositories of medical Knowledge-Intensive NLP Tasks,” May 2020,
knowledge encoded within LLMs, chatbots can deliver [Online]. Available: http://arxiv.org/abs/2005.11401
evidence-based information, assist in symptom assessment, [14]. A. Pal, L. K. Umapathi, and M. Sankarasubbu,
offer medication advice, and promote healthy lifestyle choices. “MedMCQA : A Large-scale Multi-Subject Multi-
Choice Dataset for Medical domain Question
REFERENCES Answering,” 2022.
[15]. M. R. Parvez, W. U. Ahmad, S. Chakraborty, B. Ray,
[1]. P. Rajpurkar, E. Chen, O. Banerjee, and E. J. Topol, and K.-W. Chang, “Retrieval Augmented Code
“AI in health and medicine,” Nature Medicine, vol. 28, Generation and Summarization,” Aug. 2021, [Online].
no. 1. Nature Research, pp. 31–38, Jan. 01, 2022. doi: Available: http://arxiv.org/abs/2108.11601
10.1038/s41591-021-01614-0. [16]. W. Kwon et al., “Efficient Memory Management for
[2]. B. Galitsky, “LLM-based Personalized Large Language Model Serving with PagedAttention,”
Recommendations in Health,” 2024, doi: in SOSP 2023 - Proceedings of the 29th ACM
10.20944/preprints202402.1709.v1. Symposium on Operating Systems Principles,
[3]. R. Szilágyi and M. Tóth, “Use of LLM for SMEs, Association for Computing Machinery, Inc, Oct. 2023,
opportunities and challenges,” Journal of Agricultural pp. 611–626. doi: 10.1145/3600006.3613165.
Informatics, vol. 14, no. 2, Jan. 2024, doi: [17]. Y. Li, Z. Song, and W. Li, “Benchmarking Large
10.17700/jai.2023.14.2.703. Language Models in Adolescent Growth and
[4]. J. G. Meyer et al., “ChatGPT and large language Development: A Comparative Analysis of Claude2,
models in academia: opportunities and challenges,” ChatGPT-3.5, and Google Bard,” 2024, doi:
BioData Mining, vol. 16, no. 1. BioMed Central Ltd, 10.21203/rs.3.rs-3858549/v1.
Dec. 01, 2023. doi: 10.1186/s13040-023-00339-9. [18]. B. Niu and G. F. N. Mvondo, “I Am ChatGPT, the
[5]. Y. Yao, J. Duan, K. Xu, Y. Cai, Z. Sun, and Y. Zhang, ultimate AI Chatbot! Investigating the determinants of
“A survey on Large Language Model (LLM) security users’ loyalty and ethical usage concerns of ChatGPT,”
and privacy: The Good, The Bad, and The Ugly,” High- Journal of Retailing and Consumer Services, vol. 76,
Confidence Computing, p. 100211, Mar. 2024, doi: Jan. 2024, doi: 10.1016/j.jretconser.2023.103562.
10.1016/j.hcc.2024.100211. [19]. L. Zheng et al., “Judging LLM-as-a-Judge with MT-
[6]. F. Jiang et al., “Artificial intelligence in healthcare: Bench and Chatbot Arena,” Jun. 2023, [Online].
Past, present and future,” Stroke and Vascular Available: http://arxiv.org/abs/2306.05685
Neurology, vol. 2, no. 4. BMJ Publishing Group, pp.
230–243, Dec. 01, 2017. doi: 10.1136/svn-2017-
000101.