0% found this document useful (0 votes)
16 views8 pages

DL Pro 456

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views8 pages

DL Pro 456

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 8

Methodology/Algorithms

LLMs are advanced natural language processing models that utilize deep learning techniques to
understand and generate human-like text. These models, such as OpenAI's GPT (Generative Pre-
trained Transformer) series, are trained on vast amounts of text data to learn the intricacies of
human language. LLMs are capable of a wide range of language tasks, including text generation,
translation, summarization, and question answering.

Fig: Chatbot Architecture

Sentence Embedding:

Sentence embeddings are representations of sentences in a high-dimensional vector space that


capture the semantic meaning of the text chunks they represent. These embeddings are typically
generated using pre-trained language models, such as BERT (Bidirectional Encoder Representations
from Transformers) or GPT (Generative Pre-trained Transformer). These models are trained on large
amounts of text data using unsupervised learning techniques to learn contextualized representations
of words and sentences. By encoding sentences into dense vectors, sentence embeddings enable
various natural language processing tasks, including semantic similarity measurement, text
classification, and information retrieval. The embeddings are learned in such a way that similar
sentences are represented by vectors that are close to each other in the embedding space,
facilitating tasks like clustering and retrieval.

Retrieval based QA:

Retrieval-based question answering (QA) is a technique that involves retrieving relevant documents
or passages from a database and selecting answers based on them. In this approach, a question is
first encoded into a vector representation, often using techniques like sentence embedding. Then,
the vector database, containing precomputed embeddings of documents or sentences, is queried to
retrieve the most relevant passages related to the question. Finally, answer selection algorithms are
applied to extract or generate answers from the retrieved passages. This approach is particularly
effective for QA tasks where the answer can be found within the given context, such as factoid-based
questions or information retrieval tasks. By leveraging the semantic similarity between the question
and the documents, retrieval-based QA systems can provide accurate and relevant answers to user
queries.

Conversational Language Model:

The Llama model (TheBloke/Llama-2-7B-Chat-GGML) is a conversational language model trained on


the Gale Encyclopedia medical textbook data. This model employs algorithms for natural language
understanding and generation in conversational contexts. It leverages pre-trained language
representations to comprehend and generate human-like responses in conversations related to
medical topics. The model is fine-tuned on conversational data to improve its ability to engage in
dialogues, understand context, and generate coherent and informative responses. By combining
advanced language modeling techniques with domain-specific knowledge from the medical textbook
data, the Llama model demonstrates capabilities in conversational AI applications, such as chatbots
for healthcare support, medical question answering, and patient interaction systems.

Data Collection and Preprocessing:

 Text Extraction: Data collection involves ingesting PDF documents containing medical
information. The PyPDFLoader is used to load these documents from a specified directory
(DATA_PATH). The PyPDFLoader extracts this text, converting it into a format suitable for
analysis by the chatbot.

Fig: Medical Textbook used for Model Training


 Text Chunking: Medical documents can be lengthy and complex. To handle such documents
effectively, a RecursiveCharacterTextSplitter is employed to split the extracted text into
manageable chunks. The text splitter splits the documents into chunks with a specified size
(chunk_size) and overlap (chunk_overlap). These parameters determine the granularity of
the text chunks and the amount of overlap between adjacent chunks. Fine-tuning these
parameters can optimize the balance between chunk size and contextual coherence

Embedding Generation:

 After preprocessing, the text chunks undergo embedding generation. This involves
representing each chunk of text as a numerical vector using pre-trained language models.
The HuggingFaceEmbeddings module is employed for this task, utilizing the sentence-
transformers/all-MiniLM-L6-v2 model to generate embeddings.
 Semantic Representation: The embeddings capture the semantic content of the text,
encoding information about the meaning and context of the medical information contained
in the text chunks.

Vector Database Creation:

The vector database creation process involves several key components:

 Text Embeddings: The embeddings generated for the text chunks serve as the basis for
constructing the vector database. Each text chunk is represented as a high-dimensional
numerical vector, capturing its semantic content.
 FAISS Library: The FAISS (Facebook AI Similarity Search) library is employed for constructing
and managing the vector database. FAISS provides highly optimized algorithms for similarity
search in large-scale datasets, making it well-suited for the task of retrieving relevant
documents during question answering.
 The embeddings of the text chunks are indexed using FAISS to create the vector database.
This indexing process organizes the embeddings in a structure that enables fast nearest
neighbor search, allowing the chatbot to retrieve the most relevant documents efficiently.

Model Loading:

In addition to constructing the vector database, the chatbot also loads a pre-trained language model
for question answering tasks. This model serves as the core component for generating responses to
user queries based on the information retrieved from the vector database.

 Selection of Pre-trained Model: A pre-trained language model suitable for conversational


question answering tasks is chosen. In the provided code, the TheBloke/Llama-2-7B-Chat-
GGML model is selected, which is specifically trained for medical conversational QA using the
Gale Encyclopedia medical textbook data.
 Integration with Chatbot Pipeline: Once initialized, the language model is integrated into the
chatbot pipeline for question answering. It serves as the primary component responsible for
understanding user queries, retrieving relevant information from the vector database, and
generating informative responses.

Q/A Chain Creation:

The pre-trained language model selected for the chatbot is integrated into the question answering
pipeline. In the provided code, the CTransformers module is used to initialize and configure the
language model.

 The vector database constructed earlier serves as the retrieval component of the question
answering pipeline. It enables the chatbot to efficiently search for relevant documents based
on user queries.
 Prompt Template: A custom prompt template is defined to structure the input for the
question answering model. This template provides context and formatting instructions for
generating responses based on user queries.
 Chain Construction: Using the components mentioned above, the question answering chain
(qa_chain) is constructed. This chain incorporates the language model, retrieval component,
and prompt template to facilitate effective question answering.

Bot Initialization and Execution:


The qa_bot function initializes the chatbot by loading the necessary components, including
the language model, vector database, and prompt template. It sets up the question
answering pipeline qa_chain for processing user queries.

The final_result function takes a user query as input and executes the chatbot by passing the
query through the question answering pipeline. The chatbot retrieves relevant information
from the vector database and generates an informative response based on the user query.

Integration with Chainlit:

Chainlit is a library used for building conversational AI agents, and it plays a crucial role in
orchestrating the interaction between the chatbot and users.

 Message Handlers: Chainlit provides message handlers to manage the flow of conversation
between the chatbot and users. Handlers are set up to handle bot initialization, user queries,
and response delivery.
 User Session Management: Chainlit facilitates user session management, allowing the
chatbot to maintain context and state information across interactions with users.
 Asynchronous Processing: Chainlit supports asynchronous processing of user queries and
responses, ensuring smooth and responsive interaction between the chatbot and users.

Implementation:

Ingest.py

from langchain_community.embeddings import HuggingFaceEmbeddings


from langchain_community.vectorstores import FAISS
from langchain_community.document_loaders import PyPDFLoader, DirectoryLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter

DATA_PATH = 'data/'
DB_FAISS_PATH = 'vectorstore/db_faiss'

# Create vector database


def create_vector_db():
loader = DirectoryLoader(DATA_PATH,
glob='*.pdf',
loader_cls=PyPDFLoader)

documents = loader.load()
text_splitter = RecursiveCharacterTextSplitter(chunk_size=500,
chunk_overlap=50)
texts = text_splitter.split_documents(documents)

embeddings = HuggingFaceEmbeddings(model_name='sentence-transformers/all-
MiniLM-L6-v2',
model_kwargs={'device': 'cpu'})

db = FAISS.from_documents(texts, embeddings)
db.save_local(DB_FAISS_PATH)

if __name__ == "__main__":
create_vector_db()

model.py
from langchain_community.document_loaders import PyPDFLoader, DirectoryLoader
from langchain.prompts import PromptTemplate
from langchain_community.embeddings import HuggingFaceEmbeddings
from langchain_community.vectorstores import FAISS
from langchain_community.llms import CTransformers
from langchain.chains import RetrievalQA
import chainlit as cl

DB_FAISS_PATH = 'vectorstore/db_faiss'

custom_prompt_template = """Use the following pieces of information to answer the


user's question.
If you don't know the answer, just say that you don't know, don't try to make up an
answer.

Context: {context}
Question: {question}

Only return the helpful answer below and nothing else.


Helpful answer:
"""

def set_custom_prompt():
"""
Prompt template for QA retrieval for each vectorstore
"""
prompt = PromptTemplate(template=custom_prompt_template,
input_variables=['context', 'question'])
return prompt

#Retrieval QA Chain
def retrieval_qa_chain(llm, prompt, db):
qa_chain = RetrievalQA.from_chain_type(llm=llm, chain_type='stuff',
retriever=db.as_retriever(search_kwargs={'k': 2}),
return_source_documents=True,
chain_type_kwargs={'prompt': prompt})
return qa_chain

#Loading the model


def load_llm():
# Load the locally downloaded model here
llm = CTransformers(
model = "TheBloke/Llama-2-7B-Chat-GGML",
model_type="llama",
max_new_tokens = 512,
temperature = 0.5
)
return llm

#QA Model Function


def qa_bot():
embeddings = HuggingFaceEmbeddings(model_name=
"sentence-transformers/all-MiniLM-L6-v2",
model_kwargs={'device': 'cpu'})
db = FAISS.load_local(DB_FAISS_PATH, embeddings,
allow_dangerous_deserialization=True)
llm = load_llm()
qa_prompt = set_custom_prompt()
qa = retrieval_qa_chain(llm, qa_prompt, db)

return qa

#output function
def final_result(query):
qa_result = qa_bot()
response = qa_result({'query': query})
return response

#chainlit code
@cl.on_chat_start
async def start():
chain = qa_bot()
msg = cl.Message(content="Starting the bot...")
await msg.send()
msg.content = "Hi, Welcome to Medical Bot. What is your query?"
await msg.update()

cl.user_session.set("chain", chain)

@cl.on_message
async def main(message: cl.Message):
chain = cl.user_session.get("chain")
cb = cl.AsyncLangchainCallbackHandler(
stream_final_answer=True, answer_prefix_tokens=["FINAL", "ANSWER"]
)
cb.answer_reached = True
res = await chain.acall(message.content, callbacks=[cb])
answer = res["result"]
#sources = res["source_documents"]
print(answer)
#if sources:
# answer += f"\nSources:" + str(sources)
#else:
# answer += "\nNo sources found"

await cl.Message(content=answer).send()

Results and discussion :


Use this image for abstract

You might also like