DL Pro 456
DL Pro 456
LLMs are advanced natural language processing models that utilize deep learning techniques to
understand and generate human-like text. These models, such as OpenAI's GPT (Generative Pre-
trained Transformer) series, are trained on vast amounts of text data to learn the intricacies of
human language. LLMs are capable of a wide range of language tasks, including text generation,
translation, summarization, and question answering.
Sentence Embedding:
Retrieval-based question answering (QA) is a technique that involves retrieving relevant documents
or passages from a database and selecting answers based on them. In this approach, a question is
first encoded into a vector representation, often using techniques like sentence embedding. Then,
the vector database, containing precomputed embeddings of documents or sentences, is queried to
retrieve the most relevant passages related to the question. Finally, answer selection algorithms are
applied to extract or generate answers from the retrieved passages. This approach is particularly
effective for QA tasks where the answer can be found within the given context, such as factoid-based
questions or information retrieval tasks. By leveraging the semantic similarity between the question
and the documents, retrieval-based QA systems can provide accurate and relevant answers to user
queries.
Text Extraction: Data collection involves ingesting PDF documents containing medical
information. The PyPDFLoader is used to load these documents from a specified directory
(DATA_PATH). The PyPDFLoader extracts this text, converting it into a format suitable for
analysis by the chatbot.
Embedding Generation:
After preprocessing, the text chunks undergo embedding generation. This involves
representing each chunk of text as a numerical vector using pre-trained language models.
The HuggingFaceEmbeddings module is employed for this task, utilizing the sentence-
transformers/all-MiniLM-L6-v2 model to generate embeddings.
Semantic Representation: The embeddings capture the semantic content of the text,
encoding information about the meaning and context of the medical information contained
in the text chunks.
Text Embeddings: The embeddings generated for the text chunks serve as the basis for
constructing the vector database. Each text chunk is represented as a high-dimensional
numerical vector, capturing its semantic content.
FAISS Library: The FAISS (Facebook AI Similarity Search) library is employed for constructing
and managing the vector database. FAISS provides highly optimized algorithms for similarity
search in large-scale datasets, making it well-suited for the task of retrieving relevant
documents during question answering.
The embeddings of the text chunks are indexed using FAISS to create the vector database.
This indexing process organizes the embeddings in a structure that enables fast nearest
neighbor search, allowing the chatbot to retrieve the most relevant documents efficiently.
Model Loading:
In addition to constructing the vector database, the chatbot also loads a pre-trained language model
for question answering tasks. This model serves as the core component for generating responses to
user queries based on the information retrieved from the vector database.
The pre-trained language model selected for the chatbot is integrated into the question answering
pipeline. In the provided code, the CTransformers module is used to initialize and configure the
language model.
The vector database constructed earlier serves as the retrieval component of the question
answering pipeline. It enables the chatbot to efficiently search for relevant documents based
on user queries.
Prompt Template: A custom prompt template is defined to structure the input for the
question answering model. This template provides context and formatting instructions for
generating responses based on user queries.
Chain Construction: Using the components mentioned above, the question answering chain
(qa_chain) is constructed. This chain incorporates the language model, retrieval component,
and prompt template to facilitate effective question answering.
The final_result function takes a user query as input and executes the chatbot by passing the
query through the question answering pipeline. The chatbot retrieves relevant information
from the vector database and generates an informative response based on the user query.
Chainlit is a library used for building conversational AI agents, and it plays a crucial role in
orchestrating the interaction between the chatbot and users.
Message Handlers: Chainlit provides message handlers to manage the flow of conversation
between the chatbot and users. Handlers are set up to handle bot initialization, user queries,
and response delivery.
User Session Management: Chainlit facilitates user session management, allowing the
chatbot to maintain context and state information across interactions with users.
Asynchronous Processing: Chainlit supports asynchronous processing of user queries and
responses, ensuring smooth and responsive interaction between the chatbot and users.
Implementation:
Ingest.py
DATA_PATH = 'data/'
DB_FAISS_PATH = 'vectorstore/db_faiss'
documents = loader.load()
text_splitter = RecursiveCharacterTextSplitter(chunk_size=500,
chunk_overlap=50)
texts = text_splitter.split_documents(documents)
embeddings = HuggingFaceEmbeddings(model_name='sentence-transformers/all-
MiniLM-L6-v2',
model_kwargs={'device': 'cpu'})
db = FAISS.from_documents(texts, embeddings)
db.save_local(DB_FAISS_PATH)
if __name__ == "__main__":
create_vector_db()
model.py
from langchain_community.document_loaders import PyPDFLoader, DirectoryLoader
from langchain.prompts import PromptTemplate
from langchain_community.embeddings import HuggingFaceEmbeddings
from langchain_community.vectorstores import FAISS
from langchain_community.llms import CTransformers
from langchain.chains import RetrievalQA
import chainlit as cl
DB_FAISS_PATH = 'vectorstore/db_faiss'
Context: {context}
Question: {question}
def set_custom_prompt():
"""
Prompt template for QA retrieval for each vectorstore
"""
prompt = PromptTemplate(template=custom_prompt_template,
input_variables=['context', 'question'])
return prompt
#Retrieval QA Chain
def retrieval_qa_chain(llm, prompt, db):
qa_chain = RetrievalQA.from_chain_type(llm=llm, chain_type='stuff',
retriever=db.as_retriever(search_kwargs={'k': 2}),
return_source_documents=True,
chain_type_kwargs={'prompt': prompt})
return qa_chain
return qa
#output function
def final_result(query):
qa_result = qa_bot()
response = qa_result({'query': query})
return response
#chainlit code
@cl.on_chat_start
async def start():
chain = qa_bot()
msg = cl.Message(content="Starting the bot...")
await msg.send()
msg.content = "Hi, Welcome to Medical Bot. What is your query?"
await msg.update()
cl.user_session.set("chain", chain)
@cl.on_message
async def main(message: cl.Message):
chain = cl.user_session.get("chain")
cb = cl.AsyncLangchainCallbackHandler(
stream_final_answer=True, answer_prefix_tokens=["FINAL", "ANSWER"]
)
cb.answer_reached = True
res = await chain.acall(message.content, callbacks=[cb])
answer = res["result"]
#sources = res["source_documents"]
print(answer)
#if sources:
# answer += f"\nSources:" + str(sources)
#else:
# answer += "\nNo sources found"
await cl.Message(content=answer).send()