Disha Chat Bot

Uploaded by

no819154

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

41 views15 pages

Disha Chat Bot

Uploaded by

no819154

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 15

INDIAN INSTITUTE OF INFORMATION TECHNOLOGY,

NAGPUR

Department of Computer Science and Engineering

First Project Presentation on

Supervisor: Dr. Mangesh Kose

E n ro l l m e n t N o. a n d N a m e o f S t u d e n t : -
B T 2 1 C S E 1 6 1 L a k s h i t U p re t i
B T 2 1 C S E 1 8 8 A b h i s h e k Ku m a r
B T 2 1 C S E 1 9 4 G y a n b a rd h a n
Contents

• Identification of Project/Problem • Whats new in Disha?

• Statement
Functionalities • Our Progress so far
• Literature survey
• Work Flow
• Fine Tuning LLM
• Retrieval-Augmented Generation
Identification of Project/Problem
Statement
Introductio
Information Overload! Today, in this digital age, finding the right information can be a
n:
daunting task. It was observed that navigating our website at college was a painful
task. Students and visitors, alike, would find it difficult to find the right information—
especially when the timetables were divided into badly managed resources like this
puzzle with missing pieces.
Here comes
Disha!!
Disha is our innovative solution. Imagine asking questions in a normal friendly way and
getting instant answers! With the power of Machine Learning, Natural Language
Processing and Large Language Models at its core, this simple conversational chatbot
will make the user experience on our college website better and deliver quick,
convenient responses on every query about our college-from current students to
prospective visitors.
With Disha, we make things easier and save precious time for all of us using it. "Let's make
accessing information a piece of cake!
Functionalitie
s
Human-Like
With Disha, users can ask questions naturally, just like speaking to a person. Whether
Interaction:
it’s a simple inquiry or a detailed query about the college, Disha will responds with
ease, providing accurate and helpful answers. The chatbot will be designed to make
conversation feel smooth and intuitive.
Multilingual – Converse in Hindi and English:
Disha breaks language barriers! Users can ask their questions in both Hindi and English
and receive accurate responses in the same language. This makes Disha accessible to
a wider audience, including both local and international users.

Voice Command – Ask Your Query Just by

No need to type—just tap and speak! With voice query functionality, users can ask
Speaking:
their questions in Hindi or English by simply talking to Disha. Whether you're typing or
speaking, Disha will provide quick, relevant answers, making the interaction more
convenient and user-friendly.
Literature
survey
Chatbot Development for Accounting Firm: Fine-Tuning
with Novel Datasets:
Objective: Develop a chatbot for accounting firm automation.
Methodology: Fine-tune LLMs using ADE (tax) and ITACA (legal)
datasets.
Results: Fine-tuned models outperform existing models, thanks to novel
datasets.
Impact: Reduces manual workload, provides fast, accurate legal/fiscal
info.
Conclusion: Dataset creation and LLM fine-tuning enhance chatbot
capabilities
Link in specialized domains.
https://webthesis.biblio.polito.it/secure/31058/1/tesi.pdf
:
Literature
survey
Leveraging LLMs for Enhanced RAC
Accessibility:
Objective: Simplify RAC using LLMs.
Dataset: 24,478 labeled Q&A pairs.
Methodology: Data extraction, labeling, model training (GEMMA 1.1).
Results: V8 model excelled, average rating 7/10, RAC 3 needs
improvement.
Impact: Reduces expert reliance, improves compliance, and
accessibility.
Conclusion:
Link LLMs can revolutionize RAC understanding.
https://arxiv.org/pdf/2405.08792
:
Literature
survey
Fine-Tuning LLMs in Ophthalmology and GPT-4-Based
Evaluation:
Objective: Evaluate clinical alignment of LLM-generated responses.
Fine-tuning: 5 LLMs (GPT-3.5, LLAMA2 series) for ophthalmology.
Methodology: Dataset of 400 Q&A pairs, GPT-4 evaluation.
Results: GPT-3.5 top performer (87.1% accuracy), LLAMA2s followed.
Agreement: High correlation between GPT-4 and human clinician rankings.
Conclusion: GPT-4 is a promising evaluator, LLMs need improvement for
clinical use.
Closing Thought: LLMs have potential in ophthalmology, but accuracy and
Link
reliability are crucial.
https://arxiv.org/pdf/2402.10083
:
Literature
survey
Smart Chatbot Architecture for Healthcare:

Introduction: Chatbots for healthcare assistance.

Challenges: Limitations of rule-based systems.
Proposed Architecture: NLP, ML for intelligent conversation.
Technological Components: NLU, NLG, ASR.
Healthcare Application: Reduces burden, offers personalized
responses.
Link
https://dl.acm.org/doi/10.1145/3386723.3387897
:
Literature
survey
RAG-Based Chatbot for PDF
Queries:
Objective: Answer queries from PDFs using LLMs and RAG.
Technology: Python, PyTorch, LangChain, Hugging Face, Chainlit.
Methodology: Data preprocessing, text segmentation, embeddings,
vector store.
Key Components: LLM models (Llama2-7B, GPT-3.5), RAG system, user
interface.
Results: Efficient, contextually accurate responses, web application.
Conclusion:
Link
Improved response accuracy, scalability for future
development.
https://ijsrem.com/download/rag-based-chatbot-using-llms
:
/
Fine Tuning LLM

GPT-2
LLaMA-2
Gaama-2
IIITN Dataset
BERT

Fine-tuning is the process of adapting a pre-trained model to a specific task or dataset by

adjusting its weights. It's essential here to tailor the chatbot for IIITN-specific queries and
ensure accurate, relevant responses.
• Data from IIITN website will be fed into the model.
• Models: BERT, GPT-2, LLaMA2, Gamma2.
• PEFT - LoRA, QLoRA, Unsloth techniques will be used.
• Customization for accurate, IIITN-specific answers.
• Open-source models ensure flexibility and scalability.
Retrieval-Augmented
Generation
RAG is a hybrid approach where a
model retrieves relevant information
from an external database before
generating an answer. This enhances
the model's response by grounding it IIITN
Dataset
in accurate, up-to-date data.
• PineCone and ChromaDB will be Pinecon
ChromaD
e
used for efficient data storage B

and retrieval. LL
• Gemini's free API will be used for M
Google
Gemini
LLM output.
• Langchain will be used for
creating Pipeline for Database
and LLM.
Whats new in
Disha?
Unified Intelligence: Merging RAG and LLMs for better
Results:
Our project combines Retrieval-Augmented Generation (RAG) with Fine-Tuned LLMs
trained on our college data. and a Summarizing LLM that unifies both outputs into
one cohesive response, delivering clarity and precision.

Key Innovations:
Unified Responses: Summarizing LLM merges best information from both, RAG systems
and fine-tuned LLMs.
Context Preservation: Critical details from all sources are maintained for accuracy.
Robust and Reliable: Facts are double-checked i.e., responses fetched by RAG are
compared with the response from LLM model fine-tuned on our college website data,
ensuring high reliability.
Natural Flow: Tailored, user-friendly language enhances the conversation.
With this innovative architecture, Disha offers smarter, clearer, and more intuitive
responses, creating a superior user experience!
Our Progress so far
Data Collection – Gathering Information:
One of the most crucial steps in our project is collecting the right data. We’ve
already completed most of our data collection, primarily through web scraping, to
gather all the necessary information from our college website. This data will form
the backbone of Disha’s knowledge, ensuring accurate and relevant responses.

Data Preprocessing – Cleaning and Refining the Data:

We are now working on preprocessing the collected data, applying various techniques
to clean, organize, and structure it for optimal use. This step ensures that Disha can
efficiently process and retrieve information, making sure users get fast and accurate
answers every time.
"With clean, well-processed data, Disha is set to deliver a seamless, intelligent
experience!"