0% found this document useful (0 votes)
124 views10 pages

Infosys Internship 4.0 Project Documentation NEW

Uploaded by

Suyash Lade
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
124 views10 pages

Infosys Internship 4.0 Project Documentation NEW

Uploaded by

Suyash Lade
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

CHATBOT

Infosys Internship 4.0 Project Documentation

Title: Project Documentation: INFOSYS CHATBOT

Team Members:

1. Suyash Lade (Team Lead)


2. Jinesh Shah
3. Kantamsetti Kanchana
4. Poshita Inaganti
5. Padam Dhakappa
6. Supriya Kumari
7. Padmasri

•Introduction:
The Infosys Chatbot project aims to develop an intelligent and interactive chatbot
that enhances customer engagement and operational efficiency across various
business sectors. Leveraging advanced technologies such as artificial intelligence
(AI), machine learning (ML), and natural language processing (NLP), the chatbot is
designed to provide real-time support, automate routine tasks, and deliver
personalized experiences to users.

Objectives:
• Provide instant responses to customer queries.
• Improve customer satisfaction by offering 24/7 assistance
• To enhance user experience by allowing real-time interactions through a
user-friendly interface.

1
CHATBOT

Significance:
• Customer Satisfaction : By providing quick and accurate responses, the
chatbot significantly enhances the customer experience.
• Scalability: The chatbot can handle a large volume of interactions
simultaneously, making it scalable for businesses of all sizes.
• Innovation: The project showcases Infosys' commitment to leveraging
cutting-edge technology to solve business challenges.

•Project Scope:
Include :
d
• Task 1: Web Scraping - Extracting information from Infosys's website

(about, history, subsidiaries, newsroom ).


• Task 2: Data Storage and Retrieval - Storing scraped data as embeddings in

Qdrant for efficient retrieval.


• Task 3: Integration with OpenAI API - Using OpenAI's RAG model to

enhance chatbot responses.


• Task 4: User Interface for Chatbot - Creating a Streamlit-based web

interface for user interaction.


Not Included:
• Deep domain-specific functionalities beyond Infosys-related queries.

• Non-Infosys Content: The scope of the chatbot is limited to information


related to Infosys. It does not cover general knowledge or topics unrelated
to the company.
• Real-Time Data Updating: The project does not include real-time updating
of the database with new information. Changes on the Infosys website
would require a manual re-run of the scraping and data insertion processes.
• Multi-Language Support: The chatbot is designed to operate primarily in
English. It does not include multi-language support, which could limit
accessibility for non-English speaking users.

2
CHATBOT

•Requirements:
Functional Requirements
Web Scraping : Utilize web scraping tools such as requests and BeautifulSoup to
extract relevant content from designated sections of the Infosys website. This
includes sections like the company overview, history, subsidiaries, and newsroom
pages. The scraped data will be structured and cleaned for further processing.
Data Storage : Store the scraped text data as embeddings in the Qdrant vector
database. This process involves converting the text data into vector
representations (embeddings) that can be efficiently searched and retrieved
when needed.

Integration: Connect the system with the OpenAI API to leverage the Retrieval-
Augmented Generation (RAG) model. This integration allows the chatbot to use
advanced natural language processing capabilities to generate enhanced,
contextually relevant responses to user queries.

User Interface: Develop a responsive and intuitive user interface using the
Streamlit framework. The UI should allow users to interact with the chatbot easily,
input queries, and receive responses in a user-friendly manner.

Non-Functional Requirements

Performance : Ensure the system delivers quick response times for user queries.
The chatbot should process and return answers promptly, maintaining high
performance under varying loads.
Scalabilit : Design the system to handle multiple concurrent user interactions
y

3
CHATBOT

without performance degradation. The architecture should support scaling up to


meet increased demand.

Security: Implement robust security measures to protect user data and ensure
compliance with data privacy regulations (e.g., GDPR). This includes data
encryption, secure API communication, and regular security audits.

•Technical Stack:

Programming Languages
• Python: Used for backend processing, web scraping, data handling, and
integration with external APIs. Its extensive support for machine learning
and AI integration also makes it ideal for this project.

Frameworks/Libraries:

• BeautifulSoup : Employed for web scraping tasks to extract content from


designated sections of the Infosys website.
• Requests: Utilized for making HTTP requests to fetch web pages for
scraping.
• Qdrant-Client: Used for interacting with the Qdrant vector database,
including storing and retrieving data embeddings.
• OpenAI API: Integrated to leverage the Retrieval-Augmented Generation
(RAG) model for enhancing chatbot responses.
• Streamlit: Provides a simple and effective framework for developing the
frontend UI, allowing users to interact with the chatbot seamlessly.

4
CHATBOT

Databases:
• Qdrant : Qdrant is a vector database designed for efficient similarity search
and storage of high-dimensional vector embeddings. It is crucial for
handling the embeddings generated from the scraped text data, enabling
quick and relevant responses to user queries.

Tools/Platforms:

• GitHub: Used for version control and collaboration, allowing for efficient
management of code changes and team collaboration.
• Docker: Used for containerizing the application, ensuring consistency across
different environments and simplifying deployment processes.
• Streamlit: Provides an easy-to-use platform for deploying and sharing
Streamlit applications, enabling seamless user interactions.

•Architecture/Design:
System Architecture
• Web Scraping Module: Retrieves content from specific sections of the
Infosys website (about, history, subsidiaries, newsroom) using Python's
requests and BeautifulSoup libraries.

• Data Processing and Storage: Converts the extracted text data into
embeddings using a language model. Stores these embeddings in the
Qdrant vector database, allowing efficient retrieval based on similarity
searches.

• OpenAI Integration: Utilizes the data stored in Qdrant to enhance user


queries with additional context. Connects to the OpenAI API, leveraging the
Retrieval-Augmented Generation (RAG) model to generate accurate and
context-aware responses.

5
CHATBOT

• Streamlit UI: Provides a user-friendly, web-based interface developed with


Streamlit. Allows users to interact with the chatbot, input queries, and
receive responses seamlessly.
Trade-offs
• Dependency on OpenAI: Integrating OpenAI's advanced RAG model
provides superior response generation capabilities.However, this
dependency introduces a reliance on external API availability and
performance, which could impact the system if the API experiences
downtime or issues.

Python for ML: Python was chosen for its rich ecosystem of machine
learning libraries and tools, making development more straightforward and
efficient. Despite Python's advantages, it can have performance overheads
compared to lower-level languages, potentially impacting real-time
processing speeds.
•Development:

Technologies and Frameworks


• BeautifulSoup:
• Implemented BeautifulSoup for web scraping to extract structured
data from various sections of the Infosys website.
• BeautifulSoup's powerful parsing capabilities helped in navigating
and extracting the required information from complex HTML
structures.
Challenges:

Data Extraction:
• Challenge: Handling dynamic and complex web content during the
scraping
process.
•Solution: Refined scraping algorithms to manage dynamic content loading
and utilized techniques like parsing JavaScript-rendered data.
Integration Complexity:

6
CHATBOT

• Challenge: Integrating multiple APIs and ensuring seamless


communication between different modules.

• Solution: Thoroughly reviewed API documentation, conducted extensive


testing, and used robust error-handling mechanisms.

•Testing: Unit
Tests:
• Focus on validating individual module functionalities to ensure each
part works correctly in isolation.
• Examples include testing the web scraping functions to ensure accurate
data extraction and checking the API endpoints.
Integration Tests:
• Verify that the different modules (web scraping, data storage, AI
integration, and UI) work together seamlessly.
• Tests include ensuring data flows correctly from scraping to storage and
then to AI for response generation.
System Tests:
• Conduct end-to-end testing to verify the complete functionality of the
system, simulating user interaction scenarios.
• Ensure that user queries are processed correctly, and responses are
generated and displayed accurately in the UI.
Results
Identified and Resolved Issues:
• Data parsing inconsistencies were detected and corrected to improve the
accuracy of the extracted information.
• API integration mismatches were addressed to ensure smooth
communication between the components.
Performance Tests:
• Conducted to confirm that the system can handle expected user loads with
good scalability and responsiveness.

7
CHATBOT

• Ensured that the application remains performant under concurrent user


interactions
•User Guide:
Initial Setup and Configuration

Installation:
• Ensure Python is installed.
• Install required packages: beautifulsoup4, requests, streamlit, openai, and
qdrant-client .
Configuration:
• Set up API keys and sensitive information using environment variables or a

.env file.
• Ensure Qdrant is configured and running locally or on a server.

Running the Application:


• Navigate to the directory with Task_4_UI_Chatbot.py .

Run the app with streamlit run Task_4_UI_Chatbot.py .
• Access the chatbot at localhost:8501 in your web browser.
Using the Application
Interacting with the Chatbot:
• Enter your query about Infosys in the text input box.
• Press 'Send' or hit 'Enter' to submit.
• The chatbot will process and respond to your query.

Troubleshooting Common Issues


Chatbot Does Not Start:
• Problem: Errors when running the Streamlit script.
• Solution: Check dependencies, Python environment, and API keys.
API Connectivity Issues:
• Problem: Failures with OpenAI API or Qdrant connectivity.
• Solution: Verify API keys, endpoint configurations, and network
connectivity. Ensure API limits are not exceeded.

8
CHATBOT

•Conclusion :
The Infosys Chatbot project successfully developed an intelligent and responsive
chatbot capable of handling a variety of Infosys-related queries. This achievement
was made possible through the integration of several key components:

Effective Web Scraping: Extracted relevant content from various sections of the
Infosys website using BeautifulSoup and requests, providing a robust dataset for
the chatbot.
Efficient Data Storage and Retrieval: Utilized Qdrant, a vector database, to store
text embeddings, ensuring quick and efficient data retrieval for processing user
queries.

Advanced AI Integration: Leveraged OpenAI's Retrieval-Augmented Generation


(RAG) model to enhance the chatbot's responses, combining retrieved data with
advanced AI capabilities for more accurate and contextually relevant answers.

User-Friendly Interface: Developed a Streamlit-based web interface that is


intuitive and easy to use, allowing users to interact seamlessly with the chatbot.

Documentation and Troubleshooting: Provided clear and detailed


documentation, including user guides and troubleshooting tips, to facilitate easy
setup, deployment, and maintenance of the chatbot.
Achievements
• Functionality: Developed a chatbot that effectively answers Infosys-
related queries by combining real-time data retrieval and AI-generated
responses.
• Scalability: Designed a scalable system architecture that can handle
multiple concurrent user interactions.
• Performance: Achieved quick response times and efficient data
processing, ensuring a smooth user experience.

9
CHATBOT

• Security: Implemented robust security measures to ensure data privacy


compliance.
• Usability: Delivered a user-friendly interface that enhances user
interaction and engagement with the chatbot.

10

You might also like