0% found this document useful (0 votes)
66 views1 page

ConversaiLabs Assignment - 1

The document outlines an assignment to develop a full-stack web scraper integrated with a chatbot that utilizes scraped content to generate responses using Retrieval-Augmented Generation (RAG) and LLM techniques. Key requirements include scraping content from a user-provided URL, enabling user interaction through a CLI or web application, and submitting the project via a GitHub repository with organized code and documentation. Optional features for bonus points include source citations in responses and performance improvements through multi-threading or async techniques.

Uploaded by

ANKIT JHA
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
66 views1 page

ConversaiLabs Assignment - 1

The document outlines an assignment to develop a full-stack web scraper integrated with a chatbot that utilizes scraped content to generate responses using Retrieval-Augmented Generation (RAG) and LLM techniques. Key requirements include scraping content from a user-provided URL, enabling user interaction through a CLI or web application, and submitting the project via a GitHub repository with organized code and documentation. Optional features for bonus points include source citations in responses and performance improvements through multi-threading or async techniques.

Uploaded by

ANKIT JHA
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 1

‭Assignment: Full-Stack Web Scraper with Chatbot‬

‭Objective‬

‭ evelop a full-stack application that scrapes content from a user-provided website URL. The application should‬
D
‭allow users to interact with a chatbot that uses the scraped content to generate responses, leveraging‬
‭Retrieval-Augmented Generation (RAG)‬‭and‬‭LLM‬‭techniques.‬

‭Requirements‬

‭1.‬ ‭Web Scraping‬


‭○‬ ‭The application should scrape content only from the provided URL. There is no need to scrape any‬
‭other referenced pages.‬
‭○‬ ‭Efficiently store the scraped content for future retrieval.‬
‭2.‬ ‭Chatbot Integration‬
‭○‬ ‭Develop a chatbot capable of responding to user queries based on the scraped content.‬
‭○‬ ‭Use‬‭Retrieval-Augmented Generation (RAG)‬‭to retrieve‬‭relevant data, which is then used by an‬
‭LLM to generate meaningful responses.‬
‭○‬ ‭[OPTIONAL] The chatbot should reference the‬‭source URLs‬‭in its responses to ensure transparency‬
‭and traceability.‬‭(Bonus points for properly citing sources in the response with clickable links, NO‬
‭NEED TO DO IT ACCURATELY)‬‭.‬
‭Example:‬
‭If a user asks, “What services does the company provide?”, the chatbot might respond with:‬

“‭ The company offers a range of services, including software development, AI consulting, and cloud‬
‭infrastructure solutions. Source: link/reference to the content”‬

‭3.‬ ‭User Interaction‬


‭○‬ ‭Provide either a‬‭Command Line Interface (CLI)‬‭or a‬‭Frontend Web Application‬‭(choose one) for‬
‭users to interact with.‬
‭○‬ ‭Users should be able to input a website URL, trigger the scraping process, and then begin chatting‬
‭with the bot.‬

‭Submission‬

‭●‬ ‭Submit the project through a‬‭GitHub repository‬‭link‬‭containing the following:‬


‭○‬ ‭Well-organized code‬
‭○‬ ‭A‬‭README.md‬‭file that includes:‬
‭■‬ ‭Technical details‬
‭■‬ ‭Examples of chatbot responses‬
‭■‬ ‭Edge cases you have handled and those that were not (Its okay if you have not handled all‬
‭the edges but we would like to know all edge cases you can think of)‬

‭Bonus Features‬‭(Optional but will earn extra points)‬

‭●‬ P ‭ rovide‬‭source citations‬‭in chatbot responses with‬‭clickable links to the original pages. (‬‭NO NEED TO‬‭DO‬
‭IT ACCURATELY)‬
‭●‬ ‭Use‬‭multi-threading or async techniques‬‭to improve‬‭scraping and chatbot performance.‬

‭ or LLM‬‭, you can use‬‭Groq API‬‭as it provides‬‭free credits or any other‬‭LLM‬‭API‬‭provider‬‭you like.‬
F
‭Language/Framework‬‭: Any‬
‭Vector DB‬‭: Any‬

You might also like