ML Assignment
ML Assignment
Objective
Design and implement an agentic workflow in Python that trains on a provided document
(e.g., a PDF or text file), answer customer support queries based on that document, and
refines its responses using simulated feedback (e.g., "not helpful," "too vague"). The bot
should demonstrate autonomy, decision-making, and iterative improvement.
Problem Statement
1. Reads and processes a provided document (e.g., a company FAQ or product
manual).
2. Uses a pre-trained NLP model to generate responses to customer queries based on
the document.
3. Evaluates its responses using simulated feedback and adjust its strategy (e.g.,
adding more detail or rephrasing).
4. Logs its actions and decisions for transparency.
5. Handles cases where the query isn’t covered by the document gracefully.
Requirements
● Language: Python
● Expected Technologies:
○ Document Processing: PyPDF2 (for PDFs) or plain text reading with
Python’s open() function
○ NLP Model: Hugging Face transformers library (e.g., distilbert-base-uncased
for question-answering or a text generation model like gpt2)
○ Text Embedding/Search: sentence-transformers (to find relevant sections of
the document) or simple keyword matching
○ Workflow Management: Python classes for agent logic (optional: LangChain
for advanced agentic behavior)
○ Logging: logging module to track decisions and actions
○ Feedback Simulation: Custom rules or random feedback generator (e.g.,
"not helpful")
● Input: A sample document (e.g., a 1-2 page FAQ in PDF or TXT format) provided by
the user or created by you (e.g., a fake company FAQ).
● Output:
○ A Python script or notebook that runs the bot.
○ A log file (support_bot_log.txt) showing the bot’s decisions and iterations.
○ Sample query responses printed to the console.
Tasks
Deliverables
Submission Guidelines
1. Provide a GitHub repository with:
Here’s a skeleton to help get started (assumes a text file input and uses a
question-answering model):
python
CollapseWrapCopy
import logging
from transformers import pipeline
from sentence_transformers import SentenceTransformer, util
import random
# Set up logging
logging.basicConfig(filename='support_bot_log.txt', level=logging.INFO)
class SupportBotAgent:
def __init__(self, document_path):
self.qa_model = pipeline("question-answering",
model="distilbert-base-uncased-distilled-squad")
self.embedder = SentenceTransformer('all-MiniLM-L6-v2')
self.document_text = self.load_document(document_path)
self.sections = self.document_text.split('\n\n') # Split by
paragraphs
self.section_embeddings = self.embedder.encode(self.sections,
convert_to_tensor=True)
logging.info(f"Loaded document: {document_path}")
if __name__ == "__main__":
# Sample document should be provided as 'faq.txt'
bot = SupportBotAgent("faq.txt")
sample_queries = [
"How do I reset my password?",
"What’s the refund policy?",
"How do I fly to the moon?" # Out-of-scope query
]
bot.run(sample_queries)
Sample FAQ Document (faq.txt)
text
CollapseWrapCopy
Resetting Your Password
To reset your password, go to the login page and click "Forgot Password."
Enter your email and follow the link sent to you.
Refund Policy
We offer refunds within 30 days of purchase. Contact support at
[email protected] with your order number to start the process.
Contacting Support
Email us at [email protected] or call 1-800-555-1234 during business
hours (9 AM - 5 PM EST).
Evaluation Criteria
● Functionality: Does the bot train on the document and answer queries accurately?
● Adaptability: Does it adjust responses based on feedback?
● Code Quality: Is the code modular, readable, and well-commented?
● Logging: Are key steps and decisions logged appropriately?
● Robustness: Can it handle out-of-scope queries gracefully?
● Documentation: Is the README clear and sufficient?