BreakoutAI Assessment - AI Agent
BreakoutAI Assessment - AI Agent
Internship Details:
Stipend: €200-400 per month, based on experience and performance
Duration: 3 months
Location: Remote
D
● eadline: November 18th 2024
● Submission: Please send your completed project, including the GitHub repository link, to
[email protected]
elcome to the assessment! This document provides a complete overview of the requirements
W
and expected results, along with answers to potential questions. This project will help us evaluate
your skills in machine learning, API integration, and prompt engineering, as well as your approach
to creating user-friendly applications.
Project Overview
ou will create an AI agent that reads through a dataset (CSV or Google Sheets) and performs a
Y
web search to retrieve specific information for each entity in a chosen column. The AI will leverage
an LLM to parse web results based on the user's query and format the extracted data in a
structured output. The project includes building a simple dashboard where users can upload a file,
define search queries, and view/download the results.
Expected Deliverables
1. G
ithub Repository: A well-organized repo containing the project code, organized into
readme.mdfile that explains the setup, usage, and key features
relevant directories, with a
of the project.
2. R
eadme File: This should include a project summary, setup instructions, usage guide, and
details on any third-party APIs or tools you used.
3. L
oom Video: A 2-minute video walkthrough explaining the project’s purpose,
demonstrating key functionalities, and highlighting any notable code implementations or
decisions.
4. Working Dashboard: A simple, intuitive UI that allows users to:
○ Upload CSV files or connect a Google Sheet.
○ Choose the primary column with the list of entities (e.g., companies).
○ Input a custom prompt for information retrieval.
○ View and download the extracted results.
○ S
tore each entity's results in a structured format, ready for further processing by the
LLM.
4. Passing Results to an LLM for Parsing and Information Extraction
● G
oal: Use an LLM to extract specific information based on the user-defined prompt and
web results.
● Expected Outcome:
○ S
end each entity’s search results to the LLM, along with a backend prompt like
“Extract the email address of {company} from the following web results.”. This
prompt could be asked from the user as well.
○ E
nsure the LLM processes the search results and extracts the requested
information (e.g., email, address, etc.) for each entity.
● Technical Details:
○ Implement LLM integration (e.g., Groq or OpenAI’s GPT API) for processing the
data.
○ H
andle any errors gracefully, such as retrying failed queries or notifying the user if
data retrieval is unsuccessful.
5. Displaying and Storing the Extracted Information
● G
oal: Show extracted data in a user-friendly format and provide an option to download the
data.
● Expected Outcome:
○ D
isplay the extracted data in a table format within the dashboard, organized by
entity and extracted information.
○ O
ffer an option to download the results as a CSV or update a connected Google
Sheet with the extracted information.
● Technical Details:
○ Provide a “Download CSV” button and an option to update the Google Sheet.
Submission Instructions
1. G
ithub Repository: Push your project to a public GitHub repository. Make sure to organize
your code and include a comprehensive README.md .
readme.mdshould contain:
2. Readme File: Your
○ Project Description: Brief overview of the purpose and capabilities of your project.
○ Setup Instructions: How to install dependencies and run the application.
○ U
sage Guide: Instructions for using the dashboard, including connecting Google
Sheets and setting up search queries.
○ A
PI Keys and Environment Variables: Explain where users should enter their API
keys and any other required environment variables.
○ Optional Features: Highlight any extra features you’ve added.
3. Loom Video: Record a 2-minute video walkthrough of your project, explaining:
○ The overall purpose of the project.
○ Key features and how the dashboard works.
○ Any notable code implementations or challenges you encountered.
README.md
○ Share this video link in your repository’s .
es, you may use any search or scraping API of your choice as long as it retrieves web data
Y
effectively and can handle automated requests. Some popular options include SerpAPI,
craperAPI
S , or even search APIs provided by major search engines if they allow sufficient
requests for testing.
If you’re unable to find a free or accessible LLM API for testing, you can use theGroq API, which
offers a free tier specifically for projects and experimentation. We encourage you to use this option
README.mdfor how users can
if you’re unable to access other LLMs. Include instructions in your
set up a Groq API key if you choose this option.
Implement a fallback mechanism, such as retrying the query or notifying the user that the
extraction was incomplete. Consider displaying a message in the dashboard to indicate that data
may be missing for some entities and make sure errors are handled gracefully.
eep the UI simple, clear, and user-friendly. While we suggest usingStreamlitfor a quick setup,
K
Flaskis also acceptable if you prefer more control over UI customization. The focus is on
functionality, so prioritize features and usability over aesthetic design.
5. How should I handle API keys and other sensitive information?
tore sensitive information like API keys in environment variables for security purposes. In your
S
README.md
, provide instructions on where users should input their API keys and any other
required credentials. Avoid hardcoding these details directly in your code.
lease reach out to us if you encounter specific technical issues or if you require more time to
P
complete the project. We're here to support you, so feel free to communicate any blockers.
ocus on retrieving publicly available information related to each entity (e.g., email, address, or
F
products). Respect website terms of service, and do not attempt to scrape sensitive or private
data. Always use API services responsibly, adhering to their usage guidelines and rate limits.
If processing large datasets, consider limiting the number of queries or implementing a batch
processing approach. This will help avoid timeouts and API rate limits while making your solution
scalable.
bsolutely! We encourage you to be creative with optional features (e.g., handling multiple
A
prompts in a single query, more advanced error handling, or exporting results to Google Sheets).
README.md
Just make sure to clearly highlight any additional functionality in your .
Stay Connected:
FollowKapil Mittalon LinkedIn for updates and opportunities:LinkedIn Profile.
Join the Telegram channel for updates in trading and investment:Telegram @breakoutinvesting.
Check out other trading insights and strategies on TradingView:TradingView Profile.
Visit the Breakout AI website for more about what we do:breakoutai.tech.