67379dbfbc59a Assignment
67379dbfbc59a Assignment
Task Requirements:
1. Data Scraping:
○ Select one platform: Twitter, Reddit, or Telegram.
○ Scrape relevant data from specific handles, subreddits, or Telegram
channels focused on stock market discussions and predictions.
○ Clean and preprocess the data, ensuring it's ready for model input
(e.g., removing noise, handling missing data).
2. Data Analysis(Optional):
○ Perform sentiment analysis or topic modeling on the scraped data.
○ Extract key features such as sentiment polarity, frequency of
mentions, or any other indicators relevant to stock movements.
3. Prediction Model:
○ Build a machine learning model that takes the processed data as input
and predicts stock movements.
○ Test the model's accuracy on historical data or known stock trends.
○ Provide a detailed evaluation of the model's performance, including
metrics like accuracy, precision, recall, and any improvements that can
be made.
4. Technical Skills Required:
○ Proficiency in Python, with experience in web scraping (using libraries
such as BeautifulSoup, Scrapy, or Selenium).
○ Knowledge of Natural Language Processing (NLP) techniques for
sentiment analysis and text mining.
○ Experience in building and evaluating machine learning models using
libraries such as scikit-learn, TensorFlow, or PyTorch.
Submission Details:
Evaluation Criteria:
How to scrape?
Here are few (but not limited) ways to scrape data from the aforementioned
channels. Candidates can refer to these for quick implementation.