0% found this document useful (0 votes)

18 views9 pages

6-Week Project Plan - Advanced NIFTY 50 Stock Prediction System

The document outlines a 6-week project plan for developing an advanced stock prediction system for NIFTY 50 stocks, starting with data collection and preprocessing in the first two weeks. It includes steps for fine-tuning a financial language model, implementing a sentiment analysis pipeline, and setting up a prediction module for stock forecasting. Each week focuses on specific tasks such as data organization, model training, and sentiment scoring, with detailed coding examples and resources provided for implementation.

Uploaded by

vedharshacts

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

18 views9 pages

6-Week Project Plan - Advanced NIFTY 50 Stock Prediction System

Uploaded by

vedharshacts

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 9

6-Week Project Plan: Advanced NIFTY 50 Stock

Prediction System
Week 1-2: Data Collection & Preprocessing
• Setup: Install Python (3.8+), create a virtual environment or Colab notebook. Install libraries:
yfinance 1 , pandas , numpy , requests , beautifulsoup4 , pdfplumber ,
snscrape (or tweepy ), spacy (or nltk ), etc. Assign roles: e.g. two members on price/
data collection, one on reports/news, one on tweets, one on initial data organization.
• Stock Prices: Use yfinance to download 10+ years of daily and weekly prices for each
NIFTY50 ticker. For example:

import yfinance as yf
tickers = ["TCS.NS","INFY.NS",...] # NIFTY50 symbols
for ticker in tickers:
df_daily = yf.Ticker(ticker).history(period="10y", interval="1d")
df_daily.to_csv(f"{ticker}/prices.csv")
df_weekly = yf.Ticker(ticker).history(period="10y", interval="1wk")
df_weekly.to_csv(f"{ticker}/prices_weekly.csv")

This saves CSVs in each company’s folder (e.g. TCS/prices.csv ). yfinance “offers a Pythonic
way to fetch financial & market data” from Yahoo Finance 1 . Use small batches or threading to
avoid rate limits.
• Financial Reports (PDF/HTML): For each company, locate investor-relations or stock-exchange
pages. Use requests + BeautifulSoup to find PDF links of quarterly and annual reports.
Example:

from bs4 import BeautifulSoup

import requests, re
ir_url = "https://example.com/IR/financials"
res = requests.get(ir_url); soup = BeautifulSoup(res.text,
'html.parser')
for a in soup.find_all('a', href=re.compile(r'\.pdf$')):
pdf_url = a['href']
filename = pdf_url.split("/")[-1]
content = requests.get(pdf_url).content
open(f"{ticker}/financials/{filename}", "wb").write(content)

After downloading PDFs, extract text and tables using PDFPlumber 2 . Example:

import pdfplumber
with pdfplumber.open(f"{ticker}/financials/Q4_2023.pdf") as pdf:

1
text = "\n".join(page.extract_text() for page in pdf.pages)
open(f"{ticker}/financials/Q4_2023.txt", "w").write(text)

Store raw text (and optionally raw PDF) in financials/ . Parsing financial tables may also use
tabula-py if needed.
• News Headlines/Articles: Use a news API (e.g. NewsAPI) or scrape websites (like Moneycontrol,
EconomicTimes). For example, fetch via NewsAPI:

import requests
api_key = "YOUR_NEWSAPI_KEY"
res = requests.get(f"https://newsapi.org/v2/everything?q={ticker}
&language=en&apiKey={api_key}")
articles = res.json()['articles']
with open(f"{ticker}/news.json","w") as f: json.dump(articles, f)

Or scrape RSS/HTML: use BeautifulSoup to collect headlines and dates. Store in JSON or CSV
as a list of {date, title, content} .
• Twitter Data: Use snscrape (no API key required) to collect tweets containing company
name/handles. Example:

import snscrape.modules.twitter as sntwitter

import pandas as pd
tweets = []
for tweet in sntwitter.TwitterSearchScraper(f"{ticker} lang:en since:
2018-01-01 until:2025-01-01").get_items():
tweets.append({'date': tweet.date.date(), 'content': tweet.content})
if len(tweets)>=10000: break
pd.DataFrame(tweets).to_json(f"{ticker}/tweets.json", orient='records')

Save tweets as JSON list of {date, content} . Be mindful of legal/social media terms.
• Text Preprocessing: Ingest all collected text (news, tweets, report text). Use spaCy or NLTK to
clean: lowercase, remove stopwords, punctuation, and lemmatize. Example with spaCy:

import spacy
nlp = spacy.load("en_core_web_sm", disable=["ner","parser"])
def preprocess(text):
doc = nlp(text)
return " ".join(token.lemma_.lower() for token in doc
if token.is_alpha and not token.is_stop)
clean_texts = [preprocess(t) for t in raw_texts]

Store cleaned text if needed (or process on-the-fly later).

• Data Organization: Create a folder per company with this structure:

TCS/
prices.csv # daily prices
prices_weekly.csv
financials/ # downloaded PDFs and extracted text/tables

2
news.json # list of news items (date, title, content)
tweets.json # list of tweets (date, text)
sentiment.csv # (to be filled later)

Use consistent naming. This structured hierarchy keeps each company’s data self-contained.
• Learning Resources:
• yfinance Tutorial: (e.g. HuggingFace repo 1 or AlgorTrading101 guide)
• Web Scraping: BeautifulSoup tutorial (official docs or tutorial)
• PDF Parsing: PDFPlumber guide 3 or PyMuPDF
• NLP Basics: spaCy/NLTK documentation on tokenization, lemmatization
• Twitter Scraping: snscrape GitHub examples
• Environment: Run data collection on a local machine or Colab. No heavy GPU needed. Ensure
~10–20 GB disk for data. Use Google Drive (mounted) if using Colab to store data.

Week 3: Fine-Tuning the Financial LLM

• What to do: Prepare a domain-specific Q&A dataset and fine-tune an LLM so it can answer
finance questions.
• Q&A Data Prep: From the collected financial texts (reports, transcripts, news), generate
instruction-response pairs. For example:

Q: "What was TCS’s net profit in Q4 2023?"

A: "Net profit was ₹4,453 crore."

You can script this by identifying key figures in tables and crafting corresponding questions, or
manually create a few hundred samples to start. Format data in “instruction-input-response”
JSONL (Alpaca style). For example:

{"instruction": "What was TCS's net profit in Q4 2023?", "input": "",

"output": "₹4,453 crore"}

• Model Selection: Use an open-source model (e.g. LLaMA-2 7B or Mistral-7B). These require
substantial GPU.
• Environment: Use Google Colab Pro+ with an A100 (40 GB VRAM) or a similar GPU machine.
Install transformers , peft , and bitsandbytes for low-memory fine-tuning.
• Setup QLoRA/LoRA: Load the model in 4-bit to save memory and attach LoRA adapters.
Example code:

from transformers import AutoModelForCausalLM, AutoTokenizer

from peft import LoraConfig, get_peft_model
model_name = "meta-llama/Llama-2-7b-chat-hf"
model = AutoModelForCausalLM.from_pretrained(model_name,
load_in_4bit=True,
device_map="auto")
tokenizer = AutoTokenizer.from_pretrained(model_name)
lora_config = LoraConfig(r=8, lora_alpha=32,
target_modules=["q_proj","v_proj"])
model = get_peft_model(model, lora_config)

3
This employs 4-bit quantization (via BitsAndBytes) together with LoRA adapters – a strategy
known as QLoRA. QLoRA “quantizes a model to 4-bits and then trains it with LoRA, enabling fine-
tuning of even very large models on a single GPU” 4 .
• Training: Use the Hugging Face Trainer or TRL SFT script. For example:

from transformers import Trainer, TrainingArguments

training_args = TrainingArguments(
output_dir="./llm-ft",
num_train_epochs=3,
per_device_train_batch_size=1,
gradient_accumulation_steps=16,
fp16=True, # use mixed precision
logging_steps=50
)
trainer = Trainer(model=model, args=training_args,
train_dataset=qa_dataset)
trainer.train()

Use a small batch size (1) with gradient accumulation to simulate larger batches. Enable fp16/
bf16 and possibly gradient checkpointing to fit into memory. According to HuggingFace
examples, using --load_in_4bit with --use_peft performs QLoRA training 5 . Expect
several hours of training for a few epochs.
• Validation: After training, test the model on held-out questions. For example:

prompt = "What was TCS's net profit in Q4 2023?"

inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
out = model.generate(**inputs, max_new_tokens=50)
print(tokenizer.decode(out[0], skip_special_tokens=True))

The model should answer consistently with the company reports.

• Tools/Libraries: PyTorch, HuggingFace Transformers, PEFT (LoRA), BitsAndBytes, Datasets (for
QA data).
• What to Learn:
• HuggingFace PEFT/QLoRA tutorials (see docs or example scripts 4 5 )
• Creating instruction-tuning datasets (Alpaca/Stanford Alpaca guides)
• Google Colab GPU usage (A100 details)
• Hardware Guidance: On an A100-40GB, a 7B model with LoRA at 4-bit typically fits with batch
size 1. Use gradient_accumulation_steps ~ 16 to emulate a larger batch. Training 3–5
epochs may take ~4–8 hours. Save checkpoints to Google Drive or Hugging Face Hub.

Week 4: Sentiment Analysis Pipeline & RAG Setup

• Sentiment Labeling: Use a financial-domain sentiment model to score news and tweets. A good
choice is FinBERT (e.g. yiyanghkust/finbert-tone on HuggingFace). This model is pre-
trained on financial text and fine-tuned for sentiment 6 . Example usage with Transformers
pipeline 7 :

from transformers import pipeline

sentiment_pipeline = pipeline(

4
"sentiment-analysis",
model="yiyanghkust/finbert-tone",
tokenizer="yiyanghkust/finbert-tone"
)
texts = ["Revenue is growing steadily", "We have some concerns about
cash flow"]
results = sentiment_pipeline(texts)
# results: [{'label':'LABEL_1','score':0.95}, ...]

This labels each piece as positive/neutral/negative. Apply this to each news headline/article and
tweet.
• Aggregate Scores: For each company, aggregate sentiment by day or week. For example, count
how many positive/negatives occurred on each date. Use pandas:

import pandas as pd
df = pd.read_json(f"{ticker}/news.json") # columns: date, content
df['sentiment'] = df['content'].apply(lambda t: sentiment_pipeline(t)[0]
['label'])
daily_counts = df.groupby('date')
['sentiment'].value_counts().unstack(fill_value=0)
daily_counts.to_csv(f"{ticker}/sentiment.csv")

The resulting sentiment.csv has columns like

date, positive_count, neutral_count, negative_count . This file is stored in the
company folder.
• Retrieval-Augmented Generation (RAG): Build a semantic search over the collected documents
so the LLM can look up facts. Steps:
• Chunk Documents: Break each company’s reports/transcripts into smaller chunks (e.g. ~1000
characters or paragraphs) to limit context size.
• Embeddings: Use a sentence-transformers model (e.g. all-MiniLM-L6-v2 ) to embed each
chunk. Example:

from sentence_transformers import SentenceTransformer

embedder = SentenceTransformer('all-MiniLM-L6-v2')
chunks = ["chunk1 text", "chunk2 text", ...]
embeddings = embedder.encode(chunks)

(See the Sentence Transformers Quickstart for details.)

• Vector Store: Store embeddings in FAISS or Chroma. Using FAISS:

import faiss, numpy as np

dim = embeddings.shape[1]
index = faiss.IndexFlatL2(dim)
index.add(np.vstack(embeddings))
# Save index if needed

Keep a mapping from vector indices to the original text chunks (e.g. a list or dict).
• Retrieval Pipeline: At query time, embed the user query and find nearest chunks. For example:

5
query = "What was TCS's revenue in 2022?"
q_emb = embedder.encode([query])
D, I = index.search(q_emb.astype(np.float32), k=3)
context = " ".join(chunks[i] for i in I[0])

Prepend or append these contexts to the query prompt. Then feed to the fine-tuned LLM to
generate an answer grounded in retrieved texts.
• Where to Run: Embedding and FAISS building can be done on Colab (GPU accelerates
embedding) or local machine. FAISS retrieval is fast on CPU. Sentiment pipeline can use GPU (for
speed) or CPU.
• Tools: Transformers, sentence-transformers, FAISS (or ChromaDB), pandas.
• What to Learn:
• FinBERT/model-card usage 7 and sentiment pipelines.
• Sentence-Transformers documentation (see Quickstart 8 ).
• FAISS tutorial for vector search.
• RAG concepts (HuggingFace RAG tutorials or blog posts).
• Sample Code:

# Sentiment analysis example (using FinBERT)

sentences = ["The market outlook is positive.", "Earnings fell short of
expectations."]
pipeline = pipeline("sentiment-analysis", model="yiyanghkust/finbert-
tone")
labels = pipeline(sentences)
print(labels) # e.g. [{'label':'LABEL_1','score':0.97}, ...]

# Building and querying FAISS index

import faiss
from sentence_transformers import SentenceTransformer
model = SentenceTransformer('all-MiniLM-L6-v2')
chunks = ["Earnings summary ...", "Balance sheet ...", "Cash flow ..."]
embeds = model.encode(chunks)
index = faiss.IndexFlatL2(embeds.shape[1])
index.add(embeds)
query = model.encode(["TCS net profit Q1 2023?"])
D,I = index.search(query, k=2)
retrieved = [chunks[i] for i in I[0]]
print(retrieved)

• Data Structuring: Add sentiment.csv to each company folder. Save any index files or pickled
embeddings (e.g. TCS/embeddings.faiss ) separately.

Week 5: Prediction Module (Stock Forecasting)

• Features: For each company, merge historical prices, technical indicators, and sentiment
features into a training dataset. Compute indicators using TA-Lib (or pandas-ta ):

6
import talib
df = pd.read_csv(f"{ticker}/prices.csv", parse_dates=['Date'],
index_col='Date')
df['RSI_14'] = talib.RSI(df['Close'], timeperiod=14)
df['SMA_50'] = talib.SMA(df['Close'], timeperiod=50)

Merge sentiment data (from sentiment.csv ) on date. For example, include pos_count ,
neg_count as additional features for the same date.
• Data Preparation: Create input-output pairs. For example, use the past 30 days of features to
predict the next day’s closing price. Normalize features with sklearn.preprocessing
(MinMax or StandardScaler). Split data chronologically (e.g. train on 2013–2021, test on 2022–
2023).
• Model: Implement a sequential model in PyTorch. An example LSTM regressor:

import torch.nn as nn
class StockLSTM(nn.Module):
def __init__(self, input_dim, hidden_dim=64):
super().__init__()
self.lstm = nn.LSTM(input_dim, hidden_dim, num_layers=2,
batch_first=True)
self.fc = nn.Linear(hidden_dim, 1)
def forward(self, x):
out, _ = self.lstm(x)
return self.fc(out[:, -1, :])

This follows the pattern in PyTorch LSTM tutorials 9 . You can also experiment with GRU or
simple Transformer layers for time series. Include sentiment features as part of input_dim .
• Training: Train with MSE loss to predict next-day price. Example:

model = StockLSTM(input_dim=number_of_features)
criterion = nn.MSELoss()
optimizer = torch.optim.Adam(model.parameters(), lr=0.001)
for epoch in range(epochs):
for x_batch, y_batch in train_loader:
optimizer.zero_grad()
pred = model(x_batch)
loss = criterion(pred, y_batch)
loss.backward()
optimizer.step()

Use appropriate batch sizing (small batches if data limited). Consider training a separate model
per ticker or a single multi-stock model (one-hot encode company as feature).
• Evaluation: Compute RMSE/MAE on test set. Plot predicted vs actual prices. Adjust
hyperparameters as needed.
• Tools: PyTorch (or TensorFlow/Keras if preferred), sklearn (for scaling/split), talib .
• What to Learn:
• Time-series forecasting with LSTM in PyTorch (e.g. MachineLearningMastery code 9 ).
• Computing technical indicators (TA-Lib docs).

7
• Handling multivariate regression.
• Sample Code:

import torch
from torch.utils.data import DataLoader, TensorDataset
import numpy as np

# Example: prepare data as tensors

X_train = torch.tensor(train_features.values,
dtype=torch.float32).reshape(-1, window, input_dim)
y_train = torch.tensor(train_targets.values,
dtype=torch.float32).reshape(-1,1)
loader = DataLoader(TensorDataset(X_train, y_train), batch_size=16,
shuffle=True)

model = StockLSTM(input_dim=input_dim, hidden_dim=32)

optimizer = torch.optim.Adam(model.parameters())
for epoch in range(50):
for X_batch, Y_batch in loader:
optimizer.zero_grad()
output = model(X_batch)
loss = nn.MSELoss()(output, Y_batch)
loss.backward()
optimizer.step()

• Hardware: Training LSTM on one company’s data is light (~MBs of data). A CPU can suffice, but a
GPU (even a small one) will speed up training. Ensure memory for data scaling (few dozen
features × a few thousand timesteps).

Week 6: Integration & UI Planning

• Integration: Test end-to-end flow. For a user query (e.g. “What was Infosys’s profit last
quarter?”), run it through the RAG pipeline: embed query, retrieve chunks, feed to fine-tuned
LLM, and return answer. Validate with known facts. Also check prediction module: ensure it can
output forecasts given recent data. Write scripts/notebooks that tie these steps together.
• UI (Planned): Design a simple interface (deferred to after core work). For now, outline a
Streamlit or Flask app. Example Streamlit idea:

import streamlit as st
st.title("NIFTY 50 Stock Predictor")
ticker = st.selectbox("Select company", tickers)
query = st.text_input("Ask a question about the company")
if st.button("Get Answer"):
answer = rag_llm_answer(ticker, query)
st.write(answer)
st.line_chart(prediction_dataframe[ticker]) # show forecasts

No code needed immediately, but consider how to present retrieved context vs model answer,
and how to display price charts.

8
• Documentation & Testing: Write README with instructions. Each team member documents
their module (data, LLM, sentiment, prediction). Perform unit tests on functions (e.g. does data
loader handle missing values?).
• What to Learn:
• Streamlit quickstart (https://docs.streamlit.io) for dashboards.
• Flask tutorial (if choosing web app) for forms and API endpoints.
• Plotting libraries (Matplotlib/Plotly) for time-series charts.
• Hardware: The UI is light – can run on any server or local machine. Final integration just uses
already-trained models.

Note: Throughout the project, communicate frequently. Use version control (Git) and share
intermediate results (e.g., checkpoints, sample data). Allocate tasks so multiple people work in parallel:
for instance, while two team members scrape and clean data (Weeks 1–2), others can begin designing
the Q&A format and LLM fine-tune pipeline (Weeks 2–3). The 5-person team can split roughly: Data
Engineering, NLP (LLM), Sentiment, Prediction Modeling, and Integration/UX.

Key References:
- yfinance (financial data) 1 ;
- FinBERT (financial sentiment) 6 7 ;
- QLoRA fine-tuning (PEFT tutorial) 4 5 ;
- PyTorch LSTM example 9 .

1 GitHub - ranaroussi/yfinance: Download market data from Yahoo! Finance's API

https://github.com/ranaroussi/yfinance

2 3A Step-by-Step Guide to Parsing PDFs using the pdfplumber Library In Python | by Azhar Sayyad
| Medium
https://azhar-sayyad.medium.com/a-step-by-step-guide-to-parsing-pdfs-using-the-pdfplumber-library-in-python-
c12d94ae9f07

4 Quantization
https://huggingface.co/docs/peft/en/developer_guides/quantization

5 Llama2 fine-tunning with PEFT QLora and testing the model - Transformers - Hugging Face Forums
https://discuss.huggingface.co/t/llama2-fine-tunning-with-peft-qlora-and-testing-the-model/50581

6 GitHub - ProsusAI/finBERT: Financial Sentiment Analysis with BERT

https://github.com/ProsusAI/finBERT

7 yiyanghkust/finbert-tone · Hugging Face

https://huggingface.co/yiyanghkust/finbert-tone

8 Quickstart — Sentence Transformers documentation

https://sbert.net/docs/quickstart.html

9 LSTM for Time Series Prediction in PyTorch - MachineLearningMastery.com

https://machinelearningmastery.com/lstm-for-time-series-prediction-in-pytorch/

Agentic AI
No ratings yet
Agentic AI
26 pages
Anis D. Ultimate Step by Step Guide To Data Science..Python.2021
No ratings yet
Anis D. Ultimate Step by Step Guide To Data Science..Python.2021
161 pages
Latin Short Stories
100% (7)
Latin Short Stories
140 pages
Fast API
No ratings yet
Fast API
14 pages
Python AI ML LLM TrainingJun142024
No ratings yet
Python AI ML LLM TrainingJun142024
192 pages
English A Guide To Giving Dawah To Non Muslims
100% (1)
English A Guide To Giving Dawah To Non Muslims
52 pages
LLMs in Financial Data
No ratings yet
LLMs in Financial Data
11 pages
Scraping 1000's of News Articles Using 10 Simple Steps - by Kajal Yadav - Jun, 2020 - Towards Data Science
No ratings yet
Scraping 1000's of News Articles Using 10 Simple Steps - by Kajal Yadav - Jun, 2020 - Towards Data Science
24 pages
Organizational Behavior Chapter 9
No ratings yet
Organizational Behavior Chapter 9
25 pages
Literacy Assessment Test
No ratings yet
Literacy Assessment Test
4 pages
The Cat Owners Manual Operating Instruct PDF
No ratings yet
The Cat Owners Manual Operating Instruct PDF
8 pages
Top 18 Python Libraries
100% (1)
Top 18 Python Libraries
11 pages
Advance Trading Bot
No ratings yet
Advance Trading Bot
7 pages
Python Task Descriptions
No ratings yet
Python Task Descriptions
10 pages
Fire in My Future
No ratings yet
Fire in My Future
34 pages
Chapter 4 Logical Design
No ratings yet
Chapter 4 Logical Design
76 pages
Text Mining Problems-4
No ratings yet
Text Mining Problems-4
59 pages
OceanofPDF - Com Hands-On Machine Learning From Scratch - Venelin Valkov
No ratings yet
OceanofPDF - Com Hands-On Machine Learning From Scratch - Venelin Valkov
119 pages
Me Project Stocks Dash
No ratings yet
Me Project Stocks Dash
14 pages
Stage 1 - Data Ingestion and Organization
No ratings yet
Stage 1 - Data Ingestion and Organization
9 pages
Let Us Create Super Ai by Chat GPT and Muwanguz David
No ratings yet
Let Us Create Super Ai by Chat GPT and Muwanguz David
133 pages
IGNOU Social and Political Thoughts (MPSE-004)
No ratings yet
IGNOU Social and Political Thoughts (MPSE-004)
188 pages
Allama Iqbal Open University: Level
No ratings yet
Allama Iqbal Open University: Level
33 pages
ML Lab Manual
No ratings yet
ML Lab Manual
90 pages
Arthur Conan Doyle
No ratings yet
Arthur Conan Doyle
15 pages
DeepTrading With TensorFlow 6 - TodoTrader
No ratings yet
DeepTrading With TensorFlow 6 - TodoTrader
50 pages
SocrAI Day 3
No ratings yet
SocrAI Day 3
43 pages
Python For AI Developers
No ratings yet
Python For AI Developers
45 pages
Unit 4 Unit Wise Question
No ratings yet
Unit 4 Unit Wise Question
20 pages
Step Into Your Supernatural Destiny Activate The Calling On Your LIFE For Breakthrough Success (Destiny Image, Fighting... (Edwin Kim (Kim, Edwin) ) (Z-Library)
No ratings yet
Step Into Your Supernatural Destiny Activate The Calling On Your LIFE For Breakthrough Success (Destiny Image, Fighting... (Edwin Kim (Kim, Edwin) ) (Z-Library)
46 pages
ML Lab File
No ratings yet
ML Lab File
33 pages
Conjunctions Notes
No ratings yet
Conjunctions Notes
3 pages
Islamic Story
No ratings yet
Islamic Story
19 pages
CH 2
No ratings yet
CH 2
29 pages
Assignment:12 Breadth-First Search: Algorithm
No ratings yet
Assignment:12 Breadth-First Search: Algorithm
4 pages
10 Passage 3 - E-Training Q28-40
No ratings yet
10 Passage 3 - E-Training Q28-40
6 pages
Analytics and Tech Mining For Engineering Managers 9781606505113 1606505114 9781606505106
No ratings yet
Analytics and Tech Mining For Engineering Managers 9781606505113 1606505114 9781606505106
146 pages
Programming 2 Lectures
No ratings yet
Programming 2 Lectures
52 pages
AI LAB-EXP-13 To LAST
No ratings yet
AI LAB-EXP-13 To LAST
16 pages
Lecture Week 5-Data Analytics-Data Scraping and Data Wrangling
No ratings yet
Lecture Week 5-Data Analytics-Data Scraping and Data Wrangling
15 pages
WDM - Week - I
No ratings yet
WDM - Week - I
24 pages
Exemplo de Código Fonte para Uma Página Web
No ratings yet
Exemplo de Código Fonte para Uma Página Web
31 pages
BAET Record
No ratings yet
BAET Record
19 pages
Aristotle's Theory of TOPOS
No ratings yet
Aristotle's Theory of TOPOS
22 pages
Scrape Vietnamese Financial Data, Starting With The Browser Console - Python
No ratings yet
Scrape Vietnamese Financial Data, Starting With The Browser Console - Python
12 pages
Ddsu666 User Manual en
No ratings yet
Ddsu666 User Manual en
22 pages
Week 3 A
No ratings yet
Week 3 A
18 pages
Wa0003.
No ratings yet
Wa0003.
12 pages
Lecture 2 - Collecting, Analyzing, and Visualizing Data With Python Part I
No ratings yet
Lecture 2 - Collecting, Analyzing, and Visualizing Data With Python Part I
15 pages
Python Workshop March 2018
No ratings yet
Python Workshop March 2018
31 pages
How To Use NLP in Python A Practical Step-by-Step ExampleTo Find Out The In-Demand Skills For Data SC
No ratings yet
How To Use NLP in Python A Practical Step-by-Step ExampleTo Find Out The In-Demand Skills For Data SC
12 pages
Wa0013.
No ratings yet
Wa0013.
12 pages
Building AI-Driven Systems For Financial Growth - A
No ratings yet
Building AI-Driven Systems For Financial Growth - A
7 pages
Postgraduate English: Re-Evaluating Woolf's Androgynous Mind
No ratings yet
Postgraduate English: Re-Evaluating Woolf's Androgynous Mind
25 pages
F5923 3G Soho Mobile Router User Manual
No ratings yet
F5923 3G Soho Mobile Router User Manual
19 pages
Project
No ratings yet
Project
7 pages
7399 25487 1 PB
No ratings yet
7399 25487 1 PB
12 pages
Sentimental Analysis On Stock Market: Minor Project
No ratings yet
Sentimental Analysis On Stock Market: Minor Project
15 pages
Project X
No ratings yet
Project X
10 pages
Python - Data Analysis
No ratings yet
Python - Data Analysis
11 pages
Internship Assignment Coding2024
No ratings yet
Internship Assignment Coding2024
6 pages
Autogen Company Research Example
No ratings yet
Autogen Company Research Example
8 pages
Mind Mapping v1.2
No ratings yet
Mind Mapping v1.2
4 pages
Python Pre-Reqs For Market Risk
No ratings yet
Python Pre-Reqs For Market Risk
3 pages
By William Somerset: Mr. Know-All
No ratings yet
By William Somerset: Mr. Know-All
20 pages
Lab - 01 - Data Engineering Practice
No ratings yet
Lab - 01 - Data Engineering Practice
4 pages
0901ec221090 Rishavmudgal
No ratings yet
0901ec221090 Rishavmudgal
11 pages
Stock Market
No ratings yet
Stock Market
3 pages
Cambridge IGCSE™: Chinese As A Second Language 0523/03
No ratings yet
Cambridge IGCSE™: Chinese As A Second Language 0523/03
6 pages
Resources ML
No ratings yet
Resources ML
2 pages
00 Info - Python For Finance Web Links
No ratings yet
00 Info - Python For Finance Web Links
6 pages
Nulltimetable
No ratings yet
Nulltimetable
2 pages
Day 1-Tasks
No ratings yet
Day 1-Tasks
3 pages
Ai Blueprint
No ratings yet
Ai Blueprint
6 pages
Importing All The Necessary Libraries - Transcript
No ratings yet
Importing All The Necessary Libraries - Transcript
3 pages
Texts of Western Philosophy Sem4 BA Hons Philosophy 2019-1
No ratings yet
Texts of Western Philosophy Sem4 BA Hons Philosophy 2019-1
3 pages
Arquivs nlp01
No ratings yet
Arquivs nlp01
3 pages
Language and Identity
100% (1)
Language and Identity
9 pages
SDFG
No ratings yet
SDFG
4 pages
Data Science With Python Workflow: Click The Links For Documentation
No ratings yet
Data Science With Python Workflow: Click The Links For Documentation
2 pages
3 Must-Have Projects For Your Data Science Portfolio - by Aakash N S - Jovian - Jan, 2021 - Medium
No ratings yet
3 Must-Have Projects For Your Data Science Portfolio - by Aakash N S - Jovian - Jan, 2021 - Medium
1 page
Micro Strategy Material
No ratings yet
Micro Strategy Material
298 pages
Ebook: Data Visualization Tools For Users (English)
No ratings yet
Ebook: Data Visualization Tools For Users (English)
26 pages
CDAC
No ratings yet
CDAC
2 pages
Aschnorous Server Request
No ratings yet
Aschnorous Server Request
4 pages

6-Week Project Plan - Advanced NIFTY 50 Stock Prediction System

Uploaded by

6-Week Project Plan - Advanced NIFTY 50 Stock Prediction System

Uploaded by

6-Week Project Plan: Advanced NIFTY 50 Stock

from bs4 import BeautifulSoup

import snscrape.modules.twitter as sntwitter

Store cleaned text if needed (or process on-the-fly later).

Week 3: Fine-Tuning the Financial LLM

Q: "What was TCS’s net profit in Q4 2023?"

{"instruction": "What was TCS's net profit in Q4 2023?", "input": "",

from transformers import AutoModelForCausalLM, AutoTokenizer

from transformers import Trainer, TrainingArguments

prompt = "What was TCS's net profit in Q4 2023?"

The model should answer consistently with the company reports.

Week 4: Sentiment Analysis Pipeline & RAG Setup

from transformers import pipeline

The resulting sentiment.csv has columns like

from sentence_transformers import SentenceTransformer

(See the Sentence Transformers Quickstart for details.)

import faiss, numpy as np

# Sentiment analysis example (using FinBERT)

# Building and querying FAISS index

Week 5: Prediction Module (Stock Forecasting)

# Example: prepare data as tensors

model = StockLSTM(input_dim=input_dim, hidden_dim=32)

Week 6: Integration & UI Planning

1 GitHub - ranaroussi/yfinance: Download market data from Yahoo! Finance's API

6 GitHub - ProsusAI/finBERT: Financial Sentiment Analysis with BERT

7 yiyanghkust/finbert-tone · Hugging Face

8 Quickstart — Sentence Transformers documentation

9 LSTM for Time Series Prediction in PyTorch - MachineLearningMastery.com

You might also like