Uday-Final Report
Uday-Final Report
OF
SUBMITTED BY
1
2023 -2024
SAVITRIBAI PHULE PUNE UNIVERSITY
2024 -2025
CERTIFICATE
Submitted by
is a bonafide student of this institute and the work has been carried out by him/her under the
supervision of Prof. Jaishri Panchal and it is approved for the partial fulfillment of the
requirement of Savitribai Phule Pune University, for the award of the degree of Bachelor of
Engineering (Computer Engineering).
(Dr. N. S. Narwade)
Principal
2
Place : Pune
Date :
ACKNOWLEDGEMENT
It is my pleasure to express a deep sense of gratitude to all those who helped me in the
completion of my final year project. I am highly indebted to Prof. Jaishri Panchal from
PGMCOE for her guidance, constant supervision, and for providing the necessary information
and support throughout the project.
I would also like to extend my gratitude to Prof. Shrikant Dhamdhere (Head of Computer
Department) for his cooperation and encouragement, which played a crucial role in providing the
required facilities for the project.
Lastly, I wish to thank and appreciate all my teachers and friends for their constructive
comments, suggestions, and guidance, as well as everyone who directly or indirectly helped me
complete this project successfully.
ARYAN ASATI
ADITYA MORE
UMESH CHIMANE
SAHIL BABAR
3
ABSTRACT
The Stock Market, as we know, is volatile in nature and the prediction of the same is a
cumbersome task. Stock prices depend upon not only economic factors, but they relate to various
physical, psychological, rational and other important parameters. In this research work, the stock
prices are predicted using the Auto Regressive Integrated Moving Average (ARIMA) Model.
Stock price predictive models have been developed and run-on published stock data acquired
from Yahoo Finance. The experimental results lead to the conclusion that ARIMA Model can be
used to predict stock prices for a short period of time with reasonable accuracy.
Key Words: Machine Learning; Stock Market; Predictive Analysis; Financial Time Series
Forecasting
4
TABLE OF CONTENTS
LIST OF ABBREVATIONS i
LIST OF FIGURES ii
LIST OF TABLES iii
INDEX
5
3.3 External Interface Requirements (If Any)
3.3.1 User Interfaces
3.3.2 Hardware Interfaces
3.3.3 Software Interfaces
3.3.4 Communication Interfaces
3.4 Non-functional Requirements
3.4.1 Performance Requirements
3.4.2 Safety Requirements
3.4.3 Security Requirements
3.4.4 Software Quality Attributes
3.5 System Requirements
3.3.1 Database Requirements
3.3.2 Software Requirements(Platform Choice)
5.3.3 Hardware Requirements
3.6 Analysis Models: SDLC Model to be applied
3.7 System Implementation Plan
04 System Design
4.1 System Architecture
4.2 Data Flow Diagrams
4.3 Entity Relationship Diagrams
4.4 UML Diagrams
05 Other Specification
5.1 Advantages
5.2 Limitations
5.3 Applications
06 Conclusions & Future Work
Appendix A: Problem statement feasibility assessment using, satisfiability analysis
and NP Hard, NP Complete or P type using modern algebra and relevant mathematical
models.
Appendix B: Details of the papers referred in IEEE format (given earlier) Summary
of the above paper in not more than 3-4 lines. Here you should write the seed idea of the
6
papers you had referred for preparation of this project report in the following format.
Example: Thomas Noltey, Hans Hanssony, Lucia Lo Belloz,”Communication Buses
for Automotive Applications” In Proceedings of the 3rd Information Survivability
Workshop (ISW-2007), Boston, Massachusetts, USA, October 2007. IEEE Computer
Society.
References
7
LIST OF ABBREVATIONS
Abbreviation Illustration
ARIMA Autoregressive Integrated Moving Average
API Application Programming Interface
LSTM Long Short-Term Memory (Neural Network)
API Application Programming Interface
RMSE Root Mean Square Error
ROI Return on Investment
BSE BOMBAY STOCK EXCHANGE
NSY National Stock Exchange
8
LIST OF FIGURES
LIST OF TABLES
2 Literature Survey 4
9
10
CHAPTER 1
INTRODUCTION
One of the vital elements of a market economy is stock market. The reason behind this is mainly
because of the foundation it lays for public listed companies to gain capital via investors, who
invest to buy equity in the company. With the aid of refinements in the industries, stock market
is expanding rapidly. In order for the investors to gain returns (profits), they should take in
consideration the disparities involved in the stock market on regular basis. The stock market is
volatile in nature and the prediction of the same is not an easy task. Stock prices depend upon a
variety of factors including economic, physical, psychological, rational and other important
aspects. Although, the stock trend is difficult to predict, investors seem to find new techniques in
order to minimise the risk of investment and increase the probability of profiting from the
investments. The variability in stock market makes it an interesting field for researchers to forge
new forecasting models. Time-series analysis is an important subset of prediction algorithms and
functions. It is regarded as an apt tool for predicting the trends in stock market and logistics.
Before making any investment, an investor gathers intel on the past stock trends, periodic
changes and various other factors that affect the capital of a company. An ARIMA model is a
vibrant univariate forecasting method to project the future values of a time-series. Since, it is
essential to identify a model to analyse trends of stock prices with adequate information for
decision making, it is proposed to use the ARIMA model for stock price prediction.
Stock price prediction using the Autoregressive Integrated Moving Average (ARIMA) model is a
common technique in time series forecasting. ARIMA is a statistical method that models the
relationship between a series of observations and lags of that series.
Prediction will continue to be an interesting area of research making
researchers in the domain field always desiring to improve existing predictive models. The
reason is that institutions and individuals are empowered to make investment decisions and
ability to plan and develop effective strategy about their daily and future endeavors. Stock price
prediction is regarded as one of most difficult task to accomplish in financial forecasting due to
complex nature of stock market. The desire of many investors is to lay hold of any forecasting
11
method that could guarantee easy profiting and minimize investment risk from the stock market.
This remains a motivating factor for researchers to evolve and develop new predictive models
1.1 MOTIVATION
The motivation behind using the Autoregressive Integrated Moving Average (ARIMA)
model for stock price prediction is rooted in the desire to understand and forecast stock price
movements.
Risk Management: Accurate stock price predictions are crucial for managing risk in
investment portfolios.
Trading Strategies: Stock traders use ARIMA predictions to inform their trading strategies.
ARIMA models can provide insights into potential short-term price movements, helping
traders make timely decisions for profit.
Quantitative Analysis: ARIMA models offer a quantitative approach to stock price
forecasting.
Technical analysts often use ARIMA predictions to complement their chart-based analyses.
ARIMA can provide additional data points to either confirm or challenge their technical
indicators and patterns.
12
employed to improve prediction accuracy by analysing other market indicators or news
sentiment that may influence stock price.
2. LITERATURE SURVEY
13
( Patel & Kothari, 2018).
14
CHAPTER 3
INTRODUCTION
Conduct a literature review of traditional time-series methods (like ARIMA) and hybrid
approaches integrating machine learning.
15
Identify common limitations of ARIMA in handling complex patterns and potential
solutions via machine learning algorithms.
Gather historical stock price data (e.g., daily closing prices) from reliable sources (e.g.,
Yahoo Finance, Alpha Vantage).
Preprocess data, addressing missing values, outliers, and preparing it for time series
analysis.
Implement an ARIMA model to capture linear trends and seasonality in stock price data.
Integrate machine learning algorithms (e.g., LSTM, Random Forest, Gradient Boosting)
to capture non-linear trends and patterns.
Conduct training and optimization on both the ARIMA model and machine learning
models for hybrid forecasting.
Evaluate model accuracy using metrics such as Mean Squared Error (MSE), Root Mean
Squared Error (RMSE), and Mean Absolute Percentage Error (MAPE).
Perform cross-validation to assess the model’s stability and generalization across various
time periods.
User Interface
Develop a simple user interface for end-users to input stock symbols and receive
forecasted price predictions.
Provide data visualizations of historical trends, ARIMA predictions, and machine
learning corrections.
16
- Intermediate
tech skills
Traders Professionals - High-frequency
making frequent trading needs
trades based on - Demands real-
market trends, time predictions
seeking precise, - Familiar with
short-term stock analysis
predictions. tools.
17
strategies. summarized
insights
- Minimal technical
focus
18
1.1 FUNCTIONAL REQUIREMENT
The system should retrieve historical stock price data via the Yahoo Finance API.
It should support data retrieval for multiple stock symbols as specified by the user.
Data processing functions must handle missing values, data transformations, and
normalization as required by the ARIMA model.
The system should allow the selection of ARIMA model parameters (p, d, q) and offer
automated model selection options, such as grid search or Akaike Information Criterion
(AIC) for optimal parameters.
It should support training the ARIMA model on historical data for a selected stock
symbol.
The system should save and load trained models for reuse.
Forecasting Functionality
Users should be able to select a forecast range (e.g., 1 day, 7 days, 30 days).
The system should generate stock price forecasts based on the ARIMA model for the
specified range.
The system must provide options for viewing predictions as a time-series chart, with
overlayed actual and predicted values.
The system should calculate and display performance metrics for the ARIMA model,
such as Mean Absolute Error (MAE), Mean Squared Error (MSE), and Root Mean
Squared Error (RMSE).
It should allow comparison between predicted and actual values over a defined period.
The system should offer a user-friendly interface where users can input stock symbols,
select parameters, view predictions, and interpret results easily.
The system should provide an option to download predictions and historical data in CSV
format.
The system should allow users to set specific conditions (e.g., price threshold) for
receiving notifications or alerts.
It should provide notifications for major deviations between predicted and actual values,
if relevant.
19
1.2 EXTERNAL INTERFACE REQUIREMENTS
Web Interface:
Forms:
o Input forms for project creation, task assignment, and user registration/login.
o Modal dialogs for comments and file uploads.
Server Requirements:
o A cloud server to host the application, capable of handling multiple user requests.
User Devices:
Database Interface:
Web Frameworks:
20
o Integration with Express.js for server-side logic and API handling.
o flask for the frontend user interface and state management.
API Communication:
Real-time Collaboration:
o Web Socket or similar technology for real-time updates and notifications among
users.
Latency: The system should respond to user queries within 1-2 seconds. Data retrieval
from Yahoo API should complete within 3-5 seconds, given a stable network.
Throughput: The system should handle multiple concurrent users without significant
performance degradation, supporting at least 100 simultaneous users.
Data Processing Speed: Data processing and model training should be optimized to
complete within a reasonable timeframe (e.g., under 2 minutes for training on 1 year of
data).
Scalability: The system should be scalable, with a design that allows it to handle
additional stock symbols, datasets, and new data sources without compromising
performance.
Fault Tolerance: In case of external API failure (e.g., Yahoo API downtime), the system
should retry the request up to three times before notifying the user of a connection issue.
21
3.4.2 Safety Requirements
Data Handling Safety: The system should ensure safe handling of data inputs, especially
from external sources (e.g., Yahoo API), to prevent incorrect data affecting the
predictions.
Data Backup: User preferences, model configurations, and saved models should be
backed up periodically to prevent data loss during updates or crashes.
Error Handling: The system should provide informative error messages to users and
handle unexpected inputs gracefully, preventing crashes and ensuring the program
continues to operate normally.
Reliability: Ensure a stable environment where all functionalities, from data retrieval to
prediction, perform accurately under various conditions, with an uptime target of 99%.
Usability: The user interface should be intuitive, with clear instructions, tooltips, and
easy navigation to enable both novice and expert users to operate the system without
extensive training.
Maintainability: The system codebase should be modular and documented thoroughly to
allow for updates, including changes in API endpoints or the addition of new algorithms,
with minimal effort.
22
Portability: The system should be compatible with different operating systems and web
browsers, offering flexibility for users to access it from various platforms.
Extensibility: The design should allow for adding other forecasting models (e.g., LSTM,
Prophet) in the future, without requiring substantial rework of the system architecture.
23
Hardware Requirements
For a Stock Price Prediction System using an ARIMA model and Yahoo API, selecting an
appropriate System Development Life Cycle (SDLC) model is essential to accommodate the
complexity and dynamic nature of financial forecasting. Given the evolving requirements and
need for iterative refinement, here are some viable SDLC models:
24
1. Agile Model
2. Incremental Model
3. Waterfall Model
Overview: The Waterfall model is a linear, sequential SDLC approach that may be
suitable if the project’s requirements are well-defined and less likely to change. While
25
less flexible, it may work if stakeholders have clear and fixed expectations for the stock
prediction model and output.
Advantages:
o Clear Phases: Each stage is planned out, making it straightforward to track
progress and manage resources.
o Less Rework: Since requirements are well-defined at the start, this model reduces
the likelihood of substantial changes.
o Ease of Documentation: Since each phase is completed before moving to the
next, thorough documentation can be maintained, which is beneficial for projects
requiring regulatory compliance or detailed audits.
Given the complexity and iterative nature of stock price prediction, Agile is the most suitable
model for this project. This approach allows for continuous integration and testing, essential for
tuning the ARIMA model and incorporating real-time data through the Yahoo API. Agile’s
flexibility supports ongoing improvements, essential as model performance and user
requirements evolve with feedback and market data.
26
3.7 SYSTEM IMPLEMENTATION PLAN
- Define user
requirements
3. System Design - Design system Week 4 System architecture and
data flow diagram
architecture
27
displaying friendly UI
predictions
8. Integration & - Integrate Week 10 Fully integrated
Testing frontend, system
backend, and
model
- Conduct unit System passes
and integration required functional
testing tests
9. Project - Write technical Week 11 Comprehensive
Documentation and user project documentation
documentation
10. Final Report - Prepare final Week 12 Project report and
& Presentation report & presentation for
presentation evaluation
28
CHAPTER 4
SYSTEM DESIGN
29
4.1 System Architecture
30
4.2 Data flow Diagram
Level-0 DFD
31
Level-1 DFD
32
4.3 Entity Relationship Diagrams
33
4.4 UML Diagrams
34
4.4.2 Class Diagram
35
4.4.3 Activity Diagram
36
4.4.4 sequence Diagram
37
CHAPTER 05
OTHER SPECIFICATION
38
5.1 Advantages
5.2 Limitations
5.3 Applications
39
CHAPTER 6
CONCLUSIONS & FUTURE WORK
40
Regarding the issue of stock price forecasting, many scholars are still studying in this area,
and using time series forecasting theory is feasible and effective for stock price
forecasting. As this article uses ARIMA model and BP neural network model to predict
the closing price of stocks. The empirical results show that these two models can predict
the future stock prices more accurately, and show that short-term forecasting of stock
prices is feasible and effective. However, this paper also has shortcomings. First, only
static forecasting is used, and it is still single-step forecasting. In fact, both time series
methods and neural networks can achieve multi-step forecasting. In addition, stock prices
are dynamic and continuous. Second, this paper only evaluates the fitting effect of the
prediction model from the relative error. It will be better if combined with the trend of
future stock price changes. Of course, stock price prediction itself does not form a
complete investment decision. At least it requires effective risk assessment and
corresponding risk control methods.
In short, this paper uses the historical closing price of stocks as time series data to
construct an ARIMA model and a BP neural network model, and make short-term
forecasts of the future stock opening prices. The forecasting effect is relatively ideal. The
two models are feasible and effective for short-term forecasting of stock price data.
41
References:
1. Ashish Sharma, Dinesh Bhuriya, Upendra Singh. "Survey of Stock Market Prediction
3. 3. Xi Zhang1, Siyu Qu1, Jieyun Huang1, Binxing Fang1, Philip Yu2, “Stock Market
5. 5. SachinSampatPatil, Prof. Kailash Patidar, Asst. Prof. Megha Jain, “A Survey on Stock
6. 6. https://www.cs.princeton.edu/sites/default/files/uploads/Saahil_magde. pdf
42
43