Stock Price Prediction Using ARIMA Model by Dereje Workneh Medium
Stock Price Prediction Using ARIMA Model by Dereje Workneh Medium
Any kind of prediction is a difficult task in the real world, especially where
the future is very dynamic. The stock market is highly volatile and
unpredictable by nature. Therefore, investors are always taking risks in
hopes of making a profit. People want to invest in the stock market and
expect profit from their investments. There are many factors that influence
stock prices, such as supply and demand, market trends, the global
economy, corporate results, historical price, public sentiments, sensitive
financial information, popularity (such as good or bad news related to a
company name and product), all of which may result in an increase or
decrease in the number of buyers etc. Even though one may analyze a lot of
factors, it is still difficult to achieve a better performance in the stock market
and to predict the future price in general. Predicting the price of a specific
stock one day ahead is, by itself, a very complicated task. In this blogpost,
next day stock prices are predicted for each of the individual days of a
certain year. For each day, comparisons are made with the actual prices to
validate the model. In this blogpost, we have been tasked with predicting the
price of the Apple (AAPL) stock price and have been provided with historical
data (time series data). This includes features like opening and closing stock
prices, volume, date, and so on. A time series is a series of data that is
collected over a period of time. Time series data are sequential data which
follow some patterns. In order of time, data are points in an index or listed
or graphed. Time series data are also called historical data or past data. Time
series data are used for predicting a future value based on an historical
value. This is called time series analysis. The daily closing price of stocks,
heights of ocean tides, and counts of sunspots are some examples of time
series data. Time series data are studied for several purposes, such as
forecasting the future based on knowledge of the past, understanding of the
phenomenon. Underlying measures, or simply succinctly describing the
salient features of the series. Forecasting or predicting future prices of an
observed time series plays an important role in nearly all fields of science,
engineering, finance, business intelligence, economics, meteorology,
telecommunications etc. To predict an outcome based on time series data,
we can use a time series model which is called Auto Regressive Integrated
Moving Average (ARIMA) is used as the machine learning technique to
analyze and predict future stock prices based on historical prices.
Data Collection: — In this section, we will collect our data and load it into
python. The data used for this blogpost was collected 5 years (2015–2020) of
AAPL(Apple) Stock price data from Yahoo Finance, which you can download
here. We chose to use the Closing Value for our analysis. This is the
workflow of the ARIMA model for this blogpost:
Let’s check the data there is null or not and the shape of the data.
Now, We have seen earlier that the data type for ‘Date’ is an object. So first of
all we have to change the data type to datetime format otherwise we can not
extract features from it.
We can see that there is an increase during the specified time frame.
Bar graph
The correlation
Training dataset is 80% of the total dataset while the test dataset the
remaining 20%.
ARIMA model:
ARIMA stands for Auto Regression Integrated Moving Average. It is specified
by three ordered parameters (p,d,q). Where:
d is the degree of differencing (number of times the data have had past
values subtracted)
Before going to the ARIMA model, we have to make our data is stationarize.
For a data to be stationarize:
Because when running a linear regression the assumption is that all of the
observations are all independent of each other. In a time series, however, we
know that observations are time dependent. It turns out that a lot of nice
results that hold for independent random variables (law of large numbers
and central limit theorem to name a couple) hold for stationary random
variables. So by making the data stationary, we can actually apply regression
techniques to this time dependent variable.
There are two methods to check the stationarity of a time series. The first is
by looking at the data. By visualizing the data it should be easy to identify a
changing mean or variation in the data. For a more accurate assessment
there is the Dickey-Fuller test. I won’t go into the specifics of this test, but if
the ‘Test Statistic’ is greater than the ‘Critical Value’ than the time series is
stationary. Below is code that will help you visualize the time series and test
for stationarity.
We can easily see that the time series is not stationary, and our
adf_test_stationarity function confirms what we see.
For this blogpost, we are using Plot the ACF and PACF charts and find the
optimal parameters. The next step is to determine the tuning parameters of
the model by looking at the autocorrelation and partial autocorrelation
graphs. There are many rules and practice about how to select the
appropriate AR, MA, SAR, and MAR terms for the model. The chart below
provides a brief guide on how to read the autocorrelation and partial
autocorrelation graphs to select the proper terms. The big issue as with all
models is that you don’t want to overfit your model to the data by using too
many terms.
ACF plot
PACF plot
Build Model:
Below are the steps we follow for implementing auto ARIMA:
Summary:
The general steps to implement an ARIMA model are in time series data:–
1. Load the data: The first step for model building is of course to load the
dataset
5. Create ACF and PACF plots: This is the most important step in ARIMA
implementation. ACF PACF plots are used to determine the input
parameters for our ARIMA model.
6. Determine the p and q values: Read the values of p and q from the plots
in the previous step.
7. Fit ARIMA model: Using the processed data and parameter values we
calculated from the previous steps, fit the ARIMA model.
Final Words
Thank you for the read and your time, if you like this story please hold the
clap button. Also, I’ll be happy to share your feedback and see you the next
blogpost!
111 5
Love podcasts or audiobooks? Learn on the go with our new app. Try Knowable
in Bransjebloggen 3