0% found this document useful (0 votes)

21 views14 pages

Prediction of S&P 500 Index Based On Regression: Zhang Daping, Qin Shihan, Shi Ziyue

Uploaded by

zdp2031

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

21 views14 pages

Prediction of S&P 500 Index Based On Regression: Zhang Daping, Qin Shihan, Shi Ziyue

Uploaded by

zdp2031

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 14

Prediction of S&P 500 Index Based On Regression

Zhang Daping, Qin Shihan, Shi Ziyue

Abstract: With the development of modern finance, more and more researchers are
trying to make prediction on the financial market, while its changeability non-linearity
and poor controllability make it hard to be forecasted. In this paper, regression
method including linear regression and ridge regression is implemented to make
predictions on S&P 500 Index. First, a data set is collected from Yahoo Finance
which includes one- year closing price of S&P 500 Index. Then, regression models
based on linear, and ridge regressions are trained for prediction. Last, RMSE and F2
score are used as evaluation to compare different models. It can be concluded that
multi linear regression model and ridge regression model performed better than other
models, which can provide reference for investment decision and financial market
analysis.

Keywords: Linear Regression, Ridge Regression, S&P 500 Index(SPX)

I. Introduction
At the end of the 20th century, with the rapid development of computer technology,
some financiers began to use computer algorithms to simulate and predict the current
economic development, design economic models, and construct stock portfolios. With
the advent of the era of big data in the 21st century, the data and information available
to people began to grow exponentially, and the use of relevant model algorithms to
predict the economy plays a role that cannot be ignored, which can greatly help
investors avoid stock market risks as much as possible and obtain more benefits at a
relatively small cost.
Based on the regression model of stock data analysis and recommendation, we can
use different data sources, adopt different data content, through further data analysis,
and get different content, to achieve "multiple eyes and ears" on stock trading
information, so as to get rid of the traditional stock trading process information
acquisition content is small, information acquisition time is long, transaction
judgment is single, arbitrary and other shortcomings [1] At the same time, people with
initial accounting knowledge can accurately grasp the basic logic of the current
market pricing stocks in the industry sector. By using different regression models, we
provide a scientific measurement mechanism for real-time prediction of stock trends,
real-time monitoring of stock information, real-time evaluation of individual stocks
and other behaviors. It is worth noting that this model has strong timeliness and is not
stable under different market trends.

Python language is mostly used to deal with financial and economic problems,
advanced mathematics, time series and other fields, and can be applied to artificial
intelligence, big data statistics and forecasting, website development and other fields.
Free and open source programming software developed by Dutchman Guido van
Rossum, with free, open source, and non-American direct origin [2]. Regression
analysis mainly explores the correlation between the independent variable and the
dependent variable, and builds a regression model based on this relationship to predict
the future evolution trend of the dependent variable. The more common regression
models include linear regression, ridge regression, logistic regression, stepwise
regression and other models. This paper mainly uses Python language to train linear
regression, multiple linear regression and ridge regression, and compares and finds
regression models that are more suitable for the current dataset.

In the Python model library, sklearn is more common, providing methods including
regression, dimensionality reduction, classification and other methods for machine
learning, which is a more popular tool for machine learning and practice. In this
paper, when using sklearn to build a regression model, the main steps are followed:
(1) Determine the independent and dependent variables according to the data set
and prediction target;
(2) Select the appropriate features and explain the selected features in-depth;
(3) Clean the data set, filter the available data information, and select standardized
or normalized models of different data;
(4) Estimate model parameters, establish linear regression, multiple linear
regression, ridge regression and other regression models;
(5) Test different regression models;
(6) Use different regression models to make predictions on data and compare the
results.

II. Literature review

Research and papers have shown the use of different regression techniques and their
applications in multiple areas. There have long been known as evolving and effective
prediction tools in economics, engineering and chemistry. Multiple linear, Ridge,
Gradient boosting and other regression techniques applying in financial forecasting
cases, such as house price predictions, which require valid processed data set to get
the exact time to purchase specific houses.

Previous works in financial areas are conducting experiments under multiple

regression methods, fitting models according to the requirement in situations. While
some predictions may use different regression models: Multiple linear, Ridge,
LASSO, Ada Boosting Regression. [3] Then compare their final results using error-
determined functions, aiming for the best performance in prediction accuracy. Further
studies in stock markets also present the validity of regression techniques in this
industry along with multiple other machine learning algorithms. A study by Srinath
Ravikumar and Prasad Saraf showed that prediction based on historical data can use
regression algorithms to predict the closing price of companies, and classification
algorithms to predict the price of the stock decrease or increase the next day. [4] The
combination of regression and classification is then been proposed as an application.

Except for financial industries, regression techniques also share enormous

significance in recognition areas for engineering and social spheres. In the speech
recognition field, kernel ridge regression has good performance in sensitivity
performance measures and time complexity calculations compared to other machine
learning algorithms. [5] Kernel ridge regression is also used in hyperspectral image
classification, which is related more to engineering. In this area, the newly developed
spectral-spatial-based ridge linear regression for hyperspectral image classification
sometimes can not deal with data sets that are linear inseparable, hence in the work
conducted by Chunhui Zhao, Wu Liu and others introduced kernel ridge regression
and shared subspace learning. [6]

During the research processes, most works will go through the process of data
collecting, for financial-related papers, they sometimes relate to historical raw data
mining. Then is the process of data pre-processing, null value and different types of
data would be detected and elected according to requirement. For linear regression,
the non-linear and linear inseparable data sets will be reprocessed or deleted.
Separation to training and testing data would then be conducted. Different test-train
data percentage also shows different results. After the separation of the train and test
datasets, researches then fit data into the models and record, compare and analyze the
results.

For studies in stock markets, the study by Srinath Ravikumar and Prasad Saraf
explained the development of deep learning models applied to stock markets.
Regression models have long been useful in price prediction, while studies also tend
to choose the appropriate factors affecting stock price and then take them as variables.
Support Vector Machine algorithms are also applied in stock price prediction, which
showed better performance in accuracy compared to other machine learning
algorithms according to the study conducted by Pushkar Khanal and Shree Raj
Shakya. [7]

Here in this article, we continue the study of stock price using linear regression,
multilinear regression and ridge regression to predict the stock prices and compare the
results.

III. Methodology
Simple linear regression:
Linear regression is a machine learning algorithm based on supervised learning.
Linear regression models the target predicted value based on the independent variable,
and its main goal is to find the relationship between the variable value and the
predicted value, and different regression models vary according to the relationship
between the dependent and independent variables they are considering and the
number of independent variables used. In linear regression [8], the target variable (Y)
depends on a single independent variable (X), and the model establishes a linear
relationship (Health Insurance Cost Prediction Using Regression Models) between
these two variables. This technique is commonly used in predictive analytics, time
series models, and to discover causal relationships between variables. Curves/lines are
typically used to fit data points, with the goal of minimizing the difference in distance
from the curve to the data point. A linear regression model attempts to fit a line
between the regression variable independent variable (x) and the dependent variable
(y), and the equation for this straight line can be expressed as:

where "a" and "b" are model parameters called regression coefficients, "a" is the Y-
intercept value of the line when X = 0, and "b" is the slope of Y with X. When we
find the best b-value, it also means that we have found the best fit line. So, when we
finally use our model to make predictions, it predicts the value of the dependent
variable Y as the input value of the independent variable X.

For linear regression algorithms, there are mainly the following steps:
(1) For a given dataset, the initial input is made;
(2) Obtaining the objective (loss) function: In order to better measure the results,
we need to quantify an objective function so that the model can be optimized
in the process of continuous solving. Therefore, we define the loss function as
follows:

It is the average square distance between the predicted value and the true value, which
is also called the mean square error (mean square error).
(3) For the derivation of the loss function, we can use the least squares method. As
shown in the following figure:

(4) Perform gradient descent on the calculated function and update the
parameters. The result is shown in the following figure:

Simple linear regression has the advantages of simplicity, interpretability, wide

availability, etc., so linear regression is often the first method considered for solving
related problems. But it is also because of its simplicity and ease of operation that the
scope of use of simple linear regression is very small, and in most cases in the real
world, it does not meet the assumptions of linear regression models at all, or it is
difficult to produce useful results through linear regression models.

Multiple linear regression:

Multiple linear regression is a statistical process that examines the degree of
association between a set of independent variables and a dependent variable. In
contrast to simple linear regression, there are many predictors in multiple linear
regression, and the value of the dependent variable Y is calculated based on the value
of the predictor. When the predicted object y is affected by multiple different
explanatory variables x at the same time, and each explanatory variable is linearly
related to the predicted object, the multiple linear regression model can be used for
analysis and prediction (secondary market stock price influencing factors - multiple
linear regression based on R). For example, suppose the target value is dependent on
"n" independent variables, and then the regression equation becomes fitted to the
regression line in n-dimensional space. For the establishment of multiple linear
regression model, there are three main steps [9]:

（1） The multiple regression model is gradually established according to the

formula, which is:

The formula is expressed as a multiple linear regression model for the

dependent variable y and the independent variable.
（2） Parameter estimation of regression models is usually done using least squares
estimation, that is, finding the regression constant and regression coefficient so
that the sum of squared dispersions is minimized, that is:

（3） The goodness-of-fit of the model is evaluated. Since the number of variables
contained in each regression model is not necessarily the same in multiple
linear regression models, the more commonly used evaluation index is the
coefficient of determination of modified degrees of freedom, which can be
expressed as:

where n is the sample size and k is the number of independent variables in the model,
(n-1) and (n-k-1) are respectively The degree of freedom of the sum of total
dispersion squares and residual sum of squares (the effect of employed persons on
fiscal revenue based on multiple linear regression models). The coefficient is obtained
through this calculation formula, which takes a value between 0 and 1, the closer to 1
indicates the higher the degree of interpretation of the independent variable, and the
closer to 0 indicates the weaker the explanatory ability of the independent variable.

Ridge Regression
Ridge regression is also known as regularized linear regression and Tikhonov
regularization in statistics [10]. This method estimates the coefficients where exists
multi-regression models and during the same situation, the independent variables are
highly correlated. Here, we also need to introduce another concept: Multicollinearity.

Multicollinearity occurs when variables can be used to predict one predictor variable
in a multiple regression model. For this situation, multilinearity indicates the
correlation between independent variables in modeled data. However, multilinearity
does not affect the overall prediction ability and reliability of the model, only changes
individual predictors’ results. Hence it may lead to invalid individual predictor’s
results, also non-identifiable parameters.

For collinearity, a linear association between two variables, if X 1 and X 2 have an exact
linear relationship between them, they are perfectly collinear and will fit the following
equation where i is the observations.
Same case for multicollinearity, the span of X k increases from 2 to k. That is, for all
observations i

Since multicollinearity leads to invalid predictor results, Ridge regression is particular

suitable in reducing the problem of multicollinearity in linear regression. In general,
this method provides the improved efficiency in parameter estimation problems in
exchange for a tolerable amount of bias.

In the following graphs this paper will display and explain the detailed Tikhonov
regularization (Ridge regression) steps.

First, we suppose for a matrix A and vector b, wish to find a vector X such that

This standard method is ordinary least squares (OLS) linear regression, but
sometimes we can’t find a X that satisfies the equation, or more than one X solution
occurs. Under these circumstances, OLS estimation will lead to overdetermined or
underdetermined system of equations. The standard approach to solve an
overdetermined system of linear equations is known as linear least squares and seeks
to minimize the residual:

where is the Euclidean norm.

In order to give preference to a particular solution with desirable properties, the

regularization term can be included in this minimization:
For some cases, we chose Tikhonov Matrix Γ . This matrix can be chosen as a scalar
multiple for the identity matrix and is also known as L2 regularizaiton.(6) This
regularization improves the conditioning of the problem, thus enabling a numerical
solution. An explicit solution, denoted by ^x , is given by

For this reduces to the unregularized least squared solution provided −1

Γ =0 ( AT A )

exists.

IV. Experiment
Dataset
There are many factors influencing the price of a stock [11] such as the company’s
financial situation, the macroeconomic market environment, even the personal
behavior of the administrators, which make it changeable and hard to be predicted
[12]. Therefore, the stock index, which consists of bunches of companies stock will be
more predictable. The Standard and Poors 500, also known as S&P 500, is an index
which reveals the stock price of the 500 largest companies in the United States. In the
experiment, the Historical data from 12/27/2021 to 12/23/2022 is extracted from
Yahoo Finance [13], which is 253 valid trading days in total exclude from holidays.
As is shown is table 1, the original data include the date, closing price, trading volume
and VIX index of each trading days and is ranked from the newest day to the oldest
day.
TABLE 1 Head of Dataset
Preprocessing
The daily change of stock price will not affect the long-term price, thus only closing
days will be used as the prediction target. First, a column of Num is generated to
represent the index of daily data. Then null value is checked and delete the thousand
separator for all data. Since the original data is in‘Object’format, it is transformed to
float for convenience. The line graph of the targeted whole year is shoen in Fig4. The
original dataset is split into train set and test set with 8:2, which is 202 and 51
respectively. Finally, a column of VIX index is added to improve the model
performance.

Fig1: S&P 500 Index

Evaluation
Two evaluation matrices including root mean square error (RMSE) and F2 scores
are used to evaluate the model. As for the RMSE, it is clear that multi linear
regression and ridge regression is much better than that of simple linear regression,
while these two models performs similar from one another. For F2 score it is the same
case while all the better two models gets scores over 0.8, which shows a good
correlation between model and data
TABLE 2 EXPERIMENT RESULT

RMSE F2
Linear 185.11484053 0.62502278
Multi 105.83258720 0.88270912
Ridge 105.84222311 0.88268776

Meanwhile, the prediction result is visualized and compared to the actual data which
is shown in Fig 2 and Fig 3. The same as the evaluation result, the graph of multi
linear regression and ridge regression are more similar to the actual market
movement. Also we can conclude that all the prediction is more or less late when
compared to the actual market. This corresponds to the financial theory that market
can not be exactly predicted [14] and it is unavoidable for the regression methods to
be left behind by the actual data. Nevertheless, the result is still meaningful to provide
the possible trend for the future market and the gap between prediction and reality can
be made up by training of more historical data and adapted more comprehensive
models.
Fig2: PERFORMANCE OF MULTI LINEAR REGRESSION
Fig3: PERFORMANCE OF RIDGE REGRESSION

V. Conclusion
In this paper, closing price of S&P 500 is predicted by several models based on
regression methods, including linear regression, multilinear regression and ridge
regression. From the result we can conclude that the multi linear regression model
performs better than the others by the evaluation of RMSE and F2. It is shown that the
proposed model has high performance in predicting S&P 500, which could provide
contribution in the analysis of financial market trend. In order to focus on the
comparison between different regression methods, only several variables and one year
data is used for dataset. For future job, longer range data and more indicators can be
used to further improve the performace.
Reference:
[1] Zhang Xiaolei, Chen Lei, “Design and Implementation of Popular Stock Analysis
and Recommendation System Based on Linear Regression”, Nov.2022
[2] He Xiaonian, Duan Fenghua, “Case Analysis of Linear Regression Based on
Python”, No.11.2022
[3] House Price Prediction Using Regression Techniques: A Comparative Study
https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=8882834
[4] Prediction of Stock Prices using Machine Learning (Regression,Classification)
Algorithms. https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9154061
[5] Kernel Ridge Regression method applied to speech recognition problem: a novel
approach
https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=7043378
[6] Hyperspectral Image Classification via Spectral–Spatial Shared Kernel Ridge
Regression
https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=8721560
[7] Pushkar Khanal, Shree Raj Shakya “Analysis and Prediction of Stock Prices of
Nepal using different Machine Learning Algorithms” Department of Mechanical
Engineering, Pulchowk campus, Institute of Engineering, Tribhuvan University,
Nepal
[8] Sudhir Panda, Biswajit Purkayastha, Dolly Das, Manomita Chakraborty, Saroj
Kumar Biswas,” Health Insurance Cost Prediction Using Regression Models “,
(COM-IT-CON), 26-27 May 2022
[9] Wei Zheng, Yunfeng Dong, Can Huang, Di Yang, "The Impact of Employment on
Fiscal Revenue Based on Multiple Linear Regression Model", 03 2022
[10] Ahn, J. J., Byun, H. W., Oh, K. J., & Kim, T. Y. (2012). Using ridge regression
with genetic algorithm to enhance real estate appraisal forecasting. Expert Systems
With Applications, 39(9), 8369–8379. https://doi.org/10.1016/j.eswa.2012.01.183
[11] Phayung Meesad & Risul Islam Rasel, “Predicting Stock Market Price Using
Support Vector Regression”,pp.6
[12] A. Altan and S. Karasu, “THE EFFECT OF KERNEL VALUES IN SUPPORT
VECTOR MACHINE TO FORECASTING PERFORMANCE OF FINANCIAL
TIME SERIES”, The Journal of Cognitive Systems Vol. 4, No. 1, 2019, pp.20
[13] “S&P 500 historical data,” Yahoo! Finance, 27-Dec-2022. [Online]. Available:
https://finance.yahoo.com/quote/%5EGSPC/history/. [Accessed: 27-Dec-2022].
[14] S. Karasu, et “Prediction of Solar Radiation based on Machine Learning
Methods”, The Journal of Cognitive Systems, Vol. 2, No. 1, pp. 16-20, 2017.

MR Marmalade PDF
No ratings yet
MR Marmalade PDF
26 pages
Ground Support in Deep Underground Mines
100% (1)
Ground Support in Deep Underground Mines
27 pages
IC Lab MANUAL KEC
67% (3)
IC Lab MANUAL KEC
69 pages
Struers Prestopress3 Embedded Press
No ratings yet
Struers Prestopress3 Embedded Press
23 pages
Stock Price Prediction Website Using Linear Regres
No ratings yet
Stock Price Prediction Website Using Linear Regres
10 pages
Predictive Analysis of Stock Market - A Comparitive Study
No ratings yet
Predictive Analysis of Stock Market - A Comparitive Study
8 pages
20EJCIT200 - Abhishek Tiwari
No ratings yet
20EJCIT200 - Abhishek Tiwari
7 pages
SNCS D 23 00531
No ratings yet
SNCS D 23 00531
15 pages
Module 1 Notes
100% (1)
Module 1 Notes
73 pages
ML U2 Regression
No ratings yet
ML U2 Regression
20 pages
Synopsis Stock Market Prediction
No ratings yet
Synopsis Stock Market Prediction
6 pages
9 Types of Regression Analysis
No ratings yet
9 Types of Regression Analysis
16 pages
DAV 2201079 Exp 2 2-1
No ratings yet
DAV 2201079 Exp 2 2-1
35 pages
19 Stock Price Trend Predictionusing Multiple Linear Regression
No ratings yet
19 Stock Price Trend Predictionusing Multiple Linear Regression
6 pages
Price Prediction Evolution: From Economic Model To Machine Learning
No ratings yet
Price Prediction Evolution: From Economic Model To Machine Learning
7 pages
Ijirt155434 Paper
No ratings yet
Ijirt155434 Paper
5 pages
Stock Market Prediction Using Machine Learning
100% (1)
Stock Market Prediction Using Machine Learning
7 pages
Stock Market Prediction Using Machine Language
No ratings yet
Stock Market Prediction Using Machine Language
11 pages
IJNRD2307048
No ratings yet
IJNRD2307048
5 pages
Stock Market Analysis Using Supervised Machine Learning
No ratings yet
Stock Market Analysis Using Supervised Machine Learning
3 pages
Supervised Learning Algorithms
No ratings yet
Supervised Learning Algorithms
20 pages
DSR Notes 3 To 5
No ratings yet
DSR Notes 3 To 5
70 pages
Stock Market Prediction: Hrithik D B181070PE
No ratings yet
Stock Market Prediction: Hrithik D B181070PE
5 pages
8 Jsee2317
No ratings yet
8 Jsee2317
12 pages
4 ML
No ratings yet
4 ML
41 pages
Supervised Learning Regression
No ratings yet
Supervised Learning Regression
15 pages
Regression: UNIT - V Regression Model
100% (1)
Regression: UNIT - V Regression Model
21 pages
Week 7. Intro To ML. Regression
No ratings yet
Week 7. Intro To ML. Regression
24 pages
Unit-3 Part 2 DA
No ratings yet
Unit-3 Part 2 DA
20 pages
Sector-Based Stock Price Prediction With Machine Learning Models
No ratings yet
Sector-Based Stock Price Prediction With Machine Learning Models
12 pages
Stock Market Prediction Using Machine Learning Algorithms A Classification Study
No ratings yet
Stock Market Prediction Using Machine Learning Algorithms A Classification Study
4 pages
Unit 2 3 Notes
No ratings yet
Unit 2 3 Notes
16 pages
Econometrics Project
No ratings yet
Econometrics Project
17 pages
What Are Linear Models in Machine Learning (1) .Docx (Unit3 ML)
No ratings yet
What Are Linear Models in Machine Learning (1) .Docx (Unit3 ML)
60 pages
Unit 2
No ratings yet
Unit 2
48 pages
Unit 3
No ratings yet
Unit 3
25 pages
Machine Learning
No ratings yet
Machine Learning
8 pages
ML Unit 2
No ratings yet
ML Unit 2
27 pages
L4a - Supervised Learning
No ratings yet
L4a - Supervised Learning
25 pages
BA3 4 5modules
No ratings yet
BA3 4 5modules
258 pages
A Genetic Programming Model For S&P 500 Stock Market Prediction
No ratings yet
A Genetic Programming Model For S&P 500 Stock Market Prediction
12 pages
Prediction of Stock Price Based On Financial Data and Tweets
No ratings yet
Prediction of Stock Price Based On Financial Data and Tweets
5 pages
Unit 2
No ratings yet
Unit 2
92 pages
Aakash S Project Report
No ratings yet
Aakash S Project Report
12 pages
Predictive Analytics
No ratings yet
Predictive Analytics
22 pages
Regression Analysis in Machine Learning: Temperature, Age, Salary, Price
No ratings yet
Regression Analysis in Machine Learning: Temperature, Age, Salary, Price
12 pages
Types of Supervised Learning2
No ratings yet
Types of Supervised Learning2
66 pages
Qfe 07 04 028
No ratings yet
Qfe 07 04 028
26 pages
RP Final
No ratings yet
RP Final
13 pages
My File
No ratings yet
My File
27 pages
AI Lab7
No ratings yet
AI Lab7
13 pages
UNIT 3 Regression
No ratings yet
UNIT 3 Regression
5 pages
DA unit-III
No ratings yet
DA unit-III
30 pages
Predicting Google Stock Prices Using Machine Learning Models
No ratings yet
Predicting Google Stock Prices Using Machine Learning Models
6 pages
6 Regression Analysis
No ratings yet
6 Regression Analysis
12 pages
Empirical Study On The Performance of Various Mach
No ratings yet
Empirical Study On The Performance of Various Mach
16 pages
Gold Price Estimation Using A Multi Variable Model
No ratings yet
Gold Price Estimation Using A Multi Variable Model
6 pages
Lecture 2
No ratings yet
Lecture 2
17 pages
ML 2 ND Unit
No ratings yet
ML 2 ND Unit
50 pages
Research Stock
No ratings yet
Research Stock
6 pages
Building A Stock Market Prediction Model Using Machine Learning
No ratings yet
Building A Stock Market Prediction Model Using Machine Learning
11 pages
A Comparative Analysis On Linear Regression and Support Vector Regression
No ratings yet
A Comparative Analysis On Linear Regression and Support Vector Regression
5 pages
DDOT Reimagined Phase II Draft Plan (Released 4/24/23)
100% (1)
DDOT Reimagined Phase II Draft Plan (Released 4/24/23)
4 pages
Strings (ALL PROGRAMS)
No ratings yet
Strings (ALL PROGRAMS)
4 pages
File 46953
No ratings yet
File 46953
28 pages
Enhanced Condominium Concepts Review 20210501
No ratings yet
Enhanced Condominium Concepts Review 20210501
8 pages
Gravitation Revision Notes (JEE Mains)
No ratings yet
Gravitation Revision Notes (JEE Mains)
33 pages
Notes For Property Tut
No ratings yet
Notes For Property Tut
3 pages
Q2 Project Instructions
No ratings yet
Q2 Project Instructions
12 pages
Weber Vinogradov 2001 Nonvertebrate Hemoglobins Functions and Molecular Adaptations
No ratings yet
Weber Vinogradov 2001 Nonvertebrate Hemoglobins Functions and Molecular Adaptations
60 pages
Search vs. Hashing
No ratings yet
Search vs. Hashing
55 pages
Building Your Money Making Machine
100% (1)
Building Your Money Making Machine
2 pages
Lesson 1 Intro To Orgl Behavior
No ratings yet
Lesson 1 Intro To Orgl Behavior
19 pages
StraMa Comprehensive Guidelines (C1 To C8) PDF
No ratings yet
StraMa Comprehensive Guidelines (C1 To C8) PDF
103 pages
Official Resume
No ratings yet
Official Resume
1 page
Od123134082577368000 2
No ratings yet
Od123134082577368000 2
2 pages
KST SeamTech Tracking 31 en
No ratings yet
KST SeamTech Tracking 31 en
130 pages
Massachusetts Parent Letter Refusing MCAS
No ratings yet
Massachusetts Parent Letter Refusing MCAS
1 page
A. Engage
No ratings yet
A. Engage
8 pages
BΩSS - Circuit Breaker
No ratings yet
BΩSS - Circuit Breaker
4 pages
CV Rajshree Shrestha
No ratings yet
CV Rajshree Shrestha
5 pages
Unit 6 - Extra Test
100% (1)
Unit 6 - Extra Test
5 pages
Uv
No ratings yet
Uv
41 pages
The Kardashev Scale Measuring Civilizational Advancement
No ratings yet
The Kardashev Scale Measuring Civilizational Advancement
2 pages
Data Engineer Requirment
No ratings yet
Data Engineer Requirment
2 pages
Notice To IEA Dwarka Museum
No ratings yet
Notice To IEA Dwarka Museum
2 pages
Capc
No ratings yet
Capc
21 pages
Cbs 350 Chapter 08
No ratings yet
Cbs 350 Chapter 08
18 pages

Prediction of S&P 500 Index Based On Regression: Zhang Daping, Qin Shihan, Shi Ziyue

Uploaded by

Prediction of S&P 500 Index Based On Regression: Zhang Daping, Qin Shihan, Shi Ziyue

Uploaded by

Prediction of S&P 500 Index Based On Regression

Zhang Daping, Qin Shihan, Shi Ziyue

Keywords: Linear Regression, Ridge Regression, S&P 500 Index(SPX)

II. Literature review

Previous works in financial areas are conducting experiments under multiple

Except for financial industries, regression techniques also share enormous

Simple linear regression has the advantages of simplicity, interpretability, wide

Multiple linear regression:

（1） The multiple regression model is gradually established according to the

The formula is expressed as a multiple linear regression model for the

Since multicollinearity leads to invalid predictor results, Ridge regression is particular

where is the Euclidean norm.

In order to give preference to a particular solution with desirable properties, the

For this reduces to the unregularized least squared solution provided −1

Fig1: S&P 500 Index

You might also like