0% found this document useful (0 votes)

20 views7 pages

A SVM Stock Selection Model Within PCA

Uploaded by

Minh Vu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

20 views7 pages

A SVM Stock Selection Model Within PCA

Uploaded by

Minh Vu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

Available online at www.sciencedirect.

com

ScienceDirect
Procedia Computer Science 31 (2014) 406 – 412

2nd International Conference on Information Technology and Quantitative Management, ITQM

2014

A SVM Stock Selection Model within PCA

Huanhuan Yua, Rongda Chenb,* , Guoping Zhangc
a
School of Finance, Zhejiang University of Finance & Economics, Hangzhou, 310018, China
b
School of Finance, Zhejiang University of Finance & Economics, Hangzhou, 310018, China
c
School of Economics and International Trade, Zhejiang University of Finance & Economics, Hangzhou, 310018, China

Abstract

In the financial market, well-performing stocks usually have some specific features in financial figures. This paper introduces a
machine learning method of support vector machine to construct a stock selection model, which can do the nonlinear
classification of stocks. However, the accuracy of SVM classification is very sensitive to the quality of training set. To avoid the
direct use of complicated and highly dimensional financial ratios, we bring the principal component analysis (PCA) into SVM
model to extract the low-dimensional and efficient feature information, which improves the training accuracy and efficiency as
well as preserve the features of initial data. As empirical results show, based on support vector machine, within PCA after norm-
standardization, the stock selection model achieves the entire accuracy of 75.4464% in training set and of 61.7925% in test set.
Further, the PCA-SVM stock selection model contributes the annual earnings of stock portfolio to outperforming those of A-
share index of Shanghai Stock Exchange, significantly.

© 2014 Published by Elsevier B.V. Open access under CC BY-NC-ND license.

© 2014 The Authors. Published by Elsevier B.V.
Selection and peer-review under responsibility of the Organizing Committee of ITQM 2014.
Selection and peer-review under responsibility of the Organizing Committee of ITQM 2014.

Keywords: machine learning; stock selection; principal components analysis; support vector machine

1. Introduction

Stock has always been one of the most popular investment instruments in financial markets. Investors and
researchers are devoting themselves to study out a method that can select accurately the stocks with favorable future

* Corresponding author. Tel.: +860571-85750010; fax: +860571-85212001.

E-mail address: [email protected].

1877-0509 © 2014 Published by Elsevier B.V. Open access under CC BY-NC-ND license.
Selection and peer-review under responsibility of the Organizing Committee of ITQM 2014.
doi:10.1016/j.procs.2014.05.284
Huanhuan Yu et al. / Procedia Computer Science 31 (2014) 406 – 412 407

return to be constituents of investment portfolio. Guo and Zhang1, Kuo et al.2 and Tsumato et al.3 develops several
method to forecast stock prices or pick qualified ones from large sample. However, some traditional stock selection
models usually face challenges when dealing with highly dimensional and nonlinear sample data for the reason that
stock selection is a kind of determination with multi objectives and multi restrictions, along with the highly
dimensional and huge financial data. The machine learning-based theory, Artificial Neural Network (ANN), can
capture the regular patterns hidden behind the complex and high-dimension data through its machine learning 4,5.
Although ANN performs better than traditional methods, it has lots of defects at the same time, such as the difficulty
to determine network structures, the problem with local minimum points and the over-fitting. Vapnik 6 proposed a
new machine learning-based method called Support Vector Machine (SVM), which can better handle the high-
dimension data avoiding the defects of ANN. SVM applies widely in many fields because of its particular
advantages. A lot of researches, domestic and abroad, use SVM to predict stock prices or reversal points, as in Yeh
et al.7 and Huang8. But it’s seldom to establish a stock selection model by SVM, and specifically rare in domestic.
This paper applies SVM into domestic stock market to establish an effective selection model. We treat financial
ratios of listed companies in A-share of Shanghai Exchange as original data, and then use the principal components
analysis (PCA) to preprocess them. First, we established a stock selection model (PCA-SVM) that recognizes high-
return stocks when utilized SVM theory to train the training set. Second, apply PCA-SVM on test set to forecast the
high-return stocks in the next year and do a comparison between the forecast and the actual to illustrate effectiveness
of the established stock selection model.

2. Principal components analysis (PCA)

Financial ratios of a listed company include earning ability, growth ability, solvency ability and so on. Each
ability contains many sub-ratios. If all the ratios were used as inputs in the training set, it would result in redundancy
and low efficiency; even decrease the quality of empirical results. New variables can be created through
transformation of original variables. Number of variables is less and most information is still retained. These new
variables are called principal components.

2.1. Definition of principal components

Principal components can be expressed as follows:

Y T
D1 X D11 X 1 D12 X 2 D1n X n
° 1
°
°Y2
T
D X D 21 X 1 D 22 X 2 D 2n X n
®
2
, (1)
°
° T
°̄Yn Dn X D n1 X 1 D n 2 X 2 D nn X n

where X i is the original variable, Yi is the principal component and D i is the coefficient vector respectively.
T
Di can be estimated by maximizing Var (Yi ) with the constraint conditions of D i D i 1 and

¦
T
Cov(Yi , Yj ) D ¦D i
i 0, j 1, 2, , i 1 , where (V ij )nun is the covariance matrix of X .

2.2. Selection of principal components

The covariance matrix of X ( X1 , X 2 , , X n )T , ¦ (V ij )nun , is a symmetric non-negative definite matrix.

Therefore it has n characteristic roots O1 , O2 , , On , and n characteristic vectors.
Suppose O1 t O2 t t On t 0 and the orthogonal unit eigenvectors are e1 , e2 , , en . The i th principal component
of X1 , X 2 , , X n can be expressed as follows:
408 Huanhuan Yu et al. / Procedia Computer Science 31 (2014) 406 – 412

Yi ei1 X1 ei 2 X 2 ein X n , i 1, 2, ,n (2)

ei ¦ ei Oi and Cov(Yi , Yj ) ei ¦ ei
T T
with Var (Yi ) 0, i z j . The first p principal components’
accumulated conribution rate is

p n
ACR( p) ¦O / ¦O
i 1
i
i 1
i (3)

which represents the explanation power for original data of the principal components extracted by PCA method.
Generally, an ACR of 85% is at least required, or the PCA method would be thought as unsuitable for losing too
much original information.
Since the covariance matrix is sensitive to the order of magnitudes of data, we need to standardize the data first.
There are two method of standardization in common use:
• Norm-standardization: X ij * ( X ij X j ) / s j , X j is the mean and s j is the standard deviation.
• Mean-standardization: X ij * X ij / X j , X j is the mean.

3. Support vector machine

3.1. Linear classification of SVM

Linear classification of SVM is realized through solving for the optimal separating hyper-plane when the training
set is linear separable. If the mingled two classes ( C1 , C2 ) of a sample can be separated correctly with the linear
function ( H 0 ) in a two-dimension plane, this sample is treated as linear separable.
Suppose the training set is {( x1 , y1 ),( x2 , y2 ), ,( xn , yn )} , where xi is sample information vector ( xi is the
coordinate vector in a two-dimension plane), yi Y {1, 1} and +1 represents class C1 , -1 represents class C2 . If
T
the linear separating hyper-plane H 0 : w x b 0 separates the training set correctly, it is equivalent with the
T T
situation: when yi 1 , w xi b t 1 ; when yi 1 , w xi b d 1 . If the distance of two data cluster of the
sample, D* , is maximized, this hyper-plane is called the optimal separating hyper-plane in this classification case.
Define D* d d ,

T
dr min{ w xi b w} (4)
i , y r1

T
By substituting w x b r1 in (4), we can obtain D* d d 2 || w || and the problem is transformed to get
the w minimizing || w || . ( b can be calculated by substituting sample points with w known)

Additionally, to avoid the situation that distance between the two parallel hyper-planes is maximized while
effective classification is not realized, we must pose constraints on this optimization problem as follows:

T
yi (w xi b) t 1 [i , 0 d [i d 1 . (5)

[ i is the slack variable to tolerate the outliers. And a penalty factor C is also introduced into the objective
function to reflect losses for tolerating the outliers. Training a SVM model, i.e. solving the optimization problem,
will lead to a quadratic programming problem, as shown in (6).
Huanhuan Yu et al. / Procedia Computer Science 31 (2014) 406 – 412 409

n
1 n n
°max ¦O i ¦¦ OiO j yi y j xi , x j !
2i 1 j 1
°
°
i 1

® s.t. 0 d Oi d C , i 1, 2, ,n (6)
° n
°
°̄
¦O y
i 1
i i 0

*T
¦O
*
Suppose O * is the solution of (6) and thus the optimal hyper-plane is w x b* 0 , where w i
*
yi xi and
*
b can be calculated by the contraints of (5)..

3.2. Nonlinear classification of SVM

Linear classification of SVM we talked about in the prior section can be only applied when sample is linear
separable. In this section, an improved nonlinear SVM method is proposed to solve the complicated and high-
dimensional financial ratios.
A kernel function M is very important here because it can map the original date into high-dimensional space H ,
i.e. M : R o H ; x o M ( x) , which can let the data can be linear separable in H . Then an optimal separating hyper-
n

plane discussed in prior section can be obtained to do the classification.

Suppose the training set is {( x1, y1 ),( x2 , y2 ), ,( xn , yn )} , xi is the highly dimensional information vector of the
sample and yi Y {1, 1} . A quadratic programming similar with (8) is obtained through mapping M :

n
1 n n

°max ¦ O 2 ¦¦ O O y y
i i j i j M ( xi ),M ( x j ) !
°
°
i 1 i 1 j 1

® s.t. 0 d Oi d C , i 1, 2, ,n (7)
° n
°
°̄
¦O y
i 1
i i 0

To solve (7), M : R o H ; x o M ( x) is needed to know, so we choose Gauss radial based kernel function (RBF)
n

to get the inner product value as k ( x, y) M ( x), M ( y) ! directly without searching for the complex M .

4. Data selection

Table 1. Financial ratios and sample stocks information

Sample stock Earnings ability A Activity ratio B Shareholder return C
Turnover of accounts EPS c1
2009, 677 stocks EBIT a1
receivable b1 Price-to-book ratio c2
ROA a2
Turnover of inventory b2 Common stock profitability c3
2010, 679 stocks ROE a3
Turnover of current assets b3 P/CF c4
Cash ratios D Growth ratios E Risk level F Solvency ratios G
Quick ratio g1
EBIT-to-Cash ratio d1 Financial leverage f1
Debt-to-Asset ratio g 2
Cash-to-Assets ratio d 2 Growth of total assets e
EBIT/Interest ratio g 3
Operating ratio d 3 Operating leverage f 2
EBIT/Fixed charge ratio g 4

This paper selects 7 categories of financial ratios of companies in A-share Shanghai Stock Exchange from their
annual reports of 2009 and 2010. The detailed financial indexes chosen are shown in Table 1. Our objective is to
410 Huanhuan Yu et al. / Procedia Computer Science 31 (2014) 406 – 412

separate the high-return stocks from the low ones according to their features hidden inside the financial ratios, thus it
is necessary to label each stock with the return characteristic. After statistical analysis, all the companies have
announced their annual report before 1th/May in 2009 and 2010. Therefore we label the stock as +1 if its return
ranks the first 25% of all the sample stocks, i.e. yi 1 and yi 1 for the rest stocks. Labels of a part of sample are
presented in Table 2.

5. Stock selection of model and analysis

5.1. Extraction of training set based on PCA method

Financial ratios of 677 stocks in 2009 are the original data. We apply PCA to extract the principal components
satisfying the condition of ACR t 85% . Since our sample is large, if we apply PCA on all of the ratios of 677 stocks
directly, we would lose the local information and the effect of dimension reduction is also smaller. Thus we do PCA
extraction one time for every 40 sample stocks. The training set is in Table 2.

Table 2. Training set of SVM nonlinear classification (part of 677 stocks)

Earnings Activity Shareholder Growth Solvency y

Stock code Cash ratios Risk levels
ability ratios return ratios ratios

PCA with norm-standardization

600069 -1.6114 -0.9830 -0.4337 -1.0664 -0.4253 0.7874 0.1431 1
600070 0.5249 -0.3005 -0.8563 -0.5438 -0.0903 -0.1103 0.0136 -1
600071 2.1843 0.1875 -1.5191 1.1364 -0.6570 -1.7170 0.7624 1
PCA with mean-standardization
600069 0.8222 -1.3006 0.8049 1.0620 -0.9571 0.3681 1.8768 1
600070 4.6133 1.0647 -0.3712 -1.1497 0.8309 1.6046 1.5020 -1
600071 7.0948 1.1286 -0.7982 0.2286 0.2485 -0.2133 2.0515 1

5.2. SVM stock selection model and analysis

The total scores obtained in the prior section combined with return labels of sample stocks constitute the
complete training set of SVM. By applying the nonlinear classification of SVM introduced in section 3 on the
training set, we can obtain the optimal separating hyper-plane. If we use this hyper-plane on test set, stocks in test
set can be classified into the high-return part and the low-return part. It can be seen as a prediction of stocks’ future
return characteristic. The accuracy of classification and prediction is presented in Table 3.

Table 3. Accuracy of SVM nonlinear classification

Method used Mean-standardization PCA-SVM Norm-standardization PCA-SVM

Whole accuracy a 88.6905% 75.4464%

Training Accuracy of +1 a 100% 58.5366%
Accuracy of -1 a 85.0394% 80.9055%
Whole accuracy b 69.1943% 61.7925%
Test Accuracy of +1 b 10.1266% 24.5283%
Accuracy of -1 b 88.8421% 74.2138%

Training and testing of SVM proceed with Livsvm 3.1 in Matlab. To achieve the best generalization ability, the
optimal penalty factor C and the coefficient V in RBF is determined by Grid Searching method.
Huanhuan Yu et al. / Procedia Computer Science 31 (2014) 406 – 412 411

By observing Table 3, we can find that the accuracy of mean-standardization PCA-SVM for label +1 in training
set is 100%. However, the accuracy of the same label in test set is only 10.1266%. It is the over-fitting phenomenon
that too many support vectors were used to explain the training set, which could has a good classification effect on
training set while a bad effect on predictions. The accuracy of norm-standardization PCA-SVM is obviously better.
For further analysis, we construct an equal weighted portfolios with stocks selected by PCA-SVM and do a
comparison between the accumulated return (ACR) gained by this model and the A-share index of Shanghai Stock
Exchange. The comparison is presented in Fig.1. It manifests that PCA-SVM has higher accumulated return over the
A-share index, which means SVM classification method is accurate and highly efficient when dealing with complex
and highly dimensional data.

Fig.1. Comparison between PCA-SVM and A-share index of

Shanghai Stock Exchange

6. Conclusions

Support Vector Machine is commonly used to train the time-series data of stocks for price forecasting. In this
paper, SVM is employed to generate an optimal separating hyper-plane in high-dimensional space based on the
training set. To increase the accuracy and efficiency of SVM classification model, we apply PCA to process the
original data. Finally, the empirical result has suggested that the return of stocks selected by PCA-SVM is
apparently superior to A-share index.
Information features of financial ratios of companies vary with their industries. We believe that the quality of
training set can be improved if we apply PCA on each industry separately. Additionally, it is quite meaningful for
achieving higher returns if stocks could have different weights according to their risk-return characteristics when
portfolios are constructed.

Acknowledgments

This research was supported by the National Natural Science Foundation of China (Grant No. 71171176).

References

1. Ming Guo, Yuan-Biao Zhang. A Stock Selection Model Based on Analytic Hierarchy Process. Factor Analysis and TOPSIS//The International
Conference on Computer and Communication Technologies in Agriculture Engineerin. 2010. p. 466-469.
2. Kuo R.J., Chen C.H.& Hwang Y.C. A Intelligent Stock Trading Decision Support System Through Integration of Genetic Algorithm based
Fuzzy Neural Network and Artificial Neural Network. Fuzzy Sets and Systems. 2001; 118: 21-45.
3. Tsumato S., Slowinski S., Komorowsk J. & Grzymala-Busse J.W. Lecturenotes in Artificial Intelligence. The fourth international conference
on rough sets and current trends in computing. 2004.
412 Huanhuan Yu et al. / Procedia Computer Science 31 (2014) 406 – 412

4. E.L. de Faria, Marcelo P. Albuquerque, J.L. Gonzalez, J.T.P. Cavalcante, Marcio P. Albuquerque. Predicting the Brazilian Stock Market
Through Neural Networks and Adaptive Exponential Smoothing Methods. Expert Systems with Application. 2009; 36:12506-12509.
5. Yudong Zhang, Lenan Wu. Stock Market Prediction of S&P 500 via Combination of Improved BCO Approach and BP Neural Network.
Expert Systems with Applications. 2009; 36: 8849-8854.
6. Vladimir N. Vapnik. Statistical Learning Theory. Publishing House of Electronics Industry. 2004.
7. Chi-Yuan Yeh, Chi-Wei Huang, Shie-Jue Lee. A multiple-kernel support vector regression approach for stock market price forecasting. Expert
Systems with Applications.2011; 38: 2177-2186.
8. Pengpeng Huang. Prediction of the Turnover Points in Stock Trend Based on Support Vector Machine. College of Software, Fudan University.
2010.

Flash BTC Sender Guide
No ratings yet
Flash BTC Sender Guide
3 pages
CSIR CLRI Junior Secretariat Assistant Paper II 2018 English
No ratings yet
CSIR CLRI Junior Secretariat Assistant Paper II 2018 English
24 pages
Gangguan Pendengaran Dan Kelainan Telinga
No ratings yet
Gangguan Pendengaran Dan Kelainan Telinga
157 pages
Forex Trading D
No ratings yet
Forex Trading D
3 pages
The Customer Journey of The Premium Traveler PDF
No ratings yet
The Customer Journey of The Premium Traveler PDF
38 pages
Mechatronic Design of A Lower Limb Exoskeleton
No ratings yet
Mechatronic Design of A Lower Limb Exoskeleton
24 pages
A Novel Motion Intention Recognition Approach For Soft Exoskeleton Via IMU
No ratings yet
A Novel Motion Intention Recognition Approach For Soft Exoskeleton Via IMU
18 pages
A Rolling Contact Joint Lower Extremity Exoskeleton Knee
No ratings yet
A Rolling Contact Joint Lower Extremity Exoskeleton Knee
15 pages
Contract - II
No ratings yet
Contract - II
8 pages
Human Motion Classification Based On Multi-Modal Sensor Data For Lower Limb Exoskeletons
No ratings yet
Human Motion Classification Based On Multi-Modal Sensor Data For Lower Limb Exoskeletons
6 pages
EEG-Controlled Meal Assistance Robot With Camera-Based Automatic Mouth Position Tracking and Mouth Open Detection
No ratings yet
EEG-Controlled Meal Assistance Robot With Camera-Based Automatic Mouth Position Tracking and Mouth Open Detection
6 pages
A Fuzzy Based Soft Computing Technique To Predict The Movement of The Price of A Stock
No ratings yet
A Fuzzy Based Soft Computing Technique To Predict The Movement of The Price of A Stock
6 pages
Maaz Assignment # 3 Deep Learning
No ratings yet
Maaz Assignment # 3 Deep Learning
5 pages
An Obstacle Avoidance Two-Wheeled Self-Balancing Robot
No ratings yet
An Obstacle Avoidance Two-Wheeled Self-Balancing Robot
7 pages
Malasakit Form
100% (1)
Malasakit Form
2 pages
Packing List 082022140
No ratings yet
Packing List 082022140
2 pages
Particles - Kinematics and Kinetics
100% (1)
Particles - Kinematics and Kinetics
249 pages
DRRM Minutes DSV, ZMDV
No ratings yet
DRRM Minutes DSV, ZMDV
17 pages
Amigo Manual Carritos
No ratings yet
Amigo Manual Carritos
28 pages
Crystal Ball For Options
No ratings yet
Crystal Ball For Options
6 pages
Insolvency and Bankruptcy Code 2016
No ratings yet
Insolvency and Bankruptcy Code 2016
46 pages
svm2 (1) Fin
No ratings yet
svm2 (1) Fin
24 pages
Welding & Joining - 04th August
No ratings yet
Welding & Joining - 04th August
7 pages
Rotation in The Space
No ratings yet
Rotation in The Space
15 pages
Unit 5 (Dimensionality Reduction)
No ratings yet
Unit 5 (Dimensionality Reduction)
96 pages
08 Classification
No ratings yet
08 Classification
46 pages
AI Project Cycle Question Bank
No ratings yet
AI Project Cycle Question Bank
14 pages
100 Consumer Behavior Questions
No ratings yet
100 Consumer Behavior Questions
50 pages
HT I&ii
No ratings yet
HT I&ii
98 pages
Quaternion To Euler Angle Conversion
No ratings yet
Quaternion To Euler Angle Conversion
12 pages
Taz TFG 2016 2057
No ratings yet
Taz TFG 2016 2057
52 pages
348 PMP ® Exam Practice Test and Study Guide
No ratings yet
348 PMP ® Exam Practice Test and Study Guide
70 pages
Historiopreneurship Related Paper 3
No ratings yet
Historiopreneurship Related Paper 3
13 pages
Spec Hyundai HX210
No ratings yet
Spec Hyundai HX210
10 pages
ADPIE From "The Nursing Process in Action by Nurse Erica.": Assessment
No ratings yet
ADPIE From "The Nursing Process in Action by Nurse Erica.": Assessment
2 pages
Support Vector Machine For Classification
No ratings yet
Support Vector Machine For Classification
38 pages
SVM Based Stock Prediction Analysis
No ratings yet
SVM Based Stock Prediction Analysis
7 pages
SVM
No ratings yet
SVM
11 pages
B43 Exp3 ML
No ratings yet
B43 Exp3 ML
5 pages
Multi-dimensional Monte Carlo Integrations Utilizing Mathematica
From Everand
Multi-dimensional Monte Carlo Integrations Utilizing Mathematica
SUJAUL CHOWDHURY
No ratings yet
Credit Awareness
100% (2)
Credit Awareness
62 pages
IPC - 912 - 914 Series - ED4 - R3
No ratings yet
IPC - 912 - 914 Series - ED4 - R3
202 pages
NORSOK STANDARD M-650 Edition 4 Qualification of Manufacturers of Special Materials
No ratings yet
NORSOK STANDARD M-650 Edition 4 Qualification of Manufacturers of Special Materials
19 pages
Unit-4 AI - SVM
No ratings yet
Unit-4 AI - SVM
21 pages
A Practical Guide To Support Vector Classification: I I I N L
No ratings yet
A Practical Guide To Support Vector Classification: I I I N L
12 pages
AP For NLP-LO2
No ratings yet
AP For NLP-LO2
38 pages
Unit Iii ML
No ratings yet
Unit Iii ML
13 pages
Negotiation Roleplays Esl
No ratings yet
Negotiation Roleplays Esl
2 pages
STOCK PRICE PREDICTOR USING MACHINE LEARNING Report
No ratings yet
STOCK PRICE PREDICTOR USING MACHINE LEARNING Report
39 pages
Answer All Questions: 4 Semester - B.E. / B.Tech Second Internal Assessment: 28-02-13
No ratings yet
Answer All Questions: 4 Semester - B.E. / B.Tech Second Internal Assessment: 28-02-13
1 page
SVM Tutorial
No ratings yet
SVM Tutorial
28 pages
Aim of The Experiment-Software Required - Theory
No ratings yet
Aim of The Experiment-Software Required - Theory
6 pages
Detection of Temporal Bone Abnormalities Using Hybrid Wavelet Support Vector Machine Classification
No ratings yet
Detection of Temporal Bone Abnormalities Using Hybrid Wavelet Support Vector Machine Classification
6 pages
Lecture Slides-Week12
100% (1)
Lecture Slides-Week12
41 pages
Project Report: "In Pursuit of Global Competitiveness"
75% (4)
Project Report: "In Pursuit of Global Competitiveness"
9 pages
Artigo Smallex
No ratings yet
Artigo Smallex
17 pages
Survey Piccialli sciandrone4OR
No ratings yet
Survey Piccialli sciandrone4OR
29 pages
Towards A Critical Health Psychology Practice
100% (1)
Towards A Critical Health Psychology Practice
15 pages
An Improved Training Algorithm For Support Vector Machines
No ratings yet
An Improved Training Algorithm For Support Vector Machines
10 pages
4 Series Manual Version 1p10
No ratings yet
4 Series Manual Version 1p10
60 pages
Support Vector Machines For Classification and Regression: Steve R. Gunn
No ratings yet
Support Vector Machines For Classification and Regression: Steve R. Gunn
66 pages
B24 ML Exp-3
No ratings yet
B24 ML Exp-3
10 pages
Using Machine Learning Algorithms On Prediction of Stock Price-SVR
No ratings yet
Using Machine Learning Algorithms On Prediction of Stock Price-SVR
16 pages
Machine Learning Unit-3.3
No ratings yet
Machine Learning Unit-3.3
38 pages
A Map Reduce Based Support Vector Machine For Big Data Classification
No ratings yet
A Map Reduce Based Support Vector Machine For Big Data Classification
22 pages
Chapter Four: International Management and Cross-Cultural Competence
No ratings yet
Chapter Four: International Management and Cross-Cultural Competence
35 pages
02 Chittick Vs CA
No ratings yet
02 Chittick Vs CA
5 pages
This Is
No ratings yet
This Is
7 pages
Multiple Integrals, A Collection of Solved Problems
From Everand
Multiple Integrals, A Collection of Solved Problems
Steven Tan
No ratings yet
Basic of SVM Algorithm
No ratings yet
Basic of SVM Algorithm
10 pages
Using Volume Weighted Support Vector Machines With Walk Forward PDF
No ratings yet
Using Volume Weighted Support Vector Machines With Walk Forward PDF
9 pages
6 PDF
No ratings yet
6 PDF
5 pages
Financial Time Series Forecasting Using Support Vector Machines
No ratings yet
Financial Time Series Forecasting Using Support Vector Machines
21 pages
Mining Stock Market Tendency Using GA-Based Support Vector Machines
No ratings yet
Mining Stock Market Tendency Using GA-Based Support Vector Machines
10 pages
Hyperplane: The Case When The Data Are Linearly Separable
No ratings yet
Hyperplane: The Case When The Data Are Linearly Separable
2 pages
A New Heuristic of The Decision Tree Induction: Ning Li, Li Zhao, Ai-Xia Chen, Qing-Wu Meng, Guo-Fang Zhang
No ratings yet
A New Heuristic of The Decision Tree Induction: Ning Li, Li Zhao, Ai-Xia Chen, Qing-Wu Meng, Guo-Fang Zhang
6 pages
Thesis
No ratings yet
Thesis
364 pages
Geometric functions in computer aided geometric design
From Everand
Geometric functions in computer aided geometric design
Oscar Ruiz
No ratings yet
An Introduction To Support Vector Machines
No ratings yet
An Introduction To Support Vector Machines
22 pages
An Introduction To Support Vector Machines
No ratings yet
An Introduction To Support Vector Machines
22 pages
Support Vector Machines: (Vapnik, 1979)
No ratings yet
Support Vector Machines: (Vapnik, 1979)
34 pages
Support Vector Machines For Prediction of Futures Prices in Indian Stock Market
No ratings yet
Support Vector Machines For Prediction of Futures Prices in Indian Stock Market
5 pages
Combining Support Vector Machines: 6.1. Introduction and Motivations
No ratings yet
Combining Support Vector Machines: 6.1. Introduction and Motivations
20 pages
Support Vector Machine
No ratings yet
Support Vector Machine
45 pages
Ijetae 0812 11
No ratings yet
Ijetae 0812 11
4 pages
Data Classification Using Support Vector Machine: Durgesh K. Srivastava, Lekha Bhambhu
No ratings yet
Data Classification Using Support Vector Machine: Durgesh K. Srivastava, Lekha Bhambhu
7 pages
SVM Basics Paper
No ratings yet
SVM Basics Paper
7 pages
Optimal Feature Selection For Support Vector Machines: Independently
No ratings yet
Optimal Feature Selection For Support Vector Machines: Independently
25 pages
ISKE2007 Wu Hongliang
No ratings yet
ISKE2007 Wu Hongliang
7 pages
Digital Signal Processing (DSP) with Python Programming
From Everand
Digital Signal Processing (DSP) with Python Programming
Maurice Charbit
No ratings yet
Understanding Support Vector Machine Algorithm From Examples
No ratings yet
Understanding Support Vector Machine Algorithm From Examples
10 pages
Advanced C Concepts and Programming: First Edition
From Everand
Advanced C Concepts and Programming: First Edition
Gayatri
3/5 (1)
SVM & Image Classification.
No ratings yet
SVM & Image Classification.
22 pages
A Practical Guide To Support Vector Classification: I I I N L
No ratings yet
A Practical Guide To Support Vector Classification: I I I N L
15 pages
A Introduction To SVM PDF
No ratings yet
A Introduction To SVM PDF
48 pages
Co-Clustering: Models, Algorithms and Applications
From Everand
Co-Clustering: Models, Algorithms and Applications
Gérard Govaert
No ratings yet
Tutorial On Support Vector Machine (SVM) : Abstract
No ratings yet
Tutorial On Support Vector Machine (SVM) : Abstract
13 pages
Introduction To Support Vector Machines: 1 Description
No ratings yet
Introduction To Support Vector Machines: 1 Description
15 pages
A Practical Guide To Support Vector Classi Cation - Chih-Wei Hsu, Chih-Chung Chang and Chih-Jen Lin
No ratings yet
A Practical Guide To Support Vector Classi Cation - Chih-Wei Hsu, Chih-Chung Chang and Chih-Jen Lin
12 pages
Support Vector Machine in R Paper
No ratings yet
Support Vector Machine in R Paper
28 pages
Learn Statistics Fast: A Simplified Detailed Version for Students
From Everand
Learn Statistics Fast: A Simplified Detailed Version for Students
Hesbon R.M
No ratings yet

A SVM Stock Selection Model Within PCA

Uploaded by

A SVM Stock Selection Model Within PCA

Uploaded by

Available online at www.sciencedirect.

2nd International Conference on Information Technology and Quantitative Management, ITQM

A SVM Stock Selection Model within PCA

© 2014 Published by Elsevier B.V. Open access under CC BY-NC-ND license.

* Corresponding author. Tel.: +860571-85750010; fax: +860571-85212001.

2. Principal components analysis (PCA)

2.1. Definition of principal components

Principal components can be expressed as follows:

2.2. Selection of principal components

The covariance matrix of X ( X1 , X 2 , , X n )T , ¦ (V ij )nun , is a symmetric non-negative definite matrix.

Yi ei1 X1  ei 2 X 2   ein X n , i 1, 2, ,n (2)

3. Support vector machine

3.1. Linear classification of SVM

3.2. Nonlinear classification of SVM

plane discussed in prior section can be obtained to do the classification.

Table 1. Financial ratios and sample stocks information

5. Stock selection of model and analysis

5.1. Extraction of training set based on PCA method

Table 2. Training set of SVM nonlinear classification (part of 677 stocks)

Earnings Activity Shareholder Growth Solvency y

PCA with norm-standardization

5.2. SVM stock selection model and analysis

Table 3. Accuracy of SVM nonlinear classification

Method used Mean-standardization PCA-SVM Norm-standardization PCA-SVM

Whole accuracy a 88.6905% 75.4464%

Fig.1. Comparison between PCA-SVM and A-share index of

You might also like

Yi ei1 X1 ei 2 X 2 ein X n , i 1, 2, ,n (2)