Intelligent Stock Trading System Based On SVM Algorithm and
Intelligent Stock Trading System Based On SVM Algorithm and
Abstract-The stock market is considered as a high complex accumulated in stock investment which we believed are
and dynamic system. Many machine learning and data mining critical to design a workable stock expert system.
technologies are used for stock analysis, but it still leaves an In our previous work [7], an expert system which learn
open question about how to integrate these methods with the trading strategy by markov network probabilistic model from
plentiful knowledge and techniques accumulated in stock high-level representation of time series, i.e. turning points
investment which are critical to the successful stock analysis. In and technical indicators was developed, which is based on the
this paper, we propose an intelligent stock trading system by
idea of price trend turning point in technical analysis. In this
combining support vector machine (SVM) algorithm and box
theory of stock. The box theory believes a successful stock paper, the objective is to exploit the common used box theory
buying/selling generally occurs when the price effectivley of stock and develop an intelligent stock trading system based
breaks out the original oscillation box into another new box. In on oscillation box prediction.
the system, support vector machine algorithm is utilized to There are two modules in our trading system: oscillation
make forecasts of the top and bottom of the oscillation box. box prediction module and trading strategy module. The
Then a trading strategy based on the box theory is constructed support vector machines algorithm is utilized to make
to make trading decisions. The different stock movement forecasts of the top and bottom of the oscillation box. Then
patterns, i.e, bull, bear and fluctuant market, are used to test the trading strategy based on the box theory is constructed to
feasibility of the system. The experiments on S&P500
make trading decisions. In the experiments, we investigated
components show a promising performance is achieved.
the performance of the supposed system on individual stocks
with different movement patterns, test the average rate of
I. INTRODUCTION
profit of nearly all the stocks in S&P 500 components and
In the first, it can transform the estimation of the price box buyprice
into time series regression problem when maintains the basic then Sell, sellprice == C,
meaning of price range, then we can employ the data mining next trade =buy
method to solve it. Second, by the definition, it facilitates the where (J" is the trading rate which is proportional with the
construction of trading strategy. Forecasting H, L is equal to accuracy of SVMmax and SVMmin. When the accuracy is
make a regression for the function which is mined from the high, (J" could be smaller, and vice verse. The role of (J" is to
historical information. regulate the trade frequency. The smaller the value, the fewer
the number of trading. tp, ¢ are used to filter out the false
operation to make sure the trading is more profitable. () is
stop-loss rate to minimize the loss when make a wrong
where Hi' L, represent the forecasts of the Hi' L i, X, yare buying operation. It is also especially useful in bearish market.
the factors related to the change of Hi' L, . The constraints of the trend of Hi' i, are in order to make sure
The forecast module employs support vector machine the true trend reverse happened. The idea of the trading
algorithm. Two estimators based on SVM algorithm called strategy can be illustrated in Fig.I.
SVMmax and SVMmin are used to estimate Hi' L, One of the major criticisms of the original box theory is
respectively. We train the two forecast modules by the sliding
window method. 32,------.-----.--------,--------.--------.-------.--------.--------.--------.-------,
3342
Y.K. Bao et al [6] conclude the support vector machines is a Where C >0 is a prescribed parameter to determine the
robust technique for stock index regression. In our system, trade-off between the flatness of f (x) and the amount up to
SVM algorithm is utilizing to build the two estimators which deviations larger than e are tolerated. To solve this
SVMmax and SVMmin to forecast the top and low of the box optimization problem, we construct a Lagrange function and
respectively. introducing a dual set of Lagrange multipliers
1) Overview ofSVM algorithm Ai,Ai* and 17i ,17i*
The SVM algorithm is based on statistical learning theory 1 11 11
T
and structural risk minimization principle [10]. It was first L = "2WT w+ c~ (J; + J;') - ~ A;(&+ J; - y; + w cI>(xJ + b)
developed by Vapnik and his co-workers in the early 1990s to
solve the classification problem. With the introduction of loss
function such as Vapnik's e -insensitive loss function,
-'i=lI. Ai'(& + s; - i + w cI>(xJ + b) - 'i=lI. (17;8; + 17i' J i' )
Y
T
(7)
Huber's loss function, SVM have been extended to solve where Ai' Ai* ,17i ,17i* ~ O. This function has a saddle point
regression estimation problems and applied successfully in which corresponds to the solution of the optimization
time series forecasting, non-linear modeling and optimal problem. Solving (4) we get the optimal w,b andf(x):
control problem[ 11]. The basic idea of support vector
regression is to map the input space into a high-dimension w' = 'i=lI. (A; -A;')cI>(XJ
space by a non-linear mapping which is accomplished (8)
implicitly by the trick of kernel function and then to do a * 1~ *T (*)
b ==-L...J(Yi-W cD (xi)=t&), O<Ai <C
linear regression in the new feature space. n i=l
Given a time series samples {Xi,Y i} Xi E RI'Yi E R,
i == 1,2, .. " n where Xi is the input feature variable, Yi is the f(x) = 'i=lI. (A; -A;')cI>(XJcI>(x )+b' (9)
target value. SVM regression algorithm first maps the data to
The Lagrange multipliers Ai' Ai* in (8) are sparse. Only when
a high-dimension feature space r using a Xi is outside or on the & -insensitive tube, they are non-zero.
mapping cD: R l ~ r . Then in the high-dimension, we make a
These points are called support vector. The training points in
regression to form a linear function
the tube are useless for f (x) . The dot products in f (x) is
computed in the new high dimensional space which is usually
f(x, w) == w T cD (x) + b (3) intractable. This problem is tackled by substituting the dot
products with a kernel function which satisfies Mercer's
in the condition of minimizing the sum of empirical risk and conditions. Any symmetric kernel function satisfying
the complexity term w 2 which enforce flatness in feature
II 11 Mercer's condition corresponds to a dot product in some
space. Where WE T is the weight vector, b is a bias. The feature space [11], then the computation of dot products in the
linear function in the high dimensional feature space high dimensional transforms to the simpler computation in
corresponds to the non-linear function in the original lower the original low feature space.
dimensional feature space [10]. In e - SV regression [11], 2) Parameter selection
our goal is to find a function f (x) that has at most e In our forecast module, the radial basis function (RBF) is
deviation from the targets Yi for all the training samples, at used as the kernel function of the two SVM estimators. The
the same time is as flat as possible. That is to find (3) under dynamics of stock price series are nonlinear, so it is
the optimization problem: intuitively believed that using nonlinear kernel functions
. .. 1 T could achieve better performance. Another advantage ofRBF
mInImIze - w w
2 kernel is it tends to give good performance under general
(4) smoothness assumptions and especially useful when no
. {Yi -wTcD(xi)-b ~ s
subject to additional knowledge of the data is available [5]. It defines as
T
w cD (Xi ) + b - Yi ~ s (10).
2
Considering the existence of data outside the & -insensitive II Xi - X j 11 )
tube K (Xl' , Xl' ) == exp( 2 (10)
2g
In SVM algorithm, the selection of the user-prescribed
IY - T
( w cD (x) + b) I~ s (5)
free parameters C,g plays an important role on its
performance. C is referred to as the regularized constant to
we introduce slack parameter 6i , 6i* to cope with the balance the empirical error tolerated and the regularization
otherwise infeasible constraints of the optimization problem term to control the capacity of the estimated function, they
(4), s; 6i* are defined as together make up the risk function of SVM. g is the
. . . -1 wT w + C~
mmumze L...J (vi
s: s" )
+ Vi bandwidth of the RBF kernel. Improper selections can cause
2 i=l the problem of overfitting or underfitting.
(6) The parameter C, g are optimized by cross-validation and
. {Yi -wTcD(xi)-b < &+6i
subject to grid search. The training set is divided into five folds. One
wTcD(xi)+b- Yi < &+6i* fold was taken as the validation set, others taken for training.
3343
The grid point with the best accuracy of predicting is used as investigated . All of the experiments run in the MATLAB
the value of the two parameters. environment.
3) Inputfeature
A. Typical stock movement and their trades
Due to the estimated values related to the next n-day prices,
so intuitively, the input data should at least contain the Generally, the same trading strategy may have a different
information of n-day before. n is set to 30 according to performance in different stock movements. In this section, we
experimental result as section 3.2. For SVMmax (SVMmin), test the trading system on different stock movements to
the input data selected in the system totally include 8*n examine the profitability in different environment. They are
features: fluctuant and bull market, fluctuant and bear market, overall
bull market, and overall bearish market.
c; MAk , sst.,ROCk' FastKk, SlowKk , SlowDk, Fig.2 shows a stock movement in a fluctuant and bull
H k_n(Lk_J k = i,i-l, . .. i-n market and long period trade (almost 4 years). Testing data
where MA k .ssr, ,ROCk , FastK k , Slo wKk , S lo wD k are set are sampled from March 27, 1990 to March 7, 1994 of
technical indicators computed from closing prices C . The ALCOA INC. The training set is constructed as section
computation of these arguments can be found in [12]. ILB.4). With trading rate (J set to 0.05 and stop-loss rate ()
H k -n' L k _ n are the historical values of the estimated set to 14% and rp,¢ set to 0, 5 respectively, our trading
parameters. All of the input data is scaling to [-1, 1]. system is able to profit up to 121.63% while the market gains
4) Training about 15.3%. The MSE and SCC for SVMmax are 0.00661
The sliding window validation is a train and test technique and 0.7109 while for SVMmin they are 0.00842 and 0.6677.
which is much more suitable for time-series data that is slow The selection of parameters a,B,cp,r/J is set manually through
varying or non-stationary . So the sliding window method is experiments .
employed to train the two estimators. We divide the whole Fig.3 shows a stock movement in a fluctuant and bear
data set into overlapping training-test set. The length of market and short period trade (less than 2 years). Testing data
window is 1050 in which the first 1000 data points are taken are sampled from Jun 13, 1970 to Jan 11, 1972 of ALCOA
as the training set, the 50 data points following are taken as INC. With trading rate (J set to 0.12 and stop-loss rate () set
the test set. to 15% and rp,¢ set to 0, 2 respectively, the profit is up to
C. Performance Evaluation 21.04% while the stock loss is about 11.3%. The MSE and
The performance of the stock trading system is evaluated SCC for SVMmax are 0.00890 and 0.8370 while for SVMm in
by the rate of profit which calculated by (11). In the trading they are 0.00842 and 0.6677.
process, suppose $1000 initial fund and trade all stocks at In other word, our system can outperform buy-and-hold
each operation. A 0.5 percent transaction cost for each strategy in the fluctuant market.
transaction is assumed. Movement in Fig. 4 is an overall bull stock movement and
400 days short trade. Test data are sampled from Mar 17,
,r ,r. final return - initial fund 10001
rate oj profit = x / 0 (11) 2004 to Oct 17, 2005 of AES CP INC. With trading rate
initial fund (J set to 0.09 and stop-loss rate 0 set to 15% and rp,¢ set to
The mean squared error (MSE) and squared correlation 0, 2 respectively, the experiment result shows our trading
coefficient (SCC) is used to measure the accuracy of SVM system achieves a perfect excellent trade with profit 139.5%
regression. MSE reflects the local fitness of the regression while the stock gains about 80.6%. The result benefits from
and SCC reflects the global fitness. The higher the values of the good regression of the two SVM estimators. The MSE
SCC, the better global fitness achieved. MSE and SCC and SCC for SVMmax are 0.00662 and 0.8806 while for
defined as follows SVMmin they are 0.00776 and 0.90364.
1~ • 2 Movement in Fig. 5 is an overall bear stock movement and
MSE = - L. (Yi - Yi )
N i =1 400 days short trade. Testing data are sampled from Mar 17,
N
where Y i is the actual output and Yi' is its estimate, )Ii' s; are
their averages respectively.
3344
10,--- -,-- - ,...-- --,-- - ..,..-.--.,--,.- - -,-- ---,,--- -,
In order to determine n, we investigate the average rate of
.. t I
r
,
profit by varying n from 3 to 50. The experiment is carried
on 50 stocks (the alphabetically first 50 stocks in S&P 500
components) over 500 trading days from June 15, 2005 to
}
. ! ,f
, , l 4 June 11,2007. The result shows in Table 1.
" l- ~,
.t
J .~ I/ nI'\.t
J ~ v
!J
~ TABLEt
,,
I AVERAGE PERFORMANCE OF TRADI NG 50 STOCKS BY VARYING N
.\ . .. n
Average Average Average see Average see
.~. ' ~J 3
pro fit (%)
35.5
number o f trade
9.8
ofSVMmax
0.96 91
ofSVMmin
0.9593
~ ... 5 39 .49 8.7 0.9452 0.9252
" 8 40.79 8.1 0.9141 0.8816
10 39 .83 7.7 0.8888 0.852 3
Fig. 3. Trading log in a fluctuant and bear market 15 34 .35 7.3 0.8431 0.7862
20 35.8 6 8.4 0.7924 0.7324
30 37.88 8.8 0.70 63 0.6096
40 37. 0 1 7.9 0.623 5 0.5232
50 34.89 7.7 0.5629 0.4 62 8
]
several selected stocks or stock index. However, this method
is unreliable to validate the effectiveness of the system
because of lack of generalization. In the experiments, we
select 442 stocks of the S&P 500 compon ents which have
over 3000 daily data and evaluate the average performance of
., those stocks. The experiment is based on the daily price data
from Mar 18,2004 to Oct 17,2005, namely 400 trading days.
-" ~....I ..... t"
f In this period the S&P 500 index raised 6.57%. For every
'" stock, we implement the following step.
II
•"j In the oscillation box prediction stage, the testing set with
total 400 trading days is divided into 8 subsets . The time
"'0 10 100 110 200 m lOt )SO '00
period of the subsets used in training and testing are listed in
TIftNIIl1Uy)
Table n.
Fig. 5. Tra ding log in overall bear market
TABLE lI
2004 to Oct 17, 2005 of TRIBUNE INC. With trading rate AVERAGE PERFORMANCE OF TRADING 50 STOCKS BY VARYING N
CY set to 0.01 and stop-loss rate 0 set to 15% and rp,¢ set to Training set Testing set
1 3/2212000 - 3/1712004 3/1812004 - 5/27/2004
0, 2 respectively, our trading system losses only 7.6% when 2 6/ 2 /2000 - 5/2 7/2004 5/28/2004 - 8/1 0/2004
the stock lose totally 35.2% in the period, The MSE and see 3 8/ 14/2000 - 8/10/2004 8/1 1/2004 - 10/20 /2004
for SVMmax are 0.00157 and 0.9921 while for SVMmin they 4 10/24 /2000 - 10/20 /2004 10/2 1/2004 - 12/3 1/2004
5 1/ 5 12001 -12/3112004 1/ 3 12005 - 3/15 /2005
are 0.00218 and 0.8336.
6 3/2012001 - 3/1512005 3/1612006 - 5125/2005
B. Width ofthe oscillation box n 7 5/3 112001 - 512512005 5/2612005 - 8/19 /2005
8 8/ 10/200 1 - 8/19 /2005 8/22/2005 - 10/17 /2005
The width of the oscillation box n is an important
parameter of the proposed trading system need to be
The cost C is set to 550 and 350 for SVMmax and SVMmin
determin ed. Intuitively, the n smaller, the higher the
respectively. The parameter g of RBF kernel function is set
prediction accuracy of the two estimators SVMmax and
to 0.00002125 for both estimators . The stop-loss rate () is set
SVMmin may be got, but the trade decision may be less
to 15% and rp,¢ set to 0, 2 respectively.
optimal due to the forecast s concerning less days ahead vice
verse.
3345
In the trading stage, suppose $1000 initial fund and trade 3) System Comparison
all stocks at each operation. A 0.5 percent transaction cost for Due to lack of benchmark tests for stock trading system,
each transaction is assumed. most of the existing stock trading system experiment on
2) Result Analysis different kind of financial data with different experimental
The evaluation criteria of the experimental result include scheme and the performances vary considerably. This makes
the total number of stocks underperformed the Buy-and-Hold the system comparison difficult.
method over the test period, the number of non-profitable In this part, we compare the performance of our system
stock and the average profit of return over the test period. with trading system which used turning point confirming and
The statistical result shows in Table III. The average profit probabilistic reasoning method [7] and has a similar
of the proposed system is 27.23% while the profit of experimental scheme with us. The trading system first builds
buy-and-hold strategy is 5.77% for the testing 400 trading the turning points and technical indicators according to
days. That means the profit of the trading system is much technical analysis method, then learn trading strategy by
better than the S&P 500 index and outperforms the probabilistic reasoning from the two high-level
buy-and-hold strategy. For convenience of the experiments, it representations of financial series. We utilize the following
is needed to note that the value of user-prescribed free formula to calculate the performance. Natural Gain is the rate
parameters in both SVM estimator are same for all stocks in of profit by Buy-and-Hold method.
training process, which is obviously not optimal for all of . Average rate of profit (13)
ratio == - - - - - - - - -
them. Natural Gain
TABLE III Table V compares the results achieved using various
AVERAGE PERFORMANCE OF TRADING 422 STOCKS FOR400 DAYS investment strategies. Our system based on SVM and box
Market Pattern Bull Bear Total theory achieved a comparatively better performance than
Number of stock 224 198 422 turning point confirming & probabilistic reasoning method.
Less than Buy-and-Ho1d 39 o 39
Hence, in terms of profitability, the system proposed in this
Less % 17.41% 0% 9.24%
Number of loss o 14 14 study can bring investors relatively better rewards.
Loss % 0% 7.07% 3.32%
Average Profit 39.11 % 13.79% 27.23% TABLE V
Average Profit of Buy-and-Ho1d 26.47% -21.04% 5.77% THEPERFORMANCE OF DIFFERENT TRADING SYSTEM
Method Support vector machine Turning point confirming +
+ Box theory Probabilistic reasoning *
In the total 422 stocks, only 14 stocks don't profit due to Data set S&P 500 components S&P 500 components
the overall downtrend during the test period in which there is Testing set 3/18/2004~ 10/171 2005 5/3/2002~ 11/25/2005
3346
combining other soft computing techniques together. The
self-learning and self-adjustment of system parameters such
as trading rate and stop-loss rate is also a necessary future
work.
REFERENCES
[1] Grudnitski. G., Osburn, L., "Forecasting S&P and gold futures prices:
an application of neural networks," Journal ofFutures Market, vo1.l3,
pp. 631-643, 1993.
[2] Tenti P., "Forecasting foreign exchange rates using recurrent neural
networks," Applied Artificial Intelligence, vo1.l0, pp. 567-581,1996.
[3] Kuan C.M., Liu T., "Forecasting exchange rates using feedforward and
recurrent neural networks," Journal ofApplied Econometrics, vo1.l 0(4),
pp.347-364, 1995.
[4] Chen A.S., Leung M.T., "Regression neural network for error
correction in foreign exchange forecasting and trading," Computers
and Operations Research, vo1.31, pp.l 049- 1068, 2004.
[5] L. l. Cao, Francis E., H. Tay, "Support Vector Machine With Adaptive
Parameters in Financial Time Series Forecasting," IEEE Trans.on
neural networks, vo1.l4(6), pp.1506-l5l8, Nov. 2003.
[6] Y. Bao, Z. Liu, L. Guo, W. Wang, "Forecasting Stock Composite
Index by Fuzzy Support Vector Machines Regression," in Proc. ofthe
Fourth Int. Con! on Machine Learning and Cybernetics, Guangzhou,
2005, pp.3535- 3540.
[7] D. Bao, Z. Yang, "Intelligent stock trading system by turning point
confirming and probabilistic reasoning," Expert Systems with
Applications, vo1.34 (1), pp.620-627 , 2008
[8] Baba N., Inoue N., Yan Yanjun, "Utilization of Soft Computing
Techniques for Constructing Reliable Decision Support System for
Dealing Stocks," in Proc. of the 2002 Int. Joint Conf. on Neural
Networks, Hawaii, 2002, pp:2l50 - 2155
[9] lung-Bin Li, An-Pin Chen, "Refined Group Learning Based on XCS
and Neural Network in Intelligent Financial Decision Support System,"
in Proc. of the Sixth Int. Conf. on Intelligent Systems Design and
Applications, 2006, vo1. 2, pp:925 - 930
[10] Vapnik V. "The nature of statistical learning theory," New York:
Springer, 1995.
[11] Muller K. R., Smola A., Ratsch G., Scholkopf B., Kohlmorgen,
l.,Vapnik V., "Predicting time series with support vector machines," In:
Proc. ofthe 7th Int. Conf. on Artificial Neural Networks, Berlin, 1997,
pp. 999-1004.
[12] Pring Martin, "Technical analysis explained (4th ed.)," McGraw-Hill
Company, 2002.
[13] Hassan M. R., Nath B., "Stock market forecasting using hidden Markov
model: a new approach," In: Proc. of 5th Int. Con! on Intelligent
Systems Design and Applications, 2005, pp.192-l96.
[14] X. Lin, Z. Yang, Y. Song, "The Application of Echo State Network in
Stock Data Mining," Lecture Notes in Computer Science, vo1. 5012,
pp.932-937, 2008.
[15] J. W. Lee, "Stock price prediction using reinforcement learning," In:
Proc. ofIEEE Int. Symposium on Industrial Electronics, Pusan, South
Korea, 2001, vo1. 1, pp.690-695.
[16] Nicolas Darvas, "How I Made Two Million Dollars in the Stock
Market, " BN Publishing, 2007.
3347