Classifying Initial Returns of Electronic Firm’s IPOs Using Entropy Based
Rough Sets in Taiwan Trading Systems
Ching-Hsue Cheng1 , You-Shyang Chen1, Jr-Shian Chen1,2
1
Department of Information Management, National Yunlin University of Science and Technology,
123, Section 3, University Road, Touliu, Yunlin 640, Taiwan
2
Department of Computer Science and Information Management, HUNGKUANG University,
No.34, Chung-Chie Road, Shalu, Taichung 433, Taiwan
{ chcheng, g9523804, g9320818 }@yuntech.edu.tw
Abstract Rough set is a predictive data mining tool that
incorporates vagueness and uncertainty and can be
This paper deals with forecasting initial returns in applied in artificial intelligence and knowledge
initial public offerings! )IPOs* market in Taiwan’s discovery in databases [2]. The entropy–based
stock trading systems using rough set theory. It is very discretization [3] can use class information to
important for investors that correctly predict initial discretize attributes. The entropy–based discretization
returns from trading systems. In this paper, we use a of continuous attributes is a valuable aspect of data
new approach, an entropy–based fuzzy discretization, mining, especially in rough set and classification
for enhancing rough set classifier. The enhanced rough problems. This study therefore focuses on improving
set theory involves two main procedures: (1) convert method of classifying initial returns in IPOs. A new
discretized continuous data into a unique approach is proposed to enhance rough set classifier
corresponding linguistic value using a MEPA for classifying problems.
approach; and (2) utilize the linguistic value to extract
decision rules by LEM2 algorithm. The actual IPOs 2. Related works
dataset is employed in this empirical case study to
illustrate the proposed approach. From the results, the This section reviews related studies of initial returns
proposed approach improves accuracy and generates in IPOs, the rough set theory, the LEM2 rule extraction
fewer rules, and the performance is superior to the method, and minimize entropy principle approach.
listing methods.
Keyword: Rough Set Theory, Initial Returns, Minimize 2.1 Initial returns in IPOs
Entropy, IPOs
Based on view of securities investors, an offering
1. Introduction success in IPOs is having positive initial returns.
According to [4], there are three types of initial returns:
In stock markets, investors have been utilizing some negative, positive, and zero. Ibbotson et al. [1]
methods to find out the superior target of investment reported that the initial returns represent the return of
over time. Generally, there are two kind instruments the investors from IPO allocation at the offer price that
for predicting stock price, which are technical analysis can achieve by selling at the first public trading day, is
and fundamental analysis. From literature [1], less average over 10%. Referring to the literature [5],
studies use fundamental analysis to predict stock price determinants of offering success, including price,
as compare to technical analysis, and there is not the underwriter and auditor reputation, age, proceed size,
history data about price and volume in pre-IPOs (initial and the clustering of filings, have different degree of
public offerings). That is why we choose fundamental impacts on the offering success.
analysis in this paper. Our goal is to extract meaningful
rules with fundamental analysis based on data mining 2.2 Rough set
tools.
0-7695-2882-1/07 $25.00 ©2007 IEEE
Rough set theory, first proposed by Pawlak [6] in complex binary attributes without compromising key
1982, employs mathematical modeling to deal with information. In general, extracting rules based on
data classification problems. Let B A and rough set LEM2 are superior to traditional methods
X U be an information system. The set X is because they deduce rule sets directly from data with
symbolic and numerical attributes, LEM2 requires pre-
approximated using information contained in B by
discretized data. From Chen and Cheng [10], the
constructing lower and upper approximation sets,
entropy–based discretization can reduce the data size.
BX {x | [ x ]B X } and The interval boundaries are defined to occur in places
BX {x | [ x ]B X z } respectively. The that may help classification accuracy. Therefore, a new
approach is proposed to reinforce the quality of a
elements in BX can be classified as members of X by rough set classification system. Figure 1 illustrates
the knowledge in B. However, the elements in BX research procedure, and the proposed approach is
can be classified as possible members of X by the introduced as following.
Step 1: Partition continuous attributes by MEPA
knowledge in B. The set BN B ( x ) BX BX is Step 2: Build membership function.
called the B-boundary region of X and it consists of Step 3: Fuzzify the continuous data into unique
those objects that cannot be classified with certainty as corresponding linguistic value.
members of X with the knowledge in B. The set X is Step 4: Extract rules by LEM2.
called “rough” (or “roughly definable”) with respect to Step 5: Evaluate accuracy and number of rules
the knowledge in B if the boundary region is non-
empty. An empirical case study
A practical collected dataset is used in this
2.3 Rule extraction empirical case study to demonstrate the proposed
approach: the dataset of IPOs for electronic firms in
Rough set rule induction algorithms were Taiwan stock trading system between 1985 and 2003.
implemented for the first time in a LERS (Learning The dataset contains 220 instances which are
from Examples) [7] system. A local covering is characterized by the following 11 attributes: (i) public
induced by exploring the search space of blocks of trading year, (ii) auditor, (iii) underwriter, (iv) age, (v)
attribute-value pairs which are then converted into the shares volume, (vi) proceed size, (vii) price, (viii)
rule set .The algorithm LEM2 (Learning from closing price, (ix) revised number, (x)IPO type, and
Examples Module, version 2) [8] for rule induction is (xi)class; in the dataset, age, shares volume, proceed
based on computing a single local covering for each size, price, and closing price are continuous data,
concept. others are symbolic data. According to [4], the dataset
is partitioned into three classes, Negative initial returns
2.4 Minimize Entropy Principle Approach (52, 23.6%), Positive initial returns (167, 75.9%) and
(MEPA) Zero initial returns (1, 0.5%). The computing process
can be expressed as follows:
To subdivide the data into membership functions, Step 1: Partition continuous attributes by MEPA
the threshold between classes of data must be From the number of general cut [11], entropy values of
established. A threshold line can be determined with an each datum are computed using the MEPA. Table 1
entropy minimization screening method. The shows thresholds of continuous attributes.
segmentation process starts with two classes. Thus, Step 2: Build membership function.
repeated partitioning with threshold value calculations These cut-off points derived step 1 are used as the
allows further partition of the data set into a number of midpoint of membership function. Figure 2 illustrates
fuzzy sets [9]. The details of MEPA refer to [9, 10]. the membership function of the MEPA approach of age
attribute.
3. The proposed approach and case study Step 3: Fuzzify the continuous data into unique
corresponding linguistic value.
Rough set classifiers usually apply the concept of According to the membership function in step 2, the
rough set to reduce the number of attributes in a maximal degree of membership for each datum is
decision table [2], and data discretization is used to calculated to determine its linguistic value.
find the cut points for attributes. By this method, the Step 4: Extract rules by LEM2.
initial decision table is converted to one with less Using LEM2 algorithm and the linguistic values
derived in step 3, decision rules can be produced. Table
2 shows partial rules. [3] Christensen, R., Entropy minimax sourcebook, Entropy
Step 5: Evaluate accuracy and number of rules Ltd., Lincoln, MA, 1980.
For verification, the dataset is split into two sub- [4] S.I., Cho., “A Model for IPO Pricing and Contract Choice
Decision”, The Quarterly Review of Economics and Finance,
datasets: the 67% dataset is used as a training set, and
41, 2001, pp. 347-364.
the other 33% is used as a testing set. The experiment [5] C.G. Dunbar, “The Choice Between Firm Commitment
is repeated ten times with the 67% / 33% random split. and Best Efforts Offering Methods in IPOs: The Effect of
Table 3 presents the accuracy rate performed of the Unsuccessful Offers”, Journal of Financial, 7, 1998, pp. 60–
number of rule with standard deviation, comparison of 90.
different methods applying to same data. The accuracy [6] Z. Pawlak, ”Rough Sets”, Informational Journal of
rate and number of rules demonstrate that the proposed Computer and Information Sciences, 11(5), 1982, pp. 341-
approach outperforms the listing methods. 356.
[7] Gmymala-Busse J.W., LERS, A System for Learning from
Examples Based on Rough Set, R. Slowinski, Editor,
4. Conclusions trntelligent decision support: Handbook of Applications and
Advances in Rough Set Theory, Kluwer Academic
A reinforced rough set theory based on MEPA for Publishing, 1992, pp. 3-18.
solving classification problems was proposed. The [8] J.W. Grzymala-Busse, ”A New Version of the Rule
empirical results of IPOs datasets indicate that the Induction System LERS”, Fundamenta Informaticae, 31,
proposed approach outperforms the listing methods 1997, pp. 27-39.
because of its accuracy and reduced number of rules. [9] Ross, T.J., Fuzzy Logic with Engineering Applications.
Moreover, the results may be useful for stock investors International Edition, McGraw-Hill, USA, 2000.
[10] J.S. Chen, and C.H. Cheng, “Extracting Classification
and system development and further researches.
Rule of Software Diagnosis Using Modified MEPA”, Expert
Specifically, the proposed method surpasses the listing Systems With Applications, 34(2), (Accepted, article in press).
method for three reasons: (1) The MEPA method [11] J. Bazan, H.S. Nguyen, S.H. Nguyen, P. Synak, and J.
discreatizes attributes based on outcome class by using Wróblewski, ”Rough Set Algorithms in Classification
the entropy equations to determine thresholds then Problem”, L. Polkowski, S. Tsumoto, T. Y. Lin, (eds.):
subdividing the intervals by minimum entropy values. Rough Set Methods and Applications. Heidelberg: Physica-
(2) MEPA can be incorporated into LEM2 to enhance Verlag, 2000, pp. 49-88.
rough set classification. (3) The accuracy rate [12] J. Bazan, and M. Szczuka, “RSES and RSESlib - A
demonstrates that the proposed approach outperforms Collection of Tools for Rough Set”, Lecture Notes in
Computer Science, Springer-Verlag, Berlin, 2001, pp. 106-
the listing methods, and the number of rules is fewer
113.
than traditional rough set.
[13] Quinlan J.R., C4.5: Programs for Machine Learning,
Morgan Kaufmann Publishers, San Mateo, CA, 1993.
5. References [14] G.H. John, and P. Langley, “Estimating Continuous
Distributions in Bayesian Classfiers”, Proceedings of the
[1] R.G. Ibbotson, J.L. Sindelar, and J.R. Ritter, “Initial Eleventh Conference on Uncertainty in Artificial Intelligence,
Public Offerings”, Journal of Applied Corporate Finance, Morgan Kaufmann, San Mateo, 1995, pp. 338-345.
1988, pp. 37–45. [15] R.P. .Lippmann, “An Intorduction to Computing with
[2] Pawlak, Z., Rough Sets, Theoretical Aspects of Reasoning Neural Nets”, IEEE ASSP Mag., April 1987, pp. 4-22.
About Data, Kluwer, Dordrecht, The Netherlands, 1991.
Figure 1. Research procedure
1.2
1
0.8
0.6 L_1 L_2 L_3
0.4
0.2
31.4 32.6
4.8
0
3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33
Figure 2. Membership function of age attributes
Table 2. Thresholds of continuous attributes in the IPOs dataset
Attribute Min SEC1 PRI SEC2 Max
Age 3.0 4.8 31.4 32.6 33.8
Shares_Volume 30500 30893 3778037 6715899 9647725
Proceed_Size 1311500000 1628704065 282154139000 664396833800 1003363389600
Price 10.5 19.5 20.5 326.5 375.0
Closing_Price 9.8 19.5 32.5 101.0 401.0
Table 3. IPOs dataset result rule set example
Rules Support
(Proceed_Size=L_1)&(Shares_Volume=L_1)&(Price=L_2)&(IPO_Type=0)&(Age=L_2)=>(Class
21
=P)
(Proceed_Size=L_1)&(Shares_Volume=L_1)&(Price=L_2)&(IPO_Type=0)&(Revised_Number=2
12
)&(Closing_Price=L_2)&(Age=L_2)=>(Class=P)
(Proceed_Size=L_1)&(Shares_Volume=L_1)&(Price=L_2)&(Revised_Number=1)&(IPO_Type=1
12
)&(Age=L_1)&(Public_Trading_Year=2001)=>(Class=P)
(Proceed_Size=L_1)&(Shares_Volume=L_1)&(Price=L_2)&(Age=L_1)&(IPO_Type=0)&(Revise
8
d_Number=2)&(Closing_Price=L_3)&(Underwriter=1)=>(Class=P)
(Proceed_Size=L_1)&(Shares_Volume=L_1)&(Price=L_2)&(Age=L_1)&(Revised_Number=1)&(
7
IPO_Type=1)&(Closing_Price=L_2)&(Public_Trading_Year=2001)=>(Class=P)
Table 3. IPOs dataset experiment result
Method Training Accuracy Testing Accuracy Number of Rules
Decision Tree-C4.5 [13] 76.37% NA
Naïve Bayes [14] 77.27% NA
Multilayer Perceptron [15] 69.55% NA
Rough Set [12] without discretization 100.0 r 0.0% 70.0 r 4.7% 71.5 r 5.7
Proposed Approach r
100.0 0.0% r
85.4 4.0% 31.5 r 3.6
Note: The NA denotes no given the answer in this method.