0% found this document useful (0 votes)
37 views13 pages

Basti Et Al. (2015)

Download as pdf or txt
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 13

Decision Support Systems 73 (2015) 15–27

Contents lists available at ScienceDirect

Decision Support Systems


journal homepage: www.elsevier.com/locate/dss

Analyzing initial public offerings' short-term performance using decision


trees and SVMs
Eyup Bastı a, Cemil Kuzey b, Dursun Delen c,⁎
a
Department of Banking and Finance, Faculty of Economics and Administrative Sciences, Fatih University, 34500, Buyukcekmece, Istanbul, Turkey
b
Department of Management, Faculty of Economics and Administrative Sciences, Fatih University, 34500, Buyukcekmece, Istanbul, Turkey
c
Management Science and Information Systems, Spears School of Business, Oklahoma State University, 700 N. Greenwood Ave., Tulsa, OK 74106, USA

a r t i c l e i n f o a b s t r a c t

Article history: In this study, we investigated underpricing of Turkish companies in the initial public offerings (IPOs) issued and
Received 19 July 2014 traded on Borsa Istanbul between 2005 and 2013. The underpricing of stocks in IPOs, or essentially leaving money
Received in revised form 13 February 2015 on the table, is considered as an important, challenging and worthy research topic in literature. Within the pro-
Accepted 14 February 2015
posed framework, the IPO performance in the short run and the factors that affect this short run performance
Available online 21 February 2015
were analyzed. Popular machine learning methods – several decision tree models and support vector machines –
Keywords:
were developed to investigate the major factors affecting the short-term performance of initial IPOs. A k-fold
Initial public offering cross validation methodology was used to assess and contrast the performance of the predictive models. An
Underpricing information fusion-based sensitivity analysis was performed to combine the values of individual variable impor-
Short-term stock performance tance results into a common representation. The results showed that there was underpricing in the initial public
Decision tree algorithms offerings of Turkish companies, although it was not as high as the underpricing determined in developed
Turkey markets. The market sentiment, the annual sales amounts, the total assets turnover rates, IPO stocks sales
methods, the underwriting methods, the offer prices, debt ratio, and number of shares sold were among the
most influential factors affecting the short term performance of initial public offerings of Turkish companies.
© 2015 Elsevier B.V. All rights reserved.

1. Introduction fraud [67,68,69]. Taking on a more challenging problem, this study used
machine-learning techniques to identify major factors affecting the
The main purpose of this study was to investigate the short-term short-term performance of IPOs. Another differentiation of this study
price performance of initial public offerings (IPO) in Turkey over the pe- is that although there are many studies that took into account the per-
riod 2005–2013, and to identify the factors affecting underpricing. This formance of IPOs in developed countries, there are very few articles in-
study aimed at providing several invaluable contributions to the extant vestigating the IPO performance in developing countries. Therefore, this
literature. It demonstrated that contemporary machine learning tech- study contributes to the extant literature concerning IPOs in developing
niques are viable (and perhaps better) data analysis tools for critical countries. A literature review on IPOs performance is given in the sec-
assessment of IPO valuation. Considering the fact that vast majority of ond section. The third section summarizes our methodology including
the previous studies in this domain (including [30,31]; Grammenos the specification of the IPO data, description of analysis methods and
and Papapostolou [65]) used classical statistical approaches such as presentation of the results. The fourth section discusses our findings
regression analysis, the machine learning based approach used in and concludes the paper.
this study broaden and enriched the analytics landscape. Application Machine learning, a branch of artificial intelligence, is the study of
of these algorithmic models allowed us to obtain information that is computational systems/algorithms that can learn from historical data.
more accurate, since they are shown to be capable of providing better Based on known features learned from the training data, machine learn-
predictive performance [9]. Although there were other studies that ing primarily focuses on prediction of future outcomes rather than
used machine-learning techniques in analysis of financial statements, focusing on the discovery of unknown features of the data. Decision
they were limited in scope, solely focusing on the detection of financial tree (DT) learning is one of the predictive modeling techniques com-
monly employed in machine learning and data mining. DT learning
utilizes a decision tree as a predictive model. The objective of DT learn-
⁎ Corresponding author at: Oklahoma State University, 700 N. Greenwood Ave., ing is to produce a model that predicts the value of a dependent variable
#NH341, Tulsa, OK 74106, USA. Tel.: +1 918 594 8283.
E-mail addresses: [email protected] (E. Bastı), [email protected] (C. Kuzey),
(target) based on various independent (input) variables. Decision tree
[email protected] (D. Delen). algorithms are among the most popular machine learning methods
URL: http://www.spears.okstate.edu/delen (D. Delen). because they produce accurate prediction models, have excellent

http://dx.doi.org/10.1016/j.dss.2015.02.011
0167-9236/© 2015 Elsevier B.V. All rights reserved.
16 E. Bastı et al. / Decision Support Systems 73 (2015) 15–27

visualizations, work with both numerical and categorical data types, Some investors are informed about IPOs while others are uninformed.
perform very well with large data sets, and are easy to understand If an IPO is underpriced, the informed investors apply to buy this stock
and interpret. issue, meaning that uninformed investors can purchase only a small
The first sale of shares to the public by corporations who have not amount of these stocks, because of rationing. On the other hand, when
previously floated shares is called an initial public offering (IPO). IPO an IPO is overpriced, the informed investors do not apply to buy that
is a difficult but very important process for the issuing corporation. stock issue meaning that the uninformed investors are able to purchase
Therefore, corporations that wish to go public consult with investment as much stock as desired. However, in this case, they stand to lose
banks in order to prepare the necessary documents, apply to the money because the stock price could decrease when trading begins.
responsible governmental authority, publicize the issue, set the offer Because of this adverse selection potential, uninformed investors do
price and finally sell their shares to investors. Determining the offer not apply for IPOs. Therefore, in order to attract uninformed investors
price is extremely important because if the offer price is set very low, to IPOs and guarantee the sale of all issued shares, IPOs must be
the issuing company loses the opportunity to sell the stocks at a higher underpriced [30,50].
price, thereby leaving too much money on the table. On the other hand, Another version of asymmetric information is assumed to be
if the offer price is set too high, all of the shares will not sell, and the between the investment banks and the issuing corporations. Based on
issuing company will end up with both a loss of funds and prestige. Be- this hypothesis, investment banks are better informed about IPOs than
cause of its importance, IPO performance has been studied extensively the issuing corporations, so they force the issuing corporations to
in the finance literature by researchers. underprice in order to be successful in IPOs, thereby increasing their
Almost all of the previous studies reached the conclusion that there reputation [3]. The investment banks argue that underpricing is neces-
was underpricing in IPOs in the short run [3,13,14,37,40,50,59–61]. sary in order to convince investors to purchase stocks while keeping
Many theories/hypotheses have been proposed in order to explain the their advertising costs low [2].
IPO underpricing phenomenon since the 1980s. By proposing a winners' Still another type of asymmetric information is between the issuing
curse theory, Rock [50] argued that there are two types of investors: in- corporations and their investors. Based upon this model, the owners of
formed and uninformed. According to Rock, underpricing is necessary in the issuing corporations know the prospects and intrinsic value of their
order to convince uninformed investors to purchase stocks, so issuing company better than outside investors. In order to differentiate their
corporations intentionally underprice their IPOs. Beatty and Ritter [3] company from low quality companies and to signal the quality of their
developed a model that assumes investment banks are better informed company to outside investors, the issuing corporations underprice
than the issuing corporations are, so they force issuing corporations to their IPOs. This model of asymmetric information, which is called the
set a low offer price in order to minimize their selling efforts and adver- “signaling model”, assumes that the quality of the issuing firms is
tisement expenditures. Signaling hypotheses assume that the issuing revealed exogenously after the IPO. Based on signaling models, the cor-
firms are better informed about their companies' intrinsic value than porations issue only a small portion of their capital in IPO and initial
outside investors. Based on these models, the issuing companies owners generally do not sell their shares in IPO. Instead, these compa-
underprice IPOs in order to signal and differentiate the quality of their nies make seasoned equity offers and maximize the total proceeds
companies from unqualified companies. These models assume that a from IPO and subsequent SEO [1,25,30,52,61].
company's quality is revealed exogenously after the IPOs, allowing The market feedback model of asymmetric information also assumes
companies to maximize their total proceeds from IPOs and subsequent informational asymmetry between investors and owners; however, it is
seasoned equity offers (SEO) [1,25]. in the opposite direction. Based on market feedback theory, the inves-
Prospect theory assumes that the issuing corporations do not care tors are better informed about an IPO than the initial owners and man-
about leaving money on the table. Instead, the initial owners of IPO agers. The initial owners determine the IPO percentage and price in
companies pay attention to the change in their wealth through the order to maximize information production from informed investors
value of the shares they hold after the IPOs [39]. Market feedback theory such as financial analysts. The management of the firm learns the actual
proposes that companies sell a small portion of their shares in IPOs; value of the company after the IPO. Based on feedback taken from the
based on feedback from investors regarding their stock prices, they IPO, the corporations revise the expected returns of their projects
then sell more shares later in SEOs at their true value [28,57]. Cyclical upwards and make SEOs at market prices to provide funds for these
behavior theory argues that IPO underpricing is mostly encountered in projects. Therefore, the total proceeds from the IPO and subsequent
hot markets instead of cold markets [49]. After reviewing many theo- SEO are maximized [28,56,57].
ries, Jenkinson and Ljungqvist [29] and Ritter and Welch [48] concluded Another model attempting to explain underpricing in IPOs is
that it is impossible to explain pervasive underpricing in IPOs by way of prospect theory. According to prospect theory, the initial owners of
a single theory. Therefore, these theories are not mutually exclusive and IPO companies do not care about underpricing or leaving money on
for every IPO, some theories may have more explanatory power than the table, because they do not take into account just the number of
others may. shares they sold in the IPO, but also the amount of shares they hold
after the IPO. Instead of getting upset because of IPO underpricing,
2. Literature review they become happy as a result of the price increase in their unsold
shares [24,36,38,39].
There are many published studies analyzing the performance of IPOs Cyclical behavior hypothesis argues that IPOs are much more heavily
in the short run as well as the ones investigating the factors that influ- underpriced during hot markets than cold markets. Ritter [49] proved
ence short-term IPO performance. The vast majority of these studies that IPO underpricing is realized in specific periods and in some sectors.
have suggested that there is underpricing in the very short run (stock Loughran and Ritter [40] determined that underpricing in IPOs in-
prices increase after IPOs). Some of the theories or the underlying creased by 15% during the 1990–1998 period; the figures then jumped
hypotheses proposed to explain underpricing in IPOs are summarized to 65% during the internet bubble years of 1999–2000 and fell to 12%
below. during the 2001–2003 period. Boonchuaymetta and Chuanrommanee
One of the hypotheses that attempt an explanation of the [4] investigated the factors affecting IPO underpricing in Thailand.
underpricing phenomenon is asymmetric information, which has sever- They found that IPO allocation to institutional investors and the length
al different versions. The first version of the asymmetric information of the lock-up period are the key determinants of IPO underpricing.
hypothesis, called “the winners' curse hypothesis”, argues that there is Based on their results, the issue size, the industry, and the hot issue
informational asymmetry among investors. This asymmetric informa- market also significantly influence initial returns. Kıymaz [31] analyzed
tion among investors is believed to cause underpricing as follows: the IPO performance of Turkish stocks in various sectors during the
E. Bastı et al. / Decision Support Systems 73 (2015) 15–27 17

period 1990–1996. His results revealed that the Turkish IPOs are we present the recent IPO pricing literature that studied developing coun-
underpriced on the initial trading day, by an average of 13.1%. He tries. Also included in Table 1 are the list of variables used in each study as
argued that the size of the issuer, the rising prices on the stock market well as the research methods applied. Based on Table 1, it is evident that
between the date of public offering and the first trading day, institutional the previous studies predominantly employed some variant of simple
ownership, and self-issued offerings were significant determinants of regression analysis and statistical t-tests. Therefore, this research study
underpricing. distinguishes itself from these studies by employing machine learning
Orhan [45] investigated underpricing on the Istanbul Stock Ex- techniques that are capable of capturing and representing complex
change for 18 sectors for the period 1996–2005. His analysis showed relationships between the input and output variables hidden in large
that half of the sectors provided a negative first day return. He argued databases without being subject to constraining assumption such as
that this contradicting IPO performance result was due to several eco- linearity, normality and multicollinearity.
nomic crises Turkey experienced during the analysis period. However,
he reached this result by analyzing sectors separately. If Orhan [45] 3. Data and the research methodology
analyzed IPOs' average aggregate performance, it would be possible to
detect a marginal underpricing along with similar to the various studies In this study, the existence of underpricing in the IPOs of Turkish
conducted on Turkish stocks [46,55,63]. Similarly, Yalçıner [63] ana- companies, issued and traded on Borsa Istanbul between 2005 and
lyzed the first day performance of IPOs on the Istanbul Stock Exchange 2013, was analytically investigated. Moreover, the underlying factors
for the period 1997–2004. His results revealed that Turkish stocks pro- affecting underpricing were also identified and critically examined.
vided a 7.2% abnormal return on average on the first day, concluding Borsa Istanbul (BIST), formerly called as Istanbul Stock Exchange,
that there was definite underpricing in Turkish stocks. He also tested started operations with 40 listed corporations at the beginning of
the influence of some factors on IPO underpricing but he could not 1986. BIST has memberships in various international federations and
find a significant relationship. associations such as the World Federation of Exchanges, Federation
Ünlü and Ersoy [55] also investigated the IPO performance of the of Euro-Asian Stock Exchanges, Federation of European Securities
Istanbul Stock Exchange listed companies for the period of 1995–2008. Exchanges, and International Capital Market Association [7]. BIST has
Their results revealed the existence of underpricing by 6.52%, based five markets: the equity market, the emerging companies market, the
upon the first trading day's closing price. They also concluded that debt securities market, the futures and options market and the precious
underpricing was more substantial in companies that are over 20 years metals and diamond market. The equity market also has eight submar-
old, as well as in companies that issue stocks at a fixed price. Relatedly, kets: the national market, the collective products market, the secondary
Otlu and Ölmez [46] examined the IPO performance of Turkish firms for national market, the watch-list companies market, the primary market,
the period of 2006–June 2011 determining that IPO stocks provided a the wholesale market, the rights coupon market and the free trade
6.99% abnormal return on average, based upon closing prices on the platform.
first trading days. Ünlü and Ersoy [55] and Otlu and Ölmez [46] tested BIST has been developing many aspects, such as the number of listed
the factors that affect short-term cumulative stock performance. While corporations, the daily trading volume, the total market capitalization of
Ünlü and Ersoy [55] determined that first day's return, standard deviation listed companies, the number of markets, since its inception. There were
of returns in previous days and IPO method were effective on short-term 421 companies listed on BIST by April 2014. The total market capitaliza-
cumulative performance, Otlu and Ölmez [46] concluded that the factors tion of BIST companies was $237.64 billion by the end of 2013. 62.5% of
having the highest effect on stock prices are first day's return and stan- the publicly traded shares of BIST companies are owned by foreigners
dard deviation of returns in previous days. However, these analyses (CMB Monthly Statistics Bulletin, December 2013). Its daily trading vol-
seem subjective since cumulative returns include the first day's return. ume was $1.48 billion by April 25, 2014 [6]. The BIST Equity Market is
Therefore, it is most likely to have a relationship between first day's ranked 33rd in the world, based upon market capitalization by the
return and cumulative returns because of this questionable analysis. end of 2013 [62].
Previous studies frequently utilized regression analysis to investi-
gate the underpricing of IPOs [30,31,65]. Machine learning methods in 3.1. Data
general and decision tree algorithms in specific are new to this applica-
tion domain. In this study, we developed and compared several ma- The within study covered all IPOs except security investment trusts
chine learning techniques in both predictive as well as for descriptive during the period of 2005–2013. As a result, our sample included
purposes. Some of these analytical methods were also recently used in the data of all IPO corporations during this time period. Of all of
other studies to address different problem settings in the field of finance these IPOs, 65% were underpriced and 35% were overpriced. Summary
[19,20,35]. information about those companies is provided in Table 2 below.
According to the recent literature, there are significant advantages of As shown in Table 2, the total proceeds provided by the IPOs during
using contemporary machine learning methods in financial studies the period of 2005–2013 were $10.58 billion. The highest proceeds with
[19,20,35]. Despite many studies focusing on IPO pricing using classical $3.29 billion were in 2007 while there were no IPOs during the global
statistical techniques, the review of the extant literature showed that crisis year of 2009.
there have not been many studies investigating IPO pricing using ML
methods, in both developed as well as developing countries. There 3.2. Methodology
seem to be a void in literature where classical statistical methods are
compared to machine learning methods in analysis of IPO pricing. An To begin, the first day returns of IPO companies were calculated by
extensive search returned only one previous research study focusing utilizing Eq. (1):
on using data mining tools for investigating IPO pricing. In that study,
Chen & Cheng [12] proposed three data mining models for solving the !
classification problems of IPO returns in the stock market. They used var- P i;t
Ri;t ¼ −1 ð1Þ
ious hybrid machine learning approaches such as decision trees (C4.5), P i;IPO
experiential knowledge, and feature selection methods, minimized entro-
py principle approach, rough set theory, and rule filter. The proposed hy-
brid models outperformed the traditional methods in accuracy as well as where Ri,t represents the return of stock “i” at the end of first trading day
in generated comprehensible rules applied in knowledge-based systems t, Pi,t is the closing price of stock “i” at the end of first trading day t and Pi,
for investors. In order to provide a comparative perspective, in Table 1, IPO is the offer price of the same stock.
18
Table 1
List of studies about IPO pricing from developing countries.

Authors Variables Control variables Method Managerial implications

Ekkachai Boonchuaymetta and Underwriter reputation, ownership concentration, Age of the firm, issue size, OLS and t-statistics As institutional investors play very limited roles in Thai IPO
Wiparat Chuanrommanee, 2013 book-building, IPO allocation, length of the lock up the industry, hot issue market activity, issuer firms should select underwriters that have
(Thailand) period and investor interest strong retail networks.
Jirapun Chorruk and Andrew C. The study investigated the factors that affected the Multivariate regression An important implication for the prospects of the owners of
firms is that they are not significantly disadvantaged in wealth

E. Bastı et al. / Decision Support Systems 73 (2015) 15–27


Worthington, 2010 (Thailand) long-term performance of IPO stocks. (SET Market
Analysis and Reporting Tool, official Prospectus term in considering an IPO. Because underpricing level is low in
Filing Forms, SET Fact book Series) developing countries.
Anna P.I. Vong and Duarte Offer price, the offering size (lnsize, lnasset), underwriter Age, whether the IPO is timing, Cross sectional regression The reputation of underwriters helps to reduce excess returns
Trigueiros, 2010 (Hong Kong) reputation, subscription rate (level of informed demand), firms syndicating the IPO, origin model since information gathering and price-setting activities are
single or multi underwriter, state of the market. of the firm going public more efficient.
(Factbooks of HKEx and the IPO prospectuses)
Manapol Ekkayokkaya and Subscription period, offer price, numbers of listed and Multivariate analysis Riskier firms specify proceeds use. Therefore, there is positive
Tulaya Pengniti, 2012 (Thailand) offered shares, firm age, issue managers, pricing relation between the specificity of proceeds use disclosure and
techniques, uses of proceeds. (Thomson Financial underpricing during the post reform period.
Securities Data Company (SDC) New Issues database,
New Securities and Company Profile databases
in SETSMART, Data stream)
K. Keasey and P.B. McGuinness [43] Percentage of equity retained by pre-listing shareowners, Regression There is a positive relationship between percentage of equity
(Hong Kong) Ratio of total liabilities to total assets, Logarithm of total retained by pre-listing shareowners and level of underpricing.
assets, Market-to-book ratio of target firm, Market-to-book Earnings forecast disclosure dummy and leverage also positively
ratio of the Hang Seng Index. (HKEx web site secondary affect IPO underpricing. Size negatively affects IPO performance.
market trading data was drawn from DataStream)
Saikat Sovan Deb and Vijaya B. Method of the offer, offer price, listing price, issue amount, Cross-section multiple Grading decreases IPO underpricing and positively influences
Marisetty, 2010 (India) total subscription (Total), subscription by qualified regression demand of retail investors.
institutional investors (QIB), subscription by retail/non- Grading reduces secondary market risk and improves liquidity.
institutional investors (Ret), promoter's holding post IPO In emerging markets, regulator's role to signal the quality of an
issue, total asset (TA), debt to equity ratio (DE), return on IPO contributes towards the market welfare.
net worth (RONW), earning per share (EPS) and current ratio.
Halil Kiymaz, 2000 (Turkey) Firm size, proceeds, age, Market trend Offer rate Cross-sectional regression The factors influencing the initial performance of Turkish IPOs
Privatization Institutional ownership Method of going are size of the issuer, rising stock market between the time of
public (all data obtained from ISE) price fixing and first trading day, and self issued offerings
E. Bastı et al. / Decision Support Systems 73 (2015) 15–27 19

Table 2 percentage of insider retained shares, the IPO underwriting method


Sample of IPOs in Turkey. and the IPO stocks sales method.
Year Aggregate proceeds A number of studies argued that IPOs take place during hot
Million dollar Percent (%)
market episodes. Those studies also show that stock market sentiment
during IPOs influenced the IPO stocks' initial return [0,22,34,38,40,42,
2005 465 4.39
59,65]. The BIST 100 Index's 21 days return was included as a proxy
2006 919 8.68
2007 3,288 31.07 for the market sentiment prior to an IPO in order to explain IPO
2008 1,873 17.70 underpricing.
2009 0 0.00 The age of the firm was considered a risk factor to IPO perfor-
2010 2,104 19.88 mance studies. Generally, older companies are considered safer
2011 833 7.87
2012 346 3.27
than younger ones. Therefore, less underpricing is expected to be
2013 755 7.14 found in the IPOs of older companies. However, no significant rela-
Total 10,583 100.00 tion was detected between firm age and the level of underpricing
in IPOs [22,59]. Firm age was included in this study as an explanatory
variable.
Some studies used the logarithm of total assets as the size factor
Following this, the BIST 100 Market Index return was calculated on for the issuing company, finding a negative and significant relation
the same day by utilizing Eq. (2): between the company's total assets and the level of underpricing
! [40,59]. The total assets were used to measure the size of a firm. As
P m;t IPOs of small companies are more speculative than IPOs of big compa-
Rm;t ¼ −1 ð2Þ
P m;t−1 nies, the IPOs of small companies were expected to be underpriced
more than those of larger companies. We also used total assets as a
size factor.
where Rm,t is the return of the BIST 100 Index on day t, Pm,t is the closing
As well, annual sales were used as a size factor to explain IPO
price of the BIST 100 Index on day t. Pm,t − 1 is the closing price of the
underpricing in previous studies [38,40]. Those studies detected a
BIST 100 Index on the previous day.
negative and significant relationship between annual sales amount
Later, the market index return was subtracted from the IPO company
and IPO underpricing. Authors of those studies argued that annual
stock return in order to find the excess return of the IPO company in its
sales are an indication of size and uncertainty concerning the issuing
first trading day by utilizing Eq. (3):
firms, which suggests that companies with low sales volume underprice
more because of general uncertainty about their performance, and
ERi;t ¼ Ri;t −Rm;t : ð3Þ
those issues provide high positive returns on their first trading day. In
other words, they argue that a negative relationship between annual
Finally, the first day's excess returns of IPO companies, with some sales and first-day returns can be interpreted as demonstrating the
variables, are explained. The variables employed in this analysis were relationship between the risk of an IPO and underpricing. Annual sales
separated into three groups: the stock market sentiment-specific, the were included in the explanatory variables.
pre-IPO financial status and operational performance-specific, and ROA was used to explain IPO underpricing by Grammenos and
the IPO characteristics of the issuing company specific. A history of Papapostolou [65]. They applied OLS regression, detecting a significant
21 days' return of the BIST 100 Index was used to calculate the stock and negative relationship between ROA and IPO underpricing. They
market sentiment. The variables expected to reflect financial and oper- stated that underwriters had a tendency to set the offer price of high
ational performance were: the firm age, the total assets, the total equity, ROA firms closer to the intrinsic value of the firm's stocks, and therefore
the sales, the operating profit, the net income, the cash flows from oper- underpricing decreased. ROA is also among the explanatory variables.
ations, the return on assets (ROA), the total assets turnover rate and the For the time being, it is impossible to estimate the direction of the rela-
debt ratio. The variables believed to reflect stock issue characteristics tionship between ROA and underpricing. If issuing firms set a high offer
are the offer price, the number of shares sold, the issue proceeds, the price for high ROA firms' stocks, then a negative relationship must be

Table 3
Variable definitions.

One day excess return (First trading day closing price − offer price) / offer price

Stock market sentiment 21 days' return on BIST 100 index prior to first trading date
Firm age First trading date–inception date
Total assets Taken from the last balance sheet prior to IPO
Total equity Taken from the last balance sheet prior to IPO
Sales Taken from the last annual income statement prior to IPO
Operating profit Taken from the last annual income statement prior to IPO
Net income Taken from the last annual income statement prior to IPO
Cash flow from operations Taken from the last annual statement of cash flows prior to IPO
ROA1 Net income/total assets
ROA2 Operating profit/total assets
Total assets turnover rate Sales/total assets
Debt ratio (Short term financial liabilities + long term liabilities)/short term financial liabilities + long term liabilities + equity
Offer price Taken from Borsa Istanbul's Website
Number of shares sold Taken from Borsa Istanbul's Website
Issue proceeds (USD) (Issue price × number of shares sold) / Central bank US dollar buying rate
Insider retention 1 — IPO percentage
Underwriting method Taken from Borsa Istanbul's Website
Sales method Taken from Borsa Istanbul's Website
20 E. Bastı et al. / Decision Support Systems 73 (2015) 15–27

expected between these variables as was detected by Grammenos and techniques were based on testaments obtained from previous compar-
Papapostolou [66]. On the other hand, if the offer price is set indepen- ative studies [19–21,35,44] and on our own experimentations. After a
dent of the ROA profitability measure, the investors should provide a consolidation of our observations, we found that decision trees and sup-
higher demand for IPOs of these firms, creating a positive relationship port vector machines performed significantly better than their machine
between these two variables. The total assets turnover rate is another learning and statistical counterparts, namely naïve Bayes, nearest
explanatory variable for IPO performance [65]. The total assets turnover neighbor, neural networks and logistic regression. Since decision trees
rate is an indication of operational efficiency. Higher operational effi- have significant advantages over other machine learning techniques in
ciency is expected to increase investor demand for the IPOs of those terms of being an easy to understand, transparent (as opposed to a
companies, and a higher initial return is expected. black-box), visually appealing and easily deployable, we included the
Debt ratio was also used as an explanatory variable for IPO perfor- two most popular decision tree method (C5 and CART) into our compar-
mance by Grammenos and Papapostolou [65]. They found a positive re- ative analysis. What follows is a brief description on the specific decision
lationship between debt ratio and underpricing. They argued that the tree and support vector machine models that we used in this study.
underwriters of indebted firms considered these companies riskier,
and therefore they underprice the issue. As debt ratio is associated 3.2.1. Decision tree algorithms
with higher riskiness, we expect a lower demand for the IPOs of indebt- Decision tree (DT) algorithms have gained increasing popularity in
ed companies with a lower initial return. analytics and in the ML field. Classification tree analysis and regression
The offer price is the first issue characteristic variable included in tree analysis are the two main types of decision tree analysis methods.
the study. Previous studies suggested that the offer price is a proxy for The most commonly used DT algorithms are CART, C5.0, C4.5, CHAID,
uncertainty about value. Thus, as it increases, the expected level of and QUEST, although there are many specific DT algorithms. Among
underpricing should decrease [5,27,59]. A similar relationship between the popular ones, arguably the most commonly used ones are C5.0
offer price and IPO performance is expected. IPOs with high expected and CART algorithms and were the ones used and compared in this
proceeds are considered to be less risky. Higher IPO proceeds with a study. C5.0 was developed by Quinlan [47]. It offers a number of
higher number of shares issued are related to the size of the issuing improvements over its previous version C4.5: empirically speaking,
company. Since big companies are considered to be less risky, less C5.0 is significantly faster than C4.5; it is more memory efficient than
underpricing is expected in IPO companies whose expected IPO pro- C4.5; it creates a considerably smaller decision tree while producing
ceeds are higher [30,65]. The variable of the number of shares issued similar or often better more accurate results; it boosts the trees, improv-
also had similar features compared to IPO proceeds. ing them and creating better accuracy; it makes it possible to weight dif-
The percentage of insider retained shares after IPO was another ferent attributes and misclassification types; and finally, it automatically
variable of IPO characteristics. Signaling and market feedback theories winnows the data to help reduce impact of noise inherent on the data.
and some studies which did not mention theories argued that As a result, it improves the objectivity and precision of the decision
there was a positive relationship between the insider retention ratio tree classification algorithm.
and IPO underpricing [8,30,64]. Therefore, the insider retention ratio, Classification and regression trees (CART) were first introduced by
i.e., the unsold portion of capital after IPO, was used as an explanatory Breiman et al. [10]. CART is a binary decision tree algorithm capable of
variable for IPO underpricing. processing both continuous and/or categorical predictor and/or target
The underwriting method was also included in the analysis as an ex- variables. That is, in contrast to C5.0, CART not only develop decision
planatory variable for IPO underpricing. Three types of underwriting trees for classification type problems – where the dependent variable
techniques are applied in Turkey: firm commitment, partial firm com- is a nominal valued variable – but also capable of developing decision
mitment and the best efforts issue. It was reported in the literature trees for regression type prediction problems — where the dependent
that more underpricing occurs in best efforts issues than in firm com- variable is a continues numerical variable. CART algorithm works recur-
mitment issues, since relatively small firms issue their shares via the sively: it partitions data into two subsets to make the records in each
best efforts issue, and that underwriters do not stand behind the issuing subset more homogeneous (all of the records in the subset belong to
firms using the best efforts issue [5]. Therefore, we also expect more the same class value) than the previous/alternative subsets; the two
underpricing in best efforts issues than in partial firm commitment subsets are then split again until the homogeneity criterion or some
and firm commitment issues. A dummy variable was employed and other time-based stopping criteria are satisfied. The same predictor
given a value of 0 if the offering was conducted through best efforts, 1 variable may be used several times in the process of growing the deci-
if the issue was conducted through partial firm commitment, and 2 if sion tree. The ultimate aim of splitting is to determine the right variable
the issue was conducted through firm commitment. associated with the right threshold to maximize the homogeneity of the
The IPO stocks sales method is another variable that was expected to subgroups/branches.
influence IPO short-term performance. Küçükkocaoğlu and Alagöz [34] A generalized depiction of the decision tree methodology employed
and Otlu and Ölmez [46] found more underpricing in IPOs issued via in this study is shown in Fig. 1. As shown, from left to right; the data is
book building with a price range method than in IPOs issued via book pre-processed, 10-fold cross validation splits are applied, decision tree
building with a fixed price method for Turkish IPOs. On the other methods are developed (for each of the two decision tree algorithms –
hand, Loughran et al. [41], Ekkayokkaya and Pengniti [22] and Ünlü CART and C5 – 10 different prediction models are developed and tested
and Ersoy [55] identified the opposite. There are mainly two IPO sales as per the 10-fold cross validation methodology) and finally the accuracy
methods in Turkey, book building and sales on the exchange. The and sensitivity analysis results are aggregated and presented.
book building method has two subsales methods: book building with
a fixed price and book building with a price range. Sale on the exchange 3.2.2. Support vector machine (SVM) algorithm
method has three submethods: the continuous auction method, book Support vector machine (SVM) which was originally developed by
building and sales with a fixed price method, and book building and Vapnik [58] is one of the most robust and accurate methods in machine
sales with a variable price method. Although rarely used in the litera- learning algorithms. It combines the statistical methods as well as
ture, total equity, operating profit, net income and cash flows from machine learning methods; therefore, its theoretical foundation is
operations were also included in the study as explanatory variables in based on statistical learning theory. SVM is supervised learning tech-
order to enrich the analysis. The definitions of all the variables used in nique that learns from observations by generating input–output map-
this study are provided in Table 3. ping functions from a training data. The structure of SVM includes an
With respect to the analysis methods, we choose to use the most input space, an output space and a training set, and the learning type
popular machine learning techniques. Our choice of methods and (i.e. binary or multiple classification problems) is being decided by
E. Bastı et al. / Decision Support Systems 73 (2015) 15–27 21

C5 Testing the
model
Training and
calibrating the
model
Conducting
sensitivity
analysis

CART Testing the


model
Training and Model Testing
calibrating the Results
10-fold cross model (Accuracy,
validation data Conducting Sensitivity,
Raw Data sensitivity Specificity, AUC)
10 %
10 %
10 %
analysis
10 % 10 %

Pre-processed 10 % 10 %

Data 10 % 10 %
10 %

SVM Testing the


10-fold cross
model Integrated (fused)
validation data Sensitivity
Training and
Analysis Results
calibrating the
model
Conducting
sensitivity
analysis

Others Testing the


model
Training and
calibrating the
model
Conducting
sensitivity
LR, NN,
analysis
QUEST, ...

Input Processing Output

Fig. 1. The research methodology used in this study.

output space. It uses mapping functions that can be either classification split into 10 mutually exclusive subsets of approximately equal sizes.
or regression function in order to map the data to a high dimensional The models are trained/developed first and then tested, and the process
feature space. SVM is considered a maximal margin classifier. In the pos- is repeated for 10 times. Each time, the model is trained on nine folds
sibility of not easily separable input data, it uses four kernel functions (as a combined training data that includes 90% of the total dataset)
such as linear, polynomial, radial based and sigmoid functions for classi- and tested on the remaining 1 fold (10% of the total dataset). The
fication problems by transforming the input data to high dimensional cross validation estimate of the overall accuracy of a model is calculated
feature space in order to make the data easily separable. by averaging the 10 individual accuracy measures coming from each
fold as shown in Eq. (4):

3.2.3. Cross validation methodology


1X k
Cross validation is a popularized technique to estimate the unbiased CVA ¼ A ð4Þ
k i¼1 i
accuracy of a predictive model's performance in practice. It is sometimes
called rotation estimation and the aim of the technique is to assess how
the result of a predictive analysis technique will generalize to an inde- where, “CVA” stands for cross validation accuracy, k is the number of
pendent data set. 10-fold cross validation is widely used in data mining folds (here k = 10), and A is the accuracy measure.
research since the empirical studies demonstrated that 10 was an
“optimal” number of folds, creating a fine balance between sampling 3.2.4. Performance measurements of prediction models
bias (i.e., diversification of training and testing subsamples) and timing A confusion matrix, also known as coincidence matrix, is used in
(time it takes to complete the model building and testing activities) order to determine the performance of models used in predicting binary
[33]. Essentially, in 10-fold cross validation, the data set is randomly (two-group) outcomes. It contains valuable information about the
22 E. Bastı et al. / Decision Support Systems 73 (2015) 15–27

actual and predicted classifications created by the prediction model Table 5


[32]. According to Caruana et al. [11], evaluation of the performance of Correlation analysis.

learning methods is crucial. Therefore, well-known performance mea- Variables V1 V2 V3 V4 V5 V6 V7


sures such as overall accuracy, AUC (Area under the ROC Curve), Recall V1 1st day excess return 1
and F-measure were employed. Overall accuracy (AC) is defined as the V2 10 days excess returns .67⁎⁎ 1
percentage of records that are correctly predicted by the model. It is also V3 ROA .21⁎ .06 1
defined as being the ratio of correctly predicted cases to the total num- V4 Total assets turnover rate .19⁎ .24⁎⁎ −.11 1
V5 Net income −.03 −.04 .19⁎ −.05 1
ber of cases. Precision is defined as the ratio of the number of True Pos- ⁎⁎
V6 Debt ratio −.06 −.12 −.27 .01 .10 1
itive (correctly predicted cases) to the sum of the True Positive and the V7 Number of issued shares −.04 −.07 .09 −.06 .74⁎⁎ .10 1
False Positive. Recall is also known as the Sensitivity or True Positive rate
⁎⁎ p b 0.01.
which is defined as the ratio of the True Positive (the number of correct- ⁎ p b 0.05.
ly predicted cases) to the sum of the True Positive and the False Nega-
tive. F-measures take the harmonic mean of the Precision and Recall
Performance measures. Therefore, it takes into consideration both the 3.2.6. Information fusion-based sensitivity analysis
Precision and the Recall Performance as being important measurement Data fusion is a process that combines data and knowledge from
tools for these calculations. Specificity is also known as the True Nega- different sources with the aim of maximizing the useful information con-
tive Rate (TN), which is defined as the ratio of the number of the True tent, for improved reliability or discriminant capability, while minimiz-
Negative to the sum of the True Negative and the False Positive. ing the quantity of data ultimately retained [54]. In this study, obtained
predictions are the data or information, “prediction models” are the
3.2.5. Sensitivity analysis (assessing predictors' importance) sources, and combining the predictions is the process of fusion. Fuller
The objective of sensitivity analysis is to measure the importance et al. [23] and Delen [18] showed that combining predictions (fusion) re-
of predictor variables. Davis [15] stated that the “Cause and effect” veals more accurate and more robust results. Accordingly, each decision
relationship between the dependent (output) and independent tree model (C5.0 and CART) created variable importance scores for each
(input) variables of a prediction model is determined by “sensitivity independent variable. The combination of these prediction models is
analysis” in machine learning algorithms. It is commonly used to identi- called information fusion-based sensitivity analysis. Each of the predic-
fy and focus on the more important variables and to ignore or drop the tion models produced different predictor important values. An informa-
least important ones. They are related to the importance of each variable tion fusion-based sensitivity analysis was performed to combine these
in making a prediction, not necessarily whether the prediction itself is values into a common representation. The relative variable importance
accurate. The variance of predictive error is arrived at by dropping one score produced by each decision tree model was normalized by using
predictor variable at a time, and observing the performance of the Eq. (7) below. They were then aggregated into a single set of importance
remainder. A variable is considered more important than another if it numbers and were represented in a tabular form. Finally, the normalized
increases the variance, compared to the complete model containing all variable importance scores were combined using Eq. (8) to find a single
the variables [69]. Predictor importance is determined by evaluating combined (fused) relative importance value for each variable.
the variance reduction of the target attributable to each predictor (see
Eq. (5)). Predictors are ranked according to the sensitivity measure de-
fined as [53]: PI−PImin
PInew ¼ ð7Þ
PI max −PImin
Vi V ðEðYjX i ÞÞ
Si ¼ ¼ ð5Þ
V ðY Þ V ðY Þ
PInðfusedÞ ¼ w1 PI 1n þ w2 PI2n þ … þ wm PImn ð8Þ

where Y is the target (dependent variable), Xj (j = 1, …, k) are predictors


(independent variables). V(Y) is the unconditional output variance. Pre- where
dictor importance of ith variable is then computed as the normalized
sensitivity (see Eq. (6)). PI represents the relative predictor importance score that was
initially produced by the individual model.
wi normalized weight values for each model. This represents the
S
PI i ¼ Xk i ð6Þ importance of models and is proportional to their predictive
S:
j¼1 j
powers.
m represents the number of prediction models.
n represents the number of variables.
It is shown that Si is the proper measure of sensitivity to rank the
predictors in order of importance in the existence of any combination These fused sensitivity scores were presented as bar-charts to visu-
of complex interaction and non-orthogonality among predictors [51]. ally illustrate the relative importance of the independent variables
from the highest (most important) to the lowest (least important) for
predicting (contributing to the prediction of) the dependent variable.
Table 4
Descriptive statistic results.
4. Results
Variable Mean Std. dev. Min Max

1st day excess return (%) 6.09 9.85 −17.49 31.55 The data screening process was very crucial during the statistical
10 days excess returns (%) 10.36 29.09 −52.76 160.55
ROA 0.06 0.10 −0.18 0.49
analysis. Therefore, the missing data analysis was initially provided.
Total assets turnover rate 2.71 8.89 0.00 62.47 Firms with many missing values were eliminated from the analysis; as
Net income (Million TL) 55.5 260 −5.83 2550 well, multiple imputations were applied in order to replace the missing
Debt ratio 0.36 0.26 0.00 1.00 values. Following the missing data analysis, influential multivariate out-
Number of issued shares (millions) 29 82.9 0.29 625
liers were removed by calculating the Mahalanobis d-squared values.
E. Bastı et al. / Decision Support Systems 73 (2015) 15–27 23

Table 6 Table 8
Prediction results, performances of the models. Raw and normalized variable importance scores.

Algorithms AC TP TN FP FN P F-measure AUC Variables Raw variable Normalized variable


importance score importance score
C5.0 .965 .965 .964 .036 .035 .965 .965 .976
C&R Tree .805 .719 .893 .107 .281 .872 .788 .840 SVM C5 CART N_SVM N_C5 N_CART
SVM .717 .772 .661 .339 .228 .698 .733 .827
Cash flow from Operations .040 .000 .005 .115 .000 .031
QUEST .558 .140 .982 .018 .860 .889 .242 .561
Debt ratio .137 .000 .045 .611 .000 .276
CHAID .575 .982 .161 .839 .018 .544 .700 .572
Equity .034 .000 .045 .084 .000 .276
Neural network .478 .298 .661 .339 .702 .472 .366 .573
Firm age .037 .000 .045 .095 .000 .276
Nomenclature: AC: accuracy; TP: sensitivity/true positive rate/recall; TN: specificity/true Insider retention .024 .000 .045 .029 .000 .276
negative rate; FP: false positive rate; FN: false negative rate; P: precision; AUC: Area Offer price .035 .000 .149 .089 .000 .915
under curve; and SVM: support vector machines. Issue proceeds (USD) .022 .000 .012 .019 .000 .076
Model parameter specifications — neural networks: input layer: 29 neurons; hidden layer Market sentiment .047 .219 .162 .151 .661 1.000
1: 3 neurons; output layer: 1 neuron; Persistence: 200; Alpha: 0.9; Initial Eta: 0.3; High Net profit .039 .000 .045 .108 .000 .276
Eta: 0.1; Eta decay: 30; Low Eta: 0.01; Number of records: 1017; Analysis Accuracy: Number of shares sold .018 .000 .113 .000 .000 .694
47.788%. Support vector machine (SVM): Number of records: 1017; analysis accuracy: Operating profit .036 .000 .045 .090 .000 .276
71.681%; Stopping criteria: 1.0E-3; Kernel type: RBF; Regularization parameter (C): 10; ROA1 .033 .000 .045 .075 .000 .276
Regression precision (epsilon): 0.1; RBF gamma: 0.1; Gamma: 1.0; Bias: 0.0; Degree: 3, ROA2 .048 .000 .045 .152 .000 .276
C5 Decision Tree: Tree depth: 6; Pruning severity: 75; Minimum records per child branch: Sales .031 .332 .045 .068 1.000 .276
2; Winnow attributes: false; Use global pruning: true. C&R decision tree: Tree depth: 5; Sales method .213 .000 .045 1.000 .000 .276
Levels below root: 5; Maximum surrogates: 5; Minimum change in impurity: 0.0; Impuri- Total assets .030 .000 .045 .063 .000 .276
ty measure for categorical targets: Gini; Stopping criteria: Use percentage; Minimum Total assets turnover rate .054 .266 .066 .182 .800 .408
records in parent branch (%): 2; Minimum records in child branch (%): 1; Prune tree: Underwriting method .124 .183 .000 .541 .553 .000
true. CHAID decision tree: Levels below root: 5; Alpha for splitting: 0.05; Alpha for
Merging: 0.05; Epsilon For Convergence: 0.001; Maximum iterations for convergence:
100; Use Bonferroni adjustment: true; Allow splitting of merged categories: false;
Chi-square method: Pearson; Stopping criteria: Use percentage; Minimum records in
parent branch (%): 2; Minimum records in child branch (%): 1.
4.2. Correlation analysis
QUEST decision tree: Levels below root: 5; Maximum surrogates: 5; Alpha for Splitting:
0.05; Stopping criteria: Use percentage; Minimum records in parent branch (%): 2; Mini- Table 5 provides the Pearson correlation coefficient results. There
mum records in child branch (%): 1; Prune tree: true. was a positive and significant relationship between the first day excess
returns and the 10 days' excess returns (r = 67%; p b .01), there were
also positive and significant correlations between the first day excess
4.1. Sample and descriptive statistics return and ROA (r = 21%; p b .05) as well as the total assets turnover
rate (r = 19%; p b .05). It was also clear that there was a significant
The descriptive statistics results are presented in Table 4. Borsa association between the excess returns for 10 days and the total
Istanbul provided the data for IPO companies on its website. Offer assets turnover rate (r = 24%; p b .01). Additionally, the ROA had a pos-
prices, number of shares sold, issue proceeds, IPO percentages, under- itive and significant relationship with net income (r = 19%; p b .05) while
writing methods and sales methods were taken directly from Borsa it had a negative and significant association with debt ratio (r = −27%;
Istanbul's website. The percentage of the unsold equity was calculated p b .01). Finally, net income and the number of shares issued showed a
by subtracting the IPO percentage from 1. The inception dates of the strong, positive and significant association (r = 74%; p b .01).
IPO firms as well as their total assets, total equity, sales, earnings before
interest and taxes, net income, and cash flows from operations were
taken from the prospectuses of the IPO companies. The age of the 4.3. Decision tree analysis, SVM and sensitivity analysis results
firms was calculated by subtracting the inception date of the issuing
company from the IPO date. The return on assets (ROA), total assets In order to formulate this study, one output (dependent variable)
turnover rate and debt ratio were calculated by utilizing the financial and eighteen inputs (independent variables) were used. The output
data extracted from the IPO prospectuses. According to the results, the variable was the one-day excess return, while the input variables
average excess return on the first trading day was 6.09% with a standard were cash flow from operations, debt ratio, equity, firm age, insider
deviation of 9.85%, while the excess return for 10 days was 10.36% with
a standard deviation of 29.09%. Additionally, the average return on as-
sets was 6%, the total assets turnover rate was 2.71, and the debt ratio Table 9
was 36%. Finally, the average net income was 55,500,000 Turkish Liras Aggregated sensitivity analysis results.
(TL) and the average number of issued shares was 29,000,000. Rank Variables Fused score

1 Market sentiment 1.551


2 Sales 1.236
3 Total assets turnover rate 1.232
4 Sales method .939
Table 7 5 Underwriting method .922
Confusion matrices of the models based on 10-fold cross validation test data. 6 Offer price .800
7 Debt ratio .660
Model 0 1 Overall Per-class
8 Number of shares sold .559
type accuracy accuracy
9 ROA2 .331
C5.0 0 486 18 Correct 981 96.46% 96.43% 10 Net profit .299
1 18 495 Wrong 36 3.54% 96.49% 11 Firm age .290
Sum 504 513 1017 12 Operating profit .287
C&R tree 0 450 54 Correct 819 80.53% 75.76% 13 Equity .282
1 144 369 Wrong 198 19.47% 87.23% 14 ROA1 .276
Sum 594 423 1017 15 Total assets .267
SVM RBF 0 333 171 Correct 729 71.68% 74.00% 16 Insider retention .243
1 117 396 Wrong 288 28.32% 69.84% 17 Cash flows from operations .108
Sum 450 567 1017 18 Issue proceeds (USD) .075
24 E. Bastı et al. / Decision Support Systems 73 (2015) 15–27

Fig. 2. Graphical representation of the sensitivity analysis result.

retention, offer price, issue proceeds (USD), market sentiment, net The confusion matrices and overall accuracy rates, as well as per-
profit, number of shares sold, operating profit, ROA1, ROA2, sales, class accuracy rates for each model constructed from the test data sam-
sales method, total assets, total assets turnover rate, and the underwriting ples are shown in Table 7. According to the obtained results, prediction
method. The output variable was entered into the model as a binary accuracy for the successful class was higher than the prediction rate for
variable by employing the central tendency measure (median) value as the unsuccessful class in both models. The prediction rate for successful
a split criterion: the class with a one day excess return score above the cases was 96.49% for the C5.0 model, while it was 87.23% for the CART
median value was rated as 1 (successful) and the class with a one day ex- model. Accordingly, the employed DT models predicted a successful
cess return score below the median value was rated as 0 (unsuccessful). one-day excess return for the selected firms, with at least 87% and 96%
prediction rates. Similarly, C5.0 predicted an unsuccessful one-day
excess return for firms with a 96.43% accuracy rate, while CART predic-
4.4. Prediction results tions had a 75.76% accuracy rate.

Various classification algorithms were used including C5, CART,


SVM, QUEST, CHAID, Logistic Regression and Neural Networks. Out of 4.5. Sensitivity analysis results
these models, C5, CART and SVM algorithms performed significantly
better than those of the others in terms of prediction accuracy. Therefore, Model-specific sensitivity analysis and information fusion based
three classification models – two from decision tree algorithms along multi-model sensitivity analysis were used for determining the relative
with support vector machines – were included (C5.0, CART, and SVM) predictor importance of the inputs, since the individual DT models gen-
in the final analysis. For each model, the 10-fold cross validation method- erated different predictor importance scores. First, the relative predictor
ology was employed to obtain the most representative prediction results. importance values created by each DT model were normalized using
In this cross validation method, 10 different models were trained and Eq. (7). The raw variable importance scores as well as the normalized
tested, each time using a different, mutually exclusive 10% of the total variable importance scores are shown in Table 8.
dataset as the hold-out/test sample. The testing results were combined In the next step, the normalized scores of each model were multi-
and used for comparison of the prediction models. The comparison of plied by the weight values of each model; these multiplied values
the prediction models are shown in Table 6. The results indicated that were then added together using Eq. (8), in order to find a single fused
the C5.0 DT model outperformed the CART and SVM models: the overall predictor importance score for each input variable. The rankings of the
accuracy of C5.0 was nearly 97% and the overall accuracy of CART was 81% fused scores of each input are represented in Table 9. To illustrate the
while that of SVM was 72%. In addition to overall accuracy, the other per- visual representation of the fused predictor importance values of inde-
formance measures such as sensitivity, specificity, precision, F-measure pendent variables in the order of importance, a bar-chart was created
and AUC also showed that the C5.0 decision tree model performed best. using the aggregated sensitivity values (see Fig. 2). The y-axis shows
In order to compare the performance of these contemporary machine input variables while the x-axis shows the predictor importance score
learning techniques to a similar classical approach, logistic regression for each indicator. According to Fig. 1, market sentiment was the single
(LR), we applied the same 10-fold cross validation methodology to train most important predictor in determining the one-day excess return,
and test logistic regression models. The results showed that decision while sales was the second most important predictor. Accordingly, the
tree and SVM models outperformed logistic regression. total assets turnover rate, IPO stocks sales method, the underwriting
E. Bastı et al. / Decision Support Systems 73 (2015) 15–27 25

method, the offer price, debt ratio and the number of shares sold were or uninformed. Since publicly available data influence IPO perfor-
followed as the leading variables in the one-day excess return. mances, the informed and uninformed classification of investors was
not true in the Turkish IPO market. In other words, the effect of publicly
5. Summary, implications and conclusions available data on IPO underpricing proved that there was no informa-
tional asymmetry among investors.
In this study, the existence of underpricing in IPOs of Borsa Istanbul The IPO stocks sales method is the first significant variable (out of all
listed companies and determinants of initial performance were investi- IPO issue characteristic variables) that has a significant effect on IPO
gated. All of the IPOs except for securities investment trusts were performance. There are basically two different IPO stocks sales methods,
included in the study for the period of 2005–2013. Results show that, which are sale by book building and sale on the exchange. Both methods
although not as high as the underpricing detected in most of the previ- have sale with fixed price and sale with a price range alternatives.
ous studies, there was underpricing in the Turkish IPO market. The Therefore, affects of fixed price and variable price sales methods on
average first day underpricing in IPOs for the analysis period was IPO performance are an expected outcome. This finding is in accordance
6.09%. This result is lower than the results of the previous studies, with the results of Küçükkocaoğlu and Alagöz [34], Otlu and Ölmez [46],
which detected underpricing in the range of 6.52% to 13.1%. Loughran et al. [41], Ekkayokkaya and Pengniti [22] and Ünlü and Ersoy
This research study provides several managerial implications. [55].
Underpricing level is relatively low in Turkey, somewhat similar to There was also a relationship between the underwriting method
the results of previous researches executed in developing countries and the IPO performance. According to the literature, there was more
[14,22,59]. The provided results indicate that market sentiment is the underpricing in the best efforts issues than in the firm commitment
crucial factor that affects IPO performance. This result is in accordance issues, since relatively small firms offered their shares via the best
with the evidences reached in Kiymaz [31] and Boonchuaymetta and efforts issue and the underwriters did not stand behind the issuing
Chuanrommanee [4]. Thus, firms that are planning to go public should firms for their best efforts issues. Therefore, the relationship between
try to time the market and sell their stocks during the period when the underwriting method and the IPO short-term performance was
the market sentiments are at a positive trajectory. In addition, results in- the expected outcome.
dicated that sales and total assets turnover ratios influence IPO perfor- The offer price was also determined to be effective in short-term IPO
mance. It suggests that firms that do not have problems in increasing performance. The offer price of an IPO was used as an ex-ante risk proxy,
their absolute sales and sales as the ratio of total assets may sell their since IPOs with higher offer prices lead to lower levels of underpricing.
stocks at relatively higher prices without encountering demand prob- Therefore, this result must stem from the investors' perceptions that
lems. This implication shows that investors overvalue the efficiency of low priced IPOs are less risky and thus provide higher demand to
the IPO firms. Moreover, the sales method was found to be an important those stock issues. This finding was consistent with the evidence of
factor on IPO performance. Firms that plan to offer their stocks via IPO Ibbotson et al. [27], Booth and Chua [5] and Vong and Trigueiros [59],
should sell their stocks using variable price methods rather than using all of whom found a significant relationship between the offer price
fixed price methods, which lead to lower underpricing in IPOs. All and the IPO underpricing level.
in all, Table 1 depicted the fact that all results from the developing The number of shares sold was associated with IPO riskiness.
countries yielded different but important implications depending Since IPO companies with more shares to be issued were regarded
on the specific characteristics of the country. This might be because of as safer, lower underpricing was expected in the IPOs of those
the legal, economic and financial infrastructures of the developing firms. This study also indicated that there was a relationship between
countries. Because of the fact that the financial data is becoming readily the number of shares sold and the level of IPO underpricing. This
available in large quantities, the decision makers can use contemporary result may be regarded to be in keeping with the results of Kennedy
machine learning techniques such as DT, SVM, and NN to investigate the et al. [30] and Grammenos and Papapostolou [65], both of which
most important market variables, and hence, make the most optimal detected a negative relationship between IPO proceeds and IPO
decisions for their firms. underpricing.
Contemporary ML methods such as decision tree algorithms and Some studies had included underwriter reputation as an explanatory
SVM were successfully employed in this study. As recommended by variable, which produced mixed results in explaining short-term IPO per-
the previous studies [18,35], in order to combine the scores of more formance [4,59]. They took differing measures as proxies for underwriter
than one algorithm, the data fusion technique was applied to determine reputation, such as the market share of an underwriter being a percentage
the factors affecting underpricing, rather than focusing on single of the number of companies that were declared publicly, the natural
method scores. Information-based sensitivity analysis enabled the logarithm of the capital volume of the IPOs, or the market share of
reliability as well as the discriminant capability to be improved while the lead underwriter. Therefore, determination of the underwriter's
minimizing the amount of obtained data, as recommended by Starr reputation is somehow subjective. In addition, in Turkey, IPOs are
and Desforges [54], in order to obtain useful information by combining performed by underwriting syndicates, that is, many investment
knowledge from the two different DT algorithm results. banks handle the issues jointly. As it is difficult to determine which
Based on the employed DT and SVM methods, the initial perfor- underwriters' reputation is more effective in IPOs performance, it is not
mances of IPO companies were affected by the market sentiment, the included in our study as an explanatory variable. The issue of not includ-
annual sales amount, the total assets turnover rate, the sales method, ing underwriter reputation as an explanatory variable may be viewed as a
the underwriting method, the offer price, debt ratio and the number limitation of our study.
of shares sold. The influence of the sentiment of the market on IPO Detection of the underpricing level is very low in our study, which
underpricing demonstrated that the cyclical behavior theory was true complies with many other studies on developing economies as well as
for Turkish IPOs. This result was in accordance with the findings of Turkey [14,22,46,55,59]. Low levels of underpricing are good for issuing
many previous studies, such as Boonchuaymetta and Chuanrommanee corporations, but they are not beneficial for investors when combined
[4], Loughran and Ritter [40], Kiymaz [31] and Ritter [49]. The cyclical with inadequate investor protection in developing countries. The causes
behavior theory argues that stock issues during bull markets are more and remedies of low underpricing could be a future research subject
in demand, and therefore provide a higher initial return. because low IPO performance may diminish investor interest in IPOs,
The effect of financial measures such as annual sales amount, the causing unsuccessful IPOs in the future. For example, Indian and Chinese
total assets turnover rate, and debt ratio on underpricing reflects the authorities have been taking action to resolve low IPO performance in
fact that the winner's curse theory was not supported by our results. order not to encounter future problems in the IPO market. India intro-
The winners' curse theory classifies investors as being either informed duced a unique certification mechanism for IPOs in 2007, whereby all
26 E. Bastı et al. / Decision Support Systems 73 (2015) 15–27

IPOs have to undergo mandatory quality grading by independent rating [26] H. Guo, T. Wang, Y. Li, H.G. Fung, Challenges to China's new stock market for small
and medium-size enterprises: trading price falls below the IPO price, Technological
agencies. This positively influenced investor demand for IPOs and and Economic Development of Economy 19 (Suppl. 1) (2013) S409–S424.
decreased underpricing [16]. Since underpricing or overpricing may [27] R.G. Ibbotson, J. Sindelar, J. Ritter, The market's problem with the pricing of initial
adversely affect the long-term development of the IPO market, China public offerings, Journal of Applied Corporate Finance 6 (1994) 66–74.
[28] N. Jegadeesh, M. Weinstein, I. Welch, An empirical investigation of IPO returns and
began requiring the disclosure of factors behind pricing decisions after subsequent equity offerings, Journal of Financial Economics 34 (1993) 153–175.
April 2012, in situations where the price-to-earnings ratio of an IPO is [29] T. Jenkinson, A. Ljungqvist, Going Public: The Theory and Evidence on How Compa-
expected to be 25% higher than that of publicly traded companies in nies Raise Equity Finance, Clarendon Press, Oxford, 2001.
[30] D.B. Kennedy, R. Sivakumar, K.R. Vetzal, The implications of IPO underpricing for the
the same industry [26]. firm and insiders: tests of asymmetric information theories, Journal of Empirical
In this study, after considering a wide variety of machine learning Finance 13 (2006) 49–78.
techniques, we settled on using decision trees and SVM as our predic- [31] H. Kiymaz, The initial and aftermarket performance of IPOs in an emerging market: ev-
idence from Istanbul stock exchange, Journal of Multinational Financial Management
tion methods. It is common knowledge that most machine learning
10 (2) (2000) 213–227.
methods (including the ones used in this study) have a number of [32] R. Kohavi, F. Provost, Glossary of terms. Editorial for the special issue on applications
modeling parameters that need to be “optimized.” While there are tech- of machine learning and the knowledge discovery process, Machine Learning 30
niques to improve the predictive power by systematically changing/ (2–3) (1998) 271–274.
[33] R. Kohavi, A study of cross-validation and bootstrap for accuracy estimation and model
adjusting these modeling parameters (such as the ones that we selection, Proceedings of the 14th International Conference on AI(IJCAI), Morgan
used in this study), there is no guaranty to reach the ultimate best Kaufmann, San Mateo, CA, 1995, pp. 1137–1145.
(the optimal) model — hence, they are often called as heuristic methods [34] G. Küçükkocaoğlu, A. Alagöz, İMKB'de uygulanan halka arz yöntemlerinin
karşılaştırmalı analizi, Dokuz Eylül Üniversitesi İktisadi ve İdari Bilimler Fakültesi
[17]. Future research would include additional prediction models and Dergisi 24 (2) (2009) 65–86.
meta-heuristic optimization techniques to achieve even better predic- [35] C. Kuzey, A. Uyar, D. Delen, The impact of multinationality on firm value: a comparative
tion accuracy and more robust variable importance measures. analysis of machine learning techniques, Decision Support Systems 59 (1) (2014)
127–142.
[36] M. Larraza-Kintana, R.M. Wiseman, L.R. Gomez-Mejia, T.M. Welbourne, Disentangling
References compensation and employment risks using the behavioral agency model, Strategic
Management Journal 28 (2007) 1001–1019.
[1] F. Allen, G. Faulhaber, Signaling by underpricing in the IPO market, Journal of [37] X. Liu, J. Ritter, Local underwriter oligopolies and IPO underpricing, Journal of Finan-
Financial Economics 23 (1989) 303–323. cial Economics 102 (3) (2011) 579–601.
[2] D.P. Baron, A model of the demand for investment banking advising and distribution [38] T. Loughran, B. McDonald, IPO first-day returns, offer price revisions, volatility, and
services for new issues, The Journal of Finance 37 (4) (1982) 955–976. form S-1 language, Journal of Financial Economics 109 (2013) 307–326.
[3] R.P. Beatty, J.R. Ritter, Investment banking, reputation, and the underpricing of initial [39] T. Loughran, J. Ritter, Why don't issuers get upset about leaving money on the table
public offerings, Journal of Financial Economics 15 (1986) 213–232. in IPOs? Review of Financial Studies 15 (2002) 413–444.
[4] E. Boonchuaymetta, W. Chuanrommanee, Management of the IPO performance in [40] T. Loughran, J. Ritter, Why has IPO underpricing changed over time? Financial
Thailand, Journal of Multinational Financial Management 23 (2013) 272–284. Management 33 (3) (2004) 5–37.
[5] J.R. Booth, L. Chua, Ownership dispersion, costly information, and IPO, Journal of [41] T. Loughran, J.R. Ritter, K. Rydqvist, Initial public offerings: international insights,
Financial Economics 41 (1996) 291–310. Pacific-Basin Finance Journal 2 (1994) 165–199.
[6] Borsa Istanbul Daily Bulletin, http://borsaistanbul.com/en/data/data/equity-market- [42] M. Lowry, G.W. Schwert, Is the IPO pricing process efficient? Journal of Financial
data/bulletin-data April 25, 2014. Economics 71 (2004) 3–26.
[7] Borsa Istanbul Website, http://borsaistanbul.com/en/data/data/ipo-data. [43] K. Keasey, P.B. McGuinness, Firm value and its relation to equity retention levels,
[8] D.J. Bradley, B.D. Jordan, Partial adjustment to public information and IPO forecast earnings disclosures and underpricing in initial public offerings in Hong
underpricing, Journal of Financial and Quantitative Analysis 37 (4) (2002) 595–616. Kong, International Business Review 17 (6) (2008) 642–662.
[9] L. Breiman, Statistical modeling: the two cultures, Statistical Science 16 (3) (2001) [44] D.L. Olson, D. Delen, Y. Meng, Comparative analysis of data mining models for bank-
199–231. ruptcy prediction, Decision Support Systems 52 (2) (2012) 464–473.
[10] L. Breiman, J.H. Friedman, R.A. Olshen, C.J. Stone, Classification and Regression Trees, [45] M. Orhan, Short and long-run performance of IPOs traded on the Istanbul stock
Chapman & Hall/CRC, New York, 1984. exchange, in: G.N. Gregoriou (Ed.), Initial public offerings: An international perspec-
[11] R. Caruana, N. Karampatziakis, A. Yessenalina, Empirical evaluation of supervised tive, Elsevier, U.S.A., 2006, pp. 45–55.
learning in high dimensions, ICML, 2008. [46] F. Otlu, S. Ölmez, Halka ilk kez arz edilen hissesenetlerinin kısa dönem fiyat
[12] Y.-S. Chen, C.-H. Cheng, A soft-computing based rough sets classifier for classifying performansları ile fiyat performansını etkileyen faktörlerin incelenmesi,
IPO returns in the financial markets, Appl, Soft Computing 12 (1) (2012) 462–475. İMKB'debiruygulama, Akademik Yaklaşımlar Dergisi 2 (2011) 14–43.
[13] W.Y. Cheng, Y.L. Cheung, K.K. Po, A note on the intraday patterns of initial public of- [47] J. Quinlan, C4.5: Programs for Machine Learning, Morgan Kaufmann, Morgan
ferings: evidence from Hong Kong, Journal of Business Finance & Accounting 31 Kaufmann, San Mateo, CA, 1993. (MA: MIT Press).
(2004) 837–860. [48] J.R. Ritter, I. Welch, A review of IPO activity, pricing and allocations, Journal of
[14] J. Chorruk, A.C. Worthington, New evidence on the pricing and performance Finance 57 (2002) 1795–1828.
of initial public offerings in Thailand, 1997–2008, Emerging Markets Review 11 [49] J.R. Ritter, The “hot issue” market of 1980, Journal of Business 57 (1984) 215–240.
(2010) 285–299. [50] K. Rock, Why new issues are underpriced, Journal of Financial Economics 15 (1986)
[15] G. Davis, Sensitivity analysis in neural net solutions, IEEE Transactions on Systems, 187–212.
Man, and Cybernetics 19 (1989) 1078–1082. [51] A. Saltelli, S. Tarantola, F. Campolongo, M. Ratto, Sensitivity Analysis in Practice — A
[16] S.S. Deb, V.B. Marisetty, Information content of IPO grading, Journal of Banking & Guide to Assessing Scientific Models, JohnWiley, 2004.
Finance 34 (9) (2010) 2294–2305. [52] D.K. Spiess, R.H. Pettway, The IPO and first seasoned equity sale: issue pro-
[17] D. Delen, Real-World Data Mining: Applied Business Intelligence and Decision ceeds, owner/managers' wealth, and the underpricing signal, Journal of
Making, Financial Times Press (a Pearson Company), Upper Saddle River, New Banking and Finance 21 (1997) 967–988.
Jersey, 2015. [53] SPSS, IBM SPSS Modeler User Manual, 2012. (Chicago, IL).
[18] D. Delen, A comparative analysis of machine learning techniques for student [54] A. Starr, M. Desforges, Strategies in data fusion — sorting through the tool box,
retention management, Decision Support Systems 49 (2010) 498–506. Proceedings of European Conference on Data Fusion, 1998.
[19] D. Delen, H. Zaim, C. Kuzey, S. Zaim, A comparative analysis of machine learning [55] U. Ünlü, E. Ersoy, İlk halka arzlarda düşük fiyatlama v ekısa dönem
systems for measuring the impact of knowledge management practices, Decision performansın belirleyicileri: 1995–2008 İMKB örneği, Dokuz Eylül
Support Systems 54 (2) (2013) 1150–1160. Üniversitesi İktisadi ve İdari Bilimler Fakültesi Dergisi 23 (2) (2008)
[20] D. Delen, C. Kuzey, A. Uyar, Measuring firm performance using financial ratios: 243–258.
a decision tree approach, Expert Systems with Applications 40 (10) (2013) 3970–3983. [56] J. vanBommel, T. Vermaelen, Post-IPO capital expenditures and market feedback,
[21] D. Delen, N. Emanet, H.R. Oz, N. Bayram, A comparative analysis of machine learning Journal of Banking and Finance 27 (2003) 275–305.
methods for classification type decision problems in healthcare, Decision Analytics 1 [57] J. vanBommel, Messages from market to management: the case of IPOs, Journal of
(1) (2014) 6–15. Corporate Finance 8 (2002) 123–138.
[22] M. Ekkayokkaya, T. Pengniti, Governance reform and IPO underpricing, Journal of [58] L. Vapnik, The Nature of Statistical Learning Theory, Springer-Verlag, New York,
Corporate Finance 18 (2012) 238–253. 1995.
[23] C.M. Fuller, D.P. Biros, D. Delen, An investigation of data and text mining methods for [59] P.I. Vong, D. Trigueiros, The short-run price performance of initial public offerings in
real world deception detection, Expert Systems with Applications 38 (2011) Hong Kong: new evidence, Global Finance Journal 21 (2010) 253–261.
8392–8398. [60] P.I. Vong, Rate of subscription and after-market volatility in Hong Kong IPO, Applied
[24] C. Ghosh, M. Petrova, Z. Feng, M. Pattanapanchai, Does IPO pricing reflect public Financial Economics 16 (2006) 1217–1224.
information? New insights from equity carve-outs, Financial Management 41 (1) [61] I. Welch, Seasoned offerings, imitation costs, and the underpricing of initial public
(2012) 1–33. offerings, Journal of Finance 44 (1989) 421–449.
[25] M. Grinblatt, C.Y. Hwang, Signalling and the pricing of new issues, Journal of Finance [62] World Federation of Exchanges Website, http://www.world-exchanges.org/
44 (1989) 393–420. statistics/monthly-query-tool.
E. Bastı et al. / Decision Support Systems 73 (2015) 15–27 27

[63] K. Yalciner, Düşük fiyatlama olgusu ile halka arz şekilleri ve halka arz fiyatı Dr. Dursun Delen is the William S. Spears Endowed Chair in
arasındaki ilişkininanalizi: 1997–2004 dönemine ait bir inceleme, Gazi Üniversitesi Business Administration, Neal Patterson Chair in Business
İktisadi ve İdari Bilimler Fakültesi Dergisi 7 (2) (2006) 145–158. Analytics, Research Director for the Center for Health Sys-
[64] S.X. Zheng, J.P. Odgen, F.C. Jen, Pursuing value through liquidity in IPOs: tems Innovation, and Professor of Management Science and
underpricing, share retention, lockup, and trading volume relationships, Review of Information Systems in the Spears School of Business at
Quantitative Finance and Accounting 25 (2005) 293–312. Oklahoma State University (OSU). He received his Ph.D. in In-
[65] C.T. Grammenos, N.C. Papapostolou, US shipping initial public offerings: Do prospec- dustrial Engineering and Management from OSU in 1997.
tus and market information matter? Transportation Research Part E: Logistics and Prior to his appointment as an Assistant Professor at OSU in
Transportation Review 48 (1) (2012) 276–295. 2001, he worked for a privately-owned research and consul-
[66] E.W.T. Ngai, Yong Hu, Y.H. Wong, YijunChen, Xin Sun, The application of data min- tancy company, Knowledge Based Systems Inc., in College
ing techniques infinancial fraud detection: a classification framework and an aca- Station, Texas, as a research scientist for five years, during
demic review of literature, Decision Support Systems 50 (2011) 559–569. which he led a number of decision support and other infor-
[67] M.E. Edge, P.R.F. Sampaio, The design of FFML: A rule-based policy modelling lan- mation systems related research projects funded by federal
guage for proactive fraud management in financial data streams, Expert Systems agencies, including DoD, NASA, NIST and DOE. His research
with Applications 39 (11) (2012) 9966–9985. has appeared in major journals including Decision Support Systems, Decision Sciences,
[68] F. Louzada, A. Ara, Bagging k-dependence probabilistic networks: An alternative Communications of the ACM, Computers and Operations Research, Computers in Industry,
powerful fraud detection tool, Expert Systems with Applications 39 (14) (2012) Journal of Production Operations Management, Artificial Intelligence in Medicine, Expert
11583–11592. Systems with Applications, among others. He recently published four books: Advanced
[69] SPSS, Clementine12 User Manual, 2007. Chicago, IL. Data Mining Techniques with Springer, 2008; Decision Support and Business Intelligence
Systems with Prentice Hall, 2010; Business Intelligence: A Managerial Approach, with
Prentice Hall, 2010; and Practical Text Mining and Statistical Analysis for Non-structured
Dr. Eyup Bastı is an Associate Professor and the head of the Text Data Applications, with Elsevier, 2012. He is often invited to national and internation-
Banking and Finance Department of Faculty of Economics al conferences for keynote addresses on topics related to Data/Text Mining, Business
and Administrative Sciences at Fatih University, Turkey. Intelligence, Decision Support Systems, and Knowledge Management. He served as the
He received his BA degree on Management from Middle general co-chair for the 4th International Conference on Network Computing and
East Technical University in 1996 and his MBA from Fatih Advanced Information Management (September 2–4, 2008 in Soul, South Korea), and
University in 1998. He obtained his Ph.D. degree from regularly chairs tracks and mini-tracks at various information systems conferences. He
Istanbul University in Finance in 2004 with a thesis titled as is the associate editor-in-chief for International Journal of Experimental Algorithms,
“2001 Financial Crisis of Turkey in the Light of Financial associate editor for International Journal of RF Technologies, and is on editorial boards of
Crises Theories: The Effects of Financial Crisis on the five other technical journals. His research and teaching interests are in data and text
Efficiency of Turkish Financial Sector”. His PhD thesis is mining, decision support systems, knowledge management, business intelligence and
published by the Capital Market Boards of Turkey. He wrote enterprise modeling.
numerous articles published in Turkish and foreign scientific
journals. His current research interests are financial crises,
corporate valuation, financial performance of firms, initial
public offerings, capital structure theories and asset pricing.

Dr. Cemil Kuzey is an Assistant Professor at the Department


of Management at Fatih University in Istanbul, Turkey, teach-
ing Operation Research and Statistics for Social Sciences.
He acquired his Ph.D. degree in Business Administration
through the Department of Quantitative Analysis, Istanbul
University, Turkey. Among his academic pursuits, he took
several graduate courses at the Ontario Institute for Studies
in Education, University of Toronto. His research interests
are related to Operation Research, Data Mining, and Business
Intelligence.

You might also like