Applying Multiple Linear Regression and Neural Network To Predict Bank Performance
Applying Multiple Linear Regression and Neural Network To Predict Bank Performance
176
International Business Research October, 2009
The objectives of this paper are to predict Malaysian bank performance using Multiple Linear Regression and Artificial
Neural Network and to evaluate which of this method is more powerful in predicting bank performance.
2. Literature Review
Researchers in banking and finance have indicated that bank performance is related to internal and external factors.
The internal factors relate to banks’ characteristics and external factors are described as the economic and legal
environment (Athanasoglou, Brissimis & Delis, 2008). Multiple linear regression is a very common statistical
technique used in finding the determinants of bank performance, for example Athanasoglou, Brissimis & Delis (2008),
Haron (2004) and Sanusi & Mohamed (2007). The analysis of multiple linear regression often produced low
coefficient of multiple determination, or R2 values and the presence of outliers is seen to be a very common problem
(Midi & Imon, 2006).
The performance measures are represented by return on assets (ROA), return on equity (ROE) and return on deposits
(ROD) from balance sheets (Sanusi & Mohamed, 2007). In a study on panel data in finding determinants of Islamic
bank profitability, Haron (2004) found that internal factors such as liquidity, total expenditures, funds invested and
profit sharing ratio have a significant effect on bank profitability. Interest rate, market share and bank size, described as
external effects, are also found to have the same effect in determining bank profitability.
A similar study in finding determinants of bank profitability, Sanusi & Mohamed (2007) found that bank’s
characteristics and the financial structure of a country are significant variables affecting bank profitability. They also
compared the results of fixed effects and random effects on the proposed model and observed low adjusted R2 values,
indicating a low proportion of variation in profitability explained by the significant independent variables.
Athanasoglou, Brissimis & Delis (2008), investigated the effect of bank-specific and industry-specific and
macroeconomics determinants on bank profitability in Greece. Two variables are found to have significant effect;
labour productivity growth (positive effect) and operating expenses (negative effect). Variables used by Athanasoglou,
Brissimis & Delis (2008), are adapted in this study to perform multiple linear regression on the Malaysian banks.
As with Artificial Neural Network, Vellido (1999) listed a variety of research that has used this method. In banking and
finance, artificial neural network has been used to predict banks and firms bankruptcy, predict credit card performance,
credit evaluation and also detect insurance fraud. Aiken (1999) used artificial neural network to forecast inflation and
concluded that neural network is able to fairly accurately forecast the consumer price index of a country.
The future of the artificial neural network in finance is discussed by Brunell & Folarin (1997). They have looked at the
promising performance of artificial neural network in debt risk assessment and its ability to improve on loan assessment.
They found that the artificial neural network has helped bank managers to evaluate good or bad credit risks by
estimating the likelihood that a firm’s or borrower’s ability to require additional capital through borrowing. The high
performance of artificial neural network in many areas of banking and finance has led to the application of the artificial
neural network to predict bank performance in this study.
The performance of artificial neural network has been compared with many other traditional statistical techniques. For
example, artificial neural network is compared with multiple linear regression (Nguyen & Cripps, 2001 and Arulsudar,
Subramaniam & Murthy, 2005), discriminant analysis and logistic regression (Leshno & Spector, 1996), decision trees
and logistic regression (Delen, Walker & Kadam, 2004), stepwise regression and ridge regression (Chokmani,Quarda,
Hamilton, Hosni & Hugo, 2008), logistic regression (Zhang, Hu, Patuwo & Indro, 1997). The artificial neural network
has outperformed the traditional methods in all of these studies. Specifically, the artificial neural network is found to
have better performance than multiple regression analysis when moderate to large data sample size is used (Nguyen &
Cripps, 2001).
Comparison of artificial neural network and multiple linear regression has also been done in various fields of study.
Artificial neural network is extensively being applied in predicting bankruptcy. Leshno & Spector (1996) have
compared artificial neural network with multivariate discriminant analysis and logistic regression in their study on
bankruptcy using a limited number of firms. Prediction capabilities of artificial neural network are found to be more
accurate than the classical discriminant analysis and logistic regression. They also concluded that an ample number of
examples must be provided for neural network to perform at its optimum. Another study in predicting bankruptcy is by
Boritz & Kennedy (1995) who examined different types of artificial neural network and compared against other
bankruptcy prediction techniques such as discriminant analysis, logit and probit techniques. Performance of the
artificial neural network is found to be affected by the choice of variables. Although the artificial neural network has
outperformed the traditional methods, the later has advantages of being easy to understand and use.
Nguyen & Cripps (2001) examined the performance of various artificial neural network architectures. Standard back
propagation is found to perform better than other neural network architectures. The network performance is also found
to improve with training size.
177
Vol. 2, No. 4 International Business Research
The applications of neural network in various fields of study have showed positive and promising results. Multiple
linear regression is a very popular method but the method is non-robust, in which influential outliers can effect
regression results significantly. Researchers in the field of robust statistics indicate that real data may contain about 1 to
10% outliers (Midi & Imon, 2006). The predictive ability and robustness of artificial neural network is an eye-catcher.
Therefore, in this study, multiple linear regression and artificial neural network are used to predict bank performance
and results of both methods are then compared. The results can then be of importance to predict bank performance in
Malaysia.
3. Data Description and Methodology
A sample data set consisting of 13 banks for the period of 2001 – 2006 was randomly selected from a list of Malaysian
banks obtained from Bank Negara Malaysia. Data for all variables, except for GDP and CPI, were collected from the
BANKSCOPE database. Data for chosen variables were selected, calculated and transferred into an Excel spreadsheet.
Data for Gross Domestic Product (GDP) and Consumer Price Index (CPI) were obtained from the Bank Negara
Malaysia official website.
Predictor variables found to be significant in the banking and finance literature were adapted into the study. Return on
assets (ROA) was used as a measure of bank performance and seven predictor variables were chosen to be analyzed.
The chosen variables are listed in Table 1.
3.1 Multiple Linear Regression Model
Multiple linear regression analysis is a technique for modelling the linear relationship between two or more variables. It
is one of the most widely used of all statistical methods. In banking and finance literature, regression analysis is a very
common method used to find the determinants of bank performance.
The general linear regression model, with normal error terms, simply in terms of X variables is shown in Equation 1.
Yi E 0 E 1 X i1 E 2 X i 2 ... E p 1 X i , p 1 H i (1)
where
ROAi = Return on average assets which serve as performance indicator
LIQi = Loan-to-assets ratio as a measure of liquidity
LLOSSi = Loan loss provision-to-loans ratio as a measure of credit risk
SIZEi = Size of a bank based on its total assets
COSTINCi = Cost income ratio which measures bank efficiency
CONCi = A concentration ratio, calculated by taking the largest 3 banks divided by total assets of the
banking sector
GDPi = Gross Domestic Product
CPI i = Consumer Price Index
Hi = error term
The underlying assumptions of linearity, normality, constant variance and independence of error terms must be satisfied
in order to get a more valid model. Diagnostics for the underlying assumptions must be done and remedial measures can
then be taken accordingly.
178
International Business Research October, 2009
179
Vol. 2, No. 4 International Business Research
180
International Business Research October, 2009
The artificial neural network gives highly accurate results from the inputs. The method increases its performance with a
large number of examples. An optimal number of neurons also need to be determined because the network tends to
memorize with too many neurons but it can hardly make any generalization if too few are used. The method does not
require any distributional assumptions and it is robust to outliers and unexpected data in the inputs. The artificial neural
network outperformed the multiple regression in predicting bank performance but somehow, the method gives no
explanation on the estimation of the parameters. Decision makers are provided with the information on the estimated
parameters from the results of multiple linear regression. The prediction of the method is only made on the mean
performance and thus gives a higher MSPR value.
A similar study can be performed using a larger dataset. As suggested by Kutner, Nachtsheim & Peter (2004), the
validation model should consist about 30% of the dataset. Furthermore, three different sub-samples are required for the
training, testing and validation in artificial neural network. The effect of different years and different banks should
also be taken into considerations. Other predictor variables such as bank ownership, bank labour growth, macro or
microeconomic factors are to be included which may explain the total variation in predicting bank performance.
References
Aiken, M. (1999). Using a Neural Network to Forecast Inflation, Industrial Management & Data Systems, Vol.
99/7:.296-301.
Arulsudar, N., Subramaniam, N. & Murthy, R.S.R (2005). Comparison of Artificial Neural Network and Multiple
Linear Regression in the Optimization of Formulation Parameters of Leuprolide Acetate Loaded Liposomes, Journal of
Pharmacy and Pharmaceutical Sciences, 8(2):.243-258.
Athanasoglou P. P., Brissimis, S. N. & Delis M.D. (2008). Bank-specific, Industry-specific and Macroeconomic
Determinants of Bank Profitability, International Financial Markets, Institutions & Money, Vol. 18:.121-136.
Bank Negara Malaysia at www.bnm.gov.my.
Boritz, J. E., & Kennedy, D.B. (1995). Effectiveness of Neural Network Types for Prediction of Business Failure,
Expert Systems with Applications, Vol. 9, No. 4, p.503-512.
Brunell, P.R. & Folarin, B.O. (1997). Impact of Neural Networks in Finance, Neural Computation & Application, Vol.
6:.193-200.
Chokmani, K. T. Quarda, J. V., Hamilton, S., Hosni,G. M., & Hugo, G. (2008). Comparison of Ice-Affected
Streamflow Estimates Computed Using Artificial Neural Networks and Multiple Regression Techniques, Journal of
Hydrology, Vol. 349:.383-396.
Delen, D., G. Walker, G., & Kadam, A. (2005). Predicting Breast Cancer Survivability: A Comparison of Three Data
Mining Methods, Artificial Intelligence in Medicine, Vol. 34:.113-117.
Haron, S. (2004). Determinants of Islamic Bank Profitability, Global Journal of Finance & Economics, USA, 1 (1):
11-33.
Kutner, M., Nachtsheim C. J., & Peter, J. (2004). Applied Linear Regression Models, McGraw Hill/Irwin Series, 4th
Edition.
Leshno, M., & Spector, Y. (1996). Neural Network Prediction Analysis: The Bankruptcy Case, Neurocomputing, Vol.
10:.125-147.
Nguyen, N., & Cripps, A. (2001). Predicting Housing Value: A Comparison of Multiple Linear Regression Analysis
and Artificial Neural Networks, Journal of Real Estate Research, 22 (3): 313-336.
Midi, H., & Imon, A. H. M. R. (2006). The Use and Abuse of Statistics, Paper presented at the National Statistics
Conference, Putrajaya International Convention Centre, Putrajaya, Malaysia, 4th-5th September 2006.
Sanusi, N.A., & Mohammed, N. (2007). Profitability of an Islamic Bank: Panel Evidence from Malaysia, Readings in
Islamic Economics & Finance, Chapter 6:.97-116.
Vellido, A. (1999). Neural Networks in Business: A Survey of Applications (1992-1998), Expert Systems with
Applications, Vol. 17: 51 – 70.
Zhang, G., Hu, M. Y., Patuwo B. E. & Indro, D. C. (1999). Artificial Neural Networks in Bankruptcy Prediction:
General Framework and Cross-Validation Analysis, European Journal of Operational Research, 116:.16-32.
181
Vol. 2, No. 4 International Business Research
182
International Business Research October, 2009
183