JPR1
JPR1
JPR1
net/publication/346308101
CITATIONS READS
64 1,400
3 authors:
SEE PROFILE
Some of the authors of this publication are also working on these related projects:
Call for Papers: Land Use Policy Special Issue on “Collaborative Decision-making for Sustainable and Inclusive Urban Regeneration” View project
All content following this page was uploaded by Winky K.O. Ho on 12 February 2021.
To cite this article: Winky K.O. Ho , Bo-Sin Tang & Siu Wai Wong (2021) Predicting property
prices with machine learning algorithms, Journal of Property Research, 38:1, 48-70, DOI:
10.1080/09599916.2020.1832558
1. Introduction
In the contemporary digital era, useful information of the society can be retrieved from
a wide variety of sources and stored in the form of structured, unstructured and semi-
structured formats. In the analysis of economic phenomena or social observations,
advancement of innovative technology makes it possible to systematically extract the
relevant information, transform them into complex data formats and structures, and
then perform suitable analyses. Because of these new circumstances, traditional data
processing and analytical tools may not be able to capture, process and analyse highly
complex information in the social and economic worlds. New techniques have been
developed in response to the treatment of the colossal amount of available data.
Machine learning is one of the cutting edge techniques that can be used to identify,
interpret, and analyse hugely complicated data structures and patterns (Ngiam & Khor,
2019). It allows consequential learning and improves model predictions with a systematic
input of more recent data (Harrington, 2012; Hastie et al., 2004). Contemporary research
on machine learning is a subset of artificial intelligence (AI) that attempts to train
computers with new knowledge through input of data, such as texts, images, numerical
values, and so on, and support its interaction with other computer networks. According
to Feggella (2019), machine learning is about ‘the science of getting computers to learn
and act like humans do, and improve their learning over time in autonomous fashion, by
feeding them data and information in the form of observations and real-world
interactions’.
Machine learning can be broadly categorised into three types, namely supervised
learning, unsupervised learning and semi-supervised learning. In essence, supervised
machine learning algorithms intend to identify a function that is able to give accurate
out-of-sample forecasts. For example, in property research, if an investigator intends to
make forecast of housing prices yi from its physical, neighbourhood and accessibility
characteristics xij from a sample of n apartments, one can assume Lðybi ; yi Þ to be the
prediction loss function. A machine learning algorithm will look for a function ^f that
h � � �i
produces lowest expected prediction loss Eðyi ;xij Þ L ^f xij ; yi on the test data from the
same distribution (Mullainathan & Spiess, 2017). A model trained through supervised
learning is said to be successful if it can make predictions within an acceptable level of
accuracy. Examples of supervised learning include linear regression and support vector
machine.
Unsupervised learning is a branch of machine learning that handles unlabelled train
ing data. These data do not have any kind of label that can reveal their classification.
When compared to supervised learning, unsupervised learning methods are applied to
discover the relationships between elements in a dataset without labels. Because of this
nature, unsupervised learning can discover hidden structures within data that may not be
easily observed. To put it simply, it is best adopted if data scientists want to uncover
patterns that may not have been identified theoretically prior to the investigation. The
usefulness of unsupervised learning is to model the underlying structure or distribution
in the data with a view to learning more about the data. Unsupervised learning problems
can be categorised into clustering and association (i.e. discover rules that describe a large
portion of data) problems. Examples of unsupervised learning algorithms include
k-means for clustering problems, neural network for analysing sequences, and Apriori
algorithm for association rule learning problems.
Semi-supervised learning refers to the case when there are a large amount of input data
(X) and only some of the data are labelled (Y). In this case, data scientists can use
unsupervised learning techniques to discover and learn about the data structure. On the
contrary, data scientists can also adopt supervised learning techniques to make predictions
for the unlabelled data, train the model with supervised learning algorithm, and make
predictions on the test data. In the real world, most data analyses fall into this category.
Against this background, the mechanism of machine learning is drastically different from
conventional econometric techniques in social and economic analyses. In econometrics
research, economic models are built up to focus on a number of parameters that represent
the effects of change in demand and/or supply factors. Researchers have to correct for
heterogeneity and attempt to obtain heteroskedasticity-consistent standard errors and
covariances for the estimated parameters (Einav & Levin, 2014). They also have to evaluate
the robustness of the results by estimating several alternative model specifications. In
contrast, machine learning approaches mostly put emphasis on the out-of-sample predic
tions, and rely on the data-driven model selection to unveil the features with the most
50 W. K. O. HO ET AL.
explanatory power (Vespignani, 2009). These approaches work with uncertainty associated
with incomplete information and/or measurement errors, and intend to model uncertainty
via the tools and techniques from probability, such as the Bayesian models. Commonly used
machine learning techniques include lasso (Mullainathan & Spiess, 2017), decision tree
(Mullainathan & Spiess, 2017), random forest (Varian, 2014; Wang & Wu, 2018), support
vector machine (Gu et al., 2011; Liu & Liu, 2010; Shinda & Gawande, 2018; Vilius &
Antanas, 2011; Wang & Wu, 2018; Xie & Hu, 2007; Zurada et al., 2011; Zhong et al., 2009),
gradient boosting machine (Ahmed & Abdel-Aty, 2013; Zhang & Haghani, 2015), K nearest
neighbours (Mukhlishin et al., 2017), and artificial neural network (Limsombunchai et al.,
2004; McCluskey et al., 2012; Mohd et al., 2019; Zurada et al., 2011).
Advanced machine learning techniques have already been applied in many domains,
such as face recognition (Vapnik & Lerner, 1963), speech analysis (Jelinek, 1998; Jurafsky
& Martin, 2008; Rabiner & Juang, 1993; Schroeder, 2004), gene identification (Golub
et al., 1999; Krause et al., 2007), protein study (Cai et al., 2003; Rausch et al., 2005),
tumour detection (Furey et al., 2000; Guyon et al., 2002; Ramaswamy et al., 2001) and
diabetics (Yu et al., 2010). It is also increasingly popular to apply machine learning
techniques in the appraisal and prediction of property values. The merit of this technique
is that it allows more recent property data or transactions to possibly ‘correct’ the
parameters of existing model estimations, which were based on earlier observations
(Baldominos et al., 2018; Hastie et al., 2004; Hausler et al., 2018; Rafiei & Adeli, 2016;
Rogers & Girolami, 2011; Sun et al., 2015).
In this paper, we choose three algorithms as examples to illustrate how machine
learning algorithms can be utilised to predict housing prices with the input of conven
tional housing attributes. We use these techniques for a number of reasons. First, SVM is
used because it works well with high dimensional data, semi-structured and unstructured
data, and is seen as a more powerful tool to make accurate predictions (Hassanien et al.,
2018). In the commercial world, the use of SVM is very popular in predicting
a company’s sale volume and revenue. Second, due to many decision trees participating
in the process, random forest (RF) is used because it can reduce over-fitting of data.
Previous property research has also demonstrated that random forest is a robust algo
rithm which provides accurate predictions (Mohd et al., 2019; Mullainathan & Spiess,
2017; Pérez-Rave et al., 2019). Third, gradient boosting machine (GBM) is also used
because it is considered as an emerging machine learning algorithm with highly flex
ibility. It also provides better accuracy than many other machine learning algorithms, as
suggested by Kaggle Data Science Competition (Kaggle, 2019).
This paper is organised as follows. Section 2 reviews the use of machine learning
and the application of SVM, RF and GBM algorithms. Section 3 describes these
methodologies and examines the algorithms, optimisation and parameters. Section 4
describes the data, its definitions and sources. Section 5 presents our empirical results
based on these algorithms, and compares their results. The last section concludes the
paper.
2. Literature review
Our literature review section comprises three parts. We first start our discussions about
SVM. First, Support Vector Machine (SVM) has been developed from the Statistical
JOURNAL OF PROPERTY RESEARCH 51
Learning Theory by Vapnik (1995) based upon the principle of structural risk minimisa
tion. As an advanced machine learning algorithm, it has been extensively used to perform
data classification, clustering and prediction. SVM is a supervised learning technique that
can be considered as a new method for training classifiers based on polynomial functions,
radial basis functions, neural networks, splines or other functions (Cortes & Vapnik,
1995). SVM uses a hyper-linear separating plane to create a classifier. If a simple
separating plane fails, SVM can also perform a non-linear transformation of the original
input space into a high dimensional feature space. Those separating planes should have
a maximal margin to contain all the points with a very small error, and to provide spaces
for new coming data (Rychetsky, 2001). Moreover, SVM can also be used to perform
regression on continuous and categorical variables. In this case, it is termed as support
vector regression (SVR).
SVM has been used by Boser et al. (1992) to recognise handwriting patterns. It has
then become very popular in a wide variety of applications, including handwriting
recognition, face detection and text classification. In property research, Li et al. (2009)
have used support vector regression (SVR) to forecast property prices in China using
quarterly data from 1998 to 2008. In their study, five features were selected to predict the
property prices as the output variable of the SVR. Based on conventional evaluation
criteria, such as the mean absolute error (MAE), the mean absolute percentage error
(MAPE) and the root mean squared error (RMSE), they concluded that the SVR model is
an excellent technique to predict property prices.
Rafiei and Adeli (2016) have used SVR to determine whether a property developer
should build a new development or stop the construction at the beginning of a project
based on the prediction of future housing prices. Using data from 350 condos built in
Tehran (Iran) for the period between 1993 and 2008, their study trained a model with 26
features, such as ZIP code, gross floor area, lot area, estimated construction cost, duration
of construction, property prices of nearby housing developments, currency exchange rate
and demographic factors. Their results have demonstrated that SVR is an appropriate
method to make predictions of housing prices since the prediction loss (error) is as low as
3.6% of the test data. The estimation results, therefore, provide valuable inputs to the
decision-making of property developer.
Second, random forest approach was first proposed by Breiman (2001) by blending
classification and regression tree (Breiman et al., 1984) and bootstrap aggregation
(Breiman, 1996). It is an ensemble classifier or regressor that employs multiple models
of T decision trees to obtain better prediction performance. This method creates many
trees, and utilises a bootstrap technique to train each tree from the original sample set of
training data. It searches for a random subset of features to obtain a split at each node.
The bootstrap technique for each regression tree generation, together with the randomly
selected features partitioned at each node, reduce the correlations between the generated
regression trees. Hence, Random Forest averages the prediction responses to reduce the
variance of the model errors (Pal, 2017).
Wang and Wu (2018) employ 27,649 housing assessment price data from Airlington
county, Virginia USA during 2015, and suggest that Random Forest outperforms Linear
Regression in terms of accuracy (see also Muralidharan et al., 2018). Based on a set of
features, such as number of beds, floor level, building age and floor area, Mohd et al.
(2019) utilise several machine learning algorithms to predict housing prices in Petaling
52 W. K. O. HO ET AL.
Jaya, Selangor, Malaysia. Their study compares the results using Random Forest,
Decision Tree, Ridge Regression, Linear Regression and LASSO, and concludes that
Random Forest is the most preferred one in terms of overall accuracy, as evaluated by the
root mean squared error (RMSE).
Using 1,970 housing transaction records, Koktashev et al. (2019) attempt to estimate
the housing values in the city of Krasnoyarsk. Their study considers the number of
rooms, total area, floor, parking, type of repair, number of balconies, type of bathroom,
number of elevators, garbage disposal, year of construction and the accident rate of the
house as features. It applies random forest, ridge regression and linear regression to
predict property prices. Their study concludes that random forest outperforms the other
two algorithms, as evaluated by the mean absolute error (MAE).
Third, the idea of boosting algorithms has been developed by Breiman (1997, 1998) as
numerical optimisation techniques in function space, and these algorithms can provide
a solver to many classification and regression problems. Gradient boosted classifiers
belong to a group of machine learning algorithms that combine many weak learning
models together to create a strong predictive model, typically decision trees. Gradient
boosting models have gained their popularity because they are effective in classifying
complex datasets. These methods have recently been used to win many data science
competitions, such as (Kaggle, 2019).
In property research, Park and Bae (2015) develop a housing price prediction model
with machine learning algorithms, such as C4.5, RIPPER, Naïve Bayesian, and AdaBoost,
and compare their performance in terms of classification accuracy. Their study aims to
help property sellers or real estate agents make rational decisions in property transac
tions. The experiments demonstrate that the RIPPER algorithm, based on accuracy,
consistently outperforms the other models in the performance of housing price
prediction.
Satish et al. (2019) utilise several machine learning algorithms, namely linear regres
sion, LASSO and gradient boosting algorithms, to predict house prices. Their results
show that LASSO regression algorithm outperforms other algorithms in terms of accu
racy. Utilising a sample of 89,412 housing transaction records from three counties in Los
Angeles, California, Huang (2019) employs both linear and non-linear machine learning
methods to predict the log error of the home prices. Incorporating features such as the
number of bedrooms, number of bathrooms, estate condition, completion year, total tax
assessed value, and so on, the study compares the results using the methods of linear
regression, decision tree, gradient boosting, random forest, and support vector machine.
Surprisingly, all these methods tend to underestimate the Zestimate prediction error. The
study concludes that Zestimate has already provided a relative accurate housing price
prediction.
Machine learning algorithms generate many benefits in research. On the one hand,
machine learning techniques provide more flexible and sophisticated estimation proce
dures to manage and analyse a tremendously huge amount of data. On the other hand,
large datasets sometimes contain extremely complex relationships among variables that
linear model estimations in traditional estimation techniques, such as ordinary least
square, may not be able to identify. Many machine learning algorithms, such as decision
trees and support vector machine (SVM), allow investigators to capture and model
complex relationships (Varian, 2014). There are many ways that we can make use of
JOURNAL OF PROPERTY RESEARCH 53
machine learning to perform data analysis, such as analysing demographic data, texts,
and images, and then using them to make predictions.
There are several advantages of using machine learning methodologies to estimate
property prices. First, unlike conventional econometrics models, machine learning algo
rithms do not require our training data to be normally distributed. Many statistical tests
rely on the assumption of normality. If the data are not normally distributed, those
statistical tests will fail and become invalid. Second, we will use the iterative process of
stochastic gradient descent in SVM algorithms and grid search in RF or GBM. These
processes used to take a long time in the past, but they can now be quickly completed
with the high-speed computational power of modern computers, and thus the use of this
technique today is less costly or time-consuming. Third, the estimation errors can be
driven to the minimum level for both training and test data sets.
3. Methodology
3.1. Support vector machine
SVM is mostly used as a classifier to find hyperplane in an n-dimensional space (the
number of features) that can separate objects of different categories while maximising the
distance between data points of each category to the separating hyperplane (Noble, 2004).
This classification method can be achieved within the same feature space or by projecting
the features into a higher dimensional plane. Noble (2006) suggests that data are more
easily separable in higher-dimensional spaces because there exists a kernel function for
any data set to be linearly separated. However, the task of identifying the kernel function
is normally completed by trial and error.
SVM can also be used as a regression method, namely support vector regression
(SVR), that input x can be mapped onto an m-dimensional feature space employing
nonlinear mapping, to construct a linear model in the feature space. A parameter P is
introduced to ensure that the learned function does not deviate from the output variable
by a value larger than the epsilon for each instance. The first step is to apply SVR to train
the model with a supervised learning method. Equation (1) is specified as follows (For the
detailed discussions of SVR, please see Basak et al., 2007; Mu et al., 2014):
1 Xn �
min w2 þ C �i þ �i� :
2 i¼1
yi f ðxi ; wÞ � P þ �i�
s:t:f f ðxi ; wÞ yi � P þ �i (1)
�i ; �i� � 0; i ¼ 1; ; 2; . . . ; n
where w2 is the model complexity; P is the insensitive loss function; �i is the slack
variable that measures the degree of misclassification of the data point i; C is the cost
parameter in the objective function. �i is a non-zero and will be multiplied by a cost
parameter C. Then the optimisation becomes a tradeoff between the maximum distance
and the cost penalty.
The optimisation can be transformed into dual problem, and the solutions are
specified as Equation (2):
54 W. K. O. HO ET AL.
nSV
X �
� 0 � α�i � C
f ðx Þ ¼ αi α�i K ðxi ; xÞs:t: (2)
i¼1
0 � αi � C
where nSV is the number of support vectors; K ðxi ; xÞ is the kernel function, as stated in
Equation (3).
X
m
K ðxi ; xÞ ¼ gj ðxÞgj ðxi Þ (3)
j¼1
When we estimate the SVM, our data set is divided into k= 5 equal subsets (folds) where
each fold will be used as a test set at some point of time. We start with the first subset
(k = 1) to test the model, while the rest are used to train the model. In the second
iteration, the second fold (k = 2) is taken as the test data, while the rest are taken as the
train data. The process will repeat until each fold has been taken as the test data. Each
iteration will give a R2 score, and then we can compute their mean value in a bid to
determine the overall accuracy of our model.
This resampling procedure is called cross-validation designed for evaluating machine
learning models on a subsample or training set in machine learning terminology (say,
80% of the whole sample). This procedure attempts to utilise a training data set with
a view to estimating how well the model generally predicts and then to make predictions
utilising the test data set (the remaining 20% of the whole sample). Its goal is to define
how many observations should be used to test the model in the training phase in order to
minimise the issues like overfitting (the model performs well on training set but poorly
on the test set) and underfitting (the model performs poorly in both training and test
sets) As a result, data scientists can obtain some insights on how the model can be
generalised to an independent data set. By employing k-fold cross-validation, we build a k
different models so that all our data can be utilised for both training and testing while
evaluating our algorithms on unseen data.
After computing the predicted values for our training data set by training our � 1models �
in the form of supervised learning method with Python, we obtain θ ¼ X T X XT y
that minimises the cost value for the training set. We then plug in the coefficients (or
weights) into our models, using the test data to see whether our estimations are still
accurate for the test data, as evaluated by the mean square error (MSE), root-mean-
square error (RMSE), mean absolute percentage error (MAPE) and the coefficient of
determination R2 . These three performance metrics (Equations (4)–(6)) range between 0
and ∞, and each of which indicates that the fit is perfect when it is estimated to be 0.
m � �
X � �2
MSE ¼ 1 h xðiÞ yðiÞ (4)
i¼1
sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
X m
2
RMSE ¼ 1 ðhðxðiÞ Þ yðiÞ Þ (5)
i¼1
JOURNAL OF PROPERTY RESEARCH 55
� � ��
m �
X � h x
ðiÞ
yðiÞ ��
MAPE ¼ 100% � � (6)
i¼1
� yðiÞ �
�
where h xðiÞ represents the predicted value of property; yðiÞ represents the actual value of
property; and m represents the number of observations in the test data.
Estimating the SVM will provide us with a R2 for test set of our model. Following this
stage (usually labelled as model validation) will be the estimation of stochastic gradient
descent (SGD) to minimise the errors (prediction loss in machine learning terminology)
while predicting the property prices. It will be even better if the training data size is very
large since employing stochastic gradient descent will approximate the true gradient by
performing the parameters update of a single training example at a time. Actually, it is
a variant of gradient descent (GD).
SGD updates the weights based on the sum of accumulated errors over all sample xðiÞ
through Δw: Equation (7) implies that we update the weights with a single training
sample at one time instead of the full training set, and this algorithm performs the update
for each training data set. The update will continue until the algorithm converges.
� �ðiÞ � ðiÞ
Δw ¼ αÑJ ¼ α yðiÞ τ wT x x (7)
where αÑJ is the stochastic gradient descent. To implement the stochastic gradient
descent, we need to specify a learning rate α. In this study, we choose an adaptive learning
rate which does not have to be a fixed value. In fact, it is a small initial value, and increases
according to the identification procedure during the training process. Equation (8)
specifies the adaptive learning rate used in our estimation. It starts with an arbitrary
initial value eð0Þ; and the learning rate eðt Þ will converge to zero within a finite time
period.
α_ ¼ γðI þ 2Þjej νγα; 0 < γ; ν (8)
where γ and ν are slightly larger than zero. Once we have computed the predicted values
� 1 T �
for our training data set, we can obtain θ ¼ X T X X y that minimises the cost value
for the training set. The estimated weights can then be plugged into our models employ
ing the test data to obtain the weight for each feature, and R2 for the test data. Then, we
can see whether our estimations are still accurate for the test data, as evaluated by the
MSE, RMSE and MAPE. These criteria are is shown in Equations (4)–(6).
For every tree, we obtain a sequence of instances which are randomly sampled
replacement from the training set. Each sequence of instances corresponds to
a random vector ;k forming a specific tree. Since all the sequences will not be exactly
the same, the decision trees constructed from them will also be slightly variant. De
Aquino Afonso et al. (2020) suggest that the prediction of the K-th tree for an input
X can be represented by Equation (9):
hk ðX Þ ¼ hðX; ;k Þ; "k 2 f1; 2; . . . ; K g (9)
where K is the number of trees. As a tree splits, each of which randomly selects features to
avoid correlations among features. Alpaydin (2009) points out that a node S can be split
into two subsets, S1 and S2 by selecting a threshold c that minimises the difference in the
sum of squared errors:
!2 !2 !
X 1 X X 1 X
SSE ¼ vi vi þ vi vi (10)
i2Si
jS1 j i2S1 i:i2S2
jS2 j i2S2
By following the same decision rules, we can predict any subtree as the mean or median
output of instances. Finally, we can obtain the final prediction as an average of each tree’s
output, as stated in Equation (11):
1X K
hð X Þ ¼ hk ð X Þ (11)
K i¼1
Under the empirical risk minimisation principle, we can employ Equation (13) to search
for an approximation that minimises the mean value of the loss function employing the
training set. We can start with a model with a constant function, and then transform it
into Equation (14):
JOURNAL OF PROPERTY RESEARCH 57
X
n
F0 ðxÞ ¼ argmin Lðyi ; θÞ (13)
θ i¼1
" #
X
n
Fm ðxÞ ¼ Fm 1 ðxÞ þ argmin Lðyi ; Fm 1 ðxi Þ þ hm ðxi ÞÞ (14)
hm 2H i¼1
where hm 2 H is a base learner function. Swathi and Shravani (2019) suggest that
although the best function h at each step for the loss function L cannot be specified,
data scientists can apply a steepest descent step to this minimisation problem. It is
assumed to be the set of arbitrary differentiable function on, the model can be updated
with the following equations:
X
n
Fm ð x Þ ¼ Fm 1 ð x Þ θm ÑFm 1 Lðyi ; Fm 1 ðxi ÞÞ (15)
i¼1
X
n
θm ¼ argmin Lðyi ; Fm 1 ðxi Þ θÑFm 1 Lðyi ; Fm 1 ðxi ÞÞÞ (16)
θ i¼1
where the derivatives are taken with respect to the functions Fi for i 2 f1; . . . ; mg; θm is
the step length.
However, this approach will offer an approximation to be minimisationa problem
only. Estimation of GBM can provide a R2 for the test set of a model, and its performance
can be evaluated by MSE, RMSE and MAPE, respectively. Finally, we can use grid search
to improve the accuracy of our model by tuning the hyperparameters.
To conclude, we summarise the advantages and disadvantages of support vector
machine, random forest and gradient boosting machine in Table 1.
4. Data sources
Machine learning algorithms are used to predict property prices in a housing district of
Hong Kong. Conventional housing attributes that influence housing prices are included
in the models. The variables in our price estimation models are defined as follows:
LRPi represents the transaction price (total consideration) of residential property i,
which is measured in HK dollars, in natural logarithm.
LGFAi is defined as the floor area of property i, in natural logarithm.
LAGEi represents the age of residential property i in years, which can be measured by
the difference between the date of issue of the occupation permit and the date of
transaction, in natural logarithm.
FLi is the actual floor level of property i.
LCENTRALi is a proxy for accessibility of residential property i. It measures the time
spent on travelling from a specific housing estate to the Central District, which is
measured in minutes, and in natural logarithm.
Ei ; Si ; Wi ; Ni ; NEi ; SEi ; SWi &NWi are the orientation of property i are facing.
Orientation is divided into eight categories: East, South, West, North, Northeast,
Southeast, Southwest and Northwest, respectively. It represents the direction property
58 W. K. O. HO ET AL.
Table 4 presents the estimations of R2 for the test set under SVM, RF and GBM
algorithms, together with the performance metrics: MSE, RMSE and MAPE. Our
Table 3. Correlation Matrix.
LRP LGFA LAGE FL LCENTRAL E S W N NE SE SW NW
LRP 1.00000 0.84094 −0.39962 0.28535 −0.37285 0.01138 0.10823 0.00623 −0.03440 −0.03490 0.05026 −0.00662 −0.04842
LGFA 0.84094 1.00000 −0.27203 0.14673 0.08690 0.01028 0.09553 0.00038 −0.00276 −0.06329 0.08272 −0.07403 0.011373
LAGE −0.39962 −0.27203 1.00000 −0.20894 0.14738 0.01629 0.02560 0.01336 0.04687 0.00636 −0.04232 0.00219 −0.01511
FL 0.28535 0.14673 −0.20894 1.00000 −0.11582 −0.00154 −0.00618 0.02043 −0.01241 0.00330 −0.01001 −0.00679 0.01204
LCENTRAL −0.37285 −0.08690 0.14738 −0.11582 1.00000 −0.02565 −0.02770 −0.03984 −0.01120 −0.05598 0.10577 −0.05232 0.05944
E 0.01138 0.01028 0.01629 −0.00154 −0.02565 1.00000 −0.04062 −0.04390 −0.03870 −0.11313 −0.10036 −0.11264 −0.10912
S 0.10823 0.09553 0.02560 −0.00618 −0.02770 −0.04062 1.00000 −0.04044 −0.03565 −0.10421 −0.09244 −0.10375 −0.10051
W 0.00623 0.00038 0.01336 0.02043 −0.03984 −0.04390 −0.04044 1.00000 −0.03853 −0.11264 −0.09992 −0.11215 −0.10864
N −0.03440 −0.00276 0.04686 −0.01241 −0.01120 −0.03870 −0.03565 −0.03853 1.00000 −0.09929 −0.08808 −0.09886 −0.09577
NE −0.03490 −0.06329 0.00636 0.00330 −0.05598 −0.11313 −0.10421 −0.11264 −0.09929 1.00000 −0.25749 −0.28900 −0.27997
SE 0.05026 0.08272 −0.04232 −0.01001 0.10577 −0.10036 −0.09244 −0.09992 −0.08808 −0.25749 1.00000 −0.25637 −0.24836
SW −0.00662 −0.07403 0.00219 −0.00679 −0.05232 −0.11264 −0.10375 −0.11215 −0.09886 −0.28900 −0.25637 1.00000 −0.27875
NW −0.04842 0.01373 −0.01511 0.01204 0.05944 −0.10912 −0.10051 −0.10864 −0.09577 −0.27997 −0.24836 −0.27875 1.00000
JOURNAL OF PROPERTY RESEARCH
61
62 W. K. O. HO ET AL.
discussion first starts with SVM. The best R2 score for a linear SVR model ðC ¼ 1Þ is
0.82702 for the test set while the intercept and estimators are as follows:
Intercept : 14.85848; LGFAi : 0.21911; LAGEi : −0.03580; FLi : 0.03334;
LCENTRALi : -0.08108
The results are then evaluated by the MSE, RMSE and MAPE criteria. These values are
estimated to be 0.01423, 0.11929 and 0.54571%, demonstrating that SVM fits our data
very well. Following this is the estimation of stochastic gradient descent. Employing 5
CV, it takes 37 epochs to converge in 0.05 seconds. The intercept and estimators in the
SGD model are as follows:
Intercept : 14.85929; LGFAi : 0.21966; LAGEi : −0.03618; FLi : 0.03323;
LCENTRALi : -0.07880
Stochastic gradient descent provides an explanatory power R2 of 0.82715 for the test
set. The model is also evaluated by the same MSE, RMSE and MAPE criteria which are
estimated to be 0.01422, 0.11925 and 0.54467%, respectively, implying a reasonable good
fit. It is quite tempting to interpret the estimated weight for each feature in terms of
elasticity. However, it is not the case in stochastic gradient descent. The estimated
weights should not be considered as the magnitude of features’ effects on real estate
prices. In an iterative algorithm, the ‘optimal’ weight for each feature is indeed obtained
by fine tuning an arbitrary value on a function, and going down its slope step by step until
JOURNAL OF PROPERTY RESEARCH 63
it reaches the lowest point of the function. Hence, our focus should concentrate on the
explanatory power R2 and the error minimisation of the model.
Based on our estimation results, Figure 4 portrays the scatterplot of property prices in
logs LRPi and the residuals. It shows that SVM fits the data very well most of the time.
The model underperforms with high bias for extreme values in LRPi (Note the green dots
and orange dots indicating values are larger than 16.0 for housing prices). Figure 4
further shows the relationship between actual prices LRPi and their predicted values
di . We can see that, most of our predicted values follow closely to the red line,
LRP
demonstrating that our model fits our data sufficiently well.
Using random forest, our base model provides a R2 that is as high as 0.89690 for the
test set while the MSE, RMSE and MAPE are estimated to be 0.00848, 0.09210 and
0.33348%, respectively. In random forest, the estimation of SGD is not feasible because
some of the hyperparameters are not continuous, such as the number of estimators.
Rather, we can employ the grid search to minimise the model errors by tuning the
hyperparameters.
For hyperparameter tuning, we perform many iterations of the entire fivefold cross-
validation process, each of which sets different model settings, such as the number of
estimators, min samples split, min sample leaf, max features, and max depth. Then, we
compare all these models, pick up the best one, train it on the training set, and then
evaluate on the test set.
By using the grid search with 5 CV, we obtain a R2 of 0.90333 for the test set (slightly
higher than the base model). The MSE, RMSE and MAPE are estimated to be 0.00795,
0.08918 and 0.32270%, which are lower than those of the base model. Hence, we take the
results based on grid search as the best results associated with random forest.
Figure 4 shows the scatterplot of property prices in logs LRPi and the residuals in RF. It
is noticeable that two dots (whose values are larger than 16 for housing prices) are lying
far away from the clustering, our model fits the data very well most of the time. The
model underperforms with high bias for extreme values in LRPi (i.e. one yellow dot and
one blue dot). We observe that most of our predicted values are lying very close or even
on the red line, implying a very good fit of the data. Figure 4 shows that most of the
predicted values lie closer to the red line in RF than SVM, implying that RF fits our data
better than the SVM.
Lastly, we discuss the results based on the gradient boosting machine (GBM), more
precisely gradient boosting regression. In the base model, GBM provides a R2 of 0.89431
for the test set while the MSE, RMSE and MAPE are estimated to be 0.00870, 0.09325 and
0.35700%. It is very obvious that GBM fits our data reasonably well. In order to obtain
even better results, we then employ the grid search with 5 CV to minimise the model
errors by tuning the hyperparameters, including subsample, learning rates, the number
of estimators, min samples split, min sample leaf, max features, and max depth. We then
obtain a R2 as high as 0.90365 for the test set while the MSE, RMSE and MAPE are
estimated to be 0.00793, 0.08903 and 0.32251%, respectively. The results based on grid
search are the best results associated with gradient boosting regression.
For most of the time, GBM also fits the data very well (see Figure 4). However, the
model does not perform well with high bias for some extreme values in LRPi in which two
dots (whose values are larger than 16 for housing prices) are lying far away from the
cluster. Again, the model has achieved very good model fitting since most of the
predicted values are lying very close or around the red line, and its performance is
slightly better than those of RF in relation to the MSE, RMSE and MAPE criteria.
Finally, we compare the results of these estimation techniques. In terms of explanatory
power, this study shows that RF and GBM have equally good performances. In terms of
error minimisation, GBM is slightly better than RF with respect to the three performance
metrics while it outperforms SVM. Hence, we demonstrate that RF and GBM are very
powerful tools for making accurate prediction of property prices, since the performances
of these two algorithms are similar to each other. In comparison, the performance of
SVM is found to be below that of RF and GBM in all aspects. However, this does not
imply that SVM is necessarily inferior.
The existing literature has demonstrated mixed results about which algorithm is
superior in making predictions. Previous studies suggest that artificial neural network
(ANN) has a better performance than RF (see Masias et al., 2016; Mimis et al., 2013), but
other studies demonstrate that both models have comparable predictive power and
almost equally applicable in making predictions (Ahmad et al., 2017). In another study
about the forecast of daily lake water levels, RF is found to exhibit the best results when
compared to linear regression, SVM and ANN with respect to R2 and RMSE (Li et al.,
2016). Utilising RF, stochastic gradient boosting and SVM to predict genomic breeding
values, Ogutu et al. (2011) conclude that boosting has had the best performance, followed
by SVM and RF. In this study, SVM performs better than RF (see also Caruana &
Niculescu-Mizil, 2006).
Hence, it is difficult to conclude which algorithm is the best in all cases because the
results tend to be data-specific. What criteria should property valuers consider when
choosing an algorithm to make predictions? We suggest that the choice of algorithm
depends on a number of factors including, the size of data set, computing power, time
constraint and researcher’s knowledge about machine learning. If the data set is large and
JOURNAL OF PROPERTY RESEARCH 65
keeps increasing, SVM should be a good choice because its computation procedure takes
less time than the other algorithms, especially when computing speed is a major concern.
If the accuracy of prediction is the priority consideration, we recommend property
valuers to utilise ANN, RF or GBM, although their computations often take much longer
times than SVM. Needless to say, it is always a good practice to use more than one
algorithms to compare the predictions.
6. Conclusions
Improvement in computing technology has made it possible to examine social informa
tion that cannot previously be captured, processed and analysed. New analytical techni
ques of machine learning can be used in property research. This study is an exploratory
attempt to use three machine learning algorithms in estimating housing prices, and then
compare their results.
In this study, our models are trained with 18-year of housing property data utilising
stochastic gradient descent (SGD) based support vector regression (SVM), random forest
(RF) and gradient boosting machine (GBM). We have demonstrated that advanced
machine learning algorithms can achieve very accurate prediction of property prices, as
evaluated by the performance metrics. Given our dataset used in this paper, our main
conclusion is that RF and GBM are able to generate comparably accurate price estima
tions with lower prediction errors, compared with the SVM results.
First, our study has shown that advanced machine learning algorithms like SVM, RF
and GBM, are promising tools for property researchers to use in housing price predic
tions. However, we must be cautious that these machine learning tools also have their
own limitations. There are often many potential features for researchers to choose and
include in the models so that a very careful feature selection is essential.
Second, many conventional estimation methods produce reasonably good estimates of
the coefficients that unveil the relationship between output variable and predictor vari
ables. These methods are intended to explain the real-world phenomena and to make
predictions, respectively. They are used for developing and testing theories to perform
causal explanation, prediction, and description (Shmuell, 2010). Based on these esti
mates, investigators can interpret the results and make policy recommendations.
However, machine learning algorithms are often not developed to achieve these pur
poses. Although machine learning can produce model predictions with tremendously
low errors, the estimated coefficients (or weights, in machine learning terminology)
derived by the models may sometimes make it hard for interpretation.
Third, the computation of machine learning algorithms often takes much longer time
than conventional methods such as hedonic pricing model. The choice of algorithm
depends on consideration of a number of factors such as the size of the data set,
computing power of the equipment, and the availability of waiting time for the results.
We recommend property valuers and researchers to use SVM for making forecasts if
speed is a primary concern. When predictive accuracy is a key objective, RF and GBM
should be considered instead.
To conclude, the application of machine learning in property research is still at an
early stage. We hope this study has moved a small step ahead in providing some
methodological and empirical contributions to property appraisal, and presenting an
66 W. K. O. HO ET AL.
alternative approach to the valuation of housing prices. Future direction of research may
consider incorporating additional property transaction data from a larger geographical
location with more features, or analysing other property types beyond housing
development.
Acknowledgments
The work in this study is supported by the General Research Fund of the Research Grants Council of
the Hong Kong Special Administrative Region Government (Project Number 17202818). The
authors are grateful for the comments and suggestions of the anonymous reviewers on the initial
version of this manuscript. All remaining errors and omissions are the responsibilities of the authors.
Disclosure statement
The authors declare there is no conflict of interests in this research.
Funding
This work was supported by the Research Grants Council, University Grants Committee
[17202818].
Notes on contributors
Notes on contributors
Dr. Winky K.O. Ho is a Research Associate in the Department of Real Estate and Construction
at the University of Hong Kong. His research interests cover urban economics, urban planning,
real estate and econometrics.
Prof. Bo-sin Tang is a Professor in the Department of Urban Planning and Design at the
University of Hong Kong. He was previously Professor and Associate Head of the Department
of Building and Real Estate at the Hong Kong Polytechnic University. He earned his PhD in Urban
and Regional Planning from the London School of Economics and Political Science, and received
real estate education from the Haas Business School at the University of California Berkeley. His
research interests include land use planning, property development and institutional analysis.
Dr. Siu Wai Wong is an Assistant Professor in the Department of Building and Real Estate at the
Hong Kong Polytechnic University. She is a Chartered Surveyor and received her PhD in Planning
from the University of British Columbia. Her specialisms cover property appraisal, land develop
ment and China studies.
References
Ahmad, M. W., Mourshed, M., & Rezgui, Y. (2017). Trees vs Neurons: Comparison between
random forest and ANN for high-resolution prediction of building energy consumption,”.
Energy and Buildings, 147(2386), 77–89. https://doi.org/10.1016/j.enbuild.2017.04.038
Ahmed, M. M., & Abdel-Aty, M. (2013). Application of stochastic gradient boosting technique to
enhance reliability of real-time risk assessment. Journal of Transport Research Board, 2386(1),
26–34. https://doi.org/10.3141/2386-04
Alpaydin, E. (2009). Introduction to Machine Learning. MIT Press.
JOURNAL OF PROPERTY RESEARCH 67
Baldominos, A., Blanco, I., Moreno, A. J., Iturrarte, R., Bernárdez, Ó., & Afonso, C. (2018).
Identifying real estate opportunities using machine learning. Applied Sciences, 8(11), 1–24.
https://doi.org/10.3390/app8112321
Basak, D., Srimanta, P., & Patranabis, D. C. (2007). Support vector regression. Neural Information
Processing – Letters and Reviews, 11(10), 203–224. https://doi.org/10.1.1.468.6802
Boser, B. E., Guyon, I. M., & Vapnik, V. N. (1992). A training algorithm for optimal margin
classifiers. In D. Haussler (Ed.), 5th Annual ACM Workshop on COLT (pp. 144–152). ACM
Press.
Breiman, L. (1996). Bagging predictors. Machine Learning, 24(2), 123–140. https://doi.org/10.
1007/BF00058655
Breiman, L. (1997). Arcing the edge (Technical Report 486). Berkeley: Department of Statistics,
University of California.
Breiman, L. (1998). Predicting games and arcing algorithms (Technical Report 504). Berkeley:
Department of Statistics, University of California.
Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5–32. https://doi.org/10.1023/
A:1010933404324
Breiman, L., Friedman, J., Stone, C. J., & Olshen, R. A. (1984). Classification and regression trees.
New York: Chapman & Hall/CRC Press.
Cai, Y-D., Zhou, G-P., & Chou, K-C. (2003). Support vector machines for predicting membrane
protein types by using functional domain composition. Biophysical Journal, 84(5), 3257–3263.
Caruana, R., & Niculescu-Mizil, A. (2006). An empirical comparison of supervised learning
algorithms. Proceedings of the 23rd international conference on Machine learning. ACM.
Corporate Finance Institute (2020). Random forest. Corporate Finance Institute Education Inc.
https://corporatefinanceinstitute.com/resources/knowledge/other/random-forest/
Cortes, C., & Vapnik, V. (1995). Support vector networks. Machine Learning, 20(3), 273–297.
https://doi.org/10.1007/BF00994018
De Aquino Afonso, B. K., Melo, L. C., de Oliveira, W. D. G., Da Silva Sousa, S. B., & Berton, L.,
(2020). Housing prices prediction with a deep learning and random forest ensemble [Unpublished
manuscript]. Anais do Encontro Nacional de Inteligencia Artificial e Computacion.
Einav, L., & Levin, J. (2014). Economics in the age of big data. Science, 346(6210), 715–721. https://
doi.org/10.1126/science.1243089
Feggella, D. (2019, February 19). What is machine learning? emeRJ. https://emerj.com/ai-glossary-
terms/what-is-machine-learning/
Furey, T. S., Cristianini, N., Duffy, N., Bednarski, D. W., Schummer, M., & Haussler, D. (2000).
Support vector machine classification and validation of cancer tissue samples using microarray
expression data. Bioinformatics, 16(10), 906–914. https://doi.org/10.1093/bioinformatics/16.10.
906
Golub, T. R., Slonim, D. K., Tamayo, P., Huard, C., Gaasenbeek, M., Mesirov, J. P., Coller, H.,
Loh, M. L., Downing, J. R., Caligiuri, M. A., Bloomfield, C. D., & Lander, E. S. (1999). Molecular
classification of cancer: Class discovery and class prediction by gene expression monitoring.
Science, 286(5439), 531–537. https://doi.org/10.1126/science.286.5439.531
Gu, J., Zhu, M., & Jiang, L. (2011). Housing price forecasting based on genetic algorithm and
support vector machine. Expert System with Applications, 38(4), 3383–3386. https://doi.org/10.
1016/j.eswa.2010.08.123
Guyon, I., Weston, J., Barnhill, S., & Vapnik, V. (2002). Gene selection for cancer classification
using support vector machines. Machine Learning, 46(1/3), 389–422. https://doi.org/10.1023/
A:1012487302797
Harrington, P. (2012). Machine learning in action. New York: Manning Publications.
Hassanien, A. E., Tolba, M. F., & Elhoseny, M. (2018). The International Conference on Advanced
Machine Learning Technologies and Applications (AMLTA2018). Springer International
Publising AG.
Hastie, T., Rosset, S., Tibshirani, R., & Zhu, J. (2004). The entire regularization path for the support
vector machine. Journal of Machine Learning Research, 5, 1391–1415.
68 W. K. O. HO ET AL.
Hausler, J., Ruscheinsk, J., & Lang, M., (2018). News-based sentiment analysis in real estate: A
machine learning approach. Journal of Property Research, 35(4), 344–371.
Huang, Y. T. (2019). Predicting home value in California, United States via machine learning
modeling. Statistics, Optimization & Information Computing, 7(1), 66–74. https://doi.org/10.
19139/soic.v7i1.435
Jelinek, F. (1998). Statistical methods for speech recognition. MIT Press.
Jurafsky, D., & Martin, J. H. (2008). Speech and language processing: An introduction to Natural
language processing, computational linguistics and speech recognition (2ng ed.). Prentice Hall.
Kaggle. (2019). Competitions. Kaggle Inc. https://www.kaggle.com/competitions
Koktashev, V., Makee, V., Shchepin, E., Peresunko, P., & Tynchenko, V. V. (2019). Pricing
modeling in the housing market with urban infrastructure effect. Journal of Physics.
Conference Series, 1353(12139), 1–6. https://doi.org/10.1088/1742-6596/1353/1/012139
Krause, J., Lalueza-Fox, C., Orlando, L., Enard, W., Green, R. E., Burbano, H., Hublin, J. J., Hanni,
C., Fortea, J., de la Rasilla, M., Bertranpetit, J., Rosas, A., & Paabo, S. (2007). The derived FOXP2
variation of modern humans was shared with Neandertals. Current Biology, 17, 1–5.
Li, B., Yang, G. S., Wan, R. R., Dai, X., & Zhang, Y. H. (2016). Comparison of random forests and
other statistical methods for the prediction of lake water level: A case study of the Poyang Lake
in China. Hydrology Research, 47(S1), 69–83. https://doi.org/10.2166/nh.2016.264
Li, D. Y., Xu, W., Zhao, H., & Chen, R. Q. (2009). A SVR based forecasting approach for real estate
price prediction. 2009 International Conference on Machine Learning and Cybernetics, Hebei,
12–15 July.
Limsombunchai, V., Gan, C., & Lee, M. (2004). House price prediction: Hedonic price model vs.
artificial neural network. American Journal of Applied Sciences, 1(3), 193–201. https://doi.org/
10.3844/ajassp.2004.193.201
Liu, T., & Liu, Y., (2010). A risk early warning model in real estate market based on support vector
machine. 2010 International Conference on Intelligent Computing and Cognitive Informatics.
50–53. Guilin: IEEE Computer Society.
Masías, V. H., Valle, M. A., Crespo, F., Crespo, R., Vargas, A., & Laengle, S. (2016). Property
valuation using machine learning algorithms: A study in a Metropolitan-Area of Chile. In
Selection at the AMSE Conferences-2016 (p. 97).
McCluskey, W., Davis, P., Haran, M., McCord, M., & McIlhatton, D. (2012). The potential of
artificial neural networks in mass appraisal: The case revisited. Journal of Financial Management
of Property and Construction, 17(3), 274–292. https://doi.org/10.1108/13664381211274371
Mimis, A., Rovolis, A., & Stamou, M. (2013). Property valuation with artificial neural network: The
case of Athens. Journal of Property Research, 30(2), 128–143. https://doi.org/10.1080/09599916.
2012.755558
Mohd, T., Masrom, S., & Johari, N. (2019). Machine learning housing price prediction in Petaling
Jaya, Selangor, Malaysia. International Journal of Recent Technology and Engineering, 8(2S11),
542–546. https://www.ijrte.org/wp-content/uploads/papers/v8i2S11/B10840982S119.pdf
Mu, J. Y., Wu, F., & Zhang, A. H. (2014). Housing value forecasting based on machine learning
methods. Abstract and Applied Analysis, 1–7.
Mukhlishin, M. F., Saputra, R., & Wibowo, A., (2017) Predicting house sale price using fuzzy logic,
Artificial Neural Network and K-Nearest Neighbor. 2017 1st International Conference on
Informatics and Computational Sciences, 171–176.
Mullainathan, S., & Spiess, J. (2017). Machine learning: An applied econometric approach. Journal
of Economic Perspective, 31(2), 87–106. https://doi.org/10.1257/jep.31.2.87
Muralidharan, S., Phiri, K., Sinha, S. K., & Kim, B. (2018). Analysis and prediction of real estate
prices: A case of the Boston housing market. Issues in Information Systems, 19(2), 109–118.
http://www.iacis/org/iis/2018/2_iis_2018_109-118.pdf
Ngiam, K. Y., & Khor, I. W. (2019). Big data and machine learning algorithms for health-care
delivery. The Lancet Oncology, 20(5), 262–273. https://doi.org/10.1016/S1470-2045(19)30149-4
Noble, W. S. (2004). Support vector machine applications in computational biology. In
B. Schoelkopf, K. Tsuda, & J. P. Vert (Eds.), Kernel Methods in Computational Biology (pp.
71–92). MIT Press.
JOURNAL OF PROPERTY RESEARCH 69
Noble, W. S. (2006). What is a support vector machine? Nature Biotechnology, 24(12), 1565–1567.
https://doi.org/10.1038/nbt1206-1565
Ogutu, J. O., Piepho, H. P., & Schulz-Streeck, T. (2011). A comparison of random forests, boosting
and support vector machines for genomic selection. BMC Proceedings, 5(Suppl 3), S11. https://
doi.org/10.1186/1753-6561-5-S3-S11
Pal, R. (2017). Overview of predictive modeling based on genomic characterizations. In Predictive
Modeling of Drug Sensitivity (pp. 121–148). Academic Press. https://doi.org/10.1016/B978-0-12-
805274-7.00006-3
Park, B. H., & Bae, J. K. (2015). Using machine learning algorithms for housing price prediction:
The case of Fairfax County, Virginia housing data. Expert Systems with Applications, 42(6),
2928–2934. https://doi.org/10.1016/j.eswa.2014.11.040
Pérez-Rave, J. I., Correa-Morales, J. C., & González-Echavarría, F. (2019). A machine learning
approach to big data regression analysis of real estate prices for inferential and predictive
purposes. Journal of Property Research, 36(1), 59–96. https://doi.org/10.1080/09599916.2019.
1587489
Rabiner, L., & Juang, B.-H. (1993). Fundamentals of speech recognition (1st ed.). Prentice Hall.
Rafiei, M. H., & Adeli, H. (2016). A novel machine learning model for estimation of sale prices of
real estate units. Journal of Construction Engineering and Management, 142(2), 04015066.
https://doi.org/10.1061/(ASCE)CO.1943-7862.0001047
Ramaswamy, S., Tamayo, P., Rifkin, R., Mukherjee, S., Yeang, C., Angelo, M., Ladd, C., Reich, M.,
Latulippe, E., Mesirov, J., Poggio, T., Gerald, W., Loda, M., Lander, E., & Golub, T. (2001).
Multiclass cancer diagnosis using tumor gene expression signature. Proceedings of the National
Academy of Sciences of the United States of America, 98(26), 15149–15154. https://doi.org/10.
1073/pnas.211566398
Rausch, C., Weber, T., Kohlbacher, O., Wohlleben, W., & Huson, D. H. (2005). Specificity
prediction of adenylation domains in nonribosomal peptide synthetases (NRPS) using trans
ductive support vector machines (TSVMs). Nucleic Acids Research, 33(18), 5799–5808.
Rogers, S., & Girolami, M. A. (2011). A first course in machine learning (Machine learning and
pattern recognition) (2nd ed.). Chapman & Hall/CRC.
Rychetsky, M. (2001). Algorithms and architectures for machine learning based on regularized
neural networks and support vector approaches. Shaker Verlag Gmbh.
Satish, G. N., Raghavendran, C. V., Rao, M. D. S., & Srinivasulu, C. (2019). House price prediction
using machine learning. International Journal of Innovative Technology and Exploring
Engineering, 8(9), 717–722.
Schroeder, M. (2004). Computer speech: Recognition, compression, synthesis (2nd ed.). Springer-
Verlag Berlin and Heidelberg GmbH & Co. KG.
Shinda, N., & Gawande, K. (2018, October 3–4). Survey on predicting property price. Paper
presented at 2018 International Conference on Automation and Computational Engineering
(pp. 1–7).
Shmuell, G. (2010). To explain or to predict. Statistical Science, 25(3), 289–310. https://doi.org/10.
1214/10-STS330
Sun, D., Du, Y., Xu, W., Zuo, M., Zhang, C., & Zhou, J. (2015). Combining online news articles and
web search to predict the fluctuation of real estate market in big data context. Pacific Asia
Journal of the Association for Information Systems, 6(4), 19–37.
Swathi, B., & Shravani, V. (2019). House price prediction analysis using machine learning.
International Journal for Research in Applied Science & Engineering Technology, 7(5),
1483–1492.
UC Business Analyst. (2018). Gradient boosting machines. University of Cincinnati. http://uc-r.
github.io/gbm_regression
Vapnik, V. (1995). The Nature of Statistical Learning Theory. Springer.
Vapnik, V., & Lerner, A. (1963). Pattern recognition using generalized portrait method. Automatic
Remote Control, 24(6), 774–780.
Varian, H. R. (2014). Big data: New tricks for econometrics. Journal of Economic Perspectives, 28
(2), 3–28. https://doi.org/10.1257/jep.28.2.3
70 W. K. O. HO ET AL.