A House Price Valuation Based On The Random Forest Approach: The Mass Appraisal of Residential Property in South Korea
A House Price Valuation Based On The Random Forest Approach: The Mass Appraisal of Residential Property in South Korea
A House Price Valuation Based On The Random Forest Approach: The Mass Appraisal of Residential Property in South Korea
1 School
of Management and Economics, Handong Global University, Pohang, Republic of Korea
2 School of Computer Science and Electrical Engineering, Handong Global University, Pohang, Republic of Korea
Abstract. Mass appraisal is the standardized procedure of valuing a large number of properties at the same time and is
commonly used to compute real estate tax. While a hedonic pricing model based on the ordinary least squares (OLS) lin-
ear regression has been employed as the traditional method in this process, the stability and accuracy of the model remain
questionable. This paper investigates the features of a house price predictor based on the Random Forest (RF) method by
comparing it with that of a conventional hedonic pricing model. We used apartment transaction data from the period of
2006 to 2017 in the district of Gangnam, one of the most developed areas in South Korea. Using a data set covering 40%
of all transactions in the sample area, we demonstrate that the accuracy of a machine learning-based predictor can be
surprisingly high. The average of percentage deviations between the predicted and the actual market price was found to
be only around 5.5% in the RF predictor, whereas it was almost 20% in the OLS-based predictor. With the RF predictor,
the probability of the predicted price being within 5% of its actual market price was 72%, while only about 17.5% of the
regression-based predictions fell within the same range. These results show that, in the practice of mass appraisal, the RF
method may be a useful complement to the hedonic models, as it more adequately captures the complexity or non-linearity
of actual housing markets.
Keywords: housing price forecasting, hedonic pricing model, random forest approach, mass appraisal, apartment, machine
learning technique.
Introduction
value of the collateral may have declined materially rela-
Mass appraisal, also called automatic valuation of real es- tive to general market prices or when a credit event, such
tate assets, is the introduction of mathematical statistics, as default, occurs.” As a result of this accord, the value
computer technology, and geographic information tech- of the property is appraised more frequently than before,
nology to establish a mathematical model that serves as which leading to an increase in costs (time and money)
a systematic appraisal of a group of real estate properties for appraisals. Thus, a stable, accurate, and fast tool for
and reveals its market value (Zhou, Ji, Chen, & Zhang, appraisal is needed, and the mass appraisal model may
2018). The Basel II Accord, issued by the Basel Commit- be a viable solution. The mass appraisal model is also
tee on Banking Supervision (BCBS) in 2008, states that a widely accepted tool for the valuation of property for
“the bank is expected to monitor the value of the collateral the purposes of taxation or mortgage for a loan. In the
on a frequent basis and at a minimum once every year. US, the Computer Assisted Mass Appraisal (CAMA), the
More frequent monitoring is suggested where the market computer system and software used for mass appraisal, is
is subject to significant changes in conditions. Statistical employed nearly universally by assessors nationwide for
methods of evaluation (e.g. reference to house price in- tax assessment. Owing to its importance, there has been
dices, sampling) may be used to update estimates or to a rich and diverse body of work addressing the appraisal
identify collateral that may have declined in value and techniques and performance.
that may need re-appraisal. A qualified professional must Traditionally, the hedonic pricing model, originat-
evaluate the property when information indicates that the ing from Lancaster’s consumer theory, has been one of
the most extensively employed models to estimate house Willis, 1992; Richardson, Vipond, & Furbey, 1974), lo-
prices and property values (Lancaster, 1966). The theoreti- cal government or municipal services (Clauretie & Neill,
cal framework and foundation for hedonic pricing models 2000; Hayes & Taylor, 1996; Jud & Watts, 1981; Downes
were developed in a study by Rosen (1974). In hedonic & Zabel, 2002; Huh & Kwak, 1997), and externalities such
price theory, it is assumed that a good can be regarded as crime rates (Thaler, 1978), noise (Wilhelmsson, 2000;
as a bundle of individual components or characteristics Williams, 1991; Espey & Lopez, 2000), and air pollution
that provide utilities. Rosen (1974) defines the theory as (Harrison & Rubinfeld, 1978). Previous studies have sug-
“a model of product differentiation based on the hedonic gested some key housing attributes included in most he-
hypothesis that goods are valued for their utility-bearing donic price models.
attributes or characteristics.” Rosen defines a set of “he- While the major advantage of the hedonic models is
donic” prices as the amount of characteristics associated their simplicity in estimating and interpreting the regres-
with the goods. Thus, a consumer who purchases a good sion coefficients, the pre-specified form of the models has
acquires a collection of the characteristics embodied in it, been criticized for imposing strong assumptions, such as
and these attributes can be converted into utility. From those regarding linearity parameters. The functional form
this perspective, a house is a heterogeneous good embody- of the conventional hedonic pricing model is based on
ing a package of inherent characteristics relevant to lo- the simplification of household’s preferences and strict
cation, property attributes, and environmental amenities. assumptions about the housing. The model depends on
The advantage of the hedonic pricing models is that the the assumption that the effects from each attribute are
marginal implicit values of the characteristics can be ob- separable and constant, which implies a separable pref-
tained by differentiating the price function with respect to erence, perfect competition, market equilibrium, and an
each attribute (McMillan, Reid, & Gillen, 1980). integrated market (see Chau & Chin, 2003; Malpezzi,
Because house prices are influenced by a number of 2002; Sheppard, 1999). Thus, in practice, the accuracy of
attributes, many studies employ the hedonic model to in- the OLS (ordinary least squares)-based model would be
vestigate relationship between house prices and their char- eroded insofar as the model simplifies the complexity or
acteristics (Chau & Chin, 2003). The most common vari- non-linearity of the real world. For instance, if the hous-
ables for the model involve the structural attributes, such ing market is organized into a series of sub-markets by
as type, age of property, number of bedrooms and other housing size or income group or if there is a non-linearity
rooms, and other amenities available within the property. in household’s preference on an attribute, the predictor
Numerous studies have found a house’s number of bed- obtained from a single regression would fail to capture
rooms and bathrooms and its floor area to be positively the complexities. This problem arises because we cannot
related to its price (Fletcher, Gallimore, & Mangan, 2000; directly observe the structure of preference and capture
Li & Brown, 1980; Garrod & Willis, 1992; Rodriguez & all the market characteristics causing the complexity in
Sirmans, 1994). Kain and Quigley (1970) revealed that the a market. In the real world, many market characteristics
age of the property can impact house prices negatively. In may intermingle, but there is no flexibility in the conven-
addition, some researchers have analyzed the impact of tional hedonic pricing model to explore such complexity.
locational features, such as racial composition, pollution These disadvantages are mentioned in Zurada, Levitan,
level, and proximity to a central business district (CBD), and Guan (2011) as “failures [that] would result in unten-
transportation facilities, or retail stores on house prices able or imprecise coefficients caused by functional form
(Palmquist, 1992; McMillan, Jarmin, & Thorsnes, 1992; misspecification, interaction among variables, multicol-
Ridker & Henning, 1967). Dubin and Sung (1990) con- linearity, and non-linearity problems.”
ducted a non-nested test to determine which set of neigh- In this case, the proposed data-driven modelling based
borhood variables most accurately explained the variation on machine learning techniques could be a complement to
in housing prices. To reveal the relationship between ac- the conventional regression methods. The main advantage
cessibility to a CBD and house prices, various measures of the proposed method is that it constructs the model,
have been proposed (Adair, McGreal, Smyth, Cooper, & while exploring the complexity, without the modeler ex-
Ryley, 2000; Hanson, 2004; Song & Sohn, 2007; Chen, plicitly describing it. In recent years, the applicability of
Ong, Zheng, & Hsu, 2017). In So, Tse, and Ganesan (1997) these methods has been expanding quickly, owing to the
and Debrezion, Pels, and Rietveld (2007), the effect of the developments in data collection. In academic research on
proximity of public transportation infrastructure on house real estate, the application of machine learning techniques
prices was studied. Location on a site with a desirable view, has grown (Fan et al., 2006; Selim, 2009; Antipov & Pokry-
such as a lake or golf course, has been found to have a pos- shevskaya, 2012; Čeh, Kilibarda, Lisec, & Bajat, 2018). As
itive effect on the price in Benson, Hansen, Schwartz, and discussed in Fan, Ong, and Koh (2006), the approach can
Smersh (1998), Gillard (1981), and Darling (1973). The be applied to investigate the linear or non-linear relation-
neighborhood attributes can be implicitly valued through ships between the dependent and independent variables
the hedonic model by comparing properties with differing and hierarchical structure of the determinants of house
neighborhood qualities (Goodman, 1989). Chau and Chin prices. In McCluskey and Anand (1999) and Verikas,
(2003) reviewed past studies and classified the attributes Lipnickas, and Malmqvist (2002), artificial neural net-
into three categories: socioeconomic variables (Garrod & work models were employed to value properties. Limsom-
International Journal of Strategic Property Management. Article in press 3
bunchai (2004) and Selim (2009) compared the predictive and the actual market price was only around 5.48% in the
power of the hedonic model based on multiple regression machine learning predictor. Further, the probabilities that
with that of an artificial neural network model. Both stud- the RF predictions fell within 3%, 5%, 10% of the actual
ies demonstrate that an artificial neural network can be market price were 53.5%, 71.9%, and 90.3%, respectively,
a more effective alternative to the hedonic model for ap- while those of the OLS-based predictions were 10.4%,
praising house prices. In Gu, Zhu, and Jiang (2011) and 17.4%, and 34.6%, respectively. Furthermore, we found
Mu, Wu, and Zhang (2014), supporting vector machine that the RF predictor makes fewer outlier predictions than
techniques were used to value house prices. Park and Bae the conventional hedonic pricing model. The probability
(2015) developed a housing price appraisal model based of the RF predictions deviating more than 50% from the
on machine learning algorithms, such as C4.5, RIPPER, actual price was only 0.5%, while that for OLS-based pre-
Naïve Bayesian, and AdaBoost, and analyzed the housing dictions was almost 3.8%.
data for Fairfax County, Virginia, USA. The following can be derived from our results. From a
In spite of the wide application of machine learning theoretical perspective, the result may serve as evidence of
techniques in house price valuation, there have been few high complexity in the price determination process of the
studies using Random Forest (RF) techniques for apprais- housing market. The superiority of RF in appraisal accu-
al. The RF method is a special type of the simple regres- racy indicates that the RF predictor can more successfully
sion trees ensemble, which gives a prediction based on track the actual price determination process in housing
majority voting or by averaging predictions made by each market than the OLS predictor. In other words, there are
of its trees (Antipov & Pokryshevskaya, 2012). The benefit some factors of the value determination process that can-
of RF is that there are few hyperparameters with the po- not be fully explained in the simplified assumptions of the
tential to strongly influence its performance. It is defined conventional hedonic pricing model (e.g., separability and
only by the number of trees and the depth of each tree. constancy of an attribute’s effect on housing value).
Antipov and Pokryshevskaya (2012) “believe random for- From a practical perspective, our results show that the
est may become one of the most appropriate techniques quality of mass appraisals or house price indices can be
for mass appraisal … it is expected to avoid fallacies of significantly improved by using the RF method. Relative
many other methods, commonly used for mass apprais- to the predictive models in previous studies in Limsom-
al.” They also presented several advantages of RF. First, in bunchai (2004), Selim (2009), Antipov and Pokryshevs-
many comparative studies, RF performed more strongly kaya (2012), and Čeh et al. (2018), the performance mea-
than other algorithms. Second, it can successfully man- sures of RF−R2 values (97.6%), mean absolute percentage
age categorical variables with many levels. In the case of error (MAPE, 5.482%), coefficient of dispersion (COD,
multiple regression or neural networks, a large number of 5.484%), and hitting rate−achieved significantly stronger
qualitative variables lead to a larger number of estimated results. Although it is difficult to compare experiments
parameters, which usually results in overfitting. Third, conducted in different samples, the results in this paper
the method works adequately when there is missing data. may also indicate that the accuracy of systemic appraisal
Because the method is based on regression trees, the pre- can be surprisingly high (Note that the MAPE of human
diction is made from the part of the tree that has already appraisals is 12% in Cannon and Cole, 2011).
been built, even when some data is missing. Fourth, it al- We infer that the high predictive power of the RF-
lows for nonlinear links and unsteadiness of variable in- based model derives from a combination of the features of
fluence across different segments. Fifth, its method does the RF method and the features of the data set we applied.
not require a detailed model specification. Thus, the RF In the RF method, a model is constructed by exploring
method may become one of the most appropriate meth- the hierarchical structure of characteristics and the effect
ods for mass appraisal, and it is for this reason that it was of each attribute on price is allowed to vary according to
chosen for this paper. circumstances. The important advantage of this method is
In this paper, we investigate the features of a house that it does not require assumptions about market com-
price predictor based on the RF method by comparing plexity. The RF algorithm constructs the data-driven hi-
it with those of a conventional, regression-based hedon- erarchical structure of the model without the modeler ex-
ic pricing model. We collected a data set covering 40% plicitly describing it. Therefore, if the data set sufficiently
(16,601 samples) of all apartment transactions (39,564) covers the characteristics of the property, the RF model is
during 2006–2017 in the district of Gangnam, one of the expected to more sensitively replicate the complex struc-
most developed areas in South Korea. The samples were ture of the house price determination process.
randomly divided into a training set consisting of 90% of In addition, we presume that the features of our data
all transactions and a test set consisting of the remain- also contributed to the high accuracy for two reasons. One
ing 10% of transactions. We compare several performance is the geographic density of the samples. A large portion of
measurements for the predictions of the house prices in a property’s value comes from its location. If the samples
the test set. The results show that the machine learning ap- are sparsely located in a large area, it is difficult to accu-
proach can significantly enhance predictive performance. rately measure the effects related to location. We collected
The average percentage deviation between the predicted
a relatively large sample (16,061 samples trained) in a
4 J. Hong et al. A house price valuation based on the random forest approach: the mass appraisal of residential...
small area (39.55 km2) and expect that this high density of provided by South Korea’s Ministry of Land, Infrastruc-
samples may have contributed to the high accuracy of our ture, and Transport (MOLIT). The data set covers about
prediction. The other reason is the type of property that 40% of all apartment transactions in Gangnam during the
our data covers. We collected all of our apartment data sample period.
in the same residential area (the district of Gangnam in Because both models involve the regression of ob-
Seoul), and the structural characteristics of the apartments served apartment prices against apartment attributes and
can be sufficiently represented by a number of common economic variables hypothesized to be determinants of
and measurable features. A data set can contain only con- price, the factors assumed to contribute to the price are
solidated features of housing, such as number of rooms given in Table 1.
and floor level. Housing in different residential areas or The structural attributes are related to inherent charac-
in detached dwellings are usually more various in their teristics of the property. In this study, they include elapsed
amenities, interior decorations, and features and conse- year (transaction year-construction year), area, floor level,
quently are difficult to codify or consolidate in a data set, and heating system. Regarding the heating system, the val-
which eventually undermines the accuracy of predictors. ue of the dummy variable is set to 0 if an apartment has
In this context, we expect that our data on apartments in a central heating system. Otherwise, the value is set to 1.
the same residential area (with a similar income group) For neighborhood attributes, we consider apartment
would contribute to the accuracy of prediction. brand, available units in the building, number of buildings
The remainder of this paper is organized as follows. in the apartment complex, parking lot, floor area ratio,
In Section 1, the data set and some basic statistics are de- building coverage ratio, and the top/lowest floor of the
scribed. In Section 2, we introduce the RF method and building. A dummy variable is employed for the ranking
describe how it predicts house prices. Section 3 provides of apartment brands. The ranking is based on a report by
the quantitative results and interpretation. Concluding re- the Korea Institute of Corporate Reputation, and the vari-
marks are provided in final section. able has a value of 1 if an apartment is not built by one
of the ten highest-ranked apartment brands. The variable
1. Data set and basic statistics “parking lot” represents the average number of parking
spaces available per apartment household. Floor area ra-
Gangnam is one of the 25 local government districts of tio (FAR) and building coverage ratio (BCR) are the ratio
Seoul, the capital city of South Korea. With a population of total floor area (gross floor area) to land area and the
of 561,052 and an area of 39.5 km2, it is Seoul’s third-larg- ratio of the building area divided by the land (site) area,
est district. The district is composed of 22 administrative respectively.
divisions called “dongs” (Figure 1). While Seoul is known The locational attributes of property, which also affect
for its high housing prices (an average apartment cost the price of the property, are considered in this study. To
approximately 5,500 USD per m2 in 2011), the average take the value of the geographical position into account,
housing price in Gangnam–approximately 10,000 USD we consider latitude, longitude, and accessibility to nearby
per m2–is almost twice as high, and 3.5 times the nation- facilities. The facilities considered are national park, high
al average. The district is also the place where the largest school, redevelopment area, university, general hospital,
number of apartment transactions have occurred in the museum, and subway station. While the information on
past decade. We collected 16,601 samples for 2006–2017 the administrative division of the apartment was found in
from the transaction records for apartments in Gangnam, the data provided by MOLIT, other information (latitude,
Standard
Variables Mean Median Min Max
deviation
Construction year 1992.946 1993 10.270 1978 2014
Area 71.567 59.96 35.0495 16.78 273.83
Floor level 7.464 5 5.633 −1 45
Units available in the building 1724.093 900 1855.321 7 5040
Number of buildings in an apartment complex 30.404 8 42.861 1 124
Parking lot 1.006 1 0.591 0.27 4.53
Floor area ratio 252.022 224 209.46 72 2435
Building coverage ratio 24.687 19 15.046 12 204
Latitude 37.494 37.493 0.0120 37.460 37.533
Longitude 127.060 127.058 0.0170 127.0181 127.104
Distance to national park 1065.147 1053.208 396.087 86.108 2142.469
Distance to high school 536.979 522.609 236.370 31.883 1531.516
Distance to redevelopment area 634.044 571.583 416.609 0 3,758.560
Distance to university 3,382.976 3,551.466 1272.367 24.587 7,136.498
Distance to general hospital 1,062.366 975.124 524.585 41.633 3,470.830
Distance to museum 986.106 1,032.572 373.803 87.490 3,323.865
Distance to subway station 678.640 579.455 394.342 47.487 2,559.068
GDP (billion won) 333,427 337,411 64,322 225,613 446,835
Growth rate in real GDP 3.641 3.4 1.791 −1.9 7.4
Land price fluctuation rate 0.0638 0.0165 0.319 −2.643 0.625
Mortgage interest rate 6.050 5.883 0.602 5.263 7.415
6 J. Hong et al. A house price valuation based on the random forest approach: the mass appraisal of residential...
helps to avoid overfitting problems caused by the large model has importance values which indicate the predic-
number of classes. In the case of multiple regression or tive power of the variables−that is, how much the variable
neural networks, such categorial variables lead to an in- decreases variance (or error) in the split space. In decision
creased number of estimated parameters, which results in trees, every node is a condition of how to split values in
overfitting. Since there are qualitative variables, such as a single feature, so that similar values of the dependent
apartment brand and heating system, in our problem, RF variable (price) belong to the same set after the split. The
techniques can be advantageous for predicting the price condition is based on impurity, which is Gini impurity in
of a property. RF can also deal with nonlinear links and case of classification problems, while mean squared error
the unsteadiness of variable influence across different seg- (MSE) and its variance are used for regression trees. So
ments, since it is based on regression trees. In many pre- when a tree is trained, the importance is how much each
vious studies on mass appraisal, the predictive power of feature contributes to decreasing the weighted impurity.
models based on nonparametric methods, such as neural In the case of Random Forest, we use the average of the
network or support vector machine, is greater than that decrease in impurity over trees by a feature as the impor-
of OLS-based models. It seems that there are significant tance of the feature.
market complexities that cannot be fully explained using Figure 4 shows the importance of the variables in the
the conventional hedonic pricing model. RF is more ap- trained RF model. Note that “area” is the most important fac-
propriate for dealing with this complexity. tor for price, followed by “number of buildings in the apart-
Another benefit of DTs and RFs is the interpretability of ment complex”. “Transaction date” and “construction year”
the trained model: humans can understand how the trained are also significant. Interestingly, distances to places of inter-
model works. In addition, trees are trained easily and make est, such as a subway station, seem to have no effect on price.
faster inferences than other machine learning algorithms. We selected features based on performance while
To train RF models, there are only two hyperparameters: training the RF model after removing the least important
number of DTs and depth of each tree. With more DTs, the variables one at a time. To measure its performance, we
result would be more stable in their computation cost, and used mean absolute percentage error (MAPE), a straight-
deeper trees find more accurate results by dividing the sam- forward measurement that captures the average percent-
ple space into smaller parts, which may lead to overfitting. age deviation of predictions from the actual transaction
In our experiment, after trying many different combina- prices. The formula is expressed as:
tions, the RF model consisted of 50 trees with a depth of 17,
100 n pˆi − pi
although there is no significant difference in performance MAPE = ∑ p ,
n i =1
(3)
with slightly different combinations. In this study, we used i
the sklearn toolkit from scikit-learn.org. where: pi and pˆi are the actual price and predicted price
of apartment i, respectively.
3. Results Figure 5 shows the MAPE curve with different num-
bers of variables, while removing the least important
3.1. Feature selection variables one at a time. The horizontal axis indicates the
We investigated the 26 variables in Table 3 to determine number of variables used in prediction. Notice that the
which of them have a dominant or significant impact on error is minimal when 16 features are used (or 10 features
the price. We fixed a random forest architecture from 50 are removed) to train the RF model. By this feature selec-
decision trees with depth 17, after many trials with dif- tion, we can avoid potential overfitting problems. The 16
ferent configurations on training and validation samples. features are listed in Table 4. We use these 16 features for
Once training the RF model with the training samples, the the subsequent experiments.
Figure 3. Example of RF: an ensemble of 2 decision trees (DTs) with a depth of 4 on 5 features
8 J. Hong et al. A house price valuation based on the random forest approach: the mass appraisal of residential...
Figure 4. Importance of features in the trained RF model. “Importance” means the contribution of each variable to the model
Figure 5. MAPE when removing the least important variables one at a time
International Journal of Strategic Property Management. Article in press 9
Figure 6. Absolute values of correlation between variables and prices (left) and between variables (right).
The brighter cells have higher correlations
On the other hand, we also need to look at the cor- of the variance in the target variable (actual transaction
relations between the variables and prices (target) and price) that is accounted for by the models. R-squared is
those between variables. Even when the importance is calculated as:
low, some variables can have strong predictive power on
∑ni=1 ( pi − pˆi )
2
the price if they are not correlated to other variables. In R2 = 1− , (5)
2
Figure 6 (left), “parking lot” (index 21) has a strong cor- ∑ni=1 ( pi − p )
relation with price, while it is not important in Figure 4.
This phenomenon can be explained by Figure 6 (right), where: p is the sample mean of the actual transaction
where “parking lot” has a strong correlation to “area” (in- price for apartment i.
dex 3), probably because “area” includes much of the same We took the average of 10 experiments for each meas-
information contained in “parking lot.” Thus, “parking lot” urement to determine whether the results were obtained
is not important when “area” is one of factors. by chance. In each experiment, the measurements were
obtained both inside and outside of the sample prediction
context. The 16,061 observations were randomly divided
3.2. Comparison between RF and OLS predictor
into training sets (90% of all transactions) and test sets
The predictive performances of OLS and RF regression (10% of all transactions).
can be compared using measurements that capture the Table 5 presents a comparison of the measurements
distance between predicted and observed transaction obtained from both predictors. The values of MAPE and
price. We considered three measurements: MAPE, coef- COD for RF are only 5.482 and 5.484, respectively, while
ficient of dispersion (COD), and R-squared. those for the OLS predictor are 19.605 and 19.571, respec-
MAPE measures the average percentage error of pre- tively. The MAPEs indicate that the percent deviation of
dictions from the actual transaction prices. Percentage er- the RF prediction from the actual contract price is only
rors from each sample are averaged after taking absolute about 5% on average, while that of the OLS predictor is
value to ignore the sign of the errors. MAPE is frequently about 20%. The R-squared of the RF is also noticeably
used because it is convenient and can be understood intui- higher than the R-squared of the OLS. The R-squared of
tively. Its formula is shown in the equation (3). the RF model is 0.9761, which implies that 97% of the
COD measures the dispersion of sales ratio, the quo- variability of the dependent variable has been accounted
tient obtained by dividing the predicted price with actual for while the remaining 3% of the variability has not.
transaction price, around the median sales ratio. It is used The predictive performance can also be considered in
to measure appraisal uniformity. It is obtained by the av- terms of the hitting rate. If we define a successful predic-
erage percentage deviation of sales ratio from the median tion as an event in which the predicted price is within a
value; thus, lower COD implies a more uniform predic- certain range of the actual price, hitting rate indicates the
tion. This measurement can be expressed as:
100 ∑ni=1 SRi − SRm
COD = , (4) Table 5. Measurements for accuracy (average of 10 trials)
SRm n
where: SRi is the ratio between the predicted price and OLS Random Forest
actual sale price for the apartment i; SRm is the median MAPE 19.60567 5.482407
of the quotient, and n is the sample size for the prediction.
COD 19.57161 5.484705
R-squared shows the predictable portion of the ob-
R-squared 0.726056 0.976198
served transaction price. It is measured by the proportion
10 J. Hong et al. A house price valuation based on the random forest approach: the mass appraisal of residential...
proportion of successful prediction. Table 6 compares the as the number of rooms and floor level. We collected all
hitting rates obtained from both methods when we define apartment data from the same residential area (Gang-
the range of successful prediction as 1%, 3%, 5%, 10%, nam in Seoul) because the structural characteristics of
and 15%, respectively. In the RF predictor, when the dif- the apartments can be well-represented by a number of
ference between the market price and the forecast price is common characteristics. However, other types of dwell-
less than 15%, the hitting rate is about 95%. This means ing (for example, detached houses) are usually more het-
that the RF predictor allows us to make more sophisti- erogenous in their amenities, interior decorations, and
cated predictions. other features that are difficult to codify or consolidate in
a data set. If a large portion of attributes are unmeasured
Table 6. Hitting rates (average of 10 trials) or unobservable, the predictive power of the model will
be undermined by the lack of information rather than any
OLS Random Forest modelling issue.
Within 1% 3.4% 21.0% In addition, we will discuss the frequency of outliers,
Within 3% 10.4% 53.5% which is potentially related to the complexity of the pre-
diction structure in the data-driven model constructed by
Within 5% 17.4% 72.0%
the machine learning approach. For an OLS-based pre-
Within 10% 34.6% 90.3%
dictor, the prediction is made by the linear projection of
Within 15% 50.9% 95.6% observed attributes; thus, a large deviation from the ac-
tual value occurs only when the values of the attributes for
The results can be interpreted as follows.
which the coefficients are overestimated or underestimat-
First, in the comparison between the accuracies of
ed are extremely large. For the RF predictor, it is difficult
both methods, we can conclude that the RF predictor
to formalize when outliers occur. However, it is important
is significantly more accurate than the OLS predictor in
that, in the RF model, the order of variables is constructed
all measurements (MAPE, COD, R-squared, and hitting
by a data-driven process and the effect of an attribute on
rates). This finding is notable because the quality and
housing value can vary according to ordering structure.
quantity of information used in both methods were the
Therefore, if the ordering structure greatly distorts the ac-
same. The difference lied only in the form of the models.
tual value determination process in the housing market,
The functional form of the conventional hedonic pricing
the non-linearity can make largely deviated predictions. In
model represents a form of our intuition about housing
the opposite case, if the complicated structure of the actual
value with the assumption that each attribute is separable
housing market is captured by the data-driven ordering
and its influences constant. This means that, in the OLS-
structure, the occurrence of outliers will be significantly
based model, the effect of each attribute is extremely sim-
reduced. Rather, the rigidity of the model in the OLS-
plified, with a single coefficient. In the RF model, since the
based technique might lead to more frequent outliers.
predictor explores the hierarchical structure of features, it
The frequency of outliers is displayed in Table 7, in
can more sensitively track the possibility that the effect of
which we define an “outlier” as a case in which the predic-
each attribute on price varies by context. The result implies
tion deviates from the actual price by more than a certain
that there are substantial losses resulting from the simpli-
percent range (50%, 100%, and 200%). Under these crite-
fied nature of the OLS-based model and that at least some
ria, the occurrence of outliers is markedly reduced with
of these losses can be recovered using the RF predictor.
the RF predictor. If we define the outliers as deviations
Second, the results show that the accuracy of the RF
greater than 50% from the actual value, then about 3.8%
predictor can be surprisingly high. The average MAPE
of OLS-based prediction are revealed to be outliers, com-
means that the percent deviation of a prediction from the
pared to about 0.5% of the RF predictions. This result im-
actual contract price is only about 5% on average. We hy-
plies that the hierarchical structure of features constructed
pothesize that this accuracy is not due to the superiority
by the RF technique is not distortive and that the predictor
of RF modeling alone and that the features of our data
is not easily over-fitted to a training set.
set also contribute to the high accuracy in the absolute
perspective. One reason is the geometric density of our Table 7. Proportion of outliers (average of 10 trials)
samples. A large portion of a property value comes from
its location. If the samples are sparsely located in a large Criteria OLS Random Forest
area, it is difficult to accurately measure the value from
its location. We collected a relatively large sample (16,061 pˆi − pi
> 0.5 3.82% 0.52%
samples trained) from a small area (39.55 km2) and ex- pi
pect that this high density of samples may contribute to
the high accuracy of prediction. The other reason is the pˆi − pi
>1 0.39% 0.26%
type of property that our data covers. The coverage of ob- pi
servable characteristics can be an important factor affect-
ing the accuracy of the estimation, as a data set contains pˆi − pi
>2 0.18% 0.19%
only consolidated or measurable features of housing, such pi
International Journal of Strategic Property Management. Article in press 11
3.3. Comparison by time period observable information (the exact dates of the contracts)
becomes more important, and the predictive power of the
In the previous section, we shuffled the whole sample and
models would decrease. Thus, we can expect that the aver-
randomly selected the training sets and test sets. In doing
age performance of both predictors is poorer from 2006 to
so, we ignored the time order of samples. For instance,
2011 (see that they simultaneously reach a peak in 2011)
in that case, samples from 2016 could be trained to be
than from 2012 to 2017.
used to appraise a property in 2010. However, in the actual
Finally, we noted that the RF predictor is still more
practice of mass appraisal, the information we can access
accurate than the OLS predictor in all individual time seg-
is usually constrained to the present and past. Therefore,
ments. Although the performance of the RF predictor is
if we use more recent information to make a less recent
volatile, the results show that it is always stronger than
appraisal, we may overstate the model’s predictive power
that of OLS predictor. Roughly, it seems that the percent
in reality. To address this problem, this section presents
deviation rates of RF are lower than half of those of the
the performance of the predictors within limited time seg-
OLS predictor and that the gap between the OLS and RF
ments.
predictors is similar in the main result. Conclusively, the
As in the previous experiments, we divided the sample
advantage of the RF predictor seems to remain even with
in each time period into 90% training sets and 10% test
a smaller data set and on different timelines.1
sets and compared the average MAPE from 10 experi-
ments for OLS and RF predictions. Table 8 presents the
results for each time segment, as divided into years. Conclusions
We noted several features of the results. At first, the In this paper, we discussed the features of the RF predic-
overall level of performance measured by MAPE is lower tor in comparison to the conventional OLS-based predic-
than that in the previous section. This is natural, since the tor. This paper shows that the predictive performance of a
samples for each time segment are smaller than the total machine learning-based predictor can be superior to that
sample. In this case, the performance of predictors inevi- of the OLS-based approach. We used apartment transac-
tably decreased. tion data from 2006–2017 in Gangnam, one of the most
Second, the performances of both predictors are un- developed areas in Korea. We collected a data set covering
stable over the period. We hypothesize that this instability 40% of all transactions in the selected area, and the sam-
resulted from the smaller sample size and some economic ples were randomly divided into training sets consisting
events causing higher volatility in certain periods. From of 90% of all transactions and test sets consisting of the
2006 to 2011, the housing market in Korea was impacted remaining 10% of transactions. We used the averages of 10
by the global housing boom-bust cycle and subsequent experiments to compare the performance measurements
financial crisis, and the annual rate of change in apart- in order to eliminate the possibility that the results oc-
ment prices was relatively more volatile than in the other curred by chance.
periods. However, our data provides only the year of the The average percentage deviation between the pre-
contract, not its exact date. Hence, if the annual change in dicted and actual market price was only around 5.5% for
housing price is more severe, the importance of the un- the machine learning predictor and almost 20% for the
OLS-based predictor. Moreover, the probabilities that the
Table 8. MAPE for each time segment (average of 10 trials) RF prediction was within 3%, 5%, and 10% of the actual
market price were 53.5%, 72%, and 90.3%, respectively,
OLS Random Forest whereas those of the OLS-based prediction are 10.4%,
2006 23.11313 4.574283 17.4%, and 34.6%, respectively. Furthermore, we found
that the RF predictor made fewer outlier predictions than
2007 19.07825 8.020659 the conventional hedonic pricing model. The probability
2008 18.65792 8.058142 of the RF predictions deviating more than 50% from the
actual price was found to be only 0.5%, while that of OLS-
2009 23.96337 9.916612
based predictions doing so was almost 3.8%.
2010 23.46693 10.28114 The contribution of this paper can be discussed in two
ways. From a theoretical perspective, this paper shows that
2011 25.52543 11.8575
there are significant market complexities making the value
2012 15.28604 6.439383 determination process unable to be fully accounted for in
the simplified assumptions of the conventional hedonic
2013 14.02813 4.597863
pricing model (separability and constancy of an attribute’s
2014 14.28471 5.189567 effect on housing value). From a practical perspective, the
2015 12.89255 4.125248
1 It also implies that the superiority of the RF predictor in the
2016 12.96526 5.764724 previous section did not result from the fact that we were al-
2017 15.17713 4.313319 lowed to use information from the “future”−i.e., more recent
information for less recent appraisals.
12 J. Hong et al. A house price valuation based on the random forest approach: the mass appraisal of residential...
results are a demonstration that the accuracy of a machine for predicting prices of the apartments. ISPRS International
learning-based mass appraisal can be surprisingly high in Journal of Geo-Information, 7(5), 168.
some cases (Note that the MAPE of human appraisals is https://doi.org/10.3390/ijgi7050168
Chau, K. W., & Chin, T. L. (2003). A critical review of literature
12% in Cannon and Cole, 2011). We infer that the high
on the hedonic price model. International Journal for Housing
predictive power derives from a combination of the fea- Science and its Applications, 27(2), 145-165.
tures of the RF model and the data set we applied. Chen, J. H., Ong, C. F., Zheng, L., & Hsu, S. C. (2017). Forecast-
It is important to obtain an accurate estimation of the ing spatial dynamics of the housing market using Support
value of a house whose market price is not observed in Vector Machine. International Journal of Strategic Property
order to construct a reliable house price index or to con- Management, 21(3), 273-283.
duct a successful mass appraisal. Traditionally, the hedonic https://doi.org/10.3846/1648715X.2016.1259190
pricing model has been adopted as the appraisal machine, Clauretie, T. M., & Neill, H. R. (2000). Year-round school sched-
but, for several reasons, the accuracy of the OLS-based ules and residential property values. Journal of Real Estate
Finance and Economics, 20(3), 311-322.
predictor can be undermined. This paper suggests that the
https://doi.org/10.1023/A:1007841326833
RF predictor could be a complement to this linear regres- Darling, A. H. (1973). Measuring benefits generated by urban
sion method. Its results show that there is a significant loss water parks. Land Economics, 49(1), 22-34.
in accuracy resulting from the simplification of reality in https://doi.org/10.2307/3145326
the OLS-based model and that some of that loss can be Debrezion, G., Pels, E., & Rietveld, P. (2007). The impact of rail-
recovered by the RF predictor. This implies that the RF way stations on residential and commercial property value: a
method can more successfully track the complexity of the meta-analysis. Journal of Real Estate Finance and Economics,
value determination process that the OLS-based models 35(2), 161-180. https://doi.org/10.1007/s11146-007-9032-z
Downes, T. A., & Zabel, J. E. (2002). The impact of school char-
cannot fully capture.
acteristics on house prices: Chicago 1987–1991. Journal of
Urban Economics, 52(1), 1-25.
Author contributions https://doi.org/10.1016/S0094-1190(02)00010-4
Dubin, R. A., & Sung, C. H. (1990). Specification of hedonic
W. Kim and J. Hong conceived the study and were respon- regressions: non-nested tests on measures of neighborhood
sible for the design and development of the data analysis. quality. Journal of Urban Economics, 27(1), 97-110.
W. Kim were responsible for data collection and J. Hong https://doi.org/10.1016/0094-1190(90)90027-K
and H. Choi were responsible for data analysis and inter- Espey, M., & Lopez, H. (2000). The impact of airport noise and
pretation. proximity on residential property values. Growth and Change,
31(3), 408-419. https://doi.org/10.1111/0017-4815.00135
Fan, G. Z., Ong, S. E., & Koh, H. C. (2006). Determinants of
Disclosure statement house price: a decision tree approach. Urban Studies, 43(12),
2301-2315. https://doi.org/10.1080/00420980600990928
There are no conflicts of interest. Fletcher, M., Gallimore, P., & Mangan, J. (2000). Heteroscedastic-
ity in hedonic house price models. Journal of Property Research,
References 17(2), 93-108. https://doi.org/10.1080/095999100367930
Garrod, G. D., & Willis, K. G. (1992). Valuing goods’ character-
Adair, A., McGreal, S., Smyth, A., Cooper, J., & Ryley, T. (2000). istics: an application of the hedonic price method to envi-
House prices and accessibility: the testing of relationships ronmental attributes. Journal of Environmental Management,
within the Belfast urban area. Housing Studies, 15(5), 699- 34(1), 59-76. https://doi.org/10.1016/S0301-4797(05)80110-0
716. https://doi.org/10.1080/02673030050134565 Gillard, Q. (1981). The effect of environmental amenities on
Antipov, E. A., & Pokryshevskaya, E. B. (2012). Mass appraisal house values: the example of a view lot. The Professional Ge-
of residential apartments: an application of Random forest for ographer, 33(2), 216-220.
valuation and a CART-based approach for model diagnostics. https://doi.org/10.1111/j.0033-0124.1981.00216.x
Expert Systems with Applications, 39(2), 1772-1778. Goodman, A. C. (1989). Topics in empirical urban housing re-
https://doi.org/10.1016/j.eswa.2011.08.077 search. In R. Muth, & A. Goodman (Eds.), The economics of
Benson, E. D., Hansen, J. L., Schwartz, A. L., & Smersh, G. T. housing markets (pp. 49-146). Chur, Switzerland: Harwood
(1998). Pricing residential amenities: the value of a view. Academic.
Journal of Real Estate Finance and Economics, 16(1), 55-73. Gu, J., Zhu, M., & Jiang, L. (2011). Housing price forecasting
https://doi.org/10.1023/A:1007785315925 based on genetic algorithm and support vector machine. Ex-
Cannon, S. E., & Cole, R. A. (2011). How accurate are commer- pert Systems with Applications, 38(4), 3383-3386.
cial real estate appraisals? Evidence from 25 years of NCREIF https://doi.org/10.1016/j.eswa.2010.08.123
sales data. Journal of Portfolio Management, 37(5), 68-88. Hanson, S. (2004). The context of urban travel: concepts and
https://doi.org/10.3905/jpm.2011.37.5.068 recent trends. In S. Hanson, & G. Giuliano (Eds.), The ge-
Case, K. E., Quigley, J. M., & Shiller, R. J. (2005). Comparing ography of urban transportation (pp. 3-29). New York: The
wealth effects: the stock market versus the housing market. Guilford Press.
Advances in Macroeconomics, 5(1). Harrison Jr, D., & Rubinfeld, D. L. (1978). Hedonic housing
https://doi.org/10.2202/1534-6013.1235 prices and the demand for clean air. Journal of Environmental
Čeh, M., Kilibarda, M., Lisec, A., & Bajat, B. (2018). Estimating Economics and Management, 5(1), 81-102.
the performance of random forest versus multiple regression https://doi.org/10.1016/0095-0696(78)90006-2
International Journal of Strategic Property Management. Article in press 13
Hayes, K. J., & Taylor, L. L. (1996). Neighborhood school char- Richardson, H. W., Vipond, J., & Furbey, R. A. (1974). Determi-
acteristics: what signals quality to homebuyers? Economic nants of urban house prices. Urban Studies, 11(2), 189-199.
Review-Federal Reserve Bank of Dallas, 2-9. https://doi.org/10.1080/00420987420080341
Huh, S., & Kwak, S. J. (1997). The choice of functional form and Ridker, R. G., & Henning, J. A. (1967). The determinants of resi-
variables in the hedonic price model in Seoul. Urban Studies, dential property values with special reference to air pollution.
34(7), 989-998. https://doi.org/10.1080/0042098975691 Review of Economics and Statistics, 49(2), 246-257.
Jud, G. D., & Watts, J. M. (1981). Schools and housing values. Land https://doi.org/10.2307/1928231
Economics, 57(3), 459-470. https://doi.org/10.2307/3146025 Rodriguez, M., & Sirmans, C. F. (1994). Quantifying the value
Kain, J. F., & Quigley, J. M. (1970). Measuring the value of housing of a view in single-family housing markets. Appraisal Journal,
quality. Journal of the American Statistical Association, 65(330), 62, 600-603.
532-548. https://doi.org/10.1080/01621459.1970.10481102 Rosen, S. (1974). Hedonic prices and implicit markets: product
Lancaster, K. J. (1966). A new approach to consumer theory. differentiation in pure competition. Journal of Political Econ-
Journal of Political Economy, 74(2), 132-157. omy, 82(1), 34-55. https://doi.org/10.1086/260169
https://doi.org/10.1086/259131 Selim, H. (2009). Determinants of house prices in Turkey: he-
Li, M. M., & Brown, H. J. (1980). Micro-neighborhood exter- donic regression versus artificial neural network. Expert Sys-
nalities and hedonic housing prices. Land Economics, 56(2), tems with Applications, 36(2), 2843-2852.
125-141. https://doi.org/10.2307/3145857 https://doi.org/10.1016/j.eswa.2008.01.044
Limsombunchai, V. (2004, June). House price prediction: he- Sheppard, S. (1999). Hedonic analysis of housing markets. Hand-
donic price model vs. artificial neural network. In New Zea- book of Regional and Urban Economics, 3, 1595-1635.
land Agricultural and Resource Economics Society Conference https://doi.org/10.1016/S1574-0080(99)80010-8
(pp. 25-26), New Zealand. So, H. M., Tse, R. Y., & Ganesan, S. (1997). Estimating the influ-
Malpezzi, S. (2002). Hedonic pricing models: a selective and ence of transport on house prices: evidence from Hong Kong.
applied review. Housing Economics and Public Policy, 67-89. Journal of Property Valuation and Investment, 15(1), 40-47.
https://doi.org/10.1002/9780470690680.ch5 https://doi.org/10.1108/14635789710163793
McCluskey, W., & Anand, S. (1999). The application of intelli- Song, Y., & Sohn, J. (2007). Valuing spatial accessibility to re-
gent hybrid techniques for the mass appraisal of residential tailing: a case study of the single family housing market in
properties. Journal of Property Investment & Finance, 17(3), Hillsboro, Oregon. Journal of Retailing and Consumer Ser-
218-239. https://doi.org/10.1108/14635789910270495 vices, 14(4), 279-288.
McMillan, D., Jarmin, R., & Thorsnes, P. (1992). Selection bias https://doi.org/10.1016/j.jretconser.2006.07.002
and land development in the monocentric model. Journal of Thaler, R. (1978). A note on the value of crime control: evidence
Urban Economics, 31, 273-284. from the property market. Journal of Urban Economics, 5(1),
https://doi.org/10.1016/0094-1190(92)90056-Q 137-145. https://doi.org/10.1016/0094-1190(78)90042-6
McMillan, M. L., Reid, B. G., & Gillen, D. W. (1980). An ex- Verikas, A., Lipnickas, A., & Malmqvist, K. (2002). Selecting
tension of the hedonic approach for estimating the value of neural networks for a committee decision. International Jour-
quiet. Land Economics, 56(3), 315-328. nal of Neural Systems, 12(05), 351-361.
https://doi.org/10.2307/3146034 https://doi.org/10.1142/S0129065702001229
Miller, N., Peng, L., & Sklarz, M. (2011). House prices and eco- Wilhelmsson, M. (2000). The impact of traffic noise on the val-
nomic growth. Journal of Real Estate Finance and Economics, ues of single-family houses. Journal of Environmental Plan-
42(4), 522-541. https://doi.org/10.1007/s11146-009-9197-8 ning and Management, 43(6), 799-815.
Mu, J., Wu, F., & Zhang, A. (2014). Housing value forecasting https://doi.org/10.1080/09640560020001692
based on machine learning methods. Abstract and Applied Williams, A. W. (1991). A guide to valuing transport externalities
Analysis, 2014, Article ID 648047. by hedonic means. Transport Reviews, 11(4), 311-324.
https://doi.org/10.1155/2014/648047 https://doi.org/10.1080/01441649108716793
Palmquist, R. B. (1992). Valuing localized externalities. Journal Zhou, G., Ji, Y., Chen, X., & Zhang, F. (2018). Artificial neural
of Urban Economics, 31, 59-68. https://doi.org/10.1016/0094- networks and the mass appraisal of real estate. International
1190(92)90032-G Journal of Online Engineering, 14(3), 180-187.
Park, B., & Bae, J. K. (2015). Using machine learning algorithms https://doi.org/10.3991/ijoe.v14i03.8420
for housing price prediction: the case of Fairfax County, Vir- Zurada, J., Levitan, A., & Guan, J. (2011). A comparison of re-
ginia housing data. Expert Systems with Applications, 42(6), gression and artificial intelligence methods in a mass apprais-
2928-2934. https://doi.org/10.1016/j.eswa.2014.11.040 al context. Journal of Real Estate Research, 33(3), 349-387.