A Map-Based Recommendation System and House Price Prediction Model For Real Estate

International Journal of
Geo-Information
Article
A Map-Based Recommendation System and House Price
Prediction Model for Real Estate
Maryam Mubarak 1 , Ali Tahir 1, * , Fizza Waqar 2 , Ibraheem Haneef 3 , Gavin McArdle 4 , Michela Bertolotto 4
and Muhammad Tariq Saeed 5
1 Institute of Geographical Information Systems, National University of Science & Technology,

Islamabad 44000, Pakistan; [email protected]
2 GIS Plus Total Solutions, Islamabad 44000, Pakistan; [email protected]
3 Department of Mech & Aerospace Engg, Air University, Islamabad 44000, Pakistan;
[email protected]
4 School of Computer Science, University College Dublin, D04 V1W8 Dublin, Ireland;
[email protected] (G.M.); [email protected] (M.B.)
5 Research Centre for Modelling & Simulation, National University of Science & Technology,
Islamabad 44000, Pakistan; [email protected]
* Correspondence: [email protected]
Simple Summary: The accessibility of spatial big data help real estate investors to make better
judgement calls and earn additional profit. Since location is considered necessary for real estate and
consequent decision-making, digital maps have become a prime resource for real estate purchases,
planning and development. Personalisation can support in making judgments by identifying user re-
quirements and inclinations, which a user interacts with digital map, it records all the user’s activities.
A personalised real estate portal can use this information to suggest properties, assist homeowners
and provide valuable real estate analytics. By monitoring user interactions through an online real

estate portal, the framework provided in this article can make personalised recommendations of real
Citation: Mubarak, M.; Tahir, A.; estate based on content, collaboration and location. The effectiveness of the recommendations was
Waqar, F.; Haneef, I.; McArdle, G.;
tested by the user feedback mechanism through a method of mean absolute precision, and the results
Bertolotto, M.; Saeed, M.T. A
show that 79% precise suggestions were generated. Out of 5 recommendations produced, users were
Map-Based Recommendation System
interested in at least 3. A separate house price prediction model was also developed base on neural
and House Price Prediction Model for
networks and classical regression technique. This model implemented to assist users in making an
Real Estate. ISPRS Int. J. Geo-Inf. 2022,
11, 178. https://doi.org/10.3390/
informed decision regarding prospects of real estate purchase.
ijgi11030178
Abstract: In 2015, global real estate was worth $217 trillion, which is approximately 2.7 times the
Academic Editor: Wolfgang Kainz
global GDP; it also accounts for roughly 60% of all conventional global resources, making it one of
Received: 3 January 2022 the key factors behind any country’s economic growth and stability. The accessibility of spatial big
Accepted: 3 March 2022 data will help real estate investors make better judgement calls and earn additional profit. Since
Published: 7 March 2022 location is deemed necessary for real estate and consequent decision-making, digital maps have
Publisher’s Note: MDPI stays neutral
become a prime resource for real estate purchases, planning and development. Personalisation
with regard to jurisdictional claims in can assist in making judgments by identifying user desires and inclinations, which can then be
published maps and institutional affil- recorded or captured as a user performs some interactions with a digital map. A personalised
iations. real estate portal can use this information to suggest properties, assist homeowners and provide
valuable real estate analytics. This article presents a novel framework for recommending real estate
to users. By monitoring user interactions through an online real estate portal, the framework can
make personalised recommendations of real estate based on content, collaboration and location. The
Copyright: © 2022 by the authors.
effectiveness of the recommendations was tested by the user feedback mechanism through a method
Licensee MDPI, Basel, Switzerland.
of mean absolute precision, and the results show that 79% precise suggestions were generated,
This article is an open access article
i.e., out of 5 recommendations produced, users were interested in at least 3. Along with that, a
distributed under the terms and
separate house price prediction model based on neural networks and classical regression techniques
conditions of the Creative Commons
Attribution (CC BY) license (https://
was also implemented to assist users in making an informed decision regarding prospects of real
creativecommons.org/licenses/by/ estate purchase.
4.0/).
ISPRS Int. J. Geo-Inf. 2022, 11, 178. https://doi.org/10.3390/ijgi11030178 https://www.mdpi.com/journal/ijgi

ISPRS Int. J. Geo-Inf. 2022, 11, 178 2 of 19
Keywords: real estate; map personalisation; map recommendation; house price prediction; estatech
maps; real estate analytics
1. Introduction
Driven by advertising technologies and goals to produce targeted ads, the personalisa-
tion and customisation of websites and services have become the new norm in our society.
The need for personalisation has been driven by the increase in data and information
available. Information overload, which makes it challenging to find relevant information,
has been a phenomenon for the past two decades. For example, a study from 2003 found
that unique information creation was estimated to be between 1 to 2 exabytes. This implied
that each human being must be processing 250 megabytes of information. Almost 20 years
later, this demonstrates the mounting need for efficient and accurate user recommendation
systems to help find pertinent data and information. Personalised content delivery to any
set of users may consist of multiple aspects.
A factor that plays a vital role in most personalised web interfaces is the interactivity
and the “user-friendly” nature of the User Interface (UI). Every web user, be it a novice,
or an expert, wants the interface to provide meaningful content delivered without having
much prior expertise about its functionality. This process involves a lot of work from a web
developer’s perspective but should be invisible and seamless to the end-user. Therefore,
various tools and techniques have been developed to implicitly collect data from users.
Implicit data collection, in simpler terms, is just the collection of a user’s data through
“interface interactions” without the user having to provide the data in a specific manner.
The data is then used to determine interests and make recommendations. At the same
time, another aspect growing in popularity is having the location information of a user to
make recommendations.
Such recommender systems are widely deployed in many consumer domains, such as
online shopping, although our research focuses on real estate recommendations. Real estate
recommendation is often about the location of a property item, so we have incorporated
online map interactions as a tool to understand a user’s interests. This paper presents
four principle recommendation approaches for effectively identifying property items in
our real estate portal. (1) Analysis and implementation of content-based filtering for
suggesting real estate items. (2) Collaborative filtering approach reduces the computational
cost by suggesting similar items to a similar group of users. (3) Location-based approach
for predicting the area of interest to the user based on geographical location and user
preferences. (4) Building a price prediction model to assist users in making an informed
decision. The reason for selecting the first two approaches is based on the fact that the
features of a real estate database closely resemble a movie database. Both content-based
filtering and collaborative filtering have proven to provide precise recommendations to
users [1]. Introducing a location-based approach is essential since property items have an
inherent location aspect.
We have used data from the Estatech map’s portal: https://www.the-estatech.com
(accessed on 15 September 2021) for the recommendation part of the study. We also obtained
data in explicit and implicit formats. In addition, historical data of properties and price
listings were obtained from Zameen.com (accessed on 15 September 2021), a real estate
portal for online property listings. The techniques and methods used for recommendation
algorithms were the score tree processes, TF-IDF and K-nearest neighbours. For house
prediction, we cross-compared two techniques, namely multiple linear regression and
Keras regression based on neural networks.
The remainder of the article is organised as follows: Section 2 presents the related
literature review. The methodological approach is given in Section 3. Section 4 presents
a discussion and the results, while Section 5 concludes the study and provides future
recommendations.
2. Related Work
Today’s modern recommendation engines have emerged from the domain of infor-
mation filtering, a term created by [2] outlines one solution for the issue of retrieving the
correct information against a pool of massive online data, called content filters. To ascertain
a user’s choice correctly, multiple visualisation tools have also been developed to accurately
distinguish a user’s interests and inclinations. These tools can also be considered as a form
of content filter. This domain has been progressing ever since. [3] demonstrate various
options for integrating a recommendation engine into a real estate portal’s user journey.
Furthermore, in the same manner, the work validated how additional real estate details can
provide more accurate recommendation results when integrated into the proposed model
of deep learning and factorisation machines.
Another study by [4] aims to determine if consumer loyalty will help a recommender
system be more accurate. Other techniques implemented by [5] such as using intelligent
data analysis methods to create a recommender framework to solve the problem of recom-
mending the most appropriate components for each user at any given time. They have fur-
ther addressed the problem of converting an original dataset from a real component-based
application to an optimised dataset. After gathering the interaction data and developing a
dataset to produce optimised recommendation results, machine learning algorithms using
feature engineering techniques and feature selection methods were also applied. Users
and developers alike want information processing and its display to be swift. The system
developed by [6] is based on an implicit profiling system for tracking the user’s interests
through mouse movements.
A gap analysis approach by [7] identifies the differences between theory and reality in
presenting information on location choice by developing a seven-factor classification tool
for evaluating property websites. To capture the relations between the latent feature vectors
of real estate items, Ref. [8] utilised the average-based and individual-based geographical
regularisation terms. Both terms are integrated with the weighted regularised matrix
factorisation framework to model users’ implicit feedback behaviours to provide them with
personalised property recommendations.
A probabilistic model for collaborative filtering by [9] calculates the predicted values
for items against active users, given that there is information already available about those
active users. The same research divides collaborative filtering methods into two primary
modules, memory-based collaborative filtering and model-based collaborative filtering.
Additional probabilistic approaches have been presented, some more sophisticated than
others, including the work of [10]. The recommended procedure is taken as a sequential
decision-making process, and the use of Markov decision chains have been suggested to
create a model. However, they do not state any improved accuracy over Breese’s projected
models. Another recommendation system by [11] applies content-based filtering, a fuzzy
technique for identifying similar and different content and a prediction algorithm for
identifying the right set of movie content for the user. At the same time, Ref. [12] developed
item to item centred algorithms. It has been done to provide improved outcomes than
user-based algorithms by comparing the approach with K-nearest neighbour.
In the domain of GIS, a complete map personalisation system is developed by [13]
in which the users’ interests are implicitly recorded and given specific rankings based on
certain criteria fulfilment upon user’s mouse clicks or movements. As already mentioned,
map personalisation has become an area of interest since data overload has become a
common scenario in spatial information systems. In the model developed by [14], the
entire focus is to understand map usage patterns of the end-users. The goal is again
focused on developing personalised maps for users on a web interface. Working on
similar lines, RecoMap [13], is a web-based platform through which each user receives
customised spatial recommendations based on their likings. The results are presented in a
map interface highlighting the user’s personalised spatial recommendations. The adaptive
map also shows the user’s preferences and the context in which they are used. A different
approach by [15] is to build a recommendation system and map interface, represented in a
personalised format for the user to acquire quick results. Further inferences are made by
studying the user’s behaviour for system improvement.
Another recommender system designed by [16] is for real estate users who do not
have a user profile for any real estate portal. The session-based interaction of the user is
made more effective by utilising a user’s search context and ranking criteria for any suitable
property item. A portal developed by [8] specifically designed for real estate uses two basic
approaches for user profiling, an ontological structure and case-based reasoning. The pur-
pose is to save the end-user from the stress of massive online searching and deliver results
where the user gets quick recommendations based on their interests. A recommendation
system that is being used by the US-based real estate website “Trulia” utilises a “square
counting method” [17] The method works well with large scale datasets and delivers swift
results per the user’s preferences based on love and hate edge configurations.
Things have changed significantly in the real estate industry during the COVID-19
era. In some regions, house prices have shown signs of stagnancy and even, in some cases,
decreasing trends as people lost their livelihoods. These conditions have urged people
to tread more carefully while making investments in this sector. In such a scenario, a
price prediction model can help users make an informed decision. A method by [18] for
predicting house prices utilises a Mallows model averaging estimator, which is vigorous
in terms of spatial dependence. Another study on ML models for house price prediction
by concludes that the random forest regressor model provides the best results amongst
all other compared models like linear regression, decision tree, k-means regression [19].
Another similar study carried out by [20] applies regression as a predictive model. They
use MSE, MAE and RMSE as their evaluation metrics for their model’s accuracy. Another
interesting study by [21] used Multiple Regression Analysis (MRA) to estimate property
prices for mass evaluation. The structural qualities and the property’s location were
viewed as two primary micro factors of house pricing. MRA was utilised to determine the
structural characteristics and locational attributes that statistically influence house price
using a sample of 106 house sale transactions from 2011 to 2015. An alternative approach
by [22] focuses on traditional solutions based on widely known methods and procedures
and faith in the infallibility and objectivity of a human analysing the real estate market.
Since modern technologies are also boldly entering the arena. Hence, the study’s key focus
is that organisations should stop viewing automated solutions (such as AVM, CAMA, and
AAVM) as functioning in opposition to traditional approaches and instead embrace them
as supplemental tools.
Our previous work in map personalisation discusses the initial concept of personal-
isation using real estate analytics [23]. It also evaluates background research relating to
the building blocks that lead to a recommendation engine for real-time analytics. Exten-
sive research in this field has revealed gaps between real-estate analytics and map-based
personalisation, recommendation and prediction; thus, we have tried to bridge this gap in
our research and initial development work. We also found motivation for our study and
consequent development since map-based personalised real estate portals do not widely
exist in the online real estate market. Having to sift through a plethora of online data is
no longer suitable for most users, and personalisation has become a key concept in every
aspect of data search. In our scenario, real estate test users have been interacting with a real
estate portal, “Estatech Maps”, to search and post property items. Our recommendation
system is based on three techniques. This includes content, collaboration and location-
based filtering. The interaction of users is captured via the map-based interface of the
real estate application, Estatech Maps, and stored in a database. Based on this data and
analysis, a user gets recommendations as per their area of interest. Along with that, we
have incorporated a module based on traditional regression techniques and Keras API for
predicting the future price trends of property items.
The subsequent section discusses the detailed insight of the research process regarding
data collection, its pre-processing, run time environment creation, and model conception.
Finally, the section will discuss the following crucial areas of the research process in
database. Based on this data and analysis, a user gets recommendations as per their area
of interest. Along with that, we have incorporated a module based on traditional
regression techniques and Keras API for predicting the future price trends of property
items.
ISPRS Int. J. Geo-Inf. 2022, 11, 178 The subsequent section discusses the detailed insight of the research process 5 of 19
regarding data collection, its pre-processing, run time environment creation, and model
conception. Finally, the section will discuss the following crucial areas of the research
process(1)
detail. in detail. (1) Data collection
Data collection and Technology.
and Technology. (2) Property
(2) Property Recommendation.
Recommendation. (3) Price(3)
Price prediction
prediction model.model.
3.3.Methodology
Methodology
“Estatech
“EstatechMaps”
Maps”main mainfocus
focusisisto
toprovide
providepersonalised
personalisedreal
realestate
estatelistings
listingsto toits
itsusers
users
on
on a map-based interface by making accurate recommendations and providinginsight
a map-based interface by making accurate recommendations and providing insight
about
aboutprice
pricetrends
trendsof ofaauser’s
user’sarea
areaofofinterest.
interest. Recommendation
Recommendationand andprice
priceprediction
predictionwere were
the
thekey
key focus
focus areas
areas toto deliver
deliver map-based personalisation to
map-based personalisation to the
the users.
users.InInthe
thefirst
firststage,
stage,a
adetailed
detailedstudy
studyon on the
the mathematical
mathematical interpretation
interpretation ofof recommendation
recommendation algorithms
algorithms was was
carried
carried out. The second stage focused on the algorithm’s designs, and in the
out. The second stage focused on the algorithm’s designs, and in the third
third stage,
stage,
development
developmentbased basedon those algorithms
on those was carried
algorithms was out, and the
carried out,models
and were
the implemented.
models were
The validation and testing of these models were carried out in
implemented. The validation and testing of these models were carried out the final stage of the
in research.
the final
The sequence of the study is illustrated in Figure 1.
stage of the research. The sequence of the study is illustrated in Figure 1.
Figure1.1.Methodology
Figure MethodologySequence.
Sequence.
Regardingprice
Regarding priceprediction,
prediction, after
after researching
researching various
various prediction
prediction techniques,
techniques, two mod-two
models
els were were selected.
selected. One isOne is based
based on a on a classical
classical regression
regression technique,
technique, and and the other
the other relies
relies on
on neural
neural networks.
networks.
3.1.
3.1.Data
DataCollection
Collectionand andTechnology
Technology
User interaction data
User interaction data waswasextracted from
extracted the portal
from over over
the portal a yeara (May
year 2020–March
(May 2020–March2021).
Data
2021). Data were extracted in JSON format from a MongoDB database, which to
were extracted in JSON format from a MongoDB database, which was converted a
was
CSV format. It consisted of 1600 recorded user interactions with the portal.
converted to a CSV format. It consisted of 1600 recorded user interactions with the portal. The data for
house price
The data forprediction
house price was acquired was
prediction fromacquired
a Pakistani based
from real estate
a Pakistani portal
based realZameen.com
estate portal
(accessed
Zameen.com on 15forSeptember
two years2021) for two
between years between
2019–2020 2019–2020
for Islamabad City.for Islamabad City.
Both
Boththe
the datasets
datasets from
from Estatech maps and
Estatech maps and Zameen.com
Zameen.comwere (accessed on 15into
converted September
test and
2021) were converted into test and training datasets. Zameen.com (accessed
training datasets. Zameen.com data, used for a house price prediction model, was on 15 Septem-
further
ber 2021) data,
converted into used for a house
a validation priceThe
dataset. prediction model, was
data consisted further converted
of multiple files: Userinto a
login
validation dataset. The data consisted of multiple files: User login information
information (User demographics), Interaction Data (Most viewed properties list) and Item (User
demographics),
Data (Properties). Interaction Data (Most viewed properties list) and Item Data (Properties).
TuriCreate was used to build the recommendation engine for content-based and
TuriCreate was used to build the recommendation engine for content-based and
collaborative filtering, whereas a K-means clustering technique was employed for the
collaborative filtering, whereas a K-means clustering technique was employed for the
location-based recommendation. TuriCreate is an open-source toolkit for building Core ML
location-based recommendation. TuriCreate is an open-source toolkit for building Core
models for tasks like image recognition, object detection, style transfers, and recommenda-
tion generation, among others.
Tensor Flow and Keras API were used as baseline technologies to build the house price
prediction model and a proper validation for model loss and model accuracy, which was
done through evaluation techniques of MSE, MAE and RMSE. TensorFlow is a machine
learning software library that is free and open-source. It can be used for various activities,
but it focuses on deep neural network training and inference. The Google Brain team created
TensorFlow for internal Google use. In 2015, it was published under the Apache License 2.0.
The reason for using TensorFlow is that it is an open-source artificial intelligence library
that builds models using data flow graphs. It enables programmers to create large-scale
neural networks with multiple layers. Keras is a deep learning API written in Python that
runs on top of the TensorFlow machine learning system. It was built with the objective of
allowing fast experimentation.
3.2. Property Recommendation

The three areas of focus for the recommendation engine are discussed in detail in each
of the following sections.
3.2.1. Content-Based Filtering

The concept behind recommender systems is data analytics. This can be achieved
either by score-based algorithms or by suggesting to a user the top items in an N-th list of
item array. In our scenario, our recommender system is designed for suggesting property
items listed for sale or rent. If a person has interacted with a map-based interface with
a property item, say in area “A” with attribute array “X”. The recommender system can
display similar items for the user in an instant and accurate manner.
In content-based filtering, the angle between the user’s profile and the items the user is
interested in is determined. This cosine angle determines how close in space the vectors lie
to each other and is also termed cosine similarity. The closer they are, the more similar they
are deemed. Let us consider a vector “U” of users {user1, user2, user3 . . . .} and a vector
“P” of property items {p1, p2, p3, p4 . . . . . . }. The similarity between these two vectors can
be calculated as:
U·P
sim(U, P) = cos(θ) = (1)
||U ||·|| P||
In other words:
number of people who viewed both P1 and P2
sim(U, P) = (2)
number of people who viewed either P1 or P2
The cosine value or similarity in Equation (1) can range between −1 and 1. Based on
this value, the articles are organised in descending order, and the top recommendations are
made to the user.
The approach for content-based filtering is further explained in Figure 2, which shows
how a tree-based criterion for item selection works. The concept is based on how much
interactivity a user has with a specific item or category. Interest ratios are calculated
between corresponding categories based on “incrementing the value of frequency”. For
example, buyers’ interactions with rent or purchase categories define the interest ratio
between the two categories. The flow of the function which performs frequency calculation
is elaborated in Figure 3, which details another content-based filtering process, namely
TF-IDF. For example, suppose a user searches for “the rise of analytics” on Google. In that
case, it is inevitable that the word “the” will occur more frequently than “analytics”, but
the relative importance of analytics is higher than the search query point of view. In such
cases, TF-IDF weighting negates the effect of high-frequency words in determining the
significance of an item (document).
Frequency o f term “t” in document

TF (t) = (3)
Total number o f terms in document
log Base 10( Total Number o f documents)

IDF (t) = (4)
Number o f documents containing “t”
TF(t) is simply the frequency of a word in a document, whereas IDF(t) signifies the
rarity of the word, so if the word occurring in the document is less, then the value of IDF
increases. In Equation (4), the log parameter is used to dampen the effect of high-frequency
words. We have utilised both the score tree process and TF-IDF approaches in formulating
𝑇𝐹(𝑡) = 𝑇𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑡𝑒𝑟𝑚𝑠 𝑖𝑛 𝑑𝑜𝑐𝑢𝑚𝑒𝑛𝑡 (3)
𝑇𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑡𝑒𝑟𝑚𝑠 𝑖𝑛 𝑑𝑜𝑐𝑢𝑚𝑒𝑛𝑡
log 𝐵𝑎𝑠𝑒 10(𝑇𝑜𝑡𝑎𝑙 𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑑𝑜𝑐𝑢𝑚𝑒𝑛𝑡𝑠)
𝐼𝐷𝐹(𝑡) = log 𝐵𝑎𝑠𝑒 10(𝑇𝑜𝑡𝑎𝑙 𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑑𝑜𝑐𝑢𝑚𝑒𝑛𝑡𝑠) (4)
𝐼𝐷𝐹(𝑡) = 𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑑𝑜𝑐𝑢𝑚𝑒𝑛𝑡𝑠 𝑐𝑜𝑛𝑡𝑎𝑖𝑛𝑖𝑛𝑔 “𝑡” (4)
𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑑𝑜𝑐𝑢𝑚𝑒𝑛𝑡𝑠 𝑐𝑜𝑛𝑡𝑎𝑖𝑛𝑖𝑛𝑔 “𝑡”
TF(t) is simply the frequency of a word in a document, whereas IDF(t) signifies the
ISPRS Int. J. Geo-Inf. 2022, 11, 178 7 ofthe
19
rarityTF(t) is word,
of the simplysothe frequency
if the of a wordininthe
word occurring a document,
document iswhereas
less, thenIDF(t) signifies
the value of IDF
rarity of the
increases. Inword, so if the
Equation (4), word
the logoccurring
parameterin theisdocument is less, then
used to dampen thethe value
effect of ofhigh-
IDF
increases. In Equation (4), the log parameter is used to dampen
frequency words. We have utilised both the score tree process and TF-IDF approaches in the effect of high-
frequency
our
formulating words.
content-based We havealgorithm.
filtering
our content-based utilised both the score
Initially,
filtering tree
user-user
algorithm. process
similarity
Initially, and
and
user-userTF-IDF approaches
item-item
similarity similarity
and item- in
formulating
are
itemobtained
similarityour content-based
in are
an array
obtained infiltering
format. anThe algorithm.
next
array Initially,
step inThe
format. thenext user-user
process
stepwas similarity
theprocess
in the creationandofitem-
was the
the
item similarity
item-user are
similarity obtained
matrix. in an
creation of the item-user similarity matrix. array format. The next step in the process was the
creation of the item-user similarity matrix.
Figure2.2.Score
Figure ScoreTree
Treeprocess
processfor
forProperty
PropertySelection.
Selection.
Figure 2. Score Tree process for Property Selection.
Figure 3. Frequency Calculating Function for Content-Based Filtering.

Figure3.3.Frequency
Figure FrequencyCalculating
CalculatingFunction
Functionfor
forContent-Based
Content-BasedFiltering.
Filtering.
3.2.2. Collaborative Filtering Approach

In our approach towards developing a collaborative filter for the portal, the test users
were divided into segments based on their preferences and items were recommended as
per mutual choices of users belonging to that segment. The more the user interacts with
items on display and rates them, the more precisely the system can suggest appropriate
items. The algorithms designed for collaborative filtering are mostly based on finding
the similarities between users on the grounds of the rank or rating they have given to
previous items. So, for predicting any item for user “u”, calculations are made to compute
the weighted sum of user “u” given by users to an item “i”. The prediction PDu,i would
then be calculated as:
PDu,i is the prediction term for user ‘u’ against an item “i”.
Σv (iv,i ∗ su,v )
PDu,i = (5)
Σv su,v
PDu,i is the prediction term for user ‘u’ against an item “i”, iv,i is the interaction by
the user say “v”. with an item “i”, su,v is the likeness among the two users, i.e., user “u”.
and user “v”.
As per Table 1, the interactions between users and properties is recorded, and sugges-
tions to a new user “u1” are generated. At the same time, the symbol “x” represents any
interaction between a user and a property item. It is evident that there is more similarity
between user 1 and user 2 than user 3. Based on this, user 1 and user 2 will be grouped
together for future recommendations. Algorithm 1 depicts a generalised algorithm that
has been designed for grouping user 1 and user 2 together so that the same properties get
recommended to them.
Table 1. Property Interaction Matrix.
P1 P2 P3 P4
U1 x x - -
U2 x x x -
U3 - - - x
Algorithm 1 Collaborative Recommendation Algorithm for New User “U1”

1: Input: Properties Dataset → all properties
2: Neighbours used for ranking → K
3: New User for recommendation → U1
4: Current recommendations for New User U1 → ∅
5: Users location history → L
6: rank = 0
7: Output: N items to be recommended
8: For each → property ∈ all properties do
9: if (users for P1==users for P2) then
10: rank++
11: Group according to the nearest neighbour in similarity (K, property, user, L) = users for
P1&&user for P2
12: Recommendations [U1] → [P3]
13: Descending rank. sort (properties)
14: Return Recommendations []
3.2.3. Location-Based Filtering

The purpose of a location-based recommendation system is to recommend items based
on the geographical location of a user. In this case scenario, recommendations can also be
made possible for a new user (cold start problem) where items get recommended based
on users of nearby locations who may align with the new user based on other parameters
such as age or gender etc. A location-based recommendation can immensely benefit people
in saving time and travel costs when displayed effectively through an interactive interface.
Equation (4) calculates the probability of interactivity of a user with an item “i”
established based on distance from all previous interactions of the user, which, in our
case, are other property items. Whereas in Algorithm 2 the algorithm for test user 1 has
been specified.
L geo (u, i ) = ∏ f (distance (i, k )) (6)
k∈ I
In Algorithm 2, a generalised algorithm for calculating location-based recommen-

dations for users is presented. It considers at least 50 users in a cluster for a similarity
score calculation.
Algorithm 2 Location-based recommendation algorithm for New User “U1”

1: Input: A user
2: Collection of users → U
3: Users location history → L
4: Similarity matrix between users → M
5: Current recommendations for New User based on location → ∅
6: Count = 0
7: Output: Top N location-based property recommendations based on users’ similarities and
preferences
8: M = similarity matrix values
9: Number of nearby users selected for similarity
10: score calculation ≤ 50
11: For each → user ∈ U do
12: LOC = location discovery // level of hierarchy or granularity of location
13: Calculate similarity distance score
14: Calculate distance from nearby users
15: The similarity score of User U1’s last x interacted properties == similarity score of nearby
user’s similarity score
16: Sort properties based on a count
17: Select top N scores
18: Select top N properties
19: Return N Recommendations
3.3. Price Prediction Model

The critical aspect to notice in the price prediction model is that the data used for this
analysis is the “offered set of prices” by the real estate portal Zameen.com (accessed on
15 September 2021). These prices can change as per the market variations or any redun-
dancy in the real estate sector.
For the prediction and analysis aspect, two regression techniques, namely (1) Multi-
ple linear regression and (2) Keras regression, were selected. The cross-comparison and
validation of these techniques were performed. The one that performed better in terms of
variance score was selected as the final model for visualising house prices.
3.3.1. Multiple Linear Regression

This is a type of linear regression in which the supposition is that the independent
variable y and the dependent variable x have a linear or direct relationship. We used the
Sklearn library to import the Linear Regression module. As already mentioned, our dataset
was divided into a test set and a train set.
3.3.2. Keras Regression

We use regression techniques to predict the independent variable y, which is price.
We have 14 features (property_id, location_id, property_type, price in pkr, price in dollars,
location, city, province, bedrooms, bathrooms, area purpose, date of addition to the portal,
area in Marla, area in sq. ft,); therefore we selected 14 neurons as baseline along with one
output and one input layer for the model. There are 4 hidden layers.
The model was trained for 400 epochs, with the training and validation precision
being recorded during each cycle. Finally, the model was run on both train and test results,
with the loss function being measured at each epoch to keep track of how well the model
is performing.
4. Results and Discussion

4.1. Content-Based and Collaborative Filtering Model Building
We adopted the Sklearn library as it contains a module called pairwise distance, which
identifies any two items which have similar characteristics or any two users who have
similar interests. To apply such a distance, we defined a function that returns the parameters
of interactions, similarity, and the type against which we are obtaining the similarity. The
algorithm generates suggestions based on the user’s profile (collaborative filtering model)
for the first case. For the second case, the suggestions are based on the item’s attributes
(content-based filtering). In the end, we were able to obtain recommendations for both
users and items. In Tables 2 and 3, it can be seen that for all users, “U”, scores “S” are
obtained in descending order, with the highest similarity scores at the top.
Table 2. Manual Model, Collaborative Filtering Scores.
S1 S2 S3 S(N-2) S(N-1) S(N)

U1 2.065 0.734 0.629 ... 0.393 0.393 0.392
U2 1.763 0.384 0.196 ... −0.088 −0.086 −0.086
U3 1.795 0.329 0.158 ... −0.136 −0.134 −0.134
U4 1.591 0.275 0.102 ... −0.167 −0.166 −0.166
U5 1.810 0.404 0.275 ... −0.009 −0.008 −0.008
Table 3. Manual Model, Content-Based Filtering.
S1 S2 S3 S(N-2) S(N-1) S(N)

U1 0.446 0.475 0.505 ... 0.588 0.573 0.566
U2 0.108 0.132 0.125 ... 0.134 0.136 0.137
U3 0.085 0.091 0.087 ... 0.084 0.089 0.090
U4 0.032 0.045 0.042 ... 0.053 0.051 0.052
U5 0.157 0.174 0.189 ... 0.199 0.197 0.200
In Tables 4 and 5, it is observed that the scores obtained against each user are not easy
to interpret. It is not clear against which property ID the user is getting the suggested items
of interest. To make our results clearer, we have utilised the Turicreate library. This made the
results obtained easier to understand. Table 4 represents content-based recommendation
model results. The model was assessed for five users of the portal, and recommendations
were generated for them.
In Table 4 the set of 5 users are recommended the same 5 property items due to the
popularity of those items as being the most interacted with.
Table 5 shows the properties recommended to users based on grouping with other
users having similar interests. Property IDs having a higher score are ranked higher. Each
user is recommended a different set of properties, which clearly shows that personalisation
exists for each user.
Table 4. Content-Based Recommendation using Turicreate.
User ID Property ID Interaction Rank

1 1599 9.0 1
1 1201 7.0 2
1 1189 5.0 3
1 1122 4.0 4
1 814 3.0 5
2 1599 9.0 1
2 1201 7.0 2
2 1189 5.0 3
2 1122 4.0 4
2 814 3.0 5
3 1599 9.0 1
3 1201 7.0 2
3 1189 5.0 3
3 1122 4.0 4
3 814 3.0 5
4 1599 9.0 1
4 1201 7.0 2
4 1189 5.0 3
4 1122 4.0 4
4 814 3.0 5
5 1599 9.0 1
5 1201 7.0 2
5 1189 5.0 3
5 1122 4.0 4
5 814 3.0 5
Table 5. Collaborative Filtering Using Turicreate.
User ID Property ID SCORE Rank

1 327 0.989 1
1 409 0.953 2
1 599 0.814 3
1 487 0.783 4
1 551 0.759 5
2 50 1.116 1
2 171 1.075 2
2 431 0.923 3
2 005 0.832 4
2 137 0.793 5
3 333 0.626 1
3 388 0.603 2
3 375 0.544 3
Table 5. Cont.
ISPRS Int. J. Geo-Inf. 2022, 11, x FOR PEER REVIEW 12 of 18
User ID Property ID SCORE Rank
3 381 0.536 4
3 392 0.522 5
3 392 0.522 5
4 055 1.113 1
4 055 1.113 1
44 248
248 1.039 1.039 2 2
44 121
121 0.930 0.930 3 3
44 342
342 0.904 0.904 4 4
44 151
151 0.899 0.899 5 5
55 175
175 1.033 1.033 1 1
5 287 0.943 2
5 287 0.943 2
5 067 0.837 3
5 067 0.837 3
5 099 0.834 4
099 0.834 4
55 057 0.786 5
5 057 0.786 5
4.2. Location-Based Recommendation Model Building through K-Means Clustering
4.2. Location-Based
K-means clusteringRecommendation
ascertainsModel the “k”Building
number through K-Meanswithin
of centroids Clustering
a dataset. After
that, K-means
it assigns clustering
every dataascertains
point withthe the“k”closest cluster.
number These datawithin
of centroids pointsaeventually end
dataset. After
that,
up it assigns
being in theevery
clusterdata
withpoint
the with
nearestthemean.
closestIncluster. These data
our approach, thepoints
purposeeventually end
of applying
up being in
K-Means the cluster
clustering is towith
groupthe similar
nearest users
mean.based
In ouron approach, the purpose
their respective of applying
locations. As the
K-Means
users clusteringthe
get clustered, is to
topgroup
most similar
searched users based onitem
or interacted theiramong
respective
that locations.
user groupAs the
starts
users
to get get clustered, the
recommended to top
eachmost
user.searched
Figure 4or interacted
shows item among
auto-generated that user
locations forgroup
users starts
from
to get recommended
different to each user.
places in Islamabad, alongFigure
with4 the
shows auto-generated
corresponding locations
cluster IDs. Asforone
users from
hovers
different
above anyplaces in Islamabad,
cluster, it shows along with the
the most corresponding
searched item in cluster IDs. Aswhich
that cluster, one hovers
gets
above any cluster,
recommended it shows
to users of the
thatmost searched
cluster. item in that
For example, cluster,
in one which
of the gets recommended
clusters, the count of
to users ofproperty
searched that cluster.
ID 689For isexample, in oneSince
the highest. of theitclusters,
is equalthe to count of searched
the highest countproperty
for that
ID 689 is the highest. Since it is equal to the highest count for that cluster,
cluster, all users falling within that cluster will get property ID 689 as the recommended all users falling
within that
property forcluster
view. will get property ID 689 as the recommended property for view.
Figure 4. Location Based Clusters over Islamabad City.

Figure 4. Location Based Clusters over Islamabad City.
4.3. Recommender System Validation

To validate the recommendations, one can simulate the user behaviour and fill in the
possible or missing ratings or, in our case, the interactions a user might have with any
prospective property items. The simulated values can then be further evaluated with error
metrics such as the mean squared error to determine the deviation of predicted over
4.3. Recommender System Validation

To validate the recommendations, one can simulate the user behaviour and fill in
the possible or missing ratings or, in our case, the interactions a user might have with
any prospective property items. The simulated values can then be further evaluated with
error metrics such as the mean squared error to determine the deviation of predicted over
observed values. The overall error of these values can provide us with an overview of the
accuracy of our model.
Table 6 shows the generated matrix for interactions a user would likely have with
property items and the MSE calculation for the overall matrix. Other methods for model
validation can be performed through recall and precision. Both are very useful, as they
show how accurate the recommendations are. However, the issue with recall and preci-
sion is that after applying these metrics, the recommended items are not sorted by their
weighted value.
Table 6. Items matrix for prospective interactions.
I1 I2 I3 I(N-2) I(N-1) I(N)

U1 3.656 3.504 3.488 ... 3.545 3.534 3.534
U2 3.65 3.507 3.503 ... 3.531 3.568 3.568
U3 3.601 3.450 3.452 ... 3.483 3.504 3.502
U4 3.686 3.518 3.515 ... 3.535 3.518 3.554
U5 3.708 3.549 3.548 ... 3.582 3.595 3.583
Iteration: 100. Total Mean Squared Error = 337.6037.
MAP@k (Mean Average Precision at k) is an evaluation metric that considers the order
of the recommended items as well. In our case, we have recommended 5 items to the set
of 5 users, so in our case k = 5. We set an experimental environment for our group of five
portal users; they were provided with a list of recommended items in the order generated
by our recommender engine. The users interacted with certain items and provided verbal
feedback on whether the generated recommendations were of interest. For example, the
statistical accuracy of the recommendation engine is as follows for a given User 1.
• [1, 1, 1, 0, 0] where 1 stands for a correct recommendation such that the user interacted
with it and 0 stands for a recommendation with which the user did not interact.
• [1/1, 2/2, 3/3, 2/3, 1/3] is the precision at k.
• (1/5) [1/1 + 2/2 + 3/3 + 2/3 + 1/3] = 0.7999 is the average precision at k.
The precision is higher for the first three items which were interacted with, but for the
last two items with which the user did not interact, the precision falls. Therefore, for user
1, the average precision is almost 80%. Whereas for all sets of users, this will be the mean
average precision and can be calculated by taking the average precisions’ mean.
4.4. House Prise Perdiiction Moderl

As previously mentioned, we have cross-compared and validated two property price
prediction models (1) Multiple Linear Regression (2) Keras Regression. MLR is based
on traditional regression techniques, whereas Keras has its basis in neural networks. We
tested both approaches in a runtime Python environment. Figure 5 provides a high-level
description of the procedure adopted for the model. Both models performed well with the
given data and parameters. Still, the model with lower error rates and better variance score
or coefficient of determination was chosen as the final model for deployment.
ISPRS Int. J. Geo-Inf. 2022, 11, 178
x FOR PEER REVIEW 14 of 19
18
Figure 5. High-Level Description of Price Model.

4.4.1. Multiple Linear Regression
4.4.1. Multiple Linearthe
After running Regression
first multiple linear regression model, Figure 6 illustrates the top-
recommended
After running properties based onlinear
the first multiple location recommendations
regression
regression model,
model, Figure
Figurein 6different localities
6 illustrates
illustrates the top-of
the top-
Islamabad city,
recommended while Figure
properties based7 represents
on locationthe price prediction in
recommendations visualisation. The actual
different localities of
price is the
Islamabad
Islamabad one
city,
city, that Figure
while
while was already
Figure present
7 represents
7 represents the inthe
theprice
pricetest data,
prediction whereas the predicted
visualisation.
prediction price
The actual
visualisation. The was
price
actual
isobtained
the one
price is the after
that
onewasrunning
that was the
already model
present
already inon
thethe
present intrain
test data,
the data. Table
whereas
test data, 7 represents
the predicted
whereas the MAE,
price
the predicted MSEwas
wasprice and
obtained
RMSE
after errors
running theof these
model predictions.
on the train We
data. can
Tablealso
7 see the
represents variance
the MAE,
obtained after running the model on the train data. Table 7 represents the MAE, MSE and score
MSE to be
and 0.70397
RMSE
approximately.
errors
RMSE oferrors
these ofpredictions. We can also
these predictions. Weseecan thealso
variance scorevariance
see the to be 0.70397
score approximately.
to be 0.70397
approximately.
Figure6.6.Top
Figure TopRecommended Properties
Recommended Based
Properties on on
Based Location Recommendation
Location in different
Recommendation localities
in different of
localities
of Islamabad
Islamabad City.
City.
Figure 6. Top Recommended Properties Based on Location Recommendation in different localities
of Islamabad City.
ISPRS Int.
ISPRS Int. J.
J. Geo-Inf.
Geo-Inf. 2022, 11, x178
2022, 11, FOR PEER REVIEW 15 of
15 of 18
19
Figure
Figure 7.
7. Actual
Actual vs
vs.Predicted
PredictedPrice
PriceVisualisation
Visualisationof
ofMLR.
MLR.
Table 7. Complete Model Results: Multiple Linear Regression (Units in US dollars).

Table 7. Complete Model Results: Multiple Linear Regression (Units in US dollars).
Evaluation Metric Score
Evaluation Metric Score
MAE 126,028.201
MAE 126,028.201
MSE 40,658,017,783
MSE
RMSE 40,658,017,783
201,638
RMSE
VarScore 201,638
0.7039
VarScore 0.7039
4.4.2. Keras Regression
4.4.2.The variance
Keras score of Keras regression is approximately 0.8028. This is a better
Regression
performance than the multiple
The variance score of Keraslinear regression
regression approach. Furthermore,
is approximately 0.8028. Thisnumerical errors
is a better per-
for RMSE in the case of Keras have also been reduced, as shown in Tables
formance than the multiple linear regression approach. Furthermore, numerical errors for8 and 9.
Therefore,
RMSE in theour predictions
case in thisalso
of Keras have casebeen
scenario are closer
reduced, to actual
as shown pricing.
in Tables 8 and 9. Therefore,
our predictions in this case scenario are closer to actual pricing.
Table 8. Complete Model Results: Keras (Units in US dollars).
Table 8. Complete Model Results:
Evaluation Metric Keras (Units in US dollars). Score
MAE Metric
Evaluation 101,542
Score
MSE 27,102,661,020
MAE 101,542
RMSE 164,628
MSE 27,102,661,020
VarScore 0.8028
RMSE 164,628
VarScore
Table 9. MLR and Keras Price Predictions (Units in US dollars). 0.8028
Actual (MLR) Predicted (MLR) Actual (Keras) Predicted (Keras)

349,950.0000 530,708.04458 349,950.0000 508,257.65625
450,000.0000 667,170.68394 450,000.0000 623,626.87500
635,000.0000 553,264.86718 635,000.0000 586,021.62500
355,500.0000 346,623.22842 355,500.0000 321,635.18750
246,950.00000 61,187.19574 246,950.00000 219,407.46875
406,550.00000 481,129.98291 406,550.00000 548,740.31250
350,000.00000 312,696.35790 350,000.00000 397,068.87500
226,500.00000 273,842.64629 226,500.00000 238,060.96875
265,000.00000 280,530.76516 265,000.00000 287,272.28125
656,000.00000 532,925.01517 656,000.00000 471,043.90625
Table 9. MLR and Keras Price Predictions (Units in US dollars).
Actual (MLR) Predicted (MLR) Actual (Keras) Predicted (Keras)

349,950.0000 530,708.04458 349,950.0000 508,257.65625
450,000.0000 667,170.68394 450,000.0000 623,626.87500
635,000.0000 553,264.86718 635,000.0000 586,021.62500
355,500.0000 346,623.22842 355,500.0000 321,635.18750
246,950.00000 61,187.19574 246,950.00000 219,407.46875
406,550.00000 481,129.98291 406,550.00000 548,740.31250
350,000.00000 312,696.35790 350,000.00000 397,068.87500
226,500.00000 273,842.64629 226,500.00000 238,060.96875
265,000.00000 280,530.76516 265,000.00000 287,272.28125
656,000.00000 532,925.01517 656,000.00000 471,043.90625
Figures 88 and
Figures and 99 represent
represent different
different ways
ways prices
prices have
have been
been changing
changing in in the
the city
city of
of
Islamabad. In
Islamabad. In Figure
Figure 8,
8, mean
mean sector
sector (neighbourhood
(neighbourhood area) area) prices
prices are
are highlighted.
highlighted. ThisThis
visualisation provides
visualisation providesaaclear
clearoverview
overviewofofwhat
whatareas could
areas show
could drastic
show price
drastic changes
price and
changes
what areas will remain stagnant. This data has been analysed for the past 2 years,
and what areas will remain stagnant. This data has been analysed for the past 2 years, and and the
predictions show how prices will change or remain the same in the coming years.
the predictions show how prices will change or remain the same in the coming years. The The blue
area area
blue in theinfigure depicts
the figure howhow
depicts prices have
prices increased
have increasedsignificantly in the
significantly coming
in the coming years in
years
those
in neighbourhoods.
those neighbourhoods. In contrast, redred
In contrast, areas indicate
areas stagnancy
indicate in prices,
stagnancy providing
in prices, a user
providing a
with a clearer picture to assist in decision making.
user with a clearer picture to assist in decision making.
Price Prediction Model Visualisation

800000
700000
600000
500000
400000
300000
200000
100000
0
Price
1 2 3 4 5 6 7 8 9 10
Figure 8. Price Prediction Visualisation over Islamabad City.

100000
Price
1 2 3 4 5 6 7 8 9 10
Figure 9.
Figure 9. A
A visualisation of sector
visualisation of sector wise
wise Predicted
Predicted Price
Price Visualisation
Visualisation in
in Islamabad
Islamabad City.
City.
5. Conclusion
Conclusions
Three different
differentrecommendation
recommendation algorithms for for
algorithms the real
the estate portalportal
real estate “Estatech Maps”
“Estatech
were
Maps” developed along with
were developed alongtwowithdifferent models
two different for house
models priceprice
for house prediction. First,
prediction. we
First,
set
we our
set goals to analyse
our goals and implement
to analyse and implementcontent-based filtering
content-based for suggesting
filtering real estate
for suggesting real
items.
estate The collaborative
items. filtering approach
The collaborative filteringwasapproach
used for reducing
was used the computational
for reducing cost the
by suggesting similar items to a similar group of users. Then, we applied the location-based
approach for predicting the areas of interest to the user based on the user’s geographical
location. All this was achieved with a minimum precision of 79%. Prediction models were
created, and results were visualised by price increase, decrease or stagnancy in multiple
sectors of Islamabad city to better assist people planning future land asset purchases. Our
model was able to precisely predict the changes in house prices trends with a minimum
accuracy of 80%, which was through our neural network-based prediction model. This
work can be effectively utilised in any real-estate sale and purchase domain and will
improve the overall user experience of real estate portals. This proves the viability of
our map-based system in providing data and recommendations to users based on the
popularity of an item, user similarity and geographical location.
While nowadays recommendation and predictive analysis are becoming a common
trend in even the smallest of businesses, in Pakistan, the real estate industry is lacking
when it comes to implementing these techniques not only in terms of a map-based interface
but also in terms of presenting these items to the user in an effective way. Therefore, our
approach for displaying an item of interest to the user on a map-based interface would be
one of the pioneers in real estate portals in Pakistan.
We have used sequential NN models for our recommendation and prediction in this
research. One area of improvement and basis for future work could be exploring and
implementing these as parallel models to improve response time and efficiency. Another
approach could be combining multiple techniques to create a hybrid model. The same
approach was used in the study where the Cobb-Douglas and linear regression models
were combined to form a mathematical model [24]. GIS was an additional tool to organise
the regional data of the area under study. In turn, this can cover a broader spectrum of
users’ behaviours and avoid high computational costs at the server end.
Author Contributions: Conceptualization, Maryam Mubarak, Ali Tahir and Fizza Waqar; method-
ology, Maryam Mubarak and Ali Tahir; software, Maryam Mubarak, Ali Tahir and Fizza Waqar;
validation, Ali Tahir, Ibraheem Haneef, Gavin McArdle and Michela Bertolotto; formal analysis,
Maryam Mubarak, Ali Tahir and Muhammad Tariq Saeed; investigation, Gavin McArdle, Maryam
Mubarak, Ibraheem Haneef, Ali Tahir and Muhammad Tariq Saeed; resources, Ali Tahir and Muham-
mad Tariq Saeed; data curation, Maryam Mubarak, Ali Tahir and Fizza Waqar; writing—original
draft preparation, Maryam Mubarak, Ali Tahir and Fizza Waqar; writing—review and editing, Gavin
McArdle, Michela Bertolotto and Muhammad Tariq Saeed; visualization, Maryam Mubarak, Fizza
Waqar and Ali Tahir; supervision, Ali Tahir and Muhammad Tariq Saeed; project administration, Ali
Tahir, Ibraheem Haneef and Muhammad Tariq Saeed; funding acquisition, Ali Tahir and Ibraheem
Haneef. All authors have read and agreed to the published version of the manuscript.
Funding: This research received the funding from Higher Education Commission (HEC), Pakistan,
under grant no. TDF03-249.
Informed Consent Statement: Not applicable.
Data Availability Statement: Not applicable.
Acknowledgments: This research was supported by the Higher Education Commission (HEC),
Pakistan, under grant no. TDF03-249. The authors gratefully acknowledge their support.
Conflicts of Interest: The authors declare no conflict of interest.
References
1. Geetha, G.; Safa, M.; Fancy, C.; Saranya, D. A hybrid approach using collaborative filtering and content based filtering for
recommender system. J. Phys. Conf. Ser. 2018, 1000, 012101. [CrossRef]
2. Peter, D. Electronic junk. Commun. ACM 1982, 25, 163.
3. Knoll, J.; Groß, R.; Schwanke, A.; Rinn, B.; Schreyer, M. Applying Recommender Approaches to the Real Estate E-Commerce
Market. In International Conference on Innovations for Community Services; Springer: Cham, Switzerland, 2018; pp. 111–126.
4. Bai, Y.; Jia, S.; Wang, S.; Tan, B. Customer Loyalty Improves the Effectiveness of Recommender Systems Based on Complex
Network. Information 2020, 11, 171. [CrossRef]
5. Fernández-García, A.J.; Iribarne, L.; Corral, A.; Criado, J.; Wang, J.Z. A recommender system for component-based applications
using machine learning techniques. Knowl. Based Syst. 2019, 164, 68–84. [CrossRef]
6. Rabiei-Dastjerdi, H.; McArdle, G.; Matthews, S.A.; Keenan, P. Gap analysis in decision support systems for real-estate in the era
of the digital earth. Int. J. Digit. Earth 2021, 14, 121–138. [CrossRef]
7. Mac Aoidh, E.; Bertolotto, M.; Wilson, D.C. Understanding geospatial interests by visualizing map interaction behavior. Inf.
Vis. 2008. [CrossRef]
8. Yu, Y.; Wang, C.; Zhang, L.; Gao, R.; Wang, H. Geographical Proximity Boosted Recommendation Algorithms for Real Estate. In
International Conference on Web Information Systems Engineering; Springer: Cham, Switzerland, 2018; pp. 61–66.
9. Breese, J.S.; Heckerman, D.; Kadie, C. Empirical Analysis of Predictive Algorithms for Collaborative Filtering. In Proceedings of
the Fourteenth Conference on Uncertainty in Artificial Intelligence, Madison, WI, USA, 24–26 July 1998; pp. 43–52.
10. Shani, G.; Heckerman, D.; Brafman, R.I. An MDP-based recommender system. J. Mach. Learn. Res. 2005, 6, 1265–1295.
11. Ayyaz, S.; Qamar, U.; Nawaz, R. HCF-CRS: A Hybrid Content based Fuzzy Conformal Recommender System for providing
recommendations with confidence. PLoS ONE 2018, 13, e0204849. [CrossRef] [PubMed]
12. Sarwar, B.; Karypis, G.; Konstan, J.; Riedl, J. Item-based Collaborative Filtering Recommendation Algorithms. In Proceedings of
the 10th International Conference on World Wide Web, Hong Kong, China, 1–5 May 2001; pp. 285–295.
13. Ballatore, A.; McArdle, G.; Kelly, C.; Bertolotto, M. Recomap: An interactive and adaptive map-based recommender. In
Proceedings of the 2010 ACM Symposium on Applied Computing, Sierre, Switzerland, 22–26 March 2010; pp. 887–891.
14. Wilson, D.C.; Lipford, H.R.; Carroll, E.; Karr, P.; Najjar, N. Charting New Ground: Modeling User Behavior in Interactive
GeoVisualization. In Proceedings of the 16th ACM SIGSPATIAL International Conference on Advances in Geographic Information
Systems, Irvine, CA, USA, 5–7 November 2008; pp. 1–4.
15. Tezuka, T.; Tanaka, K. Presentation of Dynamic Maps by Estimating User Intentions from Operation History. In International
Conference on Multimedia Modeling; Springer: Berlin/Heidelberg, Germany, 2007; pp. 156–165.
16. Rehman, F.; Masood, H.; Ul-Hasan, A.; Nawaz, R.; Shafait, F. An Intelligent Context Aware Recommender System for Real Estate.
In Mediterranean Conference on Pattern Recognition and Artificial Intelligence; Springer: Cham, Switzerland, 2019; pp. 177–191.
17. Kong, J.S.; Teague, K.; Kessler, J. The Love-Hate Square Counting Method for Recommender Systems. Proc. KDD Cup 2011, 18,
249–261.
18. Greenaway-McGrevy, R.; Sorensen, K. A spatial model averaging approach to measuring house prices. J. Spat. Econom. 2021, 2,
1–32. [CrossRef]
19. Rawool, A.G.; Rogye, D.V.; Rane, S.G. House Price Prediction Using Machine Learning. Int. J. Res. Appl. Sci. Eng. Technol. 2021, 9,
686–692. [CrossRef]
20. Chaturvedi, S.; Ahlawat, L.; Patel, T.; Talha, M. Real Estate Price Prediction. EasyChair 2021, 4926. Available online: https:
//easychair.org/publications/preprint/HbD8 (accessed on 2 January 2022).
21. Abdullahi, A.; Usman, H.; Ibrahim, I. Determining house price for mass appraisal using multiple regression analysis modeling in
Kaduna North, Nigeria. ATBU J. Environ. Technol. 2018, 11, 26–40.
22. Renigier-Biłozor, M.; Źróbek, S.; Walacik, M.; Borst, R.; Grover, R.; d’Amato, M. International acceptance of automated modern
tools use must-have for sustainable real estate market development. Land Use Policy 2022, 113, 105876. [CrossRef]
23. Mubarak, M.; Khalid, K.; Waqar, F.; Tahir, A.; Haneef, I.; McArdle, G.; Bertolotto, M. Towards Real Estate Analytics using Map
Personalisation. In Proceedings of the 6th International Conference on Geographical Information Systems Theory, Applications
and Management, GISTAM 2020, Prague, Czech Republic, 7–9 May 2020; pp. 184–190.
24. Sisman, S.; Akar, A.U.; Yalpir, S. The novelty hybrid model development proposal for mass appraisal of real estates in sustainable
land management. Surv. Rev. 2021, 1–20. [CrossRef]

A Map-Based Recommendation System and House Price Prediction Model For Real Estate

Uploaded by

Copyright:

Available Formats

A Map-Based Recommendation System and House Price Prediction Model For Real Estate

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

A Map-Based Recommendation System and House Price Prediction Model For Real Estate

Uploaded by

Copyright:

Available Formats

International Journal of

1 Institute of Geographical Information Systems, National University of Science & Technology,

ISPRS Int. J. Geo-Inf. 2022, 11, 178. https://doi.org/10.3390/ijgi11030178 https://www.mdpi.com/journal/ijgi

3.2. Property Recommendation

3.2.1. Content-Based Filtering

Frequency o f term “t” in document

log Base 10( Total Number o f documents)

Figure 3. Frequency Calculating Function for Content-Based Filtering.

3.2.2. Collaborative Filtering Approach

Table 1. Property Interaction Matrix.

Algorithm 1 Collaborative Recommendation Algorithm for New User “U1”

3.2.3. Location-Based Filtering

In Algorithm 2, a generalised algorithm for calculating location-based recommen-

Algorithm 2 Location-based recommendation algorithm for New User “U1”

3.3. Price Prediction Model

3.3.1. Multiple Linear Regression

3.3.2. Keras Regression

4. Results and Discussion

Table 2. Manual Model, Collaborative Filtering Scores.

S1 S2 S3 S(N-2) S(N-1) S(N)

Table 3. Manual Model, Content-Based Filtering.

S1 S2 S3 S(N-2) S(N-1) S(N)

Table 4. Content-Based Recommendation using Turicreate.

User ID Property ID Interaction Rank

Table 5. Collaborative Filtering Using Turicreate.

User ID Property ID SCORE Rank

Figure 4. Location Based Clusters over Islamabad City.

4.3. Recommender System Validation

4.3. Recommender System Validation

Table 6. Items matrix for prospective interactions.

I1 I2 I3 I(N-2) I(N-1) I(N)

4.4. House Prise Perdiiction Moderl

Figure 5. High-Level Description of Price Model.

Table 7. Complete Model Results: Multiple Linear Regression (Units in US dollars).

Actual (MLR) Predicted (MLR) Actual (Keras) Predicted (Keras)

Table 9. MLR and Keras Price Predictions (Units in US dollars).

Actual (MLR) Predicted (MLR) Actual (Keras) Predicted (Keras)

Price Prediction Model Visualisation

Figure 8. Price Prediction Visualisation over Islamabad City.

ISPRS Int. J. Geo-Inf. 2022, 11, 178 17 of 19

Figure 8. Price Prediction Visualisation over Islamabad City.

You might also like