0% found this document useful (0 votes)
58 views12 pages

House Price Prediction Using AI

Give report for this paper

Uploaded by

pchethan111
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
58 views12 pages

House Price Prediction Using AI

Give report for this paper

Uploaded by

pchethan111
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 12

House Price Prediction Using AI

Daniel Christopher M Vishwas Gowda C S Chethan P


(20232MCA0250) (20232MCA0256) (20232MCA0268)

ABSTRACT: Accurate house price prediction is increasingly important in today’s volatile real estate
market, where prices fluctuate due to economic shifts and supply-demand dynamics. This project aims to
improve prediction accuracy by combining user-specific data—such as name, contact information, and the area
in which the user is searching for a home—with market factors like neighborhood characteristics and access to
public amenities. Machine learning algorithms such as Linear Regression, Decision Tree Regression, and
Random Forest Regression are employed to analyze these variables and provide reliable price predictions. By
using these techniques, the model can identify patterns and trends in the housing market that may not be
immediately apparent, helping buyers assess affordability and sellers set competitive prices. Additionally,
artificial intelligence (AI) algorithms are incorporated to fine-tune predictions based on evolving market
conditions, ensuring greater precision.The integration of AI and ML models reduces the need for traditional
brokers, offering a data-driven solution that is more efficient and personalized. This approach benefits not only
individual buyers and sellers but also real estate investors, developers, and policymakers, enabling them to make
informed decisions and plan strategically. The result is a more accurate, automated system for predicting house
prices that streamlines the decision-making process in a competitive market.

Keywords: Artificial Intelligence (AI), Machine Learning (ML), House Price Prediction, Accuracy, Public
Amenities, Linear Regression.

1 Introduction
The real estate industry is a significant driver of economic development and societal progress, impacting
individuals' aspirations and shaping broader economic trends. As A. H. Maslow’s A Theory of Human
Motivation suggests, physiological needs such as shelter are fundamental to human well-being (Maslow, 1943,
p.374) [1]. Housing, therefore, occupies a central position in people’s lives and is essential for societal stability.
Given the crucial role of housing, understanding the factors that influence house prices is essential for
policymakers, real estate professionals, and prospective homeowners. Predicting house prices has become a
topic of growing interest as it provides valuable insights into the dynamics of the housing market, helping
various stakeholders—including buyers, sellers, investors, and urban planners—make informed decisions.

This study focuses on house price prediction using data from Bengaluru, India, a rapidly growing metropolis
known for its dynamic real estate market. Bengaluru, the capital of Karnataka, is one of India's largest cities and
is often referred to as the "Silicon Valley of India" due to its status as a technology hub. This vibrant city
features a diverse range of neighborhoods, each with distinct housing market trends. As urbanization accelerates
and the demand for housing increases in Bengaluru, predicting house prices has become even more crucial. The
diverse socio-economic profile of Bengaluru’s population and its rapidly evolving infrastructure further
complicates the task of accurately forecasting house prices.

Accurate house price predictions are essential for facilitating market efficiency. For buyers, predictions help
make informed purchasing decisions, while for sellers, they enable the determination of fair and competitive
pricing strategies. Real estate investors can use predictions to identify lucrative investment opportunities, while
urban planners and policymakers benefit from insights that guide zoning decisions and infrastructure
investments. However, accurately predicting house prices remains challenging due to the wide array of factors
that influence them. These include basic property attributes like size, location, age, and condition, as well as
more complex factors such as proximity to amenities, transportation networks, and social sentiments. While
many studies have focused on basic property features, fewer have explored the influence of surrounding
environmental and social factors on house prices.

This paper aims to advance the understanding of house price prediction by applying machine learning
techniques to a dataset from Bengaluru. We employ four supervised learning models—Linear Regression,
Random Forest (RF), Decision Tree(DT) , and XGBoost—to predict house prices based on a variety of features,
including location, area, number of bedrooms and bathrooms, age of the property, and other relevant attributes.
Our research seeks to identify the most significant factors that influence house prices and provide a reliable
framework for predicting prices in Bengaluru’s diverse housing market.

With the proliferation of data sources such as transaction records, satellite imagery, and online real estate
listings, the potential to improve house price predictions has grown substantially. In this study, we adopt a
multi-source data fusion approach, combining traditional property data with additional information about
surrounding amenities, neighborhood characteristics, and even factors derived from sentiment analysis. This
comprehensive approach enables a deeper understanding of the various drivers behind price fluctuations in the
housing market.

Machine learning algorithms such as Random Forest and XGBoost have gained prominence in real estate price
prediction due to their ability to capture non-linear relationships and interactions between variables that
traditional methods may overlook. By analyzing large datasets, these algorithms can uncover patterns and
insights that enhance the accuracy of predictions. Our research explores the performance of these machine
learning models in predicting house prices in Bengaluru and aims to determine which approach is most effective
in this context.
The key contributions of this research include:

1. Identifying Key Influential Factors:We analyze the impact of various property features and surrounding
factors like hospital ,,college on house prices, ranking these influences to provide actionable insights for buyers,
sellers, and investors.

2. Comparing Machine Learning Models: We evaluate and compare the performance of Linear Regression,
Random Forest, Decision tree, and XGBoost in predicting house prices, determining which model offers the
highest accuracy in the Bengaluru real estate market.

3. Multi-source Data Fusion: We integrate traditional property features with additional data from surrounding
amenities and external factors to improve prediction accuracy and robustness.

By leveraging machine learning techniques and incorporating diverse data sources, this research offers a more
comprehensive framework for predicting house prices in Bengaluru. The findings can aid stakeholders in the
real estate market by offering more precise and reliable predictions, ultimately contributing to better decision-
making in this rapidly evolving market.

RELATED WORK

· Feature Exploration and Data Integration

Zhao et al. (2024) highlighted the significance of multi-source data fusion, where traditional property
features such as size and location were augmented with data from satellite imagery, transaction records,
and social sentiment analysis. Their approach demonstrated that the inclusion of external factors like
proximity to schools, hospitals, and commercial hubs significantly enhances the predictive power of
models. Similarly, Wang et al. (2021) implemented a deep learning model with joint self-attention
mechanisms, which enabled their system to weigh the relative importance of different features
dynamically. Their findings emphasized the importance of neighborhood characteristics and urban
infrastructure in determining house prices.

· Machine Learning Techniques

Several studies have explored the application of various machine learning algorithms to improve
prediction accuracy. Random Forest and XGBoost emerged as highly effective due to their ability to
model non-linear relationships and interactions between variables. Adetunjia et al. (2020-2021) used
Random Forest and demonstrated its effectiveness in handling noisy datasets, a common issue in real
estate datasets, while maintaining robustness against overfitting.

Linear Regression remains a popular baseline model for house price prediction. Chaubey et al. (2022)
and Kumar et al. (2022) employed Linear Regression to predict house prices and highlighted its
simplicity and interpretability. However, they also noted its limitations when dealing with complex
datasets with non-linear relationships. These findings prompted researchers to explore more
sophisticated models like Decision Trees and boosting algorithms.

Yusof and Ismail (2012) demonstrated the application of multiple regression models to analyze housing
price variations. Their work emphasized the need for incorporating spatial and socio-economic factors,
which are often overlooked in traditional models.

 Hybrid and Advanced Models

Gradient Boosting and its advanced variant, XGBoost, have shown exceptional performance in
predicting real estate prices. Ragapriya et al. (2023) introduced a modified Extreme Gradient Boosting
model tailored for small, noisy datasets. By incorporating regularization techniques, their model
achieved high accuracy and reduced overfitting.

Deep learning models have also made their mark in this domain. Wang et al. (2021) utilized neural
networks with self-attention mechanisms, enabling their model to capture subtle relationships between
diverse features. These advanced models are particularly useful when large datasets with high-
dimensional features are available.

· Regional Focus and Case Studies

Focusing on region-specific data, Aghav et al. (2023) and Mysore et al. (2022) explored the housing
markets in Indian cities. Their research identified regional factors such as infrastructure development,
cultural preferences, and urbanization trends as significant determinants of house prices. Their findings
underscore the importance of contextualizing prediction models to local market conditions.

Bengaluru’s housing market has been a subject of interest due to its dynamic nature. Studies in this
region often emphasize the role of IT sector growth, urban sprawl, and evolving infrastructure in
shaping property prices. Our research builds on this regional focus by integrating localized datasets and
multi-source information to improve prediction reliability.
· AI and Sentiment Analysis

Incorporating sentiment analysis into house price prediction is a relatively recent development.
Banerjee et al. (2017) demonstrated that sentiment extracted from social media platforms and online
property reviews could serve as a valuable predictor for housing trends. Their findings revealed that
public sentiment often correlates with short-term market fluctuations, offering a supplementary layer of
insight beyond traditional numerical datasets.

· Evaluation Metrics and Model Comparison

Across the reviewed studies, model evaluation metrics such as Mean Squared Error (MSE), Root Mean
Squared Error (RMSE), and R-squared (R²) were commonly used to assess performance. Adetunjia et
al. (2020-2021) and Mysore et al. (2022) emphasized the importance of comparing multiple algorithms
to identify the best-performing model for specific datasets. They found that ensemble methods like
Random Forest and XGBoost consistently outperformed simpler models like Linear Regression and
Decision Trees in terms of accuracy and robustness.

SYSTEM ARCHITECTURE: The system architecture for house price prediction using AI
consists of five main components: Data Collection, Data Preprocessing, Model Training, Model Testing, and
Prediction. First, data is collected from various sources, such as real estate platforms, containing features like
house size, location, number of rooms, and age of the property. This data is then preprocessed by handling
missing values, encoding categorical variables, and normalizing numerical data. After preprocessing, the data is
divided into training and testing sets. The training set is used to train machine learning models such as Linear
Regression, Random Forest, and XGBoost, which learn to predict house prices based on the features. The
trained models are evaluated using performance metrics like Mean Squared Error (MSE) and R-squared
(R²).Finally,the best-performing model is used to predict house prices for new inputs, providing users with
accurate price estimates based on the trained AI system
PROPOSED METHODS:in the present, we have limited dataset of Bengaluru for real estate
appraisal. Since, the data also has outliers and is not enough for Artificial Neural Network, we have proposed
the use of Regression analysis for the system. Linear regression is a statistical modeling tool that we can use to
predict one variable using another. It is an algorithm of supervised machine learning in which the predicted
output is continuous with having a constant slope. It is used to predict the values in a continuous range instead
of classifying the values in the categories. This is a particularly useful tool for predictive modeling and
forecasting, providing excellent insight on present data and predicting data in the future. The following three
ML algorithms have been implemented to have comparison with respect to performance metrics.

A.Linear Regression Analysis

Linear Regression is a supervised machine learning model that attempts to create a linear relationship between
dependent variables (Y) and independent variables (X). Every evaluated observation with a model, the target
(Y)’s actual value is compared to the target (Y)’s predicted value, and the major differences in these values are
called residuals.

Fig 1. Linear Regression

B. Random Forest :Random Forest is an ensemble learning technique that builds multiple decision
trees during training and aggregates their outputs to provide a final prediction. This method excels in handling
non-linear relationships between the features and the target variable, as well as managing the noise in the data,
making it ideal for real estate datasets where various factors influence the house prices. Random Forest is also
known for its robustness against overfitting and its ability to capture complex interactions between variables. In
this study, Random Forest will be employed to predict house prices by utilizing multiple decision trees that are
trained on randomly selected subsets of the data, improving model accuracy and generalization. The
performance of the model will be evaluated using metrics like Mean Squared Error (MSE) and R-squared (R²)
to compare its predictive power with other algorithms.
C.Decision Tree
Decision Tree can be used for both classification and Regression problems, but mostly it is preferred for solving
Classification problems. It is a supervised learning technique that breaks down a dataset into smaller and
smaller subsets while at the same time an associated decision tree is incrementally developed. The final result is
a tree with decision nodes and leaf nodes.
D. XGBoost (Extreme Gradient Boosting) :
In addition to Random Forest, the use of XGBoost (Extreme Gradient Boosting) to enhance house price
prediction accuracy in the context of the Bengaluru real estate market. XGBoost is a powerful boosting
algorithm that builds trees sequentially, where each new tree attempts to correct the errors made by the previous
one. This process leads to an increase in predictive performance and robustness, especially when dealing with
small or noisy datasets. XGBoost is well-suited for real estate price prediction due to its ability to handle
complex, non-linear patterns and interactions among features. The model is also highly efficient and capable of
reducing overfitting through regularization, making it a reliable choice for predicting house prices. The method
will be trained using the available dataset, and performance will be assessed through metrics such as Mean
Squared Error (MSE) and R-squared (R²), which will allow a detailed comparison of its effectiveness relative to
other models like Random Forest and Linear Regression.
CONCLUSION: In conclusion, we have successfully developed a machine learning web solution to
predict house prices based on various features. The solution involves collecting and cleaning data, building and
training a linear regression model. Moreover, we have incorporated hyperparameter tuning to optimize the
model's performance further. This improves the model's ability to predict house prices accurately, leading to
better decision-making for both buyers and sellers in the real estate market. By implementing the model in a
web-based solution, users can input data on a house, and the solution will provide an estimated price based on
the model's predictions. This makes it easier for buyers and sellers to obtain a rough estimate of a property's
value without the need for extensive research. Overall, this machine learning web solution for house price
prediction provides a valuable tool for the real estate industry and can aid in making more informed decisions
regarding property values.

REFERENCES:
[1] ANAND G. RAWOOL1, DATTATRAY V. ROGYE2, SAINATH G. RANE3, DR. VINAYK A.
BHARADI4 ,” House Price Prediction Using Machine Learning”, IRE Journals | Volume 4 Issue 11 | ISSN:
2456-8880, MAY 2021

[2] PEI-YING WANG1, CHIAO-TING CHEN , JAIN-WUN SU1, TING-YUN WANG1, AND SZU-HAO
HUANG(Member, IEEE)., Deep Learning Model for House Price Prediction Using Heterogeneous Data
Analysis Along With Joint Self-Attention Mechanism, April 15, 2021

[3] Vaishnavi Aghav*1, Dhanashree Avhad*2, Supriya Nanaware*3, Rakesh Gudekar*4, Prof. Mahendra
Pawar, Department Of Computer Engineering, HOUSE PRICE PREDICTION USING MACHINE
LEARNING,Volume:05/Issue:03/March-2023.

[4] Yaping Zhao, Jichang Zhao, and Edmund Y. Lam*, House Price Prediction: A Multi-Source Data Fusion
Perspective, BIG DATA MINING AND ANALYTICS, Volume 7, Number 3 ISSN 2096-
0654 04/25 pp603−620,, September 2024

[5] Chenxi Li, School Of International Education, GuangDong University Of Technology, No. 11,Guangzhou,
China, House price prediction using machine learning, [email protected]

[6] Aminah Md Yusof and Syuhaida Ismail, Multiple Regressions in Analysing House Price Variations, IBIMA
Publishing Communications of the IBIMA http://www.ibimapublishing.com/journals/CIBIMA/cibima.html Vol.
2012 (2012), Article ID 383101, DOI: 10.5171/2012.383101.

[7] Prof. J. Kalidass1, T. Dharshalini2, R. Nivetha3, AP. Subasri4, HOUSE PRICE PREDICTION USING
MACHINE LEARNING, International Research Journal of Engineering and Technology (IRJET), Volume: 11
Issue: 04 | Apr 2024.

[8] Debanjan Banerjee, Department of Management Information Systems Sarva Siksha Mission, Kolkata, India
[email protected], Suchibrota Dutta, Department of Information Technology and Mathematics
Royal Thimphu College, Thimphu, Bhutan [email protected], IEEE International Conference on Power,
Control, Signals and Instrumentation Engineering (ICPCSI-2017).

[9] Aman Yadav, Predicting The Housing Price using Artificial Intelligence/ Machine Learning, International
Journal of Advanced Research in Science, Communication and Technology (IJARSCT), Volume 2, Issue 6,
June 2022.

[10] N. Ragapriya1*, T. Ananth Kumar2, R. Parthiban3, P. Divya4, S. Jayalakshmi5 & D. Raghu Raman6,
Machine Learning Based House Price Prediction Using Modified Extreme Boosting, Asian Journal of Applied
Science and Technology (AJAST) Volume 7, Issue 1, Pages 41-54, January-March 2023.

[11] Sumanth Mysore1, Abhinay Muthineni2, Vaishnavi Nandikandi3, Sudersan Behera4, Prediction of House
Prices Using Machine Learning, Volume 10 Issue VI June 2022- www.ijraset.com.
[12] Ayushi Bhagat, Mayuri Gosavi, Aditi Shahasane, Nandini Mishra , Amit Nerurkar, House Price Prediction
Using Machine Learning ,Department of Computer, Engineering, Vidyalankar Institute of, Technology, Wadala,
Mumbai 400037.

[13] Amrit Kumar Chaubey[1], Aadit Shrestha[2], Anindita Gogoi[3], Using Linear Regression Machine
Learning Algorithm for the Prediction of Real Estate, DOI: 10.14293/S2199-1006.1.SOR-.PP6RJWG.v1, 30
May 2022.

[14] Abigail Bola Adetunjia, Oluwatobi Noah Akande*b, Funmilola Alaba Ajala a, Ololade Oyewo a, Yetunde
Faith Akande c, Gbenle Oluwadara b, House Price Prediction using Random Forest Machine Learning
Technique, (ITQM 2020 & 2021).

You might also like