0% found this document useful (0 votes)
14 views50 pages

Main Content (1) - Merged

This project report focuses on developing a house price prediction model using Python and machine learning techniques. The model aims to assist buyers and sellers in the real estate market by providing accurate price estimations based on various factors such as location and property features. The project involves exploring different machine learning algorithms, particularly linear regression, to enhance the accuracy of predictions and facilitate informed decision-making in real estate transactions.

Uploaded by

dibyarashmiulick
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views50 pages

Main Content (1) - Merged

This project report focuses on developing a house price prediction model using Python and machine learning techniques. The model aims to assist buyers and sellers in the real estate market by providing accurate price estimations based on various factors such as location and property features. The project involves exploring different machine learning algorithms, particularly linear regression, to enhance the accuracy of predictions and facilitate informed decision-making in real estate transactions.

Uploaded by

dibyarashmiulick
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 50

A PROJECT REPORT

ON

“HOUSE PRICE PREDICTION USING PYTHON”

SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS


FOR THE B.TECH IN

COMPUTER SCIENCE & ENGINEERING

SUBMITTED BY

NAME OF THE STUDENT REGISTRATION NO.

1 SUCHISMITA SAHOO 2221304049


2 SURAJ SAHOO 2221304051
3 TUSAR KANTA DHAL 2221304052
4 DIBYA RASHMI BHANJA 2231304002

GITAM, BHUBANESWAR

ACADEMIC YEAR

2024-25

i
GITAM

CERTIFICATE
This is to certify that the work in this Project Report entitled
“HOUSE PRICE PREDICTION USING PYTHON” by Suchismita
Sahoo (2221304049), Suraj Sahoo (2221304051), Tusar Kanta
Dhal(2221304052), Dibya Rashmi Bhanja(2231304002) have
been carried out under my supervision in partial fulfillment of the
requirements for the B.Tech in Computer Science & Engineering
during session 2024-2025 in Department of Computer Science &
Engineering of GITAM and this work is the original work of the
above students.

Er. Prasanjit Das


(Project Guide)

ii
ACKNOWLEDGMENT
This project is done as a semester project, as a part course titled

“MINOR PROJECT”.

We are really thankful to our HOD, Er. Sunanda Kumar


Sahoo, Department of Computer Science & Engineering,
GITAM, BHUBANESWAR for his invaluable guidance and
assistance, without which the accomplishment of the task would
have never been possible.

We also thank Er. Prasanjit Das for giving this opportunity to


explore into the real world and realize the interrelation without
which a Project can never progress. In our present project we
have chosen the topic- “HOUSE PRICE PREDICTION
USING PYTHON”.

We are also thankful to parents, friend and all staff of


Department of Computer Science & Engineering for
providing us relevant information and necessary clarifications,
and great support.

NAME OF THE STUDENT REGISTRATION NO.

1 SUCHISMITA SAHOO 2221304049


2 SURAJ SAHOO 2221304051
3 TUSAR KANTA DHAL 2221304052
4 DIBYA RASHMI BHANJA 2231304002

iii
ABSTRACT
We propose to implement a house price prediction model of Bangalore,
India. Buyers focus on finding a suitable home/flat within their budget, and
consider their investment on house to increases over a period of time. On
the other hand, Sellers aim to sell their homes at the best possible price.
Since house prices are subject to fluctuations, customers often face
difficulties in purchasing a house at the right time before prices change in
the near future. To address this major issue in the real estate market, we are
designing a machine learning model for predicting house prices. Machine
learning techniques play a vital role in this project by providing more
precise house price estimations based on user preferences such as location,
number of rooms, and air quality, among others. Housing prices fluctuate
on a daily basis and are sometimes exaggerated rather than based on worth.
The major focus of this project is on predicting home prices using genuine
factors. Here, we intend to base an evaluation on every basic criterion that is
taken into account when establishing the pricing. The goal of this project is
to learn Python and get experience in Data Analytics, Machine Learning.

iv
LIST OF TABLES

Table 1 Application

LIST OF FIGURES

Figure 1 Introduction to Machine Learning


Figure 2 Working of Machine Learning
Figure 3 Application of Machine Learning
Figure 4 Supervised Learning
Figure 5 Types of supervised Machine learning
Figure 6 Linear Regression in Machine Learning
Figure 7 Linear Regression Line
Figure 8 Negative Linear Relationship
Figure 9 Random Forest Algorithm
Figure 10 Decision Tree Classification Algorithm
Figure 11 Decision Tree algorithm Working
Figure 12 Logistic Regression
Figure 13 Working of Unsupervised Learning
Figure 14 Clustering in Machine Learning
Figure 15 Visualization insights
Figure 16 Location Diversity
Figure 17 Evaluation Metric

v
TABLE OF CONTENTS

Page Number
Certificate ii
Acknowledgement iii
Abstract iv
List of Table v
List of Figures v
PHASE-I
1. Introduction
1.1 Introduction 1
1.2 Motivation 2
2. Literature Survey
2.1 Literature Survey 3
3. Proposed Work
3.1 Objective of proposed work 9
3.2 Methodology 9
3.2.1 Introduction to machine learning 9
3.2.2 How does Machine Learning work? 11
3.2.3 Need for Machine Learning 12
3.2.4 Applications of Machine learning: - 12
3.2.5 Machine Learning Classifications 15
3.2.5.1 Supervised Learning 15
3.2.5.2 Unsupervised Machine Learning 28
3.2.5.3 Reinforcement Learning: 30
PHASE-II
4. Implementation
4.1 Code 31
5. Result Analysis
5.1 Visualization Insights 33
5.2 Advantages 36
5.3 Disadvantages 37
5.4 Maintenance 39
5.5 Application 41
6. Conclusion and Future Development 42
Reference 43

vi
CHAPTER 1
INTRODUCTION

1.1 INTRODUCTION:

The real estate industry is a significant contributor to the global economy.


Accurately predicting house prices is essential for buyers, sellers, and
investors in this industry. Traditionally, real estate agents and property
appraisers rely on their experience and knowledge to determine the value of
a house. However, with the rapid growth of data and machine learning
algorithms, predicting house prices has become more precise and efficient.
Machine learning algorithms can analyze vast amounts of data and identify
patterns to predict future prices accurately. In this project, we aim to develop
a model that can predict house prices based on various factors such as
location, size, number of bedrooms and bathrooms, age of the property, and
other features. This project's objective is to explore different machine
learning algorithms and identify the most effective approach for predicting
house prices accurately. The results of this study can help real estate agents,
property appraisers, and investors make informed decisions about buying,
selling, and investing in properties. This proposed version could make viable
who are expecting the precise costs of house. This model of Linear regression
in machine learning takes the internal factor of house valuation dependencies
like area, no of bedrooms, locality etc. and external factors like air pollution
and crime rates. This Linear regression in machine learning gives the output
of price of the house with more accuracy. Here in this project, we are going
to use linear regression algorithm (a supervised learning approach) in
machine learning to build a predictive model for estimation of the house price
for real estate customers. In this project we are going to build the Machine
learning model by using python programming and other python tools like
NumPy, pandas, matplotlib etc.

1
1.2 Motivation
We are highly interested in anything related to Machine Learning, the
independent project provided us with the opportunity to study and reaffirm
our passion for this subject. The capacity to generate guesses, forecasts, and
offer machines the ability to learn on their own is both powerful and infinite
in terms of application possibilities. Machine Learning may be applied in
finance, medicine, and virtually any other field. That is why we opted to base
our idea on Machine Learning.

2
CHAPTER 2
LITERATURE SURVEY

2.1 Literature Survey

We are conducting an analysis of various Machine Learning algorithms in


this project to enhance the training of our Machine Learning model. The study
focuses on housing cost trends, which serve as indicators of the current
economic situation and have direct implications for buyers and sellers. The
actual cost of a house depends on numerous factors, including the number of
bedrooms, bathrooms, and location. In rural areas, the cost tends to be lower
compared to cities. Additionally, factors such as proximity to highways,
malls, supermarkets, job opportunities, and good educational facilities greatly
influence house prices.
To address this issue, our research paper presents a survey on predicting
house prices by analyzing given features. We employed different Machine
Learning models, including Linear Regression, Decision Tree, and Random
Forest, to construct a predictive model with their working accuracy. Our
approach involved a step-by-step process, encompassing Data Collection,
Pre-Processing Data, Data Analysis to Model Building.
Real Estate Property is not only a person's primary desire, but it also reflects
a person's wealth and prestige in today's society. Real estate investment
typically appears to be lucrative since property values do not drop in a choppy
fashion. Changes in the value of the real estate will have an impact on many
home investors, bankers, policymakers, and others. Real estate investing
appears to be a tempting option for investors. As a result, anticipating the
important estate price is an essential economic indicator. According to the
2011 census, the Asian country ranks second in the world in terms of the
number of households, with a total of 24.67 crores. However, previous
recessions have demonstrated that real estate costs cannot be seen. The
expenses of significant estate property are linked to the state's economic

3
situation. Regardless, we don't have accurate standardized approaches to live
the significant estate property values.

First, we looked at different articles and discussions about machine learning


for housing price prediction. The title of the article is house price prediction,
and it is based on machine learning and neural networks. The publication's
description is minimal error and the highest accuracy. The aforementioned
title of the paper is Hedonic models based on price data from Belfast infer
that submarkets and residential valuation this model is used to identify over
a larger spatial scale and implications for the evaluation process related to the
selection of comparable evidence and the quality of variables that the values
may require. Understanding current developments in house prices and
homeownership are the subject of the study. In this article, they utilized a
feedback mechanism or social pandemic that fosters a perception of property
as an essential market investment.
In this section, first of all, the basic economic structure affecting housing
prices is emphasized. Houses meet the shelter needs of people and are also an
investment tool. The housing market differs from other markets in that
housing is both a consumption and an investment good. Housing markets
differ from other markets in that the housing supply is very costly, the housing
is permanent and continuous, heterogeneous, fixed, causes growth in the
secondary markets, and is used as a guarantee (Iacoviello, 2000). The housing
market is formed through a mechanism of housing supply and demand. In the
housing market, unlike the goods and services market, the housing supply is
inelastic. Supply and demand for housing change and develop over time
depending on the economic, social, cultural, geographical, and demographic
realities of the countries. Meeting the housing demand is associated with
housing policies and economic conditions. Housing demand arises for
different purposes such as consumption, investment, and wealth
accumulation. The supply and demand factors change according to the type
of housing demand. In addition to the input costs of the house as a product,

4
the determination of the price of the house is affected by many variables such
as people’s income level, marital status, industrialization of the society and
agricultural employment rate, interest rates, population growth and migration,
and all variables also affect the price. Since changes in housing prices affect
both socio-economic conditions and national economic conditions, it is an
important issue that concerns governments and individuals (Kim and Park,
2005). Housing demand arises for different purposes such as consumption,
investment, and wealth accumulation. In this part of the literature, some
studies that estimate housing prices are cited. The prediction of houses with
real factors is important for the studies. With the developments in artificial
intelligence methods, it now allows the solution of many problems in daily
life such as purchasing a house. The competitive nature 4 AESTIMUM JUST
ACCEPTED MANUSCRIPT of the housing sector helps the data mining
process in this industry, processing this data and predicting its future trends.
Regression is a machine learning tool that encourages to build expectations
from available measurable information by taking the links between the target
parameter and many different independent parameters. The cost of a house is
based on several parameters. Machine learning is one of the most important
areas to apply ideas on how to increase costs and predict with high accuracy.
Machine learning method is one of the recent methods used for prediction. It
is used to interpret and analyze highly complex data structures and patterns
(Ngiam and Khor, 2019).
Machine learning predicts that computers learn and behave like humans
(Feggella, 2019). Machine learning means providing valid dataset, and
moreover predictions are based on it, machine learns how important a
particular event might be on the whole system based on pre-loaded data and
predicts the outcome accordingly. Various modern applications of this
technique include predicting stock prices, predicting the probability of an
earthquake, and predicting company sales, and the list has infinite
possibilities (Shiller, 2007). Unlike traditional econometrics models, machine
learning algorithms do not require the training data to be normally distributed.
Many statistical tests rely on the assumption of normality. If the data are not
5
normally distributed, these statistical tests will fail and become invalid. These
processes used to take a long time, however, today they can be completed
quickly with the high-speed computing power of modern computers and
therefore this technique is less costly and less timely to use. Rafiei and Adeli
(2016) used SVR to determine whether a property developer should build a
new development or stop the construction at the beginning of a project based
on the prediction of future house prices. The study, in which data from 350
apartment houses built in Tehran (Iran) between 1993 and 2008 were used,
had 26 features such as zip code, gross floor area, land area, estimated cost of
construction, construction time, and property prices. Its results revealed that
SVR was a suitable method for making home price predictions since the loss
of prediction (error) was as low as 3.6% of the test data. Therefore, the
prediction results provide valuable input to the property developer’s decision-
making process. Cechin et al. (2000) analyzed the data of buildings for sale
and rental in Porto Alegre, Brazil, using linear regression and artificial neural
network methods. They used parameters such as the size of the house, district,
geographical location, environmental arrangement, number of rooms,
building construction date and total area of use. According to the study, they
reported that the artificial neural network method was more useful compared
to linear regression. Yu and Wu (2016) used the classification and regression
algorithms. According to the analysis, living area square meter, roof content
and neighborhood have the greatest statistical significance in predicting the
selling price of a house, and the prediction analysis can be improved by the
Principal Component Analysis (PCA) technique. Because the value of a
particular property is closely associated with the infrastructure facilities
surrounding the property. Koktashev et al. (2019) attempted to predict the
house values in the city of Krasnoyarsk by using 1.970 housing transaction
records. The number of rooms, total area, floor, parking lot, type of repair,
number of balconies, type of bathroom, number of elevators, garbage
disposal, year of construction and accident rate of the house were discussed
as the features in that study. They applied random forest, ridge regression,
and linear regression to predict the property prices. Their study concluded
6
that the random forest outperformed the other two algorithms, as evaluated
by the Mean Absolute Error (MAE). Park and Bae (2015) developed a house
price prediction model with machine learning algorithms in real estate
research and compared their performance in terms of classification accuracy.
Their study aimed at helping real estate sellers or real estate agents to make
rational decisions in real estate transactions. The tests showed that the
accuracy-based Repeated Incremental Pruning to Produce Error Reduction
(RIPPER) consistently outperformed other models in house price prediction
performance. Bhagat et al. (2016) studied on linear regression algorithms for
house prediction. The aim of the study was to predict the effective price of
the real estate for clients based on their budget and priorities. They indicated
that the linear regression technique of the analysis of past market trends and
price ranges could be used to determine future house prices. In their study,
Mora-Esperanza and Gallego (2004) analyzed house prices in Madrid using
12 parameters. The parameters they used were the distance to the city center,
road, size of the district, construction class, age of the building, renovation
status, housing area, terrace area, location within the district, housing design,
the floor and the presence of outbuildings. The dataset was created assuming
that the sales values of 100 houses for sale in the region were the real values.
Researchers, who used the ANN and linear regression analysis technically,
reported that the ANN technique was more successful and achieved an
average agreement of 95% and an accuracy of 86%. Wang and Wu (2018)
used 27.649 data on home appraisal price from Airlington County, Virginia,
USA in 2015 and suggested that Random Forest outperformed the linear
regression in terms of accuracy. In their study in the case of Mumbai, India,
Varma et al. (2018) attempted to predict t the price of the house by using
various regression techniques (Linear Regression, Forest regression, boosted
regression) and artificial neural network technique based on the features of
the house (usage area, number of rooms, number of bathrooms, parking lot,
elevator, furniture). In conclusion, they determined that the efficiency of the
algorithm with the use of artificial neural networks was higher compared to
other regression techniques. They also revealed that the system prevented the
7
risk of investing in the wrong house by providing the right output. Thamarai
and Malarvizhi (2020) attempted to predict the prices of houses from real-
time data after the large fluctuation in house price increases in 2018 at the
Tadepalligudem location of West Godavari District in Andhra Pradesh, India
using the features of the number of bedrooms, age of the house, transportation
facilities, nearby schools, and shopping opportunities. They applied these
models in decision tree regression and multiple linear regression techniques,
which are among the machine learning techniques. They suggested that the
performance of multiple linear regression was better than decision tree
regression in predicting the house prices.
Zhao et al. [1] who applied deep learning in combination with extreme
Gradient Boosting (XGBoost) for real estate price predictions, by analyzing
historical property sale records. The dataset was extracted from Online Real
Estate website. The data split into 80% as training set and 20% as testing test.
According to Satish et al. [2] regression deals with specifying the relationship
between dependent also called as response or outcome and independent
variable or predictor. The study aimed to predict future house price with the
help of machine learning algorithm.

8
CHAPTER 3.
PROPOSED WORK
3.1 Objective of proposed work

As a first project, we intended to make it as instructional as possible by


tackling each stage of the machine learning process and attempting to
comprehend it well. We have picked Bangalore Real Estate Prediction as a
method, which is known as a "toy issue," identifying problems that are not of
immediate scientific relevance but are helpful to demonstrate and practice.
The objective was to forecast the price of a specific apartment based on
market pricing while accounting for various features.

3.2 Methodology

3.2.1 Introduction to machine learning


Machine learning uses various algorithms for building mathematical models
and making predictions using historical data or information. Currently, it is
being used for various tasks such as image recognition, speech
recognition, email filtering, Face book auto-tagging, recommender system,
and many more. This machine learning tutorial gives you an introduction to
machine learning along with the wide range of machine learning techniques
such as Supervised, Unsupervised, and Reinforcement learning. You will
learn about regression and classification models, clustering methods, hidden
Markov models, and various sequential models We’ve seen machine learning
in action in a variety of businesses, like Facebook, where it helps us identify
ourselves and our friends, and YouTube, where it helps us discover new
content. Where it recommends movies based on our preferences. Machine
learning is divided into two type’s unsupervised learning and supervised
learning. A data analyst often employs supervised learning to address
problems like classification and regression, implying that the data in this case
is targetable and that we want to anticipate in the future, such as assessing a
student's worth or the amount of monthly costs in the real world, we are

9
surrounded by humans who can increasing data. This research includes the
history of machine learning learn everything from their experiences with their
learning capability, and we have computers or machines which work on our
instructions. But can a machine also learn from experiences or past data like
a human does? So here comes the role of Machine Learning. It is a science
that will improve more in the future. The reason behind this development is
the difficulty of analyzing and processing the rapidly increasing data.
Machine learning is based on the principle of finding the best model for the
new data among the previous data thanks to this increasing data. Therefore,
machine learning researches will go on in parallel with the, the methods used
in machine learning, its application fields, and the researches on this field.
The aim of this study is to transmit the knowledge on machine learning, which
has become very popular nowadays, and its applications to the researchers.
There is no error margin in the operations carried out by computers based an
algorithm and the operation follows certain steps. Different from the
commands which are written to have an output based on an input, there are
some situations when the computers make decisions based upon the present
sample data. In those situations, computers may make mistakes just like
people in the decision-making process. That is, machine learning is the
process of equipping the computers with the ability to learn by using the data
and experience like a human brain. The main aim of machine learning is to
create models which can train themselves to improve, perceive the complex
patterns, and find solutions to the new problems by using the previous data.

Fig. 1 Introduction to Machine Learning

10
3.2.2 How does Machine Learning work?
A Machine Learning system learns from historical data, builds the prediction
models, and whenever it receives new data, predicts the output for it. The
accuracy of predicted output depends upon the amount of data, as the huge
amount of data helps to build a better model which predicts the output more
accurately. Suppose we have a complex problem, where we need to perform
some predictions, so instead of writing a code for it, we just need to feed the
data to generic algorithms, and with the help of these algorithms, machine
builds the logic as per the data and predict the output. Machine learning has
changed our way of thinking about the problem. The below block diagram
explains the working of Machine Learning algorithm:

Fig. 2 Working of Machine Learning

Features of Machine Learning: -


 Machine learning uses data to detect various patterns in a
given dataset.
 It can learn from past data and improve automatically.
 It is a data-driven technology.
 Machine learning is much similar to data mining as it also
deals with the huge amount of the data.

11
3.2.3 Need for Machine Learning
The need for machine learning is increasing day by day. The reason behind
the need for machine learning is that it is capable of doing tasks that are too
complex for a person to implement directly. As a human, we have some
limitations as we cannot access the huge amount of data manually, so for this,
we need some computer systems and here comes the machine learning to
make things easy for us. We can train machine learning algorithms by
providing them the huge amount of data and let them explore the data,
construct the models, and predict the required output automatically. The
performance of the machine learning algorithm depends on the amount of
data, and it can be determined by the cost function. With the help of machine
learning, we can save both time and money. The importance of machine
learning can be easily understood by its use’s cases, Currently, machine
learning is used in self-driving cars, cyber fraud detection, face recognition,
and friend suggestion by Facebook, etc. Various top companies such as
Netflix and Amazon have built machine learning models that are using a vast
amount of data to analyses the user interest and recommend product
accordingly.

3.2.4 Applications of Machine learning: -


Machine learning is a buzzword for today's technology, and it is growing very
rapidly day by day. We are using machine learning in our daily life even
without knowing it such as Google Maps, Google assistant, Alexa,

etc. Below are some most trending real-world applications of


Machine Learning:

Machine learning Life cycle


Machine learning has given the computer systems the abilities to
automatically learn without being explicitly programmed. But how does a
machine learning system work? So, it can be described using the life cycle of
machine learning. Machine learning life cycle is a cyclic process to build an

12
efficient machine learning project. The main purpose of the life cycle is to
find a solution to the problem or project. Machine learning life cycle involves
seven major steps, which are given below:
 Gathering Data
 Data preparation
 Data Wrangling
 Analyze Data
 Train the model
 Test the model

Fig 3 Application of Machine Learning

Gathering Data:
Data Gathering is the first step of the machine learning life cycle. The goal of
this step is to identify and obtain all data-related problems. In this step, we
need to identify the different data sources, as data can be collected from
various sources such as files, database, internet, or mobile devices. It is one
of the most important steps of the life cycle. The quantity and quality of the
collected data will determine the efficiency of the output. The more will be
the data, the more accurate will be the prediction. This step includes the below
tasks:

13
 Identify various data sources
 Collect data
 Integrate the data obtained from different source

Data preparation:
After collecting the data, we need to prepare it for further steps. Data
preparation is a step where we put our data into a suitable place and prepare
it to use in our machine learning training. In this step, first, we put all data
together, and then randomize the ordering of data. This step can be further
divided into two processes:

Data exploration:
It is used to understand the nature of data that we have to work with. We need
to understand the characteristics, format and quality of data.
A better understanding of data leads to an effective outcome. In this, we find
Correlations, general trends, and outliers.

Data Wrangling:
Data wrangling is the process of cleaning and converting raw data into a
useable format. It is the process of cleaning the data, selecting the variable to
use, and transforming the data in a proper format to make it more suitable for
analysis in the next step. It is one of the most important steps of the complete
process. Cleaning of data is required to address the quality issues. It is not
necessary that data we have collected is always of our use as some of the data
may not be useful. In real-world applications, collected data may have various
issues, including:
 Missing Values
 Duplicate data
 Invalid data
 Noise

14
Data Analysis
 Now the cleaned and prepared data is passed on to the analysis step.
This step involves:
 Selection of analytical techniques
 Building models
 Review the result

The aim of this step is to build a machine learning model to analyze the data
using various analytical techniques and review the outcome. It starts with the
determination of the type of the problems, where we select the machine
learning techniques such as Classification, Regression, Cluster
analysis, Association, etc. then build the model using prepared data, and
evaluate the model.

Deployment
The last step of machine learning life cycle is deployment, where we deploy
the model in the real-world system. If the above-prepared model is producing
an accurate result as per our requirement with acceptable speed, then we
deploy the model in the real system. But before deploying the project, we will
check whether it is improving its performance using available data or not. The
deployment phase is similar to making the final report for a project.

3.2.5 Machine Learning Classifications


Machine Learning can be examined in four parts as follows;
 Supervised learning
 Unsupervised learning
 Reinforced learning

3.2.5.1 Supervised Learning


Supervised learning is a type of machine learning method in which we
provide sample labelled data to the machine learning system in order to train
it, and on that basis, it predicts the output. The system creates a model using

15
labelled data to understand the datasets and learn about each data, once the
training and processing are done then we test the model by providing a sample
data to check whether it is predicting the exact output or not. The goal of
supervised learning is to map input data with the output data. The supervised
learning is based on supervision, and it is the same as when a student learns
things in the supervision of the teacher. The example of supervised learning
is spam filtering. Supervised learning is a process of providing input data as
well as correct output data to the machine learning model. The aim of a
supervised learning algorithm is to find a mapping function to map the input
variable(x) with the output variable(y). In the real-world, supervised learning
can be used for Risk Assessment, Image classification, Fraud Detection, spam
filtering, etc. In supervised learning, models are trained using labelled
dataset, where the model learns about each type of data. Once the training
process is completed, the model is tested on the basis of test data (a subset of
the training set), and then it predicts the output. The working of Supervised
learning can be easily understood by the below example and diagram:

Fig. 4 Supervised Learning

Suppose we have a dataset of different types of shapes which includes square,


rectangle, triangle, and Polygon. Now the first step is that we need to train
the model for each shape.

 If the given shape has four sides, and all the sides are equal, then it
will be labelled as a Square.
 If the given shape has three sides, then it will be labelled as a triangle.

16
 If the given shape has six equal sides, then it will be labelled as
hexagon.

Now, after training, we test our model using the test set, and the task of the
model is to identify the shape. The machine is already trained on all types of
shapes, and when it finds a new shape, it classifies the shape on the bases of
a number of sides, and predicts the output.

Steps Involved in Supervised Learning

 First Determine the type of training dataset


 Collect/Gather the labelled training data.
 Split the training dataset into training dataset, test dataset, and
validation dataset.
 Determine the input features of the training dataset, which should
have enough knowledge so that the model can accurately predict the
output.
 Determine the suitable algorithm for the model, such as support vector
machine, decision tree, etc.
 Execute the algorithm on the training dataset. Sometimes we need
validation sets as the control parameters, which are the subset of
training datasets.
 Evaluate the accuracy of the model by providing the test set. If the
model predicts the correct output, which means our model is accurate.

Types of supervised Machine learning Algorithms:


Supervised learning can be further divided into two types of problems which
is shown in figure:

17
Fig. 5 Types of supervised Machine learning

Regression

Regression algorithms are used if there is a relationship between the input


variable and the output variable. It is used for the prediction of continuous
variables, such as Weather forecasting, Market Trends, etc. Below are some
popular Regression algorithms which come under supervised learning:

 Linear Regression
 Regression Trees
 Non-Linear Regression
 Bayesian Linear Regression
 Polynomial Regression

Linear Regression in Machine Learning: -

Linear regression is one of the easiest and most popular Machine Learning
algorithms. It is a statistical method that is used for predictive analysis. Linear
regression makes predictions for continuous/real or numeric variables such
as sales, salary, age, product price, etc. Linear regression algorithm shows a
linear relationship between a dependent (y) and one or more independent (y)
variables, hence called as linear regression. Since linear regression shows the
linear relationship, which means it finds how the value of the dependent

18
variable is changing according to the value of the independent variable. The
linear regression model provides a sloped straight line representing the
relationship between the variables. Consider the below image:

Fig 6 Linear Regression in Machine Learning

Mathematically, we can represent a linear regression as:


y= a0+a1x+ ε
Here,
Y= Dependent Variable (Target Variable)
X= Independent Variable (predictor Variable)
a0= intercept of the line (Gives an additional degree of freedom)
a1 = Linear regression coefficient (scale factor to each input value).
ε = random error
The values for x and y variables are training datasets for Linear Regression
model representation.

Types of Linear Regression:


Linear regression can be further divided into two types of the
algorithm:

19
 Simple Linear Regression: If a single independent variable is
used to predict the value of a numerical dependent variable, then such
a Linear Regression algorithm is called Simple Linear Regression.
 Multiple Linear Regressions: If more than one independent variable
is used to predict the value of a numerical dependent variable, then
such a Linear Regression algorithm is called Multiple Linear
Regression.

Linear Regression Line


 A linear line showing the relationship between the dependent and
independent variables is called a regression line. A regression line can
show two types of relationship:
 Positive Linear Relationship: If the dependent variable increases on
the Y-axis & independent variable increases on X-axis, then such a
relationship s termed as a Positive linear relationship.

Fig 7 Linear Regression Line

Negative Linear Relationship


If the dependent variable decreases on the Y-axis and independent variable
increases on the X-axis, then such a relationship is called a negative linear
relationship.

20
Fig 8 Negative Linear Relationship

Finding the best fit line:


When working with linear regression, our main goal is to find the best fit line
that means the error between predicted values and actual values should be
minimized. The best fit line will have the least error. The different values for
weights or the coefficient of lines (a0, a1) gives a different line of regression,
so we need to calculate the best values for a0 and a1 to find the best fit line,
so to calculate this we use cost function.

Cost function

 The different values for weights or coefficient of lines (a0, a1) gives
the different line of regression, and the cost function is used to
estimate the values of the coefficient for the best fit line.
 Cost function optimizes the regression coefficients or weights. It
measures how a linear regression model is performing.
 We can use the cost function to find the accuracy of the mapping
function, which maps the input variable to the output variable. This
mapping function is also known as Hypothesis function.
 For Linear Regression, we use the Mean Squared Error (MSE) cost
function, which is the average of squared error occurred between the

21
predicted values and actual values. It can be written as: For the above
linear equation, MSE can be calculated as:

Where,
N=Total number of observations
Yi = Actual value
(a1xi+a0) = Predicted value.

Classification
Classification algorithms are used when the output variable is
categorical, which means there are two classes such as Yes-No,
Male-Female, True-false, etc.
 Random Forest
 Decision Trees
 Logistic Regression
 Support vector Machines

Random Forest Algorithm

Random Forest is a popular machine learning algorithm that belongs to the


supervised learning technique. It can be used for both Classification and
Regression problems in ML. It is based on the concept of ensemble learning,
which is a process of combining multiple classifiers to solve a complex
problem and to improve the performance of the model. As the name suggests,
"Random Forest is a classifier that contains a number of decision trees on
various subsets of the given dataset and takes the average to improve the
predictive accuracy of that dataset." Instead of relying on one decision tree,
the random forest takes the prediction from each tree and based on the
majority votes of predictions, and it predicts the final output. The greater
number of trees in the forest leads to higher accuracy and prevents the

22
problem of over fitting. The below diagram explains the working of the
Random Forest algorithm:

Fig 9 Random Forest Algorithm

Why use Random Forest?


Below are some points that explain why we should use the Random Forest
algorithm:

 It takes less training time as compared to other algorithms.


 It predicts output with high accuracy, even for the large dataset it runs
efficiently.
 It can also maintain accuracy when a large proportion of data is
missing.

Applications of Random Forest


There are mainly four sectors where Random Forest mostly used:
 Banking: Banking sector mostly uses this algorithm for the
identification of loan risk.
 Medicine: With the help of this algorithm, disease trends
and risks of the disease can be identified.

23
 Land Use: We can identify the areas of similar land use by
this algorithm.
 Marketing: Marketing trends can be identified using this
algorithm.

Advantages of Random Forest


 Random Forest is capable of performing both Classification
and Regression tasks.
 It is capable of handling large datasets with high
dimensionality.
 It enhances the accuracy of the model and prevents the over
fitting issue.

Advantages of Supervised learning


 With the help of supervised learning, the model can predict
the output on the basis of prior experiences.
 In supervised learning, we can have an exact idea about the
classes of objects.
 Supervised learning model helps us to solve various real-
world problems such as fraud detection, spam filtering, etc.

Disadvantages of supervised learning: -


 Supervised learning models are not suitable for handling the complex
tasks.
 Supervised learning cannot predict the correct output if the test data
is different from the training dataset.
 Training required lots of computation times.
 In supervised learning, we need enough knowledge about the classes
of object.

24
Decision Tree Classification Algorithm: -

 Decision Tree is a supervised learning technique that can be used for


both classification and Regression problems, but mostly it is preferred
for solving Classification problems. It is a tree-structured classifier,
where internal nodes represent the features of a dataset, branches
represent the decision rules and each leaf node represents the
outcome.
 In a Decision tree, there are two nodes, which are the Decision Node
and Leaf Node. Decision nodes are used to make any decision and
have multiple branches, whereas Leaf nodes are the output of those
decisions and do not contain any further branches.
 The decisions or the test are performed on the basis of features of the
given dataset.
 In order to build a tree, we use the CART algorithm, which stands
for Classification and Regression Tree algorithm.
 A decision tree simply asks a question, and based on the answer
(Yes/No), it further split the tree into sub trees.
 Below diagram explains the general structure of a decision tree:

Fig 10 Decision Tree Classification Algorithm

25
Why use Decision Trees?

 There are various algorithms in Machine learning, so choosing the


best algorithm for the given dataset and problem is the main point to
remember while creating a machine learning model. Below are the
two reasons for using the Decision tree:

 Decision Trees usually mimic human thinking ability while making a


decision, so it is easy to understand. The logic behind the decision tree
can be easily understood because it shows a tree-like structure.

How does the Decision Tree algorithm Work?


In a decision tree, for predicting the class of the given dataset, the algorithm
starts from the root node of the tree. This algorithm compares the values of
root attribute with the record (real dataset) attribute and, based on the
comparison, follows the branch and jumps to the next node. For the next node,
the algorithm again compares the attribute value with the other sub-nodes and
move further. It continues the process until it reaches the leaf node of the tree.
The complete process can be better understood using the below algorithm:

 Step-1: Begin the tree with the root node, says S, which contains the
complete dataset.
 Step-2: Find the best attribute in the dataset using Attribute Selection
Measure (ASM).
 Step-3: Divide the S into subsets that contains possible values for the
best attributes.
 Step-4: Generate the decision tree node, which contains the best
attribute.
 Step-5: Recursively make new decision trees using the subsets of the
dataset created in step -3. Continue this process until a stage is reached
where you cannot further classify the nodes and called the final node

26
as a leaf node. Finally, the decision node splits into two leaf nodes
(Accepted offers and Declined offer). Consider the below diagram:

Fig 11 Decision Tree algorithm Working

Logistic Regression
 Logistic Regression is a significant machine learning algorithm
because it has the ability to provide probabilities and classify new data
using continuous and discrete datasets.
 Logistic Regression can be used to classify the observations using
different types of data and can easily determine the most effective
variables used for the classification. The below image is showing the
logistic function:

Fig 12 Logistic Regression

27
3.2.5.2 Unsupervised Machine Learning:
Unsupervised learning cannot be directly applied to a regression or
classification problem because unlike supervised learning, we have the input
data but no corresponding output data. The goal of unsupervised learning is
to find the underlying structure of dataset, group that data according to
similarities, and represent that dataset in a compressed format.

Working of Unsupervised Learning

Working of unsupervised learning can be understood by the below diagram:

Fig 13 Working of Unsupervised Learning

Here, we have taken an unlabeled input data, which means it is not


categorized and corresponding outputs are also not given. Now, this
unlabeled input data is fed to the machine learning model in order to train it.
Firstly, it will interpret the raw data to find the hidden patterns from the data
and then will apply suitable algorithms such as k-means clustering, Decision
tree, etc.

Unsupervised Learning algorithms: -

Below is the list of some popular unsupervised learning


algorithms:

 K-means clustering
 KNN (k-nearest neighbors)

28
 Hierarchal clustering
 Anomaly detection
 Neural Networks
 Principle Component Analysis
 Independent Component Analysis
 Apriori algorithm
 Singular value decomposition

Clustering in Machine Learning


Clustering is very much important as it determines the intrinsic grouping
among the unlabeled data present. There are no criteria for good clustering.
It depends on the user, what is the criteria they may use which satisfy their
need. For instance, we could be interested in finding representatives for
homogeneous groups (data reduction), in finding “natural clusters” and
describe their unknown properties (“natural” data types), in finding useful
and suitable groupings (“useful” data classes) or in finding unusual data
objects (outlier detection). This algorithm must make some assumptions that
constitute the similarity of points and each assumption make different and
equally valid clusters. Clustering or cluster analysis is a machine learning
technique, which groups the unlabeled dataset. It can be defined as "A way
of grouping the data points into different clusters, consisting of similar data
points. The objects with the possible similarities remain in a group that has
less or no similarities with another group does it by finding some similar
patterns in the unlabeled dataset such as shape, size, color, behavior, etc., and
divides them as per the presence and absence of those similar patterns. It is
an unsupervised learning method, hence no supervision is provided to the
algorithm, and it deals with the unlabeled dataset. After applying this
clustering technique, each cluster or group is provided with a cluster-ID. ML
system can use this id to simplify the processing of large and complex
datasets.

29
Apart from these general usages, it is used by the Amazon in its
recommendation system to provide the recommendations as per the past
search of products. Netflix also uses this technique to recommend the movies
and web-series to its users as per the watch history. The below diagram
explains the working of the clustering algorithm. We can see the different
fruits are divided into several groups with similar properties.

Fig 14 Clustering in Machine Learning

3.2.5.3 Reinforcement Learning:


This is a kind of learning in which the agents learn via reward system.
Although there is a start and finish points, the aim of the agent is to use the
shortest and the correct ways to reach the goal. When the agent goes through
the correct ways, s/he is given positive rewards. But the going through
wrong ways means negative rewards. Learning occurs on the way to the
goal.

30
CHAPTER. 4
4. IMPLIMENTATION

4.1 CODE
import matplotlib. pyplot as plt
def plot_scatter_chart(df,location):
bhk2=df[(df.location==location)&(df.BHK==2)]
bhk3=df[(df.location==location)&(df.BHK==3)]
plt.rcParams['figure.figsize']=(15,10)
plt.scatter(bhk2.total_sqft,bhk2.price,color='Blue',label='2 BHK',s=50)
plt.scatter(bhk3.total_sqft,bhk3.price,color='green',marker='+',label='3
BHK',s=50)
plt.xlabel('Total Square Foot')
plt.ylabel('Price')
plt.title(location)
plt.legend()
plot_scatter_chart(data3,"Rajaji Nagar")
def remove_bhk_outliers(df):
exclude_indices=np.array([])
for location, location_df in df.groupby('location'):
bhk_sats={}
for BHK,BHK_df in location_df.groupby('BHK'):
bhk_sats[BHK]={
'mean':np.mean(BHK_df.price_per_sqft),
'std':np.std(BHK_df.price_per_sqft),
'count':BHK_df.shape[0]
}
for BHK,BHK_df in location_df.groupby('BHK'):
stats=bhk_sats.get(BHK-1)
if stats and stats['count']>5:

31
exclude_indices=np.append(exclude_indices,BHK_df[BHK_df.price_per_s
qft<(stats['mean'])].index.values)
return df.drop(exclude_indices,axis='index')
data4=remove_bhk_outliers(data3)
data4.shape

32
CHAPTER 5
5 RESULT ANALYSIS
5.1 VISUALIZATION INSIGHTS:
2BHK Preference:
The observation that most houses sold are 2BHK suggests that buyers may
prefer smaller-sized homes, possibly due to factors such as affordability,
family size, or lifestyle preferences.

Fig. 15 Visualization insights

Location Diversity:
With houses from 255 different locations, 'Whitefield' and 'Sarjapur Road'
emerge as popular areas. This information is valuable for understanding
market demand and can aid in targeted marketing or investment decisions.

Fig 16 Location Diversity

33
Distribution Plots:
The distribution plots for 'bath', 'bhk', 'price', and 'total_sqft' provide insights
into the spread and variability of these features. Understanding their
distributions can help in identifying outliers, understanding central
tendencies, and assessing data quality.
Train-Test Split and Model Building:

Data Splitting:
The dataset is split into training and testing sets, with 80% of the data used
for training and 20% for testing. This ensures that the model's performance is
evaluated on unseen data, providing a more accurate assessment of its
generalization ability.

Model Selection:
Three regression models - Linear Regression, Lasso Regression, and Ridge
Regression - are chosen for predicting house prices. These models offer
different approaches to regression and can capture different aspects of the
data's underlying relationships.

Preprocessing:
One-hot encoding is used to handle the categorical feature 'location', while
standard scaling ensures that all features are on a similar scale, preventing
any particular feature from dominating the model training process.

Evaluation Metric:
R2 score, also known as the coefficient of determination, is employed as the
evaluation metric. It represents the proportion of the variance in the
dependent variable (house prices) that is predictable from the independent
variables.

34
Fig. 17 Evaluation Metric

Result Analysis:

Model Performance:
Linear Regression and Ridge Regression exhibit similar performance, with
R2 scores of around 0.82. This indicates that approximately 82% of the
variance in house prices is captured by these models.

Impact of Regularization:
Lasso Regression, which applies L1 regularization, slightly underperforms
compared to the other two models. The negligible difference in performance
between Ridge and Linear Regression suggests that regularization might not
significantly affect model performance in this scenario.

Overall Assessment: The models demonstrate good


performance in
predicting house prices based on the given features. However, there is room
for further analysis, feature engineering, or tuning to potentially improve
model performance even further. In conclusion, the analysis highlights the

35
effectiveness of the chosen regression models in predicting house prices. The
insights gleaned from data visualization aid in understanding market
dynamics, while model evaluation provides valuable feedback for refining
model selection and preprocessing techniques.

5.2 ADVANTAGES

Predicting house prices using Python offers several advantages:

 Flexibility: Python provides a wide range of libraries and frameworks


for machine learning, such as Scikit-learn, TensorFlow, and PyTorch.
This flexibility allows developers to choose the best tools for their
specific requirements.
 Ease of Use: Python is known for its simplicity and readability,
making it accessible for both beginners and experienced developers.
It offers clear and concise syntax, which facilitates the
implementation of machine learning algorithms.
 Abundant Libraries: Python boasts a rich ecosystem of libraries and
packages for data analysis, preprocessing, visualization, and
modelling. For house price prediction, libraries like Pandas, NumPy,
Matplotlib, and Seaborn are invaluable for tasks such as data
manipulation, exploration, and visualization.
 Scalability: Python is highly scalable, allowing developers to work
with datasets of various sizes. Whether working with small datasets
or big data, Python's libraries and frameworks can handle the task
efficiently.
 Community Support: Python has a vast and active community of
developers, data scientists, and researchers. This community provides
extensive documentation, tutorials, and forums, making it easy to find
help and resources for building predictive models.
 Integration: Python seamlessly integrates with other technologies
and tools commonly used in data science and machine learning

36
projects. It can be integrated with databases, web frameworks, and
cloud services, allowing for end-to-end development and deployment
of predictive models.
 Machine Learning Ecosystem: Python's machine learning
ecosystem is well-established and constantly evolving. It offers state-
of-the-art algorithms, techniques, and methodologies for solving
predictive modelling problems, including house price prediction.
 Interpretability: Python-based machine learning models are often
highly interpretable, allowing stakeholders to understand the factors
driving predictions. This transparency is crucial, especially in real
estate, where buyers, sellers, and agents seek to understand the
rationale behind house price estimates.
 Open Source: Python is open source and free to use, making it
accessible to everyone. This democratization of technology enables
individuals and organizations of all sizes to leverage machine learning
for various applications, including house price prediction.

5.3 DISADVANTAGES
While Python offers numerous advantages for house price
prediction, there are also some potential disadvantages to consider:

 Performance: Python, being an interpreted language, can be slower


compared to compiled languages like C++ or Java. For large-scale
datasets or complex models, Python's performance may become a
limiting factor, leading to longer training and inference times.
 Memory Usage: Python's memory management can be less efficient
compared to languages like C or C++. This inefficiency may lead to
higher memory usage, especially when working with large datasets or
deep learning models, potentially causing resource constraints on
systems with limited memory.

37
 GIL Limitation: Python's Global Interpreter Lock (GIL) can hinder
multithreaded performance, particularly in CPU-bound tasks. While
libraries like NumPy and Pandas can offload computation to
optimized C or Fortran code, certain operations may still be affected
by the GIL, impacting parallel processing performance.
 Dependency Management: Python's dependency management
system, particularly with respect to package versions and
compatibility, can sometimes be challenging. Dependency conflicts
or version mismatches between libraries may arise, requiring careful
management and potentially causing issues with model
reproducibility.
 Debugging Complexity: Python's dynamic typing and flexible
syntax, while advantageous for development speed, can sometimes
lead to more challenging debugging processes. Errors may not be
caught until runtime, and troubleshooting issues in complex machine
learning pipelines may require significant effort.
 Limited Deployment Options: While Python excels in model
development and experimentation, deploying Python-based machine
learning models into production environments may present
challenges.
 Interpretability: While Python-based machine learning models can
offer interpretability, certain advanced techniques such as deep
learning may produce less interpretable models. Understanding and
explaining the predictions of complex models may require additional
effort and expertise.
 Security Risks: Python's open-source nature and extensive library
ecosystem can introduce security risks, particularly when using third-
party packages or dependencies. Ensuring the security of machine
learning pipelines and protecting against vulnerabilities requires
careful attention and proactive measures.

38
 Learning Curve: While Python's syntax is relatively easy to learn,
mastering the full spectrum of machine learning techniques and
libraries can be challenging. Beginners may face a steep learning
curve, requiring time and dedication to gain proficiency in data
preprocessing, model selection, and evaluation.

5.4 MAINTENANCE

 Maintaining a house price prediction system developed using Python


involves several key aspects to ensure its reliability, accuracy, and
relevance over time. Here are some maintenance tasks for such a
system:
 Data Quality Monitoring: Continuously monitor the quality of input
data to detect any anomalies, missing values, or inconsistencies.
Implement data validation checks and automated alerts to notify when
data quality issues arise.
 Model Performance Monitoring: Regularly assess the performance of
the predictive models deployed in the system. Monitor key metrics
such as accuracy, precision, recall, and F1-score. Retrain models
periodically using updated data to maintain their effectiveness.
 Feature Engineering Updates: Stay informed about changes in the real
estate market and incorporate relevant features or indicators into the
predictive models. Periodically review and update feature

engineering techniques to capture new trends or patterns in


housing data.
 Model Interpretability: Ensure that the predictive models remain
interpretable and explainable. Use techniques such as feature
importance analysis and model-agnostic interpretation methods to
understand the factors driving predictions and maintain transparency.
 Version Control: Implement version control for both code and data to
track changes and facilitate reproducibility. Maintain a version history

39
of the predictive models, along with the associated data preprocessing
and feature engineering pipelines.
 Security Measures: Implement security measures to protect the
integrity and confidentiality of the data used in the prediction system.
Use encryption, access controls, and secure communication protocols
to safeguard sensitive information.
 Scalability: Monitor system performance and scalability as the
volume of data and user traffic grows. Optimize code and
infrastructure to handle increasing workloads efficiently and ensure
timely responses to user queries.
 Documentation: Maintain comprehensive documentation for the
prediction system, including model specifications, data sources,
preprocessing steps, and evaluation metrics. Document any

changes or updates made to the system over time.


 User Feedback Incorporation: Gather feedback from users and
stakeholders to identify areas for improvement and address any
usability issues. Incorporate user feedback into future iterations of the
prediction system to enhance user satisfaction and adoption.
 Continual Improvement: Continuously evaluate and refine the
prediction system based on feedback, performance metrics, and
evolving business requirements. Experiment with new algorithms,
techniques, or features to improve predictive accuracy and relevance.

40
5.5 APPLICATION
Table 1 Application

41
CHAPTER 6

6.1 CONCLUSION AND FUTURE DEVELOPMENTS

With several characteristics, the suggested method predicts the property price
in Bangalore. We experimented with different Machine Learning algorithms
to get the best model. When compared to all other algorithms, the Decision
Tree Algorithm achieved the lowest loss and the greatest R-squared. Flask
was used to create the website.
Let's see how our project pans out. Open the HTML web page we generated
and run the app.py file in the backend. Input the property's square footage,
the number of bedrooms, the number of bathrooms, and the location, then
click 'ESTIMATE PRICE.' We forecasted the cost of what may be someone's
ideal home.

The goal of the project "House Price Prediction Using Machine Learning" is
to forecast house prices based on various features in the provided data. Our
best accuracy was around 90% after we trained and tested the model. To make
this model distinct from other prediction systems, we must include more
parameters like tax and air quality. People can purchase houses on a budget
and minimize financial loss. Numerous algorithms are used to determine
house values. The selling price was determined with greater precision and
accuracy. People will benefit greatly from this. Numerous elements that
influence housing prices must be taken into account and handled.

42
REFERENCE
[1] Model “BANGALORE HOUSE PRICE PREDICTION MODEL”
[2] Heroku “Documentation”
[3] Repository: “Web Application” https://github.com/msatmod/Bangalore-
House-Price-Prediction
[4]Repository: “Web Application” https://github.com/Amey-
Thakur/BANGALORE-HOUSE-PRICE-PREDICTION
[5] Pickle ‘’Documentation’
[6] A. Varma, A. Sarma, S. Doshi and R. Nair, "House Price Prediction Using
Machine Learning and Neural Networks," 2018 Second International
Conference on Inventive Communication and Computational Technologies
(ICICCT), 2018, pp. 1936-1939, doi: 10.1109/ICICCT.2018.8473231.
[7] Furia, Palak, and Anand Khandare. "Real Estate Price Prediction Using
Machine Learning Algorithm." e-Conference on Data Science and Intelligent
Computing. 2020.
[8] Musciano, Chuck, and Bill Kennedy. HTML & XHTML: The Definitive
Guide: The Definitive Guide. " O'Reilly Media, Inc.", 2002.
[9] Aggarwal, Shalabh. Flask framework cookbook. Packt Publishing Ltd,
2014.
[10] Grinberg, Miguel. Flask web development: developing web applications
with python. " O'Reilly Media, Inc.", 2018.
[11] Middleton, Neil, and Richard Schneeman. Heroku: up and running:
effortless application deployment and scaling. " O'Reilly Media, Inc.", 2013.
[12]Available:https://www.researchgate.net/publication/347584803_House_
Price_Prediction_using_a_Machine_Learning_Model_A_Survey_of_Literat
ure
[13] House price prediction using a hedonic price model vs an artificial
neural network. American Journal of Applied Sciences. Limsombunchai,
Christopher Gan, and Minsoo Lee. 3:193–201.

43
[14] Joep Steegmans and Wolter Hassink. an empirical investigation of how
wealth and income affect one's financial status and ability to purchase a home.
Journal of Housing Economics. 2017;z36:8–24.
[15] Ankit Mohokar, Nihar Baghat, and Shreyash Mane. House Price
Forecasting Using Data Mining, International Journal of Computer
Applications. 152:23–26.
[16] Joao Gama, Torgo, and Luis. Logic regression using Classification
Algorithms. Intelligent Data Analysis. 4:275-292.
[17] Fabian Pedregosa et al. Python's Scikit-learn library for machine
learning, Journal of Machine Learning Research. 12:2825–830.
[18] Real Estate Economics. Heidelberg, Bork M. and Moller VS, House
Price Forecast Ability: A Factor Analysis. 46:582–611.
[19] Hy Dang, Minh Nguyen, Bo Mei, and Quang Troung. Improvements to
home price prediction methods using machine learning. Precedia
Engineering. 174:433-442.
[20] Atharva Chogle, Priyankakhaire, Akshata Gaud, and Jinal Jain. A article
titled House Price Forecasting Using Data Mining Techniques was published
in the International Journal of Advanced Research in Computer and
Communication Engineering. 6:24-28.
[21] Kai-Hsuan Chu, Li, Li. Prediction of real estate price variation based on
economic parameters, International Conference on. IEEE, Applied System
Innovation (ICASI); 2017.
[22] Subhani Shaik, Uppu Ravibabu. Classification of EMG Signal Analysis
based on Curvelet Transform and Random Forest tree Method. Paper selected
for Journal of Theoretical and Applied Information Technology (JATIT). 95.
[23] Subhani Shaik, Uppu Ravibabu. Classification of EMG Signal Analysis
based on Curvelet Transform and Random Forest tree Method. Paper selected
for Journal of Theoretical and Applied Information Technology (JATIT). 95.

44

You might also like