0% found this document useful (0 votes)

26 views14 pages

House-Price-Prediction-Using-Regression-Techniques Retouch - Removed

This document discusses building a machine learning model to predict house prices in Bangalore, India. It scrapes data from real estate websites, explores and visualizes the data. Multiple linear regression is identified as the best model based on mean squared error. The model is deployed through a Python web app. The goal is to create an accurate price prediction tool to help buyers and sellers compare prices.

Uploaded by

krushnapalsinhvaghela76

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

26 views14 pages

House-Price-Prediction-Using-Regression-Techniques Retouch - Removed

Uploaded by

krushnapalsinhvaghela76

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 14

lOMoARcPSD|38545651

TABLE OF CONTENTS

Chapter Number Contents

1. Abstract
2. INTRODUCTION

2.1 AIM and IMPORTANCE

2.2 Need and Motivation

3. DATASET
Steps in Preparing Data for Model

3.1 Data Exploration

3.2 Data Visualization

3.3 Data Selection

4. LANGUAGE AND MODELS USED

5. Python
Jupyter Notebook
NumPy
Pandas
Seaborn
Matplotlib

6. Models Used
Multiple Linear Regression
7. RESULTS AND DISCUSSIONS
8. Deployment App
9. Conclusion
10. Repository link

Downloaded by krushnapalsinh vaghela ([email protected])

lOMoARcPSD|38545651

Abstract

House price forecasting is an important topic of real estate. The literature

attempts to derive useful knowledge from historical data of property markets.

Machine learning techniques are applied to analyse historical property

transactions in India to discover useful models for house buyers and sellers.

Revealed is the high discrepancy between house prices in the most expensive

and most affordable suburbs in the city of Bangalore. Moreover, experiments

demonstrate that the Multiple Linear Regression that is based on mean squared

error measurement is a competitive approach.

Downloaded by krushnapalsinh vaghela ([email protected])

lOMoARcPSD|38545651

INTRODUCTION

AIM and IMPORTANCE

Aim
These are the Parameters on which we will evaluate ourselves-

• Create an effective price prediction model

• Validate the model’s prediction accuracy

• Identify the important home price attributes which feed the model’s
predictive power.

Downloaded by krushnapalsinh vaghela ([email protected])

lOMoARcPSD|38545651

Need and Motivation

Having lived in India for so many years if there is one thing that I had been taking

for granted, it’s that housing and rental prices continue to rise. Since the housing

crisis of 2008, housing prices have recovered remarkably well, especially in major

housing markets. However, in the 4th quarter of 2016, I was surprised to read that

Bombay housing prices had fallen the most in the last 4 years. In fact, median

resale prices for condos and coops fell 6.3%, marking the first time there was a

decline since Q1 of 2017. The decline has been partly attributed to political

uncertainty domestically and abroad and the 2014 election. So, to maintain the

transparency among customers and also the comparison can be made easy through

this model. If customer finds the price of house at some given website higher than

the price predicted by the model, so he can reject that house.

Downloaded by krushnapalsinh vaghela ([email protected])

lOMoARcPSD|38545651

DATASET

Here we have web scrapped the Data from 99acres.com website which is one of
the leading real estate websites operating in INDIA.

Our Data contains Bangalore Houses only.

Dataset looks as follows-

Data Exploration

Data exploration is the first step in data analysis and typically involves summarizing the main
characteristics of a data set, including its size, accuracy, initial patterns in the data and other
attributes. It is commonly conducted by data analysts using visual analytics tools, but it can
also be done in more advanced statistical software, Python. Before it can conduct analysis
on data collected by multiple data sources and stored in data warehouses, an organization
must know how many cases are in a data set, what variables are included, how many
missing values there are and what general hypotheses the data is likely to support. An initial
exploration of the data set can help answer these questions by familiarizing analysts with
the data with which they are working.
We divided the data 9:1 for Training and Testing purpose respectively.

Downloaded by krushnapalsinh vaghela ([email protected])

lOMoARcPSD|38545651

Data Visualization

Data visualization is the graphical representation of information and data.

By using visual elements like charts, graphs, and maps, data visualization

tools provide an accessible way to see and understand trends, outliers, and

patterns in data. In the world of Big Data, data visualization tools and

technologies are essential to analyse massive amounts of information and

make data-driven decisions.

Downloaded by krushnapalsinh vaghela ([email protected])

lOMoARcPSD|38545651

Data Selection

Data selection is defined as the process of determining the appropriate data

type and source, as well as suitable instruments to collect data. Data

selection precedes the actual practice of data collection. This definition

distinguishes data selection from selective data reporting (selectively

excluding data that is not supportive of a research hypothesis) and

interactive/active data selection (using collected data for monitoring

activities/events, or conducting secondary data analyses). The process of

selecting suitable data for a research project can impact data integrity.

The primary objective of data selection is the determination of appropriate

data type, source, and instrument(s) that allow investigators to adequately

answer research questions. This determination is often discipline-specific

and is primarily driven by the nature of the investigation, existing literature,

and accessibility to necessary data sources.

Data Transformation

Downloaded by krushnapalsinh vaghela ([email protected])

lOMoARcPSD|38545651

The log transformation can be used to make highly skewed distributions less

skewed. This can be valuable both for making patterns in the data more

interpretable and for helping to meet the assumptions of inferential statistics.

It is hard to discern a pattern in the upper panel whereas the strong relationship is

shown clearly in the lower panel. The comparison of the means of log-transformed

data is actually a comparison of geometric means. This occurs because, as shown

below, the anti-log of the arithmetic mean of log-transformed values is the

geometric mean.

Price in sq.ft

Bathrooms

Downloaded by krushnapalsinh vaghela ([email protected])

lOMoARcPSD|38545651

LANGUAGE AND MODELS USED

Python

Python is widely used in scientific and numeric computing:

• SciPy is a collection of packages for mathematics, science, and

engineering.
• Pandas is a data analysis and modelling library.
• IPython is a powerful interactive shell that features easy editing and
recording of a work session, and supports visualizations and parallel
computing.
• The Software Carpentry Course teaches basic skills for scientific
computing, running bootcamps and providing open-access teaching
materials.
Libraries Used for this Project include –

• Pandas

Downloaded by krushnapalsinh vaghela ([email protected])

lOMoARcPSD|38545651

• NumPy
• Matplotlib
• Seaborn
• Scikit Learn
• XG Boost

MODELS USED

Regression Model

• Linear Regression is a machine learning algorithm based on supervised

learning.

• It performs a regression task. Regression models a target prediction value

based on independent variables.

• It is mostly used for finding out the relationship between variables and
forecasting.

Downloaded by krushnapalsinh vaghela ([email protected])

lOMoARcPSD|38545651

Downloaded by krushnapalsinh vaghela ([email protected])

lOMoARcPSD|38545651

RESULTS
Best Suited Model
So, our study showed that……..

Linear Regression displayed the best performance for this Dataset and can be
used for deploying purposes.

Deployment App
The Model is deployed through Python Web server and using of Postman API
Flask in collaboration with HTML and CSS.

Downloaded by krushnapalsinh vaghela ([email protected])

lOMoARcPSD|38545651

Conclusion

So, our Aim is achieved as we have successfully ticked all our parameters as mentioned in
our Aim Column. It is seen that circle rate is the most effective attribute in predicting the
house price and that the Linear Regression is the most effective model for our Dataset.

Downloaded by krushnapalsinh vaghela ([email protected])

lOMoARcPSD|38545651

GitHub Repository Link:-

https://github.com/anujyadav73/Realstate_Price_Prediction.git

THANK YOU
****************

Downloaded by krushnapalsinh vaghela ([email protected])

NOUN Master Degree Project2
No ratings yet
NOUN Master Degree Project2
39 pages
AVEVA Engineering 15.7 User - Overview and Intro
No ratings yet
AVEVA Engineering 15.7 User - Overview and Intro
54 pages
House Price Prediction Project
No ratings yet
House Price Prediction Project
55 pages
Unit 2
No ratings yet
Unit 2
78 pages
Tba Record Final
No ratings yet
Tba Record Final
140 pages
TSAT Model Manual PDF
100% (1)
TSAT Model Manual PDF
431 pages
AIML Curriculum Powered by IBM - Pregrad-Merged
No ratings yet
AIML Curriculum Powered by IBM - Pregrad-Merged
66 pages
PTS Reference Manual V2.2
100% (3)
PTS Reference Manual V2.2
265 pages
C21 Report
No ratings yet
C21 Report
64 pages
Manage Asset Creation Requests
No ratings yet
Manage Asset Creation Requests
12 pages
Final Report Capstone Project House Price Prediction
No ratings yet
Final Report Capstone Project House Price Prediction
35 pages
Module 5
No ratings yet
Module 5
46 pages
AIMLlatestmodule 2notes Removed
No ratings yet
AIMLlatestmodule 2notes Removed
33 pages
ML Project CLG
No ratings yet
ML Project CLG
62 pages
Module 2
No ratings yet
Module 2
35 pages
Module 2notes
No ratings yet
Module 2notes
44 pages
Big Mart Sales Prediction Using Machine Learning Report PDF
No ratings yet
Big Mart Sales Prediction Using Machine Learning Report PDF
56 pages
Dawit House
No ratings yet
Dawit House
49 pages
House Prices Prediction - Final
No ratings yet
House Prices Prediction - Final
24 pages
End of Semester Exam - Image - Processing and Computer Vision
No ratings yet
End of Semester Exam - Image - Processing and Computer Vision
3 pages
Project Report
No ratings yet
Project Report
37 pages
MsAccess Lessons
No ratings yet
MsAccess Lessons
36 pages
ML LAB Manual
No ratings yet
ML LAB Manual
18 pages
Project Report
No ratings yet
Project Report
15 pages
GAMBIT Manual
No ratings yet
GAMBIT Manual
84 pages
House Price Prediction
No ratings yet
House Price Prediction
17 pages
AIML-Curriculum by Pregrad
No ratings yet
AIML-Curriculum by Pregrad
33 pages
s71500 CM PTP Function Manual en-US en-US
No ratings yet
s71500 CM PTP Function Manual en-US en-US
186 pages
Clonamos El Repositorio para Obtener Los Dataset: From Import
No ratings yet
Clonamos El Repositorio para Obtener Los Dataset: From Import
23 pages
2) Front Pages
No ratings yet
2) Front Pages
11 pages
House Price Prediction
No ratings yet
House Price Prediction
14 pages
HOUSE PREDICTION (1) (1) New
No ratings yet
HOUSE PREDICTION (1) (1) New
24 pages
Ids Project
No ratings yet
Ids Project
25 pages
Real Estate Web Application Using Flask
0% (1)
Real Estate Web Application Using Flask
11 pages
Ritika Kapoor - DETD
No ratings yet
Ritika Kapoor - DETD
22 pages
Shub Neet DT
No ratings yet
Shub Neet DT
12 pages
Project Report
No ratings yet
Project Report
27 pages
Introduction To Artificial Intelligence
No ratings yet
Introduction To Artificial Intelligence
15 pages
Yug Removed
No ratings yet
Yug Removed
29 pages
Making Predictions
No ratings yet
Making Predictions
13 pages
SC101Assignment3 Eng
No ratings yet
SC101Assignment3 Eng
10 pages
Project - Synopsis - Format (1) (1) (1) Copy 2
No ratings yet
Project - Synopsis - Format (1) (1) (1) Copy 2
33 pages
AI - ML Curriculum Powered by IBM - Pregrad
No ratings yet
AI - ML Curriculum Powered by IBM - Pregrad
31 pages
FML PROJECT Diya
No ratings yet
FML PROJECT Diya
9 pages
Machine Learning With Python
100% (2)
Machine Learning With Python
137 pages
Ste 8
No ratings yet
Ste 8
4 pages
PBL-1 Research Paper
No ratings yet
PBL-1 Research Paper
5 pages
Aastha Mahajan Python File
No ratings yet
Aastha Mahajan Python File
17 pages
IT Terminal Examination
No ratings yet
IT Terminal Examination
3 pages
Project Synopsis Shaiba
No ratings yet
Project Synopsis Shaiba
5 pages
House Price Prediction Using Machine Learning Techniques
No ratings yet
House Price Prediction Using Machine Learning Techniques
5 pages
Newprot E500 Service Manual PDF
No ratings yet
Newprot E500 Service Manual PDF
179 pages
1822 B.E Ece Batchno 120
No ratings yet
1822 B.E Ece Batchno 120
29 pages
Prediction
100% (1)
Prediction
10 pages
House Price Prediction Using Machine Learning Techniques
No ratings yet
House Price Prediction Using Machine Learning Techniques
5 pages
Cenumes - Week 6
No ratings yet
Cenumes - Week 6
4 pages
Project Report
No ratings yet
Project Report
27 pages
House Report
No ratings yet
House Report
26 pages
Image Classification Thesis
100% (2)
Image Classification Thesis
6 pages
Case Study 219302405
No ratings yet
Case Study 219302405
14 pages
House Price Prdiction Mini Project Report
100% (2)
House Price Prdiction Mini Project Report
8 pages
Project Report Gr-12
No ratings yet
Project Report Gr-12
25 pages
Oral Presentation
No ratings yet
Oral Presentation
9 pages
How To Apply
No ratings yet
How To Apply
5 pages
Comp Project List 2022
No ratings yet
Comp Project List 2022
3 pages
GIS Demystified Skyler
No ratings yet
GIS Demystified Skyler
6 pages
Price Prediction
100% (1)
Price Prediction
13 pages
A Protocol-Independent Technique For Eliminating Redundant Network Traffic
No ratings yet
A Protocol-Independent Technique For Eliminating Redundant Network Traffic
9 pages
DsNaIT v2.0
No ratings yet
DsNaIT v2.0
43 pages
GCD Detailed Syllabus
No ratings yet
GCD Detailed Syllabus
24 pages
Python Programming (Int 213) : Report For House Price Prdiction
No ratings yet
Python Programming (Int 213) : Report For House Price Prdiction
23 pages
Online Bags Shopping Cart
No ratings yet
Online Bags Shopping Cart
50 pages
House Price Prediction 1
No ratings yet
House Price Prediction 1
27 pages
Dsbda Mini Priyanshu
No ratings yet
Dsbda Mini Priyanshu
17 pages
House Prices
No ratings yet
House Prices
5 pages
The Pathologies of Big Data Summary
No ratings yet
The Pathologies of Big Data Summary
2 pages
House Pricing Prediction System
No ratings yet
House Pricing Prediction System
36 pages
How To Scan FC LUNS and SCSI Disks
0% (1)
How To Scan FC LUNS and SCSI Disks
6 pages
Bangalore House Price Prediction
No ratings yet
Bangalore House Price Prediction
4 pages
Dsbda Mini Manav
No ratings yet
Dsbda Mini Manav
17 pages
Seclore Data-Centric Security Platform
No ratings yet
Seclore Data-Centric Security Platform
2 pages
Case Study Boeing by Ridwan
No ratings yet
Case Study Boeing by Ridwan
4 pages
7W211 Quick Guide PDF
No ratings yet
7W211 Quick Guide PDF
2 pages
Report On Java Chatting
No ratings yet
Report On Java Chatting
10 pages
Presentation On Genera Banking Activities of SEBL
No ratings yet
Presentation On Genera Banking Activities of SEBL
15 pages
Key Terms and Concepts - Defined: - Goal-Seeking Analysis
No ratings yet
Key Terms and Concepts - Defined: - Goal-Seeking Analysis
4 pages
Memory Management & Virtual Memory
No ratings yet
Memory Management & Virtual Memory
16 pages
Cisco CCNP Resume
No ratings yet
Cisco CCNP Resume
3 pages
Data Mining Models: Techniques and Applications
From Everand
Data Mining Models: Techniques and Applications
Ravi Deshpande
No ratings yet
Spark for Data Science
From Everand
Spark for Data Science
Srinivas Duvvuri
No ratings yet

House-Price-Prediction-Using-Regression-Techniques Retouch - Removed

Uploaded by

House-Price-Prediction-Using-Regression-Techniques Retouch - Removed

Uploaded by

lOMoARcPSD|38545651

Chapter Number Contents

2.1 AIM and IMPORTANCE

2.2 Need and Motivation

3.1 Data Exploration

3.2 Data Visualization

3.3 Data Selection

4. LANGUAGE AND MODELS USED

Downloaded by krushnapalsinh vaghela ([email protected])

House price forecasting is an important topic of real estate. The literature

attempts to derive useful knowledge from historical data of property markets.

Machine learning techniques are applied to analyse historical property

and most affordable suburbs in the city of Bangalore. Moreover, experiments

error measurement is a competitive approach.

Downloaded by krushnapalsinh vaghela ([email protected])

AIM and IMPORTANCE

• Create an effective price prediction model

• Validate the model’s prediction accuracy

Downloaded by krushnapalsinh vaghela ([email protected])

Need and Motivation

the price predicted by the model, so he can reject that house.

Downloaded by krushnapalsinh vaghela ([email protected])

Our Data contains Bangalore Houses only.

Dataset looks as follows-

Downloaded by krushnapalsinh vaghela ([email protected])

Data visualization is the graphical representation of information and data.

technologies are essential to analyse massive amounts of information and

make data-driven decisions.

Downloaded by krushnapalsinh vaghela ([email protected])

Data selection is defined as the process of determining the appropriate data

type and source, as well as suitable instruments to collect data. Data

selection precedes the actual practice of data collection. This definition

distinguishes data selection from selective data reporting (selectively

excluding data that is not supportive of a research hypothesis) and

interactive/active data selection (using collected data for monitoring

activities/events, or conducting secondary data analyses). The process of

The primary objective of data selection is the determination of appropriate

data type, source, and instrument(s) that allow investigators to adequately

answer research questions. This determination is often discipline-specific

and is primarily driven by the nature of the investigation, existing literature,

and accessibility to necessary data sources.

Downloaded by krushnapalsinh vaghela ([email protected])

interpretable and for helping to meet the assumptions of inferential statistics.

data is actually a comparison of geometric means. This occurs because, as shown

below, the anti-log of the arithmetic mean of log-transformed values is the

Downloaded by krushnapalsinh vaghela ([email protected])

LANGUAGE AND MODELS USED

Python is widely used in scientific and numeric computing:

• SciPy is a collection of packages for mathematics, science, and

Downloaded by krushnapalsinh vaghela ([email protected])

• Linear Regression is a machine learning algorithm based on supervised

• It performs a regression task. Regression models a target prediction value

Downloaded by krushnapalsinh vaghela ([email protected])

Downloaded by krushnapalsinh vaghela ([email protected])

Downloaded by krushnapalsinh vaghela ([email protected])

Downloaded by krushnapalsinh vaghela ([email protected])

GitHub Repository Link:-

Downloaded by krushnapalsinh vaghela ([email protected])

You might also like