House-Price-Prediction-Using-Regression-Techniques Retouch - Removed
House-Price-Prediction-Using-Regression-Techniques Retouch - Removed
TABLE OF CONTENTS
3. DATASET
Steps in Preparing Data for Model
5. Python
Jupyter Notebook
NumPy
Pandas
Seaborn
Matplotlib
6. Models Used
Multiple Linear Regression
7. RESULTS AND DISCUSSIONS
8. Deployment App
9. Conclusion
10. Repository link
Abstract
transactions in India to discover useful models for house buyers and sellers.
Revealed is the high discrepancy between house prices in the most expensive
demonstrate that the Multiple Linear Regression that is based on mean squared
INTRODUCTION
Aim
These are the Parameters on which we will evaluate ourselves-
• Identify the important home price attributes which feed the model’s
predictive power.
Having lived in India for so many years if there is one thing that I had been taking
for granted, it’s that housing and rental prices continue to rise. Since the housing
crisis of 2008, housing prices have recovered remarkably well, especially in major
housing markets. However, in the 4th quarter of 2016, I was surprised to read that
Bombay housing prices had fallen the most in the last 4 years. In fact, median
resale prices for condos and coops fell 6.3%, marking the first time there was a
decline since Q1 of 2017. The decline has been partly attributed to political
uncertainty domestically and abroad and the 2014 election. So, to maintain the
transparency among customers and also the comparison can be made easy through
this model. If customer finds the price of house at some given website higher than
DATASET
Here we have web scrapped the Data from 99acres.com website which is one of
the leading real estate websites operating in INDIA.
Data Exploration
Data exploration is the first step in data analysis and typically involves summarizing the main
characteristics of a data set, including its size, accuracy, initial patterns in the data and other
attributes. It is commonly conducted by data analysts using visual analytics tools, but it can
also be done in more advanced statistical software, Python. Before it can conduct analysis
on data collected by multiple data sources and stored in data warehouses, an organization
must know how many cases are in a data set, what variables are included, how many
missing values there are and what general hypotheses the data is likely to support. An initial
exploration of the data set can help answer these questions by familiarizing analysts with
the data with which they are working.
We divided the data 9:1 for Training and Testing purpose respectively.
Data Visualization
By using visual elements like charts, graphs, and maps, data visualization
tools provide an accessible way to see and understand trends, outliers, and
patterns in data. In the world of Big Data, data visualization tools and
Data Selection
selecting suitable data for a research project can impact data integrity.
Data Transformation
The log transformation can be used to make highly skewed distributions less
skewed. This can be valuable both for making patterns in the data more
It is hard to discern a pattern in the upper panel whereas the strong relationship is
shown clearly in the lower panel. The comparison of the means of log-transformed
geometric mean.
Price in sq.ft
Bathrooms
Python
• Pandas
• NumPy
• Matplotlib
• Seaborn
• Scikit Learn
• XG Boost
MODELS USED
Regression Model
• It is mostly used for finding out the relationship between variables and
forecasting.
RESULTS
Best Suited Model
So, our study showed that……..
Linear Regression displayed the best performance for this Dataset and can be
used for deploying purposes.
Deployment App
The Model is deployed through Python Web server and using of Postman API
Flask in collaboration with HTML and CSS.
Conclusion
So, our Aim is achieved as we have successfully ticked all our parameters as mentioned in
our Aim Column. It is seen that circle rate is the most effective attribute in predicting the
house price and that the Linear Regression is the most effective model for our Dataset.
THANK YOU
****************