Python Project Major Project
Python Project Major Project
A PROJECT SYNOPSIS
on
House Price Prediction Using Machine Learning
Submitted By
1. Shreyas saundade(64)
2. Shivani singh (67)
3. Aditya wadnerkar (71)
(Data Science)
Mission
“To educate Students to become quality techno-crafts for taking up challenges in all
facets of life “
Mission
CERTIFICATE
This is to certify that the requirements for the synopsis entitled,”House Price Prediction Using Machine
Learning” Have been successfully completed by the following students:
64 Shreyas saundade
67 Shivani singh
71 Aditya wadnerkar
ABSTRACT
The relationship between house prices and the economy is an important motivating factor for
predicting house prices. A property’s value is important in real estate transactions. Housing price
trends are not only the concern of buyers and sellers, but it also indicates the current economic
situation. Therefore, it is important to predict housing prices without bias to help both the buyers
and sellers make their decisions. In this project, we are going to create a website where user have
to add some property details for predicting the house price, enter date for forecasting the price till
that date and budget range for recommending best location. This project uses two datasets, one
includes some features and large entries of housing sales in Mumbai and another contains house
price index of Mumbai. We are using different feature selection methods and feature extraction
method with Multiple Linear Regression to predict the current house price and using ARIMA
model for forecasting the price after few years in Mumbai and also uses content based
recommendation system to recommend best location according to their budget in nearby area of
interest.
Keywords- House price prediction using machine learning algorithm, Recommendation of house
INDEX
List of Figures ………………………………………….................................................................……………………………... 1
1. Introduction …………………………………………………................................................................…………………….. 2
3. Methodology ………………………………………………………………............. 5
5. Conclusion ……….………………………………………………………………… 21
7. Reference ……………………………………………………………………...........
23
24
1. To apply statistical data analysis and other data science techniques to effectively solve real-world problems.
2. To motivate & prepare students for lifelong learning and research to manifest global competitiveness.
3. To equip students with communication, team work and leadership skills to accept challenges in all facets of life
ethically.
1. Apply the knowledge of Mathematics, Science and Engineering Fundamentals to solve complex Data Science
Problems.
2. Identify, formulate and analyze Data analysis Problems and derive conclusion using First Principle of
Mathematics, Engineering Science and Computer Science.
3. Investigate Complex Data Science problems to find appropriate solution leading to valid conclusion.
4. Design a data science model, process to meet specified needs with appropriate attention to health and Safety
Standards, Environmental and Societal Considerations.
5. Create, select and apply appropriate techniques, resources and advance Engineering software to analyze tools
and design for Data Science Problems.
6. Understand the Impact of Data Science solution on society and environment for Sustainable development.
7. Understand Societal, health, Safety, cultural, Legal issues and Responsibilities relevant to Engineering
Profession.
8. Apply Professional ethics, accountability and equity in Engineering Profession.
9. Work Effectively as a member and leader in multidisciplinary team for a common goal.
10. Communicate Effectively within a Profession and Society at large.
11. Appropriately incorporate principles of Management and Finance in one’s own Work.
12. Identify educational needs and engage in lifelong learning in a Changing World of Technology.
1. Identify, understand, formulate and analyse complex engineering problems in the field of Data Analysis, Big
Data, Database Management, Predictive Analysis, Trends Identification and Identifying Business Insights.
2. Acquire, Store, Retrieve, Process and finally convert data into knowledge in the field of artificial intelligence,
data mining, network management and security, and Internet of Things applications
through use of secure, reliable and cost effective state of art Analysis tools efficiently
Lab Objectives:
1. To acquaint with the process of identifying the needs and converting it into the problem.
3. To acquaint with the process of applying basic engineering fundamentals to attempt solutions to the problems.
Lab Outcomes:
4. Draw the proper inferences from available results through theoretical/ experimental/simulations.
5. Analyze the impact of solutions in societal and environmental context for sustainable development.
INTRODUCTION
Investment is a business activity on which most people are interested in this globalization era. There are
several objects that are often used for investment, for example, gold, stocks and property. In particular,
property investment has increased significantly. Housing price trends are not only the concern of buyers
and sellers, but it also indicates the current economic situation. There are many factors which has impact
on house prices, such as location, BHK, floor etc. Also, a location with a great accessibility to highways,
expressways, schools, shopping malls and local employment opportunities contributes to the rise in house
price. Manual house prediction becomes difficult, hence there are many systems developed for house
price prediction. The aim of this system is to create a website through which the user can give his house
requirements as input which is then passed on to the linear regression model for predicting the house
price. The website also allows user to forecast the predicted house price to a particular date which is also
specified by the user. This is done by using another model known as the ARIMA(Auto Regressive
Integrated Moving Average Model).
During the last few decades, with the rise of Youtube, Amazon, Netflix and many other such web
services, recommender systems have taken more and more place in our lives. From e-commerce (suggest
to buyers articles that could interest them) to online advertisement (suggest to users the right contents,
matching their preferences), recommender systems are today unavoidable in our daily online journeys.
In a very general way, recommender systems are algorithms aimed at suggesting relevant items to users
(items being movies to watch, text to read, products to buy or anything else depending on industries).
This website also provides an option for recommendations. The type of recommendation system is
content based recommendation. In this project, we are using two datasets which are extracted from
Makaan.com by using the concept of web scraping. One dataset consists of some features such as
location, BHK, floor, furnished etc. with different cities in Mumbai. This dataset is used for prediction.
The other dataset consists of the House Price index of Mumbai for the last 10 years. This dataset is used
for forecasting.
Machine learning plays a major role from past years in image detection, spam reorganization, normal
speech command, product recommendation and medical diagnosis. Present machine learning algorithm
helps us in enhancing security alerts, ensuring public safety and improve medical enhancements. Machine
learning system also provides better customer service and safer automobile systems. In the present
PROPOSED SYSTEM
1. Data Gathering
2. Analysis Dataset
3. Training the regression model
4. Testing
Dataset Preparation
The dataset was imported from scikit-learn library in python. The dataset includes 4 Pre-labelled variable
in total, 3-Independent variable (i.e area, roomcount, building) and 1-dependent varible (i.e price) as
shown in figure 1.0 below.
Fig1.0: DATASET
Linear Regression
In this Project, we have used Linear Regression Algorithm for predicting the current house price.
• The Linear Regression Algorithm accepts two variables Independent variable (X) and Dependent
variable (Y).
• The dataset containing different cities with their features and prices is used for training Linear
Regression Model.
• The dataset entities will be divided into two parts 80% for training and 20% for testing. • Linear
Regression model will be trained using X_train Independent variable entries and Y_train Dependent
variable entries.
• The trained model will be tested upon the 20% test dataset entities. After training and testing the model
will be use for prediction purpose.
• Formula: Yi=β0+β1Xi1+β2Xi2+...+βpxi+ ϵ
Yi=dependent variable
Xi=independent variables
AREA(int) PRICE(rupees)
ROOM COUNTt(int)
BUILDING AGE(int)
Table:1.0
SYSTEM SPECIFICATION
Algorithm
STEP 1: Start
STEP 8: Stop
Result
fig0.2: output1
fig0.3: output2
Fig0.4: output3
CONCLUSION
House prices prediction are expected to help people who plan to buy a house so they can know the price
range in the future, then they can plan their finance well. In addition, house price predictions are also
beneficial for property investors to know the trend of housing prices in a certain location.
Machine learning which is broadly defined as the capability of a machine to imitate intelligent human
behavior.
So we use Machine learning model predictions as they allow businesses to make highly accurate guesses
as to outcomes of a question based on historical data, which can be about all kinds of things.
. The system makes optimal use of the Data mining Algorithm i.e Linear Regression
The Linear Regression algorithm is used to predict the house price according to the property requirement
given by the customer with accuracy of 86.7%
. The main objective of using this prediction system is to reduce the human physical calculation, time and
carry out the whole process at ease.
References
[1] Real Estate Price Prediction with Regression and Classification, CS 229 Autumn 2016
[2] Gongzhu Hu, Jinping Wang, and Wenying Feng Multivariate Regression Modelling for Home Value Estimates
with Evaluation using Maximum Information Coefficient
[3] Byeonghwa Park, Jae Kwon Bae (2015). Using machine learning algorithms for housing price prediction,
Volume 42, Pages 2928-2934
[6] https://towardsdatascience.com/how-to-buildfrom-scratch-a-content-based-movie-recommenderwith-
natural-language-processing-25ad400eb243
[7] https://www.makaan.com/
[8] https://tradingeconomics.com/