0% found this document useful (0 votes)

41 views34 pages

Ay-Sem8-Internship Report

This internship report by Amit Yadav details a project on House Price Prediction using Machine Learning, conducted at Grras Solutions Pvt. Ltd. The project involved data collection, preprocessing, and the implementation of various regression models to predict house prices based on features like location and size. The results indicated that ensemble learning models, particularly XGBoost, provided the highest accuracy in predictions.

Uploaded by

cockylearner999

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

41 views34 pages

Ay-Sem8-Internship Report

Uploaded by

cockylearner999

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 34

HOUSE PRICE PREDICTION MODEL

AN INTERNSHIP REPORT

Submitted by
AMIT YADAV

210180107001

In partial fulfillment for the award of the degree of

BACHELOR OF ENGINEERING

Department of Computer Engineering

Government Engineering College, Dahod

Gujarat Technological University, Ahmedabad

April, 2025

i
Government Engineering College, Dahod
V64R+7QP, Jhalod Road, Dahod, Usarvan Part, Gujarat 389151

CERTIFICATE
This is to certify that the project report submitted along with the project
entitled House Price Prediction has been carried out by Amit Yadav under
my guidance in partial fulfillment for the degree of Bachelor of Engineering
in Computer Engineering, 8th Semester of Gujarat Technological University,
Ahmedabad during the academic year 2024-25.

Prof. Viren Patel Prof. Viren Patel

Internal Guide Head of the Department

ii
Company Certificate

iii
Government Engineering College, Dahod
V64R+7QP, Jhalod Road, Dahod, Usarvan Part, Gujarat 389151

DECLARATION

We hereby declare that the Internship report submitted along with the
Internship entitled House Price Prediction submitted in partial fulfillment
for the degree of Bachelor of Engineering in Computer Engineering to
Gujarat Technological University, Ahmedabad is a bonafide record of
original project work carried out by me at Department of Computer
Engineering, Government Engineering College, Dahod under the supervision
of Prof. Viren Patel and that no part of this report has been directly copied
from any students’ reports or taken from any other source, without providing
due reference.

Name of the Student Sign of Student

Amit Yadav

iv
Acknowledgement

I would like to express my heartfelt gratitude to Grras Solutions Private Limited,

Ahmedabad, for providing me with the opportunity to undertake my internship in

Machine Learning with Python.

I sincerely thank my HOD and Internal Guide, Prof. Viren Patel, for his valuable

guidance and continuous support throughout this internship.

I extend my gratitude to my Industrial Supervisor, Mr. Rajesh Shah, for his expert

insights and technical guidance, which were instrumental in the successful completion of

my project.

I also thank Ms. Suman Vairagi, my Reporting Manager, for providing me with the

necessary resources, mentorship, and a conducive learning environment during my

internship.

A special thanks to Government Engineering College, Dahod, for facilitating this

internship program and Gujarat Technological University (GTU) for providing the

framework for industry-academia collaboration.

Lastly, I am grateful to my family and friends for their unwavering support and

encouragement throughout my academic journey.

v
Abstract

Machine learning has revolutionized data analysis and predictive modeling, enabling
the development of highly accurate forecasting systems. This report presents an
internship project on House Price Prediction using Machine Learning, conducted at
Grras Solutions Private Limited, Ahmedabad as part of a 12-week internship
program.

The objective of this project was to design and implement a machine learning model
capable of predicting house prices based on various features such as location, size,
number of rooms, and other factors. The project involved:
• Data Collection & Preprocessing: Cleaning, handling missing values, and feature
selection.
• Exploratory Data Analysis (EDA): Understanding patterns and relationships in
housing data.
• Model Selection & Training: Using regression algorithms (Linear Regression,
Decision Tree, Random Forest, XGBoost).
• Evaluation & Optimization: Hyperparameter tuning and performance analysis.

The report details the step-by-step implementation of the project, challenges faced,
and key insights gained. The results demonstrate that ensemble learning models
provide higher accuracy, making them a preferable choice for price prediction tasks.

This internship provided hands-on experience in machine learning, data science

methodologies, and model deployment, equipping the student with essential skills for
real-world applications.

vi
List of Figures

Fig 1.1.1 ML in Real Estate……………………………..………………….. 1

Fig 1.4.1 Technologies Used………….…………………………………….. 4
Fig 2.1.1 Company Logo…………………………………………….. 6
Fig 3.1.1 ML in Real Estate…………………………………………….. 7
Fig 3.2.1 Dataset Overview…………………………………………….. 7
Fig 3.3.1 Flowchart of the project ……………………………………….. 10
Fig 4.2.1 Loading of dataset…………………………………………… 10
Fig 4.3.1 Handling of missing data …………………………………….. 11
Fig 4.4.1 Histogram and Box Plot………………………………………... 12
Fig 4.4.2 Correlation Matrix…………………………………………….. 12
Fig 5.1 Splitting of Dataset into training and test data…………………. 15
Fig 5.1.1 Linear Regression……………………………………………… 16
Fig 5.2.1 Decision Tree Regression………………………………………… 17
Fig 5.3.1 Random Forest Regression……………………………………….. 18
Fig 5.4.1 XGBoost Regression………..……………………………………. 22
Fig 6.1.1 Comparison of Regression Model Performance Metrics ………… 22
Fig 6.2.1 Saving and Loading the Model………………………………… 22
Fig 6.3.1 Creation of Flask API…………………………………………… 22
Fig 6.4.1 User View……….………………………………………………… 23
Fig 7.3.1.1 Scatter Plot for various regression models………………………... 23
Fig 7.3.2.1 Feature Importance Plot…….…………………………………….. 26

vii
Abbreviations

AI Artificial Intelligence
ML Machine Learning
CSV Comma-Separated Values
RMSE Root Mean Square Error
MSE Mean Squared Error
MAE Mean Absolute Error
R² Coefficient of Determination
ANN Artificial Neural Network
SVM Support Vector Machine
KNN K-Nearest Neighbors
API Application Programming Interface
GUI Graphical User Interface
CPU Central Processing Unit
GPU Graphics Processing Unit

viii
Table of Contents

Acknowledgement………………………………………………………………… v
Abstract…………………………………………………………………………… vi
List of Figures…………………………………………………………………...... vii
List of Abbreviations……………………………………………………………... ix
Table of Contents………………………………………………………………… x
Chapter 1: Introduction….................................................................................... 1
1.1 Introduction 1
1.2 Objectives of the Project 1
1.3 Scope of the Project 2
1.4 Technologies Used 2
Chapter 2: Company Profile…............................................................................. 3
2.1 About Company
2.2 Mission and Vision
2.3 Internship Program
2.4 Company Culture
Chapter 3: Project Overview ................................................................................. 5
3.1 Problem Statement 5
3.2 Dataset Description 6
3.3 Methodology

Chapter 4: Implementation………………............................................................... 8
4.1 Introduction 8
4.2 Data Collection 9
4.3 Data Preprocessing
4.4 Exploratory Data Analysis

Chapter 5: Model Selection and Training............................................................... 10

5.1 Linear Regression 10
5.2 Decision Tree Regressor 11
5.3 Random Forest Regressor 14
5.4 XGBoost Regressor 15

ix
Chapter 6: Model Evaluation and Deployment................................................... 19
6.1 Model Evaluation 19
6.2 Model Deployment 19
6.3 Deployment Using Flask 22
6.4 Frontend Development

Chapter 7: Results and Discussion ………............................................................. 24

7.1 Introduction 24
7.2 Model Performance Analysis 24
7.3 Graphical Representation of Model Performance 26

7.3.1 Actual price Vs Predicted Price

7.3.2 Feature Importance Plot

7.3.3 Real World Implications

Chapter 8: Conclusion and Future Work…......................................................... 31

8.1 Conclusion
8.2 Challenges Faced
8.3 Future Scope and Improvements

References…………………………………………………………....................... 32
Appendix...…………………………………………………………...................... 33

x
CHAPTER 1: INTRODUCTION

1.1 Introduction

The real estate market has always been dynamic and influenced by multiple factors such as
location, size, amenities, economic trends, and market demand. Accurate house price
prediction is crucial for buyers, sellers, and investors to make informed decisions. With
advancements in Machine Learning (ML), predictive models can analyze large datasets
and uncover hidden patterns, leading to better price estimations.
This project, House Price Prediction using Machine Learning, was developed as part of the
internship program at Grras Solutions Pvt. Ltd., Ahmedabad. The project explores various
supervised learning algorithms to build a robust price prediction model based on historical
housing data.

1.2 Objectives of the Project

The primary objectives of this project are:

• To analyze and preprocess real estate data for better model performance.
• To implement and compare different regression models for price prediction.
• To evaluate the accuracy and efficiency of each model using performance metrics.
• To provide insights and recommendations for real estate stakeholders.

1.3 Scope of the Project

• The model considers historical sales data, house features, and market trends.
• It applies Machine Learning techniques to predict prices.
• The project is developed using Python, Scikit-Learn, Pandas, and Matplotlib.
• The final model will be evaluated based on Mean Squared Error (MSE) and R²
Score.

1
1.4 Technologies Used
• Programming Language: Python
• Libraries & Frameworks: Pandas, NumPy, Matplotlib, Seaborn, Scikit-Learn
• Machine Learning Algorithms: Linear Regression, Decision Tree, Random Forest,
XGBoost
• Tools: Jupyter Notebook, Google Colab

2
CHAPTER 2: COMPANY PROFILE

2.1 ABOUT COMPANY

Fig. 2.1.1 Company Logo

Grras Solutions Pvt. Ltd. is a leading IT training and development company headquartered
in Ahmedabad, India. The company specializes in:

• Machine Learning & Data Science

• Cloud Computing & DevOps
• Cybersecurity & Ethical Hacking
• Full-Stack Web Development

Grras Solutions provides industry-relevant training and collaborates with students and
professionals to bridge the gap between academic knowledge and corporate requirements

2.2 MISSION AND VISION

Mission: To equip students with the latest technical skills through hands-on training and
industry-based projects.

Vision: To be recognized as a center of excellence in IT education and skill development.

3
2.3 INTERNSHIP PROGRAM AT GRRAS SOLUTIONS

The company offers structured internship programs for engineering and IT students in
collaboration with colleges and universities. Key features of the internship include:

• Live project work on real-world datasets.

• Mentorship from industry experts.
• Practical exposure to emerging technologies.

2.4 COMPANY CULTURE

In our culture at Grras Solutions, individual satisfaction and peace of mind are
paramount, leading to consistent excellence. Providing a family environment for every
team member, rooted in Indian Culture, fosters a positive core team and drives excellence
across all our products and solutions.

4
CHAPTER 3: PROJECT OVERVIEW

3.1 PROBLEM STATEMENT:

House pricing is influenced by several factors, and manual estimation is often inaccurate.
Traditional methods fail to capture hidden patterns in real estate data, leading to mispriced
properties.

This project aims to develop a Machine Learning model to predict house prices based on
key attributes such as location, size, number of rooms, and amenities.

3.2 DATASET DESCRIPTION:

The dataset used in this project consists of thousands of housing records with features
including:
• Location: City, neighborhood, and ZIP code
• Size: Square footage of the house
• Rooms: Number of bedrooms and bathrooms
• Market Data: Price trends over time

5
3.3 METHODOLOGY:

The project follows a structured approach:

1. Data Collection: Gathering historical real estate data.
2. Data Preprocessing: Handling missing values, feature selection, and scaling.
3. Exploratory Data Analysis (EDA): Understanding correlations and trends.
4. Model Training: Implementing regression models.
5. Model Evaluation: Comparing models using performance metrics.

6
CHAPTER 4: IMPLEMENTATION

4.1 INTRODUCTION

This chapter covers the step-by-step implementation of the House Price Prediction
System using Machine Learning.
The implementation includes data collection, preprocessing, model selection, training,
evaluation, and deployment.

4.2 DATA COLLECTION

The dataset used in this project is obtained from Kaggle, containing various features
such as:

• Lot Size
• Number of Bedrooms & Bathrooms
• Square Footage
• Location & Zip Code
• Year Built
• House Condition & Grade

The dataset is stored in CSV format and is loaded using Pandas in Python.

Code Snippet: Loading the Dataset

7
4.3 DATA PREPROCESSING

Data preprocessing ensures the dataset is clean and ready for model training. The steps
include:

1. Handling Missing Values

2. Feature Engineering
3. Encoding Categorical Variables
4. Scaling & Normalization

Code Snippet: Handling Missing Data

4.4 EXPLORATORY DATA ANALYSIS (EDA)

EDA helps visualize trends and relationships between variables using graphs such
as histograms, scatter plots, and heatmaps.

Graphical Representation:

Code Snippet: (Histogram and Boxplot)

8
Code Snippet: Correlation Heatmap

9
CHAPTER 5: MODEL SELECTION AND TRAINING

Once the data preprocessing step was completed, several machine learning models
were considered for house price prediction:

➢ Linear Regression
➢ Decision Tree Regressor
➢ Random Forest Regressor
➢ XG Boost Regressor

Each model was trained and evaluated to compare their performance in predicting
house prices.

5.1 Linear Regression

Overview:
Linear Regression assumes a linear relationship between input features and the
target variable. It fits a line to the data by minimizing the difference between
actual and predicted values.

Mathematical Representation:

10
Implementation:

5.2 Decision Tree Regressor:

Overview:
Decision Tree Regressor splits data into branches based on feature values. It
captures non-linear relationships but may overfit.

How it works:
The dataset is recursively split into branches based on the feature that
minimizes the mean squared error (MSE).
At each node, a decision rule is applied to split the data.
The tree grows until a stopping condition is met (e.g., minimum samples per
leaf, maximum depth).

Mathematical Formulation:

Implementation:

11
5.3 Random Forest Regressor:

Overview:
Random Forest is an ensemble learning method that combines multiple Decision
Trees to improve accuracy and reduce overfitting.

How it Works:
Constructs multiple decision trees using random subsets of the training data
(Bootstrap Aggregation or Bagging).
Each tree makes a prediction, and the final prediction is the average of all tree
outputs.

Mathematical Formulation:

Implementation:

12
5.4 XGBoost Regressor (Extreme Gradient Boosting)

Overview:
XGBoost is a powerful gradient boosting algorithm optimized for speed and
performance. It builds an ensemble of weak learners (decision trees) in a sequential
manner, where each tree corrects the errors of the previous one.

How it works:
Uses boosting, meaning trees are added iteratively.
Each new tree corrects the residual errors of the previous trees.
Uses a regularized objective function to prevent overfitting.

Mathematical Formulation:

Implementation:

13
CHAPTER 6: MODEL EVALUATION AND DEPLOYMENT

6.1 MODEL EVALUATION

After training multiple models, we evaluate them based on Mean Absolute Error
(MAE), Mean Squared Error (MSE), Root Mean Squared Error (RMSE), and R²
Score to determine the best-performing model.

Performance Metrics Used:

Mean Absolute Error (MAE): Measures the average absolute difference between
actual and predicted values.

Mean Squared Error (MSE): Measures the average squared difference between
actual and predicted values.

Root Mean Squared Error (RMSE): The square root of MSE, providing a
measure of error in the same units as the target variable.

R² Score (R-squared): Indicates how well the model explains the variance in the
data. A value close to 1 means a better fit.

Observation:
Model MAE MSE RMSE R² Score

Linear Regression 970,043.40 1,754,318,687,330.66 1,324,595.81 0.6529

Decision Tree 1,195,266.06 2,642,802,637,614.68 1,625,646.29 0.4771

Random Forest 1,021,546.04 1,961,585,044,320.34 1,400,565.80 0.6119

XGBoost 1,054,208.88 2,075,875,606,528.00 1,441,639.98 0.5893

14
Note:
RMSE (Root Mean Squared Error) is calculated as RMSE= sqrt(MSE).
Higher R² Score indicates a better fit.
Lower MAE, MSE, and RMSE indicate better performance in error reduction.

Result:
Linear Regression has the best R² score, meaning it explains the most variance in the
data.
Decision Tree has the highest MSE and RMSE, indicating it has the worst prediction
accuracy.
Random Forest and XGBoost have similar performance, but Random Forest slightly
outperforms XGBoost in terms of MSE and RMSE.

15
6.2 MODEL DEPLOYMENT

Once the best-performing model is selected, we deploy it using Flask or Streamlit

to create a user-friendly interface where users can input property details and get
predicted prices.

Steps in Deployment:

1. Save the trained model using joblib.

2. Develop a Flask-based web application.
3. Create an HTML frontend to take user input.
4. Integrate the model to return price predictions.

Code Snippet: Saving & Loading the Model

6.3 DEPLOYMENT USING FLASK

(This section includes the Flask-based API creation.)
Code Snippet: Creating a Flask API

16
6.4 FRONTEND DEVELOPMENT

The user-friendly frontend is developed using HTML, CSS, and JavaScript to take
user input and display predicted house prices.

17
CHAPTER 7: RESULTS AND DISCUSSION

7.1 INTRODUCTION

This chapter presents the findings of the House Price Prediction Model, analyzing
its performance, accuracy, and potential real-world applications. The evaluation
metrics provide insights into the efficiency of the model and its usability for
predicting property prices.

7.2 MODEL PERFORMANCE ANALYSIS

Comparison of Model Accuracy

The table below summarizes the evaluation metrics of different machine learning
models:

Model MAE MSE RMSE R² Score

Linear Regression 970,043.40 1,754,318,687,330.66 1,324,595.81 0.6529

Decision Tree 1,195,266.06 2,642,802,637,614.68 1,625,646.29 0.4771

Random Forest 1,021,546.04 1,961,585,044,320.34 1,400,565.80 0.6119

XGBoost 1,054,208.88 2,075,875,606,528.00 1,441,639.98 0.5893

7.3GRAPHICAL REPRESENTATION OF MODEL PERFORMANCE

7.3.1 Actual Vs Predicted Price Plot

A scatter plot is used to compare actual vs. predicted house prices.

The closer the points are to the 45-degree line, the better the model performance.

18
Code Snippet: (Include a scatter plot of actual vs. predicted house prices.)

19
7.3.2 Feature Importance Plot(XGBoost Model)

Feature importance helps in understanding which attributes have the most

significant impact on the house price prediction.

Code Snippet: (Include a feature importance plot from the XGBoost model)

7.3.3 Real World Implications:

The trained machine learning model has diverse real-world applications. It aids real
estate market analysis by helping buyers and sellers make informed decisions.
Banks use it for loan approvals by assessing mortgage risks. Investors benefit from
property investment strategies, identifying profitable locations. Governments
leverage it for urban planning, analyzing housing trends and forecasting
development needs.

20
CHAPTER 8: CONCLUSION AND FUTURE WORK

8.1 CONCLUSION

The performance evaluation of various machine learning models for house price
prediction reveals significant insights. Among the tested models, Linear Regression
achieved the best performance with an R² score of 0.6529, indicating a stronger
correlation between the predicted and actual values. It also had the lowest MAE
(970,043.40) and MSE (1.75 trillion), making it the most reliable choice in this study.

The Random Forest model followed closely with an R² score of 0.6119, though its
slightly higher error metrics suggest reduced accuracy compared to Linear Regression.
XGBoost, while often excelling in predictive tasks, did not outperform Linear
Regression in this case, achieving an R² of 0.5893. The Decision Tree model had the
lowest R² score (0.4771) and the highest MAE and MSE, making it the least effective
option for this dataset.

Overall, while Linear Regression demonstrated the best performance, further tuning of
ensemble methods like Random Forest and XGBoost may improve accuracy. Future
enhancements could include feature engineering, hyperparameter optimization, and
exploring deep learning models for more robust predictions.

8.2 CHALLENGES FACED

While working on this project, a few challenges were encountered:

Data Quality Issues – Missing values and inconsistencies in the dataset required
significant preprocessing.
Overfitting in Decision Tree Models – Complex models tended to memorize the
training data, leading to poor generalization.
Computational Complexity – Training advanced models like XGBoost required
high processing power and optimization.

Each challenge was addressed through appropriate data cleaning, feature

engineering, and hyperparameter tuning techniques.

21
8.3 FUTURE SCOPE AND IMPROVEMENTS:

To enhance the accuracy and applicability of this project, the following future
improvements can be considered:

Deep Learning Integration – Implementing neural networks for better prediction

accuracy.
Live Market Trends – Incorporating real-time pricing data from online property
listings.
Geospatial Analysis – Using GPS coordinates and satellite imagery for precise
location-based predictions.
User-Friendly Web Application – Deploying the model as an interactive web tool
for real estate professionals.

By implementing these enhancements, the system can evolve into a powerful AI-
driven property valuation tool, revolutionizing the real estate industry.

22
REFERENCES

• Kaggle-https://www.kaggle.com/datasets/yasserh/housing-prices-dataset

• House Price Prediction using Machine Learning in Python – GeeksforGeeks.

https://www.geeksforgeeks.org/house-price-prediction-using-machine-learning-in-
python/

• Scikit-Learn Documentation. https://scikit-learn.org/stable/

• XGBoost Documentation. https://xgboost.readthedocs.io/en/stable/

• Pandas Documentation. https://pandas.pydata.org/

• Matplotlib Documentation. https://matplotlib.org/stable/contents.html

23
Weekly report scanned copy

A Comparative Study Between Traditional and Modern Recruitment Techniques
No ratings yet
A Comparative Study Between Traditional and Modern Recruitment Techniques
98 pages
Dsbda Mini Manav
No ratings yet
Dsbda Mini Manav
17 pages
Frito-Lay: Operations Management in Manufacturing
No ratings yet
Frito-Lay: Operations Management in Manufacturing
2 pages
House Price Prediction Report
100% (1)
House Price Prediction Report
26 pages
Chemical Engineering, March 2014
100% (1)
Chemical Engineering, March 2014
92 pages
De Vera, Crisangelyn C
No ratings yet
De Vera, Crisangelyn C
2 pages
Case Study: A Case Study On Subledger Accounting, Oracle Release 12
No ratings yet
Case Study: A Case Study On Subledger Accounting, Oracle Release 12
13 pages
Cat Connectors
No ratings yet
Cat Connectors
85 pages
FORM No. 35: (See Rule 69 (8) (Iii) ) Report of Examination of Water-Sealed Gasholder
No ratings yet
FORM No. 35: (See Rule 69 (8) (Iii) ) Report of Examination of Water-Sealed Gasholder
1 page
Lab Manual
No ratings yet
Lab Manual
56 pages
Class Scheduling System and Attendance Monitoring System
100% (1)
Class Scheduling System and Attendance Monitoring System
6 pages
Inbound 91797242154262642
No ratings yet
Inbound 91797242154262642
7 pages
Community Consultation On The Response Actions (CORA) For COVID-19 - 1
No ratings yet
Community Consultation On The Response Actions (CORA) For COVID-19 - 1
35 pages
Takeover Full
50% (2)
Takeover Full
92 pages
1.1 Purpose: 1.2.1 Selection
No ratings yet
1.1 Purpose: 1.2.1 Selection
7 pages
Order 19973751
No ratings yet
Order 19973751
37 pages
As Win Sivam Ravi Kumar
No ratings yet
As Win Sivam Ravi Kumar
23 pages
Acc Tutorial Topic 8
No ratings yet
Acc Tutorial Topic 8
9 pages
1 s2.0 S0360319923002951 Main
No ratings yet
1 s2.0 S0360319923002951 Main
25 pages
Prediction of House Pricing Using Machine Learning With Python
80% (5)
Prediction of House Pricing Using Machine Learning With Python
85 pages
PAF Model
100% (1)
PAF Model
4 pages
E+H-PROMAG W 400 - Tender Text - TTW400EN
No ratings yet
E+H-PROMAG W 400 - Tender Text - TTW400EN
2 pages
Resume 2022 July Agrim Mathur
No ratings yet
Resume 2022 July Agrim Mathur
2 pages
House
No ratings yet
House
58 pages
From: Sent: To: Subject
No ratings yet
From: Sent: To: Subject
2 pages
Project Proposal Format For BTech CSE 2019-23 Batch
No ratings yet
Project Proposal Format For BTech CSE 2019-23 Batch
2 pages
IJCRT2111135
No ratings yet
IJCRT2111135
7 pages
Final Report
No ratings yet
Final Report
92 pages
MY PRO DAY 9 Copy
No ratings yet
MY PRO DAY 9 Copy
59 pages
Project - Synopsis - Format (1) (1) (1) Copy 2
No ratings yet
Project - Synopsis - Format (1) (1) (1) Copy 2
33 pages
S. G. Balekundri Institute of Technology: Department of Computer Science and Engineering
No ratings yet
S. G. Balekundri Institute of Technology: Department of Computer Science and Engineering
17 pages
Ip Project Kavi Priyan
No ratings yet
Ip Project Kavi Priyan
32 pages
Aastha
No ratings yet
Aastha
21 pages
Dma 362
No ratings yet
Dma 362
7 pages
CS Assignment (Raam Kumar)
No ratings yet
CS Assignment (Raam Kumar)
32 pages
LP - ARTS 2nd Quarter
No ratings yet
LP - ARTS 2nd Quarter
7 pages
Yug Removed
No ratings yet
Yug Removed
29 pages
Project Report Gr-12
No ratings yet
Project Report Gr-12
25 pages
Report On Java Chatting
No ratings yet
Report On Java Chatting
10 pages
Hotel Bill 25092024
No ratings yet
Hotel Bill 25092024
1 page
House Price Prediction
No ratings yet
House Price Prediction
12 pages
Mini Project Synopsis
No ratings yet
Mini Project Synopsis
1 page
House Price Prediction Using Machine Learning: Bachelor of Technology
No ratings yet
House Price Prediction Using Machine Learning: Bachelor of Technology
20 pages
Data Science Assignment Chapter 1
No ratings yet
Data Science Assignment Chapter 1
5 pages
House Price Prediction Using Machine Learning
No ratings yet
House Price Prediction Using Machine Learning
6 pages
Dsbda Mini Priyanshu
No ratings yet
Dsbda Mini Priyanshu
17 pages
Aastha Mahajan Python File
No ratings yet
Aastha Mahajan Python File
17 pages
For House Price Prediction Model
No ratings yet
For House Price Prediction Model
9 pages
Car Price Prediction
No ratings yet
Car Price Prediction
21 pages
House File
No ratings yet
House File
30 pages
ZRO Chennai Notification Sol Tech NA 2025 - 26
No ratings yet
ZRO Chennai Notification Sol Tech NA 2025 - 26
23 pages
Saikat Das - Major Project Report PF
No ratings yet
Saikat Das - Major Project Report PF
18 pages
Major Project Report PUCSE - 244 PDF
No ratings yet
Major Project Report PUCSE - 244 PDF
45 pages
Bangalore House Price Prediction
No ratings yet
Bangalore House Price Prediction
4 pages
Samsung Manual-ACI3PR16001 R2
No ratings yet
Samsung Manual-ACI3PR16001 R2
32 pages
NLC Accomplishment Report 2024-2025
No ratings yet
NLC Accomplishment Report 2024-2025
5 pages
IV Sem Internship Report
No ratings yet
IV Sem Internship Report
17 pages
A Synopsys Report
No ratings yet
A Synopsys Report
16 pages
FRONT PAge (Aastha Mahajan)
No ratings yet
FRONT PAge (Aastha Mahajan)
4 pages
UCL International Postgraduates Orientation Webinar
No ratings yet
UCL International Postgraduates Orientation Webinar
70 pages
(Campus of Open Learning) University of Delhi Delhi-110007
No ratings yet
(Campus of Open Learning) University of Delhi Delhi-110007
1 page
Int 5
No ratings yet
Int 5
12 pages
Jamal Internship Report
No ratings yet
Jamal Internship Report
39 pages
Utkarsh Gupta - House Price Prediction
No ratings yet
Utkarsh Gupta - House Price Prediction
6 pages
Department of Educat
No ratings yet
Department of Educat
3 pages
House Price Prediction
No ratings yet
House Price Prediction
55 pages
Krishna Sorthiya - House Price Prediction Using ML
No ratings yet
Krishna Sorthiya - House Price Prediction Using ML
41 pages
House Price Prediction 3 47
No ratings yet
House Price Prediction 3 47
45 pages
7th Sem Report File
No ratings yet
7th Sem Report File
41 pages
HOUSE PREDICTION (1) (1) New
No ratings yet
HOUSE PREDICTION (1) (1) New
24 pages
Bda Report
No ratings yet
Bda Report
27 pages
House Price Using Machine Learning
No ratings yet
House Price Using Machine Learning
9 pages
Assignment - Com-Rc-5210
No ratings yet
Assignment - Com-Rc-5210
2 pages
ML Project CLG
No ratings yet
ML Project CLG
62 pages
House Price Prediction
No ratings yet
House Price Prediction
25 pages
Report
No ratings yet
Report
20 pages
Comparative Study of House Price Prediction Using Machine Learning Research Paper
No ratings yet
Comparative Study of House Price Prediction Using Machine Learning Research Paper
14 pages
Major Proj Report 8th Sem Partial Bordered
No ratings yet
Major Proj Report 8th Sem Partial Bordered
27 pages
Main Content (1) - Merged
No ratings yet
Main Content (1) - Merged
50 pages
CSEPROJECT
No ratings yet
CSEPROJECT
51 pages
Directory
No ratings yet
Directory
228 pages
Presentation 1
No ratings yet
Presentation 1
11 pages
Main Content (1) - Merged
No ratings yet
Main Content (1) - Merged
50 pages
Intership Report
No ratings yet
Intership Report
20 pages
Sameer111 PDF
No ratings yet
Sameer111 PDF
20 pages
Mini Project PPT Sample Copy AIML
No ratings yet
Mini Project PPT Sample Copy AIML
16 pages
Project Report Kajal
No ratings yet
Project Report Kajal
21 pages
Sameeksha Mishra Project Report
No ratings yet
Sameeksha Mishra Project Report
28 pages
Property Rental Predication
No ratings yet
Property Rental Predication
36 pages
Advanced Multiplayer Game Development with Ureal Engine 5: A Comprehensive Guide to C++ Scripting
From Everand
Advanced Multiplayer Game Development with Ureal Engine 5: A Comprehensive Guide to C++ Scripting
Vladimir Kiselev
No ratings yet

Ay-Sem8-Internship Report

Uploaded by

Ay-Sem8-Internship Report

Uploaded by

HOUSE PRICE PREDICTION MODEL

In partial fulfillment for the award of the degree of

Department of Computer Engineering

Government Engineering College, Dahod

Gujarat Technological University, Ahmedabad

Prof. Viren Patel Prof. Viren Patel

Name of the Student Sign of Student

I would like to express my heartfelt gratitude to Grras Solutions Private Limited,

Ahmedabad, for providing me with the opportunity to undertake my internship in

Machine Learning with Python.

guidance and continuous support throughout this internship.

necessary resources, mentorship, and a conducive learning environment during my

A special thanks to Government Engineering College, Dahod, for facilitating this

framework for industry-academia collaboration.

encouragement throughout my academic journey.

This internship provided hands-on experience in machine learning, data science

Fig 1.1.1 ML in Real Estate……………………………..………………….. 1

Chapter 5: Model Selection and Training............................................................... 10

Chapter 7: Results and Discussion ………............................................................. 24

7.3.1 Actual price Vs Predicted Price

7.3.2 Feature Importance Plot

7.3.3 Real World Implications

Chapter 8: Conclusion and Future Work…......................................................... 31

1.2 Objectives of the Project

The primary objectives of this project are:

1.3 Scope of the Project

2.1 ABOUT COMPANY

Fig. 2.1.1 Company Logo

• Machine Learning & Data Science

2.2 MISSION AND VISION

Vision: To be recognized as a center of excellence in IT education and skill development.

• Live project work on real-world datasets.

2.4 COMPANY CULTURE

3.1 PROBLEM STATEMENT:

3.2 DATASET DESCRIPTION:

The project follows a structured approach:

4.2 DATA COLLECTION

Code Snippet: Loading the Dataset

1. Handling Missing Values

Code Snippet: Handling Missing Data

4.4 EXPLORATORY DATA ANALYSIS (EDA)

Code Snippet: (Histogram and Boxplot)

5.1 Linear Regression

5.2 Decision Tree Regressor:

6.1 MODEL EVALUATION

Performance Metrics Used:

Linear Regression 970,043.40 1,754,318,687,330.66 1,324,595.81 0.6529

Decision Tree 1,195,266.06 2,642,802,637,614.68 1,625,646.29 0.4771

Random Forest 1,021,546.04 1,961,585,044,320.34 1,400,565.80 0.6119

XGBoost 1,054,208.88 2,075,875,606,528.00 1,441,639.98 0.5893

Once the best-performing model is selected, we deploy it using Flask or Streamlit

1. Save the trained model using joblib.

Code Snippet: Saving & Loading the Model

6.3 DEPLOYMENT USING FLASK

7.2 MODEL PERFORMANCE ANALYSIS

Comparison of Model Accuracy

Model MAE MSE RMSE R² Score

Linear Regression 970,043.40 1,754,318,687,330.66 1,324,595.81 0.6529

Decision Tree 1,195,266.06 2,642,802,637,614.68 1,625,646.29 0.4771

Random Forest 1,021,546.04 1,961,585,044,320.34 1,400,565.80 0.6119

XGBoost 1,054,208.88 2,075,875,606,528.00 1,441,639.98 0.5893

7.3GRAPHICAL REPRESENTATION OF MODEL PERFORMANCE

7.3.1 Actual Vs Predicted Price Plot

A scatter plot is used to compare actual vs. predicted house prices.

Feature importance helps in understanding which attributes have the most

7.3.3 Real World Implications:

8.2 CHALLENGES FACED

While working on this project, a few challenges were encountered:

Each challenge was addressed through appropriate data cleaning, feature

Deep Learning Integration – Implementing neural networks for better prediction

• House Price Prediction using Machine Learning in Python – GeeksforGeeks.

• Scikit-Learn Documentation. https://scikit-learn.org/stable/

• XGBoost Documentation. https://xgboost.readthedocs.io/en/stable/

• Pandas Documentation. https://pandas.pydata.org/

• Matplotlib Documentation. https://matplotlib.org/stable/contents.html

You might also like