0% found this document useful (0 votes)

64 views5 pages

Project 4 - House Price Prediction - Ipynb - Colab

The document outlines a project for predicting house prices using the California housing dataset with various features. It details the data preparation process, including loading the dataset, checking for missing values, and splitting the data into training and test sets. The model is trained using the XGBoost regressor, and evaluation metrics such as R squared error and Mean Absolute Error are calculated for both training and test data predictions.

Uploaded by

Sargam

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

64 views5 pages

Project 4 - House Price Prediction - Ipynb - Colab

Uploaded by

Sargam

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

01/10/2024, 17:26 Copy of Project 4 : House Price Prediction.

ipynb - Colab

Importing the Dependencies

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import sklearn.datasets
from sklearn.model_selection import train_test_split
from xgboost import XGBRegressor
from sklearn import metrics

Importing the Boston House Price Dataset

house_price_dataset = sklearn.datasets.fetch_california_housing()

print(house_price_dataset)

{'data': array([[ 8.3252 , 41. , 6.98412698, ..., 2.55555556,

37.88 , -122.23 ],
[ 8.3014 , 21. , 6.23813708, ..., 2.10984183,
37.86 , -122.22 ],
[ 7.2574 , 52. , 8.28813559, ..., 2.80225989,
37.85 , -122.24 ],
...,
[ 1.7 , 17. , 5.20554273, ..., 2.3256351 ,
39.43 , -121.22 ],
[ 1.8672 , 18. , 5.32951289, ..., 2.12320917,
39.43 , -121.32 ],
[ 2.3886 , 16. , 5.25471698, ..., 2.61698113,
39.37 , -121.24 ]]), 'target': array([4.526, 3.585, 3.521, ..., 0.923, 0.847, 0.894]), 'frame': None, 'target_na

# Loading the dataset to a Pandas DataFrame

house_price_dataframe = pd.DataFrame(house_price_dataset.data, columns = house_price_dataset.feature_names)

# Print First 5 rows of our DataFrame

house_price_dataframe.head()

MedInc HouseAge AveRooms AveBedrms Population AveOccup Latitude Longitude

0 8.3252 41.0 6.984127 1.023810 322.0 2.555556 37.88 -122.23

1 8.3014 21.0 6.238137 0.971880 2401.0 2.109842 37.86 -122.22

2 7.2574 52.0 8.288136 1.073446 496.0 2.802260 37.85 -122.24

3 5.6431 52.0 5.817352 1.073059 558.0 2.547945 37.85 -122.25

4 3 8462 52 0 6 281853 1 081081 565 0 2 181467 37 85 -122 25

Next steps: Generate code with house_price_dataframe

toggle_off View recommended plots New interactive sheet

# add the target (price) column to the DataFrame

house_price_dataframe['price'] = house_price_dataset.target

house_price_dataframe.head()

MedInc HouseAge AveRooms AveBedrms Population AveOccup Latitude Longitude price

0 8.3252 41.0 6.984127 1.023810 322.0 2.555556 37.88 -122.23 4.526

1 8.3014 21.0 6.238137 0.971880 2401.0 2.109842 37.86 -122.22 3.585

2 7.2574 52.0 8.288136 1.073446 496.0 2.802260 37.85 -122.24 3.521

3 5.6431 52.0 5.817352 1.073059 558.0 2.547945 37.85 -122.25 3.413

4 3 8462 52 0 6 281853 1 081081 565 0 2 181467 37 85 -122 25 3 422

Next steps: Generate code with house_price_dataframe

toggle_off View recommended plots New interactive sheet

# checking the number of rows and Columns in the data frame

house_price_dataframe.shape

(20640, 9)

https://colab.research.google.com/drive/1m-p2Nj9HPWN-fEk57uIZ6J2wWjW4pn2R#scrollTo=mv3Vgwq2SHp-&printMode=true 1/5
01/10/2024, 17:26 Copy of Project 4 : House Price Prediction.ipynb - Colab
# check for missing values
house_price_dataframe.isnull().sum()

MedInc 0

HouseAge 0

AveRooms 0

AveBedrms 0

Population 0

AveOccup 0

Latitude 0

Longitude 0

price 0

# statistical measures of the dataset

house_price_dataframe.describe()

MedInc HouseAge AveRooms AveBedrms Population AveOccup Latitude Longitude price

count 20640.000000 20640.000000 20640.000000 20640.000000 20640.000000 20640.000000 20640.000000 20640.000000 20640.000000

mean 3.870671 28.639486 5.429000 1.096675 1425.476744 3.070655 35.631861 -119.569704 2.068558

std 1.899822 12.585558 2.474173 0.473911 1132.462122 10.386050 2.135952 2.003532 1.153956

min 0.499900 1.000000 0.846154 0.333333 3.000000 0.692308 32.540000 -124.350000 0.149990

25% 2.563400 18.000000 4.440716 1.006079 787.000000 2.429741 33.930000 -121.800000 1.196000

50% 3.534800 29.000000 5.229129 1.048780 1166.000000 2.818116 34.260000 -118.490000 1.797000

75% 4.743250 37.000000 6.052381 1.099526 1725.000000 3.282261 37.710000 -118.010000 2.647250

max 15 000100 52 000000 141 909091 34 066667 35682 000000 1243 333333 41 950000 -114 310000 5 000010

Understanding the correlation between various features in the dataset

1. Positive Correlation

2. Negative Correlation

correlation = house_price_dataframe.corr()

# constructing a heatmap to nderstand the correlation

plt.figure(figsize=(10,10))
sns.heatmap(correlation, cbar=True, square=True, fmt='.1f', annot=True, annot_kws={'size':8}, cmap='Blues')

https://colab.research.google.com/drive/1m-p2Nj9HPWN-fEk57uIZ6J2wWjW4pn2R#scrollTo=mv3Vgwq2SHp-&printMode=true 2/5
01/10/2024, 17:26 Copy of Project 4 : House Price Prediction.ipynb - Colab

<Axes: >

Splitting the data and Target

X = house_price_dataframe.drop(['price'], axis=1)
Y = house_price_dataframe['price']

print(X)
print(Y)

MedInc HouseAge AveRooms AveBedrms Population AveOccup Latitude \

0 8.3252 41.0 6.984127 1.023810 322.0 2.555556 37.88
1 8.3014 21.0 6.238137 0.971880 2401.0 2.109842 37.86
2 7.2574 52.0 8.288136 1.073446 496.0 2.802260 37.85
3 5.6431 52.0 5.817352 1.073059 558.0 2.547945 37.85
4 3.8462 52.0 6.281853 1.081081 565.0 2.181467 37.85
... ... ... ... ... ... ... ...
20635 1.5603 25.0 5.045455 1.133333 845.0 2.560606 39.48
20636 2.5568 18.0 6.114035 1.315789 356.0 3.122807 39.49
20637 1.7000 17.0 5.205543 1.120092 1007.0 2.325635 39.43
20638 1.8672 18.0 5.329513 1.171920 741.0 2.123209 39.43
20639 2.3886 16.0 5.254717 1.162264 1387.0 2.616981 39.37

Longitude
0 -122.23
1 -122.22
2 -122.24
3 -122.25
4 -122.25
... ...
20635 -121.09
20636 -121.21
20637 -121.22
20638 -121.32
20639 -121.24

[20640 rows x 8 columns]

0 4.526
1 3.585
2 3.521

https://colab.research.google.com/drive/1m-p2Nj9HPWN-fEk57uIZ6J2wWjW4pn2R#scrollTo=mv3Vgwq2SHp-&printMode=true 3/5
01/10/2024, 17:26 Copy of Project 4 : House Price Prediction.ipynb - Colab
3 3.413
4 3.422
...
20635 0.781
20636 0.771
20637 0.923
20638 0.847
20639 0.894
Name: price, Length: 20640, dtype: float64

Splitting the data into Training data and Test data

X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size = 0.2, random_state = 2)

print(X.shape, X_train.shape, X_test.shape)

(506, 13) (404, 13) (102, 13)

Model Training

XGBoost Regressor

# loading the model

model = XGBRegressor()

# training the model with X_train

model.fit(X_train, Y_train)

▾ XGBRegressor i

XGBRegressor(base_score=None, booster=None, callbacks=None,

colsample_bylevel=None, colsample_bynode=None,
colsample_bytree=None, device=None, early_stopping_rounds=None,
enable_categorical=False, eval_metric=None, feature_types=None,
gamma=None, grow_policy=None, importance_type=None,
interaction_constraints=None, learning_rate=None, max_bin=None,
max_cat_threshold=None, max_cat_to_onehot=None,
max_delta_step=None, max_depth=None, max_leaves=None,
min_child_weight=None, missing=nan, monotone_constraints=None,
multi_strategy=None, n_estimators=None, n_jobs=None,
num parallel tree=None, random state=None, ...)

Evaluation

Prediction on training data

# accuracy for prediction on training data

training_data_prediction = model.predict(X_train)

print(training_data_prediction)

[0.5523039 3.0850039 0.5835302 ... 1.9204227 1.952873 0.6768683]

# R squared error
score_1 = metrics.r2_score(Y_train, training_data_prediction)

# Mean Absolute Error

score_2 = metrics.mean_absolute_error(Y_train, training_data_prediction)

print("R squared error : ", score_1)

print('Mean Absolute Error : ', score_2)

R squared error : 0.943650140819218

Mean Absolute Error : 0.1933648700612105

Visualizing the actual Prices and predicted prices

plt.scatter(Y_train, training_data_prediction)
plt.xlabel("Actual Prices")
plt.ylabel("Predicted Prices")
plt.title("Actual Price vs Preicted Price")
plt.show()

https://colab.research.google.com/drive/1m-p2Nj9HPWN-fEk57uIZ6J2wWjW4pn2R#scrollTo=mv3Vgwq2SHp-&printMode=true 4/5
01/10/2024, 17:26 Copy of Project 4 : House Price Prediction.ipynb - Colab

Prediction on Test Data

# accuracy for prediction on test data

test_data_prediction = model.predict(X_test)

# R squared error
score_1 = metrics.r2_score(Y_test, test_data_prediction)

# Mean Absolute Error

score_2 = metrics.mean_absolute_error(Y_test, test_data_prediction)

print("R squared error : ", score_1)

print('Mean Absolute Error : ', score_2)

https://colab.research.google.com/drive/1m-p2Nj9HPWN-fEk57uIZ6J2wWjW4pn2R#scrollTo=mv3Vgwq2SHp-&printMode=true 5/5

Harvard Business Review Case Study Agero
100% (1)
Harvard Business Review Case Study Agero
12 pages
One Hot Encoding
No ratings yet
One Hot Encoding
12 pages
Pakistan's Application GI
100% (1)
Pakistan's Application GI
18 pages
Predicting Diamond Price: 2 Step Method
100% (1)
Predicting Diamond Price: 2 Step Method
17 pages
Business Project Shell
No ratings yet
Business Project Shell
23 pages
Predicting Diamond Price Using Linear Model
50% (2)
Predicting Diamond Price Using Linear Model
20 pages
Uniqlos Marketing Plan To Expand To Viet
No ratings yet
Uniqlos Marketing Plan To Expand To Viet
42 pages
Saral Jyotish
No ratings yet
Saral Jyotish
146 pages
Case Study Kishan
No ratings yet
Case Study Kishan
2 pages
3 FFBB 2 Ab
No ratings yet
3 FFBB 2 Ab
6 pages
Aparajitha Stotram
No ratings yet
Aparajitha Stotram
2 pages
InfoSphere CDC For Oracle Configurations
No ratings yet
InfoSphere CDC For Oracle Configurations
12 pages
Ms Office Cheatsheet
100% (1)
Ms Office Cheatsheet
28 pages
CBC Final
No ratings yet
CBC Final
79 pages
12 Jyotirlinga
No ratings yet
12 Jyotirlinga
15 pages
Cars4U - Rajat Kapoor 21032021 FINAL-2
0% (1)
Cars4U - Rajat Kapoor 21032021 FINAL-2
39 pages
Divisional Interpretation
100% (1)
Divisional Interpretation
9 pages
"Fitwel Tools and Forgings PVT - LTD.": Sri Siddhartha Academy of Higher Education
100% (1)
"Fitwel Tools and Forgings PVT - LTD.": Sri Siddhartha Academy of Higher Education
43 pages
Features of Jagannatha Hora Software: Siddhantas (Planetary Models)
100% (1)
Features of Jagannatha Hora Software: Siddhantas (Planetary Models)
6 pages
Planet's Position at Birth Time: Avakhada Chakra Ghat Chakra
No ratings yet
Planet's Position at Birth Time: Avakhada Chakra Ghat Chakra
20 pages
Thrissphuta and Kalachakra
No ratings yet
Thrissphuta and Kalachakra
4 pages
Regression Algorithm
No ratings yet
Regression Algorithm
9 pages
Test 3 - Financial Instruments - Ques
100% (1)
Test 3 - Financial Instruments - Ques
6 pages
Document From Jahnavi
No ratings yet
Document From Jahnavi
20 pages
Graha Karakas
No ratings yet
Graha Karakas
12 pages
Medini Jyotishya Fundas
100% (1)
Medini Jyotishya Fundas
7 pages
Messages 3
No ratings yet
Messages 3
12 pages
V.25 Layout Enhancement Via VOKF
No ratings yet
V.25 Layout Enhancement Via VOKF
5 pages
Tara Balam
No ratings yet
Tara Balam
4 pages
It's Tru: That This Machine Was Engineered With Great Care
No ratings yet
It's Tru: That This Machine Was Engineered With Great Care
11 pages
Advanced Astrology Chart - Part 1
No ratings yet
Advanced Astrology Chart - Part 1
11 pages
Mlext
No ratings yet
Mlext
1 page
WDM Brochure (20240112)
No ratings yet
WDM Brochure (20240112)
10 pages
SAP Note 2949155 SUM Select Error
No ratings yet
SAP Note 2949155 SUM Select Error
2 pages
JF TNB12 Beer Costing-1 PDF
No ratings yet
JF TNB12 Beer Costing-1 PDF
6 pages
Machine Learning Lab
No ratings yet
Machine Learning Lab
20 pages
Import As Import As From Import: "Mean Squared Errors: "
No ratings yet
Import As Import As From Import: "Mean Squared Errors: "
1 page
Dik Bal of Grahas
100% (1)
Dik Bal of Grahas
2 pages
Emllab
No ratings yet
Emllab
6 pages
ENTREPRENUERSHIP Red Answeres
No ratings yet
ENTREPRENUERSHIP Red Answeres
10 pages
PRJ15254Galactic - AAI NOC
No ratings yet
PRJ15254Galactic - AAI NOC
2 pages
Dashamsha Kundali D10
No ratings yet
Dashamsha Kundali D10
1 page
Role of Dasamsa and Vimsottari Dasa
No ratings yet
Role of Dasamsa and Vimsottari Dasa
9 pages
A4 Letter Quinn Pants
No ratings yet
A4 Letter Quinn Pants
30 pages
Normialization Dataset
No ratings yet
Normialization Dataset
7 pages
Mahadi K
No ratings yet
Mahadi K
2 pages
Vedic Astrolgy Lesson 14: Planets or Grahas
No ratings yet
Vedic Astrolgy Lesson 14: Planets or Grahas
2 pages
A.C. Circuit
No ratings yet
A.C. Circuit
42 pages
Untitled
No ratings yet
Untitled
35 pages
My Horoscope Says I Am Shudra, Rakshash, Ashwa, What Is This?
No ratings yet
My Horoscope Says I Am Shudra, Rakshash, Ashwa, What Is This?
2 pages
Comprehensive Report OF THE CASE The Global Software Team: Jugaad Needed
No ratings yet
Comprehensive Report OF THE CASE The Global Software Team: Jugaad Needed
7 pages
Advance Pattern Problems C++ Notes Homework Day 11
No ratings yet
Advance Pattern Problems C++ Notes Homework Day 11
31 pages
SOMESH
No ratings yet
SOMESH
36 pages
WEEK-3 Tutorial
No ratings yet
WEEK-3 Tutorial
37 pages
Use Nakshatras For Muhurata
No ratings yet
Use Nakshatras For Muhurata
3 pages
Yogas in Astrology
No ratings yet
Yogas in Astrology
4 pages
Sample of Literature Review in Research Proposal
100% (1)
Sample of Literature Review in Research Proposal
7 pages
ASME II A 1 (2015) .PDF Extract
100% (1)
ASME II A 1 (2015) .PDF Extract
7 pages
Jyotish - What Is Lagna
No ratings yet
Jyotish - What Is Lagna
2 pages
414CC3 - Excel Template - Prelim Shell and Tube Heat Exchanger Design - Si - Units
No ratings yet
414CC3 - Excel Template - Prelim Shell and Tube Heat Exchanger Design - Si - Units
4 pages
Vastu Prakrana
No ratings yet
Vastu Prakrana
1 page
Prasna BH
No ratings yet
Prasna BH
4 pages
Weighing Balance Service Manual
100% (1)
Weighing Balance Service Manual
51 pages
MITS
No ratings yet
MITS
10 pages
70kW System - REB90, HBC400400-3
No ratings yet
70kW System - REB90, HBC400400-3
1 page
AdmissionCard 9619523 742
No ratings yet
AdmissionCard 9619523 742
2 pages
Jake S Resume Anonymous
No ratings yet
Jake S Resume Anonymous
2 pages
WEEK4
No ratings yet
WEEK4
82 pages
Jaimini Karakas
No ratings yet
Jaimini Karakas
1 page
Madhav Institute of Technology & Science: Basics of Elecrical Vehicles
No ratings yet
Madhav Institute of Technology & Science: Basics of Elecrical Vehicles
6 pages
Electric Vehicles Presentation
No ratings yet
Electric Vehicles Presentation
6 pages
Thank You For Taking The Week 4: Assignment 4. Week 4: Assignment 4
No ratings yet
Thank You For Taking The Week 4: Assignment 4. Week 4: Assignment 4
4 pages
@placement - Fellas Telegram
No ratings yet
@placement - Fellas Telegram
4 pages
C800IP Manual
No ratings yet
C800IP Manual
10 pages
Class 3 Basic AYANAMSHA
No ratings yet
Class 3 Basic AYANAMSHA
15 pages
Data Sufficiency PDF 1
No ratings yet
Data Sufficiency PDF 1
16 pages
Pattern - Recognition - 3 - Code With Output
No ratings yet
Pattern - Recognition - 3 - Code With Output
7 pages
ProfQuiz Structural 03
No ratings yet
ProfQuiz Structural 03
31 pages
Basic Chart Reading Skills
No ratings yet
Basic Chart Reading Skills
6 pages
Biosemiotics
No ratings yet
Biosemiotics
4 pages
Karan As
No ratings yet
Karan As
3 pages
Chain Rule Problems
No ratings yet
Chain Rule Problems
4 pages
Hindu Units of Time
No ratings yet
Hindu Units of Time
7 pages
DD Fonts
No ratings yet
DD Fonts
9 pages
Software Engineer-Delhi NCR
No ratings yet
Software Engineer-Delhi NCR
1 page
Admit Card
No ratings yet
Admit Card
1 page
Kukke Subramanya Temple
No ratings yet
Kukke Subramanya Temple
1 page
AIML - Quiz-3 - Attempt Review
No ratings yet
AIML - Quiz-3 - Attempt Review
2 pages
SD New Resume
No ratings yet
SD New Resume
1 page
Astro Taara Balam
No ratings yet
Astro Taara Balam
2 pages
Bench Marking For Coal Preparation
No ratings yet
Bench Marking For Coal Preparation
10 pages
3996 Seal Chamber
No ratings yet
3996 Seal Chamber
2 pages
EY IFRS Accounting For Crypto Assets
No ratings yet
EY IFRS Accounting For Crypto Assets
24 pages
Model 273A Potentiostat-Galvanostat
No ratings yet
Model 273A Potentiostat-Galvanostat
142 pages
How To Judge A Planet Is Benefic During Its Vimsottari Dasa
No ratings yet
How To Judge A Planet Is Benefic During Its Vimsottari Dasa
2 pages
Sers Anual: B L S Ebs M B II
No ratings yet
Sers Anual: B L S Ebs M B II
8 pages
M3B10 M8a10 M10a16 M2a10 M4a10 M1a4 M17a16
No ratings yet
M3B10 M8a10 M10a16 M2a10 M4a10 M1a4 M17a16
1 page
Mintzberg's Models
No ratings yet
Mintzberg's Models
34 pages
Functional Benefic and Malefic in Chart Aaaa
No ratings yet
Functional Benefic and Malefic in Chart Aaaa
2 pages
Topics of Vedic Astrolog2
No ratings yet
Topics of Vedic Astrolog2
1 page
Energies 10 00336 v2 PDF
No ratings yet
Energies 10 00336 v2 PDF
19 pages
Philips Halogen Safety Light Bulletin 3-90
No ratings yet
Philips Halogen Safety Light Bulletin 3-90
2 pages
Coralia Antenas
No ratings yet
Coralia Antenas
3 pages
Category:: Rashino Comments
No ratings yet
Category:: Rashino Comments
1 page
School of Distance Education U.G Programme-Fees Structure
No ratings yet
School of Distance Education U.G Programme-Fees Structure
2 pages
Construction and Building Materials
No ratings yet
Construction and Building Materials
10 pages
Industrial Training Report: Degree of Bachelor of Technology in Mechanical Engineering
No ratings yet
Industrial Training Report: Degree of Bachelor of Technology in Mechanical Engineering
5 pages
Statement of Purpose Â A Man With
No ratings yet
Statement of Purpose Â A Man With
2 pages
Deactivating SAP Screen Personas
No ratings yet
Deactivating SAP Screen Personas
2 pages