0% found this document useful (0 votes)

54 views10 pages

ML LinearRegression

This document summarizes the steps taken to perform linear regression on a housing dataset to predict prices. It loads housing data, splits it into training and test sets, fits a linear regression model to the training set, makes predictions on the test set, and evaluates the model's performance using various metrics like mean absolute error, mean squared error, and RMSE. The linear regression model is able to explain about 92% of the variability in housing prices.

Uploaded by

Hellem Biersack

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

54 views10 pages

ML LinearRegression

Uploaded by

Hellem Biersack

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 10

ML_LinearRegression

February 9, 2023

[1]: import numpy as np

import pandas as pd

[2]: import matplotlib.pyplot as plt

import seaborn as sns

[3]: %matplotlib inline

[4]: house=pd.read_csv('USA_Housing.csv')

[5]: house.head(3)

[5]: Avg. Area Income Avg. Area House Age Avg. Area Number of Rooms \
0 79545.458574 5.682861 7.009188
1 79248.642455 6.002900 6.730821
2 61287.067179 5.865890 8.512727

Avg. Area Number of Bedrooms Area Population Price \

0 4.09 23086.800503 1.059034e+06
1 3.09 40173.072174 1.505891e+06
2 5.13 36882.159400 1.058988e+06

Address
0 208 Michael Ferry Apt. 674\nLaurabury, NE 3701…
1 188 Johnson Views Suite 079\nLake Kathleen, CA…
2 9127 Elizabeth Stravenue\nDanieltown, WI 06482…

[6]: house.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 5000 entries, 0 to 4999
Data columns (total 7 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Avg. Area Income 5000 non-null float64
1 Avg. Area House Age 5000 non-null float64
2 Avg. Area Number of Rooms 5000 non-null float64
3 Avg. Area Number of Bedrooms 5000 non-null float64

1
4 Area Population 5000 non-null float64
5 Price 5000 non-null float64
6 Address 5000 non-null object
dtypes: float64(6), object(1)
memory usage: 273.6+ KB

[ ]:

[7]: house.describe()

[7]: Avg. Area Income Avg. Area House Age Avg. Area Number of Rooms \
count 5000.000000 5000.000000 5000.000000
mean 68583.108984 5.977222 6.987792
std 10657.991214 0.991456 1.005833
min 17796.631190 2.644304 3.236194
25% 61480.562388 5.322283 6.299250
50% 68804.286404 5.970429 7.002902
75% 75783.338666 6.650808 7.665871
max 107701.748378 9.519088 10.759588

Avg. Area Number of Bedrooms Area Population Price

count 5000.000000 5000.000000 5.000000e+03
mean 3.981330 36163.516039 1.232073e+06
std 1.234137 9925.650114 3.531176e+05
min 2.000000 172.610686 1.593866e+04
25% 3.140000 29403.928702 9.975771e+05
50% 4.050000 36199.406689 1.232669e+06
75% 4.490000 42861.290769 1.471210e+06
max 6.500000 69621.713378 2.469066e+06

[8]: sns.pairplot(house)

[8]: <seaborn.axisgrid.PairGrid at 0x7f58bbebf8d0>

2
[9]: sns.distplot(house['Price'])

[9]: <AxesSubplot:xlabel='Price'>

3
[10]: house.corr()

[10]: Avg. Area Income Avg. Area House Age \

Avg. Area Income 1.000000 -0.002007
Avg. Area House Age -0.002007 1.000000
Avg. Area Number of Rooms -0.011032 -0.009428
Avg. Area Number of Bedrooms 0.019788 0.006149
Area Population -0.016234 -0.018743
Price 0.639734 0.452543

Avg. Area Number of Rooms \

Avg. Area Income -0.011032
Avg. Area House Age -0.009428
Avg. Area Number of Rooms 1.000000
Avg. Area Number of Bedrooms 0.462695
Area Population 0.002040
Price 0.335664

Avg. Area Number of Bedrooms Area Population \

Avg. Area Income 0.019788 -0.016234
Avg. Area House Age 0.006149 -0.018743
Avg. Area Number of Rooms 0.462695 0.002040
Avg. Area Number of Bedrooms 1.000000 -0.022168
Area Population -0.022168 1.000000

4
Price 0.171071 0.408556

Price
Avg. Area Income 0.639734
Avg. Area House Age 0.452543
Avg. Area Number of Rooms 0.335664
Avg. Area Number of Bedrooms 0.171071
Area Population 0.408556
Price 1.000000

[11]: sns.heatmap(house.corr(),annot=True)

[11]: <AxesSubplot:>

[13]: house.columns

[13]: Index(['Avg. Area Income', 'Avg. Area House Age', 'Avg. Area Number of Rooms',
'Avg. Area Number of Bedrooms', 'Area Population', 'Price', 'Address'],
dtype='object')

5
[ ]:

[14]: X=house[['Avg. Area Income', 'Avg. Area House Age','Avg. Area Number of␣
↪Rooms','Avg. Area Number of Bedrooms', 'Area Population']]

[15]: y=house[['Price']]

[16]: from sklearn.model_selection import train_test_split

[17]: X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.4,␣

↪random_state=101)

[18]: from sklearn.linear_model import LinearRegression

[19]: lm=LinearRegression()

[20]: lm.fit(X_train,y_train)

[20]: LinearRegression()

[21]: print(lm.intercept_)

[-2640159.79685191]

[22]: lm.coef_

[22]: array([[2.15282755e+01, 1.64883282e+05, 1.22368678e+05, 2.23380186e+03,

1.51504200e+01]])

[23]: lmcoeftransp = np.transpose(lm.coef_)

[24]: lmcoeftransp

[24]: array([[2.15282755e+01],
[1.64883282e+05],
[1.22368678e+05],
[2.23380186e+03],
[1.51504200e+01]])

[25]: X.columns

[25]: Index(['Avg. Area Income', 'Avg. Area House Age', 'Avg. Area Number of Rooms',
'Avg. Area Number of Bedrooms', 'Area Population'],
dtype='object')

[26]: X_train.columns

6
[26]: Index(['Avg. Area Income', 'Avg. Area House Age', 'Avg. Area Number of Rooms',
'Avg. Area Number of Bedrooms', 'Area Population'],
dtype='object')

[27]: cdf = pd.DataFrame(lmcoeftransp,X.columns,columns=['Coeff'])

[28]: cdf

[28]: Coeff
Avg. Area Income 21.528276
Avg. Area House Age 164883.282027
Avg. Area Number of Rooms 122368.678027
Avg. Area Number of Bedrooms 2233.801864
Area Population 15.150420

PREDICTIONS
[29]: predictions = lm.predict(X_test)

[30]: predictions

[30]: array([[1260960.70567626],
[ 827588.75560352],
[1742421.24254328],
…,
[ 372191.40626952],
[1365217.15140895],
[1914519.54178824]])

[31]: y_test

[31]: Price
1718 1.251689e+06
2511 8.730483e+05
345 1.696978e+06
2521 1.063964e+06
54 9.487883e+05
… …
1776 1.489520e+06
4269 7.777336e+05
1661 1.515271e+05
2410 1.343824e+06
2302 1.906025e+06

[2000 rows x 1 columns]

[32]: #Now we wanna know how far off we are from real dataset

7
[33]: plt.scatter(y_test,predictions)

[33]: <matplotlib.collections.PathCollection at 0x7f58b58b3d50>

[34]: #Now we need to see through hist for a residue - what is a residue?

[35]: sns.distplot((y_test - predictions))

[35]: <AxesSubplot:>

8
[36]: #Evaluating metrics

[37]: from sklearn import metrics

[38]: metrics.mean_absolute_error(y_test,predictions)

[38]: 82288.22251914957

[39]: metrics.mean_squared_error(y_test,predictions)

[39]: 10460958907.209501

[40]: #RMSE

[46]: RresultMSE = np.sqrt(10460958907.209501)

[47]: RresultMSE

[47]: 102278.82922291153

[48]: r2_score = lm.score(X_test,y_test)

print(r2_score*100,'%')

91.76824009649201 %

9
[ ]:

Systems Architecture 7th Edition Burd Solutions Manual PDF
No ratings yet
Systems Architecture 7th Edition Burd Solutions Manual PDF
5 pages
Installation and Operating Manual FMTB 5000 Test Bench: 0.5 - 5000 m3 / H
100% (2)
Installation and Operating Manual FMTB 5000 Test Bench: 0.5 - 5000 m3 / H
36 pages
Regression Algorithm
No ratings yet
Regression Algorithm
9 pages
01.multiple Linear Regression - Ipynb - Colaboratory
No ratings yet
01.multiple Linear Regression - Ipynb - Colaboratory
10 pages
DL - LR - 1.ipynb - Colab
No ratings yet
DL - LR - 1.ipynb - Colab
5 pages
Project Linear Regression
No ratings yet
Project Linear Regression
7 pages
Prac - 8 (1) - Jupyter Notebook
No ratings yet
Prac - 8 (1) - Jupyter Notebook
6 pages
Sesi 4-2B Linear Regression With Python - Jupyter Notebook
No ratings yet
Sesi 4-2B Linear Regression With Python - Jupyter Notebook
12 pages
Expt 7
No ratings yet
Expt 7
3 pages
T2 Summary VHA
No ratings yet
T2 Summary VHA
14 pages
ML Regression
No ratings yet
ML Regression
9 pages
Emllab
No ratings yet
Emllab
6 pages
Linear Regression Analysis - Polynomial Regression
No ratings yet
Linear Regression Analysis - Polynomial Regression
25 pages
Linear Regression - Jupyter Notebook
No ratings yet
Linear Regression - Jupyter Notebook
2 pages
Linear Regression Using Python
No ratings yet
Linear Regression Using Python
18 pages
Ex No.: Date: Problem Statement
No ratings yet
Ex No.: Date: Problem Statement
3 pages
Exp4 (Linear Regression)
No ratings yet
Exp4 (Linear Regression)
2 pages
ML Manual
No ratings yet
ML Manual
9 pages
Unit 3 5
No ratings yet
Unit 3 5
4 pages
ML Merged
No ratings yet
ML Merged
28 pages
2 - Linear - Regression - Multivariate - Ipynb - Colaboratory
No ratings yet
2 - Linear - Regression - Multivariate - Ipynb - Colaboratory
4 pages
Housing Prices Linear Regression
No ratings yet
Housing Prices Linear Regression
3 pages
AIML
No ratings yet
AIML
5 pages
DL 1
No ratings yet
DL 1
11 pages
Exercise4 Solution
No ratings yet
Exercise4 Solution
20 pages
Chirag HOusing Price Pred
No ratings yet
Chirag HOusing Price Pred
12 pages
IoT Task4 21BEC0384
No ratings yet
IoT Task4 21BEC0384
9 pages
Mlext
No ratings yet
Mlext
1 page
Machine Learning
No ratings yet
Machine Learning
10 pages
Introduction To Machine Learning (ML) With Sklearn
No ratings yet
Introduction To Machine Learning (ML) With Sklearn
10 pages
AD-22053227 Lab 401, 402
No ratings yet
AD-22053227 Lab 401, 402
4 pages
DSBDAL - Assignment No 4
No ratings yet
DSBDAL - Assignment No 4
15 pages
ML Lab Experiment Shivansh
No ratings yet
ML Lab Experiment Shivansh
29 pages
Python File
No ratings yet
Python File
5 pages
2 Linear Regression Multivariate
No ratings yet
2 Linear Regression Multivariate
2 pages
A
No ratings yet
A
2 pages
Project 4 - House Price Prediction - Ipynb - Colab
No ratings yet
Project 4 - House Price Prediction - Ipynb - Colab
5 pages
Document From Jahnavi
No ratings yet
Document From Jahnavi
20 pages
ML Assignment1
No ratings yet
ML Assignment1
4 pages
178 - Regulinear - Ipynb - Colab
No ratings yet
178 - Regulinear - Ipynb - Colab
3 pages
1 - Lab Manual (ML)
No ratings yet
1 - Lab Manual (ML)
42 pages
Experiment Number: 3: Aim:-Study of The Linear Regression in The Machine Learning Using The Boston Housing Dataset. 1)
No ratings yet
Experiment Number: 3: Aim:-Study of The Linear Regression in The Machine Learning Using The Boston Housing Dataset. 1)
14 pages
Machine Learning Lab
No ratings yet
Machine Learning Lab
20 pages
DA Lab2
No ratings yet
DA Lab2
5 pages
Integrated System Lab
No ratings yet
Integrated System Lab
25 pages
Import As Import As From Import: "Mean Squared Errors: "
No ratings yet
Import As Import As From Import: "Mean Squared Errors: "
1 page
Lab Sheet 1
No ratings yet
Lab Sheet 1
6 pages
Data Science Record - 05
No ratings yet
Data Science Record - 05
20 pages
1 - Linear - Regression - Ipynb - Colaboratory
No ratings yet
1 - Linear - Regression - Ipynb - Colaboratory
7 pages
ML Exp-5,6
No ratings yet
ML Exp-5,6
6 pages
ML Lab Record
No ratings yet
ML Lab Record
17 pages
Data Science - Machine Learning - Multiple Linear Regression
No ratings yet
Data Science - Machine Learning - Multiple Linear Regression
14 pages
Kritika Sejwal - 24MCI10023 - ML Lab - Worksheet 1
No ratings yet
Kritika Sejwal - 24MCI10023 - ML Lab - Worksheet 1
6 pages
Machine Learning Project: TITLE: Predicting The Sale Price of A House Using Linear Regression
No ratings yet
Machine Learning Project: TITLE: Predicting The Sale Price of A House Using Linear Regression
20 pages
Week 6 LAB
No ratings yet
Week 6 LAB
13 pages
ML Assignment 1ipynb
No ratings yet
ML Assignment 1ipynb
10 pages
House Price Prediction
No ratings yet
House Price Prediction
2 pages
Data Mining Final Assignment
No ratings yet
Data Mining Final Assignment
4 pages
ML Record
No ratings yet
ML Record
19 pages
Linear Reg
No ratings yet
Linear Reg
25 pages
Lab 2 Linear Regression Representation
No ratings yet
Lab 2 Linear Regression Representation
6 pages
Introduction To Microprocessor (Lecture 2: 8086 MP)
No ratings yet
Introduction To Microprocessor (Lecture 2: 8086 MP)
29 pages
Variant Configuration of Sap SD
No ratings yet
Variant Configuration of Sap SD
5 pages
Containerisation Vs Virtualisation - What's The Difference - PDF
No ratings yet
Containerisation Vs Virtualisation - What's The Difference - PDF
1 page
BBEdit Format HTML
No ratings yet
BBEdit Format HTML
3 pages
BRDB 5th
No ratings yet
BRDB 5th
4 pages
Python While Loop
No ratings yet
Python While Loop
5 pages
Nse Option Chain Indices
No ratings yet
Nse Option Chain Indices
12 pages
Core Linux PDF
No ratings yet
Core Linux PDF
722 pages
Crestron CI-KNX 1 Bit v1.6 Help
No ratings yet
Crestron CI-KNX 1 Bit v1.6 Help
2 pages
Explain To Me Like I Am Five - Sentence Simplification Using Transformers
No ratings yet
Explain To Me Like I Am Five - Sentence Simplification Using Transformers
4 pages
HP ProBook 440 G8 Notebook PC Brochure
No ratings yet
HP ProBook 440 G8 Notebook PC Brochure
44 pages
Day1-Day75 Data Analytics Interview
No ratings yet
Day1-Day75 Data Analytics Interview
669 pages
Veta Ict L1
No ratings yet
Veta Ict L1
324 pages
Spring Rest Api Documenting
No ratings yet
Spring Rest Api Documenting
6 pages
Power Off Reset Reason Backup
No ratings yet
Power Off Reset Reason Backup
5 pages
Email
No ratings yet
Email
4 pages
E PKS: Xperion
No ratings yet
E PKS: Xperion
56 pages
Unit 1
No ratings yet
Unit 1
10 pages
Computer Aided Manufacturing
No ratings yet
Computer Aided Manufacturing
2 pages
Cyber Security Report
No ratings yet
Cyber Security Report
30 pages
C Multiple Choice Questions: Void Main (Int A 10, B 20 Char X 1, y 0 If (A, B, X, Y) (Printf ("EXAM") ) )
No ratings yet
C Multiple Choice Questions: Void Main (Int A 10, B 20 Char X 1, y 0 If (A, B, X, Y) (Printf ("EXAM") ) )
21 pages
22ETC15 - M 2 Vtuupdates
No ratings yet
22ETC15 - M 2 Vtuupdates
53 pages
OMB Form 1 - Application For Ombudsman Clearance
No ratings yet
OMB Form 1 - Application For Ombudsman Clearance
1 page
Flowcode RPi Getting Started Guide
No ratings yet
Flowcode RPi Getting Started Guide
11 pages
IT 103 Presentation 1 Lesson1
No ratings yet
IT 103 Presentation 1 Lesson1
31 pages
Fujitsu Lifebook A530
No ratings yet
Fujitsu Lifebook A530
87 pages
Quantum Mobile
No ratings yet
Quantum Mobile
5 pages
Lecture 09
No ratings yet
Lecture 09
29 pages

ML LinearRegression

Uploaded by

ML LinearRegression

Uploaded by

ML_LinearRegression

[1]: import numpy as np

[2]: import matplotlib.pyplot as plt

[3]: %matplotlib inline

Avg. Area Number of Bedrooms Area Population Price \

Avg. Area Number of Bedrooms Area Population Price

[8]: <seaborn.axisgrid.PairGrid at 0x7f58bbebf8d0>

[10]: Avg. Area Income Avg. Area House Age \

Avg. Area Number of Rooms \

Avg. Area Number of Bedrooms Area Population \

[16]: from sklearn.model_selection import train_test_split

[17]: X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.4,␣

[18]: from sklearn.linear_model import LinearRegression

[22]: array([[2.15282755e+01, 1.64883282e+05, 1.22368678e+05, 2.23380186e+03,

[23]: lmcoeftransp = np.transpose(lm.coef_)

[27]: cdf = pd.DataFrame(lmcoeftransp,X.columns,columns=['Coeff'])

[2000 rows x 1 columns]

[33]: <matplotlib.collections.PathCollection at 0x7f58b58b3d50>

[35]: sns.distplot((y_test - predictions))

[37]: from sklearn import metrics

[46]: RresultMSE = np.sqrt(10460958907.209501)

[48]: r2_score = lm.score(X_test,y_test)

You might also like