20BECE30146 ML Pratical2

This document discusses linear regression for predicting housing prices. It loads housing data, splits it into training and test sets, fits a linear regression model to the training data, evaluates the model, and makes predictions on the test data. It finds the intercept and coefficients for the linear regression model. It creates a dataframe showing the coefficients and their corresponding features in the model. Finally, it makes predictions on the test data using the fitted linear regression model and prints the predictions.

Uploaded by

T42 BEIT3OO43 Kundalia Harsh H.

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOC, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

67 views3 pages

20BECE30146 ML Pratical2

Uploaded by

T42 BEIT3OO43 Kundalia Harsh H.

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOC, PDF, TXT or read online on Scribd

You are on page 1/ 3

20BECE30146

1) LINEAR REGRESSION
import numpy as np
import pandas as pd

df=pd.read_csv('/content/drive/MyDrive/ML Practical/Practical
2/USA_Housing.csv')
df.head()

Avg. Area Income Avg. Area House Age Avg. Area Number of Rooms \
0 79545.458574 5.682861 7.009188
1 79248.642455 6.002900 6.730821
2 61287.067179 5.865890 8.512727
3 63345.240046 7.188236 5.586729
4 59982.197226 5.040555 7.839388

Avg. Area Number of Bedrooms Area Population Price \

0 4.09 23086.800503 1.059034e+06
1 3.09 40173.072174 1.505891e+06
2 5.13 36882.159400 1.058988e+06
3 3.26 34310.242831 1.260617e+06
4 4.23 26354.109472 6.309435e+05

Address
0 208 Michael Ferry Apt. 674\nLaurabury, NE 3701...
1 188 Johnson Views Suite 079\nLake Kathleen, CA...
2 9127 Elizabeth Stravenue\nDanieltown, WI 06482...
3 USS Barnett\nFPO AP 44820
4 USNS Raymond\nFPO AE 09386

df.describe()

Avg. Area Income Avg. Area House Age Avg. Area Number of Rooms \
count 5000.000000 5000.000000 5000.000000
mean 68583.108984 5.977222 6.987792
std 10657.991214 0.991456 1.005833
min 17796.631190 2.644304 3.236194
25% 61480.562388 5.322283 6.299250
50% 68804.286404 5.970429 7.002902
75% 75783.338666 6.650808 7.665871
max 107701.748378 9.519088 10.759588

Avg. Area Number of Bedrooms Area Population Price

count 5000.000000 5000.000000 5.000000e+03
mean 3.981330 36163.516039 1.232073e+06
std 1.234137 9925.650114 3.531176e+05
min 2.000000 172.610686 1.593866e+04
25% 3.140000 29403.928702 9.975771e+05
50% 4.050000 36199.406689 1.232669e+06
20BECE30146

75% 4.490000 42861.290769 1.471210e+06

max 6.500000 69621.713378 2.469066e+06

Divide the dataset into data and labels

X=df[['Avg. Area Income', 'Avg. Area House Age', 'Avg. Area Number of Rooms',
'Avg. Area Number of Bedrooms', 'Area Population']]
y=df['Price']

Divide the dataset into train and test dataset

from sklearn.model_selection import train_test_split

X_train,X_test,y_train,y_test=train_test_split(X,y,test_size=0.4,random_state
=101)

Create and Train the Model

from sklearn.linear_model import LinearRegression

lm=LinearRegression()
lm.fit(X_train,y_train)

LinearRegression()

Evaluating the Model

print(lm.intercept_)

-2640159.7968526958

lm.coef_

array([2.15282755e+01, 1.64883282e+05, 1.22368678e+05, 2.23380186e+03,

1.51504200e+01])

X_train.columns

Index(['Avg. Area Income', 'Avg. Area House Age', 'Avg. Area Number of
Rooms',
'Avg. Area Number of Bedrooms', 'Area Population'],
dtype='object')

create the dataframe of the coefficient and their corresponding features

cdf=pd.DataFrame(lm.coef_,X.columns,columns=['Coeff'])
cdf

Coeff
Avg. Area Income 21.528276
Avg. Area House Age 164883.282027
Avg. Area Number of Rooms 122368.678027
20BECE30146

Avg. Area Number of Bedrooms 2233.801864

Area Population 15.150420

Predictions
predictions=lm.predict(X_test)
print(predictions)

[1260960.70567627 827588.75560329 1742421.24254344 ... 372191.40626917

1365217.15140898 1914519.5417888 ]

Michael A Bailey - Real Econometrics - The Right Tools To Answer Important Questions 2nd Edition OXFORD UNIVERSITY PRESS - Libgenli
No ratings yet
Michael A Bailey - Real Econometrics - The Right Tools To Answer Important Questions 2nd Edition OXFORD UNIVERSITY PRESS - Libgenli
656 pages
ANOVA of Unequal Sample Sizes
No ratings yet
ANOVA of Unequal Sample Sizes
7 pages
Spatial Econometrics
No ratings yet
Spatial Econometrics
57 pages
Technological Forecasting & Social Change: Ganesh Dash, Justin Paul
No ratings yet
Technological Forecasting & Social Change: Ganesh Dash, Justin Paul
11 pages
Robust Regression
No ratings yet
Robust Regression
7 pages
Ordered Probit and Logit Models R Program and Output
No ratings yet
Ordered Probit and Logit Models R Program and Output
3 pages
2.correlation Regression Summary Notes by Pranav Popat 1
No ratings yet
2.correlation Regression Summary Notes by Pranav Popat 1
4 pages
Association Between Variables Measured at The Interval-Ratio Level
No ratings yet
Association Between Variables Measured at The Interval-Ratio Level
23 pages
SEM R Notes 2019 3
No ratings yet
SEM R Notes 2019 3
172 pages
Blimp 3 User Manual
No ratings yet
Blimp 3 User Manual
240 pages
EE 566 - Pattern Recognition Project
No ratings yet
EE 566 - Pattern Recognition Project
19 pages
SNM 2
No ratings yet
SNM 2
7 pages
Anova
No ratings yet
Anova
38 pages
Improved Likelihood Inference in Beta Regression: Journal of Statistical Computation and Simulation
No ratings yet
Improved Likelihood Inference in Beta Regression: Journal of Statistical Computation and Simulation
14 pages
Jasp 1
No ratings yet
Jasp 1
7 pages
Forecasting Product/Item Demand: John Molson School of Business BSTA 477: Managerial Forecasting Winter 2019
No ratings yet
Forecasting Product/Item Demand: John Molson School of Business BSTA 477: Managerial Forecasting Winter 2019
51 pages
Zuur 2010
No ratings yet
Zuur 2010
12 pages
Using LASSO From Lars (Or Glmnet) Package in R For Variable Selection - Cross Validated PDF
No ratings yet
Using LASSO From Lars (Or Glmnet) Package in R For Variable Selection - Cross Validated PDF
3 pages
Multiple Linear Regression
No ratings yet
Multiple Linear Regression
34 pages
Econmetrics - EC4061: t t t−1 0 t t−1 t 2 ε
No ratings yet
Econmetrics - EC4061: t t t−1 0 t t−1 t 2 ε
2 pages
Mra Exam Notes
No ratings yet
Mra Exam Notes
10 pages
Economics Class 11 F
No ratings yet
Economics Class 11 F
12 pages
Hydrology Assignment
No ratings yet
Hydrology Assignment
6 pages
First-Grade Classroom Behavior - Its Short - and Long-Term Consequences For School Performance
No ratings yet
First-Grade Classroom Behavior - Its Short - and Long-Term Consequences For School Performance
15 pages
Chapter 11 - 250305 - 102157
No ratings yet
Chapter 11 - 250305 - 102157
7 pages
SPSS Test Output
No ratings yet
SPSS Test Output
16 pages
Classification Demo
No ratings yet
Classification Demo
4 pages
ML Article Writing
No ratings yet
ML Article Writing
3 pages
What Methods Are Most Frequently Used in Research in Criminology and Criminal Justice?
No ratings yet
What Methods Are Most Frequently Used in Research in Criminology and Criminal Justice?
6 pages
QT Tables Rug-Merged
No ratings yet
QT Tables Rug-Merged
14 pages