0% found this document useful (0 votes)
67 views3 pages

20BECE30146 ML Pratical2

This document discusses linear regression for predicting housing prices. It loads housing data, splits it into training and test sets, fits a linear regression model to the training data, evaluates the model, and makes predictions on the test data. It finds the intercept and coefficients for the linear regression model. It creates a dataframe showing the coefficients and their corresponding features in the model. Finally, it makes predictions on the test data using the fitted linear regression model and prints the predictions.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
67 views3 pages

20BECE30146 ML Pratical2

This document discusses linear regression for predicting housing prices. It loads housing data, splits it into training and test sets, fits a linear regression model to the training data, evaluates the model, and makes predictions on the test data. It finds the intercept and coefficients for the linear regression model. It creates a dataframe showing the coefficients and their corresponding features in the model. Finally, it makes predictions on the test data using the fitted linear regression model and prints the predictions.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 3

20BECE30146

1) LINEAR REGRESSION
import numpy as np
import pandas as pd

df=pd.read_csv('/content/drive/MyDrive/ML Practical/Practical
2/USA_Housing.csv')
df.head()

Avg. Area Income Avg. Area House Age Avg. Area Number of Rooms \
0 79545.458574 5.682861 7.009188
1 79248.642455 6.002900 6.730821
2 61287.067179 5.865890 8.512727
3 63345.240046 7.188236 5.586729
4 59982.197226 5.040555 7.839388

Avg. Area Number of Bedrooms Area Population Price \


0 4.09 23086.800503 1.059034e+06
1 3.09 40173.072174 1.505891e+06
2 5.13 36882.159400 1.058988e+06
3 3.26 34310.242831 1.260617e+06
4 4.23 26354.109472 6.309435e+05

Address
0 208 Michael Ferry Apt. 674\nLaurabury, NE 3701...
1 188 Johnson Views Suite 079\nLake Kathleen, CA...
2 9127 Elizabeth Stravenue\nDanieltown, WI 06482...
3 USS Barnett\nFPO AP 44820
4 USNS Raymond\nFPO AE 09386

df.describe()

Avg. Area Income Avg. Area House Age Avg. Area Number of Rooms \
count 5000.000000 5000.000000 5000.000000
mean 68583.108984 5.977222 6.987792
std 10657.991214 0.991456 1.005833
min 17796.631190 2.644304 3.236194
25% 61480.562388 5.322283 6.299250
50% 68804.286404 5.970429 7.002902
75% 75783.338666 6.650808 7.665871
max 107701.748378 9.519088 10.759588

Avg. Area Number of Bedrooms Area Population Price


count 5000.000000 5000.000000 5.000000e+03
mean 3.981330 36163.516039 1.232073e+06
std 1.234137 9925.650114 3.531176e+05
min 2.000000 172.610686 1.593866e+04
25% 3.140000 29403.928702 9.975771e+05
50% 4.050000 36199.406689 1.232669e+06
20BECE30146

75% 4.490000 42861.290769 1.471210e+06


max 6.500000 69621.713378 2.469066e+06

Divide the dataset into data and labels


X=df[['Avg. Area Income', 'Avg. Area House Age', 'Avg. Area Number of Rooms',
'Avg. Area Number of Bedrooms', 'Area Population']]
y=df['Price']

Divide the dataset into train and test dataset


from sklearn.model_selection import train_test_split

X_train,X_test,y_train,y_test=train_test_split(X,y,test_size=0.4,random_state
=101)

Create and Train the Model


from sklearn.linear_model import LinearRegression

lm=LinearRegression()
lm.fit(X_train,y_train)

LinearRegression()

Evaluating the Model


print(lm.intercept_)

-2640159.7968526958

lm.coef_

array([2.15282755e+01, 1.64883282e+05, 1.22368678e+05, 2.23380186e+03,


1.51504200e+01])

X_train.columns

Index(['Avg. Area Income', 'Avg. Area House Age', 'Avg. Area Number of
Rooms',
'Avg. Area Number of Bedrooms', 'Area Population'],
dtype='object')

create the dataframe of the coefficient and their corresponding features


cdf=pd.DataFrame(lm.coef_,X.columns,columns=['Coeff'])
cdf

Coeff
Avg. Area Income 21.528276
Avg. Area House Age 164883.282027
Avg. Area Number of Rooms 122368.678027
20BECE30146

Avg. Area Number of Bedrooms 2233.801864


Area Population 15.150420

Predictions
predictions=lm.predict(X_test)
print(predictions)

[1260960.70567627 827588.75560329 1742421.24254344 ... 372191.40626917


1365217.15140898 1914519.5417888 ]

You might also like