0% found this document useful (0 votes)
19 views

ML - Tutorial1.ipynb - Colaboratory

This document outlines the steps to implement a linear regression model to predict housing prices in Bengaluru based on total square footage. It loads and cleans a housing data set, splits it into training and test sets, fits a linear regression model to the training set, makes predictions on both training and test sets, and evaluates the model accuracy. Key steps include data preprocessing, splitting into train and test, fitting the linear regression model to the training data, making predictions, and evaluating the model performance on both training and test sets.

Uploaded by

khushi namdev
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views

ML - Tutorial1.ipynb - Colaboratory

This document outlines the steps to implement a linear regression model to predict housing prices in Bengaluru based on total square footage. It loads and cleans a housing data set, splits it into training and test sets, fits a linear regression model to the training set, makes predictions on both training and test sets, and evaluates the model accuracy. Key steps include data preprocessing, splitting into train and test, fitting the linear regression model to the training data, making predictions, and evaluating the model performance on both training and test sets.

Uploaded by

khushi namdev
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

 

                                                         Tutorial-1

Linear Regression Implementation

1 import pandas as pd
2 import numpy as np
3 import matplotlib.pyplot as plt
4 import seaborn as sns

1 df=pd.read_csv('/content/Bengaluru_House_Data.csv')
2 df.head(2)

total_sqft price

0 284.0 8.00

1 1350.0 8.44

1 df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 13107 entries, 0 to 13106
Data columns (total 2 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 total_sqft 13089 non-null float64
1 price 13092 non-null float64
dtypes: float64(2)
memory usage: 204.9 KB

1 df['total_sqft'].fillna(df['total_sqft'].median(),inplace=True)
2 df['price'].fillna(df['price'].median(),inplace=True)

1 df.isnull().sum()

total_sqft 0
price 0
dtype: int64

1 df.corr()

total_sqft price

total_sqft 1.000000 0.570307

price 0.570307 1.000000

1 x=df.drop('price',axis=1)
2 y=df['price']
3 print('x_shape:',x.shape)
4 print('y_shape:',y.shape)

x_shape: (13107, 1)
y_shape: (13107,)

1 from sklearn.model_selection import train_test_split
2 x_train,x_test,y_train,y_test=train_test_split(x,y,test_size=0.30,shuffle=True,random_state=40)
3 print('x_train shape:',x_train.shape)
4 print('x_test shape:',x_test.shape)
5 print('y_train shape:',y_train.shape)
6 print('y_test shape:',y_test.shape)

x_train shape: (9174, 1)


x_test shape: (3933, 1)
y_train shape: (9174,)
y_test shape: (3933,)

1 from sklearn.linear_model import LinearRegression
2 lr=LinearRegression()
3 lr.fit(x_train,y_train)
4 plt.plot(x_train,y_train,'b-')
[<matplotlib.lines.Line2D at 0x7f1efacca4f0>]

1 y_train_pre=lr.predict(x_train)
2 plt.plot(x_train,y_train,'bx')
3 plt.plot(x_train,y_train_pre,'r-')

[<matplotlib.lines.Line2D at 0x7f1efac25e20>]

1 lr.score(x_train,y_train)

0.44559432557671586

1 y_test_pre=lr.predict(x_test)
2 plt.plot(x_test,y_test,'bx')
3 plt.plot(x_test,y_test_pre,'g-')

[<matplotlib.lines.Line2D at 0x7f1efac0d8e0>]

1 lr.score(x_test,y_test)

-0.17485328595276228
Colab paid products - Cancel contracts here

You might also like