0% found this document useful (0 votes)
17 views2 pages

Week 12

Uploaded by

syedabidmaroof30
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views2 pages

Week 12

Uploaded by

syedabidmaroof30
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

In [1]: import seaborn as sns

In [2]: import matplotlib.pyplot as plt

In [3]: import pandas as pd


df = pd.read_csv('house.csv')
print(df.head())

Posted On BHK Rent Size Floor Area Type \


0 2022-05-18 2 10000 1100 Ground out of 2 Super Area
1 2022-05-13 2 20000 800 1 out of 3 Super Area
2 2022-05-16 2 17000 1000 1 out of 3 Super Area
3 2022-07-04 2 10000 800 1 out of 2 Super Area
4 2022-05-09 2 7500 850 1 out of 2 Carpet Area

Area Locality City Furnishing Status Tenant Preferred \


0 Bandel Kolkata Unfurnished Bachelors/Family
1 Phool Bagan, Kankurgachi Kolkata Semi-Furnished Bachelors/Family
2 Salt Lake City Sector 2 Kolkata Semi-Furnished Bachelors/Family
3 Dumdum Park Kolkata Unfurnished Bachelors/Family
4 South Dum Dum Kolkata Unfurnished Bachelors

Bathroom Point of Contact


0 2 Contact Owner
1 1 Contact Owner
2 1 Contact Owner
3 1 Contact Owner
4 1 Contact Owner

In [4]: df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 4746 entries, 0 to 4745
Data columns (total 12 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Posted On 4746 non-null object
1 BHK 4746 non-null int64
2 Rent 4746 non-null int64
3 Size 4746 non-null int64
4 Floor 4746 non-null object
5 Area Type 4746 non-null object
6 Area Locality 4746 non-null object
7 City 4746 non-null object
8 Furnishing Status 4746 non-null object
9 Tenant Preferred 4746 non-null object
10 Bathroom 4746 non-null int64
11 Point of Contact 4746 non-null object
dtypes: int64(4), object(8)
memory usage: 445.1+ KB

In [5]: df.head()

Out[5]: Posted On BHK Rent Size Floor Area Type Area Locality City Furnishing Status Tenant Preferred Bathroom Point of Contact

0 2022-05-18 2 10000 1100 Ground out of 2 Super Area Bandel Kolkata Unfurnished Bachelors/Family 2 Contact Owner

1 2022-05-13 2 20000 800 1 out of 3 Super Area Phool Bagan, Kankurgachi Kolkata Semi-Furnished Bachelors/Family 1 Contact Owner

2 2022-05-16 2 17000 1000 1 out of 3 Super Area Salt Lake City Sector 2 Kolkata Semi-Furnished Bachelors/Family 1 Contact Owner

3 2022-07-04 2 10000 800 1 out of 2 Super Area Dumdum Park Kolkata Unfurnished Bachelors/Family 1 Contact Owner

4 2022-05-09 2 7500 850 1 out of 2 Carpet Area South Dum Dum Kolkata Unfurnished Bachelors 1 Contact Owner

1)Write a Program in Python to implement Correlation.


In [6]: correlations=df['Bathroom'].corr(df['Rent'])
print(f'The correlation is:{correlations}')

The correlation is:0.4412152289555701

In [7]: correlations=df['Size'].corr(df['Rent'])
print(f'The correlation is:{correlations}')

The correlation is:0.4135507582245195

In [8]: correlations=df['Bathroom'].corr(df['BHK'])
print(f'The correlation is:{correlations}')

The correlation is:0.7948854397283542

2)Write a Program in Python to perform Simple Linear Regressionn


In [9]: df1=df[["BHK","Rent","Size","Bathroom"]]
print(df1)

BHK Rent Size Bathroom


0 2 10000 1100 2
1 2 20000 800 1
2 2 17000 1000 1
3 2 10000 800 1
4 2 7500 850 1
... ... ... ... ...
4741 2 15000 1000 2
4742 3 29000 2000 3
4743 3 35000 1750 3
4744 3 45000 1500 2
4745 2 15000 1000 2

[4746 rows x 4 columns]

In [10]: df1.describe()

Out[10]: BHK Rent Size Bathroom

count 4746.000000 4.746000e+03 4746.000000 4746.000000

mean 2.083860 3.499345e+04 967.490729 1.965866

std 0.832256 7.810641e+04 634.202328 0.884532

min 1.000000 1.200000e+03 10.000000 1.000000

25% 2.000000 1.000000e+04 550.000000 1.000000

50% 2.000000 1.600000e+04 850.000000 2.000000

75% 3.000000 3.300000e+04 1200.000000 2.000000

max 6.000000 3.500000e+06 8000.000000 10.000000

In [11]: from sklearn.linear_model import LinearRegression


from sklearn.model_selection import train_test_split

In [12]: X=df1[['Size']]
y=df1['BHK']
#splitting in 80 : 20 ratio
X_train,X_test,y_train,y_test = train_test_split(X, y,test_size=0.2,random_state=59)
model= LinearRegression()
model.fit(X_train,y_train)
m=model.coef_[0] #slope
b=model.intercept_ #intercept
print(f"Slope (m): {m}")
print(f"Intercept (b): {b}")
y_pred = model.predict(X_test)

Slope (m): 0.0009161422842001272


Intercept (b): 1.1912197415401806

In [13]: # Plot the data and the regression line


plt.scatter(X_test, y_test, color='orange', label='Original Data')
plt.plot(X_test, y_pred, color='blue', label='Regression Line')
plt.legend()
plt.show()

4)Write a Program in Python to predict House rent using linear regression


In [14]: X=df1[['Bathroom']]
y=df1['Rent']
#splitting in 80 : 20 ratio
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2,random_state=59)
# Create and train the model
model=LinearRegression()
model.fit(X_train,y_train)
m=model.coef_[0] #slope
b=model.intercept_ #intercept
print(f"Slope (m): {m}")
print(f"Intercept (b): {b}")
y_pred = model.predict(X_test)

Slope (m): 39908.44676348857


Intercept (b): -43038.48217706297

In [15]: #plot regression line


plt.scatter(X_test, y_test, color='orange', label='Original Data')
plt.plot(X_test, y_pred, color='blue', label='Regression Line')
plt.legend()
plt.show()

You might also like