0% found this document useful (0 votes)
301 views35 pages

House Rent Prediction EDA

The document discusses exploring and analyzing a dataset containing house rental listings. It loads the dataset and performs some initial exploratory data analysis. This includes checking for missing values, viewing feature data types and counts of unique values, and examining the distributions of features like BHK (bedroom-hall-kitchen) and rent amounts. Charts are generated to visualize the distributions. The analysis finds that most listings are for 2BHK houses and rent amounts vary widely, which could impact machine learning models.

Uploaded by

Mr. Mystery
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
301 views35 pages

House Rent Prediction EDA

The document discusses exploring and analyzing a dataset containing house rental listings. It loads the dataset and performs some initial exploratory data analysis. This includes checking for missing values, viewing feature data types and counts of unique values, and examining the distributions of features like BHK (bedroom-hall-kitchen) and rent amounts. Charts are generated to visualize the distributions. The analysis finds that most listings are for 2BHK houses and rent amounts vary widely, which could impact machine learning models.

Uploaded by

Mr. Mystery
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 35

1/15/23, 7:24 PM House_Rent_EDA (1)

House Rent Prediction EDA


importing libraries

In [1]:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import warnings
warnings.filterwarnings("ignore")
%matplotlib inline

Loading dataset

In [2]:

df=pd.read_csv("/content/House_Rent_Dataset.csv")
df.shape

Out[2]:

(4746, 12)

In [3]:

df.head(5)

Out[3]:

Posted Area Area Furnishing Tena


BHK Rent Size Floor City
On Type Locality Status Preferr

2022- Ground Super


0 2 10000 1100 Bandel Kolkata Unfurnished Bachelors/Fam
05-18 out of 2 Area

Phool
2022- 1 out of Super Semi-
1 2 20000 800 Bagan, Kolkata Bachelors/Fam
05-13 3 Area Furnished
Kankurgachi

Salt Lake
2022- 1 out of Super Semi-
2 2 17000 1000 City Sector Kolkata Bachelors/Fam
05-16 3 Area Furnished
2

2022- 1 out of Super Dumdum


3 2 10000 800 Kolkata Unfurnished Bachelors/Fam
07-04 2 Area Park

2022- 1 out of Carpet South Dum


4 2 7500 850 Kolkata Unfurnished Bachel
05-09 2 Area Dum

https://htmtopdf.herokuapp.com/ipynbviewer/temp/f094e30b096d815513911b1defda642f/House_Rent_EDA (1).html?t=1673790740596 1/35


1/15/23, 7:24 PM House_Rent_EDA (1)

In [4]:

df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 4746 entries, 0 to 4745
Data columns (total 12 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Posted On 4746 non-null object
1 BHK 4746 non-null int64
2 Rent 4746 non-null int64
3 Size 4746 non-null int64
4 Floor 4746 non-null object
5 Area Type 4746 non-null object
6 Area Locality 4746 non-null object
7 City 4746 non-null object
8 Furnishing Status 4746 non-null object
9 Tenant Preferred 4746 non-null object
10 Bathroom 4746 non-null int64
11 Point of Contact 4746 non-null object
dtypes: int64(4), object(8)
memory usage: 445.1+ KB

In [5]:

df.describe()

Out[5]:

BHK Rent Size Bathroom

count 4746.000000 4.746000e+03 4746.000000 4746.000000

mean 2.083860 3.499345e+04 967.490729 1.965866

std 0.832256 7.810641e+04 634.202328 0.884532

min 1.000000 1.200000e+03 10.000000 1.000000

25% 2.000000 1.000000e+04 550.000000 1.000000

50% 2.000000 1.600000e+04 850.000000 2.000000

75% 3.000000 3.300000e+04 1200.000000 2.000000

max 6.000000 3.500000e+06 8000.000000 10.000000

Checking Missing Values

In [6]:

df.columns

Out[6]:

Index(['Posted On', 'BHK', 'Rent', 'Size', 'Floor', 'Area Type',


'Area Locality', 'City', 'Furnishing Status', 'Tenant Preferred',
'Bathroom', 'Point of Contact'],
dtype='object')

https://htmtopdf.herokuapp.com/ipynbviewer/temp/f094e30b096d815513911b1defda642f/House_Rent_EDA (1).html?t=1673790740596 2/35


1/15/23, 7:24 PM House_Rent_EDA (1)

In [7]:

df.isnull().sum()

Out[7]:

Posted On 0
BHK 0
Rent 0
Size 0
Floor 0
Area Type 0
Area Locality 0
City 0
Furnishing Status 0
Tenant Preferred 0
Bathroom 0
Point of Contact 0
dtype: int64

In [8]:

len(df.columns)

Out[8]:

12

Unique value counts for each feature

https://htmtopdf.herokuapp.com/ipynbviewer/temp/f094e30b096d815513911b1defda642f/House_Rent_EDA (1).html?t=1673790740596 3/35


1/15/23, 7:24 PM House_Rent_EDA (1)

In [9]:

columns = df.columns
counts = [len(df['Posted On'].unique()),len(df['BHK'].unique()),len(df['Rent'].unique
()),len(df['Size'].unique()),len(df['Floor'].unique()),
len(df['Area Type'].unique()),len(df['Area Locality'].unique()),len(df['Cit
y'].unique()),len(df['Furnishing Status'].unique()),
len(df['Tenant Preferred'].unique()), len(df['Bathroom'].unique()), len(df['P
oint of Contact'].unique())]
fig, ax = plt.subplots(figsize=(15,9))

x=np.arange(len(columns))
ax.set_ylabel('Unique Counts')
ax.set_xlabel('Features in Dataset')
ax.set_title('Number of Unique Values for each feature')
ax.set_xticks(x)
ax.set_xticklabels(columns)
width = 0.35

pps = ax.bar(x - width/2, counts, width, label='delivery',color=['yellow','red','blu


e','green'])
for p in pps:
height = p.get_height()
ax.annotate('{}'.format(height),
xy=(p.get_x() + p.get_width() / 2, height),
xytext=(0, 3), # 3 points vertical offset
textcoords="offset points",
ha='center', va='bottom')

Checking datatypes of each feature

https://htmtopdf.herokuapp.com/ipynbviewer/temp/f094e30b096d815513911b1defda642f/House_Rent_EDA (1).html?t=1673790740596 4/35


1/15/23, 7:24 PM House_Rent_EDA (1)

In [10]:

df.dtypes

Out[10]:

Posted On object
BHK int64
Rent int64
Size int64
Floor object
Area Type object
Area Locality object
City object
Furnishing Status object
Tenant Preferred object
Bathroom int64
Point of Contact object
dtype: object

Most popular BHK rooms to rent

In [11]:

plt.rcParams["figure.figsize"] = [15, 10]


plt.rcParams["figure.autolayout"] = True

ax = sns.countplot(x="BHK", data=df)

for p in ax.patches:
ax.annotate('{:.1f}'.format(p.get_height()), (p.get_x()+0.25, p.get_height()+0.01))

plt.show()

https://htmtopdf.herokuapp.com/ipynbviewer/temp/f094e30b096d815513911b1defda642f/House_Rent_EDA (1).html?t=1673790740596 5/35


1/15/23, 7:24 PM House_Rent_EDA (1)

Observations:

1. Most of the houses that are available for rent are 2BHK houses.
2. 6BHK has the least availability for renting.

In [12]:

df['Rent'].describe()

Out[12]:

count 4.746000e+03
mean 3.499345e+04
std 7.810641e+04
min 1.200000e+03
25% 1.000000e+04
50% 1.600000e+04
75% 3.300000e+04
max 3.500000e+06
Name: Rent, dtype: float64

Observations:

The Rent feature in your dataset appears to have a wide range of values, with a minimum value of 1,200 and
a maximum value of 3,500,000. This can cause problems when building machine learning models because
many algorithms use the scale of the features to make predictions.

In [13]:

df.head(5)

Out[13]:

Posted Area Area Furnishing Tena


BHK Rent Size Floor City
On Type Locality Status Preferr

2022- Ground Super


0 2 10000 1100 Bandel Kolkata Unfurnished Bachelors/Fam
05-18 out of 2 Area

Phool
2022- 1 out of Super Semi-
1 2 20000 800 Bagan, Kolkata Bachelors/Fam
05-13 3 Area Furnished
Kankurgachi

Salt Lake
2022- 1 out of Super Semi-
2 2 17000 1000 City Sector Kolkata Bachelors/Fam
05-16 3 Area Furnished
2

2022- 1 out of Super Dumdum


3 2 10000 800 Kolkata Unfurnished Bachelors/Fam
07-04 2 Area Park

2022- 1 out of Carpet South Dum


4 2 7500 850 Kolkata Unfurnished Bachel
05-09 2 Area Dum

Distribution of Target Variable

https://htmtopdf.herokuapp.com/ipynbviewer/temp/f094e30b096d815513911b1defda642f/House_Rent_EDA (1).html?t=1673790740596 6/35


1/15/23, 7:24 PM House_Rent_EDA (1)

In [14]:

df.columns

Out[14]:

Index(['Posted On', 'BHK', 'Rent', 'Size', 'Floor', 'Area Type',


'Area Locality', 'City', 'Furnishing Status', 'Tenant Preferred',
'Bathroom', 'Point of Contact'],
dtype='object')

In [19]:

sns.distplot(df['Rent'])

Out[19]:

<matplotlib.axes._subplots.AxesSubplot at 0x7f6210c39400>

https://htmtopdf.herokuapp.com/ipynbviewer/temp/f094e30b096d815513911b1defda642f/House_Rent_EDA (1).html?t=1673790740596 7/35


1/15/23, 7:24 PM House_Rent_EDA (1)

In [15]:

sns.catplot(x="BHK", y="Rent",kind="box",data=df)

Out[15]:

<seaborn.axisgrid.FacetGrid at 0x7f6222d9c3a0>

In [16]:

df['Rent_log'] = np.log1p(df['Rent']) # Log transformation of target variable

In [17]:

df['Rent'].describe()

Out[17]:

count 4.746000e+03
mean 3.499345e+04
std 7.810641e+04
min 1.200000e+03
25% 1.000000e+04
50% 1.600000e+04
75% 3.300000e+04
max 3.500000e+06
Name: Rent, dtype: float64

https://htmtopdf.herokuapp.com/ipynbviewer/temp/f094e30b096d815513911b1defda642f/House_Rent_EDA (1).html?t=1673790740596 8/35


1/15/23, 7:24 PM House_Rent_EDA (1)

In [20]:

sns.distplot(df['Rent_log']) #Distrubtion after log transformation

Out[20]:

<matplotlib.axes._subplots.AxesSubplot at 0x7f621346d6a0>

In [21]:

df.drop(['Rent'],axis=1,inplace=True)

https://htmtopdf.herokuapp.com/ipynbviewer/temp/f094e30b096d815513911b1defda642f/House_Rent_EDA (1).html?t=1673790740596 9/35


1/15/23, 7:24 PM House_Rent_EDA (1)

In [22]:

sns.catplot(x="BHK", y="Rent_log",kind="box",data=df)

Out[22]:

<seaborn.axisgrid.FacetGrid at 0x7f6213a8ad00>

Observations:

1. Lower and Higher end outliers are observed for 1BHK,2BHK,3BHK Rooms.
2. Higher End Outliers are observed for 6BHK Rooms too.

Finding Rents of Rooms for each Cities based on BHK

https://htmtopdf.herokuapp.com/ipynbviewer/temp/f094e30b096d815513911b1defda642f/House_Rent_EDA (1).html?t=1673790740596 10/35


1/15/23, 7:24 PM House_Rent_EDA (1)

In [16]:

sns.barplot('City','Rent_log',data=df,hue='BHK')

Out[16]:

<matplotlib.axes._subplots.AxesSubplot at 0x7f21d3fdd340>

Observations:

1. Kolkata, Chennai, Hyderabad has upto 6 BHK houses available for rent.
2. Mumbai and Delhi have houses available upto 5 BHK.
3. Bangalore has upto 4 BHK houses available.
4. For all sorts of houses that are available upto 5 BHK, Mumbai has the highest rent.
5. Whereas Kolkata has lowest rent.
6. 6 BHK houses have highest rent in Chennai.

Finding Available Room Size for each Cities based on BHK

https://htmtopdf.herokuapp.com/ipynbviewer/temp/f094e30b096d815513911b1defda642f/House_Rent_EDA (1).html?t=1673790740596 11/35


1/15/23, 7:24 PM House_Rent_EDA (1)

In [17]:

sns.barplot('City','Size',data=df,hue='BHK')

Out[17]:

<matplotlib.axes._subplots.AxesSubplot at 0x7f21d3e81f10>

Observations:

1. Kolkata has least sizes of houses available for renting(includes all sorts of houses).
2. Chennai has maximum size of 5 BHK houses available.
3. Whereas Bangalore has maximum size of 4 BHK houses available.
4. Hyderabad and Bangalore almost has maximum size of 3 BHK houses available.
5. Hyderabad has maximum size of 2 BHK houses available.
6. Hyderabad also has maximum size of 1 BHK houses available.

Finding Rents of Rooms for each Area Type based on BHK

https://htmtopdf.herokuapp.com/ipynbviewer/temp/f094e30b096d815513911b1defda642f/House_Rent_EDA (1).html?t=1673790740596 12/35


1/15/23, 7:24 PM House_Rent_EDA (1)

In [18]:

sns.barplot('Area Type','Rent_log',data=df,hue='BHK')

Out[18]:

<matplotlib.axes._subplots.AxesSubplot at 0x7f21d3d3c490>

Observations:

1. Only 1 and 2 BHK Rooms comes under the Built Area.


2. Super Area and Carpet Area has rooms till 6 BHK.
3. Rooms that have Carpet Area has higher Rent than rest of the area types.
4. Built Area rooms have lowest rent.

Finding Sizes of Rooms for each Area Type based on BHK

https://htmtopdf.herokuapp.com/ipynbviewer/temp/f094e30b096d815513911b1defda642f/House_Rent_EDA (1).html?t=1673790740596 13/35


1/15/23, 7:24 PM House_Rent_EDA (1)

In [19]:

sns.barplot('Area Type','Size',data=df,hue='BHK')

Out[19]:

<matplotlib.axes._subplots.AxesSubplot at 0x7f21d3ecd9d0>

Observations:

1. 6 BHK rooms of Area Type Carpet Area have higher sizes.


2. 5 BHK rooms of Area Type Super Area also have higher sizes.

Finding Rents of Rooms for each Furnishing Status based on BHK

https://htmtopdf.herokuapp.com/ipynbviewer/temp/f094e30b096d815513911b1defda642f/House_Rent_EDA (1).html?t=1673790740596 14/35


1/15/23, 7:24 PM House_Rent_EDA (1)

In [20]:

sns.barplot('Furnishing Status','Rent',hue='BHK',data=df)

Out[20]:

<matplotlib.axes._subplots.AxesSubplot at 0x7f21d3bd5f10>

Observations:

1. For houses uptil 4 BHK, Semi-Furnished and Furnished houses have almost similar rent.
2. However, For 5 BHK Houses, Furnished houses have higher rent.
3. For 6 BHK houses, Semi-furnished houses have higher rent.

Finding Sizes of Rooms for each Furnishing Status based on BHK

https://htmtopdf.herokuapp.com/ipynbviewer/temp/f094e30b096d815513911b1defda642f/House_Rent_EDA (1).html?t=1673790740596 15/35


1/15/23, 7:24 PM House_Rent_EDA (1)

In [21]:

sns.barplot('Furnishing Status','Size',hue='BHK',data=df)

Out[21]:

<matplotlib.axes._subplots.AxesSubplot at 0x7f21d3b25250>

Observations:

1. Semi-furnished houses appear to have higher sizes.

Finding Rents of Rooms for each Preferred Bathrooms based on BHK

https://htmtopdf.herokuapp.com/ipynbviewer/temp/f094e30b096d815513911b1defda642f/House_Rent_EDA (1).html?t=1673790740596 16/35


1/15/23, 7:24 PM House_Rent_EDA (1)

In [22]:

sns.barplot('Bathroom','Rent_log',data=df,hue='BHK')

Out[22]:

<matplotlib.axes._subplots.AxesSubplot at 0x7f21d3afb250>

Observations:

1. Maximum rent has been observed if the number of Bathrooms in the house is 5.
2. Houses with 1 bathroom appear to have least rent.

Finding Sizes of Rooms for each Preferred Bathrooms based on BHK

https://htmtopdf.herokuapp.com/ipynbviewer/temp/f094e30b096d815513911b1defda642f/House_Rent_EDA (1).html?t=1673790740596 17/35


1/15/23, 7:24 PM House_Rent_EDA (1)

In [23]:

sns.barplot('Bathroom','Size',data=df,hue='BHK')

Out[23]:

<matplotlib.axes._subplots.AxesSubplot at 0x7f21d38c47f0>

Observations:

1. After looking at the graph, it can be said that as the size increases, number of bathrooms also increases.
2. However, this might not hold true in some cases. It can be seen that size is not getting affected even if
the number of bathrooms is higher(No of Bathrooms=6).

Finding Rents of Rooms for each Point of Contact based on BHK

https://htmtopdf.herokuapp.com/ipynbviewer/temp/f094e30b096d815513911b1defda642f/House_Rent_EDA (1).html?t=1673790740596 18/35


1/15/23, 7:24 PM House_Rent_EDA (1)

In [24]:

sns.barplot('Point of Contact','Rent_log',hue='BHK',data=df)

Out[24]:

<matplotlib.axes._subplots.AxesSubplot at 0x7f21d3ede220>

Observations:

1. If the Point of Contact is an Agent, then the rent is higher for each houses.

Rents of Houses Based on Cities

https://htmtopdf.herokuapp.com/ipynbviewer/temp/f094e30b096d815513911b1defda642f/House_Rent_EDA (1).html?t=1673790740596 19/35


1/15/23, 7:24 PM House_Rent_EDA (1)

In [28]:

sns.barplot('BHK','Rent_log',hue='City',data=df)

Out[28]:

<matplotlib.axes._subplots.AxesSubplot at 0x7f21d35b2f70>

Observations

1. Rooms upto 5 BHK are expensive to rent in Mumbai.


2. 6BHK Houses are only available to rent only in Chennai and Hyderabad. Hyderabad has expensive rent
than Chennai

Preferred Tenant

https://htmtopdf.herokuapp.com/ipynbviewer/temp/f094e30b096d815513911b1defda642f/House_Rent_EDA (1).html?t=1673790740596 20/35


1/15/23, 7:24 PM House_Rent_EDA (1)

In [29]:

tenant=df['Tenant Preferred'].value_counts()
plt.figure(figsize=(10,5))
plt.pie(tenant.values,labels=tenant.index,autopct="%1.2f%%");

Observations:

1. Most preferred tenants are Bachelors/Family.


2. Least preferred tenants are Family.

Available houses furnishing status

https://htmtopdf.herokuapp.com/ipynbviewer/temp/f094e30b096d815513911b1defda642f/House_Rent_EDA (1).html?t=1673790740596 21/35


1/15/23, 7:24 PM House_Rent_EDA (1)

In [30]:

furn=df['Furnishing Status'].value_counts()
plt.figure(figsize=(10,5))
plt.pie(furn.values,labels=furn.index,autopct="%1.2f%%");

Observations:

1. 47.43% of the houses available for rent are Semi-furnished.


2. 14.33% of the houses available are furnished.

Point of Contact for renting houses

https://htmtopdf.herokuapp.com/ipynbviewer/temp/f094e30b096d815513911b1defda642f/House_Rent_EDA (1).html?t=1673790740596 22/35


1/15/23, 7:24 PM House_Rent_EDA (1)

In [31]:

poc=df['Point of Contact'].value_counts()
plt.figure(figsize=(10,5))
plt.pie(poc.values,labels=poc.index,autopct="%1.2f%%");

Observations:

1. Almost 68% of the houses have Owners as their point of contact.


2. .02% of the houses have Builder as their point of contact.

Area Type of houses available for renting

https://htmtopdf.herokuapp.com/ipynbviewer/temp/f094e30b096d815513911b1defda642f/House_Rent_EDA (1).html?t=1673790740596 23/35


1/15/23, 7:24 PM House_Rent_EDA (1)

In [32]:

at=df['Area Type'].value_counts()
plt.figure(figsize=(10,5))
plt.pie(at.values,labels=at.index,autopct="%1.2f%%");

Observations:

1. Almost 52% of the houses have Super Area.


2. .04% of the houses have Built Area.

Cities to Rent Houses

https://htmtopdf.herokuapp.com/ipynbviewer/temp/f094e30b096d815513911b1defda642f/House_Rent_EDA (1).html?t=1673790740596 24/35


1/15/23, 7:24 PM House_Rent_EDA (1)

In [33]:

ct=df['City'].value_counts()
plt.figure(figsize=(10,5))
plt.pie(ct.values,labels=ct.index,autopct="%1.2f%%");

Observations

1. Mumbai is the most popular city for renting houses, followed by Chennai, Bangalore and Hyderabad.
2. Kolkata is the least popular city for renting houses.

10 Most Popular floors for renting house

https://htmtopdf.herokuapp.com/ipynbviewer/temp/f094e30b096d815513911b1defda642f/House_Rent_EDA (1).html?t=1673790740596 25/35


1/15/23, 7:24 PM House_Rent_EDA (1)

In [34]:

al=df['Area Locality'].value_counts()
plt.figure(figsize=(10,5))
plt.pie(al.values[:10],labels=al.index[:10],autopct="%1.2f%%");

Observations

The above pie chart shows popular area localities for renting houses.

Performing Time Based Analysis

In [35]:

df["Posted On"] = pd.to_datetime(df["Posted On"])


df["year"] = df["Posted On"].dt.year
df["month"] = df["Posted On"].dt.month
df["day"] = df["Posted On"].dt.day
df["weekday"] = df["Posted On"].dt.weekday
df["week"] = df["Posted On"].dt.week
df["quarter"] = df["Posted On"].dt.quarter

https://htmtopdf.herokuapp.com/ipynbviewer/temp/f094e30b096d815513911b1defda642f/House_Rent_EDA (1).html?t=1673790740596 26/35


1/15/23, 7:24 PM House_Rent_EDA (1)

In [44]:

df.head(5)

Out[44]:

Posted Area Area Furnishing Tenant


BHK Size Floor City Bat
On Type Locality Status Preferred

2022- Ground Super


0 2 1100 Bandel Kolkata Unfurnished Bachelors/Family
05-18 out of 2 Area

Phool
2022- 1 out of Super Semi-
1 2 800 Bagan, Kolkata Bachelors/Family
05-13 3 Area Furnished
Kankurgachi

Salt Lake
2022- 1 out of Super Semi-
2 2 1000 City Sector Kolkata Bachelors/Family
05-16 3 Area Furnished
2

2022- 1 out of Super Dumdum


3 2 800 Kolkata Unfurnished Bachelors/Family
07-04 2 Area Park

2022- 1 out of Carpet South Dum


4 2 850 Kolkata Unfurnished Bachelors
05-09 2 Area Dum

Rent Varies on Month

In [52]:

def show_values_on_bars(axs):
def _show_on_single_plot(ax):
for p in ax.patches:
_x = p.get_x() + p.get_width() / 2
_y = p.get_y() + p.get_height()
value = '{:.2f}'.format(p.get_height())
ax.text(_x, _y, value, ha="center")
if isinstance(axs, np.ndarray):
for idx, ax in np.ndenumerate(axs):
_show_on_single_plot(ax)
else:
_show_on_single_plot(axs)

https://htmtopdf.herokuapp.com/ipynbviewer/temp/f094e30b096d815513911b1defda642f/House_Rent_EDA (1).html?t=1673790740596 27/35


1/15/23, 7:24 PM House_Rent_EDA (1)

In [61]:

sns.barplot(x='month', y='Rent_log', data=df)


show_values_on_bars(plt.gca())

Observations:

1. Maximum rent was observed in the 7th month.


2. Minimum rent was observed in the 4th month.

Rents of houses varying on weekdays

https://htmtopdf.herokuapp.com/ipynbviewer/temp/f094e30b096d815513911b1defda642f/House_Rent_EDA (1).html?t=1673790740596 28/35


1/15/23, 7:24 PM House_Rent_EDA (1)

In [62]:

sns.barplot(x='weekday', y='Rent_log', data=df)


show_values_on_bars(plt.gca())

Observations

1. Weekday 5(Saturday) has the maximum price of rent.


2. Weekday 4(Friday) has the miniimum price of rent.

Rents varying on days of month

https://htmtopdf.herokuapp.com/ipynbviewer/temp/f094e30b096d815513911b1defda642f/House_Rent_EDA (1).html?t=1673790740596 29/35


1/15/23, 7:24 PM House_Rent_EDA (1)

In [60]:

sns.barplot(x='week', y='Rent_log', data=df)


show_values_on_bars(plt.gca())

Observations

1. Maximum rent was observed on the 15th week.


2. Minimum rent was observed on the 16th week.

Rents varying on quarters of 2022 year(Since the data was taken from magicbricks.com 2022 for year
2022)

https://htmtopdf.herokuapp.com/ipynbviewer/temp/f094e30b096d815513911b1defda642f/House_Rent_EDA (1).html?t=1673790740596 30/35


1/15/23, 7:24 PM House_Rent_EDA (1)

In [66]:

sns.barplot(x='quarter', y='Rent_log', data=df)


show_values_on_bars(plt.gca())

Observations

1. The Dataset only has 2nd and 3rd quarter of data.


2. Rent was maximum in the 3rd quarter of 2022 whereas it was minimum in the 2nd quarter

Heatmaps for Correlation

https://htmtopdf.herokuapp.com/ipynbviewer/temp/f094e30b096d815513911b1defda642f/House_Rent_EDA (1).html?t=1673790740596 31/35


1/15/23, 7:24 PM House_Rent_EDA (1)

In [71]:

sns.heatmap(df.corr(), annot = True)

Out[71]:

<matplotlib.axes._subplots.AxesSubplot at 0x7f21cf7aad60>

Observations:

1. BHK, Size, Bathrooms are highly correlated with the Rent feature

BHK vs Rent

https://htmtopdf.herokuapp.com/ipynbviewer/temp/f094e30b096d815513911b1defda642f/House_Rent_EDA (1).html?t=1673790740596 32/35


1/15/23, 7:24 PM House_Rent_EDA (1)

In [78]:

sns.scatterplot('BHK','Rent_log',data=df)

Out[78]:

<matplotlib.axes._subplots.AxesSubplot at 0x7f21cf0a8bb0>

Observations

As BHK increases, Range of Rent of houses also increases.

Bathrooms vs Rent

https://htmtopdf.herokuapp.com/ipynbviewer/temp/f094e30b096d815513911b1defda642f/House_Rent_EDA (1).html?t=1673790740596 33/35


1/15/23, 7:24 PM House_Rent_EDA (1)

In [79]:

sns.scatterplot('Bathroom','Rent_log',data=df)

Out[79]:

<matplotlib.axes._subplots.AxesSubplot at 0x7f21cee0d8e0>

Observations

As no of bathrooms increases, ranging of house rents also increases

Size vs Rent

https://htmtopdf.herokuapp.com/ipynbviewer/temp/f094e30b096d815513911b1defda642f/House_Rent_EDA (1).html?t=1673790740596 34/35


1/15/23, 7:24 PM House_Rent_EDA (1)

In [83]:

sns.scatterplot('Size','Rent_log',data=df)

Out[83]:

<matplotlib.axes._subplots.AxesSubplot at 0x7f21cf4c47f0>

Observations

1. As house size increases, ranging of house rents also increases, but in some cases, we might need to
consider the city in which the house will be taken for rent and no of BHK it has.
2. Based on City and BHK, Rents might increase or decrease.

https://htmtopdf.herokuapp.com/ipynbviewer/temp/f094e30b096d815513911b1defda642f/House_Rent_EDA (1).html?t=1673790740596 35/35

You might also like