Project
Project
data.shape
(11251, 15)
data.head()
data.tail()
Status unnamed1
11246 NaN NaN
11247 NaN NaN
11248 NaN NaN
11249 NaN NaN
11250 NaN NaN
data.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 11251 entries, 0 to 11250
Data columns (total 15 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 User_ID 11251 non-null int64
1 Cust_name 11251 non-null object
2 Product_ID 11251 non-null object
3 Gender 11251 non-null object
4 Age Group 11251 non-null object
5 Age 11251 non-null int64
6 Marital_Status 11251 non-null int64
7 State 11251 non-null object
8 Zone 11251 non-null object
9 Occupation 11251 non-null object
10 Product_Category 11251 non-null object
11 Orders 11251 non-null int64
12 Amount 11239 non-null float64
13 Status 0 non-null float64
14 unnamed1 0 non-null float64
dtypes: float64(3), int64(4), object(8)
memory usage: 1.3+ MB
data.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 11251 entries, 0 to 11250
Data columns (total 13 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 User_ID 11251 non-null int64
1 Cust_name 11251 non-null object
2 Product_ID 11251 non-null object
3 Gender 11251 non-null object
4 Age Group 11251 non-null object
5 Age 11251 non-null int64
6 Marital_Status 11251 non-null int64
7 State 11251 non-null object
8 Zone 11251 non-null object
9 Occupation 11251 non-null object
10 Product_Category 11251 non-null object
11 Orders 11251 non-null int64
12 Amount 11239 non-null float64
dtypes: float64(1), int64(4), object(8)
memory usage: 1.1+ MB
data.shape
(11251, 13)
data.dropna(inplace= True)
data['Amount'] = data['Amount'].astype('int')
data['Amount'].dtypes
dtype('int32')
Data Analysis
Gender
ax = sns.countplot(x='Gender',hue='Gender', data=data)
for bars in ax.containers:
ax.bar_label(bars)
sales_gen= data.groupby(['Gender'], as_index= False)
['Amount'].sum().sort_values(by=['Amount'],ascending= False)
sns.barplot(x='Gender', y='Amount',hue='Gender',data=sales_gen)
Age
rx=sns.countplot(x='Age Group', hue='Gender', data=data)
for bars in rx.containers:
rx.bar_label(bars)
sales_age= data.groupby(['Age Group'], as_index= False)
['Amount'].sum().sort_values(by=['Amount'],ascending= False)
sns.barplot(x='Age Group', y='Amount',hue='Age Group', data=sales_age)
State
sales_state= data.groupby(['State'], as_index= False)
['Orders'].sum().sort_values(by='Orders', ascending= False).head(10)
sns.set(rc={'figure.figsize':(17,5)})
sns.barplot(x='State',
y='Orders',hue='State',palette='tab10',data=sales_state)
sns.set(rc={'figure.figsize':(17,5)})
sns.barplot(x='State',
y='Amount',hue='State',palette='Paired',data=sales_amt)
From above graphs we can see that most of the orders & total sales/amount are from Uttar
Pradesh, Maharashtra and Karnataka respectively.
px =
sns.countplot(x='Marital_Status',hue='Marital_Status',palette='tab10',
data=data)
sns.set(rc={'figure.figsize':(9,5)})
for bars in px.containers:
px.bar_label(bars)
sales_status= data.groupby(['Marital_Status','Gender'], as_index=
False)['Amount'].sum().sort_values(by='Amount', ascending= False)
sns.set(rc={'figure.figsize':(6,5)})
sns.barplot(x='Marital_Status',
y='Amount',hue='Gender',palette='tab10', data=sales_status)
Occupation
sns.set(rc={'figure.figsize':(20,5)})
ux = sns.countplot(x='Occupation', hue='Occupation',
palette='Accent',data=data)
sns.set(rc={'figure.figsize':(20,5)})
sns.barplot(x='Occupation', y='Amount',
hue='Occupation',palette='Dark2', data=sales_occ)
From above graphs we can see that most of the buyers are working in IT, Healthcare and
Aviation sector.
Product Category
sns.set(rc={'figure.figsize':(25,8)})
ex = sns.countplot(x='Product_Category',
hue='Product_Category',palette='GnBu', data=data)
sns.set(rc={'figure.figsize':(20,5)})
sns.barplot(x='Product_Category', y='Amount', hue='Product_Category',
palette='Paired', data=sales_prod)
From above graphs we can see that most of the sold products are from Food, Clothing and
Electronics category.
sns.set(rc={'figure.figsize':(20,5)})
sns.barplot(x='Product_ID', y='Orders', hue='Product_ID',
palette='RdYlGn', data=sales_ID)
Conclusion
Married women age group 26-35 yrs from UP, Maharastra and Karnataka working in IT,
Healthcare and Aviation are more likely to buy products from Food, Clothing and Electronics
category