0% found this document useful (0 votes)

115 views35 pages

Walmart Solution PDF

The document analyzes transaction data from Walmart. It performs exploratory data analysis on the dataset, which has over 55,000 rows and 10 columns. Key insights include that most transactions were by males, in the age group 26-35, and unmarried customers.

Uploaded by

ASWINKUMAR R

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

115 views35 pages

Walmart Solution PDF

Uploaded by

ASWINKUMAR R

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 35

walmart

April 28, 2024

WALMART - CASE ANALYSIS

[ ]: #importing libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from scipy.stats import t
import warnings
warnings.filterwarnings('ignore')
import copy

[ ]: !gdown https://d2beiqkhq929f0.cloudfront.net/public_assets/assets/000/001/293/
↪original/walmart_data.csv?1641285094

Downloading…
From: https://d2beiqkhq929f0.cloudfront.net/public_assets/assets/000/001/293/ori
ginal/walmart_data.csv?1641285094
To: /content/walmart_data.csv?1641285094
100% 23.0M/23.0M [00:00<00:00, 87.3MB/s]
1. Exploratory Data Analysis
[ ]: # loading the dataset
df = pd.read_csv('walmart_data.csv')

[ ]: df.head()

[ ]: User_ID Product_ID Gender Age Occupation City_Category \

0 1000001 P00069042 F 0-17 10 A
1 1000001 P00248942 F 0-17 10 A
2 1000001 P00087842 F 0-17 10 A
3 1000001 P00085442 F 0-17 10 A
4 1000002 P00285442 M 55+ 16 C

Stay_In_Current_City_Years Marital_Status Product_Category Purchase

0 2 0 3 8370
1 2 0 1 15200
2 2 0 12 1422

1
3 2 0 12 1057
4 4+ 0 8 7969

[ ]: df.tail()

[ ]: User_ID Product_ID Gender Age Occupation City_Category \

550063 1006033 P00372445 M 51-55 13 B
550064 1006035 P00375436 F 26-35 1 C
550065 1006036 P00375436 F 26-35 15 B
550066 1006038 P00375436 F 55+ 1 C
550067 1006039 P00371644 F 46-50 0 B

Stay_In_Current_City_Years Marital_Status Product_Category Purchase

550063 1 1 20 368
550064 3 0 20 371
550065 4+ 1 20 137
550066 2 0 20 365
550067 4+ 1 20 490

[ ]: df.shape

[ ]: (550068, 10)

[ ]: df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 550068 entries, 0 to 550067
Data columns (total 10 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 User_ID 550068 non-null int64
1 Product_ID 550068 non-null object
2 Gender 550068 non-null object
3 Age 550068 non-null object
4 Occupation 550068 non-null int64
5 City_Category 550068 non-null object
6 Stay_In_Current_City_Years 550068 non-null object
7 Marital_Status 550068 non-null int64
8 Product_Category 550068 non-null int64
9 Purchase 550068 non-null int64
dtypes: int64(5), object(5)
memory usage: 42.0+ MB
Insights:
From the above analysis, it is clear that, data has total of 10 features with lots of mixed alpha
numeric data.
Apart from Purchase Column, all the other data types are of categorical type. We will change the

2
datatypes of all such columns to category
Changing the Datatype of Columns:
[ ]: for i in df.columns[:-1]:
df[i] = df[i].astype('category')
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 550068 entries, 0 to 550067
Data columns (total 10 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 User_ID 550068 non-null category
1 Product_ID 550068 non-null category
2 Gender 550068 non-null category
3 Age 550068 non-null category
4 Occupation 550068 non-null category
5 City_Category 550068 non-null category
6 Stay_In_Current_City_Years 550068 non-null category
7 Marital_Status 550068 non-null category
8 Product_Category 550068 non-null category
9 Purchase 550068 non-null int64
dtypes: category(9), int64(1)
memory usage: 10.3 MB
2. Satatistical Summary:
a. Satistical summary of object type columns:
[ ]: df.describe(include = 'category')

[ ]: User_ID Product_ID Gender Age Occupation City_Category \

count 550068 550068 550068 550068 550068 550068
unique 5891 3631 2 7 21 3
top 1001680 P00265242 M 26-35 4 B
freq 1026 1880 414259 219587 72308 231173

Stay_In_Current_City_Years Marital_Status Product_Category

count 550068 550068 550068
unique 5 2 20
top 1 0 5
freq 193821 324731 150933

Insights: 1. User_ID - Among 5,50,068 transactions there are 5891 unique user_id, indicating
same customers buying multiple products. 2. Product_ID - Among 5,50,068 transactions there are
3631 unique products,with the product having the code P00265242 being the highest seller , with
a maximum of 1,880 units sold. 3. Gender - Out of 5,50,068 transactions, 4,14,259 (nearly 75%)
were done by male gender indicating a significant disparity in purchase behavior between males
and females during the Black Friday event. 4. Age - We have 7 unique age groups in the dataset.

3
26 - 35 Age group has maximum of 2,19,587 transactions. We will analyse this feature in detail in
future 5. Stay_In_Current_City_Years - Customers with 1 year of stay in current city accounted
to maximum of 1,93,821 transactions among all the other customers with (0,2,3,4+) years of stay in
current city 6. Marital_Status - 59% of the total transactions were done by Unmarried Customers
and 41% by Married Customers .
b.Satistical summary of numerical data type columns:
[ ]: df.describe()

[ ]: Purchase
count 550068.000000
mean 9263.968713
std 5023.065394
min 12.000000
25% 5823.000000
50% 8047.000000
75% 12054.000000
max 23961.000000

c.Duplicate Detection:
[ ]: df.duplicated().value_counts()

[ ]: False 550068
Name: count, dtype: int64

Insight: There is no duplicate entries in the dataset

c. Sanity Check for columns
[ ]: # checking the unique values for columns
for i in df.columns:
print('Unique Values in',i,'column are :-')
print(df[i].unique())
print('-'*70)

Unique Values in User_ID column are :-

[1000001, 1000002, 1000003, 1000004, 1000005, …, 1004588, 1004871, 1004113,
1005391, 1001529]
Length: 5891
Categories (5891, int64): [1000001, 1000002, 1000003, 1000004, …, 1006037,
1006038, 1006039, 1006040]
----------------------------------------------------------------------
Unique Values in Product_ID column are :-
['P00069042', 'P00248942', 'P00087842', 'P00085442', 'P00285442', …,
'P00375436', 'P00372445', 'P00370293', 'P00371644', 'P00370853']
Length: 3631
Categories (3631, object): ['P00000142', 'P00000242', 'P00000342', 'P00000442',
…, 'P0099642',

4
'P0099742', 'P0099842', 'P0099942']
----------------------------------------------------------------------
Unique Values in Gender column are :-
['F', 'M']
Categories (2, object): ['F', 'M']
----------------------------------------------------------------------
Unique Values in Age column are :-
['0-17', '55+', '26-35', '46-50', '51-55', '36-45', '18-25']
Categories (7, object): ['0-17', '18-25', '26-35', '36-45', '46-50', '51-55',
'55+']
----------------------------------------------------------------------
Unique Values in Occupation column are :-
[10, 16, 15, 7, 20, …, 18, 5, 14, 13, 6]
Length: 21
Categories (21, int64): [0, 1, 2, 3, …, 17, 18, 19, 20]
----------------------------------------------------------------------
Unique Values in City_Category column are :-
['A', 'C', 'B']
Categories (3, object): ['A', 'B', 'C']
----------------------------------------------------------------------
Unique Values in Stay_In_Current_City_Years column are :-
['2', '4+', '3', '1', '0']
Categories (5, object): ['0', '1', '2', '3', '4+']
----------------------------------------------------------------------
Unique Values in Marital_Status column are :-
[0, 1]
Categories (2, int64): [0, 1]
----------------------------------------------------------------------
Unique Values in Product_Category column are :-
[3, 1, 12, 8, 5, …, 10, 17, 9, 20, 19]
Length: 20
Categories (20, int64): [1, 2, 3, 4, …, 17, 18, 19, 20]
----------------------------------------------------------------------
Unique Values in Purchase column are :-
[ 8370 15200 1422 … 135 123 613]
----------------------------------------------------------------------
Insights:
The dataset does not contain any abnormal values.
We will convert the 0,1 in Marital Status column as married and unmarried
[ ]: #replacing the values in marital_status column
df['Marital_Status'] = df['Marital_Status'].replace({0:'Unmarried',1:'Married'})
df['Marital_Status'].unique()

[ ]: ['Unmarried', 'Married']
Categories (2, object): ['Unmarried', 'Married']

5
d. Missing value Analysis
[ ]: df.isnull().sum()

[ ]: User_ID 0
Product_ID 0
Gender 0
Age 0
Occupation 0
City_Category 0
Stay_In_Current_City_Years 0
Marital_Status 0
Product_Category 0
Purchase 0
dtype: int64

Insights: The dataset does not contain any missing values.

3.Univariate Analysis:
3.1 Numerical Variables
� 3.1.1 Purchase Amount Distribution
[ ]: #setting the plot style
fig = plt.figure(figsize = (15,10))
gs = fig.add_gridspec(2,1,height_ratios=[0.65, 0.35])
#creating purchase amount histogram

ax0 = fig.add_subplot(gs[0,0])
ax0.hist(df['Purchase'],color= '#5C8374',linewidth=0.5,edgecolor='black',bins =␣
↪20)

ax0.set_xlabel('Purchase Amount',fontsize = 12,fontweight = 'bold')

ax0.set_ylabel('Frequency',fontsize = 12,fontweight = 'bold')
#removing the axis lines
for s in ['top','left','right']:
ax0.spines[s].set_visible(False)

#setting title for visual

ax0.set_title('Purchase Amount Distribution',{'font':'serif', 'size':
↪15,'weight':'bold'})

#creating box plot for purchase amount

ax1 = fig.add_subplot(gs[1,0])
boxplot = ax1.boxplot(x = df['Purchase'],vert = False,patch_artist =␣
↪True,widths = 0.5)

# Customize box and whisker colors

boxplot['boxes'][0].set(facecolor='#5C8374')

6
# Customize median line
boxplot['medians'][0].set(color='red')
# Customize outlier markers
for flier in boxplot['fliers']:
flier.set(marker='o', markersize=8, markerfacecolor= "#4b4b4c")

#removing the axis lines

for s in ['top','left','right']:
ax1.spines[s].set_visible(False)
#adding 5 point summary annotations
info = [i.get_xdata() for i in boxplot['whiskers']] #getting the␣
↪upperlimit,Q1,Q3 and lowerlimit

median = df['Purchase'].quantile(0.5) #getting Q2

for i,j in info: #using i,j here because of the output type of info list␣
↪comprehension

ax1.annotate(text = f"{i:.1f}", xy = (i,1), xytext = (i,1.4),fontsize = 12,

arrowprops= dict(arrowstyle="<-", lw=1, connectionstyle="arc,rad=0"))

ax1.annotate(text = f"{j:.1f}", xy = (j,1), xytext = (j,1.4),fontsize = 12,

arrowprops= dict(arrowstyle="<-", lw=1, connectionstyle="arc,rad=0"))
#adding the median separately because it was included in info list
ax1.annotate(text = f"{median:.1f}",xy = (median,1),xytext = (median + 1,1.
↪4),fontsize = 12,

arrowprops= dict(arrowstyle="<-", lw=1, connectionstyle="arc,rad=0"))

#removing y-axis ticks
ax1.set_yticks([])
#adding axis label
ax1.set_xlabel('Purchase Amount',fontweight = 'bold',fontsize = 12)
plt.show()

7
Calculating the Number of Outliers:
As seen above, Purchase amount over 21399 is considered as outlier. We will count the number of
outliers as below
[ ]: len(df.loc[df['Purchase'] > 21399,'Purchase'])

[ ]: 2677

Insights:
Outliers:
There are total of 2677 outliers which is roughly 0.48% of the total data present in purchase
amount. We will not remove them as it indicates a broad range of spending behaviors during the
sale, highlighting the importance of tailoring marketing strategies to both regular and high-value
customers to maximize revenue.
Distribution:
Data suggests that the majority of customers spent between 5,823 USD and 12,054 USD , with the
median purchase amount being 8,047 USD . The lower limit of 12 USD while the upper limit of
21,399 USD reveal significant variability in customer spending
3.2 Categorical Variables:
3.2.1 Gnender, Marital Status and city category Distribution:

8
[ ]: #setting the plot style
fig = plt.figure(figsize = (15,12))
gs = fig.add_gridspec(1,3)
# creating pie chart for gender disribution
ax0 = fig.add_subplot(gs[0,0])
color_map = ["#3A7089", "#4b4b4c"]
ax0.pie(df['Gender'].value_counts().values,labels = df['Gender'].value_counts().
↪index,autopct = '%.1f%%',

shadow = True,colors = color_map,textprops={'fontsize': 13, 'color': 'black'})

#setting title for visual
ax0.set_title('Gender Distribution',{'font':'serif', 'size':15,'weight':'bold'})
# creating pie chart for marital status
ax1 = fig.add_subplot(gs[0,1])
color_map = ["#3A7089", "#4b4b4c"]
ax1.pie(df['Marital_Status'].value_counts().values,labels =␣
↪df['Marital_Status'].value_counts().index,autopct = '%.1f%%',

shadow = True,colors = color_map,textprops={'fontsize': 13, 'color': 'black'})

#setting title for visual
ax1.set_title('Marital Status Distribution',{'font':'serif', 'size':15,'weight':
↪'bold'})

# creating pie chart for city category

ax1 = fig.add_subplot(gs[0,2])
color_map = ["#3A7089", "#4b4b4c",'#99AEBB']
ax1.pie(df['City_Category'].value_counts().values,labels = df['City_Category'].
↪value_counts().index,autopct = '%.1f%%',

shadow = True,colors = color_map,textprops={'fontsize': 13, 'color': 'black'})

#setting title for visual
ax1.set_title('City Category Distribution',{'font':'serif', 'size':15,'weight':
↪'bold'})

plt.show()

Insights:
1. Gender Distribution - Data indicates a significant disparity in purchase behavior between

9
males and females during the Black Friday event.
2. Marital Status - Given that unmarried customers account for a higher percentage of trans-
actions, it may be worthwhile to consider specific marketing campaigns or promotions that
appeal to this group.
3. City Category - City B saw the most number of transactions followed by City C and City A
respectively
3.2.2 Customer Age Distribution
[ ]: #setting the plot style
fig = plt.figure(figsize = (15,7))
gs = fig.add_gridspec(1,2,width_ratios=[0.6, 0.4])
# creating bar chart for age disribution

ax0 = fig.add_subplot(gs[0,0])
temp = df['Age'].value_counts()
color_map = ["#3A7089",␣
↪"#4b4b4c",'#99AEBB','#5C8374','#6F7597','#7A9D54','#9EB384']

ax0.bar(x=temp.index,height = temp.values,color = color_map,zorder = 2)

#adding the value_counts
for i in temp.index:
ax0.text(i,temp[i]+5000,temp[i],{'font':'serif','size' : 10},ha =␣
↪'center',va = 'center')

#adding grid lines

ax0.grid(color = 'black',linestyle = '--', axis = 'y', zorder = 0, dashes =␣
↪(5,10))

#removing the axis lines

for s in ['top','left','right']:
ax0.spines[s].set_visible(False)

#adding axis label

ax0.set_ylabel('Count',fontweight = 'bold',fontsize = 12)
ax0.set_xlabel('Age Group',fontweight = 'bold',fontsize = 12)
ax0.set_xticklabels(temp.index,fontweight = 'bold')
#creating a info table for age

ax1 = fig.add_subplot(gs[0,1])
age_info = age_info =␣
↪[['26-35','40%'],['36-45','20%'],['18-25','18%'],['46-50','8%'],['51-55','7%'],['55+','4%'],

['0-17','3%']]
color_2d =␣
↪[["#3A7089",'#FFFFFF'],["#4b4b4c",'#FFFFFF'],['#99AEBB','#FFFFFF'],['#5C8374','#FFFFFF'],['#

['#7A9D54','#FFFFFF'],['#9EB384','#FFFFFF']]
table = ax1.table(cellText = age_info, cellColours=color_2d,␣
↪cellLoc='center',colLabels =['Age Group','Percent Dist.'],

colLoc = 'center',bbox =[0, 0, 1, 1])

10
table.set_fontsize(15)
#removing axis
ax1.axis('off')
#setting title for visual
fig.suptitle('Customer Age Distribution',font = 'serif', size = 18, weight =␣
↪'bold')

plt.show()

Insights:
The age group of 26-35 represents the largest share of Walmart’s Black Friday sales, accounting
for 40% of the sales. This suggests that the young and middle-aged adults are the most active and
interested in shopping for deals and discounts .
The 36-45 and 18-25 age groups are the second and third largest segments, respectively, with 20%
and 18% of the sales. This indicates that Walmart has a diverse customer base that covers different
life stages and preferences..
The 46-50, 51-55, 55+, and 0-17 age groups are the smallest customer segments , with less than 10%
of the total sales each. This implies that Walmart may need to improve its marketing strategies
and product offerings to attract more customers from these age groups, especially the seniors and
the children.
3.2.3 Customer Stay In current City Distribution
[ ]: #setting the plot style
fig = plt.figure(figsize = (15,7))
gs = fig.add_gridspec(1,2,width_ratios=[0.6, 0.4])
# creating bar chart for Customer Stay In current City

11
ax1 = fig.add_subplot(gs[0,0])
temp = df['Stay_In_Current_City_Years'].value_counts()
color_map = ["#3A7089", "#4b4b4c",'#99AEBB','#5C8374','#6F7597']
ax1.bar(x=temp.index,height = temp.values,color = color_map,zorder = 2,width =␣
↪0.6)

#adding the value_counts

for i in temp.index:
ax1.text(i,temp[i]+4000,temp[i],{'font':'serif','size' : 10},ha =␣
↪'center',va = 'center')

#adding grid lines

ax1.grid(color = 'black',linestyle = '--', axis = 'y', zorder = 0, dashes =␣
↪(5,10))

#removing the axis lines

for s in ['top','left','right']:
ax1.spines[s].set_visible(False)

#adding axis label

ax1.set_ylabel('Count',fontweight = 'bold',fontsize = 12)
ax1.set_xlabel('Stay in Years',fontweight = 'bold',fontsize = 12)
ax1.set_xticklabels(temp.index,fontweight = 'bold')
#creating a info table for Customer Stay In current City

ax2 = fig.add_subplot(gs[0,1])
stay_info = [['1','35%'],['2','19%'],['3','17%'],['4+','15%'],['0','14%']]
color_2d =␣
↪[["#3A7089",'#FFFFFF'],["#4b4b4c",'#FFFFFF'],['#99AEBB','#FFFFFF'],['#5C8374','#FFFFFF'],['#

table = ax2.table(cellText = stay_info, cellColours=color_2d,␣

↪cellLoc='center',colLabels =['Stay in Years','Percent Dist.'],

colLoc = 'center',bbox =[0, 0, 1, 1])

table.set_fontsize(15)
#removing axis
ax2.axis('off')
#setting title for visual
fig.suptitle('Customer Current City Stay Distribution',font = 'serif', size =␣
↪18, weight = 'bold')

plt.show()

12
Insights:
The data suggests that the customers are either new to the city or move frequently, and may have
different preferences and needs than long-term residents.
The majority of the customers (49%) have stayed in the current city for one year or less . This
suggests that Walmart has a strong appeal to newcomers who may be looking for affordable and
convenient shopping options.
4+ years category (14%) customers indicates that Walmart has a loyal customer base who have
been living in the same city for a long time.
The percentage of customers decreases as the stay in the current city increases which suggests that
Walmart may benefit from targeting long-term residents for loyalty programs and promotions .
3.2.4 Top 10 Products and Categories:
Sales Snapshot Top 10 Products and Product Categories which has sold most during Black Friday
Sales
[ ]: #setting the plot style
fig = plt.figure(figsize = (15,6))
gs = fig.add_gridspec(1,2)
#Top 10 Product_ID Sales
ax = fig.add_subplot(gs[0,0])
temp = df['Product_ID'].value_counts()[0:10]
# reversing the list
temp = temp.iloc[-1:-11:-1]
color_map = ['#99AEBB' for i in range(7)] + ["#3A7089" for i in range(3)]
#creating the plot
ax.barh(y = temp.index,width = temp.values,height = 0.2,color = color_map)

13
ax.scatter(y = temp.index, x = temp.values, s = 150 , color = color_map )
#removing x-axis
ax.set_xticks([])
#adding label to each bar
for y,x in zip(temp.index,temp.values):
ax.text( x + 50 , y , x,{'font':'serif', 'size':10,'weight':
↪'bold'},va='center')

#removing the axis lines

for s in ['top','bottom','right']:
ax.spines[s].set_visible(False)

#adding axis labels

ax.set_xlabel('Units Sold',{'font':'serif', 'size':10,'weight':'bold'})
ax.set_ylabel('Product ID',{'font':'serif', 'size':12,'weight':'bold'})
#creating the title
ax.set_title('Top 10 Product_ID with Maximum Sales',
{'font':'serif', 'size':15,'weight':'bold'})
#Top 10 Product Category Sales
ax = fig.add_subplot(gs[0,1])
temp = df['Product_Category'].value_counts()[0:10]
# reversing the list
temp = temp.iloc[-1:-11:-1]
#creating the plot
ax.barh(y = temp.index,width = temp.values,height = 0.2,color = color_map)
ax.scatter(y = temp.index, x = temp.values, s = 150 , color = color_map )
#removing x-axis
ax.set_xticks([])
#adding label to each bar
for y,x in zip(temp.index,temp.values):
ax.text( x + 5000 , y , x,{'font':'serif', 'size':10,'weight':
↪'bold'},va='center')

#removing the axis lines

for s in ['top','bottom','right']:
ax.spines[s].set_visible(False)

#adding axis labels

ax.set_xlabel('Units Sold',{'font':'serif', 'size':12,'weight':'bold'})
ax.set_ylabel('Product Category',{'font':'serif', 'size':12,'weight':'bold'})
#creating the title
ax.set_title('Top 10 Product Category with Maximum Sales',
{'font':'serif', 'size':15,'weight':'bold'})
plt.show()

14
Insights:
1. Top 10 Products Sold - The top-selling products during Walmart’s Black Friday sales are
characterized by a relatively small variation in sales numbers, suggesting that Walmart offers
a variety of products that many different customers like to buy.
2. Top 10 Product Categories - Categories 5,1 and 8 have significantly outperformed other
categories with combined Sales of nearly 75% of the total sales suggesting a strong preference
for these products among customers.
3.2.5 Top 10 Customer Occupation
Top 10 Occupation of Customer in Black Friday Sales
[ ]: temp = df['Occupation'].value_counts()[0:10]
#setting the plot style
fig,ax = plt.subplots(figsize = (13,6))
color_map = ["#3A7089" for i in range(3)] + ['#99AEBB' for i in range(7)]
#creating the plot
ax.bar(temp.index,temp.values, color = color_map, zorder = 2)
#adding valuecounts
for x,y in zip(temp.index,temp.values):
ax.text(x, y + 2000, y,{'font':'serif', 'size':10,'weight':
↪'bold'},va='center',ha = 'center')

#setting grid style

ax.grid(color = 'black',linestyle = '--',axis = 'y',zorder = 0,dashes = (5,10))
#customizing the axis labels
ax.set_xticklabels(temp.index,fontweight = 'bold',fontfamily='serif')
ax.set_xlabel('Occupation Category',{'font':'serif', 'size':12,'weight':'bold'})
ax.set_ylabel('Count',{'font':'serif', 'size':12,'weight':'bold'})
#removing the axis lines
for s in ['top','left','right']:
ax.spines[s].set_visible(False)

15
#adding title to the visual
ax.set_title('Top 10 Occupation of Customers',
{'font':'serif', 'size':15,'weight':'bold'})
plt.show()

Insights:
Customers with Occupation category 4,0 and 7 contributed significantly i.e. almost 37% of the total
purchases suggesting that these occupations have a high demand for Walmart products or services,
or that they have more disposable income to spend on Black Friday.
4.Bivariate Analysis:
4.1 Exploring Purchase Patterns
[ ]: #setting the plot style
fig = plt.figure(figsize = (15,20))
gs = fig.add_gridspec(3,2)
for i,j,k in␣
↪[(0,0,'Gender'),(0,1,'City_Category'),(1,0,'Marital_Status'),(1,1,'Stay_In_Current_City_Year

#plot position
if i <= 1:
ax0 = fig.add_subplot(gs[i,j])
else:
ax0 = fig.add_subplot(gs[i,:])

#plot

16
color_map = ["#3A7089",␣
↪"#4b4b4c",'#99AEBB','#5C8374','#6F7597','#7A9D54','#9EB384']

sns.boxplot(data = df, x = k, y = 'Purchase' ,ax = ax0,width = 0.5, palette␣

↪=color_map)

#plot title
ax0.set_title(f'Purchase Amount Vs {k}',{'font':'serif', 'size':12,'weight':
↪'bold'})

#customizing axis
ax0.set_xticklabels(df[k].unique(),fontweight = 'bold',fontsize = 12)
ax0.set_ylabel('Purchase Amount',fontweight = 'bold',fontsize = 12)
ax0.set_xlabel('')

plt.show()

17
Insights:
Out of all the variables analysed above, it’s noteworthy that the purchase amount remains relatively
stable regardless of the variable under consideration. As indicated in the data, the median purchase
amount consistently hovers around 8,000 USD , regardless of the specific variable being examined.

18
5. Gender vs Purchase Amount:
5.1 Data Visualization:
[ ]: #creating a df for purchase amount vs gender
temp = df.groupby('Gender')['Purchase'].agg(['sum','count']).reset_index()
#calculating the amount in billions
temp['sum_in_billions'] = round(temp['sum'] / 10**9,2)
#calculationg percentage distribution of purchase amount
temp['%sum'] = round(temp['sum']/temp['sum'].sum(),3)
#calculationg per purchase amount
temp['per_purchase'] = round(temp['sum']/temp['count'])
#renaming the gender
temp['Gender'] = temp['Gender'].replace({'F':'Female','M':'Male'})
temp

[ ]: Gender sum count sum_in_billions %sum per_purchase

0 Female 1186232642 135809 1.19 0.233 8735.0
1 Male 3909580100 414259 3.91 0.767 9438.0

[ ]: #setting the plot style

fig = plt.figure(figsize = (15,14))
gs = fig.add_gridspec(3,2,height_ratios =[0.10,0.4,0.5])
#Distribution of Purchase Amount
ax = fig.add_subplot(gs[0,:])
#plotting the visual
ax.barh(temp.loc[0,'Gender'],width = temp.loc[0,'%sum'],color = "#3A7089",label␣
↪= 'Female')

ax.barh(temp.loc[0,'Gender'],width = temp.loc[1,'%sum'],left =temp.

↪loc[0,'%sum'], color = "#4b4b4c",label = 'Male' )

#inserting the text

txt = [0.0] #for left parameter in ax.text()
for i in temp.index:
#for amount
ax.text(temp.loc[i,'%sum']/2 + txt[0],0.15,f"${temp.loc[i,'sum_in_billions']}␣
↪Billion",

va = 'center', ha='center',fontsize=18, color='white')

#for gender
ax.text(temp.loc[i,'%sum']/2 + txt[0],- 0.20 ,f"{temp.loc[i,'Gender']}",
va = 'center', ha='center',fontsize=14, color='white')

txt += temp.loc[i,'%sum']

#removing the axis lines

for s in ['top','left','right','bottom']:
ax.spines[s].set_visible(False)

19
#customizing ticks
ax.set_xticks([])
ax.set_yticks([])
ax.set_xlim(0,1)
#plot title
ax.set_title('Gender-Based Purchase Amount Distribution',{'font':'serif',␣
↪'size':15,'weight':'bold'})

#Distribution of Purchase Amount per Transaction

ax1 = fig.add_subplot(gs[1,0])
color_map = ["#3A7089", "#4b4b4c"]
#plotting the visual
ax1.bar(temp['Gender'],temp['per_purchase'],color = color_map,zorder = 2,width␣
↪= 0.3)

#adding average transaction line

avg = round(df['Purchase'].mean())
ax1.axhline(y = avg, color ='red', zorder = 0,linestyle = '--')
#adding text for the line
ax1.text(0.4,avg + 300, f"Avg. Transaction Amount ${avg:.0f}",
{'font':'serif','size' : 12},ha = 'center',va = 'center')
#adjusting the ylimits
ax1.set_ylim(0,11000)
#adding the value_counts
for i in temp.index:
ax1.text(temp.loc[i,'Gender'],temp.loc[i,'per_purchase']/2,f"${temp.
↪loc[i,'per_purchase']:.0f}",

{'font':'serif','size' : 12,'color':'white','weight':'bold' },ha = 'center',va␣

↪= 'center')

#adding grid lines

ax1.grid(color = 'black',linestyle = '--', axis = 'y', zorder = 0, dashes =␣
↪(5,10))

#removing the axis lines

for s in ['top','left','right']:
ax1.spines[s].set_visible(False)

#adding axis label

ax1.set_ylabel('Purchase Amount',fontweight = 'bold',fontsize = 12)
ax1.set_xticklabels(temp['Gender'],fontweight = 'bold',fontsize = 12)
#setting title for visual
ax1.set_title('Average Purchase Amount per Transaction',{'font':'serif', 'size':
↪15,'weight':'bold'})

# creating pie chart for gender disribution

ax2 = fig.add_subplot(gs[1,1])
color_map = ["#3A7089", "#4b4b4c"]
ax2.pie(temp['count'],labels = temp['Gender'],autopct = '%.1f%%',

20
shadow = True,colors = color_map,wedgeprops = {'linewidth':␣
↪5},textprops={'fontsize': 13, 'color': 'black'})

#setting title for visual

ax2.set_title('Gender-Based Transaction Distribution',{'font':'serif', 'size':
↪15,'weight':'bold'})

# creating kdeplot for purchase amount distribution

ax3 = fig.add_subplot(gs[2,:])
#plotting the kdeplot
sns.kdeplot(data = df, x = 'Purchase', hue = 'Gender', palette = color_map,fill␣
↪= True, alpha = 1,ax = ax3)

#removing the axis lines

for s in ['top','left','right']:
ax3.spines[s].set_visible(False)

# adjusting axis labels

ax3.set_yticks([])
ax3.set_ylabel('')
ax3.set_xlabel('Purchase Amount',fontweight = 'bold',fontsize = 12)
#setting title for visual
ax3.set_title('Purchase Amount Distribution by Gender',{'font':'serif', 'size':
↪15,'weight':'bold'})

plt.show()

21
Insights:
1. Total Sales and Transactions Comparison The total purchase amount and number of trans-
actions by male customers was more than three times the amount and transactions by female
customers indicating that they had a more significant impact on the Black Friday sales.
2. Average Transaction Value The average purchase amount per transaction was slightly higher
for male customers than female customers ($9438 vs $8735) .
3. Distribution of Purchase Amount As seen above, the purchase amount for both the genders
is not normally distributed
5.2 Confidence Interval Construction: Estimating Average Purchase Amount per
Transaction
1. Step 1 - Building CLT Curve As seen above, the purchase amount distribution is not Normal.
So we need to use Central Limit Theorem . It states the distribution of sample means will
approximate a normal distribution, regardless of the underlying population distribution

22
2. Step 2 - Building Confidence Interval After building CLT curve, we will create a confidence
interval predicting population mean at 99%,95% and 90% Confidence level .
Note - We will use different sample sizes of [100,1000,5000,50000]

[55]: #creating a function to calculate confidence interval

def confidence_interval(data,ci):
#converting the list to series
l_ci = (100-ci)/2
u_ci = (100+ci)/2

#calculating lower limit and upper limit of confidence interval

interval = np.percentile(data,[l_ci,u_ci]).round(0)

return interval

[77]: #defining a function for plotting the visual for given confidence interval
def plot(ci):
#setting the plot style
fig = plt.figure(figsize = (15,8))
gs = fig.add_gridspec(2,2)
#creating separate data frames for each gender
df_male = df.loc[df['Gender'] == 'M','Purchase']
df_female = df.loc[df['Gender'] == 'F','Purchase']
#sample sizes and corresponding plot positions
sample_sizes = [(100,0,0),(1000,0,1),(5000,1,0),(50000,1,1)]
#number of samples to be taken from purchase amount
bootstrap_samples = 20000
male_samples = {}
female_samples = {}

for i,x,y in sample_sizes:

male_means = [] #list for collecting the means of male sample
female_means = [] #list for collecting the means of female sample
for j in range(bootstrap_samples):
#creating random 5000 samples of i sample size
male_bootstrapped_samples = np.random.choice(df_male,size = i)
female_bootstrapped_samples = np.random.choice(df_female,size = i)
#calculating mean of those samples
male_sample_mean = np.mean(male_bootstrapped_samples)
female_sample_mean = np.mean(female_bootstrapped_samples)
#appending the mean to the list
male_means.append(male_sample_mean)
female_means.append(female_sample_mean)

#storing the above sample generated

male_samples[f'{ci}%_{i}'] = male_means
female_samples[f'{ci}%_{i}'] = female_means

23
#creating a temporary dataframe for creating kdeplot
temp_df = pd.DataFrame(data = {'male_means':male_means,'female_means':
↪female_means})

#plotting kdeplots
#plot position
ax = fig.add_subplot(gs[x,y])

#plots for male and female

sns.kdeplot(data = temp_df,x = 'male_means',color ="#3A7089" ,fill =␣
↪True, alpha = 0.5,ax = ax,label = 'Male')

sns.kdeplot(data = temp_df,x = 'female_means',color ="#4b4b4c" ,fill =␣

↪True, alpha = 0.5,ax = ax,label = 'Female')

#calculating confidence intervals for given confidence level(ci)

m_range = confidence_interval(male_means,ci)
f_range = confidence_interval(female_means,ci)
#plotting confidence interval on the distribution
for k in m_range:
ax.axvline(x = k,ymax = 0.9, color ="#3A7089",linestyle = '--')
for k in f_range:
ax.axvline(x = k,ymax = 0.9, color ="#4b4b4c",linestyle = '--')
#removing the axis lines
for s in ['top','left','right']:
ax.spines[s].set_visible(False)
# adjusting axis labels
ax.set_yticks([])
ax.set_ylabel('')
ax.set_xlabel('')
#setting title for visual
ax.set_title(f'CLT Curve for Sample Size = {i}',{'font':'serif', 'size':
↪11,'weight':'bold'})

plt.legend()

#setting title for visual

fig.suptitle(f'{ci}% Confidence Interval',font = 'serif', size = 18, weight␣
↪= 'bold')

plt.show()

return male_samples,female_samples

[78]: m_samp_90,f_samp_90 = plot(90)

24
[79]: m_samp_95,f_samp_95 = plot(95)

25
[80]: m_samp_99,f_samp_99 = plot(99)

Are confidence intervals of average male and female spending overlapping?

[83]: fig = plt.figure(figsize = (20,10))
gs = fig.add_gridspec(3,1)
for i,j,k,l in␣
↪[(m_samp_90,f_samp_90,90,0),(m_samp_95,f_samp_95,95,1),(m_samp_99,f_samp_99,99,2)]:

#list for collecting ci for given cl

m_ci = ['Male']
f_ci = ['Female']

#finding ci for each sample size (males)

for m in i:
m_range = confidence_interval(i[m],k)
m_ci.append(f"CI = ${m_range[0]:.0f} - ${m_range[1]:.0f}, Range =␣
↪{(m_range[1] - m_range[0]):.0f}")

#finding ci for each sample size (females)

for f in j:
f_range = confidence_interval(j[f],k)
f_ci.append(f"CI = ${f_range[0]:.0f} - ${f_range[1]:.0f}, Range =␣
↪{(f_range[1] - f_range[0]):.0f}")

26
#plotting the summary
ax = fig.add_subplot(gs[l])

#contents of the table

ci_info = [m_ci,f_ci]

#plotting the table

table = ax.table(cellText = ci_info, cellLoc='center',
colLabels =['Gender','Sample Size = 100','Sample Size =␣
↪1000','Sample Size = 5000','Sample Size = 50000'],

colLoc = 'center',colWidths = [0.05,0.2375,0.2375,0.

↪2375,0.2375],bbox =[0, 0, 1, 1])

table.set_fontsize(13)
#removing axis
ax.axis('off')

#setting title
ax.set_title(f"{k}% Confidence Interval Summary",{'font':'serif', 'size':
↪14,'weight':'bold'})

Insights:
1. Sample Size The analysis highlights the importance of sample size in estimating population
parameters. It suggests that as the sample size increases, the confidence intervals become
narrower and more precise . In business, this implies that larger sample sizes can provide
more reliable insights and estimates.
2. Confidence Intervals From the above analysis, we can see that except for the Sample Size
of 100, the confidence interval do not overlap as the sample size increases. This means that

27
there is a statistically significant difference between the average spending per transaction for
men and women within the given samples.
3. Population Average We are 95% confident that the true population average for males falls
between $9,393 and $9,483 , and for females , it falls between $8,692 and $8,777 .
4. Women spend less Men tend to spend more money per transaction on average than women
, as the upper bounds of the confidence intervals for men are consistently higher than those
for women across different sample sizes.
5. How can Walmart leverage this conclusion to make changes or improvements?
5.1. Segmentation Opportunities Walmart can create targeted marketing campaigns, loyalty pro-
grams, or product bundles to cater to the distinct spending behaviors of male and female customers.
This approach may help maximize revenue from each customer segment.
5.2. Pricing Strategies Based on the above data of average spending per transaction by gender, they
might adjust pricing or discount strategies to incentivize higher spending among male customers
while ensuring competitive pricing for female-oriented products.
Note Moving forward in our analysis, we will use 95% Confidence Level only.
6. Marital Staus vs Purchase Amount:
6.1. Data Visulaisation
[84]: #creating a df for purchase amount vs marital status
temp = df.groupby('Marital_Status')['Purchase'].agg(['sum','count']).
↪reset_index()

#calculating the amount in billions

temp['sum_in_billions'] = round(temp['sum'] / 10**9,2)
#calculationg percentage distribution of purchase amount
temp['%sum'] = round(temp['sum']/temp['sum'].sum(),3)
#calculationg per purchase amount
temp['per_purchase'] = round(temp['sum']/temp['count'])
temp

[84]: Marital_Status sum count sum_in_billions %sum per_purchase

0 Unmarried 3008927447 324731 3.01 0.59 9266.0
1 Married 2086885295 225337 2.09 0.41 9261.0

[85]: #setting the plot style

fig = plt.figure(figsize = (15,14))
gs = fig.add_gridspec(3,2,height_ratios =[0.10,0.4,0.5])
#Distribution of Purchase Amount
ax = fig.add_subplot(gs[0,:])
#plotting the visual
ax.barh(temp.loc[0,'Marital_Status'],width = temp.loc[0,'%sum'],color =␣
↪"#3A7089",label = 'Unmarried')

ax.barh(temp.loc[0,'Marital_Status'],width = temp.loc[1,'%sum'],left =temp.

↪loc[0,'%sum'], color = "#4b4b4c",label = 'Married')

28
#inserting the text
txt = [0.0] #for left parameter in ax.text()
for i in temp.index:
#for amount
ax.text(temp.loc[i,'%sum']/2 + txt[0],0.15,f"${temp.
↪loc[i,'sum_in_billions']} Billion",

va = 'center', ha='center',fontsize=18, color='white')

#for marital status

ax.text(temp.loc[i,'%sum']/2 + txt[0],- 0.20 ,f"{temp.
↪loc[i,'Marital_Status']}",

va = 'center', ha='center',fontsize=14, color='white')

txt += temp.loc[i,'%sum']

#removing the axis lines

for s in ['top','left','right','bottom']:
ax.spines[s].set_visible(False)

#customizing ticks
ax.set_xticks([])
ax.set_yticks([])
ax.set_xlim(0,1)
#plot title
ax.set_title('Marital_Status-Based Purchase Amount Distribution',{'font':
↪'serif', 'size':15,'weight':'bold'})

#Distribution of Purchase Amount per Transaction

ax1 = fig.add_subplot(gs[1,0])
color_map = ["#3A7089", "#4b4b4c"]
#plotting the visual
ax1.bar(temp['Marital_Status'],temp['per_purchase'],color = color_map,zorder =␣
↪2,width = 0.3)

#adding average transaction line

avg = round(df['Purchase'].mean())
ax1.axhline(y = avg, color ='red', zorder = 0,linestyle = '--')
#adding text for the line
ax1.text(0.4,avg + 300, f"Avg. Transaction Amount ${avg:.0f}",
{'font':'serif','size' : 12},ha = 'center',va = 'center')
#adjusting the ylimits
ax1.set_ylim(0,11000)
#adding the value_counts
for i in temp.index:
ax1.text(temp.loc[i,'Marital_Status'],temp.loc[i,'per_purchase']/
↪2,f"${temp.loc[i,'per_purchase']:.0f}",

{'font':'serif','size' : 12,'color':'white','weight':'bold' },ha =␣

↪'center',va = 'center')

29
#adding grid lines
ax1.grid(color = 'black',linestyle = '--', axis = 'y', zorder = 0, dashes =␣
↪(5,10))

#removing the axis lines

for s in ['top','left','right']:
ax1.spines[s].set_visible(False)

#adding axis label

ax1.set_ylabel('Purchase Amount',fontweight = 'bold',fontsize = 12)
ax1.set_xticklabels(temp['Marital_Status'],fontweight = 'bold',fontsize = 12)
#setting title for visual
ax1.set_title('Average Purchase Amount per Transaction',{'font':'serif', 'size':
↪15,'weight':'bold'})

# creating pie chart for Marital_Status disribution

ax2 = fig.add_subplot(gs[1,1])
color_map = ["#3A7089", "#4b4b4c"]
ax2.pie(temp['count'],labels = temp['Marital_Status'],autopct = '%.1f%%',
shadow = True,colors = color_map,wedgeprops = {'linewidth':␣
↪5},textprops={'fontsize': 13, 'color': 'black'})

#setting title for visual

ax2.set_title('Marital_Status-Based Transaction Distribution',{'font':'serif',␣
↪'size':15,'weight':'bold'})

# creating kdeplot for purchase amount distribution

ax3 = fig.add_subplot(gs[2,:])
color_map = [ "#4b4b4c","#3A7089"]
#plotting the kdeplot
sns.kdeplot(data = df, x = 'Purchase', hue = 'Marital_Status', palette =␣
↪color_map,fill = True, alpha = 1,

ax = ax3,hue_order = ['Married','Unmarried'])
#removing the axis lines
for s in ['top','left','right']:
ax3.spines[s].set_visible(False)

# adjusting axis labels

ax3.set_yticks([])
ax3.set_ylabel('')
ax3.set_xlabel('Purchase Amount',fontweight = 'bold',fontsize = 12)
#setting title for visual
ax3.set_title('Purchase Amount Distribution by Marital_Status',{'font':'serif',␣
↪'size':15,'weight':'bold'})

plt.show()

30
Insights:
1. Total Sales and Transactions Comparison The total purchase amount and number of transac-
tions by Unmarried customers was more than 20% the amount and transactions by married
customers indicating that they had a more significant impact on the Black Friday sales.
2. Average Transaction Value The average purchase amount per transaction was almost similar
for married and unmarried customers ($9261 vs $9266) .
3. Distribution of Purchase Amount As seen above, the purchase amount for both married and
unmarried customers is not normally distributed
7. Customer Age VS Purchase Amount:
7.1 Data Visualization
[86]: #creating a df for purchase amount vs age group
temp = df.groupby('Age')['Purchase'].agg(['sum','count']).reset_index()

31
#calculating the amount in billions
temp['sum_in_billions'] = round(temp['sum'] / 10**9,2)
#calculationg percentage distribution of purchase amount
temp['%sum'] = round(temp['sum']/temp['sum'].sum(),3)
#calculationg per purchase amount
temp['per_purchase'] = round(temp['sum']/temp['count'])
temp

[86]: Age sum count sum_in_billions %sum per_purchase

0 0-17 134913183 15102 0.13 0.026 8933.0
1 18-25 913848675 99660 0.91 0.179 9170.0
2 26-35 2031770578 219587 2.03 0.399 9253.0
3 36-45 1026569884 110013 1.03 0.201 9331.0
4 46-50 420843403 45701 0.42 0.083 9209.0
5 51-55 367099644 38501 0.37 0.072 9535.0
6 55+ 200767375 21504 0.20 0.039 9336.0

[87]: #setting the plot style

fig = plt.figure(figsize = (20,14))
gs = fig.add_gridspec(3,1,height_ratios =[0.10,0.4,0.5])
#Distribution of Purchase Amount
ax = fig.add_subplot(gs[0])
color_map = ["#3A7089",␣
↪"#4b4b4c",'#99AEBB','#5C8374','#6F7597','#7A9D54','#9EB384']

#plotting the visual

left = 0
for i in temp.index:
ax.barh(temp.loc[0,'Age'],width = temp.loc[i,'%sum'],left = left,color =␣
↪color_map[i],label = temp.loc[i,'Age'])

left += temp.loc[i,'%sum']
#inserting the text
txt = 0.0 #for left parameter in ax.text()
for i in temp.index:
#for amount
ax.text(temp.loc[i,'%sum']/2 + txt,0.15,f"{temp.loc[i,'sum_in_billions']}B",
va = 'center', ha='center',fontsize=14, color='white')

#for age grp

ax.text(temp.loc[i,'%sum']/2 + txt,- 0.20 ,f"{temp.loc[i,'Age']}",
va = 'center', ha='center',fontsize=12, color='white')

txt += temp.loc[i,'%sum']

#removing the axis lines

for s in ['top','left','right','bottom']:
ax.spines[s].set_visible(False)

32
#customizing ticks
ax.set_xticks([])
ax.set_yticks([])
ax.set_xlim(0,1)
#plot title
ax.set_title('Age Group Purchase Amount Distribution',{'font':'serif', 'size':
↪15,'weight':'bold'})

#Distribution of Purchase Amount per Transaction

ax1 = fig.add_subplot(gs[1])
#plotting the visual
ax1.bar(temp['Age'],temp['per_purchase'],color = color_map,zorder = 2,width = 0.
↪3)

#adding average transaction line

avg = round(df['Purchase'].mean())
ax1.axhline(y = avg, color ='red', zorder = 0,linestyle = '--')
#adding text for the line
ax1.text(0.4,avg + 300, f"Avg. Transaction Amount ${avg:.0f}",
{'font':'serif','size' : 12},ha = 'center',va = 'center')
#adjusting the ylimits
ax1.set_ylim(0,11000)
#adding the value_counts
for i in temp.index:
ax1.text(temp.loc[i,'Age'],temp.loc[i,'per_purchase']/2,f"${temp.
↪loc[i,'per_purchase']:.0f}",

{'font':'serif','size' : 12,'color':'white','weight':'bold' },ha =␣

↪'center',va = 'center')

#adding grid lines

ax1.grid(color = 'black',linestyle = '--', axis = 'y', zorder = 0, dashes =␣
↪(5,10))

#removing the axis lines

for s in ['top','left','right']:
ax1.spines[s].set_visible(False)

#adding axis label

ax1.set_ylabel('Purchase Amount',fontweight = 'bold',fontsize = 12)
ax1.set_xticklabels(temp['Age'],fontweight = 'bold',fontsize = 12)
#setting title for visual
ax1.set_title('Average Purchase Amount per Transaction',{'font':'serif', 'size':
↪15,'weight':'bold'})

# creating kdeplot for purchase amount distribution

ax3 = fig.add_subplot(gs[2,:])
#plotting the kdeplot
sns.kdeplot(data = df, x = 'Purchase', hue = 'Age', palette = color_map,fill =␣
↪True, alpha = 0.5,

33
ax = ax3)
#removing the axis lines
for s in ['top','left','right']:
ax3.spines[s].set_visible(False)

# adjusting axis labels

ax3.set_yticks([])
ax3.set_ylabel('')
ax3.set_xlabel('Purchase Amount',fontweight = 'bold',fontsize = 12)
#setting title for visual
ax3.set_title('Purchase Amount Distribution by Age Group',{'font':'serif',␣
↪'size':15,'weight':'bold'})

plt.show()

Insights:
1. Total Sales Comparison Age group between 26 - 45 accounts to almost 60% of the total sales
suggesting that Walmart’s Black Friday sales are most popular among these age groups. The
age group 0-17 has the lowest sales percentage (2.6%) , which is expected as they may not
have as much purchasing power. Understanding their preferences
2. Average Transaction Value While there is not a significant difference in per purchase spending
among the age groups, the 51-55 age group has a relatively low sales percentage (7.2%) but

34
they have the highest per purchase spending at 9535 . Walmart could consider strategies to
attract and retain this high-spending demographic.
3. Distribution of Purchase Amount As seen above, the purchase amount for all age groups is
not normally distributed
******

Problem Scenario
No ratings yet
Problem Scenario
13 pages
SMDM Project Report - Shubham Bakshi - 07.05.2023
0% (1)
SMDM Project Report - Shubham Bakshi - 07.05.2023
23 pages
Retail Analysis With Walmart Data
100% (10)
Retail Analysis With Walmart Data
2 pages
Example Resume Template
No ratings yet
Example Resume Template
2 pages
SMDM Project Report-Survi Ghura
100% (1)
SMDM Project Report-Survi Ghura
26 pages
SMDM Business-Report Arvind Soni-2
0% (1)
SMDM Business-Report Arvind Soni-2
15 pages
Black Friday Sales
No ratings yet
Black Friday Sales
26 pages
Entrep12 q1 m5 7p S of Marketing and Branding
84% (31)
Entrep12 q1 m5 7p S of Marketing and Branding
38 pages
Assignment 12 - Answer Keys
No ratings yet
Assignment 12 - Answer Keys
5 pages
Ede PRACTICAL 3
60% (5)
Ede PRACTICAL 3
3 pages
A Study On Customer Satisfaction Towards Google Pay Users in Coimbatore City Project........
100% (1)
A Study On Customer Satisfaction Towards Google Pay Users in Coimbatore City Project........
77 pages
Customer Segmentation Clustering
No ratings yet
Customer Segmentation Clustering
35 pages
Explorotary Data Analysis
100% (1)
Explorotary Data Analysis
30 pages
BigMart PDF
100% (1)
BigMart PDF
42 pages
Entrep Exam 2nd Quarter True
100% (2)
Entrep Exam 2nd Quarter True
3 pages
SMDM Project Report Dipti
No ratings yet
SMDM Project Report Dipti
14 pages
Analysis and Prediction of House Prices by Linear Regression Model
No ratings yet
Analysis and Prediction of House Prices by Linear Regression Model
91 pages
Project 4: Final Project: Bigmart Sales Prediction: Chapter 1: Problem Statement
No ratings yet
Project 4: Final Project: Bigmart Sales Prediction: Chapter 1: Problem Statement
35 pages
An Overview of Marketing
No ratings yet
An Overview of Marketing
41 pages
Bank Loan Case Study Report
No ratings yet
Bank Loan Case Study Report
23 pages
Data Analysis in The Banking Sector: Pandas Fundamentals
No ratings yet
Data Analysis in The Banking Sector: Pandas Fundamentals
16 pages
Customer Segmentation 1683225943
No ratings yet
Customer Segmentation 1683225943
34 pages
Nikita Prasad - Exploratory Data Analysis (EDA)
No ratings yet
Nikita Prasad - Exploratory Data Analysis (EDA)
18 pages
The Relationships Among Community Experience, Community Commitment
No ratings yet
The Relationships Among Community Experience, Community Commitment
14 pages
Diwali Sales Analysis EDA 1696347982
No ratings yet
Diwali Sales Analysis EDA 1696347982
8 pages
Advance Data Analytics ASSIGNMENT
No ratings yet
Advance Data Analytics ASSIGNMENT
10 pages
Training
No ratings yet
Training
17 pages
Arpita - Sarkar - Business - Report - 17th December, 2023
No ratings yet
Arpita - Sarkar - Business - Report - 17th December, 2023
23 pages
Netflix Sample Strategic Plan-Copy-1
100% (1)
Netflix Sample Strategic Plan-Copy-1
32 pages
Divyanshi 05401172023 Ds Practical
No ratings yet
Divyanshi 05401172023 Ds Practical
18 pages
Task 6
No ratings yet
Task 6
14 pages
Online Food Orders Analysis Using Python
No ratings yet
Online Food Orders Analysis Using Python
12 pages
Germany Credit Analysis
No ratings yet
Germany Credit Analysis
41 pages
The Assessment 1 - Strategic Marketing Plan.
No ratings yet
The Assessment 1 - Strategic Marketing Plan.
26 pages
Sales Data Practice Assignment
No ratings yet
Sales Data Practice Assignment
12 pages
GMC Final Project - Maha
No ratings yet
GMC Final Project - Maha
20 pages
Sustainability and Strategic Audit (Activity)
No ratings yet
Sustainability and Strategic Audit (Activity)
8 pages
Ali Shafi BSBA 2-A 6522 Sales Market Data
No ratings yet
Ali Shafi BSBA 2-A 6522 Sales Market Data
40 pages
Business Case Study Walmart New
No ratings yet
Business Case Study Walmart New
37 pages
Project Sale Analysis
No ratings yet
Project Sale Analysis
8 pages
Strategy Templates Ver 16 1
No ratings yet
Strategy Templates Ver 16 1
88 pages
Target SQL - Reference
No ratings yet
Target SQL - Reference
11 pages
Direction: Read and Answer The Questions Carefully. Shade The Corresponding Letter of The Correct Answer in Your Answer Sheet
No ratings yet
Direction: Read and Answer The Questions Carefully. Shade The Corresponding Letter of The Correct Answer in Your Answer Sheet
2 pages
Michael B. Bell: Education
No ratings yet
Michael B. Bell: Education
2 pages
Chapter 2 Building Customer Satisfaction Through Quality Service and Value
75% (4)
Chapter 2 Building Customer Satisfaction Through Quality Service and Value
5 pages
Cleaning Data in Python
No ratings yet
Cleaning Data in Python
8 pages
Case Study Module 1
No ratings yet
Case Study Module 1
4 pages
Franchising
No ratings yet
Franchising
6 pages
Eda - 1@3pm 8th Nov
No ratings yet
Eda - 1@3pm 8th Nov
2 pages
Project
No ratings yet
Project
12 pages
Black Book Tybaf
No ratings yet
Black Book Tybaf
93 pages
Loyalty. Aaker Distinguished Five Levels of Customer's Attitude Toward A Brand
No ratings yet
Loyalty. Aaker Distinguished Five Levels of Customer's Attitude Toward A Brand
2 pages
Essay Describe Yourself
100% (2)
Essay Describe Yourself
3 pages
Walmart Case Study
No ratings yet
Walmart Case Study
40 pages
Big Data
No ratings yet
Big Data
5 pages
Diwali - Sales - Analysis - Jupyter Notebook
No ratings yet
Diwali - Sales - Analysis - Jupyter Notebook
12 pages
25 Karishma Khurana 2
No ratings yet
25 Karishma Khurana 2
4 pages
AML Project LearnerNotebook LowCode
No ratings yet
AML Project LearnerNotebook LowCode
74 pages
Supermarket Sales Analysis 1
No ratings yet
Supermarket Sales Analysis 1
13 pages
Aerofit Eda
No ratings yet
Aerofit Eda
25 pages
Marketing Campaign Problem Statement
No ratings yet
Marketing Campaign Problem Statement
3 pages
Masterclass Data Analysis - Ipynb - Colab
No ratings yet
Masterclass Data Analysis - Ipynb - Colab
4 pages
Public Relations Conceptual Map
No ratings yet
Public Relations Conceptual Map
1 page
Aerofit CaseStudy
No ratings yet
Aerofit CaseStudy
30 pages
CUSTOMER ANALYSIS - Report
No ratings yet
CUSTOMER ANALYSIS - Report
10 pages
Data Mining Report
No ratings yet
Data Mining Report
5 pages
Interiors Monthly August 2023
No ratings yet
Interiors Monthly August 2023
144 pages
Vantage Introducer Agreement - VGL STP ECN (.R)
No ratings yet
Vantage Introducer Agreement - VGL STP ECN (.R)
16 pages
The Peak April 2024 Feature - DR Alice Lee
No ratings yet
The Peak April 2024 Feature - DR Alice Lee
3 pages
Capstone CLA1
No ratings yet
Capstone CLA1
16 pages
Chap 3 How To Select My Export Market
No ratings yet
Chap 3 How To Select My Export Market
29 pages
Customer Marketing Analysis 1738244935
No ratings yet
Customer Marketing Analysis 1738244935
42 pages
ML Lab Manual 1-10
No ratings yet
ML Lab Manual 1-10
58 pages
Quiz Invty2
No ratings yet
Quiz Invty2
8 pages
Exp 8 - LM
No ratings yet
Exp 8 - LM
10 pages
Certified Digital Marketing Professional - Session 01
No ratings yet
Certified Digital Marketing Professional - Session 01
13 pages
QUOTATION CIDB RENEWAL APPLICATION BY MTI Updated - 22.10.2023
No ratings yet
QUOTATION CIDB RENEWAL APPLICATION BY MTI Updated - 22.10.2023
4 pages
Aerofit Case Study
No ratings yet
Aerofit Case Study
16 pages
Walmart Business Case Study - Ipynb - Colab
No ratings yet
Walmart Business Case Study - Ipynb - Colab
28 pages
Observation: Import As Import As Import As Import As
No ratings yet
Observation: Import As Import As Import As Import As
31 pages
Practice Questions2
No ratings yet
Practice Questions2
2 pages
Group 01 - Cashify - Case
No ratings yet
Group 01 - Cashify - Case
4 pages
Walmart Business Case - Updated
No ratings yet
Walmart Business Case - Updated
47 pages
Strategic Management
No ratings yet
Strategic Management
3 pages
Assignment 1
No ratings yet
Assignment 1
4 pages
MGNM - 801 - Ca1
No ratings yet
MGNM - 801 - Ca1
14 pages
EDA Report Week2
No ratings yet
EDA Report Week2
15 pages
EDA Diwali Sale Analysis Project
No ratings yet
EDA Diwali Sale Analysis Project
11 pages
Portfolio Management Revenues World Summary: Market Values & Financials by Country
From Everand
Portfolio Management Revenues World Summary: Market Values & Financials by Country
Editorial DataGroup
No ratings yet
Miscellaneous Intermediation Revenues World Summary: Market Values & Financials by Country
From Everand
Miscellaneous Intermediation Revenues World Summary: Market Values & Financials by Country
Editorial DataGroup
No ratings yet