Rajendra Task-2
Rajendra Task-2
Roll No : 21F01A0503
College : ST.ANN'S COLLEGE OF ENGINEERING AND TECHNOLOGY
Mail id: [email protected]
Course : Datascience
Task No : Task 2
PROGRAM:-
try:
df = pd.read_csv(file_path, encoding='latin-1')
error')
exit()
infer_gender_from_name(name): gender =
or gender == 'mostly_male':
'mostly_female':
return 'female'
else:
df['name'].apply(infer_gender_from_name)
gender_counts = df['gender'].value_counts()
plt.figure(figsize=(8,6))
gender_counts.plot(kind='bar',color=['blue','pink'])
plt.title("Count of Males and Females")
plt.xlabel('Gender') plt.ylabel('Count')
plt.show()
OUTPUT:-
2) Give the sum of amounts spent by each gender and plot the corresponding graph.
gender
female 18725393.49 male
22052081.95 unknown
65471656.99 Name: Amount,
dtype: float64
3)Count each age group and provide individual counts grouped by gender.
OUTPUT:-
Gender F M
Age Group
0-17 162 134
18-25 1305 574
26-35 3271 1272
36-45 1581 705
46-50 696 291
51-55 554 278
55+ 273 155
4)Plot the total amount spent by each age group.
PROGRAM:-
import pandas as pd
import matplotlib.pyplot as plt
df=pd.read_csv('Diwali Sales Data.csv',encoding='unicode_escape')
state_order_counts=df['State'].value_counts()
top_10_states=state_order_counts.head(10)
plt.figure(figsize=(10,6)) top_10_states.plot(kind='bar')
plt.title('Total Number of Orders from Top 10 States')
plt.xlabel('State')
plt.ylabel('Number of Orders')
plt.xticks(rotation=45)
plt.show()
OUTPUT:-
6)Determine the total amount spent in the top 10 states.
PROGRAM:-
import pandas as pd
import matplotlib.pyplot as plt df=pd.read_csv('Diwali Sales
Data.csv',encoding='unicode_escape')
num_married=df['Marital_Status'].sum() #Assuming 1 represents
married
num_unmarried=(df[
'Marital_Status'
]==0).sum() #Assuming 0 represents
unmarried
Statuses=[
'Married'
,'Unmarried'
]
counts=[num_married,num_unmarried]
plt.figure(figsize=(
8,6))
plt.bar(Statuses,counts,color=[
'blue','orange'
])
plt.title(
'Comparision between Married and Unmarried Individuals'
)
plt.xlabel(
'Marital Status'
)
plt.ylabel(
'Count'
)
plt.show()
OUTPUT:
-
8)Plot the amount spent by males and females based on marital status.
PROGRAM:
-
importpandasas pd
importmatplotlib.pyplot
as plt
df=pd.read_csv(
'Diwali Sales Data.csv'
,encoding=
'unicode_escape'
)
grouped=df.groupby([
'Gender'
,'Marital_Status'
])['Amount'
].sum().reset_i
ndex()
fig,ax=plt.subplots(figsize=(
8,6))
male_data=grouped[grouped['Gender']=='M']
female_data=grouped[grouped['Gender']=='F']
bar_width=0.35
bar_positions_male=male_data['Marital_Status']
bar_positions_female=female_data['Marital_Status']+bar_width
ax.bar(bar_positions_male,male_data['Amount'],width=bar_width,label='Ma
le')
ax.bar(
bar_positions_female,female_data['Amount'],width=bar_width,label='Femal
e')
ax.set_xlabel('Marital Status(0: Single, 1: Married)')
ax.set_ylabel('Amount Spent')
ax.set_title('Amount Spent by Marital Status and Gender')
ax.set_xticklabels(['Single','Married'])
ax.legend()
plt.tight_layout()
plt.show()
OUTPUT:-
9)Plot the count of each occupation present in the dataset.
PROGRAM:-
OUTPUT:-
PROGRAM:-
import pandas as pd import matplotlib.pyplot as plt
df=pd.read_csv('Diwali Sales Data.csv',encoding='unicode_escape')
occupation_amounts=df.groupby('Occupation')['Amount'].sum()
occupation_amounts_sorted=occupation_amounts.sort_values(ascending=Fals
e) plt.figure(figsize=(10,6))
occupation_amounts_sorted.plot(kind='bar',color='green')
plt.title('Total Amount Spent by Each Occupation(Descending Order)')
plt.xlabel('Occupation') plt.ylabel('Total Amount Spent')
plt.xticks(rotation=45) plt.show()
OUTPUT:-
11)Provide a statistical analysis of each product category based on the percentage of orders
completed.
PROGRAM:-
OUTPUT:-
PROGRAM:-
OUTPUT:-
ANSWER:-
4)Marital Status:-
5)Zone:-
6)Occupation:-
1)Popular Products:-
2)Demographic Preferences:-
3)Regional Variances:-
4)Seasonal Trends:-
6)Optimization Opportunities:-