0% found this document useful (0 votes)
27 views15 pages

Untitled 11

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views15 pages

Untitled 11

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

import pandas as pd

import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
sns.set(style='whitegrid')

1:Data Loading
file_path=(r'C:\Users\Administrator\Desktop\DA-jupyter\Book1.xlsx')
data=pd.ExcelFile(file_path)
data.sheet_names

# Load individual sheets for processing


gameplay_data = data.parse('User Gameplay Data')
print(gameplay_data.columns)

deposit_data = data.parse('Deposite Data')


print(deposit_data.columns)

withdraw_data = data.parse('Withdraw Data')


print(withdraw_data.columns)

Index(['User ID', 'Games Played', 'Datetime'], dtype='object')


Index(['User Id', 'Datetime', 'Amount'], dtype='object')
Index(['User Id', 'Datetime', 'Amount'], dtype='object')

2: Data Cleaning
#converting the datetime columns for accurate time filtering

#1: gameplay_data
gameplay_data['Datetime']=pd.to_datetime(gameplay_data['Datetime'])

#2: deposit_data
deposit_data['Datetime']=pd.to_datetime(deposit_data['Datetime'])

#3: withdraw_data
withdraw_data['Datetime']=pd.to_datetime(withdraw_data['Datetime'])

# getting only unique user


unique_user1=gameplay_data['User ID'].unique()

unique_user2=deposit_data['User Id'].unique()

unique_user3=withdraw_data['User Id'].unique()

gameplay_data.info()
deposit_data.info()
withdraw_data.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 355266 entries, 0 to 355265
Data columns (total 3 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 User ID 355266 non-null int64
1 Games Played 355266 non-null int64
2 Datetime 355266 non-null datetime64[ns]
dtypes: datetime64[ns](1), int64(2)
memory usage: 8.1 MB
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 17438 entries, 0 to 17437
Data columns (total 3 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 User Id 17438 non-null int64
1 Datetime 17438 non-null datetime64[ns]
2 Amount 17438 non-null int64
dtypes: datetime64[ns](1), int64(2)
memory usage: 408.8 KB
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3566 entries, 0 to 3565
Data columns (total 3 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 User Id 3566 non-null int64
1 Datetime 3566 non-null datetime64[ns]
2 Amount 3566 non-null int64
dtypes: datetime64[ns](1), int64(2)
memory usage: 83.7 KB

#A: "Final Loyalty Point Formula"


Loyalty Point = (0.01 * deposit) + (0.005 * Withdrawal amount) + (0.001 * (maximum of
(#deposit - #withdrawal) or 0)) + (0.2 * Number of games played)

At the end of each month total loyalty points are alloted to all the players. Out of which the top
50 players are provided cash benefits."

#1: gameplay_data
gameplay_data['Datetime']=pd.to_datetime(gameplay_data['Datetime'])

#2: deposit_data
deposit_data['Datetime']=pd.to_datetime(deposit_data['Datetime'])
#3: withdraw_data
withdraw_data['Datetime']=pd.to_datetime(withdraw_data['Datetime'])

october_gameplay=gameplay_data[gameplay_data['Datetime'].dt.month==10]

october_deposite=deposit_data[deposit_data['Datetime'].dt.month==10]

october_withdraw=withdraw_data[withdraw_data['Datetime'].dt.month==10]

# The agg() function (which is an alias for aggregate()) is used to


perform aggregation operations on DataFrames and Series.
# df.agg(['sum', 'mean']) Apply sum and mean to all numeric columns
# df.agg({'A': 'sum', 'B': 'mean'}) Apply sum to 'A' and mean to 'B'

deposit_summary = october_deposite.groupby('User Id').agg(


total_deposit_amount=('Amount', 'sum'),
num_deposits=('Amount', 'size')
).reset_index().rename(columns={'User Id':'User ID'})

withdraw_summary = october_withdraw.groupby('User Id').agg(


total_withdraw_amount=('Amount', 'sum'),
num_withdraw=('Amount', 'size')
).reset_index().rename(columns={'User Id':'User ID'})

gameplay_summery=october_gameplay.groupby('User ID').agg(
total_gameplay=('Games Played','sum')
).reset_index()

gameplay_summary=pd.merge(deposit_summary,withdraw_summary,on='User
ID',how='outer')
gameplay_summary=pd.merge(gameplay_summary,gameplay_summery,on='User
ID',how='outer').fillna(0)

gameplay_summary['Deposite_points']=0.01*gameplay_summary['total_depos
it_amount']
gameplay_summary['Withdrwal_points']=0.005*gameplay_summary['total_wit
hdraw_amount']
gameplay_summary['num_points']=0.001*gameplay_summary['num_deposits']-
gameplay_summary['num_withdraw']
gameplay_summary['gameplay_points']=0.2*gameplay_summary['total_gamepl
ay']

gameplay_summary['Loyalty_points']=(gameplay_summary['Deposite_points'
]+gameplay_summary['Withdrwal_points']+gameplay_summary['num_points']
+gameplay_summary['gameplay_points'])

gameplay_sort=gameplay_summary.sort_values(['Loyalty_points'],ascendin
g=(False))

print(gameplay_sort[['User ID','Loyalty_points']])

#Figure...
gameplay_sort_top_5=gameplay_sort.head(5)
plt.figure(figsize=(10,6))
sns.barplot(x='Loyalty_points',y='User ID',data=gameplay_sort_top_5)
plt.title('Loyalty Point Formula ')
plt.show()

User ID Loyalty_points
377 795 14649.440
196 599 10716.765
433 852 10250.183
544 972 10068.710
112 502 10049.121
.. ... ...
657 643 0.400
624 388 0.400
678 993 0.400
645 507 0.400
620 384 0.200

[680 rows x 2 columns]


1. Find Playerwise Loyalty points earned by
Players in the following slots:-
a. 2nd October Slot S1
b. 16th October Slot S2
b. 18th October Slot S1
b. 26th October Slot S2

gameplay_data['Datetime'] = pd.to_datetime(gameplay_data['Datetime'],
errors='coerce')
withdraw_data['Datetime'] = pd.to_datetime(withdraw_data['Datetime'],
errors='coerce')
deposit_data['Datetime'] = pd.to_datetime(deposit_data['Datetime'],
errors='coerce')

def filter_data(data,date,slot):
start_time,end_time=('00:00','12:00') if slot=='S1' else
('12:00','23:59')
start_datetime=pd.to_datetime(f'{date} {start_time}')
end_datetime=pd.to_datetime(f'{date} {end_time}')
return data[(data['Datetime']>=start_datetime) &
(data['Datetime']<=end_datetime)]

target_slots = [
{"date": "2022-10-02", "slot": "S1"},
{"date": "2022-10-16", "slot": "S2"},
{"date": "2022-10-18", "slot": "S1"},
{"date": "2022-10-26", "slot": "S2"}
]

Loyalty_points=[]

for slot_info in target_slots:


date,slot=slot_info['date'],slot_info['slot']
slot_game=filter_data(gameplay_data,date,slot)
slot_withdraw=filter_data(withdraw_data,date,slot)
slot_deposite=filter_data(deposit_data,date,slot)

deposit_summary = slot_deposite.groupby('User Id').agg(


total_deposit_amount=('Amount', 'sum'),
num_deposits=('Amount', 'size')
).reset_index().rename(columns={'User Id':'User ID'})

withdraw_summary = slot_withdraw.groupby('User Id').agg(


total_withdraw_amount=('Amount', 'sum'),
num_withdraw=('Amount', 'size')
).reset_index().rename(columns={'User Id':'User ID'})
gameplay_summery=slot_game.groupby('User ID').agg(
total_gameplay=('Games Played','sum')
).reset_index()

gameplay_summary=pd.merge(deposit_summary,withdraw_summary,on='User
ID',how='outer')

gameplay_summary=pd.merge(gameplay_summary,gameplay_summery,on='User
ID',how='outer').fillna(0)

gameplay_summary['total_loyality_points']=(
0.01*gameplay_summary['total_deposit_amount']+
0.005*gameplay_summary['total_withdraw_amount']+
0.001*(gameplay_summary['num_deposits']-
gameplay_summary['num_withdraw']).clip(lower=0)+
0.2*gameplay_summary['total_gameplay']
)

gameplay_summary['date']=date
gameplay_summary['slot']=slot

Loyalty_points.append(gameplay_summary[['User
ID','total_loyality_points','date','slot']])

dp=pd.concat(Loyalty_points)
print(dp.head())

#figure
dp_head=dp.head(10)
plt.figure(figsize=(10,6))
sns.lineplot(x='total_loyality_points',y='User ID',data=dp_head)
plt.show()

Empty DataFrame
Columns: [User ID, total_loyality_points, date, slot]
Index: []
User ID total_loyality_points date slot
0 541 637.013 2022-10-16 S2
1 542 167.555 2022-10-16 S2
2 543 1.001 2022-10-16 S2
3 544 1490.423 2022-10-16 S2
4 546 75.002 2022-10-16 S2
User ID total_loyality_points date slot
0 541 637.013 2022-10-16 S2
1 542 167.555 2022-10-16 S2
2 543 1.001 2022-10-16 S2
3 544 1490.423 2022-10-16 S2
4 546 75.002 2022-10-16 S2
User ID total_loyality_points date slot
0 541 637.013 2022-10-16 S2
1 542 167.555 2022-10-16 S2
2 543 1.001 2022-10-16 S2
3 544 1490.423 2022-10-16 S2
4 546 75.002 2022-10-16 S2

2. Calculate overall loyalty points earned and


rank players on the basis of loyalty points in the
month of October.
In case of tie, number of games played should be taken as the next
criteria for ranking.

#1: gameplay_data
gameplay_data['Datetime']=pd.to_datetime(gameplay_data['Datetime'])

#2: deposit_data
deposit_data['Datetime']=pd.to_datetime(deposit_data['Datetime'])
#3: withdraw_data
withdraw_data['Datetime']=pd.to_datetime(withdraw_data['Datetime'])

october_gameplay=gameplay_data[gameplay_data['Datetime'].dt.month==10]

october_deposite=deposit_data[deposit_data['Datetime'].dt.month==10]

october_withdraw=withdraw_data[withdraw_data['Datetime'].dt.month==10]

# The agg() function (which is an alias for aggregate()) is used to


perform aggregation operations on DataFrames and Series.
# df.agg(['sum', 'mean']) Apply sum and mean to all numeric columns
# df.agg({'A': 'sum', 'B': 'mean'}) Apply sum to 'A' and mean to 'B'

deposit_summary = october_deposite.groupby('User Id').agg(


total_deposit_amount=('Amount', 'sum'),
num_deposits=('Amount', 'size')
).reset_index().rename(columns={'User Id':'User ID'})

withdraw_summary = october_withdraw.groupby('User Id').agg(


total_withdraw_amount=('Amount', 'sum'),
num_withdraw=('Amount', 'size')
).reset_index().rename(columns={'User Id':'User ID'})

gameplay_summery=october_gameplay.groupby('User ID').agg(
total_gameplay=('Games Played','sum')
).reset_index()

gameplay_summary=pd.merge(deposit_summary,withdraw_summary,on='User
ID',how='outer')
gameplay_summary=pd.merge(gameplay_summary,gameplay_summery,on='User
ID',how='outer').fillna(0)

gameplay_summary['Deposite_points']=0.01*gameplay_summary['total_depos
it_amount']
gameplay_summary['Withdrwal_points']=0.005*gameplay_summary['total_wit
hdraw_amount']
gameplay_summary['num_points']=0.001*gameplay_summary['num_deposits']-
gameplay_summary['num_withdraw']
gameplay_summary['gameplay_points']=0.2*gameplay_summary['total_gamepl
ay']

gameplay_summary['Total_Loyalty_points']=(gameplay_summary['Deposite_p
oints']+gameplay_summary['Withdrwal_points']
+gameplay_summary['num_points']+gameplay_summary['gameplay_points'])

gameplay_sort=gameplay_summary.sort_values(['Total_Loyalty_points'],as
cending=(False))

print(gameplay_sort[['User ID','Total_Loyalty_points']])

User ID Total_Loyalty_points
377 795 14649.440
196 599 10716.765
433 852 10250.183
544 972 10068.710
112 502 10049.121
.. ... ...
657 643 0.400
624 388 0.400
678 993 0.400
645 507 0.400
620 384 0.200

[680 rows x 2 columns]

3. What is the average deposit amount?


deposit_data = data.parse('Deposite Data')

average_deposite_amount=deposit_data['Amount'].mean()
round_avg=round(average_deposite_amount)
print(f'Average Deposite Amount is {round_avg}')

Average Deposite Amount is 5492

4. What is the average deposit amount per user


in a month?
deposit_data['Datetime']=pd.to_datetime(deposit_data['Datetime'])
october_deposite=deposit_data[deposit_data['Datetime'].dt.month==10]
user_deposite=october_deposite.groupby('User Id')['Amount'].sum()
mean_month=user_deposite.mean()
round_month=mean_month.round()
print(f'The average deposit amount per user in a month is
{round_month}')

The average deposit amount per user in a month is 110400.0


5. What is the average number of games played
per user?"
gameplay_groupby=gameplay_data.groupby('User ID')['Games
Played'].sum()
mean_gameplay=gameplay_groupby.mean()
mean=mean_gameplay.round()
print(f'The average number of games played per user are {mean} ')

The average number of games played per user are 355.0

Part B - How much bonus should be allocated to


leaderboard players?
After calculating the loyalty points for the whole month find out which 50 players are at the top
of the leaderboard. The company has allocated a pool of Rs 50000 to be given away as bonus
money to the loyal players.

Now the company needs to determine how much bonus money should be given to the players.

Should they base it on the amount of loyalty points? Should it be based on number of games? Or
something else?

That’s for you to figure out.

Suggest a suitable way to divide the allocated money keeping in mind the following points:

1. Only top 50 ranked players are awarded bonus


# #
gameplay_data['Datetime']=pd.to_datetime(gameplay_data['Datetime'])
# # deposit_data['Datetime']=pd.to_datetime(deposit_data['Datetime'])
# #
withdraw_data['Datetime']=pd.to_datetime(withdraw_data['Datetime'])

# #
october_gameplay=gameplay_data[gameplay_data['Datetime'].dt.month==10]
# #
october_deposit=deposit_data[deposit_data['Datetime'].dt.month==10]
# #
october_withdraw=withdraw_data[withdraw_data['Datetime'].dt.month==10]

# # gameplay_summery=october_gameplay.groupby('User ID').agg(
# # total_gameplay=('Games Played','sum')
# # ).reset_index()

# # deposit_summary = october_deposite.groupby('User Id').agg(


# # total_deposit_amount=('Amount', 'sum'),
# # num_deposits=('Amount', 'size')
# # ).reset_index().rename(columns={'User Id':'User ID'})

# # withdraw_summary = october_withdraw.groupby('User Id').agg(


# # total_withdraw_amount=('Amount', 'sum'),
# # num_withdraw=('Amount', 'size')
# # ).reset_index().rename(columns={'User Id':'User ID'})

# #
gameplay_summary=pd.merge(deposit_summary,withdraw_summary,on='User
ID',how='outer')
# #
gameplay_summary=pd.merge(gameplay_summary,gameplay_summery,how='outer
',on='User ID').fillna(0)

# #
gameplay_summary['deposite_points']=0.01*deposit_summary['total_deposi
t_amount']
# #
gameplay_summary['withdraw_points']=0.005*withdraw_summary['total_with
draw_amount']
# #
gameplay_summary['gameplay_points']=0.01*gameplay_summery['total_gamep
lay']
# #
gameplay_summary['num_points']=0.001*gameplay_summary['num_deposits']-
gameplay_summary['num_withdraw']

# #
gameplay_summary['Total_points']=(gameplay_summary['deposite_points']
+gameplay_summary['withdraw_points']
+gameplay_summary['gameplay_points']+gameplay_summary['num_points'])

# #
total_gameplay=gameplay_summary.sort_values(['Total_points'],ascending
=(False))
# # print(total_gameplay[['Total_points']].sum().round())

# # #sorting

# # gameplay_summery=october_gameplay.groupby('User ID').agg(
# # total_gameplay=('Games Played','sum')
# # ).reset_index().head(50)

# # deposit_summary = october_deposite.groupby('User Id').agg(


# # total_deposit_amount=('Amount', 'sum'),
# # num_deposits=('Amount', 'size')
# # ).reset_index().rename(columns={'User Id':'User ID'}).head(50)

# # withdraw_summary = october_withdraw.groupby('User Id').agg(


# # total_withdraw_amount=('Amount', 'sum'),
# # num_withdraw=('Amount', 'size')
# # ).reset_index().rename(columns={'User Id':'User ID'}).head(50)

# #
gameplay_summary=pd.merge(deposit_summary,withdraw_summary,on='User
ID',how='outer')
# #
gameplay_summary=pd.merge(gameplay_summary,gameplay_summery,how='outer
',on='User ID').fillna(0)

# #
gameplay_summary['deposite_points']=0.01*deposit_summary['total_deposi
t_amount']
# #
gameplay_summary['withdraw_points']=0.005*withdraw_summary['total_with
draw_amount']
# #
gameplay_summary['gameplay_points']=0.01*gameplay_summery['total_gamep
lay']
# #
gameplay_summary['num_points']=0.001*gameplay_summary['num_deposits']-
gameplay_summary['num_withdraw']

# #
gameplay_summary['sort_points']=(gameplay_summary['deposite_points']
+gameplay_summary['withdraw_points']
+gameplay_summary['gameplay_points']+gameplay_summary['num_points'])

# #
sort_gameplay=gameplay_summary.sort_values(['sort_points'],ascending=(
False))
# # print(sort_gameplay[['sort_points']].sum().round())

# # #formula for the points


# final=(0.7*total_gameplay)/(0.3*sort_gameplay)*50000
# final

top_50_players = gameplay_sort.nlargest(50, ['Total_Loyalty_points',


'total_gameplay'])
a = 0.7
b = 0.3

top_50_players['weighted_score'] = (a *
top_50_players['Total_Loyalty_points'] +
b *
top_50_players['total_gameplay'])

total_weighted_score = top_50_players['weighted_score'].sum()

top_50_players['bonus'] = (top_50_players['weighted_score'] /
total_weighted_score) * 50000

print(top_50_players[['User ID', 'Total_Loyalty_points',


'total_gameplay', 'bonus']])

#figure

plt.figure(figsize=(10,6))
sns.lineplot(x='Total_Loyalty_points',y='bonus',data=top_50_players)
plt.title('Top 50 players accoring to the play')
plt.show()

User ID Total_Loyalty_points total_gameplay bonus


377 795 14649.440 617.0 1913.804910
196 599 10716.765 9.0 1375.711419
433 852 10250.183 321.0 1332.996555
544 972 10068.710 18.0 1293.045475
112 502 10049.121 4494.0 1536.693532
35 421 10014.412 0.0 1285.087820
425 844 9519.217 448.0 1246.180673
417 836 9328.400 2430.0 1330.696240
126 519 8607.857 1980.0 1213.485225
378 796 8444.574 90.0 1088.589811
229 634 8420.383 24.0 1081.855798
431 850 8155.805 8.0 1047.024197
552 980 8121.590 63.0 1045.658373
213 618 8046.938 7084.0 1422.205182
152 548 7992.038 31.0 1027.273893
418 837 7877.001 335.0 1029.230667
557 985 7691.396 880.0 1035.385903
332 744 7663.769 579.0 1015.286925
257 663 7661.617 6401.0 1335.197089
483 909 7490.180 1.0 961.223669
238 644 7335.630 343.0 960.199828
517 944 7237.920 28.0 930.337587
464 888 7127.014 582.0 946.573449
481 907 7087.915 57.0 912.683251
169 565 7016.760 1312.0 972.572263
232 637 6901.929 5.0 885.957024
522 949 6471.514 1.0 830.504534
373 790 6412.871 238.0 836.013275
494 920 6411.371 932.0 873.987966
188 587 6359.509 734.0 856.443641
412 831 6331.506 344.0 831.401774
500 927 6261.458 1033.0 860.305143
342 754 6194.309 5702.0 1108.464333
248 654 6102.142 998.0 837.936244
34 352 6087.649 0.0 781.190507
62 450 6012.742 0.0 771.578153
410 829 5917.497 49.0 762.050749
361 776 5858.482 105.0 757.557490
550 978 5824.309 74.0 751.467405
509 936 5434.194 161.0 706.190996
523 950 5423.464 4.0 696.179720
70 458 5408.389 2190.0 814.466345
125 518 5143.473 156.0 668.609581
489 915 5112.019 8.0 656.433884
242 648 5008.265 3.0 642.844793
531 959 5004.174 13.0 642.869779
558 987 4938.355 2171.0 753.104854
598 765 4910.710 24096.0 1955.343154
114 505 4899.165 2308.0 755.610285
220 625 4856.648 23.0 624.488636
Part C
Would you say the loyalty point formula is fair or unfair?

Can you suggest any way to make the loyalty point formula more robust?"

Answers-:
Accoridng to me the loyalty point formula is fair as Deposits: Points are awarded based on 1% of
the total deposit amount. Withdrawals: Points are awarded based on 0.5% of the total
withdrawal amount. Deposit-Withdrawal Balance: Small additional points are awarded if
deposits exceed withdrawals, based on the count difference between deposits and withdrawals.
Games Played: Points are heavily weighted here, with 0.2 points awarded per game p}layed.

You might also like