0% found this document useful (0 votes)
7 views3 pages

Soln

The document outlines various questions that can be answered using pandas groupby and aggregation methods on a dataset related to matches. It provides Python code snippets for each question, including how to find the team with the most wins in 2017, the average runs by which teams win, and the total number of wickets taken by each team. Additionally, it covers statistics related to umpires, venues, and match outcomes.

Uploaded by

sarojgiri853
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views3 pages

Soln

The document outlines various questions that can be answered using pandas groupby and aggregation methods on a dataset related to matches. It provides Python code snippets for each question, including how to find the team with the most wins in 2017, the average runs by which teams win, and the total number of wickets taken by each team. Additionally, it covers statistics related to umpires, venues, and match outcomes.

Uploaded by

sarojgiri853
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Here are different questions you can ask based on the dataset, which can be answered using

pandas groupby or other aggregation methods:

1. Which team won the most matches in 2017?


python
Copy
team_wins_2017 = df[df['season'] ==
2017].groupby('winner').size().reset_index(name='wins')
team_wins_2017 = team_wins_2017.sort_values(by='wins', ascending=False)
print(team_wins_2017.head(1))

2. How many matches were played in each city?


python
Copy
matches_by_city =
df.groupby('city').size().reset_index(name='matches_played')
print(matches_by_city)

3. What is the average number of runs by which teams win in each season?
python
Copy
average_runs_by_season =
df.groupby('season')['win_by_runs'].mean().reset_index(name='average_runs_won
')
print(average_runs_by_season)

4. Which player won the most "Player of the Match" awards in 2017?
python
Copy
player_of_the_match_2017 = df[df['season'] ==
2017].groupby('player_of_match').size().reset_index(name='awards')
player_of_the_match_2017 = player_of_the_match_2017.sort_values(by='awards',
ascending=False)
print(player_of_the_match_2017.head(1))

5. How many matches did each team win at home (based on the venue)?
python
Copy
team_home_wins = df.groupby(['team1', 'venue'])['winner'].apply(lambda x: (x
== x.name).sum()).reset_index(name='home_wins')
print(team_home_wins)

6. What is the win percentage for each team in 2017?


python
Copy
team_total_matches_2017 = df[df['season'] ==
2017].groupby('team1').size().reset_index(name='total_matches')
team_wins_2017 = df[df['season'] ==
2017].groupby('winner').size().reset_index(name='wins')
team_win_percentage = pd.merge(team_total_matches_2017, team_wins_2017,
left_on='team1', right_on='winner', how='left')
team_win_percentage['win_percentage'] = (team_win_percentage['wins'] /
team_win_percentage['total_matches']) * 100
print(team_win_percentage[['team1', 'win_percentage']])

7. What was the total number of wickets taken by each team?


python
Copy
total_wickets_by_team =
df.groupby('team1')['win_by_wickets'].sum().reset_index(name='total_wickets_t
aken')
print(total_wickets_by_team)

8. How many matches did each umpire officiate in total?


python
Copy
umpire_match_count = df.groupby(['umpire1',
'umpire2']).size().reset_index(name='matches_officiated')
print(umpire_match_count)

9. Which venue hosted the most matches in 2017?


python
Copy
matches_by_venue_2017 = df[df['season'] ==
2017].groupby('venue').size().reset_index(name='matches_hosted')
matches_by_venue_2017 =
matches_by_venue_2017.sort_values(by='matches_hosted', ascending=False)
print(matches_by_venue_2017.head(1))

10. Which team had the highest average number of runs won by in 2017?
python
Copy
avg_runs_by_team_2017 = df[df['season'] ==
2017].groupby('team1')['win_by_runs'].mean().reset_index(name='avg_runs_won')
avg_runs_by_team_2017 = avg_runs_by_team_2017.sort_values(by='avg_runs_won',
ascending=False)
print(avg_runs_by_team_2017.head(1))

11. What is the total number of wins by wickets for each team in 2017?
python
Copy
wins_by_wickets_2017 = df[df['season'] ==
2017].groupby('team1')['win_by_wickets'].sum().reset_index(name='total_wicket
s_won')
print(wins_by_wickets_2017)

12. Which umpire pair was the most common in 2017?


python
Copy
umpire_pair_count_2017 = df[df['season'] == 2017].groupby(['umpire1',
'umpire2']).size().reset_index(name='pair_count')
umpire_pair_count_2017 = umpire_pair_count_2017.sort_values(by='pair_count',
ascending=False)
print(umpire_pair_count_2017.head(1))

13. How many times did each team win by 0 runs in 2017?
python
Copy
team_zero_runs_wins = df[(df['season'] == 2017) & (df['win_by_runs'] ==
0)].groupby('team1').size().reset_index(name='wins_by_zero_runs')
print(team_zero_runs_wins)

14. Which team won the most matches in a particular city?


python
Copy
city_team_wins = df.groupby(['city',
'winner']).size().reset_index(name='city_wins')
city_team_wins = city_team_wins.sort_values(by='city_wins', ascending=False)
print(city_team_wins)

15. What was the total number of runs and wickets won by each team?
python
Copy
team_runs_and_wickets = df.groupby('team1').agg({'win_by_runs': 'sum',
'win_by_wickets': 'sum'}).reset_index()
print(team_runs_and_wickets)

You might also like