Indian Premier League (IPL) - Data Analysis - Colab
Indian Premier League (IPL) - Data Analysis - Colab
#Importing libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import warnings
warnings.filterwarnings('ignore')
%matplotlib inline
import plotly.graph_objs as go
import plotly.offline as offline
PLAYER_MATCH
player_match.head(2)
Player_match_SK PlayerMatch_key Match_Id Player_Id Player_Name DOB Batting_hand Bowling_skill Country_Name Role_Desc .
1973- Right-arm
1 12694 33598700006 335987 6 R Dravid Right-hand bat India Captain
01-11 offbreak
2 rows × 22 columns
player_match.describe(include='all')
https://colab.research.google.com/#fileId=https%3A//storage.googleapis.com/kaggle-colab-exported-notebooks/indian-premier-league-ipl-data-an… 1/9
3/4/25, 10:08 PM Indian Premier League(IPL): Data Analysis - Colab
count 13993.000000 1.399300e+04 1.399300e+04 13993.000000 13992 13992 13992 12862 13992
1987- Right-arm
top NaN NaN NaN NaN SK Raina Right-hand bat India
04-30 offbreak
freq NaN NaN NaN NaN 160 251 10026 3359 9045
mean 19688.092832 6.371377e+10 6.371377e+05 168.732152 NaN NaN NaN NaN NaN
std 4042.570934 2.350311e+10 2.350311e+05 129.453471 NaN NaN NaN NaN NaN
min -1.000000 -1.000000e+00 -1.000000e+00 -1.000000 NaN NaN NaN NaN NaN
25% 16191.000000 4.191540e+10 4.191540e+05 56.000000 NaN NaN NaN NaN NaN
50% 19689.000000 5.483820e+10 5.483820e+05 136.000000 NaN NaN NaN NaN NaN
75% 23187.000000 8.297460e+10 8.297460e+05 267.000000 NaN NaN NaN NaN NaN
max 26685.000000 1.082650e+11 1.082650e+06 497.000000 NaN NaN NaN NaN NaN
11 rows × 22 columns
array(['R Dravid', 'SC Ganguly', 'Yuvraj Singh', 'V Sehwag', 'SK Warne',
'Harbhajan Singh', 'VVS Laxman', 'SM Pollock', 'SR Tendulkar',
'SR Watson', 'MS Dhoni', 'KP Pietersen', 'BB McCullum', 'A Kumble',
'G Gambhir', 'SK Raina', 'DPMD Jayawardene', 'KC Sangakkara',
'DJ Bravo', 'DL Vettori', 'V Kohli', 'JR Hopes', 'CL White',
'DJ Hussey', 'SPD Smith', 'RT Ponting', 'AD Mathews',
'LRPL Taylor', 'AJ Finch', 'RG Sharma', 'DA Warner', 'GJ Bailey',
'S Dhawan', 'DJG Sammy', 'JP Duminy', 'Z Khan', 'DA Miller',
'M Vijay', 'GJ Maxwell', 'AM Rahane', 'KK Nair'], dtype=object)
plt.figure(figsize=(14,6))
sns.countplot(x='Age_As_on_match',data=player_match)
Age is normally distributed. There are some young players, probably talented enough to start playing early. We also observe some older
players, well into there 40's, still playing in the IPL
Match
match.head(2)
https://colab.research.google.com/#fileId=https%3A//storage.googleapis.com/kaggle-colab-exported-notebooks/indian-premier-league-ipl-data-an… 2/9
3/4/25, 10:08 PM Indian Premier League(IPL): Data Analysis - Colab
Match_SK match_id Team1 Team2 match_date Season_Year Venue_Name City_Name Country_Name Toss_Winner match_winne
Royal Kolkata M
Kolkata Kolkata Knig
0 546 980964 Challengers Knight 2016-05-02 2016 Chinnaswamy Bangalore India
Knight Riders Ride
Bangalore Riders Stadium
Saurashtra
Gujarat Delhi Cricket Delhi Del
1 547 980966 2016-05-03 2016 Rajkot India
Lions Daredevils Association Daredevils Daredevi
Stadium
Next steps: Generate code with match toggle_off View recommended plots New interactive sheet
match.describe(include='all')
Match_SK match_id Team1 Team2 match_date Season_Year Venue_Name City_Name Country_Name Toss_Winn
count 637.000000 6.370000e+02 637 637 637 637.000000 636 637 637 6
Royal M
Mumbai Mum
top NaN NaN Challengers NaN NaN Chinnaswamy Mumbai India
Indians India
Bangalore Stadium
2012-10-27
mean 318.000000 6.378825e+05 NaN NaN 2012.497645 NaN NaN NaN N
10:46:31.836734720
2008-04-18
min 0.000000 3.359870e+05 NaN NaN 2008.000000 NaN NaN NaN N
00:00:00
2010-04-11
25% 159.000000 4.191550e+05 NaN NaN 2010.000000 NaN NaN NaN N
00:00:00
2012-05-22
50% 318.000000 5.483830e+05 NaN NaN 2012.000000 NaN NaN NaN N
00:00:00
2015-04-22
75% 477.000000 8.297480e+05 NaN NaN 2015.000000 NaN NaN NaN N
00:00:00
2017-05-21
max 636.000000 1.082650e+06 NaN NaN 2017.000000 NaN NaN NaN N
00:00:00
std 184.030342 2.356312e+05 NaN NaN NaN 2.776600 NaN NaN NaN N
match.isnull().sum(axis=0)
Match_SK 0
match_id 0
Team1 0
Team2 0
match_date 0
Season_Year 0
Venue_Name 1
City_Name 0
Country_Name 0
Toss_Winner 1
match_winner 3
Toss_Name 1
Win_Type 2
Outcome_Type 0
ManOfMach 4
Win_Margin 9
Country_id 0
dtype: int64
https://colab.research.google.com/#fileId=https%3A//storage.googleapis.com/kaggle-colab-exported-notebooks/indian-premier-league-ipl-data-an… 3/9
3/4/25, 10:08 PM Indian Premier League(IPL): Data Analysis - Colab
#Number of teams
print("Number of unique teams: ",match.Team1.unique())
Number of unique teams: ['Royal Challengers Bangalore' 'Gujarat Lions' 'Kolkata Knight Riders'
'Delhi Daredevils' 'Sunrisers Hyderabad' 'Kings XI Punjab'
'Mumbai Indians' 'Rising Pune Supergiants' 'Rajasthan Royals'
'Deccan Chargers' 'Chennai Super Kings' 'Kochi Tuskers Kerala'
'Pune Warriors']
match_winner
ManOfMach
CH Gayle 18
YK Pathan 16
AB de Villiers 15
DA Warner 15
RG Sharma 14
dtype: int64
https://colab.research.google.com/#fileId=https%3A//storage.googleapis.com/kaggle-colab-exported-notebooks/indian-premier-league-ipl-data-an… 4/9
3/4/25, 10:08 PM Indian Premier League(IPL): Data Analysis - Colab
Big cities with a home team have hosted more matches with M Chinnaswamy Stadium leading till 2017 followed by Eden Gardens and Feroz
Shah Kotla
https://colab.research.google.com/#fileId=https%3A//storage.googleapis.com/kaggle-colab-exported-notebooks/indian-premier-league-ipl-data-an… 5/9
3/4/25, 10:08 PM Indian Premier League(IPL): Data Analysis - Colab
Mumbai has most wins followed by Chennai and than Kolkata. Now, let's see who's winning the toss more often
Again it's Mumbai who's winning the toss more often. Let's see what are the teams doing after winning the toss over the years
https://colab.research.google.com/#fileId=https%3A//storage.googleapis.com/kaggle-colab-exported-notebooks/indian-premier-league-ipl-data-an… 6/9
3/4/25, 10:08 PM Indian Premier League(IPL): Data Analysis - Colab
plt.figure(figsize=(12,6))
sns.countplot(x='Season_Year', hue='Toss_Name', data=match)
Teams used to bat first after winning the toss during initial years of IPL. But we see there's a clear change in this pattern, specially last couple
of years.
match.head()
Match_SK match_id Team1 Team2 match_date Season_Year Venue_Name City_Name Country_Name Toss_Winner match_winn
Royal Kolkata M
Kolkata Kolkata Kni
0 546 980964 Challengers Knight 2016-05-02 2016 Chinnaswamy Bangalore India
Knight Riders Rid
Bangalore Riders Stadium
Saurashtra
Gujarat Delhi Cricket Delhi De
1 547 980966 2016-05-03 2016 Rajkot India
Lions Daredevils Association Daredevils Daredev
Stadium
Kolkata
Kings XI Eden Kings XI Kolkata Kni
2 548 980968 Knight 2016-05-04 2016 Kolkata India
Punjab Gardens Punjab Rid
Riders
Rising
Delhi Feroz Shah Rising Pune Rising Pu
3 549 980970 Pune 2016-05-05 2016 Delhi India
Daredevils Kotla Supergiants Supergia
Supergiants
Rajiv Gandhi
Sunrisers Gujarat International Sunrisers Sunris
4 550 980972 2016-05-06 2016 Hyderabad India
Hyderabad Lions Stadium, Hyderabad Hyderab
Uppal
Next steps: Generate code with match toggle_off View recommended plots New interactive sheet
PLAYER
players.head(2)
Next steps: Generate code with players toggle_off View recommended plots New interactive sheet
https://colab.research.google.com/#fileId=https%3A//storage.googleapis.com/kaggle-colab-exported-notebooks/indian-premier-league-ipl-data-an… 7/9
3/4/25, 10:08 PM Indian Premier League(IPL): Data Analysis - Colab
plt.figure(figsize=(8,6))
sns.countplot(x='Batting_hand', data=players)
plt.figure(figsize=(12,6))
ax=sns.countplot(x='Bowling_skill', data=players, order=pd.value_counts(players['Bowling_skill']).iloc[:10].index)
ax.set_xticklabels(ax.get_xticklabels(), rotation=40, ha="right")
plt.show()
Right arm bat and Right arm medium are clearly more popular
Most players are from India followed by Australia and South Africa
https://colab.research.google.com/#fileId=https%3A//storage.googleapis.com/kaggle-colab-exported-notebooks/indian-premier-league-ipl-data-an… 8/9
3/4/25, 10:08 PM Indian Premier League(IPL): Data Analysis - Colab
BALL FACT
I will now quickly dive into the next dataset Ball Fact. This has lot of information about each ball bowled in the IPL
ball_fact.describe(include='all')
11 rows × 53 columns
An extra is a run scored by a means other than a batsman hitting the ball Other than runs scored off the bat from a no ball a batsman is not
https://colab.research.google.com/#fileId=https%3A//storage.googleapis.com/kaggle-colab-exported-notebooks/indian-premier-league-ipl-data-an… 9/9