Project Report Bhupender 6
Project Report Bhupender 6
II
DECLARATION
Place: Panipat
Date: 3rd Dec 2024 Bhupender
III
ACKNOWLEDGEMENT
At the end just as significantly, I would like to express my sincere thanks to,
Ms. Lovleen (Asst. Prof.) and all the other staff members who have provided
me excellent knowledge and support throughout my Graduation.
I am very much thankful to my parents, brother/sister and friends for their
continuous support.
Place: Panipat
Date: 3rd Dec 2024 Bhupender
IV
INDEX
1. Introduction 1-2
4. Introduction of 8-15
Tools
5. Working of 16-19
Project
7. Screenshots of 57-67
Project
9. Bibliography 71
V
SCREENSHOTS INDEX
3. Distribution of 61-63
Runs
4. Top decision in 64
each season of
IPL
6. Number of total 67
matches per
season
VI
INTRODUCTION
The project offered a treasure trove of insights into various cricket metrics.
Notably, a fascinating revelation surfaced-Siddhartha Trivedi, last seen in 2013,
held the title of the highest wicket-taker for the Rajasthan Royals. Beyond
statistical revelations, I delved into the functionalities of Hugging Face datasets
and spaces.
Finding insightful information that enables businesses to make wise decisions is
the core of data analytics. I used Python to conduct exploratory data analysis for
this project. The renowned Indian Premier League is a cricket competition that
1
takes place in India annually in the months of March through May. The game is
played in a professional T20 format. Eight teams representing eight different
Indian cities compete in this league.
Since its launch in 2008, this league has grown significantly and is now the
most popular in the world. Every team purchases the best players in the world
with its earnings. It is also a major factor in the success of Indian teams since
the league attracts the greatest players from India’s billion-person population.
The Indian Premier League (IPL) is a clear example of how a more
sophisticated and nuanced approach to analytics is required when it comes to
Twenty20 cricket (T20). Data analysis in cricket was limited to recording runs
scored and wickets taken during the days when Test matches were the ultimate
match. The world was awakened to strike rates, economy rates, and chase
precision, among other things, when One-Day Internationals (ODIs) were
introduced.
2
OBJECTIVE OF THE PROJECT
Cricket, often dubbed as a religion in some parts of the world, has evolved
beyond being just a sport into a global phenomenon. Within this vast cricketing
landscape, the Indian Premier League (IPL) stands out as a unique spectacle,
blending athleticism, entertainment, and high-stakes competition. In this blog,
we embark on a journey of Exploratory Data Analysis (EDA) on the IPL dataset
to uncover insights, patterns, and stories hidden within the numbers.
The IPL dataset contains lots of information about different seasons, teams,
players, matches, and venues. It’s like a big treasure chest waiting to be explored.
With Exploratory Data Analysis (EDA), we want to dig into this data to learn
more about the IPL and what makes it so exciting.
The Exploratory Data Analysis (EDA) on the IPL dataset, delving into both
univariate and bivariate analysis. In the univariate analysis, I examined
individual variables within the dataset, uncovering insights into each feature’s
distribution, central tendency, and dispersion. This process allowed me to
understand the characteristics and patterns of each variable in isolation. Moving
on to bivariate analysis, I explored the relationships between pairs of variables,
seeking correlations and dependencies. By visualizing these relationships, I
gained deeper insights into how different aspects of the IPL dataset interact with
each other, paving the way for more comprehensive analysis and informed
decision-making.
3
The main objective of this article is to cover the steps involved in Data pre-
processing, Feature Engineering, and different stages of Exploratory Data
Analysis, which is an essential step in any research analysis. Data pre-
processing, Feature Engineering, and EDA are fundamental early steps after
data collection. Still, they are not limited to where the data is simply
visualized, plotted, and manipulated, without any assumptions, to assess the
quality of the data and building models. This article will guide you through
data pre-processing, feature engineering, and exploratory data analysis (EDA)
using Python.
In our data-driven processes, we prioritize refining our raw data through the
crucial stages of EDA (Exploratory Data Analysis). Both data pre-processing
and feature engineering play pivotal roles in this endeavor. EDA involves a
comprehensive range of activities, including data integration, analysis,
cleaning, transformation, and dimension reduction.
4
Steps Involved in Project:
Import and clean the data to handle missing values, duplicates, and
incorrect data types.
2. Data Exploration:
5
TOOLS USED IN PROJECT
Introduction to tools:
There are myriads of data analytics tools that help us get important
information from the given data. We can use some of these free and open
source tools even without any coding knowledge. These tools are used for
deriving useful insights from the given data without sweating too much. For
example, you could use them to determine the better among some cricket
player based on various statistics and yardsticks. They have helped in
strengthening the decision making the process by providing useful information
that can help reach better conclusions.
1. PYTHON:-
6
2. Power BI:-
3. Microsoft Excel:
Data Analysis with Excel is a detailed lesson that gives readers a clear
understanding of the newest and most sophisticated functions offered by
Microsoft Excel. It describes in detail how to use MS-capabilities Excel to
carry out various Data Analysis Excel tasks. The guide includes a good
amount of screenshots that step-by-step demonstrate how to use various
features. One of the most used programs for Data Analysis Excel is Microsoft
Excel. You can simply import, browse, clean, analyze, and display your data
using this all-in-one data management tool.
7
INTRODUCTION OF TOOLS
Jupyter Notebook:
1. Pandas:
Pandas is one of the most used libraries in Python for data science or data
analysis. It can read data from CSV or Excel files, manipulate the data,
and generate insights from it. Pandas can also be used to clean data, filter
data, and visualize data.
2. NumPy:
3. Matplotlib:
8
4. Seaborn:
5. GitHub:
In the context of this project, GitHub served as a vital tool for the analysis
of the “EXPLORATORY DATA ANALYSIS ON IPL DATA”, streamlining
version control, collaboration, and deployment processes.
1. Version Control:
GitHub facilitates efficient tracking of code changes, allowing developers to
maintain a history of updates, revert to previous versions when necessary, and
ensure code integrity throughout the development lifecycle.
2. Collaboration:
Through features like pull requests, code reviews, and issue tracking, GitHub
promotes teamwork by enabling multiple contributors to work simultaneously
on different aspects of the project.
9
3. Branching and Merging:
GitHub's branching model allows developers to experiment with new
features, fix bugs, or test changes in isolated environments. Once tested,
changes can be merged into the main code base seamlessly.
GitHub played a crucial role in organizing and managing the code base for this
project. Its version control system ensured that all updates were systematically
tracked, enabling efficient collaboration among team members. Features like
branching allowed for the simultaneous analysis of the data and visulaization
components, while issue tracking facilitated the management of tasks and
resolution of bugs.
By leveraging GitHub, the analysis team maintained a streamlined workflow,
ensuring that the ‘EXPLORATORY DATA ANALYSIS ON IPL DATA’ was
analyze with precision, collaboration, and efficiency. The platform's robust
features not only enhanced the quality of the project but also provided a secure
and scalable repository for its continued growth and maintenance.
10
Key Features of Data Analysis:
The analyst has to understand the task and the stakeholder’s expectations
for the solution. A stakeholder is a person that has invested their money
and resources to a project. The analyst must be able to ask different
questions in order to find the right solution to their problem. The analyst
has to find the root cause of the problem in order to fully understand the
problem. The analyst must make sure that he/she doesn’t have any
distractions while analyzing the problem. Communicate effectively with
the stakeholders and other colleagues to completely understand what the
underlying problem is.
2. Collecting Data:
The analyst has to collect the data based on the task given from multiple
sources. The data has to be collected from various sources, internal or
external sources. Internal data is the data available in the organization that
you work for while external data is the data available in sources other than
your organization. The data that is collected by an individual from their
own resources is called first-party data. The data that is collected and sold
is called second-party data. Data that is collected from outside sources is
called third-party data. The common sources from where the data is
collected are Interviews, Surveys, Feedback, Questionnaires. The collected
data can be stored in a spreadsheet or SQL database.
11
3. Data Cleaning:
Clean data means data that is free from misspellings, redundancies, and
irrelevance. Clean data largely depends on data integrity. There might be
duplicate data or the data might not be in a format, therefore the
unnecessary data is removed and cleaned. There are different functions
provided by SQL and Excel to clean the data. This is one of the most
important steps in Data Analysis as clean and formatted data helps in
finding trends and solutions. The most important part of the Process
phase is to check whether your data is biased or not. Bias is an act of
favoring a particular group/community while ignoring the rest. Biasing is
a big no-no as it might affect the overall data analysis. The data analyst
must make sure to include every group while the data is being collected.
4. Analyzing Data:
The cleaned data is used for analyzing and identifying trends. It also
performs calculations and combines data for better results. The tools used
for performing calculations are Excel or SQL. These tools provide in-
built functions to perform calculations or sample code is written in SQL
to perform calculations. Using Excel, we can create pivot tables and
perform calculations while SQL creates temporary tables to perform
calculations. Programming languages are another way of solving
problems. They make it much easier to solve problems by providing
packages. The most widely used programming languages for data
analysis are and Python.
12
5. Data Visualization:
The reason for making data visualizations is that there might be people,
mostly stakeholders that are non-technical. Visualizations are made for a
simple understanding of complex data. Tableau and Looker are the two
popular tools used for compelling data visualizations. Tableau is a simple
drag and drop tool that helps in creating compelling visualizations. Looker
is a data viz tool that directly connects to the database and creates
visualizations. Tableau and Looker are both equally used by data analysts
for creating a visualization and Python have some packages that provide
beautiful data visualizations. R has a package named plotting which has a
variety of data visualizations. A presentation is given based on the data
findings. Sharing the insights with the team members and stakeholders will
help in making better decisions. It helps in making more informed decisions
and it leads to better outcomes.
13
Role in the Exploratory Data Analysis on IPL Data Project:
In this project, Python serves as the backbone for data analysis. In EDA on
IPL Data project, your role likely involved the following key responsibilities:
2. Data Cleaning:
4. Feature Engineering:
14
5.Insights and Reporting:
6. Collaboration:
15
WORKING OF THE PROJECT
The first step involved using python is understanding and playing around
with our data using libraries. Import all libraries which are required for
our analysis, such as Data Loading, Statistical analysis, Visualizations,
Data Transformations, Merge and Joins, etc.
Pandas and Numpy have been used for Data Manipulation and numerical
Calculations Matplotlib and Seaborn have been used for Data visualizations.
2. Reading Dataset:
The Pandas library offers a wide range of possibilities for loading data into
the pandas data frame from files like JSON, .csv, .xlsx, .sql, .pickle, .html,
images etc.
Most of the data are available in a tabular format of CSV files. It is trendy
and easy to access. Using the read_csv() function, data can be converted to a
pandas Data-frame.
In this article, the data to predict IPL Data is being used as an example. In
this data set, we are trying to analyze the used car’s price and how EDA
focuses on identifying the factors influencing the car price. We have stored
the data in the Data Frame data.
16
3. Analyzing the Data:
The main goal of data understanding is to gain general insights about the
data, which covers the number of rows and columns, values in the data,
datatypes, and Missing values in the dataset.
info() helps to understand the data type and information about data,
including the number of records in each column, data having null or not null,
Data type, the memory usage of the dataset.
17
data.info() shows the variables Mileage, Engine, Power, Seats, New_Price,
and Price have missing values. Numeric variables like Mileage, Power are of
datatype as float64 and int64. Categorical variables like Location,
Fuel_Type, Transmission, and Owner Type are of object data type.
nunique() based on several unique values in each column and the data
description, we can identify the continuous and categorical columns in
the data. Duplicated data can be handled or removed based on further
analysis.
18
6. Understand the Size of the Data:
The shape attribute in Pandas is a simple yet powerful tool used to quickly
understand the size and structure of a dataset during data analysis. It provides
the dimensions of a DataFrame or Series in the form of a tuple (rows, columns).
Useful for confirming changes in the data set after operations like
removing duplicates, handling missing values, or filtering rows.
19
CODING OF PROJECT
{
"cells": [
{
"cell_type": "code",
"execution_count": 27,
"id": "c503a7f9-8c94-410a-9479-9f118c2945ff",
"metadata": {},
"outputs": [],
"source": [
"# Loading the required libraries\n",
"import pandas as pd\n",
"from matplotlib import pyplot as plt\n",
"import seaborn as sns"
]
},
{
"cell_type": "code",
"execution_count": 30,
"id": "9d052511-04f6-4e8d-997f-8169c0310fbf",
"metadata": {},
"outputs": [],
"source": [
"# Loading the IPL matches dataset\n",
"ipl=pd.read_csv(\"C:/Users/Tamanna Rana/Downloads/matches (1).csv\")"
]
},
{
"cell_type": "code",
"execution_count": 31,
"id": "796cd901-c5ce-4688-9c9f-be9c374aaf46",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
20
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>id</th>\n",
" <th>season</th>\n",
" <th>city</th>\n",
" <th>date</th>\n",
" <th>team1</th>\n",
" <th>team2</th>\n",
" <th>toss_winner</th>\n",
" <th>toss_decision</th>\n",
" <th>result</th>\n",
" <th>dl_applied</th>\n",
" <th>winner</th>\n",
" <th>win_by_runs</th>\n",
" <th>win_by_wickets</th>\n",
" <th>player_of_match</th>\n",
" <th>venue</th>\n",
" <th>umpire1</th>\n",
" <th>umpire2</th>\n",
" <th>umpire3</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>1</td>\n",
" <td>2017</td>\n",
" <td>Hyderabad</td>\n",
" <td>2017-04-05</td>\n",
" <td>Sunrisers Hyderabad</td>\n",
" <td>Royal Challengers Bangalore</td>\n",
" <td>Royal Challengers Bangalore</td>\n",
" <td>field</td>\n",
" <td>normal</td>\n",
" <td>0</td>\n",
" <td>Sunrisers Hyderabad</td>\n",
21
" <td>35</td>\n",
" <td>0</td>\n",
" <td>Yuvraj Singh</td>\n",
" <td>Rajiv Gandhi International Stadium, Uppal</td>\n",
" <td>AY Dandekar</td>\n",
" <td>NJ Llong</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>2</td>\n",
" <td>2017</td>\n",
" <td>Pune</td>\n",
" <td>2017-04-06</td>\n",
" <td>Mumbai Indians</td>\n",
" <td>Rising Pune Supergiant</td>\n",
" <td>Rising Pune Supergiant</td>\n",
" <td>field</td>\n",
" <td>normal</td>\n",
" <td>0</td>\n",
" <td>Rising Pune Supergiant</td>\n",
" <td>0</td>\n",
" <td>7</td>\n",
" <td>SPD Smith</td>\n",
" <td>Maharashtra Cricket Association Stadium</td>\n",
" <td>A Nand Kishore</td>\n",
" <td>S Ravi</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>3</td>\n",
" <td>2017</td>\n",
" <td>Rajkot</td>\n",
" <td>2017-04-07</td>\n",
" <td>Gujarat Lions</td>\n",
" <td>Kolkata Knight Riders</td>\n",
" <td>Kolkata Knight Riders</td>\n",
" <td>field</td>\n",
" <td>normal</td>\n",
" <td>0</td>\n",
" <td>Kolkata Knight Riders</td>\n",
" <td>0</td>\n",
" <td>10</td>\n",
" <td>CA Lynn</td>\n",
22
" <td>Saurashtra Cricket Association Stadium</td>\n",
" <td>Nitin Menon</td>\n",
" <td>CK Nandan</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>4</td>\n",
" <td>2017</td>\n",
" <td>Indore</td>\n",
" <td>2017-04-08</td>\n",
" <td>Rising Pune Supergiant</td>\n",
" <td>Kings XI Punjab</td>\n",
" <td>Kings XI Punjab</td>\n",
" <td>field</td>\n",
" <td>normal</td>\n",
" <td>0</td>\n",
" <td>Kings XI Punjab</td>\n",
" <td>0</td>\n",
" <td>6</td>\n",
" <td>GJ Maxwell</td>\n",
" <td>Holkar Cricket Stadium</td>\n",
" <td>AK Chaudhary</td>\n",
" <td>C Shamshuddin</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>5</td>\n",
" <td>2017</td>\n",
" <td>Bangalore</td>\n",
" <td>2017-04-08</td>\n",
" <td>Royal Challengers Bangalore</td>\n",
" <td>Delhi Daredevils</td>\n",
" <td>Royal Challengers Bangalore</td>\n",
" <td>bat</td>\n",
" <td>normal</td>\n",
" <td>0</td>\n",
" <td>Royal Challengers Bangalore</td>\n",
" <td>15</td>\n",
" <td>0</td>\n",
" <td>KM Jadhav</td>\n",
" <td>M Chinnaswamy Stadium</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
23
" <td>NaN</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" id season city date team1 \\\n",
"0 1 2017 Hyderabad 2017-04-05 Sunrisers Hyderabad \n",
"1 2 2017 Pune 2017-04-06 Mumbai Indians \n",
"2 3 2017 Rajkot 2017-04-07 Gujarat Lions \n",
"3 4 2017 Indore 2017-04-08 Rising Pune Supergiant \n",
"4 5 2017 Bangalore 2017-04-08 Royal Challengers Bangalore \n",
"\n",
" team2 toss_winner toss_decision \\\n",
"0 Royal Challengers Bangalore Royal Challengers Bangalore field
\n",
"1 Rising Pune Supergiant Rising Pune Supergiant field \n",
"2 Kolkata Knight Riders Kolkata Knight Riders field \n",
"3 Kings XI Punjab Kings XI Punjab field \n",
"4 Delhi Daredevils Royal Challengers Bangalore bat \n",
"\n",
" result dl_applied winner win_by_runs \\\n",
"0 normal 0 Sunrisers Hyderabad 35 \n",
"1 normal 0 Rising Pune Supergiant 0 \n",
"2 normal 0 Kolkata Knight Riders 0 \n",
"3 normal 0 Kings XI Punjab 0 \n",
"4 normal 0 Royal Challengers Bangalore 15 \n",
"\n",
" win_by_wickets player_of_match venue \\\n",
"0 0 Yuvraj Singh Rajiv Gandhi International Stadium, Uppal
\n",
"1 7 SPD Smith Maharashtra Cricket Association Stadium
\n",
"2 10 CA Lynn Saurashtra Cricket Association Stadium
\n",
"3 6 GJ Maxwell Holkar Cricket Stadium \n",
"4 0 KM Jadhav M Chinnaswamy Stadium \n",
"\n",
" umpire1 umpire2 umpire3 \n",
"0 AY Dandekar NJ Llong NaN \n",
"1 A Nand Kishore S Ravi NaN \n",
"2 Nitin Menon CK Nandan NaN \n",
"3 AK Chaudhary C Shamshuddin NaN \n",
"4 NaN NaN NaN "
24
]
},
"execution_count": 31,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# having a glance at the first five research of the dataset\n",
"ipl.head()"
]
},
{
"cell_type": "code",
"execution_count": 32,
"id": "cd6a277b-c16e-4ad1-839d-57522515d709",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"(756, 18)"
]
},
"execution_count": 32,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# looking at the number of rows and columns in the dataset\n",
"ipl.shape"
]
},
{
"cell_type": "code",
"execution_count": 33,
"id": "6ea3411f-1168-49ca-88e2-f349afdfb4df",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"player_of_match\n",
"CH Gayle 21\n",
25
"AB de Villiers 20\n",
"RG Sharma 17\n",
"MS Dhoni 17\n",
"DA Warner 17\n",
" ..\n",
"PD Collingwood 1\n",
"NV Ojha 1\n",
"AC Voges 1\n",
"J Theron 1\n",
"S Hetmyer 1\n",
"Name: count, Length: 226, dtype: int64"
]
},
"execution_count": 33,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Getting the frequency of most man of the match award\n",
"ipl['player_of_match'].value_counts()"
]
},
{
"cell_type": "code",
"execution_count": 34,
"id": "0b677338-aacd-4ece-b993-aca8bc9fde8d",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"player_of_match\n",
"CH Gayle 21\n",
"AB de Villiers 20\n",
"RG Sharma 17\n",
"MS Dhoni 17\n",
"DA Warner 17\n",
"YK Pathan 16\n",
"SR Watson 15\n",
"SK Raina 14\n",
"G Gambhir 13\n",
"MEK Hussey 12\n",
"Name: count, dtype: int64"
]
26
},
"execution_count": 34,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Getting the top 10 players with the most man of the match awards\n",
"ipl['player_of_match'].value_counts()[0:10]"
]
},
{
"cell_type": "code",
"execution_count": 35,
"id": "ca809335-f4f8-4f6d-b9c0-e72f439a4ef3",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"player_of_match\n",
"CH Gayle 21\n",
"AB de Villiers 20\n",
"RG Sharma 17\n",
"MS Dhoni 17\n",
"DA Warner 17\n",
"Name: count, dtype: int64"
]
},
"execution_count": 35,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Getting the top 5 players with the most man of the match award\n",
"ipl['player_of_match'].value_counts()[0:5]"
]
},
{
"cell_type": "code",
"execution_count": 36,
"id": "f978d6dc-15de-4061-9999-3a6d926947e2",
"metadata": {},
"outputs": [
27
{
"data": {
"text/plain": [
"['CH Gayle', 'AB de Villiers', 'RG Sharma', 'MS Dhoni', 'DA Warner']"
]
},
"execution_count": 36,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"list(ipl['player_of_match'].value_counts()[0:5].keys())"
]
},
{
"cell_type": "code",
"execution_count": 38,
"id": "6d611764-d993-4555-971d-b514a063752f",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"<Figure size 800x500 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# making a bar-plot for the top 5 players with most man of the match
awards\n",
"plt.figure(figsize=(8,5))\n",
"plt.bar(list(ipl['player_of_match'].value_counts()[0:5].keys()),list(ipl['player_of
_match'].value_counts()[0:5]),color='g')\n",
"plt.show()"
]
},
{
"cell_type": "code",
"execution_count": 40,
"id": "d1794bbe-b91e-42f9-8da2-847301fb5d70",
28
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"result\n",
"normal 743\n",
"tie 9\n",
"no result 4\n",
"Name: count, dtype: int64"
]
},
"execution_count": 40,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# getting the frequecncy of result column\n",
"ipl['result'].value_counts()"
]
},
{
"cell_type": "code",
"execution_count": 41,
"id": "fa4c46e5-1ae2-4351-9f99-50f6f1cad810",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"toss_winner\n",
"Mumbai Indians 98\n",
"Kolkata Knight Riders 92\n",
"Chennai Super Kings 89\n",
"Royal Challengers Bangalore 81\n",
"Kings XI Punjab 81\n",
"Delhi Daredevils 80\n",
"Rajasthan Royals 80\n",
"Sunrisers Hyderabad 46\n",
"Deccan Chargers 43\n",
"Pune Warriors 20\n",
"Gujarat Lions 15\n",
"Delhi Capitals 10\n",
"Kochi Tuskers Kerala 8\n",
29
"Rising Pune Supergiants 7\n",
"Rising Pune Supergiant 6\n",
"Name: count, dtype: int64"
]
},
"execution_count": 41,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# finding out the number of toss wins w.r.t each team\n",
"ipl['toss_winner'].value_counts()"
]
},
{
"cell_type": "code",
"execution_count": 43,
"id": "cbbf2e93-2d47-4067-ae56-643c05ca022c",
"metadata": {},
"outputs": [],
"source": [
"# extracting the records where a team won batting first\n",
"batting_first=ipl[ipl['win_by_runs']!=0]"
]
},
{
"cell_type": "code",
"execution_count": 44,
"id": "537908a9-78cd-4a3a-93cb-1ec789040579",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
30
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>id</th>\n",
" <th>season</th>\n",
" <th>city</th>\n",
" <th>date</th>\n",
" <th>team1</th>\n",
" <th>team2</th>\n",
" <th>toss_winner</th>\n",
" <th>toss_decision</th>\n",
" <th>result</th>\n",
" <th>dl_applied</th>\n",
" <th>winner</th>\n",
" <th>win_by_runs</th>\n",
" <th>win_by_wickets</th>\n",
" <th>player_of_match</th>\n",
" <th>venue</th>\n",
" <th>umpire1</th>\n",
" <th>umpire2</th>\n",
" <th>umpire3</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>1</td>\n",
" <td>2017</td>\n",
" <td>Hyderabad</td>\n",
" <td>2017-04-05</td>\n",
" <td>Sunrisers Hyderabad</td>\n",
" <td>Royal Challengers Bangalore</td>\n",
" <td>Royal Challengers Bangalore</td>\n",
" <td>field</td>\n",
" <td>normal</td>\n",
" <td>0</td>\n",
" <td>Sunrisers Hyderabad</td>\n",
" <td>35</td>\n",
" <td>0</td>\n",
" <td>Yuvraj Singh</td>\n",
31
" <td>Rajiv Gandhi International Stadium, Uppal</td>\n",
" <td>AY Dandekar</td>\n",
" <td>NJ Llong</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>5</td>\n",
" <td>2017</td>\n",
" <td>Bangalore</td>\n",
" <td>2017-04-08</td>\n",
" <td>Royal Challengers Bangalore</td>\n",
" <td>Delhi Daredevils</td>\n",
" <td>Royal Challengers Bangalore</td>\n",
" <td>bat</td>\n",
" <td>normal</td>\n",
" <td>0</td>\n",
" <td>Royal Challengers Bangalore</td>\n",
" <td>15</td>\n",
" <td>0</td>\n",
" <td>KM Jadhav</td>\n",
" <td>M Chinnaswamy Stadium</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <th>8</th>\n",
" <td>9</td>\n",
" <td>2017</td>\n",
" <td>Pune</td>\n",
" <td>2017-04-11</td>\n",
" <td>Delhi Daredevils</td>\n",
" <td>Rising Pune Supergiant</td>\n",
" <td>Rising Pune Supergiant</td>\n",
" <td>field</td>\n",
" <td>normal</td>\n",
" <td>0</td>\n",
" <td>Delhi Daredevils</td>\n",
" <td>97</td>\n",
" <td>0</td>\n",
" <td>SV Samson</td>\n",
" <td>Maharashtra Cricket Association Stadium</td>\n",
" <td>AY Dandekar</td>\n",
" <td>S Ravi</td>\n",
32
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <th>13</th>\n",
" <td>14</td>\n",
" <td>2017</td>\n",
" <td>Kolkata</td>\n",
" <td>2017-04-15</td>\n",
" <td>Kolkata Knight Riders</td>\n",
" <td>Sunrisers Hyderabad</td>\n",
" <td>Sunrisers Hyderabad</td>\n",
" <td>field</td>\n",
" <td>normal</td>\n",
" <td>0</td>\n",
" <td>Kolkata Knight Riders</td>\n",
" <td>17</td>\n",
" <td>0</td>\n",
" <td>RV Uthappa</td>\n",
" <td>Eden Gardens</td>\n",
" <td>AY Dandekar</td>\n",
" <td>NJ Llong</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <th>14</th>\n",
" <td>15</td>\n",
" <td>2017</td>\n",
" <td>Delhi</td>\n",
" <td>2017-04-15</td>\n",
" <td>Delhi Daredevils</td>\n",
" <td>Kings XI Punjab</td>\n",
" <td>Delhi Daredevils</td>\n",
" <td>bat</td>\n",
" <td>normal</td>\n",
" <td>0</td>\n",
" <td>Delhi Daredevils</td>\n",
" <td>51</td>\n",
" <td>0</td>\n",
" <td>CJ Anderson</td>\n",
" <td>Feroz Shah Kotla</td>\n",
" <td>YC Barde</td>\n",
" <td>Nitin Menon</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" </tbody>\n",
33
"</table>\n",
"</div>"
],
"text/plain": [
" id season city date team1 \\\n",
"0 1 2017 Hyderabad 2017-04-05 Sunrisers Hyderabad \n",
"4 5 2017 Bangalore 2017-04-08 Royal Challengers Bangalore \n",
"8 9 2017 Pune 2017-04-11 Delhi Daredevils \n",
"13 14 2017 Kolkata 2017-04-15 Kolkata Knight Riders \n",
"14 15 2017 Delhi 2017-04-15 Delhi Daredevils \n",
"\n",
" team2 toss_winner toss_decision \\\n",
"0 Royal Challengers Bangalore Royal Challengers Bangalore field
\n",
"4 Delhi Daredevils Royal Challengers Bangalore bat \n",
"8 Rising Pune Supergiant Rising Pune Supergiant field \n",
"13 Sunrisers Hyderabad Sunrisers Hyderabad field \n",
"14 Kings XI Punjab Delhi Daredevils bat \n",
"\n",
" result dl_applied winner win_by_runs \\\n",
"0 normal 0 Sunrisers Hyderabad 35 \n",
"4 normal 0 Royal Challengers Bangalore 15 \n",
"8 normal 0 Delhi Daredevils 97 \n",
"13 normal 0 Kolkata Knight Riders 17 \n",
"14 normal 0 Delhi Daredevils 51 \n",
"\n",
" win_by_wickets player_of_match venue \\\n",
"0 0 Yuvraj Singh Rajiv Gandhi International Stadium, Uppal
\n",
"4 0 KM Jadhav M Chinnaswamy Stadium \n",
"8 0 SV Samson Maharashtra Cricket Association Stadium
\n",
"13 0 RV Uthappa Eden Gardens \n",
"14 0 CJ Anderson Feroz Shah Kotla \n",
"\n",
" umpire1 umpire2 umpire3 \n",
"0 AY Dandekar NJ Llong NaN \n",
"4 NaN NaN NaN \n",
"8 AY Dandekar S Ravi NaN \n",
"13 AY Dandekar NJ Llong NaN \n",
"14 YC Barde Nitin Menon NaN "
]
},
"execution_count": 44,
"metadata": {},
34
"output_type": "execute_result"
}
],
"source": [
"# looking at the head\n",
"batting_first.head()"
]
},
{
"cell_type": "code",
"execution_count": 46,
"id": "ee0275d7-469f-4b6c-bc9d-9b92a84e25bd",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"<Figure size 700x700 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# making a histogram\n",
"plt.figure(figsize=(7,7))\n",
"plt.hist(batting_first['win_by_runs'])\n",
"plt.title(\"Distribution of Runs\")\n",
"plt.xlabel(\"Runs\")\n",
"plt.show()\n"
]
},
{
"cell_type": "code",
"execution_count": 47,
"id": "37062ff0-9f5b-4541-8254-2cf3669fe26b",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"winner\n",
"Mumbai Indians 57\n",
"Chennai Super Kings 52\n",
35
"Kings XI Punjab 38\n",
"Kolkata Knight Riders 36\n",
"Royal Challengers Bangalore 35\n",
"Sunrisers Hyderabad 30\n",
"Rajasthan Royals 27\n",
"Delhi Daredevils 25\n",
"Deccan Chargers 18\n",
"Pune Warriors 6\n",
"Rising Pune Supergiant 5\n",
"Delhi Capitals 3\n",
"Kochi Tuskers Kerala 2\n",
"Rising Pune Supergiants 2\n",
"Gujarat Lions 1\n",
"Name: count, dtype: int64"
]
},
"execution_count": 47,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# finding out the number of wins w.r.t each team after batting first\n",
"batting_first['winner'].value_counts()"
]
},
{
"cell_type": "code",
"execution_count": 49,
"id": "c3b919f5-135b-4375-abbc-1f9cc8ac4332",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"<Figure size 700x700 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# making a bar-plot for top 3 teams with most wins after batting first\n",
"plt.figure(figsize=(7,7))\n",
36
"plt.bar(list(batting_first['winner'].value_counts()[0:3].keys()),list(batting_first['
winner'].value_counts()[0:3]),color=[\"blue\",\"yellow\",\"orange\"])\n",
"plt.show()"
]
},
{
"cell_type": "code",
"execution_count": 52,
"id": "e9fdcbac-194f-48d1-979d-6d1552da6c87",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"<Figure size 700x700 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# making a pie chart\n",
"plt.figure(figsize=(7,7))\n",
"plt.pie(list(batting_first['winner'].value_counts()),labels=list(batting_first['winn
er'].value_counts().keys()),autopct='%0.1f%%')\n",
"plt.show()"
]
},
{
"cell_type": "code",
"execution_count": 53,
"id": "efb46e54-e425-4c30-a27e-dd4e918fde6a",
"metadata": {},
"outputs": [],
"source": [
"# extracting those records where a team has won after batting second\n",
"batting_second=ipl[ipl['win_by_wickets']!=0]"
]
},
{
"cell_type": "code",
"execution_count": 54,
37
"id": "0c6b8af0-7b5d-49b7-a87a-67b652cfe1a8",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>id</th>\n",
" <th>season</th>\n",
" <th>city</th>\n",
" <th>date</th>\n",
" <th>team1</th>\n",
" <th>team2</th>\n",
" <th>toss_winner</th>\n",
" <th>toss_decision</th>\n",
" <th>result</th>\n",
" <th>dl_applied</th>\n",
" <th>winner</th>\n",
" <th>win_by_runs</th>\n",
" <th>win_by_wickets</th>\n",
" <th>player_of_match</th>\n",
" <th>venue</th>\n",
" <th>umpire1</th>\n",
" <th>umpire2</th>\n",
" <th>umpire3</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
38
" <tr>\n",
" <th>1</th>\n",
" <td>2</td>\n",
" <td>2017</td>\n",
" <td>Pune</td>\n",
" <td>2017-04-06</td>\n",
" <td>Mumbai Indians</td>\n",
" <td>Rising Pune Supergiant</td>\n",
" <td>Rising Pune Supergiant</td>\n",
" <td>field</td>\n",
" <td>normal</td>\n",
" <td>0</td>\n",
" <td>Rising Pune Supergiant</td>\n",
" <td>0</td>\n",
" <td>7</td>\n",
" <td>SPD Smith</td>\n",
" <td>Maharashtra Cricket Association Stadium</td>\n",
" <td>A Nand Kishore</td>\n",
" <td>S Ravi</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>3</td>\n",
" <td>2017</td>\n",
" <td>Rajkot</td>\n",
" <td>2017-04-07</td>\n",
" <td>Gujarat Lions</td>\n",
" <td>Kolkata Knight Riders</td>\n",
" <td>Kolkata Knight Riders</td>\n",
" <td>field</td>\n",
" <td>normal</td>\n",
" <td>0</td>\n",
" <td>Kolkata Knight Riders</td>\n",
" <td>0</td>\n",
" <td>10</td>\n",
" <td>CA Lynn</td>\n",
" <td>Saurashtra Cricket Association Stadium</td>\n",
" <td>Nitin Menon</td>\n",
" <td>CK Nandan</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>4</td>\n",
39
" <td>2017</td>\n",
" <td>Indore</td>\n",
" <td>2017-04-08</td>\n",
" <td>Rising Pune Supergiant</td>\n",
" <td>Kings XI Punjab</td>\n",
" <td>Kings XI Punjab</td>\n",
" <td>field</td>\n",
" <td>normal</td>\n",
" <td>0</td>\n",
" <td>Kings XI Punjab</td>\n",
" <td>0</td>\n",
" <td>6</td>\n",
" <td>GJ Maxwell</td>\n",
" <td>Holkar Cricket Stadium</td>\n",
" <td>AK Chaudhary</td>\n",
" <td>C Shamshuddin</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <th>5</th>\n",
" <td>6</td>\n",
" <td>2017</td>\n",
" <td>Hyderabad</td>\n",
" <td>2017-04-09</td>\n",
" <td>Gujarat Lions</td>\n",
" <td>Sunrisers Hyderabad</td>\n",
" <td>Sunrisers Hyderabad</td>\n",
" <td>field</td>\n",
" <td>normal</td>\n",
" <td>0</td>\n",
" <td>Sunrisers Hyderabad</td>\n",
" <td>0</td>\n",
" <td>9</td>\n",
" <td>Rashid Khan</td>\n",
" <td>Rajiv Gandhi International Stadium, Uppal</td>\n",
" <td>A Deshmukh</td>\n",
" <td>NJ Llong</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <th>6</th>\n",
" <td>7</td>\n",
" <td>2017</td>\n",
" <td>Mumbai</td>\n",
" <td>2017-04-09</td>\n",
40
" <td>Kolkata Knight Riders</td>\n",
" <td>Mumbai Indians</td>\n",
" <td>Mumbai Indians</td>\n",
" <td>field</td>\n",
" <td>normal</td>\n",
" <td>0</td>\n",
" <td>Mumbai Indians</td>\n",
" <td>0</td>\n",
" <td>4</td>\n",
" <td>N Rana</td>\n",
" <td>Wankhede Stadium</td>\n",
" <td>Nitin Menon</td>\n",
" <td>CK Nandan</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" id season city date team1 \\\n",
"1 2 2017 Pune 2017-04-06 Mumbai Indians \n",
"2 3 2017 Rajkot 2017-04-07 Gujarat Lions \n",
"3 4 2017 Indore 2017-04-08 Rising Pune Supergiant \n",
"5 6 2017 Hyderabad 2017-04-09 Gujarat Lions \n",
"6 7 2017 Mumbai 2017-04-09 Kolkata Knight Riders \n",
"\n",
" team2 toss_winner toss_decision result \\\n",
"1 Rising Pune Supergiant Rising Pune Supergiant field normal \n",
"2 Kolkata Knight Riders Kolkata Knight Riders field normal \n",
"3 Kings XI Punjab Kings XI Punjab field normal \n",
"5 Sunrisers Hyderabad Sunrisers Hyderabad field normal \n",
"6 Mumbai Indians Mumbai Indians field normal \n",
"\n",
" dl_applied winner win_by_runs win_by_wickets \\\n",
"1 0 Rising Pune Supergiant 0 7 \n",
"2 0 Kolkata Knight Riders 0 10 \n",
"3 0 Kings XI Punjab 0 6 \n",
"5 0 Sunrisers Hyderabad 0 9 \n",
"6 0 Mumbai Indians 0 4 \n",
"\n",
" player_of_match venue umpire1 \\\n",
"1 SPD Smith Maharashtra Cricket Association Stadium A Nand
Kishore \n",
41
"2 CA Lynn Saurashtra Cricket Association Stadium Nitin
Menon \n",
"3 GJ Maxwell Holkar Cricket Stadium AK Chaudhary
\n",
"5 Rashid Khan Rajiv Gandhi International Stadium, Uppal A
Deshmukh \n",
"6 N Rana Wankhede Stadium Nitin Menon \n",
"\n",
" umpire2 umpire3 \n",
"1 S Ravi NaN \n",
"2 CK Nandan NaN \n",
"3 C Shamshuddin NaN \n",
"5 NJ Llong NaN \n",
"6 CK Nandan NaN "
]
},
"execution_count": 54,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# looking at the head\n",
"batting_second.head()"
]
},
{
"cell_type": "code",
"execution_count": 56,
"id": "ff843db9-8d20-4c7a-9279-9becc8deb776",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"<Figure size 700x700 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# making a histogram for frequency of wins w.r.t number of wickets\n",
"plt.figure(figsize=(7,7))\n",
42
"plt.hist(batting_second['win_by_wickets'],bins=30)\n",
"plt.show()"
]
},
{
"cell_type": "code",
"execution_count": 57,
"id": "50dba461-b296-42d7-8cad-96ec7172269c",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"winner\n",
"Kolkata Knight Riders 56\n",
"Mumbai Indians 50\n",
"Royal Challengers Bangalore 48\n",
"Chennai Super Kings 48\n",
"Rajasthan Royals 46\n",
"Kings XI Punjab 42\n",
"Delhi Daredevils 42\n",
"Sunrisers Hyderabad 27\n",
"Gujarat Lions 12\n",
"Deccan Chargers 11\n",
"Pune Warriors 6\n",
"Delhi Capitals 6\n",
"Rising Pune Supergiant 5\n",
"Kochi Tuskers Kerala 4\n",
"Rising Pune Supergiants 3\n",
"Name: count, dtype: int64"
]
},
"execution_count": 57,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# finding out the frequency of number of wins w.r.t each time after batting
second\n",
"batting_second['winner'].value_counts()"
]
},
{
"cell_type": "code",
43
"execution_count": 63,
"id": "17c4beb0-7643-41bd-8225-b8a82f51d057",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"<Figure size 700x700 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# making a bar plot for top 3 teams with most wins after batting second\n",
"plt.figure(figsize=(7,7))\n",
"plt.bar(list(batting_second['winner'].value_counts()[0:3].keys()),list(batting_se
cond['winner'].value_counts()[0:3]),color=[\"purple\",\"red\",\"pink\"])\n",
"plt.show()"
]
},
{
"cell_type": "code",
"execution_count": 68,
"id": "cc32deb4-59f3-46ab-92f3-4332cfce8c0b",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"<Figure size 700x700 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# making a pie chart for distribution of most wins after batting second \n",
"plt.figure(figsize=(7,7))\n",
"plt.pie(list(batting_second['winner'].value_counts()),labels=list(batting_second
['winner'].value_counts().keys()),autopct='%0.1f%%')\n",
44
"plt.show()"
]
},
{
"cell_type": "code",
"execution_count": 69,
"id": "73546272-5d01-4795-88c6-f3bd380e1006",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"season\n",
"2013 76\n",
"2012 74\n",
"2011 73\n",
"2010 60\n",
"2014 60\n",
"2016 60\n",
"2018 60\n",
"2019 60\n",
"2017 59\n",
"2015 59\n",
"2008 58\n",
"2009 57\n",
"Name: count, dtype: int64"
]
},
"execution_count": 69,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# looking at the number of matches played each season\n",
"ipl['season'].value_counts()"
]
},
{
"cell_type": "code",
"execution_count": 70,
"id": "8c57ba6c-b2dc-4de8-9996-885069a23dfc",
"metadata": {},
"outputs": [
{
45
"data": {
"text/plain": [
"393"
]
},
"execution_count": 70,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# finding out how many times a team has won the match after winning the
toss\n",
"import numpy as np\n",
"np.sum(ipl['toss_winner']==ipl['winner'])"
]
},
{
"cell_type": "code",
"execution_count": 76,
"id": "d1009955-0896-4b33-b421-d85b295a2fad",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"0.6179245283018868"
]
},
"execution_count": 76,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"393/636"
]
},
{
"cell_type": "code",
"execution_count": 84,
"id": "aef12d3a-1520-4803-b5e0-afe7027c5be8",
"metadata": {},
"outputs": [],
"source": [
46
"deliveries=pd.read_csv(\"C:/Users/Tamanna Rana/Downloads/deliveries.csv
(1)/deliveries.csv\")"
]
},
{
"cell_type": "code",
"execution_count": 85,
"id": "c023678b-fe73-4b59-8ee3-dad8f41c8757",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>match_id</th>\n",
" <th>inning</th>\n",
" <th>batting_team</th>\n",
" <th>bowling_team</th>\n",
" <th>over</th>\n",
" <th>ball</th>\n",
" <th>batter</th>\n",
" <th>bowler</th>\n",
" <th>non_striker</th>\n",
" <th>batsman_runs</th>\n",
" <th>extra_runs</th>\n",
" <th>total_runs</th>\n",
" <th>extras_type</th>\n",
" <th>is_wicket</th>\n",
47
" <th>player_dismissed</th>\n",
" <th>dismissal_kind</th>\n",
" <th>fielder</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>335982</td>\n",
" <td>1</td>\n",
" <td>Kolkata Knight Riders</td>\n",
" <td>Royal Challengers Bangalore</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>SC Ganguly</td>\n",
" <td>P Kumar</td>\n",
" <td>BB McCullum</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>legbyes</td>\n",
" <td>0</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>335982</td>\n",
" <td>1</td>\n",
" <td>Kolkata Knight Riders</td>\n",
" <td>Royal Challengers Bangalore</td>\n",
" <td>0</td>\n",
" <td>2</td>\n",
" <td>BB McCullum</td>\n",
" <td>P Kumar</td>\n",
" <td>SC Ganguly</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>NaN</td>\n",
" <td>0</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
48
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>335982</td>\n",
" <td>1</td>\n",
" <td>Kolkata Knight Riders</td>\n",
" <td>Royal Challengers Bangalore</td>\n",
" <td>0</td>\n",
" <td>3</td>\n",
" <td>BB McCullum</td>\n",
" <td>P Kumar</td>\n",
" <td>SC Ganguly</td>\n",
" <td>0</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>wides</td>\n",
" <td>0</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>335982</td>\n",
" <td>1</td>\n",
" <td>Kolkata Knight Riders</td>\n",
" <td>Royal Challengers Bangalore</td>\n",
" <td>0</td>\n",
" <td>4</td>\n",
" <td>BB McCullum</td>\n",
" <td>P Kumar</td>\n",
" <td>SC Ganguly</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>NaN</td>\n",
" <td>0</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>335982</td>\n",
" <td>1</td>\n",
49
" <td>Kolkata Knight Riders</td>\n",
" <td>Royal Challengers Bangalore</td>\n",
" <td>0</td>\n",
" <td>5</td>\n",
" <td>BB McCullum</td>\n",
" <td>P Kumar</td>\n",
" <td>SC Ganguly</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>NaN</td>\n",
" <td>0</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" match_id inning batting_team bowling_team over \\\n",
"0 335982 1 Kolkata Knight Riders Royal Challengers Bangalore
0 \n",
"1 335982 1 Kolkata Knight Riders Royal Challengers Bangalore
0 \n",
"2 335982 1 Kolkata Knight Riders Royal Challengers Bangalore
0 \n",
"3 335982 1 Kolkata Knight Riders Royal Challengers Bangalore
0 \n",
"4 335982 1 Kolkata Knight Riders Royal Challengers Bangalore
0 \n",
"\n",
" ball batter bowler non_striker batsman_runs extra_runs \\\n",
"0 1 SC Ganguly P Kumar BB McCullum 0 1 \n",
"1 2 BB McCullum P Kumar SC Ganguly 0 0 \n",
"2 3 BB McCullum P Kumar SC Ganguly 0 1 \n",
"3 4 BB McCullum P Kumar SC Ganguly 0 0 \n",
"4 5 BB McCullum P Kumar SC Ganguly 0 0 \n",
"\n",
" total_runs extras_type is_wicket player_dismissed dismissal_kind
fielder \n",
"0 1 legbyes 0 NaN NaN NaN \n",
"1 0 NaN 0 NaN NaN NaN \n",
"2 1 wides 0 NaN NaN NaN \n",
50
"3 0 NaN 0 NaN NaN NaN \n",
"4 0 NaN 0 NaN NaN NaN "
]
},
"execution_count": 85,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"deliveries.head()"
]
},
{
"cell_type": "code",
"execution_count": 91,
"id": "200d003a-ec0a-4dfb-89bb-2b2d8c77234d",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([ 335982, 335983, 335984, ..., 1426310, 1426311, 1426312],\n",
" dtype=int64)"
]
},
"execution_count": 91,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"deliveries['match_id'].unique()"
]
},
{
"cell_type": "code",
"execution_count": 92,
"id": "d4ba9c8e-7f06-4638-9435-6087f945395e",
"metadata": {},
"outputs": [],
"source": [
"match_1=deliveries[deliveries['match_id']==1]"
]
},
51
{
"cell_type": "code",
"execution_count": 93,
"id": "63c19589-7813-44c8-aefa-51e169f00967",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>match_id</th>\n",
" <th>inning</th>\n",
" <th>batting_team</th>\n",
" <th>bowling_team</th>\n",
" <th>over</th>\n",
" <th>ball</th>\n",
" <th>batter</th>\n",
" <th>bowler</th>\n",
" <th>non_striker</th>\n",
" <th>batsman_runs</th>\n",
" <th>extra_runs</th>\n",
" <th>total_runs</th>\n",
" <th>extras_type</th>\n",
" <th>is_wicket</th>\n",
" <th>player_dismissed</th>\n",
" <th>dismissal_kind</th>\n",
" <th>fielder</th>\n",
" </tr>\n",
52
" </thead>\n",
" <tbody>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
"Empty DataFrame\n",
"Columns: [match_id, inning, batting_team, bowling_team, over, ball,
batter, bowler, non_striker, batsman_runs, extra_runs, total_runs, extras_type,
is_wicket, player_dismissed, dismissal_kind, fielder]\n",
"Index: []"
]
},
"execution_count": 93,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"match_1.head()"
]
},
{
"cell_type": "code",
"execution_count": 94,
"id": "e8fab8ef-d26f-4a5b-868f-feb3e839c836",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"(0, 17)"
]
},
"execution_count": 94,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"match_1.shape"
]
},
{
53
"cell_type": "code",
"execution_count": 95,
"id": "baf60a6b-c67b-4a15-a3da-12cefb645d98",
"metadata": {},
"outputs": [],
"source": [
"srh=match_1[match_1['inning']==1]"
]
},
{
"cell_type": "code",
"execution_count": 96,
"id": "8b55c305-09fd-4d94-8a8a-bdc13e4e95c4",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Series([], Name: count, dtype: int64)"
]
},
"execution_count": 96,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"srh['batsman_runs'].value_counts()"
]
},
{
"cell_type": "code",
"execution_count": 97,
"id": "b0bf4590-f007-4b1e-8101-56bba168cc96",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Series([], Name: count, dtype: int64)"
]
},
"execution_count": 97,
"metadata": {},
"output_type": "execute_result"
54
}
],
"source": [
"srh['dismissal_kind'].value_counts()"
]
},
{
"cell_type": "code",
"execution_count": 98,
"id": "e21e2b33-1bf7-4295-adaa-502e883a2884",
"metadata": {},
"outputs": [],
"source": [
"rcb=match_1[match_1['inning']==2]"
]
},
{
"cell_type": "code",
"execution_count": 99,
"id": "01e08a77-c0b3-4f24-9df2-9dd6e43b38e7",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Series([], Name: count, dtype: int64)"
]
},
"execution_count": 99,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"rcb['batsman_runs'].value_counts()"
]
},
{
"cell_type": "code",
"execution_count": 100,
"id": "60ea9ede-9115-4574-b059-26369bcb224a",
"metadata": {},
"outputs": [
{
"data": {
55
"text/plain": [
"Series([], Name: count, dtype: int64)"
]
},
"execution_count": 100,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"rcb['dismissal_kind'].value_counts()"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "34e7bc18-3392-4a5e-8d18-c4a6f9ea52bc",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.12.4"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
56
SCREENSHOTS OF THE PROJECT
1. Player Statistics:
I’ve developed a simple function using the IPL dataset that retrieves all the
basic information about an IPL player. This function allows users to input
the player’s name and receive essential details such as their team, batting
and bowling averages, strike rates, and other relevant stats. It’s a convenient
tool for quickly accessing key information about any IPL player from the
dataset.
Team-wise Analysis:
57
Total Runs by IPL Team:
58
Top 5 players with most man of the match awards
A bar chart displays total final wins by IPL teams. Mumbai Indians lead with 5
trophies, followed by Chennai Super Kings with 4 trophies. The visualization
offers a clear comparison of final wins for each team.
59
2. Total Matches win by the Teams
A pie chart depicts total match wins by IPL teams. Mumbai Indians lead
with 16.9%, followed by Chennai Super Kings with 15.4%. The
visualization provides a succinct overview of each team’s share of match
victories.
60
3. Distribution of Runs:
61
62
Top 3 teams with most wins after batting second:
63
Distribution of most wins after batting second:
64
4. Top decision in each season of IPL:
65
5. Winning Ratio Analysis:
66
Top 10 IPL Hosted Cities:
67
6. Number of Total Matches Per Season:
68
FUTURE USES AND CONCLUSIONS
69
5. Academic and Research Contributions
7. Technological Extensions
70
We analyzing the IPL datasets by different techniques and by plotting charts.
We get the information that Mumbai Indians play most matches than other
teams and Mumbai Indians winning ratio is also good. So, we get that Mumbai
Indian is most successful team of IPL. We also get that Virat Kohli is the most
valuable player of IPL because he become most time player of match and he is
also a batsmen of most runs in IPL history. And DJ Bravo is most wicket taker
bowler in IPL history. We also analyze that most toss winning teams chose to
field first and most of the toss winning teams also win match. So, we can say
that winning toss depend on match win.
In this analysis, we also get that in which season most matches played and
which city hosted most matches of IPL. We also fin that which team won
most time the title of winning IPL and which is the most toss winning team
in IPL.
71
BIBLIOGRAPHY
https://pandas.pydata.org/docs/user_guide/index.html
https://matplotlib.org/stable/api/index.html
https://seaborn.pydata.org/tutorial.html
https://www.kaggle.com/datasets/vora1011/ipl-2008-to-2021-all-match-
dataset?select=IPL_Matches_2008_2022.csv
72