0% found this document useful (0 votes)
17 views27 pages

Davp Ipl Pradeep

Uploaded by

gammappt001
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views27 pages

Davp Ipl Pradeep

Uploaded by

gammappt001
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 27

DATA ANALYSIS AND VISUALIZATION OF

IPL USING PYTHON

A PROJECT REPORT
Submitted by

PRADEEP KU. NAYAK


Reg No.- 240301120232
in partial fulfilment for the
award of the degree of

BACHELOR OF TECHNOLOGY
in
COMPUTER SCIENCE AND ENGINEERING

SCHOOL OF ENGINEERING AND TECHNOLOGY


BHUBANESWAR CAMPUS
CENTURION UNIVERSUTY OF TECHNOLOGY AND MANAGEMENT
DECEMBER : 2024
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING
SCHOOL OF ENGINEERING AND TECHNOLOGY
BHUBANESWAR CAMPUS

BONAFIDE CERTIFICATE
Certified that this project report “DATA ANALYSIS AND VISUALIZATION OF
IPL USING PYTHON” is the Bonafide work of PRADEEP KU. NAYAK who
carried out the project work under my supervision. This is to further certify to the best
of my knowledge, that this project has not been carried out earlier in this institute and
the university.

SIGNATURE
Prof. SABYASACHI MOHANTY
Professor of Computer Science and Engineering

Certified that the above mentioned project has been duly carried out as per the
norms of the college and statutes of the university.

SIGNATURE
Prof. Raj Kumar Mohanta
HEAD OF THE DEPARTMENT / DEAN OF THE SCHOOL
Professor of Computer Science and Engineering

DEPARTMENT SEAL
DECLARATION

hereby declare that the project entitled “DATA ANALYSIS AND VISUALIZATION OF
IPL USING PYTHON” submitted for the “Minor Project” of 1 st semester B. Tech in Computer
Science and Engineering is my original work and the project has not formed the basis for the award
of any Degree / Diploma or any other similar titles in any other University / Institute.

Name of the Student: PRADEEP KU. NAYAK


Signature of the Student:
Registration No: 240301120232
Place: BHUBANESWAR
Date: 26-12-2024
ACKNOWLEDGEMENTS

I wish to express my profound and sincere gratitude to Prof. SABYA SACHI MOHANTY

Department of Computer Science and Engineering, SoET, Bhubaneswar Campus, who guided me
into the intricacies of this project nonchalantly with matchless magnanimity.

I thank Prof. Raj Kumar Mohanta, Head of the Dept. of Computer Science and Engineering, SoET,
Bhubaneswar Campus and Prof. Sujata Chakrabarty Dean, School of Engineering and Technology,
Bhubaneswar Campus for extending their support during Course of this investigation.

I would be failing in my duty if I don’t acknowledge the cooperation rendered during various stages
of image interpretation by Prof. SABYASACHI MOHANTY

I am highly grateful to my friends who evinced keen interest and invaluable support in the progress
and successful completion of my project work.

I am indebted to group member. For their constant encouragement, co-operation and help. Words of
gratitude are not enough to describe the accommodation and fortitude which they have shown
throughout my endeavour.

Name of the Student: PRADEEP KU. NAYAK

Signature of the Student:

Registration No: 240301120232

Place: BHUBANESHWAR

Date: 26-12-2024
CONTENT

SL. NO. INDEX PAGE NO.

1 INTRODUCTION 1

2 DATA ANALYSIS AND VISUALZATION OF IPL 2-16

3 CONCLUSION 17

4 FUTURE SCOPE 18

5 REFERENCE 19
LIST OF TABLE

1. CHAPTER – 1 INTRODUCTION

2. CHAPTER 2 PROJECT WORK

3. CHAPTER – 3 SUMMARY, CONCLUSIONS & SCOPE FOR FURTHER STUDY


DATA ANALYSIS
AND
VISUALIZATION IPL
(2008–2023)USING
PYTHON
INTRODUCTION

Indian Premier League (IPL), Indian professional Twenty20 (T20) cricket league that was
established in 2008 and has developed into one of the richest sports leagues in the world.
The brainchild of the Board of Control for Cricket in India (BCCI), the Indian Premier League (IPL)
is based on a round-robin group and knockout format and has teams in major Indian cities.

Matches generally begin in late afternoon or evening so at least a portion of them are played under
floodlights at night to maximize the television audience for worldwide broadcasts. Initially, league
matches were played on a home-and-away basis between all teams, but, with the planned expansion
to 10 clubs (divided into two groups of five) in 2011, that format changed so that matches between
some teams would be limited to a single encounter. The top four teams contest three playoff matches,
with one losing team being given a second chance to reach the final, a wrinkle aimed at maximizing
potential television revenue. The playoff portion of the tournament involves the four teams that
finished at the top of the tables in a series of knockout games that allow one team that lost its first-
round game a second chance to advance to the final match.

With the advent of the IPL, almost overnight the world’s best cricketers—who had seldom made the
kind of money earned by their counterparts in other professional sports—became millionaires. The
owners of the IPL franchises—including major companies, Bollywood film stars, and media moguls
—bid for the best players in auctions organized by the league. At the outset of the IPL, the well-
financed Mumbai Indians had the league’s biggest payroll, more than $100 million. It cost
the Chennai Super Kings $1.5 million to secure the services of Mahendra Dhoni in the initial auction
for the 2008 season and the Kolkata Knight Riders $2.4 million to sign Gautam Gambhir, the
opening batsman for the Indian national team, in the bidding for the 2011 season. Yuvraj Singh
(2014 and 2015), Ben Stokes (2017 and 2018), Pat Cummins (2020), Chris Morris (2021), and Sam
Curran (2023) are some other players who have been secured at the highest bids.

The eight founding franchises were the Mumbai Indians, the Chennai Super Kings, the Royal
Challengers Bangalore, the Deccan Chargers (based in Hyderabad), the Delhi Daredevils, the Punjab
XI Kings (Mohali), the Kolkata Knight Riders, and the Rajasthan Royals (Jaipur). In late 2010 two
franchises, Rajasthan and Punjab, were expelled from the league by the BCCI for breaches of
ownership policy, but they were later reinstated in time for the 2011 tournament. Two new
franchises, the Pune Warriors India and the Kochi Tuskers Kerala, joined the IPL for the 2011
tournament.

1
Analysis of IPL Data

1. Import libraries

2. Load the data

3. Analyse the data

Import Libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns%matplotlib inline

Load the data


match=pd.read_csv('D:\Data Science\IPL\Dataset\matches.csv')
delivery=pd.read_csv('D:\Data Science\IPL\Dataset\deliveries.csv')

Analyse the Data

Now take a look at the data we are working on:


match.head(5)
delivery.head(5)

When you run the shell then in the match section you will see the first match back in 2008 was played

between KKR and RCB. KKR has won the match at M Chinnaswamy Stadium and the Player of the

Match was BB McCullum. The match result was decided by runs.

The same kind of analysis is present in the ball delivery section too. If you want to look at the top 5

bottom data of the table and then you have to run the program match.tail(5) in the Jupyter Notebook

or Google Colab cell.

More Information about Matches and Ball Deliveries between 2008–2020


match.info() #816
delivery.info() #193468

2
List of the Participating Teams
all_teams = match['team1'].tolist() + match['team2'].tolist()
all_teams = list(set(all_teams))
all_teams

You will get the list of teams that played between the period 2008 to 2020. If you are a pro IPL fan

then you will see some old team names on the list which are not playing these days but they

contributed some valuable information in the IPL history.

Number of Matches per Venue


sns.countplot('venue', data=match)
plt.xticks(rotation='vertical')

3
Number of matches per venue in IPl(2008–2020)

As you see that Eden Gardens is the fan-favourite ground of IPL, nearly 80 matches have been hosted

there.

Matches Played by Each Team


x = match['team1'].value_counts()
y = match['team2'].value_counts()
(x+y).plot(kind='barh')

Match played by each team in IPL(2008–2020)

We count the value of each team playing in column one and add to the count of each team from team

two to get the desired output. For example, if CSK played 90 times from team one and 85 times from

team 2 then the total of175 matches are shown in the graph. You can see that Mumbai Indians played

the highest number of matches in the IPL.

Matches Won by Each Team


x=pd.DataFrame({"Winner":match['winner']}).value_counts()
print(x)

When you run this cell you will see that Mumbai Indians win the highest number of matches followed

by CSK and other teams. Now if you want to plot this result in graph form then run this program in

the next cell.


sns.countplot('winner', data=match)
plt.xticks(rotation='vertical')

4
Match won by each team in IPL(2008–2020)

Top 5 Players with the Highest Number of Man of the Match Awards

If you are a team management official and these players go under the hammer then you must have to

keep eye on these players as these players have the highest number of Man of the Match awards.

Let’s check out how to find this:


temp_data=match['player_of_match'].value_counts().head()
print(temp_data)
sns.barplot(x=temp_data.index,y=temp_data.values,data=match)plt.title("Top 5 MoM")
plt.xticks(rotation=90)
plt.xlabel("Match Count")
plt.ylabel("Player")
plt.show()

5
Players with the highest number of MoM in IPL(2008–2020)

Is your favourite player present in the above list?

The Top Batsman in the IPL

For this, we have to find out the player with the highest number of runs. To find out this, we have to

sum up the batsman’s run from the delivery dataset and the batsman who scored that run. It’s simple

logic, right?

Demonstrated below:
top_batsman=delivery.groupby('batsman')
['batsman_runs'].agg('sum').reset_index().sort_values('batsman_runs',
ascending=False).head(10)top_batsman.set_index('batsman', inplace=True)
top_batsman.plot(kind='bar')

We grouped the top 10 batsmen from the delivery dataset and summed up their runs. After this, we

plot this information into a graph.

6
The top batsman in IPL(2008–2020)

King Kohli is at the top followed by Suresh Raina and other batsmen.

The Bowler Who Has Given the Highest Number of Runs


delivery.groupby('bowler')['total_runs'].agg('sum').reset_index().sort_values('total_runs',
ascending=False).head(10)

For this, we grouped the top 10 bowlers who have given runs on his delivery in IPL matches and

summed up that run for the final outlet.

The Bowler with Team-wise Performance

Let’s suppose you are playing against CSK and you have to find out which bowler’s performance was

good in the previous years against this team. To find out the team-wise performance analysis of a

bowler, you have to run the following program in your Notebook cell:
mask=delivery['bowler']=='PP Chawla'
delivery[mask].groupby('batting_team')['total_runs'].agg('sum').plot(kind='bar')

7
We are taking the example of PP Chawla. This bowler has given the highest number of runs in the

IPL history till 2020. We summed up the total runs given by PP Chawla to the opponent team.

PP Chawla's bowling performance against IPL teams.

It’s clear that if you have PP Chawla in your team then don’t let him play against MI, CSK, RCB, RR,

and DC.

Over-wise Batting Performance of Each Team in the IPL (2008–2020)


delivery6=delivery[mask]
delivery6=delivery6[['batting_team','over','batsman_runs']]
x=delivery6.pivot_table(values='batsman_runs', index='batting_team', columns='over',
aggfunc='count')sns.heatmap(x, cmap='summer')

For this, we are using a pivot table and then count the over-wise run of batsmen of the batting team.

Then convert the data into a heatmap as given below:

8
Over wise batting performance of each team in IPL(2008–2020)

As you can see, if you are playing against MI or CSK, then you have to play with your best bowling

attack line-up from the first over. MI’s batsmen are silent in the second and third over, after that they

go on rampage mode against their opponent. The same goes for CSK and RCB too. This data is not

only helpful from the bowling team’s perspective but also the batting team. If you are a team manager

and you see using this data that your team is not performing well in the death overs then you probably

should focus on buying a good finisher in the next auction. As you see in the above heatmap, most of

the teams are lagging at the end of the map, except CSK and MI.

I think that’s why MI and CSK are the two most successful franchises in the IPL.

Dismissal Kind
sns.countplot('dismissal_kind', data=delivery)
plt.xticks(rotation='vertical')

9
Now, if you want to know that how many runs Virat Kohli scored when he faced Jasprit Bumrah, use

the following code:


mask=delivery['bowler']=='JJ Bumrah'
mask2=delivery['batsman']=='V Kohli'
delivery[mask].groupby('batsman')['batsman_runs'].agg('count').sort_values(ascending=False)['V Kohli']

Sum of the run when the bowler is Bumrah and the batsman is V Kohli. You will get the output.

10
Who scored more runs by boundaries?
The IPL is known for some heavy hitting, and spectators enjoy seeing the ball arc from the center of
the field into the upper tiers of the stands. But who scores more of their runs this way? This flipped,
stacked bar plot shows the batsmen who have scored the highest percentage of their runs by
boundaries. Bars are broken down to show the percentage of a batsman’s runs scored by fours and
sixes (for example, astonishingly, more than half (50.4%) of the runs scored by AD Russell in the
IPL have come from sixes, and 78.6% of his total runs scored have been by boundaries.)

How do the top run scorers score their runs?


This facted bar chart looks as the 16 all-time leading run scorers in the IPL and breaks down how
they scored their runs by percentage.

11
Mean dot balls per over
This lollipop plot shows the mean number of dot balls (non-scoring balls) per over of the leading
dot-ball bowling bowlers in the IPL. DW Steyn averages close to 3 dot balls per over across his IPL
career. Minimum qualification for inclusion is bowling > 260 overs.

Teams totals distribution


This histogram with rug shows the distribution of team totals across all editions of IPL. It
spproximates to a normals distribution.

12
Who runs farthest?
Each time a player scores a non-boundary run, both they and the non-striking batsman traverse the
length of the playing wicket (the stumps are 22 yards apart, and the popping creases are 19.3 yards
apart; for this estimate we will assume each runner cover 22 yards per run). This flipped barchart
shows which batsmen have covered the greatest distance over their IPL careers.

Maximum and minimum team totals each season for the major franchises

13
This animated/interactive plotly Cleveland dot plot shows the maximum and minimum team totals
for each of the major IPL franchises, per edition of the IPL. Chennai Super Kings and Rajasthan
Royals are missing data as they were suspended from participating in the 2016 and 2017 editions
over allegations of corruption

Distribution of each dismissal by over


This faceted line plot shows in which over each of the major dismissal types occur.

How the leading run scorers are dismissed


This faceted bar plot shows the ways in which the 16 all0time leading run scorers have been
dismissed.

14
15
16
CONCLUSION

The analysis and visualization of IPL cricket data reveal intriguing insights into the performance of
teams and players. Examining batting, bowling, and fielding statistics allows us to identify trends
and patterns that shape the outcome of these prestigious tournaments.

In the IPL, team performance is characterized by dynamic player contributions, with top performers
often determining success. Visualization of batting strike rates, bowling averages, and fielding
efficiency highlights the critical roles played by key players. Additionally, analyzing match data over
the seasons reveals the evolution of team strategies and player dynamics.

Comparatively, the IPL emphasizes national team strengths and weaknesses. Data visualization
exposes the impact of key players on a global stage, showcasing their consistency and adaptability.
Team performance metrics unveil the competitive balance among nations and provide insights into
the factors influencing outcomes.

In conclusion, data analysis and visualization of IPL cricket showcase the intricate dynamics of the
game. They not only highlight individual and team performances but also offer a nuanced
understanding of cricket's evolution at both domestic and international levels. These insights
contribute to a deeper appreciation of the sport's complexity and the factors influencing success on
different cricketing platforms.

17
FUTURE SCOPE

The future of data analysis and visualization for IPL (Indian Premier League) promises to be even
more dynamic and insightful. With advancements in technology, data scientists and analysts will
harness the power of machine learning algorithms to derive deeper insights from player statistics,
match outcomes, and team performances.

Predictive modeling will play a crucial role in anticipating match results, player form, and strategic
decisions. Interactive dashboards and visualizations will become more immersive, allowing fans and
analysts alike to explore the data in real-time. Virtual and augmented reality may enhance the fan
experience, offering a more engaging and personalized way to interact with match data.

Furthermore, data storytelling will gain prominence, enabling analysts to convey complex insights in
a compelling and understandable manner. Social media integration will foster a sense of community
as fans share and discuss data-driven analyses. Ethical considerations will also be paramount,
ensuring the responsible use of data for fair play and unbiased assessments.

In summary, the future of data analysis and visualization in IPL is poised to be technologically
advanced, interactive, and socially connected, enriching the experience for both cricket enthusiasts
and professionals alike.

18
REFERENCE
Data analysis and visualization play crucial roles in gaining insights into the performance and trends
of cricket tournaments like the Indian Premier League (IPL). These tools allow enthusiasts, analysts,
and teams to explore and interpret various aspects of the game, enhancing the overall understanding
and enjoyment of cricket.[1]

In the context of the IPL, teams and players are subjected to intense scrutiny through statistical
analysis. Metrics such as batting averages, bowling strike rates, and economy rates are meticulously
examined to evaluate player performances. Advanced analytics can also reveal patterns in team
strategies, identifying factors that contribute to success or failure. Visualization tools like heat maps,
run-rate graphs, and player performance charts provide a comprehensive overview of the tournament
dynamics.[2]

Visualization tools bring these analyses to life, making complex statistics more accessible.
Infographics, interactive dashboards, and charts allow fans and analysts to explore data in an
engaging manner. For example, a geographical representation of player performances or a timeline of
key moments in a match can enhance the overall viewing experience.[3]

ASSESSMENT
Internal:

SL FULL
RUBRICS MARKS OBTAINED REMARKS
NO MARK

Understanding the relevance, scope and


1 10
dimension of the project

2 Methodology 10

3 Quality of Analysis and Results 10

4 Interpretations and Conclusions 10

5 Report 10

Total 50

Date: Signature of the Faculty

19
COURSE OUTCOME (COs) ATTAINMENT
➢ Expected Course Outcomes (COs):
(Refer to COs Statement in the Syllabus)

________________________________________________________________________________

________________________________________________________________________________

________________________________________________________________________________

________________________________________________________________________________

➢ Course Outcome Attained:


How would you rate your learning of the subject based on the specified COs?

1 2 3 4 5 6 7 8 9 10
LOW HIGH
➢ Learning Gap (if any):
___________________________________________________________________________
___________________________________________________________________________
___________________________________________________________________________
___________________________________________________________________________
➢ Books / Manuals Referred:
________________________________________________________________________________

________________________________________________________________________________

________________________________________________________________________________

________________________________________________________________________________

Date: Signature of the Student


➢ Suggestions / Recommendations:
(By the Course Faculty)

________________________________________________________________________________

________________________________________________________________________________

________________________________________________________________________________

________________________________________________________________________________

Date: Signature of the Faculty

20

You might also like