0% found this document useful (0 votes)
9 views20 pages

Ip Project

Uploaded by

anantacharya290
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views20 pages

Ip Project

Uploaded by

anantacharya290
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 20

ST. XAVIER’S SENIOR SEC.

SCHOOL,
JAIPUR

INFORMATICS PRACTICES PROJECT

IPL Player Performance Analysis

Submitted in partial fulfillment of the requirements for the


Senior Secondary Examination (AISSCE)

By:
Name: Anant Acharya
Class: 12 E
Roll No:
Session: 2024-25
C E RT I F I C AT E
This is to certify that Anant Acharya of Class XII E has successfully completed
the Informatics Practices project entitled IPL Player Performance Analysis

for the academic year 2024-25 under my supervision.

Place: Jaipur Signature of the Internal Supervisor


Date: Name: Nimmi Sam

Place: Jaipur Signature of the External Supervisor


Date: Name:
ID-SHEET

ROLL NO.

NAME OF STUDENT Anant Acharya

CONTACT NO. 8290375558

EMAIL ADDRESS [email protected]

INTERNAL SUPERVISOR Mrs. Nimmi Sam

PROJECT TITLE IPL Player Performance Analysis

FRONT END PyCharm


PROGRAMMING
Python
LANGUAGE
DATA SOURCE CSV file
ACKNOWLEDGMENT
I take this opportunity to thank Rev. Fr. Principal M. Arockiam for providing
all the facilities required to carry out my project.

I would like to express my sincere gratitude to my supervisor Mrs. Nimmi Sam


for helping me develop the project and also for her constant encouragement
towards becoming more professionally qualified.
LANGAGUAGE SPECIFICATION
This project has been developed using the PYTHON programming language.

Data science is an essential part of any industry in this era of big data. Data
science is a field that deals with vast volumes of data using modern tools and
techniques to derive meaningful information, make business decisions and for
predictive analysis. The data used for analysis can be from multiple sources and
in various formats.

Python is the most sought-after programming language today among data


professionals for data analysis.

Python provides all the necessary tools to analyse huge datasets. It comes with
powerful statistical, numerical and visualisation libraries such as Pandas,
Numpy, Matplotlib etc. and many advanced libraries also.
INTRODUCTION

In today’s fast-growing world, information has a vital and essential role to play. The
IT revolution has not only affected business, education, science and technology but
also the way people think. Speedy changes in the economy and globalization are
putting more and more stress on cutting-edge technology and processing information
swiftly, accurately and reliably. The conventional system was not capable to show
accuracy and speed.

Thus it has been replaced by a computer-based system that is reliable, accurate,


secure, versatile and efficient enough to process information speedily. The computer-
based system has proved revolutionary in satisfying the basic needs of today’s
modern business world – quick availability, processing and analysis of information.

Problems with the conventional (Manual) system:

1. Lack of immediate information retrieval.


2. Lack of immediate information storage.
3. Lack of prompt updating of transactions.
4. Lacks sorting of information.
5. Redundancy of information.
6. Time and efforts required to generate accurate and prompt reports is high.

Need and benefits of computerisation

1. To make the information available accurately and speedily.


2. To minimise the burden of paper documents.
SCOPE OF THE PROJECT

This project is designed as a comprehensive analytical tool for exploring and visualizing
statistics from the Indian Premier League (IPL). The goal is to provide cricket enthusiasts,
analysts, and fans with a structured, user-friendly interface for accessing a variety of insights
about player performances and match data. By utilizing three datasets—player statistics, match
details, and ball-by-ball deliveries—the project integrates a vast amount of information to offer
both detailed statistical summaries and insightful visualizations.

The tool supports several key functionalities. Users can retrieve detailed individual player
statistics, including batting and bowling averages, strike rates, and fielding contributions like
catches and stumpings. It also allows for the identification of top players in various categories
such as runs scored, wickets taken, strike rates, and economy rates. For in-depth analysis, users
can compare multiple players' performances side-by-side or examine trends in a specific
batsman's runs or a bowler's dismissals over matches. Additionally, users can analyze top run-
scorers across different IPL seasons.

The implementation relies on Python's powerful data analysis libraries. The pandas library is
used for efficient data processing and manipulation, while matplotlib provides capabilities for
creating clear, insightful visualizations. The tool also employs techniques for handling and
validating datasets, ensuring robust and accurate analysis.

By automating complex statistical operations and offering intuitive visual representations, this
project streamlines the exploration of IPL data, making it an invaluable resource for anyone
interested in the intricacies of cricket analytics. Whether you're analyzing past performances or
comparing players, this tool makes it effortless to derive meaningful insights from IPL statistics.
DATA SOURCE
CSV File name: deliveries.csv, IPL Player Stat.csv, matches.csv
I MPLEMENTATION
SOURCE CODE
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
from typing import Dict, List

class IPLStatsAnalyzer:
def __init__(self, stats_file: str, matches_file: str, deliveries_file:
str):
"""Initialize the IPL Stats Analyzer with the datasets"""
self.stats = pd.read_csv(stats_file)
self.matches = pd.read_csv(matches_file)
self.deliveries = pd.read_csv(deliveries_file)
self._verify_columns()

def _verify_columns(self):
"""Verify that all required columns are present"""
required_columns_stats = {
'player', 'runs', 'boundaries', 'balls_faced', 'wickets',
'balls_bowled', 'runs_conceded', 'matches', 'batting_avg',
'batting_strike_rate', 'bowling_economy', 'bowling_avg',
'bowling_strike_rate', 'catches', 'stumpings'
}
required_columns_matches = {'id', 'season', 'winner', 'team1',
'team2'}
required_columns_deliveries = {'match_id', 'batter', 'bowler',
'batsman_runs', 'dismissal_kind'}

missing_stats = required_columns_stats - set(self.stats.columns)


missing_matches = required_columns_matches - set(self.matches.columns)
missing_deliveries = required_columns_deliveries -
set(self.deliveries.columns)

if missing_stats:
raise ValueError(f"Missing columns in stats dataset:
{missing_stats}")
if missing_matches:
raise ValueError(f"Missing columns in matches dataset:
{missing_matches}")
if missing_deliveries:
raise ValueError(f"Missing columns in deliveries dataset:
{missing_deliveries}")

def get_player_stats(self, player_name: str) -> Dict:


"""Get comprehensive stats for a player"""
player_data = self.stats[self.stats['player'].str.lower() ==
player_name.lower()]
if len(player_data) == 0:
return {"error": f"No data found for player: {player_name}"}

stats_dict = player_data.iloc[0].to_dict()
return {
'name': stats_dict['player'],
'matches': int(stats_dict['matches']),
'runs': int(stats_dict['runs']),
'batting_avg': round(stats_dict['batting_avg'], 2),
'batting_strike_rate': round(stats_dict['batting_strike_rate'],
2),
'boundaries': int(stats_dict['boundaries']),
'wickets': int(stats_dict['wickets']) if not
pd.isna(stats_dict['wickets']) else 0,
'bowling_economy': round(stats_dict['bowling_economy'], 2) if not
pd.isna(
stats_dict['bowling_economy']) else 0,
'catches': int(stats_dict['catches']) if not
pd.isna(stats_dict['catches']) else 0,
'stumpings': int(stats_dict['stumpings']) if not
pd.isna(stats_dict['stumpings']) else 0
}

def get_top_players(self, category: str, limit: int = 10) -> pd.DataFrame:


"""Get top players in various categories"""
if category == 'runs':
return self.stats.nlargest(limit, 'runs')[
['player', 'runs', 'batting_avg', 'batting_strike_rate',
'matches']]
elif category == 'wickets':
return self.stats.nlargest(limit, 'wickets')[
['player', 'wickets', 'bowling_economy', 'bowling_avg',
'matches']]
elif category == 'batting_strike_rate':
return self.stats[self.stats['balls_faced'] >=
100].nlargest(limit, 'batting_strike_rate')[
['player', 'batting_strike_rate', 'runs',
'boundaries_percent', 'matches']]
elif category == 'bowling_economy':
return self.stats[self.stats['balls_bowled'] >=
100].nsmallest(limit, 'bowling_economy')[
['player', 'bowling_economy', 'wickets', 'bowling_avg',
'matches']]
else:
raise ValueError(f"Invalid category: {category}")

def plot_player_comparison(self, player_names: List[str]):


"""Compare multiple players' statistics"""
players_data = self.stats[self.stats['player'].isin(player_names)]

if len(players_data) == 0:
print("No data found for the specified players.")
return

fig, ((ax1, ax2), (ax3, ax4)) = plt.subplots(2, 2, figsize=(15, 12))

# Batting Stats
players_data.plot(kind='bar', x='player', y='batting_avg', ax=ax1,
color='skyblue')
ax1.set_title('Batting Average Comparison')
ax1.tick_params(axis='x', rotation=45)

players_data.plot(kind='bar', x='player', y='batting_strike_rate',


ax=ax2, color='lightgreen')
ax2.set_title('Batting Strike Rate Comparison')
ax2.tick_params(axis='x', rotation=45)

# Bowling Stats (if applicable)


bowling_data = players_data[players_data['wickets'] > 0]
if len(bowling_data) > 0:
bowling_data.plot(kind='bar', x='player', y='bowling_economy',
ax=ax3, color='salmon')
ax3.set_title('Bowling Economy Comparison')
ax3.tick_params(axis='x', rotation=45)

bowling_data.plot(kind='bar', x='player', y='bowling_avg', ax=ax4,


color='orange')
ax4.set_title('Bowling Average Comparison')
ax4.tick_params(axis='x', rotation=45)

plt.tight_layout()
plt.show()

def plot_batsman_performance(self, batsman_name: str):


"""Analyze the performance of a specific batsman"""
batsman_data = self.deliveries[self.deliveries['batter'] ==
batsman_name]
if batsman_data.empty:
print(f"No data found for batsman: {batsman_name}")
return

batsman_grouped = batsman_data.groupby('match_id')
['batsman_runs'].sum()
plt.figure(figsize=(12, 6))
batsman_grouped.plot(kind='bar', color='blue', alpha=0.7)
plt.title(f"{batsman_name}'s Runs per Match")
plt.xlabel("Match ID")
plt.ylabel("Runs")
plt.show()

def plot_bowler_performance(self, bowler_name: str):


"""Analyze the performance of a specific bowler"""
bowler_data = self.deliveries[self.deliveries['bowler'] ==
bowler_name]
if bowler_data.empty:
print(f"No data found for bowler: {bowler_name}")
return

dismissals =
bowler_data['dismissal_kind'].notna().groupby(bowler_data['match_id']).sum()
plt.figure(figsize=(12, 6))
dismissals.plot(kind='bar', color='green', alpha=0.7)
plt.title(f"{bowler_name}'s Dismissals per Match")
plt.xlabel("Match ID")
plt.ylabel("Dismissals")
plt.show()

def plot_top_run_scorers_by_season(self):
"""Analyze top run-scorers for each season"""
merged_data = pd.merge(self.deliveries, self.matches,
left_on='match_id', right_on='id')
season_runs = merged_data.groupby(['season', 'batter'])
['batsman_runs'].sum().reset_index()
top_scorers = season_runs.groupby('season').apply(lambda x:
x.nlargest(1, 'batsman_runs')).reset_index(drop=True)
plt.figure(figsize=(12, 6))
for season in top_scorers['season'].unique():
season_data = top_scorers[top_scorers['season'] == season]
plt.bar(season_data['season'], season_data['batsman_runs'],
label=season_data['batter'].values[0])

plt.legend(title="Top Scorers")
plt.title("Top Run Scorers by Season")
plt.xlabel("Season")
plt.ylabel("Runs")
plt.show()

def main():
try:
analyzer = IPLStatsAnalyzer('IPL Player Stat.csv', 'matches.csv',
'deliveries.csv')
while True:
print("\n=== IPL Stats Analysis Tool ===")
print("1. Player Statistics")
print("2. Top Players by Category")
print("3. Compare Players")
print("4. Batsman Performance")
print("5. Bowler Performance")
print("6. Top Run Scorers by Season")
print("7. Exit")

choice = input("\nEnter your choice (1-7): ")

if choice == '1':
player = input("Enter player name: ")
stats = analyzer.get_player_stats(player)
if "error" in stats:
print(f"\n{stats['error']}")
else:
print(f"\nStatistics for {stats['name']}:")
print(f"Matches Played: {stats['matches']}")
print(f"Runs Scored: {stats['runs']}")
print(f"Batting Average: {stats['batting_avg']}")
print(f"Strike Rate: {stats['batting_strike_rate']}")
print(f"Boundaries: {stats['boundaries']}")
if stats['wickets'] > 0:
print(f"Wickets: {stats['wickets']}")
print(f"Bowling Economy: {stats['bowling_economy']}")
print(f"Catches: {stats['catches']}")
print(f"Stumpings: {stats['stumpings']}")

elif choice == '2':


print("\nCategories:")
print("1. Most Runs")
print("2. Most Wickets")
print("3. Best Strike Rate")
print("4. Best Economy Rate")

category_choice = input("Choose category (1-4): ")


if category_choice == '1':
print("\nTop Run Scorers:")
print(analyzer.get_top_players('runs'))
elif category_choice == '2':
print("\nTop Wicket Takers:")
print(analyzer.get_top_players('wickets'))
elif category_choice == '3':
print("\nBest Strike Rates (min 100 balls):")
print(analyzer.get_top_players('batting_strike_rate'))
elif category_choice == '4':
print("\nBest Economy Rates (min 100 balls):")
print(analyzer.get_top_players('bowling_economy'))

elif choice == '3':


players = input("Enter player names (comma-separated):
").split(',')
players = [p.strip() for p in players]
analyzer.plot_player_comparison(players)

elif choice == '4':


batsman = input("Enter batsman name: ")
analyzer.plot_batsman_performance(batsman)

elif choice == '5':


bowler = input("Enter bowler name: ")
analyzer.plot_bowler_performance(bowler)
elif choice == '6':
analyzer.plot_top_run_scorers_by_season()

elif choice == '7':


print("Exiting the tool. Goodbye!")
break

else:
print("Invalid choice. Please try again.")

except Exception as e:
print(f"An error occurred: {e}")

if __name__ == "__main__":
main()
S AMPLE O UTPUTS
Main Menu

Player Statistics

Top Players By Category


Batsman Performance

Bowler Performance
Top Scorers by season:
BIBLIOGRAPHY
 Informatics Practices Text Book (NCERT)
 Informatics Practices by Sumita Arora
 docs.python.org

You might also like