0% found this document useful (0 votes)
3 views21 pages

Computer Project

The document outlines a project focused on designing and developing a system for managing Indian Premier League (IPL) match data from 2008 to 2017 using Python and MySQL. It emphasizes the importance of CSV file handling, programming concepts, and database management while detailing the features of Python in these areas. The project aims to demonstrate practical skills in data analysis and visualization through the use of libraries such as pandas and matplotlib.

Uploaded by

spachudhan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views21 pages

Computer Project

The document outlines a project focused on designing and developing a system for managing Indian Premier League (IPL) match data from 2008 to 2017 using Python and MySQL. It emphasizes the importance of CSV file handling, programming concepts, and database management while detailing the features of Python in these areas. The project aims to demonstrate practical skills in data analysis and visualization through the use of libraries such as pandas and matplotlib.

Uploaded by

spachudhan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 21

Introduction:

Computers and technology play an integral role in


modern life, shaping the way we communicate, solve
problems, and innovate. For this final project, I have
chosen to design and develop Indian premier league
(ipl).

This project serves as a practical application of the


concepts and skills I have learned during my computer
studies. It integrates programming, problem-solving,
and system design to access the details of ipl matches
from 2008 to 2017 [from the csv file of ipl database].

CSV (Comma-Separated Values) file handling refers to the process of


reading, writing, and manipulating data stored in CSV files using
programming languages. Since CSV files are widely used for storing
tabular data, efficient handling is crucial for tasks like data analysis,
database integration, or transferring data between applications.

Key Concepts in CSV File Handling:

1. Reading CSV Files:


o Data can be extracted from a CSV file to
perform operations or analysis.
o Libraries like Python’s csv or pandas simplify
reading rows and columns.
2. Writing to CSV Files:
o Programs can generate CSV files to save
output or export processed data.
o Customizable formats allow including headers,
delimiters, or specific data structures.
3. Manipulating Data:
o Using programming, users can filter, sort, transform,
or summarize data in CSV files.
Objective:
 to demonstrate a clear understanding of
programming languages like Python, database
management, file handling.
 to apply theoretical knowledge to a real-world
scenario.

 to develop an intuitive and functional system that


addresses user needs effectively.

 to apply theoretical concepts of Python programming


and database management in a real-world scenario.
 to demonstrate proficiency in designing,
implementing, and connecting databases with Python
applications.

 to solve a practical problem by creating a reliable


and scalable system for managing ipl database .

This project not only showcases my understanding of


Python programming and MySQL database
management but also emphasizes the importance of
creating systems that are both functional and efficient
in addressing everyday data management challenges.

This project not only showcases my understanding of


Python programming and MySQL database
management but also emphasizes the importance of
creating systems that are both functional and efficient
in addressing everyday data management challenges.
Features of Python in
Database Management:
1. Database Connectivity:
o Python supports various database
management systems like MySQL, SQLite,
PostgreSQL, and MongoDB through dedicated
libraries (MySQL-connector-python, sqlite3,
psycopg2, etc.).
2. Ease of Query Execution:
o Python allows the execution of SQL queries
directly from code, enabling seamless
interaction with databases.
3. Data Manipulation:
o Python can fetch, insert, update, and delete
records from a database using standard SQL
commands integrated into scripts.
4. Database Abstraction:
o Libraries like SQLAlchemy provide Object-
Relational Mapping (ORM) to abstract
database operations into Python objects,
simplifying development.
5. Portability:
o Python’s database management features are
platform-independent, allowing code to run
across different operating systems.
6. Scalability and Performance:
o Python can handle large datasets and complex
database operations using efficient libraries
and frameworks.
7. Integration with Tools:
o Python databases can be integrated with data
visualization and analysis tools like Pandas,
Matplotlib, and NumPy.
Features of Python in File
Handling:
1. Cross-Platform Support:
o Python supports various file formats
like .txt, .csv, .json, .xml, and more, making it
ideal for diverse use cases.
2. Simple Syntax:
o Python’s file handling uses intuitive functions
like open (), read (), write (), and close () for
performing basic operations.
3. Modes of Operation:
o Supports multiple modes (r, w, a, rb, wb, etc.)
for flexible file interactions, including binary
file handling.
4. Exception Handling:
o Python includes robust exception handling to
manage file errors like file not found,
permission issues, or read/write errors.
5. Large Data Handling:
o Python can efficiently handle large files by
reading them in chunks or using libraries like
pandas for structured data files (e.g., CSV,
Excel).
6. Support for Structured Data:
o Specialized libraries (csv, json, xml.etree)
simplify parsing and writing structured data
formats.
7. Automation Capabilities:
o Python scripts can automate repetitive file
handling tasks like backups, data migration, or
log file management.
8. Security Features:
o Provides mechanisms to safely handle
sensitive data using encryption libraries (e.g.,
cryptography or hashlib).

Hardware and Software used:


Windows edition:

 windows 7 ultimate

System hardware:

 Manufacturer: intel
 Processor: Intel® Core™2 duo CPU E7500 @ 2.93Ghz
2.93Ghz
 Installed memory(RAM): 4.00 GB
 System type: 64-bit Operating System
 Monitor: acer

Libraries and modules used:


1. Core Libraries

Library/ Purpose
Module
numpy Provides support for numerical
computations. (Not used directly in the
code, but imported.)
pandas For data manipulation and analysis, e.g.,
reading the dataset, filtering data, and
summary stats.

2. Visualization Libraries

Library/Module Purpose
matplotlib.pypl Used to create basic visualizations,
ot such as setting plot dimensions and
displaying the plots.
seaborn A modern visualization library that
makes creating attractive and
informative statistical graphics easier.
Code-Specific Usage of Libraries
pandas

Used for:

 Reading the dataset (pd.read_csv).


 Analyzing data with functions like info(), unique(),
value_counts(), and slicing with iloc.
 Filtering rows using conditions
(df[df['column'].condition]).

matplotlib.pyplot

Used for:

 Setting figure size: plt.rcParams['figure.figsize'].


 Displaying plots with plt.show().

seaborn

Used for:

 Applying a modern visualization style:


sns.set_style("darkgrid").
 Creating bar plots and count plots (sns.barplot() and
sns.countplot()).

Imports in the Code


python
Copy code
import numpy as np # Numerical computing
import pandas as pd # Data analysis and
manipulation
import matplotlib.pyplot as plt # Visualization
import seaborn as sns # Statistical data visualization

Functions used:
1. Functions from pandas
Function Purpose Usage in Code

pd.read_csv() Reads data from a CSV Used to load the dataset


file into a pandas (ipl1.csv).
DataFrame.
df.info() Provides a concise Prints information about
summary of the the dataset.
DataFrame, including
column data types and
non-null counts.
df['column'].max Returns the maximum Used to find the
() value in a specific maximum match ID (total
column. matches).
df['column'].unique Returns unique values in Used to find all unique
() a column. IPL seasons.
df.iloc[ Accesses specific rows Extracts details of
] and columns by position. specific matches (e.g.,
largest margin wins).
df['column'].idxmax Returns the index of the Finds the row index for
() maximum value in a maximum win margins
column. (win_by_runs,
win_by_wickets).
df['column'].ge( Checks if the column Filters rows with values
x) values are greater than >= 1 for minimum run
or equal to a specific and wicket victories.
value.
df['column'].value_counts Counts occurrences of Counts the number of
() unique values in a matches won by each
column. team or awards won by
players.
2. Functions from matplotlib.pyplot
Function Purpose Usage in Code
plt.rcParams[] Modifies default plot Sets the default figure size
parameters (e.g., figure size, for all plots
font size, etc.). (figure.figsize =
(14, 8)).
plt.show() Displays the current figure or Used to show the Seaborn
plot. plots.

3. Functions from seaborn


Function Purpose Usage in Code
sns.set_style() Sets the overall aesthetic Applies the "darkgrid"
style of the plots. style to all visualizations.
sns.countplot() Creates a count plot to show Visualizes the number of
the frequency of categories. matches played in each
season.
sns.barplot() Creates a bar plot to display Used for visualizing team
data in a horizontal or wins and "Player of the
vertical orientation. Match" counts.

4. Built-in Python Functions


Function Purpose Usage in Code
print() Prints text or data to the Used extensively to
console. display results and
insights from the
dataset.
Summary of Key Uses

 Data Analysis:
Functions from pandas are used to load, filter, and analyze the dataset
(e.g., read_csv(), iloc[], value_counts()).
 Visualization:
matplotlib and seaborn functions are used to create visual
representations of the data (e.g., countplot(), barplot()).
 Console Output:
Python's print() function is used to display text-based results directly in
the terminal.

Source code:
importnumpy as np # numerical computing

import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)

importmatplotlib.pyplot as plt #visualization

importseaborn as sns #modern visualization

plt.rcParams['figure.figsize'] = (14, 8)

sns.set_style("darkgrid")

df = pd.read_csv("E:\ipl1.csv")

print('---------------------------------------------------------------------------')

print('---------------------------------------------------------------------------')

print(df.info())

print()

print('---------------------------------------------------------------------------')

print('---------------------------------------------------------------------------')

print('Total Matches are::::',df['id'].max())

print()
print('---------------------------------------------------------------------------')

print('---------------------------------------------------------------------------')

print('How many seasons data we’ve got in the dataset?')

print(df['season'].unique())

print()

print('---------------------------------------------------------------------------')

print('---------------------------------------------------------------------------')

print('Which Team had won by maximum runs?')

print(df.iloc[df['win_by_runs'].idxmax()])

print()

print('---------------------------------------------------------------------------')

print('---------------------------------------------------------------------------')

print('Which Team had won by maximum wickets?')

print(df.iloc[df['win_by_wickets'].idxmax()]['winner'])

print()

print('---------------------------------------------------------------------------')

print('---------------------------------------------------------------------------')

print('Which Team had won by (closest margin) minimum runs?')


print(df.iloc[df[df['win_by_runs'].ge(1)].win_by_runs.idxmin()]['winner'])

print()

print('---------------------------------------------------------------------------')

print('---------------------------------------------------------------------------')

print('Which Team had won by minimum wickets?')

print(df.iloc[df[df['win_by_wickets'].ge(1)].win_by_wickets.idxmin()])

print()

print('---------------------------------------------------------------------------')

print('---------------------------------------------------------------------------')

print('Which season had most number of matches?')

sns.countplot(x='season', data=df)

plt.show()

print()

print('---------------------------------------------------------------------------')

print('---------------------------------------------------------------------------')

print('The Most Successful IPL Team is:::')

data = df.winner.value_counts()
sns.barplot(y = data.index, x = data, orient='h')

print()

print('---------------------------------------------------------------------------')

print('---------------------------------------------------------------------------')

print('The Players who got maximum times Man of the Match are:::')

top_players = df.player_of_match.value_counts()[:10]

#sns.barplot(x="day", y="total_bill", data=tips)

fig, ax = plt.subplots()

ax.set_ylim([0,20])

ax.set_ylabel("Count")

ax.set_title("Top player of the match Winners")

#top_players.plot.bar()

sns.barplot(x = top_players.index, y = top_players, orient='v');

#palette="Blues");

plt.show()
OUTPUT:

---------------------------------------------------------------------------

---------------------------------------------------------------------------

<class 'pandas.core.frame.DataFrame'>

RangeIndex: 670 entries, 0 to 669

Data columns (total 18 columns):

# Column Non-Null Count Dtype

--- ------ -------------- -----

0 id 670 non-null int64

1 season 670 non-null int64

2 city 670 non-null object

3 date 670 non-null object

4 team1 670 non-null object

5 team2 670 non-null object

6 toss_winner 670 non-null object

7 toss_decision 670 non-null object

8 result 636 non-null object

9 dl_applied 636 non-null float64

10 winner 670 non-null object

11 win_by_runs 636 non-null float64

12 win_by_wickets 636 non-null float64

13 player_of_match 633 non-null object

14 venue 636 non-null object

15 umpire1 635 non-null object

16 umpire2 635 non-null object

17 umpire3 0 non-null float64

dtypes: float64(4), int64(2), object(12)

memory usage: 62.9+ KB

None
---------------------------------------------------------------------------

---------------------------------------------------------------------------

Total Matches are:::: 670

---------------------------------------------------------------------------

---------------------------------------------------------------------------

How many seasons data we’ve got in the dataset?

[2017 2008 2009 2010 2011 2012 2013 2014 2015 2016 2018]

---------------------------------------------------------------------------

---------------------------------------------------------------------------

Which Team had won by maximum runs?

id 44

season 2017

city Delhi

date 5/6/2017

team1 Mumbai Indians

team2 Delhi Daredevils

toss_winner Delhi Daredevils

toss_decision field

result normal

dl_applied 0

winner Mumbai Indians

win_by_runs 146

win_by_wickets 0

player_of_match LMP Simmons

venueFeroz Shah Kotla

umpire1 Nitin Menon

umpire2 CK Nandan

umpire3NaN
Name: 43, dtype: object

---------------------------------------------------------------------------

---------------------------------------------------------------------------

Which Team had won by maximum wickets?

Kolkata Knight Riders

---------------------------------------------------------------------------

---------------------------------------------------------------------------

Which Team had won by (closest margin) minimum runs?

Mumbai Indians

---------------------------------------------------------------------------

---------------------------------------------------------------------------

Which Team had won by minimum wickets?

id 560

season 2015

city Kolkata

date 5/9/2015

team1 Kings XI Punjab

team2 Kolkata Knight Riders

toss_winner Kings XI Punjab

toss_decision bat

result normal

dl_applied 0

winner Kolkata Knight Riders

win_by_runs 0

win_by_wickets 1

player_of_match AD Russell

venue Eden Gardens

umpire1 AK Chaudhary
umpire2 HDPK Dharmasena

umpire3NaN

Name: 559, dtype: object

---------------------------------------------------------------------------

---------------------------------------------------------------------------

Which season had most number of matches?

---------------------------------------------------------------------------

---------------------------------------------------------------------------

The Most Successful IPL Team is:::


---------------------------------------------------------------------------

---------------------------------------------------------------------------

The Players who got maximum times Man of the Match are:::
How the Code Works:
1. Loads the IPL Dataset:
python
Copy code
df = pd.read_csv(r"E:\ipl1.csv")

o Reads the dataset from a CSV file into a DataFrame.

2. Data Inspection:
o The info() function gives a summary of the dataset.
o df['id'].max() is used to determine the total matches
(assuming id is a unique identifier for matches).

3. Identifies Key Insights:


o Uses the .idxmax() function to find the rows with
maximum values for certain columns like win_by_runs
and win_by_wickets.
o Applies filtering to determine minimum values, excluding
zeros (.ge(1) ensures values greater than or equal to 1).

4. Seasonal Trends:
o Creates a count plot of matches per season with
sns.countplot().

5. Team and Player Analysis:


o Counts the number of matches won by each team using
df.winner.value_counts() and visualizes it.
o Lists and visualizes the top 10 players with the most
"Player of the Match" awards using
df.player_of_match.value_counts().
Key Objectives of the Code:

1. Basic Dataset Exploration:


o The dataset is loaded using pandas, and basic information
(df.info()) is displayed, such as column names, data types, and
non-null counts.

2. Statistical Questions Answered:


o Total Matches: The total number of matches in the dataset is
printed.
o Seasons Analyzed: Identifies and lists all unique IPL seasons
available in the dataset.

3. Team Performance Analysis:


o Team with the Largest Victory (by Runs):
 Identifies the team that won a match by the highest
number of runs using the win_by_runs column.
o Team with the Largest Victory (by Wickets):
 Finds the team that won a match with the most wickets in
hand, using the win_by_wickets column.
o Team with Closest Victory (by Runs):
 Determines the team that won a match by the smallest
margin of runs, ensuring that the value is greater than 0.
o Team with Smallest Victory (by Wickets):
 Finds the team that won with the fewest wickets in hand,
ensuring the value is greater than 0.

4. Seasonal Analysis:
o Season with Most Matches:
 Uses a count plot to show how many matches occurred in
each season, making it easy to spot the busiest IPL season.

5. Success of Teams and Players:


o Most Successful IPL Team:
 Identifies and visualizes the teams with the highest number
of match wins using a horizontal bar plot.
o Players with Most "Player of the Match" Awards:
 Displays the top 10 players who have received the "Player
of the Match" award the most, using a vertical bar chart.
6. Visualization:
o Uses Seaborn and Matplotlib to generate intuitive plots for better
understanding and presentation of data.

What Can Be Learned from This Code?

 Match Trends: Total matches played and which season was the
busiest.
 Team Performance: Which teams have been the most
successful and dominant in the IPL.
 Player Excellence: Players who consistently delivered match-
winning performances.
 Winning Margins: Insights into the biggest and closest wins in
IPL history.

Bibliography:
 https://engineersplanet.com/python-projects-class-
xi-xii/
 https://python4csip.com/computer-science-xii.php
 https://cbseacademic.nic.in/web_material/doc/cs/
2_Computer_Science_Python_ClassXII.pdf
 COMPUTER SCIENCE with python by Sumita Arora

You might also like