0% found this document useful (0 votes)
4 views3 pages

Sports Analytics Management Answers

The document provides exam answers related to sports analytics management, covering functions in Python, dataset creation and manipulation using pandas, and data analysis techniques. It discusses correlations in player performance, data visualization methods, data cleaning, and skewness types. Additionally, it includes code examples and interpretations of graphical data representations.

Uploaded by

Naman Juneja
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views3 pages

Sports Analytics Management Answers

The document provides exam answers related to sports analytics management, covering functions in Python, dataset creation and manipulation using pandas, and data analysis techniques. It discusses correlations in player performance, data visualization methods, data cleaning, and skewness types. Additionally, it includes code examples and interpretations of graphical data representations.

Uploaded by

Naman Juneja
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 3

Sports Analytics Management - Exam Answers

Q.1 Attempt Any 4 out of 6 Questions (5 Marks Each)


a) Purpose of the following functions:

 • del() – Deletes objects or elements from lists, dictionaries, or entire variables.


 • pop() – Removes and returns an element from a list or dictionary.
 • type() – Returns the data type of the variable.
 • std() – Calculates the standard deviation in a DataFrame or Series.
 • head() – Returns the first 5 rows of a DataFrame by default.

b) Code for given steps:

import pandas as pd

# Step (i) - Create dataset


df = pd.DataFrame({
'Player': ['Player1', 'Player2', 'Player3'],
'Playing position': ['Batsman', 'Bowler', 'All-rounder'],
'Score': [45, 30, 60]
})

# Step (ii) - Add 3 more rows


new_data = pd.DataFrame({
'Player': ['Player4', 'Player5', 'Player6'],
'Playing position': ['Batsman', 'Bowler', 'Wicketkeeper'],
'Score': [70, 55, 35]
})
df = pd.concat([df, new_data], ignore_index=True)

# Step (iii) - Display in ascending order of Score


print(df.sort_values(by='Score'))

c) Interpretation of Graph:

• There is a negative correlation between miles per gallon and displacement.

• As fuel efficiency increases, engine displacement decreases.

• Cars with high displacement tend to have lower miles per gallon.
d) Dataset Analysis:

• Number of rows: 1355

• Number of columns: 8

• Missing values: 0

• Data types:

 - Player: object
 - Overs: float64
 - Others: int64

Code: print(df.info())

e) Difference between Bar Plot and Histogram:

 • Bar Plot: For categorical data with gaps between bars.


 • Histogram: For numerical data, showing distribution with no gaps.

f) Outliers and Detection Methods:

 • Outliers are values that deviate significantly from other data points.
 • Detection Methods:
 - IQR method
 - Z-score method
 - Box plot

Q.2 Attempt Any 5 out of 7 Questions (8 Marks Each)


a) 4 Variables Positively Correlated with Player Performance:

 • Number of Innings Played


 • Total Runs Scored
 • Number of Wickets Taken
 • Batting Average or Strike Rate

b) 4 Functions of Sets with Example:

 • union(): Combines two sets – A.union(B)


 • intersection(): Common elements – A.intersection(B)
 • difference(): Elements in A not in B – A.difference(B)
 • issubset(): Checks if A is subset of B – A.issubset(B)

c) Pie Plot Analysis on Captaincy Duration:

 • Mohammad Azharuddin has the longest captaincy duration.


 • Sourav Ganguly and MS Dhoni follow.
 • Other players like Virat Kohli and Rahul Dravid had shorter stints.

d) Code Explanation:

 • df.groupby("team")['score'].mean() – Returns mean score per team.


 • df[df["score"] > 100] – Filters rows with score greater than 100.
 • df.team.value_counts() – Returns frequency of each team.

e) Data Validity in Data Cleaning:

 • Ensures data is accurate and falls within acceptable ranges.


 • Involves type checks, range checks, duplicates removal, and consistency.

f) Types of Skewness:

 • Positive Skew: Long tail on right, e.g., income distribution.


 • Negative Skew: Long tail on left, e.g., age at retirement.

g) Formula 1 Correlation Matrix Interpretation:

 • Driver and constructor are highly correlated (0.83).


 • Constructor has moderate correlation with position (0.51).
 • Car number has low correlation with performance.
 • Suggests team and driver impact performance more than car number.

You might also like