0% found this document useful (0 votes)

128 views9 pages

Python Class 6 Assignment Solution

Uploaded by

Arpit Dubey

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

128 views9 pages

Python Class 6 Assignment Solution

Uploaded by

Arpit Dubey

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 9

PANDAS:

1- Create a Pandas Data frame from the given data and create a new column “Voter” based on
voter age, i.e., if age >18 then voter column should be “Yes” otherwise if age <18 then voter
column should be “No”

raw_Data = {'Voter_name': ['Geek1', 'Geek2', 'Geek3', 'Geek4',

'Geek5', 'Geek6', 'Geek7', 'Geek8'],
'Voter_age': [15, 23, 25, 9, 67, 54, 42, np.NaN]}

ills
Solution:

import pandas as pd

Sk
import numpy as np

raw_Data = {'Voter_name': ['Geek1', 'Geek2', 'Geek3', 'Geek4', 'Geek5', 'Geek6', 'Geek7',

'Geek8'], a
'Voter_age': [15, 23, 25, 9, 67, 54, 42, np.NaN]}
at
df = pd.DataFrame(raw_Data)
D

# Create a new column "Voter" based on voter age

df['Voter'] = np.where(df['Voter_age'] > 18, 'Yes', 'No')

print(df)
ro

2 – Create a Pandas Data frame from the given data and collapse First and Last column into
G

one column as Full Name, so the output contains Full Name and Age, then convert column age
to index

raw_Data = {'First': ['Manan ', 'Raghav ', 'Sunny '],

'Last': ['Goel', 'Sharma', 'Chawla'],
'Age' : [12, 24, 56]}

Solution:
raw_Data = {'First': ['Manan', 'Raghav', 'Sunny'],

'Last': ['Goel', 'Sharma', 'Chawla'],

'Age': [12, 24, 56]}

df = pd.DataFrame(raw_Data)

# Combine First and Last columns into Full Name

ills
df['Full Name'] = df['First'] + ' ' + df['Last']

# Set Age as index

Sk
df.set_index('Age', inplace=True)

print(df)

a
3- Create a Pandas Data frame from the given data -
at
raw_Data = {'Date':['10/2/2011', '11/2/2011', '12/2/2011', '13/2/2011'],
'Product':['Umbrella', 'Matress', 'Badminton','Shuttle'],
D

'Price':[1250, 1450, 1550, 400],

'Expense': [ 21525220.653, 31125840.875, 23135428.768, 56245263.942]}

a- Add Index as Item1, Item2, Item3, Item4

b- Find the index labels of all items whose ‘Price’ is greater than 1000.

c- Replace products using Map() with respective codes- Umbrella : ‘U’, Matress : 'M', Badminton
ro

: 'B', Shuttle: 'S'

d- Round off the Expense column values to two decimal places.

e- Create a new column called ‘Discounted_Price’ after applying a 10% discount on the existing
‘price’ column.(try using lambda function)

f- Convert the column type of “Date” to datetime format

g- Create a column rank which ranks the products based on the price (one with highest price will
be rank 1).

Solution:
raw_Data = {'Date': ['10/2/2011', '11/2/2011', '12/2/2011', '13/2/2011'],

'Product': ['Umbrella', 'Matress', 'Badminton', 'Shuttle'],

'Price': [1250, 1450, 1550, 400],

'Expense': [21525220.653, 31125840.875, 23135428.768, 56245263.942]}

df = pd.DataFrame(raw_Data)

ills
# Task a: Add Index as Item1, Item2, Item3, Item4

df.index = ['Item1', 'Item2', 'Item3', 'Item4']

Sk
# Task b: Find index labels with Price > 1000

indexes_with_price_gt_1000 = df[df['Price'] > 1000].index.tolist()

a
# Task c: Replace products using Map()

product_map = {'Umbrella': 'U', 'Matress': 'M', 'Badminton': 'B', 'Shuttle': 'S'}

at
df['Product'] = df['Product'].map(product_map)
D

# Task d: Round off Expense column to two decimal places

df['Expense'] = df['Expense'].round(2)
w

# Task e: Create 'Discounted_Price' column with 10% discount

df['Discounted_Price'] = df['Price'] * 0.9

# Task f: Convert 'Date' column to datetime format

df['Date'] = pd.to_datetime(df['Date'])

# Task g: Create 'Rank' column based on price

df['Rank'] = df['Price'].rank(ascending=False).astype(int)

print(df)
Assignment: Exploring NBA Player Data

Download the nba.csv file containing NBA player data Complete the following tasks using
Python, Pandas, and data visualization libraries:

1. Load Data:

● Load the nba.csv data into a Pandas DataFrame.

ills
● Display basic information about the DataFrame.

2. Data Cleaning:

● Handle missing values by either removing or imputing them.

Sk
● Remove duplicate rows.

3. Data Transformation:

● Create a new column 'BMI' (Body Mass Index) using the formula: BMI = (weight in
pounds / (height in inches)^2) * 703.(Assuming a fixed height value of 70 inches (5 feet
a
10 inches)

4. Exploratory Data Analysis (EDA):

at
● Display summary statistics of the 'age', 'weight', and 'salary' columns.

● Calculate the average age, weight, and salary of players in each 'position' category.
D

5. Data Visualization:

● Create a histogram of player ages.

● Create a box plot of player salaries for each 'position'.

● Plot a scatter plot of 'age' vs. 'salary' with a different color for each 'position'.
ro

6. Top Players:

● Display the top 10 players with the highest salaries.

7. College Analysis:

● Determine the top 5 colleges with the most represented players.

8. Position Distribution:

● Plot a pie chart to show the distribution of players across different 'positions'.

9. Team Analysis:

● Display the average salary of players for each 'team'.

● Plot a bar chart to visualize the average salary of players for each 'team'.

10. Extras

● Get the index at which minimum weight value is present.

● Sort values based on name in alphabetical order for the rows (the original Dataframe
sorting should not change)
● Create a series from given dataframe on “name” column and display top and last 10

Guidelines:

ills
1. Write Python code to complete each task.

2. Provide comments explaining your code.

3. Use meaningful variable names.

Sk
4. Include necessary library imports.

5. Present your findings in a clear and organized manner.

6. Feel free to use additional code cells for each task.

Solution:
a
at
1. Load Data:

import pandas as pd
D

# Load the data into a Pandas DataFrame

df = pd.read_csv('nba.csv')
ro

# Display basic information about the DataFrame

print(df.info())
G

print(df.head())

2. Data Cleaning:

# Handle missing values

df.dropna(inplace=True)
# Remove duplicate rows

df.drop_duplicates(inplace=True)

3. Data Transformation: Create 'BMI' column using a fixed height value

# Assuming a fixed height value of 70 inches (5 feet 10 inches)

fixed_height = 70

ills
# Create 'BMI' column

df['BMI'] = (df['Weight'] / (fixed_height ** 2)) * 703

Sk
4. Exploratory Data Analysis (EDA):

# Summary statistics

print(df[['Age', 'Weight', 'Salary']].describe())

a
# Average age, weight, and salary by position
at
avg_by_position = df.groupby('Position')[['Age', 'Weight', 'Salary']].mean()

print(avg_by_position)
D

5. Data Visualization:
w

import matplotlib.pyplot as plt

# Histogram of player ages

plt.hist(df['Age'], bins=20)
G

plt.xlabel('Age')

plt.ylabel('Frequency')

plt.title('Distribution of Player Ages')

plt.show()

# Box plot of player salaries by position

plt.figure(figsize=(10, 6))

df.boxplot(column='Salary', by='Position')

plt.ylabel('Salary')

plt.title('Box Plot of Player Salaries by Position')

plt.suptitle('')

plt.xticks(rotation=45)

plt.show()

ills
# Scatter plot of 'age' vs. 'salary' by position

plt.figure(figsize=(10, 6))

Sk
colors = {'PG': 'red', 'SG': 'blue', 'SF': 'green', 'PF': 'purple', 'C': 'orange'}

plt.scatter(df['Age'], df['Salary'], c=df['Position'].map(colors), alpha=0.5)

plt.xlabel('Age')

plt.ylabel('Salary')
a
plt.title('Age vs. Salary by Position')
at
plt.legend(colors)

plt.show()
D

6. Top Players:
w

top_players = df.nlargest(10, 'Salary')

print(top_players)
ro

7. College Analysis:
G

top_colleges = df['College'].value_counts().nlargest(5)

print(top_colleges)

8. Position Distribution:

position_counts = df['Position'].value_counts()
plt.pie(position_counts, labels=position_counts.index, autopct='%1.1f%%', startangle=140)

plt.title('Position Distribution of Players')

plt.axis('equal')

plt.show()

9. Team Analysis:

ills
avg_salary_by_team = df.groupby('Team')['Salary'].mean()

print(avg_salary_by_team)

Sk
plt.figure(figsize=(10, 6))

avg_salary_by_team.plot(kind='bar')

plt.xlabel('Team')

plt.ylabel('Average Salary')
a
plt.title('Average Salary of Players by Team')
at
plt.xticks(rotation=45)

plt.show()
D

10.Extras:
w

min_weight_index = df['Weight'].idxmin()

print("Index with minimum weight value:", min_weight_index)

df_sorted = df.sort_values(by='Name', ignore_index=True)

print(df_sorted)

name_series = df['Name']

print("Top 10 names:\n", name_series.head(10))

print("\nLast 10 names:\n", name_series.tail(10))

G
ro
w
D
at
a
Sk
ills

Data Exploration Preparation
No ratings yet
Data Exploration Preparation
12 pages
IP Practical File 2024-25
100% (7)
IP Practical File 2024-25
22 pages
Analysing NBA DATA
No ratings yet
Analysing NBA DATA
13 pages
Data Sci
No ratings yet
Data Sci
29 pages
Ip Practical File
No ratings yet
Ip Practical File
23 pages
12 IP Practical Exampl
No ratings yet
12 IP Practical Exampl
6 pages
Leadership Without Easy Answers-18 Pages
No ratings yet
Leadership Without Easy Answers-18 Pages
18 pages
Python For Exploratory Data Analysis
No ratings yet
Python For Exploratory Data Analysis
12 pages
Cleaning Data in Python
No ratings yet
Cleaning Data in Python
8 pages
BDA Lab 4: Python Data Visualization: Your Name: Mohamad Salehuddin Bin Zulkefli Matric No: 17005054
No ratings yet
BDA Lab 4: Python Data Visualization: Your Name: Mohamad Salehuddin Bin Zulkefli Matric No: 17005054
10 pages
Some Exercises
No ratings yet
Some Exercises
9 pages
Practical File 2024
No ratings yet
Practical File 2024
25 pages
EDA - Exploratory Data Analysis
No ratings yet
EDA - Exploratory Data Analysis
16 pages
Assignment 1 - LP1
No ratings yet
Assignment 1 - LP1
14 pages
Exemplar - Perform Feature Engineering
No ratings yet
Exemplar - Perform Feature Engineering
14 pages
IP Practical File 2022
No ratings yet
IP Practical File 2022
26 pages
Project Management Plan Contents and Templates
0% (1)
Project Management Plan Contents and Templates
25 pages
DWDM Lab Manual
No ratings yet
DWDM Lab Manual
32 pages
Data Science Practical Book - Ipynb
No ratings yet
Data Science Practical Book - Ipynb
21 pages
Jashan ML
No ratings yet
Jashan ML
20 pages
Machinist Mate 3 2 Surface Navy
No ratings yet
Machinist Mate 3 2 Surface Navy
592 pages
Practical File 12.
No ratings yet
Practical File 12.
22 pages
Hrithik Saini Class 12th c1, Roll No 1033
No ratings yet
Hrithik Saini Class 12th c1, Roll No 1033
25 pages
Python Codes
No ratings yet
Python Codes
17 pages
Data Visualization EDA-print
No ratings yet
Data Visualization EDA-print
18 pages
Practical File 12th
No ratings yet
Practical File 12th
19 pages
DAV Guidelines
No ratings yet
DAV Guidelines
4 pages
Ai Programs
No ratings yet
Ai Programs
22 pages
Gastroschisis Pathway
100% (1)
Gastroschisis Pathway
18 pages
Data Science
No ratings yet
Data Science
18 pages
Khadeeja - DS - PRACTICAL 4
No ratings yet
Khadeeja - DS - PRACTICAL 4
24 pages
Ip 12th Practical
No ratings yet
Ip 12th Practical
22 pages
Project Paarth
No ratings yet
Project Paarth
21 pages
PP DWDM 4 5
No ratings yet
PP DWDM 4 5
26 pages
Data Mining Lab 03
No ratings yet
Data Mining Lab 03
10 pages
Eda Code Snippets
No ratings yet
Eda Code Snippets
17 pages
ST Joseph'S Convent Senior Secondary School: Name:-Shatakshi Gaur Class:-Xii Sec:-A Board Roll No.
No ratings yet
ST Joseph'S Convent Senior Secondary School: Name:-Shatakshi Gaur Class:-Xii Sec:-A Board Roll No.
65 pages
12 Pandas
No ratings yet
12 Pandas
14 pages
Statistics IMP Questions and Answers
No ratings yet
Statistics IMP Questions and Answers
23 pages
Practical 1
No ratings yet
Practical 1
5 pages
Practicals
No ratings yet
Practicals
42 pages
Prac 2
No ratings yet
Prac 2
11 pages
NumPy and Pandas Step
No ratings yet
NumPy and Pandas Step
9 pages
Prac 2
No ratings yet
Prac 2
11 pages
DS Manual 1
No ratings yet
DS Manual 1
96 pages
BPSM
100% (1)
BPSM
18 pages
Exp 8 - LM
No ratings yet
Exp 8 - LM
10 pages
Day 4 Data Manipulation With Pandas
No ratings yet
Day 4 Data Manipulation With Pandas
4 pages
Data Science Practicals - Ipynb
No ratings yet
Data Science Practicals - Ipynb
54 pages
Pe Syllabus g12
100% (2)
Pe Syllabus g12
8 pages
Machine Learning Lab Manual
No ratings yet
Machine Learning Lab Manual
42 pages
Step-by-Step Explanation of Python Data Preprocessing Script
No ratings yet
Step-by-Step Explanation of Python Data Preprocessing Script
9 pages
Oddstudents
No ratings yet
Oddstudents
35 pages
Assignment 2
No ratings yet
Assignment 2
6 pages
Aerofit Case Study
No ratings yet
Aerofit Case Study
16 pages
Even Students
No ratings yet
Even Students
36 pages
Writing in Focus
No ratings yet
Writing in Focus
69 pages
Python SQL
No ratings yet
Python SQL
5 pages
Abhiml ML File
No ratings yet
Abhiml ML File
74 pages
Filling Order 2
100% (1)
Filling Order 2
52 pages
Data Science in Society Cat
No ratings yet
Data Science in Society Cat
5 pages
Spherical Roller Bearings
No ratings yet
Spherical Roller Bearings
32 pages
Ice Cream
No ratings yet
Ice Cream
26 pages
Data Preprocessing 2
No ratings yet
Data Preprocessing 2
5 pages
Contemporary Models of Development and Underdevelopment
No ratings yet
Contemporary Models of Development and Underdevelopment
22 pages
Design Patterns Refcard PDF
100% (1)
Design Patterns Refcard PDF
7 pages
sakina_assign1_batch3
No ratings yet
sakina_assign1_batch3
8 pages
Chapter1 FindingtheRightConversation 1
No ratings yet
Chapter1 FindingtheRightConversation 1
15 pages
Allocate Move Order Script
100% (1)
Allocate Move Order Script
3 pages
Pandas
No ratings yet
Pandas
13 pages
George Kelly Construct Theory: - Early Cognitive Personality Theorist - Phenomonological - Clinician
No ratings yet
George Kelly Construct Theory: - Early Cognitive Personality Theorist - Phenomonological - Clinician
15 pages
15' Stress Test
No ratings yet
15' Stress Test
1 page
Secviii Div 2 - 7.5.6
No ratings yet
Secviii Div 2 - 7.5.6
2 pages
The AIM Test
No ratings yet
The AIM Test
4 pages
Buckingham Pi Theorem
No ratings yet
Buckingham Pi Theorem
2 pages
Exp 12 and 15
No ratings yet
Exp 12 and 15
4 pages
2024-LS-G8-NMP Mathematics Q1 W3 D2
No ratings yet
2024-LS-G8-NMP Mathematics Q1 W3 D2
13 pages
Hexagon Mi 454sf Datasheet en
No ratings yet
Hexagon Mi 454sf Datasheet en
5 pages
SMEA Orientation
No ratings yet
SMEA Orientation
35 pages
Algebra II Workbook Chapter 16
No ratings yet
Algebra II Workbook Chapter 16
11 pages
Python Interviews
No ratings yet
Python Interviews
154 pages
Sabar Rutoto, Henry Suryo Bintoro, Ika Oktavianti, Sumaji
No ratings yet
Sabar Rutoto, Henry Suryo Bintoro, Ika Oktavianti, Sumaji
9 pages
ISE 330 Introduction To Operations Research: Deterministic Models What Is Linear Programming?
No ratings yet
ISE 330 Introduction To Operations Research: Deterministic Models What Is Linear Programming?
5 pages
English 111 15PR Fall 2013 Syllabus
No ratings yet
English 111 15PR Fall 2013 Syllabus
15 pages
California Academy For Lilminius (Cal) : Lesson Plan
No ratings yet
California Academy For Lilminius (Cal) : Lesson Plan
2 pages
SPRR Framework PDF
No ratings yet
SPRR Framework PDF
2 pages
Discussedlessonplan
No ratings yet
Discussedlessonplan
2 pages

Python Class 6 Assignment Solution

Uploaded by

Python Class 6 Assignment Solution

Uploaded by

PANDAS:

raw_Data = {'Voter_name': ['Geek1', 'Geek2', 'Geek3', 'Geek4',

raw_Data = {'Voter_name': ['Geek1', 'Geek2', 'Geek3', 'Geek4', 'Geek5', 'Geek6', 'Geek7',

# Create a new column "Voter" based on voter age

df['Voter'] = np.where(df['Voter_age'] > 18, 'Yes', 'No')

raw_Data = {'First': ['Manan ', 'Raghav ', 'Sunny '],

'Last': ['Goel', 'Sharma', 'Chawla'],

'Age': [12, 24, 56]}

# Combine First and Last columns into Full Name

# Set Age as index

'Price':[1250, 1450, 1550, 400],

a- Add Index as Item1, Item2, Item3, Item4

: 'B', Shuttle: 'S'

d- Round off the Expense column values to two decimal places.

f- Convert the column type of “Date” to datetime format

'Product': ['Umbrella', 'Matress', 'Badminton', 'Shuttle'],

'Price': [1250, 1450, 1550, 400],

'Expense': [21525220.653, 31125840.875, 23135428.768, 56245263.942]}

df.index = ['Item1', 'Item2', 'Item3', 'Item4']

indexes_with_price_gt_1000 = df[df['Price'] > 1000].index.tolist()

product_map = {'Umbrella': 'U', 'Matress': 'M', 'Badminton': 'B', 'Shuttle': 'S'}

# Task d: Round off Expense column to two decimal places

# Task e: Create 'Discounted_Price' column with 10% discount

df['Discounted_Price'] = df['Price'] * 0.9

# Task f: Convert 'Date' column to datetime format

# Task g: Create 'Rank' column based on price

● Load the nba.csv data into a Pandas DataFrame.

● Handle missing values by either removing or imputing them.

4. Exploratory Data Analysis (EDA):

● Create a histogram of player ages.

● Create a box plot of player salaries for each 'position'.

● Display the top 10 players with the highest salaries.

● Determine the top 5 colleges with the most represented players.

● Display the average salary of players for each 'team'.

● Get the index at which minimum weight value is present.

2. Provide comments explaining your code.

3. Use meaningful variable names.

5. Present your findings in a clear and organized manner.

6. Feel free to use additional code cells for each task.

# Load the data into a Pandas DataFrame

# Display basic information about the DataFrame

# Handle missing values

3. Data Transformation: Create 'BMI' column using a fixed height value

# Assuming a fixed height value of 70 inches (5 feet 10 inches)

df['BMI'] = (df['Weight'] / (fixed_height ** 2)) * 703

print(df[['Age', 'Weight', 'Salary']].describe())

import matplotlib.pyplot as plt

# Histogram of player ages

plt.title('Distribution of Player Ages')

# Box plot of player salaries by position

plt.title('Box Plot of Player Salaries by Position')

plt.scatter(df['Age'], df['Salary'], c=df['Position'].map(colors), alpha=0.5)

top_players = df.nlargest(10, 'Salary')

plt.title('Position Distribution of Players')

print("Index with minimum weight value:", min_weight_index)

df_sorted = df.sort_values(by='Name', ignore_index=True)

print("Top 10 names:\n", name_series.head(10))

print("\nLast 10 names:\n", name_series.tail(10))

You might also like