0% found this document useful (0 votes)
180 views5 pages

6600 Week 6 - Assignment 6 - Questions

Uploaded by

Hemanth Kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
180 views5 pages

6600 Week 6 - Assignment 6 - Questions

Uploaded by

Hemanth Kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 5

6/3/24, 11:09 PM Assignment_6 - Jupyter Notebook

Module 6 Assignment

Assignment Instructions
You'll be working with the "120 years of Olympic History" dataset acquired by Randi Griffin. Your assignment is to identify the top five sports based on the most significant number of medals awarded in the year 2016
and then perform the following analysis:

Tasks & Grading Rubric: (100 points total)

1. Open the dataset and format it as a pandas DataFrame.


2. Filter the DataFrame only to include the rows corresponding to medal winners in 2016. (15 Points)
3. Find out the medals awarded in 2016 for each sport. (10 Points)
4. List the top five sports based on the most significant number of medals awarded in 2016. (5 Points)
5. Filter the DataFrame one more time to include the records for the top five sports in 2016. (10 Points)
6. Generate a bar plot of record counts corresponding to the top five sports in 2016. (20 Points)
7. Generate a histogram for the Age feature of all medal winners in the top five sports in 2016. (20 Points)
8. Generate a bar plot indicating how many medals were won by each country's team in the top five sports in 2016, and sort them. Make sure to set the figure size to 15x5 (20 Points)

In [ ] : M # importing necessary libraries


import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

Step 1: Data Import and Formatting

In [ ] : M # Read the dataset into a DataFrame


df = pd.read_csv("athlete_events.csv")
df

Step 2: Filter the DataFrame only to include the rows corresponding to medal winners in 2016 (15 Points)

a. Filter the DataFrame to include only records from the year 2016 (5 Points)

In [ ]: M # GRADED Task la: Please add your code below in the 'your code here' sections

# Filter the DataFrame to include only records from the year 2016 and store in temp_df
temp_df =#your code here

In [ ]: M # This cell is read-only and includes hidden test cases that will run in autograding.

https://www.coursera.org/learn/ie6600-computation-and-visualization-for-analytics/programming/Al1qz/python-data-analysis-and-visualization/lab?path=%2Fnotebooks%2Frelease%2FModule6%2FAssignment_6.ipynb 1/5
6/3/24, 11:09 PM Assignment_6 - Jupyter Notebook

In [ ]: M # Print first 5 rows of the temporary dataframe


temp_df.head()

b. Find the number of null and non-null values in 'Medal Column' (5 Points)

In [ ]: M # GRADED Task lb: Please add your code below in the 'your code here' sections

# Display the number of null and non-full values in the 'Medal Column' for temp_df
# Store this summary in the provided null_summary variable where True indicates the count for null values
# and False indicates the value for non-null values.

# Expected Final Display format:


# True int
# False int

null_summary =#your code here

null_summary

In [ ]: M # This cell is read-only and includes hidden test cases that will run in autograding.

c. Remove rows where the 'Medal' column has missing values. (5 Points)

In [ ]: M # GRADED Task le: Please add your code below in the 'your code here' sections

# We can see there are few null values in 'Medal' column


# Remove rows where the 'Medal' column has missing values and store back in temp_df
# This operation filters the DataFrame to keep only records of athletes who won medals.

temp_df =#your code here

temp_df

In [ ]: M # This cell is read-only and includes hidden test cases that will run in autograding.

In [ ]: M temp_df.head()

Step 3: Find out the medals awarded in 2016 for each sport. (10 Points)

https://www.coursera.org/learn/ie6600-computation-and-visualization-for-analytics/programming/Al1qz/python-data-analysis-and-visualization/lab?path=%2Fnotebooks%2Frelease%2FModule6%2FAssignment_6.ipynb 2/5
6/3/24, 11:09 PM Assignment_6 - Jupyter Notebook

In [ ]: M # GRADED Task 3: Please add your code below in the 'your code here' sections

# Find out the medals awarded in 2016 for each sport and store in sport_medal_summary

sport_medal_summary =#your code here

sport_medal_summary

In [ ]: M# This cell is read-only and includes hidden test cases that will run in autograding.

Step 4: List the top five sports based on the most significant number of medals awarded in 2016. (5 Points)

In [ ]: M # GRADED Task 4: Please add your code below in the 'your code here' sections

# List the top five sports based on the most significant number of medals and store in top_five_sports below.
awarded,
# Expected Final Display format:

# Sport Name Medal Count


(int) # Sport Name Medal
Count (int) # Sport Name
Medal Count (int) # Sport
Name Medal Count (int) #
Sport Name Medal Count (int)

top_five_sports =#your code here

top_five_sports

In [ ]: M# This cell is read-only and includes hidden test cases that will run in autograding.

Step 5: Filter the DataFrame one more time to include the records for the top five sports in 2016. (10 Points)

In [ ]: M # GRADED Task 5: Please add your code below in the 'your code here' sections

# Filter the DataFrame one more time to include the records for the top five sports in 2016.
# Store this filtered DataFrame in top5_df provided below.

topS_df =#your code here

topS_df

In [ ]: M# This cell is read-only and includes hidden test cases that will run in autograding.

https://www.coursera.org/learn/ie6600-computation-and-visualization-for-analytics/programming/Al1qz/python-data-analysis-and-visualization/lab?path=%2Fnotebooks%2Frelease%2FModule6%2FAssignment_6.ipynb 3/5
6/3/24, 11:09 PM Assignment_6 - Jupyter Notebook

In [ ]: M topS_df. head()

Step 6: Generate a bar plot of record counts corresponding to the top five sports in 2016. (20 Points)

In [ ]: M # GRADED Task 6: Please add your code below in the 'your code here' sections

# Plot a bar chart that shows the record counts of the Top 5 2616 sports

top_S_plot =#your code here

plt.show()

In [ ]: M # This cell is read-only and includes hidden test cases that will run in autograding.

Step 7: Generate a histogram for the Age feature of all medal winners in the top five sports in 2016 (20 Points)

In [ ]: M # GRADED Task 7: Please add your code below in the 'your code here' sections

# Plot a histogram and store in the age_histogram variable


age_histogram =#your code here

age_histogram

In [ ]: M # This cell is read-only and includes hidden test cases that will run in autograding for partial points

In [ ]: M # This cell is read-only and includes hidden test cases that will run in autograding for partial points

Step 8: Generate a bar plot indicating how many medals were won by each country's team in the top five sports in 2016, and sort them. Make sure to set the figure size to 15x5 (20 Points)

In [ ]: M # GRADED Task 8: Please add your code below in the 'your code here' sections

# Plot a bar chart indicating medals won by each team in the top 5 2616 sports.
# Store this in final_bar_plot

final_bar_plot =#your code here

https://www.coursera.org/learn/ie6600-computation-and-visualization-for-analytics/programming/Al1qz/python-data-analysis-and-visualization/lab?path=%2Fnotebooks%2Frelease%2FModule6%2FAssignment_6.ipynb 4/5
6/3/24, 11:09 PM Assignment_6 - Jupyter Notebook

In [ ] : M # This cell is read-only and includes hidden test cases that will run in autograding for partial points

In [ ] : M # This cell is read-only and includes hidden test cases that will run in autograding for partial points

https://www.coursera.org/learn/ie6600-computation-and-visualization-for-analytics/programming/Al1qz/python-data-analysis-and-visualization/lab?path=%2Fnotebooks%2Frelease%2FModule6%2FAssignment_6.ipynb 5/5

You might also like