Data Analysis using Python_Homework 5.docx
Data Analysis using Python_Homework 5.docx
This assignment is designed to give you practice writing code and applying lessons and
topics for the current module.
The Assignment
In this assignment, you will draw two figures based on the matplotlib library. You are expected
to perform data analysis according to the requirement and plot the figure to visualize your
results. For each question, there are clear instructions in each cell of the provided Jupyter
Notebook file. F
ollow those instructions and write the code after each “# your code here”.
We’ll use nbgrader, a Jupyter Notebooks testing platform, to test whether each function
implementation is correct. You can see the exact test we are running in the cell right below your
solution.
All of the data is contained within the “imdb.xlsx” file which contains 3 sheets:
● “imdb”: contains records of movies and ratings scraped from IMDB website
○ There are 8 columns: movie_title, director_id, country_id, content_rating,
title_year, imdb_score, gross, duration
During this exercise, you’ll need to merge the “imdb” data with the “countries” data using the
“country_id” and “id” columns, respectively. This will give you the country of origin for each
movie. You’ll also need to merge the “imdb” data with the “directors” data using the
“director_id” and “id” columns, respectively. This will give you the director for each movie.
Submission
Open the Jupyter Notebook directly in Coursera (which you will find in the item soon after this
reading). The Coursera lab includes the imdb.xlsx file. To complete the assignment, complete
the provided Jupyter Notebook file, following the detailed instructions in each cell. Test your
submission before submitting by following the instructions on the assignment page in
Coursera. When you’re happy with your solutions, click the ‘Submit Assignment’ button in the
top right.
Evaluation
Q1:
● 2 pts - scatter plot for movies after 2000
● 2 pts - scatter plot for movies before 2000
● 1 pt - legend and labels
Q2:
● 2 pts - histogram for R-Rated movies
● 2 pts - histogram for PG-13 movies
● 1 pt - legend and labels