0% found this document useful (0 votes)
10 views3 pages

Data Analysis using Python_Homework 5.docx

This homework assignment focuses on visualizing movie and ratings data using Python's matplotlib library, specifically through histograms and scatterplots. Students are required to analyze data from an 'imdb.xlsx' file, which includes movie records, country names, and director names, and to merge these datasets accordingly. The assignment will be evaluated based on the accuracy of the visualizations and the implementation of code in a provided Jupyter Notebook.

Uploaded by

Hanif Nur Ilham
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views3 pages

Data Analysis using Python_Homework 5.docx

This homework assignment focuses on visualizing movie and ratings data using Python's matplotlib library, specifically through histograms and scatterplots. Students are required to analyze data from an 'imdb.xlsx' file, which includes movie records, country names, and director names, and to merge these datasets accordingly. The assignment will be evaluated based on the accuracy of the visualizations and the implementation of code in a provided Jupyter Notebook.

Uploaded by

Hanif Nur Ilham
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Data Analysis using Python

Homework 5 : Visualize Movie & Ratings Data

This assignment is designed to give you practice writing code and applying lessons and
topics for the current module.

This homework deals with the following topics:

● Jupyter notebook magic functions


● The m
​ atplotlib​ library
● Histograms
● Scatterplots

The Assignment

In this assignment, you will draw two figures based on the ​matplotlib library. You are expected
to perform data analysis according to the requirement and plot the figure to visualize your
results. For each question, there are clear instructions in each cell of the provided Jupyter
Notebook file. F
​ ollow those instructions and write the code after each “# your code here”.

We’ll use nbgrader, a Jupyter Notebooks testing platform, to test whether each function
implementation is correct. You can see the exact test we are running in the cell right below your
solution.

About the Data

All of the data is contained within the “imdb.xlsx” file which contains 3 sheets:

● “imdb”: contains records of movies and ratings scraped from IMDB website
○ There are 8 columns: movie_title, director_id, country_id, content_rating,
title_year, imdb_score, gross, duration

● “countries”: contains the country (of origin) names


○ There are 2 columns: id, country
Data Analysis using Python

● “directors”: contains the director names


○ There are 2 columns: id, director_name

During this exercise, you’ll need to merge the “imdb” data with the “countries” data using the
“country_id” and “id” columns, respectively. This will give you the country of origin for each
movie. You’ll also need to merge the “imdb” data with the “directors” data using the
“director_id” and “id” columns, respectively. This will give you the director for each movie.

Submission

Open the Jupyter Notebook directly in Coursera (which you will find in the item soon after this
reading). The Coursera lab includes the ​imdb.xlsx file. To complete the assignment, complete
the provided Jupyter Notebook file, following the detailed instructions in each cell. Test your
submission before submitting by following the instructions on the assignment page in
Coursera. When you’re happy with your solutions, click the ‘Submit Assignment’ button in the
top right.

Evaluation

Each question is worth 5 points except for:


Data Analysis using Python

Q1:
● 2 pts - scatter plot for movies after 2000
● 2 pts - scatter plot for movies before 2000
● 1 pt - legend and labels
Q2:
● 2 pts - histogram for R-Rated movies
● 2 pts - histogram for PG-13 movies
● 1 pt - legend and labels

You might also like