0% found this document useful (0 votes)
7 views

Python Codes and Comments

The document outlines a series of tasks for analyzing a COVID-19 dataset using Python in a Jupyter Notebook. Key tasks include structuring the notebook with appropriate sections, cleaning the dataset, performing data analysis, and visualizing results. Specific requirements include handling missing values, calculating statistics, and creating visualizations using matplotlib.

Uploaded by

Tom
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views

Python Codes and Comments

The document outlines a series of tasks for analyzing a COVID-19 dataset using Python in a Jupyter Notebook. Key tasks include structuring the notebook with appropriate sections, cleaning the dataset, performing data analysis, and visualizing results. Specific requirements include handling missing values, calculating statistics, and creating visualizations using matplotlib.

Uploaded by

Tom
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

You will be given a COVID-19 dataset read in from python code, and are required to do below tasks:

1. (4.5 marks, 1.5 marks for each) Good structure of Python Jupyter Notebook

a. Containing title cells, subtitle cells.

b. Python codes are reasonable separated into groups (code cells) with functionalities.

 Code for importing libraries


 Code for loading data
 Code for data cleaning
 Code for data analysis
 Code for visualizations

c. Containing meaningful comments and sensible variable and function name


2. (10 marks, 2 marks each)) download csv data with pandas with below code:

 import pandas as pd

deaths_df =
pd.read_csv('https://github.com/CSSEGISandData/COVID 19/blob/master/csse_covid_19_data/csse
_covid_19_time_series/time_series_covid19_deaths_

global.csv’)

 Identify and handle missing values in the dataset.


 Remove duplicate entries if any.
 Convert the date column to a consistent date format (e.g., YYYY MM-DD)
 Save the cleaned dataset to a new CSV file.

3. (3 marks) Display first 5 rows of the loaded data (1 mark) and do a short summary about the data
(2 marks)
4. (2.5 marks) Calculate the mean and median of the daily cases.

5. (3 marks) Get daily death cases worldwide (hint: summarizing daily death cases over all countries.
6. (3 marks) Get daily incensement of deaths cases via defining a function (hint: use the death cases
of today minus the death cases of yesterday from the data obtained in task 5.

7. (4 marks) Visualize the data obtained in task 5 with library matplotlib.

You might also like