0% found this document useful (0 votes)
2 views

pandas__prac

This document outlines a series of exercises aimed at practicing data analysis using pandas, covering various datasets including the Titanic, Superstore, Climate, Stock Prices, and Netflix. Each exercise includes tasks related to data loading, cleaning, exploration, feature engineering, and visualizations. The exercises are designed to enhance skills in data manipulation and analysis through practical applications.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

pandas__prac

This document outlines a series of exercises aimed at practicing data analysis using pandas, covering various datasets including the Titanic, Superstore, Climate, Stock Prices, and Netflix. Each exercise includes tasks related to data loading, cleaning, exploration, feature engineering, and visualizations. The exercises are designed to enhance skills in data manipulation and analysis through practical applications.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Pandas Learning Exercises

Prepared by Mehizel Ali


January 25, 2025

Introduction
This document contains exercises designed to help you practice data analysis using
pandas. Each exercise includes tasks that involve data exploration, cleaning, manip-
ulation, and visualization. The datasets can be downloaded from the provided links.

Exercise 1: Exploring and Analyzing the Titanic Dataset


Dataset: Titanic Dataset (Download train.csv).

Tasks:

1. Loading Data:

• Load the dataset into a pandas DataFrame.


• Display the first 10 rows and check the data types of each column.

2. Basic Exploration:

• Calculate the number of rows and columns.


• Identify columns with missing values and their proportions.
• Generate descriptive statistics for numerical and categorical columns.

3. Data Cleaning:

• Replace missing Age values with the median age.


• Fill missing Embarked values with the most frequent port.
• Drop the Cabin column.

4. Feature Engineering:

• Create a column FamilySize as the sum of SibSp and Parch.


• Create a binary column IsAlone (1 if the passenger was alone, 0 otherwise).

5. Exploratory Data Analysis:

• Compute survival rates overall and by Pclass.

1
• Analyze survival rates by gender (Sex) and embarkation port (Embarked).
• Create a pivot table showing survival rates by Pclass and Sex.

6. Visualizations:

• Plot the distribution of passenger ages.


• Create a bar chart showing survival rates by Pclass.
• Visualize survival rates by gender and IsAlone.

Exercise 2: Analyzing the Superstore Dataset


Dataset: Superstore Dataset (Download Sample - Superstore.csv).

Tasks:

1. Data Cleaning:

• Handle missing values appropriately.


• Standardize column names by making them lowercase and replacing spaces
with underscores.

2. Exploratory Data Analysis:

• Identify the top 5 states contributing to the highest sales.


• Calculate the profit margin for each product category.
• Find the most profitable sub-category overall.

3. Visualizations:

• Plot sales, profit, and discount by region.


• Create a bar chart to visualize the most profitable product categories.
• Generate a time series plot of monthly sales trends.

Exercise 3: Climate Data Analysis


Dataset: Global Land Temperature Dataset (Use GlobalLandTemperaturesByCountry.csv).

Tasks:

1. Loading Data:

• Load the dataset and parse dates into datetime objects.


• Display the number of records for each country.

2. Data Cleaning:

• Handle missing temperature values by filling them with the rolling mean (win-
dow = 12 months).

2
• Remove records with invalid country names (e.g., empty strings).

3. Exploratory Data Analysis:

• Rank the top 10 hottest and coldest countries by average temperature.


• Analyze temperature trends for a selected country over the last 50 years.

4. Visualizations:

• Plot the temperature trend for the top 5 hottest countries.


• Create a heatmap of average annual temperatures for each continent.

Exercise 4: Financial Stock Market Analysis


Dataset: Stock Prices Dataset (Focus on alls tocks5 yr.csv).

Tasks:

1. Loading Data:

• Load the dataset and check for missing or duplicate records.

2. Exploratory Data Analysis:

• Analyze the daily average stock price for a chosen company.


• Calculate monthly and yearly returns for all companies.
• Find the top 5 companies with the highest average closing prices.

3. Feature Engineering:

• Create a column for daily price range (High - Low).


• Add a column for cumulative returns for each company.

4. Visualizations:

• Plot the price trend of the top 3 performing stocks over time.
• Create a histogram of daily returns for a selected stock.

5. Advanced Analysis:

• Use rolling statistics to calculate a 50-day moving average of stock


prices.
• Identify the best-performing sectors based on average returns.

3
Exercise 5: Netflix Data Analysis
Dataset: Netflix Movies and TV Shows Dataset (Use netflixt itles.csv).

Tasks:

1. Exploring the Dataset:

• Load the dataset and check for null values.


• Count the number of unique movies and TV shows in the dataset.

2. Data Cleaning:

• Handle missing values in director and cast by replacing nulls with


Unknown.
• Extract the year from the datea ddedcolumn.

3. Exploratory Data Analysis:

• Analyze the distribution of content by type (Movie vs. TV Show).


• Find the top 10 genres based on the number of titles.
• Analyze content trends over the years (e.g., movies vs. TV shows).

4. Visualizations:

• Create a bar chart for the top 5 countries producing Netflix content.
• Plot the distribution of movie durations using a histogram.
• Visualize the trend of Netflix content added over time.

5. Advanced Analysis:

• Use groupby to analyze the average duration of movies by genre.


• Identify countries producing the highest number of movies and TV shows
in the Action genre.

You might also like