0% found this document useful (0 votes)

9 views9 pages

qxc6bs1pw: 0.0.1 Matplotlib Assignment

The document outlines a series of data visualization tasks using Python's Pandas and Matplotlib libraries, focusing on automobile sales and Netflix data. It includes instructions for creating line charts, scatter plots, pie charts, heatmaps, and bar plots to analyze trends and correlations in sales, advertising expenditure, and IMDb ratings. Each section provides code snippets and explanations for visualizing the data effectively.

Uploaded by

anuj rawat

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

9 views9 pages

qxc6bs1pw: 0.0.1 Matplotlib Assignment

Uploaded by

anuj rawat

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 9

qxc6bs1pw

December 26, 2024

0.0.1 Matplotlib Assignment

[1]: import pandas as pd

# Load the dataset

url = 'https://itv-contentbucket.s3.ap-south-1.amazonaws.com/Exams/AWP/
↪Matplotlib/historical_automobile_sales.csv'

df = pd.read_csv(url)

1. Develop a Line chart using pandas to show how automobile sales fluctuate from year to year
[2]: import matplotlib.pyplot as plt

# Aggregate automobile sales by year

sales_per_year = df.groupby('Year')['Automobile_Sales'].sum().reset_index()

# Plot the line chart

plt.figure(figsize=(12, 6))
plt.plot(sales_per_year['Year'], sales_per_year['Automobile_Sales'],␣
↪marker='o', linestyle='-')

plt.title('Automobile Sales Fluctuation from Year to Year')

plt.xlabel('Year')
plt.ylabel('Automobile Sales')
plt.grid(True)
plt.show()

1
2. Plot different lines for categories of vehicle type and analyze the trend during recession periods
[3]: # Aggregate automobile sales by year and vehicle type
sales_per_year_vehicle = df.groupby(['Year',␣
↪'Vehicle_Type'])['Automobile_Sales'].sum().unstack()

# Plot the line chart with recession shading

plt.figure(figsize=(14, 7))
for vehicle_type in sales_per_year_vehicle.columns:
plt.plot(sales_per_year_vehicle.index,␣
↪sales_per_year_vehicle[vehicle_type], marker='o', linestyle='-',␣

↪label=vehicle_type)

# Highlight recession periods

recession_periods = df[df['Recession'] == 1]['Year'].unique()
for year in recession_periods:
plt.axvspan(year - 0.5, year + 0.5, color='gray', alpha=0.3)

plt.title('Sales Trends by Vehicle Type During Recession Periods')

plt.xlabel('Year')
plt.ylabel('Automobile Sales')
plt.legend(title='Vehicle Type')
plt.grid(True)
plt.show()

2
3. Visualization to compare the sales trend per vehicle type for recession and non-recession
periods
[4]: # Separate recession and non-recession periods
recession_sales = df[df['Recession'] == 1].groupby(['Year',␣
↪'Vehicle_Type'])['Automobile_Sales'].sum().unstack()

non_recession_sales = df[df['Recession'] == 0].groupby(['Year',␣

↪'Vehicle_Type'])['Automobile_Sales'].sum().unstack()

# Plot the comparison

fig, axes = plt.subplots(1, 2, figsize=(18, 8), sharey=True)

# Recession period sales

axes[0].set_title('Sales Trend During Recession Periods')
for vehicle_type in recession_sales.columns:
axes[0].plot(recession_sales.index, recession_sales[vehicle_type],␣
↪marker='o', linestyle='-', label=vehicle_type)

axes[0].set_xlabel('Year')
axes[0].set_ylabel('Automobile Sales')
axes[0].legend(title='Vehicle Type')
axes[0].grid(True)

# Non-recession period sales

axes[1].set_title('Sales Trend During Non-Recession Periods')
for vehicle_type in non_recession_sales.columns:
axes[1].plot(non_recession_sales.index, non_recession_sales[vehicle_type],␣
↪marker='o', linestyle='-', label=vehicle_type)

3
axes[1].set_xlabel('Year')
axes[1].legend(title='Vehicle Type')
axes[1].grid(True)

plt.show()

4. Scatter plot to identify the correlation between average vehicle price and sales volume during
recessions
[5]: # Calculate average price and total sales during recession periods
avg_price_sales_recession = df[df['Recession'] == 1].groupby('Vehicle_Type').
↪agg({'Price': 'mean', 'Automobile_Sales': 'sum'}).reset_index()

# Scatter plot
plt.figure(figsize=(10, 6))
plt.scatter(avg_price_sales_recession['Price'],␣
↪avg_price_sales_recession['Automobile_Sales'])

for i, txt in enumerate(avg_price_sales_recession['Vehicle_Type']):

plt.annotate(txt, (avg_price_sales_recession['Price'][i],␣
↪avg_price_sales_recession['Automobile_Sales'][i]))

plt.title('Correlation Between Average Vehicle Price and Sales Volume During␣

↪Recessions')

plt.xlabel('Average Vehicle Price')

plt.ylabel('Total Automobile Sales')
plt.grid(True)
plt.show()

4
5. Pie chart to display the portion of advertising expenditure of Automotives during recession
and non-recession periods
[6]: # Calculate total advertising expenditure during recession and non-recession␣
↪periods

ad_exp_recession = df[df['Recession'] == 1]['Advertising_Expenditure'].sum()

ad_exp_non_recession = df[df['Recession'] == 0]['Advertising_Expenditure'].sum()

# Create pie chart

labels = ['Recession Period', 'Non-Recession Period']
sizes = [ad_exp_recession, ad_exp_non_recession]
colors = ['#ff9999','#66b3ff']
explode = (0.1, 0)

plt.figure(figsize=(8, 8))
plt.pie(sizes, explode=explode, labels=labels, colors=colors, autopct='%1.
↪1f%%', shadow=True, startangle=140)

plt.title('Advertising Expenditure During Recession and Non-Recession Periods')

plt.axis('equal')
plt.show()

5
6) Heatmap to Understand Correlation Between IMDB Score, Hidden Gem Score, and IMDB
Votes
[12]: import pandas as pd

# Load the dataset

url = 'https://itv-contentbucket.s3.ap-south-1.amazonaws.com/Exams/AWP/pandas/
↪Netflix.csv'

df = pd.read_csv(url)

# Print the column names

print(df.columns)

Index(['Title', 'Genre', 'Languages', 'Series or Movie', 'Hidden Gem Score',

'Country Availability', 'Runtime', 'Director', 'Writer', 'Actors',
'View Rating', 'IMDb Score', 'Rotten Tomatoes Score',
'Metacritic Score', 'Awards Nominated For', 'Boxoffice', 'Release Date',
'Netflix Release Date', 'Netflix Link', 'IMDb Votes'],
dtype='object')

[16]: import seaborn as sns

import matplotlib.pyplot as plt

6
# Ensure relevant columns are numeric
df['IMDb Score'] = pd.to_numeric(df['IMDb Score'], errors='coerce')
df['Hidden Gem Score'] = pd.to_numeric(df['Hidden Gem Score'], errors='coerce')
df['IMDb Votes'] = pd.to_numeric(df['IMDb Votes'], errors='coerce')

# Drop rows with NaN values in the relevant columns

correlation_data = df[['IMDb Score', 'Hidden Gem Score', 'IMDb Votes']].dropna()

# Calculate the correlation matrix

correlation_matrix = correlation_data.corr()

# Create the heatmap

plt.figure(figsize=(10, 6))
sns.heatmap(correlation_matrix, annot=True, cmap='coolwarm', fmt='.2f')
plt.title('Correlation between IMDb Score, Hidden Gem Score, and IMDb Votes')
plt.show()

7) Plot lines for categories of every movie type and analyze how they have received IMDB Votes.
Create a subplot to compare the same categories with IMDB Score
[17]: # Ensure 'Series or Movie' and 'IMDb Votes' columns are numeric
df['IMDb Votes'] = pd.to_numeric(df['IMDb Votes'], errors='coerce')

7
df['IMDb Score'] = pd.to_numeric(df['IMDb Score'], errors='coerce')

# Aggregate IMDb Votes and IMDb Score by 'Series or Movie'

votes_by_type = df.groupby('Series or Movie')['IMDb Votes'].sum().reset_index()
score_by_type = df.groupby('Series or Movie')['IMDb Score'].mean().reset_index()

# Create subplots
fig, axes = plt.subplots(1, 2, figsize=(16, 6))

# IMDb Votes plot

sns.lineplot(ax=axes[0], data=votes_by_type, x='Series or Movie', y='IMDb␣
↪Votes', marker='o')

axes[0].set_title('IMDb Votes by Movie Type')

axes[0].set_xlabel('Movie Type')
axes[0].set_ylabel('IMDb Votes')
axes[0].grid(True)

# IMDb Score plot

sns.lineplot(ax=axes[1], data=score_by_type, x='Series or Movie', y='IMDb␣
↪Score', marker='o')

axes[1].set_title('IMDb Score by Movie Type')

axes[1].set_xlabel('Movie Type')
axes[1].set_ylabel('IMDb Score')
axes[1].grid(True)

plt.tight_layout()
plt.show()

8) Create 2 bar plots to understand movies and web series by languages in which they have been
made

8
[ ]: # Extract languages data for movies and web series
movie_languages = df[df['Series or Movie'] == 'Movie']['Languages'].
↪value_counts().reset_index()

movie_languages.columns = ['Language', 'Count']

series_languages = df[df['Series or Movie'] == 'Series']['Languages'].

↪value_counts().reset_index()

series_languages.columns = ['Language', 'Count']

# Create subplots for bar plots

fig, axes = plt.subplots(1, 2, figsize=(18, 8), sharey=True)

# Movies by language
sns.barplot(ax=axes[0], data=movie_languages.head(10), x='Count', y='Language',␣
↪hue='Language', palette='viridis', dodge=False, legend=False)

axes[0].set_title('Top 10 Languages of Movies')

axes[0].set_xlabel('Count')
axes[0].set_ylabel('Language')

# Web series by language

sns.barplot(ax=axes[1], data=series_languages.head(10), x='Count',␣
↪y='Language', hue='Language', palette='inferno', dodge=False, legend=False)

axes[1].set_title('Top 10 Languages of Web Series')

axes[1].set_xlabel('Count')
axes[1].set_ylabel('Language')

plt.tight_layout()
plt.show()

Robotics Research Paper
100% (3)
Robotics Research Paper
23 pages
Final Project Part 2 Dashboard
No ratings yet
Final Project Part 2 Dashboard
6 pages
All Certik Skynet Answer (Up-To-date)
100% (2)
All Certik Skynet Answer (Up-To-date)
21 pages
NumPy, Pandas, MatplotLib, Seaborn, ScikitLearn (SkLearn)
No ratings yet
NumPy, Pandas, MatplotLib, Seaborn, ScikitLearn (SkLearn)
14 pages
Matplotlib Cheat Sheet
No ratings yet
Matplotlib Cheat Sheet
6 pages
Alexis Reid - Type Specimens
No ratings yet
Alexis Reid - Type Specimens
81 pages
21CS644 Module 4
No ratings yet
21CS644 Module 4
24 pages
Data Visualization - Matplotlib PDF
100% (1)
Data Visualization - Matplotlib PDF
15 pages
Lec 20
No ratings yet
Lec 20
24 pages
DVPD Final Lab Word PDF
No ratings yet
DVPD Final Lab Word PDF
93 pages
Data Visualization Using Matplotlib in Python
No ratings yet
Data Visualization Using Matplotlib in Python
15 pages
BIDA Practical Print
No ratings yet
BIDA Practical Print
56 pages
DV LAb Staff
No ratings yet
DV LAb Staff
73 pages
Fisica1 Apuntes
100% (1)
Fisica1 Apuntes
220 pages
CG DADL - 2024 June - Lecture 02
No ratings yet
CG DADL - 2024 June - Lecture 02
64 pages
10 Must-Know Seaborn Visualization Plots For Multivariate Data Analysis in Python - by Susan Maina - Towards Data Science
No ratings yet
10 Must-Know Seaborn Visualization Plots For Multivariate Data Analysis in Python - by Susan Maina - Towards Data Science
39 pages
3 Regression Diagnostics
100% (1)
3 Regression Diagnostics
53 pages
KSS 82 CREAD CWRITE en PDF
100% (1)
KSS 82 CREAD CWRITE en PDF
67 pages
Data Visualization
No ratings yet
Data Visualization
48 pages
Comprehensive Data Visualization With Matplotlib and Seaborn
No ratings yet
Comprehensive Data Visualization With Matplotlib and Seaborn
40 pages
Ds 1
No ratings yet
Ds 1
37 pages
Data Visualization
No ratings yet
Data Visualization
31 pages
Data Visualization With Python
No ratings yet
Data Visualization With Python
34 pages
Data Visualization With Matplotlib
No ratings yet
Data Visualization With Matplotlib
20 pages
A9bf73 - Introduction To Matplotlib
No ratings yet
A9bf73 - Introduction To Matplotlib
18 pages
Visualization
No ratings yet
Visualization
28 pages
Data Viz 2 Notes
No ratings yet
Data Viz 2 Notes
20 pages
West Rox
No ratings yet
West Rox
29 pages
Article Review 6 Eng
No ratings yet
Article Review 6 Eng
31 pages
Dsa and ML 10
No ratings yet
Dsa and ML 10
18 pages
Cap 793
No ratings yet
Cap 793
17 pages
Pandas Complete + Visualisation Summary of IBM Visualization
No ratings yet
Pandas Complete + Visualisation Summary of IBM Visualization
21 pages
23bet10114 Naman Gupta Assignment-1
No ratings yet
23bet10114 Naman Gupta Assignment-1
17 pages
Lec 19
No ratings yet
Lec 19
14 pages
DVA Practical
No ratings yet
DVA Practical
19 pages
20 June BA Class
No ratings yet
20 June BA Class
17 pages
Chapter 03 Visualization (R)
No ratings yet
Chapter 03 Visualization (R)
30 pages
Data Collection and Data Cleaning: Next Connect To The Drive
No ratings yet
Data Collection and Data Cleaning: Next Connect To The Drive
16 pages
Advanced Visualization For Data Scientists With Matplotlib
No ratings yet
Advanced Visualization For Data Scientists With Matplotlib
38 pages
Dav Week8 240953580
No ratings yet
Dav Week8 240953580
15 pages
Summary: Introduction To Data Visualization Tools
No ratings yet
Summary: Introduction To Data Visualization Tools
13 pages
Data Visualisation
No ratings yet
Data Visualisation
5 pages
Xiwf7pq1g: Pandas PD
No ratings yet
Xiwf7pq1g: Pandas PD
9 pages
Lab1 For Module3 - Python Code
No ratings yet
Lab1 For Module3 - Python Code
10 pages
ALOJIPAN Assessment - Task - 1 - Sampling - Data - Visualization
No ratings yet
ALOJIPAN Assessment - Task - 1 - Sampling - Data - Visualization
12 pages
Exercises 3
No ratings yet
Exercises 3
11 pages
Instructions
No ratings yet
Instructions
8 pages
Matplotlib Pandas Guide
No ratings yet
Matplotlib Pandas Guide
7 pages
Matplotlib Pandas Guide
No ratings yet
Matplotlib Pandas Guide
9 pages
Learn Seaborn 1674064934
No ratings yet
Learn Seaborn 1674064934
24 pages
Basic Plotting
No ratings yet
Basic Plotting
8 pages
Matplotlib
No ratings yet
Matplotlib
7 pages
DSBDL Write Ups 8 To 10
No ratings yet
DSBDL Write Ups 8 To 10
7 pages
Assignment 4 On Visualization On Graph With Solution
No ratings yet
Assignment 4 On Visualization On Graph With Solution
14 pages
2303A54054 - Lab Assignment 1 - Colab
No ratings yet
2303A54054 - Lab Assignment 1 - Colab
6 pages
Eda Lab Assignment2
No ratings yet
Eda Lab Assignment2
10 pages
Topic 2. Visual Data Analysis in Python: Mlcourse - Ai (Https://mlcourse - Ai)
No ratings yet
Topic 2. Visual Data Analysis in Python: Mlcourse - Ai (Https://mlcourse - Ai)
15 pages
Python Pandas Matplot
No ratings yet
Python Pandas Matplot
15 pages
Data Visualizations in Python With Matplotlib: Sidita Duli, PHD
No ratings yet
Data Visualizations in Python With Matplotlib: Sidita Duli, PHD
6 pages
Data Analisis 2
No ratings yet
Data Analisis 2
13 pages
Advanced Plot Types With Seaborn
No ratings yet
Advanced Plot Types With Seaborn
8 pages
Website Development Agreement
No ratings yet
Website Development Agreement
9 pages
DAV Guidelines
No ratings yet
DAV Guidelines
4 pages
Data Viz Cheat Sheet Final
No ratings yet
Data Viz Cheat Sheet Final
2 pages
DWH & Data Modeling
No ratings yet
DWH & Data Modeling
50 pages
Temp 2 Lab 1
No ratings yet
Temp 2 Lab 1
5 pages
Program Questions
No ratings yet
Program Questions
2 pages
Activator Office 2016.Cmd
No ratings yet
Activator Office 2016.Cmd
1 page
MATH 2160 Numerical Analysis 1 Notes: S. H. Lui Department of Mathematics University of Manitoba
No ratings yet
MATH 2160 Numerical Analysis 1 Notes: S. H. Lui Department of Mathematics University of Manitoba
111 pages
1.9.4 Test (TST) - Foundations of Geometry (Test)
No ratings yet
1.9.4 Test (TST) - Foundations of Geometry (Test)
11 pages
CCNA1 v7.0: ITN Practice PT Skills Assessment (PTSA) Answers
No ratings yet
CCNA1 v7.0: ITN Practice PT Skills Assessment (PTSA) Answers
1 page
L9-Health Data Privacy, Confidentiality and Security
No ratings yet
L9-Health Data Privacy, Confidentiality and Security
14 pages
Matrikon Data Broker MQTT Publisher User Manual
No ratings yet
Matrikon Data Broker MQTT Publisher User Manual
66 pages
Infineon-Presentation 2kW ZVS Demoboard description-AP-v01 00-EN
No ratings yet
Infineon-Presentation 2kW ZVS Demoboard description-AP-v01 00-EN
16 pages
Anisha ETL DataEngineer
No ratings yet
Anisha ETL DataEngineer
7 pages
G. Hypersurfaces, Junctions and Shells: For Maple 2016
100% (1)
G. Hypersurfaces, Junctions and Shells: For Maple 2016
9 pages
New CV
No ratings yet
New CV
5 pages
SwissgasSonimix 2106 Gas Dilution Calibrator
No ratings yet
SwissgasSonimix 2106 Gas Dilution Calibrator
2 pages
XG Boost
No ratings yet
XG Boost
5 pages
Pleiades Panharpening and Orthorectification
No ratings yet
Pleiades Panharpening and Orthorectification
10 pages
Abdul Rauf CV 24
No ratings yet
Abdul Rauf CV 24
4 pages
OOP-Week3 - Class 2UML-CLass Diagram-Pages
No ratings yet
OOP-Week3 - Class 2UML-CLass Diagram-Pages
20 pages
BCOS Math 10 Chapter 2
No ratings yet
BCOS Math 10 Chapter 2
3 pages
Inline Terminal - IB IL AO 4/I/4-20-ECO - 2702497: Product Description
No ratings yet
Inline Terminal - IB IL AO 4/I/4-20-ECO - 2702497: Product Description
10 pages
Interaction Model
No ratings yet
Interaction Model
11 pages
MCA Cloud Storage Report
No ratings yet
MCA Cloud Storage Report
13 pages
Windows - Error - Registration of The App Failed - Stack Overflow
No ratings yet
Windows - Error - Registration of The App Failed - Stack Overflow
6 pages
Config
No ratings yet
Config
1 page
C Language Programming Codes
From Everand
C Language Programming Codes
Durgesh
No ratings yet

qxc6bs1pw: 0.0.1 Matplotlib Assignment

Uploaded by

qxc6bs1pw: 0.0.1 Matplotlib Assignment

Uploaded by

qxc6bs1pw

December 26, 2024

0.0.1 Matplotlib Assignment

[1]: import pandas as pd

# Load the dataset

# Aggregate automobile sales by year

# Plot the line chart

plt.title('Automobile Sales Fluctuation from Year to Year')

# Plot the line chart with recession shading

# Highlight recession periods

plt.title('Sales Trends by Vehicle Type During Recession Periods')

non_recession_sales = df[df['Recession'] == 0].groupby(['Year',␣

# Plot the comparison

# Recession period sales

# Non-recession period sales

for i, txt in enumerate(avg_price_sales_recession['Vehicle_Type']):

plt.title('Correlation Between Average Vehicle Price and Sales Volume During␣

plt.xlabel('Average Vehicle Price')

ad_exp_recession = df[df['Recession'] == 1]['Advertising_Expenditure'].sum()

# Create pie chart

plt.title('Advertising Expenditure During Recession and Non-Recession Periods')

# Load the dataset

# Print the column names

Index(['Title', 'Genre', 'Languages', 'Series or Movie', 'Hidden Gem Score',

[16]: import seaborn as sns

# Drop rows with NaN values in the relevant columns

# Calculate the correlation matrix

# Create the heatmap

# Aggregate IMDb Votes and IMDb Score by 'Series or Movie'

# IMDb Votes plot

axes[0].set_title('IMDb Votes by Movie Type')

# IMDb Score plot

axes[1].set_title('IMDb Score by Movie Type')

movie_languages.columns = ['Language', 'Count']

series_languages = df[df['Series or Movie'] == 'Series']['Languages'].

series_languages.columns = ['Language', 'Count']

# Create subplots for bar plots

axes[0].set_title('Top 10 Languages of Movies')

# Web series by language

axes[1].set_title('Top 10 Languages of Web Series')

You might also like