0% found this document useful (0 votes)
61 views3 pages

Task3.Ipynb - Colaboratory Dip

Uploaded by

Mario
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
61 views3 pages

Task3.Ipynb - Colaboratory Dip

Uploaded by

Mario
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Dipanshu Nanhe Class: R&A Div 02 Task3.

ipynb - Colaboratory
Roll No: PB32

from google.colab import drive


drive.mount('/content/drive')

Mounted at /content/drive

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

# Create the sales data dictionary


sales_data = {
'OrderID': range(1, 501),
'Product': np.random.choice(['Product A', 'Product B', 'Product C'], 500),
'Salesperson': np.random.choice(['Salesperson 1', 'Salesperson 2', 'Salesperson 3'], 500),
'Revenue': np.random.randint(100, 1000, 500),
'Date': pd.date_range(start='2023-01-01', periods=500)
}

# Create a Pandas DataFrame from the sales data


df = pd.DataFrame(sales_data)

# Save the DataFrame to a CSV file


df.to_csv('sales_data.csv', index=False)

df = pd.read_csv('sales_data.csv')
df

OrderID Product Salesperson Revenue Date

0 1 Product A Salesperson 3 239 2023-01-01

1 2 Product A Salesperson 1 348 2023-01-02

2 3 Product C Salesperson 1 283 2023-01-03

3 4 Product C Salesperson 3 768 2023-01-04

4 5 Product B Salesperson 1 883 2023-01-05

... ... ... ... ... ...

495 496 Product B Salesperson 3 750 2024-05-10

496 497 Product A Salesperson 3 619 2024-05-11

497 498 Product C Salesperson 1 775 2024-05-12

498 499 Product B Salesperson 2 947 2024-05-13

499 500 Product C Salesperson 3 300 2024-05-14

500 rows × 5 columns

# Display column names and data types


print("Task 1: Column names and data types")
print(df.dtypes)

Task 1: Column names and data types


OrderID int64
Product object
Salesperson object
Revenue int64
Date object
dtype: object

# Identify and handle missing data


missing_data = df.isnull().sum()
print("\nTask 2: Missing data count by column")
print(missing_data)

Task 2: Missing data count by column


OrderID 0
Product 0
Salesperson 0
Revenue 0
Date 0
dtype: int64

https://colab.research.google.com/drive/1A9bUrXNLM4IPSEJ9cJYJh_XCz0vrBTu0#scrollTo=glCKJVkcU8lY&printMode=true 1/3
9/30/23, 11:53 PM Task3.ipynb - Colaboratory
# Calculate total sales revenue for each 'Product' category and visualize it using a bar plot
product_sales = df.groupby('Product')['Revenue'].sum()
print("\nTask 3: Total sales revenue by product")
print(product_sales)

product_sales.plot(kind='bar', rot=0)
plt.title('Total Sales Revenue by Product')
plt.xlabel('Product')
plt.ylabel('Total Revenue')
plt.show()

Task 3: Total sales revenue by product


Product
Product A 94194
Product B 94713
Product C 95970
Name: Revenue, dtype: int64

# Find the top 5 salespeople based on total sales amount


top_salespeople = df.groupby('Salesperson')['Revenue'].sum().nlargest(5)
print("\nTask 4: Top 5 salespeople based on total sales amount")
print(top_salespeople)

Task 4: Top 5 salespeople based on total sales amount


Salesperson
Salesperson 1 101908
Salesperson 3 96506
Salesperson 2 86463
Name: Revenue, dtype: int64

# Calculate the month with the highest total sales revenue and visualize it using a line plot
df['Date'] = pd.to_datetime(df['Date'])
df['Month'] = df['Date'].dt.month
monthly_sales = df.groupby('Month')['Revenue'].sum()
print("\nTask 5: Total sales revenue by month")
print(monthly_sales)

monthly_sales.plot(kind='line', marker='o')
plt.title('Total Sales Revenue by Month')
plt.xlabel('Month')
plt.ylabel('Total Revenue')
plt.xticks(range(1, 13), ['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec'])
plt.show()

https://colab.research.google.com/drive/1A9bUrXNLM4IPSEJ9cJYJh_XCz0vrBTu0#scrollTo=glCKJVkcU8lY&printMode=true 2/3
9/30/23, 11:53 PM Task3.ipynb - Colaboratory

Task 5: Total sales revenue by month


Month
1 34378
2 34975
3 37199
4 35057
5 25384
6 15852
7 18332
8 17357
9 17494
10 15454
11 15425
12 17970
Name: Revenue, dtype: int64

https://colab.research.google.com/drive/1A9bUrXNLM4IPSEJ9cJYJh_XCz0vrBTu0#scrollTo=glCKJVkcU8lY&printMode=true 3/3

You might also like