0% found this document useful (0 votes)
17 views17 pages

Cap 793

1

Uploaded by

dipanshu020312
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views17 pages

Cap 793

1

Uploaded by

dipanshu020312
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

NAME – DIPANSHU REGRESTATION NO -12217249

COURSE CODE-CAP 793

PART-1 Data Preparation and Basic Plots


Q1- - Create a DataFrame with the following columns: `Month`, `Product_A_Sales`,
`Product_B_Sales`, `Product_C_Sales`.

ANS 1-

import pandas as pd

data = {

'Month': ['January', 'February', 'March', 'April', 'May', 'June', 'July', 'August', 'September', 'October',
'November', 'December'],

'Product_A_Sales': [5000, 6000, 7000, 8000, 7500, 8200, 7800, 8400, 8200, 8600, 8800, 9000],

'Product_B_Sales': [7000, 6500, 7200, 6800, 7000, 7300, 7100, 7600, 7200, 7300, 7400, 7500],

'Product_C_Sales': [8000, 7000, 6800, 7500, 8200, 7900, 8500, 8100, 7900, 7800, 8000, 8300]

df = pd.DataFrame(data)

df['Total_Sales'] = df[['Product_A_Sales', 'Product_B_Sales', 'Product_C_Sales']].sum(axis=1)

df['Average_Sales'] = df[['Product_A_Sales', 'Product_B_Sales', 'Product_C_Sales']].mean(axis=1)

print(df)
2- . Line Plot:

- Create a line plot to display the sales trends of Product A, Product B, and Product C over the
months using Matplotlib or Seaborn.

- Ensure the plot has a title, labeled axes, and a legend.

ANS 2-

import pandas as pd

import matplotlib.pyplot as plt

data = {

'Month': ['January', 'February', 'March', 'April', 'May', 'June', 'July', 'August', 'September', 'October',
'November', 'December'],

'Product_A_Sales': [5000, 6000, 7000, 8000, 7500, 8200, 7800, 8400, 8200, 8600, 8800, 9000],

'Product_B_Sales': [7000, 6500, 7200, 6800, 7000, 7300, 7100, 7600, 7200, 7300, 7400, 7500],

'Product_C_Sales': [8000, 7000, 6800, 7500, 8200, 7900, 8500, 8100, 7900, 7800, 8000, 8300]

df = pd.DataFrame(data)

plt.figure(figsize=(10,6))
plt.plot(df['Month'], df['Product_A_Sales'], label='Product A')

plt.plot(df['Month'], df['Product_B_Sales'], label='Product B')

plt.plot(df['Month'], df['Product_C_Sales'], label='Product C')

plt.title('Sales A, B, and C over the Months')

plt.xlabel('Month')

plt.ylabel('Sales')

plt.legend()

plt.show()

3 - . Bar Plot:

- Create a bar plot to compare the monthly sales of all three products.

- Ensure the plot has a title, labeled axes, and a legend.

ANS 3-

import pandas as pd

import matplotlib.pyplot as plt

data = {

'Month': ['January', 'February', 'March', 'April', 'May', 'June', 'July', 'August', 'September', 'October',
'November', 'December'],
'Product_A_Sales': [5000, 6000, 7000, 8000, 7500, 8200, 7800, 8400, 8200, 8600, 8800, 9000],

'Product_B_Sales': [7000, 6500, 7200, 6800, 7000, 7300, 7100, 7600, 7200, 7300, 7400, 7500],

'Product_C_Sales': [8000, 7000, 6800, 7500, 8200, 7900, 8500, 8100, 7900, 7800, 8000, 8300]

df = pd.DataFrame(data)

plt.figure(figsize=(10,6))

plt.bar(df['Month'], df['Product_A_Sales'], label='Product A')

plt.bar(df['Month'], df['Product_B_Sales'], bottom=df['Product_A_Sales'], label='Product B')

plt.bar(df['Month'], df['Product_C_Sales'], bottom=df['Product_A_Sales'] + df['Product_B_Sales'],


label='Product C')

plt.title('Monthly Sales of Product A, B, and C')

plt.xlabel('Month')

plt.ylabel('Sales')

plt.legend()

plt.show()
Q4 - . Pie Chart - Create a pie chart to show the proportion of total annual sales contributed by
each product.

- Ensure the plot has a title and data labels.

ANS 4 –

import pandas as pd

import matplotlib.pyplot as plt

data = {

'Month': ['January', 'February', 'March', 'April', 'May', 'June', 'July', 'August', 'September', 'October',
'November', 'December'],

'Product_A_Sales': [5000, 6000, 7000, 8000, 7500, 8200, 7800, 8400, 8200, 8600, 8800, 9000],

'Product_B_Sales': [7000, 6500, 7200, 6800, 7000, 7300, 7100, 7600, 7200, 7300, 7400, 7500],

'Product_C_Sales': [8000, 7000, 6800, 7500, 8200, 7900, 8500, 8100, 7900, 7800, 8000, 8300]

df = pd.DataFrame(data)

df['Total_Sales'] = df[['Product_A_Sales', 'Product_B_Sales', 'Product_C_Sales']].sum(axis=1)

df['Average_Sales'] = df[['Product_A_Sales', 'Product_B_Sales', 'Product_C_Sales']].mean(axis=1)

total_product_a_sales = df['Product_A_Sales'].sum()

total_product_b_sales = df['Product_B_Sales'].sum()

total_product_c_sales = df['Product_C_Sales'].sum()

plt.figure(figsize=(6,6))

plt.pie([total_product_a_sales, total_product_b_sales, total_product_c_sales], labels=['Product A',


'Product B', 'Product C'], autopct='%1.1f%%')

plt.title(' Total Annual Sales by Product')

plt.axis('equal')

plt.show()
Part 2: Advanced Data Visualization

Q1- Scatter Plot:

- Create a scatter plot to analyze the relationship between `Product_A_Sales` and


`Product_B_Sales`.

- Ensure the plot has a title, labeled axes, and a trendline

ANS-1

import pandas as pd

import matplotlib.pyplot as plt

import numpy as np

data = {

'Product_A_Sales': [5000, 6000, 7000, 8000, 7500, 8200, 7800, 8400, 8200, 8600, 8800, 9000],

'Product_B_Sales': [7000, 6500, 7200, 6800, 7000, 7300, 7100, 7600, 7200, 7300, 7400, 7500]

df = pd.DataFrame(data)

plt.figure(figsize=(8,6))

plt.scatter(df['Product_A_Sales'], df['Product_B_Sales'])
plt.title('Relationship between Product A Sales and Product B Sales')

plt.xlabel('Product A Sales')

plt.ylabel('Product B Sales')

z = np.polyfit(df['Product_A_Sales'], df['Product_B_Sales'], 1)

p = np.poly1d(z)

plt.plot(df['Product_A_Sales'],p(df['Product_A_Sales']),"r--")

plt.show()

Q 2- . Histogram:

- Create a histogram to show the distribution of `Total_Sales` across the months.

- Ensure the plot has a title and labeled axes

ANS 2

import pandas as pd

import matplotlib.pyplot as plt

data = {
'Month': ['January', 'February', 'March', 'April', 'May', 'June', 'July', 'August', 'September', 'October',
'November', 'December'],

'Product_A_Sales': [5000, 6000, 7000, 8000, 7500, 8200, 7800, 8400, 8200, 8600, 8800, 9000],

'Product_B_Sales': [7000, 6500, 7200, 6800, 7000, 7300, 7100, 7600, 7200, 7300, 7400, 7500],

'Product_C_Sales': [8000, 7000, 6800, 7500, 8200, 7900, 8500, 8100, 7900, 7800, 8000, 8300]

df = pd.DataFrame(data)

df['Total_Sales'] = df['Product_A_Sales'] + df['Product_B_Sales'] + df['Product_C_Sales']

plt.figure(figsize=(8,6))

plt.hist(df['Total_Sales'], bins=10, edgecolor='black')

plt.title('Distribution Total Sales Across Months')

plt.xlabel('Total Sales')

plt.ylabel('Frequency')

plt.show()
Q 3- . Box Plot:

- Create a box plot to visualize the distribution of sales for Product A, Product B, and Product C.

- Ensure the plot has a title and labeled axes

ANS 3-

import pandas as pd

import matplotlib.pyplot as plt

data = {

'Month': ['January', 'February', 'March', 'April', 'May', 'June', 'July', 'August', 'September', 'October',
'November', 'December'],

'Product_A_Sales': [5000, 6000, 7000, 8000, 7500, 8200, 7800, 8400, 8200, 8600, 8800, 9000],

'Product_B_Sales': [7000, 6500, 7200, 6800, 7000, 7300, 7100, 7600, 7200, 7300, 7400, 7500],

'Product_C_Sales': [8000, 7000, 6800, 7500, 8200, 7900, 8500, 8100, 7900, 7800, 8000, 8300]

df = pd.DataFrame(data)
plt.figure(figsize=(8,6))

plt.boxplot([df['Product_A_Sales'], df['Product_B_Sales'], df['Product_C_Sales']], vert=True,


patch_artist=True)

plt.title('Distribution of Sales for Product A, Product B, and Product C')

plt.xlabel('Product')

plt.ylabel('Sales')

plt.xticks([1, 2, 3], ['Product A', 'Product B', 'Product C'])

plt.show()

Q 4 -. Heatmap:

- Create a heatmap to show the correlation between the sales of Product A, Product B, and
Product C.

- Ensure the plot has a title and a color bar

ANS 4-

import pandas as pd

import matplotlib.pyplot as plt


import seaborn as sns

import numpy as np

data = {

'Month': ['January', 'February', 'March', 'April', 'May', 'June', 'July', 'August', 'September', 'October',
'November', 'December'],

'Product_A_Sales': [5000, 6000, 7000, 8000, 7500, 8200, 7800, 8400, 8200, 8600, 8800, 9000],

'Product_B_Sales': [7000, 6500, 7200, 6800, 7000, 7300, 7100, 7600, 7200, 7300, 7400, 7500],

'Product_C_Sales': [8000, 7000, 6800, 7500, 8200, 7900, 8500, 8100, 7900, 7800, 8000, 8300]

df = pd.DataFrame(data)

corr_matrix = df[['Product_A_Sales', 'Product_B_Sales', 'Product_C_Sales']].corr()

plt.figure(figsize=(8,6))

sns.heatmap(corr_matrix, annot=True, cmap='coolwarm', vmin=-1, vmax=1, square=True)

plt.title('Correlation Between Product A, Product B, and Product C Sales')

plt.show()
Part 3: Customization and Presentation

Q1-. Plot Customization:

- Customize the plots with appropriate colors, markers, and styles to make them visually
appealing and informative.

- Use annotations where necessary to highlight key data points

ANS 1-

import pandas as pd

import matplotlib.pyplot as plt

import seaborn as sns

import numpy as np

data = {

'Month': ['January', 'February', 'March', 'April', 'May', 'June', 'July', 'August', 'September', 'October',
'November', 'December'],

'Product_A_Sales': [5000, 6000, 7000, 8000, 7500, 8200, 7800, 8400, 8200, 8600, 8800, 9000],

'Product_B_Sales': [7000, 6500, 7200, 6800, 7000, 7300, 7100, 7600, 7200, 7300, 7400, 7500],

'Product_C_Sales': [8000, 7000, 6800, 7500, 8200, 7900, 8500, 8100, 7900, 7800, 8000, 8300]

df = pd.DataFrame(data)

plt.figure(figsize=(10,6))

plt.plot(df['Month'], df['Product_A_Sales'], marker='o', linestyle='-', color='black', label='Product A')

plt.plot(df['Month'], df['Product_B_Sales'], marker='s', linestyle='--', color='red', label='Product B')

plt.plot(df['Month'], df['Product_C_Sales'], marker='D', linestyle='-.', color='green', label='Product C')

plt.title('Monthly Sales of Product A, Product B, and Product C')

plt.xlabel('Month')

plt.ylabel('Sales')

plt.legend()

plt.grid(True)
max_sales_a = df['Product_A_Sales'].max()

max_sales_b = df['Product_B_Sales'].max()

max_sales_c = df['Product_C_Sales'].max()

plt.annotate(f'Max Sales: {max_sales_a}', xy=(df['Month'][df['Product_A_Sales'].idxmax()],


max_sales_a), xytext=(10,10), textcoords='offset points', ha='center')

plt.annotate(f'Max Sales: {max_sales_b}', xy=(df['Month'][df['Product_B_Sales'].idxmax()],


max_sales_b), xytext=(10,10), textcoords='offset points', ha='center')

plt.annotate(f'Max Sales: {max_sales_c}', xy=(df['Month'][df['Product_C_Sales'].idxmax()],


max_sales_c), xytext=(10,10), textcoords='offset points', ha='center')

plt.show()

plt.figure(figsize=(8,6))

plt.boxplot([df['Product_A_Sales'], df['Product_B_Sales'], df['Product_C_Sales']], vert=True,


patch_artist=True)

plt.title('Distribution of Sales for Product A, Product B, and Product C')

plt.xlabel('Product')

plt.ylabel('Sales')

plt.xticks([1, 2, 3], ['Product A', 'Product B', 'Product C'])

plt.show()

corr_matrix = df[['Product_A_Sales', 'Product_B_Sales', 'Product_C_Sales']].corr()

plt.figure(figsize=(8,6))

sns.heatmap(corr_matrix, annot=True, cmap='coolwarm', vmin=-1, vmax=1, square=True)

plt.title('Correlation Between Product A, Product B, and Product C Sales')

plt.show()
Q 2- . Subplots:

- Create a single figure with multiple subplots to present the data in a cohesive manner.

- Ensure the figure has a main title and each subplot is clearly labelled

ANS 2-

import matplotlib.pyplot as plt

import pandas as pd

data = {

'Month': ['January', 'February', 'March', 'April', 'May', 'June', 'July', 'August', 'September', 'October',
'November', 'December'],
'Product_A_Sales': [5000, 6000, 7000, 8000, 7500, 8200, 7800, 8400, 8200, 8600, 8800, 9000],

'Product_B_Sales': [7000, 6500, 7200, 6800, 7000, 7300, 7100, 7600, 7200, 7300, 7400, 7500],

'Product_C_Sales': [8000, 7000, 6800, 7500, 8200, 7900, 8500, 8100, 7900, 7800, 8000, 8300]

df = pd.DataFrame(data)

df['Total_Sales'] = df[['Product_A_Sales', 'Product_B_Sales', 'Product_C_Sales']].sum(axis=1)

df['Average_Sales'] = df[['Product_A_Sales', 'Product_B_Sales', 'Product_C_Sales']].mean(axis=1)

fig, axs = plt.subplots(3, 1, figsize=(10, 12))

axs[0].plot(df['Month'], df['Product_A_Sales'], label='Product A')

axs[0].plot(df['Month'], df['Product_B_Sales'], label='Product B')

axs[0].plot(df['Month'], df['Product_C_Sales'], label='Product C')

axs[0].set_title('Monthly Sales by Product')

axs[0].set_xlabel('Month')

axs[0].set_ylabel('Sales')

axs[0].legend()

axs[1].plot(df['Month'], df['Total_Sales'])

axs[1].set_title('Total Monthly Sales')

axs[1].set_xlabel('Month')

axs[1].set_ylabel('Total Sales')

axs[2].plot(df['Month'], df['Average_Sales'])

axs[2].set_title('Average Monthly Sales')

axs[2].set_xlabel('Month')

axs[2].set_ylabel('Average Sales')

fig.suptitle('Monthly Sales Analysis')

fig.tight_layout(rect=[0, 0.03, 1, 0.95]


plt.show()

You might also like