0% found this document useful (0 votes)
11 views

matplotlib

The document contains Python code for data analysis and visualization using matplotlib and pandas. It includes exercises for calculating monthly profits, identifying top months, and creating various plots such as line charts, bar charts, and pie charts based on sales data. The code also demonstrates reading data from a CSV file and plotting sales over time.

Uploaded by

jjim49029
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views

matplotlib

The document contains Python code for data analysis and visualization using matplotlib and pandas. It includes exercises for calculating monthly profits, identifying top months, and creating various plots such as line charts, bar charts, and pie charts based on sales data. The code also demonstrates reading data from a CSV file and plotting sales over time.

Uploaded by

jjim49029
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Python Code in LaTeX

Python Code
1 import matplotlib
2 matplotlib . use ( ’ TkAgg ’)
3 import matplotlib . pyplot as plt
4 import pandas as pd
5 # Exercise 1
6 data = {
7 ’ Month ’: [ ’ January ’ , ’ February ’ , ’ March ’ , ’ April ’ , ’ May ’
,
8 ’ June ’ , ’ July ’ , ’ August ’ , ’ September ’ ,
9 ’ October ’ , ’ November ’ , ’ December ’] ,
10 ’ Revenue ’: [5000 , 7000 , 8000 , 7500 , 8500 , 9000 , 9500 ,
8700 , 9200 , 9700 , 8800 , 10200] ,
11 ’ Expenses ’: [3000 , 4000 , 3500 , 3200 , 3800 , 4200 , 4000 ,
4500 , 4100 , 4300 , 3900 , 4600]
12 }
13 df = pd . DataFrame ( data )
14 # 1. Calculate the monthly profit : Profit = Revenue -
Expenses
15 print ( df )
16 monthly_profit = df [ ’ Revenue ’] - df [ ’ Expenses ’]
17 print ( monthly_profit )
18 # Add it as a new column to the DataFrame
19 df [ ’ MonthlyProfit ’] = monthly_profit
20 print ( df )
21

22 # 2. Identify the top 3 months with the highest profits


23 df_by_monthly_profit = df . sort_values ( by = ’ MonthlyProfit ’ ,
ascending = False )
24 print ( df_by_monthly_profit )
25 # Top 3 months with highest profits
26 print ( f " The top 3 months with the highest profits :\ n {
df_by_monthly_profit . head (3) [ ’ Month ’]} " )
27

28 # Plot a line chart comparing revenue , expenses , and profit


over the months .
29 print ( df )
30 # Plot each column separately using the default color cycle

1
31 plt . plot ( df [ ’ Month ’] , df [ ’ Revenue ’] , label = ’ Revenue ’ , marker
= ’o ’)
32 plt . plot ( df [ ’ Month ’] , df [ ’ Expenses ’] , label = ’ Expenses ’ ,
marker = ’o ’)
33 plt . plot ( df [ ’ Month ’] , df [ ’ MonthlyProfit ’] , label = ’ Monthly
Profit ’ , marker = ’o ’)
34 # Customization
35 plt . title ( ’ Revenue , Expenses , and Profit Over the Months ’)
36 plt . xlabel ( ’ Month ’)
37 plt . ylabel ( ’ Amount ’)
38 plt . xticks ( rotation =45) # To rotate the labels
39 plt . grid ( True )
40 plt . legend ( title = " Metrics " )
41 plt . show ()
42

43 # Bar plot showing revenue per month


44 plt . bar ( df [ ’ Month ’] , df [ ’ Revenue ’] , color = ’ red ’)
45 plt . title ( ’ Revenue per Month ’)
46 plt . xlabel ( ’ Month ’)
47 plt . ylabel ( ’ Revenue ( $ ) ’)
48 plt . xticks ( rotation =45)
49 plt . show ()
50

51 # Exercise 2
52 import pandas as pd
53 # Read the CSV file into a DataFrame
54 df = pd . read_csv ( ’ Sales . csv ’)
55 # Convert the ’ Date ’ column to datetime format
56 df [ ’ Date ’] = pd . to_datetime ( df [ ’ Date ’ ])
57 # Display the DataFrame
58 print ( df )
59

60 # 1. Bar chart showing total sales for each region :


61 sales_by_region = df . groupby ( ’ Region ’) [ ’ Sales ’ ]. sum () #
Series including sum of sales for each region
62 # Bar Chart for Sales by Region
63 x = sales_by_region . index # Region name
64 plt . bar (x , sales_by_region , color = ’ green ’ , label = ’ Total
Sales ’)
65 plt . title ( ’ Total Sales by Region ’ , fontsize =14)
66 plt . xlabel ( ’ Region ’ , fontsize =12)
67 plt . ylabel ( ’ Total Sales ’ , fontsize =12)
68 plt . grid ( axis = ’y ’ , linestyle = ’ -- ’ , alpha =0.6)
69 plt . show ()
70

71 # You can also use pandas plotting


72 sales_by_region = df . groupby ( ’ Region ’) [ ’ Sales ’ ]. sum ()
73 # Plot the bar chart
74 sales_by_region . plot ( kind = ’ bar ’ , color = ’g ’)
75 plt . title ( ’ Total Sales by Region ’)

2
76 plt . xlabel ( ’ Region ’)
77 plt . ylabel ( ’ Total Sales ’)
78 plt . grid ( axis = ’y ’ , linestyle = ’ -- ’ , alpha =0.6)
79 plt . show ()
80

81 # Pie chart for total quantity sold by category ( with


explode effect for the category with the highest quantity
):
82 # Group by Category and sum quantity
83 quantity_by_category = df . groupby ( ’ Category ’) [ ’ Quantity ’ ].
sum () . sort_values ( ascending = False )
84 print ( quantity_by_category )
85

86 # The category with the highest quantity will be the first


one after sorting in descending order
87 explode = [0.2 , 0 , 0]
88

89 # Plot the pie chart


90 quantity_by_category . plot ( kind = ’ pie ’ , autopct = ’ %1.1 f %% ’ ,
explode = explode ,
91 colors =[ ’g ’ , ’y ’ , ’r ’] , shadow =
True )
92 plt . title ( ’ Total Quantity Sold by Category ’)
93 plt . ylabel ( ’ ’) # Setting ylabel to an empty string to
remove the label ’ Quantity ’ from the pie chart
94 plt . savefig ( ’ Quant ityB yCate gory Pie . png ’)
95 plt . show ()
96

97 # Adjust the figure size


98 plt . figure ( figsize =(10 , 10) )
99 # Top Plot : Pie Chart for Total Quantity Sold by Category
100 plt . subplot (2 , 1 , 1) # 2 rows , 1 column , 1 st subplot
101 quantity_by_category = df . groupby ( ’ Category ’) [ ’ Quantity ’ ].
sum () . sort_values ( ascending = False )
102

103 # The category with the highest quantity will be the first
one
104 explode = [0.2 , 0 , 0]
105

106 # Plot the pie chart using pyplot


107 plt . pie ( quantity_by_category , labels = quantity_by_category .
index , autopct = ’ %1.1 f %% ’ ,
108 explode = explode ,
109 colors =[ ’ lightgreen ’ , ’ lightblue ’ , ’ salmon ’] ,
110 shadow = True )
111 """ quantity_by_category . plot ( kind = ’ pie ’, autopct = ’%1.1 f %% ’ ,
explode = explode ,
112 colors =[ ’ g ’, ’y ’, ’r ’] , shadow =
True ) """
113 s ub pl o t_ t it le _ fo n t_ di c t = {

3
114 ’ fontsize ’: 14 , # Font size
115 ’ fontweight ’: ’ bold ’ , # Font weight
116 ’ color ’: ’ purple ’ , # Font color
117 }
118 plt . title ( ’ Total Quantity Sold by Category ’ , fontdict =
s ub pl o t_ t it le _ fo n t_ di c t )
119

120 # Bottom Plot : Scatter Plot for Sales vs Date


121 plt . subplot (2 , 1 , 2) # 2 rows , 1 column , 2 nd subplot
122 plt . scatter ( df [ ’ Date ’] , df [ ’ Sales ’] , color = ’ purple ’ , alpha
=0.7)
123 plt . title ( ’ Sales Over Time ’ , su b pl o t_ ti t le _ fo nt _ di c t )
124 plt . xlabel ( ’ Date ’ , fontsize =12 , fontweight = ’ normal ’)
125 plt . ylabel ( ’ Sales ’ , fontsize =12 , fontweight = ’ bold ’)
126 plt . grid ( True , linestyle = ’ -- ’ , alpha =0.6)
127

128 # Add a suptitle to the entire figure


129 # Default family used is ’ DejaVu Sans ’
130 plt . suptitle ( ’ Sales Data Analysis : Quantity by Category and
Sales Over Time ’ ,
131 fontsize =16 , fontweight = ’ bold ’ , style = ’ italic ’ ,
family = ’ monospace ’)
132 # Save the plot with the suptitle in the filename
133 plt . savefig ( ’ S a le s _ D at a _ A na l y s is _ P l ot . jpg ’)
134

135 # Show the plots


136 plt . show ()

Listing 1: Python Code

You might also like