0% found this document useful (0 votes)
12 views

SE Matplotlib

Uploaded by

Messon Vantard
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views

SE Matplotlib

Uploaded by

Messon Vantard
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

SE_matplotlib

April 22, 2024

0.0.1 Discrete Data


• Line Graph: visualize trends over time or relationships between variables
• Stem-and-leaf plot:
• – Stems: The stems represent the main values of the data and are typically drawn vertically
along the y-axis.
– Leaves: The leaves correspond to the individual data points and are plotted horizontally
along the stems.
– Baseline: The baseline serves as the reference line along which the stems are drawn.
• Histogram: shows the frequency of numerical data using rectangles.
• Violin plot is a method of visualizing the distribution of numerical data and its probability
density.
• – It is similar to a box plot but provides additional insights into the data’s distribution by
incorporating a kernel density estimation (KDE) plot mirrored on each side.
– This allows for a more comprehensive understanding of the data’s central tendency,
spread, and shape.

0.0.2 Continuous Data


• Scatter plots: visualizing the relationship between two continuous variables

0.0.3 Categorical Data


• Bar chart:
• – represents the category of data with rectangular bars with lengths and heights that is
proportional to the values which they represent.
• – describes the comparisons between the discrete categories.
• Stackplot/stacked area plot: displays the contribution of different categories or components
to the total value over a continuous range, typically time.
• Box plot/box-and-whisker plot: useful for comparing distributions across different groups or
identifying anomalies in the data.
• Pie Chart
[ ]:

1
[ ]: import pandas as pd
df = pd.read_csv("/home/mymate/UDM/Modules/Python_Introduction to Data Science/
↪Datasets/iris.csv")

# titanic.csv
df.head()

1 LINE PLOT
Example 1
[ ]: # Python Program to illustrate Linear Plotting
import matplotlib.pyplot as plt

# year contains the x-axis values


# and e-india & e-bangladesh
# are the y-axis values for plotting

year = [1972, 1982, 1992, 2002, 2012]


e_india = [100.6, 158.61, 305.54, 394.96, 724.79]
e_bangladesh = [10.5, 25.21, 58.65, 119.27, 274.87]

# plotting of x-axis(year) and y-axis(power consumption) with different colored␣


↪labels of two countries

plt.plot(year, e_india, color ='orange', label ='India')

plt.plot(year, e_bangladesh, color ='g',


label ='Bangladesh')

# naming of x-axis and y-axis


plt.xlabel('Years')
plt.ylabel('Power consumption in kWh')

# naming the title of the plot


plt.title('Electricity consumption per capita of India and Bangladesh')

plt.legend()
plt.show()

[ ]:

Example 2
[ ]: # Python Program to illustrate Linear Plotting
import matplotlib.pyplot as plt

2
# year contains the x-axis values
# and e-india & e-bangladesh
# are the y-axis values for plotting

year = [1972, 1982, 1992, 2002, 2012]


e_india = [100.6, 158.61, 305.54, 394.96, 724.79]
e_bangladesh = [10.5, 25.21, 58.65, 119.27, 274.87]

# plotting of x-axis(year) and y-axis(power consumption) with different colored␣


↪labels of two countries

plt.plot(year, e_india, color ='orange', label ='India', marker = 'o',␣


↪markersize=12)

plt.plot(year, e_bangladesh, color ='g', label ='Bangladesh', linestyle␣


↪='dashed', linewidth = 2)

# naming of x-axis and y-axis


plt.xlabel('Years')
plt.ylabel('Power consumption in kWh')

# naming the title of the plot


plt.title('Electricity consumption per capita of India and Bangladesh')

plt.legend()
plt.show()

[ ]:

2 STEM AND LEAF


[ ]: # importing libraries
import matplotlib.pyplot as plt
import numpy as np

x = np.linspace(0.1, 2 * np.pi, 41)


y = np.exp(np.sin(x))

plt.stem(x, y)
plt.show()

[ ]:

3
3 BAR CHART
[ ]: titanic = pd.read_csv("/home/mymate/UDM/Modules/Python_Introduction to Data␣
↪Science/Datasets/titanic.csv")

[ ]: titanic.columns

Gender distribution per Class


[ ]: pclass_ct = titanic.groupby('class')['alive'].value_counts().unstack()
pclass_ct

[ ]: # Setup a figure of plots


pclass_ct.plot(kind='bar')

plt.legend(('Died', 'Survived'), loc='best')


plt.title('Survivors by Pclass')
plt.xlabel('Pclass')
plt.ylabel('Count')
plt.xticks(rotation=0)

plt.show()

[ ]: # Setup a figure of plots


pclass_ct.plot(kind='bar', stacked=True, figsize=(10, 5))

plt.legend(('Died', 'Survived'), loc='best')


plt.title('Survivors by Pclass')
plt.xlabel('Pclass')
plt.ylabel('Count')
plt.xticks(rotation=0)

plt.show()

[ ]:

4 HISTOGRAM
[ ]: iris = pd.read_csv("/home/mymate/UDM/Modules/Python_Introduction to Data␣
↪Science/Datasets/iris.csv")

iris.columns

version 1
[ ]: # plotting histograms
plt.hist(iris['petal.length'], label='petal length')

4
plt.hist(iris['sepal.length'], label='sepal length')

plt.legend(loc='upper right')
plt.title('Overlapping')
plt.show()

version 2 - Transparency parameter ‘alpha’ Play around by changing the alpha value for
both sepal length and petal length
[ ]: # plotting histograms
plt.hist(iris['petal.length'], label='petal length', alpha = 0.5)

plt.hist(iris['sepal.length'], label='sepal length', alpha = 0.5)

plt.legend(loc='upper right')
plt.title('Overlapping')
plt.show()

version 3 Create more than 2 overlapping histograms with customized colors.


[ ]: plt.hist(iris['sepal.width'], alpha=0.5, label='sepal width', color='red') #␣
↪customized color parameter

plt.hist(iris['petal.width'], alpha=0.5, label='petal width', color='green')

plt.hist(iris['petal.length'], alpha=0.5, label='petal length', color='yellow')

plt.hist(iris['sepal.length'], alpha=0.5, label='sepal length', color='purple')

plt.legend(loc='upper right')
plt.show()

[ ]:

4.0.1 Introduction to Sub Plots in Matplotlib


To create multiple plots use matplotlib.pyplot.subplots method which returns the figure along with
the objects Axes object or array of Axes object.
nrows, ncols attributes of subplots() method determine the number of rows and columns of the
subplot grid.

5
Subplots in Matplotlib Using 1-D Array of Subplots Matplotlib generates a figure with
two subplots.
The data, represented by arrays ‘x,’ ‘y,’ and ‘z,’ is plotted on separate axes within the figure.
The resulting visualization displays distinct plots for the datasets ‘y’ and ‘z’ in the designated
subplots
[ ]: # importing library
import matplotlib.pyplot as plt

# Some data to display


x = [1, 2, 3]
y = [0, 1, 0]
z = [1, 0, 1]

# Creating 2 subplots
fig, ax = plt.subplots(2)

# Accessing each axes object to plot the data through returned array
ax[0].plot(x, y)
ax[1].plot(x, z)

[ ]:

Matplotlib Multiple Plots Same Figure Matplotlib creates a 2×2 grid of subplots.
Each subplot showcases a different type of plot: line plot, scatter plot, bar plot, and histogram.
The Axes objects are accessed through the 2D array ‘axs,’ and specific data is visualized in each
subplot, demonstrating the flexibility of Matplotlib for diverse plotting needs.
[ ]: # Create a 2x2 grid of subplots
fig, axs = plt.subplots(2, 2)

6
# Now axs is a 2D array of Axes objects
axs[0, 0].plot([1, 2, 3], [4, 5, 6])
axs[0, 1].scatter([1, 2, 3], [4, 5, 6])
axs[1, 0].bar([1, 2, 3], [4, 5, 6])
axs[1, 1].hist([1, 2, 2, 3, 3, 3, 3, 3, 3, 4, 4, 5])

plt.show()

5 VIOLIN PLOT
[ ]: iris.columns

[ ]: len(iris)

[ ]: iris['variety'].value_counts()

[ ]: plt.violinplot( iris['sepal.length'] )

[ ]:

[ ]: iris_petalLength = iris[['petal.length','variety']]

[ ]: iris_petalLength['variety'].value_counts()

[ ]: iris_petalLength_setosa = iris_petalLength[ iris_petalLength['variety'] ==␣


↪'Setosa' ]

iris_petalLength_versicolor = iris_petalLength[ iris_petalLength['variety'] ==␣


↪'Versicolor' ]

iris_petalLength_virginica = iris_petalLength[ iris_petalLength['variety'] ==␣


↪'Virginica' ]

[ ]: #plt.figure(figsize=(12,8))

plt.subplot(2,2,1)
plt.violinplot( iris_petalLength_setosa['petal.length'] )

plt.subplot(2,2,2)
plt.violinplot(iris_petalLength_versicolor['petal.length'] )

plt.subplot(2,2,3)
plt.violinplot(iris_petalLength_virginica['petal.length'] )

[ ]:

7
6 BOXPLOT
version 1
[ ]: plt.subplot(2,2,1)
plt.boxplot( iris_petalLength_setosa['petal.length'] )

plt.subplot(2,2,2)
plt.boxplot(iris_petalLength_versicolor['petal.length'] )

plt.subplot(2,2,3)
plt.boxplot(iris_petalLength_virginica['petal.length'] )

[ ]:

version 2 The parameters of .boxplot() define the following:



– is your data.


– sets the plot orientation to horizontal when False. The default orientation is vertical.


– shows the mean of your data when True.


– represents the mean as a line when True. The default representation is a point.


– the labels of your data.


– determines how to draw the graph.

8

– denotes the properties of the line representing the median.


– indicates the properties of the line or dot representing the mean.
[ ]:

[ ]: fig, ax = plt.subplots()
ax.boxplot( (iris_petalLength_setosa['petal.length'],␣
↪iris_petalLength_versicolor['petal.length'],␣

↪iris_petalLength_virginica['petal.length']),

showmeans=True, meanline=True,
labels=('Setosa', 'Versicolor', 'Virginica'), patch_artist=True,
medianprops={'linewidth': 2, 'color': 'purple'},
meanprops={'linewidth': 2, 'color': 'red'} )

ax.set_xlabel('Variety of Iris')
ax.set_ylabel('Petal Length')

plt.show()

[ ]:

[ ]:

7 SCATTER PLOT
[ ]:

Example 1
[ ]: import matplotlib.pyplot as plt

x = df['petal.length']
y = df['sepal.length']

# Create a scatter plot with color mapping


plt.scatter(x, y)

# Set plot title and labels

9
plt.title('Scatter Plot with Color Mapping')
plt.xlabel('X-axis')
plt.ylabel('Y-axis')
plt.legend()

# Display the plot


plt.show()

[ ]:

Example 2
[ ]: versicolor = df[ df["variety"]== "Versicolor" ]
setosa = df[ df["variety"] == "Setosa" ]
virginica = df[ df["variety"] == "Virginica" ]

fig, ax = plt.subplots()
fig.set_size_inches(8, 8)

plt.scatter( versicolor["petal.length"], versicolor["petal.width"],␣


↪label="Versicolor" )

plt.scatter( setosa["petal.length"], setosa["petal.width"], label="Setosa" )


plt.scatter( virginica["petal.length"], virginica["petal.width"],␣
↪label="Virginica" )

plt.xlabel("Petal Length (cm)")


plt.ylabel("Petal Width (cm)")
plt.title("Iris Petal Sizes")
plt.legend()

[ ]:

8 PIE CHART
version 1
[ ]: from matplotlib import pyplot as plt

# Creating dataset
cars = ['AUDI', 'BMW', 'FORD', 'TESLA', 'JAGUAR', 'MERCEDES']

data = [23, 17, 35, 29, 12, 41]

# Creating plot
#fig = plt.figure(figsize=(10, 7))

plt.pie(data, labels=cars)

10
# show plot
plt.show()

[ ]:

version 2
[ ]: iris_flowers = ['SETOSA', 'VERTICOLOR', 'VIRGINICA']

# DATA
versicolor = iris[ iris["variety"]== "Versicolor" ]
setosa = df[ df["variety"] == "Setosa" ]
virginica = df[ df["variety"] == "Virginica" ]

# Creating plot
#fig = plt.figure(figsize=(10, 7))

setosa_sumPetalLength = setosa['petal.length'].sum()
versicolor_sumPetalLength = versicolor['petal.length'].sum()
virginica_sumPetalLength = virginica['petal.length'].sum()

data = [setosa_sumPetalLength, versicolor_sumPetalLength,␣


↪virginica_sumPetalLength ]

plt.pie(data, labels=iris_flowers)

# show plot
plt.show()

[ ]:

version 3
[ ]: iris_flowers = ['SETOSA', 'VERTICOLOR', 'VIRGINICA']

# DATA
versicolor = iris[ iris["variety"]== "Versicolor" ]
setosa = df[ df["variety"] == "Setosa" ]
virginica = df[ df["variety"] == "Virginica" ]

# Creating plot
#fig = plt.figure(figsize=(10, 7))

setosa_sumPetalLength = setosa['petal.length'].sum()
versicolor_sumPetalLength = versicolor['petal.length'].sum()
virginica_sumPetalLength = virginica['petal.length'].sum()

11
data = [setosa_sumPetalLength, versicolor_sumPetalLength,␣
↪virginica_sumPetalLength ]

fig, ax = plt.subplots()
ax.pie( data, labels=iris_flowers, autopct='%1.1f%%' )
ax.set_title("Pie Chart showcasing the sum of Petal Length for the 3 IRIS␣
↪Varieties")

plt.show()

[ ]:

12

You might also like