0% found this document useful (0 votes)

31 views

Pyth

This Python program performs data visualization using the matplotlib library. It creates line plots showing the salaries of data scientists and software engineers at different experience levels. It demonstrates various plotting options like adding titles, labels, legends, changing line styles, colors and widths, adding markers and grids. It also shows stacking multiple plots and using different plotting styles.

Uploaded by

Minal Joshi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

31 views

Pyth

Uploaded by

Minal Joshi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 38

DEPARTMENT OF COMPUTER SCIENCE AND

ENGINEERING (DATA SCIENCE)

COURSE CODE: DJS22DSL305 DATE: 1/12/2023
COURSE NAME: Python Laboratory CLASS: SYBTECH
Name: Minal Joshi SAP ID: 60009220180

EXPERIMENT NO. 8

CO/LO:
CO5Apply various advance modules of Python for data analysis.
AIM / OBJECTIVE: Write a Python program to perform visualization using matplotlib
DESCRIPTION OF EXPERIMENT:

Importing Libraries
In [1]:
from matplotlib import pyplot as plt

In [2]: import seaborn as sns

In [3]: from matplotlib import font_manager as fm

In [4]: import pandas as pd

In [5]: import numpy as np

In [6]:

from datetime import datetime, timedelta #It's for time series

Matplotlib
Matplotlib is a comprehensive library for creating static, animated, and interactive visualizations in Python.

Matplotlib makes easy things easy and hard things possible.

Official Page of Matplotlib: https://matplotlib.org/stable/index.html

Pyplot
Pyplot is a collection of functions that make matplotlib work like MATLAB. Each pyplot function makes
some change to a figure

In [7]:

x = [0,2,4,5,6,7,8,9,10] y =
[60,13,45,29,48,77,102,95,58]
In [8]:

plt.plot(x, y) plt.show()

Line Plot
A Line plot can be defined as a graph that displays data as points or check marks above a number line,
showing the frequency of each value.

We will plot data scientist's salaries with respect to their experiences.

In [9]:

experience = [1,3,4,5,7,8,10,12]

salary = [6500, 9280, 12050, 13200, 16672, 21000, 23965, 29793]

In [10]:

plt.plot(experience,salary) plt.show()
Adding a Title

In [11]:

plt.plot(experience,salary)
plt.title("Salary of Data Scientists by their experiences")
plt.show()

Adding Labels to x and y

In [12]:

plt.plot(experience,salary)
plt.title("Salary of Data Scientists by their
experiences") plt.xlabel("Experience")
plt.ylabel("Salary") plt.show()

Plotting Multiple Graphs in One Graph

We will also add software engineer's salary to our graph.

In [13]:

experience = [1,3,4,5,7,8,10,12]
data_scientists_salary = [6500, 9280, 12050, 13200, 16672, 21000, 23965,
29793]
software_engineers_salary = [9020, 12873, 15725, 18000, 19790, 20196,
25769,32000 ]

In [14]:

plt.plot(experience,data_scientists_salary)
plt.plot(experience,software_engineers_salary)
plt.title("Salary of Data Scientists and Software Engineers by their
experiences") plt.xlabel("Experience") plt.ylabel("Salary")
plt.show()

We can't understand which line represents what, we need to add legend. We can add them as a list or we
can add them in the beginning.

In [15]:

In [16]:
plt.plot(experience,data_scientists_salary, label= "Data Scientists")
plt.plot(experience,software_engineers_salary, label= "Software Engineers" )
plt.title("Salary of Data Scientists and Software Engineers by their
experiences") plt.xlabel("Experience") plt.ylabel("Salary")
plt.legend() plt.show()

We can also change the location of legend with loc argument.

In [17]:

plt.plot(experience,data_scientists_salary, label= "Data Scientists")

plt.plot(experience,software_engineers_salary, label= "Software Engineers" )
plt.title("Salary of Data Scientists and Software Engineers by their
experiences")
plt.xlabel("Experience")
plt.ylabel("Salary")
plt.legend(loc="lower right")
plt.show()

A format string consists of a part for color, marker and line:

In [18]: fmt = '[marker][line][color]'

Each of them is optional. If not provided, the value from the style cycle is used. Exception: If line is given,
but no marker, the data will be a line without markers.

We can also specify the arguments:

In [19]:

plt.plot(experience,data_scientists_salary,color="r", label= "Data

Scientists")
plt.plot(experience,software_engineers_salary, color="g", label= "Software
Engineers" )
plt.title("Salary of Data Scientists and Software Engineers by their
experiences") plt.xlabel("Experience") plt.ylabel("Salary")
plt.legend() plt.show()

In [20]:

plt.plot(experience,data_scientists_salary,color="r", linestyle="--", label=

"Data Scientists") #We can also make lines different
plt.plot(experience,software_engineers_salary, color="g",linestyle=':', label=
"Software Engineers" )
plt.title("Salary of Data Scientists and Software Engineers by their
experiences") plt.xlabel("Experience") plt.ylabel("Salary")
plt.legend() plt.show()

In [21]:

#We can also add markers

plt.plot(experience,data_scientists_salary,color="r", linestyle="--
",marker="o", label= "Data Scientists")
plt.plot(experience,software_engineers_salary,
color="g",linestyle=':',marker=".", label= "Software Engineers" )
plt.title("Salary of Data Scientists and Software Engineers by their
experiences") plt.xlabel("Experience") plt.ylabel("Salary")
plt.legend() plt.show()

In [22]:

#We can also adjust line width by using linewidth argument.

plt.plot(experience,data_scientists_salary,color="r", linestyle="--
",linewidth=6,marker="o", label= "Data Scientists")
plt.plot(experience,software_engineers_salary,
color="g",linestyle=':',marker=".",linewidth=6, label= "Software Engineers" )
plt.title("Salary of Data Scientists and Software Engineers by their
experiences") plt.xlabel("Experience") plt.ylabel("Salary")
plt.legend() plt.show()

tight_layout automatically adjusts subplot params so that the subplot(s) fits in to the figure area. This is an
experimental feature and may not work for some cases. It only checks the extents of ticklabels, axis labels,
and titles.
For more example and details:
https://matplotlib.org/stable/tutorials/intermediate/tight_layout_guide.html

In [23]:

#We can also add grids by using grids argument

In [24]:

We can fill below of the lines by using stackplot.

In [25]:
plt.stackplot(experience,data_scientists_salary, colors="g")
plt.title("Salary of Data Scientists by their experiences")
plt.xlabel("Experience") plt.ylabel("Salary")

plt.show()

We can change the style of the plots. In order to see all available styles:

plt.style.available In [26]:
Out[26]:

['Solarize_Light2',
'_classic_test_patch',
'bmh',
'classic',
'dark_background',
'fast',
'fivethirtyeight',
'ggplot',
'grayscale',
'seaborn',
'seaborn-bright',
'seaborn-colorblind',
'seaborn-dark',
'seaborn-dark-palette',
'seaborn-darkgrid',
'seaborn-deep',
'seaborn-muted',
'seaborn-notebook',
'seaborn-paper',
'seaborn-pastel',
'seaborn-poster',
'seaborn-talk',
'seaborn-ticks', 'seaborn-
white',
'seaborn-whitegrid',
'tableau-colorblind10']

In [27]:
plt.style.use('dark_background')
plt.plot(experience,data_scientists_salary,color="r", linestyle="--
",linewidth=6,marker="o", label= "Data Scientists")
plt.plot(experience,software_engineers_salary,
color="g",linestyle=':',marker=".",linewidth=6, label= "Software Engineers" )
plt.title("Salary of Data Scientists and Software Engineers by their
experiences") plt.xlabel("Experience") plt.ylabel("Salary")
plt.legend()
plt.tight_layout()
plt.grid(True)
plt.show()

We can save figures by using savefig argument.

In [28]:
plt.style.use('seaborn-dark')

Bar Plot
A barplot (or barchart) is one of the most common types of graphic. It shows the relationship between a
numeric and a categoric variable. Each entity of the categoric variable is represented as a bar. The size of the
bar represents its numeric value.

In [29]:
x = ["A", "B", "C", "D"] y
= [3, 8, 1, 10]

In [30]:
plt.bar(x,y)
plt.show()

In [31]:

experience = [1,2,3,4,5,6,7,8]
data_scientists_salary = [6500, 9280, 12050, 13200, 16672, 21000, 23965,
29793]

In [32]:
plt.style.use('seaborn-paper')
plt.bar(experience,data_scientists_salary,color="b")
plt.title("Salary of Data Scientists")
plt.xlabel("Experience")
plt.ylabel("Salary")
plt.tight_layout()
plt.grid(False)

plt.show()
We can combine bar and line plot.

In [33]: experience = [1,2,3,4,5,6,7,8]

data_scientists_salary = [6500, 9280, 12050, 13200, 16672, 21000, 23965,
29793]
software_engineers_salary = [9020, 12873, 15725, 18000, 19790, 20196,
25769,32000 ]

In [34]:
plt.style.use('tableau-colorblind10')
plt.bar(experience,data_scientists_salary,color="r", label= "Data Scientists")
plt.plot(experience,software_engineers_salary, color="g",label= "Software
Engineers" )
plt.title("Salary of Data Scientists and Software Engineers by their
experiences") plt.xlabel("Experience") plt.ylabel("Salary")
plt.legend()
plt.grid(False
) plt.show()

We can specify the width with width argument.

In [35]:
width = 0.2
plt.style.use('tableau-colorblind10')
plt.bar(experience,data_scientists_salary,color="m",width=width, label= "Data
Scientists")
plt.title("Salary of Data Scientists by their experiences")
plt.xlabel("Experience") plt.ylabel("Salary")
plt.legend()
plt.grid(False
) plt.show()

We can also plot multiple bar plots.

In [36]: plt.style.use("fivethirtyeight")

plt.bar(experience,software_engineers_salary, color="g",linewidth=3,label=
"Software Engineers" )
plt.bar(experience,data_scientists_salary,color="r",linewidth=3, label= "Data
Scientists")
plt.title("Salary of Data Scientists and Software Engineers by their
experiences") plt.xlabel("Experience") plt.ylabel("Salary")
plt.legend()
plt.grid(False
) plt.show()

It seems that in the x axis, the values of 8 doesn't seem ok. We can shift the plots.

experience_indexes = np.arange(len(experience)) In [37]:

In [38]:

experience_indexes

Out[38]:
array([0, 1, 2, 3, 4, 5, 6, 7])

In [39]:
plt.style.use("fivethirtyeight")
width = 0.4

plt.bar(experience_indexes - width,software_engineers_salary,
color="g",width=width,linewidth=3,label= "Software Engineers" )
plt.bar(experience_indexes+width,data_scientists_salary,color="r",linewidth=3,
width=width, label= "Data Scientists")
plt.title("Salary of Data Scientists and Software Engineers by their
experiences") plt.xlabel("Experience") plt.ylabel("Salary")
plt.legend()
plt.grid(False
) plt.show()

As you can see, x axis does not represent true values of experiences, so we can solve it with xticks()
method.

In [40]:
plt.style.use("fivethirtyeight")
width = 0.25
plt.bar(experience_indexes - width,software_engineers_salary,
color="g",width=width,linewidth=3,label= "Software Engineers"
)
plt.bar(experience_indexes+width,data_scientists_salary,color="r",linewidth=3,
width=width, label= "Data Scientists")
plt.title("Salary of Data Scientists and Software Engineers by their
experiences") plt.xlabel("Experience") plt.ylabel("Salary")
plt.xticks(ticks=experience_indexes, labels=experience)

plt.legend()
plt.grid(True)

plt.show()

Pie Chart
A pie chart (or a circle chart) is a circular statistical graphic, which is divided into slices to illustrate
numerical proportion. In a pie chart, the arc length of each slice (and consequently its central angle and
area), is proportional to the quantity it represents.

In [41]:

experience = [1,2,3,4,5,6,7,8]
data_scientists_salary = [6500, 9280, 12050, 13200, 16672, 21000, 23965,
29793]
software_engineers_salary = [9020, 12873, 15725, 18000, 19790, 20196,
25769,32000 ]

In [42]: plt.title("Pie Chart Example") slices = [60,40] plt.pie(slices)

plt.tight_layout()
plt.show()
Sum of Values for wedges can be different than 100. The size of each wedge is determined by comparing
the value with all the other values, by using this formula:

The value divided by the sum of all values: x/sum(x)

In [43]: list_1 = [40,56,72,38,4] plt.pie(list_1) plt.show()

We can add labels.

In [44]:

incomes = [40,56,72,38,4]
persons = ["Josh","Berkay","Maria","Michael","Anastacia"]
plt.pie(incomes,labels=persons)

plt.show()

By default the plotting of the first wedge starts from the x-axis and move counterclockwise:
But you can change the start angle by specifying a startangle parameter. The startangle parameter is
defined with an angle in degrees, default angle is 0.

In [45]:

incomes = [40,56,72,38,4]
persons = ["Josh","Berkay","Maria","Michael","Anastacia"]
plt.pie(incomes,labels=persons,startangle=180) plt.show()

If we want use stand one of the wedges out, we can use explode parameter.It takes an array with one
value for each edge.

In [46]:

incomes = [40,56,72,38,4]
persons = ["Josh","Berkay","Maria","Michael","Anastacia"] myexplode
= [0,0.2, 0, 0, 0]
plt.pie(incomes,labels=persons,startangle=180,explode = myexplode) plt.show()

We can also change size of the chart with figsize argument

In [47]:

plt.figure(figsize=(10,10))

plt.rcParams['font.size'] = 20
incomes = [40,56,72,38,4]
persons = ["Josh","Berkay","Maria","Michael","Anastacia"] myexplode
= [0,0.2, 0, 0, 0]
plt.pie(incomes,labels=persons,startangle=180,explode = myexplode) plt.show()

We can also add shadows by making shadow argument True.

In [48]:

plt.figure(figsize=(10,10))

plt.rcParams['font.size'] = 20

incomes = [40,56,72,38,4]

persons = ["Josh","Berkay","Maria","Michael","Anastacia"]
myexplode = [0,0.2, 0, 0, 0]
plt.pie(incomes,labels=persons,startangle=180,explode = myexplode,shadow=True)
plt.show()

We can also set color of each wedge with colors parameter. Some
of possible color options are here:

Shortage Colour
"r" Red
"g" Green
"b" Blue
"c" Cyan
"m" Magenta
"y" Yellow
"k" Black
"w" White
for more color option: https://www.w3schools.com/colors/colors_names.asp

In [49]:

plt.figure(figsize=(10,10))

plt.rcParams['font.size'] = 20

incomes = [40,56,72,38,4]
persons = ["Josh","Berkay","Maria","Michael","Anastacia"]
myexplode = [0,0.2, 0, 0, 0]
colors = ["black","g","y","hotpink","#4CAF70"]
plt.pie(incomes,labels=persons,startangle=180,explode =
myexplode,shadow=True,colors=colors)

plt.show()

In order to add a list of explanation for each wedge, we can use the legend() function.

In [50]:

plt.figure(figsize=(7,7))

incomes = [40,56,72,38,4] persons =

["Josh","Berkay","Maria","Michael","Anastacia"] colors =
["black","g","y","hotpink","#4CAF70"]
plt.pie(incomes,labels=persons,colors=colors)
plt.legend() plt.show()

We can also add a title to legends by using title parameter.

In [51]:
plt.style.use("fivethirtyeight") plt.figure(figsize=(7,7))

incomes = [40,56,72,38,4]
persons = ["Josh","Berkay","Maria","Michael","Anastacia"]
colors = ["black","g","y","hotpink","#4CAF70"]
plt.pie(incomes,labels=persons,colors=colors)
plt.legend(title="Persons") plt.show()

We can add percentages of slices by using autopct argument.

In [52]: plt.style.use("fivethirtyeight") plt.figure(figsize=(7,7))

In [53]:

fig = plt.figure(1, figsize=(6,6))

ax = fig.add_axes([0.1, 0.1, 0.8, 0.8])

plt.title('Raining Hogs and Dogs') labels
= 'Frogs', 'Hogs', 'Dogs', 'Logs' fracs =
[15,30,45, 10]

patches, texts, autotexts = ax.pie(fracs, labels=labels, autopct='%1.1f%%')

proptease = fm.FontProperties() proptease.set_size('xx-small')
plt.setp(autotexts, fontproperties=proptease)
plt.setp(texts, fontproperties=proptease)

plt.show()

Stack Plot
The idea of stack plots is to show “parts to a whole” over time; basically, it’s like a pie-chart, only over
time.

We can use stackplot() built-in function.

In [54]: days = [1,2,3,4,5,6]

sleep = [6,7,5,8,6,7]
drinking_water = [2,2,1,2,1,1]
work = [5,7,10,8,6,9] exercise=
[3,3,0,1,3,2]

In [55]:

plt.style.use("fivethirtyeight")
plt.plot([],[],color='green', label='sleep', linewidth=3)
plt.plot([],[],color='blue', label='drinking_water', linewidth=3)
plt.plot([],[],color='red', label='work', linewidth=3)
plt.plot([],[],color='orange', label='play', linewidth=3)
plt.stackplot(days, sleep, drinking_water, work, exercise,
colors=['green','blue','red','orange'])

plt.xlabel('days')
plt.ylabel('activities')
plt.title('6 DAY ROUTINE STACK PLOT EXAMPLE') plt.legend(loc="lower
right")
plt.tight_layout()
plt.show()

We can visualize the data that has a spesific total.

In [56]:

stock1= [5,3,3,6,1,8,2,7,9] stock2=

[2,4,1,3,5,0,3,1,0] stock3=
[2,2,5,1,3,1,4,1,0]

days =[1,2,3,4,5,6,7,8,9]

In [57]:
stocks= ["stock1","stock2","stock3"] colors = ["#F9CDAD", "#FC9D9A",
"#83AF9B"] plt.title("Stack Plot of Stock Rates")
plt.stackplot(days,stock1,stock2,stock3,labels=stocks,colors=colors)

plt.legend() plt.tight_layout() plt.show()

Histograms
A histogram is a graph showing frequency distributions.
It is a graph showing the number of observations within each given interval.

We use hist() function in order to create histograms.

In [58]: notes = [30,74,94,14,55,47,63,28,88,44,53,18,66,74,81]

In [59]: plt.style.use("fivethirtyeight") plt.hist(notes)

plt.show()

In [60]: plt.style.use("fivethirtyeight") plt.hist(notes,color="r")

plt.title("Notes") plt.xlabel("Notes")
plt.ylabel("Person")
plt.tight_layout()
plt.grid(False)
plt.show()

We can add edge colors in order to interpret the table better.

In [61]: plt.style.use("fivethirtyeight")
plt.hist(notes,color="r",edgecolor="black")
plt.title("Notes")
plt.xlabel("Notes")
plt.ylabel("Person")
plt.tight_layout()
plt.grid(False)
plt.show()

We can specify the size of bins.

In [62]: plt.style.use("fivethirtyeight")
plt.hist(notes,bins=5,color="g",edgecolor="black")
plt.title("Notes")
plt.xlabel("Notes")
plt.ylabel("Person")
plt.tight_layout()
plt.grid(False)
plt.show()

We can give bin values spesifically.

In [63]:

plt.style.use("fivethirtyeight") bins
= [10,45,65,80,100]
plt.hist(notes,bins=bins,color="g",edgecolor="black")
plt.title("Notes")
plt.xlabel("Notes")
plt.ylabel("Person")
plt.tight_layout()
plt.grid(False)
plt.show()
Let's plot a normal distribution(bell shape)

In
[64]: x = np.random.normal(170, 10, 250)
plt.hist(x,color="gray",edgecolor="black")
plt.title("Normal Distribution")
plt.xlabel("Numbers")
plt.ylabel("Count")
plt.tight_layout()
plt.grid(False)
plt.show()

For a real world example, we will work with Human Resources Data Set.
Dataset can be downloaded from here : https://www.kaggle.com/rhuebner/human-resources-data-set We
will read it with pandas.
In [65]: df = pd.read_csv("../input/human-resources-data-set/HRDataset_v14.csv")

In [66]: df.head()
Out[66]:

La
st
Fr o Pe
m Pe En Sp
Re rf
Em M Di cr
rf ga E m eci or Da
M
pl M ari Em Pe ve M uit or ge pS al m ys
an
oy E
ar tal Ge
pS De rfS rsi
an m m m ati Pr an La Ab
m nd tyJ Sa ag an en oj se
ee rie St tat ptI co ag en sf ce te
erI ob lar er
_N pI
dI at us D reI ...
erI tS ce tS ac ec Re La nc
D D Fa y Na Sc ur ts es
am D us D D ou tio vi e st
ID irI me ve Co
e ID rc e or n w 30
D e y un _D
t at e

Ad
in Mi
1/
olf ch Li Ex
10 62 17
i, ae l nk ce
02 50 Al 22 4. /2
0 Wi 0 0 1 1 5 4 0 ... ed ed 5 0 0 1
6 6 be .0 60 01
lso In s
rt 9
nK

Ait
Si Fu 2/
di, Si
10 10 In lly 24
Ka m 4. 4.
08 44 de M /2
1 rt 1 1 1 5 3 3 0 ... on 0 96 3 6 0 17
4 37 ed ee 01
hi Ro
ts 6
ke up
ya
n
Ak
in Kis Fu 5/
ku sy Li 15
10 64 lly
oli Su nk
19 95 20 M 3. /2
2 e, 1 1 0 5 5 3 0 ... lliv ed 3 0 0 3
6 5 .0 ee 02 01
Sa an In ts 2
ra h

Al Eli Fu
ag 10 jia lly 1/
64 In
be 08 h 16 M 4. 3/
3 1 1 0 1 5 3 0 99 ... de 5 0 0 15
,Tr Gr .0 ee 84 20
8 1 ed
in a ay ts 19

W
An Go
eb Fu
de og 2/
st lly
rs 10 50 le 1/
er 39 M 5.
4 on , 06 0 2 0 5 5 3 0 82 ... Se 4 0 20 0 2
Bu .0 ee 00
Ca 9 5 ar 16
tle ts
rol ch
r
5 rows × 36 columns

In [67]:
Out[66]:

We will work with Salary column.

In [67]:
plt.style.use("fivethirtyeight")
bins = [40000,55000,70000,85000,100000,120000]
plt.hist(df.Salary,bins=bins,color="blue",edgecolor="black")
plt.title("Salaries of Workers")
plt.xlabel("Salary")
plt.ylabel("Count")
plt.tight_layout()
plt.grid(False)
plt.show()

If some values are so higher, we can use logarithmic scale to plot.

In [68]:
plt.style.use("fivethirtyeight") bins =
[40000,55000,70000,85000,100000,120000]
plt.hist(df.Salary,bins=bins,color="blue",edgecolor="black",log=True)

plt.title("Salaries of Workers")

plt.xlabel("Salary") plt.ylabel("Count")
plt.tight_layout() plt.grid(False) plt.show()
We can also add median and mean of Salaries.
In [69]: plt.style.use("fivethirtyeight")

salary_median = df.Salary.median() salary_mean

= df.Salary.mean()
bins = [40000,55000,70000,85000,100000,120000]
plt.hist(df.Salary,bins=bins,color="blue",edgecolor="black")
plt.axvline(salary_median, color="gray", label="Salary Median", linewidth=3)
plt.axvline(salary_mean, color="green", label="Salary Mean", linewidth=3)
plt.legend() plt.title("Salaries
of Workers") plt.xlabel("Salary")
plt.ylabel("Count")
plt.tight_layout()
plt.grid(False)
plt.show()

Scatter Plots
Scatter plots are used to plot data points on horizontal and vertical axis in the attempt to show how much
one variable is affected by another.

In [70]:

first_exam_grades = [89, 90, 70, 89, 100, 80, 90, 100, 80, 34]
second_exam_grades = [30, 29, 49, 48, 100, 48, 38, 45, 20, 30]

In [71]: plt.title("Exam Grades Scatter plot")

plt.scatter(first_exam_grades,second_exam_grades)
plt.tight_layout()
plt.xlabel("First Exam Grades")
plt.ylabel("Second Exam Grades")
plt.grid(True) plt.show()

We can change the dot size and color.

In [72]: plt.title("Exam Grades Scatter plot")

plt.scatter(first_exam_grades,second_exam_grades,s=100,color="r")
plt.tight_layout()
plt.xlabel("First Exam Grades")
plt.ylabel("Second Exam Grades")
plt.grid(True)
plt.show()
You can change dot size by values.

In [73]:

plt.title("Exam Grades Scatter plot")

sizes = np.array([20,50,100,200,500,1000,60,90,10,300])
plt.scatter(first_exam_grades,second_exam_grades,s=sizes,color="r")
plt.tight_layout()
plt.xlabel("First Exam Grades")
plt.ylabel("Second Exam
Grades") plt.grid(True)
plt.show()

We can also change the marker.

In [74]:
plt.title("Exam Grades Scatter plot")

plt.scatter(first_exam_grades,second_exam_grades,s=100,color="green",marker="x
")
plt.tight_layout()
plt.xlabel("First Exam Grades")
plt.ylabel("Second Exam Grades")
plt.grid(True) plt.show()

We can also plot two different plots.

In [75]:

first_exam_grades = [89, 90, 70, 89, 100, 80, 90, 100, 80, 34]
first_study_hours = [6,8,3,9,9,1,4,2,2,5]

In [76]:
second_exam_grades = [30, 29, 49, 48, 100, 48, 38, 45, 20,
30] second_study_hours = [2,7,1,5,3,3,2,6,3,2]
In [77]:
plt.title("Exam Grades Scatter plot")

plt.scatter(first_exam_grades,first_study_hours,s=100,color="green",marker="x"
) plt.scatter(second_exam_grades,second_study_hours,s=100,color="red")
plt.tight_layout()
plt.xlabel("Exam Grades")
plt.ylabel("Study Hours")
plt.grid(True)
plt.show()

You can also add features and colormaps.

In [78]:

first_exam_grades = [89, 90, 70, 89, 100, 80, 90, 100, 80, 34]
second_exam_grades = [30, 29, 49, 48, 100, 48, 38, 45, 20, 30]
colors = [7, 5, 9, 7, 5, 7, 2, 5, 3, 7]
sizes = [209, 486, 381, 255, 191, 315, 185, 228, 174,538]

In [79]:

plt.title("Exam Grades Scatter plot")

plt.scatter(first_exam_grades,second_exam_grades,s=sizes,c=colors,cmap="Blues"
,edgecolor="black")
cbar = plt.colorbar()
cbar.set_label("Exam Grades")
plt.tight_layout()
plt.xlabel("First Exam Grades")
plt.ylabel("Second Exam
Grades")
plt.grid(True)
plt.show()

Contour(Level) Plots
Contour plots (sometimes called Level Plots) are a way to show a three-dimensional surface on a two-
dimensional plane. It graphs two predictor variables X Y on the y-axis and a response variable Z as
contours. These contours are sometimes called the z-slices or the iso-response values.

In [80]:

x = [0,3,6,9,13,15,19,23,26,29,33,35,39,41,47,56] y
= [5,8,13,16,17,20,25,26,30,33,37,39,41,44,48,59]
In order to create contour plot, first we will use numpy's meshgrid function, and then use contour
function.

In [81]:

# Creating 2-D grid of features

[X, Y] = np.meshgrid(x,
y)
fig, ax = plt.subplots(1, 1) Z
= np.sqrt(X**2+Y**2)

# plots contour
lines ax.contour(X, Y,
Z)
ax.set_title('Contour Plot')
ax.set_xlabel('X values')
ax.set_ylabel('Y values')
plt.show()

We can also fill inside of plot by using contourf() function.

In [82]:

# Creating 2-D grid of features

[X, Y] = np.meshgrid(x,
y)
fig, ax = plt.subplots(1, 1) Z
= np.sqrt(X**2+Y**2)

# plots contour
lines ax.contourf(X, Y,
Z)
ax.set_title('Contour Plot')
ax.set_xlabel('X values')
ax.set_ylabel('Y values')
plt.show()

Violin Plots
Violin plots are similar to box plots, except that they also show the probability density of the data at
different values. These plots include a marker for the median of the data and a box indicating the
interquartile range, as in the standard box plots.

In [83]:

x = [0,3,6,9,13,15,19,23,26,29,33,35,39,41,47,56] y
= [5,8,13,16,17,20,25,26,30,33,37,39,41,44,48,59]

In [84]: data=[x,y]#First we will combine the collections fig = plt.figure()

ax = fig.add_axes([0,0,1,1]) bp

= ax.violinplot(data)

plt.grid(False) plt.title("Violin
Plot") plt.show()

Plotting Time Series

A time series is a sequence of numerical data points in successive order. In investing, a time series tracks
the movement of the chosen data points, such as a security's price, over a specified period of time with
data points recorded at regular intervals.

In [85]:

dates = [ datetime(2021,
3, 10),

datetime(2021, 3, 13), datetime(2021,

3, 14), datetime(2021, 3, 15),
datetime(2021, 3, 16), datetime(2021,
3, 17), datetime(2021, 3, 18),
datetime(2021, 3, 19) ]

values = [0,3,4,7,5,3,5,6]

In [86]: plt.title("Time Series") plt.plot_date(dates, values)

plt.xticks(rotation='vertical') plt.show()

We can add a line to plot.

In [87]: plt.title("Time Series")

plt.plot_date(dates, values,linestyle="solid",marker = 'o',ms = 20, mfc =
'r',c="b" )
plt.xticks(rotation='vertical')
plt.xlabel("Dates")
plt.ylabel("Values")
plt.grid(False)

plt.show()
Box Plot
In descriptive statistics, a box plot or boxplot is a method for graphically depicting groups of numerical
data through their quartiles. Box plots may also have lines extending from the boxes (whiskers) indicating
variability outside the upper and lower quartiles. We will use plt.boxplot() function for that.

In [88]:

Salaries =
[6900,7500,4700,11997,22000,16550,9655,8670,15090,29000,7600,14980,1250]

In [89]: plt.boxplot(Salaries) plt.title("Box Plot of Salaries")

plt.ylabel("Salaries")

plt.show()

By making notch argument True, we can create notched boxes.

In [90]: plt.boxplot(Salaries,
notch=True) plt.title("Box Plot of Salaries")
plt.ylabel("Salaries") plt.show()

We can change the colors.

In [91]: green_diamond = dict(markerfacecolor='g', marker='D')

plt.boxplot(Salaries, notch=True, flierprops=green_diamond) plt.title("Box Plot
of Salaries")

plt.ylabel("Salaries") plt.show()

We can make showfliers argument False in order to hide Outlier Points.

In [92]: plt.boxplot(Salaries, notch=True, showfliers=False) plt.title("Box Plot of

Salaries") plt.ylabel("Salaries") plt.show()
We can plot it horizontal by making vert argument False.

In
[93] :plt.boxplot(Salaries, notch=True, showfliers=False, vert=False)
plt.title("Box Plot of Salaries")

plt.xlabel("Salaries") plt.show()

Heatmap
It is often desirable to show data which depends on two independent variables as a color coded image
plot. This is often referred to as a heatmap. If the data is categorical, this would be called a categorical
heatmap.

A heat map is a data visualization technique that shows magnitude of a phenomenon as color in two
dimensions. The variation in color may be by hue or intensity, giving obvious visual cues to the reader
about how the phenomenon is clustered or varies over space. It's generally used to understand
correlations between variables.
Matplotlib's imshow() or heatmap() function makes production of such plots particularly easy.
matplotlib.pyplot.pcolormesh() is an alternative function.

In [94]:

data = np.random.random(( 6 , 6 )) data

Out[94]:

array([[0.80507909, 0.33227263, 0.11982315, 0.43404586, 0.00439907,

0. 36893734],
[0.32868725, 0.49988084, 0.27837337, 0.34234132, 0.04036771,
0.139306 ],
[0.32850119, 0.5059023 , 0.22190802, 0.74768769, 0.25377415,
0.06561068],
[0.28074898, 0.93220202, 0.28393501, 0.0436093 , 0.7720071 ,
0.33618152],
[0.69157098, 0.51986173, 0.35829982, 0.35398844, 0.67410653,
0.15522212],
[0.10798746, 0.73882111, 0.50958755, 0.62619836, 0.28166925,
0.22563274]])

In [95]:
plt.imshow( data , cmap = 'autumn' )
plt.title( "2-D Heat Map" )
plt.show()
In [96]:

sns.heatmap( data , linewidth = 0.5 , cmap = 'coolwarm' )

plt.title( "2-D Heat Map" )

plt.show()

In [97]:

plt.pcolormesh( data , cmap = 'summer' )

plt.title( '2-D Heat Map' ) plt.show()

For a real world example, we will use flights dataset of Seaborn.

In [98]:
flights = sns.load_dataset("flights")

In [99]:
flights.head()

Out[99]:

year month passengers

0 1949 Jan 112
1 1949 Feb 118
2 1949 Mar 132
3 1949 Apr 129
4 1949 May 121
Let's make a pivot table in order to make this dataset ready to plot heatmap. Otherwise heatmap will not
work.

In [100]:
flights = flights.pivot("month","year","passengers")

In [101]: flights
Out[101]:

year 1949 1950 1951 1952 1953 1954 1955 1956 1957 1958 1959 1960
month
Jan 112 115 145 171 196 204 242 284 315 340 360 417
Feb 118 126 150 180 196 188 233 277 301 318 342 391
Mar 132 141 178 193 236 235 267 317 356 362 406 419
Apr 129 135 163 181 235 227 269 313 348 348 396 461
May 121 125 172 183 229 234 270 318 355 363 420 472
Jun 135 149 178 218 243 264 315 374 422 435 472 535
Jul 148 170 199 230 264 302 364 413 465 491 548 622
Aug 148 170 199 242 272 293 347 405 467 505 559 606
Sep 136 158 184 209 237 259 312 355 404 404 463 508
Oct 119 133 162 191 211 229 274 306 347 359 407 461
Nov 104 114 146 172 180 203 237 271 305 310 362 390
Dec 118 140 166 194 201 229 278 306 336 337 405 432
In [102]:

sns.heatmap( flights , linewidth = 0.5 , cmap = 'YlGn' )

plt.title( "Flights Heat Map" )
plt.show()

OBSERVATIONS / DISCUSSION OF RESULT:

1. Plot
2. Pie Plot
3. Violin Plot

4. Box Plot
5. Bar Pot
6. Scatter Plot
7. Heatmap

8. Stack Plot
CONCLUSION:
Therefore we have learnt matplotlib and seaborn modules in python and how to use them to
create bar graphs, pie plots, violin plots, scatter plots, box plots,etc and use it for data
visualization.

REFERENCES:
Website References:
[1] https://www.mygreatlearning.com/
[2] https://www.geeksforgeeks.org/

MATH 5 - Q1 - Mod1 PDF
78% (49)
MATH 5 - Q1 - Mod1 PDF
25 pages
Simple Line Plots _ Python Data Science Handbook
No ratings yet
Simple Line Plots _ Python Data Science Handbook
9 pages
DSF - Unit IV Notes
No ratings yet
DSF - Unit IV Notes
40 pages
DS UNIT-V
No ratings yet
DS UNIT-V
49 pages
Data Visualization and Matplotlib
No ratings yet
Data Visualization and Matplotlib
58 pages
Matplotlib
No ratings yet
Matplotlib
18 pages
Python Matplotlib
No ratings yet
Python Matplotlib
20 pages
4.2 2. Line Chart Plot
No ratings yet
4.2 2. Line Chart Plot
4 pages
Matplotseabornfinal
No ratings yet
Matplotseabornfinal
103 pages
Ch 4 Plotting Data Using Mathplotlib 2024-25
No ratings yet
Ch 4 Plotting Data Using Mathplotlib 2024-25
29 pages
DataVisualization - 1 Surya Sir
No ratings yet
DataVisualization - 1 Surya Sir
51 pages
Unit 4 Plotting Final
No ratings yet
Unit 4 Plotting Final
51 pages
Mat Plot Lib
No ratings yet
Mat Plot Lib
21 pages
07. Matplotlib
No ratings yet
07. Matplotlib
20 pages
Tutorial Matplotlib
No ratings yet
Tutorial Matplotlib
75 pages
Unit 5 Matplotlib
No ratings yet
Unit 5 Matplotlib
8 pages
AI lab4
No ratings yet
AI lab4
25 pages
Matplotlib Seaborn Fundamentals (1)
No ratings yet
Matplotlib Seaborn Fundamentals (1)
72 pages
Python_Matplotlib_Cheat_Sheet
No ratings yet
Python_Matplotlib_Cheat_Sheet
1 page
Visualization With Matplotlib
No ratings yet
Visualization With Matplotlib
18 pages
Unit II lecturer notes
No ratings yet
Unit II lecturer notes
28 pages
MATPLOTLIB For Python
No ratings yet
MATPLOTLIB For Python
46 pages
Mat Plot Lib
No ratings yet
Mat Plot Lib
12 pages
UNIT-5
No ratings yet
UNIT-5
27 pages
Unit V notes
No ratings yet
Unit V notes
11 pages
Unit 4 (2) Python
No ratings yet
Unit 4 (2) Python
27 pages
IE 555 - Programming For Analytics: Import
No ratings yet
IE 555 - Programming For Analytics: Import
12 pages
Matplotlib
No ratings yet
Matplotlib
23 pages
Mat Plot Lib
No ratings yet
Mat Plot Lib
24 pages
32-Basic Charting-24-05-2023
No ratings yet
32-Basic Charting-24-05-2023
15 pages
Data Visualization Part Notes - 1
No ratings yet
Data Visualization Part Notes - 1
9 pages
Data Visualisation
No ratings yet
Data Visualisation
5 pages
matplotlib_cheetsheet
No ratings yet
matplotlib_cheetsheet
9 pages
Unit 6 Data Visualization-1
No ratings yet
Unit 6 Data Visualization-1
30 pages
FDS Notes Unit-5
No ratings yet
FDS Notes Unit-5
24 pages
DV LAb Staff
No ratings yet
DV LAb Staff
73 pages
Handson-Ml - Tools - Matplotlib - Ipynb at 265099f9 Ageron - Handson-Ml GitHub
No ratings yet
Handson-Ml - Tools - Matplotlib - Ipynb at 265099f9 Ageron - Handson-Ml GitHub
33 pages
unit 4
No ratings yet
unit 4
27 pages
Matplotlib 1
No ratings yet
Matplotlib 1
29 pages
Practical Guide To Matplotlib For Data Science
100% (1)
Practical Guide To Matplotlib For Data Science
35 pages
Class 12 Informatics Project
No ratings yet
Class 12 Informatics Project
16 pages
Unit - II Visualization Using Matplotlib
No ratings yet
Unit - II Visualization Using Matplotlib
86 pages
Unit 3
No ratings yet
Unit 3
19 pages
Python 3
No ratings yet
Python 3
63 pages
Data Visualisation Using Pyplot
No ratings yet
Data Visualisation Using Pyplot
20 pages
UNIT 2
No ratings yet
UNIT 2
40 pages
Matplotlib
No ratings yet
Matplotlib
30 pages
Python Pandas and Matplotlib 7
100% (3)
Python Pandas and Matplotlib 7
72 pages
Data Visualization
No ratings yet
Data Visualization
66 pages
Lab 10
No ratings yet
Lab 10
16 pages
Chapter11_DataVisualization2
No ratings yet
Chapter11_DataVisualization2
43 pages
Matplotlib
No ratings yet
Matplotlib
17 pages
Black and White Blank Note Document
No ratings yet
Black and White Blank Note Document
57 pages
MatplotLib - Charts
No ratings yet
MatplotLib - Charts
30 pages
IDS U-5
No ratings yet
IDS U-5
110 pages
XII-IP - Data Visualisation
No ratings yet
XII-IP - Data Visualisation
65 pages
DSP LAB-3(part-a)
No ratings yet
DSP LAB-3(part-a)
16 pages
FOD Record Sem 1
No ratings yet
FOD Record Sem 1
25 pages
Unit 4 python
No ratings yet
Unit 4 python
12 pages
Mat Plot Lib
No ratings yet
Mat Plot Lib
22 pages
Python For Beginners
From Everand
Python For Beginners
Célio Azevedo
No ratings yet
Tes Bahasa Inggris
No ratings yet
Tes Bahasa Inggris
17 pages
Math-8 Q4 M3
No ratings yet
Math-8 Q4 M3
4 pages
CHEM 3412: Final Exam: Potentially Useful Facts
No ratings yet
CHEM 3412: Final Exam: Potentially Useful Facts
5 pages
Feap Element Library
No ratings yet
Feap Element Library
6 pages
Safety Requirements For Scaffolds
75% (8)
Safety Requirements For Scaffolds
17 pages
Ferrotherm 4742
No ratings yet
Ferrotherm 4742
2 pages
3 4 4 Awebsoilsurvey
No ratings yet
3 4 4 Awebsoilsurvey
4 pages
007 0032 (2017) PDF
No ratings yet
007 0032 (2017) PDF
3 pages
Emerging-Trends-In-Infrastructure-2022 KPMG
No ratings yet
Emerging-Trends-In-Infrastructure-2022 KPMG
20 pages
AP Interview Questions
No ratings yet
AP Interview Questions
3 pages
Statement 2
No ratings yet
Statement 2
6 pages
On Line Chloride and Sulfate Monitoring
No ratings yet
On Line Chloride and Sulfate Monitoring
8 pages
Sales Territory Design of FMCG
No ratings yet
Sales Territory Design of FMCG
3 pages
Presentation BIM PT Brantas Abipraya - Rev
No ratings yet
Presentation BIM PT Brantas Abipraya - Rev
29 pages
Kami Export - 18th Century Colonies
No ratings yet
Kami Export - 18th Century Colonies
10 pages
Comparison of Construction Classification Systems Used For Classifying Building Product Models
No ratings yet
Comparison of Construction Classification Systems Used For Classifying Building Product Models
8 pages
Bathymetry Survey at Kamarajar Port Limited, Ennore (Phase-2)
No ratings yet
Bathymetry Survey at Kamarajar Port Limited, Ennore (Phase-2)
1 page
(Jon Hild) Ans1 The Land of The Insect Men (23 06 2019)
100% (1)
(Jon Hild) Ans1 The Land of The Insect Men (23 06 2019)
31 pages
Assignment 02 AE675
No ratings yet
Assignment 02 AE675
3 pages
Chemistry The Central Science 12th Edition instant download
100% (1)
Chemistry The Central Science 12th Edition instant download
31 pages
White - 1990 - Environmental History, Ecology, and Meaning
No ratings yet
White - 1990 - Environmental History, Ecology, and Meaning
7 pages
Ford Ranger Spare Parts
No ratings yet
Ford Ranger Spare Parts
2 pages
Morris Library First Floor: Periodicals
No ratings yet
Morris Library First Floor: Periodicals
4 pages
Type A Quarters - GAD
No ratings yet
Type A Quarters - GAD
3 pages
Quality Assurence Syallbus
No ratings yet
Quality Assurence Syallbus
23 pages
Samsung GY15VS GY17VS Service Manual
No ratings yet
Samsung GY15VS GY17VS Service Manual
52 pages
Final Roselyn TTL Report
50% (2)
Final Roselyn TTL Report
16 pages
One_Piece_-_I_Am_A
No ratings yet
One_Piece_-_I_Am_A
2,347 pages
Jimin - Lesson68P (J)
No ratings yet
Jimin - Lesson68P (J)
9 pages