Python Univ V
Python Univ V
Syllabus
Data Visualization: Importing Matplotlib – Line plots – Scatter plots – Visualizing Errors –
Density and Contour plots – Histograms – Legends – Colors – Subplots – Text and Annotation.
Importing Matplotlib
• Matplotlib is a cross-platform, data visualization and graphical plotting library for Python and
its numerical extension NumPy.
• Matplotlib is a comprehensive library for creating static, animated, and interactive
visualizations in Python.
• Matplotlib is a plotting library for the Python programming language. It allows to make
quality charts in few lines of code. Most of the other python plotting library are build on top of
Matplotlib.
• The library is currently limited to 2D output, but it still provides you with the means to
express graphically the data patterns.
Visualizing Information: Starting with Graph
• Data visualization is the presentation of quantitative information in a graphical form. In other
words, data visualizations turn large and small datasets into visuals that are easier for the human
brain to understand and process.
• Good data visualizations are created when communication, data science, and design collide.
Data visualizations done right offer key insights into complicated datasets in ways that are
meaningful and intuitive.
• A graph is simply a visual representation of numeric data. MatPlotLib supports a large number
of graph and chart types.
• Matplotlib is a popular Python package used to build plots. Matplotlib can also be used to
make 3D plots and animations.
• Line plots can be created in Python with Matplotlib's pyplot library. To build a line plot, first
import Matplotlib. It is a standard convention to import Matplotlib's pyplot library as plt.
• To define a plot, you need some values, the matplotlib.pyplot module, and an idea of what you
want to display.
import matplotlib.pyplot as plt
plt.plot([1,2,3],[5,7,4])
plt.show()
• The plt.plot will "draw" this plot in the background, but we need to bring it to the screen when
we're ready, after graphing everything we intend to.
• plt.show(): With that, the graph should pop up. If not, sometimes can pop under, or you may
have gotten an error. Your graph should look like :
• This window is a matplotlib window, which allows us to see our graph, as well as interact with
it and navigate it
Line Plot
• More than one line can be in the plot. To add another line, just call the plot (x,y) function
again. In the example below we have two different values for y (y1, y2) that are plotted onto the
chart.
import matplotlib.pyplot as plt
import numpy as np
x = np.linspace(-1, 1, 50)
y1 = 2*x+ 1
y2 = 2**x + 1
plt.figure(num = 3, figsize=(8, 5))
plt.plot(x, y2)
plt.plot(x, y1,linewidth=1.0,linestyle='--')
plt.show()
• Output of the above code will look like this:
Example 5.1.1: Write a simple python program that draws a line graph where x = [1,2,3,4]
and y = [1,4,9,16] and gives both axis label as "X-axis" and "Y-axis".
Solution:
Import matplotlib.pyplot as plt
Import numpy as np
# define data values
x = np.array([1, 2, 3, 4]) # X-axis points
y = x*2 # Y-axis points
print("Values of :")
print("Values of Y):")
print (Y)
plt.plot(X, Y)
# Set the x axis label of the current axis.
plt.xlabel('x-axis')
# Set the y axis label of the current axis.
plt.ylabel('y-axis')
# Set a title
plt.title('Draw a line.')
# Display the figure.
plt.show()
Saving Work to Disk
• Matplotlib plots can be saved as image files using the plt.savefig() function.
• The .savefig() method requires a filename be specified as the first argument. This filename can
be a full path. It can also include a particular file extension if desired. If no extension is
provided, the configuration value of savefig.format is used instead.
• The .savefig() also has a number of useful optional arguments :
1. dpi can be used to set the resolution of the file to a numeric value.
2. transparent can be set to True, which causes the background of the chart to be transparent.
Setting the Axis, Ticks, Grids
• The axes define the x and y plane of the graphic. The x axis runs horizontally, and the y axis
runs vertically.
• An axis is added to a plot layer. Axis can be thought of as sets of x and y axis that lines and
bars are drawn on. An Axis contains daughter attributes like axis labels, tick labels, and line
thickness.
• The following code shows how to obtain access to the axes for a plot :
fig = plt.figure()
axes = fig.add_axes([0.1, 0.1, 0.8, 0.8]) # left, bottom, width, height (range 0 to 1)
axes.plot(x, y, 'r')
axes.set_xlabel('x')
axes.set_ylabel('y')
axes.set_title('title');
A grid can be added to a Matplotlib plot using the plt.grid() command. By defaut, the grid is
turned off. To turn on the grid use:
plt.grid(True)
• The only valid options are plt.grid(True) and plt.grid(False). Note that True and False are
capitalized and are not enclosed in quotes.
Defining the Line Appearance and Working with Line Style
• Line styles help differentiate graphs by drawing the lines in various ways. Following line style
is used by Matplotlib.
• Matplotlib has an additional parameter to control the colour and style of the plot.
plt.plot(xa, ya 'g')
• This will make the line green. You can use any colour of red, green, blue, cyan, magenta,
yellow, white or black just by using the first character of the colour name in lower case (use "k"
for black, as "b" means blue).
• You can also alter the linestyle, for example two dashes -- makes a dashed line. This can be
used added to the colour selector, like this:
plt.plot(xa, ya 'r--')
• You can use "-" for a solid line (the default), "-." for dash-dot lines, or ":" for a dotted line.
Here is an example :
from matplotlib import pyplot as plt
import numpy as np
xa = np.linspace(0, 5, 20)
ya = xa**2
plt.plot(xa, ya, 'g')
ya = 3*xa
plt.plot(xa, ya, 'r--')
plt.show()
OUTPUT:
Scatter Plots
• When two variables x and y have an association (or relationship), we say there exists
a correlation between them. Alternatively, we could say x and y are correlated. To find such an
association, we usually look at a scatterplot and try to find a pattern.
• Scatterplot (or scatter diagram) is a graph in which the paired (x, y) sample data are plotted
with a horizontal x axis and a vertical y axis. Each individual (x, y) pair is plotted as a single
point.
• One variable is called independent (X) and the second is called dependent (Y).
Matplotlib also supports more advanced plots, such as scatter plots. In this case, the scatter
function is used to display data values as a collection of x, y coordinates represented by
standalone dots.
importmatplotlib.pyplot as plt
#X axis values:
x = [2,3,7,29,8,5,13,11,22,33]
# Y axis values:
y = [4,7,55,43,2,4,11,22,33,44]
# Create scatter plot:
plt.scatter(x, y)
plt.show()
Scatter() plot matplotlib in Python
matplotlib.pyplot.scatter()
Scatter plots are utilized to see how different variables are related to each other. The dots on the
plot shows how the variables are related. A scatter plot is made with the matplotlib
library's scatter() method.
Syntax
Here's how to write code for the scatter() method:
matplotlib.pyplot.scatter (x_axis_value, y_axis_value, s = None, c = None, vmin = None, vmax
= None, marker = None, cmap = None, alpha = None, linewidths = None, edgecolors = None)
The following are the syntax parameters for the scatter() method:
x_axis_value - An array containing x-axis data for scatter in the plot.
y_axis_value - an array with y-axis data.
s - it is the size of the marker (can be scalar or array of size equal to the size of the x-axis or y-
axis)
c- the order of the colors of the markers
marker- marker style for scatter plot in the python.
cmap - cmap name for scatter plot in the python.
Linewidths - these are the size of the marker border for scatter in the plot.
edgecolor: the border color of a marker for scatter in the plot.
Alpha - blending value for scatter fucntion, between 0 and 1 (transparent to opaque)
Example: the following example shows a scatter plot using Python. Here, we use a
different color for each plotted value. We can use scatter with different colors, sizes, edges,
and line widths of the border.
import matplotlib.pyplot as plt
import numpy as np
x_axis_value = np.array([4, 9, 6, 7, 12, 13, 21])
y_axis_value = np.array([90, 80, 89, 88, 101, 82, 102])
plt.scatter(x_axis_value, y_axis_value, color = 'orange', s = 150, edgecolor = 'blue', linewidth =
2,marker=’x’)
x_axis_value = np.array([5, 12, 8, 11, 16, 8])
y_axis_value = np.array([101, 106, 85, 105, 90, 98])
plt.scatter(x_axis_value, y_axis_value, color = 'red', s = 300, edgecolor = 'green', linewidth = 1)
plt.show()
• Comparing plt.scatter() and plt.plot(): We can also produce the scatter plot shown above
using another function within matplotlib.pyplot. Matplotlib'splt.plot() is a general-purpose
plotting function that will allow user to create various different line or marker plots.
We can achieve the same scatter plot as the one obtained in the section above with the
following call to plt.plot(), using the same data:
plt.plot(x, y, "o")
plt.show()
• In this case, we had to include the marker "o" as a third argument, as otherwise plt.plot()
would plot a line graph. The plot created with this code is identical to the plot created earlier
with plt.scatter().
. • Here's a rule of thumb that can use :
a) If we need a basic scatter plot, use plt.plot(), especially if we want to prioritize performance.
b) If we want to customize our scatter plot by using more advanced plotting features, use
plt.scatter().
• Example: We can create a simple scatter plot in Python by passing x and y values to
plt.scatter():
# scatter_plotting.py
importmatplotlib.pyplot as plt
plt.style.use('fivethirtyeight')
x = [2, 4, 6, 6, 9, 2, 7, 2, 6, 1, 8, 4, 5, 9, 1, 2, 3, 7, 5, 8, 1, 3]
y = [7, 8, 2, 4, 6, 4, 9, 5, 9, 3, 6, 7, 2, 4, 6, 7, 1, 9, 4, 3, 6, 9]
plt.scatter(x, y)
plt.show()
Output:
Creating Advanced Scatterplots
• Scatterplots are especially important for data science because they can show data patterns that
aren't obvious when viewed in other ways.
import matplotlib.pyplot as plt
x_axis1 =[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
y_axis1 =[5, 16, 34, 56, 32, 56, 32, 12, 76, 89]
x_axis2 = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
y_axis2 = [53, 6, 46, 36, 15, 64, 73, 25, 82, 9]
plt.title("Prices over 10 years")
plt.scatter(x_axis1, y_axis1, color = 'darkblue', marker='x', label="item 1")
plt.scatter(x_axis2, y_axis2, color='darkred', marker='x', label="item 2")
plt.xlabel("Time (years)")
plt.ylabel("Price (dollars)")
plt.grid(True)
plt.legend()
plt.show()
• The chart displays two data sets. We distinguish between them by the colour of the marker.
What is Histogram?
A histogram is a visual depiction of a frequency distribution table with continuous divisions that
have been grouped. A series of rectangles with foundations equal to the distances between class
bounds and areas proportionate to the frequency in the associated classes make up the area
diagram.
When to Use Histogram?
The histogram diagram is employed in specific circumstances. As follows:
o The data must be quantitative.
o To examine the form of the data distribution, we use a histogram.
o Used to determine if a process evolves from one time period to the next.
o Used to assess whether the outcome differs when two or more procedures are involved.
o Used to determine whether the specified process satisfies the customer's needs.
Legend:
A legend is an area describing the elements of the graph. In the Matplotlib library, there’s a
function called legend() which is used to place a legend on the axes. Plot legends give meaning
to a visualization, assigning labels to the various plot elements. Legends are found in maps -
describe the pictorial language or symbology of the map. Legends are used in line graphs to
explain the function or the values underlying the different lines of the graph.
import numpy as np
import matplotlib.pyplot as plt
# X-axis values
x = [1, 2, 3, 4, 5]
# Y-axis values
y = [1, 4, 9, 16, 25]
# Function to plot
plt.plot(x, y)
# Function add a legend
plt.legend(['single element'])
# function to show the plot
plt.show()
Change the Position of the Legend
In this example, two data series, represented by `y1` and `y2`, are plotted. Each series is
differentiated by a specific color, and the legend provides color-based labels “blue” and “green”
for clarity.
# importing modules
import numpy as np
import matplotlib.pyplot as plt
# Y-axis values
y1 = [2, 3, 4.5]
# Y-axis values
y2 = [1, 1.5, 5]
# Function to plot
plt.plot(y1)
plt.plot(y2)
# Function add a legend
plt.legend(["blue", "green"], loc="lower right")
# function to show the plot
plt.show()
Visualizing Errors
Error bars are included in Matplotlib line plots and graphs. Error is the difference between the
calculated value and actual value.
Visualizing Errors
• Error bars are included in Matplotlib line plots and graphs. Error is the difference between the
calculated value and actual value.
• Without error bars, bar graphs provide the perception that a measurable or determined number
is defined to a high level of efficiency. The method matplotlib.pyplot.errorbar() draws y vs. x as
planes and/or indicators with error bars associated.
• Adding the error bar in Matplotlib, Python. It's very simple, we just have to write the value of
the error. We use the command:
plt.errorbar(x, y, yerr = 2, capsize=3)
Where:
x = The data of the X axis.
Y = The data of the Y axis.
yerr = The error value of the Y axis. Each point has its own error value.
xerr = The error value of the X axis.
capsize = The size of the lower and upper lines of the error bar
• A simple example, where we only plot one point. The error is the 10% on the Y axis.
importmatplotlib.pyplot as plt
x=1
y = 20
y_error = 20*0.10 ## El 10% de error
plt.errorbar(x,y, yerr = y_error, capsize=3)
plt.show()
Output:
• We plot using the command "plt.errorbar (...)", giving it the desired characteristics.
• Multiple lines in MatplotlibErrorbar in Python : The ability to draw numerous lines in almost
the same plot is critical. We'll draw many errorbars in the same graph by using this scheme.
Import numpy as np
Import matplotlib.pyplot as plt
fig = plt.figure()
x = np.arange(20)
y = 4* np.sin(x / 20 * np.pi)
yerr = np.linspace (0.06, 0.3, 20)
plt.errorbar(x, y + 8, yerr = yerr, )
plt.errorbar(x, y + 6, yerr = yerr,
uplims = True, )
plt.errorbar(x, y + 4, yerr = yerr,
uplims = True,
lolims True, )
upperlimits = [True, False] * 6
lowerlimits = [False, True]* 6
plt.errorbar(x, y, yerr = yerr,
uplims =upperlimits,
lolims = lowerlimits, )
plt.legend(loc='upper left')
plt.title('Example')
plt.show()
Output: