0% found this document useful (0 votes)
16 views

Python Univ V

Uploaded by

pg.7730857428
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views

Python Univ V

Uploaded by

pg.7730857428
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 16

UNIT V: Data Visualization

Syllabus
Data Visualization: Importing Matplotlib – Line plots – Scatter plots – Visualizing Errors –
Density and Contour plots – Histograms – Legends – Colors – Subplots – Text and Annotation.
Importing Matplotlib
• Matplotlib is a cross-platform, data visualization and graphical plotting library for Python and
its numerical extension NumPy.
• Matplotlib is a comprehensive library for creating static, animated, and interactive
visualizations in Python.
• Matplotlib is a plotting library for the Python programming language. It allows to make
quality charts in few lines of code. Most of the other python plotting library are build on top of
Matplotlib.
• The library is currently limited to 2D output, but it still provides you with the means to
express graphically the data patterns.
Visualizing Information: Starting with Graph
• Data visualization is the presentation of quantitative information in a graphical form. In other
words, data visualizations turn large and small datasets into visuals that are easier for the human
brain to understand and process.
• Good data visualizations are created when communication, data science, and design collide.
Data visualizations done right offer key insights into complicated datasets in ways that are
meaningful and intuitive.
• A graph is simply a visual representation of numeric data. MatPlotLib supports a large number
of graph and chart types.
• Matplotlib is a popular Python package used to build plots. Matplotlib can also be used to
make 3D plots and animations.
• Line plots can be created in Python with Matplotlib's pyplot library. To build a line plot, first
import Matplotlib. It is a standard convention to import Matplotlib's pyplot library as plt.
• To define a plot, you need some values, the matplotlib.pyplot module, and an idea of what you
want to display.
import matplotlib.pyplot as plt
plt.plot([1,2,3],[5,7,4])
plt.show()
• The plt.plot will "draw" this plot in the background, but we need to bring it to the screen when
we're ready, after graphing everything we intend to.
• plt.show(): With that, the graph should pop up. If not, sometimes can pop under, or you may
have gotten an error. Your graph should look like :

• This window is a matplotlib window, which allows us to see our graph, as well as interact with
it and navigate it
Line Plot
• More than one line can be in the plot. To add another line, just call the plot (x,y) function
again. In the example below we have two different values for y (y1, y2) that are plotted onto the
chart.
import matplotlib.pyplot as plt
import numpy as np
x = np.linspace(-1, 1, 50)
y1 = 2*x+ 1
y2 = 2**x + 1
plt.figure(num = 3, figsize=(8, 5))
plt.plot(x, y2)
plt.plot(x, y1,linewidth=1.0,linestyle='--')
plt.show()
• Output of the above code will look like this:

Example 5.1.1: Write a simple python program that draws a line graph where x = [1,2,3,4]
and y = [1,4,9,16] and gives both axis label as "X-axis" and "Y-axis".
Solution:
Import matplotlib.pyplot as plt
Import numpy as np
# define data values
x = np.array([1, 2, 3, 4]) # X-axis points
y = x*2 # Y-axis points
print("Values of :")
print("Values of Y):")
print (Y)
plt.plot(X, Y)
# Set the x axis label of the current axis.
plt.xlabel('x-axis')
# Set the y axis label of the current axis.
plt.ylabel('y-axis')
# Set a title
plt.title('Draw a line.')
# Display the figure.
plt.show()
Saving Work to Disk
• Matplotlib plots can be saved as image files using the plt.savefig() function.
• The .savefig() method requires a filename be specified as the first argument. This filename can
be a full path. It can also include a particular file extension if desired. If no extension is
provided, the configuration value of savefig.format is used instead.
• The .savefig() also has a number of useful optional arguments :
1. dpi can be used to set the resolution of the file to a numeric value.
2. transparent can be set to True, which causes the background of the chart to be transparent.
Setting the Axis, Ticks, Grids
• The axes define the x and y plane of the graphic. The x axis runs horizontally, and the y axis
runs vertically.
• An axis is added to a plot layer. Axis can be thought of as sets of x and y axis that lines and
bars are drawn on. An Axis contains daughter attributes like axis labels, tick labels, and line
thickness.
• The following code shows how to obtain access to the axes for a plot :
fig = plt.figure()
axes = fig.add_axes([0.1, 0.1, 0.8, 0.8]) # left, bottom, width, height (range 0 to 1)
axes.plot(x, y, 'r')
axes.set_xlabel('x')
axes.set_ylabel('y')
axes.set_title('title');

A grid can be added to a Matplotlib plot using the plt.grid() command. By defaut, the grid is
turned off. To turn on the grid use:
plt.grid(True)
• The only valid options are plt.grid(True) and plt.grid(False). Note that True and False are
capitalized and are not enclosed in quotes.
Defining the Line Appearance and Working with Line Style
• Line styles help differentiate graphs by drawing the lines in various ways. Following line style
is used by Matplotlib.
• Matplotlib has an additional parameter to control the colour and style of the plot.
plt.plot(xa, ya 'g')

• This will make the line green. You can use any colour of red, green, blue, cyan, magenta,
yellow, white or black just by using the first character of the colour name in lower case (use "k"
for black, as "b" means blue).
• You can also alter the linestyle, for example two dashes -- makes a dashed line. This can be
used added to the colour selector, like this:
plt.plot(xa, ya 'r--')
• You can use "-" for a solid line (the default), "-." for dash-dot lines, or ":" for a dotted line.
Here is an example :
from matplotlib import pyplot as plt
import numpy as np
xa = np.linspace(0, 5, 20)
ya = xa**2
plt.plot(xa, ya, 'g')
ya = 3*xa
plt.plot(xa, ya, 'r--')
plt.show()
OUTPUT:

MatPlotLib Colors are as follows:


Setting the line color
By default, the plt.plot() function plots a blue line. However, you can change the line color by
passing a color parameter to the function.
Setting the line width
You can also change the line width by passing a linewidth parameter to the plt.plot() function.
plt.plot(dates, closing_price, linewidth=3)
Setting the line style
You can change the line style by passing a linestyle parameter to the plt.plot() function.
The linestyle parameter takes a string that represents the line style.
Adding markers to line plots
Markers can be used to highlight specific points in the line plot
plt.plot(df['Date'], df['Close'], marker='x')
Adding labels and title
To make the plot more informative, we can add axis labels and a title. We can achieve this by
using the plt.xlabel(), plt.ylabel(), and plt.title() functions, respectively.

plt.plot(dates, closing_price, color='red', linewidth=2)


plt.xlabel('Date')
plt.ylabel('Closing Price')
plt.title('DJIA Stock Price')
Adding grid lines
We can also add grid lines to our plot to make it more readable. We can achieve this by using
the plt.grid() function.
plt.grid(True)

Scatter Plots
• When two variables x and y have an association (or relationship), we say there exists
a correlation between them. Alternatively, we could say x and y are correlated. To find such an
association, we usually look at a scatterplot and try to find a pattern.
• Scatterplot (or scatter diagram) is a graph in which the paired (x, y) sample data are plotted
with a horizontal x axis and a vertical y axis. Each individual (x, y) pair is plotted as a single
point.
• One variable is called independent (X) and the second is called dependent (Y).

Matplotlib also supports more advanced plots, such as scatter plots. In this case, the scatter
function is used to display data values as a collection of x, y coordinates represented by
standalone dots.
importmatplotlib.pyplot as plt
#X axis values:
x = [2,3,7,29,8,5,13,11,22,33]
# Y axis values:
y = [4,7,55,43,2,4,11,22,33,44]
# Create scatter plot:
plt.scatter(x, y)
plt.show()
Scatter() plot matplotlib in Python
matplotlib.pyplot.scatter()
Scatter plots are utilized to see how different variables are related to each other. The dots on the
plot shows how the variables are related. A scatter plot is made with the matplotlib
library's scatter() method.
Syntax
Here's how to write code for the scatter() method:
matplotlib.pyplot.scatter (x_axis_value, y_axis_value, s = None, c = None, vmin = None, vmax
= None, marker = None, cmap = None, alpha = None, linewidths = None, edgecolors = None)
The following are the syntax parameters for the scatter() method:
x_axis_value - An array containing x-axis data for scatter in the plot.
y_axis_value - an array with y-axis data.
s - it is the size of the marker (can be scalar or array of size equal to the size of the x-axis or y-
axis)
c- the order of the colors of the markers
marker- marker style for scatter plot in the python.
cmap - cmap name for scatter plot in the python.
Linewidths - these are the size of the marker border for scatter in the plot.
edgecolor: the border color of a marker for scatter in the plot.
Alpha - blending value for scatter fucntion, between 0 and 1 (transparent to opaque)
Example: the following example shows a scatter plot using Python. Here, we use a
different color for each plotted value. We can use scatter with different colors, sizes, edges,
and line widths of the border.
import matplotlib.pyplot as plt
import numpy as np
x_axis_value = np.array([4, 9, 6, 7, 12, 13, 21])
y_axis_value = np.array([90, 80, 89, 88, 101, 82, 102])
plt.scatter(x_axis_value, y_axis_value, color = 'orange', s = 150, edgecolor = 'blue', linewidth =
2,marker=’x’)
x_axis_value = np.array([5, 12, 8, 11, 16, 8])
y_axis_value = np.array([101, 106, 85, 105, 90, 98])
plt.scatter(x_axis_value, y_axis_value, color = 'red', s = 300, edgecolor = 'green', linewidth = 1)
plt.show()
• Comparing plt.scatter() and plt.plot(): We can also produce the scatter plot shown above
using another function within matplotlib.pyplot. Matplotlib'splt.plot() is a general-purpose
plotting function that will allow user to create various different line or marker plots.
We can achieve the same scatter plot as the one obtained in the section above with the
following call to plt.plot(), using the same data:
plt.plot(x, y, "o")
plt.show()
• In this case, we had to include the marker "o" as a third argument, as otherwise plt.plot()
would plot a line graph. The plot created with this code is identical to the plot created earlier
with plt.scatter().
. • Here's a rule of thumb that can use :
a) If we need a basic scatter plot, use plt.plot(), especially if we want to prioritize performance.
b) If we want to customize our scatter plot by using more advanced plotting features, use
plt.scatter().
• Example: We can create a simple scatter plot in Python by passing x and y values to
plt.scatter():
# scatter_plotting.py
importmatplotlib.pyplot as plt
plt.style.use('fivethirtyeight')
x = [2, 4, 6, 6, 9, 2, 7, 2, 6, 1, 8, 4, 5, 9, 1, 2, 3, 7, 5, 8, 1, 3]
y = [7, 8, 2, 4, 6, 4, 9, 5, 9, 3, 6, 7, 2, 4, 6, 7, 1, 9, 4, 3, 6, 9]
plt.scatter(x, y)
plt.show()
Output:
Creating Advanced Scatterplots
• Scatterplots are especially important for data science because they can show data patterns that
aren't obvious when viewed in other ways.
import matplotlib.pyplot as plt
x_axis1 =[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
y_axis1 =[5, 16, 34, 56, 32, 56, 32, 12, 76, 89]
x_axis2 = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
y_axis2 = [53, 6, 46, 36, 15, 64, 73, 25, 82, 9]
plt.title("Prices over 10 years")
plt.scatter(x_axis1, y_axis1, color = 'darkblue', marker='x', label="item 1")
plt.scatter(x_axis2, y_axis2, color='darkred', marker='x', label="item 2")
plt.xlabel("Time (years)")
plt.ylabel("Price (dollars)")
plt.grid(True)
plt.legend()
plt.show()
• The chart displays two data sets. We distinguish between them by the colour of the marker.
What is Histogram?
A histogram is a visual depiction of a frequency distribution table with continuous divisions that
have been grouped. A series of rectangles with foundations equal to the distances between class
bounds and areas proportionate to the frequency in the associated classes make up the area
diagram.
When to Use Histogram?
The histogram diagram is employed in specific circumstances. As follows:
o The data must be quantitative.
o To examine the form of the data distribution, we use a histogram.
o Used to determine if a process evolves from one time period to the next.
o Used to assess whether the outcome differs when two or more procedures are involved.
o Used to determine whether the specified process satisfies the customer's needs.

Creating a Matplotlib Histogram


To create a Matplotlib histogram the first step is to create a bin of the ranges, then distribute the
whole range of the values into a series of intervals, and count the values that fall into each of the
intervals. Bins are identified as consecutive, non-overlapping intervals of variables.
It provides a visual interpretation of numerical data by showing the number of data points that
fall within a specified range of values called "bins".

The matplotlib.pyplot.hist() function is used to compute and create a histogram of x.


The following table shows the parameters accepted by matplotlib.pyplot.hist() function :
Attribute Parameter
x array or sequence of array
bins optional parameter contains integer or sequence or strings
density Optional parameter contains boolean values
range Optional parameter represents upper and lower range of bins
histtype optional parameter used to create type of histogram [bar, barstacked, step,
stepfilled], default is “bar”
align optional parameter controls the plotting of histogram [left, right, mid]
weights optional parameter contains array of weights having same dimensions as x
bottom location of the baseline of each bin
rwidth optional parameter which is relative width of the bars with respect to bin width
color optional parameter used to set color or sequence of color specs
label optional parameter string or sequence of string to match with multiple datasets
log optional parameter used to set histogram axis on log scale

import matplotlib.pyplot as plt


import numpy as np
# Generate random data for the histogram
data = np.random.randn(1000)
# Plotting a basic histogram
plt.hist(data, bins=30, color='skyblue', edgecolor='black')
# Adding labels and title
plt.xlabel('Values')
plt.ylabel('Frequency')
plt.title('Basic Histogram')
# Display the plot
plt.show()

Legend:
A legend is an area describing the elements of the graph. In the Matplotlib library, there’s a
function called legend() which is used to place a legend on the axes. Plot legends give meaning
to a visualization, assigning labels to the various plot elements. Legends are found in maps -
describe the pictorial language or symbology of the map. Legends are used in line graphs to
explain the function or the values underlying the different lines of the graph.

Python Matplotlib.pyplot.legend() Syntax


Syntax: matplotlib.pyplot.legend([“blue”, “green”], bbox_to_anchor=(0.75, 1.15), ncol=2)
Attributes:
 shadow: [None or bool] Whether to draw a shadow behind the legend.
 It’s Default value is None.
 markerscale: [None or int or float] The relative size of legend markers compared with the
originally drawn ones.The Default is None.
 numpoints: [None or int] The number of marker points in the legend when creating a
legend entry for a Line2D (line).The Default is None.
 fontsize: The font size of the legend.If the value is numeric the size will be the absolute
font size in points.
 facecolor: [None or “inherit” or color] The legend’s background color.
 edgecolor: [None or “inherit” or color] The legend’s background patch edge color.
The attribute Loc in legend() is used to specify the location of the legend. The default value of
loc is loc= “best” (upper left). The strings ‘upper left’, ‘upper right’, ‘lower left’, and ‘lower
right’ place the legend at the corresponding corner of the axes/figure.

import numpy as np
import matplotlib.pyplot as plt
# X-axis values
x = [1, 2, 3, 4, 5]
# Y-axis values
y = [1, 4, 9, 16, 25]
# Function to plot
plt.plot(x, y)
# Function add a legend
plt.legend(['single element'])
# function to show the plot
plt.show()
Change the Position of the Legend
In this example, two data series, represented by `y1` and `y2`, are plotted. Each series is
differentiated by a specific color, and the legend provides color-based labels “blue” and “green”
for clarity.
# importing modules
import numpy as np
import matplotlib.pyplot as plt

# Y-axis values
y1 = [2, 3, 4.5]
# Y-axis values
y2 = [1, 1.5, 5]
# Function to plot
plt.plot(y1)
plt.plot(y2)
# Function add a legend
plt.legend(["blue", "green"], loc="lower right")
# function to show the plot
plt.show()
Visualizing Errors
Error bars are included in Matplotlib line plots and graphs. Error is the difference between the
calculated value and actual value.
Visualizing Errors
• Error bars are included in Matplotlib line plots and graphs. Error is the difference between the
calculated value and actual value.
• Without error bars, bar graphs provide the perception that a measurable or determined number
is defined to a high level of efficiency. The method matplotlib.pyplot.errorbar() draws y vs. x as
planes and/or indicators with error bars associated.
• Adding the error bar in Matplotlib, Python. It's very simple, we just have to write the value of
the error. We use the command:
plt.errorbar(x, y, yerr = 2, capsize=3)
Where:
x = The data of the X axis.
Y = The data of the Y axis.
yerr = The error value of the Y axis. Each point has its own error value.
xerr = The error value of the X axis.
capsize = The size of the lower and upper lines of the error bar
• A simple example, where we only plot one point. The error is the 10% on the Y axis.
importmatplotlib.pyplot as plt
x=1
y = 20
y_error = 20*0.10 ## El 10% de error
plt.errorbar(x,y, yerr = y_error, capsize=3)
plt.show()
Output:
• We plot using the command "plt.errorbar (...)", giving it the desired characteristics.

Import matplotlib.pyplot as plt


Import numpy as np
x = np.arange(1,8)
y = np.array([20,10,45,32,38,21,27])
y_error = y * 0.10 ##El 10%
plt.errorbar(x, y, yerr = y_error,
linestyle="None", fmt="ob", capsize=3, ecolor="k")
plt.show()
• Parameters of the errorbar :
a) yerr is the error value in each point.
b) linestyle, here it indicate that we will not plot a line.
c) fmt, is the type of marker, in this case is a point ("o") blue ("b").
d) capsize, is the size of the lower and upper lines of the error bar.
e) ecolor, is the color of the error bar. The default color is the marker color.
Output:

• Multiple lines in MatplotlibErrorbar in Python : The ability to draw numerous lines in almost
the same plot is critical. We'll draw many errorbars in the same graph by using this scheme.
Import numpy as np
Import matplotlib.pyplot as plt
fig = plt.figure()
x = np.arange(20)
y = 4* np.sin(x / 20 * np.pi)
yerr = np.linspace (0.06, 0.3, 20)
plt.errorbar(x, y + 8, yerr = yerr, )
plt.errorbar(x, y + 6, yerr = yerr,
uplims = True, )
plt.errorbar(x, y + 4, yerr = yerr,
uplims = True,
lolims True, )
upperlimits = [True, False] * 6
lowerlimits = [False, True]* 6
plt.errorbar(x, y, yerr = yerr,
uplims =upperlimits,
lolims = lowerlimits, )
plt.legend(loc='upper left')
plt.title('Example')
plt.show()
Output:

You might also like