0% found this document useful (0 votes)
15 views39 pages

Unit 2

Uploaded by

Mihir Bhayani
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views39 pages

Unit 2

Uploaded by

Mihir Bhayani
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 39

Unit 2

Prepared By :: Prof. Megha Mehta


Quick setup and importing Matplotlib's essential modules
Understanding Figure Structure
Creating plots like
lines
scatter plots
bars
Histograms
pie

Saving and Exporting


Customizing Aesthetics
 Matplotlib is a powerful plotting library in Python used for creating static, animated, and
interactive visualizations.
 Matplotlib’s primary purpose is to provide users with the tools and functionality to represent
data graphically, making it easier to analyze and understand.
 It was originally developed by John D. Hunter in 2003 and is now maintained by a large
community of developers.
 py -m pip install --upgrade pip
 py --version
 py -m pip --version
 py -m pip install matplotlib
 Versatility: Matplotlib can generate a wide range of plots, including line plots, scatter plots,
bar plots, histograms, pie charts, and more.
 Customization: It offers extensive customization options to control every aspect of the plot,
such as line styles, colors, markers, labels, and annotations.
 Integration with NumPy: Matplotlib integrates seamlessly with NumPy, making it easy to plot
data arrays directly.
 Publication Quality: Matplotlib produces high-quality plots suitable for publication with fine-
grained control over the plot aesthetics.
 Extensible: Matplotlib is highly extensible, with a large ecosystem of add-on toolkits and
extensions like Seaborn, Pandas plotting functions, and Basemap for geographical plotting.
 Cross-Platform: It is platform-independent and can run on various operating systems,
including Windows, macOS, and Linux.
 Interactive Plots: Matplotlib supports interactive plotting through the use of widgets and
event handling, enabling users to explore data dynamically.
 In Matplotlib, a figure is the top-level container that holds all the elements of a plot. It
represents the entire window or page where the plot is drawn
 The parts of a Matplotlib figure include (as shown in the figure above):
1. Figures in Matplotlib: The Figure object is the top-level container for all elements of the
plot. It serves as the canvas on which the plot is drawn. You can think of it as the blank sheet
of paper on which you’ll create your visualization.
2. Axes in Matplotlib: Axes are the rectangular areas within the figure where data is plotted.
Each figure can contain one or more axes, arranged in rows and columns if necessary. Axes
provide the coordinate system and are where most of the plotting occurs.
3. Axis in Matplotlib: Axis objects represent the x-axis and y-axis of the plot. They define the
data limits, tick locations, tick labels, and axis labels. Each axis has a scale and a locator that
determine how the tick marks are spaced.
4. Marker in Matplotlib: Markers are symbols used to denote individual data points on a plot.
They can be shapes such as circles, squares, triangles, or custom symbols. Markers are often
used in scatter plots to visually distinguish between different data points.
5. Adding lines to Figures: Lines connect data points on a plot and are commonly used in line
plots, scatter plots with connected points, and other types of plots. They represent the
relationship or trend between data points and can be styled with different colors, widths,
and styles to convey additional information.
6. Matplotlib Title:The title is a text element that provides a descriptive title for the plot. It
typically appears at the top of the figure and provides context or information about the data
being visualized.
7. Axis Labels in Matplotlib: Labels are text elements that provide descriptions for the x-axis
and y-axis. They help identify the data being plotted and provide units or other relevant
information.
8. Ticks: Tick marks are small marks along the axis that indicate specific data points or
intervals. They help users interpret the scale of the plot and locate specific data values.
9. Tick Labels: Tick labels are text elements that provide labels for the tick marks. They
usually display the data values corresponding to each tick mark and can be customized to
show specific formatting or units.
10. Matplotlib Legend: Legends provide a key to the symbols or colors used in the plot to
represent different data series or categories. They help users interpret the plot and
understand the meaning of each element.
11. Matplotlib Grid Lines: Grid lines are horizontal and vertical lines that extend across the
plot, corresponding to specific data intervals or divisions. They provide a visual guide to the
data and help users identify patterns or trends.
12. Spines of Matplotlib Figures: Spines are the lines that form the borders of the plot area.
They separate the plot from the surrounding whitespace and can be customized to change
the appearance of the plot borders.
 Matplotlib offers a wide range of plot types to suit various data visualization needs.
 Here are some of the most commonly used types of plots in Matplotlib:
 Line Graph
 Stem Plot
 Bar chart
 Histograms
 Scatter Plot
 Stack Plot
 Box Plot
 Pie Chart
 Error Plot
 Violin Plot
 3D Plots
 Matplotlib is popular due to its ease of use, extensive documentation, and wide range of
plotting capabilities.
 It offers flexibility in customization, supports various plot types, and integrates well with other
Python libraries like NumPy and Pandas.
 Matplotlib is a suitable choice for various data visualization tasks, including exploratory data
analysis, scientific plotting, and creating publication-quality plots.
 It excels in scenarios where users require fine-grained control over plot customization and
need to create complex or specialized visualizations.
 Matplotlib is a widely used plotting library in Python that provides a variety of plotting tools
and capabilities. Here are some of the advantages of using Matplotlib:
1. Versatility: Matplotlib can create a wide range of plots, including line plots, scatter plots,
bar plots, histograms, pie charts, and more.
2. Customization: It offers extensive customization options to control every aspect of the plot,
such as line styles, colors, markers, labels, and annotations.
3. Integration with NumPy: Matplotlib integrates seamlessly with NumPy, making it easy to
plot data arrays directly.
4. Publication Quality: Matplotlib produces high-quality plots suitable for publication with
fine-grained control over the plot aesthetics.
5. Wide Adoption: Due to its maturity and flexibility, Matplotlib is widely adopted in the
scientific and engineering communities.
6. Extensible: Matplotlib is highly extensible, with a large ecosystem of add-on toolkits and
extensions like Seaborn, Pandas plotting functions, and Basemap for geographical plotting.
7. Cross-Platform: It is platform-independent and can run on various operating systems,
including Windows, macOS, and Linux.
8. Interactive Plots: Matplotlib supports interactive plotting through the use of widgets and
event handling, enabling users to explore data dynamically.
9. Integration with Jupyter Notebooks: Matplotlib works seamlessly with Jupyter Notebooks,
allowing for interactive plotting and inline display of plots.
10. Rich Documentation and Community Support: Matplotlib has comprehensive
documentation and a large community of users and developers, making it easy to find help,
tutorials, and examples.
 While Matplotlib is a powerful and versatile plotting library, it also has some disadvantages
that users might encounter:
1. Steep Learning Curve: For beginners, Matplotlib can have a steep learning curve due to its
extensive customization options and sometimes complex syntax.
2. Verbose Syntax: Matplotlib’s syntax can be verbose and less intuitive compared to other
plotting libraries like Seaborn or Plotly, making it more time-consuming to create and
customize plots.
3. Default Aesthetics: The default plot aesthetics in Matplotlib are often considered less
visually appealing compared to other libraries, requiring more effort to make plots visually
attractive.
4. Limited Interactivity: While Matplotlib does support interactive plotting to some extent, it
does not offer as many interactive features and options as other libraries like Plotly.
5. Limited 3D Plotting Capabilities: Matplotlib’s 3D plotting capabilities are not as advanced
and user-friendly as some other specialized 3D plotting libraries.
6. Performance Issues with Large Datasets: Matplotlib can sometimes be slower and less
efficient when plotting large datasets, especially compared to more optimized plotting
libraries.
7. Documentation and Error Messages: Although Matplotlib has comprehensive
documentation, some users find it challenging to navigate, and error messages can
sometimes be cryptic and hard to debug.
8. Dependency on External Libraries: Matplotlib relies on other libraries like NumPy and
SciPy for many of its functionalities, which can sometimes lead to compatibility issues and
dependency management issues.
9. Limited Native Support for Statistical Plotting:While Matplotlib can create basic
statistical plots, it lacks some advanced statistical plotting capabilities that are available in
specialized libraries like Seaborn.
10. Less Modern Features: Matplotlib has been around for a long time, and some users find
that it lacks some of the modern plotting features and interactive visualization capabilities
found in newer libraries.
 Pyplot is a Matplotlib module that provides a MATLAB-like interface.
 matplotlib.pyplot is a plotting library used for 2D graphics in python programming
language. It can be used in python scripts, shell, web application servers and other graphical
user interface toolkits.
 Matplotlib is designed to be as usable as MATLAB, with the ability to use Python and the
advantage of being free and open-source.
 Each pyplot function makes some changes to a figure: e.g., creates a figure, creates a plotting
area in a figure, plots some lines in a plotting area, decorates the plot with labels, etc.
 The various plots we can utilize using Pyplot are Line Plot, Histogram,Scatter,3D
Plot, Image, Contour, and Polar
 Line charts are used to represent the relation between two data X and Y on a different axis.
 A line plot is a graph that shows the frequency of data along a number line.
 It is used to show continuous dataset.
 A line plot is used to visualize growth or decline in data over a time interval.
 Line plots are a powerful tool for visualizing trends and patterns in data, and Matplotlib
provides a user-friendly interface to create them.
 Setting the line color
 By default, the plt.plot() function plots a blue line. However, you can change the line color by
passing a color parameter to the function. The color parameter can take a string representing
the color name or a hexadecimal code
 Setting the line width
 You can also change the line width by passing a linewidth parameter to the plt.plot() function.
The linewidth parameter takes a floating-point value representing the line's width.
 Setting the line style
 You can change the line style by passing a linestyle parameter to the plt.plot() function. The
linestyle parameter takes a string that represents the line style. Solid, Dashed, Dotted and
Dashdot values can be given over here
 Adding markers to line plots
 Markers can be used to highlight specific points in the line plot. Various kinds of symbols can
be used as markers
 Changing marker size to line plots
 Markers can be used to highlight specific points in the line plot. We can also change its size by
using markersize parameter.
 Changing marker edgecolor to line plots
 Markeredgecolor parameter is used to change the color of marker edge.
import matplotlib.pyplot as plt
import numpy as np

y = np.array([35, 25, 25, 15])


mylabels = ["Apples", "Bananas", "Cherries", "Dates"]

plt.title('Sample Chart',color='red')
plt.plot(mylabels,y,marker='*',markersize=10,markeredgecolor='yellow')
plt.show()
 The lines are unable to efficiently depict comparison between the weeks for which the sales
data is plotted.
 In order to show comparisons, we prefer Bar charts. Unlike line plots, bar charts can plot
strings on the x axis
 A bar plot or bar chart is a graph that represents the category of data with rectangular bars
with lengths and heights that is proportional to the values which they represent.
 The bar plots can be plotted horizontally or vertically. A bar chart describes the comparisons
between the discrete categories.
 One of the axis of the plot represents the specific categories being compared, while the other
axis represents the measured values corresponding to those categories.
 A bar graph uses bars to compare data among different categories. It is well suited when you
want to measure the changes over a period of time.
 Also, the important thing to keep in mind is that longer the bar, greater is the value.
 Types of Bar chart are : Simple Bar Chart, Stacked Bar Chart, Group Bar Chart, Horizontal Bar
Chart.
 Changing Color
 With the help of color properties we can change the color of bar.

 Adding Edgecolor
 With Edgecolor we can add edge around bar with different color.

 LineWidth
 With Linewidth we can set the border around bar.

 Line Style
 With Linestyle we can change the style of border.

 Width
 With width we can change the width of bar.
import matplotlib.pyplot as plt
import numpy as np

y = np.array([35, 25, 25, 15])


mylabels = ["Apples", "Bananas", "Cherries", "Dates"]

plt.title('Sample Chart',color='red')
plt.bar(mylabels,y,color=[‘red’,’green’],width=0.8)
plt.show()
 A pie chart refers to a circular graph which is broken down into segments i.e. slices of pie. It is
basically used to show the percentage or proportional data where each slice of pie represents
a category.
 A Pie Chart is a circular statistical plot that can display only one series of data.
 The area of the chart is the total percentage of the given data. Pie charts are commonly used in
business presentations like sales, operations, survey results, resources, etc. as they provide a
quick summary.
 Syntax :
matplotlib.pyplot.pie(data, explode=None, labels=None, colors=None, autopct=None,
shadow=False)
 Labels :
Assigning label value to each part of plot
 Explode :
 Explode allows you to add space (in terms of the pie radius) around slices.

 Shadow :
The function pie() allows you to add shadow to the pie chart. We can set the value True for applying
shadow effects.
 Colors :
We can change the colors of different slice of pie chart as per our choice
 Autopct :
Use the parameter autopct to show the percentage for each slice
 Wedgeprops :
The wedges in the pie chart can be given a border color and border width using the wedgeprops
attribute
 Startangle :
By default, pie() starts drawing the slices at 0 degrees. We can change the starting position using this
parameter.
 Counterclock :
 The function pie() draws slices in a counterclock direction. That’s the default behavior. We can use the
parameter counterclock to flip the direction. If we set this parameter to False, and the slices are plotted
clockwise.
import matplotlib.pyplot as plt
# the slices are ordered and plotted counter-clockwise:
product = 'Product A', 'Product B', 'Product C', 'Product D'
stock = [15, 30, 35, 20]
explode = (0.1, 0, 0.1, 0)

plt.pie(stock, explode = explode, labels = product, autopct = '%1.1f%%', shadow = True,


startangle = 90, wedgeprops= {"edgecolor":"black", 'linewidth': 3, 'antialiased': True})

plt.show()
 Histograms are column-charts, where each column represents a range of values, and the height
of a column corresponds to how many values are in that range.
 To make a histogram, the data is sorted into "bins" and the number of data points in each bin is
counted.
 The height of each column in the histogram is then proportional to the number of data points
its bin contains.
 Histograms are used to show a distribution whereas a bar chart is used to compare different
entities.
 Histograms are useful when you have arrays or a very long list.
 Let’s consider an example where I have to plot the age of population with respect to bin. Now,
bin refers to the range of values that are divided into series of intervals. Bins are usually
created of the same size.
 Edgecolor
Used to set the edge color.
 Linewidth
Used to set the width of the border line.
 Linestyle
Used to set the style of the border line.
 Fill:
The default True means each hist will be filled with color and False means each hist will
be empty
 Hatch :
hatch can be used to fill to each hist with pattern ( '-', '+', 'x', '\\', '*', 'o', 'O', '.')
import matplotlib.pyplot as plt
import numpy as np
x = np.random.normal(170, 10, 250)
plt.hist(x,edgecolor='Green',linewidth=2,linestyle=':', fill=False,hatch='o')
plt.show()
 A scatter chart is a two-dimensional data visualization method that uses dots to represent the
values obtained for two different variables —one plotted along the x-axis and the other plotted
along the y-axis.
 Scatter plots are used when you want to show the relationship between two variables.
 Scatter plots are sometimes called correlation plots because they show how two variables are
correlated.
 Additionally, the size, shape or color of the dot could represent a third (or even fourth
variable).
 Usually we need scatter plots in order to compare variables, for example, how much
one variable is affected by another variable to build a relation out of it. The data is displayed as
a collection of points, each having the value of one variable which determines the position on
the horizontal axis and the value of other variable determines the position on the vertical axis.
 S:
To set the Size of marker.
 Color:
To set the color
 Linewidth:
To set the width of the line(border)
 Marker:
to set the symbol of marker
 Edgecolor:
Setting the edge color
import numpy as np
import matplotlib.pyplot as plt
discount= np.array([10,20,30,40,50])
saleInRs=np.array([40000,45000,48000,50000,100000])
size=discount*10
plt.scatter(x=discount,y=saleInRs,s=size,color='red',edgecolor='pink')
plt.title('Sales Vs Discount')
plt.xlabel('Discount offered')
plt.ylabel('Sales in Rs')
plt.show()
 Adding title
 Chart title is one of the most crucial elements in communicating what your chart is about at first glance. The interpretation
and observation usually starts after reading the title.
 plt.title("Progess Grid", color="blue", size=14, loc="left")

 Axis Label
 It’s quite simple to create axis labels using matplotlib and Python. All you have to do is run xlabel(), ylabel() function to add
an x-axis label & y-axis label to your chart. Here is an example:
 plt.xlabel("Day of Month")
 plt.ylabel("Task Progress")
 With y-axis it might often be necessary to also adjust the rotation of the axis label. This can easily be achieved using
rotation parameter and assigning a rotation angle to it.
 plt.ylabel("Task Progress", rotation=90)

 Axis Size
 Axis size is an automatically adjusted property for Matplotlib charts but you can also assign them specific sizes and make
them fixed in which case axis of a chart won’t change according to data.
 You can simply assign lower and upper limits to plot axes using Python code below:
 plt.xlim(0,7)
 plt.ylim(0,5)
 Adding grid lines
 We can also add grid lines to our plot to make it more readable. We can achieve this by using the
plt.grid() function. The plt.grid() function takes a boolean value representing whether the grid should
be shown.
 Adding Axes ticks
 Ticks are another fundamental component in a Python chart. You can adjust the names, frequencies,
colors and even rotation of axis ticks to suit them better.
 Axes Ticks can be adjusted using xticks() and yticks() methods in Python. Additionally, we have to
pass the range of tick values for the first parameter and numpy’s arange function is usually very
convenient for this task.
 plt.xticks(np.arange(0, 30, 7), ['Wk1', 'Wk2', "Wk3", "Wk4", "Wk5"], rotation=35, color="red")

 Adding legend
 Legend is another chart element that can enhance a visualization. To activate legend simply execute
Python code below.
 plt.legend()
 To import an Excel file into Python using Pandas:
import pandas as pd
df = pd.read_excel(r"Path where the Excel file is stored\File name.xlsx")
print(df)
 And if you have a specific Excel sheet that you’d like to import, you may then apply:
import pandas as pd
df = pd.read_excel(r"Path of Excel file\File name.xlsx", sheet_name="your Excel sheet name")
print(df)
 Example:
import pandas as pd
df = pd.read_excel(r"C:\Users\Ron\Desktop\my_products.xlsx")
print(df)
 If that’s the case, you can specify this column name as captured below:

import pandas as pd
data = pd.read_excel(r"C:\Users\Ron\Desktop\my_products.xlsx")
df = pd.DataFrame(data, columns=["product_name", "price"])
print(df)
 CSV files are the “comma separated values”, these values are separated by commas, this file
can be viewed as an Excel file.
 In Python, Pandas is the most important library coming to data science. We need to deal with
huge datasets while analyzing the data, which usually can be in CSV file format.
 To access data from the CSV file, we require a function read_csv() from Pandas that retrieves
data in the form of the data frame.
 Example
import pandas as pd
df = pd.read_csv(r"C:\Users\Ron\Desktop\my_products.csv")
#df = pd.DataFrame(data, columns=["product", "price"])
print(df)
 Matplotlib is a widely used Python library to plot graphs, plots, charts, etc. show() method is
used to display graphs as output, but don’t save it in any file.
 The figure produced after data plotting is saved using the savefig() method, as the name
implies.
 Using this technique, the generated figure can be saved to our local computers.

import matplotlib.pyplot as plt


year = ['2010', '2002', '2004', '2006', '2008']
production = [25, 15, 35, 30, 10]
plt.bar(year, production)
plt.savefig("output.jpg")

You might also like