0% found this document useful (0 votes)
2 views8 pages

Basic Plotting

The document provides an overview of plotting with the matplotlib library, specifically using its pyplot sub-module for creating various types of visualizations with NumPy and pandas. It includes examples of basic plots, data preparation, and advanced plotting techniques, as well as a list of common plot types and embellishments to enhance visual clarity. Key practices for effective data visualization, such as labeling, styling, and saving plots, are also discussed.

Uploaded by

Akash Aku
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views8 pages

Basic Plotting

The document provides an overview of plotting with the matplotlib library, specifically using its pyplot sub-module for creating various types of visualizations with NumPy and pandas. It includes examples of basic plots, data preparation, and advanced plotting techniques, as well as a list of common plot types and embellishments to enhance visual clarity. Key practices for effective data visualization, such as labeling, styling, and saving plots, are also discussed.

Uploaded by

Akash Aku
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 8

PLOTTING Important points:

Plotting for NumPy and pandas is commonly done using the matplotlib library,
particularly its sub-module pyplot, which provides a MATLAB-like interface for creating
static, animated, and interactive visualizations.

Here's a basic overview and example to help you get started:

Importing pyplot
python
CopyEdit
import matplotlib.pyplot as plt

Basic Plot Example with NumPy


python
CopyEdit
import numpy as np
import matplotlib.pyplot as plt

# Create some data


x = np.linspace(0, 10, 100)
y = np.sin(x)

# Plot
plt.plot(x, y)
plt.title("Sine Wave")
plt.xlabel("x-axis")
plt.ylabel("y = sin(x)")
plt.grid(True)
plt.show()

📊 Basic Plot Example with pandas


python
CopyEdit
import pandas as pd
import matplotlib.pyplot as plt

# Create DataFrame
data = {
'Year': [2020, 2021, 2022, 2023],
'Sales': [150, 200, 250, 300]
}
df = pd.DataFrame(data)

# Plot
df.plot(x='Year', y='Sales', kind='line', marker='o')
plt.title("Annual Sales")
plt.ylabel("Sales")
plt.grid(True)

1
plt.show()

✅ Notes:

 matplotlib.pyplot is usually imported as plt.


 pandas can use matplotlib in the background for its built-in .plot() method.
 You can create various types of plots: line, bar, histogram, pie, scatter, etc.

The example provided walks through basic plotting with Matplotlib's pyplot, particularly
in the context of analyzing alcohol consumption data across U.S. states over time. Here’s a
breakdown and explanation of what the code is doing in a more structured way:

1. Data Preparation
python

import matplotlib, matplotlib.pyplot as plt


import pickle, pandas as pd

 Imports essential libraries: matplotlib for plotting, pandas for data handling,
pickle for loading pre-saved data.

python

alco = pickle.load(open("alco.pickle", "rb"))


del alco["Total"]

 Loads the pre-processed alcohol consumption data (probably a pandas DataFrame).


 Removes the "Total" column because only specific types (Beer, Wine, Spirits) are of
interest.

python

columns, years = alco.unstack().columns.levels

 Retrieves beverage types and years by unstacking the DataFrame. The result is a
multi-indexed DataFrame where columns = beverage types, and years = years.

python

states = pd.read_csv("states.csv", names=("State", "Standard", "Postal",


"Capital"))
states.set_index("State", inplace=True)

 Loads state abbreviation info and sets state names as the index for easier merging.

python

2
frames = [pd.merge(alco[column].unstack(), states,
left_index=True, right_index=True).sort_values(2009)
for column in columns]

 For each beverage type:


o alco[column].unstack() reshapes the data with years as columns.
o Merges state info.
o Sorts states by 2009 consumption.

python

span = max(years) - min(years) + 1

 Calculates the number of years in the data.

2. Plotting
python

matplotlib.style.use("ggplot")

 Applies the "ggplot" style to make plots visually appealing.

python

STEP = 5

 Used for tick intervals (every 5 steps).

python

for pos, (draw, style, column, frame) in enumerate(zip(


(plt.contourf, plt.contour, plt.imshow),
(plt.cm.autumn, plt.cm.cool, plt.cm.spring),
columns, frames)):

plt.subplot(2, 2, pos + 1)
draw(frame[frame.columns[:span]], cmap=style, aspect="auto")
plt.colorbar()
plt.title(column)
plt.xlabel("Year")
plt.xticks(range(0, span, STEP), frame.columns[:span:STEP])
plt.yticks(range(0, frame.shape[0], STEP), frame.Postal[::STEP])
plt.xticks(rotation=-17)

Key Concepts:

 plt.subplot(2, 2, pos + 1): Arranges subplots in a 2x2 grid. pos + 1 picks the
position for each plot.
 Plot types:
o contourf: filled contours
o contour: contour lines

3
o imshow: image plot (like a heatmap)
 cmap: Defines the color map (visual palette).
 aspect="auto": Ensures plots scale properly.
 xticks and yticks: Adjusts tick marks and labels on x and y axes.
 colorbar(): Adds a legend for the colors used.
 Rotation: Tilts x-axis labels for better readability.

3. Saving or Displaying
python

plt.tight_layout()
plt.savefig("../images/pyplot-all.pdf")
# plt.show()

 tight_layout() fixes layout overlaps.


 savefig() saves the figure to a file (here: a PDF).
 plt.show() would display it interactively (commented out here).
 plt.clf() could be used later to clear the figure if desired.

📝 Summary

This example:

 Visualizes three types of alcohol consumption (Beer, Wine, Spirits) across states and
time.
 Demonstrates merging datasets, styling subplots, using different plot types, and
exporting figures.
 Illustrates good practices like colorbars, labeling, and layout adjustment.

If you want to replicate or modify this code, ensure:

 alco.pickle and states.csv are available.


 The data has the expected multi-level index for beverage types and years.

4
Plot Types:
the various plot types supported by Matplotlib's pyplot module. Here's a clearer and more
structured summary of the most commonly used pyplot plotting functions and their
purposes:

📊 Conventional Plot Types in PyPlot

Plot Type Function Description


Vertical Bar bar()
Used to display data using rectangular bars
Plot (height = value).
Horizontal Bar barh()
Same as bar(), but bars are horizontal.
Plot
Shows distribution, median, quartiles, and outliers
Box Plot boxplot()
("whiskers" included).
Adds error bars to data points (useful for
Error Bar Plot errorbar()
experimental data).
Histogram hist() Displays frequency distribution of a dataset.
Log-Log Plot loglog() Both X and Y axes use logarithmic scale.
Semilog X Plot semilogx() X-axis is logarithmic, Y-axis is linear.
Semilog Y Plot semilogy() Y-axis is logarithmic, X-axis is linear.
Pie Chart pie() Displays proportions in a circular chart.
The most common plot type — connects data
Line Plot plot()
points with lines.
plot_date() or
Date Plot plot_dates()
Special line plot with date/time on the X-axis.
Polar Plot polar() Plots data in polar coordinates (θ, r).
Shows individual data points (size and color can
Scatter Plot scatter()
be varied).
Step Plot step() Useful for showing changes at discrete intervals.

Mastering Embellishments in Matplotlib (pyplot)

5
Once you've chosen a plot type, the next step is to embellish your plot — to make it clear,
attractive, and informative. Embellishments help you tell the story behind the data.

Mastering Embellishments in PyPlot — Key Takeaways


Embellishments help clarify your plots, highlight key points, and improve aesthetics.
Here's what you can control and customize:

1. Axes Scale and Limits

 Linear vs Logarithmic scale:

python

plt.xscale("log")
plt.yscale("log")

 Set axis limits:

python

plt.xlim(1975, 2010)
plt.ylim(1.0, 2.5)

2. Styling and Themes

 Use themes/styles:

python

plt.style.use("ggplot") # Other options: 'seaborn', 'bmh',


'classic', etc.

 Use comic-style:

python

plt.xkcd() # Fun, hand-drawn style (not for serious use!)

 Customize fonts (for Unicode or design):

python

import matplotlib
matplotlib.rc("font", family="Arial")

3. Annotations and Arrows

6
 Mark key events or peaks:

python

plt.annotate("Peak",
xy=(year, value),
xytext=(year+0.5, value+0.1),
arrowprops=dict(facecolor='black', shrink=0.2))

4. Legends and Labels

 Add legends, titles, and axis labels:

python

plt.legend(["New Hampshire", "Colorado", "Utah"])


plt.title("Beer Consumption Over Time")
plt.ylabel("Beer Consumption")
plt.xlabel("Year")

5. Save Plot as Image

 Save to file (PNG, PDF, etc.):

python

plt.savefig("plot.png", dpi=300, bbox_inches='tight')

🧪 Example Code — Based on Your Reference


python

import matplotlib.pyplot as plt


import matplotlib
import pickle
import pandas as pd

# Load pre-saved DataFrame (multilevel index: state and year)


alco = pickle.load(open("alco.pickle", "rb"))

# Setup
BEVERAGE = "Beer"
years = alco.index.levels[1]
states = ("New Hampshire", "Colorado", "Utah")

# Use xkcd comic style and ggplot theme


plt.xkcd()
matplotlib.style.use("ggplot")

# Plot for each state


for state in states:
ydata = alco.loc[state][BEVERAGE]
plt.plot(years, ydata, "-o", label=state)

7
# Add annotation for peak
peak_year = ydata.idxmax()
peak_value = ydata.max()
plt.annotate("Peak",
xy=(peak_year, peak_value),
xytext=(peak_year + 0.5, peak_value + 0.1),
arrowprops={"facecolor": "black", "shrink": 0.2})

# Add embellishments
plt.ylabel(BEVERAGE + " consumption")
plt.title("And now in xkcd...")
plt.legend()
plt.savefig("pyplot-legend-xkcd.pdf")
plt.show()

Note on
alco.ix[state]

alco.ix[state] is deprecated in modern Pandas. Use alco.loc[state] instead.

You might also like