1/5/25, 12:09 AM Preprocessing1.
ipynb - Colab
!pip install watermark
Collecting watermark
Downloading [Link] (1.4 kB)
Requirement already satisfied: ipython>=6.0 in /usr/local/lib/python3.10/dist-packages (from watermark) (7.34.0)
Requirement already satisfied: importlib-metadata>=1.4 in /usr/local/lib/python3.10/dist-packages (from watermark) (8.5.0)
Requirement already satisfied: setuptools in /usr/local/lib/python3.10/dist-packages (from watermark) (75.1.0)
Requirement already satisfied: zipp>=3.20 in /usr/local/lib/python3.10/dist-packages (from importlib-metadata>=1.4->watermark) (3.21
Collecting jedi>=0.16 (from ipython>=6.0->watermark)
Downloading [Link] (22 kB)
Requirement already satisfied: decorator in /usr/local/lib/python3.10/dist-packages (from ipython>=6.0->watermark) (4.4.2)
Requirement already satisfied: pickleshare in /usr/local/lib/python3.10/dist-packages (from ipython>=6.0->watermark) (0.7.5)
Requirement already satisfied: traitlets>=4.2 in /usr/local/lib/python3.10/dist-packages (from ipython>=6.0->watermark) (5.7.1)
Requirement already satisfied: prompt-toolkit!=3.0.0,!=3.0.1,<3.1.0,>=2.0.0 in /usr/local/lib/python3.10/dist-packages (from ipython
Requirement already satisfied: pygments in /usr/local/lib/python3.10/dist-packages (from ipython>=6.0->watermark) (2.18.0)
Requirement already satisfied: backcall in /usr/local/lib/python3.10/dist-packages (from ipython>=6.0->watermark) (0.2.0)
Requirement already satisfied: matplotlib-inline in /usr/local/lib/python3.10/dist-packages (from ipython>=6.0->watermark) (0.1.7)
Requirement already satisfied: pexpect>4.3 in /usr/local/lib/python3.10/dist-packages (from ipython>=6.0->watermark) (4.9.0)
Requirement already satisfied: parso<0.9.0,>=0.8.4 in /usr/local/lib/python3.10/dist-packages (from jedi>=0.16->ipython>=6.0->waterm
Requirement already satisfied: ptyprocess>=0.5 in /usr/local/lib/python3.10/dist-packages (from pexpect>4.3->ipython>=6.0->watermark
Requirement already satisfied: wcwidth in /usr/local/lib/python3.10/dist-packages (from prompt-toolkit!=3.0.0,!=3.0.1,<3.1.0,>=2.0.0
Downloading [Link] (7.7 kB)
Downloading [Link] (1.6 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.6/1.6 MB 20.5 MB/s eta [Link]
Installing collected packages: jedi, watermark
Successfully installed jedi-0.19.2 watermark-2.5.0
%load_ext watermark
%watermark -u -v -d -p matplotlib,numpy
Last updated: 2024-12-27
Python implementation: CPython
Python version : 3.10.12
IPython version : 7.34.0
matplotlib: 3.8.0
numpy : 1.26.4
More info about the %watermark extension
%matplotlib inline
Boxplots in matplotlib
keyboard_arrow_down Sections
Simple Boxplot
Black and white Boxplot
Horizontal Boxplot
Filled and cylindrical boxplots
Boxplots with custom fill colors
Violin plots
[Link] 1/6
1/5/25, 12:09 AM [Link] - Colab
keyboard_arrow_down Simple Boxplot
import [Link] as plt
import numpy as np
all_data = [[Link](0, std, 100) for std in range(5, 8)]
fig = [Link](figsize=(4,3))
[Link](all_data,
notch=False, # box instead of notch shape
sym='rs', # red squares for outliers
vert=True) # vertical box aligmnent
[Link]([y+1 for y in range(len(all_data))], ['x1', 'x2', 'x3'])
[Link]('measurement x')
[Link]('Box plot')
[Link]()
[Link] 2/6
1/5/25, 12:09 AM [Link] - Colab
Median: The middle value of the dataset.
Quartiles: Indicate the spread of the middle 50% of the data.
Whiskers: Extend to show the range of the data, excluding outliers.
Outliers: Data points that fall outside the whiskers, marked as red squares ('rs').
keyboard_arrow_down Black and white Boxplot
import [Link] as plt
import numpy as np
all_data = [[Link](0, std, 100) for std in range(1, 4)]
fig = [Link](figsize=(8,6))
bplot = [Link](all_data,
notch=False, # box instead of notch shape
sym='rs', # red squares for outliers
vert=True) # vertical box aligmnent
[Link]([y+1 for y in range(len(all_data))], ['x1', 'x2', 'x3'])
[Link]('measurement x')
print([Link]())
for components in [Link]():
for line in bplot[components]:
line.set_color('black') # black lines
t = [Link]('Black and white box plot')
[Link]()
Show hidden output
keyboard_arrow_down Horizontal Boxplot
import [Link] as plt
import numpy as np
all_data = [[Link](0, std, 100) for std in range(1, 4)]
fig = [Link](figsize=(8,6))
[Link](all_data,
notch=False, # box instead of notch shape
sym='rs', # red squares for outliers
vert=False) # horizontal box aligmnent
[Link]([y+1 for y in range(len(all_data))], ['x1', 'x2', 'x3'])
[Link]('measurement x')
[Link] 3/6
1/5/25, 12:09 AM [Link] - Colab
t = [Link]('Horizontal Box plot')
[Link]()
keyboard_arrow_down Filled and cylindrical boxplots
import [Link] as plt
import numpy as np
all_data = [[Link](0, std, 100) for std in range(1, 4)]
fig = [Link](figsize=(8,6))
[Link](all_data,
notch=True, # notch shape
sym='bs', # blue squares for outliers
vert=True, # vertical box aligmnent
patch_artist=True) # fill with color
[Link]([y+1 for y in range(len(all_data))], ['x1', 'x2', 'x3'])
[Link]('measurement x')
t = [Link]('Box plot')
[Link]()
[Link] 4/6
1/5/25, 12:09 AM [Link] - Colab
keyboard_arrow_down Boxplots with custom fill colors
import [Link] as plt
import numpy as np
all_data = [[Link](0, std, 100) for std in range(1, 4)]
fig = [Link](figsize=(8,6))
bplot = [Link](all_data,
notch=False, # notch shape
vert=True, # vertical box aligmnent
patch_artist=True) # fill with color
colors = ['pink', 'lightblue', 'lightgreen']
for patch, color in zip(bplot['boxes'], colors):
patch.set_facecolor(color)
[Link]([y+1 for y in range(len(all_data))], ['x1', 'x2', 'x3'])
[Link]('measurement x')
t = [Link]('Box plot')
[Link]()
[Link] 5/6
1/5/25, 12:09 AM [Link] - Colab
keyboard_arrow_down Violin plots
Violin plots are closely related to Tukey's (1977) box plots but add useful information such as the distribution of the sample data (density trace).
Violin plots were added in matplotlib 1.4.
import [Link] as plt
import numpy as np
fig, axes = [Link](nrows=1,ncols=2, figsize=(12,5))
all_data = [[Link](0, std, 100) for std in range(6, 10)]
#fig = [Link](figsize=(8,6))
axes[0].violinplot(all_data,
showmeans=False,
showmedians=True
)
axes[0].set_title('violin plot')
axes[1].boxplot(all_data,
)
axes[1].set_title('box plot')
# adding horizontal grid lines
for ax in axes:
[Link](True)
ax.set_xticks([y+1 for y in range(len(all_data))], )
ax.set_xlabel('xlabel')
ax.set_ylabel('ylabel')
[Link](axes, xticks=[y+1 for y in range(len(all_data))],
xticklabels=['x1', 'x2', 'x3', 'x4'],
)
[Link]()
[Link] 6/6