Plotting
Plotting
Quantitative Variables We now present a few useful methods for visualizing quantitative
data, again using the nutri data set. We will first focus on continuous features (e.g., 'age') and then
add some specific graphs related to discrete features (e.g., 'tea').
The aim is to describe the variability present in a single feature. This typically involves a
central tendency, where observations tend to gather around, with fewer observations further away.
The main aspects of the distribution are the location (or center) of the variability, the spread
of the variability (how far the values extend from the center), and the shape of the variability;
e.g., whether or not values are spread symmetrically on either side of the center. 10
Visualizing Data 1.5.2.1 Boxplot A boxplot can be viewed as a graphical representation of the five-
number summary of boxplot the data consisting of the minimum, maximum, and the first, second,
and third quartiles.
Figure 1.2 gives a boxplot for the 'age' feature of the nutri data.
plt.boxplot(nutri['age'],widths=width ,vert=False) plt.xlabel('age') plt.show () The widths parameter
determines the width of the boxplot, which is by default plotted vertically.
The left whisker extends to the largest of (a) the minimum of the data and (b) Q1 − 1.5 IQR.
Similarly, the right whisker extends to the smallest of (a) the maximum of the data and (b) Q3 + 1.5
IQR. Any data point outside the whiskers is indicated by a small hollow dot, indicating a suspicious or
deviant point (outlier). Note that a boxplot may also be used for discrete quantitative features