R Graphics Chapter1
R Graphics Chapter1
An Introduction to R Graphics
Chapter preview
This chapter provides the most basic information to get started pro-
ducing plots in R. First of all, there is a three-line code example that
demonstrates the fundamental steps involved in producing a plot. This
is followed by a series of figures to demonstrate the range of images
that R can produce. There is also a section on the organization of R
graphics giving information on where to look for a particular function.
The final section describes the different graphical output formats that
R can produce and how to obtain a particular output format.
The following code provides a simple example of how to produce a plot using
R (see Figure 1.1).
> plot(pressure)
> text(150, 600,
"Pressure (mm Hg)\nversus\nTemperature (Celsius)")
∗ The pressure data set, available in the datasets package, contains 19 recordings of
the relationship between vapor pressure (in millimeters of mercury) and temperature (in
degrees Celsius).
1
2 R Graphics
800
Temperature (Celsius)
pressure
400
200
0
temperature
Figure 1.1
A simple scatterplot of vapor pressure of mercury as a function of temperature.
The plot is produced from two simple R expressions: one expression to draw the
basic plot, consisting of axes, data symbols, and bounding rectangle; and another
expression to add the text label within the plot.
An Introduction to R Graphics 3
∗ The barplot makes use of data on death rates in the state of Virginia for different age
groups and population groups, available as the VADeaths data set in the datasets package.
The boxplot example makes use of data on the effect of vitamin C on tooth growth in guinea
pigs, available as the ToothGrowth data set, also from the datasets package. These and
many other data sets distributed with R were obtained from “Interactive Data Analysis” by
Don McNeil[40] rather than directly from the original source.
4 R Graphics
For more information on the Trellis system and how to produce Trellis plots
using the lattice package, see Chapter 4.
∗ All of the figures in this book, apart from the figures in Chapter 7 that only contain R
Histogram of Y
6 6
0.5
Bird 131
0.4
4 4
0.3
Density
0.2
2 2
0.1
0.0
0 0
−3 −2 −1 0 1 2 3
0 4 8 12 16
Travel Time (s) Y
200 35
71.1 30
150 25
tooth length
66
20
100 54.3 54.6 50 15
41 10
30.9 37 35.1
50 26.9 5 Ascorbic acid
20.3 24.3 19.3 Orange juice
18.1 11.7 13.6 0
11.7 8.7 15.4 8.4
0 Rural Rural Urban Urban 0.5 0.5 1 1 2 2
Male Female Male Female
Vitamin C dose (mg)
Cherry
Blueberry
z
Apple
Vanilla
y
x Other
Boston Cream
Figure 1.2
Some standard plots produced using R: (from left-to-right and top-to-bottom) a
scatterplot, a histogram, a barplot, a boxplot, a 3D surface, and a piechart. In the
first four cases, the basic plot type has been augmented by adding additional labels,
lines, and axes. (The boxplot is adapted from an idea by Roger Bivand.)
8 R Graphics
2.0
350
300
1.8
Number of Vessels
250
234 (65%)
Completeness
1.6
200
159 (44%)
150
1.4
100
1.2
1.2
50
1.0
0
Sampling Fraction
Figure 1.3
A customized scatterplot produced using R. This is created by starting with a simple
scatterplot and augmenting it by adding an additional y-axis and several additional
sets of lines, polygons, and text labels.
An Introduction to R Graphics 9
Waseca
Trebi
Wisconsin No. 38
No. 457
Glabron
Peatland
Velvet
No. 475
Manchuria
No. 462
Svansota
Crookston
Trebi
Wisconsin No. 38
No. 457
Glabron
Peatland
Velvet
No. 475
Manchuria
No. 462
Svansota
Morris
Trebi
Wisconsin No. 38
No. 457
Glabron
Peatland
Velvet
No. 475
Manchuria
No. 462
Svansota 1932
University Farm 1931
Trebi
Wisconsin No. 38
No. 457
Glabron
Peatland
Velvet
No. 475
Manchuria
No. 462
Svansota
Duluth
Trebi
Wisconsin No. 38
No. 457
Glabron
Peatland
Velvet
No. 475
Manchuria
No. 462
Svansota
Grand Rapids
Trebi
Wisconsin No. 38
No. 457
Glabron
Peatland
Velvet
No. 475
Manchuria
No. 462
Svansota
20 30 40 50 60
Figure 1.4
A Trellis dotplot produced using R. The relationship between the yield of barley and
species of barley is presented, with a separate dotplot for different experimental sites
and different plotting symbols for data gathered in different years. This is a small
modification of Figure 1.1 from Bill Cleveland’s “Visualizing Data” (reproduced with
permission from Hobart Press).
10 R Graphics
Auckland
Figure 1.5
A map of New Zealand produced using R, Ray Brownrigg’s maps package, and
Thomas Minka’s mapproj package. The map (of New Zealand) is drawn as a se-
ries of polygons, and then text, an arrow, and a data point have been added to
indicate the location of Auckland, the birthplace of R. A separate world map has
been drawn in the bottom-right corner, with a circle to help people locate New
Zealand.
An Introduction to R Graphics 11
drat
qsec
wt S
Marked
90
0.8
0.6
180 0
0.4
0.2
0.2
0.4
0.4
0.2
0.6
0.6
0.8
0.8
270
None Some
Figure 1.6
Some polar-coordinate plots produced using R (top-left), the CircStats package by
Ulric Lund and Claudio Agostinelli (top-right), and code submitted to the R-help
mailing list by Karsten Bjerre (bottom-left). The plot at bottom-right is a ternary
plot produced using the vcd package (by David Meyer, Achim Zeileis, Alexandros
Karatzoglou, and Kurt Hornik)
12 R Graphics
1
vari
p < 0.001
2
vasg
p < 0.001
3
vart
p = 0.001
0 0 0 0
glau norm glau norm glau norm glau norm
Figure 1.7
A novel decision tree plot, visualizing the distribution of the dependent variable in
each terminal node. Produced using the party package by Torsten Hothorn, Kurt
Hornik, and Achim Zeileis.
An Introduction to R Graphics 13
Centre cases
Carcinoma in situ
Thailand 327
Philippines 319
Invasive cancer
Colombia 96
Spain 115
All 1673
0 1 2 3 4
OR
Figure 1.8
A table-like plot produced using R. This is a typical presentation of the results
from a meta-analysis. The original motivation and data were provided by Martyn
Plummer[48].
14 R Graphics
X1
X2 X3
X1
X2 X3
Figure 1.9
Didactic diagrams produced using R and functions provided by Arden Miller. The
figures show a geometric representation of extensions to F-tests.
An Introduction to R Graphics 15
A Little Culture
4
4
Figure 1.10
A music score produced using R (code by Steven Miller).
● ●
● ●
●
●
●
●
●
●
● ●
Figure 1.11
A piece of clip art produced using R.
16 R Graphics
This section briefly describes how R’s graphics functions are organized so that
the user knows where to start looking for a particular function.
The R graphics system can be broken into four distinct levels: graphics pack-
ages; graphics systems; a graphics engine, including standard graphics devices;
and graphics device packages (see Figure 1.12).
Graphics
maps ... lattice ...
Packages
Graphics
graphics grid
Systems
Graphics
Engine
grDevices
&
Devices
Graphics
Device gtkDevice ...
Packages
Figure 1.12
The structure of the R graphics system showing the main packages that provide
graphics functions in R. Arrows indicate where one package builds on the functions
in another package. The packages described in this book are highlighted with thicker
borders and grey backgrounds.
An Introduction to R Graphics 17
Functions in the graphics systems and graphics packages can be broken down
into three main types: high-level functions that produce complete plots; low-
level functions that add further output to an existing plot; and functions for
working interactively with graphical output.
The traditional system, or graphics packages built on top of it, provide the
majority of the high-level functions currently available in R. The most signifi-
cant exception is the lattice package (see Chapter 4), which provides complete
plots based on the grid system.
Both the traditional and grid systems provide many low-level graphics func-
tions, and grid also provides functions for interacting with graphical output
(editing, extracting, deleting parts of an image).
Most functions in graphics packages produce complete plots and typically offer
specialized plots for a specific sort of analysis or a specific field of study. For
example: the hexbin package[10] from the BioConductor project has functions
for producing hexagonal binning plots for visualizing large amounts of data;
the maps package[7] provides functions for visualizing geographic data (see, for
example, Figure 1.5); and the package scatterplot3d[35] produces a variety
of 3-dimensional plots. If there is a need for a particular sort of plot, there
is a reasonable chance that someone has already written a function to do it.
For example, a common request on the R-help mailing list is for a way to
add error bars to scatterplots or barplots and this can be achieved via the
18 R Graphics
functions plotCI() from the gplots package in the gregmisc bundle or the
errbar() function from the Hmisc package. There are some search facilities
linked off the main R home page web site to help to find a particular function
for a particular purpose (also see Section A.2.10).
While there is no detailed discussion of the high-level graphics functions in
graphics packages other than lattice, the general comments in Chapter 2 con-
cerning the behavior of high-level functions in the traditional graphics system
will often apply as well to high-level graphics functions in graphics packages
built on the traditional system.
The existence of two distinct graphics systems in R raises the issue of when
to use each system.
For the purpose of producing complete plots from a single function call, which
graphics system to use will largely depend on what type of plot is required.
The choice of graphics system is largely irrelevant if no further output needs
to be added to the plot.
If it is necessary to add further output to a plot, the most important thing to
know is which graphics system was used to produce the original plot. In gen-
eral, the same graphics system should be used to add further output (though
see Appendix B for ways around this).
In some cases, the same sort of plot can be produced by both lattice and
traditional functions. The lattice versions offer more flexibility for adding
further output and for interacting with the plot, plus Trellis plots have a
better design in terms of visually decoding the information in the plot.
For producing graphical scenes starting from a blank page, the grid system
offers the benefit of a much wider range of possibilities, at the cost of having
to learn a few additional concepts.
For the purpose of writing new graphical functions for others to use, grid
again provides better support for producing more general output that can be
combined with other output more easily. Grid also provides more possibilities
for interaction.
An Introduction to R Graphics 19
At the start of this chapter (page 1), there is a simple example of the sort of R
expressions that are required to produce a plot. When using R interactively,
the result is a plot drawn on screen. However, it is also possible to produce
a file that contains the plot, for example, as a PostScript document. This
section describes how to control the format in which a plot is produced.
R graphics output can be produced in a wide variety of graphical formats.
In R’s terminology, output is directed to a particular output device and that
dictates the output format that will be produced. A device must be created or
“opened” in order to receive graphical output and, for devices that create a file
on disk, the device must also be closed in order to complete the output. For
example, for producing PostScript output, R has a function postscript()
that opens a file to receive PostScript commands. Graphical output sent to
this device is recorded by writing PostScript commands into the file. The
function dev.off() closes a device.
The following code shows how to produce a simple scatterplot in PostScript
format. The output is stored in a file called myplot.ps:
> postscript(file="myplot.ps")
> plot(pressure)
> dev.off()
To produce the same output in PNG format (in a file called myplot.png), the
code simply becomes:
> png(file="myplot.png")
> plot(pressure)
> dev.off()
Table 1.1
Graphics formats that R supports and the functions that open
an appropriate graphics device
Screen/GUI Devices
x11() or X11() X Window window
windows() Microsoft Windows window
quartz() Mac OS X Quartz window
File Devices
postscript() Adobe PostScript file
pdf() Adobe PDF file
pictex() LATEX PicTEX file
xfig() XFIG file
bitmap() GhostScript conversion to file
png() PNG bitmap file
jpeg() JPEG bitmap file
(Windows only)
win.metafile() Windows Metafile file
bmp() Windows BMP file
Table 1.1 gives a full list of functions that open devices and the output formats
that they correspond to.
All of these functions provide several arguments to allow the user to specify
things such as the physical size of the window or document being created. The
documentation for individual functions should be consulted for descriptions
of these arguments.
It is possible to have more than one device open at the same time, but only
one device is currently “active” and all graphics output is sent to that device.
If multiple devices are open, there are functions to control which device is
active. The list of open devices can be obtained using dev.list(). This gives
the name (the device format) and number for each open device. The function
dev.cur() returns this information only for the currently active device. The
dev.set() function can be used to make a device active, by specifying the
An Introduction to R Graphics 21
For a screen device, starting a new page involves clearing the window before
producing more output. On Windows there is a facility for returning to pre-
vious screens of output (see the “History” menu, which is available when a
graphics window has focus), but on most screen devices, the output of previ-
ous pages is lost.
For file devices, the output format dictates whether multiple pages are sup-
ported. For example, PostScript and PDF allow multiple pages, but PNG does
not. It is usually possible, especially for devices that do not support multiple
pages of output, to specify that each page of output produces a separate file.
This is achieved by specifying the argument onefile=FALSE when opening
a device and specifying a pattern for the file name like file="myplot%03d"
so that the %03d is replaced by a three-digit number (padded with zeroes)
indicating the “page number” for each file that is created.
R maintains a display list for each open device, which is a record of the output
on the current page of a device. This is used to redraw the output when
a device is resized and can also be used to copy output from one device to
another.
The function dev.copy() copies all output from the active device to another
device. The copy may be distorted if the aspect ratio of the destination device
— the ratio of the physical height and width of the device — is not the same as
the aspect ratio of the active device. The function dev.copy2eps() is similar
to dev.copy(), but it preserves the aspect ratio of the copy and creates a file
in EPS (Encapsulated PostScript) format that is ideal for embedding in other
documents (e.g., a LATEX document). The dev2bitmap() function is similar
in that it also tries to preserve the aspect ratio of the image, but it produces
one of the output formats available via the bitmap() device.
The function dev.print() attempts to print the output on the active device.
By default, this involves making a PostScript copy and then invoking the print
command given by options("printcmd").
22 R Graphics
The display list can consume a reasonable amount of memory if a plot is par-
ticularly complex or if there are very many devices open at the same time.
For this reason it is possible to disable the display list, by typing the expres-
sion dev.control(displaylist="inhibit"). If the display list is disabled,
output will not be redrawn when a device is resized, and output cannot be
copied between devices.
Chapter summary