0% found this document useful (0 votes)
15 views

Data Visualization

Uploaded by

claucormar
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views

Data Visualization

Uploaded by

claucormar
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

Data visualization with ggplot2 : : CHEATSHEET

Basics Geoms Use a geom function to represent data points, use the geom’s aesthetic properties to represent variables.
Each function returns a layer.
ggplot2 is based on the grammar of graphics, the idea
that you can build every graph from the same GRAPHICAL PRIMITIVES TWO VARIABLES
components: a data set, a coordinate system, a <- ggplot(economics, aes(date, unemploy)) both continuous continuous bivariate distribution
and geoms—visual marks that represent data points. b <- ggplot(seals, aes(x = long, y = lat)) e <- ggplot(mpg, aes(cty, hwy)) h <- ggplot(diamonds, aes(carat, price))

F M A a + geom_blank() and a + expand_limits() e + geom_label(aes(label = cty), nudge_x = 1, h + geom_bin2d(binwidth = c(0.25, 500))


Ensure limits include values across all plots. nudge_y = 1) - x, y, label, alpha, angle, color, x, y, alpha, color, fill, linetype, size, weight
+ = b + geom_curve(aes(yend = lat + 1,
family, fontface, hjust, lineheight, size, vjust
h + geom_density_2d()
xend = long + 1), curvature = 1) - x, xend, y, yend, e + geom_point() x, y, alpha, color, group, linetype, size
data geom coordinate plot alpha, angle, color, curvature, linetype, size
x=F·y=A system x, y, alpha, color, fill, shape, size, stroke
a + geom_path(lineend = "butt", h + geom_hex()
To display values, map variables in the data to visual linejoin = "round", linemitre = 1) e + geom_quantile() x, y, alpha, color, fill, size
properties of the geom (aesthetics) like size, color, and x x, y, alpha, color, group, linetype, size x, y, alpha, color, group, linetype, size, weight
and y locations.
a + geom_polygon(aes(alpha = 50)) - x, y, alpha, e + geom_rug(sides = “bl") continuous function
F M A color, fill, group, subgroup, linetype, size x, y, alpha, color, linetype, size i <- ggplot(economics, aes(date, unemploy))

+ = b + geom_rect(aes(xmin = long, ymin = lat,


xmax = long + 1, ymax = lat + 1)) - xmax, xmin,
e + geom_smooth(method = lm)
x, y, alpha, color, fill, group, linetype, size, weight
i + geom_area()
x, y, alpha, color, fill, linetype, size
data geom coordinate plot ymax, ymin, alpha, color, fill, linetype, size
x=F·y=A system e + geom_text(aes(label = cty), nudge_x = 1, i + geom_line()
color = F a + geom_ribbon(aes(ymin = unemploy - 900, nudge_y = 1) - x, y, label, alpha, angle, color,
size = A ymax = unemploy + 900)) - x, ymax, ymin, x, y, alpha, color, group, linetype, size
family, fontface, hjust, lineheight, size, vjust
alpha, color, fill, group, linetype, size
i + geom_step(direction = "hv")
Complete the template below to build a graph. x, y, alpha, color, group, linetype, size
required LINE SEGMENTS
ggplot (data = <DATA> ) + common aesthetics: x, y, alpha, color, linetype, size
one discrete, one continuous visualizing error
<GEOM_FUNCTION> (mapping = aes( <MAPPINGS> ), b + geom_abline(aes(intercept = 0, slope = 1)) f <- ggplot(mpg, aes(class, hwy)) df <- data.frame(grp = c("A", "B"), fit = 4:5, se = 1:2)
stat = <STAT> , position = <POSITION> ) + Not b + geom_hline(aes(yintercept = lat)) j <- ggplot(df, aes(grp, fit, ymin = fit - se, ymax = fit + se))
<COORDINATE_FUNCTION> + required, b + geom_vline(aes(xintercept = long))
sensible f + geom_col() j + geom_crossbar(fatten = 2) - x, y, ymax,
<FACET_FUNCTION> + defaults b + geom_segment(aes(yend = lat + 1, xend = long + 1)) x, y, alpha, color, fill, group, linetype, size ymin, alpha, color, fill, group, linetype, size
supplied b + geom_spoke(aes(angle = 1:1155, radius = 1))
<SCALE_FUNCTION> +
f + geom_boxplot() j + geom_errorbar() - x, ymax, ymin,
<THEME_FUNCTION> x, y, lower, middle, upper, ymax, ymin, alpha, alpha, color, group, linetype, size, width
color, fill, group, linetype, shape, size, weight Also geom_errorbarh().
ggplot(data = mpg, aes(x = cty, y = hwy)) Begins a plot ONE VARIABLE continuous
that you finish by adding layers to. Add one geom c <- ggplot(mpg, aes(hwy)); c2 <- ggplot(mpg) f + geom_dotplot(binaxis = "y", stackdir = “center") j + geom_linerange()
function per layer. x, y, alpha, color, fill, group x, ymin, ymax, alpha, color, group, linetype, size
c + geom_area(stat = "bin")
last_plot() Returns the last plot. x, y, alpha, color, fill, linetype, size f + geom_violin(scale = “area") j + geom_pointrange() - x, y, ymin, ymax,
x, y, alpha, color, fill, group, linetype, size, weight alpha, color, fill, group, linetype, shape, size
ggsave("plot.png", width = 5, height = 5) Saves last plot c + geom_density(kernel = "gaussian")
as 5’ x 5’ file named "plot.png" in working directory. x, y, alpha, color, fill, group, linetype, size, weight
Matches file type to file extension. both discrete maps
c + geom_dotplot()
g <- ggplot(diamonds, aes(cut, color)) Draw the appropriate geometric object depending on the
x, y, alpha, color, fill
simple features present in the data. aes() arguments:
Aes Common aesthetic values. c + geom_freqpoly()
x, y, alpha, color, group, linetype, size
g + geom_count()
x, y, alpha, color, fill, shape, size, stroke
map_id, alpha, color, fill, linetype, linewidth.
nc <- sf::st_read(system.file("shape/nc.shp", package = "sf"))
color and fill - string ("red", "#RRGGBB")
e + geom_jitter(height = 2, width = 2) ggplot(nc) +
c + geom_histogram(binwidth = 5) x, y, alpha, color, fill, shape, size
linetype - integer or string (0 = "blank", 1 = "solid", x, y, alpha, color, fill, linetype, size, weight geom_sf(aes(fill = AREA))
2 = "dashed", 3 = "dotted", 4 = "dotdash", 5 = "longdash",
6 = "twodash") c2 + geom_qq(aes(sample = hwy))
x, y, alpha, color, fill, linetype, size, weight THREE VARIABLES
size - integer (in mm for size of points and text) seals$z <- with(seals, sqrt(delta_long^2 + delta_lat^2)); l <- ggplot(seals, aes(long, lat))
linewidth - integer (in mm for widths of lines) l + geom_contour(aes(z = z)) l + geom_raster(aes(fill = z), hjust = 0.5,
discrete x, y, z, alpha, color, group, linetype, size, weight vjust = 0.5, interpolate = FALSE)
shape - integer/shape name or d <- ggplot(mpg, aes(fl)) x, y, alpha, fill
a single character ("a")
d + geom_bar() l + geom_contour_filled(aes(fill = z)) l + geom_tile(aes(fill = z))
x, alpha, color, fill, linetype, size, weight x, y, alpha, color, fill, group, linetype, size, subgroup x, y, alpha, color, fill, linetype, size, width

CC BY SA Posit So ware, PBC • [email protected] • posit.co • Learn more at ggplot2.tidyverse.org • HTML cheatsheets at pos.it/cheatsheets • ggplot2 3.5.1 • Updated: 2024-05
ft
Stats An alternative way to build a layer. Scales Override defaults with scales package. Coordinate Systems Faceting
A stat builds new variables to plot (e.g., count, prop). Scales map data values to the visual values of an r <- d + geom_bar() Facets divide a plot into
fl cty cyl aesthetic. To change a mapping, add a new scale. r + coord_cartesian(xlim = c(0, 5)) - xlim, ylim subplots based on the
n <- d + geom_bar(aes(fill = fl)) The default cartesian coordinate system. values of one or more

+ =
x count
discrete variables.
aesthetic prepackaged scale-specific
scale_ to adjust scale to use arguments r + coord_fixed(ratio = 1/2) t <- ggplot(mpg, aes(cty, hwy)) + geom_point()
data stat geom coordinate plot
x=x· system n + scale_fill_manual( ratio, xlim, ylim - Cartesian coordinates with
y = count values = c("skyblue", "royalblue", "blue", "navy"), fixed aspect ratio between x and y units. t + facet_grid(. ~ fl)
Visualize a stat by changing the default stat of a geom function, limits = c("d", "e", "p", "r"), breaks =c("d", "e", "p", “r"), Facet into columns based on fl.
geom_bar(stat="count") or by using a stat function, name = "fuel", labels = c("D", "E", "P", "R")) r + coord_flip()
stat_count(geom="bar"), which calls a default geom to make Flip cartesian coordinates by switching t + facet_grid(year ~ .)
a layer (equivalent to a geom function). range of values title to use in labels to use breaks to use in
to include in legend/axis in legend/axis legend/axis x and y aesthetic mappings. Facet into rows based on year.
Use a er_stat(name) syntax to map the stat variable name to
an aesthetic. t + facet_grid(year ~ fl)
r + coord_polar(theta = "x", direction=1) Facet into both rows and columns.
GENERAL PURPOSE SCALES theta, start, direction - Polar coordinates.
geom to use stat function geommappings
Use with most aesthetics t + facet_wrap(~ fl)
i + stat_density_2d(aes(fill = a er_stat(level)), Wrap facets into a rectangular layout.
scale_*_continuous() - Map cont’ values to visual ones. r + coord_trans(y = “sqrt") - x, y, xlim, ylim
geom = "polygon")
variable created by stat scale_*_discrete() - Map discrete values to visual ones. Transformed cartesian coordinates. Set xtrans Set scales to let axis limits vary across facets.
scale_*_binned() - Map continuous values to discrete bins. and ytrans to the name of a window function.
scale_*_identity() - Use data values as visual ones. t + facet_grid(drv ~ fl, scales = "free")
c + stat_bin(binwidth = 1, boundary = 10)
60

π + coord_sf() - xlim, ylim, crs. Ensures all layers x and y axis limits adjust to individual facets:
x, y | count, ncount, density, ndensity scale_*_manual(values = c()) - Map discrete values to "free_x" - x axis limits adjust

lat
manually chosen visual ones. use a common Coordinate Reference System.
c + stat_count(width = 1) x, y | count, prop "free_y" - y axis limits adjust
scale_*_date(date_labels = "%m/%d"), long

c + stat_density(adjust = 1, kernel = "gaussian") date_breaks = "2 weeks") - Treat data values as dates.
Position Adjustments
Set labeller to adjust facet label:
x, y | count, density, scaled scale_*_datetime() - Treat data values as date times.
Same as scale_*_date(). See ?strptime for label formats. t + facet_grid(. ~ fl, labeller = label_both)
e + stat_bin_2d(bins = 30, drop = T)
x, y, fill | count, density fl: c fl: d fl: e fl: p fl: r
Position adjustments determine how to arrange geoms
X & Y LOCATION SCALES that would otherwise occupy the same space.
e + stat_bin_hex(bins = 30) x, y, fill | count, density t + facet_grid(fl ~ ., labeller = label_bquote(alpha ^ .(fl)))
Use with x or y aesthetics (x shown here) s <- ggplot(mpg, aes(fl, fill = drv))
e + stat_density_2d(contour = TRUE, n = 100)
x, y, color, size | level scale_x_log10() - Plot x on log10 scale. ↵c ↵d ↵e ↵p ↵r
scale_x_reverse() - Reverse the direction of the x axis. s + geom_bar(position = "dodge")
e + stat_ellipse(level = 0.95, segments = 51, type = "t") scale_x_sqrt() - Plot x on square root scale. Arrange elements side by side.
l + stat_contour(aes(z = z)) x, y, z, order | level
l + stat_summary_hex(aes(z = z), bins = 30, fun = max) COLOR AND FILL SCALES (DISCRETE)
s + geom_bar(position = "fill")
Stack elements on top of one
Labels and Legends
x, y, z, fill | value another, normalize height. Use labs() to label the elements of your plot.
n + scale_fill_brewer(palette = "Blues")
l + stat_summary_2d(aes(z = z), bins = 30, fun = mean) For palette choices: e + geom_point(position = "jitter") t + labs(x = "New x axis label", y = "New y axis label",
x, y, z, fill | value RColorBrewer::display.brewer.all() Add random noise to X and Y position of title ="Add a title above the plot",
each element to avoid overplotting. subtitle = "Add a subtitle below title",
f + stat_boxplot(coef = 1.5) n + scale_fill_grey(start = 0.2, A caption = "Add a caption below plot",
x, y | lower, middle, upper, width , ymin, ymax end = 0.8, na.value = "red") e + geom_label(position = "nudge") alt = "Add alt text to the plot",
B
Nudge labels away from points. <aes> = "New <aes>
<AES> <AES> legend title")
f + stat_ydensity(kernel = "gaussian", scale = "area") x, y |
density, scaled, count, n, violinwidth, width COLOR AND FILL SCALES (CONTINUOUS) s + geom_bar(position = "stack") t + annotate(geom = "text", x = 8, y = 9, label = “A")
Stack elements on top of one another. Places a geom with manually selected aesthetics.
e + stat_ecdf(n = 40) x, y | x, y o <- c + geom_dotplot(aes(fill = x))
e + stat_quantile(quantiles = c(0.1, 0.9), Each position adjustment can be recast as a function p + guides(x = guide_axis(n.dodge = 2)) Avoid crowded
o + scale_fill_distiller(palette = “Blues”) with manual width and height arguments: or overlapping labels with guide_axis(n.dodge or angle).
formula = y ~ log(x), method = "rq") x, y | quantile
s + geom_bar(position = position_dodge(width = 1)) n + guides(fill = “none") Set legend type for each
e + stat_smooth(method = "lm", formula = y ~ x, se = T, o + scale_fill_gradient(low="red", high=“yellow") aesthetic: colorbar, legend, or none (no legend).
level = 0.95) x, y | se, x, y, ymin, ymax

ggplot() + xlim(-5, 5) + stat_function(fun = dnorm,


o + scale_fill_gradient2(low = "red", high = “blue”,
mid = "white", midpoint = 25) Themes n + theme(legend.position = "bottom")
Place legend at "bottom", "top", "le ", or “right”.
n = 20, geom = “point”) x | x, y n + scale_fill_discrete(name = "Title",
ggplot() + stat_qq(aes(sample = 1:100)) o + scale_fill_gradientn(colors = topo.colors(6)) r + theme_bw() r + theme_classic() labels = c("A", "B", "C", "D", "E"))
x, y, sample | sample, theoretical Also: rainbow(), heat.colors(), terrain.colors(), White background Set legend title and labels with a scale function.
cm.colors(), RColorBrewer::brewer.pal() with grid lines. r + theme_light()
e + stat_sum() x, y, size | n, prop
e + stat_summary(fun.data = "mean_cl_boot")
h + stat_summary_bin(fun = "mean", geom = "bar")
SHAPE AND SIZE SCALES
r + theme_gray()
Grey background
r + theme_linedraw()
r + theme_minimal()
Zooming
p <- e + geom_point(aes(shape = fl, size = cyl)) (default theme). Minimal theme. Without clipping (preferred):
e + stat_identity() p + scale_shape() + scale_size() r + theme_dark() r + theme_void() t + coord_cartesian(xlim = c(0, 100), ylim = c(10, 20))
e + stat_unique() p + scale_shape_manual(values = c(3:7)) Dark for contrast. Empty theme.
With clipping (removes unseen data points):
r + theme() Customize aspects of the theme such
as axis, legend, panel, and facet properties. t + xlim(0, 100) + ylim(10, 20)
p + scale_radius(range = c(1,6))
p + scale_size_area(max_size = 6) r + ggtitle(“Title”) + theme(plot.title.postion = “plot”) t + scale_x_continuous(limits = c(0, 100)) +
r + theme(panel.background = element_rect(fill = “blue”)) scale_y_continuous(limits = c(0, 100))

CC BY SA Posit So ware, PBC • [email protected] • posit.co • Learn more at ggplot2.tidyverse.org • HTML cheatsheets at pos.it/cheatsheets • ggplot2 3.5.1 • Updated: 2024-05
ft
ft
ft
ft

You might also like