0% found this document useful (0 votes)
113 views2 pages

Data Visualization 2.1 PDF

Uploaded by

sintya rachma
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
113 views2 pages

Data Visualization 2.1 PDF

Uploaded by

sintya rachma
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

Data Visualization with ggplot2 : : CHEAT SHEET

Basics Geoms Use a geom function to represent data points, use the geom’s aesthetic properties to represent variables. 

Each function returns a layer.
GRAPHICAL PRIMITIVES TWO VARIABLES 

ggplot2 is based on the grammar of graphics, the idea
that you can build every graph from the same a <- ggplot(economics, aes(date, unemploy)) continuous x , continuous y continuous bivariate distribution
components: a data set, a coordinate system, b <- ggplot(seals, aes(x = long, y = lat)) h <- ggplot(diamonds, aes(carat, price))
e <- ggplot(mpg, aes(cty, hwy))
and geoms—visual marks that represent data points. a + geom_blank()
 e + geom_label(aes(label = cty), nudge_x = 1, h + geom_bin2d(binwidth = c(0.25, 500))

(Useful for expanding limits) nudge_y = 1, check_overlap = TRUE) x, y, label, x, y, alpha, color, fill, linetype, size, weight
F M A alpha, angle, color, family, fontface, hjust,
b + geom_curve(aes(yend = lat + 1,
 lineheight, size, vjust
+ = xend=long+1,curvature=z)) - x, xend, y, yend,
alpha, angle, color, curvature, linetype, size e + geom_jitter(height = 2, width = 2) 

h + geom_density2d()

x, y, alpha, colour, group, linetype, size
x, y, alpha, color, fill, shape, size
data geom coordinate plot a + geom_path(lineend="butt", linejoin="round", h + geom_hex()

x=F·y=A system linemitre=1)
 x, y, alpha, colour, fill, size
e + geom_point(), x, y, alpha, color, fill, shape,
x, y, alpha, color, group, linetype, size size, stroke

To display values, map variables in the data to visual a + geom_polygon(aes(group = group))
 e + geom_quantile(), x, y, alpha, color, group,
properties of the geom (aesthetics) like size, color, and x x, y, alpha, color, fill, group, linetype, size linetype, size, weight
 continuous function
and y locations. i <- ggplot(economics, aes(date, unemploy))
b + geom_rect(aes(xmin = long, ymin=lat, xmax=
F M A long + 1, ymax = lat + 1)) - xmax, xmin, ymax, e + geom_rug(sides = "bl"), x, y, alpha, color, i + geom_area()

ymin, alpha, color, fill, linetype, size x, y, alpha, color, fill, linetype, size
+ =
linetype, size
a + geom_ribbon(aes(ymin=unemploy - 900, e + geom_smooth(method = lm), x, y, alpha, i + geom_line()

ymax=unemploy + 900)) - x, ymax, ymin, color, fill, group, linetype, size, weight x, y, alpha, color, group, linetype, size
data geom coordinate plot alpha, color, fill, group, linetype, size
x=F·y=A system
color = F e + geom_text(aes(label = cty), nudge_x = 1, i + geom_step(direction = "hv")

size = A nudge_y = 1, check_overlap = TRUE), x, y, label, x, y, alpha, color, group, linetype, size

alpha, angle, color, family, fontface, hjust, 

LINE SEGMENTS lineheight, size, vjust 

common aesthetics: x, y, alpha, color, linetype, size 

b + geom_abline(aes(intercept=0, slope=1)) visualizing error
Complete the template below to build a graph. b + geom_hline(aes(yintercept = lat)) df <- data.frame(grp = c("A", "B"), fit = 4:5, se = 1:2)
required b + geom_vline(aes(xintercept = long)) discrete x , continuous y j <- ggplot(df, aes(grp, fit, ymin = fit-se, ymax = fit+se))
ggplot (data = <DATA> ) + f <- ggplot(mpg, aes(class, hwy))
b + geom_segment(aes(yend=lat+1, xend=long+1)) j + geom_crossbar(fatten = 2)

<GEOM_FUNCTION> (mapping = aes( <MAPPINGS> ), x, y, ymax, ymin, alpha, color, fill, group, linetype,
b + geom_spoke(aes(angle = 1:1155, radius = 1)) f + geom_col(), x, y, alpha, color, fill, group,
stat = <STAT> , position = <POSITION> ) + Not 
 linetype, size size
<COORDINATE_FUNCTION> + required,
sensible j + geom_errorbar(), x, ymax, ymin, alpha, color,
f + geom_boxplot(), x, y, lower, middle, upper, group, linetype, size, width (also
<FACET_FUNCTION> + defaults
supplied ONE VARIABLE continuous ymax, ymin, alpha, color, fill, group, linetype, geom_errorbarh())
<SCALE_FUNCTION> + shape, size, weight
c <- ggplot(mpg, aes(hwy)); c2 <- ggplot(mpg)
j + geom_linerange()

<THEME_FUNCTION> f + geom_dotplot(binaxis = "y", stackdir = x, ymin, ymax, alpha, color, group, linetype, size
c + geom_area(stat = "bin")
 "center"), x, y, alpha, color, fill, group
x, y, alpha, color, fill, linetype, size j + geom_pointrange()

ggplot(data = mpg, aes(x = cty, y = hwy)) Begins a plot f + geom_violin(scale = "area"), x, y, alpha, color, x, y, ymin, ymax, alpha, color, fill, group, linetype,
that you finish by adding layers to. Add one geom c + geom_density(kernel = "gaussian")
 fill, group, linetype, size, weight shape, size
function per layer. 
 x, y, alpha, color, fill, group, linetype, size, weight
aesthetic mappings data geom
c + geom_dotplot() 
 maps
qplot(x = cty, y = hwy, data = mpg, geom = “point") x, y, alpha, color, fill data <- data.frame(murder = USArrests$Murder,

Creates a complete plot with given data, geom, and discrete x , discrete y state = tolower(rownames(USArrests)))

mappings. Supplies many useful defaults. c + geom_freqpoly() x, y, alpha, color, group, g <- ggplot(diamonds, aes(cut, color)) map <- map_data("state")

linetype, size k <- ggplot(data, aes(fill = murder))
last_plot() Returns the last plot g + geom_count(), x, y, alpha, color, fill, shape, k + geom_map(aes(map_id = state), map = map)
c + geom_histogram(binwidth = 5) x, y, alpha,
ggsave("plot.png", width = 5, height = 5) Saves last plot color, fill, linetype, size, weight size, stroke + expand_limits(x = map$long, y = map$lat),
as 5’ x 5’ file named "plot.png" in working directory. map_id, alpha, color, fill, linetype, size
Matches file type to file extension. c2 + geom_qq(aes(sample = hwy)) x, y, alpha,
color, fill, linetype, size, weight
THREE VARIABLES
seals$z <- with(seals, sqrt(delta_long^2 + delta_lat^2))l <- ggplot(seals, aes(long, lat))
discrete l + geom_contour(aes(z = z))
 l + geom_raster(aes(fill = z), hjust=0.5, vjust=0.5,
d <- ggplot(mpg, aes(fl)) x, y, z, alpha, colour, group, linetype, 
 interpolate=FALSE)

size, weight x, y, alpha, fill
d + geom_bar() 

x, alpha, color, fill, linetype, size, weight l + geom_tile(aes(fill = z)), x, y, alpha, color, fill,
linetype, size, width

RStudio® is a trademark of RStudio, Inc. • CC BY SA RStudio • [email protected] • 844-448-1212 • rstudio.com • Learn more at http://ggplot2.tidyverse.org • ggplot2 2.1.0 • Updated: 2016-11
Stats An alternative way to build a layer Scales Coordinate Systems Faceting
A stat builds new variables to plot (e.g., count, prop). Scales map data values to the visual values of an r <- d + geom_bar() Facets divide a plot into 

fl cty cyl aesthetic. To change a mapping, add a new scale. r + coord_cartesian(xlim = c(0, 5)) 
 subplots based on the 

xlim, ylim
 values of one or more 

(n <- d + geom_bar(aes(fill = fl)))
+ =
x ..count..
The default cartesian coordinate system discrete variables.
aesthetic prepackaged scale-specific r + coord_fixed(ratio = 1/2) 

scale_ to adjust scale to use arguments ratio, xlim, ylim
 t <- ggplot(mpg, aes(cty, hwy)) + geom_point()
data stat geom coordinate plot Cartesian coordinates with fixed aspect ratio
x = x ·
 system n + scale_fill_manual( between x and y units
y = ..count.. values = c("skyblue", "royalblue", "blue", “navy"), r + coord_flip() 
 t + facet_grid(. ~ fl)

Visualize a stat by changing the default stat of a geom limits = c("d", "e", "p", "r"), breaks =c("d", "e", "p", “r"), xlim, ylim
 facet into columns based on fl
name = "fuel", labels = c("D", "E", "P", "R")) Flipped Cartesian coordinates
function, geom_bar(stat="count") or by using a stat t + facet_grid(year ~ .)

r + coord_polar(theta = "x", direction=1 ) 
 facet into rows based on year
function, stat_count(geom="bar"), which calls a default range of title to use in labels to use breaks to use in theta, start, direction

values to include legend/axis in legend/axis legend/axis
geom to make a layer (equivalent to a geom function). in mapping Polar coordinates t + facet_grid(year ~ fl)

Use ..name.. syntax to map stat variables to aesthetics. r + coord_trans(ytrans = “sqrt") 
 facet into both rows and columns
xtrans, ytrans, limx, limy
 t + facet_wrap(~ fl)

GENERAL PURPOSE SCALES Transformed cartesian coordinates. Set xtrans and wrap facets into a rectangular layout
geom to use stat function geommappings ytrans to the name of a window function.
Use with most aesthetics
i + stat_density2d(aes(fill = ..level..), Set scales to let axis limits vary across facets
scale_*_continuous() - map cont’ values to visual ones π + coord_quickmap()
geom = "polygon") 60

variable created by stat scale_*_discrete() - map discrete values to visual ones π + coord_map(projection = "ortho", t + facet_grid(drv ~ fl, scales = "free")


lat
scale_*_identity() - use data values as visual ones orientation=c(41, -74, 0))projection, orienztation, x and y axis limits adjust to individual facets

xlim, ylim "free_x" - x axis limits adjust

c + stat_bin(binwidth = 1, origin = 10)
 scale_*_manual(values = c()) - map discrete values to long

Map projections from the mapproj package


manually chosen visual ones "free_y" - y axis limits adjust
x, y | ..count.., ..ncount.., ..density.., ..ndensity.. (mercator (default), azequalarea, lagrange, etc.)
scale_*_date(date_labels = "%m/%d"), date_breaks = "2 Set labeller to adjust facet labels
c + stat_count(width = 1) x, y, | ..count.., ..prop.. weeks") - treat data values as dates.
c + stat_density(adjust = 1, kernel = “gaussian") 
 scale_*_datetime() - treat data x values as date times. t + facet_grid(. ~ fl, labeller = label_both)
x, y, | ..count.., ..density.., ..scaled..

e + stat_bin_2d(bins = 30, drop = T)



Use same arguments as scale_x_date(). See ?strptime for
label formats. Position Adjustments fl: c fl: d fl: e fl: p fl: r

t + facet_grid(fl ~ ., labeller = label_bquote(alpha ^ .(fl)))


x, y, fill | ..count.., ..density.. Position adjustments determine how to arrange geoms ↵c ↵d ↵e ↵p ↵r
X & Y LOCATION SCALES that would otherwise occupy the same space.
e + stat_bin_hex(bins=30) x, y, fill | ..count.., ..density.. t + facet_grid(. ~ fl, labeller = label_parsed)
Use with x or y aesthetics (x shown here) c d e p r
e + stat_density_2d(contour = TRUE, n = 100)
 s <- ggplot(mpg, aes(fl, fill = drv))
x, y, color, size | ..level.. scale_x_log10() - Plot x on log10 scale
s + geom_bar(position = "dodge")

Labels
e + stat_ellipse(level = 0.95, segments = 51, type = "t") scale_x_reverse() - Reverse direction of x axis
scale_x_sqrt() - Plot x on square root scale Arrange elements side by side
l + stat_contour(aes(z = z)) x, y, z, order | ..level.. s + geom_bar(position = "fill")

Stack elements on top of one another, 
 t + labs( x = "New x axis label", y = "New y axis label",

l + stat_summary_hex(aes(z = z), bins = 30, fun = max)
 COLOR AND FILL SCALES (DISCRETE) normalize height
x, y, z, fill | ..value.. title ="Add a title above the plot", 

n <- d + geom_bar(aes(fill = fl)) e + geom_point(position = "jitter")
 Use scale functions
subtitle = "Add a subtitle below title",
 to update legend
l + stat_summary_2d(aes(z = z), bins = 30, fun = mean)
 Add random noise to X and Y position of each
n + scale_fill_brewer(palette = "Blues") 
 element to avoid overplotting caption = "Add a caption below plot", labels
x, y, z, fill | ..value.. For palette choices: <aes> = "New <aes>
<AES> <AES> legend title")
A
RColorBrewer::display.brewer.all() e + geom_label(position = "nudge")

f + stat_boxplot(coef = 1.5) x, y | ..lower.., 
 B Nudge labels away from points
 t + annotate(geom = "text", x = 8, y = 9, label = "A")
..middle.., ..upper.., ..width.. , ..ymin.., ..ymax.. n + scale_fill_grey(start = 0.2, end = 0.8, 

na.value = "red") geom to place manual values for geom’s aesthetics
f + stat_ydensity(kernel = "gaussian", scale = “area") x, y | s + geom_bar(position = "stack")

..density.., ..scaled.., ..count.., ..n.., ..violinwidth.., ..width.. Stack elements on top of one another
COLOR AND FILL SCALES (CONTINUOUS)
e + stat_ecdf(n = 40) x, y | ..x.., ..y..
e + stat_quantile(quantiles = c(0.1, 0.9), formula = y ~
o <- c + geom_dotplot(aes(fill = ..x..)) Each position adjustment can be recast as a function with
manual width and height arguments Legends
log(x), method = "rq") x, y | ..quantile.. o + scale_fill_distiller(palette = "Blues") s + geom_bar(position = position_dodge(width = 1)) n + theme(legend.position = "bottom")


 Place legend at "bottom", "top", "left", or "right"
e + stat_smooth(method = "lm", formula = y ~ x, se=T,
level=0.95) x, y | ..se.., ..x.., ..y.., ..ymin.., ..ymax.. o + scale_fill_gradient(low="red", high="yellow") n + guides(fill = "none")


Themes

 Set legend type for each aesthetic: colorbar, legend, or
ggplot() + stat_function(aes(x = -3:3), n = 99, fun = o + scale_fill_gradient2(low="red", high=“blue", none (no legend)
dnorm, args = list(sd=0.5)) x | ..x.., ..y.. mid = "white", midpoint = 25) n + scale_fill_discrete(name = "Title", 


 labels = c("A", "B", "C", "D", "E"))

e + stat_identity(na.rm = TRUE) r + theme_bw()
 r + theme_classic() Set legend title and labels with a scale function.
o + scale_fill_gradientn(colours=topo.colors(6)) White background

ggplot() + stat_qq(aes(sample=1:100), dist = qt, Also: rainbow(), heat.colors(), terrain.colors(), with grid lines r + theme_light()
dparam=list(df=5)) sample, x, y | ..sample.., ..theoretical..
Zooming
cm.colors(), RColorBrewer::brewer.pal() r + theme_gray()
 r + theme_linedraw()
e + stat_sum() x, y, size | ..n.., ..prop.. Grey background 

(default theme) r + theme_minimal()

e + stat_summary(fun.data = "mean_cl_boot") SHAPE AND SIZE SCALES Minimal themes
r + theme_dark()
 r + theme_void()
 Without clipping (preferred)
h + stat_summary_bin(fun.y = "mean", geom = "bar") p <- e + geom_point(aes(shape = fl, size = cyl)) dark for contrast
p + scale_shape() + scale_size() Empty theme t + coord_cartesian(

e + stat_unique() xlim = c(0, 100), ylim = c(10, 20))
p + scale_shape_manual(values = c(3:7))
With clipping (removes unseen data points)
t + xlim(0, 100) + ylim(10, 20)
p + scale_radius(range = c(1,6))
p + scale_size_area(max_size = 6) t + scale_x_continuous(limits = c(0, 100)) +
scale_y_continuous(limits = c(0, 100))

RStudio® is a trademark of RStudio, Inc. • CC BY SA RStudio • [email protected] • 844-448-1212 • rstudio.com • Learn more at http://ggplot2.tidyverse.org • ggplot2 2.1.0 • Updated: 2016-11

You might also like