0% found this document useful (0 votes)
9 views57 pages

Chapter 2

This document provides an introduction to data visualization using the ggplot2 package in R. It discusses how to map variables from a dataset to different aesthetics like x, y, color, size. It explains how to modify positions and scales to avoid overlapping points. Common aesthetics that can be mapped include x, y, color, size, shape. Position adjustments include identity, dodge, stack, and jitter. Scale functions like scale_x_continuous() allow customizing axis labels, limits, breaks and expansion.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views57 pages

Chapter 2

This document provides an introduction to data visualization using the ggplot2 package in R. It discusses how to map variables from a dataset to different aesthetics like x, y, color, size. It explains how to modify positions and scales to avoid overlapping points. Common aesthetics that can be mapped include x, y, color, size, shape. Position adjustments include identity, dodge, stack, and jitter. Scale functions like scale_x_continuous() allow customizing axis labels, limits, breaks and expansion.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 57

Visible aesthetics

I N T R O D U C T I O N T O D ATA V I S U A L I Z AT I O N W I T H G G P L O T 2

Rick Scave a
Founder, Scave a Academy
Mapping onto the X and Y axes
ggplot(iris, aes(x = Sepal.Length,
y = Sepal.Width)) +
geom_point()

INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT2


Mapping onto color
ggplot(iris, aes(x = Sepal.Length,
y = Sepal.Width,
color = Species)) +
geom_point()

Type Variable
Color Species

INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT2


Mapping onto the color aesthetic
ggplot(iris, aes(x = Sepal.Length,
y = Sepal.Width,
color = Species)) +
geom_point()

Type Variable
Color Species

Species, a dataframe column, is mapped onto


color, a visible aesthetic.

INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT2


Mapping onto the color aesthetic
ggplot(iris, aes(x = Sepal.Length,
y = Sepal.Width,
color = Species)) +
geom_point()

Type Variable
Color Species

Species, a dataframe column, is mapped onto


color, a visible aesthetic.

Map aesthetics in aes() .

INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT2


Mapping onto the color aesthetic in geom
ggplot(iris) +
geom_point(aes(x = Sepal.Length,
y = Sepal.Width,
col = Species))

Only necessary if:

All layers should not inherit the same


aesthetics

Mixing di erent data sources

INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT2


Typical visible aesthetics
Aesthetic Description
x X axis position
y Y axis position

INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT2


Typical visible aesthetics
Aesthetic Description
x X axis position
y Y axis position
ll Fill color

INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT2


Typical visible aesthetics
Aesthetic Description
x X axis position
y Y axis position
ll Fill color
color Color of points, outlines of other geoms

INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT2


Typical visible aesthetics
Aesthetic Description
x X axis position
y Y axis position
ll Fill color
color Color of points, outlines of other geoms
size Area or radius of points, thickness of lines

INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT2


Typical visible aesthetics
Aesthetic Description Aesthetic Description
x X axis position alpha Transparency
y Y axis position
ll Fill color
Color of points, outlines of other
color
geoms
Area or radius of points,
size
thickness of lines

INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT2


Typical visible aesthetics
Aesthetic Description Aesthetic Description
x X axis position alpha Transparency
y Y axis position linetype Line dash pa ern
ll Fill color
Color of points, outlines of other
color
geoms
Area or radius of points,
size
thickness of lines

INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT2


Typical visible aesthetics
Aesthetic Description Aesthetic Description
x X axis position alpha Transparency
y Y axis position linetype Line dash pa ern
ll Fill color labels Text on a plot or axes
Color of points, outlines of other
color
geoms
Area or radius of points,
size
thickness of lines

INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT2


Typical visible aesthetics
Aesthetic Description Aesthetic Description
x X axis position alpha Transparency
y Y axis position linetype line dash pa ern
ll Fill color labels Text on a plot or axes
Color of points, outlines of other shape Shape
color
geoms
Area or radius of points,
size
thickness of lines

INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT2


Let's Practice
I N T R O D U C T I O N T O D ATA V I S U A L I Z AT I O N W I T H G G P L O T 2
Using attributes
I N T R O D U C T I O N T O D ATA V I S U A L I Z AT I O N W I T H G G P L O T 2

Rick Scave a
Founder, Scave a Academy
Aesthetics? Attributes!
ggplot(iris, aes(x = Sepal.Length,
y = Sepal.Width)) +
geom_point(color = "red")

Type Property
Color "red"

Set a ributes in geom_*() .

The color a ribute is set to "red".

INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT2


Aesthetics? Attributes!
ggplot(iris, aes(x = Sepal.Length,
y = Sepal.Width)) +
geom_point(size = 10)

Type Property
Size 4

INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT2


Aesthetics? Attributes!
ggplot(iris, aes(x = Sepal.Length,
y = Sepal.Width)) +
geom_point(shape = 4)

Type Property
Shape 4

INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT2


Let's practice!
I N T R O D U C T I O N T O D ATA V I S U A L I Z AT I O N W I T H G G P L O T 2
Modifying
Aesthetics
I N T R O D U C T I O N T O D ATA V I S U A L I Z AT I O N W I T H G G P L O T 2

Rick Scave a
Founder, Scave a Academy
Positions
Adjustment for overlapping

identity

dodge

stack

ll

ji er

ji erdodge

nudge

INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT2


position = "identity" (default)
ggplot(iris, aes(x = Sepal.Length,
y = Sepal.Width,
color = Species)) +
geom_point()

INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT2


position = "identity" (default)
ggplot(iris, aes(x = Sepal.Length,
y = Sepal.Width,
color = Species)) +
geom_point(position = "identity")

INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT2


position = "jitter"
ggplot(iris, aes(x = Sepal.Length,
y = Sepal.Width,
color = Species)) +
geom_point(position = "jitter")

INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT2


position_jitter()
posn_j <- position_jitter(0.1)

ggplot(iris, aes(x = Sepal.Length,


y = Sepal.Width,
col = Species)) +
geom_point(position = posn_j)

INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT2


position_jitter()
posn_j <- position_jitter(0.1,
seed = 136)

ggplot(iris, aes(x = Sepal.Length,


y = Sepal.Width,
color = Species)) +
geom_point(position = posn_j)

Set arguments for the position

Consistency across plots & layers

INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT2


Scale functions
scale_x_*()

scale_y_*()

scale_color_*()
Also scale_colour_*()

scale_fill_*()

scale_shape_*()

scale_linetype_*()

scale_size_*()

INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT2


Scale functions
scale_x_continuous()

scale_y_*()

scale_color_discrete()
Alternatively, scale_colour_*()

scale_fill_*()

scale_shape_*()

scale_linetype_*()

scale_size_*()

INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT2


scale_*_*()
ggplot(iris, aes(x = Sepal.Length,
y = Sepal.Width,
color = Species)) +
geom_point(position = "jitter") +
scale_x_continuous("Sepal Length") +
scale_color_discrete("Species")

INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT2


The limits argument
ggplot(iris, aes(x = Sepal.Length,
y = Sepal.Width,
color = Species)) +
geom_point(position = "jitter") +
scale_x_continuous("Sepal Length",
limits = c(2,8)) +
scale_color_discrete("Species")

INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT2


The breaks argument
ggplot(iris, aes(x = Sepal.Length,
y = Sepal.Width,
color = Species)) +
geom_point(position = "jitter") +
scale_x_continuous("Sepal Length",
limits = c(2, 8),
breaks = seq(2, 8, 3)) +
scale_color_discrete("Species")

INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT2


The expand argument
ggplot(iris, aes(x = Sepal.Length,
y = Sepal.Width,
color = Species)) +
geom_point(position = "jitter") +
scale_x_continuous("Sepal Length",
limits = c(2, 8),
breaks = seq(2, 8, 3),
expand = c(0, 0)) +
scale_color_discrete("Species")

INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT2


The labels argument
ggplot(iris, aes(x = Sepal.Length,
y = Sepal.Width,
color = Species)) +
geom_point(position = "jitter") +
scale_x_continuous("Sepal Length",
limits = c(2, 8),
breaks = seq(2, 8, 3),
expand = c(0, 0),
labels = c("Setosa",
"Versicolor",
"Virginica")) +
scale_color_discrete("Species")

INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT2


labs()
ggplot(iris, aes(x = Sepal.Length,
y = Sepal.Width,
color = Species)) +
geom_point(position = "jitter") +
labs(x = "Sepal Length",
y = "Sepal Width",
color = "Species")

INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT2


Let's try it out!
I N T R O D U C T I O N T O D ATA V I S U A L I Z AT I O N W I T H G G P L O T 2
Aesthetics best
practices
I N T R O D U C T I O N T O D ATA V I S U A L I Z AT I O N W I T H G G P L O T 2

Rick Scave a
Founder, Scave a Academy
Which aesthetics?
Use your creative know-how, and

Follow some clear guidelines

Jacques Bertin
The Semiology of Graphics, 1967

William Cleveland
The Elements of Graphing Data, 1985

Visualizing Data, 1993

INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT2


Form follows function

INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT2


Form follows function
Function Guiding principles

Primary: Never:

Accurate and e cient representations Misrepresent or obscure data

Secondary: Confuse viewers with complexity

Always:
Visually appealing, beautiful plots

Consider the audience and purpose of


every plot

INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT2


INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT2
INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT2
Extracting information from Data

INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT2


INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT2
INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT2
INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT2
The best choices for aesthetics
E cient
Provides a faster overview than numeric summaries

Accurate
Minimizes information loss

INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT2


Aesthetics - continuous variables
ggplot(iris, aes(x = Sepal.Length,
y = Sepal.Width,
color = Species)) +
geom_point()

INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT2


Aesthetics - continuous variables
ggplot(iris, aes(color = Sepal.Length,
y = Sepal.Width,
x = Species)) +
geom_point()

INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT2


INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT2
Three iris scatter plots

INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT2


Three iris scatter plots, unaligned y-axes

INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT2


Single faceted plot, common y-axis

INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT2


INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT2
Aesthetics - categorical variables
ggplot(iris, aes(x = Sepal.Length,
y = Sepal.Width,
col = Species)) +
geom_point()

INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT2


Aesthetics - categorical variables
ggplot(iris, aes(x = Sepal.Length,
y = Sepal.Width,
col = Species)) +
geom_point(position = "jitter",
alpha = 0.5)

INTRODUCTION TO DATA VISUALIZATION WITH GGPLOT2


Now it's your turn
I N T R O D U C T I O N T O D ATA V I S U A L I Z AT I O N W I T H G G P L O T 2

You might also like