0% found this document useful (0 votes)
12 views

Graphics Using R

The document provides an overview of creating graphics using R. It discusses vector vs raster images, graphical devices, the Cartesian coordinate system, and best practices for creating plots. Examples are given for creating histograms, pie charts, scatterplots, and dealing with overlapping points using techniques like jittering and sunflower plots. Resources for learning graphics in R, like manuals, guides and tutorials, are also mentioned.

Uploaded by

darkzilla91
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views

Graphics Using R

The document provides an overview of creating graphics using R. It discusses vector vs raster images, graphical devices, the Cartesian coordinate system, and best practices for creating plots. Examples are given for creating histograms, pie charts, scatterplots, and dealing with overlapping points using techniques like jittering and sunflower plots. Resources for learning graphics in R, like manuals, guides and tutorials, are also mentioned.

Uploaded by

darkzilla91
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 96

Graphics using R

Corso di laurea magistrale in


Psicologia del Lavoro e delle Organizzazioni

8 May 2017

Giovanni Luca Lo Magno


INVALSI test
A bad graph
An interesting graph

From “Le Scienze”, September 2011


Hertzsprung–Russell diagram
Pen's parade
Circular layout
Why R for graphics?
Resources for learning

● manuals
● guides
● cookbooks
● tutorials
● blogs
● forums
● www.stackoverflow.com
R is made up of packages

Load a package:
library(packagename)
Main graphic packages

grDevices

graphics grid

maps lattice
Vector vs. raster images

Vector image Raster image

(50 x 50 pixels)
Resolution affects image quality

(50 x 50 pixels) (25 x 25 pixels) (10 x 10 pixels)


Anti-aliasing

No anti-aliasing With anti-aliasing


Alpha blending

One line Two overlapped lines


Best practices

● prefer vector graphics


● ≥ 300 dpi for raster images
● check paper output
Let's have a look

demo(graphics)
More examples?

example(plot)
example(hist)
example(barplot)
example(boxplot)
Graphical user interface (GUI) vs. text-based interface

GUI Text-based interface

● point-and-click ● type commands


● easy to learn ● not easy to learn
● easy to use ● not easy to remember
● little automation ● excellent automation
Painters model

First

Second

New paint partially or completely obscures the old


Functions in the graphic system

● high-level functions
● low-level functions
● interactive functions
Graphical devices

Graphic Graphic
Device
commands output
One input, several outputs

Input Device Output

Graphic commands
windows()

screen

bitmap()

file .png
A typical session

1) Open device bitmap(file="rastertest")

2) Graphic commands plot(1:10)

3) Close device dev.off()


Graphical devices

Screen File Other


● x11() ● postscript() ● devGTK()
● windows() ● pdf() ● devJava()
● quartz() ● pictex() ● devSVG()
● xfig()
● bitmap()
● png()
● jpeg()
● win.metafile()
● bmp()
Managing devices

Return open devices


dev.list()

Return current device


dev.cur()

Close current device


dev.off()

Close all open devices


graphics.off()
Cartesian coordinate system
Y

P(x,y)
y

O x X
The graphic box model

outer margin 3

figure region
figure margin 2 figure margin 3

figure margin 4
outer margin 2

outer margin 4
plot region

figure margin 1

outer margin 1
The graphic box model: an example
The graphic box model: an example
Figure margin 3

Figure margin 4
Figure margin 2 Plot region

Figure margin 1
The graphic box model: an example
x = rnorm(50)
y = rnorm(50)
plot(x, y, main="An example graph",
xlim=c(-3, 3), ylim=c(-3, 3))
Adding boxes

plot.new()
box(which="plot")
box(which="figure")
box(which="outer")

Note: no outer margins by default


Exploring the margins

plot.new()
plot.window(c(0,10), c(0,2))
points(c(0,0,10,10), c(0, 2, 0, 2))

(0,2) (10,2)

(0,0) (10,0)
Exploring the margins and the box
plot.new()
plot.window(c(0,10), c(0,2))
points(c(0,0,10,10), c(0, 2, 0, 2))
box()

(0,2) (10,2)

(0,0) (10,0)
Multiple figure regions
Outer margin 3
Figure region 1 Figure region 2

Plot region 1 Plot region 2


Outer margin 2

Outer margin 4
Figure region 3 Figure region 4

Plot region 3 Plot region 4

Figure region 5 Figure region 6

Plot region 5 Plot region 6

Outer margin 1
Coordinate system in the plot region

Max y value

y (x, y)

Min y value
Min x value x Max x value
Several types of plot
plot(y, type="p") plot(y, type="l")

plot(y, type="b") plot(y, type="n")


Plot step-by-step: data

> set.seed(123456)
> y <- rnorm(20)
> y
[1] 0.83373317 -0.27604777 -0.35500184 0.08748742
[5] 2.25225573 0.83446013 1.31241551 2.50264541
[9] 1.16823174 -0.42616558 -0.99612975 -1.11394990
[13] -0.05573154 1.17443240 1.05321861 0.05760597
[17] -0.73504289 0.93052842 1.66821097 0.55968789
> range(y)
[1] -1.113950 2.502645
Plot step-by-step: start a new plot
plot.new()
Plot step-by-step: set up coordinate system
plot.window(c(1, 20), c(-1.2, 2.6))
Plot step-by-step: add grid
grid(col="lightgray", lty="solid")
Plot step-by-step: add points
points(y)
Plot step-by-step: add x-axis
axis(1, at=c(1, 10, 20))
Plot step-by-step: add y-axis
axis(2, at=c(-1.2, 0, 2.6))
Plot step-by-step: add x-axis title
title(xlab="X")
Plot step-by-step: add y-axis title
title(ylab="Y")
Plot step-by-step: add main title
title(main="My graph title")
Plot step-by-step: let review all the code

set.seed(123456)
y <- rnorm(20)
plot.new()
plot.window(c(1, 20), c(-1.2, 2.6))
grid(col="lightgray", lty="solid")
points(y)
axis(1, at=c(1, 10, 20))
axis(2, at=c(-1.2, 0, 2.6))
title(xlab="X")
title(ylab="Y")
title(main="My graph title")
Plot step-by-step: create SVG file

set.seed(123456)
y <- rnorm(20)
open SVG device svg(file="mygraph.svg")
plot.new()
plot.window(c(1, 20), c(-1.2, 2.6))
grid(col="lightgray", lty="solid")
points(y)
graphic commands
axis(1, at=c(1, 10, 20))
axis(2, at=c(-1.2, 0, 2.6))
title(xlab="X")
title(ylab="Y")
title(main="My graph title")
close device dev.off()
Best practices: comment and save the script
# Data
set.seed(123456)
y <- rnorm(20)

# Open device
svg(file="final.svg")

# Init frame
plot.new()
plot.window(c(1, 20), c(-1.2, 2.6))

# Grid
grid(col="lightgray", lty="solid")

# Points
points(y)

# Axes
axis(1, at=c(1, 10, 20))
axis(2, at=c(-1.2, 0, 2.6))

# Titles
title(xlab="X")
title(ylab="Y")
title(main="My graph title")

# Close device
dev.off()
Overlapping points: the problem
x <- c(1, 2, 2, 3, 3, 3, 4, 4, 4, 4)
y <- c(2, 6, 6, 8, 8, 8, 10, 10, 10, 10)
plot(x=x, y=y)
4 points

3 points

2 points

1 point
Overlapping points: jitter (add noise)
x <- c(1, 2, 2, 3, 3, 3, 4, 4, 4, 4)
y <- c(2, 6, 6, 8, 8, 8, 10, 10, 10, 10)
plot(x=jitter(x), y=jitter(y), xlab="x", ylab="y")
Overlapping points: sunflower plot

1 2 3 4 5 6 7 8 9 10
Overlapping points: a sunflower plot example
x <- c(1, 2, 2, 3, 3, 3, 4, 4, 4, 4)
y <- c(2, 6, 6, 8, 8, 8, 10, 10, 10, 10)
sunflowerplot(x=x, y=y)
Overlapping points: another sunflower plot example
sunflowerplot(x=iris$Petal.Length, y=iris$Petal.Width)
Pie chart

data <- c(12, 5, 4)


labels <- c("Italian", "English", "Spanish")
pie(data, labels=labels)

Example:

“Italian” pie slice


From vector data to distribution

> data <- c("No", "Maybe", "Maybe", "Yes", "No",


"Yes", "Yes", "Yes", "No", "Yes")
> distribution <- table(data)
> distribution
data
Maybe No Yes
2 3 5
Pie chart of distribution
> data <- c("No", "Maybe", "Maybe", "Yes", "No",
"Yes", "Yes", "Yes", "No", "Yes")
> distribution <- table(data)
> pie(distribution)
Histogram with absolute frequencies
set.seed(123456)
data <- rnorm(1000)
hist(data)

= number of bins
= number of obs.
Histogram with relative frequencies (density)
set.seed(123456)
data <- rnorm(1000)
hist(data, freq=FALSE)

= number of bins
Histogram with not equal bins
hist(data, breaks=c(-4, 0, 1, 3)))

= number of bins
= number of obs.
= number of obs. in
the i-th bin
= width of the i-th bin
Calculate density for histogram: an example
> n <- length(data)
> n1 <- length(data[which(data > -4 & data <=0)])
> f1 <- n1 / n
> f1
[1] 0.479
> w1 <- 4
> d1 <- f1 / w1
> d1
[1] 0.11975

0.479
0.11975
Box plot (or box-and-whisker plot)

max or other value

whisker

third quartile

box second quartile


(median)

first quartile

whisker

min or other value


Box plot: highlighting outliers
> data <- airquality$Ozone
> s <- summary(data)
> print(s)
Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
1.00 18.00 31.50 42.13 63.25 168.00 37
> q1 <- s[[2]]
> q3 <- s[[5]]
> iqr = q3 - q1
> q1 - 1.5*iqr outliers
[1] -49.875
> q3 + 1.5*iqr
[1] 131.125 highest value
> boxplot(airquality$Ozone) within 1.5·IQR

lowest value
within 1.5·IQR
Box plot: highlighting min and max
boxplot(airquality$Ozone, range=0)

max

3Q

2Q
1Q
min
Box plot: groups of data

d <- read.table(file="Employee_data.txt")

boxplot(d$salary ~ d$gender) boxplot(d$salary ~ d$gender, range=0)


Box plot: horizontal orientation
d <- read.table(file="Employee_data.txt")
boxplot(d$salary ~ d$gender, horizontal=TRUE)
Box plot + data points

d <- read.table("Employee_data.txt")
salaryf <- d$salary[which(d$gender=="Female")]
boxplot(salaryf, range=0)
x <- rep(1, length(salaryf))
points(x, salaryf)
Box plot + jittered data points

d <- read.table("Employee_data.txt")
salaryf <- d$salary[which(d$gender=="Female")]
boxplot(salaryf, range=0)
x <- rep(1, length(salaryf))
x <- jitter(x, factor=8)
points(x, salaryf)

jittering
Bar plot
d <- read.table("Employee_data.txt")
jobcattable <- table(d$jobcat)
barplot(jobcattable)
Stacked bar plot

> d <- read.table("Employee_data.txt")


> subd <- data.frame(gender = d$gender,
jobcat = d$jobcat)
> t <- table(subd)
> print(t)
d.jobcat
d.gender Clerical Custodial Manager
Female 206 0 10
Male 156 27 74
> barplot(t)
Stacked bar plot: add a legend (inside the plot area)
barplot(t, legend.text=c("female", "male"))
Stacked bar plot: relative frequencies
> d <- read.table("Employee_data.txt")
> subd <- data.frame(gender=d$gender, jobcat=d$jobcat)
> t <- table(subd)
> rt <- prop.table(t, 2)
> print(rt)
jobcat
gender Clerical Custodial Manager
Female 0.5690608 0.0000000 0.1190476
Male 0.4309392 1.0000000 0.8809524
> barplot(rt)
The device as a state machine: the “par” command

List graphic parameters:


par()

Set a graphic parameter:


par(col=2)
Multiple plots: basic layouts

par(mfrow=c(3,2)) par(mcol=c(3,2))

1 2 1 4

3 4 2 5

5 6 3 6
Multiple plots: projecting our first layout
par(mfcol=c(2,1))

Male

Salary
Experience

Female
Salary

Experience

Tip: use paper and pencil when projecting


Multiple plots: basic layout
male <- d[d$gender=="Male",]
female <- d[d$gender=="Female",]
par(mfcol=c(2,1))
plot(x=male$prevexp, y=male$salary,
main="Male", xlab="Experience",
ylab="Salary", ylim=c(15000, 135000))
plot(x=female$prevexp, y=female$salary,
main="Female", xlab="Experience",
ylab="Salary", ylim=c(15000, 135000))
More advanced multiple plots: the “layout” command

m <- matrix(c(1,1,2,3), nrow=2, ncol=2, byrow=TRUE)


layout(m, width=c(4,4), height=c(2, 3))

1 1 height=2

2 3 height=3

width=4 width=4
An example of use of the “layout” command
l <- matrix(c(1,1,2,3), nrow=2, ncol=2,
byrow=TRUE)
layout(l, height=c(2, 3))
barplot(table(d$jobcat), main="Job category")
plot(x=male$prevexp, y=male$salary,
main="Male", xlab="Experience",
ylab="Salary", ylim=c(15000, 135000))
plot(x=female$prevexp, y=female$salary,
main="Female", xlab="Experience",
ylab="Salary", ylim=c(15000, 135000))
An overlapping legend
edudata <- matrix(c(0.4, 0.6, 0.3, 0.7, 0.2, 0.8), nrow=2, ncol=3)
colors <- c("gray50", "gray80")
barplot(edudata, xlab="Education", names.arg=c("low", "medium", "high"),
col=colors, legend.text=c("female", "male"))
Adding legend by using the “layout” command
edudata <- matrix(c(0.4, 0.6, 0.3, 0.7, 0.2, 0.8), nrow=2, ncol=3)
mlayout <- matrix(c(1,2), nrow=2, ncol=1)
colors <- c("gray50", "gray80")
par(mai=c(0.8, 0.6, 0.1, 0.2)) # bottom, left, top, right
layout(mlayout, height=c(9, 3))
barplot(edudata, xlab="Education", names.arg=c("low", "medium", "high"),
col=colors)
plot.new()
par(mai=c(0, 0, 0, 0)) # bottom, left, top, right
plot.window(xlim=c(0,1), ylim=c(0,1))
legend(x=0.5, y=0.5, xjust=0.5, yjust=0.5, legend = c("male", "female"),
fill = colors)
Multiple graphs setting the figure regions

par(fig=c(0, 0.8, 0, 0.8), new=FALSE)


plot(x=d$prevexp, y=d$salary,
xlab="Experience", ylab="Salary")
par(fig=c(0, 0.8, 0.55, 1), new=TRUE)
boxplot(d$prevexp, horizontal=TRUE,
axes=FALSE)
par(fig=c(0.65, 1, 0, 0.8), new=TRUE)
boxplot(d$salary, axes=FALSE)
Plotting fitted regression line

# Data
n <- 50
x <- 0:(n-1)
real_a <- 5
real_b <- 0.1
logy <- real_a + real_b*x +rnorm(n)
y <- exp(logy)

# Estimation
est <- lm(log(y) ~ x)

# Graph
plot(log(y) ~ x)
abline(est, col="red")
Plotting fitted regression line for log-linear model
# Data
n <- 50
x <- 0:(n-1)
real_a <- 5
real_b <- 0.1
logy <- real_a + real_b*x +rnorm(n)
y <- exp(logy)

# Estimation
est <- lm(log(y) ~ x)
a <- est$coefficients[[1]]
b <- est$coefficients[[2]]
fitted <- exp(a+b*x)

# Graph
plot(y ~ x)
lines(y=fitted, x=x, col="red")
Thanks for your kind attention

[email protected]

You might also like