0% found this document useful (0 votes)

145 views32 pages

Introduction To Ggplot2: Saier (Vivien) Ye September 16, 2013

a great introduction to the ggplot package in the R programming language

Uploaded by

10yangb92

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

145 views32 pages

Introduction To Ggplot2: Saier (Vivien) Ye September 16, 2013

a great introduction to the ggplot package in the R programming language

Uploaded by

10yangb92

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 32

Introduction to ggplot2

STAT 361/661 Data Analysis

Saier (Vivien) Ye
September 16, 2013

1 Intro
When it comes to producing graphics in R , there are basically three options:
1. base graphics
2. lattice
3. ggplot2
We have introduced the use of base graphics in the R session last week. Base graphics are
attractive, and flexible. But when it comes to creating more complex plots, the codes that you
have to write become more cumbersome - often involving many loops.
Both lattice and ggplo2 make creating complex plots easier. The lattice package uses grid
graphics to implement the trellis graphics system and is a considerable improvement over base
graphics. However lattice graphics lacks a formal model, which can make it hard to extend. ggplot2
has gained significant popularity in recent years, and it has become a mainstream package for
making complex graphics.

2 Basics
In the book ggplot2: elegant graphics for data analysis by the author of ggplot2 Hadley Wickham,
ggplot2 is described as an R package for producing statistical, or data, graphics, and it differs
from other graphics packages because it has a deep underlying grammar. This particular grammar
is based on the Grammar of Graphics, hence the name gg-plot. The basic notion is that there is a
grammar to the composition of graphical components in statistical graphics. This makes ggplot2
very powerful because by directly controlling the grammar, you can generate a large set of graphics
tailored to your particular needs. You are no longer limited to a set of pre-specified graphics.
To install ggplot2, make sure you have a recent version of R (at least version 2.8). Type
install.packages("ggplot2") in the R console to install the package. Or if you work in R
Studio, go to the packages window, click on Install Packages and search for ggplot2. Details of
installation of R and R Studio can be found in the notes from last weeks R help session. In order to
use the package, you have to load every time beforehand, with the command library("ggplot2").
ggplot2 package comes with many built-in data sets, for the purpose of demonstration. In this
session, we will demonstrate a data set called mpg. All the examples are from the book that I
mentioned.

Saier (Vivien) Ye, Department of Statistics, Yale University 2013

An overview of the data mpg:
> data(mpg)
> ?mpg
Before plotting, it is always useful to perform a sanity check on the data. This helps you gain a
general idea of the structure of the data, and spot abnormality in the data if there is any.
>
>
>
>

summary(mpg)
head(mpg)
library("YaleToolkit")
whatis(mpg)

Theres a quick plotting function in ggplot2 called qplot(), which is very similar to the plot()
function from base graphics. A simple line of qplot() command looks like the following:
> qplot(displ, hwy, data = mpg, colour = factor(cyl))
You can do a lot with qplot() alone, but the main disadvantage is that it only permits a single
dataset and a single set of aesthetic mappings. ggplot2 is designed to work in a layered fashion,
such that each graphical component is added to the plot as a layer. Each layer can come from a
different dataset and have a different aesthetic mapping, allowing us to create plots that could not
be generated using qplot().
A more systematic way to use ggplot2 package is to plot graphs with the function ggplot().
The function takes two primary arguments: data and aesthetic mapping. These arguments set up
defaults for the plot and can be omitted if you specify data and aesthetics when adding each layer.
data is the data frame that you want to visualize. And aes() mappings will be pass on to the plot
elements. A simple example:
> p <- ggplot(mpg, aes(displ, hwy))
With this function, we have set up a plot which is going to draw from the data frame , the
variable will be mapped to the x-axis, and the variable is going to be mapped to the y-axis.
However, if you just type p or print(p) in R console, youll get back a warning saying that the plot
lacks any layers. Looking at the command, we have not specified which kind of geometric object
will represent the data. Lets add points, for a scatterplot.
> p+geom_point()
You add geometries to a plot with one of the geom_*() functions, using the + operator. Our
command now has two layers, connected by +. This is what we meant by saying ggplot2 works
in layers. We use layers to add various features to the graph, and to customize graph based on
our needs.
Notice how we didnt write any arguments in geom_point(). In order to map points to values
on the x and y axes, geom_point() needs to know what variables were mapping to the x and y

Saier (Vivien) Ye, Department of Statistics, Yale University 2013

axes. It inherited this information from ggplot(). If, however, you insert arguments in the geom()
functions, they will override what is in the main ggplot() function.
The best way to demonstrate this is to make a few plots.

> ggplot(mpg, aes(displ, hwy))+

+ geom_point(aes(color = factor(cyl)))+
+ geom_line()

hwy

factor(cyl)

displ

The points are colored, the lines are not, and a legend has automatically been added.
Next, well pass the color mapping to the line, not the points:

Saier (Vivien) Ye, Department of Statistics, Yale University 2013

> ggplot(mpg, aes(displ, hwy))+

+ geom_point()+
+ geom_line(aes(color = factor(cyl)))

hwy

factor(cyl)
4

displ

Now the line is colored, and the points are not. Its kind of hard to tell with this plot, but lines
which are different colors are not connected. The legend also represents the fact that lines are
colored.
Finally, we can include the color mapping in ggplot(), meaning all the geom objects following
will inherit this mapping:

Saier (Vivien) Ye, Department of Statistics, Yale University 2013

> ggplot(mpg, aes(displ, hwy, color = factor(cyl)))+

+ geom_point()+
+ geom_line()

hwy

factor(cyl)

displ

3 Displaying Statistics
Youll frequently want to add statistical analyses to your plots, or your plots may just be of statistical
summaries anyway. ggplot2 has a few built-in statistics to make plotting easier.
The most frequent statistic I use is a smoothing line with stat_smooth(). There are a number
of different smoothing lines you can add, from local regression lines (loess) to linear or logistic
regressions. Lets start with the mpg data again.

Saier (Vivien) Ye, Department of Statistics, Yale University 2013

> p <- ggplot(mpg, aes(displ, hwy))

> p + geom_point() + stat_smooth()

hwy

displ

By default, stat_smooth() has added a loess line with the standard error represented by a
semi-transparent ribbon. You could also specify the method argument to add a different smoothing
line:

Saier (Vivien) Ye, Department of Statistics, Yale University 2013

> p + geom_point() + stat_smooth(method = "lm")

hwy

10
2

displ

Now, statistics are represented with default geometries. For stat_smooth(), its default geoms
are the semi-transparent ribbon and the smoothing line. You could also represent the output with
points and errorbars.

Saier (Vivien) Ye, Department of Statistics, Yale University 2013

> p + stat_smooth(geom = "point")+stat_smooth(geom =

+ "errorbar")

hwy

displ

For numeric vs categorical varialbe comparison, you can calculate statistics that make up boxplots:

Saier (Vivien) Ye, Department of Statistics, Yale University 2013

> ggplot(mpg, aes(class, hwy))+

+ geom_boxplot()

hwy

2seater

compact

midsize

minivan

pickup

subcompact

suv

class

4 Grouping
An important feature of ggplot2 is you can represent data as grouped easily, and draw geoms and
calculates statistics acoording to these groupings. Weve already seen an example of this, where
lines of different colors arent connected:

Saier (Vivien) Ye, Department of Statistics, Yale University 2013

> ggplot(mpg, aes(displ, hwy, color = factor(cyl)))+

+ geom_point()+
+ stat_smooth(method = "lm")

hwy

factor(cyl)

displ

We mapped the color aesthetic to the variable .... in ggplot(). When we add points to the
plot, their color is set according to their color group. Same with the regression lines.
There are various ways of mapping groups to the plot, for example, point shape:

Saier (Vivien) Ye, Department of Statistics, Yale University 2013

> ggplot(mpg, aes(displ, hwy, shape = factor(cyl)))+

+ geom_point()+
+ stat_smooth(method = "lm")

hwy

factor(cyl)

4
5
6
8

displ

Now, the color of the smoothing lines arent meaningful anymore, but theyve been grouped and
separated.
We could also group by size:

Saier (Vivien) Ye, Department of Statistics, Yale University 2013

> ggplot(mpg, aes(displ, hwy, size = factor(cyl)))+

+ geom_point()+
+ stat_smooth(method = "lm")

hwy

factor(cyl)

displ

A silly plot though.

We could also define a grouping which is only meaningful for geom_smooth() and not geom_point().
This will cause each smoothing line to be calculated and appear separately, but the points will be
undifferentiated.

Saier (Vivien) Ye, Department of Statistics, Yale University 2013

> ggplot(mpg, aes(displ, hwy, linetype = factor(cyl)))+

+ geom_point()+
+ stat_smooth(method = "lm")

hwy

factor(cyl)
4

displ

If you use multiple grouping variables, groups will be defined as unique combinations of each of
the levels.

Saier (Vivien) Ye, Department of Statistics, Yale University 2013

>
>
+
+
+
+

library(MASS)
ggplot(mpg, aes(displ, hwy, color = factor(cyl),
shape = factor(year),
linetype = factor(year)))+
geom_point()+
stat_smooth(method = "rlm")

factor(cyl)

hwy

1999
2008

factor(year)

displ

Grouping isnt only useful for smoothing functions. Boxplots, for example, can be grouped:

Saier (Vivien) Ye, Department of Statistics, Yale University 2013

> ggplot(mpg, aes(class, hwy, fill = factor(year)))+

+ geom_boxplot()

factor(year)

hwy

1999

2seater compact midsize minivan pickupsubcompact suv

class

You can reorder class according to median(hwy):

2008

Saier (Vivien) Ye, Department of Statistics, Yale University 2013

> ggplot(mpg, aes(reorder(class, hwy, median), hwy, fill =

+ factor(year)))+
+ geom_boxplot()

factor(year)

hwy

1999
2008

pickup

suv

minivan 2seatersubcompact
compact midsize

reorder(class, hwy, median)

5 Faceting
A very useful kind of visualization technique is the small multiple. i.e. multiple rows and columns
in a graph. This is achieved by par(mfrow=c()) in the base graphics of R . In ggplot2, it is known
as faceting and there is two ways of achieving this: facet_wrap() and facet_grid().
facet_wrap() creates and labels a plot for every level of a factor which is passed to it. For
example:

Saier (Vivien) Ye, Department of Statistics, Yale University 2013

>
+
+
+

ggplot(mpg, aes(displ, hwy))+

geom_point()+
stat_smooth()+
facet_wrap(~year)

1999

2008

hwy

displ

Saier (Vivien) Ye, Department of Statistics, Yale University 2013

> ggplot(mpg, aes(displ, hwy))+

+ geom_point()+
+ facet_wrap(~manufacturer)

audi

chevrolet

dodge

ford

40
30

honda
40
30

hyundai

jeep

hwy

20
lincoln

mercury

land rover

nissan

pontiac

30
20

subaru

toyota

volkswagen

40
30

2 3 4 5 6 7

displ

One important thing to note here is that the x and y scales of each plot are the same in each facet.
If you would like free scales on each of the facets, just modify your facet line as: facet_wrap( factor,scales="free").
With two variables, you can facet by facet_grid(). Recall the tips data shown in class by Prof
Chen:

Saier (Vivien) Ye, Department of Statistics, Yale University 2013

> tips <- read.table("tips.dat", header=T)

> head(tips)

1
2
3
4
5
6

TOTBILL
16.99
10.34
21.01
23.68
24.59
25.29

TIP FEMALE SMOKER DAY TIME SIZE

1.01
1
0
6
1
2
1.66
0
0
6
1
3
3.50
0
0
6
1
3
3.31
0
0
6
1
2
3.61
1
0
6
1
4
4.71
0
0
6
1
4

>
+
+
+

ggplot(tips, aes(SIZE, TIP/TOTBILL))+

geom_point(position = position_jitter(width = 0.2, height =
0)) +
facet_grid(TIME ~ FEMALE)

0.6

TIP/TOTBILL

0.2

0.4

0.6

0.4

0.2

SIZE

Saier (Vivien) Ye, Department of Statistics, Yale University 2013

In the facet_grid() command, TIME split the graph in the direction of y-axis, and FEMALE does
so in the direction of x-axis. So vertically zero stands for male and 1 stands for female; while
horizontally, zero stands for lunch and one stands for dinner. For example, it looks like a male bill
payer in a dinner party of two tipped 70%.

6 Positioning
How geoms are positioned relative to each other is another feature of plots that you might want to
adjust. The possible position adjustments in ggplot2 are:
position_dodge()
position_fill()
position_identity()
position_jitter()
position_stack()
We will use another data set for the demonstration of positioning in ggplot2. The data set is
called diamonds.
> data(diamonds)
> head(diamonds)
> summary(diamonds)
Here are some demonstrations of these positions:

Saier (Vivien) Ye, Department of Statistics, Yale University 2013

> p <- ggplot(diamonds,aes(clarity,fill=cut))

> p+geom_histogram(aes(y=..count..),position="stack")

10000
cut
Fair

count

Good
Very Good
Premium
5000

Ideal

0
I1

SI2

SI1

VS2

VS1

clarity

VVS2 VVS1

Saier (Vivien) Ye, Department of Statistics, Yale University 2013

> p+geom_histogram(aes(y=..count..),position="fill")

1.00

0.75
cut

count

Fair
Good
0.50

Very Good
Premium
Ideal

0.25

0.00
I1

SI2

SI1

VS2

VS1

clarity

VVS2 VVS1

Saier (Vivien) Ye, Department of Statistics, Yale University 2013

> p+geom_histogram(aes(y=..count..),position="dodge")

5000

4000

cut
Fair

3000

count

Good
Very Good
Premium

2000

Ideal

1000

0
I1

SI2

SI1

VS2

VS1

clarity

VVS2 VVS1

Saier (Vivien) Ye, Department of Statistics, Yale University 2013

> p+geom_histogram(aes(y=..count..),position="identity",alpha=0.2)

5000

4000

cut
Fair

3000

count

Good
Very Good
Premium

2000

Ideal

1000

0
I1

SI2

SI1

VS2

VS1

VVS2 VVS1

clarity

In the identity case, the parameter alpha controls the level of transparency. The lower the
number, the more transparent the bins are.

7 Scales
Every aesthetic which is mapped to the data expresses the magnitude of its value along some scale.
These can be adjusted using the scale_*() functions.
The most common scale adjustments are for the x and y axes. The most basic way to adjust the
x and y scales for continuous data is with scale_x_continuous() or scale_y_continuous().
Some examples of scale manipulation:
> p <- ggplot(mpg, aes(displ, hwy)) + geom_point()
> #p + scale_x_continuous(label="Engine Displacement in Liters")
> #or

Saier (Vivien) Ye, Department of Statistics, Yale University 2013

>
>
>
>
>
>
>

p + xlab("Engine Displacement in Liters")

#p + scale_x_continuous(limits = c(2,4))
#or
p + xlim(2, 4)
p + scale_x_continuous(trans = "log10")
#or
p + scale_x_log10()

Some people dont like the default discrete colors. With scale_color_brewer() you can set the
color pallete to one of the RColorBrewer palletes. To see the possible options
> library(RColorBrewer)
> display.brewer.all()
If you like Set1 for qualitative differences:
> #p + scale_color_brewer(pal = "Set1")
where p is your ggplot2 object.

8 Some Examples
In this session, we look at some more advanced examples of plots by ggplot2.
Weve seen the basic histograms in ggplot2, where frequency is represented by bins. There is a
number of variations on a histogram. They all use the same statistical transformation underlying
a histogram - the bin stat, but use different geoms to display the results.

Saier (Vivien) Ye, Department of Statistics, Yale University 2013

> d <- ggplot(diamonds, aes(carat)) + xlim(0, 3)

> d + stat_bin(aes(ymax = ..count..), binwidth = 0.1, geom = "area")

12000

count

9000

6000

3000

0
0

carat

Saier (Vivien) Ye, Department of Statistics, Yale University 2013

>
+
+
+

d + stat_bin(
aes(size = ..density..), binwidth = 0.1,
geom = "point", position="identity"
)

12000

9000
density

count

6000

0.0

0.5

1.0

2.0

3000

1.5

carat

Saier (Vivien) Ye, Department of Statistics, Yale University 2013

>
+
+
+

d + stat_bin(
aes(y = 1, fill = ..count..), binwidth = 0.1,
geom = "tile", position="identity"
)

1.50

1.25
count

9000
1.00

6000
3000
0

0.75

0.50
0

carat

The first histogram shown here uses an area geom to display frequency, the second uses the point
geom and the third tile geom.
Weve shown this plot briefly in last weeks session. This is again the diamonds data set. It
shows the distribution of depth, marked by different cut.

Saier (Vivien) Ye, Department of Statistics, Yale University 2013

> depth_dist <- ggplot(diamonds, aes(depth)) + xlim(58, 68)

> depth_dist +
+ geom_histogram(aes(fill = cut), binwidth = 0.1, position = "fill")

1.00

0.75
cut

count

Fair
Good
0.50

Very Good
Premium
Ideal

0.25

0.00
57.5

60.0

62.5

65.0

67.5

depth

We see that this is essentially a histogram with very tiny binwidth. And it provides a much
better visual presentation than traditional bars.
The last example that Id like to discuss is drawing maps in ggplot2. ggplot2 provides some
tools to make it easy to combine maps from the maps package with other ggplot2 graphics. To be
able to draw maps, you have to install the package maps in addition to ggplot2. (and, of course,
dont forget to load it for use)
The example below shows the crime statistics for all states in the United States (excluding HI
and AK). This data is built in the package maps.

Saier (Vivien) Ye, Department of Statistics, Yale University 2013

>
>
>
>
>
>
>
>
>
+

library(maps)
states <- map_data("state")
arrests <- USArrests
names(arrests) <- tolower(names(arrests))
arrests$region <- tolower(rownames(USArrests))
choro <- merge(states, arrests, by = "region")
# Reorder the rows because order matters when drawing polygons
choro <- choro[order(choro$order), ]
qplot(long, lat, data = choro, group = group, fill = assault,
geom="polygon")

assault
40

lat

300
200

100

25
120

100

long

Saier (Vivien) Ye, Department of Statistics, Yale University 2013

> qplot(long, lat, data = choro, group = group, fill = assault / murder,
+ geom="polygon")

assault/murder
40

lat

40
30
35

25
120

100

long

Drawing maps usually involves using the geom polygon. It is a relatively rarely used geom.
Details of this geom can be found here: http://docs.ggplot2.org/current/geom_polygon.html.
Briefly speaking, you need two data frames for using this function: one contains the coordinates of
each polygon (positions), and the other contains the values associated with each polygon (values).
Therefore we had to reorder our data frame choro, as merge disrupts the ordering.

9 Resources ggplot2
The best resource for low-level details will always be the built-in documentation. This documentation is accessible online at: http://docs.ggplot2.org/current/. You can also use the usual help
syntax (help() or ?) in R to access the contents. Online documentation provides more flexibility
as you can see all the example plots and navigate between topics easily.

Saier (Vivien) Ye, Department of Statistics, Yale University 2013

The offciail CRAN website, http://cran.r-project.org/web/packages/ggplot2/ is another
useful resource. This page links to what is new and different in each release.
Lastly, the book we mentioned earlier by the author of ggplot2 Hadley Wickham, http:
//www.amazon.com/dp/0387981403/ref=cm_sw_su_dp?tag=ggplot2-20 provides a detailed and
comprehensive introduction and explanation on data analysis in ggplot2. The book website,
http://ggplot2.org/book/, contains updates to this book, as well as all graphics used in the
book, with code and data needed to reproduce them.

Ggplot2 Book PDF
100% (2)
Ggplot2 Book PDF
281 pages
100 Pandas Exercises
No ratings yet
100 Pandas Exercises
6 pages
Advanced R Programming GGPLOT2 Notes
No ratings yet
Advanced R Programming GGPLOT2 Notes
8 pages
Tom a. B. Snijders - Multilevel Analysis_ an Introduction to Basic and Advanced Multilevel Modeling (2011)-1
No ratings yet
Tom a. B. Snijders - Multilevel Analysis_ an Introduction to Basic and Advanced Multilevel Modeling (2011)-1
521 pages
Data Visualization With R Ggplot2
No ratings yet
Data Visualization With R Ggplot2
236 pages
SSP448 Wheel Alignment - Basics
88% (8)
SSP448 Wheel Alignment - Basics
48 pages
New Holland TM S21-2 PDF
50% (2)
New Holland TM S21-2 PDF
90 pages
Apostila Ggplot
No ratings yet
Apostila Ggplot
59 pages
Data Analytics Using R (DA-R)
100% (1)
Data Analytics Using R (DA-R)
67 pages
Introduction To R Programming 1691124649
No ratings yet
Introduction To R Programming 1691124649
79 pages
Ggplot 2: Elegant Graphics For Data Analysis. Second Edition.
No ratings yet
Ggplot 2: Elegant Graphics For Data Analysis. Second Edition.
277 pages
Figures With GGPlot
No ratings yet
Figures With GGPlot
58 pages
Learn R For Applied Statistics
No ratings yet
Learn R For Applied Statistics
457 pages
RYAN, THOMAS P. - [Wiley Series in Probability and Statistics] Modern Regression Methods __ (2
No ratings yet
RYAN, THOMAS P. - [Wiley Series in Probability and Statistics] Modern Regression Methods __ (2
658 pages
Data Visualization Using Ggplot2
No ratings yet
Data Visualization Using Ggplot2
21 pages
Lecture 6 - Data Visualization With Ggplot2
No ratings yet
Lecture 6 - Data Visualization With Ggplot2
15 pages
Ggplot2 - Easy Way To Mix Multiple Graphs On The Same Page - Articles - STHDA
No ratings yet
Ggplot2 - Easy Way To Mix Multiple Graphs On The Same Page - Articles - STHDA
54 pages
Data Visualization With Ggplot2: Sca!er Plots
No ratings yet
Data Visualization With Ggplot2: Sca!er Plots
54 pages
Data Visualization in R Sem-III 2021 PDF
No ratings yet
Data Visualization in R Sem-III 2021 PDF
57 pages
Statistical Analysis Illustrated - Foundations
No ratings yet
Statistical Analysis Illustrated - Foundations
91 pages
R Module 4
No ratings yet
R Module 4
31 pages
W221 Pre-Facelift Edition
100% (1)
W221 Pre-Facelift Edition
705 pages
Catalogo Citycoco
0% (1)
Catalogo Citycoco
5 pages
Thule 2012 Catalog
No ratings yet
Thule 2012 Catalog
96 pages
Analysis of Categorical Data
No ratings yet
Analysis of Categorical Data
75 pages
Multiple Linear Regression Housing Case Study PDF
No ratings yet
Multiple Linear Regression Housing Case Study PDF
151 pages
Data Transformation With Dplyr Cheat Sheet
No ratings yet
Data Transformation With Dplyr Cheat Sheet
2 pages
Data Visualization With Ggplot2 PDF
No ratings yet
Data Visualization With Ggplot2 PDF
13 pages
Data Visualization With Ggplot2: Case Study I Bag Plot
No ratings yet
Data Visualization With Ggplot2: Case Study I Bag Plot
47 pages
Data - Visualisation - Charts and Types of Data
No ratings yet
Data - Visualisation - Charts and Types of Data
7 pages
Sampling Theory and Method-301-500
No ratings yet
Sampling Theory and Method-301-500
200 pages
Exercise 1
No ratings yet
Exercise 1
5 pages
Stochastic Processes by Joseph T Chang
0% (1)
Stochastic Processes by Joseph T Chang
233 pages
Driverless Car
80% (5)
Driverless Car
27 pages
Ggplot2 Elegant Graphics For Data Analysis (2016, Springer) PDF
No ratings yet
Ggplot2 Elegant Graphics For Data Analysis (2016, Springer) PDF
281 pages
Density, Boxplot, Violinplot, Scatterplot
No ratings yet
Density, Boxplot, Violinplot, Scatterplot
7 pages
Honda Manual Civic 2017
100% (4)
Honda Manual Civic 2017
84 pages
General Information: Workshop Manual
90% (10)
General Information: Workshop Manual
39 pages
Ggridges Ggplot2 Price Cut Fill Cut: Ridgelineplot
No ratings yet
Ggridges Ggplot2 Price Cut Fill Cut: Ridgelineplot
3 pages
Chapter 2 R Ggplot2 Examples
No ratings yet
Chapter 2 R Ggplot2 Examples
22 pages
Introduccion A R en Mexico
No ratings yet
Introduccion A R en Mexico
29 pages
Using Ggplot2 For Plots in R
No ratings yet
Using Ggplot2 For Plots in R
8 pages
R Studio
No ratings yet
R Studio
41 pages
Create Elegant Data Visualisations Using The Grammar of Graphics - Ggplot2
No ratings yet
Create Elegant Data Visualisations Using The Grammar of Graphics - Ggplot2
5 pages
Peugeot RCZ 2013 Owners Manual
No ratings yet
Peugeot RCZ 2013 Owners Manual
336 pages
R-Tutorial - Introduction
No ratings yet
R-Tutorial - Introduction
30 pages
Cheat Sheet
No ratings yet
Cheat Sheet
163 pages
SSP 20 Automatic Gearbox Fundamentals PDF
No ratings yet
SSP 20 Automatic Gearbox Fundamentals PDF
38 pages
Ggplot2: Quick Correlation Matrix Heatmap - R Software and Data Visualization - Easy Guides - Wiki - STHDA
No ratings yet
Ggplot2: Quick Correlation Matrix Heatmap - R Software and Data Visualization - Easy Guides - Wiki - STHDA
7 pages
Forecast
No ratings yet
Forecast
82 pages
An Introduction To R Language
No ratings yet
An Introduction To R Language
11 pages
Plotting With Ggplot: Install - Packages ("Ggplot2") Library (Ggplot2)
No ratings yet
Plotting With Ggplot: Install - Packages ("Ggplot2") Library (Ggplot2)
3 pages
Chapter 1 Displaying and Describing Data Distributions
100% (1)
Chapter 1 Displaying and Describing Data Distributions
40 pages
Time Series Analysis - An Introduction
No ratings yet
Time Series Analysis - An Introduction
38 pages
Desriptive Statistics - Zarni Amri
No ratings yet
Desriptive Statistics - Zarni Amri
57 pages
Class 7
No ratings yet
Class 7
42 pages
Rstudio Cheat Sheet: Console
No ratings yet
Rstudio Cheat Sheet: Console
3 pages
Data Science With R Workflow
100% (1)
Data Science With R Workflow
1 page
Expected Returns For Private Equity
No ratings yet
Expected Returns For Private Equity
23 pages
Wheels and Tyres Guide PDF
No ratings yet
Wheels and Tyres Guide PDF
256 pages
Basic - Statistics 30 Sep 2013 PDF
100% (1)
Basic - Statistics 30 Sep 2013 PDF
20 pages
Daimler GermanInvestmentConference 25092013 PDF
No ratings yet
Daimler GermanInvestmentConference 25092013 PDF
72 pages
Ggplot2 Cheatsheet
No ratings yet
Ggplot2 Cheatsheet
2 pages
Peugeot 407 Owners Manual 2004
No ratings yet
Peugeot 407 Owners Manual 2004
113 pages
R Notes For Data Analysis and Statistical Inference
No ratings yet
R Notes For Data Analysis and Statistical Inference
10 pages
Hindustan Engineering and Automotive Products LTS
67% (3)
Hindustan Engineering and Automotive Products LTS
18 pages
Survival Plots SURVMINER Package Tutorial
No ratings yet
Survival Plots SURVMINER Package Tutorial
5 pages
App.A - Detection and Estimation in Additive Gaussian Noise PDF
No ratings yet
App.A - Detection and Estimation in Additive Gaussian Noise PDF
55 pages
One-Sample T-Test
No ratings yet
One-Sample T-Test
9 pages
Klemanski - Abnormal Psy - Robertohandouts
No ratings yet
Klemanski - Abnormal Psy - Robertohandouts
20 pages
Ford Motor Company and Firestone Tyre Recall
No ratings yet
Ford Motor Company and Firestone Tyre Recall
13 pages
Marketing Toyota
No ratings yet
Marketing Toyota
61 pages
Police Log July 17, 2016
No ratings yet
Police Log July 17, 2016
13 pages
Boxplot Outlier
No ratings yet
Boxplot Outlier
3 pages
Apac Eng FMX Product Guide Euro 3-5-140521
No ratings yet
Apac Eng FMX Product Guide Euro 3-5-140521
27 pages
TX842 PDF
No ratings yet
TX842 PDF
2 pages
Atf Drain & Fill & Leveling - Camry U760e
67% (3)
Atf Drain & Fill & Leveling - Camry U760e
13 pages
Frequency Distribution For Categorical Data
No ratings yet
Frequency Distribution For Categorical Data
6 pages
EC2303 Final Formula Sheet PDF
No ratings yet
EC2303 Final Formula Sheet PDF
8 pages
Buyback Derangement Syndrome
No ratings yet
Buyback Derangement Syndrome
10 pages
5379X 5382X Diff Rebuild 0
No ratings yet
5379X 5382X Diff Rebuild 0
2 pages
Central Limit Theorem
No ratings yet
Central Limit Theorem
3 pages
How To Use All 3 Types of ANOVA Built Into Excel To Make Your Internet Marketing More Effective
No ratings yet
How To Use All 3 Types of ANOVA Built Into Excel To Make Your Internet Marketing More Effective
20 pages
01-45 ABS-EDS Bosch 5.0 PDF
No ratings yet
01-45 ABS-EDS Bosch 5.0 PDF
10 pages
Four Square Theorem - Quaternions
No ratings yet
Four Square Theorem - Quaternions
6 pages
R Packages For Machine Learning
No ratings yet
R Packages For Machine Learning
3 pages
Deskripsi B. Inggris Balap Motor
No ratings yet
Deskripsi B. Inggris Balap Motor
2 pages
Dvds
No ratings yet
Dvds
27 pages
FSAE Design Report 2ndversion
No ratings yet
FSAE Design Report 2ndversion
24 pages
An R Tutorial Starting Out
No ratings yet
An R Tutorial Starting Out
9 pages
Shifter Catalogue Small
No ratings yet
Shifter Catalogue Small
12 pages
Noregon JPRO Catalog PDF
No ratings yet
Noregon JPRO Catalog PDF
8 pages
Bellcrank Shocks
No ratings yet
Bellcrank Shocks
6 pages
Manitou MLT-X 627 (EN)
No ratings yet
Manitou MLT-X 627 (EN)
2 pages

Introduction To Ggplot2: Saier (Vivien) Ye September 16, 2013

Uploaded by

Introduction To Ggplot2: Saier (Vivien) Ye September 16, 2013

Uploaded by

Introduction to ggplot2

STAT 361/661 Data Analysis

Saier (Vivien) Ye, Department of Statistics, Yale University 2013

Saier (Vivien) Ye, Department of Statistics, Yale University 2013

> ggplot(mpg, aes(displ, hwy))+

Saier (Vivien) Ye, Department of Statistics, Yale University 2013

> ggplot(mpg, aes(displ, hwy))+

Saier (Vivien) Ye, Department of Statistics, Yale University 2013

> ggplot(mpg, aes(displ, hwy, color = factor(cyl)))+

Saier (Vivien) Ye, Department of Statistics, Yale University 2013

> p <- ggplot(mpg, aes(displ, hwy))

Saier (Vivien) Ye, Department of Statistics, Yale University 2013

> p + geom_point() + stat_smooth(method = "lm")

Saier (Vivien) Ye, Department of Statistics, Yale University 2013

> p + stat_smooth(geom = "point")+stat_smooth(geom =

Saier (Vivien) Ye, Department of Statistics, Yale University 2013

> ggplot(mpg, aes(class, hwy))+

Saier (Vivien) Ye, Department of Statistics, Yale University 2013

> ggplot(mpg, aes(displ, hwy, color = factor(cyl)))+

Saier (Vivien) Ye, Department of Statistics, Yale University 2013

> ggplot(mpg, aes(displ, hwy, shape = factor(cyl)))+

Saier (Vivien) Ye, Department of Statistics, Yale University 2013

> ggplot(mpg, aes(displ, hwy, size = factor(cyl)))+

A silly plot though.

Saier (Vivien) Ye, Department of Statistics, Yale University 2013

> ggplot(mpg, aes(displ, hwy, linetype = factor(cyl)))+

Saier (Vivien) Ye, Department of Statistics, Yale University 2013

Saier (Vivien) Ye, Department of Statistics, Yale University 2013

> ggplot(mpg, aes(class, hwy, fill = factor(year)))+

2seater compact midsize minivan pickupsubcompact suv

You can reorder class according to median(hwy):

Saier (Vivien) Ye, Department of Statistics, Yale University 2013

> ggplot(mpg, aes(reorder(class, hwy, median), hwy, fill =

reorder(class, hwy, median)

Saier (Vivien) Ye, Department of Statistics, Yale University 2013

ggplot(mpg, aes(displ, hwy))+

Saier (Vivien) Ye, Department of Statistics, Yale University 2013

> ggplot(mpg, aes(displ, hwy))+

Saier (Vivien) Ye, Department of Statistics, Yale University 2013

> tips <- read.table("tips.dat", header=T)

TIP FEMALE SMOKER DAY TIME SIZE

ggplot(tips, aes(SIZE, TIP/TOTBILL))+

Saier (Vivien) Ye, Department of Statistics, Yale University 2013

Saier (Vivien) Ye, Department of Statistics, Yale University 2013

> p <- ggplot(diamonds,aes(clarity,fill=cut))

Saier (Vivien) Ye, Department of Statistics, Yale University 2013

Saier (Vivien) Ye, Department of Statistics, Yale University 2013

Saier (Vivien) Ye, Department of Statistics, Yale University 2013

Saier (Vivien) Ye, Department of Statistics, Yale University 2013

p + xlab("Engine Displacement in Liters")

Saier (Vivien) Ye, Department of Statistics, Yale University 2013

> d <- ggplot(diamonds, aes(carat)) + xlim(0, 3)

Saier (Vivien) Ye, Department of Statistics, Yale University 2013

Saier (Vivien) Ye, Department of Statistics, Yale University 2013

Saier (Vivien) Ye, Department of Statistics, Yale University 2013

> depth_dist <- ggplot(diamonds, aes(depth)) + xlim(58, 68)

Saier (Vivien) Ye, Department of Statistics, Yale University 2013

Saier (Vivien) Ye, Department of Statistics, Yale University 2013

Saier (Vivien) Ye, Department of Statistics, Yale University 2013

You might also like