100% found this document useful (28 votes)

920 views23 pages

Getting Started With R: An Introduction For Biologists. ISBN 9780198787846, 978-0198787846

ISBN-10: 9780198787846. ISBN-13: 978-0198787846. Getting Started with R: An Introduction for Biologists Full PDF DOCX Download

Uploaded by

gennigoldinguwu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

100% found this document useful (28 votes)

920 views23 pages

Getting Started With R: An Introduction For Biologists. ISBN 9780198787846, 978-0198787846

ISBN-10: 9780198787846. ISBN-13: 978-0198787846. Getting Started with R: An Introduction for Biologists Full PDF DOCX Download

Uploaded by

gennigoldinguwu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 23

Getting Started with R: An Introduction for Biologists

Visit the link below to download the full version of this book:
https://cheaptodownload.com/product/getting-started-with-r-an-introduction-for-b
iologists-2nd-edition-full-pdf-download/
Getting Started
with R
An Introduction for
Biologists
Second Edition

ANDREW P. BECKERMAN
DYL AN Z. CHILDS

Department of Animal and Plant Sciences,

University of Sheﬃeld

O W E N L . P E TC H E Y

Department of Evolutionary Biology

and Environmental Studies,
University of Zurich

3
3
Great Clarendon Street, Oxford, OX2 6DP,
United Kingdom
Oxford University Press is a department of the University of Oxford.
It furthers the University’s objective of excellence in research, scholarship,
and education by publishing worldwide. Oxford is a registered trade mark of
Oxford University Press in the UK and in certain other countries
© Andrew Beckerman, Dylan Childs, & Owen Petchey 2017
The moral rights of the authors have been asserted
First Edition published in 2012
Second Edition published in 2017
Impression: 1
All rights reserved. No part of this publication may be reproduced, stored in
a retrieval system, or transmitted, in any form or by any means, without the
prior permission in writing of Oxford University Press, or as expressly permitted
by law, by licence or under terms agreed with the appropriate reprographics
rights organization. Enquiries concerning reproduction outside the scope of the
above should be sent to the Rights Department, Oxford University Press, at the
address above
You must not circulate this work in any other form
and you must impose this same condition on any acquirer
Published in the United States of America by Oxford University Press
198 Madison Avenue, New York, NY 10016, United States of America
British Library Cataloguing in Publication Data
Data available
Library of Congress Control Number: 2016946804
ISBN 978–0–19–878783–9 (hbk.)
ISBN 978–0–19–878784–6 (pbk.)
DOI 10.1093/acprof:oso/9780198787839.001.0001
Printed and bound by
CPI Litho (UK) Ltd, Croydon, CR0 4YY
Links to third party websites are provided by Oxford in good faith and
for information only. Oxford disclaims any responsibility for the materials
contained in any third party website referenced in this work.
Contents
Preface ix
Introduction to the second edition ix
What this book is about xii
How the book is organized xiv
Why R? xvi
Updates xviii
Acknowledgements xviii

Chapter 1: Getting and Getting Acquainted with R 1

1.1Getting started 1
1.2Getting R 2
1.3Getting RStudio 5
1.4Let’s play 6
1.5Using R as a giant calculator (the size of your computer) 8
1.6Your ﬁrst script 15
1.7Intermezzo remarks 21
1.8Important functionality: packages 21
1.9Getting help 24
1.10A mini-practical—some in-depth play 26
1.11Some more top tips and hints for a successful ﬁrst
(and more) R experience 28
Appendix 1a Mini-tutorial solutions 29
Appendix 1b File extensions and operating systems 30

Chapter 2: Getting Your Data into R 35

2.1 Getting data ready for R 35
2.2 Getting your data into R 40
2.3 Checking that your data are your data 45
2.4 Basic troubleshooting while importing data 48
2.5 Summing up 49
Appendix Advanced activity: dealing with untidy data 50
vi CONTENTS

Chapter 3: Data Management, Manipulation, and Exploration

with dplyr 57
3.1 Summary statistics for each variable 58
3.2 dplyr verbs 59
3.3 Subsetting 60
3.4 Transforming 67
3.5 Sorting 68
3.6 Mini-summary and two top tips 69
3.7 Calculating summary statistics about groups of your data 70
3.8 What have you learned . . . lots 73
Appendix 3a Comparing classic methods and dplyr 73
Appendix 3b Advanced dplyr 74

Chapter 4: Visualizing Your Data 79

4.1 The ﬁrst step in every data analysis—making a picture 79
4.2 ggplot2: a grammar for graphics 80
4.3 Box-and-whisker plots 85
4.4 Distributions: making histograms of numeric variables 87
4.5 Saving your graphs for presentation, documents, etc. 90
4.6 Closing remarks 91

Chapter 5: Introducing Statistics in R 93

5.1 Getting started doing statistics in R 93
5.2 χ 2 contingency table analysis 95
5.3 Two-sample t-test 103
5.4 Introducing . . . linear models 108
5.5 Simple linear regression 109
5.6 Analysis of variance: the one-way ANOVA 118
5.7 Wrapping up 128
Appendix Getting packages not on CRAN 128

Chapter 6: Advancing Your Statistics in R 131

6.1 Getting started with more advanced statistics 131
6.2 The two-way ANOVA 131
6.3 Analysis of covariance (ANCOVA) 145
6.4 Overview: an analysis workﬂow 164

Chapter 7: Getting Started with Generalized Linear Models 167

7.1 Introduction 167
7.2 Counts and rates—Poisson GLMs 170
CONTENTS vii

7.3 Doing it wrong 173

7.4 Doing it right—the Poisson GLM 177
7.5 When a Poisson GLM isn’t good for counts 194
7.6 Summary, and beyond simple Poisson regression 201

Chapter 8: Pimping Your Plots: Scales and Themes in ggplot2 203

8.1 What you already know about graphs 203
8.2 Preparation 204
8.3 What you may want to customize 206
8.4 Axis labels, axis limits, and annotation 207
8.5 Scales 209
8.6 The theme 212
8.7 Summing up 218

Chapter 9: Closing Remarks: Final Comments and

Encouragement 219
General Appendices 223
Appendix 1 Data Sources 223
Appendix 2 Further Reading 224
Appendix 3 R Markdown 225

Index 227
Preface

Introduction to the second edition

This is a book about how to use R, an open source programming language

and environment for statistics. It is not a book about statistics per se, but a
book about getting started using R. It is a book that we hope will teach you
how using R can make your life (research career) easier.
Several years ago we published the first edition of this book, aiming to
help people move from ‘hearing about R’ to ‘using R’. We had realized
that there were lots of books about exploring data and doing statistics with
R, but none specifically designed for people that didn’t have a lot of ex-
perience or confidence in using much more than a spreadsheet, people
that didn’t have a lot of time, and people that appreciated an engaging and
sometimes humorous initial journey into R. The first edition was also de-
signed for people who did know statistics and other packages, but wanted
a quick ‘getting started’ guide, because, well, it is hard to get started with
R in some ways. Overall, we aimed to make the somewhat steep learning
curve more of a walk in the park.
Over the past five years much has changed. Most significantly, R has
evolved as a platform for doing data analysis, for managing data, and for
producing figures. Other things have not changed. People still seem to
need and appreciate help in navigating the process of getting started work-
ing with R. Thus, this new version of the book does two things. It retains
x PREFACE

our focus on helping you get started using R. We love doing this and we’ve
been teaching this for 15 years. Not surprisingly, many of you are also find-
ing that this getting-started book is great for undergraduate and graduate
teaching. We thank you all for your feedback!
Second, we have substantially revised how we use, and thus suggest you
use, R. Our changes and suggestions take advantage of some new and very
cool, efficient, and straightforward tools. We think these changes will help
you focus even more on your data and questions. This is good.
If you compare this second edition with the first, you will find sev-
eral differences. We no longer rely on base R tools and graphics for data
manipulation and figure making, instead introducing dplyr and ggplot2.
We’ve also expanded the set of basic statistics we introduce to you, includ-
ing new examples of a simple regression and a one-way and a two-way
ANOVA, in addition to the old ANCOVA example. Third, we provide an
entire new chapter on the generalized linear model. Oh, yes, and we have
added an author, Dylan.

WHAT ’S SO DIFFERENT FROM THE FIRST EDITION?

We teach a particular workﬂow for quantitative problem solving: have a

clear question, get the right data for that question, inspect and visualize
the data, use the visualization to reveal the answer to the question, make a
statistical model that reflects your question, check the assumptions of the
model, interpret the model to confirm or refute your answer, and clearly
and beautifully communicate your answer in a figure.
In R there are many different tools, and combinations of these tools,
for accomplishing this workflow. In the first edition of this book we intro-
duced a set of ‘classic’ R tools drawn from the base R installation. These
classic tools worked and, importantly, continue to work very well. We
taught them in our courses for years. We used them in our research for
years. We still use them sometimes. And as you start to use R, and interact
with people using R, and perhaps share code, you will find many people
using these classic tools and methods.
PREFACE xi

But the tools and their syntax were designed a long time ago. Many em-
ploy a rather idiosyncratic set of symbols and syntax to accomplish tasks.
For example, square brackets are used for selecting parts of datasets, and
dollar signs for referring to particular variables. Sometimes diﬀerent tools
that perform similar tasks work in very diﬀerent ways. This makes for ra-
ther idiosyncratic instructions that are not so easy for people to read or to
remember how to write.
So after much deliberation, and some good experiences, we decided
that in this second edition we would introduce a popular and new set of
tools contributed by Sir1 Hadley Wickham and many key collaborators
(http://had.co.nz). These new tools introduce a set of quite stand-
ardized and coherent syntax and exist in a set of add-on packages—you
will learn exactly what these are and how to use them later. And you will
also learn some base R. In fact, you will learn a great deal of base R.
We decided to teach this new way of using R because:

• The tools use a more ‘natural language’ that is easier for humans to
work with.
• The standardization and coherence among the tools make them easy
to learn and use.
• The tools work very well for simple and small problems, but also scale
very intuitively and naturally to quite complex and large problems.
• There are tools for every part of the workﬂow, from data management
to statistical analysis and making beautiful graphs.
• Each of us independently migrated to this new set of tools, giving
us greater conﬁdence that it’s the way forward. (Well, Andrew was
forced a bit.)

Though we are conﬁdent that teaching newcomers these new tools is the
right thing to do, there are some risks and, in particular, people taught only
these new tools may not be able to work easily with people or code using

1
Unoﬃcial knighthood for contributions to making our R-lives so much easier and
beautiful.
xii PREFACE

the classic way. Furthermore, some colleagues have questioned the wis-
dom of teaching this ‘modern’ approach to entry-level students (i.e. those
with no or little previous experience with R), especially if taught in the ab-
sence of the classic approach (funnily enough, many of these ‘concerned’
colleagues don’t use R at all!). Certainly the risks mentioned above are real,
and for that reason we provide a short appendix in Chapter 3 (the chapter
on Data management) that links the classic and new methods. The classic
way can still sometimes be the best way. And old dogs don’t often agree to
learning new tricks.
Another concern voiced asks why we’re teaching ‘advanced R’ at entry
level, with the idea that the use of new tools and add-on packages im-
plies ‘advanced’. After all, why wouldn’t the ‘base’ R distribution contain
everything an entry-level user needs? Well, it does, but we’ve found the
standardization and syntax in the add-on packages to be valuable even for
us as seasoned users. And one should not read ‘base’ R distribution as ‘ba-
sic’ R distribution, or ‘add-on’ package as ‘advanced’ package. The ‘base’
distribution contains many advanced tools, and many add-on packages
contain very basic tools.
We hope you enjoy this new Getting Started with R.

What this book is about

We love R. We use statistics in our everyday life as researchers and teach-

ers. Sometimes even more: Owen used it to explore the nursing behaviour
of his firstborn. We are first and foremost evolutionary and community
ecologists, but over the past 15 years we have developed, first in parallel
and then together, an affinity for R. We want to share our 40+ years of
combined experience using R to show you how easy, important, and ex-
citing it can be. This book is based on 3–5-day courses we give in various
guises around the world. The courses are designed to give students and
staff alike a boost up the steep initial learning curve associated with R.
We assume that course participants, and you as readers, already use
some spreadsheet, statistical, and graphing programs (such as Excel, SPSS,
PREFACE xiii

Minitab, SAS, JMP, Statistica, and SigmaPlot). Most participants, and we

hope you, have some grasp of common statistical methods, including
the chi-squared test, the t-test, and ANOVA. In return for a few days
of their lives, we give participants knowledge about how to easily use R,
and R only, to manage data, make figures, and do statistics. R changed
our research lives, and many participants agree that it has done the same
for them.
The efforts we put into developing the course and this book are, how-
ever, minuscule compared with the efforts of the R Core Development
Team. Please remember to acknowledge them and package contributors
when you use R to analyse and publish your amazing findings.

WHAT YOU NEED TO KNOW TO MAKE THIS BOOK WORK

FOR YOU

There are a few things that you need to know to make this book, and our
ideas, work for you. Many of you already know how to do most of these
things, having been in the Internet age for long enough now, but just to
be sure:

1. You need to know how to download things from the Internet. If you
use Windows, Macintosh, or Linux, the principles are the same, but
the details are different. Know your operating system. Know your
browser and know your mouse/trackpad.
2. You need to know how to make folders on your computer and save
files to them. This is essential for being organized and efficient.
3. It is useful, though not essential, to understand what a ‘path’ is on
your computer. This is the address of a folder or a file (i.e. the path
to a file). On Windows, depending on the type you are using, this in-
volves a drive name, a colon (:), and slashes (\ or /). On a Macintosh
and Linux/Unix, this requires the names of your hard drive, the
name of your home directory, a tilde (~), the names of folders, and
slashes (/).
xiv PREFACE

4. Finally, you need at least a basic understanding of how to do, and

why you are doing, statistics. We recommend that you know the
types of questions a t-test, a chi-squared test, linear regression,
ANOVA, and ANCOVA are designed to help you answer before you
use this book. As we said, we are not aiming to teach you statistics per
se, but how to do some of the most common plotting and most fre-
quent statistics in R, and understand what R is providing as output.
That said, we’ll try and teach a bit along the way.

How the book is organized

In this book, we will show you how to use R in the context of everyday re-
search in biology (or, indeed, in many other disciplines). Our philosophy
assumes that you have some data and would like to derive some under-
standing from it. Typically you need to manage your data, explore your
data (e.g. by plotting it), and then analyse your data. Before any attempt at
analysis, we suggest (no, demand!) that you always plot your data. As al-
ways, analysing (modelling) your data involves first developing a model
that accurately reflects your question, and then testing critical assump-
tions associated with the statistical method (model). Only after this do
you attempt interpretation. Our focus is on developing a rigorous and ef-
ficient routine (workflow) and a template for using R for data exploration,
visualization, and analysis. We believe that this will give you a functional
approach to using R, in which you always have the goal (understanding
your data, answering your question) in mind.
Chapter 1 is about getting R and getting acquainted with it. The chap-
ter is a bit like when you first meet someone who might be your friend,
or might not, so you take some time to get to know each other. We also
introduce you to another friend, RStudio, and strongly recommend that
you get to know this one, as well as R. RStudio is just great. You will fall in
love with it.
Chapter 2 is about getting your data ready for R, getting it into R, and
checking it got into R correctly. Not many courses cover data preparation
PREFACE xv

as much as in this chapter, but it’s really essential for an efficient experience
with R. Good preparation makes for great performance. We give tips about
what can go wrong here, how to recognize this, and how to fix it.
Chapter 3 focuses on how you work with data once it’s in R. Usually
you’ll need to do some data manipulation before making a graph or doing a
statistical analysis. You might need to subset your data, or want to calculate
mean ± SE. We walk you through some very efficient and clear methods
for doing all kinds of data manipulations.
Chapter 4 is about visualizing your data, and comes before the chapters
about statistical analyses because we always visualize our data before we do
any statistics (you will hear that again and again throughout this book).
We introduce you to scatterplots, histograms, and box-and-whisker plots.
In later chapters, we also introduce you to plots of means and standard
errors. (But we do not introduce you to bar charts with error bars, because
they are evil2 .)
Chapters 5, 6, and 7 finally get on to doing some statistics. Chapter 5
introduces ‘basic’ statistical tests (t-test, chi-squared contingency table
analyses, simple linear regression, and the one-way ANOVA). Chapter 6
is slightly more complex tests (two-way ANOVA and ANCOVA). And
Chapter 7 takes us to new territory, where we introduce about the sim-
plest generalized linear model around: a Poisson regression. As we said,
we are introducing how to do stuff in R and we’re not aiming to cover
lots of statistics in great detail, but along the way we try and ensure that
your understanding of statistics maps onto the output you can get from
using R. We’ve added this ‘getting started with generalized linear models’
chapter because so many types of question in the biological sciences de-
mand it. Our goal is that you should have seen enough variety of analysis
methods to be comfortable and confident in moving forward and learning
more yourself.
Chapter 8 comes back to figures and graphs. It is about how to make
your graphs look even more beautiful than they were during the previous

2
http://dx.doi.org/10.1371/journal.pbio.1002128
xvi PREFACE

chapters. Put another way, it’s about pimping your graphs. Making the la-
bels, symbols, colours, shading, sizes, and everything else you might like
to change look beautiful, coordinated, and clear, so readers are amazed by
the clarity with which they can see your findings. It will also give you the
skills and flexibility to make atrocious graphs . . . be careful.
The final chapter 9, wraps all this up and provides encouragement. It is
brief. We figure that by this point, you’ll have had enough of us, and will be
raring to get your own data into R. And that is great, because that is when
you’ll really solidify your learning.

SOME CONVENTIONS IN THE BOOK

We have attempted to be consistent in the typefaces and colours of text in

the book, so that you can easily recognize diﬀerent types of R command. So
the text is rather colourful. Hopefully, the advantages of clarity about what
is what will outweigh any concerns you might have about colour choices.
Throughout the book, we highlight where you can work along with us
on your own computer using R, through the use of the symbol at the side
of the page.
Finally, all of the datasets we use are available online at
http://www.r4all.org/the-book/datasets/.

Why R?

If you’ve got this far, you probably know you want to learn R. Some of you
will have established research careers based around using a variety of stat-
istical and graphing packages. Some of you will be starting your research
career and wondering whether you should use some of the packages and
applications that your supervisor/research group uses, or jump ship to R.
Perhaps your group already uses R and you are just looking for that ‘get-
ting started’ book that answers what you think are embarrassing questions.
Regardless of your stage or background, we think an informal but struc-
tured introduction to an approach and routine for using R will help. And
regardless of the motivation, we ﬁnish the Preface here by introducing a
PREFACE xvii

core set of features and characteristics of R that we think make it worth

using and worth making a transition to from other applications.
First, we think you should invest the effort because it is freely available
and cross-platform (e.g. it works on Windows, Macs (OS X), and Linux).
This means that no matter where you are and with whom you work, you
can share data, figures, analyses, and, most importantly, the instructions
(also known as scripts and code) used to generate the figures and analyses.
Anyone, anywhere in the world, with any kind of Windows, Macintosh, or
Linux operating system, can use R, without a licence. If you, your depart-
ment, or your university invest heavily in multiple statistical packages, R
can save a great deal of money. When you change institutions, R doesn’t
become inaccessible, get lost, or become unusable.
Second, R is an interpreted programming language. It does not involve
extensive use of menus; you type commands instead. As a result, you have
to know what to ask R, know why you are asking R for this, and know what
to expect from R. You can’t just click on menus and get some results. This
means that by using R, you continually learn a great deal about statistics
and data analysis.
Third, it’s free. Oh, we said that already. Actually, it’s more accurate to
state that it’s freely available. Lots of people put an awful lot of effort into
developing R . . . that effort wasn’t free. Please acknowledge this effort by
citing R when you use it.
Fourth, we believe that R can replace common combinations of pro-
grams that you might use in the process of analysing your data. For
example, we have, at times, used two or three of Excel, Minitab, SAS, Systat,
JMP, SigmaPlot, and CricketGraph, to name a few. This results in not only
costly licensing of multiple programs, but also software-specific files of
various formats, all floating around in various places on your computer
(or desk) that are necessary for the exploration, plotting, and analysis that
make up a research project. Keeping a research project organized is hard
enough without having to manage multiple files and file types, proprietary
data formats, and the tools to put them all together. Furthermore, mov-
ing data between applications introduces extra steps into your workflow.
And how much fun is it piecing all of this together 3–6 months after
xviii PREFACE

submitting a manuscript, and needing to make changes? These steps and

frustrations are removed by investing in using R.
Fifth, with R you can make outstanding publication-quality and
publication-ready figures, and export them in many different formats, in-
cluding pdf. We now use only R for making graphs, and when submitting
manuscripts to journals we usually send only pdf files generated directly
from R. One of the nice things about pdfs is that they are resolution in-
dependent (you can zoom in as far as you like and they don’t get blocky).
This means that publishers have the best possible version of your figure.
And if the quality is poor in the published version of your paper, you know
it is down to something the publishers have done!
Finally, and quite importantly, R makes it very easy to write down and
save the instructions you want R to execute—this is called a script in R.
In fact, the script becomes a permanent, repeatable, annotated, cross-
platform, shareable record of your analysis. Your entire analysis, from
transferring your data from field or lab notebook to making figures and
performing analyses, is all in one, secure, repeatable, annotated place.
Take your time and learn the magic of R. Let’s get started.

Updates

Rstudio evolves quickly, so don’t worry if what you see on your computer
is a little diﬀerent from what’s printed in this book. For example, as this
book went to press, RStudio started using a new method for importing
data. We quickly updated the most important parts of the book, but for
a full account of this change, and any others, look on the book web site
www.r4all.org/the-book.

Acknowledgements

Thanks to our wives, Sophie, Amanda, and Sara, for everything. After all
these years, they know about R too. Many thanks to Ian Sherman and Lucy
Nash at OUP for their guidance, support and encouragement, to Douglas
Meekison for excellent copy-editing, and Philip Alexander for patiently
dealing with countless “ﬁnal” ﬁxes!
1
Getting and Getting
Acquainted with R

1.1 Getting started

One of the most challenging bits of getting started with R is actually

getting R, installing it, and understanding how it works with your com-
puter. Despite R’s cross-platform capacity (OSX, Windows, Linux, Unix),
there remain several diﬀerences in how things can look on each platform.
Thankfully, a new application, RStudio, provides a way to standardize
most of what you see and do with R, once it is on your computer. In this
chapter, we’ll walk you through the steps of getting R and RStudio, install-
ing them on your computer, understanding what you’ve done, and then
working through various aspects of using R and RStudio.
This introduction will make you feel comfortable using R, via RStudio.
It will make you understand that R is a giant calculator that does whatever
you ask it to do (within reason). It will also familiarize you with how R
does things, both ‘out of the box’ and via additional ‘add-on’ packages that
make R one of the most fun and widely used programs for doing statistics
and visualizing data.

Getting Started with R Second Edition. Andrew Beckerman, Dylan Childs, & Owen Petchey:
Oxford University Press (2017). © Andrew Beckerman, Dylan Childs, & Owen Petchey.
DOI 10.1093/acprof:oso/9780198787839.001.0001
2 GETTING STARTED WITH R

We will ﬁrst walk you through getting and installing R and getting and
installing RStudio. While for many this will be trivial, our experience sug-
gests that many of you probably need a tiny bit of hand-holding every once
and a while.

1.2 Getting R

We assume you don’t yet have R on your computer. It will run on

Macintosh, Windows, Linux, and Unix operating systems. R has a
homepage, r-project.org, but the software itself is located for
download on the Comprehensive R Archive Network (CRAN), which you
can ﬁnd at cran.r-project.org (Figure 1.1).

Figure 1.1 The CRAN website front page, from where you can ﬁnd the links to
download the R application.
GET TING AND GET TING ACQUAINTED WITH R 3

The top box on CRAN provides access to the three major classes of op-
erating systems. Simply click on the link for your operating system. As we
mentioned in the Preface, R remains freely available.
You’ll hear our next recommendation quite a bit throughout the book:
read the instructions. The instructions will take you through the pro-
cesses of downloading R and installing it on your computer. It might also
make sense to examine some of the Frequently Asked Questions found at
the bottom of the web page. R has been around quite a long time now,
and these FAQs reﬂect more than a decade of beginners like you asking
questions about how R works, etc. Go on . . . have a look!

1.2.1 L I N U X/U N I X

Moving along now, the Linux link takes you to several folders for flavours
of Linux and Unix. Within each of those is a set of instructions. We’ll as-
sume that if you know enough to have a Linux or Unix machine under
your fine fingertips, you can follow these instructions and take advantage
of the various tools.

1.2.2 WINDOWS

The Windows link takes you to a page with three more links. The link
you want to focus on is ‘base’. You will also notice that there is a link to
the aforementioned R FAQs and an additional R for Windows FAQs. Go
on . . . have a look! There is a tonne of good stuﬀ in there about the various
ways R works on Windows NT, Vista, 8, 10, etc. The base link moves you
further on to instructions and the installer, as shown in Figure 1.2.

1.2.3 M AC I N T O S H

The (Mac) OS X link takes you to a page with several links as well
(Figure 1.3). Unless you are on a super-old machine, the ﬁrst link is the
one on which you want to focus. It will download the latest version of R
for several recent distributions of OS X and oﬀer, via a .dmg installer, to
put everything where it needs to be. Note that while not required for ‘get-
ting started’, getting the XQuartz X11 windowing system is a good idea;
4 GETTING STARTED WITH R

Figure 1.2 Two steps to download the Windows version of R.

Figure 1.3 The download page for R for Macintosh.

GET TING AND GET TING ACQUAINTED WITH R 5

a link is provided just below the paragraph describing the installer (see
Figure 1.3). As with Windows, the R FAQs and an additional R for OS X
FAQs are provided . . . they are good things.

1.3 Getting RStudio

So, at this stage, you should have downloaded and installed R. Well done!
However, we are not going to use R directly. Our experience suggests that
you will enjoy your R-life a lot more if you interact with R via a diﬀerent
program, also freely available: the software application RStudio. RStudio is
a lovely, cross-platform application that makes interacting with R quite a
bit easier and more pleasurable. Among other things, it makes importing
data a breeze, has a standardized look and feel on all platforms, and has
several tools that make it much easier to keep track of the instructions you
have to give R to make the magic happen.

Figure 1.4 The RStudio website front page, from where you can ﬁnd the links
to download the RStudio application. (Note: you must (as you have done) also
download the R application from the CRAN website.)

A Planet of Viruses: Second Edition.
100% (25)
A Planet of Viruses: Second Edition.
23 pages
Medical Selection of Life Risks. ISBN 0292769148, 978-0333695234
100% (30)
Medical Selection of Life Risks. ISBN 0292769148, 978-0333695234
23 pages
Essentials of Pathophysiology: Concepts of Altered States. Fourth. ISBN 1451190808, 978-1451190809
97% (32)
Essentials of Pathophysiology: Concepts of Altered States. Fourth. ISBN 1451190808, 978-1451190809
23 pages
Rat Park 1981 PB&B PDF
No ratings yet
Rat Park 1981 PB&B PDF
6 pages
Basic Guide To Dental Instruments. 2nd Edition. ISBN 1444335324, 978-1444335323
100% (31)
Basic Guide To Dental Instruments. 2nd Edition. ISBN 1444335324, 978-1444335323
23 pages
Applied Longitudinal Analysis. ISBN 0470380276, 978-0470380277
100% (26)
Applied Longitudinal Analysis. ISBN 0470380276, 978-0470380277
23 pages
Miller - Haden - 2013 - GLM Statistical Analysis PDF
No ratings yet
Miller - Haden - 2013 - GLM Statistical Analysis PDF
274 pages
An Integrative Approach Through Reading Comprehension To Enhance Problem-Solving Skills of Grade 7 Mathematics Students
No ratings yet
An Integrative Approach Through Reading Comprehension To Enhance Problem-Solving Skills of Grade 7 Mathematics Students
25 pages
JASP Manual: Seton Hall University Department of Psychology 2018
No ratings yet
JASP Manual: Seton Hall University Department of Psychology 2018
48 pages
Ancova: Psy 420 Andrew Ainsworth
No ratings yet
Ancova: Psy 420 Andrew Ainsworth
53 pages
A Remedial Teaching Programme To Help Children With Mathematical Disability
No ratings yet
A Remedial Teaching Programme To Help Children With Mathematical Disability
25 pages
Progressive Muscle Relaxation, Breathing Exercises, and ABC Relaxation Theory
No ratings yet
Progressive Muscle Relaxation, Breathing Exercises, and ABC Relaxation Theory
7 pages
Teaching Decoding Skills To Poor Readers in High School: Catherine G. Penney
No ratings yet
Teaching Decoding Skills To Poor Readers in High School: Catherine G. Penney
20 pages
13 - Anova
No ratings yet
13 - Anova
33 pages
Effects of Classroom Testing by Microcomputer: Digitalcommons@University of Nebraska - Lincoln
No ratings yet
Effects of Classroom Testing by Microcomputer: Digitalcommons@University of Nebraska - Lincoln
7 pages
The Effectiveness of The Game-Based Learning System
No ratings yet
The Effectiveness of The Game-Based Learning System
14 pages
Inferential Statistics For Tourism
No ratings yet
Inferential Statistics For Tourism
3 pages
Topic 13: Analysis of Covariance (ANCOVA) : PLS205 Homework 9 Winter 2015
No ratings yet
Topic 13: Analysis of Covariance (ANCOVA) : PLS205 Homework 9 Winter 2015
8 pages
03.24.steenbergen Jones
No ratings yet
03.24.steenbergen Jones
21 pages
Stat 705: Completely Randomized and Complete Block Designs: Timothy Hanson
No ratings yet
Stat 705: Completely Randomized and Complete Block Designs: Timothy Hanson
16 pages
Project SPSS 2016
No ratings yet
Project SPSS 2016
18 pages
Cryotherapy For Acute Ankle Sprains - A Randomised Controlled Study of Two Different Icing Protocols
No ratings yet
Cryotherapy For Acute Ankle Sprains - A Randomised Controlled Study of Two Different Icing Protocols
7 pages
Statistical Treatment of Data - Written Report
No ratings yet
Statistical Treatment of Data - Written Report
3 pages
SPSS Advanced Models 10.0
No ratings yet
SPSS Advanced Models 10.0
2 pages
Missing Values Estimation Comparison in Split-Plot Design
No ratings yet
Missing Values Estimation Comparison in Split-Plot Design
8 pages
Effect of Self-Assessment On Academic Performance of Secondary School Students in Rivers State of Nigeria
No ratings yet
Effect of Self-Assessment On Academic Performance of Secondary School Students in Rivers State of Nigeria
8 pages
Unit - 2
No ratings yet
Unit - 2
3 pages
Nihms 1697268
No ratings yet
Nihms 1697268
29 pages
Democratizing Health Care: Welfare State Building in Korea and Thailand (Asia Today) - ISBN 1137537116, 978-1137537119
100% (19)
Democratizing Health Care: Welfare State Building in Korea and Thailand (Asia Today) - ISBN 1137537116, 978-1137537119
23 pages
An Introduction To Resting State fMRI (Oxford Neuroimaging Primers) - 1st Edition. ISBN 0198808224, 978-0198808220
95% (20)
An Introduction To Resting State fMRI (Oxford Neuroimaging Primers) - 1st Edition. ISBN 0198808224, 978-0198808220
23 pages
Neinstein's Adolescent and Young Adult Health Care: A Practical Guide. 6th Edition. ISBN 1451190085, 978-1451190083
96% (28)
Neinstein's Adolescent and Young Adult Health Care: A Practical Guide. 6th Edition. ISBN 1451190085, 978-1451190083
23 pages
Effect of Computer - Simulation On Achievement and Interest in Cell Division Among Male and Female Secondary School Students in Abuja, Nigeria
No ratings yet
Effect of Computer - Simulation On Achievement and Interest in Cell Division Among Male and Female Secondary School Students in Abuja, Nigeria
5 pages
An Unnatural History of Emerging Infections. 1st Edition. ISBN 0199608296, 978-0199608294
100% (25)
An Unnatural History of Emerging Infections. 1st Edition. ISBN 0199608296, 978-0199608294
23 pages
Final ScribdThe Law of Emergencies: Public Health and Disaster Management. ISBN 1856175472, 978-1856175470
100% (23)
Final ScribdThe Law of Emergencies: Public Health and Disaster Management. ISBN 1856175472, 978-1856175470
23 pages
The New Public Health Law: A Transdisciplinary Approach To Practice and Advocacy. ISBN 0190681055, 978-0190681050
100% (30)
The New Public Health Law: A Transdisciplinary Approach To Practice and Advocacy. ISBN 0190681055, 978-0190681050
23 pages
An Introduction To Health Policy: A Primer For Physicians and Medical Students., 978-1461477341
100% (24)
An Introduction To Health Policy: A Primer For Physicians and Medical Students., 978-1461477341
23 pages
Epidemiology and The People's Health: Theory and Context. ISBN 0199348421, 978-0199348428
100% (24)
Epidemiology and The People's Health: Theory and Context. ISBN 0199348421, 978-0199348428
23 pages
Health and Wealth: Studies in History and Policy (Rochester Studies in Medical History, Vol. 6) (Volume 6) - ISBN 1580461980, 978-1580461986
100% (29)
Health and Wealth: Studies in History and Policy (Rochester Studies in Medical History, Vol. 6) (Volume 6) - ISBN 1580461980, 978-1580461986
23 pages
Handbook of Assay Development in Drug Discovery (Drug Discovery Series) - ISBN 1574444719, 978-1574444711
100% (18)
Handbook of Assay Development in Drug Discovery (Drug Discovery Series) - ISBN 1574444719, 978-1574444711
23 pages
Ethics, Jurisprudence and Practice Management in Dental Hygiene, 2022 Update (Kimbrough, Ethics, Juriprudence and Practice Management in Dental Hygiene) - , 978-0131394926
100% (29)
Ethics, Jurisprudence and Practice Management in Dental Hygiene, 2022 Update (Kimbrough, Ethics, Juriprudence and Practice Management in Dental Hygiene) - , 978-0131394926
23 pages
Introduction To Plant Physiology. ISBN 0470247665, 978-0470247662
100% (27)
Introduction To Plant Physiology. ISBN 0470247665, 978-0470247662
23 pages
Biopolitics: An Advanced Introduction (Biopolitics, 5) - ISBN 081475242X, 978-0814752425
100% (30)
Biopolitics: An Advanced Introduction (Biopolitics, 5) - ISBN 081475242X, 978-0814752425
23 pages
The Simplicity of Dementia: A Guide For Family and Carers.
100% (24)
The Simplicity of Dementia: A Guide For Family and Carers.
23 pages
An American Plague: The True and Terrifying Story of The Yellow Fever Epidemic of 1793 (Newbery Honor Book) - ISBN 0395776082, 978-0395776087
100% (22)
An American Plague: The True and Terrifying Story of The Yellow Fever Epidemic of 1793 (Newbery Honor Book) - ISBN 0395776082, 978-0395776087
23 pages
Final ScribdInterprofessional Collaboration: From Policy To Practice in Health and Social Care., 978-1583911761
100% (35)
Final ScribdInterprofessional Collaboration: From Policy To Practice in Health and Social Care., 978-1583911761
23 pages
Advanced Practice Nursing: An Integrative Approach. ISBN 1416043926, 978-1416043928
100% (38)
Advanced Practice Nursing: An Integrative Approach. ISBN 1416043926, 978-1416043928
23 pages
Animal Viruses and Humans, A Narrow Divide: How Lethal Zoonotic Viruses Spill Over and Threaten Us. ISBN 1589881222, 978-1589881228
100% (33)
Animal Viruses and Humans, A Narrow Divide: How Lethal Zoonotic Viruses Spill Over and Threaten Us. ISBN 1589881222, 978-1589881228
23 pages
Research Methods in Biomechanics. ISBN 0736093400, 978-0736093408
100% (30)
Research Methods in Biomechanics. ISBN 0736093400, 978-0736093408
23 pages
Informed Dialogue: Using Research To Shape Education Policy Around The World (Washington Papers 170) - , 978-0275954437
100% (38)
Informed Dialogue: Using Research To Shape Education Policy Around The World (Washington Papers 170) - , 978-0275954437
23 pages
Pediatric Psychopharmacology For Primary Care. Second Edition. ISBN 1610021991, 978-1610021999
100% (18)
Pediatric Psychopharmacology For Primary Care. Second Edition. ISBN 1610021991, 978-1610021999
23 pages
Working On Health Communication. ISBN 1847879233, 978-1847879233
100% (23)
Working On Health Communication. ISBN 1847879233, 978-1847879233
23 pages
Leadership in Healthcare: Delivering Organisational Transformation and Operational Excellence (Organizational Behaviour in Healthcare)
100% (26)
Leadership in Healthcare: Delivering Organisational Transformation and Operational Excellence (Organizational Behaviour in Healthcare)
23 pages
Blue Marble Health: An Innovative Plan To Fight Diseases of The Poor Amid Wealth. ISBN 1421420465, 978-1421420462
100% (36)
Blue Marble Health: An Innovative Plan To Fight Diseases of The Poor Amid Wealth. ISBN 1421420465, 978-1421420462
23 pages
The Perfect Predator: A Scientist's Race To Save Her Husband From A Deadly Superbug: A Memoir. ISBN 0316418110, 978-0316418119
100% (19)
The Perfect Predator: A Scientist's Race To Save Her Husband From A Deadly Superbug: A Memoir. ISBN 0316418110, 978-0316418119
23 pages
Public Health and The Risk Factor: A History of An Uneven Medical Revolution (Rochester Studies in Medical History) - ISBN 1580461271, 978-1580461276
100% (25)
Public Health and The Risk Factor: A History of An Uneven Medical Revolution (Rochester Studies in Medical History) - ISBN 1580461271, 978-1580461276
23 pages
The Great American Drug Deal: A New Prescription For Innovative and Affordable Medicines. ISBN 1733058915, 978-1733058919
100% (25)
The Great American Drug Deal: A New Prescription For Innovative and Affordable Medicines. ISBN 1733058915, 978-1733058919
23 pages
Effective Health Risk Messages: A Step-By-Step Guide. ISBN 0761915095, 978-0761915096
100% (31)
Effective Health Risk Messages: A Step-By-Step Guide. ISBN 0761915095, 978-0761915096
23 pages
Mad Cow Crisis: Health and The Public Good., 978-1857288124
100% (24)
Mad Cow Crisis: Health and The Public Good., 978-1857288124
23 pages
The Lupus Encyclopedia: A Comprehensive Guide For Patients and Families (A Johns Hopkins Press Health Book) - ISBN 1421409844, 978-1421409849
100% (32)
The Lupus Encyclopedia: A Comprehensive Guide For Patients and Families (A Johns Hopkins Press Health Book) - ISBN 1421409844, 978-1421409849
23 pages
Life Liberty & The Defense of Dignity: The Challenge For Bioethics (Encounter Broadsides) - ISBN 1594030472, 978-1594030475
100% (32)
Life Liberty & The Defense of Dignity: The Challenge For Bioethics (Encounter Broadsides) - ISBN 1594030472, 978-1594030475
23 pages
Biochemistry Laboratory: Modern Theory and Techniques. ISBN 013604302X, 978-0136043027
100% (27)
Biochemistry Laboratory: Modern Theory and Techniques. ISBN 013604302X, 978-0136043027
23 pages
Valuing Health For Regulatory Cost-Effectiveness Analysis. ISBN 0309100771, 978-0309100779
100% (27)
Valuing Health For Regulatory Cost-Effectiveness Analysis. ISBN 0309100771, 978-0309100779
23 pages
The Social Economics of Health Care (Routledge Advances in Social Economics) - , 978-0415207652
100% (33)
The Social Economics of Health Care (Routledge Advances in Social Economics) - , 978-0415207652
23 pages
Policy & Politics in Nursing and Health Care (Policy and Politics in Nursing and Health) - ISBN 0323241441, 978-0323241441
100% (21)
Policy & Politics in Nursing and Health Care (Policy and Politics in Nursing and Health) - ISBN 0323241441, 978-0323241441
23 pages
Nursing Care Plans & Documentation: Nursing Diagnoses and Collaborative Problems (Nursing Care Plans and Documentation)
100% (31)
Nursing Care Plans & Documentation: Nursing Diagnoses and Collaborative Problems (Nursing Care Plans and Documentation)
23 pages
End Times: A Brief Guide To The End of The World.
100% (29)
End Times: A Brief Guide To The End of The World.
23 pages
Statistical Modeling For Biomedical Researchers: A Simple Introduction To The Analysis of Complex Data (Cambridge Medicine (Paperback) )
100% (27)
Statistical Modeling For Biomedical Researchers: A Simple Introduction To The Analysis of Complex Data (Cambridge Medicine (Paperback) )
23 pages
Essentials of Anatomy & Physiology. ISBN 0321787455, 978-0321787453
100% (25)
Essentials of Anatomy & Physiology. ISBN 0321787455, 978-0321787453
23 pages
MP Medical Terminology: Language For Health Care W/student CD-ROMs and Audio CDs. ISBN 0077302346, 978-0077302344
100% (31)
MP Medical Terminology: Language For Health Care W/student CD-ROMs and Audio CDs. ISBN 0077302346, 978-0077302344
23 pages
The Practical Guide To HIPAA Privacy and Security Compliance. ISBN 1439855587, 978-1439855584
100% (16)
The Practical Guide To HIPAA Privacy and Security Compliance. ISBN 1439855587, 978-1439855584
23 pages
Managing Health Care Information Systems: A Practical Approach For Health Care Executives. ISBN 0787974684, 978-0787974688
100% (37)
Managing Health Care Information Systems: A Practical Approach For Health Care Executives. ISBN 0787974684, 978-0787974688
23 pages
The Bell Lap: Stories For Compassionate Nursing Care. ISBN 1138430501, 978-1138430501
100% (17)
The Bell Lap: Stories For Compassionate Nursing Care. ISBN 1138430501, 978-1138430501
23 pages
Final ScribdBacterial Pathogenesis: A Molecular Approach. ISBN 063203775X, 978-0632037759
100% (34)
Final ScribdBacterial Pathogenesis: A Molecular Approach. ISBN 063203775X, 978-0632037759
23 pages
Contemporary Nutrition. ISBN 0072943742, 978-0072943740
100% (34)
Contemporary Nutrition. ISBN 0072943742, 978-0072943740
23 pages
Health Services Research Methods 2nd Second Edition Byshi.
100% (22)
Health Services Research Methods 2nd Second Edition Byshi.
23 pages
Hole's Human Anatomy & Physiology. ISBN 1260092828, 978-1260092820
100% (24)
Hole's Human Anatomy & Physiology. ISBN 1260092828, 978-1260092820
23 pages
Anatomy & Physiology: An Integrative Approach. ISBN 0078024285, 978-0078024283
100% (25)
Anatomy & Physiology: An Integrative Approach. ISBN 0078024285, 978-0078024283
23 pages
Designing An Alternate Reality Board Game With Augmented Reality and Multi-Dimensional Scaffolding For Promoting Spatial and Logical Ability
No ratings yet
Designing An Alternate Reality Board Game With Augmented Reality and Multi-Dimensional Scaffolding For Promoting Spatial and Logical Ability
22 pages
Pharmaceutical Public Policy. ISBN 1032242515, 978-1032242514
100% (21)
Pharmaceutical Public Policy. ISBN 1032242515, 978-1032242514
23 pages
Farr's Physics For Medical Imaging. ISBN 0702028444, 978-0702028441
100% (17)
Farr's Physics For Medical Imaging. ISBN 0702028444, 978-0702028441
23 pages
Junqueira's Basic Histology: Text and Atlas. ISBN 1259250989, 978-0071842709
100% (32)
Junqueira's Basic Histology: Text and Atlas. ISBN 1259250989, 978-0071842709
23 pages
Nuclear Medicine Technology: Procedures and Quick Reference. ISBN 0781774500, 978-0781774505
100% (22)
Nuclear Medicine Technology: Procedures and Quick Reference. ISBN 0781774500, 978-0781774505
23 pages
Analysis of Covariance
No ratings yet
Analysis of Covariance
18 pages
Telecare Technologies and The Transformation of Healthcare (Health, Technology and Society) - , 978-0230300200
100% (28)
Telecare Technologies and The Transformation of Healthcare (Health, Technology and Society) - , 978-0230300200
23 pages
Examination of The Net Promoter Score in Business Markets
No ratings yet
Examination of The Net Promoter Score in Business Markets
106 pages
02 The Electric Resistivy of Human Tissues
No ratings yet
02 The Electric Resistivy of Human Tissues
11 pages
Sports Research With Analytical Solution Using SPSS - 1st Edition PDF
100% (18)
Sports Research With Analytical Solution Using SPSS - 1st Edition PDF
17 pages
Use R!: Advisors
No ratings yet
Use R!: Advisors
13 pages
IBM SPSS Statistics 19 Made Simple 1st Edition Colin D. Gray Instant Download
No ratings yet
IBM SPSS Statistics 19 Made Simple 1st Edition Colin D. Gray Instant Download
64 pages
A Beginner's Guide To R, 1st Edition Educational Ebook Download
No ratings yet
A Beginner's Guide To R, 1st Edition Educational Ebook Download
17 pages

Getting Started With R: An Introduction For Biologists. ISBN 9780198787846, 978-0198787846

Uploaded by

Getting Started With R: An Introduction For Biologists. ISBN 9780198787846, 978-0198787846

Uploaded by

Getting Started with R: An Introduction for Biologists

Department of Animal and Plant Sciences,

Department of Evolutionary Biology

Chapter 1: Getting and Getting Acquainted with R 1

Chapter 2: Getting Your Data into R 35

Chapter 3: Data Management, Manipulation, and Exploration

Chapter 4: Visualizing Your Data 79

Chapter 5: Introducing Statistics in R 93

Chapter 6: Advancing Your Statistics in R 131

Chapter 7: Getting Started with Generalized Linear Models 167

7.3 Doing it wrong 173

Chapter 8: Pimping Your Plots: Scales and Themes in ggplot2 203

Chapter 9: Closing Remarks: Final Comments and

Introduction to the second edition

This is a book about how to use R, an open source programming language

WHAT ’S SO DIFFERENT FROM THE FIRST EDITION?

We teach a particular workﬂow for quantitative problem solving: have a

What this book is about

We love R. We use statistics in our everyday life as researchers and teach-

Minitab, SAS, JMP, Statistica, and SigmaPlot). Most participants, and we

WHAT YOU NEED TO KNOW TO MAKE THIS BOOK WORK

4. Finally, you need at least a basic understanding of how to do, and

How the book is organized

SOME CONVENTIONS IN THE BOOK

We have attempted to be consistent in the typefaces and colours of text in

core set of features and characteristics of R that we think make it worth

submitting a manuscript, and needing to make changes? These steps and

1.1 Getting started

One of the most challenging bits of getting started with R is actually

We assume you don’t yet have R on your computer. It will run on

Figure 1.2 Two steps to download the Windows version of R.

Figure 1.3 The download page for R for Macintosh.

1.3 Getting RStudio

You might also like