0% found this document useful (0 votes)
35 views

Programming in R

R is a popular open-source programming language for statistical analysis and graphics. This document introduces basic R features including: - Downloading and installing the R software from the CRAN website. - Using R as a calculator to perform arithmetic and assign values to variables. - Entering small datasets directly into R by defining vectors of values. - Getting help on R functions and commands through the help menu, documentation, and search functions.

Uploaded by

Matthew Grayson
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
35 views

Programming in R

R is a popular open-source programming language for statistical analysis and graphics. This document introduces basic R features including: - Downloading and installing the R software from the CRAN website. - Using R as a calculator to perform arithmetic and assign values to variables. - Entering small datasets directly into R by defining vectors of values. - Getting help on R functions and commands through the help menu, documentation, and search functions.

Uploaded by

Matthew Grayson
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

Focus Article

Programming in R
Jane M. Horgan∗
R programming is now a major resource for statistical analysis, research, and
teaching and has an impressive suite of applications and packages. In this
article, we introduce the basic features of the language including data entry,
data description, graphical procedures, writing simple programs, and simulation.
It is not the intention to provide an exhaustive list of features of the language, just
enough to give a flavor of the structure of R, and how to get help to proceed further.
 2011 Wiley Periodicals, Inc.

How to cite this article:


WIREs Comp Stat 2012, 4:75–84. doi: 10.1002/wics.183

Keywords: data entry analysis; graphical displays; simulation

WHAT IS R? Installing R
R is obtained from the website called CRAN
R is a data-analysis system that provides an
environment for statistical analysis and graphics.
It can be used just as a calculator, all the way up to
(Comprehensive R Archive Network), and is down-
loaded by proceeding as follows:
producing elaborate graphics, performing simulations,
and statistical modeling. It is in fact a complete object- • Go to the Cran website at http://cran.r-
oriented programming language, is open source, and project.org/
available from the web under the General Public • Click ‘Download and Install R’
License (GPL) which allows free use. It exists for • Choose an operating system
Microsoft Windows, Linux, and Unix platforms, and
for Apple Macintosh (OS versions newer than 8.6). • Choose the ‘base’ package
Unlike standard statistical packages such as SPSS • Click on ‘Download R’
and Minitab, which use point-and-click graphical-user • Press the option ‘Run’
interfaces, R is command driven; user type commands
at a prompt, and R responds. The purpose of this R is now installed.
article is to describe enough of the main features of
To start, click on the R icon, or go to ‘Programs’,
R to enable the new user to get started. We first,
select R, and then click on the R icon. When the R
in the next section, show how to download R and
program is started, and after it prints an introductory
describe some of its basic operations, editing, and
message on the screen, the interpreter prompts for
help procedures. The methods used to read and edit
input with ‘>’.
statistical data are discussed in Section ‘Data Entry’,
and an introduction to data analysis is given in Section R as a Calculator
‘Data Analysis’. Some of the graphical features of R are
Expressions that are typed at the command
examined in Section ‘Graphical Displays’, and Section
prompt (>) are executed by the interpreter. For
‘Simulation’ deals with an example in queuing, to
example:
illustrate the powerful simulation tools available in
R. We conclude with some suggestions for further 6+7*3/2
reading.
returns
BASICS
[1] 16.5
As a start we look at how to download R, and get it
to perform simple calculations. x <- 1:4

∗ Correspondence
to: [email protected] Here the integers 1, 2, 3, 4 are assigned to the vector
School of Computing, Dublin City University, Dublin, Ireland x. To check the contents of x, type

Volume 4, January/February 2012  2011 Wiley Periodicals, Inc. 75


Focus Article wires.wiley.com/compstats

x
for an HTML browser interface.
which returns It could be helpful to look at some demonstrations of
R by typing
[1] 1 2 3 4
demo()
xx <- x**2

causes each element in the vector x to be squared and which gives a list of available demonstrations. For
example,
stored in the vector xx. To examine the contents of
xx, type
demo(graphics)
xx
returns some examples of graphical procedures, along
which gives with the code used to implement them.
A more specific way of getting help is to
[1] 1 4 9 16
type the name of the function you require. For
To multiply a vector by a constant, type example:

X <- 10 ?read.table
prod1 <- X*x This will provide details on the exact syntactic
prod1 structure of the instruction ‘read.table’.
[1] 10 20 30 40 If you do not know the name of the command,
Here the integer 10 is stored in X, and X∗x causes type some words from the topic as follows:
each element of the vector x to be multiplied by 10. help.search ("data.entry")
Some points to note:
Check yourself what this will give.
• <- is the assignment operator; in the illustration
‘x <- 1:4’, the vector (1, 2, 3, 4) is assigned to
x. An alternative assignment operator is just ‘=’; DATA ENTRY
• R is case sensitive; x and X represent different Before carrying out a statistical analysis, it is
variables; necessary to get the data into the computer. How
• Variable names can consist of any combination this is done varies depending on the amount of
of lower and upper case letters, numerals, data involved. We illustrate the various options
periods, and underscores, but cannot begin with of data entry with the Anscombe Quartet1 given
in Table 1. It consists of four data sets (x1,
a numeral or an underscore;
y1), (x2, y2), (x3, y3), (x4, y4), each consisting
• All of the above examples of variables are of two variables in each of which there are 11
numeric, but R supports many other types of observations.
data, such as nonnumeric strings and matrices.

The entities that R creates and manipulates are called Reading and Displaying Data on Screen
objects. These include variables, arrays of numbers, A small data set may be entered directly from the
strings, or functions. All objects are stored in what is screen. It is usually stored as a vector, which is
known as the workspace. essentially a list of numbers. To input the x1 values
given in Table 1 from the screen, type
x1 <- c(10, 8, 13, 9, 11, 14, 6, 4, 12,
Getting Help
7, 5)
The easiest way of getting help when working in the R
environment is to click the Help button on the toolbar. The construct c(...) is used to define a
Alternatively you can type vector containing the data points. These are then
assigned to a vector called x1. Similarly for y1,
help()
type
for on-line help, or y1 <- c(8.04, 6.95, 7.58, 8.81, 8.33,
9.96, 7.24, 4.26, 10.84,
help.start() 4.82, 5.68)

76  2011 Wiley Periodicals, Inc. Volume 4, January/February 2012


WIREs Computational Statistics Programming in R

TABLE 1 The Anscombe Quartet


x1 y1 x2 y2 x3 y3 x4 y4
10 8.04 10 9.14 10 7.46 8 6.58
8 6.95 8 8.14 8 6.77 8 5.76
13 7.58 13 8.74 13 12.74 8 7.71
9 8.81 9 8.77 9 7.11 8 8.84
11 8.33 11 9.26 11 7.81 8 8.47
14 9.96 14 8.10 14 8.84 8 7.04
6 7.24 6 6.13 6 6.08 8 5.25
4 4.26 4 3.10 4 5.39 19 12.50
12 10.84 12 9.13 12 8.15 8 5.56
7 4.82 7 7.26 7 6.42 8 7.91
5 5.68 5 4.74 5 5.73 8 6.89

Entering Data from a File After attach there is no need to use the data frame
When the data set is large, it is better to set up a text name. Now
file and access the data from this, rather than to enter
x1[5]
it directly from the screen. For example, if the data in
Table 1 are stored in a file called anscombe.txt in the returns
G directory and data subdirectory, the file can be read
into R using [1] 11
anscombe <- read.table read.table assumes that the data in the text file
(‘‘G:/data/anscombe.txt’’, header = T) are separated by spaces, as in Table 1. Other forms
include:
Here header = T specifies that the first line
is a header, in this case containing the names of vari- read.csv, used when the data points are separated
ables. Notice that the forward slash (/) is used in the by commas;
filename, not backslash (\) which would be expected
in the windows environment. The backslash has itself read.csv2, used when the data are separated by
a meaning within R, and cannot be used in this semicolons.
context.
In R, this type of data set is stored in what is Spread Sheets
referred to as a data frame, which is an object with It is also possible to enter data into a spreadsheet and
rows and columns. Equivalently it is a list of vectors store it in a data frame as follows:
of the same length; the columns denote the variables,
while the rows are the observations on the variables. anscombe <- data.frame()
The read.table instruction above assigns the data fix(anscombe)
to a data frame called anscombe.
The convention for accessing the column This brings up a blank spread sheet called anscombe,
variables is to use the name of the data frame followed and the user may then enter the variable labels
by the name of the relevant column. For example: and the variable values. When finished entering
the data, right click and close creates a data
anscombe$x1[5] frame anscombe in which the new information is
stored.
returns
[1] 11
Editing
which is the 5th observation in the column labeled x1. If you subsequently need to amend anscombe, type
An easier way of doing this is to type and enter

attach(anscombe) fix(anscombe)

Volume 4, January/February 2012  2011 Wiley Periodicals, Inc. 77


Focus Article wires.wiley.com/compstats

This brings up the spreadsheet with the data, which Summarizing Statistical Data
can be changed as you wish. Alternatively click on First we implement some of the most commonly used
Edit on the tool bar to get access to the Data descriptive statistical measures.
Editor. For the mean of x1 write
mean(x1)
Missing Values which gives
R allows vectors to contain a special NA value to
indicate that the data point is not available. The [1] 9
absent values are referred to as missing values, and For the standard deviation of x1
are not included at the analysis stage.
sd(x1)
gives
Saving and Retrieving the Workspace
To save the entire workspace use [1] 3.316625

save.image() Similarly for median, range, quartiles, and deciles; the


functions in R are usually similar to their actual names
When exiting from R, you are given the opportunity in statistics.
to save the workspace: at File on the toolbar, clicking To get an overall summary of x1, write
on Exit gets a response
summary(x1)
{Save workspace image?} which gives the minimum, the first quartile, the
to which you answer yes or no. median, the mean, the third quartile, and the
Alternatively, the command maximum:
Min. 1st Qu. Median Mean
q()
4.0 6.5 9.0 9.0
instructs R to quit; it also responds with 3rd Qu. Max.
11.5 14.0
{Save workspace image?}
To get an overall summary of all the variables in the
A saved workspace may be retrieved at File on the anscombe data frame, type
toolbar, clicking on Load Workspace, and specifying summary(anscombe)
the location of a workspace that you have previously
saved. which gives

x1 y1 x2 y2 x3
Min. : 4.0 Min. : 4.260 Min. : 4.0 Min. :3.100 Min. : 4.0
1st Qu.: 6.5 1st Qu.: 6.315 1st Qu.: 6.5 1st Qu.:6.695 1st Qu.: 6.5
Median : 9.0 Median : 7.580 Median : 9.0 Median :8.140 Median : 9.0
Mean : 9.0 Mean : 7.501 Mean : 9.0 Mean :7.501 Mean : 9.0
3rd Qu.:11.5 3rd Qu.: 8.570 3rd Qu.:11.5 3rd Qu.:8.950 3rd Qu.:11.5
Max. :14.0 Max. :10.840 Max. :14.0 Max. :9.260 Max. :14.0
y3 x4 y4
Min. : 5.39 Min. : 8 Min. : 5.520
1st Qu.: 6.25 1st Qu.: 8 1st Qu.: 6.170
Median : 7.11 Median : 8 Median : 7.040
Mean : 7.50 Mean : 9 Mean : 7.525
3rd Qu.: 7.98 3rd Qu.: 8 3rd Qu.: 8.190
Max. :12.74 Max. :19 Max. :12.500

One can get quite far using R to execute simple


expressions from the command line; some users may
never need to go beyond this. However, at a more
DATA ANALYSIS
advanced level, it is possible to write programs and
We now look at how to perform some data analysis functions in R when what is required is not among
in R, by using some of its built-in functions and by what has been built in. This facility is one of the great
writing simple programs and functions. benefits of R, and we look now at how it is done.

78  2011 Wiley Periodicals, Inc. Volume 4, January/February 2012


WIREs Computational Statistics Programming in R

Writing R Programs Of course, the standard deviation may be calculated


Programming is not too difficult in R because vectors in a single statement, as follows:
and data frames are treated as single objects, and
std <- sqrt(sum((x1-mean(x1))^2)/
calculations can be done on these as if they were
(length(x1)-1))
ordinary numbers. To illustrate we write a program
to calculate the mean: We have seen that R has functions built in
which calculate the most commonly used statistical
1 
n
x= xi (1) measures. You will recall that the mean and standard
n deviation can be obtained directly with
i=1

In standard programming languages, the calculation mean(x1)


of Eq. (1) would involve ‘initialization’ and ‘loops’; [1] 9
with R however the conventions of vector calculations sd(x1)
make it very easy to calculate statistical functions. [1] 3.316625
We took you through the calculations just to illustrate
Example 1: A program to calculate the mean
how easy it is to program in R.
sx1 <- sum(x1) #sum the elements in x1
n <- length(x1) #count the number of elements in x1
averx1 <- sx1/n
Program Development
Here we see that sum adds all the elements in the There are various ways of developing programs in R.
vector, and length counts the number of elements in If the program is short, it may be developed while
the vector. Notice also the symbol #: what follows is working interactively at the workstation.
ignored by R, and is just a comment for the benefit of
the reader. To print the value obtained, type Alternatively, programs may also be developed in
a text editor like Notepad, saved and retrieved using
averx1 the source statement:
[1] 9
source("C:/test")
Alternatively, we could calculate the mean in just one
statement: which retrieves the program named test.R from
the C directory. An easy way of doing this while
averx1 <- sum(x1)/length(x1)
working in R is to click on File on the tool bar
Let us look at how to calculate the standard deviation, where you will be given the option to Source R
which is defined as: code, and to browse and retrieve the program you
 require.
n
i=1 (xi − x)
2
sd = (2)
n−1 Scripts
The most useful way of writing programs is by means
We illustrate step by step how to calculate Eq. (2) for
of an editor built in to R, called Scripts. From File
the data in x1.
at the toolbar click on New Script (File/New Script).
You are then presented with a blank screen to develop
Example 2: A program to calculate the standard
your program. When done, you may save and retrieve
deviation

diffx1 <- x1-mean(x1) #subtract the mean from each data point
diffsq <- (diffx1)^2 # obtain the squares of these differences
sumdiffsq < sum(diffsq) #sum the squared differences
std <- sqrt(sumdiffsq)/(length(x1)-1)) #divide this sum by (length(x1)-1), take the square root

Writing
this program as you wish. File/Save causes the file to
std be saved; you may designate what name you want
gives to call it, and it will be given a .R extension. In
subsequent sessions, File/Open Script brings up all the
[1] 3.316625 .R files you have saved, and you can select the one

Volume 4, January/February 2012  2011 Wiley Periodicals, Inc. 79


Focus Article wires.wiley.com/compstats

you wish to use. When you want to execute a line or Histogram of y1


group of lines, highlight them and press Ctrl-R, that is, 3.0
Ctrl and the letter R simultaneously. The commands
are then transferred to the control window and 2.5
executed. 2.0

Frequency
1.5
Creating Functions 1.0
Users can write function of their own when what they
need is not available as a built-in function in R. We 0.5
take as an example the skewness coefficient, which
0.0
measures how much the data differ from symmetry
and is defined as 4 5 6 7 8 9 10 11
y1
√ n
n (xi − x)3 FIGURE 1 | A histogram.
skew =  i=1  . (3)
n 2 3/2
i=1 (xi − x)

11
A perfectly symmetrical set of data will have a
skewness of zero; when the skewness coefficient is 10
substantially greater than zero, the data are assymetric 9
with a long tail to the right, and a negative skewness
coefficient means that data have a long tail to the left. 8
The following syntax calculates the skewness 7
coefficient, and assigns it to a function called skew
6
which has one argument (x).
5
Example 3: Function which calculates the skewness
4
coefficient
skew <- function(x) FIGURE 2 | A simple boxplot.
{
sum2 <- sum((x-mean(x))^2)
sum3 <- sum((x-mean(x))^3)
skew <- (sqrt(length(x))* sum3)/(sum2^(1.5)) Histogram
return(skew)
}
The traditional way of examining the ‘shape’ of a set
of data is a histogram.
The function skew can be applied to any data set. For
example hist(y1)

skew(y1) yields Figure 1.

gives

1] -0.05580807 Boxplots
A boxplot is a graphical summary based on the
which indicates that the y1 data is slightly negatively median, quartiles, and extreme values. To display
skewed. the y1 data using a boxplot, type

boxplot(y1)
GRAPHICAL DISPLAYS which gives Figure 2.
As well as numerical summaries, there are various Often called the Box and Whiskers Plot, the box
pictorial representations and graphical displays represents the interquartile range which contains 50%
available which have a more dramatic impact on the of cases. The whiskers are the lines that extend from
user and make for a better understanding of the data. the box to the highest and lowest values. The line
The ease and speed which graphical displays can be across the box indicates the median.
produced is one of the important features of R. We Multiple boxplots can be displayed on the same
look at some of the most commonly used. axis, by adding extra arguments to the boxplot

80  2011 Wiley Periodicals, Inc. Volume 4, January/February 2012


WIREs Computational Statistics Programming in R

11
10
10

9
8
8

y1
6 7

6
4
5

4
1 2
4 6 8 10 12 14
FIGURE 3 | Boxplot of y1 and y2. x1

FIGURE 5 | A scatter plot.

11
15
10

9
10
8

y1
7
5
6

x1 x2 x3 x4 y1 y2 y3 y4 5

4
FIGURE 4 | Boxplots of all the variables in the data frame.
4 6 8 10 12 14
x1
function or by using the complete data frame. For
example FIGURE 6 | The line of best fit.

boxplot(y1, y2) Here you can see that there is what is called a
yields Figure 3. linear trend in these data. The line that ‘best fits’ these
Notice the point below the whiskers of the data is obtained and displayed with
boxplot in y2. This data point is called an outlier abline(lm(y1˜x1))
and represents a case more than 1.5 box lengths
from the upper or lower end of the box. This point This gives Figure 6.
is considered atypical of the data in general, being When more than two variables are involved, R
extremely low compared to the rest of the data. provides a facility for producing scatter plots of all
Boxplots of all the variables in the data frame possible pairs. Writing
anscombe are obtained with pairs(anscombe)
boxplot(anscombe) will generate Figure 7.
which gives Figure 4.
Graphical Display versus Summary Statistics
Scatter Plots Looking again at the Anscombe data set given in
Table 1, we calculate the means (rounded to one
Scatter plots are useful to investigate relationships
decimal place) as follows:
between variables. To examine, for example,
the relationship between x1 and y1, we could round(mean(anscombe), 1)
write:
gives
plot(x1, y1)
x1 x2 x3 x4 y1 y2 y3 y4
to obtain Figure 5. 9.0 9.0 9.0 9.0 7.5 7.5 7.5 7.5

Volume 4, January/February 2012  2011 Wiley Periodicals, Inc. 81


Focus Article wires.wiley.com/compstats

4 12 8 16 3 7 6 12 12 12
12 8 8

y1

y2
x1
4 4 4
12 0 0
x2
4 0 5 10 15 20 0 5 10 15 20
12 x1 x2
x3
4
12 12
16 x4
8 8 8

y3

y4
8 4 4
y1
4 0 0
0 5 10 15 20 0 5 10 15 20
7 y2
3 x3 x4
12
y3
6
FIGURE 8 | Plots of four data sets with same means and standard
deviations.
12
y4
6
3. Data set 3 has an outlier. If the outlier were
4 12 4 12 4 8 6 12
removed the data would be linear;
FIGURE 7 | Use of the pairs function. 4. Data set 4 contains x values which are equal
except for one outlier. If the outlier were
The standard deviations (rounded to two decimal removed, the data would be vertical.
places) are calculated with
Graphical displays are the core of getting ‘insight/feel’
round(sd(anscombe),2) for the data. Such ‘insight/feel’ does not come from the
which gives quantitative statistics; on the contrary, calculations
of quantitative statistics should be done after the
x1 x2 x3 x4 y1 y2 exploratory data analysis using graphical displays.
3.32 3.32 3.32 3.32 2.03 2.03 The powerful graphical procedures of R facilitate this
y3 y4 approach.
2.03 2.03

Notice that the four sets of data (x1, y1), (x2, SIMULATION
y2), (x3, y3), (x4, y4) have the same mean and
standard deviation, which might lead to the conclusion With the computational power of R it is easy to
that the four data sets are essentially the same. simulate problems that might otherwise be difficult to
Investigating further using graphical displays understand. We illustrate with an example from the
gives a different picture. Scatter plots is the obvious theory of queues.
exploratory technique to use with paired data:
par(mfrow = c(2, 2)) #gives a two by two display Queues
plot(x1,y1, xlim=c(0, 20), ylim =c(0, 13))
plot(x2,y2, xlim=c(0, 20), ylim =c(0, 13)) There is an extensive literature on queuing theory; R
plot(x3,y3, xlim=c(0, 20), ylim =c(0, 13)) enables us to sidestep the theory, and to concentrate
plot(x4,y4, xlim=c(0, 20), ylim =c(0, 13))
instead on experimentation. We use as an example
the M/M/1 queue, where there is one server dealing
generates Figure 8. We use xlim = c(0,20) and
with customers on a first-in first-out basis. Customers
ylim= = c(0,13) to make the scales on the axes
are usually assumed to arrive in accordance with a
the same in the four plots, to allow for a valid
Poisson distribution, and are served immediately if
comparison.
the queue is empty, otherwise they join the end of
Examining Figure 8, we see that there are very
the queue. The service rates are also assumed to be
great differences in the data sets:
Poisson.
Traffic intensity (I) is the ratio of that arrival
1. Data set 1 is linear with some scatter; rate to the service rate. When the arrival rate is greater
2. Data set 2 is quadratic; than the service rate I > 1, when it is equal I = 1, and

82  2011 Wiley Periodicals, Inc. Volume 4, January/February 2012


WIREs Computational Statistics Programming in R

Traffic
Intensity I > 1
Traffic
Intensity I = 1
Traffic
Intensity I < 1
Figure 9 illustrates the severe problem that devel-
2000 2000 2000
ops when the arrival rate is greater than the service
rate (I > 1), the length of the queue is increasing
steeply. With arrival and service rates equal (I = 1),
the problem is not as severe, but it does exist, and we
1500 1500 1500 see that in the long run it will become serious. The
only tenable solution to the queuing problem is to
keep I < 1.
Queue length

Queue length

Queue length
1000 1000 1000

SUMMARY
500 500 500 We have tried to set before you some of the features
of R which make it such a flexible and accessible
language within which to tackle your statistical prob-
lems; we hope you have been convinced. For further
0 0 0
and deeper information, there are many books and
0 4000 10000 0 4000 10000 0 4000 10000 manuals both on and off line which, between them,
Time Time Time
deal with most statistical applications. Venables et al.2
FIGURE 9 | Queue lengths. provide a manual which gives an introduction to the
language and how to use R for doing statistical anal-
when it is less then I < 1. We investigate each of these ysis and graphics; it is downloadable from the CRAN
three scenarios in turn. website (http://cran.r-project.org/). Chambers3 guides
The following code simulates a queue in which the reader in programming with R, from interactive
customers arrive at the rate of 4 per minute, and use and writing simple functions to the design of pack-
are serviced at 3.8 per minute (I > 1). It generates ages and intersystem interfaces. Horgan4 deals with
10,000 random Poisson arrivals (rpois(10000, probability problems. Statistical inference examples
4)), and 10,000 Poisson services (rpois(10000, are tackled in Dalgaard5 . The book of Maindon-
3.8)), and calculates the queue length at each time ald and Braun6 has extensive examples that illustrate
interval. practical data analysis using R. Fox and Weisberg7
Example 4: A program to simulate a simple queue

arrivals <- rpois(10000, 4) #generates 10,000 values from a Poisson dist with mean =4
service <- rpois(10000, 3.8) #generates 10,000 values from a Poisson dist with mean =3.8
queue[1] <- max(arrivals[1] - service[1], 0)
for (t in 2:10000) queue[t] = max(queue[t-1]+arrivals[t]-service[t], 0) #length of queue
plot(queue, xlab = "Time", ylab = "Queue length")

The service rates can be changed to


rpois(10000, 4) for I = 1 and rpois(10000, give an introduction to the use of R in the context
4.2) for I < 1. Figure 9 gives the output with the of applied regression analysis. Gentleman8 uses R to
three different traffic intensities. address bioinformatics problems.

ACKNOWLEDGMENT
Thanks to the referees whose observations and suggestions greatly improved this article.

REFERENCES
1. Anscombe FJ. Graphs in statistical analysis. Am Stat Available at: http://www.r-project.org/. (Accessed July
1973, 27:17–21. 04, 2011).
2. Venables WN, Smith DM, the R Development
Core Team. An Introduction to R: A Programming 3. Chambers JM. Software for Data Analysis: Programming
Environment for Data Analysis and Graphics, 2004. with R. New York: Springer; 2008.

Volume 4, January/February 2012  2011 Wiley Periodicals, Inc. 83


Focus Article wires.wiley.com/compstats

4. Horgan JM. Probability with R: An Introduction with 7. Fox J, Weisberg S. An R Companion to Applied Regres-
Computer Science Applications. Hoboken, NJ: John sion. 2nd ed. Thousand Oaks LA: Sage Publications;
Wiley & Sons; 2008. 2011.
5. Dalgaard P. Introductory Statistics with R. 2nd ed. 8. Gentleman R. R programming for Bioinformatics. Lon-
Heidelberg: Springer-Verlag; 2008. don: Chapman and Hall/CRC; 2009.
6. Maindonald J, Braun J. Data Analysis and Graphics
Using R. 2nd ed. Cambridge: Cambridge University
Press; 2007.

84  2011 Wiley Periodicals, Inc. Volume 4, January/February 2012

You might also like