0% found this document useful (0 votes)

64 views20 pages

Chapter R Programming

This document discusses programming in R, including flow control, vectorization, and user-defined functions. It covers basic programming structures like conditional statements (if, ifelse, switch) and loops (for, repeat, while). It emphasizes that vectorization is more efficient than loops in R. The document also provides an example of defining a function to calculate the normal likelihood function.

Uploaded by

Wyara Vanesa Moura

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

64 views20 pages

Chapter R Programming

Uploaded by

Wyara Vanesa Moura

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 20

This is page 43

Printer: Opaque this

3
R programming
The R system is not only an interactive tool for exploring data sets and
graphic representations, but it also serves as an excellent environment for
programming. Comparably speaking, the programming syntax of R is easy
to learn. Users even without previous programming experience can get
started quickly in a couple of hours of learning some basic R control struc-
tures. This is exactly what we will do here. This chapter covers some basic
programming skills, focused on the use of flow controls and on how to write
functions for simple statistical problems.

3.1 Flow control and vectorization

The use of flow controls, either conditional or repetitive, are essentially
programming skills.

3.1.1 Conditional execution

There are three functions that can be used for conditional execution: if,
ifelse, and switch.

• The if-statement

The syntax of the if-statement is

if (cond) expr
if (cond) expr_1 else expr_2
where cond stands for condition, and expr stands for expression.
A cond is a length-one logical value. The cond must evaluate to a single
logical value and the result of the entire expression is then evident1 . An
expr is any valid R expression, and is often a compound expression, which
is a series of expressions contained with in curly braces.
In the following example, we use the if-statement to decide the actual
grade given the score that a student has. The initial value for the grade
variable is grade = ’’Pass’’, and an actual score is score = 50. Assume
that the actual grade is based on a cut-off score of 60 (i.e., Fail if score <
60, or Pass otherwise). Then, the actual grade is obtained by:.

1 In recent versions of R, length-one numeric values also work, where zero corresponds

to False and any non-zero value corresponds to True.

44 3. R programming

> grade = ‘‘PASS’’

> score = 50
> if (score<60) grade = ‘‘FAIL’’
> grade # actual grade
[1] ‘‘FAIL’’
The above example can also been implemented using the ifelse-statement.
> score<-50
> if (score<60) grade = ‘‘FAIL’’ else grade = ‘‘PASS’’
> grade
[1] ‘‘FAIL’’

• The ifelse-statement

The ifelse-statement provides a more concise form, which takes three

arguments, cond, a, and b.
ifelse(cond, a, b)
which returns a as the result if the condition cond is true, and b other-
wise.
Retaking the previous example, we can use the ifelse-statement to do
the same, yet in a more concise form.
> score<-50
> grade <- ifelse(score>=60,’’PASS’’,’’FAIL’’)
> grade
[1] ’’FAIL’’

• The switch-function

The function switch is also a commonly used conditional execution form,

the syntax of which is:
switch(expr, ...)
where the first argument expr is an expression to evaluate, and ”...”
stands for a list of alternatives, given explicitly.
Any number of additional arguments can be supplied, and they can be
either named or unnamed. If the value of the expression is numeric, then
the corresponding additional argument is evaluated and returned. If the
expression returns a character value, then the additional argument with
the matching name will be evaluated and returned. If no argument has a
matching name, then the value of the first unnamed argument is returned.
In the following example, a character vector contains five elements rep-
resenting either a single letter or a double letters. Then, each component
is evaluated and a message is displayed showing a number if a single letter
is found, or showing ”Double letters” otherwise.
> lett <- c(’’b’’,’’QQ’’,’’a’’,’’A’’,’’bb’’)
> for(ch in lett)
+ cat(ch,’’:’’,switch(EXPR = ch, a=1, A=1, b=2, B=2, ’’No match!’’),’’\n’’)
b : 2
3. R programming 45

QQ : No match!
a : 1
A : 1
bb : No match!

3.1.2 Repetitive execution

A repetitive execution is implemented by using a for-loop, or repeat-
command, or while-command.

• The for-loop

The syntax of the for-loop is:

for (loop_variable in seq ) expr
where seq is actually any vector expression usually taking the form of a
regular sequence, such as 1:5, and the statements of expr are executed for
each value of the loop variable in the sequence.
A simple example using the for-loop follows:
> for(i in 1:5) print(1:i)
[1] 1
[1] 1 2
[1] 1 2 3
[1] 1 2 3 4
[1] 1 2 3 4 5

• The repeat command

The syntax of the repeat command is:

repeat expr
which repeatedly execute the expression expr until explicitly terminated.

• The while command

The syntax of the while command is:

while (cond) expr
where the while loop continues the execution of the expression expr
while the condition cond holds true.

3.1.3 Vectorization
Loops are very inefficiently implemented in R. So, the use of loops should
be avoided, whenever possible, and the technique of vectorization should be
used instead. Consider exam scores of five persons, which are represented
by s < −c(80, 45, 55, 90, 75). Let g be a vector grade, the element of which
is given the value 0 (”Fail”) if the score is less than 60 and 1 (”Pass”)
46 3. R programming

otherwise. By using the looping technique (e.g., the ifelse statement),

the function is defined as:
> grade<-function (s) {
+ n<-length(s)
+ for (i in 1:n) {
+ g<-ifelse(s<60,0,1)
+ }
+ return(g)
+ }
score<-c( 80, 45, 55, 90, 75)
> grade(score)
[1] 1 0 0 1 1
More efficiently, the above can be done via vectorization:
> score<-c( 80, 45, 55, 90, 75)
> grade <- rep(1,5)
> grade[score<60] <- 0
> grade
[1] 1 0 0 1 1
Obviously, vectorized expressions are computationally simpler, and more
efficient, particularly with a large quantity of data. In R, many functions
are vectorized (i.e., they can handle both scalars and vectors), such as mean,
sum, and apply, just to list a few.

3.2 User-defined functions

Often than not, we need to define our own functions. A R function is defined
by using the keyword function, followed by an opening parenthesis, a list
of formal arguments (separated by commas), and a closing parenthesis, and
then by the expression(s) for the body of the functions. The value returned
by a R function is either the value that is explicitly returned by a call to
return() or it is simply the value of the last expression.
In the following, three functions are defined. They do the same thing
(i.e., calculate the square of a numeric number) though they look somewhat
differently.
> sq1<-function(x) x*x
> sq1(5)
[1] 25
> sq2<-function(x) return(x*x)
> sq2(5)
[1] 25
> sq3<-function(x) {
+ y<-x*x
+ return(y)
3. R programming 47

+ }
> sq3(5)
[1] 25
Note that a single expression can be entered directly on the same line of
the function keyword (as in sq1 and sq2). However, if there are several
expressions or statements to execute, they must be entered at different lines,
enclosed in braces (as in sq3). Also note that the above three functions are
all vectorized. (Test them for yourself: if x<-1:10, what will be the outputs?)

3.2.1 A function for the normal likelihood

In statistics, the likelihood function (often simply the likelihood) is a func-
tion of the parameters of a statistical model. Informally, if we say that
”probability” allows predicting unknown outcomes based on known para-
meters, then ”likelihood” allows estimating unknown parameters based on
known outcomes2 . So, likelihoods play a key role in statistical inference.

Suppose we observe a sample of size n, and the observations y = y1 ... yn
follow a normal distribution with mean µ and variance σ2 . The likelihood
function is (or proportional to)

n 2
− n (yi − µ)
L = 2πσ 2 2 exp − i=1 2 (3.1)
2σ

Computationally, it is preferable to compute the logarithmic likelihood

(think why?):
n
n
1
log L = − 2
log 2πσ + − 2 (yi − µ)2 (3.2)
2 2σ i=1

The R code for the logarithmic normal likelihood is:

> loglike <-function(mu,sigma,yobs) {
+ n <- length(yobs)
+ var <- sigma * sigma
+ logL <- 0.5*n*log(2.0*pi*var) + sum((yobs-mu)^2)/(2.0*var)
+ return(-logL)
+ }

2 In a sense, likelihood works backwards from conditional probability. In a forward rea-

soning, given parameter B, we use the conditional probability Pr (A|B) to reason about
outcome A. In a backward reasoning, however, outcome A is given and the likelihood
function L(B|A) is used to reason about parameter B. Formally, a likelihood function
is a conditional probability function considered as a function of its second argument,
with its first argument held fixed, and also any other function proportional to such a
function. Thus, the likelihood function for B is the equivalence class of functions
L (b|A) = α Pr (A|B = b)
for any constant of proportionality α > 0.
48 3. R programming

Now, let us randomly generate a sample of size from a normal distrib-

ution with mean 1.0 and standard deviation 1.2, and then calculate the
logarithmic likelihood for parameters µ = 1.0 and σ2 = 1.2.
> mu<-1.0
> sigma<-1.2
> seed<-123456
> y<-rnorm(n=100,mean=mu,sd=sigma)
> logL<-loglike(mu,sigma,y)
> logL # logarithm of likelihood
[1] -158.1327
> exp(logL) # likelihood
[1] 2.107775e-69
Here, rnorm(n=,mean=,sd=) is a function for generating n random sam-
ples from a normal distribution with the mean and standard deviation
specified by mean= and sd=, respectively. If the two parameters mean and
sd are not provided, then the random samples are generate from a standard
normal distribution with mean 0 and an unit standard deviation.

3.2.2 Functions with default values

Using default values in a R function means that not every argument needs
to be given specifically when calling the function. Presumably, some argu-
ments can be given commonly appropriate default values, and these values
may be omitted from a call to this function. In practice, the use of func-
tions with default values brings a lot convenience in statistical computation
using functions.
In the logarithmic normal likelihood function, for example, the variance
may be assumedly known, say σ2 = 1.0, and we would like to calculate
likelihoods for a grid of x values. Then, the R function for calculating the
logarithmic normal likelihood can be modified sightly, as shown below.
> loglike <-function(mu,sigma=1,yobs) {
+ n <- length(yobs)
+ var <- sigma * sigma
+ logL <- 0.5*n*log(2.0*pi*var) + sum((yobs-mu)^2)/(2.0*var)
+ return(logL)
+ }
Further, by making use of the function loglike, a new function, likemu,
can be defined, which calculates the likelihoods for a grid of values for mu
with the variance fixed as σ2 = 1.0.
> likemu<-function(vmu,yobs) {
+ m<-length(vmu)
+ like<-numeric(m)
+ for (i in 1:m) {
+ like[i]<-exp(loglike(mu=vmu[i],yobs=yobs))
+ }
3. R programming 49

+ return(like)
+}
Now, assume that there are 20 data points from a normal distribution
with the variance approximately being 1.0. The likelihood values for varying
values of the mean (i.e., from -2 to 2 with an increment of 0.1) is calculated
as:
> y=c(1.18,-0.84,-0.07,-2.00,-0.34,-1.84,-0.38,-2.39,-1.18,
+ 0.44,-0.21,0.43,-1.21,0.28,-1.19,0.19,-1.17,0.01)
> mean(y)
[1] -0.5716667
> mmu<-seq(-2,2,0.1)
> lkmu<-likemu(vmu=mmu,yobs=y)
In Figure 3.1, the maximum likelihood value is observed at a location
approximately correspond to the sample mean (µ ≈ −0.57). Think what
this result implies.
> plot(mmu,lkmu,type=’’h’’)
2.5e-11
2.0e-11
1.5e-11
lkmu

1.0e-11
5.0e-12
0.0e+00

-2 -1 0 1 2

mmu

FIGURE 3.1. Plot of likelihood values for a grid of mean values with the variance
fixed at 1.0
50 3. R programming

3.2.3 Functions as arguments

In R, a function (or functions) can be passed as arguments in another
function. In the following example, the general plotting function plots the
values of a function f for a specified set of x values.
> genplot <- function(f, x=seq(-10,10,length=200),
+ ptype=’’l’’, colour=2) {
+ y <- f(x)
+ plot(x, y, type=ptype, col=colour)
+ }
> genplot(sin, ptype=’’h’’)
> genplot(sin, ptype=’’h’’)
1 .0
0 .5
0 .0
y

- 0 .5
-1 .0

-10 -5 0 5 10

FIGURE 3.2. Plots of a generic sin function for a specified set of x values

In the above, we use a generic sin function to generate values for y =

sin(x), and the values of y are plotted for a range of x values between -10
and 10 (Figure 3.2).
The function f, which is passed as an argument, can also be user-defined
(Figure 3.3).
> cubf <- function(x) x^3-6*x-6
> gen_plot(cubf, x=seq(-3,2,length=500))
3. R programming 51

0
-5
y

-1 0
-1 5

-3 -2 -1 0 1 2

FIGURE 3.3. Plots of a user-defined function

3.2.4 Functions for Binary operators

As mentioned previously, a binary operations take two values, such as addi-
tion (+), subtraction (-), multiplication (*), and division (/). For example,
adding 10 and 2 is mathematically denoted by 10 + 2. The R syntax follows
this convention. In R, binary operators also include matrix multiplication
%*% and outer product %o%.
Essentially, a binary operator is a function. Consider the addition oper-
ator (”+”), the code of which can be displayed by:
> get(’’+’’)
function (e1, e2) .Primitive(’’+’’)
Clearly, the addition operator (”+”) is a function, which takes two pa-
rameters, e1 and e2. When writing addition expression in R, however, we
write, for example, 1 + 2 rather than +(1; 2). Using such binary opera-
tors with the arguments on either side of the binary operator, instead of
following the function convention, is much easier for us to understand.
R also allows us to define our own binary operators, e.g., in the form
%name%. Suppose we want to define a binary %m% such that a%m%b=ab-b.
The function is defined as:
> ’’%m%’’ <- function(a,b) a*b-b
Then, we can use it in the same way as we use other binary operators
(such as + or -).
> 1 %m% 2
52 3. R programming

0 .5
0 .0
y

-0 .5

0 5 10 15 20

FIGURE 3.4. Illustration of a binary operator for plotting y=log(x) over x

[1] 0
> 5 %m% 6
[1] 24
Next, a binary operation is defined for plotting y over x. Practically, y
can be a numeric vector, or any function of x. In Figure 3.4), for example,
we plot 0.3*cos(x) + 0.7*sin(2*x) over a grid of x values between 0.1 and
20.
> ’’%p%’’ <- function(y,x) plot(x, y, type=’’l’’, col=2)
> x <- seq(0.1, 20, length=400)
> (0.3*cos(x) + 0.7*sin(2*x)) %p% x

3.2.5 Recursive functions

A recursive functions is function that calls itself. Recursive functions are
convenience to use, but sometimes they may be inefficient means of solving
problems in terms of run times.
Now, consider computing the factorial: n! = n ∗ (n − 1)! It is apparent
that a recursive function can be used here, because, to compute n!, one can
compute (n − 1)!, and then multiplied by n.
Numerically, we can see how this can be done using the recursive algo-
rithm. As a starting point, we have:
0! = 1
Then, the factorial can be understood in the following recursive way:
3. R programming 53

1! = 1 * 0! = 1 * 1 = 1
2! = 2 * 1! = 1 * 1 = 2
3! = 3 * 2! = 3 * 2 = 6
4! = 4 * 3! = 4 * 6 = 24
......
Here, we enter this function in another way. Use a text editor to enter
the following code.
factorial<-function (n) {
If (n==0) return(1)
else return(n*factorial(n-1))
}
Save this function as “factorial.R”, and load this function by source(’’factorial.R’’).
Now, it is ready for use.
> source(’’factorial.R’’)
> factorial(0)
[1] 1
> factorial(1)
[1] 1
> factorial(2)
[1] 2
> factorial(3)
[1] 6
> factorial(4)
[1] 24
> factorial(10)
[1] 3628800

3.3 Some issues related to R programming

3.3.1 Lexical scope
Variables in the body of a R function can be grouped into three cate-
gories: formal parameters, local variables and free variables. The formal
parameters of a function are those appearing in the argument list of the
function, and their values are determined when call to the function (i.e.,
by the process of binding the actual function arguments to the formal pa-
rameters). Local variables are those whose values are determined by the
evaluation of expressions in the body of the functions. Free variables are
those belonging to neither of the two groups (i.e., not formal parameters
nor local variables).
Consider the following function that calculates the area of a rectangle.
area <- function(h, w) {
s1 <- h * w
print(h)
54 3. R programming

print(w)
print(s1)
print(s2)
}
In this function, h and w are formal parameters, s1 is a local variable
and s2 is a free variable.
In R the value of a free variable is resolved by first looking in the en-
vironment in which the function was created. This is called lexical scope,
which marks one of the major differences between S-Plus and R. Lexical
scope can be confusing to R users, but, when probably use, it can provide a
powerful mechanism for controlling evaluation and it ensures that intended
sets of bindings between variables and values are used.
Define a function that calculates the volume of a cube.
cube <- function(w) {
area <- function() w * w
w * area()
}
The variable w is a formal parameter in the function cube, but it is
is a free variable in the function area, so its value is determined by the
scoping rules. In S-Plus, the value of w is that associated with a global
variable named w (i.e., static scope). In R, however, it is the parameter
to the function cube because that is the active binding for the variable
w at the time the function area was defined (i.e., lexical scope). So, the
difference is that S-Plus looks for a global variable called w while R first
looks for a variable called w in the environment created when cube was
invoked.
In S, suppose that there is a globe variable w=3. A call to cube(2) will
return 18 as the cube volumn.
S> cube(2)
Error in sq(): Object ’’w’’ not found
Dumped
S> w <- 3
S> cube(2)
[1] 18
In R, however, a call to the same function will return 8 as the answer.
> w<-3
> cube(2)
[1] 8

3.3.2 Exception handling

Exception handling is the process of dealing with the failure of a com-
putation to complete successfully and in some sense to allow the user to
interrupt computation. There are a number of tools in R that allow for
general exception handling. The two most common sorts of exceptions are
3. R programming 55

errors (which can be raised by a call to stop) and warnings (which can be
raised by a call to warning).
The typical behavior for an error is to halt the current evaluation and
return control to the top-level R prompt3 . The default behavior for warning
is to wait until the current evaluation is finished and, then, to print the
warning that occurred during the evaluation. Users can control the behavior
by making use of various R options, which is not discussed in details here.
Next is a simple example demonstrating the use of tryCatch for con-
ditionally evaluating expressions. In this example, two handlers are estab-
lished, one for errors and the other for warnings.
> foo <- function (x) {
+ if (x<3)
+ list() + x
+ else if (x<10)
+ warning(’’ouch’’)
+ else
+ 33
+ }
>
> foo(2)
Error in list() + x : non-numeric argument to binary operator
> foo(5)
Warning message:
In foo(5) : ouch
> foo(29)
[1] 33
>
> tryCatch(foo(2),error=function(e) ’’This is an error’’,
+ warning = function(e) ’’This is an warning’’)
[1] ’’This is an error’’
> tryCatch(foo(5),error=function(e) ’’This is an error’’,
+ warning = function(e) ’’This is an warning’’)
[1] ’’This is an warning’’
> tryCatch(foo(29),error=function(e) ’’This is an error’’,
+ warning = function(e) ’’This is an warning’’)
[1] 33

3 In some situation, however, this may not be desired. For example, a large simula-
tion is being run, and one run may fail, which nevertheless should not halt the entire
simulation.
56 3. R programming

3.3.3 Classes and generic functions

A class is a description of a thing, and an object is an instance of a class.
For example,
> y<-1:20
> y
[1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
> class(y)
[1] ’’integer’’
Here, we see that y is an object of the integer class. In R, the class of
an object determines how it will be treated by what are known as generic
functions. Put the other way round, a generic function performs a task or
action on its arguments specific to the class of the argument itself. If the
argument lacks any class attribute, or has a class not catered for specifically
by the generic function in question, a default action is always provided.
The class mechanism offers the user the facility of designing and writ-
ing generic functions for special purposes. Examples of generic functions
are plot() for displaying objects graphically, summary() for summarizing
analyses of various types, and anova() for comparing statistical models.
The number of classes a generic function can handle can also be quite
large. For example, the summary() function has a default method and
variants for objects of classes. A complete list can be shown by a call to
methods(summary)4 :
> methods(summary)
[1] summary.aov summary.aovlist summary.connection
[4] summary.data.frame summary.Date summary.default
[7] summary.ecdf* summary.factor summary.glm
[10] summary.infl summary.lm summary.loess*
[13] summary.manova summary.matrix summary.mlm
[16] summary.nls* summary.packageStatus* summary.POSIXct
[19] summary.POSIXlt summary.ppr* summary.prcomp*

4 In
this example there are 26 methods. Most of them can be seen by typing its name,
such as summary.data.frame. However, five of them are asterisked, indicating that can
not be viewed directly by typing their names. We can read these methods by, e.g.,
> getAnywhere(summary.loess)
A single object matching ‘summary.loess’ was found
It was found in the following places
registered S3 method for summary from namespace stats
namespace:stats
with value
function (object, ...)
{
class(object) <- ’’summary.loess’’
object
}
<environment: namespace:stats>
3. R programming 57

[22] summary.princomp* summary.stepfun summary.stl*

[25] summary.table summary.tukeysmooth*
Non-visible functions are asterisked
When call to the summary function, it performs a task or action on its
arguments specific according to the class of the argument. In the following
examples, the summary function gives descriptive statistics (e.g., minimum,
quantiles, and maximum) for the numeric object y, a frequency table for
the factor object x, and a list of regression results for the “lm” object.
> y <- rnorm(20)
> class(y)
[1] ’’numeric’’
> summary(y)
Min. 1st Qu. Median Mean 3rd Qu. Max.
-1.9370 -0.5221 0.2525 0.1297 1.1750 1.3420
> x <-sample(letters[1:4],20,replace=T)
> x <-as.factor(x)
> class(x)
[1] ’’factor’’
> summary(x)
a b c d
4 5 7 4
> lm<-lm(x~y)
> class(lm)
[1] ’’lm’’
> summary(lm)
Call:
lm(formula = x ~y)
Residuals:
Min 1Q Median 3Q Max
-1.8809 -0.5416 0.1368 0.6934 1.3954
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.0777 0.5302 0.147 0.885
yb -0.1310 0.7113 -0.184 0.856
yc -0.1334 0.6646 -0.201 0.843
yd 0.6571 0.7498 0.876 0.394
Residual standard error: 1.06 on 16 degrees of freedom
Multiple R-squared: 0.09478, Adjusted R-squared: -0.07495
F-statistic: 0.5584 on 3 and 16 DF, p-value: 0.65.
For advanced users of R, please refer to Appendix A for detailed descrip-
tions of classes, generic functions, and object-oriented programming.
58 3. R programming

3.4 Exercises
1. Consider measurements of heights (in centimeters) of five persons at
ages 8 and 15, respectively. Let x1=c(75.1,108.9,105.3,83.9,101.2) and x2
= c(131.1,175.8,179.7,154.6,163.9). Now, we would like to know the change
of height per year for each of them. Mathematically, this is to calculate:
1
∆ = 15−8 (x2 − x1 ). Then, (a) define a function that returns the change
of height per year for each of them, and (b) define a function (i.e., binary
operator %∆%) that takes x2 and x1 as the two parameters and returns
the changes of height per year (∆) for the five persons.
2. Define a function, namely center, which is expected to return either
the mean, or median, or mode, depending on the expression to be evaluated.
The mean and median are given by the generic functions mean and median,
and the mode is given by a user-defined function mode. (We’ll explain the
mode function in Chapter 4).
mode <- function (x) {
y <- as.integer(names(sort(-table(x)))[1])
print(y)
}
Next, sample 20 numbers randomly with replacement from numbers 1,
2, 3, 4, and 5. Find the mean, median, and mode using the center function.
3. In mathematics, the Kronecker product, denoted by ⊗, is an operation
on two matrices of arbitrary size resulting in a block matrix. If A is an m×n
matrix and B is a p × q matrix, then the Kronecker product A ⊗ B is the
mp × nq block matrix 
a11 B . . . a1n B
A⊗B =
 .. .. .. 
. . . 
a B · · · amn B
 m1 
a11 . . . a1n
where A =  ... .. 
 ..
. . 
am1 · · · amn
Then, (a) Define binary
operator (denoted by %@%) for the Kronecker
7 0 1.0 0.2
product; (b) calculate ⊗ .
0 5 0.2 1.0
5
4. In mathematics, the Fibonacci numbers are the following sequence of
numbers: 0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, . . . . . . Note that the first two
Fibonacci numbers are 0 and 1, and each remaining number is the sum of
the previous two:
0+1=1
1+1=2

5 The Fibonacci sequence is named after Leonardo of Pisa, who was known as Fi-

bonacci (a contraction of filius Bonaccio, ”son of Bonaccio”.)

3. R programming 59

1+2=3
2+3=5
3+5=8
5 + 8 = 13
..
.
In mathematical terms, the sequence Fn of Fibonacci numbers is defined
by the recurrence relation
Fn = Fn−1 + Fn−2
with seed values
F0 = 0 and F1 = 1
Write a recursive R function that gives the Fibonacci numbers.
5. The Bayesian information criterion (BIC) or Schwarz Criterion is a cri-
terion for model selection among a class of parametric models with different
numbers of parameters.
The BIC is an asymptotic result derived under the assumptions that the
data distribution is in the exponential family. Let x = the observed data;
n = the number of data points in x (i.e., the number of observations); k =
the number of free parameters to be estimated (If the estimated model is
a linear regression, k is the number of regressors, including the constant);
p(x|k) = the likelihood of the observed data given the number of parame-
ters; L = the maximized value of the likelihood function for the estimated
model.
The formula for the BIC is:
BIC = −2 · ln L + k ln(n)
Under the assumption that the model errors or disturbances are normally
distributed, this becomes (up to an additive constant, which depends only
on n and not on the model):
BIC=nln RSS n + k ln(n)
where RSS is the residual sum of squares from the estimated model.
Write a function that gives BIC values of normal data with overall mean
µ and variance σ 2 .
6*. Lexical scope and exception handling: The following R codes are used
to mimic a bank account. A functioning bank account needs to have a bal-
ance or total, a function for making withdrawals, a function for making
deposits and a function for stating the current balance. This is achieved by
creating the three functions within account and then returning a list con-
taining them. When account is invoked it takes a numerical argument total
and returns a list containing the three functions. Because these functions
are defined in an environment which contains total, they will have access
to its value.
The special assignment operator, <<-, is used to change the value associ-
ated with total. This operator looks back in enclosing environments for an
environment that contains the variable total. When such an environment
is found, it replaces the value, in that environment, with the value of right
60 3. R programming

hand side. If the global or top-level environment is reached without finding

the variable total then that variable is created and assigned to there.
open.account <- function(total) {
list(deposit = function(amount) {
if(amount <= 0)
stop(’’Deposits must be positive!\n’’)
total <<- total + amount
cat(amount, ’’deposited. Your balance is’’, total, ’’\n\n’’)
},
withdraw = function(amount) {
if(amount > total)
stop(’’You don’t have that much money!\n’’)
total <<- total - amount
cat(amount, ’’withdrawn. Your balance is’’, total, ’’\n\n’’)
},
balance = function() {
cat(’’Your balance is’’, total, ’’\n\n’’)
}
)
}
1) Predict what will be the outputs of the following. Then, run the codes
and see what you actually get as the outputs.
ross <- open.account(100)
robert <- open.account(200)
ross$withdraw(30)
ross$balance()
robert$balance()
2) Modify the codes by using the tryCatch function for conditionally
evaluating the over-drawing problem.

3.5 Project: Additive genetic relationship matrix

The probability of identical genes by descent occurring in two individuals
is termed as the coancestry or the coefficient of kinship (Falconer, 1989)
and the additive genetic relationship between two individuals is twice their
coancestry. The matrix which indicates the additive genetic relationship
among individuals is called the numerator relationship matrix (A). It is a
symmetric matrix with its diagonal element for animal i (αii ) being equal to
1+Fi , where Fi is the inbreeding coefficient of animal i (Wright, 1922). The
diagonal element represents twice the probability that two gametes taken
at random from animal i will carry identical alleles by descent. The off-
diagonal element, aij , equals the numerator of the coefficient of relationship
(Wright, 1922) between animal i and j.
3. R programming 61

The matrix A can be computed using path coefficient, but a recursive

method has described by Henderson (1976), which is computationally more
convenient to be taken. The algorithm of the recursive method is as follows.
Let there be n animals in the pedigree. First, code the animals from 1 to
n and order them such that parents precede their progeny Then, the A
matrix can be computed recursively.
If both parents (say s and d) of animal i are known
aji = aij = 0.5 (ajs + ajd ); for j = 1 to (i − 1)
aii = 1 + 0.5 (asd )
If only one parent (s) is known and assumed unrelated to the mate
aji = aij = 0.5 (ajs ); for j = 1 to (i − 1)
aii = 1
If both parents are unknown and are assumed unrelated
aji = aij = 0; for j = 1 to (i − 1)
aii = 1
(1) Define a function for the numerator relationship matrix A for an
arbitrarily pedigree with n individuals.
(2) Calculate the numerator relationship matrix A for the pedigree given
below.
Calf Sire Dam
3 1 2
4 1 unknown
5 4 3
6 5 2
(3) Multiplying the matrix A by the additive genetic variance σ2u leads
to the covariance among breeding values of the individuals (denoted as
Aσ2u ).
Let ui be the breeding value for animal i, then var (ui ) = (1 + Fi ) σ2u .
Define a function which takes two animal ids (say i and j) as the input
parameters and returns the covariance of breeding values between the two
individuals. Specifically, that the function is expected to return the variance
of the breeding value of individual i, if two animal ids are the same (i.e.,
i = j).
62 3. R programming

Functions: Built-In Functions in R
No ratings yet
Functions: Built-In Functions in R
10 pages
Unit 2
No ratings yet
Unit 2
17 pages
Programming In: Flow Control (Control Structures
No ratings yet
Programming In: Flow Control (Control Structures
14 pages
Cda Lab
No ratings yet
Cda Lab
31 pages
2 - Datacamp - Intermediate R Notes
No ratings yet
2 - Datacamp - Intermediate R Notes
55 pages
02 Programming
No ratings yet
02 Programming
85 pages
R Programming
No ratings yet
R Programming
22 pages
Unit III R Programming Fundamentals
No ratings yet
Unit III R Programming Fundamentals
33 pages
Second
No ratings yet
Second
26 pages
Unit 2
No ratings yet
Unit 2
16 pages
Functions and Flow Control
No ratings yet
Functions and Flow Control
127 pages
Ids Unit 4 Final
No ratings yet
Ids Unit 4 Final
32 pages
Intermediate R
No ratings yet
Intermediate R
13 pages
Week2 Slides
No ratings yet
Week2 Slides
76 pages
7 Programming Fundamentals in R
No ratings yet
7 Programming Fundamentals in R
7 pages
Presentation 2
No ratings yet
Presentation 2
18 pages
Chapter 3 Programming Basics: 3.1 Conditional Expressions
No ratings yet
Chapter 3 Programming Basics: 3.1 Conditional Expressions
7 pages
Control Structures and Functions
No ratings yet
Control Structures and Functions
20 pages
R-Prog Unit-2
No ratings yet
R-Prog Unit-2
16 pages
Foha - R
No ratings yet
Foha - R
34 pages
Machine Learning - Unit III
No ratings yet
Machine Learning - Unit III
19 pages
Learn R in W3 School
No ratings yet
Learn R in W3 School
21 pages
Unit 5 - DS - 1st Year
No ratings yet
Unit 5 - DS - 1st Year
19 pages
Chapter 4 Programming Basics - Introduction To Data Science
No ratings yet
Chapter 4 Programming Basics - Introduction To Data Science
11 pages
Ebooks Basicr Writefuns
No ratings yet
Ebooks Basicr Writefuns
11 pages
Unit 2 R
No ratings yet
Unit 2 R
18 pages
Statistics Using R Language
No ratings yet
Statistics Using R Language
5 pages
R Programming
No ratings yet
R Programming
13 pages
SEC Notes
No ratings yet
SEC Notes
62 pages
R Unit 2 Notes
No ratings yet
R Unit 2 Notes
14 pages
Control Flow Tools
No ratings yet
Control Flow Tools
10 pages
Unit 2 in R Updated-HN
No ratings yet
Unit 2 in R Updated-HN
14 pages
Statistics With R Programming For Bigdata (Autosaved)
No ratings yet
Statistics With R Programming For Bigdata (Autosaved)
41 pages
R Programming
No ratings yet
R Programming
114 pages
R Module 2
No ratings yet
R Module 2
30 pages
Note R Control Function Scoping Rules Vectorized Operation Date and Time
No ratings yet
Note R Control Function Scoping Rules Vectorized Operation Date and Time
15 pages
R Language - Loops and Control Staements
No ratings yet
R Language - Loops and Control Staements
92 pages
BDA-Unit 5
No ratings yet
BDA-Unit 5
84 pages
R Language1
No ratings yet
R Language1
82 pages
R Basics - 02
No ratings yet
R Basics - 02
34 pages
302 SM and Da (Unit 3 4 5)
No ratings yet
302 SM and Da (Unit 3 4 5)
47 pages
P2 - Basics of R Programming
No ratings yet
P2 - Basics of R Programming
47 pages
Computational Techniques in Statistics: Ronald Wesonga (PH.D) 19 Feb 24
No ratings yet
Computational Techniques in Statistics: Ronald Wesonga (PH.D) 19 Feb 24
7 pages
Untitled
No ratings yet
Untitled
59 pages
4 R and RStudio 2
No ratings yet
4 R and RStudio 2
20 pages
Advanced Comp Unit 4
No ratings yet
Advanced Comp Unit 4
10 pages
Unit 4 - Ids
No ratings yet
Unit 4 - Ids
65 pages
Week 3&4
No ratings yet
Week 3&4
39 pages
Introduction To R Installation: Data Types Value Examples
No ratings yet
Introduction To R Installation: Data Types Value Examples
9 pages
Aim: Write A R Script To Demonstrate The Use of 1) Decision Making, 2) Loops 3) Functions
No ratings yet
Aim: Write A R Script To Demonstrate The Use of 1) Decision Making, 2) Loops 3) Functions
9 pages
R Programming Lab
No ratings yet
R Programming Lab
33 pages
Ass Soln2
No ratings yet
Ass Soln2
3 pages
Wa0011
No ratings yet
Wa0011
32 pages
R - Lecture 5
No ratings yet
R - Lecture 5
39 pages
Ids Unit 4 by
No ratings yet
Ids Unit 4 by
41 pages
R Functions: Things Your Mother (Probably) Didn't Tell You About
No ratings yet
R Functions: Things Your Mother (Probably) Didn't Tell You About
34 pages
1research Methodology For Commerce Lab
No ratings yet
1research Methodology For Commerce Lab
35 pages
R-Unit 2
No ratings yet
R-Unit 2
81 pages
Introduction to PHP, Part 2, Second Edition
From Everand
Introduction to PHP, Part 2, Second Edition
Adam Majczak
No ratings yet
Top Numerical Methods With Matlab For Beginners!
From Everand
Top Numerical Methods With Matlab For Beginners!
Andrei Besedin
No ratings yet
PowerShell Pillar PDFdownload
No ratings yet
PowerShell Pillar PDFdownload
26 pages
Ls Comp lb9 Ans PDF
No ratings yet
Ls Comp lb9 Ans PDF
1 page
Nimble Tech: What Is Java
No ratings yet
Nimble Tech: What Is Java
57 pages
Control Structure
No ratings yet
Control Structure
6 pages
01 Laboratory Exercise 15
No ratings yet
01 Laboratory Exercise 15
3 pages
C Programming Mind Maps
0% (1)
C Programming Mind Maps
45 pages
From Alice To Python. Introducing Text-Based Programming in Middle Schools
No ratings yet
From Alice To Python. Introducing Text-Based Programming in Middle Schools
6 pages
Certified Python Programmer
No ratings yet
Certified Python Programmer
6 pages
Nested If Examples in Java
No ratings yet
Nested If Examples in Java
4 pages
C Language Unit 1,2,3,4,5
No ratings yet
C Language Unit 1,2,3,4,5
247 pages
Flowchart If - Then - Else: Anan Phonphoem Anan@cpe - Ku.ac - TH
No ratings yet
Flowchart If - Then - Else: Anan Phonphoem Anan@cpe - Ku.ac - TH
16 pages
Chapter 4 - Selection Structure
No ratings yet
Chapter 4 - Selection Structure
5 pages
Conditionals and Imports Slides
No ratings yet
Conditionals and Imports Slides
39 pages
Control-Statements-In-Java
No ratings yet
Control-Statements-In-Java
24 pages
Qbasic Notes
No ratings yet
Qbasic Notes
7 pages
Final - Revision - Sheet Prep.3 Second Term 2017 Khaled
No ratings yet
Final - Revision - Sheet Prep.3 Second Term 2017 Khaled
28 pages
Computer Science Engineering
No ratings yet
Computer Science Engineering
53 pages
T.Y.B.Sc. (Computer Science) - 07.07.2021
No ratings yet
T.Y.B.Sc. (Computer Science) - 07.07.2021
46 pages
Comprog Question
No ratings yet
Comprog Question
3 pages
4 Introduction To Problem Solving
No ratings yet
4 Introduction To Problem Solving
11 pages
ME18A - MATLAB Lecture Notes
No ratings yet
ME18A - MATLAB Lecture Notes
49 pages
Core Java
No ratings yet
Core Java
217 pages
VB 6.0 Tutorial
100% (1)
VB 6.0 Tutorial
95 pages
PHP Project Report
73% (15)
PHP Project Report
24 pages
What Is C
No ratings yet
What Is C
61 pages
Bca103 Sample Question Bank
No ratings yet
Bca103 Sample Question Bank
5 pages
Robot Notes
No ratings yet
Robot Notes
15 pages
Python Full Stack 2025
No ratings yet
Python Full Stack 2025
25 pages
Unit I
No ratings yet
Unit I
42 pages
PowerBI Dashboard III
No ratings yet
PowerBI Dashboard III
21 pages

Chapter R Programming

Uploaded by

Chapter R Programming

Uploaded by

This is page 43

Printer: Opaque this

3.1 Flow control and vectorization

3.1.1 Conditional execution

The syntax of the if-statement is

to False and any non-zero value corresponds to True.

> grade = ‘‘PASS’’

The ifelse-statement provides a more concise form, which takes three

The function switch is also a commonly used conditional execution form,

3.1.2 Repetitive execution

The syntax of the for-loop is:

• The repeat command

The syntax of the repeat command is:

• The while command

The syntax of the while command is:

otherwise. By using the looping technique (e.g., the ifelse statement),

3.2 User-defined functions

3.2.1 A function for the normal likelihood

Computationally, it is preferable to compute the logarithmic likelihood

The R code for the logarithmic normal likelihood is:

2 In a sense, likelihood works backwards from conditional probability. In a forward rea-

Now, let us randomly generate a sample of size from a normal distrib-

3.2.2 Functions with default values

3.2.3 Functions as arguments

In the above, we use a generic sin function to generate values for y =

FIGURE 3.3. Plots of a user-defined function

3.2.4 Functions for Binary operators

FIGURE 3.4. Illustration of a binary operator for plotting y=log(x) over x

3.2.5 Recursive functions

3.3 Some issues related to R programming

3.3.2 Exception handling

3.3.3 Classes and generic functions

[22] summary.princomp* summary.stepfun summary.stl*

bonacci (a contraction of filius Bonaccio, ”son of Bonaccio”.)

hand side. If the global or top-level environment is reached without finding

3.5 Project: Additive genetic relationship matrix

The matrix A can be computed using path coefficient, but a recursive

You might also like