0% found this document useful (0 votes)

16 views

Intro R

This document provides an introduction to using the statistical programming language R. It discusses downloading and installing R, loading sample datasets, managing packages, and performing basic operations. Some key points covered include loading the iris dataset as an example, listing available objects in memory, removing objects, and performing arithmetic operations, creating vectors, and accessing vector elements by name or index. Basic functions for sequences, repetition, aggregation, and indexing vectors are also described.

Uploaded by

bhyjed35

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

16 views

Intro R

Uploaded by

bhyjed35

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 38

1) Sutting up

1- Prise en main du logiciel

Obtenir le logiciel
http://cran.r-project.org/bin/windows/base/R-2.11.0-win32.exe
Installer le logiciel
Lancez le programme R-2.11.0-win32.exe, puis suivez les instruction affichées à
l’écran.
Taper Ctrl L pour nettoyer la fenêtre ‘Rconsole’
2- Premier exemple d'utilisation :
R est fourni avec des tableaux de données
data()
Charger le tableau de données "iris"
data(iris)
La fonction ls() permet de lister les objets en mémoire
ls()
Effacer des objets en mémoire
rm(a,b)
Effacer tous les objets
rm(list=ls())
3- R packages
R makes use of a system of packages.
• A collection of packages is called a library
• Some packages are already loaded when R starts up. Other
packages need be loaded using the library() function
Several packages come pre-installed with R.
> rownames(installed.packages()) ou installed.packages() ou library()
[1] "KernSmooth" "MASS" "base" "boot"
[5] "class" "cluster" "datasets" "foreign"
[9] "grDevices" "graphics" "grid" "lattice"
[13] "methods" "mgcv" "nlme" "nnet"
[17] "rpart" "spatial" "splines" "stats"
[21] "stats4" "survival" "tcltk" "tools"
[25] "utils"
It is also useful to type
regularly the command:
> update.packages()
which checks the versions of the packages installed against those available
on CRAN (this command can be called from the menu \Packages" under
Windows). The user can then update the packages with more recent versions
than those installed on the computer.
There are also many (more than 300) other packages contributed
by various users of R available online, from the Comprehensive
R Archive Network (CRAN):
http://cran.us.r-project.org/src/contrib/PACKAGES.html
New packages can be downloaded and installed using the
install.packages function. For example, to install the ISwR
package (if it’s not already installed), one can use

1
> install.packages("ISwR")
Pour désinstaller un package
> remove.packages("ISwR")
Some packages are already loaded when R starts up. At any
point, The list of currently loaded packages can be listed by the
search function:
> search()
[1] ".GlobalEnv" "package:lattice"
[3] "package:tools" "package:methods"
[5] "package:stats" "package:graphics"
[7] "package:grDevices" "package:utils"
[9] "package:datasets" "Autoloads"
[11] "package:base
Other packages can be loaded by the user. We will be interested
in the ISwR package, which contains the datasets used in the
text. This can be loaded by:
> library(base)
> library( help="base") # List the functions provided in " ISwR "
Most R packages provide some data sets as well as functions. Use function data() to see the
data sets that are loaded by default. Data sets have help pages. For example the page describing
the structure and variables of the data set named swiss is displayed by help(swiss). To get
information on the data sets that are included in the datasets
package, specify:
data(package="datasets") # Specify ’package’, not ’library’.
Replace "datasets" by the name of any other installed package
4- Commandes de base
R : pour démarrer R en interactif.
q() : pour quitter.
help(solve) ou ?solve : pour avoir de l'aide sur solve.
help.start() : lance un navigateur pour l'aide en html.
help.search('chi') pour chercher dans l'aide avec la partie de mot clef 'chi'.
example(solve) : pour faire tourner les exemples de la doc de solve.
demo(package = "stats") : liste des démos du package stats.
demo(nlm, package = "stats") : fait tourner la démo de nlm sur le package stats)

2) Arithmetic
R uses the usual symbols for addition +, subtraction -, multiplication *, division
/, and exponentiation ^. R calculates to a high precision, mais par défaut les nombres sont
arrondis à 7 chiffres après la virgule. You can change the display to x digits
using options(digits = x).
R has a number of built-in functions, for example sin(x), cos(x), tan(x),
(all in radians), exp(x), log(x), and sqrt(x). Some special constants such
as pi are also predefined.
> exp(1)
[1] 2.718282
> options(digits = 16)
> exp(1)

2
[1] 2.718281828459045
> pi
[1] 3.141592653589793
> round(pi,digits=3)
[1] 3.142

3) Vecteurs
Data vectors can be made with the c () function, which combines its arguments. The whale data
can be entered, as follows:
> whales = c(74, 122, 235, 111, 292, 111, 211, 133, 156, 79)
The values are separated by a comma. Once stored, the values can be printed by typing the
variable name
> whales
[1] 74 122 235 111 292 111 211 133 156 79
The [1] refers to the first observation. If more than one row is output, then this number
refers to the first observation in that row.
The c () function can also combine data vectors. For example:
> x = c(74, 122, 235, 111, 292)
> y = c(111, 211, 133, 156, 79)
> c(x,y)
[1] 74 122 235 111 292 111 211 133 156 79
Data vectors have a type One restriction on data vectors is that all the values have the
same type. This can be numeric, as in whales, characters strings, as in
> simpsons = c("Homer",’Marge’,"Bart","Lisa","Maggie")
or one of the other types we will encounter. Character strings are made with matching
quotes, either double, ", or single,’. If we mix the type within a data vector, the data will be
coerced into a common type, which is usually a character.
Les données peuvent être de nature différente et R les classe en différents groupes appelés modes
:
– numeric (les valeurs numériques qui peuvent être de type différent : integer et double)
– logical (les booléens vrai/faux)
– complex (les complexes)
– character (les caractères)
On peut connaître la nature d’une variable en utilisant la commande mode() et son type avec
typeof().
Les fonctions suivantes permettent de vérifier le type d’une donnée ou d’une variable et
d’effectuer
des opérations de conversion :
Test Conversion
is.numeric() as.numeric()

is.complex() as.complex()

is.character() as.character()

is.logical() as.logical()

3
Character vectors
Data, reports and figures require frequent manipulation of characters. Character strings are
delineated by double or single quotes. Here is an example:
> (s <- c("Florida; a politician's","nightmare"))
[1] "Florida; a politician's" "nightmare"
The vector s has two elements. To create a single string from s[1] and s[2],
we paste() them:
> paste(s[1], s[2])
[1] "Florida; a politician's nightmare"
By default, paste() separates its arguments with a space. If you want a different
character for spacing elements of characters, use the argument sep:
> paste(s[1], s[2], sep = '-')
[1] "Florida; a politician's-nightmare"
Giving data vectors named entries A data vector can have its entries named. These
will show up when it is printed. The names () function is used to retrieve and set values
for the names. This is done as follows:
> names(simpsons) = c("dad","mom","son","daughter
+ 1","daughter 2")
> names(simpsons)
[1] "dad" "mom" "son" "daughter 1"
[5] “daughter 2"
> simpsons
dad mom son daughter 1 daughter 2
"Homer" "Marge" "Bart" "Lisa" "Maggie"
Pour supprimer les noms :
> names(simpsons) <- NULL
Accessing by names In R, when the data vector has names, then the values can be
accessed by their names. This is done by using the names in place of the indices. A
simple example follows:
> x = 1:3
> names(x) = c("one","two","three") # set the names
> x["one"]
one
1
Using data.entry () to edit data : data.entry (x) will allow us to edit the data vector x. The
function does not make a new variable. To use data.entry() to make a new variable, we can first
create a simple one, as we have done below, and then finish the data entry with the spreadsheet.
> x <-c(1)
> data.entry(x)
It is also possible, if one wants to enter some data on the keyboard, to use
the function scan with simply the default options:
> x <- scan()
1: 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0
10:
Read 9 items
>x

4
[1] 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0
Simple sequences
> v <- seq (0,1,by=0.1)
[1] 0.0 0.1 0.2 …………..0.9 1.0
> v <- seq (0, 1, length= 11) # 11 numbers only
[1] 0.0 0.1 1.0 # 11 valeurs entre 0 et 1
> seq (10)
[1] 1 2 3 4 5 6 7 8 9 10
> v <- 1:3 # les composantes de v sont 1,2,3
Repeated numbers When a vector of repeated values is desired, the rep() function is
used.
[1] 1 1 1 1 1 1 1 1 1 1
> rep(1:3,3) # ou rep(1:3,times=3)
[1] 1 2 3 1 2 3 1 2 3
> rep(1:3,each=3)
[1] 1 1 1 2 2 2 3 3 3
> rep(c("long","short"),c(1,2)) # 1 long and 2 short
[1] "long" "short" "short"
> sum(v) # somme des composantes de v
> sd(v) #écart type de v
> sum(v[1 :3]) # somme des 3 premières composantes de v
> v[c(1,4)] # les composantes de rang 1 et 4
> v[-c(1,4)] # enlève les composantes de rang 1 et 4
> x [1] <- 5 # change la valeur de la composante de rang 1
Si vous voulez remplacer la valeur 1 par la valeur 25, vous utiliserez alors la ligne
de commande suivante :
> x[x= =1] <- 25
> x[c (1, 4)]<- c(20, 30) # les composantes de rang 1 et 4 ont pour valeur respectivement 20 et
30
> v[v>2] # les composantes dont les valeurs sont supérieures à 2
Si vous disposez de deux vecteurs ayant le même nombre de composantes, vous
pouvez demander à chercher les valeurs de l'un pour lesquelles les valeurs de l'autre
sont supérieures (ou inférieures) à une certaine valeur. Par exemple, les vecteurs x
et y sont composées de 5 valeurs. Vous pouvez demander d'extraire de y les valeurs
de y pour lesquels x est supérieur à 4 en utilisant la ligne de commande suivante :
> y[x>4]
> x<-1:5 ; > y<-10:14
> x[y>12]
[1] 4 5
max(v)
which(v= = max(v)) # indice dans v qui correspond à max(v)
length(v) # nombre de composantes de v
cumsum(a) # somme cumulative de 'a'
> cumsum(c(1,3,5))
[1] 1 4 9
cumprod(b) # produit cumulatif de 'b'
> cumprod(c(1,3,5))

5
[1] 1 3 15
Exercice
If you want to create a sequence of the same length as an existing vector, then use along
like this.
> x<-10:20
> seq(along = x)
[1] 1 2 3 4 5 6 7 8 9 10 11
> seq(88,50,along=x)
[1] 88.0 84.2 80.4 76.6 72.8 69.0 65.2 61.4 57.6 53.8 50.0

Creating a Vector
Named Elements within Vectors
Working with Vectors and Logical Subscripts
Take the example of a vector containing the 11 numbers 0 to 10:
x<-0:10
There are two quite different kinds of things we might want to do with this. We might want
to add up the values of the elements:
sum(x)
[1] 55
Alternatively, we might want to count the elements that passed some logical criterion.
Suppose we wanted to know how many of the values were less than 5:
sum(x<5)
[1] 5
Ce qui remplace
> length(x[x>5])
[1] 5
You see the distinction. We use the vector function sum in both cases. But sum(x) adds
up the values of the xs and sum(x<5) counts up the number of cases that pass the logical
condition ‘x is less than 5’. Logical TRUE has been coerced to numeric 1 and logical FALSE has
been coerced to numeric 0.
To find the sum of the values of x that are less than 5, we write:
sum(x[x<5])
[1] 10
Let’s look at this in more detail. The logical condition x<5 is either true or false:
x<5
[1] TRUE TRUE TRUE TRUE TRUE FALSE FALSE FALSE FALSE
[10] FALSE FALSE
You can imagine false as being numeric 0 and true as being numeric 1. Then the vector of
subscripts [x<5] is five 1s followed by six 0s:
1*(x<5)
[1] 1 1 1 1 1 0 0 0 0 0 0
Now imagine multiplying the values of x by the values of the logical vector
x*(x<5)
[1] 0 1 2 3 4 0 0 0 0 0 0
When the function sum is applied, it gives us the answer we want: the sum of the values
of the numbers 0+1+2+3+4=10.
sum(x*(x<5))

6
[1] 10

Exercise :
Suppose we want to work out the sum of the three largest values in a vector. There are
two steps: first sort the vector into descending order. Then add up the values of the first
three elements of the sorted vector. Let’s do this in stages. First, the values of y:
y<-c(8,3,5,7,6,6,8,9,2,3,9,4,10,4,11)
Now if you apply sort to this, the numbers will be in ascending sequence, and this makes
life slightly harder for the present problem:
sort(y)
[1] 2 3 3 4 4 5 6 6 7 8 8 9 9 10 11
We can use the reverse function, rev like this (use the Up arrow key to save typing):
rev(sort(y))
[1] 11 10 9 9 8 8 7 6 6 5 4 4 3 3 2
So the answer to our problem is 11+10+9=30. But how to compute this? We can use
specific subscripts to discover the contents of any element of a vector. We can see that 10
is the second element of the sorted array. To compute this we just specify the subscript [2]:
rev(sort(y))[2]
[1] 10
A range of subscripts is simply a series generated using the colon operator. We want the
subscripts 1 to 3, so this is:
rev(sort(y))[1:3]
[1] 11 10 9
So the answer to the exercise is just
sum(rev(sort(y))[1:3])
[1] 30
Ce qui remplace
sum(sort(y)[(length(y)-2):length(y)])

Addresses within Vectors

There are two important functions for finding addresses within arrays. The function which
is very easy to understand.
y
[1] 8 3 5 7 6 6 8 9 2 3 9 4 10 4 11
Suppose we wanted to know which elements of y contained values bigger than 5. We type
which(y>5)
[1] 1 4 5 6 7 8 11 13 15

Exercise :
To extract every nth element from a long vector we can use seq as an index. In this case
I want every 25th value in a 1000-long vector of normal random numbers with mean value
100 and standard deviation 10:
xv<-rnorm(1000,100,10)
xv[seq(25,length(xv),25)]
[1] 100.98176 91.69614 116.69185 97.89538 108.48568 100.32891 94.46233
[8] 118.05943 92.41213 100.01887 112.41775 106.14260 93.79951 105.74173
[15] 102.84938 88.56408 114.52787 87.64789 112.71475 106.89868 109.80862

7
[22] 93.20438 96.31240 85.96460 105.77331 97.54514 92.01761 97.78516
[29] 87.90883 96.72253 94.86647 90.87149 80.01337 97.98327 92.77398
[36] 121.47810 92.40182 87.65205 115.80945 87.60231

Finding Closest Values

Finding the value in a vector that is closest to a specified value is straightforward using
which. Here, we want to find the value of xv that is closest to 108.0:
which(abs(xv-108)==min(abs(xv-108)))
[1] 332
The closest value to 108.0 is in location 332. But just how close to 108.0 is this 332nd
value? We use 332 as a subscript on xv to find this out
xv[332]
[1] 108.0076
Ce qui revient à
> xv[abs(xv-108)==min(abs(xv-108))]
Thus, we can write a function to return the closest value to a specified value _sv_
closest<-function(xv,sv){
xv[which(abs(xv-sv)==min(abs(xv-sv)))] }
and run it like this:
closest(xv,108)
[1] 108.0076

Trimming Vectors Using Negative Subscripts

Individual subscripts are referred to in square brackets. So if x is like this:
x<- c(5,8,6,7,1,5,3)
we can find the 4th element of the vector just by typing
x[4]
[1] 7
An extremely useful facility is to use negative subscripts to drop terms from a vector.
Suppose we wanted a new vector, z, to contain everything but the first element of x
z <- x[-1]
z
[1] 8 6 7 1 5 3

It is often useful to have the values in a vector labelled in some way. For instance, if our
data are counts of 0, 1, 2, … occurrences in a vector called counts (ou effectif)
> (counts<-c(25,12,7,4,6,2,1,0,2))
[1] 25 12 7 4 6 2 1 0 2
so that there were 25 zeros, 12 ones and so on, it would be useful to name each of the
counts with the relevant number 0 to 8:
> names(counts)<-0:8
Now when we inspect the vector called counts we see both the names and the frequencies:
counts
012345678
25 12 7 4 6 2 1 0 2
If you have computed a table of counts, and you want to remove the names, then use the
as.vector function like this:

8
k
> (st<-table(rpois(2000,2.3))) # P[ X  k ]  e 
k!
0123456789
205 455 510 431 233 102 43 13 7 1
names(st)
[1] "0" "1" "2" "3" "4" "5" "6" "7" "8" "9" "10"
> as.vector(st)
[1] 205 455 510 431 233 102 43 13 7 1

Exercice
Suppose our task is to calculate a trimmed mean of x which ignores both the smallest
and largest values (i.e. we want to leave out the 1 and the 8 in this example). There are two
steps to this. First, we sort the vector x. Then we remove the first element using x[-1] and
the last using x[-length(x)]. We can do both drops at the same time by concatenating both
instructions like this: -c(1,length(x)). Then we use the built-in function mean:
trim.mean <- function (x) mean(sort(x)[-c(1,length(x))])
Now try it out. The answer should be mean(c(5,6,7,5,3)) = 26/5 = 5.2:
trim.mean(x)
[1] 5.2

Table 1. Logical operations

Symbol Meaning
!= not equal
%% the remainder of a division, modulo
%/% the integer part
* multiplication
+ addition
- subtraction
^ the power
/ division
< less than
<= less than or equal to
> greater than
>= greater than or equal to
== logical equals (double =)
& (et)
| (ou)
! (non)
xor (ou exclusif)

Exemples
>x
[1] 5 8 6 7 1 5 3
> x[x!=5]<-0

9
>x
[1] 5 0 0 0 0 5 0
> 10%%2
[1] 0
> x[x%%2= =0]
[1] 8 6
> 31%/%4
[1] 7
> x[!x<5]<-0
>x
[1] 0 0 0 0 1 0 3
4) Les fonctions
Custom functions to extend the R language can be created using the function
keyword. For example a function to calculate l’écart type of a sample
vector x could be defined as follows:
> se<-function(x){
n<-length(x)
xbar<-mean(x)
sqrt((sum((x-xbar)^2))/(n-1))}
The function arguments are declared as the arguments to the function keyword.
Here there is just one argument, named x. The value returned by
the function is the value of its final line. The name of the function is the name of
the variable you assign it to. Here the function is named se. This function could
then be used as follows:
> y<-rnorm(100)
> se(y)
Les fonctions mathématiques dans R
a<-0; ifelse (a>=1, b<-"oui", b<-"non"); b
log(x) log to base e of x
exp(x) antilog of x _ex_
log(x,n) log to base n of x
log10(x) log to base 10 of x
sqrt(x) square root of x
factorial(x) x!
choose(n,x) binomial coefficients n!/(x! (n−x)!)
floor(x) greatest integer <=x
ceiling(x) smallest integer >=x
trunc(x) closest integer to x between x and 0 trunc(1.5) =1, trunc(-1.5)
=−1 trunc is like floor for positive values and like ceiling for
negative values
round(x, digits=0) round the value of x to an integer
runif(n) generates n random numbers between 0 and 1 from a uniform
distribution
cos(x) cosine of x in radians
sin(x) sine of x in radians
tan(x) tangent of x in radians
acos(x), asin(x), atan(x) inverse trigonometric transformations of real or complex numbers

10
acosh(x), asinh(x), atanh(x) inverse hyperbolic trigonometric transformations of real or
complex numbers
abs(x) the absolute value of x
unique () : comme son nom l'indique, enlève les doublons d'un vecteur.
> unique(c(1,3,6,2,7,4,8,1,0))
[1] 1 3 6 2 7 4 8 0

Les nombres avec exposant

For very big numbers or very small numbers R uses the following scheme:
1.2e3 means 1200 because the e3 means ‘move the decimal point 3 places to the right’
1.2e-2 means 0.012 because the e-2 means ‘move the decimal point 2 places to the left’
3.9+4.5i is a complex number with real (3.9) and imaginary (4.5) parts, and i is the square
root of −1.
Modulo
Suppose we want to know the integer part of a division: say, how many 13s are there in 119:
> 119 %/% 13
[1] 9
Now suppose we wanted to know the remainder (what is left over when 119 is divided by
13): in maths this is known as modulo:
> 119 %% 13
[1] 2
Modulo is very useful for testing whether numbers are odd or even: odd numbers have
modulo 2 value 1 and even numbers have modulo 2 value 0:
> 9 %% 2
[1] 1
> 8 %% 2
[1] 0
Likewise, you use modulo to test if one number is an exact multiple of some other number.
For instance to find out whether 15 421 is a multiple of 7, ask:
15421 %% 7 = = 0
[1] TRUE
Rounding
Various sorts of rounding (rounding up, rounding down, rounding to the nearest integer)
can be done easily. Take 5.7 as an example. The ‘greatest integer less than’ function is floor
> floor(5.7)
[1] 5
and the ‘next integer’ function is ceiling
> ceiling(5.7)
[1] 6
Exercice
Suppose now that we need to produce a vector containing the numbers 1 to 50 but
omitting all the multiples of seven (7, 14, 21, etc.)
First make a vector of all the numbers1 to 50 including the multiples of 7:
vec<-1:50
vec[!vec%%7==0]
[1] 1 2 3 4 5 6 8 9 10 11 12 13 15 16 17 18 19 20 22 23 24 25
[23] 26 27 29 30 31 32 33 34 36 37 38 39 40 41 43 44 45 46 47 48 50

11
Infinity and Things that Are Not a Number (NaN)
Calculations can lead to answers that are plus infinity, represented in R by Inf, or minus
infinity, which is represented as -Inf:
> 3/0
[1] Inf
> -12/0
[1] -Inf
Calculations involving infinity can be evaluated: evaluated: for instance,
> exp(-Inf)
[1] 0
> 0/Inf
[1] 0
Other calculations, however, lead to quantities that are not numbers. These are represented
in R by NaN (‘not a number’). Here are some of the classic cases:
> 0/0
[1] NaN
> Inf-Inf
[1] NaN
> Inf/Inf
[1] NaN
> is.finite(10)
[1] TRUE
> is.infinite(10)
[1] FALSE
> is.infinite(Inf)
[1] TRUE

5) Missing values NA
Missing values in dataframes are a real source of irritation because they affect the way that
model-fitting functions operate and they can greatly reduce the power of the modelling that
we would like to do.
Some functions do not work with their default settings when there are missing values in
the data, and mean is a classic example of this:
> x<-c(1:8,NA)
> mean(x)
[1] NA
In order to calculate the mean of the non-missing values, you need to specify that the
NA are to be removed, using the na.rm=TRUE argument:
> mean(x,na.rm=T)
[1] 4.5
To check for the location of missing values within a vector, use the function is.na(x)
rather than x !="NA". Here is an example where we want to find the locations (7 and 8) of
missing values within a vector called vmv:
> vmv
[1] 1 2 3 4 5 6 NA NA 9 10 11 12
> is.na(vmv)

12
[1] FALSE FALSE FALSE FALSE FALSE FALSE TRUE TRUE FALSE FALSE FALSE
FALSE
Making an index of the missing values is achieved using which like this:
> which(is.na(vmv))
[1] 7 8
or
> which(is.na(vmv)==T)
[1] 7 8
On obtient les composantes de vmv dont les indices (ou rangs) vérifient la condition
is.na(vmv)==T.
If the missing values are genuine counts of zero, you might want to edit the NA to 0.
Use the is.na function to generate subscripts for this
> vmv[is.na(vmv)]<- 0
> vmv
[1] 1 2 3 4 5 6 0 0 9 10 11 12
C’est la même chose que pour vmv(vmv<4).
or use the ifelse function like this
vmv<-c(1:6,NA,NA,9:12)
ifelse(is.na(vmv),0,vmv)
[1] 1 2 3 4 5 6 0 0 9 10 11 12
> sum(is.na(vmv)) # nombre de NA
Exercice
Comment vérifier que les positions des valeurs manquantes sont les mêmes pour deux vecteurs ?
> x<-c(1,4,6,NA,4,NA)
> y<-c(4,3,2,NA,5,NA)
On écrit l’instruction suivante :
> all(is.na(x)==is.na(y))
[1] TRUE
Si on ignore les valeurs manquantes, les deux vecteurs sont différents. En effet
> all(x[!is.na(x)]==y[!is.na(y)])
[1] FALSE
6) Fonctions d’un vecteur
One of R’s great strengths is its ability to evaluate functions over entire vectors, thereby
avoiding the need for loops and subscripts. Important vector functions are listed in Table 2.2.
Table Vector functions used in R.
Operation Meaning
max(x) maximum value in x
min(x) minimum value in x
sum(x) total of all the values in x
mean(x) arithmetic average of the values in x
median(x) median value in x
range(x) vector of min_x_ and max_x_
var(x) sample variance of x
cor(x,y) correlation between vectors x and y
sort(x) a sorted version of x
rank(x) vector of the ranks of the values in x
order(x) an integer vector containing the permutation to sort x into ascending order

13
quantile(x) vector containing the minimum, lower quartile, median, upper quartile, and
maximum of x
quantile(x,prob=1:3/4) donne les trois quartile Q1, Q2 et Q3.
cumsum(x) vector containing the sum of all of the elements up to that point
cumprod(x) vector containing the product of all of the elements up to that point
cummax(x) vector of non-decreasing numbers which are the cumulative maxima of the values
in x up to that point
cummin(x) vector of non-increasing numbers which are the cumulative minima of the values
in x up to that point
pmax(x,y,z) vector, of length equal to the longest of x_ y or z, containing the maximumof x, y
or z for the ith position in each
pmin(x,y,z) vector, of length equal to the longest of x_ y or z, containing the minimum
of x,y or z for the ith position in each
colMeans(x) column means of dataframe or matrix x
colSums(x) column totals of dataframe or matrix x
rowMeans(x) row means of dataframe or matrix x
rowSums(x) row totals of dataframe or matrix x

Les principales structures de données en R sont :

– les vecteurs (vector)
– les facteurs (factor)
– les matrices (matrix)
– les tableaux (array)
– les listes (list)
– les chaînes de caractères (string)
– les data frames (data frame)

7) The sample Function

This function shuffles the contents of a vector into a random sequence while maintaining all
the numerical values intact. It is extremely useful for randomization in experimental design,
in simulation and in computationally intensive hypothesis testing. Here is the original y
vector again:
y
[1] 8 3 5 7 6 6 8 9 2 3 9 4 10 4 11
and here are two samples of y:
sample(y)
[1] 8 8 9 9 2 10 6 7 3 11 5 4 6 3 4
sample(y)
[1] 9 3 9 8 8 6 5 11 4 6 4 7 3 2 10
The order of the values is different each time that sample is invoked, but the same numbers
are shuffled in every case. This is called sampling without replacement. You can specify
the size of the sample you want as an optional second argument:
sample(y,5)
[1] 9 4 10 8 11
sample(y,5)
[1] 9 3 4 2 8

14
The option replace=T allows for sampling with replacement. The vector produced by the sample
function with replace=T is the same length as the vector sampled, but some values are left out at
random and other values, again at random, appear two or more times. In this sample, 10 has been
left out, and there are now three 9s:
sample(y,replace=T)
[1] 9 6 11 2 9 4 6 8 8 4 4 4 3 9 3
In this next case, the are two 10s and only one 9:
sample(y,replace=T)
[1] 3 7 10 6 8 2 5 11 4 6 3 9 10 7 4
More advanced options in sample include specifying different probabilities with which
each element is to be sampled (prob=). For example, if we want to take four numbers at
random from the sequence 1:10 without replacement where the probability of selection (p)
is 5 times greater for the middle numbers (5 and 6) than for the first or last numbers, and
we want to do this five times, we could write
p <- c(1, 2, 3, 4, 5, 5, 4, 3, 2, 1)
x<-1:10
sapply(1:5,function(i) sample(x,4,prob=p))
[,1] [,2] [,3] [,4] [,5]
[1,] 8 7 4 10 8
[2,] 7 5 7 8 7
[3,] 4 4 3 4 5
[4,] 9 10 8 7 6
so the four random numbers in the first trial were 8, 7, 4 and 9 (i.e. column 1).

8) factor
Type de vecteur qui code pour une propriété qualitative (attribut nominal) qui est codé en
interne par un numéro et non par la chaîne de caractère représentant sa valeur. A factor is stored
internally as a numeric vector with values 1, 2, 3, …, k. The value k is
the number of levels.
Statisticians typically recognise three basic types of variable: numeric, ordinal,
and categorical. Both ordinal and categorical variables take values from
some finite set, but the set is ordered for ordinal variables. For example in an
experiment one might grade the level of physical activity as low, medium, or
high, giving an ordinal measurement. An example of a categorical variable is
hair colour. In R the data type for ordinal and categorical vectors is factor.
The possible values of a factor are referred to as its levels.
To create factors in R, use the function factor(). However, many operations on data in R create
factors by default.
> fac <- factor(c("rouge", "vert", "rouge", "bleu", "vert"))
> fac
[1] rouge vert rouge bleu vert
Levels: bleu rouge vert
> levels(fac)
[1] "bleu" "rouge" "vert"
> as.numeric(fac)
[1] 2 3 2 1 3
The function as.numeric extracts the numerical coding as numbers

15
1–3 and levels extracts the names of the levels.
Exercice
If x is a factor with n levels and y is a length n vector, what happens if you compute y[x]?
Réponse
Factor x gets treated as if it contained the integer codes.
x <- factor(c("Huey", "Dewey", "Louie", "Huey"))
y <- c("blue", "red", "green")
x
y[x]
y[as.numeric(x)]
Autre exemple
> Sch<-sample(0:1,20,replace=T) # échantillon de 20 nombres pris à partir du vecteur 0-1
> Sch
[1] 0 0 1 0 0 0 1 0 1….
> Sch.f<-factor(Sch,labels=c("private","public"))
> Sch.f
[1] private public public private private public private public public
[10] private public public public public public public private private
[19] private public
Levels : private public
Autre exemple :
> hair <- c("blond", "black", "brown", "brown", "black", "gray","none")
> is.character(hair)
[1] TRUE
> is.factor(hair)
[1] FALSE
> hair <- factor(hair)
> levels(hair)
[1] "black" "blond" "brown" "gray" "none"
> hair <- factor(hair, levels = c("black", "gray", "brown", "blond", "white", "none"))
> table(hair)
hair
black gray brown blond white none
212101
Note the use of the function table to calculate the number of times each level of the factor
appears. table can be applied to other modes of vectors as well as factors.
To create an ordered factor we just include the option ordered = TRUE in the factor command. In
this case it is usual to specify the levels of the factor yourself, as that determines the ordering.
> phys.act <- c("L", "H", "H", "L", "M", "M")
> phys.act <- factor(phys.act, levels = c("L", "M", "H"), ordered = TRUE)
Autre possibilité :
> phys.act <-as.orderd(phys.act)
Levels(y)<- c("L", "M", "H")

> is.ordered(phys.act)
[1] TRUE
> phys.act[2] > phys.act[1]

16
[1] TRUE
Often abbreviations or numerical codes are used to represent the levels of a
factor. You can change the names of the levels using the labels argument.
If you do this then it is good practice to specify the levels too, so you know
which label goes with which level.
> phys.act <- factor(phys.act, levels = c("L", "M", "H"),
+ labels = c("Low", "Medium", "High"), ordered = TRUE)
> table(phys.act)
phys.act
Low Medium High
222
> which(phys.act == "High")
[1] 2 3

9) list
Une liste est un type de vecteur spécial dont les éléments peuvent être de
n’importe quel mode, y compris le mode list (ce qui permet d’emboîter des
listes). La fonction de base pour créer des listes est list. Il est généralement préférable de nommer
les éléments d’une liste. Il est en effet plus simple et sûr d’extraire les éléments par leur étiquette.
L’extraction des éléments d’une liste peut se faire de deux façons :
1. avec des doubles crochets [[ ]] ;
2. par leur étiquette avec nom.liste$etiquette.element.
Accès aux noms des éléments de la liste : names(lis). On peut aussi modifier les noms des
éléments en faisant une affectation : names(lis) <- c("f", "l", "a", "c")
Exemple :
> lis <- list(firstname = "jean", lastname = "dupond", age = 35, childAges = c(3, 5, 9))
> lis[[4]]
[1] 3 5 9
> lis$age
[1] 35
> names(lis)
[1] "firstname" "lastname" "age" "childAges"
> names(lis) <- c("f", "l", "a", "c")
> lis
$f
[1] "jean"

$l
[1] "dupond"

$a
[1] 35

$c
[1] 3 5 9
Attention : lis[1] renvoie une liste composée d'un seul élément, le premier
> lis[1]

17
$f
[1] "jean"
10) matrix
X <- matrix (1 :12, nrow=4, ncol=3, byrow=TRUE)
Ajouter des noms aux lignes et colonnes
> mat <- matrix ( c(3,2,…), nrow=3, dimnames = list (c("A","B","C"), c("a","b","c")))
> dimnames(mat)[[1]] <- letters[1:3] : les noms des lignes deviennent a b c
ou > rownames (mat) <- c("a","b","c")
class(X)
[1] "matrix"
> dim(X)
[1] 4 3
Changer une matrice
> mat [1, 2] <-5 : change la valeur d’un élément se situant à la 1ère ligne-2ème colonne
> mat [1, ] <- c(5,6,7) : change la 1ère ligne
> round(mat, 2) : les composantes sont arrondis à 2 décimales
> mat [1,2] : extraction de l’élément se situant à la 1ère ligne-2ème colonne
> mat[ , 2] : extraction de la 2ème colonne
> mat[cbind(c(1,2),c(3,5)] : extraction des éléments d’indices (1,2) et (3,5)
> which(m= =1) : les éléments =1
> which(m= =1, arr.ind=TRUE) : récupération des indices des composantes égales à 1.
Transformer une matrice en vecteur
> as.vector (X)
[1] 1 2 3 4 5 6 7 8 9 10 11 12
> cbind(vect1,vect2,vect3) : renvoie une matrice dont les vecteurs colonnes sont les
vecteurs vect1, vect2 et vect3.
> m2 <- cbind(1,1:4)
> m2
[,1] [,2]
[1,] 1 1
[2,] 1 2
[3,] 1 3
[4,] 1 4
> rbind <- idem mais avec des vecteurs lignes.
Pour insérer une ligne ou une colonne, on peut utiliser rbind et cbind :
> rbind(mat[1:2,], NA,mat[3:4,]) # Insérer une ligne
> cbind(mat[1,], NA,mat[2:3,]) # Insérer une colonne
Autre possibilités pour créer une matrice
> vector<-c(1,2,3,4,4,3,2,1)
> V<-matrix(vector,byrow=T,nrow=2)
>V
[,1] [,2] [,3] [,4]
[1,] 1 2 3 4
[2,] 4 3 2 1
Ou aussi avec :
> dim(vector)<-c(4,2)

18
Pour vérifier que l’on obtient ainsi une matrice on écrit :
> is.matrix(vector)
[1] TRUE
> vector
[,1] [,2]
[1,] 1 4
[2,] 2 3
[3,] 3 2
[4,] 4 1
Soit la matrice :
X<-matrix(rpois(20,1.5),nrow=4)
On veut ajouter les noms des lignes à savoir trial.1,trial.2,trial.3,trial.4. On écrit alors
rownames(X)<-paste("Trial.",1:4,sep=" ")
On obtient
>X
[,1] [,2] [,3] [,4] [,5]
Trial.1 2 2 1 1 3
Trial.2 1 0 2 2 1
Trial.3 2 3 0 2 2
Trial.4 1 0 4 1 1
De même pour les noms des colonnes.
Si on veut que les noms soient row1,row2,… on écrit
rownames(X)<-rownames(X,do.NULL=FALSE)
[,1] [,2] [,3] [,4] [,5]
row1 2 2 1 1 3
row2 1 0 2 2 1
row3 2 3 0 2 2
row4 1 0 4 1 1

On peut assi utiliser la fonction dimnames qui a pour argument list (rows first, columns second)
> dimnames(X)<-list(NULL,paste("drug.",1:5,sep=""))
>X
drug.1 drug.2 drug.3 drug.4 drug.5
[1,] 2 2 1 1 3
[2,] 1 0 2 2 1
[3,] 2 3 0 2 2
[4,] 1 0 4 1 1
Agrégats sur les matrices
> rowSums (mat) (ou colSums (mat)) : renvoie les vecteurs contenant la somme des lignes
(ou la somme des colonnes).
> rowMeans (mat) (ou colMeans (mat))
On utiise na.rm = TRUE pour éviter de tenir compte des valeurs NA.
Produit de matrice (n,p) ∗ (p,q) = (n,q)
> A%∗%B
Produit scalaire xty où x et y deux vecteurs ayant le même nombre de composantes.
> x%∗%y

19
Produit élément par élément
> A ∗B
où A et B sont deux vecteurs (ou deux matrices) de même « dimension ».
Transposée d’une matrice
> t(X)
Matrice identité
> I <- diag(n))
Inverse d’une matrice
> solve (X)
Diagonale d’une matrice
> diag (mat) : renvoie un vecteur correspondant à la diagonale de mat
> diag (k,n) : renvoie une matrice diagonale de dimension n et dont les éléments diagonaux
sont égaux à k
Exercice : calculer le produit de deux matrices
f<-function(M,N){
m<-dim(M)[2];n<-dim(N)[1]
if (m!=n) stop("dsd")
else
p<-dim(M)[1];q<-dim(N)[2]
mat<-matrix(NA,p,q)
for (i in 1:p){
for (j in 1:q) {
mat[i,j]<-M[i,]%*%N[,j]
}
}
return(mat)
}
M<-matrix(1,2,3)
N<-matrix(2,3,2)
f(M,N)
M%*%N

11) data frame

For reading and writing in files, R uses the working directory. To find this
directory, the command getwd() (get working directory) can be used, and the
working directory can be changed with setwd("C:/data") It is necessary to give the path to a _le if
it is not in the working directory.
R can read data stored in text (ASCII) files with the following functions:
read.table (which has several variants, see below) and scan. R
can also read files in other forgemats (Excel, SAS, SPSS, . . . ), and access SQLtype
databases, but the functions needed for this are not in the package base.
These functionalities are very useful for a more advanced use of R.
The function read.table has for effect to create a data frame, and so is
the main way to read data in tabular form.
Assume that the data from Table 1.1 are in a file called exemple.txt in the working
directory, with fields separated by one or more spaces.

20
VF
13
24
Table 1.1
To read this file, type
exemple<-read.table("exemple.txt",header=T)
or
exemple<-read.table("C:/livre/logiciel_R/cours_R/exemple.txt",header=T)
We have assumed that the fields in exemple.txt are separated by spaces (or tabs), as
allowed by the default setting (sep=" ") for read.table().
Notice header=T specifying that the first line is a header containing
the names of variables contained in the file. Also note that you use forward
slashes (/), not backslashes (\), in the filename. You could have used expemple<-
read.table("C:\\livre\\logiciel_R\\cours_R\\exemple.txt",sep="",header=T)
There is one commonly used variant of read.table. read.csv(file) is
for comma-separated data (pour lire des donnees separées par des virgules) and is equivalent to
read.table(file, header = TRUE, sep = ","). read.csv2(file) est utilisé pour lire des données
separées par des points-virgules. read.delim(file) is for tab-delimitated data (données separées par
des tabulations) and is equivalent to read.table(file, header = TRUE, sep = "\t").
A dataframe is an object with rows and columns (a bit like a 2-dimensional matrix). The rows
contain different observations from your study, or measurements from your experiment. The
columns contain different variables. The values in the body of the dataframe can be numbers (as
they would be in as matrix), but they could also be text (e.g. the names of factor levels for
categorical variables, like “male” or “female” in a variable called “gender”), they could be
calendar dates (like 23/5/04), or they could be logical variables (like “TRUE” or “FALSE”). Here
is a dataframe with 7 variables, the left-most of which comprises the row names, and other
variables are numeric (Area, Slope, Soil pH and Worm density), categorical (Field Name and
Vegetation) or logical (Damp is either true = T or false = F).

Field.Name Area Slope Vegetation Soil.pH Damp Worm.density

Nash's.Field 3.6 11 Grassland 4.1 F 4
Silwood.Bottom 5.1 2 Arable 5. F 7
Nursery.Field 2.8 3 Grassland 4.3 F 2
Rush.Meadow 2.4 5 Meadow 4.9 T 5
Gunness'.Thicket3.8 0 Scrub 4.2 F 6
Oak.Mead 3.1 2 Grassland 3.9 F 2
Church.Field 3.5 3 Grassland 4.2 F 3
Ashurst 2.1 0 Arable 4.8 F 4
The.Orchard 1.9 0 Orchard 5.7 F 9
Rookery.Slope 1.5 4 Grassland 5 T 7
Garden.Wood 2.9 10 Scrub 5.2 F 8
North.Gravel 3.3 1 Grassland 4.1 F 1
South.Gravel 3.7 2 Grassland 4 F 2
Observatory.Ridge1.8 6 Grassland 3.8 F 0
Pond.Field 4.1 0 Meadow 5 T 6
Water.Meadow 3.9 0 Meadow 4.9 T 8
Cheapside 2.2 8 Scrub 4.7 T 4

21
Pound.Hill 4.4 2 Arable 4.5 F 5
Gravel.Pit 2.9 1 Grassland 3.5 F 1
Farm.Wood 0.8 10 Scrub 5.1 T 3

Perhaps the most important thing about analysing your own data properly is getting your
dataframe absolutely right. The expectation is that you will have used a spreadsheet like Excel to
enter and edit the data. Once you have made your dataframe in Excel and corrected all the
inevitable data-entry and spelling errors, then you need to save the dataframe in a file format that
can be read by R. Much the simplest way is to save all your dataframes from Excel as tab-
delimited text files: File / Save As / … then from the “Save as type” options choose “Text (Tab
delimited)”. There is no need to add a suffix, because Excel will automatically add “.txt” to your
file name. This file can then be read into R directly as a dataframe, using the read.table function.
It is important to note that read.table would fail if there were any spaces in any of the variable
names in row 1 of the dataframe (the header row) like Field Name, Soil pH or Worm Density.
We should replace all these spaces by dots “.” before saving the dataframe in Excel (use Edit
/Replace with “ “ replaced by “.”). Now the dataframe can be read into R. There are 3 things to
remember:
• the whole path and file name needs to be enclosed in double quotes: “c:\\abc.txt”
• header =T says that the first row contains the variable names

Think of a name for the data frame (say “worms” in this case).
worms<-read.table("c:\\temp\\worms.txt",header=T,row.names=1)
or
> worms<-
read.table("http://www.bio.ic.ac.uk/research/mjcraw/therbook/data/worms.txt",header=T)
Pour ouvrir un fichier de données sans avoir à indiquer son emplacement en utilisant
une boîte de dialogue conviviale :
> read.table(file.choose())

Il existe plusieurs façons d'accéder aux variables du data.frame

En précisant le nom du data.frame pour chaque variable
plot(worms$Area, worms$Slope)
Par leur numéro
plot(worms[,2], worms[,3])
En attachant le data.frame (uniquement pour la lecture)
attach(worms)
mean(Slope)
detach()
To see the contents of the dataframe, just type its name:
> worms
It is often useful to select certain rows, based on the logical tests on the values of one or more
variables. Here is the code to select only those rows with Area > 3 and Slope < 3:
worms[Area>3 & Slope <3,]
autre possibilité
> worms[worms$Area>3 & worms$Slope <3,]
Autre exemple

22
> fr <- data.frame(age = c(15,20,16), nom = c("pierre",
"jeanne","karim"),sexe=c("Masculin","Féminin","Masculin"))
> fr
[1] age nom Sexe
1 15 pierre Masculin
2 20 jeanne Féminin
3 16 karim Masculin
On peut ajouter les noms des lignes (ou colonnes) par rownames( ) (ou colnames( )) :
> rownames(fr) = c("I1", "I2","I3")
Extraction d’une colonne par le nom de la colonne ou par son numéro
> fr[, 1] ou fr[,"age"] ou fr$age. On obtient un vecteur
[1] 15 20. On peut aussi écrire fr["age"] on obtient un dataframe avec une seule variable. Si on
veut sélectionner une partie du dataframe avec les variables nom et sexe on écrit fr[, c(2,3)] ou
fr[,c("nom","sexe")] ou fr[c("nom","sexe")].
Si on veut ajouter une variable note (ou colonne) à un data frame :
> fr["note"]<-c(14,2,10) ou > fr$note<-c(14,2,10)
> fr
age nom sexe note
1 15 pierre Masculin 14
2 20 jeanne Féminin 2
3 16 karim Masculin 10
> mean(age)
Erreur dans mean(age) : objet 'age' introuvable. Si on écrit
> attach(fr)
On peut maintenant calculer la moyenne sans utiliser la syntaxe $
> mean(age)
[1] 17
Si on veut obtenir un sous-ensemble pour les individus de sexe masculin, on écrit
fr[fr$sex=="Masculin",] ou fr[fr[,3]=="Masculin",] ou fr[fr[,"sexe"]=="Masculin",]
On obtient
age nom sexe
I1 15 pierre Masculin
I3 16 karim Masculin
head(fr) : renvoie les 6 premières lignes d'un frame.
head(fr, 10) : renvoie les 10 premières lignes d'un frame
tail(fr, 10) : renvoie les 10 dernières lignes d'un frame.

Autre exemple
Le tableau iris.f, est de type data.frame, il contient des données sur 3 espèces d'iris.
On dispose de 150 fleurs sur lesquelles on a mesuré 5 caractéristiques (taille du data.frame :
150,5).
> iris.f[c(1:5),]
Sepale.long Sepale.larg Petale.long Petale.larg Espece
1 5.1 3.5 1.4 0.2 setosa
2 4.9 3.0 1.4 0.2 setosa
3 4.7 3.2 1.3 0.2 setosa
4 4.6 3.1 1.5 0.2 setosa

23
5 5.0 3.6 1.4 0.2 setosa
> dim(iris.f)
[1] 150 5
> iris.f$Sepale.long[1:5]
[1] 5.1 4.9 4.7 4.6 5.0
> iris.f[[2]][1:5]
[1] 3.5 3.0 3.2 3.1 3.6

12) Sorting, Ranking and Ordering

These three related concepts are important, and one of them (order) is difficult to understand
on first acquaintance. Let’s take a simple example:
houses<-
read.table("http://www.bio.ic.ac.uk/research/mjcraw/therbook/data/houses.txt",header=T)
attach(houses)
names(houses)
[1] "Location" "Price"
> houses
Location Price
1 Ascot 325
2 Sunninghill 201
3 Bracknell 157
4 Camberley 162
5 Bagshot 164
6 Staines 101
7 Windsor 211
8 Maidenhead 188
9 Reading 95
10 Winkfield 117
11 Warfield 188
12 Newbury 121
Now we apply the three different functions to the vector called Price,
ranks<-rank(Price)
sorted<-sort(Price)
ordered<-order(Price)
and make a dataframe out of the four vectors like this:
view<-data.frame(Price,ranks,sorted,ordered)
view
Price ranks sorted ordered
1 325 12.0 95 9
2 201 10.0 101 6
3 157 5.0 117 10
4 162 6.0 121 12
5 164 7.0 157 3
6 101 2.0 162 4
7 211 11.0 164 5
8 188 8.5 188 8
9 95 1.0 88 11

24
10 117 3.0 201 2
11 188 8.5 211 7
12 121 4.0 325 1
Rank
The prices themselves are in no particular sequence. The ranks column contains the value
that is the rank of the particular data point (value of Price), where 1 is assigned to the
lowest data point and length(Price) – here 12 – is assigned to the highest data point. So the
first element, Price=325, is the highest value in Price. You should check that there are 11
values smaller than 325 in the vector called Price. Fractional ranks indicate ties. There are
two 188s in Price and their ranks are 8 and 9. Because they are tied, each gets the average
of their two ranks _8+9_/2=8_5.
Sort
The sorted vector is very straightforward. It contains the values of Price sorted into ascending
order. If you want to sort into descending order, use the reverse order function rev like
this: y<-rev(sort(x)). Note that sort is potentially very dangerous, because it uncouples
values that might need to be in the same row of the dataframe (e.g. because they are the
explanatory variables associated with a particular value of the response variable). It is bad
practice, therefore, to write x<-sort(x), not least because there is no ‘unsort’ function.
Order
This is the most important of the three functions, and much the hardest to understand on
first acquaintance. The order function returns an integer vector containing the permutation
that will sort the input into ascending order. You will need to think about this one. The
lowest value of Price is 95. Look at the dataframe and ask yourself what is the subscript in
the original vector called Price where 95 occurred. Scanning down the column, you find it
in row number 9. This is the first value in ordered, ordered[1]. Where is the next smallest
value (101) to be found within Price? It is in position 6, so this is ordered[2]. The third
smallest Price (117) is in position 10, so this is ordered[3]. And so on.
This function is particularly useful in sorting dataframes, as explained on p. 113. Using
order with subscripts is a much safer option than using sort, because with sort the values
of the response variable and the explanatory variables could be uncoupled with potentially
disastrous results if this is not realized at the time that modelling was carried out. The
beauty of order is that we can use order(Price) as a subscript for Location to obtain the
price-ranked list of locations:
Location[order(Price)]
[1] Reading Staines Winkfield Newbury
[5] Bracknell Camberley Bagshot Maidenhead
[9] Warfield Sunninghill Windsor Ascot
> houses[order(Price),]
Location Price
9 Reading 95
6 Staines 101
10 Winkfield 117
12 Newbury 121
3 Bracknell 157
4 Camberley 162
5 Bagshot 164
8 Maidenhead 188

25
11 Warfield 188
2 Sunninghill 201
7 Windsor 211
1 Ascot 325
When you see it used like this, you can see exactly why the function is called order. If you
want to reverse the order, just use the rev function like this:
Location[rev(order(Price))]
[1] Ascot Windsor Sunninghill Warfield
[5] Maidenhead Bagshot Camberley Bracknell
[9] Newbury Winkfield Staines Reading
Sorting by several criteria is done simply by having several arguments to
Order
> aa<-data.frame(sexe=c("M","F","M","M","F","M"),age=c(22,54,44,15,41,40))
> aa[order(sexe,age),]
sexe age
5 F 41
2 F 54
4 M 15
1 M 22
6 M 40
3 M 44

13) tapply, apply, sapply

One of the most important and useful vector functions to master is tapply. The ‘t’ stands
for ‘table’ and the idea is to apply a function to produce a table from the values in the
vector, based on one or more grouping variables (often the grouping is by factor levels).
This sounds much more complicated than it really is:
> data<-
read.table("http://www.bio.ic.ac.uk/research/mjcraw/therbook/data/Daphnia.txt",header=T)
ou
> data<-read.table("c:\\temp\\daphnia.txt",header=T)
Read.table permet de lire un fichier dans un formzt table et le transforme en data frame.
> class(data)
[1] "data.frame"
> attach(data)
> names(data)
[1] "Growth.rate" "Water" "Detergent" "Daphnia"
Une fois on a utilisé attach on peut écrire
> Growth.rate au lieu de > data$Growth.rate
>head(data)
[1] Growth.rate Water Detergent Daphnia
1 2.919086 Tyne BrandA Clone1
2 2.492904 Tyne BrandA Clone1
3 3.021804 Tyne BrandA Clone1
4 2.350874 Tyne BrandA Clone2
5 3.148174 Tyne BrandA Clone2

26
6 4.423853 Tyne BrandA Clone2
The response variable is Growth.rate and the other three variables are factors. Suppose we want
the mean growth rate for each detergent:
> tapply(Growth.rate,Detergent,mean)
BrandA BrandB BrandC BrandD
3.88 4.01 3.95 3.56
This produces a table with four entries, one for each level of the factor called Detergent.
To produce a two-dimensional table we put the two grouping variables in a list. Here we
calculate the median growth rate for water type and daphnia clone:
tapply(Growth.rate,list(Water,Daphnia),median)
Clone1 Clone2 Clone3
Tyne 2.87 3.91 4.62
Wear 2.59 5.53 4.30
The first variable in the list creates the rows of the table and the second the columns
apply (X, MARGIN, FUN)
> apply (mat, 1, sum) : renvoie un vecteur avec la somme des lignes
> apply (mat, 1, mean)
> apply (mat, c(1,2), function (x) {ifelse(x ! = 0, 1,0)}) : renvoie une matrice de même
dimension dont les éléments sont 1 ou 0 selon que la valeur est non nulle ou nulle.
MARGIN = 1 : indique les lignes
MARGIN =2 : indique les colonnes
MARGIN = c(1,2) : indique les lignes et les colonnes
FUN : la fonction à appliquer
> identical (a,b)
[1] TRUE (si a=b) ou FALSE (si a ≠ b)
sapply
Use sapply to map a function to each column of a data frame. For example the provided iris data
set:
> sapply[iris,class] # Apply class to columns of iris
> sapply[iris[1:4], mean # Apply mean to columns 1:4

14) Output to a file and input from a file

R provides a number of commands for writing output to a file. We will generally use write for
writing numeric values and cat for writing text, or a combination of numeric and character values.
> (x <- matrix(1:24, nrow = 4, ncol = 6))
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 1 5 9 13 17 21
[2,] 2 6 10 14 18 22
[3,] 3 7 11 15 19 23
[4,] 4 8 12 16 20 24
> write(t(x), file = "out.txt", ncolumns = 6)
Here is what the file out.txt looks like:
1 5 9 13 17 21
2 6 10 14 18 22
3 7 11 15 19 23
4 8 12 16 20 24
cat("bonjour monsieur", file = "bonjour.txt", sep = " ", append = FALSE)

27
Here is what the file bonjour.txt looks like:
bonjour monsieur
Note that cat does not automatically write a newline after the expressions. If you want a newline
you must explicitly include the string \n.
R provides a number of ways to read data from a file, the most flexible of
which is the scan function. We use scan to read a vector of values from a file.
For this example the file ba.txt was created beforehand using a text editor,
and is stored in working directory ../coursR.
321 543
432 543
> data<-scan(file="ba.txt")
> data
[1] 321 543 432 543
> name <- c('Ira A', 'David A', 'Todd A')
> height <- c(5 + 4 / 12, 6 + 11 / 12, 5 + 11 / 12)
Next, we combine them into a data frame
> faculty <- data.frame(name, height)
We save the data frame to a file
> save(faculty, file = 'faculty.rda')
> load('faculty.rda')
> faculty
name height
1 Ira A 5.333333
2 David A 6.916667
3 Todd A 5.916667
Ainsi, il est possible (et fortement souhaitable) de créer plusieurs fichiers d'extension .rda : un
pour chaque projet sur lequel on doit travailler. Il fautalors créer ces fichiers d'extension .rda dans
des dossiers appropriés distincts. Par exemple, supposons que l'on travaille sur deux projets
statistiques différents : l'un en relation avec des automobiles et l'autre en relation avec le climat,
on pourra alors creer un dossier nomme Automobile contenant un fichier auto.rda et un autre
dossier nomrne Climat contenant un fichier nomme c1imat.rda (qui contiendront les objets R
correspondant a chacune des deux études).
La fonction save ( ) permet d'enregistrer un fichier d'environnement de travail et il faut utiliser la
fonction load ( ) pour en charger un existant.

15)The data editor

When invoked on a data frame or matrix, edit brings up a separate spreadsheet-like environment
for editing. This is useful for making small changes once a data set has been read. The command
> xnew <- edit(xold)
will allow you to edit your data set xold, and on completion the changed object is assigned
to xnew. If you want to alter the original dataset xold, the simplest way is to use fix(xold),
which is equivalent to xold <- edit(xold).
Use
> xnew <- edit(data.frame())
to enter new data via the spreadsheet interface.

28
16) Les dates
As.Date transforme une chaine de caractères en objet « date ».
x<-"2007-10-17"
d<-as.Date(x, "%Y-%m-%d")
> x;d
[1] "2007-10-17"
[1] "2007-10-17"
> str(x)
chr "2007-10-17"
> str(d)
Date[1:1], format: "2007-10-17"
The default format has year, then month, then day of month
> dd <- as.Date(c("2003-08-24","2003-11-23","2004-02-22","2004-05-03"))
> diff(dd)
Time differences in days
[1] 91 91 71
> as.Date("1/1/1960", format="%d/%m/%Y")
[1] "1960-01-01"
> as.Date("1:12:1960",format="%d:%m:%Y")
[1] "1960-12-01"
La syntaxe résumée :
%d jour du mois (01–31)
%m mois (01–12)
%Y année (4 chiffres)
%y année (2 chiffres) à eviter !

Pour la date du jour

> Sys.Date()
[1] "2012-02-05"
> Sys.time()
[1] "2012-02-05 14:57:55 CET"

17) Exercices
Exercice 1
Écrire une expression R pour créer la liste suivante :
[[1]]
[1] 1 2 3 4 5
$data
[,1] [,2] [,3]
[1,] 1 3 5
[2,] 2 4 6
[[3]]
[1] 0 0 0
$test
[1] FALSE FALSE FALSE FALSE
b) Extraire les étiquettes de la liste.
c) Trouver le mode et la longueur du quatrième élément de la liste.

29
d) Extraire les dimensions du second élément de la liste.
e) Extraire les deuxième et troisième éléments du second élément de la
liste.
f) Remplacer le troisième élément de la liste par le vecteur 3:8.
Correction
Soit x le nom de la liste.
a) x<- list(1:5, data = matrix(1:6, 2, 3), numeric(3),
+ test = logical(4))
Ou
x<- list(1:5, data = matrix(1:6, 2, 3), numeric(3),
+ test = logical(4))
x<-list(1:5,data=matrix(1:6,nrow=2,ncol=3),rep(0,3),test=rep(FALSE,4))
b) > names(x)
c) > mode(x$test) ou > mode(aa[[4]])
> length(x$test)
d) > dim(x$data)
e) > x[[2]][c(2, 3)] ou x$data[c(2, 3)]
f) > x[[3]] <- 3:8
Exercice 2
Soit obs un vecteur contenant les valeurs suivantes :
> obs
[1] 3 9 2 2 1 1 7 13 9 14 4 16 6 7 4 3
[17] 9 8 3 12
Écrire une expression R permettant d’extraire les éléments suivants.
a) Le deuxième élément de l’échantillon.
b) Les cinq premiers éléments de l’échantillon.
c) Les éléments strictement supérieurs à 14.
d) Tous les éléments sauf les éléments en positions 6, 10 et 12.
Correction
a) > obs[2]
b) > obs[1:5]
c) > obs[obs > 14]
d) > obs[-c(6, 10, 12)]
Exercice 3
Soit mat une matrice 7×10 obtenue aléatoirement avec
> (mat <- matrix(sample(1:100, 70), 7, 10))
Écrire une expression R permettant d’obtenir les éléments demandés ci-dessous.
a) L’élément (4,3) de la matrice.
b) Le contenu de la sixième ligne de la matrice.
c) Les première et quatrième colonnes de la matrice (simultanément).
d) Les lignes de la matrice dont le premier élément est supérieur à 50.
Correction
> mat
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 4 81 89 76 77 35 41 3 96 26
[2,] 2 60 11 93 64 68 75 17 9 73
[3,] 98 90 28 46 24 69 1 84 61 8

30
[4,] 22 6 13 29 78 47 19 30 38 85
[5,] 72 95 52 94 79 82 48 10 57 18
[6,] 44 40 39 21 83 43 14 33 91 45
[7,] 12 86 23 49 67 65 5 97 55 34
a) > mat[4, 3]
b) > mat[6, ]
c) > mat[, c(1, 4)]
d) > which(mat[, 1] > 50)
[1] 3 5
> mat[c(3,5),]
Autre possibilité
> mat[mat[, 1] > 50, ]
Exercice 4
À l’aide des fonctions rep, seq et c seulement, générer les séquences suivantes.
a) 0 6 0 6 0 6
b) 1 4 7 10
c) 1 2 3 1 2 3 1 2 3 1 2 3
d) 1 2 2 3 3 3
e) 1 1 1 2 2 3
f) 1 5.5 10
g) 1 1 1 1 2 2 2 2 3 3 3 3
Correction
a) > rep(c(0, 6), 3)
b) > seq(1, 10, by = 3)
c) > rep(1:3, 4)
d) > rep(1:3, 1:3)
e) > rep(1:3, 3:1)
f) > seq(1, 10, length = 3)
Exercice 5
Générer les suites de nombres suivantes à l’aide des fonctions : et rep
seulement, donc sans utiliser la fonction seq.
a) 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2
b) 1 3 5 7 9 11 13 15 17 19
c) -2 -1 0 1 2 -2 -1 0 1 2
d) -2 -2 -1 -1 0 0 1 1 2 2
e) 10 20 30 40 50 60 70 80 90 100
Correction
a) > 11:20/10
b) > 2 * 0:9 + 1
c) > rep(-2:2, 2)
d) > rep(-2:2, each = 2)
e) > 10 * 1:10
Exercice 6
À l’aide de la commande apply, écrire des expressions S qui remplaceraient
les fonctions suivantes.
a) rowSums
b) colSums

31
c) rowMeans
d) colMeans
Correction
Soit mat une matrice.
a) > apply(mat, 1, sum)
b) > apply(mat, 2, sum)
c) > apply(mat, 1, mean)
d) > apply(mat, 2, mean)
Exercice 7
Sans utiliser les fonctions factorial, générer la séquence 1 !, 2 !, ..., 10 !
Correction
> cumprod(1:10)
Exercice 8
Simuler un échantillon (x1, x2, x3, ..., x20) avec la fonction sample.
Écrire une expression R permettant d’obtenir ou de calculer chacun des
résultats demandés ci-dessous.
a) Les cinq premiers éléments de l’échantillon.
b) La valeur maximale de l’échantillon.
c) La moyenne des cinq premiers éléments de l’échantillon.
d) La moyenne des cinq derniers éléments de l’échantillon.
Correction
> x<-rnorm(12)
a) > x[1:5]
> head(x, 5)
b) > max(x)
c) > mean(x[1:5])
> mean(head(x, 5))
d) > mean(x[16:20])
> mean(x[(length(x) - 4):length(x)])
> mean(tail(x, 5))
> mean(rev(x)[1:5])
Exercice 9
Simuler une matrice mat 7×10, puis écrire des expressions R permettant
d’effectuer les tâches demandées ci-dessous.
a) Calculer la somme des éléments de chacunes des lignes de la matrice.
b) Calculer la moyenne des éléments de chacunes des colonnes de la
matrice.
c) Calculer la valeur maximale de la sous-matrice formée par les trois
premières lignes et les trois premières colonnes.
d) Extraire toutes les lignes de la matrice dont la moyenne des éléments
est supérieure à 7.
Correction
> mat<-matrix(rnorm(70),7,10)
a) > rowSums(mat)
b) > colMeans(mat)
c) > max(mat[1:3, 1:3])
d) > mat[rowMeans(mat) > 7, ]

32
Exercice 10
Ecrire une fonction permettant de calculer les moyennes et variances d’un vecteur de deux
composantes.
Correction
desc<-function (x,y){
moyenne<-numeric(2)
var<-numeric(2)
moyenne[1]<-mean(x)
moyenne[2]<-mean(y)
var[1]<-var(x)
var[2]<-var(y)
cat("Moyennes",moyenne,"\n")
cat("Variances",var,"\n")
}
> desc(rnorm(32),rnorm(43))
Moyennes 0.1235007 0.08347782
Variances 1.045373 1.250843
Exercice 11
Ecrire une fonction permettant de calculer les moyennes et variances d’un vecteur ayant un
nombre arbitraire de composantes.
Correction
many.means<-function (...){
data <- list(...)
n<-length(data)
means<-numeric(n)
vars<-numeric(n)
for (i in 1:n){
means[i]<-mean(data[[i]])
vars[i]<-var(data[[i]])
}
cat("Moyennes",round(means,3),"\n")
cat("Variances",round(vars,3),"\n")
}
Exemple :
> many.means(rnorm(100,4,2),rnorm(43))
Moyennes 3.982 -0.012
Variances 4.408 0.944
Exercice 12
Écrire une fonction qui prend en paramètre un ensemble de valeurs et qui retourne une
liste contenant le nombre de valeurs, la moyenne et l’écart-type.
Correction
desc <- function (x){
ans <- list ()
ans$taille <- length (x)
ans$moyenne <- mean (x)
ans$ecarttype <-sd(x)
print (ans)
}
Exemple

33
> desc(rnorm(32))
$taille
[1] 32
$moyenne
[1] -0.04655557
$ecarttype
[1] 0.9266683
Exercice 13
Écrire une fonction centrer() qui « centre » les variables du tableau de données data de la page
25 (autrement dit qui retranche à chaque élément d’une colonne, la moyenne de cette colonne).
On pourra procéder de deux façons :
– En calculant les moyennes des colonnes avec mean
– En utilisant la fonction scale().
(Remarque : on pourra vérifier que les variables à centrer sont bien quantitatives avant
d’effectuer la transformation)
Correction
1) centre<-function(DR)
{
aa<-data.frame()
d<-dim(DR)[2]
for (i in 1:d)
if (is.numeric(DR[,i]))
DR[,i]<-DR[,i]-mean(DR[,i])
aa<-DR
print(aa)}
2) cen<-function(DR)
{
aa<-data.frame()
d<-dim(DR)[2]
for (i in 1:d)
if (is.numeric(DR[,i]))
DR[,i]<-scale(DR[,i],scale=FALSE)
aa<-DR
print(aa)}
Exercice 14
Calculer la fonction de probabilité de la loi binomiale
Correction
binome <- function(n,p) factorial(n)/(factorial(p)*
+ factorial (n-p))
Autre possibilité : choose(n,p)
Exercice 15
Describe how to insert a value between two elements of a vector at a
given position by using the append function (use the help system to find
out). Without append, how would you do it?
Réponse
1)
> x<-1:5

34
> x<-append(x,9,after=3)
>x
[1] 1 2 3 9 4 5
2)
> x<-1:5
> x<-c(x[1:3],9,x[4:length(x)])
>x
[1] 1 2 3 9 4 5
Exercice 16
1) Ecrire l’instruction permettant de simuler un échantillon de taille 100 tiré selon la loi de
poisson de paramètre 2.2.
2) Ecrire l’instruction permettant d’obtenir le tableau des effectifs.
3) Ecrire l’instruction permettant d’obtenir le nombre de 0 et le nombre de 3, les deux à la fois.
Correction
1) > x<-rpois(100,2.2)
2) > table(x)
x
0 1 2 3 4 5 8
14 22 23 20 15 5 1
3) > table(x)[c(1,4)]
x
0 3
14 20
Autre possibilité
> table(x)[c("0","3")]
x
0 3
14 20
Exercice 17
Ecrire l’instruction permettant de calculer la variance de x.
> x = c(2,3,5,7,11)

Correction
> x = c(2,3,5,7,11)
> xbar = mean(x)
> x-xbar # the difference
[1] −3.6 −2.6 −0.6 1.4 5.4
> (x−xbar)^2 # the squared difference
[1] 12.96 6.76 0.36 1.96 29.16
> sum((x−xbar)^2) # sum of squared
differences
[1] 51.2
> n = length(x)
>n
[1] 5
> var(x)<-sum((x−xbar)^2)/ (n-1)

35
[1] 12.8
Exercice 18

Générer un facteur de 20 éléments dont les valeurs sont choisiesaléatoirement parmi "oui", "non"
et "peut-être". Tester les fonctions table() et levels() sur ce vecteur.
Correction
> x<- factor ( sample (c(" oui "," non "," peut-être ") ,20 , replace =T))
> table (x)
x
non oui peut-être
13 14 13
> levels (x)
[1] " non" "oui " "peut-être "
Exercice 19
1. Générer une matrice M, 10×5, aléatoire (avec des valeurs réelles comprises entre 0 et 1)
(utiliser la fonction runif).
2. Déterminer le nombre d’éléments supérieurs à 0.9.
3. Remplacer les éléments de M inférieurs à 0.5 par des 0.
4. Tester et vérifier son type et la nature de ses éléments.
5. Créer un data frame à partir de M. Vérifier.
6. Extraire le vecteur correspondant à la troisième colonne de M.
7. Extraire la liste correspondant à la deuxième ligne de M.
Correction
1. > M<- matrix ( runif (50) , nrow =10)
2. > length (M[M >0.9]) ou sum(M>0.9)
3. > M[M <0.5] < -0
4. > mode(M); typeof (M)
> class (M)
5. > MDF <- as.data.frame (M) ou MDF<-data.frame(M)
6. > M[ ,3]
7. > M[2 ,]
Exercice 20
Générer un data frame tel que :
- le nombre d’individus (lignes) est de 5
- les variables (colonnes) sont nommées : "Sexe" "Âge" puis "Note 1", "Note 2"
- le sexe de chaque individu est choisi au hasard parmi "Masculin" et "Féminin" ;
- les notes sont générées aléatoirement entre 0 et 20.
- l’âge d’un individu est gégéré aléatoirement entre 18 et 24.
1) Extraire le sous-ensemble des données correspondant aux variables "Note 1
", "Note 2"
2) Extraire le sous-ensemble des données correspondant aux filles.
Correction
sexe <- sample (c("M","F") ,5, replace =T)
age <- sample (18:24 ,5, replace =T)
note1 <- sample (0:20 ,5 , replace =T)
note2 <- sample (0:20 ,5, replace =T)

36
DF <- data.frame (Age=age , Sexe =sexe , note1=note1,note2=note2 )
1) DF[,c("note1","note2")] ou DF[,c(3,4)]
2) > DF[DF$Sexe=="F",] ou DF[DF[,2] =="F",]
Exercice 21
Que renvoie ces instructions :
x<-as.factor(c("apple", "apple", "orange", "apple", "orange"))
as. numeric(x)
levels(x)
x
Exercice 22
Ecrire une fonction permettant de calculer la médiane d’un vecteur x de n composantes distinctes.
Rappelons que si n est pair la médiane est la moyenne des 2 points milieu et que si n est impair la
médiane est le point milieu.
Indication : Classer au préalable les n données par ordre croissant.
Correction
f<-function(x){
n<-length(x)
x<-sort(x)
if ( n%%2==0) X<-(x[n/2]+x[n/2+1])/2
else X<-x[(n+1)/2]
return(X)
}
Exercice 23
Considerons la fonction y  f ( x ) définie par :
x 0 ]0,1] 1
f ( x)  x3 x 2
x
Ecrire une fonction R permettant d’obtenir la valeur de y pour une valeur quelconque de x .
Correction
f<-function(x) ifelse(x<=0,-x,ifelse(x>1,sqrt(x),x))
f<-function(x){
if (x<=0) X<- -x
else
if (x>1) X<-sqrt(x)
else X<-x
return(X)
}
Exercice 24
Ecrire une fonction permettant le calcul du produit de deux matrices quelconques
Correction
f<-function(M,N){
m<-dim(M)[2];n<-dim(N)[1]
if (m!=n) stop("dsd")
else
p<-dim(M)[1];q<-dim(N)[2]
mat<-matrix(NA,p,q)
for (i in 1:p){

37
for (j in 1:q) {
mat[i,j]<-M[i,]%*%N[,j]
}
}
return(mat)
}
M<-matrix(1,2,3)
N<-matrix(2,3,2)
f(M,N)
M%*%N

Lisp Interpreter in Rust
From Everand
Lisp Interpreter in Rust
Vishal Patil
1/5 (1)
Grammar Help Grade 4
No ratings yet
Grammar Help Grade 4
10 pages
Statistical Computing II-slide (1)
No ratings yet
Statistical Computing II-slide (1)
279 pages
R PDF
No ratings yet
R PDF
491 pages
Introduction To R PDF
No ratings yet
Introduction To R PDF
56 pages
R Is A Command Line Based Language All Commands Are Entered Directly Into The Console. R
No ratings yet
R Is A Command Line Based Language All Commands Are Entered Directly Into The Console. R
8 pages
R_Vectors
No ratings yet
R_Vectors
22 pages
STATS LAB Basics of R PDF
No ratings yet
STATS LAB Basics of R PDF
77 pages
An Introduction To R: Biostatistics 615/815
No ratings yet
An Introduction To R: Biostatistics 615/815
59 pages
cours
No ratings yet
cours
33 pages
Introduction To R: 1 Getting Started
No ratings yet
Introduction To R: 1 Getting Started
14 pages
Rintro
No ratings yet
Rintro
14 pages
Part I: Introductory Materials: Introduction To R
No ratings yet
Part I: Introductory Materials: Introduction To R
25 pages
Introduction To Rlogistic
No ratings yet
Introduction To Rlogistic
135 pages
MIS 4.hafta (Introduction To R)
No ratings yet
MIS 4.hafta (Introduction To R)
52 pages
R Programming
No ratings yet
R Programming
22 pages
An R Tutorial Starting Out
No ratings yet
An R Tutorial Starting Out
9 pages
Lab 1 22.7
No ratings yet
Lab 1 22.7
40 pages
Introduction To R: Nihan Acar-Denizli, Pau Fonseca
No ratings yet
Introduction To R: Nihan Acar-Denizli, Pau Fonseca
50 pages
R Programming
No ratings yet
R Programming
79 pages
Introduction To R
No ratings yet
Introduction To R
34 pages
Data Science Using R - Lab Manual-Complete Ver 2.0 - Nov 2024
No ratings yet
Data Science Using R - Lab Manual-Complete Ver 2.0 - Nov 2024
36 pages
R-Basic Concepts
No ratings yet
R-Basic Concepts
67 pages
R Programming PDF
No ratings yet
R Programming PDF
128 pages
R Programming PDF
No ratings yet
R Programming PDF
128 pages
R PPT
No ratings yet
R PPT
63 pages
Introduction to R
No ratings yet
Introduction to R
23 pages
An Introduction To R: W. N. Venables, D. M. Smith and The R Core Team
No ratings yet
An Introduction To R: W. N. Venables, D. M. Smith and The R Core Team
109 pages
An Introduction To R: W. N. Venables, D. M. Smith and The R Core Team
No ratings yet
An Introduction To R: W. N. Venables, D. M. Smith and The R Core Team
109 pages
An Introduction To R: W. N. Venables, D. M. Smith and The R Core Team
No ratings yet
An Introduction To R: W. N. Venables, D. M. Smith and The R Core Team
109 pages
An Introduction To R: W. N. Venables, D. M. Smith and The R Core Team
No ratings yet
An Introduction To R: W. N. Venables, D. M. Smith and The R Core Team
109 pages
R Programming LAB Manual
No ratings yet
R Programming LAB Manual
39 pages
R-programming Syllabus
No ratings yet
R-programming Syllabus
1 page
R PROGRAMMING LAB MANUAL
No ratings yet
R PROGRAMMING LAB MANUAL
35 pages
R Lab
No ratings yet
R Lab
114 pages
R Intro A Firsts Steps
No ratings yet
R Intro A Firsts Steps
112 pages
Chapter 1 Introduction To R
No ratings yet
Chapter 1 Introduction To R
33 pages
R Introduction by Deepayan Sarkar
No ratings yet
R Introduction by Deepayan Sarkar
23 pages
Introduction to r Chap 2
No ratings yet
Introduction to r Chap 2
30 pages
Lec 3
No ratings yet
Lec 3
23 pages
R
No ratings yet
R
13 pages
R Intro
No ratings yet
R Intro
109 pages
The Essential R Reference
From Everand
The Essential R Reference
Mark Gardener
No ratings yet
Profound Python Data Science
From Everand
Profound Python Data Science
Onder Teker
No ratings yet
Advanced C Concepts and Programming: First Edition
From Everand
Advanced C Concepts and Programming: First Edition
Gayatri
3/5 (1)
Introduction to PHP, Part 2, Second Edition
From Everand
Introduction to PHP, Part 2, Second Edition
Adam Majczak
No ratings yet
Graphs with MATLAB (Taken from "MATLAB for Beginners: A Gentle Approach")
From Everand
Graphs with MATLAB (Taken from "MATLAB for Beginners: A Gentle Approach")
Peter Kattan
4/5 (2)
Introduction to Algorithms
From Everand
Introduction to Algorithms
S VASIST
No ratings yet
Learn C++
From Everand
Learn C++
Durgesh
4.5/5 (9)
Computer Engineering Laboratory Solution Primer
From Everand
Computer Engineering Laboratory Solution Primer
Karan Bhandari
No ratings yet
Lisp Programming Language
From Everand
Lisp Programming Language
Faiz ul haque Zeya
No ratings yet
Mastering Data Structures and Algorithms in C and C++
From Everand
Mastering Data Structures and Algorithms in C and C++
Sachin Naha
No ratings yet
A Brief Introduction to MATLAB: Taken From the Book "MATLAB for Beginners: A Gentle Approach"
From Everand
A Brief Introduction to MATLAB: Taken From the Book "MATLAB for Beginners: A Gentle Approach"
Peter Kattan
2.5/5 (2)
Oracle Certified Professional Java Programmer OCPJP 1Z0 809
From Everand
Oracle Certified Professional Java Programmer OCPJP 1Z0 809
Manish Soni
No ratings yet
Matrices with MATLAB (Taken from "MATLAB for Beginners: A Gentle Approach")
From Everand
Matrices with MATLAB (Taken from "MATLAB for Beginners: A Gentle Approach")
Peter Kattan
3/5 (4)
Mastering Go A Practical Guide to Developers: A Practical Guide to Developers
From Everand
Mastering Go A Practical Guide to Developers: A Practical Guide to Developers
Miguel Miranda de Mattos
No ratings yet
Algorithms and Data Structures: An Easy Guide to Programming Skills
From Everand
Algorithms and Data Structures: An Easy Guide to Programming Skills
Rigdon Jonathan
No ratings yet
C# Package Mastery: 100 Essentials in 1 Hour - 2024 Edition
From Everand
C# Package Mastery: 100 Essentials in 1 Hour - 2024 Edition
Tenko
No ratings yet
A Short Course in Discrete Mathematics
From Everand
A Short Course in Discrete Mathematics
Edward A. Bender
3/5 (1)
C Programming
From Everand
C Programming
Netra
No ratings yet
MATLAB for Beginners: A Gentle Approach - Revised Edition
From Everand
MATLAB for Beginners: A Gentle Approach - Revised Edition
Peter Kattan
No ratings yet
CS - Lecture 5 - Word
No ratings yet
CS - Lecture 5 - Word
39 pages
Solid Pad Mount Switchgear - 150123 V2.0
50% (2)
Solid Pad Mount Switchgear - 150123 V2.0
8 pages
aryabhatta mock 4pdf
No ratings yet
aryabhatta mock 4pdf
7 pages
FDS U2 Technical
No ratings yet
FDS U2 Technical
66 pages
Parasite Quotes and Analysis
100% (1)
Parasite Quotes and Analysis
6 pages
Born of God PDF
No ratings yet
Born of God PDF
2 pages
Airline Reservation System PDF Free
No ratings yet
Airline Reservation System PDF Free
43 pages
Child Observation Log
0% (1)
Child Observation Log
4 pages
Avg Adc
No ratings yet
Avg Adc
5 pages
Celiaquillian Filmactingresume2016
No ratings yet
Celiaquillian Filmactingresume2016
1 page
2 ND UT Syllabus PDF
No ratings yet
2 ND UT Syllabus PDF
9 pages
Muse 355 String Pedagogy Handbook
No ratings yet
Muse 355 String Pedagogy Handbook
23 pages
THE TEACHINGS OF THE BUDDHA Rewised
No ratings yet
THE TEACHINGS OF THE BUDDHA Rewised
33 pages
Le Management Digital
0% (1)
Le Management Digital
6 pages
Mirror
No ratings yet
Mirror
32 pages
IMD263 Electronic Publishing Week 2
No ratings yet
IMD263 Electronic Publishing Week 2
16 pages
Drill Slides
No ratings yet
Drill Slides
14 pages
Power Point
No ratings yet
Power Point
24 pages
Your English Pal ESL Lesson Plan Health Well Being Student v3
No ratings yet
Your English Pal ESL Lesson Plan Health Well Being Student v3
4 pages
Watermark 11170140000094 Suci Ismiati (1)
No ratings yet
Watermark 11170140000094 Suci Ismiati (1)
102 pages
Trojan War
No ratings yet
Trojan War
20 pages
Persuasive Communication Skills
No ratings yet
Persuasive Communication Skills
6 pages
Macedonian Digest April 2011
No ratings yet
Macedonian Digest April 2011
39 pages
Leverage Devotional - Segun Obadje January - March
50% (2)
Leverage Devotional - Segun Obadje January - March
90 pages
I Ching
50% (2)
I Ching
40 pages
Differential Calculus
No ratings yet
Differential Calculus
45 pages
Free Presbyterian Church of Scotland
No ratings yet
Free Presbyterian Church of Scotland
12 pages
Practical Research 2 Module 6 Q1
No ratings yet
Practical Research 2 Module 6 Q1
9 pages
Grammar: Am Sitting Watching Meet Play Are Become
No ratings yet
Grammar: Am Sitting Watching Meet Play Are Become
2 pages

Intro R

Uploaded by

Intro R

Uploaded by

1) Sutting up

1- Prise en main du logiciel

Addresses within Vectors

Finding Closest Values

Trimming Vectors Using Negative Subscripts

Table 1. Logical operations

Les nombres avec exposant

Les principales structures de données en R sont :

7) The sample Function

11) data frame

Field.Name Area Slope Vegetation Soil.pH Damp Worm.density

Il existe plusieurs façons d'accéder aux variables du data.frame

12) Sorting, Ranking and Ordering

13) tapply, apply, sapply

14) Output to a file and input from a file

15)The data editor

Pour la date du jour

You might also like