0% found this document useful (0 votes)

1 views127 pages

Functions and Flow Control

The document provides an overview of user-defined functions in R, including their syntax, handling multiple outputs, default arguments, and data types. It also covers control structures such as loops, if-else statements, and switch statements, along with applications in curve fitting, solving equations, calculus, and optimization. Additionally, it emphasizes the importance of argument validation and provides examples for practical implementation.

Uploaded by

Soumyadeep Majumdar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

1 views127 pages

Functions and Flow Control

Uploaded by

Soumyadeep Majumdar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 127

Flow Control and Functions

Presidency University

November, 2024
User Defined Functions

The basic format of the code is

function_name <- function (arguments)

{
main computation to be done
}
User Defined Functions

The basic format of the code is

function_name <- function (arguments)

{
main computation to be done
}

testfunction <- function(x,y) #----define a function

{
x*y
}

testfunction(2,5) #-----call the function with the arguments 2 and 5

[1] 10
Doing more than one computation

I When a function performs more than one task and gives

multiple objects return() is used to get all the outputs in a
form of a vector:

testfunction <-function(x,y)
{
prod=x*y
su= x+y
return(c(prod,su))
}

testfunction(2,5)

[1] 10 7
Doing more than one computation

I Note that the two output can be accessed separately as

result=testfunction(2,5)
result[1]

[1] 10

result[2]

[1] 7
Doing more than one computation
I Alternatively multiple output can be extracted using list(). This will
enable us to extract by names (along with indices)
testfunction <- function(x,y)
{
prod=x*y
su= x+y
output=list(prod,su) #--- Creates the list
names(output)=c("Product", "Sum") #--- name them (optional)
return(output) #---- returns the list
}

result=testfunction(2,5)
result

$Product
[1] 10

$Sum
[1] 7

result$Product #----result[[1]] give same output

[1] 10

result$Sum #---- result[[2]] give similar output

[1] 7
Default argument of a function
I R provides method to define the default value of the arguments
while defining the function. These default values will be used when
the function is called unless this argument values are changed
during calling.
testfunction <- function(x=1,y=1)
{
prod=x*y
su= x+y
output=list(prod,su) #--- Creates the list
names(output)=c("Product", "Sum") #--- name them (optional)
return(output) #---- returns the list
}
testfunction() #--call with no argument

$Product
[1] 1

$Sum
[1] 2

testfunction(x=4)

$Product
[1] 4

$Sum
[1] 5
Additional Arguments
I Provision for additional arguments ( probably optional arguments,
which cannot be decided beforehand) can be done using ...

testfunction <- function(x=1,y=1,...)

{
prod=x*y
su= x+y
output=list(prod,su) #--- Creates the list
names(output)=c("Product", "Sum") #--- name them (optional)
return(output) #---- returns the list
}

#--z is extra argument which is of no use here

testfunction(2,5,z=12)

$Product
[1] 10

$Sum
[1] 7
Data types of arguments
I Since the types of the arguments are not specified (at the time of
definition), the arguments can be of any data type provided the
internal code of the function is conformable with that data
type

testfunction <- function(x=1,y=1,...)

{
prod=x*y
su= x+y
output=list(prod,su) #--- Creates the list
names(output)=c("Product", "Sum") #--- name them (optional)
return(output) #---- returns the list
}

testfunction(c(1,2),c(3,4)) #--calling with vectors

$Product
[1] 3 8

$Sum
[1] 4 6

testfunction("F","M") #---calling with characters

Error in x * y: non-numeric argument to binary operator

Sanity checking argument

I So how can we stop a function if the user calls it with

non-conformable arguments? A good practice is to write functions
in such a way that while calling, it checks whether the arguments
supplied make sense before going to the main body of the function.

testfunction= function(x,y)
{
#---check if the arguments are not characters
stopifnot( typeof(x)!="character", typeof(y)!="character" )
prod=x*y
su= x+y
output=list(prod,su) #--- Creates the list
names(output)=c("Product", "Sum") #--- name them (optional)
return(output)
}
testfunction("F","M")

Error in testfunction("F", "M"): typeof(x) != "character" is not TRUE

I The stopifnot function halts the execution of the function (with

error message) if all of its arguments do not evaluate to TRUE.
Applications: Curve fitting
I Any function can be plotted using curve(function, from, to, n, add=T/F,...) where from and to
are range over which the function is plotted and n (integer) is the number of points at which we
evaluate. add=TRUE / FALSE indicates whether to add this curve to a existing plot or not.

myfun= function(x) x*(1-x) #----in single line braces not required

curve(myfun, from= 0, to=1) #-other arguments take default values
myfun(x)

0.15
0.00

0.0 0.2 0.4 0.6 0.8 1.0

Example: Plotting normal curve

curve(dnorm,from=-4,to=4,n=500) #---dnorm gives pdf of N(0,1)

0.0 0.1 0.2 0.3 0.4
dnorm(x)

−4 −2 0 2 4

x
Example: Does lim sin x1 exist?
x→0

curve( sin(1/x), from=-2, to=2)

Warning in sin(1/x): NaNs produced

0.0 0.5 1.0
sin(1/x)

−1.0

−2 −1 0 1 2

x
Example (Contd.): Zoom at the origin

curve( sin(1/x), from=-0.1, to=0.1)

Warning in sin(1/x): NaNs produced

0.0 0.5 1.0
sin(1/x)

−1.0

−0.10 −0.05 0.00 0.05 0.10

x
Applications: Solving Equation
I For equations involving one variable we can use uniroot( function, interval,.....)
I For solving e x = sin(x) we write

uniroot( function(x) exp(x)-sin(x), c(-5,5))

$root
[1] -3.183063

$f.root
[1] -1.359327e-08

$iter
[1] 8

$init.it
[1] NA

$estim.prec
[1] 6.103516e-05

#-note how we write f(x) in one line

I For finding real or complex roots of a ploynomial use polyroot() and for solving
roots of n non-linear equations we can use multiroot() in package rootSolve
Applications: Calculus

´1
I Definite integral can be done using integrate() .e.g. (x)dx can be done
0
using

integrate (function(x) x, 0, 1)

0.5 with absolute error < 5.6e-15

integrate (dnorm, -3, 3)

0.9973002 with absolute error < 9.3e-07

Applications: Calculus

´1
I Definite integral can be done using integrate() .e.g. (x)dx can be done
0
using

integrate (function(x) x, 0, 1)

0.5 with absolute error < 5.6e-15

integrate (dnorm, -3, 3)

0.9973002 with absolute error < 9.3e-07

I Expression for derivatives can be obtained using deriv()

Applications: Optimization

I Maximum or Minimum value of a function can be found using

optimize( function, interval, maximum= TRUE/FALSE)

optimize(function(x) exp(-x),c(0,5))

$minimum
[1] 4.999936

$objective
[1] 0.006738379
Applications: Optimization

I Maximum or Minimum value of a function can be found using

optimize( function, interval, maximum= TRUE/FALSE)

optimize(function(x) exp(-x),c(0,5))

$minimum
[1] 4.999936

$objective
[1] 0.006738379

I There are other functions for optimization like optim(),

nlm(), constrOptim()
Loops in R
I Loops helps to repeat a job. We first start with for loop.
Loops in R
I Loops helps to repeat a job. We first start with for loop.

I The syntax is
for ( variable in sequence)
{
expression to be evaluated
}
Loops in R
I Loops helps to repeat a job. We first start with for loop.

I The syntax is
for ( variable in sequence)
{
expression to be evaluated
}

I Here seq is an expression which evaluates to a vector (not

necessarily in A.P.)
Loops in R
I Loops helps to repeat a job. We first start with for loop.

I The syntax is
for ( variable in sequence)
{
expression to be evaluated
}

I Here seq is an expression which evaluates to a vector (not

necessarily in A.P.)

I For example all the following are valid

for ( i in 1:10)
for ( i in c(2,3,7,9,13,17,19,23))
for ( i in c(“A”, “B”, “C”))
Loops in R
I Loops helps to repeat a job. We first start with for loop.

I The syntax is
for ( variable in sequence)
{
expression to be evaluated
}

I Here seq is an expression which evaluates to a vector (not

necessarily in A.P.)

I For example all the following are valid

for ( i in 1:10)
for ( i in c(2,3,7,9,13,17,19,23))
for ( i in c(“A”, “B”, “C”))

I The no. of times the expression in loop is evaluated is the

length of the sequence.
While Loop

I The syntax is
while ( condition )
{
expression to be evaluated
}
While Loop

I The syntax is
while ( condition )
{
expression to be evaluated
}

I The loop repeats its action untill the test condition is not
satisfied.
While Loop

I The syntax is
while ( condition )
{
expression to be evaluated
}

I The loop repeats its action untill the test condition is not
satisfied.

I Unlike for loop we need not to know in advance how many

times the loop will repeat.
If and If-Else

I The syntax for if - statement is

if ( condition)
{
expression
}
If and If-Else

I The syntax for if - statement is

if ( condition)
{
expression
}

I For a binary situation we can use if-else

if (condition)
{
expression 1
}
else
{
expression 2
}
If-Else function

I An alternative better way of if-else statements is ifelse()

function.
If-Else function

I An alternative better way of if-else statements is ifelse()

function.

I The syntax is
new variable= ifelse( Some Condition , Value of new
variable if condition is true, value if condition is false)
If-Else function

I An alternative better way of if-else statements is ifelse()

function.

I The syntax is
new variable= ifelse( Some Condition , Value of new
variable if condition is true, value if condition is false)

I e.g. category= ifelse ( marks > 80, “Good”, ”Fair” )

assigns value Good if marks is more than 80 and otherwise
Fair.
If-Else function

I An alternative better way of if-else statements is ifelse()

function.

I The syntax is
new variable= ifelse( Some Condition , Value of new
variable if condition is true, value if condition is false)

I e.g. category= ifelse ( marks > 80, “Good”, ”Fair” )

assigns value Good if marks is more than 80 and otherwise
Fair.

I The additional advantage is in the condition this function can

compare a vector with scalar (interpreted as each element
compared to the scalar)
Else if Ladder

I When we have more than two cases we can use else-if ladder
Else if Ladder

I When we have more than two cases we can use else-if ladder

f=function(x)
{
if (x==1) print(a)
else if(x==2) print(b)
else print(c)
}
Switch Statement

I An alternative and faster way is switch() statement.

Switch Statement

I An alternative and faster way is switch() statement.

I The basic syntax is switch( statement, list)

Switch Statement

I An alternative and faster way is switch() statement.

I The basic syntax is switch( statement, list)

I Here statement is evaluated and based on this value, the

corresponding item in the list is returned.
Switch Statement

I An alternative and faster way is switch() statement.

I The basic syntax is switch( statement, list)

I Here statement is evaluated and based on this value, the

corresponding item in the list is returned.

I e.g. switch(2 , “A”, “B”, “C”) gives the answer “B”. It selects
the item no. 2 from the list.
Switch Statement

I An alternative and faster way is switch() statement.

I The basic syntax is switch( statement, list)

I Here statement is evaluated and based on this value, the

corresponding item in the list is returned.

I e.g. switch(2 , “A”, “B”, “C”) gives the answer “B”. It selects
the item no. 2 from the list.

I switch(4 , “A”, “B”, “C”) gives NULL as there is no item

with index 4 in the list.
Switch Statement

I An alternative and faster way is switch() statement.

I The basic syntax is switch( statement, list)

I Here statement is evaluated and based on this value, the

corresponding item in the list is returned.

I e.g. switch(2 , “A”, “B”, “C”) gives the answer “B”. It selects
the item no. 2 from the list.

I switch(4 , “A”, “B”, “C”) gives NULL as there is no item

with index 4 in the list.

I switch( “color”, “color”=”red”, “shape”=”round”,

“length”=5) gives answer red (it matches the string).
Example

stat= function( x, type)

{
switch ( type, "mean"=mean(x), "median"=median(x), "sd"=sd(x))
} #----function ends here
stat(1:10, "mean") #call the function with mean

[1] 5.5

stat(1:9, "median") #call the function with median

[1] 5
Repeat Loop

I Basic syntax is
repeat
{
expression to be evaluated
}
Repeat Loop

I Basic syntax is
repeat
{
expression to be evaluated
}

I No default way of termination.

Repeat Loop

I Basic syntax is
repeat
{
expression to be evaluated
}

I No default way of termination.

I We need to manually terminate the loop using break

statement.
Repeat Loop

I Basic syntax is
repeat
{
expression to be evaluated
}

I No default way of termination.

I We need to manually terminate the loop using break

statement.

x=1 #---Take any value x as 1

repeat
{ #--Loop begins here
print(x)
x=x+1
if (x==6) break #--manual instruction to exit loop
} #---Loop ends here
x
Example: Fitting a Model

I Bigger cities tend to produce more economically per capita.

One proposed statistical model is

Y = y0 N a +

where Y is the per-capita “gross metropolitan product” of a

city, N is its population, and y0 and a are parameters and is
the random error.
gmp <- read.table("gmp.dat")
gmp$pop <- gmp$gmp/gmp$pcgmp
plot(pcgmp~pop, data=gmp, log="x", xlab="Population",
ylab="Per-Capita Economic Output ($/person-year)",
main="US Metropolitan Areas, 2006")
curve(6611*x^(1/8),add=TRUE,col="blue")
US Metropolitan Areas, 2006
80000
Per−Capita Economic Output ($/person−year)

60000
40000
20000

5e+04 1e+05 2e+05 5e+05 1e+06 2e+06 5e+06 1e+07 2e+07

Population
I Suppose we choose y0 = 6611. We want to fit the model

Y = y0 N a +

by minimizing MSE (a) = (Yi − y0 Nia )2 w.r.t. a.

P
I Suppose we choose y0 = 6611. We want to fit the model

Y = y0 N a +

by minimizing MSE (a) = (Yi − y0 Nia )2 w.r.t. a.

I But how do we take the derivative w.r.t. a?

I Suppose we choose y0 = 6611. We want to fit the model

Y = y0 N a +

by minimizing MSE (a) = (Yi − y0 Nia )2 w.r.t. a.

I But how do we take the derivative w.r.t. a?

I Compute that numerically by

0 MSE (a + h) − MSE (a)

MSE (a) ≈
h
0
at+1 − at ∝ −MSE (a)
First Attempt
maximum.iterations <- 100
deriv.step <- 1/1000
step.scale <- 1e-12
stopping.deriv <- 1/100
iteration <- 0
deriv <- Inf
a <- 0.15
while ((iteration < maximum.iterations) && (deriv > stopping.deriv)) {
iteration <- iteration + 1
mse.1 <- mean((gmp$pcgmp - 6611*gmp$pop^a)^2)
mse.2 <- mean((gmp$pcgmp - 6611*gmp$pop^(a+deriv.step))^2)
deriv <- (mse.2 - mse.1)/deriv.step
a <- a - step.scale*deriv
}
list(a=a,iterations=iteration,converged=(iteration < maximum.iterations))

$a
[1] 0.1258166

$iterations
[1] 58

$converged
[1] TRUE
What’s wrong with this?

I Not encapsulated: Re-run by cutting and pasting code — but

how much of it? Also, hard to make part of something larger
I Inflexible: To change initial guess at a, have to edit, cut,
paste, and re-run
I Error-prone: To change the data set, have to edit, cut, paste,
re-run, and hope that all the edits are consistent
I Hard to fix: should stop when absolute value of derivative is
small, but this stops when large and negative. Imagine having
five copies of this and needing to fix same bug on each.
What’s wrong with this?

I Not encapsulated: Re-run by cutting and pasting code — but

I Will turn this into a function and then improve it

estimate.scaling.exponent.1 <- function(a) {
maximum.iterations <- 100
deriv.step <- 1/1000
step.scale <- 1e-12
stopping.deriv <- 1/100
iteration <- 0
deriv <- Inf
while ((iteration < maximum.iterations) && (abs(deriv) > stopping.deriv)) {
iteration <- iteration + 1
mse.1 <- mean((gmp$pcgmp - 6611*gmp$pop^a)^2)
mse.2 <- mean((gmp$pcgmp - 6611*gmp$pop^(a+deriv.step))^2)
deriv <- (mse.2 - mse.1)/deriv.step
a <- a - step.scale*deriv
}
fit <- list(a=a,iterations=iteration,
converged=(iteration < maximum.iterations))
return(fit)
}
I Problem: All those magic numbers!
I Solution: Make them defaults
Third Attempt

estimate.scaling.exponent.2 <- function(a, y0=6611,

maximum.iterations=100, deriv.step = .001,
step.scale = 1e-12, stopping.deriv = .01) {
iteration <- 0
deriv <- Inf
while ((iteration < maximum.iterations) && (abs(deriv) > stopping.deriv)) {
iteration <- iteration + 1
mse.1 <- mean((gmp$pcgmp - y0*gmp$pop^a)^2)
mse.2 <- mean((gmp$pcgmp - y0*gmp$pop^(a+deriv.step))^2)
deriv <- (mse.2 - mse.1)/deriv.step
a <- a - step.scale*deriv
}
fit <- list(a=a,iterations=iteration,
converged=(iteration < maximum.iterations))
return(fit)
}
I Problem: Why type out the same calculation of the MSE
twice?
I Solution: Declare a function
Fourth Attempt

estimate.scaling.exponent.3 <- function(a, y0=6611,

maximum.iterations=100, deriv.step = .001,
step.scale = 1e-12, stopping.deriv = .01) {
iteration <- 0
deriv <- Inf
mse <- function(a) { mean((gmp$pcgmp - y0*gmp$pop^a)^2) }
while ((iteration < maximum.iterations) && (abs(deriv) > stopping.deriv)) {
iteration <- iteration + 1
deriv <- (mse(a+deriv.step) - mse(a))/deriv.step
a <- a - step.scale*deriv
}
fit <- list(a=a,iterations=iteration,
converged=(iteration < maximum.iterations))
return(fit)
}
I Problem: Locked in to using specific columns of gmp;
shouldn’t have to re-write just to compare two data sets
I Solution: More arguments, with defaults
Fifth Attempt

estimate.scaling.exponent.4 <- function(a, y0=6611,

response=gmp$pcgmp, predictor = gmp$pop,
maximum.iterations=100, deriv.step = .001,
step.scale = 1e-12, stopping.deriv = .01) {
iteration <- 0
deriv <- Inf
mse <- function(a) { mean((response - y0*predictor^a)^2) }
while ((iteration < maximum.iterations) && (abs(deriv) > stopping.deriv)) {
iteration <- iteration + 1
deriv <- (mse(a+deriv.step) - mse(a))/deriv.step
a <- a - step.scale*deriv
}
fit <- list(a=a,iterations=iteration,
converged=(iteration < maximum.iterations))
return(fit)
}
I Respecting the interfaces: We could turn the while() loop into
a for() loop, and nothing outside the function would care
estimate.scaling.exponent.5 <- function(a, y0=6611,
response=gmp$pcgmp, predictor = gmp$pop,
maximum.iterations=100, deriv.step = .001,
step.scale = 1e-12, stopping.deriv = .01) {
mse <- function(a) { mean((response - y0*predictor^a)^2) }
for (iteration in 1:maximum.iterations) {
deriv <- (mse(a+deriv.step) - mse(a))/deriv.step
a <- a - step.scale*deriv
if (abs(deriv) <= stopping.deriv) { break() }
}
fit <- list(a=a,iterations=iteration,
converged=(iteration < maximum.iterations))
return(fit)
}
Avoid using loops in R

I In R it is generally suggested that we avoid using for() loops

as a tool for iteration. Instead we can perform iterative work
through the following ways:
I Indexing with conditionals statements and by vectorization

x[x>2]
sum(x*y)
Avoid using loops in R

I In R it is generally suggested that we avoid using for() loops

as a tool for iteration. Instead we can perform iterative work
through the following ways:
I Indexing with conditionals statements and by vectorization

x[x>2]
sum(x*y)

I Using apply family of functions: R offers a family of apply

functions, which allow you to apply a function across different
chunks of data. This offers an alternative to explicit iteration
using for() loop. Further this can be simpler and faster,
though not always.
Apply family

I A quick overview of these functions is as follows:

I apply(): apply a function to rows or columns of a matrix or
data frame
I lapply(): apply a function to elements of a list or vector
I sapply(): same as the above, but simplify the output (if
possible)
I tapply(): apply a function to levels of a factor vector.
Using apply()

I The apply() function takes inputs of the following form:

I apply(x, MARGIN=1, FUN=my.fun), to apply my.fun() across
rows of a matrix or data frame x.
I apply(x, MARGIN=2, FUN=my.fun), to apply my.fun() across
columns of a matrix or data frame x.
Using apply()

I The apply() function takes inputs of the following form:

mydata=na.omit(airquality)
apply(mydata, MARGIN=2, FUN=min)

Ozone Solar.R Wind Temp Month Day

1.0 7.0 2.3 57.0 5.0 1.0
I Suppose we need to find which observations are maximum
with respect to each variable.

apply(mydata, MARGIN=2, FUN=which.max)

Ozone Solar.R Wind Temp Month Day

77 12 30 79 83 24
I Suppose we need to find which observations are maximum
with respect to each variable.

apply(mydata, MARGIN=2, FUN=which.max)

Ozone Solar.R Wind Temp Month Day

77 12 30 79 83 24

I In fact this technique is particularly useful for finding the

summary of each variable.

apply(mydata, MARGIN=2, FUN=summary)

Ozone Solar.R Wind Temp Month Day

Min. 1.0000 7.0000 2.30000 57.00000 5.000000 1.00000
1st Qu. 18.0000 113.5000 7.40000 71.00000 6.000000 9.00000
Median 31.0000 207.0000 9.70000 79.00000 7.000000 16.00000
Mean 42.0991 184.8018 9.93964 77.79279 7.216216 15.94595
3rd Qu. 62.0000 255.5000 11.50000 84.50000 9.000000 22.50000
Max. 168.0000 334.0000 20.70000 97.00000 9.000000 31.00000
I It is possible that we can use apply a user defined function
over different rows and columns but then we need to define
the function explicitly beforehand.
I (Example continued) Suppose we form a function which
computes the 10% symmetric trimmed mean and then apply
as above.

trimmed_mean = function(v) {
q1 = quantile(v, prob=0.1)
q2 = quantile(v, prob=0.9)
return(mean(v[q1 <= v & v <= q2]))
}

apply(mydata, MARGIN=2, FUN=trimmed_mean)

Ozone Solar.R Wind Temp Month Day

37.177778 189.764045 9.927957 78.000000 7.216216 15.532609
I Sometimes it is more convenient to define the function “on the
fly” instead of defining it beforehand.
I We can alternatively define our trimmed mean function
directly.

apply(state.x77, MARGIN=2, FUN=function(v) {

q1 = quantile(v, prob=0.1)
q2 = quantile(v, prob=0.9)
return(mean(v[q1 <= v & v <= q2]))
})

Population Income Illiteracy Life Exp Murder HS Grad

3384.27500 4430.07500 1.07381 70.91775 7.29750 53.33750
Frost Area
104.68293 56575.72500
I Suppose the user defined function needs to have some extra
arguments. It is possible to pass extra arguments to the
function through apply(). More specifically we can use:
apply(x, MARGIN=1, FUN=my.fun, extra.arg.1, extra.arg.2),
for two extra arguments extra.arg.1, extra.arg.2 to be passed
to my.fun().
I We can extend our trimmed mean function to specify the
trimming percentage.

# Our custom function: trimmed mean, with user-specified percentiles

trimmed.mean = function(v, p1, p2) {
q1 = quantile(v, prob=p1)
q2 = quantile(v, prob=p2)
return(mean(v[q1 <= v & v <= q2]))
}

apply(state.x77, MARGIN=2, FUN=trimmed.mean, p1=0.01, p2=0.99)

Population Income Illiteracy Life Exp Murder HS Grad

3974.125000 4424.520833 1.136735 70.882708 7.341667 53.131250
Frost Area
104.895833 61860.687500
What’s the return argument?

I What kind of data type will apply() give us? Depends on what
function we pass. Suppose we have FUN=my.fun(), then:
I if my.fun() returns a single value, then apply() will return a
vector.
I if my.fun() returns k values, then apply() will return a matrix
with k rows (note: this is true regardless of whether
MARGIN=1 or MARGIN=2).
I if my.fun() returns different length outputs for different inputs,
then apply() will return a list.
I if my.fun() returns a list, then apply() will return a list.
A word of caution

I The apply concept in most of the times is useful but we should

not make overuse the apply paradigm! There’s lots of
functions that are optimized for specific tasks and are both
simpler and faster than using apply().
A word of caution

I The apply concept in most of the times is useful but we should

not make overuse the apply paradigm! There’s lots of
functions that are optimized for specific tasks and are both
simpler and faster than using apply().

I The apply concept in most of the times is useful but we should

not make overuse the apply paradigm! There’s lots of
functions that are optimized for specific tasks and are both
simpler and faster than using apply().

I For example
I rowSums(), colSums(): for computing row, column sums of a
matrix
I rowMeans(), colMeans(): for computing row, column means of
a matrix
I max.col(): for finding the maximum position in each row of a
matrix
I Combining these functions with logical indexing and vectorized
operations will enable you to do quite a lot.
I E.g., how to count the number of positives in each row of a
matrix?

x = matrix(rnorm(9), 3, 3)
# Don't do this (much slower for big matrices)
apply(x, MARGIN=1, function(v) { return(sum(v > 0)) })

[1] 2 1 0

# Do this insted (much faster, simpler)

rowSums(x > 0)

[1] 2 1 0
Using lapply()
The lapply() function takes inputs as in: lapply(x, FUN=my.fun),
to apply my.fun() across elements of a list or vector x. The output
is always a list.
Consider the following
x=2:5
lapply(x, FUN=log) #same as log(x)

[[1]]
[1] 0.6931472

[[2]]
[1] 1.098612

[[3]]
[1] 1.386294

[[4]]
[1] 1.609438
I Let us prepare a list and apply mean function to every element
of a list

my.list=list(nums=c(0.1,0.2,0.3),chars=c("a", "b", "c"),bools=c(FALSE,TRUE, FALSE))

lapply(my.list, FUN=mean) # Get a warning: mean() can't be applied to chars

Warning in mean.default(X[[i]], ...): argument is not numeric or logical: returning NA

$nums
[1] 0.2

$chars
[1] NA

$bools
[1] 0.3333333

lapply(my.list, FUN=summary)

$nums
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.10 0.15 0.20 0.20 0.25 0.30

$chars
Length Class Mode
3 character character

$bools
Mode FALSE TRUE
logical 2 1
Using sapply()
The sapply() function works just like lapply(), but tries to simplify
the return value whenever possible. E.g., most common is the
conversion from a list to a vector
Let us use sapply() in the previous example
sapply(my.list, FUN=mean) # Simplifies the result, now a vector

Warning in mean.default(X[[i]], ...): argument is not numeric or logical: returning NA

nums chars bools

0.2000000 NA 0.3333333

sapply(my.list, FUN=summary) # Can't simplify, so still a list

$nums
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.10 0.15 0.20 0.20 0.25 0.30

$chars
Length Class Mode
3 character character

$bools
Mode FALSE TRUE
logical 2 1
Using tapply()

The function tapply() takes inputs as in: tapply(x,

INDEX=my.index, FUN=my.fun), to apply my.fun() to subsets of
entries in x that share a common level in my.index
Suppose we want to compute the mean and sd of the Frost for
each region.
tapply(state.x77[,"Frost"], INDEX=state.region, FUN=mean)

Northeast South North Central West

132.7778 64.6250 138.8333 102.1538

tapply(state.x77[,"Frost"], INDEX=state.region, FUN=sd)

Northeast South North Central West

30.89408 31.30682 23.89307 68.87652
Using split()

The function split() split up the rows of a data frame by levels of a

factor, as in: split(x, f=my.index) to split a data frame x according
to levels of my.index. Suppose we want to split up the iris dataset
according to species.
state.by.reg = split(data.frame(state.x77), f=state.region)
class(state.by.reg) # The result is a list

[1] "list"

names(state.by.reg) # This has 4 elements for the 4 regions

[1] "Northeast" "South" "North Central" "West"

class(state.by.reg[[1]]) # Each element is a data frame

[1] "data.frame"
# For each region, display the first 3 rows of the data frame
lapply(state.by.reg, FUN=head, 3)

$Northeast
Population Income Illiteracy Life.Exp Murder HS.Grad Frost Area
Connecticut 3100 5348 1.1 72.48 3.1 56.0 139 4862
Maine 1058 3694 0.7 70.39 2.7 54.7 161 30920
Massachusetts 5814 4755 1.1 71.83 3.3 58.5 103 7826

$South
Population Income Illiteracy Life.Exp Murder HS.Grad Frost Area
Alabama 3615 3624 2.1 69.05 15.1 41.3 20 50708
Arkansas 2110 3378 1.9 70.66 10.1 39.9 65 51945
Delaware 579 4809 0.9 70.06 6.2 54.6 103 1982

$`North Central`
Population Income Illiteracy Life.Exp Murder HS.Grad Frost Area
Illinois 11197 5107 0.9 70.14 10.3 52.6 127 55748
Indiana 5313 4458 0.7 70.88 7.1 52.9 122 36097
Iowa 2861 4628 0.5 72.56 2.3 59.0 140 55941

$West
Population Income Illiteracy Life.Exp Murder HS.Grad Frost Area
Alaska 365 6315 1.5 69.31 11.3 66.7 152 566432
Arizona 2212 4530 1.8 70.55 7.8 58.1 15 113417
California 21198 5114 1.1 71.71 10.3 62.6 20 156361
# For each region, average each of the 8 numeric variables
lapply(state.by.reg, FUN=function(df) {
return(apply(df, MARGIN=2, mean))
})

$Northeast
Population Income Illiteracy Life.Exp Murder HS.Grad
5495.111111 4570.222222 1.000000 71.264444 4.722222 53.966667
Frost Area
132.777778 18141.000000

$South
Population Income Illiteracy Life.Exp Murder HS.Grad
4208.12500 4011.93750 1.73750 69.70625 10.58125 44.34375
Frost Area
64.62500 54605.12500

$`North Central`
Population Income Illiteracy Life.Exp Murder HS.Grad
4803.00000 4611.08333 0.70000 71.76667 5.27500 54.51667
Frost Area
138.83333 62652.00000

$West
Population Income Illiteracy Life.Exp Murder HS.Grad
2.915308e+03 4.702615e+03 1.023077e+00 7.123462e+01 7.215385e+00 6.200000e+01
Frost Area
Split Apply Combine Procedure

I This can be extended to a further general structure, which we

may call split-apply-combine strategy. It is a combination of
the following three steps:

1. Split the data object into some convenient chunks.

2. Apply the function of interest over each data chunks.
3. Combine the results from each chunk in a convenient structure.
Split Apply Combine Procedure

I This can be extended to a further general structure, which we

may call split-apply-combine strategy. It is a combination of
the following three steps:

1. Split the data object into some convenient chunks.

2. Apply the function of interest over each data chunks.
3. Combine the results from each chunk in a convenient structure.

I Often the apply and combine steps can be performed for us by

a single call to the appropriate function from the apply()
family.
I The split-apply-combine strategy is simple to conceptualize
and very effective in the sense that we essentially require less
lines of code as compared to a usual for() loop.
Example

I The strikes data set contains information on 18 countries over

35 years (compiled by Bruce Western, in the Sociology
Department at Harvard University). The measured variables
are:
I country, year: country and year of data collection
I strike.volume: days on strike per 1000 workers
I unemployment: unemployment rate
I inflation: inflation rate
I left.parliament: left wing share of the government
I centralization: centralization of unions
I density: density of unions
Example (Contd.)

strikes.df = read.csv("C:/Users/hp/Desktop/pendrive/R course/PG new/strikes.csv")

dim(strikes.df)

[1] 625 8

head(strikes.df)

country year strike.volume unemployment inflation left.parliament

1 Australia 1951 296 1.3 19.8 43.0
2 Australia 1952 397 2.2 17.2 43.0
3 Australia 1953 360 2.5 4.3 43.0
4 Australia 1954 3 1.7 0.7 47.0
5 Australia 1955 326 1.4 2.0 38.5
6 Australia 1956 352 1.8 6.3 38.5
centralization density
1 0.3748588 NA
2 0.3751829 NA
3 0.3745076 NA
4 0.3710170 NA
5 0.3752675 NA
6 0.3716072 NA
Example (Contd.)

I Is there a relationship between a country’s ruling party

alignment (left versus right) and the volume of strikes?
I How do we answer this question statistically?
Example (Contd.)

I Is there a relationship between a country’s ruling party

alignment (left versus right) and the volume of strikes?
I How do we answer this question statistically?

I One way is to understand the relationship is to fit linear

models separately for each of the 18 countries and check
whether the variable left.parliament has significant effect on
the response variable strike.volume.
Example (Contd.)

I Is there a relationship between a country’s ruling party

alignment (left versus right) and the volume of strikes?
I How do we answer this question statistically?

I One way is to understand the relationship is to fit linear

models separately for each of the 18 countries and check
whether the variable left.parliament has significant effect on
the response variable strike.volume.

I Computationally this can be executed in R in at least 3 ways:

I Worst way: manually write 18 separate code blocks
I Bad way: explicit for() loop, where we loop over countries
I Best way: split appropriately, then use sapply()
I Let us execute the split-apply-combine strategy through the
following steps.
I (Work with just one chunk of data) So let’s write code to do
regression on the data from (say) just Italy
strikes.df.italy = strikes.df[strikes.df$country=="Italy", ] # Data for It
italy.lm = lm(strike.volume ~ left.parliament, data=strikes.df.italy)
summary(italy.lm)

Call:
lm(formula = strike.volume ~ left.parliament, data = strikes.df.italy)

Residuals:
Min 1Q Median 3Q Max
-930.2 -411.6 -137.3 387.2 1901.4

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -738.75 1200.62 -0.615 0.543
left.parliament 40.29 27.76 1.451 0.156

Residual standard error: 583.3 on 33 degrees of freedom

Multiple R-squared: 0.05999,Adjusted R-squared: 0.0315
F-statistic: 2.106 on 1 and 33 DF, p-value: 0.1562
plot(strikes.df.italy$left.parliament, strikes.df.italy$strike.volume,
main="Italy strike volume versus leftwing alignment",
ylab="Strike volume", xlab="Leftwing alignment")
abline(coef(italy.lm), col=2)
Italy strike volume versus leftwing alignment

2500
2000
Strike volume

1500
1000
500

38 40 42 44 46 48

Leftwing alignment
I (Functionalization) The next step is to turn this into a function
my.strike.lm = function(country.df) {
coef(lm(strike.volume ~ left.parliament, data=country.
}
my.strike.lm(strikes.df.italy)
(Intercept) left.parliament
-738.74531 40.29109
I (Split data into appropriate chunks) Next we shall split our
data into appropriate chunks, each of which can be handled by
our function. For this purpose, the function split() in R is
often helpful: split(df, f=my.factor) splits a data frame df into
several data frames, defined by constant levels of the factor
my.factor. So we want to split strikes.df into 18 smaller data
frames, each of which has the data for just one country.
strikes.by.country = split(strikes.df, f=strikes.df$country)
class(strikes.by.country)
[1] "list"
names(strikes.by.country) # It has one element for each country
[1] "Australia" "Austria" "Belgium" "Canada" "Denmark"
[6] "Finland" "France" "Germany" "Ireland" "Italy"
[11] "Japan" "Netherlands" "New.Zealand" "Norway" "Sweden"
[16] "Switzerland" "UK" "USA"
head(strikes.by.country$Italy) # Same as what we saw before

country year strike.volume unemployment inflation left.parliament

311 Italy 1951 437 8.8 14.3 37.5
312 Italy 1952 337 9.5 1.9 37.5
313 Italy 1953 545 10.0 1.4 40.2
314 Italy 1954 493 8.7 2.4 40.2
315 Italy 1955 511 7.5 2.3 40.2
316 Italy 1956 372 9.3 3.4 40.2
centralization density
311 0.2513799 NA
312 0.2489860 NA
313 0.2482739 NA
314 0.2466577 NA
315 0.2540366 NA
316 0.2457069 NA
I (Apply our function and combine the results) Let us apply our
function to each chunk of data, and combine the results. Here,
the functions lapply() or sapply() are often helpful. So we
want to apply strikes.lm() to each data frame in
strikes.by.country. Think about what the output will be from
each function call: vector of length 2 (intercept and slope), so
we can use sapply().
strikes.coefs = sapply(strikes.by.country, FUN=my.strike.lm)
strikes.coefs
Australia Austria Belgium Canada Denmark
(Intercept) 414.7712254 423.077279 -56.926780 -227.8218 -1399.35735
left.parliament -0.8638052 -8.210886 8.447463 17.6766 34.34477
Finland France Germany Ireland Italy Ja
(Intercept) 108.2245 202.4261408 95.657134 -94.78661 -738.74531 964.73
left.parliament 12.8422 -0.4255319 -1.312305 55.46721 40.29109 -24.07
Netherlands New.Zealand Norway Sweden Switzerland
(Intercept) -32.627678 721.3464 -458.22397 513.16704 -5.1988836
left.parliament 1.694387 -10.0106 10.46523 -8.62072 0.3203399
UK USA
(Intercept) 936.10154 111.440651
left.parliament -13.42792 5.918647
# We don't care about the intercepts, only the slopes (2nd row).
# Some are positive, some are negative! Let's plot them:
plot(1:ncol(strikes.coefs), strikes.coefs[2,], xaxt="n",
xlab="", ylab="Regression coefficient",
main="Countrywise labor activity by leftwing score")
axis(side=1, at=1:ncol(strikes.coefs),
labels=colnames(strikes.coefs), las=2, cex.axis=0.5)
abline(h=0, col="grey")
Countrywise labor activity by leftwing score

40
Regression coefficient

20
0
−20

y
Using plyr

I plyr was among the most downloaded R package of all time.

This is due to many good reasons!
Using plyr

I plyr was among the most downloaded R package of all time.

This is due to many good reasons!

I The plyr package is just another tool for doing

split-apply-combine procedures. Actually plyr adds very little
new functionality to R. What it does do is take the process of
SAC and make it cleaner, more tidy and easier.
Using plyr

I plyr was among the most downloaded R package of all time.

This is due to many good reasons!

I The plyr package is just another tool for doing

split-apply-combine procedures. Actually plyr adds very little
new functionality to R. What it does do is take the process of
SAC and make it cleaner, more tidy and easier.

I plyr functions have a neat naming convention. All plyr

functions are of the form **ply().
I The first two letters of the function tells the input and output
data types, respectively. Replace ** with characters denoting
types:
I First character: input type, one of a, d, l
I Second character: output type, one of a, d, l, or _ (drop)
a*ply() - the input is an array

I The signature for all a*ply() functions is:

a*ply(.data, .margins, .fun, ...)

I Here
I .data : an array
I .margins : index (or indices) to split the array by
I .fun : the function to be applied to each piece
I ... : additional arguments to be passed to the function.
I Note that this looks like:

apply(X, MARGIN, FUN, ...)

Example

I Consider a three dimensional array:

new.array = array(1:27, c(3,3,3))
I Also assign names to the rows, columns and the other
dimension:
rownames(new.array) = c("row1", "row2", "row3")
colnames(new.array) = c("column1", "column2", "column3")
dimnames(new.array)[[3]] = c("Group1", "Group2", "Group3")
I Let us have a final look at the array we created just now:
new.array
, , Group1

column1 column2 column3

row1 1 4 7
row2 2 5 8
row3 3 6 9

, , Group2

column1 column2 column3

row1 10 13 16
row2 11 14 17
row3 12 15 18

, , Group3

column1 column2 column3

I Now we shall different functions of a*ply family and notice the
change in the output.

library(plyr)

Warning: package ’plyr’ was built under R version

4.3.2

aaply(new.array, 1, sum) # the output is an array

row1 row2 row3

117 126 135
adply(new.array, 1, sum) # puts the output in a data frame

X1 V1
1 row1 117
2 row2 126
3 row3 135

alply(new.array, 1, sum) # puts the output in a list

$`1`
[1] 117

$`2`
[1] 126

$`3`
[1] 135

attr(,"split_type")
[1] "array"
attr(,"split_labels")
X1
1 row1
2 row2
3 row3
I Now we change the index which will create a different splitting.

aaply(new.array, 2:3, sum) # Get back a 3 x 3 array

X2
X1 Group1 Group2 Group3
column1 6 33 60
column2 15 42 69
column3 24 51 78

adply(new.array, 2:3, sum) # Get back a data frame

X1 X2 V1
1 column1 Group1 6
2 column2 Group1 15
3 column3 Group1 24
4 column1 Group2 33
5 column2 Group2 42
6 column3 Group2 51
7 column1 Group3 60
8 column2 Group3 69
9 column3 Group3 78
alply(new.array, 2:3, sum) # Get back a list

$`1`
[1] 6

$`2`
[1] 15

$`3`
[1] 24

$`4`
[1] 33

$`5`
[1] 42

$`6`
[1] 51

$`7`
[1] 60

$`8`
[1] 69

$`9`
l*ply() - the input is a list

The signature for all l*ply() functions is:

l*ply(.data, .fun, ...)

Here
I .data : a list
I .fun : the function to be applied to each element
I ... : additional arguments to be passed to the function
Note that this looks like:

lapply(X, FUN, ...)

my.list = list(nums=rnorm(1000), lets=letters, pops=state.x77[,"Population"])
laply(my.list, range) # Get back an array

1 2
[1,] "-3.66418302870311" "2.6689240524252"
[2,] "a" "z"
[3,] "365" "21198"
ldply(my.list, range) # Get back a data frame

.id V1 V2
1 nums -3.66418302870311 2.6689240524252
2 lets a z
3 pops 365 21198

llply(my.list, range) # Get back a list

$nums
[1] -3.664183 2.668924

$lets
[1] "a" "z"

$pops
[1] 365 21198
laply(my.list, summary) # Doesn't work! Outputs have different types/lengths

Error: Results must have one or more dimensions.

ldply(my.list, summary) # Doesn't work! Outputs have different types/lengths

Error in list_to_dataframe(res, attr(.data, "split_labels"), .id,

id_as_factor): Results do not have equal lengths

llply(my.list, summary) # Works just fine

$nums
Min. 1st Qu. Median Mean 3rd Qu. Max.
-3.66418 -0.70961 -0.01693 -0.02758 0.63021 2.66892

$lets
Length Class Mode
26 character character

$pops
Min. 1st Qu. Median Mean 3rd Qu. Max.
365 1080 2838 4246 4968 21198
The fourth option for * I

I The fourth option for * is _: the function a_ply() (or l*ply())

has no explicit return object, but still runs the given function
over the given array (or list), possibly producing side effects.

par(mfrow=c(3,3), mar=c(4,4,1,1))
a_ply(new.array, 2:3, plot, ylim=range(new.array), pch=19, c
The fourth option for * II
25

25
piece

piece

piece
15

15
5

5
0

0
1.0 1.5 2.0 2.5 3.0 1.0 1.5 2.0 2.5 3.0 1.0 1.5 2.0 2.5 3.0

Index Index Index

25
piece

piece

piece
15

15
5

5
0

0
1.0 1.5 2.0 2.5 3.0 1.0 1.5 2.0 2.5 3.0 1.0 1.5 2.0 2.5 3.0

Index Index Index

25
piece

piece

piece
15

15
5

5
0

0
1.0 1.5 2.0 2.5 3.0 1.0 1.5 2.0 2.5 3.0 1.0 1.5 2.0 2.5 3.0

Index Index Index

d*ply() : the input is a data frame

I The signature for all d*ply() functions is:

d*ply(.data, .variables, .fun, ...)
I Here
I .data: data frame
I .variables : variable (or variables) to split the data frame by
I .fun : the function to be applied to each piece
I ... : additional arguments to be passed to the function
I Note that this resembles:
tapply(X, INDEX, FUN, ...)
Strikes data set, revisited

#Regression coefficients separately for each country, old way:

strikes.list = split(strikes.df, f=strikes.df$country)
strikes.coefs = sapply(strikes.list, my.strike.lm)
head(strikes.coefs)

Australia Austria Belgium Canada Denmark

(Intercept) 414.7712254 423.077279 -56.926780 -227.8218 -1399.35735
left.parliament -0.8638052 -8.210886 8.447463 17.6766 34.34477
Finland France Germany Ireland Italy Japan
(Intercept) 108.2245 202.4261408 95.657134 -94.78661 -738.74531 964.73750
left.parliament 12.8422 -0.4255319 -1.312305 55.46721 40.29109 -24.07595
Netherlands New.Zealand Norway Sweden Switzerland
(Intercept) -32.627678 721.3464 -458.22397 513.16704 -5.1988836
left.parliament 1.694387 -10.0106 10.46523 -8.62072 0.3203399
UK USA
(Intercept) 936.10154 111.440651
left.parliament -13.42792 5.918647
# Getting regression coefficient separately for each country, new way:
strikes.coefs.a = daply(strikes.df, .(country), my.strike.lm)
head(strikes.coefs.a) # Get back an array, note the difference to sapply()

country (Intercept) left.parliament

Australia 414.77123 -0.8638052
Austria 423.07728 -8.2108864
Belgium -56.92678 8.4474627
Canada -227.82177 17.6766029
Denmark -1399.35735 34.3447662
Finland 108.22451 12.8422018
strikes.coefs.d = ddply(strikes.df, .(country), my.strike.lm)
head(strikes.coefs.d) # Get back a data frame

country (Intercept) left.parliament

1 Australia 414.77123 -0.8638052
2 Austria 423.07728 -8.2108864
3 Belgium -56.92678 8.4474627
4 Canada -227.82177 17.6766029
5 Denmark -1399.35735 34.3447662
6 Finland 108.22451 12.8422018
strikes.coefs.l = dlply(strikes.df, .(country), my.strike.lm)
head(strikes.coefs.l) # Get back a list

$Australia
(Intercept) left.parliament
414.7712254 -0.8638052

$Austria
(Intercept) left.parliament
423.077279 -8.210886

$Belgium
(Intercept) left.parliament
-56.926780 8.447463

$Canada
(Intercept) left.parliament
-227.8218 17.6766

$Denmark
(Intercept) left.parliament
-1399.35735 34.34477

$Finland
(Intercept) left.parliament
108.2245 12.8422
Splitting on two (or more) variables

I The function d*ply() makes it very easy to split on two (or

more) variables: we just specify them, separated by a “,” in the
.variables argument.

#First create a variable that indicates whether the year is pre 1975, and add
# it to the data frame
strikes.df$yearPre1975 = strikes.df$year <= 1975
# Then use (say) ddply() to compute regression coefficients for each country

# pre and post 1975

strikes.coefs.1975 = ddply(strikes.df, .(country, yearPre1975), my.strike.lm)
dim(strikes.coefs.1975) # Note that there are 18 x 2 = 36 rows

[1] 36 4
head(strikes.coefs.1975)

country yearPre1975 (Intercept) left.parliament

1 Australia FALSE 973.34088 -11.8094991
2 Australia TRUE -169.59900 12.0170866
3 Austria FALSE 19.51823 -0.3470889
4 Austria TRUE 400.83004 -7.7051918
5 Belgium FALSE -4182.06650 148.0049261
6 Belgium TRUE -103.67439 9.5802824
# We can also create factor variables on-the-fly with I(), as we've seen before
strikes.coefs.1975 = ddply(strikes.df, .(country, I(year<=1975)), my.strike.lm)
dim(strikes.coefs.1975) # Again, there are 18 x 2 = 36 rows

[1] 36 4
head(strikes.coefs.1975)

country I(year <= 1975) (Intercept) left.parliament

Statistics With R Programming For Bigdata (Autosaved)
No ratings yet
Statistics With R Programming For Bigdata (Autosaved)
41 pages
Unit 2
No ratings yet
Unit 2
101 pages
02 Programming
No ratings yet
02 Programming
85 pages
R-Unit 2
No ratings yet
R-Unit 2
81 pages
SEC Notes
No ratings yet
SEC Notes
62 pages
Big-Data Unit-4
No ratings yet
Big-Data Unit-4
110 pages
R - Programming - Moduel 1 - Module 4
No ratings yet
R - Programming - Moduel 1 - Module 4
88 pages
R Language - Loops and Control Staements
No ratings yet
R Language - Loops and Control Staements
92 pages
ProbList1 24Sln
No ratings yet
ProbList1 24Sln
63 pages
Unit 4 - Ids
No ratings yet
Unit 4 - Ids
65 pages
Week2 Slides
No ratings yet
Week2 Slides
76 pages
R Language1
No ratings yet
R Language1
82 pages
Chapter 4 Slides
No ratings yet
Chapter 4 Slides
55 pages
Functional Programming: Hadley Wickham
No ratings yet
Functional Programming: Hadley Wickham
58 pages
Stat 1st Unit
No ratings yet
Stat 1st Unit
32 pages
M-3 Matlab
No ratings yet
M-3 Matlab
44 pages
File 1
No ratings yet
File 1
27 pages
R Programming
No ratings yet
R Programming
50 pages
2 - Datacamp - Intermediate R Notes
No ratings yet
2 - Datacamp - Intermediate R Notes
55 pages
Cda Lab
No ratings yet
Cda Lab
31 pages
R Module 2
No ratings yet
R Module 2
30 pages
Unit III R Programming Fundamentals
No ratings yet
Unit III R Programming Fundamentals
33 pages
Ids Unit 4 Final
No ratings yet
Ids Unit 4 Final
32 pages
02 Functions in R
No ratings yet
02 Functions in R
24 pages
Unit 5 - DS - 1st Year
No ratings yet
Unit 5 - DS - 1st Year
19 pages
Lesson 5: Selection: Akos Ledeczi and Mike Fitzpatrick
No ratings yet
Lesson 5: Selection: Akos Ledeczi and Mike Fitzpatrick
29 pages
Writing Simple Functions in R Bootstrapping
No ratings yet
Writing Simple Functions in R Bootstrapping
17 pages
Unit 2
No ratings yet
Unit 2
17 pages
R Programming
No ratings yet
R Programming
22 pages
Control Structures and Functions
No ratings yet
Control Structures and Functions
20 pages
R - Lecture 7
No ratings yet
R - Lecture 7
20 pages
Projectile Motion MCQS Physics
80% (71)
Projectile Motion MCQS Physics
10 pages
Note R Control Function Scoping Rules Vectorized Operation Date and Time
No ratings yet
Note R Control Function Scoping Rules Vectorized Operation Date and Time
15 pages
Lec 09
No ratings yet
Lec 09
16 pages
R Functions: Things Your Mother (Probably) Didn't Tell You About
No ratings yet
R Functions: Things Your Mother (Probably) Didn't Tell You About
34 pages
Second
No ratings yet
Second
26 pages
Big Data - Lab 2
No ratings yet
Big Data - Lab 2
16 pages
Presentation 2
No ratings yet
Presentation 2
18 pages
Unit 2 R
No ratings yet
Unit 2 R
16 pages
Logeshwaran Ex - 2
No ratings yet
Logeshwaran Ex - 2
26 pages
Control Flow Tools
No ratings yet
Control Flow Tools
10 pages
R Studio
No ratings yet
R Studio
41 pages
R Program (Exp 18-24)
No ratings yet
R Program (Exp 18-24)
12 pages
R-Prog Unit-2
No ratings yet
R-Prog Unit-2
16 pages
Unit 2
No ratings yet
Unit 2
16 pages
Week5 2020
No ratings yet
Week5 2020
13 pages
Intermediate R
No ratings yet
Intermediate R
13 pages
Programming In: Flow Control (Control Structures
No ratings yet
Programming In: Flow Control (Control Structures
14 pages
Functions: Built-In Functions in R
No ratings yet
Functions: Built-In Functions in R
10 pages
Exploratory Data Analysis & Data Preprocessing
No ratings yet
Exploratory Data Analysis & Data Preprocessing
16 pages
Chapter R Programming
No ratings yet
Chapter R Programming
20 pages
7 Programming Fundamentals in R
No ratings yet
7 Programming Fundamentals in R
7 pages
Unit 2 R Factorial Functions
No ratings yet
Unit 2 R Factorial Functions
6 pages
Advanced Comp Unit 4
No ratings yet
Advanced Comp Unit 4
10 pages
Chapter 4 Programming Basics - Introduction To Data Science
No ratings yet
Chapter 4 Programming Basics - Introduction To Data Science
11 pages
Grouping, Loops and Conditional Execution
No ratings yet
Grouping, Loops and Conditional Execution
13 pages
Ebooks Basicr Writefuns
No ratings yet
Ebooks Basicr Writefuns
11 pages
Pertemuan9 Fungsi Pada R PDF
No ratings yet
Pertemuan9 Fungsi Pada R PDF
4 pages
Chapter 3 Programming Basics: 3.1 Conditional Expressions
No ratings yet
Chapter 3 Programming Basics: 3.1 Conditional Expressions
7 pages
Statistics Using R Language
No ratings yet
Statistics Using R Language
5 pages
Linear Model Recap 2
No ratings yet
Linear Model Recap 2
313 pages
Prob Intro4
No ratings yet
Prob Intro4
277 pages
Great Circle Sailing Formulas and Voyage Planning
100% (1)
Great Circle Sailing Formulas and Voyage Planning
2 pages
(Astrophysics and Space Science Library 131) Fionn Murtagh, André Heck (Auth.) - Multivariate Data Analysis-Springer Netherlands (1987) PDF
No ratings yet
(Astrophysics and Space Science Library 131) Fionn Murtagh, André Heck (Auth.) - Multivariate Data Analysis-Springer Netherlands (1987) PDF
224 pages
Linear Review 1
No ratings yet
Linear Review 1
235 pages
Prob Intro2
No ratings yet
Prob Intro2
224 pages
Cân bằng vật chất và Cân bằng năng lượng - Bài giảng khoa Hóa - ĐH Bách Khoa Tp.HCM
No ratings yet
Cân bằng vật chất và Cân bằng năng lượng - Bài giảng khoa Hóa - ĐH Bách Khoa Tp.HCM
45 pages
Chapter 5 Algebraic Expressions
No ratings yet
Chapter 5 Algebraic Expressions
15 pages
Heat Exchangers
No ratings yet
Heat Exchangers
29 pages
A Beginner's Guide To Fragility, Vulnerability and Risk-Porter
No ratings yet
A Beginner's Guide To Fragility, Vulnerability and Risk-Porter
50 pages
Basic Testing
No ratings yet
Basic Testing
116 pages
Fiitjee: JEE (Advanced), 2014
No ratings yet
Fiitjee: JEE (Advanced), 2014
20 pages
Linear Model 1
No ratings yet
Linear Model 1
71 pages
Introduction To Plasma Module
No ratings yet
Introduction To Plasma Module
42 pages
Tidy Verse
No ratings yet
Tidy Verse
76 pages
1) Two Forces Are To Be Added To Determine The Resultant Force F - Compute The Angle
No ratings yet
1) Two Forces Are To Be Added To Determine The Resultant Force F - Compute The Angle
40 pages
Introduction
No ratings yet
Introduction
47 pages
Automated Text Classification of News Articles: A Practical Guide
No ratings yet
Automated Text Classification of News Articles: A Practical Guide
39 pages
Engineering Mathematics - I Semester - 1 by DR N V Nagendram UNIT - V Vector Differential Calculus Gradient, Divergence and Curl
No ratings yet
Engineering Mathematics - I Semester - 1 by DR N V Nagendram UNIT - V Vector Differential Calculus Gradient, Divergence and Curl
34 pages
PM Mathematics Tips
No ratings yet
PM Mathematics Tips
5 pages
Quantum Computing - Vision and Challenges
No ratings yet
Quantum Computing - Vision and Challenges
11 pages
DAA Unit-IV
No ratings yet
DAA Unit-IV
12 pages
Countable and Uncountable Nouns
No ratings yet
Countable and Uncountable Nouns
18 pages
Practical
No ratings yet
Practical
21 pages
Yr 4 Mid 2ND Term Exam
No ratings yet
Yr 4 Mid 2ND Term Exam
5 pages
Gauge Transformations, Foldy-Wouthuysen Transformations and Conservation of Energy (Kuo-Ho Yang J.phys.a 1982)
No ratings yet
Gauge Transformations, Foldy-Wouthuysen Transformations and Conservation of Energy (Kuo-Ho Yang J.phys.a 1982)
15 pages
Mahmud 2021
No ratings yet
Mahmud 2021
14 pages
Assignment 1
No ratings yet
Assignment 1
8 pages
1 s2.0 S0950061822008741 Main
No ratings yet
1 s2.0 S0950061822008741 Main
12 pages
Assignment 1 New
No ratings yet
Assignment 1 New
6 pages
PeakFit 4.12 PDF
No ratings yet
PeakFit 4.12 PDF
2 pages
Turing Machine Based Encryption
No ratings yet
Turing Machine Based Encryption
4 pages
Microsoft C Sharp, Array Explained in Detail.
No ratings yet
Microsoft C Sharp, Array Explained in Detail.
6 pages
Data Tructure B. TECH 2ND SESSIONAL PAPER FORMAT
No ratings yet
Data Tructure B. TECH 2ND SESSIONAL PAPER FORMAT
2 pages
Assignment 6 New
No ratings yet
Assignment 6 New
3 pages
Class 8 Logical Reasoning: Choose Correct Answer(s) From The Given Choices
No ratings yet
Class 8 Logical Reasoning: Choose Correct Answer(s) From The Given Choices
3 pages
Stative Verbs Chart
No ratings yet
Stative Verbs Chart
2 pages
No Name 1
No ratings yet
No Name 1
1 page
Fluid Flow Operation
No ratings yet
Fluid Flow Operation
2 pages
Lazy Portfolios: Core and Satellite
No ratings yet
Lazy Portfolios: Core and Satellite
2 pages
Untitled
No ratings yet
Untitled
2 pages