0% found this document useful (0 votes)
13 views64 pages

Advanced R: Chapter 3.3: Functional Programming

The document discusses functional programming and recursion in R. It defines functional programming and outlines its key characteristics that R satisfies. Recursion is a core concept, where a function calls itself through recursive calls. Examples like calculating factorials and Fibonacci numbers recursively are provided. The recursion stack is also explained, which tracks nested function calls in R.

Uploaded by

DunsScoto
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views64 pages

Advanced R: Chapter 3.3: Functional Programming

The document discusses functional programming and recursion in R. It defines functional programming and outlines its key characteristics that R satisfies. Recursion is a core concept, where a function calls itself through recursive calls. Examples like calculating factorials and Fibonacci numbers recursively are provided. The recursion stack is also explained, which tracks nested function calls in R.

Uploaded by

DunsScoto
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 64

Advanced R

Chapter 3.3: Functional programming

Daniel Horn & Sheila Görz

Summer Semester 2022

Daniel Horn & Sheila Görz Advanced R Summer Semester 2022 1 / 64


What is functional programming?

What is functional programming?

Daniel Horn & Sheila Görz Advanced R Summer Semester 2022 2 / 64


What is functional programming?

Recapitulation

Imperative programming concerned itself with how problems should be


solved. The program specified exactly what is to be done.
On the other hand, object-oriented programming was all about data
and structuring small chunks of imperative code through methods.
Declarative programming takes a contrary approach:

Declarative programming
Declarative programs merely describe the problem, i.e. they describe what
needs to be done. It’s left for the underlying programming language (not
the programmer!) to decide how to actually solve it.

To us, the most important representative of declarative programming is


functional programming. However, there are others as well, e.g. SQL.

Daniel Horn & Sheila Görz Advanced R Summer Semester 2022 3 / 64


What is functional programming?

Concept of functional programming

A computer program is interpreted as a mathematical function that maps


an input onto an output. Thus, we simply need to define this mathematical
function in a proper way:
Example: Calculating factorials (n!).
Defining factorials by specifying the Defining factorials recursively:
calculation steps:
n! = 1
(
for (i in 1:n) 1 if n = 1
n! = n! * i
n! =
n · (n − 1)! else

Based on the definition, the programming language must now decide on its
own how to calculate the function. All it can fall back on are arithmetic
and logical operators, conditionals and further (recursive) function calls.
Daniel Horn & Sheila Görz Advanced R Summer Semester 2022 4 / 64
What is functional programming?

Definition: Functional programming languages I

A programming language is called functional, if it shares the following five


characteristics:

1. Programs are functions


A computer program is a mathematical function: it maps an input onto an
output which shall only depend on the input and not any further (hidden)
parameters.

Daniel Horn & Sheila Görz Advanced R Summer Semester 2022 5 / 64


What is functional programming?

Definition: Functional programming languages II

2. No imperative programming
Inside the function, no imperative programming is allowed except for the if
conditional. Only further (nested) function calls as well as arithmetic and
logical expressions are permitted.

3. Functions are data objects


Functions are regular data objects like, for example, double or integer.
This means in particular: Functions can act as input or output of other
functions. Furthermore, functions are not static objects and can be created
anew anytime at runtime.

Daniel Horn & Sheila Görz Advanced R Summer Semester 2022 6 / 64


What is functional programming?

Definition: Functional programming languages III

4. Functions are closures


A function is aware of the context it has been created in. It can access all
variables from said context even without being a part of it anymore. As
soon as the function leaves its context, the current state of the variables is
frozen. Such functions are called closures.

5. Functions don’t have to be named


When defining a new function, it doesn’t necessarily have to be named.
Particularly, in combination with the third characteristic, a nameless
function can be used directly as an input of another function. Functions
like these are called anonymous.

Daniel Horn & Sheila Görz Advanced R Summer Semester 2022 7 / 64


What is functional programming?

R fulfills all five characteristics

1 This is more of a philosophical matter, but ultimately every program


can be a function in R. R also allows the execution of R scripts in
which every instruction is a function call. Caution: Sometimes, R
functions don’t solely depend on their input, but for example on a
random number generator as well.
2 At first glance: No, since imperative programming inside of R
functions is possible. However: Everything in R is a function call. Thus,
every R program is a single nested function call internally.
3 Fulfilled. See chapter on functions.
4 Fulfilled. That’s why functions are called closures in R. Caution: The
state of variables from a function’s context isn’t frozen in R.
5 Fulfilled. Assigning a function to a variable is optional.

Daniel Horn & Sheila Görz Advanced R Summer Semester 2022 8 / 64


What is functional programming?

R is a functional programming language

R fulfills the five essential characteristics of functional programming


languages with only minor caveats. Thus, we can call R a functional
programming language in good conscience.
In particular, R also offers us all imperative tools like loops. Internally,
these are executed according to the principles of functional
programming (Reminder: Loops are just another function call) while
still allowing us to combine these two paradigms in a flexible way.
Thus, we can cherry-pick at will and write quite elegant programs.

Daniel Horn & Sheila Görz Advanced R Summer Semester 2022 9 / 64


What is functional programming?

DRY: Don’t Repeat Yourself

Often times, recurring tasks have to be performed inside our programs with
only minor changes to the parameters. A common approach to this is copy
and paste. As convenient as it may be, it’s also very error-prone. Instead,
it’s more advisable to follow the DRY principle.

Dave Thomas and Andy Hunt


Every piece of knowledge must have a single, unambiguous, authoritative
representation within a system.

In many situations, functional programming is a powerful tool to realize the


DRY principle.

Daniel Horn & Sheila Görz Advanced R Summer Semester 2022 10 / 64


Recursion

Recursion

Daniel Horn & Sheila Görz Advanced R Summer Semester 2022 11 / 64


Recursion

Definition: Recursion

The most essential control structure of functional programming languages


is recursion, a concept that most people with either a programming or
mathematical background should be familiar with.

Recursion and recursive functions


A function that calls itself is referred to as a recursive function, the actual
self-call as recursion.

Usually, a recursive function consists of two components:


A base case in which the result is directly calculated for a simple input.
A recursive step in which the function is called recursively upon a
somehow reduced input. The result of the self-call is then used to
calculate the return value of the function.

Daniel Horn & Sheila Görz Advanced R Summer Semester 2022 12 / 64


Recursion

Recursion: Examples
We’ve already seen one example: calculating factorials. Two other common
examples of recursion are:

Calculating the n-th number of the Sorting according to Quicksort:


Fibonacci series:
quicksort = function(x) {
# Base case
fib = function(n) { if (length(x) < 2L)
# Base case return(x)
if (n < 2L)
return(n) # Recursive step with two rec. calls
pivot = x[1L]
# Recursive step with two rec. calls sorted = c(Recall(x[x < pivot]),
res = Recall(n - 1L) + Recall(n - 2L) x[x == pivot], Recall(x[x > pivot]))
return(res) return(sorted)
} }

Caution: Since binding a function to a name is optional, the function


Recall() should be used for the recursive call.
Daniel Horn & Sheila Görz Advanced R Summer Semester 2022 13 / 64
Recursion

The recursion stack I

If you’ve already used recursion in R, you may have come across the
following problem:
recursion = function(x) {
if (x == 1)
return(x)
return(Recall(x - 1))
}
recursion(10000)

## Error: evaluation nested too deeply: infinite recursion /


options(expressions=)?

What’s obvious: Infinite recursions are impossible in a computer since a


program that calls itself infinitely never terminates.

Conclusion: There must be an upper limit for how often a function can
call itself as well as an error should it be reached.
Daniel Horn & Sheila Görz Advanced R Summer Semester 2022 14 / 64
Recursion

The recursion stack II

In particular, the computer must keep track of all open function calls. Upon
completing the execution of a single call, it must then continue with the
preceding call.

The recursion stack


All currently executing (nested) function calls are placed on the recursion
stack. In R, it also contains the respective execution environment.
Functions are executed from top to bottom. Each new function call is
placed on top of the stack.

The current stack size corresponds to the number of currently executing


function calls. It’s also referred to as the recursion depth. In R, it’s bounded
and can be adjusted via options(expressions = NUMBER).

Daniel Horn & Sheila Görz Advanced R Summer Semester 2022 15 / 64


Recursion

Recursion in R I

Because recursions usually require more computational time than their


iterative equivalents in R, they should be avoided:
Calculating factorials recursively Calculating factorials iteratively
facIt = function(x) {
facRec = function(x) { res = 1
if (x == 1) for (i in 1:x)
return(1) res = res * x
return(x * Recall(x - 1)) return(res)
} }
for (n in c(20, 40, 60, 80, 100)) for (n in c(20, 40, 60, 80, 100))
print(mean(microbenchmark( print(mean(microbenchmark(
facRec(n))$time)) facIt(n))$time))

## [1] 42057.01 ## [1] 28629.02


## [1] 25540.04 ## [1] 1532.01
## [1] 46313.05 ## [1] 1965.05
## [1] 54705.03 ## [1] 2660.05
## [1] 76953.95 ## [1] 2792.02

Daniel Horn & Sheila Görz Advanced R Summer Semester 2022 16 / 64


Recursion

Recursion in R II

To understand this, we need to recall the chapter on functions: When a


function is called, the following things happen
A new execution environment is created.
The enclosing environment is captured.
Argument matching is performed.
...
All of this takes place for every recursive function call and amounts to a lot
of time. That’s why recursions should be avoided in R. However, as we
already know: Every recursion has an equivalent iterative variant.

Therefore: Even though R is a functional programming language, iterative


variants of recursions should be preferred in most cases. If the recursion
depth is known to be shallow in advance, recursive functions can be viable.

Daniel Horn & Sheila Görz Advanced R Summer Semester 2022 17 / 64


Iterating over functions

Iterating over functions

Daniel Horn & Sheila Görz Advanced R Summer Semester 2022 18 / 64


Iterating over functions

Iterating over functions I

In the last example, we haven’t actually followed the DRY principle: We


have executed the same code for facRec() and facIt().

Functional programming offers us tools to prevent this. In R, we can iterate


over a list of functions:
x = runif(20)
for(fun in c(median, mean, var, sd)) {
print(fun(x))
}

## [1] 0.640933
## [1] 0.6598144
## [1] 0.07202202
## [1] 0.2683692

Daniel Horn & Sheila Görz Advanced R Summer Semester 2022 19 / 64


Iterating over functions

Iterating over functions II

Here, we’re combining a concept of imperative programming (the for


loop) and a concept of functional programming (a list of functions) to
apply all functions in a compact way.

We can use this to realize the DRY principle in our example:


for (facFun in c(facRec, facIt)) {
for (n in c(20, 40, 60, 80, 100)) {
print(mean(microbenchmark(facFun(n))$time))
}
}

We’re not quite fond of the output and other aspects of this solution yet,
but we’ll refine this example over the course of this chapter.

Daniel Horn & Sheila Görz Advanced R Summer Semester 2022 20 / 64


Iterating over functions

Roadmap

We’re basing our approach in particular on the third characteristic of


functional programming languages: Functions are treated as data objects.
This is what allowed us to iterate over functions. In the next couple of
subchapters, another aspect of this characteristic shall be our focus:

Functions can act as an input or an output of other functions.

We’ll be taking a look at the three possible scenarios:


1 Functions as input parameters
2 Functions as return values
3 Functions as input as well as output

Daniel Horn & Sheila Görz Advanced R Summer Semester 2022 21 / 64


Functions as input of other functions

Functions as input of other functions

Daniel Horn & Sheila Görz Advanced R Summer Semester 2022 22 / 64


Functions as input of other functions

Functionals

Functionals
A function that takes a function as its input and returns a vector, is called
a functional.

We’ll be looking at three kinds of functionals in this chapter.


1 Functionals that only apply the provided function, thus replacing loops.
2 Functionals that analyze the provided function.
3 Functionals that outsource part of their functionality.

All of you have probably already used functionals from all three classes.

Daniel Horn & Sheila Görz Advanced R Summer Semester 2022 23 / 64


Functions as input of other functions

Our first functional: lapply()

The first functional that (hopefully) all of you have already used is:
lapply

## function (X, FUN, ...)


## {
## FUN <- match.fun(FUN)
## if (!is.vector(X) || is.object(X))
## X <- as.list(X)
## .Internal(lapply(X, FUN))
## }
## <bytecode: 0x0000000006dcc3e0>
## <environment: namespace:base>

lapply() takes a vector X and a function FUN as its input. It applies FUN
to every element of X and returns the results as a list.

Daniel Horn & Sheila Görz Advanced R Summer Semester 2022 24 / 64


Functions as input of other functions

lapply: Example
The most common use case of lapply() is as a replacement of a for loop:

ns = c(20, 40, 60, 80, 100) res = vector(mode = "list", length = 5L)
res = lapply(ns, function(n) for (i in seq_along(ns)) {
mean(microbenchmark(facRec(n))$time) res[[i]] = mean(
) microbenchmark(facRec(ns[i]))$time
)
}

We can define a simplified lapply() ourselves:


lapply2 = function(X, FUN, ...) {
res = vector(mode = "list", length = length(X))
for (i in seq_along(X))
res[[i]] = FUN(X[[i]], ...)
names(res) = names(X)
return(res)
}

The actual lapply() is implemented in C and thus, is more efficient.


Daniel Horn & Sheila Görz Advanced R Summer Semester 2022 25 / 64
Functions as input of other functions

Why use lapply()? I

If lapply() is always equivalent to using a for loop, why use it at all?


1 Supposedly, it’s more efficient. In a loop, code inside the body must be
interpreted in each iteration anew while this is only required once for
lapply().

funct = function(n) { loop = function(n) {


lapply(1:n, function(x) x^2) res = numeric(n)
} for(i in 1:n)
res[i] = i^2
}
mean(microbenchmark(funct(1e4))$time) mean(microbenchmark(loop(1e4))$time)
## [1] 6936928 ## [1] 522540

This statement is apparently wrong. The mentioned efficiency issue is


deprecated and has been solved since R Version 3.4.0.

Daniel Horn & Sheila Görz Advanced R Summer Semester 2022 26 / 64


Functions as input of other functions

Why use lapply()? II

Runtime of both variants in different R versions:


R Version 3.0.2 3.2.2 3.3.2 3.4.2
Loop 14.0ms 7.8ms 9.0ms 0.6ms
lapply() 8.7ms 7.0ms 7.3ms 5.6ms
Though these relations may change for other examples, loops and
apply() variants take about the same time in general.

2 We’re saving four lines of code.

lapply() replaces a recurring pattern of a for loop: Initializing the


result vector as a list and iterating over X. In compliance with the
DRY principle, we’re replacing this recurring pattern by a function.
Instead of a confusing (and bulky) loop, lapply() enables us to
simply use a single function call for the same functionality.

Daniel Horn & Sheila Görz Advanced R Summer Semester 2022 27 / 64


Functions as input of other functions

The apply family

Aside from lapply(), there are also other apply functions that all replace
certain looping patterns:

Function Pattern
lapply() Input: Vector, Output: List
sapply() Input: Vector, Output: As much simplified as possible
vapply() Input: Vector, Output: Pre-defined type
apply() Input: Matrix, iterating over rows or columns,
Output: Vector
tapply() Input: Vector, Factor, iterating over factor levels,
Output: Simplified list with an element per factor level
by() Input: Data frame, else like tapply()
mapply() Input: Multiple vectors, Output: Simplified list
eapply() Input: Environment, Output: List

Daniel Horn & Sheila Görz Advanced R Summer Semester 2022 28 / 64


Functions as input of other functions

The plyr package

The apply family belongs to the first kind of functionals that aim to
simply apply the given function to a provided data structure and
create a proper output.
Each apply function has its own quirks, e.g.: An argument might be
called simplify for one function and SIMPLIFY for the other.
The apply family is the key to mastering the functional aspects of R.
The R package plyr offers unified variants of apply.

Daniel Horn & Sheila Görz Advanced R Summer Semester 2022 29 / 64


Functions as input of other functions

Filter(), Map(), Reduce() I

Most functional programming languages build on three essential


functionals: Filter(), Map() and Reduce(). R supports these as well:

Filter()
Filter() takes a vector and a function with a logical return value as input.
It returns a vector with the elements for which the function returns TRUE.
Filter(function(x) x > 3, 1:10)

## [1] 4 5 6 7 8 9 10

Daniel Horn & Sheila Görz Advanced R Summer Semester 2022 30 / 64


Functions as input of other functions

Filter(), Map(), Reduce() II

Map()
Map() takes a function and multiple vectors with arguments. The function
is then applied to every i-th element of the vectors. If necessary, vectors are
recycled.
Map(`+`, 1:2, 11:12)

## [[1]]
## [1] 12
##
## [[2]]
## [1] 14

For the most part, Map() corresponds to mapply().

Daniel Horn & Sheila Görz Advanced R Summer Semester 2022 31 / 64


Functions as input of other functions

Filter(), Map(), Reduce() III

Reduce()
Reduce() reduces a vector to a single value. First, the provided function is
applied to a starting value and the first element of the vector. After that,
the resulting value and the next element of the vector are used and so forth
until all elements of the vector have been processed.
Reduce(`+`, 1:10, init = 0)

## [1] 55

The reduction can be performed starting from the left or from the right.
It’s also possible to return the accumulated results as well.
Reduce(`+`, 1:10, init = 0, right = TRUE, accumulate = TRUE)

## [1] 55 54 52 49 45 40 34 27 19 10 0

Daniel Horn & Sheila Görz Advanced R Summer Semester 2022 32 / 64


Functions as input of other functions

Anonymous functions

Recalling the fifth characteristic of functional programming:

Functions don’t have to be named; they can be anonymous.

We’ve already made use of this multiple times on the last couple of slides:
sapply(1:10, function(x) x^2)

## [1] 1 4 9 16 25 36 49 64 81 100

Here, function(x) x 2 is an anonymous function. It’s not assigned to any


variable. Functionals are a common use case for anonymous functions.

Daniel Horn & Sheila Görz Advanced R Summer Semester 2022 33 / 64


Functions as input of other functions

The benchmark example


Using our newly acquired knowledge about functionals, we can now
implement our benchmark example more elegantly:
# Define a function to measure the time for one facFun and for one n
getTime = function(facFun, n)
mean(microbenchmark(facFun(n))$time)
# Define a function that iterates over multiple facFuns and ns
# and produces a nice output
getTimes = function(facFuns, ns) {
times = lapply(facFuns, function(facFun) sapply(ns, getTime, facFun = facFun))
do.call(rbind, times)
}
# Apply the function
getTimes(
facFuns = list(Rec = facRec, It = facIt),
ns = c(20L, 40L, 60L, 80L, 100L)
)

## [,1] [,2] [,3] [,4] [,5]


## Rec 12101.01 27432.07 38120.98 60236.90 78816.02
## It 1057.02 1906.04 2144.06 2516.05 3277.01

Daniel Horn & Sheila Görz Advanced R Summer Semester 2022 34 / 64


Functions as input of other functions

Type 2: Analyzing the provided function

The aim of some functionals is not to actually apply the provided function,
but to analyze it. We already know some examples for this:

optim(fun): Find the minimum of fun.


integrate(fun): Determine the area under fun.
uniroot(fun): Determine the roots of fun.

All these functions have a thing in common: During their execution, the
function fun is evaluated multiple times to learn something about fun.
Usually, fun is interpreted as a purely mathematical function.

Daniel Horn & Sheila Görz Advanced R Summer Semester 2022 35 / 64


Functions as input of other functions

Type 3: Outsourcing functionality

Sometimes we’re writing a lengthy function in which a small task can be


performed in multiple ways (depending on the input). Generally, there are
two approaches:

1 Using nested if conditionals. A discrete input parameter specifies


which of these ways shall be used. All variants are implemented in the
function and through conditionals one of them is selected.
2 The functionality is outsourced to an external function which is then
used as an input parameter and is called in the larger function. The
user can implement the small task as desired.

Example: Inside the glm() function, the (distribution) family can be


specified which is defined via an external function, e.g. binomial().

Daniel Horn & Sheila Görz Advanced R Summer Semester 2022 36 / 64


Functions as input of other functions

Example: Weighted moving average I


In this example, we’re calculating the weighted moving average of a DAX
time series from the datasets package:
plot(EuStockMarkets[, "DAX"], ylab = "Euro")
5000
Euro

2000

1992 1993 1994 1995 1996 1997 1998

Time

Daniel Horn & Sheila Görz Advanced R Summer Semester 2022 37 / 64


Functions as input of other functions

Example: Weighted moving average II

Our function movingMean() takes a time series and an avgFun() function


as input. The latter specifies how to smooth the time series.
movingMean = function(time.series, avgFun) {
# Calculate moving average using supplied avg.fun
res = sapply(seq_along(time.series),
function(i) avgFun(time.series, i)
)
# Format back to original time series
ts(res, start = start(time.series),
end = end(time.series), frequency = frequency(time.series))
}

Note: The input parameters as well as the output type of avgFun() are
fixed by the function’s definition. Thus, every supplied function must
comply with them.

We can now write arbitrary functions to calculate different moving averages:


Daniel Horn & Sheila Görz Advanced R Summer Semester 2022 38 / 64
Functions as input of other functions

Example: Weighted moving average III

identityAvgFun = function(time.series, i)
time.series[i]

plot(movingMean(EuStockMarkets[, "DAX"], identityAvgFun), ylab = "Euro")


5000
Euro

2000

1992 1993 1994 1995 1996 1997 1998

Time

Daniel Horn & Sheila Görz Advanced R Summer Semester 2022 39 / 64


Functions as input of other functions

Example: Weighted moving average IV

meanAvgFun = function(time.series, i) {
inds = i + -5:5
inds = inds[inds >= 1 & inds <= length(time.series)]
mean(time.series[inds])
}
5000
Euro

2000

1992 1993 1994 1995 1996 1997 1998

Time

Daniel Horn & Sheila Görz Advanced R Summer Semester 2022 40 / 64


Functions as input of other functions

Example: Weighted moving average V

weightMeanAvgFun = function(time.series, i) {
inds = i + -2:2
allowed.inds = inds >= 1 & inds <= length(time.series)
inds = inds[allowed.inds]
weights = 2 * (1:5)[allowed.inds]
weighted.mean(time.series[inds], w = weights)
}
5000
Euro

2000

1992 1993 1994 1995 1996 1997 1998

Time

Daniel Horn & Sheila Görz Advanced R Summer Semester 2022 41 / 64


Functions as output of other functions

Functions as output of other functions

Daniel Horn & Sheila Görz Advanced R Summer Semester 2022 42 / 64


Functions as output of other functions

Function factories

Up until now, we have only allowed for functions to be input parameters.


Next, we’ll cover functions that return functions. These are also called
function factories:

Functions making functions. How perverse. (Loosely based on C3PO Ep II)

We’re recalling: Every function captures the environment in which it was


created. If a function f 1 is created inside another function f 2, then the
enclosing environment of f 1 is the execution environment of f 2.

Thus, our newly created function can also access all variables and
parameters of its factory. Therefore, if we often need similar functions that
only differ in a few parameters, we can create them using a function factory.

Daniel Horn & Sheila Görz Advanced R Summer Semester 2022 43 / 64


Functions as output of other functions

A simple function factory

As a simple example, let’s simplify sampling random numbers:


createUnifSampler = function(n, lower, upper) {
fun = function()
runif(n, lower, upper)
return(fun)
}

u0To10 = createUnifSampler(1, 0, 10)


u0To10()

## [1] 4.13099

u10To20 = createUnifSampler(1, 10, 20)


u10To20()

## [1] 12.91354

Daniel Horn & Sheila Görz Advanced R Summer Semester 2022 44 / 64


Functions as output of other functions

The enclosing environment I


The respective parameters can be found inside the enclosing environment:

env.u0To10 = environment(u0To10) env.u10To20 = environment(u10To20)


as.list(env.u0To10) as.list(env.u10To20)

## $fun ## $fun
## function() ## function()
## runif(n, lower, upper) ## runif(n, lower, upper)
## <environment: 0x000000001102aef8> ## <bytecode: 0x0000000010a253b0>
## ## <environment: 0x0000000009340458>
## $n ##
## [1] 1 ## $n
## ## [1] 1
## $lower ##
## [1] 0 ## $lower
## ## [1] 10
## $upper ##
## [1] 10 ## $upper
## [1] 20

Daniel Horn & Sheila Görz Advanced R Summer Semester 2022 45 / 64


Functions as output of other functions

The enclosing environment II

We can manipulate the enclosing environment as we please:

rm("fun", envir = env.u0To10) pars = list(n = 2, lower = -5, upper = 0)


env.u0To10$n = 3L new.env = list2env(pars)
str(as.list(environment(u0To10))) environment(u10To20) = new.env
str(as.list(environment(u0To10)))
## List of 3
## $ n : int 3 ## List of 3
## $ lower: num 0 ## $ n : int 3
## $ upper: num 10 ## $ lower: num 0
## $ upper: num 10
u0To10()
u10To20()
## [1] 9.026430 7.574375 3.637657
## [1] -3.806947 -1.969815

But really, we shouldn’t. If we desire other behavior from our function, we


should simply create a new function with our factory.

Daniel Horn & Sheila Görz Advanced R Summer Semester 2022 46 / 64


Functions as output of other functions

Why use function factories?

It appears: Function factories are a nice gimmick. However, we could have


just as well realized the previous example by directly calling runif with the
respective parameters.

So, what value do factories actually add?


Sometimes, there are multiple layers of parameters some of which are
supposed to be set while others are not. In this case, factories can
reduce the parameterized call.
Complicated, one-time-only calculations can be performed once upon
creating the function.
Resulting program code is better structured.

Daniel Horn & Sheila Görz Advanced R Summer Semester 2022 47 / 64


Functions as output of other functions

Example: Weight functions for our moving average

The weight functions in our examples were very similar. This doesn’t really
comply with the DRY principle. Instead, we can write a function factory for
weight functions:

mkAvgFun = function(window.size = 2L, weights = rep(1, 2L * window.size + 1L)) {


function(time.series, i) {
inds = i + -window.size:window.size
allowed.inds = inds >= 1L & inds <= length(time.series)
inds = inds[allowed.inds]
weights = weights[allowed.inds]
weighted.mean(time.series[inds], w = weights)
}
}

Daniel Horn & Sheila Görz Advanced R Summer Semester 2022 48 / 64


Functions as output of other functions

Example: Likelihood function for logistic regression I

The log-likelihood function of the logistic function is known as


n
X
(yi xiT θ − log(1 + exp(xiT θ))) = 1T (yX T θ − log(1 + exp(X T θ))).
i=1

It depends on the data xi ∈ Rb (where b: number of columns in design


matrix) as well as yi ∈ {0, 1}.

If we want to implement it, we could create an individual implementation


for every dataset. Or, we could just use a function factory instead:

Daniel Horn & Sheila Görz Advanced R Summer Semester 2022 49 / 64


Functions as output of other functions

Example: Likelihood function for logistic regression II

mkLogLik = function(X, y) {
if (nrow(X) != length(y))
stop("Number of rows of X and length of y must be equal.")

if (is.data.frame(X))
X = as.matrix(X)

if (is.factor(y)) {
y = droplevels(y)
y = c(0, 1)[y]
}

function(theta) {
cp <- X %*% theta
-sum(y * cp - log(1 + exp(cp)))
}
}

Daniel Horn & Sheila Görz Advanced R Summer Semester 2022 50 / 64


Functions as output of other functions

Example: Likelihood function for logistic regression III

Now, we can properly create a log-likelihood for two different classification


problems:

irisLogLik = mkLogLik(iris[1:100, 1:4], irisLogLik = mkLogLik(iris[51:150, 1:4],


iris[1:100, 5]) iris[51:150, 5])

irisLogLik(1:4) irisLogLik(1:4)

## [1] 861.6 ## [1] 1478

Caution: The enclosing environment of the function now contains a copy


of the data. This can lead to increased memory requirements for large
datasets. Sometimes, the enclosing environment accidentally contains a
dataset that’s actually not needed.

Daniel Horn & Sheila Görz Advanced R Summer Semester 2022 51 / 64


Functions as output of other functions

Functions with memory


Recalling: The enclosing environment of a function can be used for the
function to remember things.

Up until now: Now also possible:


makeCounter <- function() {
counter = function() { i <- 0
i <<- i + 1 function() {
i i <<- i + 1
} i
}
environment(counter) = new.env() }
environment(counter)$i = 0 counter = makeCounter()
counter() counter()

## [1] 1 ## [1] 1

counter() counter()

## [1] 2 ## [1] 2

Daniel Horn & Sheila Görz Advanced R Summer Semester 2022 52 / 64


Functions as output of other functions

Beware: Lazy evaluation can be tricky I

We’re writing a function that capture = function(x) {


captures the current content of x function()
x
and returns it upon being called. }

Now, we’re setting x to 10 and check the result:

x = 10 x = 20
f1 = capture(x) f2()
f2 = capture(x)
f1() ## [1] 20

## [1] 10

Due to lazy evaluation, x is evaluated upon the first call of the functions
respectively. The promise object looks for the current value of x and
stores it in the respective enclosing environment.
Daniel Horn & Sheila Görz Advanced R Summer Semester 2022 53 / 64
Functions as output of other functions

Beware: Lazy evaluation can be tricky II

After the two functions have been called for the first time, the values of x
are now stored and can be obtained.

x = 30 x = 40
f1() f2()

## [1] 10 ## [1] 20

We can fix this behavior using force(x):


capture = function(x) {
force(x)
function()
x
}

Daniel Horn & Sheila Görz Advanced R Summer Semester 2022 54 / 64


Functions as input as well as output of functions

Functions as input as well as output of functions

Daniel Horn & Sheila Görz Advanced R Summer Semester 2022 55 / 64


Functions as input as well as output of functions

The next step

Having considered functions as either input or output of other functions,


the final step is now combining these two variants.

Possible scenarios could be:


We can write a function factory which creates a function whose
parameter depends on the call of another function.
The function created by the function factory can make use of the
input function itself.
However, we usually want to modify the provided function and return
it. Such functions are also called functional operators.

Daniel Horn & Sheila Görz Advanced R Summer Semester 2022 56 / 64


Functions as input as well as output of functions

Counting the number of calls

makeCountingFunction <- function(f) {


counter = 0
function(...) {
counter <<- counter + 1
f(...)
}
}
sinCnt = makeCountingFunction(sin)
sinCnt(1)

## [1] 0.841471

sinCnt(1)

## [1] 0.841471

environment(sinCnt)$counter

## [1] 2

Daniel Horn & Sheila Görz Advanced R Summer Semester 2022 57 / 64


Functions as input as well as output of functions

Memoization I

Store results of all previous calls (Example for simple numeric functions):
memoise = function(f) {
cache = data.frame(x = NULL, val = NULL)
function(x) {
# Has x been evaluated before?
inds = cache$x == x
if(any(inds)) {
# Look value for x up in cache
return(cache$val[inds])
} else {
# Evaluate and cache x
val = f(x)
cache <<- rbind(cache, data.frame(x = x, val = val))
return(val)
}
}
}

Daniel Horn & Sheila Görz Advanced R Summer Semester 2022 58 / 64


Functions as input as well as output of functions

Memoization II

fibo = function(x) { fibo2 = memoise(function(x) {


if(x < 2) if(x < 2)
return(x) return(x)
Recall(x - 1) + Recall(x - 2) fibo2(x - 1) + fibo2(x - 2)
} })
system.time(fibo(30)) system.time(fibo2(30))

## user system elapsed ## user system elapsed


## 2.01 0.00 2.03 ## 0.02 0.00 0.02

Caution: Recall() doesn’t work here, since it would always call the
non-memoizing function.

The memoise package offers an implementation for arbitrary inputs.


However, the issue with Recall() still remains.

Daniel Horn & Sheila Görz Advanced R Summer Semester 2022 59 / 64


Functions as input as well as output of functions

Transform each input with another function

From a mathematical point of view: Form the composition of two functions:


transform = function(fun, trans) {
function(...) {
fun(trans(...))
}
}

square = function(x) x^2


square(4)

## [1] 16

transformed = transform(square, sqrt)


transformed(4)

## [1] 4

Daniel Horn & Sheila Görz Advanced R Summer Semester 2022 60 / 64


Functions as input as well as output of functions

General pattern
All functional operators (mostly) follow the same pattern:
funOp <- function(f, otherargs) {
# Maybe initialize some objects
function(...) {
# Maybe do something
res <- f(...)
# Maybe do something else
res
}
}

Following this pattern, we can almost arbitrarily add additional


functionality to already existing functions without even knowing or let alone
understanding those functions:

Transforming in-/outputs Printing additional output to


Additional parameter checks the console
Executing the function Memorizing things
repeatedly (e.g. Vectorize()) ...
Daniel Horn & Sheila Görz Advanced R Summer Semester 2022 61 / 64
Summary

Summary

Daniel Horn & Sheila Görz Advanced R Summer Semester 2022 62 / 64


Summary

Functional programming (in R): What’s it about?

In R, functional programming is based on two essential principles:

1 Functions are not static. They are normal data objects that can be
manipulated, appended to lists or used as input of other functions.
2 Functions are closures and can memorize things. We can manipulate
functions by modifying the enclosing environment. The concept of
closures resembles object-oriented programming:

Objects are data with methods. Closures are functions with data.

Daniel Horn & Sheila Görz Advanced R Summer Semester 2022 63 / 64


Summary

Why should I use functional programming?

It’s not necessary to rely on functional programming in R. Everything can


also be realized using imperative programming.

Yet, functional programs have some advantages over imperative programs:


Writing and maintaining short and simple functions is way easier.
Programs are automatically well structured.
FP is close to mathematics and as such, often meets us halfway.

Functional programming is another powerful tool for us in R. It’s up to us


to decide what tool is the best for a given task.

In particular, if we want to adjust the behavior of already existing functions


without being able to modify their source code, we have to resort to
functional programming.

Daniel Horn & Sheila Görz Advanced R Summer Semester 2022 64 / 64

You might also like