0% found this document useful (0 votes)
31 views36 pages

08 Functions

The document discusses creating and using user-defined functions in R. It covers topics like named functions, anonymous functions, using functions in loops, and the apply family of functions. Examples are provided to illustrate how to write functions and apply them across variables or datasets using loops or apply functions.

Uploaded by

Cotta Lee
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
31 views36 pages

08 Functions

The document discusses creating and using user-defined functions in R. It covers topics like named functions, anonymous functions, using functions in loops, and the apply family of functions. Examples are provided to illustrate how to write functions and apply them across variables or datasets using loops or apply functions.

Uploaded by

Cotta Lee
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 36

Chapter 8: Your Own Functions!

Tyson S. Barrett
Summer 2017
Utah State University

1
Introduction

Creating a Function

Named Functions

Anonymous Functions

Why Write Your Own?

Loops with User-Defined Functions

Conclusions

2
Introduction

3
Introduction

User-defined functions — functions you create yourself — are an


important tool.
We will show how you can make your own functions.

4
What is a Function Anyway?

It is a special R object that performs something on another object or


the environment.
For example,

• ggplot() takes a data.frame object and variable names and


produces a plot.
• lm() takes a formula, a data.frame and performs a statistical
model.
• gather() takes a data.frame and produces another
data.frame in another form

5
Creating a Function

6
Creating a Function

It all starts with the function function().

myfunction <- function(arguments){


stuff = that(you)
want = it + to + do + 4
you
}

After function() we use { and }. Everything in between is what


you want the function to do.
All you need to do is run the function and you can use it in the R
session.

7
Named Functions

8
Named Functions

These are functions that you assign a name to.

mean2 <- function(x){


n <- length(x)
m <- (1/n) * sum(x)
return(m)
}

Now, mean2() is a function you can use.


What does this function do?

9
Named Functions

v1 <- c(1,3,2,4,2,1,2,1,1,1) ## vector to try

Let’s give it a try using the vector v1

mean2(v1) ## our function

## [1] 1.8

mean(v1) ## the base R function

## [1] 1.8

10
Named Functions

For practice, we will write one more function. Let’s make a function
that takes a vector and gives us the N, the mean, and the standard
deviation.

important_statistics <- function(x, na.rm=FALSE){


N <- length(x)
M <- mean(x, na.rm=na.rm)
SD <- sd(x, na.rm=na.rm)

final <- c(N, M, SD)


return(final)
}

11
Named functions

One of the first things you should note is that we included a second
argument in the function seen as na.rm=FALSE (you can have as
many arguments as you want within reason).

• This argument has a default that we provide as FALSE


• We take what is provided in the na.rm and give that to both
the mean() and sd() functions.
• Finally, you should notice that we took several pieces of
information and combined them into the final object and
returned that.

12
Named Functions

Let’s try it out with the vector we created earlier.

important_statistics(v1)

## [1] 10.000000 1.800000 1.032796

13
Named Functions

Looks good but we may want to change a few aesthetics. In the


following code, we adjust it so we have each one labeled.

important_statistics2 <- function(x, na.rm=FALSE){


N <- length(x)
M <- mean(x, na.rm=na.rm)
SD <- sd(x, na.rm=na.rm)

final <- data.frame(N, "Mean"=M, "SD"=SD)


return(final)
}
important_statistics2(v1)

## N Mean SD
## 1 10 1.8 1.032796 14
Anonymous Functions

15
Anonymous Functions

Sometimes it is not worth saving a function but want to use it,


generally within loops.

function(x) thing(that, you, want, it, to, do, with, x)

We will show a few of these in the looping section (although they


are identical in nature to named functions, they just aren’t named)

16
Why Write Your Own?

17
Why Write Your Own?

Several reasons exist.

• Looping
• Adjusting output
• Performing a special function
• Other customization

We are going to talk about looping in depth.

18
Looping

Writing your own functions for looping is very common and


practical.
Loops are things that are repeated.
For example:

• We may want a certain statistic (like a mean) for every


continuous variable in the data set.
• We may want to remove 999 from every variable in a data set.
• We may want to change variable types of certain variables
across the whole data set.

19
Looping

Examples in R include:

• for loops
• the apply family of functions

20
For Loops

for (i in 1:10){
mean(data[, i])
}

21
For Loops

Another example:

library(tidyverse)
data = read.csv("~/Dropbox/Teaching/R for Social Sciences/D
select(Prod1, MentalApt, PhysApt, Income, Children, SubsU
thing = list()
for (i in 1:8){
thing[[i]] = cbind(mean(data[, i]), sd(data[, i]))
}

22
For Loops

thing

## [[1]]
## [,1] [,2]
## [1,] 3.2 1.146423
##
## [[2]]
## [,1] [,2]
## [1,] 5.2 2.305273
##
## [[3]]
## [,1] [,2]
## [1,] 5.733333 1.869556
##
23
## [[4]]
For loops

I like for loops. They are easy to understand and fiddle with, after
some practice.
However, they used to be slow in R and so they have a bad
reputation.

24
The apply family

There are several apply functions in R that do loops for you.

• apply()
• sapply()
• lapply()
• tapply()

25
sapply

Produces a vector based on the function that it is repeating.


Both do the same thing here.

for (i in 1:10){
mean(data[, i])
}
sapply(1:10, function(i) mean(data[, i]))

26
sapply

Can also just provide the data.frame and it assumes you want the
function (in this case mean()) applied to each variable.

sapply(data, mean)

## Prod1 MentalApt PhysApt Income Children


## 3.2000000 5.2000000 5.7333333 53.3333333 0.4000000
## Ment1 Ment2
## 13.8666667 6.0666667

27
lapply

Produces a list. Just like sapply() it takes a data.frame and a


function and applies it across the variables.

lapply(data, mean)

## $Prod1
## [1] 3.2
##
## $MentalApt
## [1] 5.2
##
## $PhysApt
## [1] 5.733333
##
## $Income 28
tapply

Is a bit different than the rest. It doesn’t do much in terms of


looping necessarily (although you can have it do that). Instead, it
applies a function based on a grouping variable. With the
tidyverse, however, this is not often used any more.

tapply(data$Income, data$Children, mean)

## 0 1 2
## 53.18182 55.00000 52.50000

How could you do this in the tidyverse framework?

29
Loops with User-Defined Functions

30
Loops with User-Defined Functions

Going back to our important_statistics2() function:

important_statistics2 <- function(x, na.rm=FALSE){


N <- length(x)
M <- mean(x, na.rm=na.rm)
SD <- sd(x, na.rm=na.rm)

final <- data.frame(N, "Mean"=M, "SD"=SD)


return(final)
}

Let’s put it in a loop.

31
Loops with User-Defined Functions

lapply(data, important_statistics2)

## $Prod1
## N Mean SD
## 1 15 3.2 1.146423
##
## $MentalApt
## N Mean SD
## 1 15 5.2 2.305273
##
## $PhysApt
## N Mean SD
## 1 15 5.733333 1.869556
##
32
## $Income
Loops with User-Defined Functions

sapply(data, important_statistics2)

## Prod1 MentalApt PhysApt Income Children Subs


## N 15 15 15 15 15 15
## Mean 3.2 5.2 5.733333 53.33333 0.4 0.26
## SD 1.146423 2.305273 1.869556 12.63027 0.7367884 0.45
## Ment2
## N 15
## Mean 6.066667
## SD 2.65832

33
Loops with User-Defined Functions

It applied our function across the variables in the data.frame. So


we can easily get information we want.
This was a very simplified version of how I created the table1()
function in furniture.

34
Conclusions

35
Conclusions

Writing your own functions takes time and practice but it can be a
worthwhile tool in using R.
I recommend you start simple and start soon.
Ultimately, you can make your own group of functions you use often
and create a package for it so others can use them too :)

36

You might also like