function
function
Functions
Estimated time needed: 30 minutes
Objectives
After completing this lab you will be able to:
You will be working on this lab in cloud-based RStudio hosted by IBM Skills Network Labs.
Note that your lab will be reset after about 1 hour inactive window. It’s recommended to backup the files you created.
The table has one row for each movie and several columns
getwd()
In the Files panel on the right, if your current directory is not /resources/rstudio, you could
click resources folder and you should find a rstudio folder. This will be your
current working directory in RStudio.
and you should see the dataset gets downloaded into the working directory as movies-db.csv
What is a Function?
A function is a re-usable block of code which performs operations specified in the function.
Pre-defined functions
User defined functions
Pre-defined functions are those that are already defined for you, whether it’s in R or within a package.
For example, sum() is a pre-defined function that returns the sum of its numeric inputs.
User-defined functions are custom functions created and defined by the user.
For example, you can create a custom function to print Hello World.
Pre-defined functions
There are many pre-defined functions, so let’s start with some simple ones.
Using the mean() function, let’s get the average of these three movie ratings:
about:blank 1/5
11/24/24, 7:15 AM about:blank
Star Wars (1977) - rating of 8.7
Jumanji - rating of 6.9
Back to the Future - rating of 8.5
Let’s try some pre-defined functions in RStudio. Click File->New File->R Script, create a file called
predefined.R.
Copy and run the following lines in predefined.R to call the predefined mean() function
We can use the sort() function to sort the movies rating in ascending order.
sort(ratings)
You can also sort by decreasing order, by adding in the argument decreasing = TRUE.
Coding Exercise: Get the max ( max() function ) and min ( min() function ) rating value from ratings vector
Click here to see solution
We will be introducing a variety of pre-defined functions to you as you learn more about R.
There are just too many functions, so there’s no way we can teach them all in one sitting.
But if you’d like to take a quick peek, here’s a short reference card for some of the commonly-used pre-defined functions:
R Reference Card
User-defined functions
Now let’s move on to user-defined functions, first we can create another R script file called userdefined.R to
include all the functions we created.
This time, we will write all user-defined functions in the script file, then run them via the console.
As such, we can have a nice separation from code definitions/implementation from code execution.
Let’s start with a simple print function, now copy the following function in userdefined.R
printHelloWorld <- function(){
print("Hello World")
}
Let’s create the function object (but not calling it yet) by clicking the Source button. Also make sure the last line
is an empty line.
Then you can see a printHelloWorld function object in workspace, which means R interprator creates a function object for us to call,
and it can be called from both console and script files.
printHelloWorld()
As you can see, the printHelloWorld() function has no inputs or arguments, but what if you want the function to
provide some output based on some inputs?
Let’s take a look at an add function, copy the following lines into userdefined.R and
click Source icon to create the add function object.
Remember to click the Source button again everytime you made updates to the script file.
As you can see above, you can create functions with the following syntax to take in inputs (as its arguments), then provide some output.
f <- function(<arguments>) {
Do something
Do something
return(some_output)
about:blank 2/5
11/24/24, 7:15 AM about:blank
}
It’s good practice to use the return() function to explicitly tell the function to return the output so please
update the previous add function with a return() function.
Copy a isGoodRating function into userdefined.R and run the script file:
isGoodRating <- function(rating){
#This function returns "NO" if the input value is less than 7. Otherwise it returns "YES".
}else{
return("YES") # otherwise return YES
}
}
You can call isGoodRating in console with different inputs to cover the two logic branches:
isGoodRating(6)
isGoodRating(9.5)
Notice how we did not have to explicitly specify the second argument (threshold), but we could specify it.
Let’s say we have a higher standard for movie ratings, so let’s bring our threshold up to 8.5:
isGoodRating(8, threshold = 8.5)
Great! Now you know how to create default values. Note that if you know
the order of the arguments, you do not need to write out the argument, as in:
isGoodRating(8, 8.5)
Coding Practice: Write a is bad rating function to print YES if rating is under 5 and print NO if rating is above 5
about:blank 3/5
11/24/24, 7:15 AM about:blank
Using functions within functions is no big deal.
In fact, you’ve already used the print() and return() functions.
Let’s create a function that can help us decide on which movie to watch, based on its rating.
We should be able to provide the name of the movie, and it should return NO
if the movie rating is below 7, and YES otherwise.
First, in the console, let’s read the movies data into workspace so that all functions could use it
my_data <- read.csv("movies-db.csv")
head(my_data)
and you should see the head in console result and my_data in the Environment panel.
Next, do you remember how to check the value of the average_rating column if we specify a movie name?
Here’s how.
Open userdefined.R script file, add a new watchMovie function and click Source to run the file
watchMovie <- function(data, moviename){
rating <- data[data["name"] == moviename,"average_rating"]
return(isGoodRating(rating))
}
and you should see YES meaning we should watch this movie for its high rating.
Make sure you take the time to understand the function above. Notice how the function expects two inputs: data and moviename, and so when we use the function,
we must also input two arguments.
But what if we only want to watch really good movies? How do we set our rating threshold that we created earlier?
Now our watchMovie takes three inputs: data, moviename, and my_threshold, let’s call it from the console:
watchMovie(my_data, "Akira", 7)
Here’s how we can do it, update the watchMovie function with a default input:
watchMovie <- function(data, moviename, my_threshold = 7){
rating <- data[data[,1] == moviename,"average_rating"]
return(isGoodRating(rating, threshold = my_threshold))
}
As you can imagine, if we assign the output to a variable, the variable will be assigned to YES
is_watch <- watchMovie(my_data, "Akira")
is_watch
We can also use the built-in paste() function to concatenate a sequence of character strings together into a single string.
Now update the watchMovie function to print the movie name and actual rating
about:blank 4/5
11/24/24, 7:15 AM about:blank
return(isGoodRating(rating, threshold = my_threshold))
}
Coding Exercise: update the watchMovie function to use the mean rating of all movies as the threshold
Click here to see solution
watchMovie <- function(moviename, my_threshold = 7){
rating <- my_data[my_data[,1] == moviename,"average_rating"]
# Get mean rating
print(my_data$average_rating)
mean_threshold <- mean(my_data$average_rating)
print(mean_threshold)
memo <- paste("The movie rating for", moviename, "is", rating)
print(memo)
watchMovie("Akira")
memo
It’s because all the variables we create in the function remain within the function. In technical terms,
this is a local variable, meaning that the variable assignment does not persist outside the function.
The memo variable only exists within the function.
Author(s)
Hi! It’s Aditya Walia, the author of this lab.
I hope you found R easy to learn! There’s lots more to learn about R but you’re well on your way.
Feel free to connect with me if you have any questions.
Other Contributor(s)
Yan Luo
about:blank 5/5