R-Unit 2
R-Unit 2
UNIT - 2
1
Functions
Syntax:
func_name <- function (argument) {
statement
}
2
Named Arguments
Sample run:
> pow(8, 2)
[1] "8 raised to the power 2 is 64"
> pow(x = 8, y = 2)
[1] "8 raised to the power 2 is 64"
> pow(y = 2, x = 8)
[1] "8 raised to the power 2 is 64"
Sample run:
> pow(x=8, 2)
[1] "8 raised to the power 2 is 64"
> pow(2, x=8)
[1] "8 raised to the power 2 is 64"
3
Default values for Arguments
Example:
Sample run:
> pow(3)
[1] "3 raised to the power 2 is 9"
> pow(3,1)
[1] "3 raised to the power 1 is 3"
4
Return Value from Function
Syntax:
return(expression)
Example:
check <- function(x) { Sample run:
if (x > 0) {
result <- "Positive" > check(1)
} [1] "Positive"
else if (x < 0) { > check(-10)
result <- "Negative" [1] "Negative"
} > check(0)
else { [1] "Zero"
result <- "Zero"
}
return(result)
} 5
Functions without return()
Example: Example:
check <- function(x) { check <- function(x) {
if (x > 0) { if (x>0) {
result <- "Positive" return("Positive")
} }
else if (x < 0) { else if (x<0) {
result <- "Negative" return("Negative")
} }
else { else {
result <- "Zero" return("Zero")
} }
result }
}
6
Multiple Returns
Example:
multi_return <- function() {
my_list <- list("color" = "red", "size" = 20, "shape" = "round")
return(my_list)
}
Sample run:
multi_return()
$color
[1] "red"
$size
[1] 20
$shape
[1] "round" 7
R Programming Environment
• R environment can be considered as a place to store and manage variables.
• Whenever an object or a function is created in R, an entry is added to the
environment.
• Environment can be thought of as a collection of objects (functions, variables etc.)
• An environment is created when we first fire up the R interpreter. Any variable we
define, is now in this environment.
• By default, the top-level environment is the R_GlobalEnv global environment
• Global environment can be referred to as .GlobalEnv in R codes as well.
• ls() function can be used to show what variables and functions are defined in the
current environment.
• The environment() can also be used to get the current environment.
8
Example:
> a <- 2
> b <- 5
> f <- function(x) x<-0
> ls()
[1] "a" "b" "f "
> environment()
<environment: R_GlobalEnv>
> .GlobalEnv
<environment: R_GlobalEnv>
9
Cascading of environments
11
R Programming Scope – Cont.
12
Recursive Function
Example:
# Recursive function to find factorial
recursive.factorial <- function(x) {
if (x == 0) return (1)
else return (x * recursive.factorial(x-1))
}
Sample run:
> recursive.factorial(0)
[1] 1
> recursive.factorial(5)
[1] 120
> recursive.factorial(7)
[1] 5040
13
Infix Operator
• Most of the operators that we use in R are binary operators (having two operands). Hence, they are
infix operators, used between the operands. Actually, these operators do a function call in the
background.
• For example, the expression a+b is actually calling the function `+`() with the arguments a and b, as
`+`(a, b).
Example:
> 5+3
[1] 8
> `+`(5,3) #operator within backtick or backquote
[1] 8
> 5-3
[1] 2
> `-`(5,3)
[1] 2
> 5*3-1
[1] 14
> `-`(`*`(5,3),1)
14
[1] 14
User defined Infix Operator
• It is possible to create user-defined infix operators in R. This is done by naming a
function that starts and ends with %.
15
Predefined infix operators
%% Remainder operator
%/% Integer division
%*% Matrix multiplication
%o% Outer product
%x% Kronecker product
%in% Matching operator
16
switch() function
Syntax:
switch (expression, list)
Example:
> switch(2,"red","green","blue")
[1] "green"
> switch(1,"red","green","blue")
[1] "red"
17
switch() function – Cont.
Examples:
> x <- switch(4,"red","green","blue")
>x
NULL
Creating Vectors
The keyword vector() is used to create a vector of a fixed type and fixed length.
vector ("numeric", 5) # numeric vector with 0 at every index
vector ("complex", 5) # complex vector with 0+0i at every index
vector ("logical", 5) # logical vector with FALSE at every index
vector ("character", 5) # character vector with "" at every index
[1] 0 0 0 0 0
[1] 0+0i 0+0i 0+0i 0+0i 0+0i
[1] FALSE FALSE FALSE FALSE FALSE
[1] "" "" "" "" ""
19
Creating Vectors by Concatenation
Example: > a <- 1
x<- c(1, 5, 4, 9, 0) > is.vector(a)
> typeof(x) [1] TRUE
[1] "double"
> length(x)
[1] 5
20
Creating a vector using : operator
> x <- 1:7; x #binary operaor
[1] 1 2 3 4 5 6 7
24
Using logical expression as index
Examples:
>x
[1] -3 -2 -1 0 1 2
25
Using character as index
Examples:
> x["second"]
second
0
26
Modifying a vector
Examples:
>x
[1] -3 -2 -1 0 1 2
27
Inserting Elements in a Vector
myVector <- c(1, 2, 3, 4) Original Vector: [1] 1 2 3 4
cat("Original Vector: ") Appending 0 at the start of the vector: [1] 0 1 2 3 4
print(myVector) Appending 5 at the end of the vector: [1] 0 1 2 3 4 5
Appending another vector at the end of the original vector: [1] 0 1 2 3 4 5 6 7 8
myVector <- c(0, myVector)
cat("Appending 0 at the start of the vector: ")
print(myVector)
28
Deleting a vector
Examples:
>x
[1] -3 -2 -1 0 1 2
> x[4]
NULL
29
Operation on Vectors
Example:
> x <- c(2,8,3)
> y <- c(6,4,1)
> x+y
[1] 8 12 4
> x>y
[1] FALSE TRUE TRUE
30
Operation on Vectors
When there is a mismatch in length (number of elements) of operand vectors, the elements
in shorter one is recycled in a cyclic manner to match the length of the longer one.
Examples:
> x <- c(2,1,8,3)
> y <- c(9,4)
> x+y # Element of y is recycled to 9,4,9,4
[1] 11 5 17 7
> x+c(1,2,3)
[1] 3 3 11 4
Warning message:
In x + c(1, 2, 3) :
31
longer object length is not a multiple of shorter object length
Matrix
• Matrices are the R objects in which the elements are arranged in a two-dimensional rectangular layout.
• They contain elements of the same atomic types.
Syntax
data is the input vector which becomes the data elements of the matrix.
nrow is the number of rows to be created.
ncol is the number of columns to be created.
byrow is a logical clue. If TRUE then the input vector elements are arranged by row.
dimname is the names assigned to the rows and columns.
32
Creating a matrix
Example:
> colnames(x)
[1] "A" "B" "C"
> rownames(x)
[1] "X" "Y" "Z"
36
Creating a matrix – Cont.
Example:
> x <- c(1,2,3,4,5,6)
>x
[1] 1 2 3 4 5 6
>class(x)
[1] "numeric"
> class(x)
[1] "matrix" 37
Accessing the elements of a matrix
Example:
>x
[,1] [,2] [,3]
[1,] 1 4 7
[2,] 2 5 8
[3,] 3 6 9
> x[c(3,2),] # leaving column field blank will select entire columns
[,1] [,2] [,3]
[1,] 3 6 9
[2,] 2 5 8
38
Accessing the elements of a matrix
Example:
> x[,] # leaving row as well as column field blank will select entire matrix
[,1] [,2] [,3]
[1,] 1 4 7
[2,] 2 5 8
[3,] 3 6 9
39
Matrix
Example:
>a
[,1] [,2] [,3]
[1,] 1 4 7
[2,] 2 5 8
[3,] 3 6 9
> class(a)
[1] "matrix "
> attributes(a)
$dim
[1] 3 3
> dim(a)
[1] 3 3
40
Accessing the elements of a matrix
Example:
> x[1,]
[1] 1 4 7
> class(x[1,])
[1] "integer"
> x[1,,drop=FALSE] # now the result is a 1X3 matrix rather than a vector
[,1] [,2] [,3]
[1,] 1 4 7
> class(x[1,,drop=FALSE])
[1] "matrix"
41
Indexing a matrix with a single vector
Example:
>x
[,1] [,2] [,3]
[1,] 4 8 3
[2,] 6 0 7
[3,] 1 2 9
> x[1:4]
[1] 4 6 1 8
> x[c(3,5,7)]
[1] 1 0 3
42
Using logical vector as index
>x
[,1] [,2] [,3]
[1,] 4 8 3
[2,] 6 0 7
[3,] 1 2 9
> x[c(TRUE,FALSE,TRUE),c(TRUE,TRUE,FALSE)]
[,1] [,2]
[1,] 4 8
[2,] 1 2
> x[,"A"]
[1] 4 6 1
> x[TRUE,c("A","C")]
AC
[1,] 4 3
[2,] 6 7
[3,] 1 9
> x[2:3,c("A","C")]
AC
[1,] 6 7
44
[2,] 1 9
Modifying a Matrix
Example:
>x
[,1] [,2] [,3]
[1,] 1 4 7
[2,] 2 5 8
[3,] 3 6 9
Example:
> rbind(x,c(1,2,3)) # add row
> t(x) # transpose a matrix
[,1] [,2] [,3]
[,1] [,2] [,3]
[1,] 0 0 7
[1,] 0 0 0
[2,] 0 10 8
[2,] 0 10 6
[3,] 0 6 9
[3,] 7 8 9
[4,] 1 2 3
> cbind(x, c(1, 2, 3)) # add column
> x <- x[1:2,]; x # remove last row
[,1] [,2] [,3] [,4]
[,1] [,2] [,3]
[1,] 0 0 7 1
[1,] 0 0 7
[2,] 0 10 8 2
[2,] 0 10 8
[3,] 0 6 9 3
46
Modifying a Matrix
Example:
>x
[,1] [,2] [,3]
[1,] 1 3 5
[2,] 2 4 6
48
Lists
• List is a data structure having components of mixed data types.
• A vector having all elements of the same type is called atomic vector but a vector having elements of different type is
called list.
[[1]]
list_data<-list("Shubha","Arpita",c(1,2,3,4,5),TRUE,FALSE,22.5,12L) [1] "Shubha"
print(list_data)
[[2]]
[1] "Arpita "
[[3]]
[1] 1 2 3 4 5
[[4]]
[1] TRUE
[[5]]
[1] FALSE
[[6]]
[1] 22.5
[[7]]
49
[1] 12
# Creating a list containing a vector, a matrix and a list.
list_data <- list(c("Shubha","Nisha","Guna"), matrix(c(40,80,60,70,90,80), nrow = 2),
list("BCA","MCA","B.tech"))
# Giving names to the elements in the list. $Students
names(list_data) <- c("Students", "Marks", "Course") [1] "Shubha" "Nisha" "Guna "
# Show the list.
print(list_data) $Marks
[,1] [,2] [,3]
[1,] 40 60 90
[2,] 80 70 80
$Course
$Course[[1]]
[1] "BCA"
$Course[[2]]
[1] "MCA"
$Course[[3]]
[1] "B. tech."
50
Accessing List Elements
[[1]]
[1] 1
Each element in a list can be another list, so to obtain a single element use double square brackets[[]] instead
[1] 1
51
x <- list(TRUE, 25, "Apple")
names(x) <- c("In Stock", "Quantity", "Product")
print(x$'In Stock')
print(x$Quantity)
print(x$Product)
[1] TRUE
[1] 25
[1] "Apple"
52
Manipulation of list elements
# Creating a list containing a vector, a matrix and a list.
list_data <- list(c("Shubham","Arpita","Nishka"), matrix(c(40,80,60,70,90,80), nrow = 2),
list("BCA","MCA","B.tech"))
[[1]]
[1] "blackcurrant"
[[2]]
[1] "banana"
[[3]]
[1] "cherry"
54
thislist <- list("apple", "banana", "cherry")
[[1]]
[1] "apple"
[[2]]
[1] "banana"
[[3]]
[1] "orange"
[[4]]
[1] "cherry"
55
thislist <- list("apple", "banana", "cherry")
[[1]]
[1] "banana"
[[2]]
[1] "cherry"
56
thislist <- list("apple", "banana", "cherry", "orange", "kiwi", "melon", "mango")
(thislist)[2:5]
[[1]]
[1] "banana"
[[2]]
[1] "cherry"
[[3]]
[1] "orange"
[[4]]
[1] "kiwi"
57
list1 <- list("a", "b", "c") [[1]]
list2 <- list(1, 2, 3) [1] "a"
list3 <- c(list1,list2)
[[2]]
list3 [1] "b"
[[3]]
[1] "c"
[[4]]
[1] 1
[[5]]
[1] 2
[[6]]
[1] 3
58
# Creating two lists.
Even_list <- list(2,4,6,8,10)
Odd_list <- list(1,3,5,7,9)
59
Data Frame
A data frame is a table or a two-dimensional array-like structure in which each column contains
values of one variable and each row contains one set of values from each column.
Data_Frame
62
Summarize the Data
Data_Frame <- data.frame (
Training = c("Strength", "Stamina", "Other"),
Pulse = c(100, 150, 120),
Duration = c(60, 30, 45)
) Training Pulse Duration
Length:3 Min. :100.0 Min. :30.0
Data_Frame Class :character 1st Qu.:110.0 1st Qu.:37.5
Mode :character Median :120.0 Median :45.0
summary(Data_Frame) Mean :123.3 Mean :45.0
3rd Qu.:135.0 3rd Qu.:52.5
Max. :150.0 Max. :60.0
63
Accessing the items of a Data Frame
Data_Frame <- data.frame (
Training = c("Strength", "Stamina", "Other"), Training
Pulse = c(100, 150, 120), 1 Strength
Duration = c(60, 30, 45) 2 Stamina
) 3 Other
Data_Frame[1]
[1] Strength Stamina Other
Data_Frame[["Training"]]
[1] Strength Stamina Other
Data_Frame$Training
64
Add Rows
Data_Frame <- data.frame (
Training = c("Strength", "Stamina", "Other"), Training Pulse Duration
Pulse = c(100, 150, 120), 1 Strength 100 60
Duration = c(60, 30, 45) 2 Stamina 150 30
) 3 Other 120 45
4 Strength 110 110
# Add a new row
New_row_DF <- rbind(Data_Frame, c("Strength", 110, 110))
65
Add Columns
Data_Frame <- data.frame (
Training = c("Strength", "Stamina", "Other"), Training Pulse Duration Steps
Pulse = c(100, 150, 120), 1 Strength 100 60 1000
Duration = c(60, 30, 45) 2 Stamina 150 30 6000
) 3 Other 120 45 2000
# Add a new column
New_col_DF <- cbind(Data_Frame, Steps = c(1000, 6000, 2000))
66
Remove Rows and Columns
67
Dimension of Dataframe
dim(Data_Frame)
68
Data_Frame <- data.frame (
Training = c("Strength", "Stamina", "Other"),
Pulse = c(100, 150, 120),
Duration = c(60, 30, 45)
)
Try using length() function
ncol(Data_Frame)
nrow(Data_Frame)
[1] 3
[1] 3
69
Combining Data Frames
70
Data_Frame3 <- data.frame (
Training = c("Strength", "Stamina", "Other"),
Pulse = c(100, 150, 120),
Duration = c(60, 30, 45) Training Pulse Duration Steps Calories
) 1 Strength 100 60 3000 300
2 Stamina 150 30 6000 400
Data_Frame4 <- data.frame ( 3 Other 120 45 2000 300
Steps = c(3000, 6000, 2000),
Calories = c(300, 400, 300)
)
71
Data Slicing # Select Columns 1
ID items store price df[,1]
1 10 book TRUE 2.5
2 20 pen FALSE 8.0 [1] 10 20 30 40
3 30 textbook TRUE 10.0
4 40 pencil_case FALSE 7.0 #Select Rows 1 to 3 and columns 3 to 4
df[1:3, 3:4]
store price
## Select row 1 in column 2 1 TRUE 2.5
df[1,2] 2 FALSE 8.0
3 TRUE 10.0
[1] book
#Slice with column name
# Select Rows 1 to 2 df[, c('ID', 'store')]
df[1:2,]
ID store
ID items store price 1 10 TRUE
1 10 book TRUE 2.5 2 20 FALSE
2 20 pen FALSE 8.0 3 30 TRUE
72
4 40 FALSE
# Select price above 5
subset(df, subset = price > 5)
73
Factors
Factors are used to represent categorical data.
Useful in the columns which have a limited number of unique values. Ex: "Male, "Female" and True, False
etc.
Useful in data analysis for statistical modeling.
print(data) [1] "East" "West" "East" "North" "North" "East" "West" "West" "West" "East"
print(is.factor(data)) "North"
74
# Create the vectors for data frame.
height <- c(132,151,162,139,166,147,122)
weight <- c(48,49,66,53,67,52,40)
gender <- c("male","male","female","female","male","female","male")
height weight gender
# Create the data frame. 1 132 48 male
input_data <- data.frame(height,weight,gender) 2 151 49 male
print(input_data) 3 162 66 female
4 139 53 female
# Test if the gender column is a factor. 5 166 67 male
print(is.factor(input_data$gender)) 6 147 52 female
7 122 40 male
# Print the gender column so see the levels. [1] FALSE
print(input_data$gender) [1] "male" "male" "female" "female" "male" "female" "male"
[1] TRUE
75
Changing the Order of Levels
[1] East West East North North East West West West East North
Levels: East North West
[1] "East" "North" "West"
[1] 3
[1] East West East North North East West West West East North
Levels: "East" "West" "North"
76
Accessing elements of a Factor
[1] male
Levels: female male
77
Modifying a Factor
78
Simple Calculator
# Program make a simple calculator that can add, subtract, multiply and divide using
functions
add <- function(x, y) {
return(x + y)
}
subtract <- function(x, y) {
return(x - y)
}
multiply <- function(x, y) {
return(x * y)
}
divide <- function(x, y) {
return(x / y)
}
79
# take input from the user
print("Select operation.")
print("1.Add")
print("2.Subtract")
print("3.Multiply")
print("4.Divide")
choice = as.integer(readline(prompt="Enter choice[1/2/3/4]: "))
num1 = as.integer(readline(prompt="Enter first number: "))
num2 = as.integer(readline(prompt="Enter second number: "))
operator <- switch(choice,"+","-","*","/")
result <- switch(choice, add(num1, num2), subtract(num1, num2), multiply(num1, num2),
divide(num1, num2))
print(paste(num1, operator, num2, "=", result))
80
Sample Run:
81