3202 R Program
3202 R Program
Contents
Hours Chapter Page No.
UNIT – I
1 Introduction 2
Evolution of R 2
2
Features of R 2
Environment Setup
3 Windows Installation 2
Linux Installation 4
Basic Syntax
4 R Command Prompt 5
R Script File 6
Comments 6
5
Revision
UNIT – II
Datatypes
1. Vectors 8
6 2. Lists 8
3. Matrices 8
7 4. Arrays 8
5. Factors 9
6. Data Frames 10
Variables
Variable Assignment 10
8 Data Type of a Variable 11
Finding Variables 11
Deleting Variables 12
Operators
1. Arithmetic Operators 13
9
2. Relational Operators 13
3. Logical Operators 14
10
4. Assignment Operators 14
5. Miscellaneous Operators 15
UNIT – III
Decision Making
11 1. if statement 16
2. if...else statement 17
12 3. The if...else if...else Statement 18
4. switch statement 19
Loops
13
1. repeat loop 20
14
2. while loop 21
15
3. for loop 22
Hours Chapter Page No.
Loop Control Statements
16 1. break statement 23
2. Next statement 24
UNIT – IV
Functions
Definition 26
17
Components 26
18
Built-in functions 26
19
User-defined functions 27
Calling a function 27
Strings
20 Rules Applied in String Construction 29
String Manipulation 30
Vector
21 Vector Creation 32
22 Accessing Vector Elements 33
Vector Manipulation 34
Lists
Creating a List 35
Naming List Elements 36
23
Accessing List Elements 36
24
Manipulating List Elements 37
Merging Lists 38
Converting List to Vector 38
UNIT – V
Matrices
Accessing Elements of a Matrix
25 Matrix Computations 40
26 Matrix Addition & Subtraction 40
Matrix Multiplication & Division
Arrays
Naming Columns and Rows 42
27
Accessing Array Elements 43
28
Manipulating Array Elements 43
Calculations Across Array Elements 44
29
Review Questions
30
Fundamentals of R Programming Unit – I
Introduction
R is a programming language and software environment for statistical analysis, graphics
representation and reporting.
This language was named R, based on 1ST letter of first name of the two authors
(Robert Gentleman and Ross Ihaka).
R is freely available under the General Public License (GNU) and pre-compiled binary
versions are provided for operating systems (Linux, Windows and Mac).
R is free software distributed under a GNU-style copy left, and an official part of
the GNU project called GNU S.
R allows integration with the procedures written in the C, C++, .Net, Python or
FORTRAN languages for efficiency.
Evolution of R
R is an interpreted programming language which was created by Ross Ihaka and
Robert Gentleman at the University of Auckland, New Zealand and is currently
developed by the R Development Core Team.
R made its first appearance in 1993.
A large group of individuals contributed to R by sending code and bug reports.
Since mid-1997 there has been a core group (the "R Core Team") who can modify
the R source code archive.
Features of R
R is a well-developed, simple and effective programming language which includes
conditionals, loops, user defined recursive functions and I facilities.
R has an effective data handling and storage facility,
R provides a suite of operators for calculations on arrays, lists, vectors and
matrices.
R provides a large, coherent and integrated collection of tools for data analysis.
R provides graphical facilities for data analysis and display either directly at the
computer or printing at the papers.
Environment Setup
1. Windows Installation
First, we have to download the R setup from
https://cloud.r-project.org/bin/windows/base/
2. Linux Installation
In the first step, we have to update all the required files in our system
using sudo apt-get update command as:
In the second step, we will install R file in our system with the help of sudo
apt-get install r-base as:
R Basic Syntax
R Command Prompt
Once you have R environment setup, then it’s easy to start your R command prompt
by just typing the following command at your command prompt −
$R
This will launch R interpreter and you will get a prompt > where you can start typing
your program as follows −
> myString <- "Hello, SAQ!"
> print ( myString)
R Script File
Usually, you will do your programming by writing your programs in script files and
then you execute those scripts at your command prompt with the help of R interpreter
called Rscript. So let's start with writing following code in a text file called test.R as
under −
# My first program in R Programming
myString <- "Hello, SAQ!"
print ( myString)
Comments
Comments are like helping text in your R program and they are ignored by the
interpreter while executing your actual program.
Single comment is written using # in the beginning of the statement as follows –
# My first program in R Programming
R does not support multi-line comments but you can perform a trick which is
something as follows −
if(FALSE)
{
"This is a demo for multi-line comments and it should be put inside either a
single OR double quote"
}
myString <- "Hello, SAQ!"
print ( myString)
[1] "Hello, SAQ!"
- Though above comments will be executed by R interpreter, they will not
interfere with your actual program.
- You should put such comments inside, either single or double quote
Datatypes
In contrast to other programming languages like C and java in R, the variables are
not declared as some data type.
The variables are assigned with R-Objects and the data type of the R-object
becomes the data type of the variable.
There are many types of R-objects.
The frequently used ones are
1. Vectors
2. Lists
3. Matrices
4. Arrays
5. Factors
6. Data Frames
The simplest of these objects is the vector object and there are six data types of these
atomic vectors, also termed as six classes of vectors.
S. No Data Type Example Verify
v <- TRUE
print(class(v))
1 Logical TRUE, FALSE
OUTPUT
[1] "logical"
v <- 23.5
print(class(v))
2 Numeric 12.3, 5, 999
OUTPUT
[1] "numeric"
v <- 2L
print(class(v))
3 Integer 2L, 34L, 0L
OUTPUT
[1] "integer"
v <- 2+5i
print(class(v))
4 Complex 3 + 2i
OUTPUT
[1] "complex"
v <- "TRUE"
'a' , '"good", "TRUE", print(class(v))
5 Character
'23.4' OUTPUT
[1] "character"
v <- charToRaw("Hello")
"Hello" is stored as print(class(v))
6 Raw
48 65 6c 6c 6f OUTPUT
[1] "raw"
In R programming, the very basic data types are the R-objects called vectors which
hold elements of different classes.
2. Lists
A list is an R-object which can contain many different types of elements inside it
like vectors, functions and even another list inside it.
# Create a list.
list1 <- list(c(2,5,3),21.3,sin)
# Print the list.
print(list1)
OUTPUT
[[1]]
[1] 2 5 3
[[2]]
[1] 21.3
[[3]]
function (x) .Primitive("sin")
3. Matrices
A matrix is a two-dimensional rectangular data set. It can be created using a vector
input to the matrix function.
# Create a matrix.
M = matrix( c('a','a','b','c','b','a'), nrow = 2, ncol = 3, byrow = TRUE)
print(M)
OUTPUT
[,1] [,2] [,3]
[1,] "a" "a" "b"
[2,] "c" "b" "a"
4. Arrays
While matrices are confined to two dimensions, arrays can be of any number
of dimensions.
5. Factors
Factors are the r-objects which are created using a vector.
It stores the vector along with the distinct values in the vector as labels.
The labels are always character irrespective of whether it is numeric or
character or Boolean etc. in the input vector. They are useful in statistical
modelling.
Factors are created using the factor() function. The nlevels functions gives the
count of levels.
# Create a vector.
apple_colors <- c('green','green','yellow','red','red','red','green')
# Create a factor object.
factor_apple <- factor(apple_colors)
# Print the factor.
print(factor_apple)
print(nlevels(factor_apple))
OUTPUT
[1] green green yellow red red red green
Levels: green red yellow
[1] 3
Variable
A variable provides us with named storage that our programs can manipulate.
A variable in R can store an atomic vector, group of atomic vectors or a
combination of many Robjects.
A variable
1. Valid – Has letters, numbers, underscore and dot.
2. Invalid
a. Starts with a number or an underscore (_).
b. The starting dot is followed by a number.
c. Using special characters other than dot(.) and underscore( _ ).
Variable Name Variable Name
Validity Validity
Example Example
var_name2. Valid var_name% Invalid
.var_name, Valid 2var_name
Invalid
var.name Valid .2var_name
_var_name Invalid
Variable Assignment
The variables can be assigned values using leftward, rightward and equal to
operator. The values of the variables can be printed using print() or cat() function.
The cat() function combines multiple items into a continuous print output.
# Assignment using equal operator.
var.1 = c(0,1,2,3)
Finding Variables
To know all the variables currently available in the workspace we use ls() function.
Also the ls() function can use patterns to match the variable names.
print(ls())
OUTPUT
[1] "my var" "my_new_var" "my_var" "var.1"
[5] "var.2" "var.3" "var.name" "var_name2."
[9] "var_x" "varname"
Deleting Variables
Variables can be deleted by using the rm() function.
Below we delete the variable var.3.
On printing the value of the variable error is thrown.
rm(var.3)
print(var.3)
OUTPUT
[1] "var.3"
Error in print(var.3) : object 'var.3' not found
All the variables can be deleted by using the rm() and ls() function together.
rm(list = ls())
print(ls())
OUTPUT
character(0)
Operators
An operator is a symbol that tells the compiler to perform specific mathematical or
logical manipulations. R language is rich in built-in operators.
Types of Operators
1. Arithmetic Operators
2. Relational Operators
3. Logical Operators
4. Assignment Operators
5. Miscellaneous Operators
Arithmetic Operators
Following table shows the arithmetic operators supported by R language.
The operators act on each element of the vector.
Operator Description Example
v <- c( 2,5.5,6) OUTPUT
+ Adds two vectors t <- c(8, 3, 4) [1] 10.0 8.5
print(v+t) 10.0
v <- c( 2,5.5,6)
OUTPUT
− Subtracts second vector from the first t <- c(8, 3, 4)
[1] -6.0 2.5 2.0
print(v-t)
v <- c( 2,5.5,6) OUTPUT
* Multiplies both vectors t <- c(8, 3, 4) [1] 16.0 16.5
print(v*t) 24.0
OUTPUT
v <- c( 2,5.5,6)
[1] 0.250000
/ Divide the first vector with the second t <- c(8, 3, 4)
1.833333
print(v/t)
1.500000
v <- c( 2,5.5,6)
Give the remainder of the first vector with OUTPUT
%% t <- c(8, 3, 4)
the second [1] 2.0 2.5 2.0
print(v%%t)
v <- c( 2,5.5,6)
The result of division of first vector with OUTPUT
%/% t <- c(8, 3, 4)
second (quotient) [1] 0 1 1
print(v%/%t)
OUTPUT
v <- c( 2,5.5,6)
The first vector raised to the exponent of [1] 256.000
^ t <- c(8, 3, 4)
second vector 166.375
print(v^t)
1296.000
Relational Operators
Following table shows the relational operators supported by R language.
Each element of 1ST vector is compared with the corresponding element of 2ND
vector.
The result of comparison is a Boolean value.
Logical Operators
Following table shows the logical operators supported by R language.
It is applicable only to vectors of type logical, numeric or complex.
All numbers greater than 1 are considered as logical value TRUE.
Each element of 1ST vector is compared with the corresponding element of 2ND
vector.
The result of comparison is a Boolean value.
Operator Description Example
It is called Element-wise Logical
AND operator.
OUTPUT
It combines each element of 1ST v <- c(3,1,TRUE,2+3i)
[1] TRUE
& vector with the corresponding t <- c(4,1,FALSE,2+3i)
TRUE FALSE
element of the 2ND vector and gives a print(v&t)
TRUE
output TRUE if both the elements
are TRUE.
It is called Element-wise Logical OR
operator. OUTPUT
v <- c(3,0,TRUE,2+2i)
It combines each element of 1ST [1] TRUE
| t <- c(4,0,FALSE,2+3i)
vector with corresponding element FALSE TRUE
print(v|t)
of 2ND vector and gives an output TRUE
TRUE if one the elements is TRUE.
OUTPUT
It is called Logical NOT operator.
v <- c(3,0,TRUE,2+2i) [1] FALSE
! Takes each element of the vector and
print(!v) TRUE FALSE
gives the opposite logical value.
FALSE
Assignment Operators
These operators are used to assign values to vectors.
Operator Description Example
v1 <- c(3,1,TRUE,2+3i)
<− OUTPUT
Called Left v2 <<- c(3,1,TRUE,2+3i)
= [1] 3+0i 1+0i 1+0i 2+3i
Assignment print(v1)
<<− [1] 3+0i 1+0i 1+0i 2+3i
print(v2)
c(3,1,TRUE,2+3i) -> v1
-> OUTPUT
Called Right c(3,1,TRUE,2+3i) ->> v2
(or) [1] 3+0i 1+0i 1+0i 2+3i
Assignment print(v1)
->> [1] 3+0i 1+0i 1+0i 2+3i
print(v2)
Miscellaneous Operators
These operators are used to for specific purpose and not general mathematical or
logical computation.
Operator Description Example
Colon operator. It creates the series
v <- 2:8 OUTPUT
: of numbers in sequence for a
print(v) [1] 2 3 4 5 6 7 8
vector.
v1 <- 8
OUTPUT
v2 <- 12
This operator is used to identify if [1] TRUE
%in% t <- 1:10
an element belongs to a vector. [1] FALSE
print(v1 %in% t)
print(v2 %in% t)
M = matrix(
c(2,6,5,1,10,4),
OUTPUT
nrow = 2,
This operator is used to multiply a [,1] [,2]
%*% ncol = 3,
matrix with its transpose. [1,] 65 82
byrow = TRUE)
[2,] 82 117
t = M %*% t(M)
print(t)
Example
x <- 30L
if(is.integer(x))
{
print("X is an Integer")
}
OUTPUT
[1] "X is an Integer"
Example
x <- c("what","is","truth")
if("Truth" %in% x)
{
print("Truth is found")
}
else
{
print("Truth is not found")
}
OUTPUT
[1] "Truth is not found"
- Here "Truth" and "truth" are two different strings.
Loops
In general, statements are executed sequentially. There may be a situation when
you need to execute a block of code several number of times.
The 1ST statement in a function is executed first, followed by the second, and so on.
A loop statement allows us to execute a statement or group of statements multiple
times.
R programming language provides the following kinds of loop to handle looping
requirements.
1. repeat loop
2. while loop
3. for loop
1. repeat loop
Executes a sequence of statements multiple times and abbreviates the code that
manages the loop variable.
Syntax
repeat
{ commands
if(condition)
{break}
}
Flow Diagram
2. While loop
Repeats a statement or group of statements while a given condition is true.
It tests the condition before executing the loop body.
Syntax
while (test_expression)
{
statement
}
Flow Diagram
3. For loop
A For loop is a repetition control structure that allows you to efficiently write
a loop that needs to execute a specific number of times.
Like a while statement, except that it tests the condition at the end of the loop
body.
Syntax
for (value in vector)
{
statements
}
Flow Diagram
Functions
A function is a set of statements organized together to perform a specific task.
R has in-built functions and the user can create their own functions.
In R, a function is an object so the R interpreter is able to pass control to the function,
along with arguments that may be necessary for the function to accomplish the
actions.
The function in turn performs its task and returns control to the interpreter as well
as any result which may be stored in other objects.
Definition
- An R function is created by using the keyword function.
- Syntax
function_name <- function(arg_1, arg_2, ...)
{
Function body
}
Components
The different parts of a function are −
1. Function Name − this is the actual name of the function. It is stored in R
environment as an object with this name.
2. Arguments − an argument is a placeholder. When a function is invoked, you
pass a value to the argument. Arguments are optional; i.e., a function may
contain no arguments. Also arguments can have default values.
3. Function Body – it contains a collection of statements that defines what the
function does.
4. Return Value – it is the last expression in the function body to be evaluated.
Built-in Function
- Simple examples of in-built functions are seq(), mean(), max(), sum(x) and
paste(...) etc.
- They are directly called by user written programs.
# Create a sequence of numbers from 32 to 44.
print( seq (32,44) )
# Find mean of numbers from 25 to 82.
print( mean (25:82) )
# Find sum of numbers from 41 to 68.
print( sum (41:68) )
OUTPUT
[1] 32 33 34 35 36 37 38 39 40 41 42 43 44
[1] 53.5
[1] 1526
String
Any value written within a pair of single quote or double quotes is treated as a
string. Internally R stores every string within double quotes, even when you create
them with single quote.
Vector
Vectors are the most basic R data objects and there are six types of atomic vectors.
They are logical, integer, double, complex, character and raw.
Vector Creation
Single Element Vector
Even when you write just one value in R, it becomes a vector of length 1 and
belongs to one of the above vector types.
# Atomic vector of type character.
print("abc");
# Atomic vector of type double.
print(12.5)
# Atomic vector of type integer.
print(63L)
# Atomic vector of type logical.
print(TRUE)
# Atomic vector of type complex.
print(2+3i)
# Atomic vector of type raw.
print(charToRaw('hello'))
OUTPUT
[1] "abc"
[1] 12.5
[1] 63
[1] TRUE
[1] 2+3i
[1] 68 65 6c 6c 6f
Vector Manipulation
1. Vector arithmetic
Two vectors of same length can be added, subtracted, multiplied or divided
giving the result as a vector output.
# Create two vectors.
v1 <- c(3,8,4,5,0,11)
v2 <- c(4,11,0,8,1,2)
# Vector addition. # Vector multiplication.
add.result <- v1+v2 multi.result <- v1*v2
print(add.result) print(multi.result)
# Vector subtraction. # Vector division.
sub.result <- v1-v2 divi.result <- v1/v2
print(sub.result) print(divi.result)
OUTPUT
[1] 7 19 4 13 1 13
[1] -1 -3 4 -3 -1 9
[1] 12 88 0 40 0 22
[1] 0.7500000 0.7272727 Inf 0.6250000 0.0000000 5.5000000
Lists
Lists are the R objects which contain elements of different types like − numbers,
strings, vectors and another list inside it.
A list can also contain a matrix or a function as its elements.
List is created using list() function.
Creating a List
# Create a list containing strings, numbers, vectors and a logical values.
list_data <- list("Red", c(21,32,11), TRUE, 51.23, 119.1)
print(list_data)
OUTPUT
[[1]]
[1] "Red"
[[2]]
[1] 21 32 11
[[3]]
[1] TRUE
[[4]]
[1] 51.23
# Access the thrid element. As it is also a list, all its elements will be printed.
print(list_data[3])
Matrices
Matrices are the R objects in which the elements are arranged in a two-dimensional
rectangular layout. They contain elements of the same atomic types.
Though we can create a matrix containing only characters or only logical values,
they are not of much use. We use matrices containing numeric elements to be used
in mathematical calculations.
A Matrix is created using the matrix() function.
Syntax
matrix(data, nrow, ncol, byrow, dimnames)
- data is the input vector which becomes the data elements of the matrix.
- nrow is the number of rows to be created.
- ncol is the number of columns to be created.
- byrow is a logical clue. If TRUE then the input vector elements are arranged
by row.
- dimname is the names assigned to the rows and columns.
Example
Create a matrix taking a vector of numbers as input.
# Elements are arranged sequentially by row.
M <- matrix(c(3:14), nrow = 4, byrow = TRUE)
print(M)
# Elements are arranged sequentially by column.
N <- matrix(c(3:14), nrow = 4, byrow = FALSE)
print(N)
# Define the column and row names.
rownames = c("row1", "row2", "row3", "row4")
colnames = c("col1", "col2", "col3")
P <- matrix(c(3:14), nrow = 4, byrow = TRUE, dimnames = list(rownames,
colnames))
print(P)
OUTPUT
[,1] [,2] [,3] [,1] [,2] [,3]
[1,] 3 4 5 [1,] 3 7 11
[2,] 6 7 8 [2,] 4 8 12
[3,] 9 10 11 [3,] 5 9 13
[4,] 12 13 14 [4,] 6 10 14
Matrix Computations
Various mathematical operations are performed on the matrices using the R
operators. The result of the operation is also a matrix.
The dimensions (number of rows and columns) should be same for the matrices
involved in the operation.
Matrix Addition & Subtraction
# Create two 2x3 matrices.
matrix1 <- matrix(c(3, 9, -1, 4, 2, 6), nrow = 2)
print(matrix1)
matrix2 <- matrix(c(5, 2, 0, 9, 3, 4), nrow = 2)
print(matrix2)
# Add the matrices.
result <- matrix1 + matrix2
cat("Result of addition","\n")
print(result)
Arrays
Arrays are the R data objects which can store data in more than two dimensions.
For example − If we create an array of dimension (2, 3, 4) then it creates 4
rectangular matrices each with 2 rows and 3 columns. Arrays can store only data
type.
An array is created using the array() function. It takes vectors as input and uses the
values in the dim parameter to create an array.
,,2
[,1] [,2] [,3]
[1,] 5 10 13
[2,] 9 11 14
[3,] 3 12 15
[1] 56 68 60