0% found this document useful (0 votes)

243 views33 pages

R Programming for Statistical Analysis

Uploaded by

Harshitha B

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

243 views33 pages

R Programming for Statistical Analysis

Uploaded by

Harshitha B

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Statistical Analysis and R Programming 2024-25

UNIT I: INTRODUCTION

R programming is an interpreted programming language widely used to analyse statistical

information and a graphical representation. It is also a software environment used to analyse
statistical information, graphical representation, reporting, and data modelling. R programming
language was created by Ross Thaka and Robert Gentleman at the University of Auckland, New
Zealand.
FEATURES OF R PROGRAMMING
1. R is an interpreted language.
2. It is a simple and effective programming language which has been well developed.
3. It is a well-designed, easy, and effective language which has the concepts of user-defined,
looping, conditional, and various I/O facilities.
4. R contains a suite of operators, different types of calculation on arrays, lists and vectors
5. It provides effective data handling and storage facility.
6. It is an open-source, powerful, and highly extensible software.
7. It provides highly extensible graphical techniques.

VARIABLES
Variables are used to store the information to be manipulated and referenced in the R program.
The R variable can store an atomic vector, a group of atomic vectors, or a combination of many R
objects.
R supports two ways of variable assignment:
1. Using equal operator ( = ): Operators use an equal sign to assign values to variables.
Syntax: variable_name = value
Ex: x = 10
2. Using the leftward operator (< -): Operator use a leftward operator to assign values to
variables where data is copied from right to left.
Syntax: variable_name < - value
Ex: x < -20

Shruthi S, Asst. Professor, GSSS SSFGC, Mysuru Page 1

Statistical Analysis and R Programming 2024-25

The following rules need to be follow to define the variable.

 A valid variable name consists of a combination of alphabets, numbers, dot(.), and
underscore(_) characters. Ex: var.1_ is valid
 No other special character is allowed except the dot and underscore operators.
Ex: var$1 or var#1 both are invalid.
 Variables can start with alphabets or dot characters.
Ex: .var or var is valid
 If a variable starts with a dot, the next thing after the dot cannot be a number.it should be an
alphabet.
Ex: .3var is invalid
 The variable name should not be a reserved keyword in R.
Ex: TRUE, FALSE, etc.is not valid.

DATA TYPES
R data types are used in computer programming to specify the kind of data that can be stored in a
variable. The operating system allocates memory based on the data type of the variable and decides
what can be stored in the reserved memory.
The following data types are used in R programming:
1. Integer: This data type is used to store the value as an integer.
Ex: 3L, 66L, 2346L
2. Numeric: Decimal value is called numeric in R, and it is the default computational data type
Ex : 12, 32, 112, 54.32
3. Complex: A complex value in R is defined as real value and the pure imaginary value i.
Ex : Z=1+2i, t=7+3i
4. Logical: It is a special data type for data with only two possible values which can be construed
as true/false.
Ex : TRUE and FALSE
5. Character: In R programming, a character is used to represent string values.
Ex : 'a', '"good'", "TRUE", '35.4'

Shruthi S, Asst. Professor, GSSS SSFGC, Mysuru Page 2

Statistical Analysis and R Programming 2024-25

OPERATORS
An operator is a symbol tells the compiler to perform specific logical or mathematical
manipulations. In R programming, there are different types of operators, and each operator
performs a different task.
There are as follows
1. Arithmetic Operators
2. Relational Operators
3. Logical Operators
4. Assignment Operators
5. Miscellaneous Operators

1. Arithmetic Operators
The arithmetic operators are used to perform arithmetic operations like, addition, subtraction,
multiplication, division and modulo. An arithmetic expression is one which comprises
arithmetic operators and variables or constants. Here variables and constants are called as
operands. The arithmetic operators are as follows.
+ : addition
- : subtraction
* : multiplication
/ : division
%% : modulo
^ : power
Ex : a+b, a-b etc.

2. Relational Operators
Relational operators are used to construct relational expressions, which are used to compare
two quantities. A relational expression is of the form operand1 operator operand2. The relation
operator are as follows.
< : is less than
> : is greater than
>= : is greater than or equal to
<= : is lesser than or equal to

Shruthi S, Asst. Professor, GSSS SSFGC, Mysuru Page 3

Statistical Analysis and R Programming 2024-25

== : is equal to
!= : is not equal to
Ex: a < b, a = =10.

3. Logical Operators
These are used to construct compound conditional expressions. The logical operators && and
|| are used to combine two expressions and make decision and ! is used to negate a conditional
expression.
The Logical operators are
&& : AND
|| : OR
! : Logical NOT
& : Logical AND
| : Logical OR

Ex: 1 & 0, TRUE || FALSE

4. Assignment Operators
Assignment operators are used to assign the result of an expression to a variable.
Syntax: variable = expression;
<- : Left assignment operators.
= : Equal Operator
Ex: a <- 20, C = 90

5. Miscellaneous Operators
Miscellaneous operators are used for a special and specific purpose. These operators are not
used for general mathematical or logical computation.

: The colon operator is used to create the series of numbers in sequence for a vector
Ex: v <- 1:8
print(v)
Output
12345678

Shruthi S, Asst. Professor, GSSS SSFGC, Mysuru Page 4

Statistical Analysis and R Programming 2024-25

class() FUNCTION
It is a built-in function is used to determine the data type of the variable provided to it. The class
function returns the data type of variable.

Syntax: class(variable)
Ex: var1 = "hello"
print(class(var1))

Output: “character”

VECTORS
In R, a sequence of elements which share the same data type is known as vector. Vector is classified
into two parts:

CREATION OF VECTOR
1) Using the c() function
Vector can be create by using c() function. This function returns a onedimensional array or
simply vector.
Syntax: Vector_Name <- c ( List of Elements)
Ex: Myvec <- c (1,3,1,4,2)
print(Myvec)

Output: 1 3 1 4 2

2) Using the colon(:) operator

We can create a vector with the help of the colon operator.
Syntax : Vector_Name = x:y
Ex: Z <- 1:10
Print (Z)

Output: 1 2 3 4 5 6 7 8 9 10
3) Using the seq() function
A sequence function creates a sequence of elements as a vector. The seq() function is used by
setting step size with ‘by' parameter.

Shruthi S, Asst. Professor, GSSS SSFGC, Mysuru Page 5

Statistical Analysis and R Programming 2024-25

Syntax: Vector_Name<-seq(start, stop, by=value)

Ex: seq_vec<-seq(1, 4, by=0.5)
print (seq_vec)

Output: 1.0 1.5 2.0 2.5 3.0 3.5 4.0

 Numeric vector: A vector which contains numeric elements is known as a numeric vector. If
we assign a decimal value to any variable, then that variable will become a numeric type.
Ex: num_vec<-c(10.1, 10.2, 33.2)
print(num_vec )
class(num_vec)

Output: 10.1 10.2 33.2

"numeric"

 Integer vector
A non-fraction numeric value is known as integer data. An integer value can be assigned to
variable by appending L to the value.
Ex: int_vec1<-c(1L,2L,3L,4L,5L)
print(int_vec1)
class(int_vec1)

Output: 1L,2L,3L,4L,5L
“integer”

 Character vector: A vector which contains character elements is known as an character

vector. In R character data type value can be created using double quotes("") or single
quotes('').
Ex: char_vec1<-c("shubham","arpita","nishka","vaishali")
print(char_vec)
class(char_vec1)

Output: "shubham" "arpita" "nishka" "vaishali"

"character"

Shruthi S, Asst. Professor, GSSS SSFGC, Mysuru Page 6

Statistical Analysis and R Programming 2024-25

 Logical vector
The logical data types have only two values i.e., True or False. These values are based on
which condition is satisfied. A vector which contains Boolean values is known as the logical
vector.
Ex: d<- 5
e<- 6
f<- 7
log_vec<-c(d<e, d<f, e<d,e<f,f<d,f<e)
print(log_vec)
class(log_vec)

Output: TRUE TRUE FALSE TRUE FALSE FALSE

"logical"

ACCESSING ELEMENTS OF VECTORS

Access the elements of a vector with the help of vector indexing. Indexing denotes the position
where the value in a vector is stored. In R, the indexing starts from 1. We can perform indexing
by specifying an integer value in square braces [] next to our vector.
Ex: seq_vec<-seq(1,4,length.out=6)
print(seq_vec)
print(seq_vec[2])

Output: 1.0 1.6 2.2 2.8 3.4 4.

1.6

VECTOR OPERATION
1) Combining vectors
By combining one or more vectors, it forms a new vector which contains all the elements of
each vector.
Ex: p <- c(1,2,4,5,7,8)
q<-c("shubham","arpita","nishka","gunjan","vaishali","sumit")
r <- c (p, q)
print (r)

Shruthi S, Asst. Professor, GSSS SSFGC, Mysuru Page 7

Statistical Analysis and R Programming 2024-25

Output: "1" "2" "4" "5" "7" "8" "shubham" "arpita" "nishka" "gunjan" "vaishali" "sumit"

2) Arithmetic operations
We can perform all the arithmetic operation on vectors. The arithmetic operations are performed
member-by-member on vectors.
Ex: a<-c(1,3,5,7)
b<-c(2,4,6,8)
print (a+b)
print (a-b)
print (a*b)
print (a%%b)

Output: 3 7 11 15
-1 -1 -1 -1
2 12 30 56
1 3 5 7

MATRICES
In R, a two-dimensional rectangular data set is known as a matrix. A matrix is created with the
help of the vector input to the matrix function. In R, matrix( ) is used to create matrix.

Syntax: matrix(data, nrow, ncol, byrow, dim_names)

Where data: It is the input vector which is the data elements of the matrix.
nrow: It is the number of rows to create in the matrix.
ncol: It is the number of columns to create in the matrix.
byrow: The byrow parameter is a logical clue. If its value is true, then the input vector
elements are arranged by row.
dim_name: It is the name assigned to the rows and columns.

Ex: p <- matrix(c(5:16), nrow=4, ncol=3, byrow=TRUE)

print(p)
Output: 5 6 7
8 9 10
11 12 13
14 15 16

Shruthi S, Asst. Professor, GSSS SSFGC, Mysuru Page 8

Statistical Analysis and R Programming 2024-25

ASSIGNING NAMES TO THE MATRIX

The column and row names can be define through vector
Ex: row_names = c("row1", "row2", "row3", "row4")
col_names = c("col1", "col2", "col3")
R <- matrix(c(3:14), nrow = 4, byrow = TRUE, dimnames = list(row_names, col_names))
print(R)

Output:
col1 col2 col3
row1 3 4 5
row2 6 7 8
row3 9 10 11
row4 12 13 14

ACCESSING MATRIX ELEMENTS IN R

There are three ways to access the elements from the matrix.
 Access the element which presents on nth row and mth column.
 Access all the elements of the matrix which are present on the nth row.
 Access all the elements of the matrix which are present on the mth column.

Ex: For the above created R matrix, accessing the elements as follow
#Accessing element present on 3rd row and 2nd column
print(R[3,2])

#Accessing element present in 3rd row

print(R[3,])
#Accessing element present in 2nd column
print(R[, 2])

MODIFICATION OF THE MATRIX

In matrix modification, the first method is to assign a single element to the matrix at a particular
position. By assigning a new value to that position, the old value will get replaced with the new
one.
Shruthi S, Asst. Professor, GSSS SSFGC, Mysuru Page 9
Statistical Analysis and R Programming 2024-25

Syntax: matrix[n, m]<-y

Here, n and m are the rows and columns of the element, respectively. And, y is the value which is
assign to modify our matrix.
Ex:
R <- matrix(c(5:16), nrow = 4, byrow = TRUE, dimnames = list(row_names, col_names))

#Assigning value 20 to the element at 3d row and 2nd column

R[3,2] <- 20
print(R)

Output:
col1 col2 col3
row1 5 6 7
row2 8 9 10
row3 11 20 13
row4 14 15 16

Use of Relational Operator

R[R= =12]<-0
print(R)
output:
col1 col2 col3
row1 5 6 7
row2 8 9 10
row3 11 0 13
row4 14 15 16

ADDITION OF ROWS AND COLUMNS

The cbind() and rbind() function are used to add a column and a row respectively.
 rbind(): it is used to add the a row to the existing matrix.
Ex: R <- matrix(c(5:16), nrow = 4, byrow = TRUE, dimnames = list(row_names, col_names))
rbind(R, c(17,18,19))
print(R)

Shruthi S, Asst. Professor, GSSS SSFGC, Mysuru Page 10

Statistical Analysis and R Programming 2024-25

Output:
col1 col2 col3
row1 5 6 7
row2 8 9 10
row3 11 12 13
row4 14 15 16
17 18 19

 cbind(): it is used to add the a column to the existing matrix.

Ex: R <- matrix(c(5:16), nrow = 4, byrow = TRUE, dimnames = list(row_names, col_names))
cbind(R, c(17,18,19,20))
print(R)

Output:
col1 col2 col3
row1 5 6 7 17
row2 8 9 10 18
row3 11 12 13 19
row4 14 15 16 20

MATRIX OPERATIONS
In R, we can perform the mathematical operations on a matrix such as addition, subtraction,
multiplication, etc.
Ex: R <- matrix(c(5:16), nrow = 4,ncol=3)
S <- matrix(c(1:12), nrow = 4,ncol=3)
sum<-R+S
print(sum)
sub<-R-S
print(sub)
mul<-R*S
print(mul)
div<-R/S
print(div)
Shruthi S, Asst. Professor, GSSS SSFGC, Mysuru Page 11
Statistical Analysis and R Programming 2024-25

Output:
[,1] [,2] [,3]
[1,] 6 14 22
[2,] 8 16 24
[3,] 10 18 26
[4,] 12 20 28

[,1] [,2] [,3]

[1,] 4 4 4
[2,] 4 4 4
[3,] 4 4 4
[4,] 4 4 4

[,1] [,2] [,3]

[1,] 5 45 117
[2,] 12 60 140
[3,] 21 77 165
[4,] 32 96 192

[,1] [,2] [,3]

[1,] 5.000000 1.800000 1.444444
[2,] 3.000000 1.666667 1.400000
[3,] 2.333333 1.571429 1.363636
[4,] 2.000000 1.500000 1.333333

ARRAYS
In R, arrays are the data objects which allow us to store data in more than two dimensions. In R,
an array is created using array () function. This function takes a vector as an input and to create an
array. it uses vectors values in the dim parameter.
Ex:- If we will create an array of dimension (2, 3, 4) then it will create 4 rectangular matrices of
2 row and 3 columns.

Shruthi S, Asst. Professor, GSSS SSFGC, Mysuru Page 12

Statistical Analysis and R Programming 2024-25

Syntax:
array_name <- array(data, dim= (row_size, column_size, matrices, dim_names))
Where data: It is an input vector is given to the array.
row_size: the number of row elements an array can store.
column_size: the number of columns elements an array can store.
matrices: In R, the array consists of multi-dimensional matrices
dim_names: This is used to change the default names of rows and columns.

Ex: vec1 <-c(1,3,5)

vec2 <-c(10,11,12,13,14,15)
res <- array(c(vec1,vec2), dim=c(3,3,2))
print(res)

Output:
,,1
[,1] [,2] [,3]
[1,] 1 10 13
[2,] 3 11 14
[3,] 5 12 15
,,2
[,1] [,2] [,3]
[1,] 1 10 13
[2,] 3 11 14
[3,] 5 12 15

NAMING ROWS AND COLUMNS

In R, the names to the rows, columns and matrices of the array can be given. This is done with the
help of the dimname parameter of the array() function.
Ex: vec1 <-c(1,3,5) #Creating two vectors of different lengths
vec2 <-c(10,11,12,13,14,15)
col_names <- c("Col1","Col2","Col3") #Initializing names for rows, columns and matrices
row_names <- c("Row1","Row2","Row3")

Shruthi S, Asst. Professor, GSSS SSFGC, Mysuru Page 13

Statistical Analysis and R Programming 2024-25

matrix_names <- c("Matrix1","Matrix2")

res <- array(c(vec1,vec2), dim=c(3,3,2), dimnames=list(row_names, col_names,
matrix_names))
print(res)

Output:
, , Matrix1
Col1 Col2 Col3
Row1 1 10 13
Row2 3 11 14
Row3 5 12 15

, , Matrix2
Col1 Col2 Col3
Row1 1 10 13
Row2 3 11 14
Row3 5 12 15

ACCESSING ARRAY ELEMENTS

The elements are accessed with the index.
Ex: For the above created array
print(res[3, ,2]) #To print third row of second matrix
Output: 5 12 15
print(res[3,2,2]) #To print third row second column element of 2nd matrix
Output: 12
print(res[ ,2,1]) #To print second column element of 1nd matrix
Output: 10 11 12

MANIPULATION OF ELEMENTS
The array is made up matrices in multiple dimensions so that the operations on elements of an
array are carried out by accessing elements of the matrices.
Ex: #Creating two vectors of different lengths

Shruthi S, Asst. Professor, GSSS SSFGC, Mysuru Page 14

Statistical Analysis and R Programming 2024-25

vec1 <-c(1,3,5)
vec2 <-c(10,11,12,13,14,15)
res1 <- array(c(vec1,vec2),dim=c(3,3,1))
print(res1)
vec1 <-c(8,4,7)
vec2 <-c(16,73,48,46,36,73)
res2 <- array(c(vec1,vec2),dim=c(3,3,1))
print(res2)
res3 <- mat1+mat2
print(res3)

Output:
,,1
[,1] [,2] [,3]
[1,] 1 10 13
[2,] 3 11 14
[3,] 5 12 15
,,1
[,1] [,2] [,3]
[1,] 8 16 46
[2,] 4 73 36
[3,] 7 48 73

[,1] [,2] [,3]

[1,] 9 26 59
[2,] 7 84 50
[3,] 12 60 88

Shruthi S, Asst. Professor, GSSS SSFGC, Mysuru Page 15

Statistical Analysis and R Programming 2024-25

LISTS
In R, A list is a data structure which has components of mixed data types. Lists are the objects of
R which contain elements of different types such as number, vectors, string and another list inside
it.
The function which is used to create a list in R is list( ).
Ex: list_1<-list(1,2,3)
list_2<-list("Shubham","Arpita","Vaishali")
print(list_1)
print(list_2)

Output:
1
2
3
"Shubham"
"Arpita"
"Vaishali"

Ex: list_data <- list("Shubham","Arpita",c(1,2,3,4,5), TRUE, FALSE, 22.5, 12L)

print(list_data)
Output:
"Shubham"
"Arpita"
12345
TRUE
FALSE
22.5
12

GIVING A NAME TO LIST ELEMENTS

There are only three steps to print the list data corresponding to the name:

Shruthi S, Asst. Professor, GSSS SSFGC, Mysuru Page 16

Statistical Analysis and R Programming 2024-25

1. Creating a list.
2. Assign a name to the list elements with the names() function.
3. Print the list data.
Ex: list_data <- list(c("Shubham","Nishka","Gunjan"), matrix(c(40,80,60,70,90,80), nrow= 2),
list("BCA","MCA","B.tech"))
names (list_data) <- c("Students", "Marks", "Course")
print(list_data)

Output:
$Students
[1] "Shubham" "Nishka" "Gunjan"
$Marks
[,1] [,2] [,3]
[1,] 40 60 90
[2,] 80 70 80
$Course
$Course[[1]]
[1] "BCA"
$Course[[2]]
[1] "MCA"
$Course[[3]]
[1] "B. tech."

ACCESSING LIST ELEMENTS

R provides two ways to access the elements of a list.
1) Using indexing method performed in the same way as a vector.
2) Access the elements of a list with the help of names.

Ex: Accessing elements using index

list_data <- list(c("Shubham","Arpita","Nishka"), matrix(c(40,80,60,70,90,80), nrow = 2),
list("BCA","MCA","B.tech"))
print(list_data[1])

Shruthi S, Asst. Professor, GSSS SSFGC, Mysuru Page 17

Statistical Analysis and R Programming 2024-25

Output:
"Shubham" "Arpita" "Nishka"

Ex: Accessing elements using names

list_data <- list(c("Shubham","Arpita","Nishka"), matrix(c(40,80,60,70,90,80),nrow = 2),
list("BCA","MCA","B.tech"))
names(list_data) <- c("Student", "Marks", "Course")
print(list_data["Student"])

Output:
$Student
"Shubham" "Arpita" "Nishka"

MANIPULATION OF LIST ELEMENTS

R allows us to add, delete, or update elements in the list.
Ex: list_data <- list(c("Shubham","Arpita","Nishka"), matrix(c(40,80,60,70,90,80), nrow = 2),
list("BCA","MCA","B.tech"))
names(list_data) <- c("Student", "Marks", "Course")
list_data[4] <- "Moradabad"
print(list_data[4])
list_data[4] <- NULL
print(list_data[4])
list_data[3] <- "Masters of computer applications"
print(list_data[3])

Output:
"Moradabad"
$<NA>
NULL
$Course
"Masters of computer applications"

Shruthi S, Asst. Professor, GSSS SSFGC, Mysuru Page 18

Statistical Analysis and R Programming 2024-25

CONVERTING LIST TO VECTOR

There is a drawback with the list, i.e., we cannot perform all the arithmetic operations on list
elements. This drawback can be overcome with the function unlist( ), this function converts the
list into vectors.
Ex: list1 <- list(1:5)
print(list1)
list2 <-list(10:14)
print(list2)
v1 <- unlist(list1)
v2 <- unlist(list2)
result <- v1+v2
print(result)

Output: 1 2 3 4 5
10 11 12 13 14
11 13 15 17 19

MERGING LISTS
R allows us to merge one or more lists into one list. To merge the lists or combine the list pass all
the lists into list function as a parameter, and it returns a list which contains all the elements which
are present in the lists.

Ex: Even_list <- list(2,4,6,8)

Odd_list <- list(3,5,7,9)
merged.list <- list(Even_list,Odd_list) # Merging the two lists.
print(merged.list)
Output:
2
4
6
8
3
5

Shruthi S, Asst. Professor, GSSS SSFGC, Mysuru Page 19

Statistical Analysis and R Programming 2024-25

7
9

DATA FRAMES
A data frame is a two-dimensional array-like structure or a table in which a column contains values
of one variable, and rows contains one set of values from each column.
A data frame is a special case of the list in which each component has equal length. A matrix can
contain one type of data, but a data frame can contain different data types such as numeric,
character, factor, etc.
There are following characteristics of a data frame.
 The columns name should be non-empty.
 The rows name should be unique.
 The data is stored in a data frame can be a factor, numeric, or character type.
 Each column contains the same number of data items.

In R, the data frames are created with frame() function of data. This function contains the vectors
of any type such as numeric, character, or integer.
Ex: Create a data frame that contains employee id (integer vector), employee name(character
vector), salary (numeric vector), and starting date(Date vector).
Ex: emp.data<- data.frame( employee_id = c (1:5),
employee_name = c("Shubham","Arpita","Nishka","Gunjan","Sumit"),
sal = c(623.3,915.2,611.0,729.0,843.25),
starting_date = as.Date(c("2012-01-01", "2013-09-23", "2014-11-15", "2014-05-11",
"2015-03-27")), )
print(emp.data)

Output:
employee_id employee_name sal starting_date
1 1 Shubham 623.30 2012-01-01
2 2 Arpita 915.20 2013-09-23
3 3 Nishka 611.00 2014-11-15
4 4 Gunjan 729.00 2014-05-11
5 5 Sumit 843.25 2015-03-27
Shruthi S, Asst. Professor, GSSS SSFGC, Mysuru Page 20
Statistical Analysis and R Programming 2024-25

EXTRACTING DATA FROM DATA FRAME

Extract the data in three ways
 Extract the specific columns from a data frame using the column name.
 Extract the specific rows also from a data frame.
 Extract the specific rows corresponding to specific columns.

Extracting the specific columns from a data frame

Ex: For the above created data frame
final <- data.frame(emp.data$employee_id, emp.data$sal)
print(final)

Output:
emp.data.employee_id emp.data.sal
1 623.30
2 515.20
3 611.00
4 729.00
5 843.25

Extracting the specific rows from a data frame

Ex: final <- emp.data[1,]
print(final)

Output:
employee_id employee_name sal starting_date
1 Shubham 623.3 2012-01-01

Extracting specific rows corresponding to specific columns

Ex: # Extracting 2nd and 3rd row corresponding to the 1st and 4th column
final <- emp.data[c(2,3),c(1,4)]
print(final)

Shruthi S, Asst. Professor, GSSS SSFGC, Mysuru Page 21

Statistical Analysis and R Programming 2024-25

Output:
employee_id starting_date
2 2013-09-23
3 2014-11-15

MODIFICATION IN DATA FRAME

It is possible to add and delete rows and columns to the data frame using cbind() and rbind()
functions.
 rbind(): used to add row to the dataframe
Syntax: rbind( dataframe name, new row variable)

Ex: x <- list(6,"Vaishali",547,"2015-09-01")

rbind(emp.data, x)

Output: employee_id employee_name sal starting_date

1 Shubham 623.30 2012-01-01
2 Arpita 515.20 2013-09-23
3 Nishka 611.00 2014-11-15
4 Gunjan 729.00 2014-05-11
5 Sumit 843.25 2015-03-27
6 Vaishali 547.00 2015-09-01

 cbind(): used to add column in the data frame

Syntax: rbind( dataframe name, new coloumn variable)

Ex: y <- c("Moradabad","Lucknow","Etah","Sambhal","Khurja")

cbind(emp.data, Address=y)

Output:
employee_id employee_name sal starting_date Address
1 Shubham 623.30 2012-01-01 Moradabad
2 Arpita 515.20 2013-09-23 Lucknow

Shruthi S, Asst. Professor, GSSS SSFGC, Mysuru Page 22

Statistical Analysis and R Programming 2024-25

3 Nishka 611.00 2014-11-15 Etah

4 Gunjan 729.00 2014-05-11 Sambhal
5 Sumit 843.25 2015-03-27 Khurja

NON-NUMERIC VALUES
LOGICAL VALUES
 Logical-values can only take on two values: TRUE or FALSE.
 Logical-values represent binary states like - >yes/no and ->one/zero
 Logical-values are used to indicate whether a condition has been met or not.TRUE and
FALSE Notation
 Logical-values are represented as TRUE and FALSE

Assigning Logical-values
Ex: b1 <- TRUE
b2 <- FALSE

Vectors can be filled with logical-values using T or F.

Ex: myvec <- c(T,T,F,F,F)

Vector Length can determine using the `length` function.

Ex: length(myvec) // It returns 5

A Logical Outcome: Relational Operators

 Relational operators are used to find the relationship between two operands.
 The output of relational expression is either TRUE or FALSE.
 The 2 operands may be constants, variables or expressions.
Ex: 4 > 5 returns FALSE
4 < 5 returns TRUE
4 >= 5 returns FALSE
4 <= 5 returns TRUE
4 == 5 returns FALSE
4 != 5 returns TRUE

Shruthi S, Asst. Professor, GSSS SSFGC, Mysuru Page 23

Statistical Analysis and R Programming 2024-25

any & all Functions

any() : Checks whether at least one element in a vector meets a specific condition. It returns TRUE
if any element satisfies the condition; otherwise, it returns FALSE.
Ex: vector1 <- c(1, 2, 3, 4, 5)
result <- any(vector1 > 3)

Output:
TRUE

all(): Checks whether all elements in a vector meet a specific condition. It returns TRUE if all
elements satisfy the condition; otherwise, it returns FALSE
Ex: vector2 <- c(1, 2, 3, 4, 5)
result <- all(vector2 > 0)

Output:
TRUE

SHORT AND LONG VERSIONS

There are versions of logical operators: i) Short versions ii) Long versions
i) Short versions are for element-wise comparisons. Short versions return multiple logical-
values.
Ex: `&`, `|`
Element-wise comparisons are performed when comparing two vectors of equal length.
Element-wise comparisons return a vector of logical-values.
Ex: b1 <- c(T, F, F)`
b2 <- c(F, T, F)`
`b1 & b2` returns `[F, F, F]`.
`b1 | b2` returns `[T, T, F]`.

ii) Long versions are for comparing individual values. Long versions return a single
logical - value.
Ex: `&&`, `||`
Using long versions of logical operators evaluates only the first pair of logicals in two
vectors.
Shruthi S, Asst. Professor, GSSS SSFGC, Mysuru Page 24
Statistical Analysis and R Programming 2024-25

Ex: b1 <- c (T, F, F)

b2 <- c (F, T, F)
b1 && b2 returns `F`.
b1 || b2` returns `T`.

LOGICAL SUBSETTING AND EXTRACTION

Logical subsetting and extraction involve using logical conditions to select elements that satisfy a
particular criterion.
 A logical vector with TRUE and FALSE values based on the condition.
 Then, use this vector to subset or extract elements.
Ex: numeric_vector <- c(1, 2, 3, 4, 5)
even_numbers <- numeric_vector[numeric_vector %% 2 == 0]

Output:
24

STRING
 A string is a data type.
 It is used to represent text or character data.
 Strings can consist of almost any combination of characters, including numbers.
 Strings are commonly used for storing and manipulating textual information.
For ex: names, sentences, and text-data extracted from files or databases.
Strings can create by using single or double quotation marks.
Ex: single_quoted <- 'This is a single-quoted string.'
double_quoted <- "This is a double-quoted string."

nchar(): It is used to determine the number of characters in a given string. It calculates and returns
the length of a string in terms of the number of characters
Ex: my_string <- "Hello, World!"
string_length <- nchar(my_string)
cat("The length of the string is:", string_length)

Shruthi S, Asst. Professor, GSSS SSFGC, Mysuru Page 25

Statistical Analysis and R Programming 2024-25

Output:
The length of the string is: 13

CONCATENATION
Two main functions are used for concatenating strings: `cat` and `paste`.
1) Using the cat() Function
cat() can be used for concatenating and printing strings with optional separators.
Ex: cat("Hello", "World")

Output:
"Hello World"

2) Using the paste() Function

paste() can be used to concatenate multiple strings into one, with optional separator and other
arguments.
Ex: concatenated <- paste("Hello", "World")

Output:
"Hello World"

An optional argument `sep` is used as a separator between concatenated strings.

Ex: concatenated <- paste("Hello", "World", sep = ", ")

Output:
"Hello, World"

ESCAPE SEQUENCES
 The backslash (\) is used to invoke an escape sequence.
 Escape sequences allow to enter characters that control the format and spacing of the string.
Ex: `\n` starts a newline.
`\t` represents a horizontal tab.
`\b` invokes a backspace.
`\\` is used to include a single backslash.
`\"` includes a double quote.

Shruthi S, Asst. Professor, GSSS SSFGC, Mysuru Page 26

Statistical Analysis and R Programming 2024-25

SUBSTRINGS AND MATCHING

PATTERN MATCHING
Pattern matching allows to inspect a given string to identify smaller strings within it.
1. substr(): it is used to extract substrings within a given string
Ex: original_string <- "Hello, World!"
substring <- substr(original_string, start = 1, stop = 5)

Output:
"Hello"

2. sub(): It is used for replacing the first occurrence of a pattern within a string
Ex: text <- "I like apples, but apples are red."
new_text <- sub("apple", "banana", text)

Output:
I like bananas, but apples are red.

3. gsub(): It is used for replacing all occurrences of a pattern within a string.

Ex: text <- "I like apples, but apples are red."
new_text <- gsub("apple", "banana", text)

Output:
I like bananas, but bananas are red.

SPECIAL VALUES
When a data set has missing observations or when a practically infinite number is calculated the
software has some unique terms reserved for these situations.
They are
 INF and -INF: When a number is too large for R to represent, the value is given as Infinite.
Ex: 1 / 0
Inf+1

Output: INF

Shruthi S, Asst. Professor, GSSS SSFGC, Mysuru Page 27

Statistical Analysis and R Programming 2024-25

 Nan (Not a Number): In some situations, it is impossible to express the result of calculation
using number, in such cases Nan is given as the output.
Ex: -Inf+Inf Output: NaN
Inf/Inf
0/0

 NULL: This value is used to explicitly define an empty entity.

Ex: f < - NULL Output: NULL
print(f)

 NA (Not Available):- If the value is not define, data value is out of range, in such cases NA
values be printed as output.
Ex: X< - c (1,2,3) Output: NA
X[4]

COERCION
In R programming, converting from one object or data type to another object or data type is referred
as coercion.
There are two types of coercion
1. Implicit coercion: This type of coercion occurs automatically.
Ex: The logical value True will be treated as 1 and False will be treated as 0.
2. Explicit coercion: This type of coercion can be done with the help of Is-Dot Object-Checking
Functions and As-Dot Object-Checking Functions.

Is-Dot Object-Checking Functions

To check whether the object is a specific class or data type and it will return a TRUE or
FALSE logical value.
Ex: num.vec1 <- 1:4
print(num.vec1)
is.integer(num.vec1) // TRUE
is.numeric(num.vec1) // TRUE
is.matrix(num.vec1) // FALSE
is.data.frame(num.vec1) // FALSE

Shruthi S, Asst. Professor, GSSS SSFGC, Mysuru Page 28

Statistical Analysis and R Programming 2024-25

is.vector(num.vec1) // TRUE
is.logical(num.vec1) // FALSE

As-Dot Coercion Functions

This explicit coercion can be achieved with the as-dot functions.
Ex; as.numeric(c(T,F,F,T)) //1 0 0 1
foo <- 34
foo.ch<-as.character(foo) //"34"
as.logical(c("1","0","1","0","0")) //NA NA NA NA NA

BAS IC PLOTT ING

The coordinates are usually represented with points written as a pair: (x value, y value). The R
function plot() is used to plot graphs. It takes two vectors—one vector of x locations and one
vector of y locations.
plot(): function is used to draw points (markers).
Ex: foo <- c(1.1, 2, 3.5, 3.9 ,4.2)
bar <- c(2, 2.2, -1.3, 0, 0.2)
plot(foo,bar)

Output

Shruthi S, Asst. Professor, GSSS SSFGC, Mysuru Page 29

Statistical Analysis and R Programming 2024-25

GRAPHICAL PARAMETERS
There are a wide range of graphical parameters that can be supplied as arguments to the plot
function
 type –types parameter tells R how to plot the supplied coordinates.
The default value for type is "p", which can be interpreted as “points only.” If type="l"
meaning “lines only”. "b" for both points and lines "o" for overplotting the points with
lines. The option type="n" results in no points or lines plotted.
Ex: foo <- c(1.1, 2, 3.5, 3.9 ,4.2)
bar <- c(2, 2.2, -1.3, 0, 0.2)
plot(foo,bar,type="b")

 main, xlab, ylab : Options to include plot title, the horizontal axis label,and the vertical axis
label, respectively
Ex: > plot(foo,bar,type="b",main="My lovely plot", xlab="x axis label", ylab="location y")
plot(foo,bar,type="b",main="My lovely plot\ntitle on two lines",xlab="", ylab="")

Shruthi S, Asst. Professor, GSSS SSFGC, Mysuru Page 30

Statistical Analysis and R Programming 2024-25

 col : it is a coloris to use for plotting points and lines. The simplest options are to use an
integer selector or a character string. The default color is integer 1 or the character string
"black". There are eight possible integer values and around 650 character strings tspecify
color. also specify colors using RGB (red,green, and blue) levels
Ex: plot(foo,bar,type="b",main="My lovely plot",xlab="",ylab="",col=2)

 pch : pch stands for point character. This selects which character to use for plotting
individual points. The pch parameter controls the character used to plot individual data
points. a single character to use for each point, or specify a value between 1 and 25. The
symbols corresponding to each integer are shown below.


Ex: foo <- c(1.1, 2, 3.5, 3.9 ,4.2)
bar <- c(2, 2.2, -1.3, 0, 0.2)
plot(foo,bar,type="b",main="My lovely plot",xlab="",ylab="", col=4,pch=8)

Shruthi S, Asst. Professor, GSSS SSFGC, Mysuru Page 31

Statistical Analysis and R Programming 2024-25

 cex: It stands for character expansion. This controls the size of plotted point characters.
Ex:plot(foo,bar,type="b",main="Mylovelyplot",xlab="",ylab="", col=4, pch=8, cex=2.3)

 lty: It stands for line type. This specifies the type of line to use to connect the points (for
example, solid, dotted, or dashed). It take the values 1 through 6. These options are shown
in the figure below.
Ex: plot(foo,bar,type="b",main="My lovely plot",xlab="",ylab="", col=4,pch=8,lty=2)

 lwd: It stands for line width. This controls the thickness of plotted lines.
Ex: plot(foo,bar,type="b",main="My lovely plot", xlab="",ylab="",col=4, pch=8,lty=2,
cex=2. ,lwd=3.3)

 xlim, ylim :This provides limits for the horizontal range and vertical range (respectively)
of the plotting region.
Ex: plot(foo,bar,type="b",main="My lovely plot",xlab="",ylab="", col=6, pch=15, lty=3,
cex=0.7,lwd=2, xlim=c(3,5), ylim=c(-0.5,0.2))

ADVANTAGES AND DISADVANTAGES

ADVANTAGES
 Open Source: An open-source language is a language which we can work without any need
for a license or a fee.

Shruthi S, Asst. Professor, GSSS SSFGC, Mysuru Page 32

Statistical Analysis and R Programming 2024-25

 Platform Independent: R is a platform-independent language or cross-platform programming

language which means its code can run on all operating systems.
 Machine Learning Operations: R allows us to do various machine learning operations such
as classification and regression.
 Quality plotting and graphing: R simplifies quality plotting and graphing using plot ()
function.
 Statistics: R is mainly known as the language of statistics.

DISADVANTAGES
 Basic Security: R lacks basic security. It is an essential part of most programming. R as it
cannot be embedded in a web application due to less security.
 Lesser Speed: R programming language is much slower than other programming languages
such as MATLAB and Python. In comparison to other programming language, R packages
are much slower.
 Complicated Language: The people who don't have prior knowledge or programming
experience may find it difficult to learn R.

Shruthi S, Asst. Professor, GSSS SSFGC, Mysuru Page 33

R Programming for Statisticians
No ratings yet
R Programming for Statisticians
45 pages
Unit1 R PGM
No ratings yet
Unit1 R PGM
42 pages
R Programming for Beginners
No ratings yet
R Programming for Beginners
45 pages
R Programming
No ratings yet
R Programming
56 pages
R Project
0% (1)
R Project
25 pages
UNIT-2 Data Analytics Using R
No ratings yet
UNIT-2 Data Analytics Using R
23 pages
77 CDA Journal Chanchal
No ratings yet
77 CDA Journal Chanchal
49 pages
R-Programming Notes
100% (2)
R-Programming Notes
33 pages
R Data Types and Variables Guide
No ratings yet
R Data Types and Variables Guide
19 pages
Maths Assinment
No ratings yet
Maths Assinment
84 pages
2 Program
No ratings yet
2 Program
11 pages
Ba Assignment Sem 6 (22504025) Dhruvi Pathania
No ratings yet
Ba Assignment Sem 6 (22504025) Dhruvi Pathania
28 pages
Satyam Jha R File
No ratings yet
Satyam Jha R File
41 pages
Statistics With R Unit 1
No ratings yet
Statistics With R Unit 1
25 pages
SCTR Unit 1
No ratings yet
SCTR Unit 1
36 pages
R Programming: Basic Syntax & Operations
No ratings yet
R Programming: Basic Syntax & Operations
19 pages
Module 1-1
No ratings yet
Module 1-1
38 pages
In R Programming PDF
No ratings yet
In R Programming PDF
72 pages
Datatypes Variables Operators in R
No ratings yet
Datatypes Variables Operators in R
22 pages
Data Science Using R - Lab Manual-Complete Ver 2.0 - Nov 2024
No ratings yet
Data Science Using R - Lab Manual-Complete Ver 2.0 - Nov 2024
36 pages
R Content
No ratings yet
R Content
74 pages
2 Undefined
No ratings yet
2 Undefined
86 pages
Live Class - 2 - 24.08.24
No ratings yet
Live Class - 2 - 24.08.24
19 pages
Data Analytics with R: A Comprehensive Guide
No ratings yet
Data Analytics with R: A Comprehensive Guide
48 pages
R Programming Basics for Beginners
No ratings yet
R Programming Basics for Beginners
20 pages
R Programming Language
No ratings yet
R Programming Language
5 pages
R Programming Basics for Beginners
No ratings yet
R Programming Basics for Beginners
14 pages
Unit I R Data Structures
No ratings yet
Unit I R Data Structures
30 pages
Introduction To R in Data Analytics
No ratings yet
Introduction To R in Data Analytics
135 pages
BR PDF File K
No ratings yet
BR PDF File K
100 pages
R Programming Course Material
No ratings yet
R Programming Course Material
217 pages
Data Analysis Using R - 2
No ratings yet
Data Analysis Using R - 2
23 pages
R Programming Lab Manual
No ratings yet
R Programming Lab Manual
35 pages
R Programming Basics: Operations & Variables
No ratings yet
R Programming Basics: Operations & Variables
7 pages
Introduction To R: Dr. D. Kothandaraman Associate Professor, SCOPE, VIT-AP Module - 2
No ratings yet
Introduction To R: Dr. D. Kothandaraman Associate Professor, SCOPE, VIT-AP Module - 2
77 pages
UNIT 2 - Advanced Data Structures
No ratings yet
UNIT 2 - Advanced Data Structures
23 pages
Introduction to R Programming
No ratings yet
Introduction to R Programming
13 pages
R Programming: Vectors and Data Types
No ratings yet
R Programming: Vectors and Data Types
34 pages
R-Basic Concepts
No ratings yet
R-Basic Concepts
67 pages
R Intro
No ratings yet
R Intro
227 pages
Big-Data Unit-4
No ratings yet
Big-Data Unit-4
110 pages
R Lecture 1
No ratings yet
R Lecture 1
17 pages
Introduction To Analytics and R File
No ratings yet
Introduction To Analytics and R File
29 pages
It Workshop Lab File
No ratings yet
It Workshop Lab File
39 pages
Introduction To R-Copy1
No ratings yet
Introduction To R-Copy1
16 pages
BRM Practical File H
No ratings yet
BRM Practical File H
37 pages
Pplpresentation 211012192639
No ratings yet
Pplpresentation 211012192639
35 pages
SMuR Assignment
No ratings yet
SMuR Assignment
8 pages
R Programming-Chapiter 4
No ratings yet
R Programming-Chapiter 4
16 pages
Notes
No ratings yet
Notes
5 pages
Data Analytics Using R
No ratings yet
Data Analytics Using R
37 pages
Statistics With R Programming For Bigdata (Autosaved)
No ratings yet
Statistics With R Programming For Bigdata (Autosaved)
41 pages
R Unit1
No ratings yet
R Unit1
26 pages
Unit-2-Start Learning R
No ratings yet
Unit-2-Start Learning R
10 pages
SEE R Practical Dhara
No ratings yet
SEE R Practical Dhara
57 pages
R Programming
No ratings yet
R Programming
22 pages
Introduction To R Chap 2
No ratings yet
Introduction To R Chap 2
30 pages
R Programming: Variables and Vectors
No ratings yet
R Programming: Variables and Vectors
18 pages
R Programming for Data Analysis Guide
No ratings yet
R Programming for Data Analysis Guide
68 pages
QB
No ratings yet
QB
1 page
BEC503-DC-M3-Information Theory
100% (1)
BEC503-DC-M3-Information Theory
100 pages
Adobe Scan 13 Mar 2025
No ratings yet
Adobe Scan 13 Mar 2025
13 pages
MMC Bce613a Notes
No ratings yet
MMC Bce613a Notes
186 pages
Ipcc Lab Manual 2023
No ratings yet
Ipcc Lab Manual 2023
22 pages
Marketing Research Methodology Overview
No ratings yet
Marketing Research Methodology Overview
53 pages
SLG Caro CS1 1.0
No ratings yet
SLG Caro CS1 1.0
5 pages
Foolen 2012 Therelevanceofemotion
No ratings yet
Foolen 2012 Therelevanceofemotion
24 pages
PI Banking DWH Model Brochure PDF
No ratings yet
PI Banking DWH Model Brochure PDF
2 pages
Sop Assignment 2023.09.22
No ratings yet
Sop Assignment 2023.09.22
18 pages
Ext 409 Report
No ratings yet
Ext 409 Report
17 pages
2024NHT EAL Report
No ratings yet
2024NHT EAL Report
6 pages
Informative Essay Topic
100% (2)
Informative Essay Topic
3 pages
Psychology Board National Psychology Examination Sample Questions
No ratings yet
Psychology Board National Psychology Examination Sample Questions
5 pages
Packaging Guide for Truck Industry
No ratings yet
Packaging Guide for Truck Industry
33 pages
VFS Global Visa Consent Form
No ratings yet
VFS Global Visa Consent Form
2 pages
Africa Exploration for 2nd Graders
No ratings yet
Africa Exploration for 2nd Graders
4 pages
Infinity Spotlight Application Overview
No ratings yet
Infinity Spotlight Application Overview
7 pages
Scribd Visual Style Guide
No ratings yet
Scribd Visual Style Guide
17 pages
Effective Communication in IT Projects
100% (1)
Effective Communication in IT Projects
8 pages
Entrepreneurship and Employability Skills Training - Day I
100% (2)
Entrepreneurship and Employability Skills Training - Day I
37 pages
Language English Q3 W1
No ratings yet
Language English Q3 W1
10 pages
Ict Skills Notes
No ratings yet
Ict Skills Notes
2 pages
Understanding Contextual Interpretation
No ratings yet
Understanding Contextual Interpretation
3 pages
Metric Learning for HAR Experts
No ratings yet
Metric Learning for HAR Experts
4 pages
COVID-19 Impact on Migration in Turkey
No ratings yet
COVID-19 Impact on Migration in Turkey
36 pages
Research Methology
No ratings yet
Research Methology
18 pages
Financial Accounting Theory and Analysis
50% (2)
Financial Accounting Theory and Analysis
15 pages
Introduction-to-Accounting and Filipino 2 Modyul 1 and 2 For Grade 11
No ratings yet
Introduction-to-Accounting and Filipino 2 Modyul 1 and 2 For Grade 11
5 pages
After Cooler Test 3500 Series
No ratings yet
After Cooler Test 3500 Series
13 pages
Fpsyg 15 1387089
No ratings yet
Fpsyg 15 1387089
13 pages
URF Academy Game Design Curriculum Guide
No ratings yet
URF Academy Game Design Curriculum Guide
14 pages
Accounting Information Systems Guide
No ratings yet
Accounting Information Systems Guide
36 pages
Media Types for Effective Communication Strategy
No ratings yet
Media Types for Effective Communication Strategy
10 pages
Viewing Comprehension
No ratings yet
Viewing Comprehension
27 pages