0% found this document useful (0 votes)
83 views100 pages

Ids Unit LLL Jntuh Cse

ids unit 3
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
83 views100 pages

Ids Unit LLL Jntuh Cse

ids unit 3
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 100

UNIT - III Vectors: Creating and Naming Vectors, Vector Arithmetic, Vector sub

setting, Matrices: Creating and Naming Matrices, Matrix Sub setting, Arrays,
Class. Factors and Data Frames: Introduction to Factors: Factor Levels,
Summarizing a Factor, Ordered Factors, Comparing Ordered Factors,
Introduction to Data Frame, subsetting of Data Frames, Extending Data
Frames, Sorting Data Frames. Lists: Introduction, creating a List: Creating a
Named List, Accessing List Elements, Manipulating List Elements, Merging
Lists, Converting Lists to Vectors

Vectors

 A vector is simply a list of items that are of the same type.

 A vector is a basic data structure which plays an important role in R


programming.
 In R, a sequence of elements which share the same data type is known
as vector. A vector supports logical, integer, double, character, complex,
or raw data type. The elements which are contained in vector known
as components of the vector. We can check the type of vector with the
help of the typeof() function.

 To combine the list of items to a vector, use the c() function and separate
the items by a comma.
 Vectors in R are the same as the arrays in C language which are used to
hold multiple data values of the same type. One major key point is that
in R the indexing of the vector will start from ‘1’ and not from ‘0’. We can
create numeric vectors and character vectors as well.
 The length is an important property of a vector. A vector length is
basically the number of elements in the vector, and it is calculated with
the help of the length() function.
Types of vectors
 Vectors are of different types which are used in R.

1.Numeric vectors
Numeric vectors are those which contain numeric values such as integer,
float, etc.

Ex: 1

# R program to create numeric Vectors

# creation of vectors using c() function.

v1 <- c(4, 5, 6, 7)

# display type of vector

typeof(v1)

# by using 'L' we can specify that we want integer


values.

v2 <- c(1L, 4L, 2L, 5L)

# display type of vector

typeof(v2)

O/P: [1] "double"


[1] "integer"
EX: 2

 Character vectors
Character vectors contain alphanumeric values and special characters.

# R program to create Character Vectors

# by default numeric values

# are converted into characters

v1 <-c('geeks', '2', 'hello', 57)

# Displaying type of vector

typeof(v1)

Output:
 [1] "character"

Ex :3

Logical vectors
Logical vectors contain boolean values such as TRUE, FALSE and NA for
Null values.

# R program to create Logical Vectors

# Creating logical vector


# using c() function
v1 <-c(TRUE, FALSE, TRUE, NA)

# Displaying type of vector


typeof(v1)
Output:
 [1] "logical"
Creating and Naming Vectors
we use c() function to create a vector. This function returns a one-
dimensional array or simply vector. The c() function is a generic
function which combines its argument.
All arguments are restricted with a common data type which is the
type of the returned value. There are various other ways also there

Types:

1. Using c() Function

2. Using the colon(:) operator

3. Using the seq() function


4. Using assign() function

Using c() Function


The c function in R programming stands for 'combine. ' This
function is used to get the output by giving parameters
inside the function.
EX: 1

# R program to create Vectors

# we can use the c function


# to combine the values as a vector.
# By default the type will be double

X <-c(61, 4, 21, 67, 89, 2)


cat('using c function', X)

# print the vales option 1


X
# print the vales option 2
# print(X)

O/P: using c function 61 4 21 67 89 2


2. Using the colon(:) operator

Colon operator (":") in R is a function that generates


regular sequences. It is most commonly used in for loops, to
index and to create a vector with increasing or decreasing
sequence. It is a binary operator i.e. it takes two arguments.
Syntax :z<- x:y
EX: 1
# Vector with numerical values in a sequence
numbers <- 1:10

numbers

EX: 2
# Vector with numerical decimals in a sequence
numbers1 <- 1.5:6.5
numbers1

3. Using the seq() function

In R, we can create a vector with the help of the seq()


function. A sequence function creates a sequence of
elements as a vector.
The seq() function is used in two ways, i.e., by setting step
size with ?by' parameter or specifying the length of the
vector(sequence) with the 'length.out' feature.

Example:

seq_vec<-seq(1,4,by=0.5)
seq_vec
class(seq_vec)
Output
[1] 1.0 1.5 2.0 2.5 3.0 3.5 4.0
[1] "numeric"

EX: 2

seq_vec<-seq(1,4,length.out=10)
seq_vec
class(seq_vec)

Output
[1] 1.000000 1.333333 1.666667 2.000000 2.333333 2.666667
3.000000 3.333333
[9] 3.666667 4.000000
[1] "numeric"

EX: 3

seq_vec<-seq(1,4,length.out=5)
seq_vec
class(seq_vec)

Output
[1] 1.00 1.75 2.50 3.25 4.00
[1] "numeric"

EX:4

# Creating a sequence from 5 to 13.


v <-5:13
print(v)
# Creating a sequence from 6.6 to 12.6.
v <-6.6:12.6
print(v)
EX:
# R program to illustrate
# Assigning vectors

# Assigning a vector using


# seq() function
V = seq(1, 3, by=0.2)

# Printing the vector


print(V)

Output
[1] 1.0 1.2 1.4 1.6 1.8 2.0 2.2 2.4 2.6 2.8 3.0

4 Using assign() function

The assign() function takes the following mandatory parameter values: x


: This represents the variable name that is given as a character
string. value : This is the value to be assigned to the x variable.
EX:1

assign("vec2",c(6,7,8,9,10))

vec2

Output
[1] 6 7 8 9 10

Vector Arithmetic Operations


We can perform arithmetic operations on vectors, like addition,
subtraction, multiplication and division.
Please note that the two vectors should be of same length and same type.
Or one of the vectors can be an atomic value of same type.
If the vectors are not of same length, then Vector Recycling happens
implicitly.
Vector Recycling:
If two vectors are of unequal length, the shorter one will be recycled in
order to match the longer vector. For example, the following
vectors u and v have different lengths, and their sum is computed by
recycling values of the shorter vector u.
> u = c(10, 20, 30)
> v = c(1, 2, 3, 4, 5, 6, 7, 8, 9)
>u+v
[1] 11 22 33 14 25 36 17 28 39

Types:

1. Addition
Addition operator takes two vectors as operands, and returns the result
of sum of two vectors.
a+b

Example
In the following program, we create two integer vectors and add them using
Addition Operator.

Ex:1
a <- c(10, 20, 30, 40, 50)

b <- c(1, 3, 5, 7, 9)

result <- a + b

print(result)

Output
[1] 11 23 35 47 59

2. Subtraction
Subtraction operator takes two vectors as operands, and returns the
result of difference of two vectors.
a-b

Example
In the following program, we create two integer vectors and find their different
using Subtraction Operator.

Ex:1
a <- c(10, 20, 30, 40, 50)

b <- c(1, 3, 5, 7, 9)

result <- a - b

print(result)

Output
[1] 9 17 25 33 41

3.Multiplication
Multiplication operator takes two vectors as operands, and returns the
result of product of two vectors.
a*b

Example
In the following program, we create two integer vectors and find their
product using Multiplication Operator.
Ex:
a <- c(10, 20, 30, 40, 50)

b <- c(1, 3, 5, 7, 9)

result <- a * b

print(result)

Output
[1] 10 60 150 280 450

4.Division
Division operator takes two vectors are operands, and returns the result
of division of two vectors.
a+b

Example
In the following program, we create two integer vectors and divide them using
Division Operator.

Example.R
a <- c(10, 20, 30, 40, 50)

b <- c(1, 3, 5, 7, 9)

result <- a / b

print(result)

Output
[1] 10.000000 6.666667 6.000000 5.714286 5.555556
Accessing elements of vectors
We can access the elements of a vector with the help of vector indexing.
Indexing denotes the position where the value in a vector is stored. Indexing
will be performed with the help of integer, character, or logic.

1) Indexing with integer vector

On integer vector, indexing is performed in the same way as we have applied in


C, C++, and java. There is only one difference, i.e., in C, C++, and java the
indexing starts from 0, but in R, the indexing starts from 1. Like other
programming languages, we perform indexing by specifying an integer value in
square braces [] next to our vector.

Example:

1. seq_vec<-seq(1,4,length.out=6)
2. seq_vec
3. seq_vec[2]
Output

[1] 1.0 1.6 2.2 2.8 3.4 4.0


[1] 1.6

2) Indexing with a character vector

In character vector indexing, we assign a unique key to each element of the


vector. These keys are uniquely defined as each element and can be accessed
very easily. Let's see an example to understand how it is performed.

Example:

1. char_vec<-c("shubham"=22,"arpita"=23,"vaishali"=25)
2. char_vec
3. char_vec["arpita"]

Output

shubhamarpitavaishali
22 23 25
arpita
23

3) Indexing with a logical vector

In logical indexing, it returns the values of those positions whose


corresponding position has a logical vector TRUE. Let see an example to
understand how it is performed on vectors.

Example:

1. a<-c(1,2,3,4,5,6)
2. a[c(TRUE,FALSE,TRUE,TRUE,FALSE,TRUE)]
Output

[1] 1 3 4 6

Vector sub setting

 Vectors are basic objects in R and they can be subsetted using


the [ operator.
 EX:
> x <-c("a", "b", "c", "c", "d", "a")
>x[1] ## Extract the first element
[1] "a"
 EX
>x[2] ## Extract the second element
[1] "b"

 The [ operator can be used to extract multiple elements of a vector by


passing the operator an integer sequence. Here we extract the first four
elements of the vector.
>x[1:4]
[1] "a""b""c""c"
 The sequence does not have to be in order; you can specify any arbitrary
integer vector.

 EX

>x[c(1, 3, 4)]
[1] "a""c""c"
Matrices: Creating and Naming Matrices

 In R, a two-dimensional rectangular data set is known as a matrix. A


matrix is created with the help of the vector input to the matrix function.
On R matrices, we can perform addition, subtraction, multiplication, and
division operation.
 In the R matrix, elements are arranged in a fixed number of rows and
columns. The matrix elements are the real numbers. In R, we use matrix
function, which can easily reproduce the memory representation of the
matrix. In the R matrix, all the elements must share a common basic
type.

 To create a matrix in R you need to use the function called matrix().


The arguments to this matrix() are the set of elements in the vector.
 You have to pass how many numbers of rows and how many numbers of
columns you want to have in your matrix. Note: By default, matrices are
in column-wise order

 Matrices are the R objects in which the elements are arranged in a two-
dimensional rectangular layout. They contain elements of the same
atomic types. Though we can create a matrix containing only characters
or only logical values, they are not of much use. We use matrices
containing numeric elements to be used in mathematical calculations.

 A Matrix is created using the matrix() function.

Syntax

 The basic syntax for creating a matrix in R is −

 matrix(data, nrow, ncol, byrow, dimnames)


Note:

Following is the description of the parameters used −

 data is the input vector which becomes the data elements of the matrix.
 nrow is the number of rows to be created.
 ncol is the number of columns to be created.
 byrow is a logical clue. If TRUE then the input vector elements are
arranged by row.(byrow is a logical variable. Matrices are by default
column-wise. By setting byrow as TRUE, we can arrange the data row-
wise in the matrix)
 dimname is the names assigned to the rows and columns(takes two
character arrays as input for row names and column names).

How to create a matrix in R:


 Like vector and list, R provides a function which creates a matrix. R
provides the matrix() function to create a matrix. This function plays an
important role in data analysis. There is the following syntax of the
matrix in R:

1. matrix(data, nrow, ncol, byrow, dim_name)

data

The first argument in matrix function is data. It is the input vector which is the
data elements of the matrix.

nrow

The second argument is the number of rows which we want to create in the
matrix.
ncol

The third argument is the number of columns which we want to create in the
matrix.

byrow

The byrow parameter is a logical clue. If its value is true, then the input vector
elements are arranged by row.

dim_name

The dim_name parameter is the name assigned to the rows and columns.

Example to understand how matrix function is used to create a matrix and


arrange the elements sequentially by row or column.

EX:

P <- matrix(c(5:16), nrow = 4, byrow = TRUE)

print(P)

Output
[,1] [,2] [,3]
[1,] 5 6 7
[2,] 8 9 10
[3,] 11 12 13
[4,] 14 15 16

# Arranging elements sequentially by column.

Q <- matrix(c(3:14), nrow = 4, byrow = FALSE)

print(Q)
[,1] [,2] [,3]
[1,] 3 7 11
[2,] 4 8 12
[3,] 5 9 13
[4,] 6 10 14

# Defining the column and row names.

row_names = c("row1", "row2", "row3", "row4")

col_names = c("col1", "col2", "col3")

R <- matrix(c(3:14), nrow = 4, byrow = TRUE, dimnames = list(row_names,


col_names))

print(R)

col1 col2 col3


row1 3 4 5
row2 6 7 8
row3 9 10 11
row4 12 13 14

Accessing matrix elements in R

There are three ways to access the elements from the matrix.

1. We can access the element which presents on nth row and mth column.
2. We can access all the elements of the matrix which are present on the
nth row.
3. We can also access all the elements of the matrix which are present on
the mth column.

Example

# Defining the column and row names.


row_names = c("row1", "row2", "row3", "row4")
col_names = c("col1", "col2", "col3")
#Creating matrix
R <-
matrix(c(5:16), nrow = 4, byrow = TRUE, dimnames = list(row_names, col_nam
es))
print(R)

#Accessing element present on 3rd row and 2nd column


print(R[3,2])

#Accessing element present in 3rd row


print(R[3,])

#Accessing element present in 2nd column


print(R[,2])

Output

col1 col2 col3


row1 5 6 7
row2 8 9 10
row3 11 12 13
row4 14 15 16

[1] 12
col1 col2 col3
11 12 13

row1 row2 row3 row4


6 9 12 15

Modification of the matrix

R allows us to do modification in the matrix. There are several methods to do


modification in the matrix, which are as follows:

Assign a single element

 In matrix modification, the first method is to assign a single element to


the matrix at a particular position. By assigning a new value to that
position, the old value will get replaced with the new one. This
modification technique is quite simple to perform matrix modification.
The basic syntax for it is as follows:

1. matrix[n, m]<-y
Here, n and m are the rows and columns of the element, respectively. And, y is
the value which we assign to modify our matrix.

Example

# Defining the column and row names.


row_names = c("row1", "row2", "row3", "row4")
ccol_names = c("col1", "col2", "col3")

R <-
matrix(c(5:16), nrow = 4, byrow = TRUE, dimnames = list(row_names, col_nam
es))
print(R)

#Assigning value 20 to the element at 3d roe and 2nd column


R[3,2]<-20
print(R)

Output

col1 col2 col3


row1 5 6 7
row2 8 9 10
row3 11 12 13
row4 14 15 16

col1 col2 col3


row1 5 6 7
row2 8 9 10
row3 11 20 13
row4 14 15 16
Use of Relational Operator

 R provides another way to perform matrix modication. In this method, we


used some relational operators like >, <, ==. Like the first method, the
second method is quite simple to use. Let see an example to understand
how this method modifies the matrix.

Example 1

# Defining the column and row names.


row_names = c("row1", "row2", "row3", "row4")
ccol_names = c("col1", "col2", "col3")

R <-
matrix(c(5:16), nrow = 4, byrow = TRUE, dimnames = list(row_names, col_nam
es))
print(R)

#Replacing element that equal to the 12


R[R==12]<-0
print(R)

Output

col1 col2 col3


row1 5 6 7
row2 8 9 10
row3 11 12 13
row4 14 15 16

col1 col2 col3


row1 5 6 7
row2 8 9 10
row3 11 0 13
row4 14 15 16

Example 2

# Defining the column and row names.


row_names = c("row1", "row2", "row3", "row4")
ccol_names = c("col1", "col2", "col3")

R <-
matrix(c(5:16), nrow = 4, byrow = TRUE, dimnames = list(row_names, col_nam
es))
print(R)

#Replacing elements whose values are greater than 12


R[R>12]<-0
print(R)

Output

col1 col2 col3


row1 5 6 7
row2 8 9 10
row3 11 12 13
row4 14 15 16

col1 col2 col3


row1 5 6 7
row2 8 9 10
row3 11 12 0
row4 0 0 0

Addition of Rows and Columns

 The third method of matrix modification is through the addition of rows


and columns using the cbind() and rbind() function. The cbind() and
rbind() function are used to add a column and a row respectively. Let see
an example to understand the working of cbind() and rbind() functions.

Example 1

# Defining the column and row names.


row_names = c("row1", "row2", "row3", "row4")
ccol_names = c("col1", "col2", "col3")

R <-
matrix(c(5:16), nrow = 4, byrow = TRUE, dimnames = list(row_names, col_nam
es))
print(R)

#Adding row
rbind(R,c(17,18,19))

#Adding column
cbind(R,c(17,18,19,20))

#transpose of the matrix using the t() function:


t(R)

#Modifying the dimension of the matrix using the dim() function


dim(R)<-c(1,12)
print(R)

Output

col1 col2 col3


row1 5 6 7
row2 8 9 10
row3 11 12 13
row4 14 15 16

col1 col2 col3


row1 5 6 7
row2 8 9 10
row3 11 12 13
row4 14 15 16
17 18 19

col1 col2 col3


row1 5 6 7 17
row2 8 9 10 18
row3 11 12 13 19
row4 14 15 16 20

row1 row2 row3 row4


col1 5 8 11 14
col2 6 9 12 15
col3 7 10 13 16

[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12]
[1,] 5 8 11 14 6 9 12 15 7 10 13 16

Matrix operations

 In R, we can perform the mathematical operations on a matrix such as


addition, subtraction, multiplication, etc. For performing the
mathematical operation on the matrix, it is required that both the matrix
should have the same dimensions.

Example 1

R <- matrix(c(5:16), nrow = 4,ncol=3)


S <- matrix(c(1:12), nrow = 4,ncol=3)
R
S
#Addition
sum<-R+S
print(sum)
#Subtraction
sub<-R-S
print(sub)

#Multiplication
mul<-R*S
print(mul)

#Multiplication by constant
mul1<-R*12
print(mul1)

#Division
div<-R/S
print(div)

Output

[,1] [,2] [,3]


[1,] 6 14 22
[2,] 8 16 24
[3,] 10 18 26
[4,] 12 20 28

[,1] [,2] [,3]


[1,] 4 4 4
[2,] 4 4 4
[3,] 4 4 4
[4,] 4 4 4

[,1] [,2] [,3]


[1,] 5 45 117
[2,] 12 60 140
[3,] 21 77 165
[4,] 32 96 192

[,1] [,2] [,3]


[1,] 60 108 156
[2,] 72 120 168
[3,] 84 132 180
[4,] 96 144 192

[,1] [,2] [,3]


[1,] 5.000000 1.800000 1.444444
[2,] 3.000000 1.666667 1.400000
[3,] 2.333333 1.571429 1.363636
[4,] 2.000000 1.500000 1.333333

Applications of matrix

1. In geology, Matrices takes surveys and plot graphs, statistics, and used
to study in different fields.
2. Matrix is the representation method which helps in plotting common
survey things.
3. In robotics and automation, Matrices have the topmost elements for the
robot movements.
4. Matrices are mainly used in calculating the gross domestic products in
Economics, and it also helps in calculating the capability of goods and
products.
5. In computer-based application, matrices play a crucial role in the
creation of realistic seeming motion.
Subset matrix by row names

 The row names(mat)method in R is used to set the names for


rows of the matrix.

 A similar approach is used to check for the presence of row


names of the matrix in the vector or list of specified row
names.

# creating matrix

matr<- matrix(1:12, nrow = 4)

print("Original Matrix")

print(matr)

# assigning row names to data frame

rownames(matr) <- c("row1","row2","row3","row4")

# getting rows

rows<- c("row1","row3")

Matrix subsetting

 A matrix is subset with two arguments within single brackets, [], and

separated by a comma. The first argument specifies the rows, and the

second the columns.

EX:

a <- matrix(1:9, nrow = 3)

a
colnames(a) <- c("A", "B", "C")

a[1:2, ]

Output
[,1] [,2] [,3]
[1,] 1 4 7
[2,] 2 5 8
[3,] 3 6 9
A B C
[1,] 1 4 7
[2,] 2 5 8

Ex: 2

a <- matrix(1:9, nrow = 3)

colnames(a) <- c("A", "B", "C")

a[1:3, ]

Output
[,1] [,2] [,3]
[1,] 1 4 7
[2,] 2 5 8
[3,] 3 6 9
A B C
[1,] 1 4 7
[2,] 2 5 8
[3,] 3 6 9

Class in R programming
Classes and Objects are basic concepts of Object-Oriented Programming
that revolve around the real-life entities.
Everything in R is an object. An object is simply a data structure that has
some methods and attributes. A class is just a blueprint or a sketch of
these objects.
It represents the set of properties or methods that are common to all
objects of one type.
Unlike most other programming languages, R has a three-class system.
These are S3, S4, and Reference Classes.

1.S3 Class in R
 S3 class is the most popular class in the R programming
language. Most of the classes that come predefined in R are of
this type.S3 refers to a class system built into R. The system
governs how R handles objects of different classes. Certain R
functions will look up an object's S3 class, and then behave
differently in response

 First we create a list with various components then we create a


class using the class() function. For example,
# create a list with required components
student1 <- list(name = "John", age = 21, GPA = 3.5)

# name the class appropriately


class(student1) <- "Student_Info"

# create and call an object


student1

Output

$name
[1] "John"

$age
[1] 21

$GPA
[1] 3.5

attr(,"class")
[1] "student"

In the above example, we have created a list named student1 with three
components. Notice the creation of class,

class(student1) <- "Student_Info"

Here, Student_Info is the name of the class. And to create an object of


this class, we have passed the student1 list inside class() .

Finally, we have created an object of the Student_Info class and called


the object student1 .
2.S4 Class in R
S4 class is an improvement over the S3 class. They have a formally defined
structure which helps in making objects of the same class look more or less
similar.

In R, we use the setClass() function to define a class. For example,

setClass("Student_Info", slots=list(name="character", age="numeric",


GPA="numeric"))

Here, we have created a class named Student_Info with three slots (member
variables): name , age , and GPA .

Now to create an object, we use the new() function. For example,

student1 <- new("Student_Info", name = "John", age = 21, GPA = 3.5)

Here, inside new() , we have provided the name of the class "Student_Info" and
value for all three slots.
We have successfully created the object named student1 .

Example: S4 Class in R

# create a class "Student_Info" with three member variables


setClass("Student_Info", slots=list(name="character", age="numeric",
GPA="numeric"))

# create an object of class


student1 <- new("Student_Info", name = "John", age = 21, GPA = 3.5)

# call student1 object


student1

Output

An object of class "Student_Info"


Slot "name":
[1] "John"
Slot "age":
[1] 21

Slot "GPA":
[1] 3.5

Here, we have created an S4 class named Student_Info using


the setClass() function and an object named student1 using the new() function.
3.Reference Class in R
Reference classes were introduced later, compared to the other two. It is
more similar to the object oriented programming we are used to seeing in
other major programming languages.

Defining a reference class is similar to defining a S4 class. Instead


of setClass() we use the setRefClass() function. For example,

# create a class "Student_Info" with three member variables


Student_Info<- setRefClass("Student_Info",
fields = list(name = "character", age = "numeric", GPA = "numeric"))

# Student_Info() is our generator function which can be used to create new


objects
student1 <- Student_Info(name = "John", age = 21, GPA = 3.5)

# call student1 object


student1

Output

Reference class object of class "Student_Info"


Field "name":
[1] "John"
Field "age":
[1] 21
Field "GPA":
[1] 3.5

In the above example, we have created a reference class


named Student_Info using the setRefClass() function.

Comparison Between S3 vs S4 vs Reference Class


S3 Class S4 Class Reference Class

Lacks formal definition Class defined using setClass() Class defined using setRefClass()

Objects are created by Objects are created using Objects are created using generator
setting the class attribute new() functions

Attributes are accessed Attributes are accessed using


Attributes are accessed using $
using $ @

Methods belong to generic Methods belong to generic


Methods belong to the class
function function

Follows copy-on-modify Follows copy-on-modify Does not follow copy-on-modify


semantics semantics semantics

R factors
 The factor is a data structure which is used for fields which take only predefined
finite number of values.
 These are the variable which takes a limited number of different values.
 These are the data objects which are used to categorize the data and to store it
on multiple levels.
 It can store both integers and strings values, and are useful in the column that
has a limited number of unique values.

 Factors have labels which are associated with the unique integers stored in
it. It contains predefined set value known as levels and by default R always
sorts levels in alphabetical order.

Attributes of a factor
There are the following attributes of a factor in R
a. X
It is the input vector which is to be transformed into a factor.
b. levels
It is an input vector that represents a set of unique values which are taken by x.
c. labels
It is a character vector which corresponds to the number of labels.
d. Exclude
It is used to specify the value which we want to be excluded,
e. ordered
It is a logical attribute which determines if the levels are ordered.
f. nmax
It is used to specify the upper bound for the maximum number of level.
How to create a factor?
In R, it is quite simple to create a factor. A factor is created in two steps

1. In the first step, we create a vector.


2. Next step is to convert the vector into a factor,

R provides factor() function to convert the vector into factor. There is the following
syntax of factor() function

1. factor_data<- factor(vector)

Example

1. # Creating a vector as input.

2. data <-

c("Shubham","Nishka","Arpita","Nishka","Shubham","Sumit","Nishka","Shubha
m","Sumit","Arpita","Sumit")

3. print(data)

4. print(is.factor(data))
5. # Applying the factor function.
6. factor_data<- factor(data)

7. print(factor_data)
8. print(is.factor(factor_data))

OUTPUT:

[1] "Shubham" "Nishka" "Arpita" "Nishka" "Shubham" "Sumit"


"Nishka"
[8] "Shubham" "Sumit" "Arpita" "Sumit"
[1] FALSE
[1] Shubham Nishka Arpita NishkaShubham Sumit Nishka Shubham Sumit
[10] Arpita Sumit
Levels: Arpita Nishka Shubham Sumit
[1] TRUE

Accessing components of factor


 Like vectors, we can access the components of factors. The process of
accessing components of factor is much more similar to the vectors. We
can access the element with the help of the indexing method or using
logical vectors.
 Example

# Creating a vector as input.


data <-
c("Shubham","Nishka","Arpita","Nishka","Shubham","Su
mit","Nishka","Shubham","Sumit","Arpita","Sumit")

# Applying the factor function.


factor_data<- factor(data)
#Printing all elements of factor
print(factor_data)
#Accessing 4th element of factor
print(factor_data[4])
#Accessing 5th and 7th element
print(factor_data[c(5,7)])
#Accessing all elemcent except 4th one
print(factor_data[-4])
Output

[1] Shubham Nishka Arpita Nishka Shubham Sumit Nishka Shubham Sumit

[10] Arpita Sumit

Levels: Arpita Nishka Shubham Sumit

[1] Nishka

Levels: Arpita Nishka Shubham Sumit

[1] Shubham Nishka

Levels: Arpita Nishka Shubham Sumit

[1] Shubham Nishka Arpita Shubham Sumit Nishka Shubham Sumit Arpita

[10] Sumit

Levels: Arpita Nishka Shubham Sumit

Modification of factor
 Like data frames, R allows us to modify the factor. We can modify the
value of a factor by simply re-assigning it.
 In R, we cannot choose values outside of its predefined levels means we
cannot insert value if it's level is not present on it.
 For this purpose, we have to create a level of that value, and then we can
add it to our factor.
Example

1. # Creating a vector as input.


2. data <-
c("Shubham","Nishka","Arpita","Nishka","Shubham")
3. # Applying the factor function.
4. factor_data<- factor(data)
5. #Printing all elements of factor
6. print(factor_data)
7. #Change 4th element of factor with sumit
8. factor_data[4] <-"Arpita"
9. print(factor_data)
10. #change 4th element of factor with "G"
11. factor_data[4] <-
"G" # cannot assign values outside levels
12. print(factor_data)
13. #Adding the value to the level
14. levels(factor_data) <-
c(levels(factor_data),"G")#Adding new level
15. factor_data[4] <- "G"
16. print(factor_data)
Output

[1] Shubham Nishka Arpita Nishka Shubham

Levels: Arpita Nishka Shubham

[1] Shubham Nishka Arpita Arpita Shubham

Levels: Arpita Nishka Shubham

Warning message:

In `[<-.factor`(`*tmp*`, 4, value = "G") :

invalid factor level, NA generated

[1] Shubham Nishka Arpita Shubham

Levels: Arpita Nishka Shubham

[1] Shubham Nishka Arpita G Shubham

Levels: Arpita Nishka Shubham G

Generating Factor Levels


 R provides gl() function to generate factor levels.
 This function takes three arguments i.e., n, k, and labels.
 Here, n and k are the integers which indicate how many levels we want
and how many times each level is required.

There is the following syntax of gl() function which is as follows

1. gl(n, k, labels)

1. n indicates the number of levels.

2. k indicates the number of replications.

3. labels is a vector of labels for the resulting factor levels.


Example

gen_factor<- gl(3,5,labels=c("BCA","MCA","B.Tech"))

gen_factor

Output

[1] BCA BCABCABCABCA MCA MCAMCAMCAMCA

[11] B.TechB.TechB.TechB.TechB.Tech

Levels: BCA MCA B.Tech

R – Level Ordering of Factors

 Factors are data objects used to categorize data and store it as levels.
 They can store a string as well as an integer.
 They represent columns as they have a limited number of unique values.
Factors in R can be created using factor() function. It takes a vector as
input. c() function is used to create a vector with explicitly provided
values.
Example:

x <- c("Pen", "Pencil", "Brush", "Pen",

"Brush", "Brush", "Pencil", "Pencil")

print(x)
print(is.factor(x))

# Apply the factor function.

factor_x = factor(x)

levels(factor_x)

Output :
[1] "Pen" "Pencil" "Brush" "Pen" "Brush" "Brush" "Pencil" "Pencil"

[1] FALSE

[1] "Brush" "Pen" "Pencil"

 In the above code, x is a vector with 8 elements.

 To convert it to a factor the function factor() is used.

 Here there are 8 factors and 3 levels.

 Levels are the unique elements in the data. Can be found

using levels() function.

Ordering Factor Levels

 Ordered factors is an extension of factors. It arranges the levels in


increasing order. We use two functions: factor() along with
argument ordered().
1. Syntax: factor(data, levels =c(“”), ordered =TRUE)
2. Parameter:
3. data: input vector with explicitly defined values.
4. levels(): Mention the list of levels in c function.
5. ordered: It is set true for enabling ordering.

Example:

# creating size vector

size = c("small", "large", "large", "small",

"medium", "large", "medium", "medium")

# converting to factor

size_factor<- factor(size)

print(size_factor)

# ordering the levels

ordered.size<- factor(size, levels = c(

"small", "medium", "large"), ordered = TRUE)

print(ordered.size)

Output:

[1] small large large small medium large medium medium


Levels: large medium small
[1] small large large small medium large medium medium
Levels: small < medium < large
 In the above code, size vector is created using c function. Then it is
converted to a factor. And for ordering factor() function is used along
with the arguments described above. Thus the sizes arranged in order.

The same can be done using the ordered function. The example for the same is
as shown below:

Example:

# creating vector size

size = c("small", "large", "large", "small",

"medium", "large", "medium", "medium")

sizes <- ordered(c("small", "large", "large",

"small", "medium"))

# ordering the levels

sizes <- ordered(sizes, levels = c("small", "medium",


"large"))

print(sizes)

Output:
[1] small large large small medium
Levels: small < medium < large
Comparing Ordered Factors using is.ordered() Function

 is.ordered() function in R Programming Language is used


to check if the passed factor is an ordered factor.
 Syntax: is.ordered(factor)
 Parameters:
 factor: Factor to be checked

 EX: 1

# creating vector size

size = c("small", "large", "large", "small",

"medium", "large", "medium", "medium")

sizes <- ordered(c("small", "large", "large",

"small", "medium"))

# ordering the levels

sizes <- ordered(sizes, levels = c("small",


"medium", "large"))

# Checking if the factor is ordered


# using is.ordered() function

is.ordered(sizes)

 Output:
 [1] TRUE

EX: 2

# Creating a vector

x<-c("female", "male", "male", "female")

# Converting vector into factor

gender <- factor(x)

# Using is.ordered() Function

# to check if a factor is ordered

is.ordered(gender)

Output:
[1] FALSE
Introduction to Data Frame
R Data Frame

 A data frame is a two-dimensional array-like structure or a table in


which a column contains values of one variable, and rows contains one
set of values from each column. A data frame is a special case of the list
in which each component has equal length.
 A data frame is used to store data table and the vectors which are
present in the form of a list in a data frame, are of equal length.
 In a simple way, it is a list of equal length vectors. A matrix can contain
one type of data, but a data frame can contain different data types such
as numeric, character, factor, etc.

R Data Frame-characteristics

 A data frame is a two-dimensional array-like structure or a table in


which a column contains values of one variable, and rows contains one
set of values from each column. A data frame is a special case of the list
in which each component has equal length.
 A data frame is used to store data table and the vectors which are
present in the form of a list in a data frame, are of equal length.
 In a simple way, it is a list of equal length vectors. A matrix can contain
one type of data, but a data frame can contain different data types such
as numeric, character, factor, etc.

There are following characteristics of a data frame.

Create Data Frame

 In R, the data frames are created with the help of frame() function of
data. This function contains the vectors of any type such as numeric,
character, or integer. In below example, we create a data frame that
contains employee id (integer vector), employee name(character vector),
salary(numeric vector), and starting date(Date vector).

In R, we use the data.frame() function to create a Data Frame.


The syntax of the data.frame() function is

dataframe1<- data.frame(
first_col =c(val1, val2, ...),
second_col = c(val1, val2, ...),
...
)

Here,

 first_col - a vector with values val1, val2, ... of same data type
 second_col - another vector with values val1, val2, ... of same data type and so
on

 # Create a data frame


 dataframe1 <- data.frame (
 Name = c("Juan", "Alcaraz", "Simantha"),
 Age = c(22, 15, 19),
 Vote = c(TRUE, FALSE, TRUE)
 )

 print(dataframe1)

 Output

 Name Age Vote


 1 Juan 22 TRUE
 2 Alcaraz 15 FALSE
 3 Simantha 19 TRUE

 In the above example, we have used the data.frame() function to create a


data frame named dataframe1. Notice the arguments passed
inside data.frame(),

 data.frame (
 Name = c("Juan", "Alcaraz", "Simantha"),
 Age = c(22, 15, 19),
 Vote = c(TRUE, FALSE, TRUE)
 )

 Here, Name, Age, and Vote are column names for vectors
of String, Numeric, and Boolean type respectively.
 And finally the datas represented in tabular format are printed.

Access Data Frame Columns

 There are different ways to extract columns from a data frame. We can
use [ ] , [[ ]] , or $ to access specific column of a data frame in R. For
example,

# Create a data frame


dataframe1 <- data.frame (
Name = c("Juan", "Alcaraz", "Simantha"),
Age = c(22, 15, 19),
Vote = c(TRUE, FALSE, TRUE)
)

# pass index number inside [ ]


print(dataframe1[1])
# pass column name inside [[ ]]
print(dataframe1[["Name"]])

# use $ operator and column name


print(dataframe1$Name)

Output

Name
1 Juan
2 Alcaraz
3 Simantha
[1] "Juan" "Alcaraz" "Simantha"
[1] "Juan" "Alcaraz" "Simantha"

 In the above example, we have created a data frame


named dataframe1 with three columns Name, Age, Vote.
 Here, we have used different operators to access Name column
of dataframe1.
 Accessing with [[ ]] or $ is similar. However, it differs for [ ] , [ ] will return
us a data frame but the other two will reduce it into a vector and return
a vector.

Combine Data Frames


In R, we use the rbind() and the cbind() function to combine two data frames
together.
 rbind() - combines two data frames vertically
 cbind() - combines two data frames horizontally
Combine Vertically Using rbind()

If we want to combine two data frames vertically, the column name of the two
data frames must be the same. For example,

# create a data frame


dataframe1<- data.frame (
Name = c("Juan", "Alcaraz"),
Age = c(22, 15)
)

# create another data frame


dataframe2<- data.frame (
Name = c("Yiruma", "Bach"),
Age = c(46, 89)
)

# combine two data frames vertically


updated<- rbind(dataframe1, dataframe2)
print(updated)

Output

Name Age
1 Juan 22
2 Alcaraz 15
3 Yiruma 46
4 Bach 89

Here, we have used the rbind() function to combine the two data
frames: dataframe1 and dataframe2 vertically.
Combine Horizontally Using cbind()

The cbind() function combines two or more data frames horizontally. For
example,

# create a data frame


dataframe1<- data.frame (
Name = c("Juan", "Alcaraz"),
Age = c(22, 15)
)

# create another data frame


dataframe2<- data.frame (
Hobby = c("Tennis", "Piano")
)

# combine two data frames horizontally


updated<- cbind(dataframe1, dataframe2)
print(updated)

Output

Name Age Hobby


1 Juan 22 Tennis
2 Alcaraz 15 Piano

Here, we have used cbind() to combine two data frames horizontally.

Note: The number of items on each vector of two or more combining data
frames must be equal otherwise we will get an error: arguments imply differing
number of rows or columns .
Length of a Data Frame in R

In R, we use the length() function to find the number of columns in a data


frame. For example,

# Create a data frame


dataframe1 <- data.frame (
Name = c("Juan", "Alcaraz", "Simantha"),
Age = c(22, 15, 19),
Vote = c(TRUE, FALSE, TRUE)
)

cat("Total Elements:", length(dataframe1))

Output

Total Elements: 3

Here, we have used length() to find the total number of columns in dataframe1.
Since there are 3 columns, the length() function returns 3.

Note: Additional

Extracting the specific columns from a data frame

Example

1. # Creating the data frame.


2. emp.data<- data.frame(
3. employee_id = c (1:5),
4. employee_name= c("Shubham","Arpita","Nishka","Gunjan","Sumit"),
5. sal = c(623.3,515.2,611.0,729.0,843.25),
6.
7. starting_date = as.Date(c("2012-01-01", "2013-09-23", "2014-11-15", "2014-05-
11",
8. "2015-03-27")),
9. stringsAsFactors = FALSE
10. )
11. # Extracting specific columns from a data frame
12. final <- data.frame(emp.data$employee_id,emp.data$sal)
13. print(final)

Output

emp.data.employee_idemp.data.sal
1 1 623.30
2 2 515.20
3 3 611.00
4 4 729.00
5 5 843.25

Extracting the specific rows from a data frame

Example

1. # Creating the data frame.


2. emp.data<- data.frame(
3. employee_id = c (1:5),
4. employee_name = c("Shubham","Arpita","Nishka","Gunjan","Sumit"),
5. sal = c(623.3,515.2,611.0,729.0,843.25),
6.
7. starting_date = as.Date(c("2012-01-01", "2013-09-23", "2014-11-15", "2014-05-
11",
8. "2015-03-27")),
9. stringsAsFactors = FALSE
10. )
11. # Extracting first row from a data frame
12. final <- emp.data[1,]
13. print(final)
14. # Extracting last two row from a data frame
15. final <- emp.data[4:5,]
16. print(final)

Output

employee_idemployee_namesalstarting_date
1 1 Shubham 623.3 2012-01-01

employee_idemployee_namesalstarting_date
4 4 Gunjan 729.00 2014-05-11
5 5 Sumit 843.25 2015-03-27

Extracting specific rows corresponding to specific columns

Example

1. # Creating the data frame.


2. emp.data<- data.frame(
3. employee_id = c (1:5),
4. employee_name = c("Shubham","Arpita","Nishka","Gunjan","Sumit"),
5. sal = c(623.3,515.2,611.0,729.0,843.25),
6.
7. starting_date = as.Date(c("2012-01-01", "2013-09-23", "2014-11-15", "2014-05-
11",
8. "2015-03-27")),
9. stringsAsFactors = FALSE
10. )
11. # Extracting 2nd and 3rd row corresponding to the 1st and 4th column

12. final <- emp.data[c(2,3),c(1,4)]


13. print(final)

Output

employee_idstarting_date
2 2 2013-09-23
3 3 2014-11-15

Modification in Data Frame

 R allows us to do modification in our data frame. Like matrices


modification, we can modify our data frame through re-assignment. We
cannot only add rows and columns, but also we can delete them. The
data frame is expanded by adding rows and columns.

We can

1. Add a column by adding a column vector with the help of a new column
name using cbind() function.

2. Add rows by adding new rows in the same structure as the existing data
frame and using rbind() function

3. Delete the columns by assigning a NULL value to them.

4. Delete the rows by re-assignment to them.


Let's see an example to understand how rbind() function works and how the
modification is done in our data frame.

Example: Adding rows and columns

1. # Creating the data frame.


2. emp.data<- data.frame(
3. employee_id = c (1:5),
4. employee_name = c("Shubham","Arpita","Nishka","Gunjan","Sumit"),
5. sal = c(623.3,515.2,611.0,729.0,843.25),
6.
7. starting_date = as.Date(c("2012-01-01", "2013-09-23", "2014-11-15", "2014-05-
11",
8. "2015-03-27")),
9. stringsAsFactors = FALSE
10. )
11. print(emp.data)
12.
13. #Adding row in the data frame
14. x <- list(6,"Vaishali",547,"2015-09-01")
15. rbind(emp.data,x)
16.
17. #Adding column in the data frame
18. y <- c("Moradabad","Lucknow","Etah","Sambhal","Khurja")
19. cbind(emp.data,Address=y)

Output

employee_idemployee_namesalstarting_date
1 1 Shubham 623.30 2012-01-01
2 2 Arpita 515.20 2013-09-23
3 3 Nishka 611.00 2014-11-15
4 4 Gunjan 729.00 2014-05-11
5 5 Sumit 843.25 2015-03-27
employee_idemployee_namesalstarting_date
1 1 Shubham 623.30 2012-01-01
2 2 Arpita 515.20 2013-09-23
3 3 Nishka 611.00 2014-11-15
4 4 Gunjan 729.00 2014-05-11
5 5 Sumit 843.25 2015-03-27
6 6 Vaishali 547.00 2015-09-01
employee_idemployee_namesalstarting_date Address
1 1 Shubham 623.30 2012-01-01 Moradabad
2 2 Arpita 515.20 2013-09-23 Lucknow
3 3 Nishka 611.00 2014-11-15 Etah
4 4 Gunjan 729.00 2014-05-11 Sambhal
5 5 Sumit 843.25 2015-03-27 Khurja

Example: Delete rows and columns

1. # Creating the data frame.


2. emp.data<- data.frame(
3. employee_id = c (1:5),
4. employee_name = c("Shubham","Arpita","Nishka","Gunjan","Sumit"),
5. sal = c(623.3,515.2,611.0,729.0,843.25),
6.
7. starting_date = as.Date(c("2012-01-01", "2013-09-23", "2014-11-15", "2014-05-
11",
8. "2015-03-27")),
9. stringsAsFactors = FALSE
10. )
11. print(emp.data)
12.
13. #Delete rows from data frame
14. emp.data<-emp.data[-1,]
15. print(emp.data)
16.
17. #Delete column from the data frame
18. emp.data$starting_date<-NULL
19. print(emp.data)

Output

employee_idemployee_namesalstarting_date
1 1 Shubham623.30 2012-01-01
2 2 Arpita515.20 2013-09-23
3 3 Nishka611.00 2014-11-15
4 4 Gunjan729.00 2014-05-11
5 5 Sumit843.25 2015-03-27
employee_idemployee_namesalstarting_date
2 2 Arpita515.20 2013-09-23
3 3 Nishka611.00 2014-11-15
4 4 Gunjan729.00 2014-05-11
5 5 Sumit843.25 2015-03-27
employee_idemployee_namesal
1 1 Shubham623.30
2 2 Arpita515.20
3 3 Nishka611.00
4 4 Gunjan729.00
5 5 Sumit843.25
Sorting Data Frames

 To sort a data frame in R, the following methods are available.

But use the order( ) function. By default, sorting is

ASCENDING.

Methods to sort a dataframe:

1. order() function (increasing and decreasing order)

2. arrange() function from dplyr package

3. setorder() function from data.table package

Method 1: Using order() function

 This function is used to sort the dataframe based on the

particular column in the dataframe

 Syntax: order(dataframe$column_name,decreasing = TRUE))

where

 dataframe is the input dataframe

 Column name is the column in the dataframe such that dataframe

is sorted based on this column

 Decreasing parameter specifies the type of sorting order

If it is TRUE dataframe is sorted in descending order.

Otherwise, in increasing order


return type: Index positions of the elements

Example 1: R program to create dataframe with 2 columns and

order based on particular columns in decreasing order. Display the

Sorted dataframe based on subjects in decreasing order, display

the Sorted dataframe based on rollno in decreasing order

Coding:

o # create dataframe with roll no and

o # subjects columns

o data = data.frame(

rollno = c(1, 5, 4, 2, 3),

o subjects = c("java", "python", "php", "sql", "c")

o print(data)

o print("sort the data in decreasing order based on

subjects ")

o print(data[order(data$subjects, decreasing = TRUE), ] )

o print("sort the data in decreasing order based on

rollno ")

o print(data[order(data$rollno, decreasing = TRUE), ] )


Output:

rollno subjects

1 1 java

2 5 python

3 4 php

4 2 sql

5 3 c

[1] "sort the data in decreasing order based on subjects "

rollno subjects

4 2 sql

2 5 python

3 4 php

1 1 java

5 3 c

[1] "sort the data in decreasing order based on rollno "

rollno subjects

2 5 python
3 4 php

5 3 c

4 2 sql

1 1 java

Example 2: R program to create dataframe with 3 columns named

rollno, names, and subjects with a vector, display the Sorted

dataframe based on subjects in increasing order, display the

Sorted dataframe based on rollno in increasing order, display the

Sorted dataframe based on names in increasing order

Coding:

o # create dataframe with roll no, names

o # and subjects columns

o data=data.frame(rollno = c(1, 5, 4, 2, 3),

names = c("sravan", "bobby","pinkey", "rohith",

"gnanesh"),

subjects = c("java", "python",

"php", "sql", "c"))


o print(data)

o print("sort the data in increasing order

based on subjects")

o print(data[order(data$subjects, decreasing

= FALSE), ] )

o print("sort the data in increasing order

based on rollno")

o print(data[order(data$rollno, decreasing =

FALSE), ] )

o print("sort the data in increasing order

based on names")

o print(data[order(data$names,decreasing =

FALSE), ] )

Output:

rollno names subjects

1 1 sravan java

2 5 bobby python

3 4 pinkey php
4 2 rohith sql

5 3 gnanesh c

[1] "sort the data in increasing order based on subjects"

rollno names subjects

5 3 gnanesh c

1 1 sravan java

3 4 pinkey php

2 5 bobby python

4 2 rohith sql

[1] "sort the data in increasing order based on rollno"

rollno names subjects

1 1 sravan java

4 2 rohith sql

5 3 gnanesh c

3 4 pinkey php

2 5 bobby python

[1] "sort the data in increasing order based on names"


rollno names subjects

2 5 bobby python

5 3 gnanesh c

3 4 pinkey php

4 2 rohith sql

1 1 sravan java

Method 2: Using arrange() Function from dplyr.

Arrange() is used to sort the dataframe in increasing order, it will

also sort the dataframe based on the column in the dataframe

Syntax: arrange(dataframe,column)

where

 dataframe is the dataframe input

 column is the column name , based on this column dataframe is

sorted

We need to install dplyr package as it is available in that package

Syntax: install.packages(“dplyr”)

Example: R program to sort dataframe based on columns


In this program, we created three columns using the vector and

sorted the dataframe based on the subjects column

Code:

 # load the package

 library("dplyr")

 # create dataframe with roll no, names

 # and subjects columns

 data = data.frame(rollno = c(1, 5, 4, 2, 3),

 names = c("sravan", "bobby", "pinkey",

o "rohith", "gnanesh"),

 subjects = c("java", "python", "php",

"sql", "c"))

 # sort the data based on subjects

 print(arrange(data, subjects))

Output:

rollno names subjects

1 3 gnanesh c

2 1 sravan java
3 4 pinkey php

4 5 bobby python

5 2 rohith sql

Method 3: Using setorder() from data.table package

 setorder is used to sort a dataframe in the set order format.

Syntax: setorder(dataframe, column)

 Where dataframe is the input dataframe

 The column is the column name

Example: R program to sort dataframe based on columns

In this program, we created the dataframe with three columns

using vector and sorted the dataframe using setorder function

based on subjects column

Code:

 # load the library

 library("data.table")
 # create dataframe with roll no, names

 # and subjects columns

 data=data.frame(rollno = c(1, 5, 4, 2, 3),

names = c("sravan", "bobby", "pinkey",

"rohith", "gnanesh"),

subjects = c("java", "python","php", "sql",

"c"))

 # sort the data based on subjects

 print(setorder(data,subjects))

Output:

rollno names subjects

5 3 gnanesh c

1 1 sravan java

3 4 pinkey php

2 5 bobby python

4 2 rohith sql
Lists
 A list in R can contain many different data types inside it. A list is a
collection of data which is ordered and changeable.
 A list is a vector but with heterogeneous data elements. A list in R is
created with the use of list() function. R allows accessing elements of a
list with the use of the index value. In R, the indexing of a list starts
with 1 instead of 0 like other programming languages.
 In R, lists are the second type of vector. Lists are the objects of R which
contain elements of different types such as number, vectors, string and
another list inside it. It can also contain a function or a matrix as its
elements.
 A list is a data structure which has components of mixed data types. We
can say, a list is a generic vector which contains other objects.

Creating a List

 To create a List in R you need to use the function called “list()”. In other
words, a list is a generic vector containing other objects.

 We want to build a list of employees with the details. So for this, we


want attributes such as ID, employee name, and the number of
employees.

Example:

vec<- c(3,4,5,6)
char_vec<-c("shubham","nishka","gunjan","sumit")
logic_vec<-c(TRUE,FALSE,FALSE,TRUE)
out_list<-list(vec,char_vec,logic_vec)
out_list

Output
[[1]]
[1] 3 4 5 6

[[2]]
[1] "shubham" "nishka" "gunjan" "sumit"

[[3]]
[1] TRUE FALSE FALSE TRUE
Lists creation
 The process of creating a list is the same as a vector. In R, the vector is
created with the help of c() function. Like c() function, there is another
function, i.e., list() which is used to create a list in R.
 A list avoid the drawback of the vector which is data type. We can add
the elements in the list of different data types.

Syntax

1. list()

Example 1: Creating list with same data type

list_1<-list(1,2,3)
list_2<-list("Shubham","Arpita","Vaishali")
list_3<-list(c(1,2,3))
list_4<-list(TRUE,FALSE,TRUE)
list_1
list_2
list_3
list_4

Output:

[[1]]
[1] 1
[[2]]
[1] 2
[[3]]
[1] 3

[[1]]
[1] "Shubham"
[[2]]
[1] "Arpita"
[[3]]
[1] "Vaishali"

[[1]]
[1] 1 2 3

[[1]]
[1] TRUE
[[2]]
[1] FALSE
[[3]]
[1] TRUE

Example 2: Creating the list with different data type

list_data<-

list("Shubham","Arpita",c(1,2,3,4,5),TRUE,FALSE,

22.5,12L)

print(list_data)

In the above example, the list function will create a list with character, logical,
numeric, and vector element. It will give the following output

Output:

[[1]]
[1] "Shubham"
[[2]]
[1] "Arpita"
[[3]]
[1] 1 2 3 4 5
[[4]]
[1] TRUE
[[5]]
[1] FALSE
[[6]]
[1] 22.5
[[7]]
[1] 12
Giving a name to list elements
 R provides a very easy way for accessing elements, i.e., by giving the
name to each element of a list. By assigning names to the elements, we
can access the element easily. There are only three steps to print the list
data corresponding to the name:
Creating a list.
Assign a name to the list elements with the help of names()
function.
Print the list data.

Example

# Creating a list containing a vector, a matrix and a list.


list_data <-
list(c("Shubham","Nishka","Gunjan"), matrix(c(40,80,60,70,90,8
0), nrow = 2),
list("BCA","MCA","B.tech"))
# Giving names to the elements in the list.
names(list_data) <- c("Students", "Marks", "Course")
# Show the list.
print(list_data)

Output:

$Students
[1] "Shubham" "Nishka" "Gunjan"

$Marks
[,1] [,2] [,3]
[1,] 40 60 90
[2,] 80 70 80

$Course
$Course[[1]]
[1] "BCA"

$Course[[2]]
[1] "MCA"

$Course[[3]]
[1] "B. tech."

Accessing List Elements


R provides two ways through which we can access the elements of a list.
First one is the indexing method performed in the same way as a vector.
In the second one, we can access the elements of a list with the help of
names.
It will be possible only with the named list.;
we cannot access the elements of a list using names if the list is normal.
Example 1: Accessing elements using index

# Creating a list containing a vector, a matrix and a list

list_data <-

list(c("Shubham","Arpita","Nishka"), matrix(c(40,80,60,

70,90,80), nrow = 2),

list("BCA","MCA","B.tech"))

# Accessing the first element of the list.

print(list_data[1])

# Accessing the third element. The third element is al

so a list, so all its elements will be printed.

print(list_data[3])

[1]]
[1] "Shubham" "Arpita" "Nishka"

[[1]]
[[1]][[1]]
[1] "BCA"

[[1]][[2]]
[1] "MCA"

[[1]][[3]]
[1] "B.tech"
Example 2: Accessing elements using names

# Creating a list containing a vector, a matrix and a list


list_data <-
list(c("Shubham","Arpita","Nishka"), matrix(c(40,80,60,
70,90,80), nrow = 2),list("BCA","MCA","B.tech"))
# Giving names to the elements in the list.
names(list_data) <- c("Student", "Marks", "Course")
# Accessing the first element of the list.
print(list_data["Student"])
print(list_data$Marks)
print(list_data)

Output:

$Student
[1] "Shubham" "Arpita" "Nishka"

[,1] [,2] [,3]


[1,] 40 60 90
[2,] 80 70 80

$Student
[1] "Shubham" "Arpita" "Nishka"

$Marks
[,1] [,2] [,3]
[1,] 40 60 90
[2,] 80 70 80

$Course
$Course[[1]]
[1] "BCA"
$Course[[2]]
[1] "MCA"
$Course[[3]]
[1] "B. tech."

Manipulation of list elements


R allows us to add, delete, or update elements in the list.
We can update an element of a list from anywhere, but elements can add
or delete only at the end of the list.
To remove an element from a specified index, we will assign it a null
value. We can update the element of a list by overriding it from the new
value. Example to understand how we can add, delete, or update the
elements in the list.

Example

# Creating a list containing a vector, a matrix and a list

list_data <-

list(c("Shubham","Arpita","Nishka"), matrix(c(40,80,60,

70,90,80), nrow = 2),

list("BCA","MCA","B.tech"))

# Giving names to the elements in the list.

names(list_data) <- c("Student", "Marks", "Course")


# Adding element at the end of the list.

list_data[4] <- "Moradabad"

print(list_data[4])

# Removing the last element.

list_data[4] <- NULL

# Printing the 4th Element.

print(list_data[4])

# Updating the 3rd Element.

list_data[3] <- "Masters of computer applications"

print(list_data[3])

Output:

[[1]]
[1] "Moradabad"

$<NA>
NULL

$Course
[1] "Masters of computer applications"

Converting list to vector


There is a drawback with the list, i.e., we cannot perform all the
arithmetic operations on list elements.
To remove this, drawback R provides unlist() function.
This function converts the list into vectors. In some cases, it is required
to convert a list into a vector so that we can use the elements of the
vector for further manipulation.
The unlist() function takes the list as a parameter and change into a
vector.
Example

# Creating lists.
list1 <- list(10:20)
print(list1)
list2 <-list(5:14)
print(list2)
# Converting the lists to vectors.
v1 <- unlist(list1)
v2 <- unlist(list2)
print(v1)
print(v2)
adding the vectors
result <- v1+v2
print(result)

Output:

[[1]]
[1] 1 2 3 4 5

[[1]]
[1] 10 11 12 13 14

[1] 1 2 3 4 5
[1] 10 11 12 13 14
[1] 11 13 15 17 19
Merging Lists
R allows us to merge one or more lists into one list. Merging is done with
the help of the list() function also.
To merge the lists, we have to pass all the lists into list function as a
parameter, and it returns a list which contains all the elements which
are present in the lists.
Example

# Creating two lists.

Even_list <- list(2,4,6,8,10)

Odd_list <- list(1,3,5,7,9)

# Merging the two lists.

merged.list <- list(Even_list,Odd_list)

# Printing the merged list.

print(merged.list)

Output:

[[1]][[1]]
[[1]][[1]]
[1] 2

[[1]][[2]]
[1] 4

[[1]][[3]]
[1] 6

[[1]][[4]]
[1] 8
[[1]][[5]]
[1] 10

[[2]]
[[2]][[1]]
[1] 1

[[2]][[2]]
[1] 3

[[2]][[3]]
[1] 5

[[2]][[4]]
[1] 7

[[2]][[5]]
[1] 9
Note:

The only difference between vectors, matrices, and arrays are

 Vectors are uni-dimensional arrays

 Matrices are two-dimensional arrays

 Arrays can have more than two dimensions

R Arrays
 In R, arrays are the data objects which allow us to store data
in more than two dimensions. In R, an array is created with
the help of the array() function. This array() function takes a
vector as an input and to create an array it uses vectors values
in the dim parameter.
 For example- if we will create an array of dimension (2, 3, 4)
then it will create 4 rectangular matrices of 2 row and 3
columns.

R Array Syntax

There is the following syntax of R arrays:

1. array_name <-
array(data, dim= (row_size, column_size, matrices, dim_names))

data
The data is the first argument in the array() function. It is an input
vector which is given to the array.

 matrices

In R, the array consists of multi-dimensional matrices.

 row_size

This parameter defines the number of row elements which an array


can store.

 column_size

This parameter defines the number of columns elements which an


array can store.

 dim_names

This parameter is used to change the default names of rows and


columns.
TYPES

Uni-Dimensional Array
 A vector is a uni-dimensional array, which is specified by a
single dimension, length. A Vector can be created using ‘c()‘
function. A list of values is passed to the c() function to create
a vector.
 EX:

vec1 <- c(1, 2, 3, 4, 5, 6, 7, 8, 9)

print (vec1)

# cat is used to concatenate

# strings and print it.

cat ("Length of vector : ", length(vec1))


Output
[1] 1 2 3 4 5 6 7 8 9
Length of vector : 9

Multi-Dimensional Array
 A two-dimensional matrix is an array specified by a fixed
number of rows and columns, each containing the same data
type. A matrix is created by using array() function to which
the values and the dimensions are passed.
 EX:1

# arranges data from 2 to 13

# in two matrices of dimensions 2x3

arr = array(2:13, dim = c(2, 3, 2))

print(arr)

Output
,,1

[,1] [,2] [,3]


[1,] 2 4 6
[2,] 3 5 7

,,2
[,1] [,2] [,3]
[1,] 8 10 12
[2,] 9 11 13

Creation of arrays

 In R, array creation is quite simple. We can easily create an


array using vector and array() function. In array, data is stored
in the form of the matrix. There are only two steps to create a
matrix which are as follows

1. In the first step, we will create two vectors of different lengths.


2. Once our vectors are created, we take these vectors as inputs
to the array.

Example

1. #Creating two vectors of different lengths


2. vec1 <-c(1,3,5)
3. vec2 <-c(10,11,12,13,14,15)
4. #Taking these vectors as input to the array
5. res <- array(c(vec1,vec2))
6. res[1]
7. print(res)
Output
[1] 1
[1] 1 3 5 10 11 12 13 14 15

Naming rows and columns

 In R, we can give the names to the rows, columns, and


matrices of the array. This is done with the help of the dim
name parameter of the array() function.
 It is not necessary to give the name to the rows and columns.
It is only used to differentiate the row and column for better
understanding.
 EX:1

#Creating two vectors of different lengths

vec1 <-c(1,3,5)

vec2 <-c(10,11,12,13,14,15)

#Initializing names for rows, columns and matrices

col_names <- c("Col1","Col2","Col3")

row_names <- c("Row1","Row2","Row3")

matrix_names <- c("Matrix1","Matrix2")

#Taking the vectors as input to the array


Res <-
array(c(vec1,vec2),dim=c(3,3,2),dimnames=list(row_names,col_n
ames,matrix_names))

print(res)

Output
, , Matrix1

Col1 Col2 Col3


Row1 1 10 13
Row2 3 11 14
Row3 5 12 15

, , Matrix2

Col1 Col2 Col3


Row1 1 10 13
Row2 3 11 14
Row3 5 12 15

Accessing arrays
 The arrays can be accessed by using indices for different
dimensions separated by commas. Different components can
be specified by any combination of elements’ names or
positions.
Accessing Uni-Dimensional Array
The elements can be accessed by using indexes of the
corresponding elements.

EX:1

vec <- c(1:10)

# accessing entire vector

cat ("Vector is : ", vec)

# accessing elements

cat ("Third element of vector is : ", vec[3])

Output
Vector is : 1 2 3 4 5 6 7 8 9 10Third element of vector is : 3

Access Entire Row or Column

EX: 1

# create a two 2 by 3 matrix


array1 <- array(c(1:12), dim = c(2,3,2))

print(array1)
# access entire elements at 2nd column of 1st matrix
cat("\n2nd Column Elements of 1st matrix:",
array1[,c(2),1])

# access entire elements at 1st row of 2nd matrix


cat("\n1st Row Elements of 2nd Matrix:", array1[c(1),
,2])

Output
,,1

[,1] [,2] [,3]


[1,] 1 3 5
[2,] 2 4 6

,,2

[,1] [,2] [,3]


[1,] 7 9 11
[2,] 8 10 12

2nd Column Elements of 1st matrix: 3 4


1st Row Elements of 2nd Matrix: 7 9 11
Manipulating Array Elements

 As array is made up matrices in multiple dimensions, the


operations on elements of array are carried out by accessing
elements of the matrices.

 EX: 1

# Create two vectors of different lengths.

vector1 <- c(5,9,3)

vector2 <- c(10,11,12,13,14,15)

# Take these vectors as input to the array.

array1 <- array(c(vector1,vector2),dim = c(3,3,2))

array1

# Create two vectors of different lengths.

vector3 <- c(9,1,0)

vector4 <- c(6,0,11,3,14,1,2,6,9)

array2 <- array(c(vector1,vector2),dim = c(3,3,2))

array2

# create matrices from these arrays.

matrix1 <- array1[,,2]


matrix2 <- array2[,,2]

# Add the matrices.

result <- matrix1+matrix2

print(result)

Output
,,1

[,1] [,2] [,3]


[1,] 5 10 13
[2,] 9 11 14
[3,] 3 12 15

,,2

[,1] [,2] [,3]


[1,] 5 10 13
[2,] 9 11 14
[3,] 3 12 15

,,1

[,1] [,2] [,3]


[1,] 5 10 13
[2,] 9 11 14
[3,] 3 12 15

,,2

[,1] [,2] [,3]


[1,] 5 10 13
[2,] 9 11 14
[3,] 3 12 15

[,1] [,2] [,3]


[1,] 10 20 26
[2,] 18 22 28
[3,] 6 24 30

Calculations across array elements

 For calculation purpose, r provides apply() function. This


apply function contains three parameters i.e., x, margin, and
function.
 This function takes the array on which we have to perform the
calculations.
Using apply() function
 'apply()' is one of the R packages which have several functions
that helps to write code in an easier and efficient way. You'll
see the example below where it can be used to calculate the
sum of two different arrays.

The syntax for apply() is :

apply(x, margin, function)

The argument above indicates that:


x: An array or two-dimensional data as matrices.
margin: Indicates a function to be applied as margin value to be
c(1) for rows, c(2) for columns, and c(1,2) for both rows and
columns.
function: Indicates the R- built-in or user-defined function to
be applied over the given data.

EX: 1

#Creating two vectors of different lengths

vec1 <-c(1,3,5)

vec2 <-c(10,11,12,13,14,15)

#Taking the vectors as input to the array1


res1 <- array(c(vec1,vec2),dim=c(3,3,2))

print(res1)

#using apply function

result <- apply(res1,c(1),sum)

print(result)

Output
,,1

[,1] [,2] [,3]


[1,] 1 10 13
[2,] 3 11 14
[3,] 5 12 15

,,2

[,1] [,2] [,3]


[1,] 1 10 13
[2,] 3 11 14
[3,] 5 12 15

[1] 48 56 64
Output
,,1

[,1] [,2] [,3]


[1,] 1 10 13
[2,] 3 11 14
[3,] 5 12 15

,,2

[,1] [,2] [,3]


[1,] 1 10 13
[2,] 3 11 14
[3,] 5 12 15

[1] 18 66 84

EX: 2

#Creating two vectors of different lengths

vec1 <-c(1,3,5)

vec2 <-c(10,11,12,13,14,15)

#Taking the vectors as input to the array1


res1 <- array(c(vec1,vec2),dim=c(3,3,2))

print(res1)

#using apply function

result <- apply(res1,c(1,2),sum)

print(result)

Output
,,1

[,1] [,2] [,3]


[1,] 1 10 13
[2,] 3 11 14
[3,] 5 12 15

,,2

[,1] [,2] [,3]


[1,] 1 10 13
[2,] 3 11 14
[3,] 5 12 15

[,1] [,2] [,3]


[1,] 2 20 26
[2,] 6 22 28
[3,] 10 24 30

IMPORTANT QUESTIONS
1) Vectors
2) Factors

3) Data Frames

4) Lists

5) Arrays

6) Matrices

7) CLASS

The may ask any subpart of the above


concepts for short questions/ big
questions

You might also like