0% found this document useful (0 votes)
5 views14 pages

DSR Unit V

The document provides an overview of various data structures in R, including Vectors, Matrices, Arrays, Data Frames, Lists, and Factors. It explains how to create, name, access, and manipulate these structures, along with examples of vector arithmetic and subsetting techniques. The document serves as a guide for organizing and processing data efficiently in R programming.

Uploaded by

mailajannu28
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views14 pages

DSR Unit V

The document provides an overview of various data structures in R, including Vectors, Matrices, Arrays, Data Frames, Lists, and Factors. It explains how to create, name, access, and manipulate these structures, along with examples of vector arithmetic and subsetting techniques. The document serves as a guide for organizing and processing data efficiently in R programming.

Uploaded by

mailajannu28
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

1|14

DSR: Unit – V: Vectors, Matrix, Factors, Data Frames, List

 Vectors: Creating vector, Naming vector, Vector arithmetic's, Vector sub setting
 Matrix: Creating matrix, Naming matrix, Matrix sub setting, Array, Class
 Array: Creating array, Naming array, Accessing Array, Array sub setting,
 Data Frame: Introduction, Creating, Sub setting, Sorting, Extending data frame
 List: Introduction, Creating list, Naming list, Accessing list, Manipulate list,
Merging list, List into vector
 Factors: Introduction, Factor level, ordered factor, Summarizing factor

 Data Structure is way of organizing data in computer memory.


 R supports various data structures efficiently organize and process data:
 Vectors: One dimensional: Data organizes one dimension row or column
 Atomic Vector: Homogeneous: Set of similar data type
 Lists: Heterogeneous: Set of different data type
 Matrices: Two dimensional: Data organizes rows and columns like matrix
 Two dimensional homogeneous data structures
 Data Frames: Two dimensional: Data organized two dimensions like table
 Two dimensional heterogeneous data structures
 Arrays: Multi-dimensional: Data organizes more than two dimensions like data cube
 Factors: is a data structure used describe fields which take only predefined finite number of values

 Vector is a set/collection elements


• Vector creation in R by using c () function
o Syntax1 : Vname <- c( V1, V2, …, Vn)
 c() concatenation R in-built function
 Vname is a vector name
 Example : V1 <- c(3, 8, -5, 23)
o Syntax2: Vname <- 1:5

Stanley College of Engineering and Technology for Women, Dept of IT, DSR Dr.V.Sidda Reddy
2|14

 Example : V2 <- 1:5


• List(heterogeneous): Collection of different data type elements
 Example : L <- c(101, “Vani”, “F” , 20.5)

 Vectors Naming: R support in-built function names() naming vector element/values names after
creation:
• Syntax: names(Vname) <- c( V1, V2, …, Vn)
 names() R in-built function
 Vname is a vector name
 Example : names(Vname) <- c(“No1”, “No2”, “No3” )
• Example :
 V1 <- c(5, 8, 7)
 names(V1) <- c(“No1”, “No2”, “No3”, )

 R : Vectors Arithmetic's:
 R support perform various arithmetic operation (+, -, *, /) on vectors :
 Example :
o V1 <- c(5, 7, 8, 0)
o V2 <- c(9, 3, 0, 5)
 V1 + V2
 V1 - V2
 V1 * V2
 V1 / V2

 Vector: Sub setting /Indexing: R support various method to sub setting given vector
• Index : vector can be sub setting by using index and subset operator [ ]of the vector
 V <- c(10, 20, 30, 40, 50)
 V1 <- V[1:3]  V1 = { 10, 20, 30}
 V2 <- V[3:5]  V2 = { 30, 40, 50}
• Positive integer : vector can be subset using positive integer to add/select subset
o Syntax: Sname <- Vname[c( I1, I2, I3) ]
Stanley College of Engineering and Technology for Women, Dept of IT, DSR Dr.V.Sidda Reddy
3|14

 Sname is a subset name


 Vname is a source vector name
 I1, I2, I3 source vector indexes/ indices to add in subset
o Example :
 V <- 1:10
 V1 <- V[c(1, 2, 3)]  1 2 3
 Negative integer : vector can be subset using negative integer by removing
o Syntax: Sname <- Vname[-c( I1, I2, I3) ]
 Sname is a subset name
 Vname is a source vector name
 I1, I2, I3 source vector indexes/ indices to remove in subset
o Example :
 V <- 1:10  1 2 3 4 5 6 7 8 9 10
 V1 <- V[-c(1, 2, 3, 10)]  4 5 6 7 8 9
 Logical sub setting : vector can be subset by using logic / relational
o Syntax: Sname <- Vname[Vname Logic ]
 Sname is a subset name
 Vname is a source vector name
 Logic is a relational expression
o Example :
 V <- c(12, 5, 34, 78, 23, 8)
 V1 <- V[V>20]

Stanley College of Engineering and Technology for Women, Dept of IT, DSR Dr.V.Sidda Reddy
4|14

 List is a one dimensional and heterogeneous data structure


• List is generic vector store various kinds of data (integers, numbers, strings, boolean).
• R - List is container that can be Vectors, Arrays, Data frames, Lists
 R – List Creation: List can be created by two inbuilt functions list( ) and as.list()

• List creation list( )


o Syntax: list(data)
 data : individual values, vector, list
o Example: list(101, "Radha", 19.4, c(67, 78, 59))
• List creation as.list()
o Syntax: as.list(data)
 data : vector of heterogeneous data
o Example: as.list(c(101, "Radha", 19.4, c(67, 78, 59)))

 Naming List and List components: R - support naming list (named list) and list components to access
by using list name and individual components(objects) of the list:

• Naming List (Named list):


o Syntax: Lname <- list(data)
 Lname: Name of the list
 data : individual values, vector, list
o Example: Emp <- list(101, "Vani", "Software", 35000.00)
• Naming List components:
o Syntax: Lname <- list(C1name = data1, C2name = data2, …,)

Stanley College of Engineering and Technology for Women, Dept of IT, DSR Dr.V.Sidda Reddy
5|14

 Lname : Name of the list


 C1name, C2name, …, : Names of list components
o Example: Emp <- list( Eno =101, Ename = "Vani", Esal = 35000.00)

 Accessing: List components/elements : In R the list components can be accessed by two ways :
1. Index: List components can be accessed by index inside [ ]:
• List create components index in sequence[1, 2, …] : First component index 1, the second
item has index 2, and so on …
o Syntax: Lname[Index]
 Lname : List name
 [Index]: Component Index
o Example: Emp <- list(101, "Vani", "Software", 35000.00)
 To access fist component : Emp[1]  101
 To access second component : Emp[2]  "Vani"
2.Names: R List components can be accessed by referring to its component name inside single [ ] and
double brackets [[ ]].

• Syntax: Lname["Cname"] or Lname[["Cname"]]


 Lname : List names
 Cname: Component name within double " "
• Example: Emp <- list(Eno = 101, Ename = "Vani", Esal = 35000.00)
 To access fist component : Emp["Eno"]  101
 To access second component : Emp[["Ename“]]  "Vani"
 To access fourth element : Emp["Esal"]  35000.00

 List Manipulate: R - List can also be manipulated by using list component index or component name
and replacing them require data/values.

• Syntax: Lname[Index] = data/value or Lname["Cname"] = data/value


 Lname : List nameis a list name
 Index : Component name
 Cname: Component name within double " "

Stanley College of Engineering and Technology for Women, Dept of IT, DSR Dr.V.Sidda Reddy
6|14

• Example: Emp <- list(Eno = 101, Ename = "Vani", Esal = 35000.00)


 To manipulate fist component by index : Emp[1] = 201
 To manipulate f third component by name : Emp["Esal "] = 45000.00
 List Merging: R - Lists (two or more) can be merged using the concatenation function c().

• Syntax: Cname <- c(List1, List2 )


 Cname : Concatenated list name
 List1 , List 2 : List names to be merge or concatenate
• Example:
 Emp1 <- list(Eno = 101, Ename = "Vani", Esal = 35000.00)
 Emp2 <- list(Eno = 102, Ename = "Ravi", Esal = 25000.00)
 Emp12 <- c(Emp1, Emp2)

 Converging List into Vector: R support library function unlist() to convert list into vector

• Syntax : unlist(Lname)
 Lname : List name to convert into vector
 Example:
 Emp<- list(Eno = 101, Ename = "Vani", Esal = 35000.00)
 unlist(Emp)

Stanley College of Engineering and Technology for Women, Dept of IT, DSR Dr.V.Sidda Reddy
7|14

 Matrix is a two-dimensional homogeneous data structures arrangement of data in rows and columns.

• Creating Matrix: R support inbuilt function matrix( ) to create matrix by sequence or vector
o Syntax : matrix(values, nrow, ncol, byrow, dimnames)
 values : Elements of matrix
 nrow : no. of rows
 ncol : no. of columns
 byrow : TRUE to assign elements in row wise
 dimnames : Names of rows and columns
o Example 1 : matrix(1:9, nrow = 3, ncol = 3, byrow = TRUE)
o Example 2 : matrix( c(3, 6, 1, 5), nrow = 2, ncol = 2)
o Example 3 : V <- c(3, 6, 1, 5)
matrix( V, nrow = 2, ncol = 2)

 Matrix Creation : Vector set: R – matrix data structures can be created by set of atomic vectors:
• Create set of atomic vectors by c( )
o V1 <- c(10, 20, 30)
o V2 <- c(40, 50, 60)
o V3 <- c(70, 80, 80)
Stanley College of Engineering and Technology for Women, Dept of IT, DSR Dr.V.Sidda Reddy
8|14

• Creating matrix by function matrix( ) and set of vectors c( )


o matrix(c(V1, V2, V3), nrow = 3, byrow = TRUE)
 Naming Matrix, Rows and Columns: R support to name matrix and rows , columns:
• Naming matrix :
o Syntax : Name <- matrix(values, nrow, ncol, byrow, dimnames)
 Name : Name of the matrix
o Example: mat <- matrix(1:4, nrow=2, ncol=2, byrow= TRUE)
 mat : Name of the matrix
• Naming rows and columns: R support functions rownames() and colnames() to name matrix
rows and columns:
o Syntax : rownames(Name) <- c(“R1”, “R2”, …, “Rn”)
 Name : Name of the matrix
 R1, R2, …, Rn : Names of rows
o Syntax : colnames(Name) <- c(“C1”, “C2”, …, “Cn”)
 Name : Name of the matrix
 C1, C2, …, Cn : Names of columns
 Matrix elements: Accessing / Navigating: R – support accessing matrix elements by using row and
column index within [ ]:
• Syntax : mat[Rowid, Colid]
 Mat: Name of matrix
 Rowid : Row index
 Colid : Column index
 Row and column index are integers start form 1 in sequence: 1,2, …
• Example: mat<- matrix(c(3 6, 0, 5, -8, 1, 23, 9, 4), nrow = 3, byrow = TRUE)
• mat
[,1] [,2] [,3]
[1,] 3 6 0
[2,] 5 -8 1
[3,] 23 9 4

• mat[2, 2]  -8

Stanley College of Engineering and Technology for Women, Dept of IT, DSR Dr.V.Sidda Reddy
9|14

 Matrix Sub setting : R support matrix sub setting by using row and column index within [Rid,
Cid ]:
• mat[2, 2]  -8
• mat[3, ]  23 9 4
• mat[, 1]  3 5 23
• mat [mat < 5] 3 0 -8 1 4

 R class() is inbuilt function that find type of the data structures or object, class() also used to create
various components
 Syntax : class(object_name)
 mat<- matrix(c(3, 6, 0, 5, -8, 1, 23, 9, 4), nrow = 3, byrow = TRUE)
 class(mat)  "matrix" "array"

 R Array is a multi-dimensional data structures or data object:


• Array store data more than 2 dimensions
• Array is homogeneous, store similar kind of data
• Array example : Data cube
 R - Array creating:
• R support array( ) function to create 1-Dim array or Multi-Dim array:
o Syntax 1-Dim : array(data)
 data : Elements of array
o Example 1 - Dim : array(1:27)
o Syntax Multi-Dim: array(data, dim = c(nrow, ncol, nmat))
 nrow: No. of rows dimension

Stanley College of Engineering and Technology for Women, Dept of IT, DSR Dr.V.Sidda Reddy
10 | 1 4

 ncol: No .of columns dimension


 nmat: Size of matrix
o Example : array(1:27, dim = c(3, 3, 3))
 Naming Array, Rows , Columns:
Array Naming: R – support create a named array
• Syntax : name <- array(data, dim = c(nrow, ncol, nmat))
 name : Name of the array
• Example : arr <- array(1:27, dim = c(3, 3, 3))
Naming Rows and Columns: R – support name rows and columns by including “dimnames”
parameter:
• Syntax :array(data, dim = c(nrow, ncol, nmat), dimnames=(rname, cname, mname)
 dimnames : list ( ): Names of rows, columns and matrix / dimension
• Example : create vector for row, column and matrix names
 rname <- c("R1", "R2", "R3")
 cname <- c("C1", "C2", "C3")
 mname <- c("M1", "M2", "M3")
 arr <- array(1:27, dim = c(3 ,3 ,3) ,dimnames = list(rname, cname, mname))

 Array Accessing :R support to access array elements by using row, column and matrix index inside [ ]
• Syntax : name[rid, cid, mid]
 name: Array name
 rid : Row index
 cid : Column index
 mid: Matrix index
• To access matrix 1 : arr[, , 1]
• To access 1 rows of all matrixes : arr[1, , ]
• To access 1 element of m1 : arr["R1", "C1", "M1" ]

Stanley College of Engineering and Technology for Women, Dept of IT, DSR Dr.V.Sidda Reddy
11 | 1 4

 R1, C1, M1 are name of row, column and matrix of array


 Array: Sub setting: R – Array: Sub setting is a extracting specific elements, rows, columns or matrix
from the Array based on various methods:
• Specific element by using row, column, matrix names within [ ]
o Example : arr["Rname", "Cname", "Mname"]
• Specific matrix of the array by using matrix index within [, , Mid ]
o Example : arr[, , Mid ]
• Specific Row of all matrix of the array by using row index within [Rid, , ]
o Example : arr[1, , ]
• Specific Column of all matrix of the array by using column index inside [, Cid, ]
o Example : arr[, 2, ]
• Specific logical condition by using relational expressions inside [ logic ]
o Example : arr[arr>15 ]

 Data Frame is a two dimensional data structure that organize like table
• Data frame is a two dimensional heterogeneous data structures
• It support various kinds of data (integer, numbers, strings, logical)
• Creation : R support data.frame() function to create data frame:
o Syntax : name <- data.frame ( C1= data1, C2 = data2, … Cn = datan)
 C1, C2 , …Cn : Names of columns

Stanley College of Engineering and Technology for Women, Dept of IT, DSR Dr.V.Sidda Reddy
12 | 1 4

 data1, data2, …, datan : values of column C1, C2, …Cn


o Example : Stu <- data.frame(
Sno = c(101, 102, 103),
Sname = c("Amar", "Bindu", "Ravi"),
Sgen = c("M", "F", "M")
)
 Accessing Data Frame :R support to access data frame elements by using single brackets [ ], double
brackets [[ ]] or $ to access columns from a data frame:
• Syntax : name[Cid]
o name: Data frame name
o Cid : Column index
• To access 1 columns : Stu[1]
• To access based on column name “Sname” : Stu[[“Sname”]]

 Sorting Data Frame is a process of reorder the rows based on the values in one or more columns.
• Sorting data frame can be useful for various purposes, such as organizing data for analysis or
presentation.
• R support various methods to sort data frame
o order() : Sorting function sort data increasing and decreasing order
o arrange() : Sorting function from R – dylyr package
o setorder() : Sorting function from table.package

 Factors Introduction: R - Factor is a data structure that is used to represent categorical data in level
order:

• Factor organize categorical data into multiple levels


 Categorical data has finite and fixed set of values : Gender, Grade
• Factor hold finite and fixed set of possible values in multiple levels
Stanley College of Engineering and Technology for Women, Dept of IT, DSR Dr.V.Sidda Reddy
13 | 1 4

• Example: Designation = {Principal, HoD, Prof, Assoc. Prof, Asst. Prof}


Principal > HoD >Professor > Assoc. Prof >Asst. Prof
1. Principal
2. HoD
3. Professor
4. Asso. Prof
5. Asst. Prof
6.
 Factor : Creation: R - support factor() to create a factor

• Syntax: factor(data, levels, ordered)


o data : Vector data
o levels: Vector with distinct values define levels in sequence
o ordered: TURE/FALSE Logical attribute define level ordered:
• Creating Factor:
o factor(c("Prof", "HoD", "Asso.Prof", "Asst.Prof"))
 data <- c("Prof", "HoD", "Asso.Prof", "Asst.Prof")
• Naming Factor:
o gender <- factor(c("M", "F", "F", "M", "F", "F"))
o print(gender)
 Level factor and Order factor: R - factor() function support levels and ordered attribute to create
levels and ordered factor
• Syntax: factor(data, levels, ordered)
o data : Vector data
o levels: Vector with distinct values define levels in sequence
o ordered: TURE/FALSE Logical attribute define level ordered:
• Creating Level Factor:
o data <- c("Prof", "HoD", "Asso.Prof", "Asst.Prof")
o level <- c("HoD", "Prof", "Asso.Prof", "Asst.Prof")
o factor(data, levels = level)
• Creating Ordered Factor: include ordered attribute by TRUE
o data <- c("Prof", "HoD", "Asso.Prof", "Asst.Prof")
Stanley College of Engineering and Technology for Women, Dept of IT, DSR Dr.V.Sidda Reddy
14 | 1 4

o level <- c("HoD", "Prof", "Asso.Prof", "Asst.Prof")


o factor(data, levels = level, ordered = TRUE

 Factor Summarizing: R support summary() function to generate factor data structure summary data by
categorical wise
• Syntax: summary(name)
o name : Name of the factor
• Example: Summarizing factor
o data<- c("Prof", "HoD", "Asso.Prof", "Asst.Prof", "Prof", "Asso.Prof")
o fdes <- factor(data)
o summary(fdes)
Asso.Prof Asst.Prof HoD Prof
2 1 1 2

Stanley College of Engineering and Technology for Women, Dept of IT, DSR Dr.V.Sidda Reddy

You might also like