0% found this document useful (0 votes)
9 views16 pages

01b Data Structures

Uploaded by

elkin farfan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views16 pages

01b Data Structures

Uploaded by

elkin farfan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

Data Structures

Prof. Jacob Escobar ([email protected])


Data Structures

• Vector
• Matrix
• Array
• Data Frame
• List
Vector: 1D Array (either row or column)
Matrix: 2D Array (rows, columns)
Array: 3D or more dimensions (rows, columns, planes)
Data Frames
• A data frame is used for storing data tables. It can be
seen as a collection of column-vectors of equal length.
• Columns:
• Typically called variables or fields
• Must have unique column names
• Each column must have a single Data Type, but the
Data Type of each column can be different.
• Rows:
• Typically called observations or records
• Row names are optional, their index tends to be a
numerical sequence.
Differences Between Computer Science and Math

• Coordinates in Databases: (row, col) vs Coordinates in Math: (x, y)

vs
Differences Between Computer Science and Math

• Databases: (3, 2) vs Cartesian: (3, 2)

vs
Different ways to create arrays (1/2)
1. With the combine function “c( )”
w <- c(”a”, ”b”, ”c”, “d“, “e“, ”f”)

2. With the “matrix(data, nrow, ncol)” function


h <- matrix(0, 4, 5)

3. With the “array(data, c(row, col, plane))” function,


z <- array(w, c(3, 2))
z <- array(c(“AMZN”, “AAPL”, “DIS”, “FB”, “MCD”, “TSLA”), c(2, 3))
Different ways to create arrays (2/2)
4. Adding elements to an existing variable in the workspace or
environment
x<-5
x[2]<-6
x[3]<-7
If the variable doesn’t exist already, we will get an error:
z[2] <- “b”
However, we can add non-consecutive elements to an existing variable
and any undefined elements in-between will be NA, for example, the
command below generates NAs in x[4] and x[5]
x[6] <- 10
Other two dimensional objects

Data frames:
table <- data.frame(x, w)

Other useful data structures exist, like


time series “ts( )”
extensible time series “xts( )”
tibbles “tibble()”
Subsetting

• Subsetting (or Slicing): Once an


array is created, its information
can be retrieved either by
calling the variable completely
or by calling a subset of the
array.
Recycling

• Recycling: When performing an


operation with structures of
different sizes, R tries to take the
object of smaller length and
reuse it as necessary to match
the object of greater length. If the
greater length is a multiple of the
smaller length (or if the smaller
length is 1), then R doesn’t send
an error message
Data Types & Data Structures: Troubleshooting
• Functions to confirm / identify data types & data structures:
str( object ) → structure
typeof( object ) → storage mode
class( object ) → class attribute(s)
• These are mainly useful when troubleshooting, since sometimes the
variables types aren’t the ones we assumed and the coercing of some
operations might not work as expected.
Next steps…
Derechos Reservados 2018 Tecnológico de Monterrey
Prohibida la reproducción total o parcial de esta obra sin
expresa autorización del Tecnológico de Monterrey.

You might also like