Chapter-5-slides
Chapter-5-slides
Thomas Maierhofer
Fall 2024
1 / 73
Learning Objectives
2 / 73
Basic Definitions and Functions
3 / 73
Basic Definitions and Functions
Why Matrices in R?
▶ Matrices offer a natural introduction to data frames, which are the most
commonly used objects in R for storing rectangular data.
▶ Much of the syntax and functions for matrices also apply to data frames, making
them essential for understanding data analysis in R.
4 / 73
The matrix() Function
▶ A matrix is a two-dimensional (rectangular) array of values.
▶ In R, every value in a matrix must be of the same type (integer, double,
character, or logical).
Creating a Matrix
▶ The matrix() function takes a vector of values (data) and arranges them into a
matrix.
▶ You specify the number of rows (nrow) and columns (ncol).
6 / 73
Behavior of nrow and ncol arguments
▶ The default values for the matrix() function are nrow = 1 and ncol = 1.
▶ If both nrow and ncol are left blank, R will produce a matrix with a single
column, i.e., a column vector
## [,1]
## [1,] 1
## [2,] 2
## [3,] 3
## [4,] 4
## [5,] 5
7 / 73
If only nrow or ncol is defined, the other value is automatically computed based
on the length of the data vector.
## [,1] [,2]
## [1,] 1 4
## [2,] 2 5
## [3,] 3 6
Key Takeaway: Be cautious when the data vector is shorter than the matrix
dimensions—recycling can happen silently or with a warning.
10 / 73
dim() The Dimension of a Matrix
▶ The dimension of a matrix is defined by the number of rows and the number of
columns.
▶ This is often written as nrow × ncol (read as “nrow by ncol”).
▶ For example, matrix A with 2 rows and 3 columns is a 2 × 3 matrix.
▶ A matrix is called square if the number of rows equals the number of columns.
The dim() function returns a numeric vector specifying the dimension of a matrix.
## [1] 2 3
dim(B)
## [1] 3 3
11 / 73
nrow() and ncol() The Number of Rows and Columns of a Matrix
▶ The functions nrow() and ncol() input a two-dimensional object and output the
number of rows or columns, respectively.
▶ These will produce the individual entries from the dim() function.
## [1] 2
## [1] 3
Question: How would you extract the number of rows from the output of dim(A)? In
other words, how would you write an nrow() function that relies on the dim()
function?
12 / 73
The cbind() and rbind() Functions
▶ The cbind() (column bind) and rbind() (row bind) functions allow you to
create matrices by binding columns or rows together.
▶ Each column (for cbind()) or row (for rbind()) is provided as a separate
unnamed argument.
# An alternative way to create the matrix A
cbind(c(1, 2), c(3, 4), c(5, 6))
14 / 73
Recycling in rbind() and cbind()
▶ Side Note: The rbind() and cbind() functions recycle values when necessary.
▶ This is useful for adding columns or rows of repeated values.
Application to Statistics
▶ In linear regression, the observed values of the predictor variables are often
organized into a design matrix X .
▶ The design matrix usually includes a column of 1’s to account for the intercept
term in the model (see matrix formulation of linear regression in STATS 101A for
details).
▶ Every value in a matrix must be of the same type because matrices are
internally stored as vectors in R.
▶ This can be verified using the mode() function on a matrix.
## [1] "numeric"
16 / 73
Matrices vs. Vectors
attributes(A)
## $dim
## [1] 2 3
attributes(1:6)
## NULL
17 / 73
We could strip the A matrix of its dim attribute by assigning NULL to its
attributes(A) object. The matrix object will revert back to a vector.
## [1] 1 2 3 4 5 6
attributes(A)
## NULL
We can also give a vector the dim attribute in a similar way to convert a vector into a
matrix.
## [1] 2 3
## [1] 1 2 3 4 5 6
19 / 73
Naming Rows and Columns of Two-Dimensional Objects
20 / 73
Naming One-Dimensional Objects (Vectors) in R
▶ A named vector in R is a regular vector where each element is associated with a
name.
▶ The names are stored as an attribute of the vector and can be used for more
intuitive indexing.
# Create a named vector
heights <- c("Leslie" = 62, "Ron" = 71, "April" = 66, "Tom" = 68)
heights
heights["April"]
## April
## 66
21 / 73
get and set vector names
You can also add names to an existing vector using the names() function:
You can also access the names using the names() function:
parks_mat <- cbind(c(62, 71, 66), c(115, 201, 119), c(4000, NA, 2000))
parks_mat # Make sure the data was entered correctly
## NULL
## NULL
Heads up: Whenever you feel like your matrix really needs row and column names you
should probably use a data frame instead. More later.
24 / 73
Setting Row and Column Names
You can set names by creating a character vector of names and assigning it using the
<- operator.
The names are now associated with the rows and columns of the matrix, providing
more context to the data.
25 / 73
Attributes and dimnames
▶ Technically, the rownames() and colnames() functions modify the attributes of
the object.
▶ Setting names will add a dimnames attribute to the object.
attributes(parks_mat)
## $dim
## [1] 3 3
##
## $dimnames
## $dimnames[[1]]
## [1] "Leslie" "Ron" "April"
##
## $dimnames[[2]]
## [1] "Height" "Weight" "Income"
Side Note: The dimnames() function can get and set both the row and column name
attributes at once. The assignment input using dimnames() needs to be a list with
two vector components.
27 / 73
# Add the same names as before
dimnames(parks_mat) <- list(c("Leslie", "Ron", "April"),
c("Height", "Weight", "Income"))
parks_mat
28 / 73
Naming Matrices using matrix(„dimnames = list())
Side Note: There are a few other ways to add names to rows and columns. The
matrix() function has an optional dimnames argument that allows us to add names
directly when creating a matrix object. The syntax is the same as the dimnames()
function.
## A B C
## a 1 4 7
## b 2 5 8
## c 3 6 9
29 / 73
Naming Matrices created using rbind() and cbind()
The rbind() and cbind() functions allow us to name, respectively, each row or
column by just typing the name of the row or column in quotation marks, as shown
below.
## A B C
## [1,] 1 4 7
## [2,] 2 5 8
## [3,] 3 6 9 30 / 73
Extracting Data From Two-Dimensional Objects
31 / 73
Extracting Data From Two-Dimensional Objects
▶ Recall that square brackets are used to extract specific parts of data from
objects in R.
▶ Vectors are one-dimensional, so you can extract elements by providing a single
index inside square brackets.
▶ For two-dimensional objects, such as matrices or data frames:
▶ Use two index coordinates inside square brackets, separated by a comma.
▶ The general format is [i, j], where:
▶ i is the row index.
▶ j is the column index.
▶ This extracts the entry in the ith row and jth column, also called the ijth entry.
## [1] 6
32 / 73
Numeric Indices in Two-Dimensional Objects
▶ Leaving one value blank means extracting all the values in that dimension.
▶ Positive, negative, and fractional indices work the same way as they do for
vectors.
▶ Row and column indices are independent, allowing for mixed positive and
negative indices.
B # Notice the row and column indices in the output
## [1] 4
33 / 73
B[2, ] # Extract the second row
## [1] 4 5 6
## [1] 3 6 9
## [,1] [,2]
## [1,] 1 3
## [2,] 4 6
## [3,] 7 9
34 / 73
B[-1, c(2, 3)] # Remove the first row and extract the second and third colu
## [,1] [,2]
## [1,] 5 6
## [2,] 8 9
Note: Notice that when the resulting output is one-dimensional (i.e., a single row or a
single column), the output object is a vector, not a one-dimensional matrix.
35 / 73
Caution: Single Indexing in Matrices
▶ Caution: Using a single index [] instead of an index pair [,] will not throw a
warning or error for matrix objects.
▶ Matrices are stored as one long column-major vector:
▶ Values are stored top-to-bottom down the first column, then the second column, and
so on.
▶ Using a single index [] will return the corresponding entries in the vector
B[8] # extracts the 8th value from the matrix as if it were a vector
## [1] 6
B[c(2, 3)] # extracts the second and third element as if it were a vector
## [1] 4 7
B[2, 3] # extracts the element in the second row and third column
36 / 73
Logical Indices
▶ Logical vectors can also be used to subset two-dimensional objects.
▶ The behavior of logical indices is similar to vectors, allowing us to extract rows
or columns that satisfy specific conditions.
Question: How can we use the tall_index vector to extract only the
rows/observations for people in the data who are at most 65 inches tall?
37 / 73
Named (Character) Indices
If the rows or columns of a two-dimensional object are named, we can use the name as
an index.
▶ row names can only be used as a row index,
▶ column names can only be used as a column index.
## [1] 71
## Leslie April
## 115 119
Note: Notice that we do not need to know the numeric index for the observations or
variables. Using names as indices can also increase the readability of your code.
39 / 73
Matrix Operations
40 / 73
Entrywise Arithmetic Operations
▶ Matrices are stored as vectors, so arithmetic operations (+, -, *, /, etc.) on
numeric matrices work just like on numeric vectors.
▶ Entrywise operations: Arithmetic operations are applied to each entry in the
matrix.
▶ Matrices must be conformable (same dimensions) for entrywise operations
between two matrices.
A + 10 # Add 10 to every entry in A
If we try to apply the operators to matrices that are not conformable, R will throw an
error.
43 / 73
Matrix Multiplication (the one from MATH 33A)
44 / 73
Review: The Dot Product
For two vectors a = [a1 , a2 , . . . , an ] and b = [b1 , b2 , . . . , bn ], the dot product is:
n
X
a·b= ai bi = a1 b1 + a2 b2 + · · · + an bn
i=1
45 / 73
Your turn: Computing the Dot Product
For vectors
v1 1 w1 2
v = v2 = 3 and w = w2 = −4 .
v3 −5 w3 6
compute the dot product v · w (which is the same as the matrix multiplication v T w ).
What is the dimensionality of your result?
46 / 73
Matrix Multiplication is a generalization of the Dot Product
▶ The (i, j)th entry of the product AB is the dot product of the ith row of A with
the jth column of B.
▶ Multiplying a matrix with only one row A and another matrix with only one
column B (i.e., matrix A this actually a row vector and matrix B that is actually a
column vector), then matrix multiplication simplifies to the dot product
47 / 73
For example, let A be a 2 × 3 matrix and B be a 3 × 2 matrix, denoted by
" # b11 b12
a11 a12 a13
A= and B = b21 b22 .
a21 a22 a23
b31 b32
" n #
X
Written generally, if A = [aik ]m×n and B = [bkj ]n×p , then AB = aik bkj .
k=1 m×p
48 / 73
Your turn: Computing the Matrix Multiplication
Questions
▶ What is the dimensionality of the resulting matrix?
▶ What is the resulting matrix AB?
49 / 73
Matrix Multiplication in R
In R, Matrix multiplication is performed using the %*% operator:
Important:
▶ Matrices A ∈ [2 × 3] and C ∈ [2 × 3] are conformable for entrywise
multiplication but not for matrix multiplication.
▶ Matrices A ∈ [2 × 3] and B ∈ [3 × 2] are conformable for matrix multiplication
but not for entrywise multiplication. 50 / 73
What’s up with the two Matrix Multiplications?
Yes, there are two versions of matrix multiplication (in this class).
1. Element-wise matrix multiplication: For A ∈ [m × n] and B ∈ [m × n], compute
the ijth entry of A ∗ B ∈ [m × n] as
[A ∗ B]ij = aij ∗ bij
2. Standard matrix product: For A ∈ [m × n] and B ∈ [n × p], compute the ijth
entry of AB ∈ [m × p] as
n
X
[AB]ij = aik bkj
k=1
So which one should I use?
That depends on what you are doing. Keep in mind that:
▶ A matrix is nothing but a clever way to arrange a lot of numbers.
▶ Matrix operators / multiplication are just a clever way to notate concisely what
computations to perform on all these numbers.
▶ There is no deeper logic, it’s just different ways to crunch large sets of numbers.
51 / 73
The Transpose of a Matrix
52 / 73
Using the t() Function to Transpose a Matrix
In R, you can compute the transpose of a matrix using the t() function
t(A) # transpose of A
## [,1] [,2]
## [1,] 1 2
## [2,] 3 4
## [3,] 5 6
t(B) # transpose of B
54 / 73
The Identity Matrix
Mathematical Definition
(
1 if i = j
[In ]ij =
̸ j
0 if i =
Purpose
The identity matrix acts like the 1 in scalar multiplication but for matrix
multiplication, leaving other matrices unchanged when multiplying them by In .
55 / 73
Identity Matrices Example
Examples:
Identity matrices can be any size, they just have to be square (number of rows equals
number of columns)
" # 1 0 0
h i 1 0
I1 = 1 , I2 = , I3 = 0 1 0 , . . .
0 1
0 0 1
Example use:
Your turn: Compute AI3 or I2 A. Pick one, it does not matter which one.
56 / 73
Diagonal Matrices
A diagonal matrix is a matrix in which the entries outside the main diagonal (entries
Aij where i = j) are all zero.
1 0 0
1 0 0 1 0 0 0
0 5 0
0 12 0 , 0 5 0 0 ,
0 0 10
0 0 4 0 0 10 0
0 0 0
···
1 0 0 0
0 1 0 ··· 0
0 0 1 ··· 0
. .. .. ..
. ..
. . . . .
0 0 0 ··· 1
57 / 73
Creating Diagonal Matrices using the diag() Function
The diag() function has two main functionalities. By inputting a number, the diag()
function will generate an identity matrix of that size.
# inputting a number
diag(4) # Create a 4x4 identity matrix
58 / 73
By inputting a vector, the diag() function will generate a diagonal matrix (the only
nonzero entries are along the diagonal) with the vector values along the diagonal.
# inputting a vector
diag(c(1, 2, 3)) # Create a diagonal matrix with 1, 2, 3 along the diagonal
59 / 73
The nrow and ncol arguments can also be used to specify the dimensions for a
rectangular (non-square) matrix.
60 / 73
The Inverse of a Matrix
1 1
2× = × 2 = 1.
2 2
61 / 73
Inverting Matrices using the solve() function
The function solve() computes the inverse of the inputted matrix:
## [,1] [,2]
## [1,] 1 2
## [2,] 4 1
## [,1] [,2]
## [1,] -0.1428571 0.2857143
## [2,] 0.5714286 -0.1428571
# Verify that M_inv is the inverse of M
62 / 73
Caution: Not all Matrices are Invertible
solve(B)
63 / 73
Singular Matrices in Statistics
As statisticians it can (and will) happen that when estimating linear regression
coefficients −1
β = XTX XT y,
64 / 73
Operations on Matrix Columns and Rows
65 / 73
The apply() Function
Suppose we want to compute the mean of each variable in the parks_mat matrix.
parks_mat
66 / 73
We could compute the mean of each variable individually, but it would require
repetitive code (or a for loop):
## [1] 66.33333
## [1] 145
## [1] 3000
For large matrices (or other data objects you will see later), using repetitive code is
inefficient and cumbersome.
67 / 73
The apply() function
The apply() function is used to apply a function to the rows or columns (the
margins) of matrices, arrays (higher dimension matrices), and data frames (which you
will see soon).
Similar to vapply(), the syntax of apply() is apply(X, MARGIN, FUN, ...),
where the arguments are:
▶ X: A matrix or data frame
▶ MARGIN: A vector giving the subscript(s) over which the function will be applied
over. A 1 indicates rows, 2 indicates columns, and c(1, 2) indicates rows and
columns.
▶ FUN: The function to be applied.
▶ ...: Any optional arguments to be passed to the FUN function (for example,
na.rm = TRUE).
68 / 73
Using apply() to Compute Row / Column Means
Using apply(), we can apply the mean() function to each column in parks_mat
simultaneously with a single command.
# Compute the mean of every column of the parks_mat matrix
apply(X = parks_mat, MARGIN = 2, FUN = mean, na.rm = TRUE)
To compute the mean of each row, we can change the margin argument MARGIN from
2 (columns) to 1 (rows).
# Compute the mean of every row of the parks_mat matrix
apply(X = parks_mat, MARGIN = 1, FUN = mean, na.rm = TRUE)
## [1] 62 71
apply(X = parks_mat, MARGIN = 2, FUN = range, na.rm = TRUE) # range of every column
72 / 73
apply() also Follows the Split-Apply-Combine Strategy
1. Split the data set into a set of row vectors (MARGIN = 1) or column vectors
(MARGIN = 2)
2. Apply a function (specified in FUN) to each vector
3. Combine all results into a vector (if individual results were scalars) or a matrix (if
individual results were vectors)
73 / 73