0% found this document useful (0 votes)
2 views

7. Gradients of Vector-Valued Functions & Matrices

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

7. Gradients of Vector-Valued Functions & Matrices

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

Applied

Mathematics for
Machine Learning
José David Vega Sánchez
[email protected]
2025
2
Cálculo Vectorial
Outline

1. Gradients of Vector-Valued Functions

2. Gradients of Matrices
Gradients of
Vector-Valued
Functions
1. Gradients of Vector-Valued Functions
Definition
The gradient of a vector-valued function refers to the derivatives of a function that maps a vector
input to a vector output. These derivatives are taken with respect to each component of the input
vector, forming a matrix called the Jacobian matrix.

Definition of a Vector-Valued Funcion

where:
1. Gradients of Vector-Valued Functions
Definition of a Vector-Valued Funcion

Jacobian Matrix: The gradient of a vector-valued function is represented as a Jacobian matrix,


which has dimensions m×n (rows correspond to functions, columns correspond to input
variables):

𝜕𝜕𝑓𝑓
Each element 𝜕𝜕 𝑖𝑖 represents the partial derivative of the i-th function 𝑓𝑓𝑖𝑖 with respect to the j-th
𝑥𝑥𝑖𝑖
variable 𝑥𝑥𝑖𝑖 .
1. Gradients of Vector-Valued Functions
Example
Gradients of
Matrices
2. Gradients of Matrices
Definition
In ML, we often face scenarios in which the input and output quantities are not simple scalars or
vectors but are themselves matrices (or higher-dimensional tensors). The gradient of a scalar
function f with respect to a matrix B is often expressed as another matrix where each entry
represents the partial derivative:

When f is scalar, this gradient is just a matrix of the same size as B.


However, in the case of matrices-to-matrices, the gradient expands to a more complex structure.
The most general form involves Jacobian tensors, which can span multiple dimensions.
2. Gradients of Matrices
Gradient between two matrices
Let’s consider two matrices:

When computing the gradient of A with respect to B, the output is a 4-dimensional tensor,
because every element A𝑖𝑖𝑖𝑖 of A depends on every element B𝑖𝑖𝑖𝑖 of B. This means:

Jacobian Tensor: Capturing all partial


derivatives of one matrix A with respect
is a (m×n)×(p×q) tensor. Each entry of this tensor is: to another matrix B.
2. Gradients of Matrices
Example of Gradient between two matrices
Let:

Define the relationship:

We want to compute:
2. Gradients of Matrices
Example of Gradient between two matrices
2. Gradients of Matrices
Example of Gradient between two matrices
2. Gradients of Matrices
Example of Gradient between two matrices
2. Gradients of Matrices
PyTorch
Gradient between two matrices in Neural Networks
https://datahacker.rs/009-pytorch-how-to-apply-backpropagation-with-vectors-and-tensors/
Thanks

You might also like