Notation Example
Notation Example
Ian Goodfellow
Yoshua Bengio
Aaron Courville
Contents
Notation ii
1 Commentary 1
1.1 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
Bibliography 4
Index 5
i
Notation
This section provides a concise reference describing notation used throughout this
document. If you are unfamiliar with any of the corresponding mathematical
concepts, Goodfellow et al. (2016) describe most of these ideas in chapters 2–4.
ii
CONTENTS
Indexing
ai Element i of vector a, with indexing starting at 1
a−i All elements of vector a except for element i
Ai,j Element i, j of matrix A
Ai,: Row i of matrix A
A:,i Column i of matrix A
Ai,j,k Element (i, j, k) of a 3-D tensor A
A:,:,i 2-D slice of a 3-D tensor
ai Element i of the random vector a
iii
CONTENTS
Calculus
dy
Derivative of y with respect to x
dx
∂y
Partial derivative of y with respect to x
∂x
∇x y Gradient of y with respect to x
∇X y Matrix derivatives of y with respect to X
∇X y Tensor containing derivatives of y with respect to
X
∂f
Jacobian matrix J ∈ Rm×n of f : Rn → Rm
∂x
2
∇x f (x) or H(f )(x) The Hessian matrix of f at input point x
Z
f (x)dx Definite integral over the entire domain of x
Z
f (x)dx Definite integral with respect to x over the set S
S
iv
CONTENTS
Functions
f :A→B The function f with domain A and range B
f ◦g Composition of the functions f and g
f (x; θ) A function of x parametrized by θ. (Sometimes
we write f (x) and omit the argument θ to lighten
notation)
log x Natural logarithm of x
1
σ(x) Logistic sigmoid,
1 + exp(−x)
ζ(x) Softplus, log(1 + exp(x))
||x||p Lp norm of x
||x|| L2 norm of x
x+ Positive part of x, i.e., max(0, x)
1condition is 1 if the condition is true, 0 otherwise
Sometimes we use a function f whose argument is a scalar but apply it to a
vector, matrix, or tensor: f (x), f (X), or f (X). This denotes the application of f
to the array element-wise. For example, if C = σ(X), then Ci,j,k = σ(Xi,j,k ) for all
valid values of i, j and k.
v
Chapter 1
Commentary
This document is an example of how to use the accompanying files as well as some
commentary on them. The files are math_commands.tex and notation.tex. The
file math_commands.tex includes several useful LATEX macros and notation.tex
defines a notation page that could be used at the front of any publication.
We developed these files while writing Goodfellow et al. (2016). We release
these files for anyone to use freely, in order to help establish some standard notation
in the deep learning community.
1.1 Examples
We include this section as an example of some LATEX commands and the macros
we created for the book.
Citations that support a sentence without actually being used in the sentence
should appear at the end of the sentence using citep:
When the authors of a document or the document itself are a noun in the
sentence, use the citet command:
1
CHAPTER 1. COMMENTARY
When introducing a new term, using the newterm macro to highlight it. If
there is a corresponding acronym, put the acronym in parentheses afterward. If
your document includes an index, also use the index command.
Sometimes you may want to make many entries in the index that all point to a
canonical index entry:
One of the simplest and most common kinds of parameter norm penalty
is the squared L2 parameter norm penalty commonly known as weight
decay. In other academic communities, L2 regularization is also known
as ridge regression or Tikhonov regularization.
Similarly, you can refer to different sections of the book using partref, Partref,
secref, Secref, etc.
Acknowledgments
We thank Catherine Olsson and Úlfar Erlingsson for proofreading and review of
this manuscript.
2
CHAPTER 1. COMMENTARY
Representation learning
Machine learning
AI
Figure 1.1: An example of a figure. The figure is a PDF displayed without being rescaled
within LATEX. The PDF was created at the right size to fit on the page, with the fonts at
the size they should be displayed. The fonts in the figure are from the Computer Modern
family so they match the fonts used by LATEX.
3
Bibliography
Goodfellow, I., Bengio, Y., and Courville, A. (2016). Deep Learning. MIT Press.
Sparkes, B. (1996). The Red and the Black: Studies in Greek Pottery. Routledge.
Tandy, D. W. (1997). Works and Days: A Translation and Commentary for the Social
Sciences. University of California Press.
4
Index
Graph, iii
Independence, iv
Integral, iv
Jacobian matrix, iv
Kullback-Leibler divergence, iv
Norm, v