0% found this document useful (0 votes)

2 views

Notation Example

Uploaded by

vishwanath444

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2 views

Notation Example

Uploaded by

vishwanath444

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 11

Example Notation for Deep Learning

Ian Goodfellow
Yoshua Bengio
Aaron Courville
Contents

Notation ii

1 Commentary 1
1.1 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

Bibliography 4

Index 5

i
Notation

This section provides a concise reference describing notation used throughout this
document. If you are unfamiliar with any of the corresponding mathematical
concepts, Goodfellow et al. (2016) describe most of these ideas in chapters 2–4.

Numbers and Arrays

a A scalar (integer or real)
a A vector
A A matrix
A A tensor
In Identity matrix with n rows and n columns
I Identity matrix with dimensionality implied by
context
e(i) Standard basis vector [0, . . . , 0, 1, 0, . . . , 0] with a
1 at position i
diag(a) A square, diagonal matrix with diagonal entries
given by a
a A scalar random variable
a A vector-valued random variable
A A matrix-valued random variable

ii
CONTENTS

Sets and Graphs

A A set
R The set of real numbers
{0, 1} The set containing 0 and 1
{0, 1, . . . , n} The set of all integers between 0 and n
[a, b] The real interval including a and b
(a, b] The real interval excluding a but including b
A\B Set subtraction, i.e., the set containing the ele-
ments of A that are not in B
G A graph
P aG (xi ) The parents of xi in G

Indexing
ai Element i of vector a, with indexing starting at 1
a−i All elements of vector a except for element i
Ai,j Element i, j of matrix A
Ai,: Row i of matrix A
A:,i Column i of matrix A
Ai,j,k Element (i, j, k) of a 3-D tensor A
A:,:,i 2-D slice of a 3-D tensor
ai Element i of the random vector a

Linear Algebra Operations

>
A Transpose of matrix A
A+ Moore-Penrose pseudoinverse of A
A B Element-wise (Hadamard) product of A and B
det(A) Determinant of A

iii
CONTENTS

Calculus
dy
Derivative of y with respect to x
dx
∂y
Partial derivative of y with respect to x
∂x
∇x y Gradient of y with respect to x
∇X y Matrix derivatives of y with respect to X
∇X y Tensor containing derivatives of y with respect to
X
∂f
Jacobian matrix J ∈ Rm×n of f : Rn → Rm
∂x
2
∇x f (x) or H(f )(x) The Hessian matrix of f at input point x
Z
f (x)dx Definite integral over the entire domain of x
Z
f (x)dx Definite integral with respect to x over the set S
S

Probability and Information Theory

a⊥b The random variables a and b are independent
a⊥b | c They are conditionally independent given c
P (a) A probability distribution over a discrete variable
p(a) A probability distribution over a continuous vari-
able, or over a variable whose type has not been
specified
a∼P Random variable a has distribution P
Ex∼P [f (x)] or Ef (x) Expectation of f (x) with respect to P (x)
Var(f (x)) Variance of f (x) under P (x)
Cov(f (x), g(x)) Covariance of f (x) and g(x) under P (x)
H(x) Shannon entropy of the random variable x
DKL (P kQ) Kullback-Leibler divergence of P and Q
N (x; µ, Σ) Gaussian distribution over x with mean µ and
covariance Σ

iv
CONTENTS

Functions
f :A→B The function f with domain A and range B
f ◦g Composition of the functions f and g
f (x; θ) A function of x parametrized by θ. (Sometimes
we write f (x) and omit the argument θ to lighten
notation)
log x Natural logarithm of x
1
σ(x) Logistic sigmoid,
1 + exp(−x)
ζ(x) Softplus, log(1 + exp(x))
||x||p Lp norm of x
||x|| L2 norm of x
x+ Positive part of x, i.e., max(0, x)
1condition is 1 if the condition is true, 0 otherwise
Sometimes we use a function f whose argument is a scalar but apply it to a
vector, matrix, or tensor: f (x), f (X), or f (X). This denotes the application of f
to the array element-wise. For example, if C = σ(X), then Ci,j,k = σ(Xi,j,k ) for all
valid values of i, j and k.

Datasets and Distributions

pdata The data generating distribution
p̂data The empirical distribution defined by the training
set
X A set of training examples
x(i) The i-th example (input) from a dataset
y (i) or y (i) The target associated with x(i) for supervised learn-
ing
X The m × n matrix with input example x(i) in row
Xi,:

v
Chapter 1

Commentary

This document is an example of how to use the accompanying files as well as some
commentary on them. The files are math_commands.tex and notation.tex. The
file math_commands.tex includes several useful LATEX macros and notation.tex
defines a notation page that could be used at the front of any publication.
We developed these files while writing Goodfellow et al. (2016). We release
these files for anyone to use freely, in order to help establish some standard notation
in the deep learning community.

1.1 Examples
We include this section as an example of some LATEX commands and the macros
we created for the book.
Citations that support a sentence without actually being used in the sentence
should appear at the end of the sentence using citep:

Inventors have long dreamed of creating machines that think. This

desire dates back to at least the time of ancient Greece. The mythical
figures Pygmalion, Daedalus, and Hephaestus may all be interpreted
as legendary inventors, and Galatea, Talos, and Pandora may all be
regarded as artificial life (Ovid and Martin, 2004; Sparkes, 1996; Tandy,
1997).

When the authors of a document or the document itself are a noun in the
sentence, use the citet command:
1
CHAPTER 1. COMMENTARY

Mitchell (1997) provides a succinct definition of machine learning: “A

computer program is said to learn from experience E with respect to
some class of tasks T and performance measure P , if its performance
at tasks in T , as measured by P , improves with experience E.”

When introducing a new term, using the newterm macro to highlight it. If
there is a corresponding acronym, put the acronym in parentheses afterward. If
your document includes an index, also use the index command.

Today, artificial intelligence (AI) is a thriving field with many prac-

tical applications and active research topics.

Sometimes you may want to make many entries in the index that all point to a
canonical index entry:

One of the simplest and most common kinds of parameter norm penalty
is the squared L2 parameter norm penalty commonly known as weight
decay. In other academic communities, L2 regularization is also known
as ridge regression or Tikhonov regularization.

To refer to a figure, use either figref or Figref depending on whether you

want to capitalize the resulting word in the sentence.

See figure 1.1 for an example of a how to include graphics in your

document. Figure 1.1 shows how to include graphics in your document.

Similarly, you can refer to different sections of the book using partref, Partref,
secref, Secref, etc.

You are currently reading section 1.1.

Acknowledgments
We thank Catherine Olsson and Úlfar Erlingsson for proofreading and review of
this manuscript.

2
CHAPTER 1. COMMENTARY

Deep learning Example:

Shallow
Example: Example:
Example: autoencoders
Logistic Knowledge
MLPs
regression bases

Representation learning

Machine learning

Figure 1.1: An example of a figure. The figure is a PDF displayed without being rescaled
within LATEX. The PDF was created at the right size to fit on the page, with the fonts at
the size they should be displayed. The fonts in the figure are from the Computer Modern
family so they match the fonts used by LATEX.

3
Bibliography

Goodfellow, I., Bengio, Y., and Courville, A. (2016). Deep Learning. MIT Press.

Mitchell, T. M. (1997). Machine Learning. McGraw-Hill, New York.

Ovid and Martin, C. (2004). Metamorphoses. W.W. Norton.

Sparkes, B. (1996). The Red and the Black: Studies in Greek Pottery. Routledge.

Tandy, D. W. (1997). Works and Days: A Translation and Commentary for the Social
Sciences. University of California Press.

4
Index

Artificial intelligence, 2 Tikhonov regularization, see weight decay

Transpose, iii
Conditional independence, iv
Covariance, iv Variance, iv
Vector, ii, iii
Derivative, iv
Determinant, iii Weight decay, 2

Element-wise product, see Hadamard prod-

uct

Graph, iii

Hadamard product, iii

Hessian matrix, iv

Independence, iv
Integral, iv

Jacobian matrix, iv

Kullback-Leibler divergence, iv

Matrix, ii, iii

Norm, v

Ridge regression, see weight decay

Scalar, ii, iii

Set, iii
Shannon entropy, iv
Sigmoid, v
Softplus, v

Tensor, ii, iii

ML Interview Questions and Answers
No ratings yet
ML Interview Questions and Answers
105 pages
Sample Assessment Questions
No ratings yet
Sample Assessment Questions
31 pages
Tutorial On Helmholtz Machine
No ratings yet
Tutorial On Helmholtz Machine
26 pages
Machine Learning Notation: 1 Numbers & Arrays 4 Functions
No ratings yet
Machine Learning Notation: 1 Numbers & Arrays 4 Functions
2 pages
Machine Learning: The Basics
No ratings yet
Machine Learning: The Basics
288 pages
unit 1
No ratings yet
unit 1
39 pages
DL (Unit I)
No ratings yet
DL (Unit I)
25 pages
Main
No ratings yet
Main
183 pages
MLBasicsBook
No ratings yet
MLBasicsBook
287 pages
Copy of deep-learning
No ratings yet
Copy of deep-learning
28 pages
Dasar Statistika Dan Matematika
No ratings yet
Dasar Statistika Dan Matematika
30 pages
Notation
No ratings yet
Notation
4 pages
Math Notation
No ratings yet
Math Notation
20 pages
Deep Learning Math Background
No ratings yet
Deep Learning Math Background
30 pages
Mml-Book Removed
No ratings yet
Mml-Book Removed
295 pages
00-statistics
No ratings yet
00-statistics
18 pages
DL Notes Unit 1
No ratings yet
DL Notes Unit 1
28 pages
Introduction To Deep Learning
No ratings yet
Introduction To Deep Learning
151 pages
Deep Learning.pdf
No ratings yet
Deep Learning.pdf
289 pages
Machine Learning and Pattern Recognition Notation
No ratings yet
Machine Learning and Pattern Recognition Notation
4 pages
Deep Learning For Mathematicians
No ratings yet
Deep Learning For Mathematicians
32 pages
Unit 2.1
No ratings yet
Unit 2.1
37 pages
009-Neural_Networks-Complete
No ratings yet
009-Neural_Networks-Complete
61 pages
Christopher Manning Lecture 3: Neural Net Learning: Gradients by Hand (Matrix Calculus) and Algorithmically (The Backpropagation Algorithm)
No ratings yet
Christopher Manning Lecture 3: Neural Net Learning: Gradients by Hand (Matrix Calculus) and Algorithmically (The Backpropagation Algorithm)
84 pages
Deep Learning For Mathematicians
No ratings yet
Deep Learning For Mathematicians
32 pages
Final2 Math EE
No ratings yet
Final2 Math EE
77 pages
Intro Deep Learning
No ratings yet
Intro Deep Learning
32 pages
Matematics and Machine Learning
No ratings yet
Matematics and Machine Learning
156 pages
Maths For ML
No ratings yet
Maths For ML
156 pages
DL Notes
No ratings yet
DL Notes
652 pages
Background Material Crib-Sheet: 1 Probability Theory
No ratings yet
Background Material Crib-Sheet: 1 Probability Theory
4 pages
Module1_ Deep Learning
No ratings yet
Module1_ Deep Learning
26 pages
Pattern Classification
No ratings yet
Pattern Classification
41 pages
1 & 2 Linear Algebra and Probability Distribution
No ratings yet
1 & 2 Linear Algebra and Probability Distribution
11 pages
Lecture 2_Math (1)
No ratings yet
Lecture 2_Math (1)
39 pages
Mathematics of Deep Learning: Lecture 1-Introduction and The Universality of Depth 1 Nets
No ratings yet
Mathematics of Deep Learning: Lecture 1-Introduction and The Universality of Depth 1 Nets
12 pages
ML1 Skript 2023
No ratings yet
ML1 Skript 2023
97 pages
Lesson 2 - Background for AI [Autosaved]new
No ratings yet
Lesson 2 - Background for AI [Autosaved]new
37 pages
D2L CH2 Part1
No ratings yet
D2L CH2 Part1
38 pages
Deep Learning: An Introduction For Applied Mathematicians: Catherine F. Higham Desmond J. Higham
No ratings yet
Deep Learning: An Introduction For Applied Mathematicians: Catherine F. Higham Desmond J. Higham
32 pages
3410notes-Linear Algebra Python
No ratings yet
3410notes-Linear Algebra Python
235 pages
Lecture20 Backprop
No ratings yet
Lecture20 Backprop
77 pages
Deep Learning Summer School 2015: Introduction To Machine Learning
No ratings yet
Deep Learning Summer School 2015: Introduction To Machine Learning
46 pages
Lecture Maths
No ratings yet
Lecture Maths
103 pages
Complete UNIT III DEEP LEARNING PPT (1)
No ratings yet
Complete UNIT III DEEP LEARNING PPT (1)
126 pages
Deep Learning M1
No ratings yet
Deep Learning M1
54 pages
ML Merge
No ratings yet
ML Merge
145 pages
Data Mining: Practical Machine Learning Tools and Techniques
No ratings yet
Data Mining: Practical Machine Learning Tools and Techniques
123 pages
RADL TQKhoat
No ratings yet
RADL TQKhoat
50 pages
1 Slides ANN
No ratings yet
1 Slides ANN
90 pages
4. Deep Learning
No ratings yet
4. Deep Learning
110 pages
DL Unit 2
No ratings yet
DL Unit 2
29 pages
Formalisms
No ratings yet
Formalisms
12 pages
Unit - 1 MACHINE LEARNING BASICS, LINEAR ALGEBRA
No ratings yet
Unit - 1 MACHINE LEARNING BASICS, LINEAR ALGEBRA
41 pages
Deep Learning
No ratings yet
Deep Learning
142 pages
Alice Book Volume 1
No ratings yet
Alice Book Volume 1
281 pages
DL (1-10)
No ratings yet
DL (1-10)
10 pages
DL QB With Ans
No ratings yet
DL QB With Ans
38 pages
Multiple Integrals, A Collection of Solved Problems
From Everand
Multiple Integrals, A Collection of Solved Problems
Steven Tan
No ratings yet
Square Summable Power Series
From Everand
Square Summable Power Series
Louis de Branges
5/5 (1)
Calculus I Essentials
From Everand
Calculus I Essentials
Editors of REA
1/5 (1)
Purchasing Organization in Enterprise: Supply Chain Management
100% (1)
Purchasing Organization in Enterprise: Supply Chain Management
57 pages
Short term electrical testing
No ratings yet
Short term electrical testing
2 pages
Fanuc Laser
No ratings yet
Fanuc Laser
4 pages
Project: Indiana Eligibility Determination Services System (IEDSS)
No ratings yet
Project: Indiana Eligibility Determination Services System (IEDSS)
3 pages
Industrial Engineering Mec 422 2 Unit Course Note WK1-3
No ratings yet
Industrial Engineering Mec 422 2 Unit Course Note WK1-3
8 pages
A Study On Customer Perception Towards HDFC Limited
No ratings yet
A Study On Customer Perception Towards HDFC Limited
13 pages
Lesson 1-34
No ratings yet
Lesson 1-34
67 pages
Oratorical Piece For Teachers' Day
100% (7)
Oratorical Piece For Teachers' Day
2 pages
Science Experiment1
No ratings yet
Science Experiment1
4 pages
Sanjay Singh Mobile Numerology Report
No ratings yet
Sanjay Singh Mobile Numerology Report
6 pages
Micro Units Case Studies PDF
No ratings yet
Micro Units Case Studies PDF
10 pages
EDA PP
No ratings yet
EDA PP
3 pages
How To Add, Subtract, Multiply, Divide in Excel
No ratings yet
How To Add, Subtract, Multiply, Divide in Excel
6 pages
Limits One Shot JEE
No ratings yet
Limits One Shot JEE
170 pages
Does Behavior Always Follow From Attitude? Provide A Few Examples With Justifications Where Attitude and Behavior Are Not Aligned With Each Other
No ratings yet
Does Behavior Always Follow From Attitude? Provide A Few Examples With Justifications Where Attitude and Behavior Are Not Aligned With Each Other
2 pages
Acoustical Board Product Data Sheet: Description
No ratings yet
Acoustical Board Product Data Sheet: Description
2 pages
Fbs Week 5 Grade 7 8 Leap
No ratings yet
Fbs Week 5 Grade 7 8 Leap
4 pages
The Rotordynamics Analysis of The Washing Machine Shaft Supported by Passive Magnetic
No ratings yet
The Rotordynamics Analysis of The Washing Machine Shaft Supported by Passive Magnetic
22 pages
Antenna
100% (2)
Antenna
68 pages
ESD - CS6 - Small Scale Embedded System Design Example
No ratings yet
ESD - CS6 - Small Scale Embedded System Design Example
54 pages
Lec 11 Clasp Assembly
100% (1)
Lec 11 Clasp Assembly
39 pages
Introduction To Statistical Learning
No ratings yet
Introduction To Statistical Learning
16 pages
Janessa Resume
No ratings yet
Janessa Resume
7 pages
Science Form 2 March 2018 Monthly Test Marking Scheme
No ratings yet
Science Form 2 March 2018 Monthly Test Marking Scheme
3 pages
Akash Karia - TED Talks Storytelling - 23 Storytelling Techniques From The Best TED Talks-CreateSpace Independent Publishing Platform (2015)
No ratings yet
Akash Karia - TED Talks Storytelling - 23 Storytelling Techniques From The Best TED Talks-CreateSpace Independent Publishing Platform (2015)
40 pages
Proyecto de Telenovela
No ratings yet
Proyecto de Telenovela
1 page
CC1011 Midterm
No ratings yet
CC1011 Midterm
3 pages
Break Tanks
100% (2)
Break Tanks
3 pages
Satip A 004 06
No ratings yet
Satip A 004 06
10 pages