0% found this document useful (0 votes)

29 views10 pages

Mathematics For Machine Learning V5

Uploaded by

abdoalsenaweabdo

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

29 views10 pages

Mathematics For Machine Learning V5

Uploaded by

abdoalsenaweabdo

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 10

Mathematics for Machine Learning : Essential

Equations (V5)

1. Linear Algebra
• Addition of Vectors:  
u1 + v1
 u2 + v2 
u + v =  .. 
 
 . 
un + vn

• Scaling a Vector:  
cv1
 cv2 
c · v =  .. 
 
 . 
cvn

• Matrix Scalar Multiplication:

 
ca11 ca12 ··· ca1n
 ca21 ca22 ··· ca2n 
c · A =  ..
 
.. .. .. 
 . . . . 
cam1 cam2 · · · camn

• Matrix-Vector Product:
 Pn 
j=1 a1j vj
 Pn a2j vj 
 j=1
Av = 

.. 
Pn .
 
j=1 amj vj

• Matrix Trace: n
X
tr(A) = aii
i=1

• Matrix Determinant (2x2 Matrix):

det(A) = a11 a22 − a12 a21

• Eigenvector Equation:
Av = λv

1
• Vector Projection:
a·b
projb (a) = b
b·b
• Inverse of a 2x2 Matrix:

−1 1 a22 −a12
A =
det(A) −a21 a11

• Orthogonality Condition:
u · v = 0 if u and v are orthogonal.

2. Probability and Statistics

• Joint Probability:
P (A ∩ B) = P (A|B)P (B)
• Bayes’ Theorem (Alternative Form):
P (A|B)P (B)
P (B|A) =
P (A)

• Variance (Alternative):
Var(X) = E[X 2 ] − (E[X])2

• Cumulative Distribution Function (CDF):

FX (x) = P (X ≤ x)

• Covariance (Alternative):
Cov(X, Y ) = E[XY ] − E[X]E[Y ]

• Entropy: X
H(X) = − P (x) log P (x)
x

• KL Divergence:
X P (x)
DKL (P ||Q) = P (x) log
x
Q(x)

• Conditional Expectation:
Z
E[Y |X] = yfY |X (y|x)dy
y

• Law of Iterated Expectations:

E[Y ] = E[E[Y |X]]

• Central Limit Theorem:

X̄ − µ d
√ → − N (0, 1)
σ/ n

2
3. Calculus
• Power Rule:
d n
[x ] = nxn−1
dx
• Product Rule:
d dv du
[uv] = u + v
dx dx dx
• Quotient Rule:
d h u i v du − u dv
= dx 2 dx
dx v v
• Exponential Derivative:
d x
[e ] = ex
dx
• Logarithmic Derivative:
d 1
[ln x] =
dx x
• Integral of a Power Function:
xn+1
Z
xn dx = +C for n ̸= −1
n+1

• Fundamental Theorem of Calculus:

Z b
f ′ (x)dx = f (b) − f (a)
a

• Chain Rule (Alternative Form):

dy dy du
= ·
dx du dx

• Taylor Expansion (Simplified):

f ′′ (a)
f (x) ≈ f (a) + f ′ (a)(x − a) + (x − a)2
2

• Jacobian Matrix:
∂fi
Jij =
∂xj

4. Optimization
• Stochastic Gradient Descent (SGD):

w ← w − η∇J(w; xi , yi )

• Momentum Gradient Descent:

vt = βvt−1 + (1 − β)∇J(w), w ← w − ηvt

3
• RMSProp Update Rule:
η
w←w− p ∇J(w)
2
∇ J(w) + ϵ

• Nesterov Accelerated Gradient:

wt+1 = wt − η∇J(wt + β(wt − wt−1 ))

• Adam Optimization:

mt = β1 mt−1 + (1 − β1 )∇J(w), vt = β2 vt−1 + (1 − β2 )(∇J(w))2

• Gradient Clipping:

∇J(w)
∇J(w) ←
max(1, ||∇J(w)||/c)

• Projected Gradient Descent:

wt+1 = ΠC (wt − η∇J(wt ))

• Newton’s Method:
wt+1 = wt − ηH−1 ∇J(wt )

• Proximal Gradient Method:

wt+1 = proxg (wt − η∇f (wt ))

• Learning Rate Decay:

η0
ηt =
1 + λt

5. Regression Models
• Linear Regression Hypothesis:

ŷ = Xw + b

• Ordinary Least Squares (OLS):

w = (XT X)−1 XT y

• Ridge Regression Objective:

J(w) = ||y − Xw||2 + λ||w||2

• Lasso Regression Objective:

J(w) = ||y − Xw||2 + λ||w||1

4
• Logistic Regression Hypothesis:
1
ŷ = σ(Xw + b), σ(z) =
1 + e−z

• Cross-Entropy Loss:
m
1 X
J(w) = − [yi log(ŷi ) + (1 − yi ) log(1 − ŷi )]
m i=1

• Mean Absolute Error (MAE):

m
1 X
MAE = |yi − ŷi |
m i=1

• Mean Squared Error (MSE):

m
1 X
MSE = (yi − ŷi )2
m i=1

• Coefficient of Determination (R-squared):

Pm
2 (yi − ŷi )2
R = 1 − Pi=1
m 2
i=1 (yi − ȳ)

• Adjusted R-squared:

(1 − R2 )(n − 1)
R̄2 = 1 −
n−p−1

• Gradient of MSE Loss:

1 T
∇J(w) = X (Xw − y)
m

• Hinge Loss for SVM:

m
1 X
J(w) = max(0, 1 − yi (wT xi + b))
m i=1

• Huber Loss: (
1 2
2
a if |a| ≤ δ,
Lδ (a) = 1
δ(|a| − 2
δ) if |a| > δ

5
6. Neural Networks
• Perceptron Update Rule:

w ← w + η(y − ŷ)x

• Sigmoid Activation Function:

1
σ(z) =
1 + e−z

• ReLU Activation Function:

f (x) = max(0, x)

• Softmax Function:
ezi
Softmax(zi ) = Pn
j=1 ezj

• Loss Function for Multi-Class Classification:

m K
1 XX
J(w) = − yik log(ŷik )
m i=1 k=1

• Forward Propagation (Single Layer):

a = σ(wT x + b)

• Backward Propagation (Gradient for Weights):

∂J
= x(ŷ − y)
∂w

• Gradient Descent for Neural Networks:

∂J
w ←w−η
∂w

• Dropout Regularization:
(l) (l)
hi = ri hi , ri ∼ Bernoulli(p)

• Batch Normalization:
xi − µ B
x̂i = p 2 , yi = γ x̂i + β
σB + ϵ

6
7. Clustering
• k-Means Objective Function:
K X
X
J= ||xi − µk ||2
k=1 i∈Ck

• Centroid Update Rule:

1 X
µk = x
|Ck | x∈C
k

• Distance Metric (Euclidean Distance):

v
u n
uX
d(x, y) = t (xi − yi )2
i=1

• Silhouette Score:
b(i) − a(i)
s(i) =
max(a(i), b(i))
• DBSCAN Core Point Condition:

|Nϵ (x)| ≥ MinPts where Nϵ (x) = {y : d(x, y) ≤ ϵ}

• Hierarchical Clustering Dendrogram Objective:

Minimize the linkage criterion L(A, B)

• Gaussian Mixture Model (GMM):

K
X
p(x) = πk N (x|µk , Σk )
k=1

• Expectation-Maximization (E-step):

πk N (xi |µk , Σk )
γik = PK
j=1 πj N (xi |µj , Σj )

• Expectation-Maximization (M-step):
PN PN
γik xi γik (xi − µk )(xi − µk )T
µk = Pi=1
N
and Σk = i=1
PN
i=1 γik i=1 γik

• Elbow Method for Optimal k:

Choose k where J(k) has the largest drop.

7
8. Dimensionality Reduction
• Principal Component Analysis (PCA) Objective:

Maximize ||Xw||2 subject to ||w|| = 1

• Covariance Matrix for PCA:

1 T
C= X X
m

• Eigen Decomposition for PCA:

Cw = λw

• t-SNE Objective: X pij

C= pij log
i̸=j
qij

• Singular Value Decomposition (SVD):

X = UΣVT

• LDA Objective (Fisher’s Criterion):

wT Sb w
J(w) =
wT Sw w

• Reconstruction Error for PCA:

Error = ||X − X̂||F

• Kernel PCA Transformation:

ϕ(x) → Principal Components in Feature Space

• Autoencoder Reconstruction:

X ≈ g(f (X))

• Explained Variance Ratio:

λi
Ratio = P
j λj

8
9. Probability Distributions
• Bernoulli Distribution:

P (X = x) = px (1 − p)1−x , x ∈ {0, 1}

• Binomial Distribution:

n k
P (X = k) = p (1 − p)n−k , k ∈ {0, 1, . . . , n}
k

• Poisson Distribution:
λk e−λ
P (X = k) = , k≥0
k!

• Uniform Distribution:
(
1
b−a
, a≤x≤b
f (x) =
0, otherwise

• Normal Distribution:
1 (x−µ)2
f (x) = √ e− 2σ 2
2πσ 2
• Exponential Distribution:
(
λe−λx , x ≥ 0
f (x) =
0, x<0

• Beta Distribution:
xα−1 (1 − x)β−1
f (x; α, β) = , x ∈ [0, 1]
B(α, β)

• Gamma Distribution:
β α xα−1 e−βx
f (x; α, β) = , x≥0
Γ(α)

• Multinomial Distribution:
n!
P (X1 = x1 , . . . , Xk = xk ) = px1 1 px2 2 · · · pxkk
x1 !x2 ! · · · xk !

• Chi-Square Distribution:
k x
x 2 −1 e− 2
f (x; k) = k , x≥0
2 2 Γ( k2 )

9
10. Reinforcement Learning
• Bellman Equation for State-Value Function:

V (s) = E[Rt + γV (St+1 )|St = s]

• Bellman Equation for Action-Value Function:

Q(s, a) = E[Rt + γQ(St+1 , At+1 )|St = s, At = a]

• Policy Improvement:
π ′ (s) = arg max Q(s, a)
a

• Temporal Difference Update Rule:

V (St ) ← V (St ) + α[Rt+1 + γV (St+1 ) − V (St )]

• Q-Learning Update Rule:

Q(St , At ) ← Q(St , At ) + α[Rt+1 + γ max Q(St+1 , a) − Q(St , At )]

• SARSA Update Rule:

Q(St , At ) ← Q(St , At ) + α[Rt+1 + γQ(St+1 , At+1 ) − Q(St , At )]

• Reward Function:
R(s, a) = E[Rt |St = s, At = a]

• Value Iteration Update Rule:

X
V (s) ← max[R(s, a) + γ P (s′ |s, a)V (s′ )]
a
s′

• Actor-Critic Policy Update:

θ ← θ + α∇θ log πθ (a|s)δ

• Discounted Return: ∞
X
Gt = γ k Rt+k+1
k=0

J Jcde 2016 07 002
No ratings yet
J Jcde 2016 07 002
9 pages
CS229
No ratings yet
CS229
216 pages
Important 13 Mark QA
No ratings yet
Important 13 Mark QA
52 pages
Aiml Solved Answers For QP
No ratings yet
Aiml Solved Answers For QP
39 pages
AI ML Concepts
No ratings yet
AI ML Concepts
97 pages
Lec 1 NLP
No ratings yet
Lec 1 NLP
23 pages
PCE3201-Network Analysis and Sysnthesis
No ratings yet
PCE3201-Network Analysis and Sysnthesis
11 pages
AI OS Ch03
No ratings yet
AI OS Ch03
27 pages
Ch-4-Application of FDM-Parabolic Eqn
No ratings yet
Ch-4-Application of FDM-Parabolic Eqn
21 pages
Modelling The Population of Singapore
No ratings yet
Modelling The Population of Singapore
32 pages
Journal December 21
No ratings yet
Journal December 21
181 pages
Lec1 Mathreview
No ratings yet
Lec1 Mathreview
61 pages
Lect-4 2
No ratings yet
Lect-4 2
18 pages
Soderstrom T., Stoica P. System Identification (PH 1989) (ISBN S
100% (6)
Soderstrom T., Stoica P. System Identification (PH 1989) (ISBN S
637 pages
Full Maths Syllabus For Machine Learning
100% (1)
Full Maths Syllabus For Machine Learning
31 pages
Lecture 3
No ratings yet
Lecture 3
12 pages
AWS Machine Learning Specialty Master Cheat Sheet
No ratings yet
AWS Machine Learning Specialty Master Cheat Sheet
24 pages
Lecture 2
No ratings yet
Lecture 2
49 pages
Answer Key
No ratings yet
Answer Key
12 pages
Back Propogation
No ratings yet
Back Propogation
43 pages
ML Main Printing Material
No ratings yet
ML Main Printing Material
241 pages
OOP Concepts
No ratings yet
OOP Concepts
47 pages
Cs229-Main Notes Andrew NG and Tengyu Ma
No ratings yet
Cs229-Main Notes Andrew NG and Tengyu Ma
227 pages
AI MCQs
No ratings yet
AI MCQs
7 pages
Main Notes
No ratings yet
Main Notes
227 pages
Collision Risk in Hash-Based Surrogate Keys - by Krzysztof K. Zdeb - Nov, 2024 - Towards Data Science
No ratings yet
Collision Risk in Hash-Based Surrogate Keys - by Krzysztof K. Zdeb - Nov, 2024 - Towards Data Science
11 pages
Statatistics and Probability-4.4
No ratings yet
Statatistics and Probability-4.4
8 pages
Skript Opt Mach
No ratings yet
Skript Opt Mach
49 pages
Math Behind ML Algos
No ratings yet
Math Behind ML Algos
18 pages
50 Data Websites
No ratings yet
50 Data Websites
5 pages
Machine Learning
No ratings yet
Machine Learning
55 pages
The Science of Deep Learning Iddo Drori
No ratings yet
The Science of Deep Learning Iddo Drori
37 pages
Stanford ML
No ratings yet
Stanford ML
168 pages
1 Structured+Programming11 10 2023
No ratings yet
1 Structured+Programming11 10 2023
75 pages
Main Notes
No ratings yet
Main Notes
227 pages
CS229 Lecture Notes: Andrew NG and Tengyu Ma April 25, 2023
No ratings yet
CS229 Lecture Notes: Andrew NG and Tengyu Ma April 25, 2023
223 pages
Maths Behind ML Algos
No ratings yet
Maths Behind ML Algos
18 pages
Machine Learning: The Basics
No ratings yet
Machine Learning: The Basics
288 pages
ML Lecture Notes 2022 v0.0
No ratings yet
ML Lecture Notes 2022 v0.0
176 pages
2 Structured+Programming+18 10 2023
No ratings yet
2 Structured+Programming+18 10 2023
31 pages
Deep-Learning
No ratings yet
Deep-Learning
28 pages
Machine Learning and Data Mining Notes 1647447657
No ratings yet
Machine Learning and Data Mining Notes 1647447657
134 pages
Lecture Maths
No ratings yet
Lecture Maths
104 pages
2IIG0 Cheat Sheet 1
No ratings yet
2IIG0 Cheat Sheet 1
2 pages
CS229 Andrew NG Lecture Notes
No ratings yet
CS229 Andrew NG Lecture Notes
216 pages
Understanding Uncertainty
No ratings yet
Understanding Uncertainty
8 pages
Introduction and Basics of Machine Learning
No ratings yet
Introduction and Basics of Machine Learning
9 pages
Ml2 Script v2
No ratings yet
Ml2 Script v2
123 pages
Matching in Planar Graphs
No ratings yet
Matching in Planar Graphs
27 pages
Slide
No ratings yet
Slide
4 pages
Poly Aml
No ratings yet
Poly Aml
76 pages
Speed Control of DC Motor Using Fuzzy PID Controller
No ratings yet
Speed Control of DC Motor Using Fuzzy PID Controller
15 pages
MIT6 0001F16 Pset4
No ratings yet
MIT6 0001F16 Pset4
10 pages
LN ML Rug
No ratings yet
LN ML Rug
283 pages
Lecture MachineLearning
No ratings yet
Lecture MachineLearning
139 pages
ML Math Notes 2025
No ratings yet
ML Math Notes 2025
3 pages
Andrew NG Main - Notes PDF
No ratings yet
Andrew NG Main - Notes PDF
226 pages
ECE 449 Notes
No ratings yet
ECE 449 Notes
5 pages
4 - Structured Programming 1-11-2023
No ratings yet
4 - Structured Programming 1-11-2023
15 pages
Extra Lecturenotes Cs725
No ratings yet
Extra Lecturenotes Cs725
119 pages
Notes5 Regression
No ratings yet
Notes5 Regression
14 pages
S&P2 Question Bank2
No ratings yet
S&P2 Question Bank2
8 pages
Dm00150423 Adc Hardware Oversampling For Microcontrollers of The Stm32 l0 and l4 Series Stmicroelectronics
No ratings yet
Dm00150423 Adc Hardware Oversampling For Microcontrollers of The Stm32 l0 and l4 Series Stmicroelectronics
14 pages
Basic Concepts For Understanding ML & DL
No ratings yet
Basic Concepts For Understanding ML & DL
8 pages
Extended Target Tracking Using Gaussian Processes: Niklas Wahlstr Om, Student Member, IEEE, Emre Ozkan, Member, IEEE
No ratings yet
Extended Target Tracking Using Gaussian Processes: Niklas Wahlstr Om, Student Member, IEEE, Emre Ozkan, Member, IEEE
31 pages
Maths For Intelligent Systems
No ratings yet
Maths For Intelligent Systems
76 pages
Condition Assessment Models For Sewer Pipelines
No ratings yet
Condition Assessment Models For Sewer Pipelines
121 pages
Decision Tree Learning
No ratings yet
Decision Tree Learning
16 pages
6.036 Notes
No ratings yet
6.036 Notes
99 pages
Bishop Solutions PDF
No ratings yet
Bishop Solutions PDF
87 pages
Machine Learning - A Probabilistic Approach
No ratings yet
Machine Learning - A Probabilistic Approach
343 pages
PGP-AIML Curriculum - Great Lakes
No ratings yet
PGP-AIML Curriculum - Great Lakes
43 pages
Aula 4 (L) - Oggi La Tua Lezione È in Presenza
No ratings yet
Aula 4 (L) - Oggi La Tua Lezione È in Presenza
11 pages
Introduce To Probabilistic Machine Learning
No ratings yet
Introduce To Probabilistic Machine Learning
53 pages
Final Exam Review: Nishant Mehta
No ratings yet
Final Exam Review: Nishant Mehta
32 pages
Machine Learning and Data Mining
No ratings yet
Machine Learning and Data Mining
134 pages
Bachelor Thesis Eth Math
100% (3)
Bachelor Thesis Eth Math
4 pages
Queueing System Analysis of Multi Server Model at
No ratings yet
Queueing System Analysis of Multi Server Model at
9 pages
Numerov
No ratings yet
Numerov
5 pages
Introduction To Machine Learning by Ethem Alpaydin 2nded - 2010
No ratings yet
Introduction To Machine Learning by Ethem Alpaydin 2nded - 2010
314 pages
Cheat Sheet For Exam
No ratings yet
Cheat Sheet For Exam
2 pages
ML (AutoRecovered)
No ratings yet
ML (AutoRecovered)
5 pages
6036 Lecture Notes
No ratings yet
6036 Lecture Notes
56 pages
Introduction To Automata Theory: Reading: Chapter 1
No ratings yet
Introduction To Automata Theory: Reading: Chapter 1
15 pages
Silverwood 2007 Report
No ratings yet
Silverwood 2007 Report
23 pages
Content-CS229 MachineLearning Notes
No ratings yet
Content-CS229 MachineLearning Notes
4 pages
Cheatsheet Reflex Models
No ratings yet
Cheatsheet Reflex Models
4 pages
10 1 1 672 7118 PDF
No ratings yet
10 1 1 672 7118 PDF
35 pages
Super Cheatsheet Artificial Intelligence
No ratings yet
Super Cheatsheet Artificial Intelligence
18 pages
Cheet Sheet
No ratings yet
Cheet Sheet
47 pages
10+2 Level Mathematics For All Exams GMAT, GRE, CAT, SAT, ACT, IIT JEE, WBJEE, ISI, CMI, RMO, INMO, KVPY Etc.
From Everand
10+2 Level Mathematics For All Exams GMAT, GRE, CAT, SAT, ACT, IIT JEE, WBJEE, ISI, CMI, RMO, INMO, KVPY Etc.
Shubhankar Paul
No ratings yet
Lagrange Multiplier: F (X, Y) G (X, Y) 0
No ratings yet
Lagrange Multiplier: F (X, Y) G (X, Y) 0
10 pages
Machine Learning Notes
No ratings yet
Machine Learning Notes
135 pages
Super VIP Cheat Sheet: Arti Cial Intelligence
No ratings yet
Super VIP Cheat Sheet: Arti Cial Intelligence
18 pages
ME1401 - Finite Elements Analysis
No ratings yet
ME1401 - Finite Elements Analysis
5 pages
Super Cheatsheet Machine Learning
100% (1)
Super Cheatsheet Machine Learning
15 pages
02 Risk Management Process - Risk Management During Project Delivery
No ratings yet
02 Risk Management Process - Risk Management During Project Delivery
3 pages
ML Unit-5
No ratings yet
ML Unit-5
14 pages
Path Planning Using Dynamic Vehicle Model: Navigation
No ratings yet
Path Planning Using Dynamic Vehicle Model: Navigation
6 pages
Real Coded Genetic Algorithm
No ratings yet
Real Coded Genetic Algorithm
4 pages
SPH Modeling Using LS Dyna
No ratings yet
SPH Modeling Using LS Dyna
6 pages
Stodola
No ratings yet
Stodola
3 pages
Shortcuts to College Calculus Refreshment Kit
From Everand
Shortcuts to College Calculus Refreshment Kit
Juan Acevedo
No ratings yet
A-level Maths Revision: Cheeky Revision Shortcuts
From Everand
A-level Maths Revision: Cheeky Revision Shortcuts
Scool Revision
3.5/5 (8)

Mathematics For Machine Learning V5

Uploaded by

Mathematics For Machine Learning V5

Uploaded by

Mathematics for Machine Learning : Essential

• Matrix Scalar Multiplication:

• Matrix Determinant (2x2 Matrix):

det(A) = a11 a22 − a12 a21

2. Probability and Statistics

• Cumulative Distribution Function (CDF):

• Law of Iterated Expectations:

• Central Limit Theorem:

• Fundamental Theorem of Calculus:

• Chain Rule (Alternative Form):

• Taylor Expansion (Simplified):

• Momentum Gradient Descent:

vt = βvt−1 + (1 − β)∇J(w), w ← w − ηvt

• Nesterov Accelerated Gradient:

wt+1 = wt − η∇J(wt + β(wt − wt−1 ))

mt = β1 mt−1 + (1 − β1 )∇J(w), vt = β2 vt−1 + (1 − β2 )(∇J(w))2

• Projected Gradient Descent:

wt+1 = ΠC (wt − η∇J(wt ))

• Proximal Gradient Method:

wt+1 = proxg (wt − η∇f (wt ))

• Learning Rate Decay:

• Ordinary Least Squares (OLS):

• Ridge Regression Objective:

J(w) = ||y − Xw||2 + λ||w||2

• Lasso Regression Objective:

J(w) = ||y − Xw||2 + λ||w||1

• Mean Absolute Error (MAE):

• Mean Squared Error (MSE):

• Coefficient of Determination (R-squared):

• Gradient of MSE Loss:

• Hinge Loss for SVM:

• Sigmoid Activation Function:

• ReLU Activation Function:

• Loss Function for Multi-Class Classification:

• Forward Propagation (Single Layer):

• Backward Propagation (Gradient for Weights):

• Gradient Descent for Neural Networks:

• Centroid Update Rule:

• Distance Metric (Euclidean Distance):

|Nϵ (x)| ≥ MinPts where Nϵ (x) = {y : d(x, y) ≤ ϵ}

• Hierarchical Clustering Dendrogram Objective:

Minimize the linkage criterion L(A, B)

• Gaussian Mixture Model (GMM):

• Elbow Method for Optimal k:

Choose k where J(k) has the largest drop.

Maximize ||Xw||2 subject to ||w|| = 1

• Covariance Matrix for PCA:

• Eigen Decomposition for PCA:

• t-SNE Objective: X pij

• Singular Value Decomposition (SVD):

• LDA Objective (Fisher’s Criterion):

• Reconstruction Error for PCA:

Error = ||X − X̂||F

• Kernel PCA Transformation:

ϕ(x) → Principal Components in Feature Space

• Explained Variance Ratio:

V (s) = E[Rt + γV (St+1 )|St = s]

• Bellman Equation for Action-Value Function:

Q(s, a) = E[Rt + γQ(St+1 , At+1 )|St = s, At = a]

• Temporal Difference Update Rule:

V (St ) ← V (St ) + α[Rt+1 + γV (St+1 ) − V (St )]

• Q-Learning Update Rule:

Q(St , At ) ← Q(St , At ) + α[Rt+1 + γ max Q(St+1 , a) − Q(St , At )]

• SARSA Update Rule:

Q(St , At ) ← Q(St , At ) + α[Rt+1 + γQ(St+1 , At+1 ) − Q(St , At )]

• Value Iteration Update Rule:

• Actor-Critic Policy Update:

θ ← θ + α∇θ log πθ (a|s)δ

You might also like