Content-CS229 MachineLearning Notes

Uploaded by

nhipt.vnist

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

17 views

Content-CS229 MachineLearning Notes

Uploaded by

nhipt.vnist

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

I Supervised learning 5
1 Linear regression 8
1.1 LMS algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.2 The normal equations . . . . . . . . . . . . . . . . . . . . . . . 13
1.2.1 Matrix derivatives . . . . . . . . . . . . . . . . . . . . . 13
1.2.2 Least squares revisited . . . . . . . . . . . . . . . . . . 14
1.3 Probabilistic interpretation . . . . . . . . . . . . . . . . . . . . 15
1.4 Locally weighted linear regression (optional reading) . . . . . . 17

2 Classification and logistic regression 20

2.1 Logistic regression . . . . . . . . . . . . . . . . . . . . . . . . 20
2.2 Digression: the perceptron learning algorithm . . . . . . . . . 23
2.3 Multi-class classification . . . . . . . . . . . . . . . . . . . . . 24
2.4 Another algorithm for maximizing `(θ) . . . . . . . . . . . . . 27

3 Generalized linear models 29

3.1 The exponential family . . . . . . . . . . . . . . . . . . . . . . 29
3.2 Constructing GLMs . . . . . . . . . . . . . . . . . . . . . . . . 31
3.2.1 Ordinary least squares . . . . . . . . . . . . . . . . . . 32
3.2.2 Logistic regression . . . . . . . . . . . . . . . . . . . . 33

4 Generative learning algorithms 34

4.1 Gaussian discriminant analysis . . . . . . . . . . . . . . . . . . 35
4.1.1 The multivariate normal distribution . . . . . . . . . . 35
4.1.2 The Gaussian discriminant analysis model . . . . . . . 38
4.1.3 Discussion: GDA and logistic regression . . . . . . . . 40
4.2 Naive bayes (Option Reading) . . . . . . . . . . . . . . . . . . 41
4.2.1 Laplace smoothing . . . . . . . . . . . . . . . . . . . . 44
4.2.2 Event models for text classification . . . . . . . . . . . 46

1
CS229 Spring 20223 2

5 Kernel methods 48
5.1 Feature maps . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
5.2 LMS (least mean squares) with features . . . . . . . . . . . . . 49
5.3 LMS with the kernel trick . . . . . . . . . . . . . . . . . . . . 49
5.4 Properties of kernels . . . . . . . . . . . . . . . . . . . . . . . 53

6 Support vector machines 59

6.1 Margins: intuition . . . . . . . . . . . . . . . . . . . . . . . . . 59
6.2 Notation (option reading) . . . . . . . . . . . . . . . . . . . . 61
6.3 Functional and geometric margins (option reading) . . . . . . 61
6.4 The optimal margin classifier (option reading) . . . . . . . . . 63
6.5 Lagrange duality (optional reading) . . . . . . . . . . . . . . . 65
6.6 Optimal margin classifiers: the dual form (option reading) . . 68
6.7 Regularization and the non-separable case (optional reading) . 72
6.8 The SMO algorithm (optional reading) . . . . . . . . . . . . . 73
6.8.1 Coordinate ascent . . . . . . . . . . . . . . . . . . . . . 74
6.8.2 SMO . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

II Deep learning 79
7 Deep learning 80
7.1 Supervised learning with non-linear models . . . . . . . . . . . 80
7.2 Neural networks . . . . . . . . . . . . . . . . . . . . . . . . . . 84
7.3 Modules in Modern Neural Networks . . . . . . . . . . . . . . 92
7.4 Backpropagation . . . . . . . . . . . . . . . . . . . . . . . . . 98
7.4.1 Preliminaries on partial derivatives . . . . . . . . . . . 99
7.4.2 General strategy of backpropagation . . . . . . . . . . 102
7.4.3 Backward functions for basic modules . . . . . . . . . . 105
7.4.4 Back-propagation for MLPs . . . . . . . . . . . . . . . 107
7.5 Vectorization over training examples . . . . . . . . . . . . . . 109

III Generalization and regularization 112

8 Generalization 113
8.1 Bias-variance tradeoff . . . . . . . . . . . . . . . . . . . . . . . 115
8.1.1 A mathematical decomposition (for regression) . . . . . 120
8.2 The double descent phenomenon . . . . . . . . . . . . . . . . . 121
8.3 Sample complexity bounds (optional readings) . . . . . . . . . 126
CS229 Spring 20223 3

8.3.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . 126

8.3.2 The case of finite H . . . . . . . . . . . . . . . . . . . . 128
8.3.3 The case of infinite H . . . . . . . . . . . . . . . . . . 131

9 Regularization and model selection 135

9.1 Regularization . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
9.2 Implicit regularization effect (optional reading) . . . . . . . . . 137
9.3 Model selection via cross validation . . . . . . . . . . . . . . . 139
9.4 Bayesian statistics and regularization . . . . . . . . . . . . . . 142

IV Unsupervised learning 144

10 Clustering and the k-means algorithm 145

11 EM algorithms 148
11.1 EM for mixture of Gaussians . . . . . . . . . . . . . . . . . . . 148
11.2 Jensen’s inequality . . . . . . . . . . . . . . . . . . . . . . . . 151
11.3 General EM algorithms . . . . . . . . . . . . . . . . . . . . . . 152
11.3.1 Other interpretation of ELBO . . . . . . . . . . . . . . 158
11.4 Mixture of Gaussians revisited . . . . . . . . . . . . . . . . . . 158
11.5 Variational inference and variational auto-encoder (optional
reading) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160

12 Principal components analysis 165

13 Independent components analysis 171

13.1 ICA ambiguities . . . . . . . . . . . . . . . . . . . . . . . . . . 172
13.2 Densities and linear transformations . . . . . . . . . . . . . . . 173
13.3 ICA algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . 174

14 Self-supervised learning and foundation models 177

14.1 Pretraining and adaptation . . . . . . . . . . . . . . . . . . . . 177
14.2 Pretraining methods in computer vision . . . . . . . . . . . . . 179
14.3 Pretrained large language models . . . . . . . . . . . . . . . . 181
14.3.1 Open up the blackbox of Transformers . . . . . . . . . 183
14.3.2 Zero-shot learning and in-context learning . . . . . . . 186
CS229 Spring 20223 4

V Reinforcement Learning and Control 188

15 Reinforcement learning 189
15.1 Markov decision processes . . . . . . . . . . . . . . . . . . . . 190
15.2 Value iteration and policy iteration . . . . . . . . . . . . . . . 192
15.3 Learning a model for an MDP . . . . . . . . . . . . . . . . . . 194
15.4 Continuous state MDPs . . . . . . . . . . . . . . . . . . . . . 196
15.4.1 Discretization . . . . . . . . . . . . . . . . . . . . . . . 196
15.4.2 Value function approximation . . . . . . . . . . . . . . 199
15.5 Connections between Policy and Value Iteration (Optional) . . 203

16 LQR, DDP and LQG 206

16.1 Finite-horizon MDPs . . . . . . . . . . . . . . . . . . . . . . . 206
16.2 Linear Quadratic Regulation (LQR) . . . . . . . . . . . . . . . 210
16.3 From non-linear dynamics to LQR . . . . . . . . . . . . . . . 213
16.3.1 Linearization of dynamics . . . . . . . . . . . . . . . . 214
16.3.2 Differential Dynamic Programming (DDP) . . . . . . . 214
16.4 Linear Quadratic Gaussian (LQG) . . . . . . . . . . . . . . . . 216

17 Policy Gradient (REINFORCE) 220

The Hundred-Page Machine Learning Book-Andriy Burkov (2019) - Removed
No ratings yet
The Hundred-Page Machine Learning Book-Andriy Burkov (2019) - Removed
145 pages
Machine Learning - A First Course For Engineers and Scientists
No ratings yet
Machine Learning - A First Course For Engineers and Scientists
348 pages
Machine Learning Simplified
100% (1)
Machine Learning Simplified
109 pages
Machine Learning Algorithms Applications and Practices in Data Science PDF
No ratings yet
Machine Learning Algorithms Applications and Practices in Data Science PDF
113 pages
Main Notes
No ratings yet
Main Notes
227 pages
Cs229-Main Notes Andrew NG and Tengyu Ma
No ratings yet
Cs229-Main Notes Andrew NG and Tengyu Ma
227 pages
ML Main Printing Material
No ratings yet
ML Main Printing Material
241 pages
Main Notes
No ratings yet
Main Notes
227 pages
CS229 Lecture Notes: Andrew NG and Tengyu Ma April 25, 2023
No ratings yet
CS229 Lecture Notes: Andrew NG and Tengyu Ma April 25, 2023
223 pages
Andrew NG Main - Notes PDF
No ratings yet
Andrew NG Main - Notes PDF
226 pages
CS229 Andrew NG Lecture Notes
No ratings yet
CS229 Andrew NG Lecture Notes
216 pages
Stanford ML
No ratings yet
Stanford ML
168 pages
6036 Lecture Notes
No ratings yet
6036 Lecture Notes
56 pages
6.036 Notes
No ratings yet
6.036 Notes
99 pages
Undergraduate Fundamentals of Machine Learning
No ratings yet
Undergraduate Fundamentals of Machine Learning
163 pages
Cs181 Textbook
No ratings yet
Cs181 Textbook
163 pages
Textbook
No ratings yet
Textbook
161 pages
SML Book Draft Latest
No ratings yet
SML Book Draft Latest
275 pages
6 390 Lecture Notes Fall24 (1)
No ratings yet
6 390 Lecture Notes Fall24 (1)
146 pages
Super Cheatsheet Machine Learning
100% (1)
Super Cheatsheet Machine Learning
15 pages
An Adventure of Epic Porpoises
No ratings yet
An Adventure of Epic Porpoises
174 pages
SML Book Draft Latest
No ratings yet
SML Book Draft Latest
194 pages
6 390 Lecture Notes Spring24
No ratings yet
6 390 Lecture Notes Spring24
144 pages
Supervised Machine Learning
No ratings yet
Supervised Machine Learning
112 pages
Machine Learning
No ratings yet
Machine Learning
95 pages
Machine Learning Complete-Course-Notes Polimi
No ratings yet
Machine Learning Complete-Course-Notes Polimi
107 pages
SML Book Draft Latest (001 046)
No ratings yet
SML Book Draft Latest (001 046)
46 pages
Machine Learning Notes
No ratings yet
Machine Learning Notes
135 pages
Poly ML SIR
No ratings yet
Poly ML SIR
378 pages
A Comprehensive Guide To Machine Learning
No ratings yet
A Comprehensive Guide To Machine Learning
152 pages
LN ML Rug
No ratings yet
LN ML Rug
283 pages
Cheet Sheet
No ratings yet
Cheet Sheet
47 pages
Ml2 Script v2
No ratings yet
Ml2 Script v2
123 pages
Introduction To Machine Learning: ETH Zurich Janik Schuettler Marcel Graetz FS18
No ratings yet
Introduction To Machine Learning: ETH Zurich Janik Schuettler Marcel Graetz FS18
18 pages
Deep Learning Ian Goodfellow download
No ratings yet
Deep Learning Ian Goodfellow download
50 pages
Machine Learning Cheat Sheet HCMUT K
No ratings yet
Machine Learning Cheat Sheet HCMUT K
34 pages
Mathematical Foundations of Machine Learning
100% (1)
Mathematical Foundations of Machine Learning
340 pages
10 1 1 672 7118 PDF
No ratings yet
10 1 1 672 7118 PDF
35 pages
Supp 2
No ratings yet
Supp 2
214 pages
Machine Learning Lecture
No ratings yet
Machine Learning Lecture
433 pages
Machine Learning Lecture
No ratings yet
Machine Learning Lecture
332 pages
Machine Learning Lecture
No ratings yet
Machine Learning Lecture
435 pages
Exercises
No ratings yet
Exercises
69 pages
Summary FS24
No ratings yet
Summary FS24
63 pages
Mathematical Foundations
No ratings yet
Mathematical Foundations
431 pages
Machine Learning Lecture-Notes
100% (2)
Machine Learning Lecture-Notes
408 pages
bookDMNN 1516 PDF
No ratings yet
bookDMNN 1516 PDF
169 pages
Machine Learning
No ratings yet
Machine Learning
216 pages
PCML Notes
No ratings yet
PCML Notes
249 pages
Alice's Adventures in A Differentiable Wonderland
No ratings yet
Alice's Adventures in A Differentiable Wonderland
279 pages
Introduction To Machine Learning by Ethem Alpaydin 2nded - 2010
No ratings yet
Introduction To Machine Learning by Ethem Alpaydin 2nded - 2010
314 pages
Machine Learning Simplified A Gentle Introduction To Supervised Learning 1011 Andrew Wolf pdf download
No ratings yet
Machine Learning Simplified A Gentle Introduction To Supervised Learning 1011 Andrew Wolf pdf download
53 pages
Machine Learning - A Probabilistic Approach
No ratings yet
Machine Learning - A Probabilistic Approach
343 pages
Nlp
No ratings yet
Nlp
69 pages
Table of Contents
No ratings yet
Table of Contents
9 pages
Vorlesung Main Compressed
No ratings yet
Vorlesung Main Compressed
1,437 pages
Machine Learning Summary
No ratings yet
Machine Learning Summary
38 pages
Lecture Notes 2016
No ratings yet
Lecture Notes 2016
132 pages
ADVANCED COLLEGE ALGEBRA STUDY GUIDE
From Everand
ADVANCED COLLEGE ALGEBRA STUDY GUIDE
Harrison K Cook
No ratings yet
Advanced college algebra study guide
From Everand
Advanced college algebra study guide
Harrison Cook
No ratings yet
0000 Design by Optimization of An Axial-Flux Permanent-Magnet Synchronous Motor
100% (1)
0000 Design by Optimization of An Axial-Flux Permanent-Magnet Synchronous Motor
5 pages
Pacific Affairs, University of British Columbia Pacific Affairs
No ratings yet
Pacific Affairs, University of British Columbia Pacific Affairs
4 pages
Power and Interdependence in Organizations Cambridge Companions to Management 1st Edition Dean Tjosvold download pdf
No ratings yet
Power and Interdependence in Organizations Cambridge Companions to Management 1st Edition Dean Tjosvold download pdf
67 pages
POSITION - OFFICER TRAINEE-Quality Control / Operations: Syllabus
No ratings yet
POSITION - OFFICER TRAINEE-Quality Control / Operations: Syllabus
2 pages
Door 2d Design (MR - Thyaraj)
No ratings yet
Door 2d Design (MR - Thyaraj)
1 page
Presentation Usage of Punctuation
No ratings yet
Presentation Usage of Punctuation
22 pages
Structural Forensic Engineering (Garrett)
No ratings yet
Structural Forensic Engineering (Garrett)
91 pages
MC74AC14
No ratings yet
MC74AC14
8 pages
Are Love Styles Related To Sexual Styles - Frey 2010
No ratings yet
Are Love Styles Related To Sexual Styles - Frey 2010
8 pages
Business Report (2)
No ratings yet
Business Report (2)
16 pages
PDF CSK
No ratings yet
PDF CSK
58 pages
So You Really Want To Learn Latin Book 1 Nicholas Ray Randle Oulton download
100% (1)
So You Really Want To Learn Latin Book 1 Nicholas Ray Randle Oulton download
42 pages
TDS Dau Phanh Brake Fluid Hydraulan 406 DOT 5.1 EN
No ratings yet
TDS Dau Phanh Brake Fluid Hydraulan 406 DOT 5.1 EN
4 pages
Eee 204 Note
No ratings yet
Eee 204 Note
37 pages
4 (Optional) Multiple Response Optimization
No ratings yet
4 (Optional) Multiple Response Optimization
4 pages
Non-Ferrous Metal: Nickel & Ni Alloys: Presented By:-Deepam Goyal
No ratings yet
Non-Ferrous Metal: Nickel & Ni Alloys: Presented By:-Deepam Goyal
22 pages
Previous Year Coding Questions Solution (Free)
No ratings yet
Previous Year Coding Questions Solution (Free)
6 pages
Advanced Reading Power
No ratings yet
Advanced Reading Power
3 pages
PHYS 101 Midterm Exam 2 Solution 2020-21-2: 1. A Block of Mass
No ratings yet
PHYS 101 Midterm Exam 2 Solution 2020-21-2: 1. A Block of Mass
4 pages
LAW Quiz 1
No ratings yet
LAW Quiz 1
2 pages
S.2 History Eoy o
No ratings yet
S.2 History Eoy o
2 pages
By: Muqri Aqil Bin Mazman 4 Al-Farabi: Computer System
No ratings yet
By: Muqri Aqil Bin Mazman 4 Al-Farabi: Computer System
13 pages
PHY370 Chapter 3 PDF
No ratings yet
PHY370 Chapter 3 PDF
50 pages
Balanço de Massa
No ratings yet
Balanço de Massa
3 pages
1 Online
No ratings yet
1 Online
7 pages
Reading Comprehension: S N Ne C Las Es L
No ratings yet
Reading Comprehension: S N Ne C Las Es L
33 pages
Art Education LP - D.el - Ed I Year
No ratings yet
Art Education LP - D.el - Ed I Year
12 pages
Power Quality Syllabus 20EE905
No ratings yet
Power Quality Syllabus 20EE905
2 pages
Aaaaaa: Distance, and Pile Cap Thickness
No ratings yet
Aaaaaa: Distance, and Pile Cap Thickness
6 pages
Social Studies Major Day 1
No ratings yet
Social Studies Major Day 1
255 pages