0% found this document useful (0 votes)

32 views45 pages

Lecture 14 Introduction To Pytorch

The document provides an introduction to PyTorch, focusing on its application in deep learning and the differences between traditional machine learning and deep learning. It discusses the architecture of neural networks, the importance of automatic differentiation, and the process of gradient descent for optimizing model parameters. Additionally, it compares PyTorch with TensorFlow, highlighting their respective strengths and use cases in research and industry.

Uploaded by

vinay thakar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

32 views45 pages

Lecture 14 Introduction To Pytorch

Uploaded by

vinay thakar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 45

Introduction to PyTorch

Lecture 14 for 14-763/18-763

Guannan Qu

Oct 23/24, 2023

Recall: the ML process
Phase II: ML Modeling

Identify the Proper ML Model

Train
Data Engineering & Preprocess
Accuracy

Evaluation Confusion Matrix

Train, Evaluation, and Parameter
Tuning
Try different models ROC/AUC

Obtain Final Tuned Model

Hyper-Parameter Tuning
via Cross Validation
Traditional ML vs Deep Learning
Traditional ML suffers from some issues including:
• Not good at handling high dimensional data (e.g. image and texts).
• For a 32*32 image, # of input features is 1024
• Need to do feature extraction (like Fourier Transform) which is difficult

Deep Learning is capable of

• Handling high dimensional data
• No need to do feature extraction
• Feature extraction is done automatically in deep learning.
𝑦

What is Deep Learning?

Recall: linear regression 𝑥

Linear Model Output

Input:
𝑦
𝑥 𝑦 = 𝑎𝑥 + 𝑏

Deep learning replaces the linear model

with “layers” of linear models with non-linear activation!
What is Deep Learning?
A.k.a. activation function.
A popular choice is the ReLU
𝑓 𝑦 ! = max(𝑦′, 0)

Output
Input: Linear Function Nonlinear
𝑦
𝑥 𝑦 # = ∑'$%& 𝑤$ 𝑥$ + 𝑏 𝑦 = 𝑓(𝑦′)

In deep learning, the

input is typically high-
dimensional This is a neural network with 1 layer with width 1
𝑥 = [𝑥! , … , 𝑥" ] Next: increase the width
What is Deep Learning?
Neuron
Linear Function Nonlinear
𝑧$ = ∑%"#$ 𝑤"$ 𝑥" + 𝑏$ 𝑦& = 𝑓(𝑧&)

Input: Nonlinear Output

Linear Function Output Layer
𝑥 𝑦 - = 𝑓(𝑧 -) 𝑦
𝑧 ! = ∑%"#$ 𝑤"! 𝑥" + 𝑏 ! 𝑦 = ∑"#$! 𝑤#% 𝑦 # + 𝑏 %

Linear Function Nonlinear

𝑧 & = ∑%"#$ 𝑤"& 𝑥" + 𝑏 & 𝑦 . = 𝑓(𝑧 .)

A hidden layer
What is Deep Learning?
A hidden layer

Neuron 1

Input: Output
Neuron 2 Output Layer 𝑦
𝑥

Neuron 3
What is Deep Learning? A neural network with 3 hidden layers
and the widths are (3,4,2)
Neuron
Neuron

Neuron Neuron
Input: Output
Neuron Output Layer 𝑦
𝑥
Neuron Neuron

Neuron Width 2
Neuron
Width 3
Width 4
What is Deep Learning?
Neural Networks are a type of ML model:
• Use a cascade of multiple layers of nonlinear processing units (neurons). Each successive
layer uses the output from the previous layer as input.
• Learns multiple levels of representations that correspond to different levels of
abstraction; the levels form a hierarchy of concepts.
• Has a long history, perhaps first dates back to 1943, but limited success until the 2000s

Deep learning uses Deep Neural Networks as the ML model:

• “Deep” refers to the the number of layers is large
• Become extremely successful in the 2010s in various domains (image classification, NLP…)
• Various architectures, Multi-Layer Perceptron (MLP), Convolutional NN (CNN), Recurrent
NN (RNN), Transformer…
Deep learning requires new ML platform
SparkML (based on Transformer/Estimator) is not adequate for deep learning
• Deep neural networks has a highly flexible structure
• # layers, # of neurons for each layer, choice of activation function
• CNN, RNN, ResNet has more complicated structure
• The training of neural network requires a lot of tuning, and therefore needs to get
to the low-level detail
• The training of neural network is data intensive and computationally heavy

We need specialized ML platform for deep learning!

What is TensorFlow?
• History: Developed by the Google Brain Team to accelerate machine
learning and deep neural network research. It was first made public in
late 2015, while the first stable version appeared in 2017. It is open
source under Apache Open Source license.
• Built to run on multiple CPUs or GPUs and even mobile operating
systems.
• Multiple languages like Python, C/C++ or Java.
• End-to-end, free, and open source
• One of the most popular program frameworks for building deep
learning applications.
TensorFlow vs PyTorch
• PyTorch was released in 2017 by Facebook AI (now Meta) and soon became popular
• Known for its simplicity, ease of use, flexibility
• Uses dynamic computation graph
• In contrast, TensorFlow at the time
• Not user friendly, steeper learning curve, not well organized
• Used static computation graph
• But TensorFlow still had advantages, e.g. in deployment, visualization
• The comparison became more complicated when TensorFlow 2 was released in 2019
• TensorFlow 2 became much more user friendly and the APIs were cleaned up
TensorFlow vs PyTorch
• PyTorch and TensorFlow are far and away the two most popular Deep
Learning frameworks today. The debate over which framework is
superior is a longstanding point of contentious debate, with each
camp having its share of fervent supporters.
• TensorFlow has a reputation for being an industry-focused framework
and PyTorch has a reputation for being a research-focused framework
TensorFlow vs PyTorch
TensorFlow vs PyTorch
Introduction to PyTorch
Still use linear regression as warm-up, but go to low level details this time

Understanding ML: loss function and optimization

Loss Function
Linear Model: 𝑦 = 𝑤𝑥 + 𝑏
Model Parameters: 𝑤, 𝑏

𝑒$ : Error of fitting
1 - data point i
𝑙𝑜𝑠𝑠 𝑤, 𝑏 = 3 𝑦$ − 𝑤𝑥$ + 𝑏
𝑁
$

𝑒$

The training/fitting process finds the w,b with the smallest loss, but how?
How does training work?

The training/fitting process finds the w,b with the smallest loss, but how?
Optimization: Gradients
Given a function 𝑙𝑜𝑠𝑠 𝜃 that depends on a 2 dimensional 𝜃 = [𝑤, 𝑏]
Its gradient ∇𝑙𝑜𝑠𝑠(𝜃) is the direction from 𝜃 that will lead to largest increase in 𝑙𝑜𝑠𝑠 𝜃
Optimization: Gradients
Contour Plot of 𝑙𝑜𝑠𝑠 𝜃 (Example)
Think of it as a terrain 1

altitude map, each circle 0.8

represents the set of 𝜃 0.6

with the same loss value 0.4 Lowest loss achieved at origin
0.2

w 0

-0.2

-0.4
The farther away from origin,
-0.6
the larger the loss function
-0.8

-1

-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1

b
Optimization: Gradients
To find a parameter with Contour Plot of 𝑙𝑜𝑠𝑠 𝜃 (Example)
lower loss function, it 1
should follow the 0.8
negative gradient direction
0.6

0.4 Lowest loss achieved at origin

0.2

-0.2

Current parameter 𝜃
-0.4
Negative Gradient Direction
-0.6

-0.8 Gradient direction

-1

-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1

Optimization: Gradients
Gradient Descent Contour Plot of 𝑙𝑜𝑠𝑠 𝜃 (Example)
Keep following the
negative gradient direction!
1

0.8

0.6

0.4 Lowest loss achieved at origin

0.2

𝜃 0at the next

-0.2iteration

𝜃 at the current
-0.4

iteration
-0.6

-0.8

-1

-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1

Optimization: Gradients
Gradient Descent Contour Plot of 𝑙𝑜𝑠𝑠 𝜃 (Example)
Initialize 𝜃
1

Repeat maxIter steps: 0.8

𝜃 ← 𝜃 − 𝜂∇𝑙𝑜𝑠𝑠(𝜃) 0.6

0.4 Lowest loss achieved at origin

𝜂 is learning rate, i.e. 0.2
how large a step one
0
makes in each iteration
-0.2

-0.4

-0.6

-0.8

-1

-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1

Optimization: Gradients
Linear Model: 𝑦 = 𝑤𝑥 + 𝑏
Model Parameters: 𝑤, 𝑏

𝑒$ : Error of fitting
1 - data point i
𝑙𝑜𝑠𝑠 𝑤, 𝑏 = 3 𝑦$ − 𝑤𝑥$ + 𝑏
𝑁
$

How to calculate the gradient of this loss function?

Gradient in PyTorch
The core of PyTorch (and TensorFlow) is their automatic differentiation (autograd)
1. Define a linear regression model
2. Generate some training data
3. Calculate gradient and conduct gradient descent
PyTorch: Linear Regression Model
Subclassing nn.Module
PyTorch: Linear Regression Model
In __init__ function, define all the parameters of the
model as nn.Parameter and give them initial values

Side note:
• In pytorch, tensor is the most basic building block. nn.Parameter is a special kind of Tensor
used to represent model parameters
• In our code, both parameters are initialized as torch.zeros, which are all zero tensors
• Our __init__ function takes d as input, which means the input dimension (we will set d=1)
PyTorch: Linear Regression Model

In forward function, define how the output is computed from input. torch.inner means
inner product, and this line of code simply means 𝑤&𝑥& + ⋯ + 𝑤4 𝑥4 + 𝑏
PyTorch: Linear Regression Model
Gradient in PyTorch
The core of PyTorch (and TensorFlow) is their automatic differentiation (autograd)
1. Define a linear regression model
2. Generate some training data
3. Calculate gradient and conduct gradient descent
Gradient in PyTorch
Gradient in PyTorch
The core of PyTorch (and TensorFlow) is their automatic differentiation (autograd)
1. Define a linear regression model
2. Generate some training data
3. Calculate gradient and conduct gradient descent
Gradient in PyTorch
Recall: we want to calculate the gradient of this loss function

1 -
𝑙𝑜𝑠𝑠 𝑤, 𝑏 = 3 𝑦$ − 𝑤𝑥$ + 𝑏
𝑁
$
Steps in PyTorch:
• Step 1: Forward Pass, calculate the loss function value
Gradient in PyTorch
Recall: we want to calculate the gradient of this loss function

1 -
𝑙𝑜𝑠𝑠 𝑤, 𝑏 = 3 𝑦$ − 𝑤𝑥$ + 𝑏
𝑁
$
Steps in PyTorch:
• Step 2: Backward pass.
Before doing backward, let’s first check the gradient values now
Gradient in PyTorch
Recall: we want to calculate the gradient of this loss function

1 -
𝑙𝑜𝑠𝑠 𝑤, 𝑏 = 3 𝑦$ − 𝑤𝑥$ + 𝑏
𝑁
$
Steps in PyTorch:
• Step 2: Backward pass.
Let’s now do backward pass and check gradient again

Up next: gradient descent, i.e. iteratively compute the gradient and conduct gradient descent!
Tell the optimizer what is the parameters to optimize!

Setting learning rate

Forward Pass

Backward pass and compute gradient.

Note: VERY IMPORTANT to run optimizer.zero_grad() to reset grad
to zero! Otherwise, the backward will be incorrect.
Run a gradient descent on the parameters using the
computed gradient and the learning rate
Summary So Far
• Creating a Module: subsclassing nn.Module
• Define parameters, define forward function
• Calculating gradient
• Forward and backward pass
• Perform training (gradient descent)
• Create optimizer and specify the parameters to optimize, and specify learning
rate
• Write training loop that does gradient descent
• Forward and compute loss
• Zero-grad
• Backward
• Step

How to choose learning rate, maxIter? Let’s now visualize the gradient descent process!
Visualizing Gradient Descent
Learning rate = 0.001
Visualizing
Learning rate = 0.0005 (smaller than our first trial)
Visualizing Gradient Descent
Learning rate = 0.005 (larger than our first trial)
Visualizing Gradient Descent
Learning rate = 0.025 (much larger than our first trial)
Visualizing Gradient Descent
Learning rate = 0.028 (much larger than our first trial)
Visualizing Gradient Descent
Learning rate = 0.03 (much larger than our first trial)
Lessons Learned on Learning Rate
• Learning rate too small:
• Converges too slow and takes a lot of iterations
• Learning rate too large:
• Exhibit unstable (oscillating) behaviors and may diverge

• How to find a good learning rate:

• Find a small enough learning rate that does not diverge
• Increase learning rate and plot the training loss curve
• If the loss curve appears to be converging and “stable”, can further increase
• If the loss curve appears to be unstable and shows signs of divergence, decrease learning
rate

Pytorch Cheatsheet EN
No ratings yet
Pytorch Cheatsheet EN
1 page
TensorFlow in 1 Day: Make your own Neural Network
From Everand
TensorFlow in 1 Day: Make your own Neural Network
Krishna Rungta
3.5/5 (10)
Intro To PyTorch and Neural Networks - Intro To PyTorch and Neural Networks Cheatsheet - Codecademy
No ratings yet
Intro To PyTorch and Neural Networks - Intro To PyTorch and Neural Networks Cheatsheet - Codecademy
8 pages
Chapter 2
No ratings yet
Chapter 2
35 pages
Chapter 2
No ratings yet
Chapter 2
35 pages
Chapter 3
No ratings yet
Chapter 3
26 pages
Chapter1 Intro
No ratings yet
Chapter1 Intro
35 pages
Pytorch Tutorial: - Ntu Machine Learning Course
No ratings yet
Pytorch Tutorial: - Ntu Machine Learning Course
64 pages
Intoduction To Neural Networks
No ratings yet
Intoduction To Neural Networks
45 pages
PyTorch for Deep Learning Zero to Mastery
No ratings yet
PyTorch for Deep Learning Zero to Mastery
6 pages
Chapter 1
No ratings yet
Chapter 1
50 pages
PyTorch PDF
No ratings yet
PyTorch PDF
72 pages
Pytorch 101: Deep Learning PHD Course 2017/2018
No ratings yet
Pytorch 101: Deep Learning PHD Course 2017/2018
19 pages
Chapter 3
No ratings yet
Chapter 3
26 pages
Lecture 09 Slides - After
No ratings yet
Lecture 09 Slides - After
57 pages
Midterm Study Guide Csci566
No ratings yet
Midterm Study Guide Csci566
20 pages
Chapter 1
No ratings yet
Chapter 1
37 pages
2.game AI 1
No ratings yet
2.game AI 1
268 pages
Pytorch Tutorial 1
No ratings yet
Pytorch Tutorial 1
48 pages
Activation Functions: Ismail Elezi
No ratings yet
Activation Functions: Ismail Elezi
30 pages
Pytorch Neural Networks Guide 1717173717
No ratings yet
Pytorch Neural Networks Guide 1717173717
17 pages
Pytorch Tutorial 1 Rev 1
No ratings yet
Pytorch Tutorial 1 Rev 1
48 pages
Week2 DL
No ratings yet
Week2 DL
29 pages
Chapter 4
No ratings yet
Chapter 4
34 pages
Unit 4 Part 3
No ratings yet
Unit 4 Part 3
8 pages
Chapter 4
No ratings yet
Chapter 4
34 pages
Beginner's PyTorch Guide
No ratings yet
Beginner's PyTorch Guide
35 pages
Crashcourse DL Pytorch Parr
No ratings yet
Crashcourse DL Pytorch Parr
39 pages
Deep Learning
No ratings yet
Deep Learning
299 pages
Lecture 02-2
No ratings yet
Lecture 02-2
37 pages
PyTorch 1 - 0 - Bringing Research and Production Together Presentation
No ratings yet
PyTorch 1 - 0 - Bringing Research and Production Together Presentation
108 pages
Chapter 9
No ratings yet
Chapter 9
73 pages
PyTorch Crash Course 1713016363
No ratings yet
PyTorch Crash Course 1713016363
15 pages
ML Lec 10 Neural Networks
No ratings yet
ML Lec 10 Neural Networks
87 pages
Pytorch Basics - For Absolute Beginners - Sel, Tam (Sel, Tam) - 2021 - Anna's Archive - Copie
No ratings yet
Pytorch Basics - For Absolute Beginners - Sel, Tam (Sel, Tam) - 2021 - Anna's Archive - Copie
62 pages
LLM For Maths People
No ratings yet
LLM For Maths People
53 pages
Fundamentals of Deep Learning
No ratings yet
Fundamentals of Deep Learning
26 pages
Lecture 02
No ratings yet
Lecture 02
147 pages
Pytorch Tutorial For Beginner: Department of Computer Science & Engineering University of Washington
No ratings yet
Pytorch Tutorial For Beginner: Department of Computer Science & Engineering University of Washington
11 pages
Lecture8 DeepLearning
No ratings yet
Lecture8 DeepLearning
94 pages
Neural Networks
No ratings yet
Neural Networks
52 pages
Deep Learning
No ratings yet
Deep Learning
49 pages
PyTorch CrashCourse
No ratings yet
PyTorch CrashCourse
16 pages
Domnic Object Detecion Basics
No ratings yet
Domnic Object Detecion Basics
62 pages
L4 Linear Regression
No ratings yet
L4 Linear Regression
51 pages
Optimization of Deep Networks
No ratings yet
Optimization of Deep Networks
84 pages
Deep Learning With PyTorch 1
No ratings yet
Deep Learning With PyTorch 1
1 page
Tutorial 1,2
No ratings yet
Tutorial 1,2
12 pages
Introduction To Neural Network
No ratings yet
Introduction To Neural Network
20 pages
Intro To Deep Learning With TensorFlow - Introduction To TensorFlow Cheatsheet - Codecademy
No ratings yet
Intro To Deep Learning With TensorFlow - Introduction To TensorFlow Cheatsheet - Codecademy
8 pages
Pytorch Tutorial PDF
No ratings yet
Pytorch Tutorial PDF
27 pages
Mid 1 DL Notes
No ratings yet
Mid 1 DL Notes
15 pages
Deep Learning Lab: How To Train Your First Neural Network
No ratings yet
Deep Learning Lab: How To Train Your First Neural Network
68 pages
HW1P1 F23
No ratings yet
HW1P1 F23
37 pages
Week 1_ Artificial Neural Networks - Part I - Justin (1)
No ratings yet
Week 1_ Artificial Neural Networks - Part I - Justin (1)
56 pages
Neural Networks Essay Feranmi Dere
No ratings yet
Neural Networks Essay Feranmi Dere
7 pages
Deep Learning Fundamentals in Python
From Everand
Deep Learning Fundamentals in Python
LazyProgrammer
4/5 (9)
The Numpy Pocketbook: Essentials on the Go
From Everand
The Numpy Pocketbook: Essentials on the Go
Silas Meadowlark
No ratings yet
Flood Fill: Flood Fill: Exploring Computer Vision's Dynamic Terrain
From Everand
Flood Fill: Flood Fill: Exploring Computer Vision's Dynamic Terrain
Fouad Sabry
No ratings yet
Line Drawing Algorithm: Mastering Techniques for Precision Image Rendering
From Everand
Line Drawing Algorithm: Mastering Techniques for Precision Image Rendering
Fouad Sabry
No ratings yet
What Are Data Distributions, and Why Are They Important
No ratings yet
What Are Data Distributions, and Why Are They Important
4 pages
AI Mind or Machine
No ratings yet
AI Mind or Machine
10 pages
High Precision 6-DoF Grasp Detection in Cluttered Scenes Based On Network Optimization and Pose Propagation
No ratings yet
High Precision 6-DoF Grasp Detection in Cluttered Scenes Based On Network Optimization and Pose Propagation
8 pages
Artificial Intelligence in Operations Management and Supply Chain Management - An Exploratory Case Study
No ratings yet
Artificial Intelligence in Operations Management and Supply Chain Management - An Exploratory Case Study
19 pages
Chen Pao 2019 Fast and Accurate Artificial Neural Network Potential Model For Mapbi3 Perovskite Materials
No ratings yet
Chen Pao 2019 Fast and Accurate Artificial Neural Network Potential Model For Mapbi3 Perovskite Materials
10 pages
Lab Manual
No ratings yet
Lab Manual
80 pages
Pricing Options With A New Hybrid Neural Network Model
No ratings yet
Pricing Options With A New Hybrid Neural Network Model
10 pages
Machine Translation of Vedic Sanskrit Using Deep Learning Algorithm
No ratings yet
Machine Translation of Vedic Sanskrit Using Deep Learning Algorithm
4 pages
Lahann Et Al - 2019 - Utilizing Machine Learning Techniques To Reveal VAT Compliance Violations in
No ratings yet
Lahann Et Al - 2019 - Utilizing Machine Learning Techniques To Reveal VAT Compliance Violations in
10 pages
Ai Mcqs Unit1
No ratings yet
Ai Mcqs Unit1
15 pages
IEEE Paper
No ratings yet
IEEE Paper
9 pages
UML - Unit 2
No ratings yet
UML - Unit 2
10 pages
4-1 Syllabus-1
No ratings yet
4-1 Syllabus-1
6 pages
Chapter 2 MCQs ClassXAI
No ratings yet
Chapter 2 MCQs ClassXAI
26 pages
How Artificial Intelligence Works
No ratings yet
How Artificial Intelligence Works
2 pages
RAG LLM Reco Review-1
No ratings yet
RAG LLM Reco Review-1
91 pages
AIML Cheatsheet - Coders - Section
No ratings yet
AIML Cheatsheet - Coders - Section
47 pages
Emerging Technologies
No ratings yet
Emerging Technologies
92 pages
Project Outliner
No ratings yet
Project Outliner
7 pages
Alphaforge: A Framework To Mine and Dynamically Combine Formulaic Alpha Factors
No ratings yet
Alphaforge: A Framework To Mine and Dynamically Combine Formulaic Alpha Factors
10 pages
Abbas 2021
No ratings yet
Abbas 2021
7 pages
Machine Learning - AL3451 - Important Questions With Answer
No ratings yet
Machine Learning - AL3451 - Important Questions With Answer
27 pages
Machine Learning-Driven Credit Risk: A Systemic Review
No ratings yet
Machine Learning-Driven Credit Risk: A Systemic Review
13 pages
Design and Implementation of A Hierarchical Robotic System: A Platform For Artificial Intelligence Investigation
No ratings yet
Design and Implementation of A Hierarchical Robotic System: A Platform For Artificial Intelligence Investigation
93 pages
Data Driven Artificial Neural Network LSTM Hybrid 250129 102818-1
No ratings yet
Data Driven Artificial Neural Network LSTM Hybrid 250129 102818-1
6 pages
MSC Computer Science Syl
No ratings yet
MSC Computer Science Syl
42 pages
Paper - 2 Acomprehensive GeoAI Review Progress, Challenges and Outlooks
No ratings yet
Paper - 2 Acomprehensive GeoAI Review Progress, Challenges and Outlooks
50 pages
Early Warning of Low Visibility Using The Ensembli
No ratings yet
Early Warning of Low Visibility Using The Ensembli
15 pages
A Comprehensive Analysis and Prediction of Earthquake Magnitude Based On Position and Depth Parameters Using Machine and Deep Learning Models
No ratings yet
A Comprehensive Analysis and Prediction of Earthquake Magnitude Based On Position and Depth Parameters Using Machine and Deep Learning Models
20 pages
Building Recommender Systems with Machine Learning and AI: Help people discover new products and content with deep learning, neural networks, and machine learning recommendations. 2nd Edition Frank Kane instant download
No ratings yet
Building Recommender Systems with Machine Learning and AI: Help people discover new products and content with deep learning, neural networks, and machine learning recommendations. 2nd Edition Frank Kane instant download
48 pages

Lecture 14 Introduction To Pytorch

Uploaded by

Lecture 14 Introduction To Pytorch

Uploaded by

Introduction to PyTorch

Lecture 14 for 14-763/18-763

Oct 23/24, 2023

Identify the Proper ML Model

Evaluation Confusion Matrix

Obtain Final Tuned Model

Deep Learning is capable of

What is Deep Learning?

Recall: linear regression 𝑥

Linear Model Output

Deep learning replaces the linear model

In deep learning, the

Input: Nonlinear Output

Linear Function Nonlinear

Deep learning uses Deep Neural Networks as the ML model:

We need specialized ML platform for deep learning!

Understanding ML: loss function and optimization

altitude map, each circle 0.8

represents the set of 𝜃 0.6

-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1

0.4 Lowest loss achieved at origin

-0.8 Gradient direction

-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1

0.4 Lowest loss achieved at origin

𝜃 0at the next

-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1

Repeat maxIter steps: 0.8

0.4 Lowest loss achieved at origin

-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1

How to calculate the gradient of this loss function?

Setting learning rate

Backward pass and compute gradient.

• How to find a good learning rate:

You might also like