0% found this document useful (0 votes)

35 views28 pages

Lecture 2 - GD Linear Regression

The document discusses various activation functions used in deep learning neural networks. It defines eight different activation functions - sigmoid, tanh, ReLU, Leaky ReLU, Parametric ReLU, Exponential Linear Unit (ELU), softplus, and softmax. For each activation function, it provides the mathematical equation. It also includes questions about why activation functions are needed and how they should be defined for a layer or neuron.

Uploaded by

Robert Babayan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

35 views28 pages

Lecture 2 - GD Linear Regression

Uploaded by

Robert Babayan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 28

Deep Learning

Vazgen Mikayelyan

October 20, 2020

V. Mikayelyan Deep Learning October 20, 2020 1 / 15

Activation functions
1
1. Sigmoid: σ (x) =
1 + e −x

V. Mikayelyan Deep Learning October 20, 2020 2 / 15

Activation functions
1
1. Sigmoid: σ (x) =
1 + e −x

V. Mikayelyan Deep Learning October 20, 2020 2 / 15

Activation functions

e x − e −x
2. Tanh: tanh (x) =
e x + e −x

V. Mikayelyan Deep Learning October 20, 2020 3 / 15

Activation functions

e x − e −x
2. Tanh: tanh (x) =
e x + e −x

V. Mikayelyan Deep Learning October 20, 2020 3 / 15

Activation functions

3. Rectified linear unit: ReLU (x) = max (0, x)

V. Mikayelyan Deep Learning October 20, 2020 4 / 15

Activation functions

3. Rectified linear unit: ReLU (x) = max (0, x)

V. Mikayelyan Deep Learning October 20, 2020 4 / 15

Activation functions
(
0.01x, for x < 0
4. Leaky ReLU: LR (x) =
x, for x ≥ 0

V. Mikayelyan Deep Learning October 20, 2020 5 / 15

Activation functions
(
0.01x, for x < 0
4. Leaky ReLU: LR (x) =
x, for x ≥ 0

V. Mikayelyan Deep Learning October 20, 2020 5 / 15

Activation functions
(
ax, for x < 0
5. Parametric ReLU: PR (x) =
x, for x ≥ 0

V. Mikayelyan Deep Learning October 20, 2020 6 / 15

Activation functions
(
ax, for x < 0
5. Parametric ReLU: PR (x) =
x, for x ≥ 0

V. Mikayelyan Deep Learning October 20, 2020 6 / 15

Activation functions
(
a (e x − 1) , for x < 0
6. Exponential linear unit: ELU (x) =
x, for x ≥ 0

V. Mikayelyan Deep Learning October 20, 2020 7 / 15

Activation functions
(
a (e x − 1) , for x < 0
6. Exponential linear unit: ELU (x) =
x, for x ≥ 0

V. Mikayelyan Deep Learning October 20, 2020 7 / 15

Activation functions
7. SoftPlus: SP (x) = log (1 + e x )

V. Mikayelyan Deep Learning October 20, 2020 8 / 15

Activation functions
7. SoftPlus: SP (x) = log (1 + e x )

V. Mikayelyan Deep Learning October 20, 2020 8 / 15

Activation functions

 
 e x1 e x2 e xn 
8. Softmax: S (x1 , x2 , . . . , xn ) = 
Pn , n , . . . , n


x x x
P P
e i e i e i
i=1 i=1 i=1

V. Mikayelyan Deep Learning October 20, 2020 9 / 15

Questions

1 Why do we need activation functions?

V. Mikayelyan Deep Learning October 20, 2020 10 / 15

Questions

1 Why do we need activation functions?

2 How should we define activation functions, for a layer or for a neuron?

V. Mikayelyan Deep Learning October 20, 2020 10 / 15

Outline

1 Gradient Descent

2 Linear and Logistic Regressions

V. Mikayelyan Deep Learning October 20, 2020 11 / 15

Gradient Descent

Let f : Rk → R be a convex function and we want to find its global

minimum.

V. Mikayelyan Deep Learning October 20, 2020 12 / 15

Gradient Descent

Let f : Rk → R be a convex function and we want to find its global

minimum. This optimization algorithm is based on the fact that the fastest
decreasing direction of the function is the opposite direction of gradient:

xn+1 = xn − α∇f (xn )

and x0 ∈ Rk is a arbitrary point.

V. Mikayelyan Deep Learning October 20, 2020 12 / 15

Outline

1 Gradient Descent

2 Linear and Logistic Regressions

V. Mikayelyan Deep Learning October 20, 2020 13 / 15

Linear Regression

Let (xi , yi )ni=1 , xi ∈ Rk , yi ∈ R be our training data.

V. Mikayelyan Deep Learning October 20, 2020 14 / 15

Linear Regression

Let (xi , yi )ni=1 , xi ∈ Rk , yi ∈ R be our training data. Consider the function

f (x) = f x 1 , x 2 , . . . , x k = w 1 x 1 + w 2 x 2 + . . . + w k x k + b = w T x + b.

V. Mikayelyan Deep Learning October 20, 2020 14 / 15

Linear Regression

Let (xi , yi )ni=1 , xi ∈ Rk , yi ∈ R be our training data. Consider the function

f (x) = f x 1 , x 2 , . . . , x k = w 1 x 1 + w 2 x 2 + . . . + w k x k + b = w T x + b.

Our aim is to find parameters b, w 1 , w 2 , . . . , w k such that

f (xi ) ≈ yi , i = 1, . . . , n.

V. Mikayelyan Deep Learning October 20, 2020 14 / 15

Linear Regression

Let (xi , yi )ni=1 , xi ∈ Rk , yi ∈ R be our training data. Consider the function

f (x) = f x 1 , x 2 , . . . , x k = w 1 x 1 + w 2 x 2 + . . . + w k x k + b = w T x + b.

Our aim is to find parameters b, w 1 , w 2 , . . . , w k such that

f (xi ) ≈ yi , i = 1, . . . , n.

We choose L2 distance as our loss function:

n
1X
(f (xl ) − yl )2 .
n
l=1

V. Mikayelyan Deep Learning October 20, 2020 14 / 15

Questions

1 Should we minimize the loss function using gradient descent?

V. Mikayelyan Deep Learning October 20, 2020 15 / 15

Questions

1 Should we minimize the loss function using gradient descent?

2 Can you represent this model as a neural network?

V. Mikayelyan Deep Learning October 20, 2020 15 / 15

9 DL_ANN_ActivationFunctions
No ratings yet
9 DL_ANN_ActivationFunctions
20 pages
Lecture NN Part1
No ratings yet
Lecture NN Part1
62 pages
16_the key to the most powerful ML models
No ratings yet
16_the key to the most powerful ML models
25 pages
Lecture - 05 (Introduction to ANN)
No ratings yet
Lecture - 05 (Introduction to ANN)
27 pages
Lecture 3 - Logistic Regression, Regularization, Softmax Classifier
No ratings yet
Lecture 3 - Logistic Regression, Regularization, Softmax Classifier
26 pages
DEEP LEARNING (ACt Func)
No ratings yet
DEEP LEARNING (ACt Func)
10 pages
Deep Learning Assignment-2
No ratings yet
Deep Learning Assignment-2
2 pages
1) deep_learning
No ratings yet
1) deep_learning
60 pages
Deep Learning Lectures - 2
No ratings yet
Deep Learning Lectures - 2
73 pages
Lecture 4 - SGD, Back Propagation
No ratings yet
Lecture 4 - SGD, Back Propagation
25 pages
DL Unit2 HD
No ratings yet
DL Unit2 HD
141 pages
A2.2 DNN Update 2
No ratings yet
A2.2 DNN Update 2
51 pages
1160 CS F425 20241218114944 Comprehensive Exam Question Paper
No ratings yet
1160 CS F425 20241218114944 Comprehensive Exam Question Paper
5 pages
winter1516_lecture52
No ratings yet
winter1516_lecture52
20 pages
Activation Functions in Neural Networks
No ratings yet
Activation Functions in Neural Networks
10 pages
Performance Analysis of Various Activation Functio
No ratings yet
Performance Analysis of Various Activation Functio
7 pages
winter1516_lecture53
No ratings yet
winter1516_lecture53
20 pages
Lecture 18. Backpropagation
No ratings yet
Lecture 18. Backpropagation
55 pages
DL_Assi02
No ratings yet
DL_Assi02
9 pages
Activation Functions
No ratings yet
Activation Functions
4 pages
Activation Functions in Neural Networks
No ratings yet
Activation Functions in Neural Networks
7 pages
Introduction of Machine Learning
No ratings yet
Introduction of Machine Learning
61 pages
Deep Learning
No ratings yet
Deep Learning
10 pages
M2 PPT
No ratings yet
M2 PPT
84 pages
DL Answers
No ratings yet
DL Answers
24 pages
Unit-2
No ratings yet
Unit-2
35 pages
003 Activation Functions in Machine Learning
No ratings yet
003 Activation Functions in Machine Learning
19 pages
Deep Learning Tutorial 3
No ratings yet
Deep Learning Tutorial 3
12 pages
UNIT-III Activation-function
No ratings yet
UNIT-III Activation-function
6 pages
lecture 9-NN- modified
No ratings yet
lecture 9-NN- modified
94 pages
Feed Forward NN
No ratings yet
Feed Forward NN
35 pages
Unit 3 Deep Learning
No ratings yet
Unit 3 Deep Learning
11 pages
4-Neural Networks and Activation Function
No ratings yet
4-Neural Networks and Activation Function
28 pages
Perceptron in Machine Learning
No ratings yet
Perceptron in Machine Learning
11 pages
activatn fn 2
No ratings yet
activatn fn 2
10 pages
Functii de Activare1
No ratings yet
Functii de Activare1
89 pages
Machine Learning (CSO851) - Lecture 08
No ratings yet
Machine Learning (CSO851) - Lecture 08
27 pages
L4 Training Neural Networks en
No ratings yet
L4 Training Neural Networks en
48 pages
Activation Function in NN
No ratings yet
Activation Function in NN
29 pages
SoftComp 02
No ratings yet
SoftComp 02
33 pages
Activation
No ratings yet
Activation
7 pages
Ad3451 Ml Unit 4 Notes
No ratings yet
Ad3451 Ml Unit 4 Notes
34 pages
Instructor Solution Manual To Neural Networks and Deep Learning A Textbook Solutions 3319944622 9783319944623 - Compress
No ratings yet
Instructor Solution Manual To Neural Networks and Deep Learning A Textbook Solutions 3319944622 9783319944623 - Compress
40 pages
Activation Function
No ratings yet
Activation Function
43 pages
Deep Learning
No ratings yet
Deep Learning
5 pages
Study of Ensemble of Activation Functions in Deep Learning
No ratings yet
Study of Ensemble of Activation Functions in Deep Learning
10 pages
5 TH
No ratings yet
5 TH
22 pages
Deep Learning: International Islamic University of Chittagong
No ratings yet
Deep Learning: International Islamic University of Chittagong
31 pages
Mod 2.3 - Activation Function, Loss Functions
No ratings yet
Mod 2.3 - Activation Function, Loss Functions
12 pages
Activation Function: Deep Neural Networks
No ratings yet
Activation Function: Deep Neural Networks
47 pages
Need and Use of Activation Functions in Anndeep Learning
No ratings yet
Need and Use of Activation Functions in Anndeep Learning
7 pages
Activation Functions in Neural Networks - 241102 - 224129
No ratings yet
Activation Functions in Neural Networks - 241102 - 224129
7 pages
Activation Function
No ratings yet
Activation Function
4 pages
Step by Step Modeling and Tuning For Fuzzy Logic Controller
No ratings yet
Step by Step Modeling and Tuning For Fuzzy Logic Controller
12 pages
Solution Dseclzg524!01!102020 Ec2r
100% (1)
Solution Dseclzg524!01!102020 Ec2r
6 pages
Instructor's Solution Manual For Neural Networks
No ratings yet
Instructor's Solution Manual For Neural Networks
40 pages
Activation Functions - Ipynb - Colaboratory
No ratings yet
Activation Functions - Ipynb - Colaboratory
10 pages
Activation Functions
No ratings yet
Activation Functions
8 pages
Module 2-Intelligent Agents
No ratings yet
Module 2-Intelligent Agents
37 pages
Project Page
No ratings yet
Project Page
36 pages
CU6051NP 20048652 ArushKarki
No ratings yet
CU6051NP 20048652 ArushKarki
22 pages
CSE422 Midterm Fall 2022
No ratings yet
CSE422 Midterm Fall 2022
2 pages
NN unit_1
No ratings yet
NN unit_1
27 pages
29-11 Static and Dynamic Testing1234
No ratings yet
29-11 Static and Dynamic Testing1234
11 pages
Artificial Intelligence in Power System by Vipin Bansal
100% (2)
Artificial Intelligence in Power System by Vipin Bansal
19 pages
PSO Mini Tutorial
No ratings yet
PSO Mini Tutorial
23 pages
ChE 441 - Process Control Student Notes
100% (1)
ChE 441 - Process Control Student Notes
207 pages
Visio Template Chart
No ratings yet
Visio Template Chart
5 pages
Chapter 1: Introduction To Decision Support Systems
No ratings yet
Chapter 1: Introduction To Decision Support Systems
6 pages
Mock Test 1: Time Domain Performance of I-Order & Ii-Order Systems
No ratings yet
Mock Test 1: Time Domain Performance of I-Order & Ii-Order Systems
8 pages
Lecture 2.1.2activation Function
No ratings yet
Lecture 2.1.2activation Function
15 pages
Competitive Networks - The Kohonen Self-Organising Map
No ratings yet
Competitive Networks - The Kohonen Self-Organising Map
6 pages
3rd Week Assignment
100% (1)
3rd Week Assignment
2 pages
An Introduction To Digital Filters PDF
100% (1)
An Introduction To Digital Filters PDF
10 pages
Or 2marks Ans
100% (1)
Or 2marks Ans
6 pages
05 Golnaraghi App E E-1-E-6
No ratings yet
05 Golnaraghi App E E-1-E-6
6 pages
Case Study On Trust Region Policy Optimization (TRPO) in Robotics
No ratings yet
Case Study On Trust Region Policy Optimization (TRPO) in Robotics
2 pages
Exercise 1
No ratings yet
Exercise 1
4 pages
Final Elective & Fast Track List
No ratings yet
Final Elective & Fast Track List
4 pages
Lect 1 Introduction To Process Control
No ratings yet
Lect 1 Introduction To Process Control
4 pages
Operations-Research Course Outine 2
No ratings yet
Operations-Research Course Outine 2
2 pages
CIM - Computer Integrated Manufacturing: Index
No ratings yet
CIM - Computer Integrated Manufacturing: Index
5 pages
Magic Square
No ratings yet
Magic Square
1 page
1ob and Development
No ratings yet
1ob and Development
50 pages
Chemical 4 Mesal
100% (2)
Chemical 4 Mesal
697 pages
Geometric functions in computer aided geometric design
From Everand
Geometric functions in computer aided geometric design
Oscar Ruiz
No ratings yet
Software Design
No ratings yet
Software Design
12 pages
CPM Data Required and Steps in Updating - 15 Roll No. Construction Management TP
No ratings yet
CPM Data Required and Steps in Updating - 15 Roll No. Construction Management TP
12 pages
Opus Terra - PUNQS3
No ratings yet
Opus Terra - PUNQS3
27 pages
MTBF - MTTR
100% (2)
MTBF - MTTR
11 pages

Lecture 2 - GD Linear Regression

Uploaded by

Lecture 2 - GD Linear Regression

Uploaded by

Deep Learning

October 20, 2020

V. Mikayelyan Deep Learning October 20, 2020 1 / 15

V. Mikayelyan Deep Learning October 20, 2020 2 / 15

V. Mikayelyan Deep Learning October 20, 2020 2 / 15

V. Mikayelyan Deep Learning October 20, 2020 3 / 15

V. Mikayelyan Deep Learning October 20, 2020 3 / 15

3. Rectified linear unit: ReLU (x) = max (0, x)

V. Mikayelyan Deep Learning October 20, 2020 4 / 15

3. Rectified linear unit: ReLU (x) = max (0, x)

V. Mikayelyan Deep Learning October 20, 2020 4 / 15

V. Mikayelyan Deep Learning October 20, 2020 5 / 15

V. Mikayelyan Deep Learning October 20, 2020 5 / 15

V. Mikayelyan Deep Learning October 20, 2020 6 / 15

V. Mikayelyan Deep Learning October 20, 2020 6 / 15

V. Mikayelyan Deep Learning October 20, 2020 7 / 15

V. Mikayelyan Deep Learning October 20, 2020 7 / 15

V. Mikayelyan Deep Learning October 20, 2020 8 / 15

V. Mikayelyan Deep Learning October 20, 2020 8 / 15

V. Mikayelyan Deep Learning October 20, 2020 9 / 15

1 Why do we need activation functions?

V. Mikayelyan Deep Learning October 20, 2020 10 / 15

1 Why do we need activation functions?

V. Mikayelyan Deep Learning October 20, 2020 10 / 15

2 Linear and Logistic Regressions

V. Mikayelyan Deep Learning October 20, 2020 11 / 15

Let f : Rk → R be a convex function and we want to find its global

V. Mikayelyan Deep Learning October 20, 2020 12 / 15

Let f : Rk → R be a convex function and we want to find its global

xn+1 = xn − α∇f (xn )

and x0 ∈ Rk is a arbitrary point.

V. Mikayelyan Deep Learning October 20, 2020 12 / 15

2 Linear and Logistic Regressions

V. Mikayelyan Deep Learning October 20, 2020 13 / 15

Let (xi , yi )ni=1 , xi ∈ Rk , yi ∈ R be our training data.

V. Mikayelyan Deep Learning October 20, 2020 14 / 15

Let (xi , yi )ni=1 , xi ∈ Rk , yi ∈ R be our training data. Consider the function

V. Mikayelyan Deep Learning October 20, 2020 14 / 15

Let (xi , yi )ni=1 , xi ∈ Rk , yi ∈ R be our training data. Consider the function

Our aim is to find parameters b, w 1 , w 2 , . . . , w k such that

V. Mikayelyan Deep Learning October 20, 2020 14 / 15

Let (xi , yi )ni=1 , xi ∈ Rk , yi ∈ R be our training data. Consider the function

Our aim is to find parameters b, w 1 , w 2 , . . . , w k such that

We choose L2 distance as our loss function:

V. Mikayelyan Deep Learning October 20, 2020 14 / 15

1 Should we minimize the loss function using gradient descent?

V. Mikayelyan Deep Learning October 20, 2020 15 / 15

1 Should we minimize the loss function using gradient descent?

V. Mikayelyan Deep Learning October 20, 2020 15 / 15

You might also like