0% found this document useful (0 votes)

79 views55 pages

Lecture 3: Basic Neural Networks: Multi-Layer Neural Networks

This document provides a summary of a lecture on basic neural networks. It discusses multi-layer neural networks and their advantages over single-layer networks. Specifically, it notes that multi-layer networks can represent certain functions more compactly than single-layer networks and are capable of solving non-linearly separable problems. The lecture also covers the forward and backpropagation algorithms for inference and learning in neural networks.

Uploaded by

zhao linger

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

79 views55 pages

Lecture 3: Basic Neural Networks: Multi-Layer Neural Networks

Uploaded by

zhao linger

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 55

Lecture 3: Basic Neural

Networks: multi-layer neural

networks

Xuming He
SIST, ShanghaiTech
Fall, 2020

9/14/2020 Xuming He – CS 280 Deep Learning 1

Announcement
 Tutorial and TA office hour
 Location: 教学中心101
 Tutorial ：2，4， 8，11 周的周二 20:00 - 20:30
 Office hour：3，6， 9，12 周的周二 20:00 - 21:00

 Quiz 1 results are out

 Check with TAs if you have any question

 A1 is out
 Beijing Time: CST 2020/09/30 23:59:59

 Reference reading is listed at the end of lecture slides.

9/14/2020 Xuming He – CS 280 Deep Learning 2

Outline
 Multi-layer neural networks
 Limitations of single layer networks

 Networks with single hidden layer

 Sequential network architecture and variants

 Inference and learning

 Forward and Backpropagation

 Examples: one-layer network

 General BP algorithm

Acknowledgement: Hugo Larochelle’s, Mehryar Mohri@NYU’s & Yingyu

Liang@Princeton’s course notes
9/14/2020 Xuming He – CS 280 Deep Learning 3
Capacity of single neuron
 Binary classification
 A neuron estimates
 Its decision boundary is linear, determined by its weights

9/14/2020 Xuming He – CS 280 Deep Learning 4

Capacity of single neuron
 Can solve linearly separable problems

 Examples

9/14/2020 Xuming He – CS 280 Deep Learning 5

Capacity of single neuron
 Can’t solve non linearly separable problems

 Can we use multiple neurons to achieve this?

9/14/2020 Xuming He – CS 280 Deep Learning 6

Capacity of single neuron
 Can’t solve non linearly separable problems
 Unless the input is transformed in a better representation

9/14/2020 Xuming He – CS 280 Deep Learning 7

Capacity of single neuron
 Can’t solve non linearly separable problems

 Unless the input is transformed in a better representation

9/14/2020 Xuming He – CS 280 Deep Learning 8

Adding one more layer
 Single hidden layer neural network
 2-layer neural network: ignoring input units

 Q: What if using linear activation in hidden layer?

9/14/2020 Xuming He – CS 280 Deep Learning 9

Capacity of neural network
 Single hidden layer neural network
 Partition the input space into regions

9/14/2020 Xuming He – CS 280 Deep Learning 10

Capacity of neural network
 Single hidden layer neural network
 Form a stump/delta function

9/14/2020 Xuming He – CS 280 Deep Learning 11

Capacity of neural network
 Single hidden layer neural network

9/14/2020 Xuming He – CS 280 Deep Learning 12

Multi-layer perceptron
 Boolean case
 Multilayer perceptrons (MLPs) can compute more complex
Boolean functions
 MLPs can compute any Boolean function
 Since they can emulate individual gates
 MLPs are universal Boolean functions

9/14/2020 Xuming He – CS 280 Deep Learning 13

Capacity of neural network
 Universal approximation
 Theorem (Hornik, 1991)
A single hidden layer neural network with a linear output unit can
approximate any continuous function arbitrarily well, given enough
hidden units.
 The result applies for sigmoid, tanh and many other hidden
layer activation functions

 Caveat: good result but not useful in practice

 How many hidden units?
 How to find the parameters by a learning algorithm?

9/14/2020 Xuming He – CS 280 Deep Learning 14

General neural network
 Multi-layer neural network

9/14/2020 Xuming He – CS 280 Deep Learning 15

Multilayer networks
Multilayer networks
Why more layers (deeper)?
 A deep architecture can represent certain functions more
compactly
 (Montufar et al., NIPS’14)
 Functions representable with a deep rectifier net can require an
exponential number of hidden units with a shallow one.

9/14/2020 Xuming He – CS 280 Deep Learning 18

Why more layers (deeper)?
 A deep architecture can represent certain functions more
compactly
 Example: Boolean functions
 There are Boolean functions which require an exponential number
of hidden units in the single layer case
 require a polynomial number of hidden units if we can adapt the
number of layers

 Example: multivariate polynomials (Rolnick & Tegmark, ICLR’18)

 Total number of neurons m required to approximate natural classes
of multivariate polynomials of n variables
 grows only linearly with n for deep neural networks, but grows
exponentially when merely a single hidden layer is allowed.

9/14/2020 Xuming He – CS 280 Deep Learning 19

Why more layers (deeper)?

https://youtu.be/aircAruvnKk?list=PLZHQObOWTQDN
U6R1_67000Dx_ZCJB-3pi
9/14/2020 Xuming He – CS 280 Deep Learning 20
Other network connectivity

9/14/2020 21
Outline
 Multi-layer neural networks
 Limitations of single layer networks

 Neural networks with single hidden layer

 Sequential network architecture and variants

 Inference and learning

 Forward and Backpropagation

 Examples: one-layer network

 General BP algorithm

9/14/2020 Xuming He – CS 280 Deep Learning 22

Computation in neural network
 We only need to know two algorithms
 Inference/prediction: simply forward pass
 Parameter learning: needs backward pass
 Basic fact:
 A neural network is a function of composed operations

 All the f functions are linear + (simple) nonlinear (differentiable

a.e.) operators

9/14/2020 Xuming He – CS 280 Deep Learning 23

Inference example: Forward Pass
 What does the network compute?

9/14/2020 Xuming He – CS 280 Deep Learning 24

Forward Pass in Python

9/14/2020 Xuming He – CS 280 Deep Learning 25

Parameter learning: Backward Pass
 Supervised learning framework

9/14/2020 Xuming He – CS 280 Deep Learning 26

Backward pass
 Backpropagation
 An efficient method for computing gradients in NNs
 A neural network as a function of composed operations

9/14/2020 27
Backward pass

https://www.youtube.com/watch?v=Ilg3gGewQ5U

9/14/2020 Xuming He – CS 280 Deep Learning 28

Gradient descent iteration
 Forward pass

 Backward pass

9/14/2020 30
Gradient descent iteration
 Backward pass

9/14/2020 31
Gradient descent iteration
 Backward pass

9/14/2020 32
Example: Single Layer Network

9/14/2020 Xuming He – CS 280 Deep Learning 33

Example: Single Layer Network

9/14/2020 Xuming He – CS 280 Deep Learning 34

Example: Single Layer Network

9/14/2020 Xuming He – CS 280 Deep Learning 35

Example: Single Layer Network

9/14/2020 Xuming He – CS 280 Deep Learning 36

Example: Single Layer Network

9/14/2020 Xuming He – CS 280 Deep Learning 37

Example: Single Layer Network

9/14/2020 Xuming He – CS 280 Deep Learning 38

Example: Single Layer Network

9/14/2020 Xuming He – CS 280 Deep Learning 39

Example: Single Layer Network

9/14/2020 Xuming He – CS 280 Deep Learning 40

Outline
 Multi-layer neural networks
 Limitations of single layer networks

 Neural networks with single hidden layer

 Sequential network architecture and variants

 Inference and learning

 Forward and Backpropagation

 Examples: one-layer network

 General BP algorithm

9/14/2020 Xuming He – CS 280 Deep Learning 41

An implementation perspective
 Example: Univariate logistic least square model

9/14/2020 Xuming He – CS 280 Deep Learning 42

Univariate chain rule
 A structured way to implement it
 The goal is to write a program that efficiently computes the
derivatives

Computing the loss: Computing the derivatives:

9/14/2020 Xuming He – CS 280 Deep Learning 43

Computation graph
 Represent the computations using a computation graph
 Nodes: inputs & computed quantities
 Edges: which nodes are computed directly as function of which
other nodes

9/14/2020 Xuming He – CS 280 Deep Learning 44

Univariate chain rule
 A shorthand notation
 Use , called the error signal
 Note that the error signals are values computed by the program

Computing the loss: Computing the derivatives:

9/14/2020 Xuming He – CS 280 Deep Learning 45

Multivariate chain rule
 The computation graph has fan-out > 1

9/14/2020 Xuming He – CS 280 Deep Learning 46

Multivariable chain rule
 Recall the distributed chain rule

 The shorthand notation:

9/14/2020 Xuming He – CS 280 Deep Learning 47

General Backpropagation
 Given a computation graph

9/14/2020 Xuming He – CS 280 Deep Learning 48

General Backpropagation
 Example: univariate logistic least square regression

9/14/2020 Xuming He – CS 280 Deep Learning 49

General Backpropagation
 Backprop as message passing:

 Each node receives a set of messages from its children, which

are aggregated into its error signal, then it passes messages to
its parents
 Modularity: each node only has to know how to compute
derivatives w.r.t. its arguments – local computation in the graph

9/14/2020 Xuming He – CS 280 Deep Learning 50

Patterns in backward flow
 Multiplicative node

9/14/2020 Xuming He – CS 280 Deep Learning 51

Patterns in backward flow
 Max node

9/14/2020 Xuming He – CS 280 Deep Learning 52

Computation cost
 Forward pass: one add-multiply operation per weight
 Backward pass: two add-multiply operations per weight

 For a multilayer network, the cost is linear in the number

of layers, quadratic in the number of units per layer

9/14/2020 Xuming He – CS 280 Deep Learning 53

Backpropagation
 Backprop is used to train the majority of neural nets
 Even generative network learning, or advanced optimization
algorithms (second-order) use backprop to compute the update
of weights
 However, backprop seems biologically implausible
 No evidence for biological signals analogous to error derivatives
 All the existing biologically plausible alternatives learn much
more slowly on computers.
 So how on earth does the brain learn???

9/14/2020 Xuming He – CS 280 Deep Learning 54

Coding examples
 Getting familiar with Pytorch
 Python Tutorial: https://cs231n.github.io/python-numpy-tutorial/
 PyTorch in 60 mins:
https://pytorch.org/tutorials/beginner/deep_learning_60min_blitz.
html

 Predicting house prices

 https://d2l.ai/chapter_multilayer-perceptrons/kaggle-house-
price.html

9/14/2020 Xuming He – CS 280 Deep Learning 55

Summary
 Multi-layer neural networks
 Inference and learning
 Forward and Backpropagation

 Next time …
 CNN

 Reference:
 d2l.ai: 4.1-4.3, 4.7
 DLBook: Chapter 6

9/14/2020 Xuming He – CS 280 Deep Learning 56

Unit 5
No ratings yet
Unit 5
61 pages
Lecture Notes 02
No ratings yet
Lecture Notes 02
65 pages
Lect 12 -Deep Feed Forward NN- Review
No ratings yet
Lect 12 -Deep Feed Forward NN- Review
93 pages
An Introduction To Neural Networks: Instituto Tecgraf PUC-Rio Nome: Fernanda Duarte Orientador: Marcelo Gattass
No ratings yet
An Introduction To Neural Networks: Instituto Tecgraf PUC-Rio Nome: Fernanda Duarte Orientador: Marcelo Gattass
45 pages
CC511 Week 5 - 6 - NN - BP
No ratings yet
CC511 Week 5 - 6 - NN - BP
62 pages
Notes DL-1
No ratings yet
Notes DL-1
10 pages
DL_Unit II
No ratings yet
DL_Unit II
78 pages
01 - Introduction To Deep Learning
No ratings yet
01 - Introduction To Deep Learning
56 pages
Machine Learning: Feed Forward Neural Networks Backpropagation Algorithm Cnns and Rnns
No ratings yet
Machine Learning: Feed Forward Neural Networks Backpropagation Algorithm Cnns and Rnns
127 pages
Lecture Notes 01
No ratings yet
Lecture Notes 01
77 pages
Unit 4
100% (1)
Unit 4
57 pages
Deep Learning PDF
100% (1)
Deep Learning PDF
87 pages
Lecture 12 - Neural Networks (DONE!!) PDF
No ratings yet
Lecture 12 - Neural Networks (DONE!!) PDF
27 pages
Assignment 2
No ratings yet
Assignment 2
12 pages
Module I
No ratings yet
Module I
109 pages
Deep Learning
100% (2)
Deep Learning
49 pages
DL Unit5 RNN
No ratings yet
DL Unit5 RNN
107 pages
14 CS1AC16 Deep Learning
No ratings yet
14 CS1AC16 Deep Learning
3 pages
DL Unit 1
No ratings yet
DL Unit 1
200 pages
Lecture 1
No ratings yet
Lecture 1
38 pages
Deep Learning - Intro, Methods & Applications
100% (1)
Deep Learning - Intro, Methods & Applications
37 pages
Artificial Intelligence - Chapter 7
No ratings yet
Artificial Intelligence - Chapter 7
18 pages
Lecture_1
No ratings yet
Lecture_1
10 pages
3.NN Backprop
No ratings yet
3.NN Backprop
56 pages
Neural Networks
No ratings yet
Neural Networks
29 pages
Lecture+8
No ratings yet
Lecture+8
65 pages
Lecture_2 (1)
No ratings yet
Lecture_2 (1)
52 pages
Components-Algorithms/: The Basic Architecture of Neural Networks: Single Computational Layer
No ratings yet
Components-Algorithms/: The Basic Architecture of Neural Networks: Single Computational Layer
65 pages
Artificial Neural Network (2)
No ratings yet
Artificial Neural Network (2)
75 pages
MLP 1122 20240509 ch10 DeepNN
No ratings yet
MLP 1122 20240509 ch10 DeepNN
47 pages
8. Deep learning
No ratings yet
8. Deep learning
95 pages
Deep Learning - Unit 1 Notes
No ratings yet
Deep Learning - Unit 1 Notes
27 pages
III-II CSM (Ar 20) DL - Units - 1 & 2 - Question Answers As On 4-3-23
No ratings yet
III-II CSM (Ar 20) DL - Units - 1 & 2 - Question Answers As On 4-3-23
56 pages
DL Concepts 1 Overview
No ratings yet
DL Concepts 1 Overview
80 pages
Fundamentals of Deep Learning
No ratings yet
Fundamentals of Deep Learning
195 pages
TensorFlow Regression
No ratings yet
TensorFlow Regression
445 pages
Aidl Unit III
No ratings yet
Aidl Unit III
79 pages
III-II CSM (Ar 20) DL 5 Units Question Answers
No ratings yet
III-II CSM (Ar 20) DL 5 Units Question Answers
108 pages
Livro 4 - Deep-Learning
No ratings yet
Livro 4 - Deep-Learning
271 pages
Session 2 ANN 2024
No ratings yet
Session 2 ANN 2024
29 pages
1 DEEP LEARNING 2324 (1)
No ratings yet
1 DEEP LEARNING 2324 (1)
55 pages
Deep Learning and Applications: Pham The Bao Ptbao@sgu - Edu.vn
No ratings yet
Deep Learning and Applications: Pham The Bao Ptbao@sgu - Edu.vn
43 pages
Mod 2.1,2.2
No ratings yet
Mod 2.1,2.2
24 pages
Deep Learnig
No ratings yet
Deep Learnig
16 pages
6S191_MIT_DeepLearning_L1
No ratings yet
6S191_MIT_DeepLearning_L1
108 pages
The Deep Learning Revolution: Introductory Overview Lecture
No ratings yet
The Deep Learning Revolution: Introductory Overview Lecture
35 pages
Lec 8 Training NN
No ratings yet
Lec 8 Training NN
71 pages
Introduction To Neural Network
No ratings yet
Introduction To Neural Network
17 pages
Introduction To Neural Network - Deep Learning
No ratings yet
Introduction To Neural Network - Deep Learning
17 pages
Deepnet Lourentzou
No ratings yet
Deepnet Lourentzou
49 pages
Deep Learning
No ratings yet
Deep Learning
299 pages
6-(9-17) Neural Networks
No ratings yet
6-(9-17) Neural Networks
32 pages
9.a-CMPS460-S22-Neural Networks I
No ratings yet
9.a-CMPS460-S22-Neural Networks I
47 pages
Lec10 Handout
No ratings yet
Lec10 Handout
41 pages
AI Chapter 4
No ratings yet
AI Chapter 4
63 pages
Jntuk r20 Unit-I Deep Learning Techniques (WWW - Jntumaterials.co - In)
No ratings yet
Jntuk r20 Unit-I Deep Learning Techniques (WWW - Jntumaterials.co - In)
23 pages
Lecture 1a - Introduction
No ratings yet
Lecture 1a - Introduction
38 pages
Foundations of Machine Learning: Module 6: Neural Network
No ratings yet
Foundations of Machine Learning: Module 6: Neural Network
22 pages
Large Scale Machine Learning with Python
From Everand
Large Scale Machine Learning with Python
Bastiaan Sjardin
2/5 (1)
Perceptrons: Fundamentals and Applications for The Neural Building Block
From Everand
Perceptrons: Fundamentals and Applications for The Neural Building Block
Fouad Sabry
No ratings yet
Phoebe Doc2.docx: Document Details
No ratings yet
Phoebe Doc2.docx: Document Details
9 pages
General Linguistics An Introductory Survey 4th Edition Robert Henry Robins - Download the ebook now and read anytime, anywhere
No ratings yet
General Linguistics An Introductory Survey 4th Edition Robert Henry Robins - Download the ebook now and read anytime, anywhere
43 pages
TPBA2 Excerpt
No ratings yet
TPBA2 Excerpt
49 pages
Impact of Student Perceived Service Quality On Student Satisfaction and Student Motivation With Faculty Transformational Leadership As Moderator
No ratings yet
Impact of Student Perceived Service Quality On Student Satisfaction and Student Motivation With Faculty Transformational Leadership As Moderator
28 pages
United International University: Assignment-1
0% (1)
United International University: Assignment-1
3 pages
master rteee2ec
No ratings yet
master rteee2ec
2 pages
Risk Assessment
No ratings yet
Risk Assessment
12 pages
Daily Lesson Log Template
No ratings yet
Daily Lesson Log Template
2 pages
CRPF Form 2023 - Washerman
No ratings yet
CRPF Form 2023 - Washerman
3 pages
CEAT Tyres - NATS JD
No ratings yet
CEAT Tyres - NATS JD
2 pages
IGNOU Common-Prospectus-English
No ratings yet
IGNOU Common-Prospectus-English
268 pages
Case # 1 GlenC
No ratings yet
Case # 1 GlenC
3 pages
School System Capacity Assessment - Key Information FINAL 051519
No ratings yet
School System Capacity Assessment - Key Information FINAL 051519
28 pages
Full Download Microsoft Blazor: Building Web Applications in .NET - Second Edition Peter Himschoot PDF DOCX
100% (2)
Full Download Microsoft Blazor: Building Web Applications in .NET - Second Edition Peter Himschoot PDF DOCX
50 pages
TRR Group Polytech Staff
No ratings yet
TRR Group Polytech Staff
3 pages
RESEARCH Chapter 1 To 4 1
No ratings yet
RESEARCH Chapter 1 To 4 1
114 pages
Internship Form 4 - Weekly Status Report TIN
No ratings yet
Internship Form 4 - Weekly Status Report TIN
2 pages
Project Report Saket
No ratings yet
Project Report Saket
46 pages
Task 2 Ethics and Moral Judgement: Lekshmi T Research Scholar
No ratings yet
Task 2 Ethics and Moral Judgement: Lekshmi T Research Scholar
14 pages
Laughter Yoga Thesis
100% (3)
Laughter Yoga Thesis
4 pages
Language Maintenance and Shift
No ratings yet
Language Maintenance and Shift
35 pages
(Ebook) Making Sense (Routledge Revivals) : The Child's Construction of the World by Jerome S. Bruner; Helen Haste ISBN 9780203830581, 020383058X instant download
100% (1)
(Ebook) Making Sense (Routledge Revivals) : The Child's Construction of the World by Jerome S. Bruner; Helen Haste ISBN 9780203830581, 020383058X instant download
46 pages
Sabanci University Faculty of Engineering and Natural Sciences Microelectronics Engineering EL 202 - Electronic Circuits II Spring-2003
No ratings yet
Sabanci University Faculty of Engineering and Natural Sciences Microelectronics Engineering EL 202 - Electronic Circuits II Spring-2003
4 pages
Sam Updated Resume
No ratings yet
Sam Updated Resume
3 pages
The Correlation of The Highschool Students of Jesus Cares Christian Academy's Academic Performance Before and During Online Classes
No ratings yet
The Correlation of The Highschool Students of Jesus Cares Christian Academy's Academic Performance Before and During Online Classes
39 pages
Peer Tutoring Paper
No ratings yet
Peer Tutoring Paper
6 pages
2 1 Statistical Measures WhZeNDRsQrqdQQyh
No ratings yet
2 1 Statistical Measures WhZeNDRsQrqdQQyh
37 pages
Short Theater Works
No ratings yet
Short Theater Works
7 pages
Pak Studies Notes Before and After 1947 PDF For All Written Exams, Screening Tests and Interviews
No ratings yet
Pak Studies Notes Before and After 1947 PDF For All Written Exams, Screening Tests and Interviews
27 pages
The Handbook of Behavior Change 2
No ratings yet
The Handbook of Behavior Change 2
33 pages

Lecture 3: Basic Neural Networks: Multi-Layer Neural Networks

Uploaded by

Lecture 3: Basic Neural Networks: Multi-Layer Neural Networks

Uploaded by

Lecture 3: Basic Neural

Networks: multi-layer neural

9/14/2020 Xuming He – CS 280 Deep Learning 1

 Quiz 1 results are out

 Reference reading is listed at the end of lecture slides.

9/14/2020 Xuming He – CS 280 Deep Learning 2

 Networks with single hidden layer

 Sequential network architecture and variants

 Inference and learning

 Examples: one-layer network

Acknowledgement: Hugo Larochelle’s, Mehryar Mohri@NYU’s & Yingyu

9/14/2020 Xuming He – CS 280 Deep Learning 4

9/14/2020 Xuming He – CS 280 Deep Learning 5

 Can we use multiple neurons to achieve this?

9/14/2020 Xuming He – CS 280 Deep Learning 6

9/14/2020 Xuming He – CS 280 Deep Learning 7

 Unless the input is transformed in a better representation

9/14/2020 Xuming He – CS 280 Deep Learning 8

 Q: What if using linear activation in hidden layer?

9/14/2020 Xuming He – CS 280 Deep Learning 9

9/14/2020 Xuming He – CS 280 Deep Learning 10

9/14/2020 Xuming He – CS 280 Deep Learning 11

9/14/2020 Xuming He – CS 280 Deep Learning 12

9/14/2020 Xuming He – CS 280 Deep Learning 13

 Caveat: good result but not useful in practice

9/14/2020 Xuming He – CS 280 Deep Learning 14

9/14/2020 Xuming He – CS 280 Deep Learning 15

9/14/2020 Xuming He – CS 280 Deep Learning 18

 Example: multivariate polynomials (Rolnick & Tegmark, ICLR’18)

9/14/2020 Xuming He – CS 280 Deep Learning 19

 Neural networks with single hidden layer

 Sequential network architecture and variants

 Inference and learning

 Examples: one-layer network

9/14/2020 Xuming He – CS 280 Deep Learning 22

 All the f functions are linear + (simple) nonlinear (differentiable

9/14/2020 Xuming He – CS 280 Deep Learning 23

9/14/2020 Xuming He – CS 280 Deep Learning 24

9/14/2020 Xuming He – CS 280 Deep Learning 25

9/14/2020 Xuming He – CS 280 Deep Learning 26

9/14/2020 Xuming He – CS 280 Deep Learning 28

9/14/2020 Xuming He – CS 280 Deep Learning 33

9/14/2020 Xuming He – CS 280 Deep Learning 34

9/14/2020 Xuming He – CS 280 Deep Learning 35

9/14/2020 Xuming He – CS 280 Deep Learning 36

9/14/2020 Xuming He – CS 280 Deep Learning 37

9/14/2020 Xuming He – CS 280 Deep Learning 38

9/14/2020 Xuming He – CS 280 Deep Learning 39

9/14/2020 Xuming He – CS 280 Deep Learning 40

 Neural networks with single hidden layer

 Sequential network architecture and variants

 Inference and learning

 Examples: one-layer network

9/14/2020 Xuming He – CS 280 Deep Learning 41

9/14/2020 Xuming He – CS 280 Deep Learning 42

Computing the loss: Computing the derivatives:

9/14/2020 Xuming He – CS 280 Deep Learning 43

9/14/2020 Xuming He – CS 280 Deep Learning 44

Computing the loss: Computing the derivatives:

9/14/2020 Xuming He – CS 280 Deep Learning 45

9/14/2020 Xuming He – CS 280 Deep Learning 46

 The shorthand notation:

9/14/2020 Xuming He – CS 280 Deep Learning 47

9/14/2020 Xuming He – CS 280 Deep Learning 48

9/14/2020 Xuming He – CS 280 Deep Learning 49

 Each node receives a set of messages from its children, which

9/14/2020 Xuming He – CS 280 Deep Learning 50

9/14/2020 Xuming He – CS 280 Deep Learning 51

9/14/2020 Xuming He – CS 280 Deep Learning 52

 For a multilayer network, the cost is linear in the number

9/14/2020 Xuming He – CS 280 Deep Learning 53

9/14/2020 Xuming He – CS 280 Deep Learning 54

 Predicting house prices

9/14/2020 Xuming He – CS 280 Deep Learning 55

9/14/2020 Xuming He – CS 280 Deep Learning 56

You might also like