0% found this document useful (0 votes)

37 views53 pages

Lec 03 Deep Networks 1

This document summarizes a lecture on deep neural networks. It discusses backpropagation with tensors, using matrices and vectors instead of scalars. It describes implementing backpropagation by treating each variable as an object with value and gradient attributes. Minibatching is introduced to make fast matrix operations dominate computation time by applying operations to multiple data points simultaneously. The XOR problem is used as an example of when a simple linear classifier is not sufficient.

Uploaded by

Mr. Coffee

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

37 views53 pages

Lec 03 Deep Networks 1

Uploaded by

Mr. Coffee

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 53

Deep Learning

Lecture 3 – Deep Neural Networks

Prof. Dr.-Ing. Andreas Geiger

Autonomous Vision Group
University of Tübingen / MPI-IS
Agenda

3.1 Backpropagation with Tensors

3.2 The XOR Problem

3.3 Multi-Layer Perceptrons

3.4 Universal Approximation

2
3.1
Backpropagation with Tensors
Recap: Backpropagation with Scalars
Forward Pass:
(1) y = y(x)
Loss: L( u(y(x)), v(y(x)) )
(2) u = u(y)
(2) v = v(y)
(3) L = L(u, v)

4
Recap: Backpropagation with Scalars
Forward Pass:
(1) y = y(x)
Loss: L( u(y(x)), v(y(x)) )
(2) u = u(y)
(2) v = v(y)
(3) L = L(u, v)

Backward Pass:
∂L ∂L ∂L ∂L
(3) = =
∂u ∂L ∂u ∂u

4
Recap: Backpropagation with Scalars
Forward Pass:
(1) y = y(x)
Loss: L( u(y(x)), v(y(x)) )
(2) u = u(y)
(2) v = v(y)
(3) L = L(u, v)

Backward Pass:
∂L ∂L ∂L ∂L
(3) = =
∂u ∂L ∂u ∂u
∂L ∂L ∂L ∂L
(3) = =
∂v ∂L ∂v ∂v

4
Recap: Backpropagation with Scalars
Forward Pass:
(1) y = y(x)
Loss: L( u(y(x)), v(y(x)) )
(2) u = u(y)
(2) v = v(y)
(3) L = L(u, v)

Backward Pass:
∂L ∂L ∂L ∂L
(3) = =
∂u ∂L ∂u ∂u
∂L ∂L ∂L ∂L
(3) = =
∂v ∂L ∂v ∂v
∂L ∂L ∂u ∂L ∂v
(2) = +
∂y ∂u ∂y ∂v ∂y

4
Recap: Backpropagation with Scalars
Forward Pass:
(1) y = y(x)
Loss: L( u(y(x)), v(y(x)) )
(2) u = u(y)
(2) v = v(y)
(3) L = L(u, v)

Backward Pass:
∂L ∂L ∂L ∂L
(3) = =
∂u ∂L ∂u ∂u
∂L ∂L ∂L ∂L
(3) = =
∂v ∂L ∂v ∂v
∂L ∂L ∂u ∂L ∂v
(2) = +
∂y ∂u ∂y ∂v ∂y
∂L ∂L ∂y
(1) =
∂x ∂y ∂x 4
Recap: Backpropagation with Scalars
Forward Pass:
Implementation: Each variable/node is an object
(1) y = y(x)
and has attributes x.value and x.grad. Values
(2) u = u(y) are computed forward and gradients backward:
(2) v = v(y)
(3) L = L(u, v) x.value = Input
y.value = y(x.value)
Backward Pass:
∂L ∂L ∂L ∂L u.value = u(y.value)
(3) = =
∂u ∂L ∂u ∂u v.value = v(y.value)
∂L ∂L ∂L ∂L
(3) = = L.value = L(u.value, v.value)
∂v ∂L ∂v ∂v
∂L ∂L ∂u ∂L ∂v
(2) = +
∂y ∂u ∂y ∂v ∂y
∂L ∂L ∂y
(1) =
∂x ∂y ∂x 5
Recap: Backpropagation with Scalars
Forward Pass:
Implementation: Each variable/node is an object
(1) y = y(x)
and has attributes x.value and x.grad. Values
(2) u = u(y) are computed forward and gradients backward:
(2) v = v(y)
(3) L = L(u, v) x.grad = y.grad = u.grad = v.grad = 0
L.grad = 1
Backward Pass:
∂L ∂L ∂L ∂L u.grad += L.grad ∗ (∂L/∂u)(u.value, v.value)
(3) = =
∂u ∂L ∂u ∂u v.grad += L.grad ∗ (∂L/∂v)(u.value, v.value)
∂L ∂L ∂L ∂L
(3) = = y.grad += u.grad ∗ (∂u/∂y)(y.value)
∂v ∂L ∂v ∂v
∂L ∂L ∂u ∂L ∂v y.grad += v.grad ∗ (∂v/∂y)(y.value)
(2) = +
∂y ∂u ∂y ∂v ∂y x.grad += y.grad ∗ (∂y/∂x)(x.value)
∂L ∂L ∂y
(1) =
∂x ∂y ∂x 5
Scalar vs. Matrix Operations

So far we have considered computations on scalars:

y = σ(w1 x + w0 )

We now consider computations on vectors and matrices:

y = σ(Ax + b)

I Matrix A and vector b are objects with attributes value and grad
I A.grad stores ∇A L and b.grad stores ∇b L
I A.grad has the same shape/dimensions as A.value (since L is scalar)

6
Backpropagation on Loops

The matrix/vector computation

y = σ(|{z}
Ax +b)
=u

can be written as loops over scalar operations:

for i u.value[i] = 0
for i,j u.value[i] += A.value[i, j] ∗ x.value[j]
for i y.value[i] = σ(u.value[i] + b.value[i])

7
Backpropagation on Loops

The backpropagated gradients for

for i y.value[i] = σ(u.value[i] + b.value[i])

are:

for i u.grad[i] += y.grad[i] ∗ σ 0 (u.value[i] + b.value[i])

for i b.grad[i] += y.grad[i] ∗ σ 0 (u.value[i] + b.value[i])

I Red: back-propagated gradients I Blue: local gradients

8
Backpropagation on Loops

The backpropagated gradients for

for i,j u.value[i] += A.value[i, j] ∗ x.value[j]

are

for i,j A.grad[i, j] += u.grad[i] ∗ x.value[j]

for i,j x.grad[j] += u.grad[i] ∗ A.value[i, j]

I Red: back-propagated gradients I Blue: local gradients

9
Backpropagation on Loops
In practice, all deep learning operations can be written using loops over scalar
assignments. Example for a higher-order tensor:

for h,i,j,k U.value[h, i, j] += A.value[h, i, k] ∗ B.value[h, j, k]

for h,i,j Y.value[h, i, j] = σ(U.value[h, i, j])

Backpropagation loops:

h,i,j U.grad += Y.grad[h, i, j] ∗ σ 0 (U.value[h, i, j])

h,i,j,k A.grad += U.grad[h, i, j] ∗ B.value[h, j, k]
h,i,j,k B.grad += U.grad[h, i, j] ∗ A.value[h, i, k]

10
Minibatching

Source code has two components:

I Slow part: Sequential operations (Python)
I Fast part: Vector/matrix operations (NumPy, BLAS, CUDA)

Goal:
I Fast part should dominate computation (wall clock time)
I Reduce the number of slow sequential operations (e.g., Python loops)
by running the fast vector/matrix operations on several data points jointly
I This is called minibatching and used in stochastic gradient descent

11
Minibatching
Afﬁne + Sigmoid: (applied to N data points simultaneously)
Y = σ(XA
|{z} +B)
=U
I Each row in X ∈ RN ×D is a data point, bias b ∈ RM is broadcast to B ∈ RN ×M

Loops now include batch index b:

for b,i U.value[b, i] = 0

for b,i,j U.value[b, i] += X.value[b, j] ∗ A.value[j, i]
for b,i Y.value[b, i] = σ(U.value[b, i] + B.value[i])

I Only inputs and outputs depend on batch index b, not the parameters (e.g., A, B)
I By convention, the gradients are averaged over the batch
12
Implementation
Afﬁne Transformation: (applied to N data points simultaneously)
Y = XA + B

I Each row in X ∈ RN ×D is a data point, bias b ∈ RM is broadcast to B ∈ RN ×M

Implementation in EDF:
def forward ( self ):
self . value = np . matmul ( self . x . value , self . w . A . value ) + self . w . b . value

def backward ( self ):

self . x . addgrad ( np . matmul ( self . grad , self . w . A . value . transpose ()))
self . w . b . addgrad ( self . grad )
self . w . A . addgrad ( self . x . value [: ,: , np . newaxis ] * self . grad [: , np . newaxis ,:])

I Computation graphs are easy to understand using the loop notation

I Efﬁcient implementation using NumPy primitives not always obvious 13
3.2
The XOR Problem
The XOR Problem

Logistic Regression Model:

1
ŷ = σ(w> x) with σ(x) =
1 + e−x

I Which problems can we solve with such a simple linear classiﬁer?

15
The XOR Problem

Example: 2D Logistic Regression

1.0

Decision Boundary
Sigmoid

1
ŷ = σ(w> x + w0 ) σ(x) = 0.8
1 + e−x
0.6

Class 0 Class 1

(x)
I Let x ∈ R2 0.5

0.4

I Decision boundary: w> x + w0 = 0

I Decide for class 1 ⇔ w> x > −w0 0.2

I Decide for class 0 ⇔ w> x < −w0 0.0

10.0 7.5 5.0 2.5 0.0 2.5 5.0 7.5 10.0
x

16
The XOR Problem

Linear Classiﬁer: !
>
x
Class 1 ⇔ w x > −w0 1
1 1 > |{z}
0.5
| {z } x2
w>
| {z } −w0
x

x1 x2 OR(x1 ,x2 ) Class 0

Class 1
0 0 0 1
0 1 1
1 0 1
1 1 1
0 1

17
The XOR Problem

Linear Classiﬁer: !
>
x
Class 1 ⇔ w x > −w0 1
1 1 > |{z}
1.5
| {z } x2
w>
| {z } −w0
x

x1 x2 AND(x1 ,x2 ) Class 0

Class 1
0 0 0 1
0 1 0
1 0 0
1 1 1
0 1

18
The XOR Problem

Linear Classiﬁer: !
>
x
Class 1 ⇔ w x > −w0 1
−1 −1 > −1.5
} x2
| {z }
| {z −w0
w>
| {z }
x

x1 x2 NAND(x1 ,x2 ) Class 0

Class 1
0 0 1 1
0 1 1
1 0 1
1 1 0
0 1

19
The XOR Problem

Linear Classiﬁer: !
>
x
Class 1 ⇔ w x > −w0 1
? ? > |{z}
?
| {z } x2
w>
| {z } −w0
x

x1 x2 XOR(x1 ,x2 ) Class 0

Class 1
0 0 0 1
0 1 1
1 0 1 ?
1 1 0
0 1

20
The XOR Problem

Class 0
Class 1
1

0 1

I Visually it is obvious that XOR is not linearly separable

I How can we formally prove this?

21
Convex Sets

I A set S is convex if any line segment connecting 2 points in S lies entirely within S:

x1 , x2 ∈ S ⇒ λx1 + (1 − λ)x2 ∈ S for λ ∈ [0, 1]

22
The XOR Problem

I Half-spaces (e.g., decision regions) are convex sets

I Suppose there was a feasible hypothesis. If the positive examples are in the
positive half-space, then the green line segment must be as well.
I Similarly the red line must lie within the negative half-space.
I But the intersection can’t lie in both half-spaces. Contradiction!

Class 0
Class 1
1

0 1
23
The XOR Problem

Some Historical Remarks:

I While linear classiﬁcation showed some
promising results in the 50s and 60s on simple
image classiﬁcation problems (Perceptron)
I But limitations became clear very soon (e.g.,
Minsky and Papert book “Perceptrons”, 1969)
I XOR problem is simple but cannot be solved as
model capacity limited to linear decisions
I Led to decline in neural net research in the 70s
I How can we solve non-linear problems?

24
The XOR Problem
Linear classiﬁer with Class 0
non-linear features ψ: Class 1

1
 
x1
w>  x2  > −w0
 

x1 x2
| {z }
ψ(x)

0 1
x1 x2 ψ1 (x) ψ2 (x) ψ3 (x) XOR
0 0 0 0 0 0 I Non-linear features allow linear
0 1 0 1 0 1 classifier to solve non-linear
1 0 1 0 0 1 classification problems!
1 1 1 1 1 0
I Analogous to polynomial curve fitting
25
Representation Matters
Cartesian Coordinates Polar Coordinates

CHAPT ER 1. INT RODUCT ION

θ
y

x r

I But how to choose the transformation? Can be very hard in practice.

I Yet, this was the dominant approach until the 2000s (vision, speech, ..)
I In this class we want to learn them ⇒ Representation learning
I Human needs to choose the right function family rather than the correct function
26
The XOR Problem

Linear Classiﬁer: XOR(x1 , x2 ) =

Class 1 ⇔ w> x > −w0 AND(OR(x1 , x2 ),NAND(x1 , x2 ))

x1 x2 XOR(x1 ,x2 ) Class 0

Class 1
0 0 0 1
0 1 1

XO
1 0 1

N
AN
1 1 0

D
O
0 1

R
27
The XOR Problem

XOR(x1 , x2 ) = AND(OR(x1 , x2 ),NAND(x1 , x2 ))

The above expression can be rewritten

as a program of logistic regressors:
>
h1 = σ(wOR x + wOR )
>
h2 = σ(wN AN D x + wN AN D )
>
ŷ = σ(wAN D h + wAN D )

Note that h(x) is a non-linear feature of x.

We call h(x) a hidden layer.

28
The XOR Problem

XOR(x1 , x2 ) = AND(OR(x1 , x2 ),NAND(x1 , x2 ))

Writing the two 1D mappings h1 (x) and

h2 (x) as a single 2D mapping h(x) yields:
 
! !
>

 w OR w OR 
h = σ
 w> x+ 
 N AN D wN AN D 
| {z } | {z }
W w
>
ŷ = σ(wAN D h + wAN D )

Parameters can be learned using backprop. This is our first Multi-Layer Perceptron!
28
3.3
Multi-Layer Perceptrons
Multi-Layer Perceptrons
I MLPs are feedforward neural networks (no feedback connections)
I They compose several non-linear functions f (x) = ŷ(h3 (h2 (h1 (x))))
where hi (·) are called hidden layers and ŷ(·) is the output layer
I The data specifies only the behavior of the output layer (thus the name “hidden”)
I Each layer i comprises multiple neurons j which are implemented as affine
transformations (a> x + b) followed by non-linear activation functions (g):

hij = g(a>
ij hi−1 + bij )

I Each neuron in each layer is fully connected to all neurons of the previous layer
I The overall length of the chain is the depth of the model ⇒ “Deep Learning”
I The name MLP is misleading as we don’t use threshold units as in Perceptrons
30
MLP Network Architecture
Network Depth = #Computation Layers = 4

Layer Width = #Neurons in Layer

Input Layer Hidden Layer 1 Hidden Layer 2 Hidden Layer 3 Output Layer

I Neurons are grouped into layers, each neuron fully connected to all prev. ones
I Hidden layer hi = g(Ai hi−1 + bi ) with activation function g(·) and weights Ai , bi
31
Feature Learning Perspective
Linear Regressor / Classiﬁer

Input Layer Hidden Layer 1 Hidden Layer 2 Hidden Layer 3 Output Layer

Class 0 Class 0
Class 1 Class 1

Transformation
by hidden layers

32
Activation Functions g(·)

33
Neural Motivation

I Neurons in the brain are structured in layers

I They receive input from many other units and compute their own activation
I The sigmoid activation function is guided by neuroscientiﬁc observations
I However, the architecture and training of modern networks differs radically
I Our main goal is not to model the brain, but to achieve statistical generalization
34
Training
Algorithm for training an MLP using (stochastic) gradient descent:
1. Initialize weights w, pick learning rate η and minibatch size |Xbatch |
2. Draw (random) minibatch Xbatch ⊆ X
3. For all elements (x, y) ∈ Xbatch of minibatch (in parallel) do:
3.1 Forward propagate x through network to calculate h1 , h2 , . . . , ŷ
3.2 Backpropagate gradients through network to obtain ∇w L(ŷ, y)
1
4. Update gradients: wt+1 = wt − η
P
|Xbatch | (x,y)∈Xbatch ∇w L(ŷ, y)
5. If validation error decreases, go to step 2, otherwise stop

Remarks:
I Large datasets typically do not ﬁt into GPU memory ⇒ |Xbatch | < |X |
I Our examples on the next slides are small ⇒ |Xbatch | = |X |
35
Levels of Abstraction

I When designing neural networks and machine learning algorithms,

you’ll need to simultaneously think at multiple level’s of abstraction
I “The psychological proﬁling [of a programmer] is mostly the ability to shift levels
of abstraction, from low level to high level. To see something in the small and to
see something in the large.” [Donald E. Knuth]
36
The XOR Problem

I Note that we have learned a boolean circuit! ⇒ differentiable programming

https://cs.stanford.edu/people/karpathy/convnetjs/demo/classify2d.html
37
A More Challenging Problem

2 Hidden Neurons 5 Hidden Neurons 15 Hidden Neurons

https://cs.stanford.edu/people/karpathy/convnetjs/demo/classify2d.html

38
Expressiveness
This following two-layer MLP

h = g(A1 x + b1 )
y = g(A2 h + b2 )

can be written as
y = g(A2 g(A1 x + b1 ) + b2 )

What if we would be using a linear activation function g(x) = x?

y = A2 (A1 x + b1 ) + b2 = A2 A1 x + A2 b1 + b2 = Ax + b

I With linear activations, a multi-layer network can only express linear functions
I What is the model capacity of MLPs with non-linear activation functions?
39
3.4
Universal Approximation
Universal Approximation Theorem
Theorem 1
Let σ be any continuous discriminatory function. Then ﬁnite sums of the form

N
X
G(x) = αj σ(a>
j x + bj )
j=1

are dense in the space of continuous functions C(In ) on the n-dimensional unit cube
In . In other words, given any f ∈ C(In ) and > 0, there is a sum, G(x) for which

|G(x) − f (x)| < for all x ∈ In

Remark: Has been proven for various activation functions (e.g., Sigmoid, ReLU).

Cybenko: Approximation by superposition of a sigmoidal function. Mathematics of Control, Signals, and Systems, 1989. 41
Example: Binary Case
x1 x2 x3 y
.. .. .. ..
. . . .
0 1 0 0
0 1 1 1
1 0 0 0
.. .. .. ..
. . . .
X
ŷ = [a> x + bi > 0]
|i {z }
i
hi

I Each hidden linear threshold unit hi recognizes one possible input vector
I We need 2D hidden units to recognize all 2D possible inputs in the binary case
42
Soft Thresholds
1.0
(x)
(2x)
(5x)
0.8 (50x)

0.6

0.4

0.2

0.0
10.0 7.5 5.0 2.5 0.0 2.5 5.0 7.5 10.0
x

Learning linear threshold units is hard as their gradient is 0 almost everywhere

I Solution: Replace hard threshold with soft theshold (e.g., sigmoid)
I Sigmoids approximate step functions when increasing the input weight
43
Network Width vs. Depth
I Universality of 2 layer networks is appealing but requires exponential width
I This leads to an exponential increase in memory and computation time
I Moreover, it doesn’t lead to generalization ⇒ network simply memorizes inputs
I Deep networks can represent functions more compactly (with less parameters)
I Inductive bias: Complex functions modeled as composition of simple functions
I This leads to more compact models and better generalization performance
I Example: The parity function

1 if P x is odd
i i
f (x1 , . . . , xD ) =
0 otherwise

requires an exponentially large shallow network but can be computed using

a deep network whose size is linear in the number of inputs D. 44
Space Folding Intuition

Space folding intuition for the case of absolute value rectiﬁcation units:
I Geometric explanation of the exponential advantage of deeper networks
I Mirror axis of symmetry given by the hyperplane (deﬁned by weights and bias)
I Complex functions arise as mirrored images of simpler patterns
Montufar, Pascanu, Cho and Bengio: On the Number of Linear Regions of Deep Neural Networks. NIPS, 2014. 45
Effect of Network Depth

I Deeper networks generalize better (task: multi-digit number classiﬁcation)

Goodfellow, Bulatov, Ibarz, Arnoud and Shet: Multi-digit number recognition from Street View imagery using deep convolutional neural networks. ICLR, 2014. 46
Effect of Network Depth

I Increasing the number of parameters is not as effective as increasing depth

I Shallow models even overﬁt at around 20 million parameters in this example
I Compositionality is a useful prior over the space of functions the model can learn

Goodfellow, Bulatov, Ibarz, Arnoud and Shet: Multi-digit number recognition from Street View imagery using deep convolutional neural networks. ICLR, 2014. 47

UNIT 1 Introduction Part 1
No ratings yet
UNIT 1 Introduction Part 1
37 pages
MITx 6.86x Notes - MD
No ratings yet
MITx 6.86x Notes - MD
91 pages
P95 Course Slides
No ratings yet
P95 Course Slides
86 pages
CS229
No ratings yet
CS229
216 pages
Andrew NG Main - Notes PDF
No ratings yet
Andrew NG Main - Notes PDF
226 pages
Deep Neural Networks - 2
No ratings yet
Deep Neural Networks - 2
55 pages
What Is Computer Vision?
No ratings yet
What Is Computer Vision?
120 pages
L06 Slides - mlp3
No ratings yet
L06 Slides - mlp3
26 pages
Peugeot Manual
100% (1)
Peugeot Manual
164 pages
L04 Slides - mlp1
No ratings yet
L04 Slides - mlp1
22 pages
L05 Slides - mlp2
No ratings yet
L05 Slides - mlp2
21 pages
Data Mining: Practical Machine Learning Tools and Techniques
No ratings yet
Data Mining: Practical Machine Learning Tools and Techniques
123 pages
ML Unit - 2
No ratings yet
ML Unit - 2
70 pages
Lecture 4
No ratings yet
Lecture 4
50 pages
Lec 05
No ratings yet
Lec 05
46 pages
5 - From Linear Models To Multi-Layer Perceptrons
No ratings yet
5 - From Linear Models To Multi-Layer Perceptrons
45 pages
Week 03-04 - Deep Feedforward Networks - Intro
No ratings yet
Week 03-04 - Deep Feedforward Networks - Intro
141 pages
Unit 3
No ratings yet
Unit 3
110 pages
L6 Multilayer FeedForward Network XOR & MNIST DIGIT
No ratings yet
L6 Multilayer FeedForward Network XOR & MNIST DIGIT
51 pages
NN Ch2
No ratings yet
NN Ch2
36 pages
Main
No ratings yet
Main
183 pages
Christopher Manning Lecture 3: Neural Net Learning: Gradients by Hand (Matrix Calculus) and Algorithmically (The Backpropagation Algorithm)
No ratings yet
Christopher Manning Lecture 3: Neural Net Learning: Gradients by Hand (Matrix Calculus) and Algorithmically (The Backpropagation Algorithm)
84 pages
Chapter 5 Final
No ratings yet
Chapter 5 Final
80 pages
Pattern Classification 10. Linear Perceptron, Least Squares & Multi-Layer Nns
No ratings yet
Pattern Classification 10. Linear Perceptron, Least Squares & Multi-Layer Nns
38 pages
CS229 Andrew NG Lecture Notes
No ratings yet
CS229 Andrew NG Lecture Notes
216 pages
Dr. Francisco O. Santos
No ratings yet
Dr. Francisco O. Santos
2 pages
3 Non Linear Classifiers
No ratings yet
3 Non Linear Classifiers
74 pages
Midterm Study Guide Csci566
No ratings yet
Midterm Study Guide Csci566
20 pages
Deep Learning Book Part1
No ratings yet
Deep Learning Book Part1
100 pages
Deep Learning Basics (Lecture Notes) : Romain Tavenard
No ratings yet
Deep Learning Basics (Lecture Notes) : Romain Tavenard
49 pages
Lesson 3 Artificial Neural Network
No ratings yet
Lesson 3 Artificial Neural Network
77 pages
Exp 1-DL
No ratings yet
Exp 1-DL
3 pages
Practical-5 - 2CEIT606 - Artificial Intelligence
No ratings yet
Practical-5 - 2CEIT606 - Artificial Intelligence
14 pages
Kagan Lecture2
No ratings yet
Kagan Lecture2
118 pages
3 Non Linear Classifiers
No ratings yet
3 Non Linear Classifiers
74 pages
DL 2
No ratings yet
DL 2
62 pages
Lecture 03 - Feedforward Networks - 4p
No ratings yet
Lecture 03 - Feedforward Networks - 4p
19 pages
Deep Learning
No ratings yet
Deep Learning
26 pages
Cours1 ML
No ratings yet
Cours1 ML
41 pages
11-Nonlinear Models (Neural Networks)
No ratings yet
11-Nonlinear Models (Neural Networks)
6 pages
Introduction To Artificial Neural Networks
No ratings yet
Introduction To Artificial Neural Networks
31 pages
CSE 440 AI Volume1 (p1)
No ratings yet
CSE 440 AI Volume1 (p1)
4 pages
Study of E Banking Services Offered by ICICI Bank Manavi Mhaskar 09
No ratings yet
Study of E Banking Services Offered by ICICI Bank Manavi Mhaskar 09
58 pages
Week3 Perceptron Mlprwerwerwer
No ratings yet
Week3 Perceptron Mlprwerwerwer
8 pages
Deep Learning 1
No ratings yet
Deep Learning 1
48 pages
Unit-5: Introduction To Deep Learning: Artificial Neural Networks
No ratings yet
Unit-5: Introduction To Deep Learning: Artificial Neural Networks
14 pages
XOR Problem Demonstration Using MATLAB
0% (1)
XOR Problem Demonstration Using MATLAB
19 pages
Inference and Learning
No ratings yet
Inference and Learning
33 pages
M3 L4 RNN Regularization
No ratings yet
M3 L4 RNN Regularization
24 pages
Machinelearning
No ratings yet
Machinelearning
59 pages
Unit 2.1
No ratings yet
Unit 2.1
37 pages
6.036: Intro To Machine Learning: Lecturer: Professor Leslie Kaelbling Notes By: Andrew Lin Fall 2019
No ratings yet
6.036: Intro To Machine Learning: Lecturer: Professor Leslie Kaelbling Notes By: Andrew Lin Fall 2019
50 pages
Deep Learning A Tutorial
No ratings yet
Deep Learning A Tutorial
16 pages
Tentative Schedule Summer, Carryover, Supplementrary 2024-25
No ratings yet
Tentative Schedule Summer, Carryover, Supplementrary 2024-25
11 pages
Ece18898g Neural Networks
No ratings yet
Ece18898g Neural Networks
47 pages
DL Practical 02 Binary Class Classifier Using ANN
No ratings yet
DL Practical 02 Binary Class Classifier Using ANN
5 pages
Shortcomings in Single Layer Neural Networks: Most Real World Problems Are Not
No ratings yet
Shortcomings in Single Layer Neural Networks: Most Real World Problems Are Not
43 pages
Dissertation Sara Parchami
100% (2)
Dissertation Sara Parchami
7 pages
Introduction To Neural Networks: RWTH Aachen University Chair of Computer Science 6 Prof. Dr.-Ing. Hermann Ney
No ratings yet
Introduction To Neural Networks: RWTH Aachen University Chair of Computer Science 6 Prof. Dr.-Ing. Hermann Ney
31 pages
Nova Southeastern Dissertation Guide
100% (2)
Nova Southeastern Dissertation Guide
4 pages
Eaton DS 265760 NZMN4 AE1000 en - GB 20241113
No ratings yet
Eaton DS 265760 NZMN4 AE1000 en - GB 20241113
4 pages
Grade 6 Conjunctions
No ratings yet
Grade 6 Conjunctions
65 pages
003 - Syngas Generation For GTL PDF
No ratings yet
003 - Syngas Generation For GTL PDF
91 pages
GS EP EXP 207 09 Systems Units
No ratings yet
GS EP EXP 207 09 Systems Units
18 pages
Introduction To Community Health and Environmental Sanitation
100% (3)
Introduction To Community Health and Environmental Sanitation
44 pages
Python Basics Nympy
No ratings yet
Python Basics Nympy
5 pages
Analyst Prep Quants 2024
100% (1)
Analyst Prep Quants 2024
465 pages
Operating - Station Master
No ratings yet
Operating - Station Master
9 pages
Unit No. Topics To Be Covered Hours: Ground Breaking Operation
No ratings yet
Unit No. Topics To Be Covered Hours: Ground Breaking Operation
32 pages
BBBB
No ratings yet
BBBB
8 pages
Notes Chapter8
No ratings yet
Notes Chapter8
4 pages
NLP-NeuralNetworks Reading Notes
No ratings yet
NLP-NeuralNetworks Reading Notes
13 pages
User'S Guide: 2. External Dimensions and Parts 5. Specifications
No ratings yet
User'S Guide: 2. External Dimensions and Parts 5. Specifications
8 pages
Both Statements Are False
No ratings yet
Both Statements Are False
26 pages
Cambridge IGCSE: Travel & Tourism 0471/21
No ratings yet
Cambridge IGCSE: Travel & Tourism 0471/21
12 pages
Cladistics and Phylogeny - Notes
No ratings yet
Cladistics and Phylogeny - Notes
6 pages
Intelligent Motion Control Design For An Omnidirectional Conveyor System
No ratings yet
Intelligent Motion Control Design For An Omnidirectional Conveyor System
11 pages
Bulldog Adhesion Promoter TPO123 TDS Rev 07 2010
No ratings yet
Bulldog Adhesion Promoter TPO123 TDS Rev 07 2010
7 pages
Developing Skills Speaking, Listening, Writing and Reading.
No ratings yet
Developing Skills Speaking, Listening, Writing and Reading.
12 pages
Issues in Distance Learning
No ratings yet
Issues in Distance Learning
29 pages
Neural Network Presentation
No ratings yet
Neural Network Presentation
33 pages
Physics Vol.2 Figures Class 12
No ratings yet
Physics Vol.2 Figures Class 12
25 pages
Obc 19971027 Kaye Gibbons
No ratings yet
Obc 19971027 Kaye Gibbons
2 pages
Dynamics Problem Solving
No ratings yet
Dynamics Problem Solving
6 pages
Array: Intermediate Level Questions
No ratings yet
Array: Intermediate Level Questions
3 pages
CID 20210320173003021556 989295 uniROC Ipayob
No ratings yet
CID 20210320173003021556 989295 uniROC Ipayob
6 pages
Grade 11 Courage (Engineering)
No ratings yet
Grade 11 Courage (Engineering)
8 pages
FR 107 Datasheet PDF
No ratings yet
FR 107 Datasheet PDF
2 pages
A Note On The Guru Cult
No ratings yet
A Note On The Guru Cult
4 pages
Student Solutions Manual to Accompany Economic Dynamics in Discrete Time, secondedition
From Everand
Student Solutions Manual to Accompany Economic Dynamics in Discrete Time, secondedition
Yue Jiang
4.5/5 (2)