0% found this document useful (0 votes)

53 views50 pages

3 - DeepLearning - and - CNN v3

Deep learning uses neural networks to process data in a way that is similar to the human brain. Convolutional neural networks (CNNs) are a type of deep learning that uses convolution operations instead of matrix multiplications. CNNs are useful for tasks involving visual or spatial data because they can recognize patterns in images or text even if the patterns appear in different locations. CNNs work by applying convolutional filters to extract features from the input data, then using techniques like backpropagation and gradient descent to train the network to accurately classify examples. This helps CNNs generalize well while avoiding overfitting issues.

Uploaded by

Dumidu Ghanasekara

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

53 views50 pages

3 - DeepLearning - and - CNN v3

Uploaded by

Dumidu Ghanasekara

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 50

Deep Learning

& Convolutional Neural Nets

Intelligent Systems
ELE 4643
What is Deep Learning?

Artificial Intelligence:
• A broad concept where machines think and
act more like humans Artificial Intelligence

Machine Learning:
Machine Learning
• An application of AI where machines use date
to automatically improve at performing tasks
Deep Learning:
Deep Learning
• A machine learning technique that processes
data through a multi-layered neural network
much like the human brain
Deep Learning Animation
How neural networks learn | Deep learning
https://youtu.be/IHZwWFHWa-w
Backpropagatin
Backpropagation
• How do you train a Multi-Layer Neural Networks
weights? How does it
learn?
• Backpropagation algorithm uses Gradient Descent
• For each training step:
• Compute the output error
• Compute how much each neuron in the previous
hidden layer contributed
• Back-propagate that error in a reverse pass
• Tweak weights to reduce the error using gradient
descent
Backpropagation

Information flow
𝐲 = 𝑓NOO 𝐱 = 𝑜 𝐖 `𝜎 𝐖 `a"𝜎 …
𝐖 "𝐱

Propagation (data stream)

Input of the Comparison with

Output layer
Hidden layer
Input layer
training data desired output

Error calculation

Backpropagation (error stream)

Gradient Descent
Gradient descent measures the gradient/slope (the change in
error caused by a change in weight during Neural Network
training).
Based on the gradient, the algorithm can tell if the weight
should be increased or decreased in order to push the
gradient towards the direction where the slope flattens out
and the error is minimized.
The gradient is obtained by calculating the partial derivative
of the error function with respect to weights and biases.
Gradient Descent Animation
Gradient Descent Potential Problems

Potential problems during gradient descent

c)
b)
d)
a)

Global minimum

Potential problem during gradient descent: a) Finding local minima, b) Near halting due to small
gradients, c) Oscillation in valleys, d) Leaving good minima
Learning Flow
Step 1
Random Initialization Loss
Desired
Inputs Function
Step 7 Output

Iterat Weights Step 2 Step 3 Loss (Error)

e /Model Feed Forward Calculate loss function Metric
until
converge
nce Gradien Gradient
t for for the last
Step 6 all Step 5 layer Step 4
Update Weights layers Backpropagate Calculate error derivative

Optimizer Function
Learning

Overfitting refers to a model that models the training data too

well (memorizes the training data).
Overfitting happens when a model learns the detail and noise
in the training data to the extent that it negatively impacts the
performance of the model on new data
Regularization is a set of techniques that can prevent
overfitting in neural networks and thus improve the accuracy
of a Deep Learning model when facing completely new data
from the problem domain
Learning
Regularization Learning To Avoiding
Overfitting
With thousands of weights to tune, overfitting is a
problem. These are some techniques to avoid that.
• Early stopping (when performance starts dropping)
• Dropout – ignore say 50% of all neurons randomly at
each training step
• Works surprisingly well!
• Forces your model to spread out its learning

There are techniques too…

Learning and Overfitting

Overfitting: fitting the model too closely to the training

data and departing with validation data.
Error is starting to increase
with new testing data
Overfitting is starting to
happen Error is still
decreasing with
Validation or testing familiar training
data

On training set
How is this achieved?

• Role of Activation Functions

– Good activation functions are nonlinear
• Allow for selective correlation: increase or decrease how
correlated the neuron is to all the other incoming signals
– Other core properties
• The function must be continuous and infinite
• The function should be monotonic
– I.e., no two input values of have the same output value
• The function and its derivative should be easily
computable
– Enable efficiency when training and deploying the network
Activation Functions

• Linear • Rectified Linear Unit

– 𝑅𝑒𝐿𝑈 𝑥 = max(0, 𝑥)
– 𝑓(𝑥) = 𝑎𝑥
– Range 0 to ∞
– Range −∞ to ∞
• Sigmoid • Leaky Rectified Linear Unit
𝑥 𝑓𝑜𝑟 𝑥 ≥ 0
– 𝜎(𝑥) – 𝐿𝑅𝑒𝐿𝑈 𝑥 = v
0.01𝑥 𝑓𝑜𝑟 𝑥 < 0
– Range: 0 to 1 – Range −∞ to ∞
• Hyperbolic Tangent • Softmax
– 𝑡𝑎𝑛ℎ(𝑥)
– Range: −1 to 1 – 𝑠𝑜𝑓𝑡𝑚𝑎𝑥
–𝑥Range:
R 0 to 1 used in the
output classification layer
Activation functions
Activation Functions ReLu
(aka
rectifier)
• Step functions don’t work with gradient descent –
there is no gradient!
• Mathematically, they have no useful derivative.
• ReLU is common. Fast to compute and works well.
• Also: “Leaky ReLU”, “Noisy ReLU”
• ELU can sometimes lead to faster learning
though.

ReLU function
Loss Functions

• Regression • Single label from multiple classes

– Predicting a single numerical value – Multiple classes which are exclusive
– Final activation – Linear – Final Activation function – Softmax
– Loss function – Mean Squared Error – Loss function – Cross Entropy

• Binary outcome • Multiple labels from multiple classes

– If there are multiple labels in your data
– Data is or isn’t a class
– Final Activation function – Sigmoid
– Final Activation function – Sigmoid
– Loss function – Binary Cross Entropy
– Loss function – Binary Cross Entropy
Tuning your
topology
• Trial & error is one way
• Evaluate a smaller network with less neurons in
the hidden layers
• Evaluate a larger network with more layers
• Try reducing the size of each layer as you progress –
form a funnel
• More layers can yield faster learning
• Or just use more layers and neurons than you
need, and don’t care because you use early
stopping.
Students may experiment with Artificial Neural Net using Googles playground on the link below.

playground.tensorflow.org
Convolution Neural Network CNN
• What is a Convulotional Neural Network?
– A CNN is a neural network with convolution operations
instead of matrix multiplications in at least one of the layers

– Special form of feed-forward network

• Reuse the same neurons for repetitive convolution tasks
– Convolution ≈ Filtering
• Convolutional kernels recognise patterns in a signal
• Reuses weights to detect the same patterns in multiple places
• Reduces overfitting and leads to much more accurate models
How Does CNN Work
CNN – Motivation

• Convolutional Neural Network (CNN)

– Convolutional kernels perform feature extraction
CNN – Motivation

• When applying a fully connected feedforward neural network

– Many thousands of weights and connections
– However, network can be simplified by considering the properties of input signal
Output
X 1
example:
flower
X 2

. . . bird
. . . . car
. . . .
. . . cat
.
An RGB image can be . . . . dog
represented as . horse
pixels (32*32*3) ship
X N

deer
Input vector dimension
N= 32⨉32⨉3= 3072
CNN – Motivation

Property 1: Some patterns are much smaller than the whole image
Can Convolution
repeat Property 2 : The same patterns appear in different regions
many
times Max Pooling Property 3 : Downsampling the pixels does not change the object

Convolution
Fully Connected
Max Pooling Flatten Feedforward Prediction
Network
Convolutional Layers

0 1 0 0 1 0
Apply small filers to detect small patterns
0 1 0 0 1 0
Each filter has a size of 3 x 3
0 1 0 0 1 0

1 0 0 0 0 1 -1 1 -1 1 -1 -1
0 1 0 0 1 0 -1 1 -1 -1 1 -1

.
.
.
.
.
0 0 1 1 0 0 -1 1 -1 -1 -1 1
6 x 6 image Filter 1 Filter 2

• Note: Only the size of the filters is specified; the weights are initialised
to arbitrary values before the start of training.
• The weights of the filters are learnt through the CNN training process
Convolutional Layers

• Key Parameters
– Filter size – defines the height and width of the filter kernel
• E.g., a filter kernel of size would have nine weights
– Stride – determines the number of steps to move in each
spatial direction while performing convolution.
– Padding –appends zeroes to the boundary of an image to
control the size of the output of convolution
• When we convolve an image of a specific size by a filter, the
resulting image is generally smaller than the original image
Convolutional Layers

stride = 1
0 1 0 0 1 0

0 1 0 0 1 0 -1 1 -1
3 -3
0 1 0 0 1 0 -1 1 -1
1 0 0 0 0 1 -1 1 -1
0 1 0 0 1 0
Filter 1
0 0 1 1 0 0

6 x 6 image

Compute the dot product between the filter and a small 3 x 3 chunk of the image
Convolutional Layers

stride = 2
0 1 0 0 1 0
0 1 0 0 1 0 -1 1 -1
3 -3
0 1 0 0 1 0 -1 1 -1
1 0 0 0 0 1 -1 1 -1
0 1 0 0 1 0
Filter 1
0 0 1 1 0 0

6 x 6 image
We set stride = 1 below
Compute the dot product between the filter and a small 3 x 3 chunk of the image
Convolutional Layers

stride = 1
0 1 0 0 1 0

0 1 0 0 1 0 -1 1 -1
3 -3 -3 3
0 1 0 0 1 0 -1 1 -1
1 -2 -2 1
1 0 0 0 0 1 -1 1 -1
0 1 0 0 1 0 1 -2 -2 1
Filter 1
0 0 1 1 0 0 -1 -1 -1 -1
6 x 6 image
4 x 4 image
Convolutional Layers

stride = 1, filter size = 3

0 1 0 0 1 0 3 -3 -3 3

0 1 0 0 1 0
1 -2 -2 1
0 1 0 0 1 0
1 -2 -2 1
1 0 0 0 0 1
0 1 0 0 1 0 -1 -1 -1 -1
0 0 1 1 0 0

6 x 6 image 4 x 4 image

output size: (6 - 3) / 1 + 1 = 4
Convolutional Layers

filter size = F
output size: (N - F) / stride + 1

for example: N = 6, F = 3
stride = 1 -> (6-3)/1 + 1 = 4
stride = 2 -> (6-3)/2 + 1 = 2.5 :\
stride = 3 -> (6-3)/3 + 1 = 2

N x N image
Convolutional Layers

zero-padding to the border

0 0 0 0 0 0 0 0
For example: N = 6, F = 3, stride = 1
0 0
Without 0-padding:
0 0
output size is (6-3)/1 + 1 = 4
0 0
With 0-padding with 1 pixel border:
0 0
output size is (6-3+2x1)/1 + 1 = 6
0 0
0 0
The output size is then the same as the input!
0 0 0 0 0 0 0 0

• N x N image
Convolutional Layers

zero-padding to the border

0 0 0 0 0 0 0 0
In general, stride=1, filters of size FxF,
0 0
then zero-padding with (F-1)/2,
0 0
to preserve size spatially
0 0
e.g.
0 0

0 0 F = 3 -> zero pad with 1 pixel to the border

F = 5 -> zero pad with 2 pixels to the border
0 0
F = 7 -> zero pad with 3 pixels to the
0 0 0 0 0 0 0 0 border

N x N image
Convolutional Layers

stride = 1
0 1 0 0 1 0

0 1 0 0 1 0 -1
-1 11 -1
-1 3 -3 -3 3
0 1 0 0 1 0 -1
-1 11 -1
-1
1 -2 -2 1
1 0 0 0 0 1 -1 11 -1
-1 -1
0 1 0 0 1 0 1 -2 -2 1
Filter 1
0 0 1 1 0 0 -1 -1 -1 -1
detect a vertical line
6 x 6 image

The same pattern in different locations are detected with the same filter
Convolutional Layers

stride = 1
0 1 0 0 1 0 Feature Maps
0 1 0 0 1 0 -1 1 -1
3 -3 -3 3
-1 -1 -1 -1 -1
0 1 0 0 1 0 1-1
1 -1 -1
1 0 0 0 0 1 1 -2 -2 1
-1 -11 1-1 -1 -1 0 -2 -1
0 1 0 0 1 0
Filt
-1 r -1
1 1 1 -2 -2 1
0 0 1 1 0 0 -3 0 0 -3
e -1 -1
3 -1 -1
6 x 6 image Filter 2 -1 -3 -1

x (#filters)
4 x 4 image
Do the same process for every filter
Convolutional Layers

-1 1 -1
RGB images -1-1 1 1 -1-1
-1 1 -1
-1-1 1 1 -1-1
-1 1 -1
0 1 0 0 1 0 -1-1 1 1 -1-1
0 1 0 0 1 0
0 10 01 00 10 01 0
Filter 1
0 1 0 0 1 0
0 10 01 00 10 01 0
0 1 0 0 1 0 1 -1 -1
1 1 -1-1 -1-1
1 00 01 00 00 11 0
1 0 0 0 0 1 -1 1 -1
-1-1 1 1 -1-1
0 11 00 00 10 00 1
0 1 0 0 1 0 -1 -1 1
-1-1 -1-1 1 1
0 00 11 10 00 01 0
0 0 1 1 0 0 Filter 2
0 0 1 1 0 0
3 channels 6x6x3 Filters
depth ofalways extend
the input the full 3
volume x3x3
Convolutional Layers – Parameters

• Key Parameters:
● Accepts an input of size W1 x H1 X D1
● Requires 4 hyperparameters:
○ Number of filters K Common settings:
○ Size of the filters F K: powers of 2, such as 32, 64, 128, 512
○ The stride S F = 3, S=1, P=2
○ The amount of zero padding P F = 5, S=1, P=2
● Produce an output of size W2 x H2 x D2, where ……
○ W2 = (W1 - F + 2P)/S + 1
○ H2 = (H1 - F + 2P)/S + 1
○ D2 = K
● With parameter sharing, it introduces F x F x D1 weights per filter, for a
total of (F x F x D1)x K weights and K biases.
Convolution vs Fully Connected

0 1 0 0 1 0 X 1

-1 1 -1 3 -3 -3 3
0 1 0 0 1 0 1 -1 -1 -1 -1 -1 -1 X 2

-1 1 -1 .
0 1 0 0 1 0 -1 1 -1 .
1
-1
-2
0
-2
-2
1
-1 .
-1 .
1 0 0 0 0 1 -1 1 -1 -1 1 .
. .
0 1 0 0 1 0 1 -2 -2 1 . .
-3 0 0 -3
convolution .
0 0 1 1 0 0
X N
-1 -1 -1 -1
3 -1 -3 -1

fully connected
Convolution vs Fully Connected

-1 1 -1 1 0

Filter 1 -1 1 -1 2 1 3
3 -3 -3 3 3 0
-1 1 -1
4 0
1 -2 -2 1 .
0 1 0 0 1 0 .
.
0 1 0 0 1 0 1 -2 -2 1 7 0
8 1
0 1 0 0 1 0 -1 -1 -1 -1 9 0
1 0 0 0 0 1
Instead of 36,
10. only 9 inputs
0 1 0 0 1 0
0 are connected
0 0 1 1 0 0 Less parameters to learn! .
.
13
6 x 6 image 0 0
14 0
1
Convolution vs Fully Connected

-1 1 -1 1 0

Filter 1 -1 1 -1 2 1 3
3 -3 -3 3 3 0
-1 1 -1
4 0
1 -2 -2 1 .
0 1 0 0 1 0 .
. -3
0 1 0 0 1 0 1 -2 -2 1 7 0
8 1
0 1 0 0 1 0 -1 -1 -1 -1 9 0
1 0 0 0 0 1
10.
0 1 0 0 1 0
0 weights are shared
0 0 1 1 0 0 Less parameters to learn! . between cells
.
6 x 6 image Even less parameters to learn! 13
0
14 0
1
Convolution vs Fully Connected

• The core idea

– Instead of having a large, dense linear layer with a
connection from every input to every output, we have lots
of small convolutional layers
• Convolutional layers usually with fewer inputs and a single
output
– The result is a smaller subset of kernel predictions, which
are used as input to the next layer.
– Convolutional layers usually have many kernels
Convolution vs Fully Connected

• The core idea

– Each kernel to learn a particular pattern and then search for
the existence of that pattern somewhere in the image
– A single, small set of weights can train over a much larger
set of training examples
– This changes the ratio of weights to datapoints on which
those weights are being trained
– This has a powerful impact on the network, drastically
reducing its ability to overfit to training data and increasing
its ability to generalise
Max Pooling

Pooling layers are usually present after a convolutional layer.

They provide a down-sampled version of the convolution output.

down-sampling

In this example, a 2x2 region is used as input of the pooling.

There are different types of pooling, the most used is max pooling.
Max Pooling

Filter 1 Filter 2
-1 1 -1 1 -1 -1

-1 1 -1 -1 1 -1
Pooling size = 2 x 2
-1 -1 1
-1 1 -1 Stride = 2
Operates over each
3 -3 -3 3 -1 -1 -1 -1 feature map
independently
1 -2 -2 1 -1 0 -2 -1

Invariant to small
1 -2 -2 1 -3 0 0 -3
differences in the input
-1 -1 -1 -1 3 -1 -3 -1

Feature map 1 Feature map 2

Max Pooling

0 1 0 0 1 0

0 1 0 0 1 0
0 1 0 0 1 0 Convolution

1 0 0 0 0 1 3 3
Max Pooling 0 -1
0 1 0 0 1 0
1 1
0 0 1 1 0 0 3 0

6 x 6 image each filter is a channel 2 feature maps

each of size 2 x 2
Smaller and more manageable
Max Pooling – Parameters

Key Parameters:
● Accepts an input of size W1 x H1 X D1
● Requires 2 hyperparameters:
Common settings:
○ Size of the filters F
F = 2, S=2
○ The stride S F = 3,
● Produce an output of size W2 x H2 x D2, where S=2
○ W2 = (W1 - F)/S + 1 ……
○ H2 = (H1 - F)/S + 1
○ D2 = D1
● It introduces zero learnable parameters since
it computes a fixed function of the input.
Convolve, Pool, Repeat

Can be repeat many times

Convolution

Max Pooling Convolution

Max Pooling

.
.
.
Output can be regarded as new images:
• Smaller than the original images
• The depth of new images is the number of filters
Transfer learning

What is Transfer Learning?

Transfer learning is a machine learning technique where a
model trained on one task is re-purposed on a second related
task.
Transfer learning with CNN

Transfer Learning
• Features learned by CNNs on large dataset problem, can be helpful for
other tasks. It is very common to pre-train a CNN on Imagenet and then
use it as a fixed feature extractor or as network initialisation.
Feature extractor: remove the last layer and then use the remaining
network to extract representations from hidden layers directly, which can
then be utilised as features for other applications.

Network initialisation: use pre-trained network and continue its training on

your own data and thus fine-tune the weights for your own specific
problem, as the training becomes progressively specific to the details of the
problem.
References:
Deep learning Conference adaptation-Schuller

OOAD and SSAD
67% (3)
OOAD and SSAD
3 pages
Super VIP Cheatsheet - Deep Learning
No ratings yet
Super VIP Cheatsheet - Deep Learning
47 pages
Automata Theory Assignment 1
100% (1)
Automata Theory Assignment 1
8 pages
Deep Learning
100% (2)
Deep Learning
49 pages
Classes Objects
No ratings yet
Classes Objects
34 pages
A Beginner's Tutorial For CNN
100% (1)
A Beginner's Tutorial For CNN
35 pages
Chapter 8 Random Variables
No ratings yet
Chapter 8 Random Variables
50 pages
An Introduction To Neural Networks: Instituto Tecgraf PUC-Rio Nome: Fernanda Duarte Orientador: Marcelo Gattass
No ratings yet
An Introduction To Neural Networks: Instituto Tecgraf PUC-Rio Nome: Fernanda Duarte Orientador: Marcelo Gattass
45 pages
Convolutional Neural Networks
No ratings yet
Convolutional Neural Networks
5 pages
NP Completeness
No ratings yet
NP Completeness
75 pages
Excel NormS Functions Spreadsheet
No ratings yet
Excel NormS Functions Spreadsheet
16 pages
4th Unit Aktu Machine Learning
No ratings yet
4th Unit Aktu Machine Learning
9 pages
Introduction To Neural Network
No ratings yet
Introduction To Neural Network
20 pages
Question Bank New
No ratings yet
Question Bank New
3 pages
Artificial Neural Networks Quiz Questions 1
No ratings yet
Artificial Neural Networks Quiz Questions 1
17 pages
Assignment 4 Joint Probability Distribution.176656.1519666683.4243
No ratings yet
Assignment 4 Joint Probability Distribution.176656.1519666683.4243
4 pages
ANN Doc
No ratings yet
ANN Doc
2 pages
Machine Learning With Artificial Neural Networks
No ratings yet
Machine Learning With Artificial Neural Networks
44 pages
Deep Learning PDF
No ratings yet
Deep Learning PDF
55 pages
Slides Prepared by John S. Loucks St. Edward's University: 1 Slide © 2003 Thomson/South-Western
No ratings yet
Slides Prepared by John S. Loucks St. Edward's University: 1 Slide © 2003 Thomson/South-Western
30 pages
CS 601 Machine Learning Unit 3
No ratings yet
CS 601 Machine Learning Unit 3
37 pages
Book
No ratings yet
Book
345 pages
Module1 ECO-598 AI & ML Aug 21
No ratings yet
Module1 ECO-598 AI & ML Aug 21
45 pages
Sequence Modeling RNN-LSTM-APPL-Anand Kumar JUNE2021
No ratings yet
Sequence Modeling RNN-LSTM-APPL-Anand Kumar JUNE2021
71 pages
CNN For Computer Vision Problem (Session 1)
No ratings yet
CNN For Computer Vision Problem (Session 1)
43 pages
Convolutional Neural Networks
No ratings yet
Convolutional Neural Networks
21 pages
Neural Networks
No ratings yet
Neural Networks
68 pages
3464
No ratings yet
3464
4 pages
Convolutional Neural Networks & Zapier
No ratings yet
Convolutional Neural Networks & Zapier
75 pages
Facial Final Mini
No ratings yet
Facial Final Mini
38 pages
Deep Learning Notes For Easy Access
No ratings yet
Deep Learning Notes For Easy Access
14 pages
Parameter Calculation
No ratings yet
Parameter Calculation
10 pages
Presentation FYP
No ratings yet
Presentation FYP
18 pages
Class 13 Rij Equation Method To Convert DFA To RE
No ratings yet
Class 13 Rij Equation Method To Convert DFA To RE
27 pages
Module 2
No ratings yet
Module 2
44 pages
CNN and Gan: Introduction To
No ratings yet
CNN and Gan: Introduction To
58 pages
Deepnet Lourentzou
No ratings yet
Deepnet Lourentzou
49 pages
Table of Poisson Distribution
No ratings yet
Table of Poisson Distribution
2 pages
DL Unit4
No ratings yet
DL Unit4
31 pages
Computer Vision NN Architecture
No ratings yet
Computer Vision NN Architecture
19 pages
1.neural Networks and Convolutional Processing
No ratings yet
1.neural Networks and Convolutional Processing
94 pages
Polymorphism
No ratings yet
Polymorphism
11 pages
NN Concepts
No ratings yet
NN Concepts
4 pages
Unit III
No ratings yet
Unit III
89 pages
Comp 304 - Artificial Intelligence: Assignment 2: Backpropagation Networks
No ratings yet
Comp 304 - Artificial Intelligence: Assignment 2: Backpropagation Networks
6 pages
CS 236 Section 3
No ratings yet
CS 236 Section 3
59 pages
Business Data Mining Week 12
No ratings yet
Business Data Mining Week 12
24 pages
Unit III
No ratings yet
Unit III
89 pages
Deep Learning
No ratings yet
Deep Learning
19 pages
Notes - 1058 - Unit III
No ratings yet
Notes - 1058 - Unit III
37 pages
Convolutional Neural Network
No ratings yet
Convolutional Neural Network
37 pages
Ai - W7L13
No ratings yet
Ai - W7L13
46 pages
Chapter 11 Neural Nets
No ratings yet
Chapter 11 Neural Nets
39 pages
Recurrent Neural Network Applications
No ratings yet
Recurrent Neural Network Applications
16 pages
Unit IV Deep Leraning
No ratings yet
Unit IV Deep Leraning
35 pages
Notes DL-1
No ratings yet
Notes DL-1
10 pages
DL Unit 3
No ratings yet
DL Unit 3
14 pages
Lect 12 - Deep Feed Forward NN - Review
No ratings yet
Lect 12 - Deep Feed Forward NN - Review
93 pages
Unit 4
No ratings yet
Unit 4
19 pages
ML Lec 10 Neural Networks
No ratings yet
ML Lec 10 Neural Networks
87 pages
Domnic Object Detecion Basics
No ratings yet
Domnic Object Detecion Basics
62 pages
Forecasting MiM Exercises Part3
No ratings yet
Forecasting MiM Exercises Part3
2 pages
Deep Learning Week 201
No ratings yet
Deep Learning Week 201
3 pages
Unit 4
No ratings yet
Unit 4
18 pages
F11 Handout
No ratings yet
F11 Handout
5 pages
Computer Vision Exam
No ratings yet
Computer Vision Exam
7 pages
Machine Learning (CSO851) - Lecture 10
No ratings yet
Machine Learning (CSO851) - Lecture 10
83 pages
Comparison of Arima and Exponential Smoothing Holt-Winters Methods For Forecasting CPI in The Tegal City, Central
No ratings yet
Comparison of Arima and Exponential Smoothing Holt-Winters Methods For Forecasting CPI in The Tegal City, Central
10 pages
NN Mdu Previousyears
No ratings yet
NN Mdu Previousyears
10 pages
ML Unit 4
No ratings yet
ML Unit 4
23 pages
Lecture 3
No ratings yet
Lecture 3
48 pages
Lecture 09 Slides - After
No ratings yet
Lecture 09 Slides - After
57 pages
Artificial Neural Network MID
No ratings yet
Artificial Neural Network MID
13 pages
Lecture 10 Slides - After
No ratings yet
Lecture 10 Slides - After
66 pages
L7 Lecture Image - classification.DNN v4
No ratings yet
L7 Lecture Image - classification.DNN v4
61 pages
Neural Networks-A Diffusion Model Changing The Landscape
No ratings yet
Neural Networks-A Diffusion Model Changing The Landscape
13 pages
Introduction To Convolution Neural Network
No ratings yet
Introduction To Convolution Neural Network
15 pages
AE556 2024 Topic4 CNN
No ratings yet
AE556 2024 Topic4 CNN
26 pages
Unit 9 ANN
No ratings yet
Unit 9 ANN
14 pages
Chapter 11 Neural Nets (Python)
No ratings yet
Chapter 11 Neural Nets (Python)
43 pages
Lec14 CNNRNNModels
No ratings yet
Lec14 CNNRNNModels
64 pages
Deep Learning Convolution Neural Networks
No ratings yet
Deep Learning Convolution Neural Networks
73 pages
Unit V
No ratings yet
Unit V
23 pages
STA457 Week 7 Notes
No ratings yet
STA457 Week 7 Notes
61 pages
Lecture - 07 (Convolutional Neural Networks)
No ratings yet
Lecture - 07 (Convolutional Neural Networks)
57 pages
Lecture 221007 05
No ratings yet
Lecture 221007 05
21 pages
Aiml Ece Unit-5
No ratings yet
Aiml Ece Unit-5
48 pages
Artificial Intelligence Interview Questions
From Everand
Artificial Intelligence Interview Questions
Tech Interviews
5/5 (2)
Python Machine Learning By Example: Unlock machine learning best practices with real-world use cases
From Everand
Python Machine Learning By Example: Unlock machine learning best practices with real-world use cases
Yuxi (Hayden) Liu
No ratings yet
Backpropagation: Fundamentals and Applications for Preparing Data for Training in Deep Learning
From Everand
Backpropagation: Fundamentals and Applications for Preparing Data for Training in Deep Learning
Fouad Sabry
No ratings yet

3 - DeepLearning - and - CNN v3

Uploaded by

3 - DeepLearning - and - CNN v3

Uploaded by

Deep Learning

& Convolutional Neural Nets

Propagation (data stream)

Backpropagation (error stream)

Potential problems during gradient descent

Iterat Weights Step 2 Step 3 Loss (Error)

Overfitting refers to a model that models the training data too

There are techniques too…

Overfitting: fitting the model too closely to the training

• Role of Activation Functions

• Linear • Rectified Linear Unit

• Regression • Single label from multiple classes

• Binary outcome • Multiple labels from multiple classes

– Special form of feed-forward network

• Convolutional Neural Network (CNN)

• When applying a fully connected feedforward neural network

stride = 1, filter size = 3

zero-padding to the border

zero-padding to the border

0 0 F = 3 -> zero pad with 1 pixel to the border

• The core idea

• The core idea

Pooling layers are usually present after a convolutional layer.

In this example, a 2x2 region is used as input of the pooling.

Feature map 1 Feature map 2

6 x 6 image each filter is a channel 2 feature maps

Can be repeat many times

Max Pooling Convolution

What is Transfer Learning?

Network initialisation: use pre-trained network and continue its training on

You might also like