0% found this document useful (0 votes)

14 views193 pages

Dlai DL CNN

Uploaded by

Guillaume Rossi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

14 views193 pages

Dlai DL CNN

Uploaded by

Guillaume Rossi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 193

Copyright Notice

These slides are distributed under the Creative Commons License.

DeepLearning.AI makes these slides available for educational purposes. You may not use or distribute
these slides for commercial purposes. You may make copies of these slides and use or distribute them for
educational purposes as long as you cite DeepLearning.AI as the source of the slides.

For the rest of the details of the license, see https://creativecommons.org/licenses/by-sa/2.0/legalcode

Convolutional
Neural Networks

Computer vision
deeplearning.ai
Computer Vision Problems
Image Classification Neural Style Transfer

Cat? (0/1)

64x64

Object detection

Andrew Ng
Deep Learning on large images

Cat? (0/1)

64x64

!"
!#
⋮ ⋮ ⋮ %&
!$
Andrew Ng
Convolutional
Neural Networks

Edge detection
deeplearning.ai
example
Computer Vision Problem

vertical edges

horizontal edges Andrew Ng

Vertical edge detection
1 10 -1
10 10
-1 -1
0 -1
3 0 1 2 7 4
1 10 10
-1 10
-1 -1
0 -1
1 5 8 9 3 1
1
1 110 110
-1 1100
-1 00
-1
-1 -1
-1
2 7 2 5 1 3
1 10 -1
10 10
-1 0
-1 -1 ∗ =
0 1 3 1 7 8 0 -2 -4 -7
4 2 1 6 2 8 -3 -2 -3 -16
2 4 5 2 3 9

Andrew Ng
Vertical edge detection
10 10 10 0 0 0
10 10 10 0 0 0 0 30 30 0
1 0 -1
10 10 10 0 0 0 0 30 30 0
∗ 1 0 -1 =
10 10 10 0 0 0 0 30 30 0
1 0 -1
10 10 10 0 0 0 0 30 30 0
10 10 10 0 0 0

∗
Andrew Ng
Convolutional
Neural Networks

More edge
deeplearning.ai
detection
Vertical edge detection examples
10 10 10 0 0 0
10 10 10 0 0 0 0 30 30 0
1 0 -1
10 10 10 0 0 0 0 30 30 0
10 10 10 0 0 0
∗ 1 0 -1 =
0 30 30 0
1 0 -1
10 10 10 0 0 0 0 30 30 0
10 10 10 0 0 0

0 0 0 10 10 10
0 0 0 10 10 10 0 -30 -30 0
1 0 -1
0 0 0 10 10 10 0 -30 -30 0
0 0 0 10 10 10
∗ 1 0 -1 =
0 -30 -30 0
1 0 -1
0 0 0 10 10 10 0 -30 -30 0
0 0 0 10 10 10

Andrew Ng
Vertical and Horizontal Edge Detection
1 0 -1 1 1 1
1 0 -1 0 0 0
1 0 -1 -1 -1 -1
Vertical Horizontal
10 10 10 0 0 0
10 10 10 0 0 0 0 0 0 0
1 1 1
10 10 10 0 0 0 30 10 -10 -30
∗ 0 0 0 =
0 0 0 10 10 10 30 10 -10 -30
-1 -1 -1
0 0 0 10 10 10 0 0 0 0
0 0 0 10 10 10
Andrew Ng
Learning to detect edges
1 0 -1
1 0 -1
1 0 -1

3 0 1 2 7 4
1 5 8 9 3 1
#$ #% #&
2 7 2 5 1 3
#' #( #)
0 1 3 1 7 8
#* #+ #,
4 2 1 6 2 8
2 4 5 2 3 9
Andrew Ng
Convolutional
Neural Networks

Padding
deeplearning.ai
Padding

∗ =

Andrew Ng
Valid and Same convolutions

“Valid”:

“Same”: Pad so that output size is the same

as the input size.

Andrew Ng
Convolutional
Neural Networks

Strided
deeplearning.ai
convolutions
Strided convolution
2 3 3 4 7 43 4 4 6 34 2 4 9 4
6 1 6 0 9 21 8 0 7 12 4 0 3 2
3 -13 4 40 8 -143 3 40 8 -134 9 40 7 43 3 4 4
7 1 8 0 3 21 6 0 6 12 3 0 4 2 ∗ 1 0 2 =
4 -13 2 04 1 -134 8 40 3 -134 4 40 6 43 -1 0 3
3 1 2 0 4 12 1 0 9 12 8 0 3 2
0 -1 1 0 3 -13 9 0 2 -13 1 0 4 3

Andrew Ng
Summary of convolutions

& × & image #× # filter

padding p stride s

'()* +, '()* +,
+1 × +1
- -

Andrew Ng
Technical note on cross-correlation vs.
convolution
Convolution in math textbook:
2 3 7 4 6 2
6 6 9 8 7 4
3 4 5
3 4 8 3 8 9
∗ 1 0 2
7 8 3 6 6 3
-1 9 7
4 2 1 8 3 4
3 2 4 1 9 8

Andrew Ng
Convolutional
Neural Networks

Convolutions over
deeplearning.ai
volumes
Convolutions on RGB images

Andrew Ng
Convolutions on RGB image

∗ =

4x4

Andrew Ng
Multiple filters

∗ =

3x3x3 4x4

6x6x3 ∗ =

3x3x3
4x4

Andrew Ng
Convolutional
Neural Networks

One layer of a
deeplearning.ai
convolutional
network
Example of a layer

∗
3x3x3

6x6x3
∗
3x3x3

Andrew Ng
Number of parameters in one layer

If you have 10 filters that are 3 x 3 x 3

in one layer of a neural network, how
many parameters does that layer have?

Andrew Ng
Summary of notation
If layer l is a convolution layer:
#
" = filter size Input:
$ # = padding Output:
#
% = stride
#
&' = number of filters
Each filter is:
Activations:
Weights:
bias:
Andrew Ng
Convolutional
Neural Networks

A simple convolution
deeplearning.ai
network example
Example ConvNet

Andrew Ng
Types of layer in a convolutional network:

- Convolution
- Pooling
- Fully connected

Andrew Ng
Convolutional
Neural Networks

Pooling layers
deeplearning.ai
Pooling layer: Max pooling

1 3 2 1
2 9 1 1
1 3 2 3
5 6 1 2

Andrew Ng
Pooling layer: Max pooling

1 3 2 1 3
2 9 1 1 5
1 3 2 3 2
8 3 5 1 0
5 6 1 2 9

Andrew Ng
Pooling layer: Average pooling

1 3 2 1
2 9 1 1
1 4 2 3
5 6 1 2

Andrew Ng
Summary of pooling
Hyperparameters:

f : filter size
s : stride
Max or average pooling

Andrew Ng
Convolutional
Neural Networks

Convolutional neural
deeplearning.ai
network example
Neural network example

Andrew Ng
608

3216

48120
10164
850
Convolutional
Neural Networks

Why convolutions?
deeplearning.ai
Why convolutions

Andrew Ng
Why convolutions
10 10 10 0 0 0
10 10 10 0 0 0 0 30 30 0
1 0 -1
10 10 10 0 0 0 0 30 30 0
10 10 10 0 0 0
∗ 1 0 -1 =
0 30 30 0
1 0 -1
10 10 10 0 0 0 0 30 30 0
10 10 10 0 0 0

Parameter sharing: A feature detector (such as a vertical

edge detector) that’s useful in one part of the image is probably
useful in another part of the image.

Sparsity of connections: In each layer, each output value

depends only on a small number of inputs.
Andrew Ng
Putting it together
Training set (% & , ( & ) … (% + ,( + ).

+
&
Cost , = + - ℒ((1 . , ( . )
./&

Use gradient descent to optimize parameters to reduce ,

Andrew Ng
Copyright Notice
These slides are distributed under the Creative Commons License.

For the rest of the details of the license, see https://creativecommons.org/licenses/by-sa/2.0/legalcode

Case Studies

Why look at
deeplearning.ai
case studies?
Outline
Classic networks:
• LeNet-5
• AlexNet
• VGG

ResNet

Inception

Andrew Ng
Case Studies

Classic networks
deeplearning.ai
LeNet - 5
avg pool avg pool

⋮
"#
5×5 f=2 5×5 f=2 ⋮
s=1 s=2 s=1 s=2

32×32 ×1 28×28×6 14×14×6 10×10×16 5×5×16

120 84

[LeCun et al., 1998. Gradient-based learning applied to document recognition] Andrew Ng

AlexNet
MAX-POOL MAX-POOL

11 × 11 3×3 5×5 3×3

s=4 s=2 same s=2
55×55 ×96 27×27 ×96 27×27 ×256 13×13 ×256
227×227 ×3

MAX-POOL
= ⋮ ⋮ ⋮
3×3 3×3 3×3 3×3
s=2
Softmax
same
1000
13×13 ×384 13×13 ×384 13×13 ×256 6×6 ×256 9216 4096 4096

[Krizhevsky et al., 2012. ImageNet classification with deep convolutional neural networks] Andrew Ng
VGG - 16
CONV = 3×3 filter, s = 1, same MAX-POOL = 2×2 , s = 2

224×224×64 112×112 ×64 112×112 ×128 56×56 ×128

[CONV 64] POOL [CONV 128] POOL
×2 ×2

224×224 ×3

56×56 ×256 28×28 ×256 28×28 ×512 14×14×512

[CONV 256] POOL [CONV 512] POOL
×3 ×3

14×14 ×512 7×7×512 FC FC Softmax

[CONV 512] POOL 4096 4096 1000
×3

[Simonyan & Zisserman 2015. Very deep convolutional networks for large-scale image recognition] Andrew Ng
Case Studies

Residual Networks
deeplearning.ai
(ResNets)
Residual block
![#%(]
![#] ![#%&]

' [#%(] = * [#%(] ![#] + , [#%(] ![#%(] = -(' [#%(] ) ' [#%&] = * [#%&] ![#%(] + , [#%&] ![#%&] = -(' [#%&] )

[He et al., 2015. Deep residual networks for image recognition] Andrew Ng
Residual Network

x ![#]

Plain ResNet
training error

training error
# layers # layers
[He et al., 2015. Deep residual networks for image recognition] Andrew Ng
Case Studies

Why ResNets
deeplearning.ai
work
Why do residual networks work?

Andrew Ng
ResNet
Plain

ResNet

[He et al., 2015. Deep residual networks for image recognition] Andrew Ng
Case Studies

Network in Network
deeplearning.ai
and 1×1 convolutions
Why does a 1 × 1 convolution do?
1 2 3 6 5 8
3 5 5 1 3 4
2 1 3 4 9 3
4 7 8 5 7 9
∗ 2 =
1 5 3 7 4 8
5 4 9 8 3 5
6×6

∗ =

6 × 6 × 32 1 × 1 × 32 6 × 6 × # filters
[Lin et al., 2013. Network in network] Andrew Ng
Using 1×1 convolutions
ReLU

CONV 1 × 1
32
28 × 28 × 32
28 × 28 × 192

[Lin et al., 2013. Network in network] Andrew Ng

Case Studies

Inception network
deeplearning.ai
motivation
Motivation for inception network
1×1

3×3
64

128
5×5 28
32
32
28
28 × 28 × 192 MAX-POOL

[Szegedy et al. 2014. Going deeper with convolutions] Andrew Ng

The problem of computational cost

CONV
5 × 5,
same,
32 28 × 28 × 32
28 × 28 × 192

Andrew Ng
Using 1×1 convolution

CONV CONV
1 × 1, 5 × 5,
16, 32,
1 × 1 × 192 28 × 28 × 16 5 × 5 × 16 28 × 28 × 32
28 × 28 × 192

Andrew Ng
Case Studies

Inception network
deeplearning.ai
Inception module
1×1
CONV

1×1 3×3
CONV CONV
Previous Channel
Activation Concat
1×1 5×5
CONV CONV
MAXPOOL
3 × 3,s = 1
1×1
same CONV
Andrew Ng
Inception network

[Szegedy et al., 2014, Going Deeper with Convolutions] Andrew Ng

http://knowyourmeme.com/memes/we-need-to-go-deeper Andrew Ng
Convolutional
Neural Networks

MobileNet
Motivation for MobileNets
• Low computational cost at deployment
• Useful for mobile and embedded vision
applications
• Key idea: Normal vs. depthwise-
separable convolutions

[Howard et al. 2017, MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications] Andrew Ng
Normal Convolution

* =

3x3x3
4 4x x4 4x 5
6x6x3

Computational cost = #filter params x # filter positions x # of filters

Andrew Ng
Depthwise Separable Convolution
Normal Convolution

* =

Depthwise Separable Convolution

* * =

Depthwise Pointwise

Andrew Ng
Depthwise Convolution

* =

3x3 4x4x3
6x6x3

Computational cost = #filter params x # filter positions x # of filters

Andrew Ng
Depthwise Separable Convolution
Depthwise Convolution

* =

Pointwise Convolution

* =

Andrew Ng
Pointwise Convolution

* =

1x1x3

4x4x3 4 x4 4x x4 5

Computational cost = #filter params x # filter positions x # of filters

Andrew Ng
Depthwise Separable Convolution
Normal Convolution

* =

Depthwise Separable Convolution

* * =

Depthwise Pointwise

Andrew Ng
Cost Summary
Cost of normal convolution

Cost of depthwise separable convolution

[Howard et al. 2017, MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications] Andrew Ng
Depthwise Separable Convolution
Depthwise Convolution

* =

Pointwise Convolution

* =

Andrew Ng
Convolutional
Neural Networks

MobileNet
Architecture
MobileNet
MobileNet v1

MobileNet v2
Residual Connection

Expansion Depthwise Projection

[Sandler et al. 2019, MobileNetV2: Inverted Residuals and Linear Bottlenecks] Andrew Ng
MobileNet v2 Bottleneck
Residual Connection

Expansion Depthwise Pointwise

[Sandler et al. 2019, MobileNetV2: Inverted Residuals and Linear Bottlenecks] Andrew Ng
MobileNet
MobileNet v1

MobileNet v2
Residual Connection

Expansion Depthwise Projection

[Sandler et al. 2019, MobileNetV2: Inverted Residuals and Linear Bottlenecks] Andrew Ng
MobileNet v2 Full Architecture

conv2d avgpool conv2d

conv2d 1x1 7x7 1x1

[Sandler et al. 2019, MobileNetV2: Inverted Residuals and Linear Bottlenecks] Andrew Ng
Convolutional
Neural Networks

EfficientNet
EfficientNet
Baseline

𝑦ො

Wider
Higher
Deeper Resolution
Compound scaling

𝑦ො 𝑦ො 𝑦ො 𝑦ො

[Tan and Le, 2019, EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks] Andrew Ng
Practical advice for
using ConvNets

Transfer Learning
deeplearning.ai
Practical advice for
using ConvNets

Data augmentation
deeplearning.ai
Common augmentation method
Mirroring

Random Cropping Rotation

Shearing
Local warping
…
Andrew Ng
Color shifting

+20,-20,+20

-20,+20,+20

+5,0,+50

Andrew Ng
Implementing distortions during training

Andrew Ng
Practical advice for
using ConvNets

The state of
deeplearning.ai
computer vision
Data vs. hand-engineering

Two sources of knowledge

• Labeled data
• Hand engineered features/network architecture/other components
Andrew Ng
Tips for doing well on benchmarks/winning
competitions
Ensembling
• Train several networks independently and average their outputs

Multi-crop at test time

• Run classifier on multiple versions of test images and average results

Andrew Ng
Use open source code

• Use architectures of networks published in the literature

• Use open source implementations if possible

• Use pretrained models and fine-tune on your dataset

Andrew Ng
Copyright Notice
These slides are distributed under the Creative Commons License.

For the rest of the details of the license, see https://creativecommons.org/licenses/by-sa/2.0/legalcode

Object Detection

Object
deeplearning.ai
localization
What are localization and detection?
Image classification Classification with Detection
localization

Andrew Ng
Classification with localization

⋯ ⋮

1- pedestrian
2- car
3- motorcycle
4- background
Andrew Ng
Defining the target label y
1- pedestrian Need to output #$ , #& , #' , #( , class label (1-4)
2- car
3- motorcycle
4- background

Andrew Ng
Object Detection

Landmark
deeplearning.ai
detection
Landmark detection ConvNet

!" , !$ , !% , !&

Andrew Ng
Object Detection

Object
deeplearning.ai
detection
Car detection example
Training set:
x y
1

Andrew Ng
Sliding windows detection

Andrew Ng
Object Detection

Convolutional
deeplearning.ai implementation of
sliding windows
Turning FC layer into convolutional layers

MAX POOL FC FC

5×5 2×2 ⋮ ⋮
y
14 × 14 × 3 10 × 10 × 16 5 × 5 × 16 400 400 softmax (4)

MAX POOL FC FC

5×5 2×2 5×5 1×1

14 × 14 × 3 10 × 10 × 16 5 × 5 × 16 1 × 1× 400 1 × 1 × 400 1×1×4

Andrew Ng
Convolution implementation of sliding windows
MAX POOL FC FC FC

5×5 2×2 5×5 1×1 1×1

14×14 ×3 10×10×16 5×5×16 1×1×400 1×1×400 1×1×4

MAX POOL FC FC FC

5×5 2×2 5×5 1×1 1×1

16×16×3 12×12×16 6×6×16 2×2×400 2×2×400 2×2×4

MAX POOL

5×5 2×2 5×5 1×1 1×1

28×28×3 24×24×16 12×12×16 8×8×400 8×8×400 8×8×4

[Sermanet et al., 2014, OverFeat: Integrated recognition, localization and detection using convolutional networks] Andrew Ng
Convolution implementation of sliding windows

MAX POOL

5×5 2×2 5×5 1×1 1×1

28×28 16×16 12×12 8×8×400 8×8×400 8×8×4

Andrew Ng
Object Detection

Bounding box
deeplearning.ai
predictions
Output accurate bounding boxes

Andrew Ng
YOLO algorithm
Labels for training
For each grid cell:

100

[Redmon et al., 2015, You Only Look Once: Unified real-time object detection] Andrew Ng
Specify the bounding boxes

100

[Redmon et al., 2015, You Only Look Once: Unified real-time object detection] Andrew Ng
Object Detection

Intersection
deeplearning.ai
over union
Evaluating object localization

“Correct” if IoU ≥ 0.5

More generally, IoU is a measure of the overlap between two bounding boxes.
Andrew Ng
Object Detection

Non-max
deeplearning.ai
suppression
Non-max suppression example

Andrew Ng
Non-max suppression example

0.6
0.8

0.9
0.3
0.5

Andrew Ng
Non-max suppression example

0.6
0.8

0.9
0.7
0.7

Andrew Ng
Non-max suppression algorithm
$%
&'
Each output prediction is: &(
&)
&*
Discard all boxes with $% ≤ 0.6
While there are any remaining boxes:
• Pick the box with the largest $%
Output that as a prediction.
19×19
• Discard any remaining box with
IoU ≥ 0.5 with the box output
in the previous step Andrew Ng
Object Detection

Anchor boxes
deeplearning.ai
Overlapping objects:
Anchor box 1: Anchor box 2:

!"
#$
#%
#&
y = #'
()
(*
(+
[Redmon et al., 2015, You Only Look Once: Unified real-time object detection] Andrew Ng
Anchor box algorithm
Previously: With two anchor boxes:
Each object in training Each object in training
image is assigned to grid image is assigned to grid
cell that contains that cell that contains object’s
object’s midpoint. midpoint and anchor box
for the grid cell with
highest IoU.

Andrew Ng
Anchor box example !"
#$
#%
#&
#'
()
(*
(+
y = !"
#$
#%
#&
#'
Anchor box 1: Anchor box 2:
()
(*
(+
Andrew Ng
Object Detection

Putting it together:
deeplearning.ai
YOLO algorithm
Training 1 - pedestrian
'( 0 0
2 - car )* ? ?
)+ ? ?
3 - motorcycle
)- ? ?
). ? ?
/0 ? ?
/1 ? ?
/2 ? ?
y = '( 0 1
)* ? )*
)+ ? )+
)- ? )-
). ?
/0
).
? 0
/1 ?
/2 1
? 0
y is 3×3×2×8

[Redmon et al., 2015, You Only Look Once: Unified real-time object detection] Andrew Ng
Making predictions
'(
)*
)+
)-
).
/0
⋯ 4= /1
/2
'(
)*
3×3×2×8 )+
)-
).
/0
/1
/2

Andrew Ng
Outputting the non-max supressed outputs

• For each grid call, get 2 predicted bounding

boxes.

• Get rid of low probability predictions.

• For each class (pedestrian, car, motorcycle)

use non-max suppression to generate final
predictions.

Andrew Ng
Object Detection

Region proposals
deeplearning.ai
(Optional)
Region proposal: R-CNN

[Girshik et. al, 2013, Rich feature hierarchies for accurate object detection and semantic segmentation] Andrew Ng
Faster algorithms

R-CNN: Propose regions. Classify proposed regions one at a

time. Output label + bounding box.

Fast R-CNN: Propose regions. Use convolution implementation

of sliding windows to classify all the proposed
regions.

Faster R-CNN: Use convolutional network to propose regions.

[Girshik et. al, 2013. Rich feature hierarchies for accurate object detection and semantic segmentation]
[Girshik, 2015. Fast R-CNN]
[Ren et. al, 2016. Faster R-CNN: Towards real-time object detection with region proposal networks] Andrew Ng
Convolutional
Neural Networks

Semantic segmentation
with U-Net
Object Detection vs. Semantic Segmentation

Input image Object Detection Semantic Segmentation

Andrew Ng
Motivation for U-Net

Chest X-Ray Brain MRI

[Novikov et al., 2017, Fully Convolutional Architectures for Multi-Class Segmentation in Chest Radiographs]
[Dong et al., 2017, Automatic Brain Tumor Detection and Segmentation Using U-Net Based Fully Convolutional Networks ] Andrew Ng
Per-pixel class labels
000000000000000000000000
000000000000000000000000
000000000000000000000000
000000000000000000000000
000000000000000000000000
000000000000000000000000
000000000000000000000000
000000000000000000000000
1. Car
000000011111100000000000 0. Not Car
001111111111111100000000
001111111111111111111110
001111111111111111111110
000011100000000000111000
000000000000000000000000
000000000000000000000000
000000000000000000000000
000000000000000000000000

Andrew Ng
Per-pixel class labels
222222222222222222222222 222222222222222222222222
222222222222222222222222 222222222222222222222222
222222222222222222222222 222222222222222222222222
222222222222222222222222 222222222222222222222222
222222222222222222222222 222222222222222222222222
222222222222222222222222 222222222222222222222222
222222222222222222222222 222222222222222222222222
222222222222222222222222 222222222222222222222222
2 2 2 2 2 2 21 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 1. Car 2 2 2 2 2 2 21 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2
2 21 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 21 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2
2 21 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2. Building 2 21 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2
2 21 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 3. Road 2 21 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2
333311133333333333111 333 333311133333333333111 333
333333333333333333333333 333333333333333333333333
333333333333333333333333 333333333333333333333333
333333333333333333333333 333333333333333333333333
333333333333333333333333 333333333333333333333333
Segmentation Map

Andrew Ng
Deep Learning for Semantic Segmentation

𝑦ො

Andrew Ng
Transpose Convolution
Normal Convolution

* =

Transpose Convolution

* =

Andrew Ng
Transpose Convolution
231 231 231
1 2 1
231 231 231
2 0 1 0 24
+2 0 1
231 231 231 2 +0
0 2 1 410
+6 7
+3+2
+4 1 3
26 +2
2 1
0
0 37
+4 0
0 2
2
3 2 weight filter
6 33
+0 4 2
2x2

4x4

filter f x f = 3 x 3 padding p = 1 stride s = 2

Andrew Ng
Deep Learning for Semantic Segmentation
Skip connection

𝑦ො

Andrew Ng
U-Net

Conv, RELU
Max Pool
Trans Conv
Skip Connection
Conv (1x1)

[Ronneberger et al., 2015, U-Net: Convolutional Networks for Biomedical Image Segmentation] Andrew Ng
U-Net

hxwx3 h x w x # classes

Conv, RELU
Max Pool
Trans Conv
Skip Connection
Conv (1x1)

[Ronneberger et al., 2015, U-Net: Convolutional Networks for Biomedical Image Segmentation] Andrew Ng
Copyright Notice
These slides are distributed under the Creative Commons License.

For the rest of the details of the license, see https://creativecommons.org/licenses/by-sa/2.0/legalcode

Face recognition

What is face
deeplearning.ai
recognition?
Face recognition

[Courtesy of Baidu] Andrew Ng

Face verification vs. face recognition
Verification
• Input image, name/ID
• Output whether the input image is that of the
claimed person

Recognition
• Has a database of K persons
• Get an input image
• Output ID if the image is any of the K persons (or
“not recognized”)
Andrew Ng
Face recognition

One-shot learning
deeplearning.ai
One-shot learning
Learning from one
example to recognize the
person again

Andrew Ng
Learning a “similarity” function
d(img1,img2) = degree of difference between images
If d(img1,img2) ≤ -
> -

Andrew Ng
Face recognition

Siamese network
deeplearning.ai
Siamese network

⋮ ⋮

" ($)

⋮ ⋮

" (&)

[Taigman et. al., 2014. DeepFace closing the gap to human level performance] Andrew Ng
Goal of learning

⋮ ⋮

f(" ($) )

Parameters of NN define an encoding ((" ) )

Learn parameters so that:
) + ) + &
If " , " are the same person, f " −f " is small.
) + ) + &
If " , " are different persons, f " −f " is large.

Andrew Ng
Face recognition

Triplet loss
deeplearning.ai
Learning Objective

Anchor Positive Anchor Negative

[Schroff et al.,2015, FaceNet: A unified embedding for face recognition and clustering] Andrew Ng
Loss function

Training set: 10k pictures of 1k persons

[Schroff et al.,2015, FaceNet: A unified embedding for face recognition and clustering] Andrew Ng
Choosing the triplets A,P,N

During training, if A,P,N are chosen randomly,

! ", $ + & ≤ !(", )) is easily satisfied.

Choose triplets that’re “hard” to train on.

[Schroff et al.,2015, FaceNet: A unified embedding for face recognition and clustering] Andrew Ng
Training set using triplet loss
Anchor Positive Negative

⋮ ⋮ ⋮

Andrew Ng
Face recognition

Face verification and

deeplearning.ai
binary classification
Learning the similarity function
⋮

$ (%) f($ (%) ) ()

$ (') f($ (') )

[Taigman et. al., 2014. DeepFace closing the gap to human level performance] Andrew Ng
Face verification supervised learning
$ (

[Taigman et. al., 2014. DeepFace closing the gap to human level performance] Andrew Ng
Neural Style
Transfer

What is neural style

deeplearning.ai
transfer?
Neural style transfer

Content Style Content Style

Generated image Generated image

[Images generated by Justin Johnson] Andrew Ng
Neural Style
Transfer

What are deep

deeplearning.ai
ConvNets learning?
Visualizing what a deep network is learning

⋮ ⋮ &'
26×26×256 13×13×256 13×13×384 13×13×384 6×6×256
55×55×96
FC FC
224×224×3 110×110×96 4096 4096

Pick a unit in layer 1. Find the nine

image patches that maximize the unit’s
activation.
Repeat for other units.

[Zeiler and Fergus., 2013, Visualizing and understanding convolutional networks] Andrew Ng
Visualizing deep layers

Layer 1 Layer 2 Layer 3 Layer 4 Layer 5

Andrew Ng
Visualizing deep layers: Layer 1

Layer 1 Layer 2 Layer 3 Layer 4 Layer 5

Andrew Ng
Visualizing deep layers: Layer 2

Layer 1 Layer 2 Layer 3 Layer 4 Layer 5

Andrew Ng
Visualizing deep layers: Layer 3

Layer 1 Layer 2 Layer 3 Layer 4 Layer 5

Andrew Ng
Visualizing deep layers: Layer 3

Layer 1 Layer 2 Layer 3 Layer 4 Layer 5

Andrew Ng
Visualizing deep layers: Layer 4

Layer 1 Layer 2 Layer 3 Layer 4 Layer 5

Andrew Ng
Visualizing deep layers: Layer 5

Layer 1 Layer 2 Layer 3 Layer 4 Layer 5

Andrew Ng
Neural Style
Transfer

Cost function
deeplearning.ai
Neural style transfer cost function

Content C Style S

Generated image G
[Gatys et al., 2015. A neural algorithm of artistic style. Images on slide generated by Justin Johnson] Andrew Ng
Find the generated image G
1. Initiate G randomly
G: 100×100×3

2. Use gradient descent to minimize %(')

[Gatys et al., 2015. A neural algorithm of artistic style] Andrew Ng

Neural Style
Transfer

Content cost
deeplearning.ai
function
Content cost function
" # = % "'()*+)* ,, # + / "0*12+ (4, #)
• Say you use hidden layer ! to compute content cost.
• Use pre-trained ConvNet. (E.g., VGG network)
• Let 6[2](9) and 6[2](:) be the activation of layer !
on the images
• If 6[2](9) and 6[2](:) are similar, both images have
similar content

[Gatys et al., 2015. A neural algorithm of artistic style] Andrew Ng

Neural Style
Transfer

Style cost function

deeplearning.ai
Meaning of the “style” of an image
255 134 93 22
255 134 202 22
123 42
255 231 94 22
83 2
123 94 83 4

"#
34 83
123 94 44 187
2 30
34 44 187 192
34 44 187 92 124
34 76 232
34 76 232 34
67 232
346776 83 124
194 142
⋮
83 194 94
67 83 194 202

Say you are using layer $’s activation to measure “style.”

Define style as correlation between activations across channels.

How correlated are the activations

%' across different channels?
%(
%&

[Gatys et al., 2015. A neural algorithm of artistic style] Andrew Ng

Intuition about style of an image
Style image Generated Image

%' %'
%( %(
%& %&

[Gatys et al., 2015. A neural algorithm of artistic style] Andrew Ng

Style matrix
[/] [/] [/] [/]
Let a*,,,- = activation at 2, 3, 4 . 7 is n9 ×n9

[Gatys et al., 2015. A neural algorithm of artistic style] Andrew Ng

Style cost function
/ 1 / H / J
;<=>/? (A, 7) = E F F(7-- G − 7-- G )
[/] [/] [/]
2%' %& %( - -G

[Gatys et al., 2015. A neural algorithm of artistic style] Andrew Ng

Convolutional
Networks in 1D or 3D

1D and 3D
deeplearning.ai
generalizations of
models
Convolutions in 2D and 1D

∗
2D filter
5×5
2D input image
14×14

1 20 15 3 18 12 4 17 1 3 10 3 1

Andrew Ng
3D data

Andrew Ng
3D convolution

∗
3D filter

3D volume

Andrew Ng

MAP050-King in Yellow in Carcosa - Compressed
No ratings yet
MAP050-King in Yellow in Carcosa - Compressed
11 pages
Ambit Optimist 8 Installation Guide
0% (1)
Ambit Optimist 8 Installation Guide
87 pages
Chapter Four: Theory of Production and Cost
No ratings yet
Chapter Four: Theory of Production and Cost
33 pages
CNN Week 1
No ratings yet
CNN Week 1
41 pages
Convolutional Neural Networks: Computer Vision
No ratings yet
Convolutional Neural Networks: Computer Vision
41 pages
Deeplearning - Ai Deeplearning - Ai
No ratings yet
Deeplearning - Ai Deeplearning - Ai
42 pages
Deeplearning - Ai Deeplearning - Ai
No ratings yet
Deeplearning - Ai Deeplearning - Ai
51 pages
CNN Week 2
No ratings yet
CNN Week 2
33 pages
Convolutional Neural Networks - Deeplearning-Notes
No ratings yet
Convolutional Neural Networks - Deeplearning-Notes
43 pages
Case Studies Why Look at Case Studies?: Deeplearning - Ai
No ratings yet
Case Studies Why Look at Case Studies?: Deeplearning - Ai
50 pages
MLT CNN Architectures
No ratings yet
MLT CNN Architectures
104 pages
Convolutional Neural Network
No ratings yet
Convolutional Neural Network
26 pages
Convolutional Neural Networks: CS 535 Deep Learning, Winter 2020 Fuxin Li
No ratings yet
Convolutional Neural Networks: CS 535 Deep Learning, Winter 2020 Fuxin Li
44 pages
AE556 2024 Topic4 CNN
No ratings yet
AE556 2024 Topic4 CNN
26 pages
CNN Iitkgp
No ratings yet
CNN Iitkgp
112 pages
Convolutional Neural Networks-CNN PDF
No ratings yet
Convolutional Neural Networks-CNN PDF
95 pages
Convolutional Neural Networks (CNN) : Convolutions
No ratings yet
Convolutional Neural Networks (CNN) : Convolutions
17 pages
Guide Convolutional Neural Network CNN
100% (1)
Guide Convolutional Neural Network CNN
25 pages
CS436 CS5310 Ee513 L05 CNN2
No ratings yet
CS436 CS5310 Ee513 L05 CNN2
27 pages
Lecture 10 Slides - After
No ratings yet
Lecture 10 Slides - After
66 pages
Lecture4 - Convnets For CV Slide
No ratings yet
Lecture4 - Convnets For CV Slide
65 pages
An Overview of Convolutional Neural Network Architectures For Deep Learning
No ratings yet
An Overview of Convolutional Neural Network Architectures For Deep Learning
22 pages
Mod 5
No ratings yet
Mod 5
96 pages
Cnnbasics 171028092801
No ratings yet
Cnnbasics 171028092801
43 pages
3 DL ConvNets
No ratings yet
3 DL ConvNets
46 pages
ch4 CNN
No ratings yet
ch4 CNN
35 pages
HODL Lec 3 DNNs For Vision 1
No ratings yet
HODL Lec 3 DNNs For Vision 1
36 pages
Cs437 Cs5317 Ee414 Ee513 l10 Cnncasestudies
No ratings yet
Cs437 Cs5317 Ee414 Ee513 l10 Cnncasestudies
55 pages
Lec 8
No ratings yet
Lec 8
60 pages
EvolvingCNNs V1
No ratings yet
EvolvingCNNs V1
42 pages
Convolutional Neural Networks - Annotated
No ratings yet
Convolutional Neural Networks - Annotated
83 pages
Images, Neural Networks, CNNs
No ratings yet
Images, Neural Networks, CNNs
26 pages
Cs490 Advanced Topics in Computing (Deep Learning) : Lecture 16: Convolutional Neural Networks (CNNS)
No ratings yet
Cs490 Advanced Topics in Computing (Deep Learning) : Lecture 16: Convolutional Neural Networks (CNNS)
63 pages
L5 CNN Architectures
No ratings yet
L5 CNN Architectures
42 pages
(Fall 2024) Images and Convolutions
No ratings yet
(Fall 2024) Images and Convolutions
69 pages
5-Convolutional Neural Network
No ratings yet
5-Convolutional Neural Network
43 pages
L10 - Intro - To - Deep - Learning
No ratings yet
L10 - Intro - To - Deep - Learning
75 pages
CNN Slides Part2
No ratings yet
CNN Slides Part2
69 pages
Module 3 Notes
No ratings yet
Module 3 Notes
22 pages
Lecture 08
No ratings yet
Lecture 08
43 pages
CNN Architectures 01
No ratings yet
CNN Architectures 01
66 pages
A Comprehensive Tutorial To Learn Convolutional Neural Networks From Scratch
No ratings yet
A Comprehensive Tutorial To Learn Convolutional Neural Networks From Scratch
11 pages
Military AI-Week 05-AI in Computer Vision
No ratings yet
Military AI-Week 05-AI in Computer Vision
65 pages
L3 - UUCLxDeepMind DL2020
No ratings yet
L3 - UUCLxDeepMind DL2020
110 pages
Ch-3 Convolutional Neural Networks (CNNS)
No ratings yet
Ch-3 Convolutional Neural Networks (CNNS)
11 pages
Image Recognition Using Neural Networks
No ratings yet
Image Recognition Using Neural Networks
18 pages
Kernel Slides
No ratings yet
Kernel Slides
33 pages
FT04 Haghighat Independent 2023
No ratings yet
FT04 Haghighat Independent 2023
40 pages
DL Unit-Ii
No ratings yet
DL Unit-Ii
34 pages
Convolutional Networks 2024
No ratings yet
Convolutional Networks 2024
44 pages
Convolutional Neural Networks: Convolutions, Pooling and Cnns. Neural Architectures For Computer Vision
No ratings yet
Convolutional Neural Networks: Convolutions, Pooling and Cnns. Neural Architectures For Computer Vision
64 pages
05introduction To Convolutional Neural Networks
No ratings yet
05introduction To Convolutional Neural Networks
72 pages
Convolutional Networks
No ratings yet
Convolutional Networks
211 pages
CNN 2
No ratings yet
CNN 2
47 pages
7 CNN
No ratings yet
7 CNN
66 pages
Computer Vision: Field of AI That Enables Computers To Derive Meaningful Information From
No ratings yet
Computer Vision: Field of AI That Enables Computers To Derive Meaningful Information From
26 pages
Convolutional Neural Networks
No ratings yet
Convolutional Neural Networks
102 pages
Convolution Neural Networks
No ratings yet
Convolution Neural Networks
80 pages
Convnets
No ratings yet
Convnets
41 pages
CS601 Machine Learning Unit 3
No ratings yet
CS601 Machine Learning Unit 3
47 pages
Convolutional Neural Networks
No ratings yet
Convolutional Neural Networks
98 pages
Deep Learning CNN
No ratings yet
Deep Learning CNN
204 pages
1 Evaluating Multiple Object Tracking Performance The CLEAR MOT Metrics
No ratings yet
1 Evaluating Multiple Object Tracking Performance The CLEAR MOT Metrics
10 pages
A Survey On Machine Learning For Data Fusion
No ratings yet
A Survey On Machine Learning For Data Fusion
15 pages
Multisensor Data Fusion A Review of The State-Of-The-Art
No ratings yet
Multisensor Data Fusion A Review of The State-Of-The-Art
17 pages
An Introduction To Multisensor Data Fusion
No ratings yet
An Introduction To Multisensor Data Fusion
18 pages
Atomic Habits
No ratings yet
Atomic Habits
4 pages
ACCY 504 Auditing I Module 5
No ratings yet
ACCY 504 Auditing I Module 5
31 pages
Introduction To Communication Systems
100% (1)
Introduction To Communication Systems
469 pages
CV of DR Uday Dokras
No ratings yet
CV of DR Uday Dokras
27 pages
IPC Engineering Critical Assessment of Dents and Dents With Cracks Using Inline Inspection
No ratings yet
IPC Engineering Critical Assessment of Dents and Dents With Cracks Using Inline Inspection
9 pages
Otondro Prohori, Guarding Who, Against What
No ratings yet
Otondro Prohori, Guarding Who, Against What
10 pages
A Roadmap For A 3PL
No ratings yet
A Roadmap For A 3PL
2 pages
Exams and Training - Cisco
No ratings yet
Exams and Training - Cisco
6 pages
Ducati Monster S4RS 2006 Parts List WWW - Manualedereparatie.info PDF
No ratings yet
Ducati Monster S4RS 2006 Parts List WWW - Manualedereparatie.info PDF
120 pages
Class Scheduling System and Attendance Monitoring System
100% (1)
Class Scheduling System and Attendance Monitoring System
6 pages
Licence Renewed Gardner John Instant Download
No ratings yet
Licence Renewed Gardner John Instant Download
36 pages
7th Sem Mech Internal Question Papers
No ratings yet
7th Sem Mech Internal Question Papers
16 pages
Obj To Report of No Distribution (Original As Filed)
No ratings yet
Obj To Report of No Distribution (Original As Filed)
10 pages
Republic of The Philippines Department of Education Region Vii, Central Visayas Division of Cebu Province Self-Learning Home Task (SLHT)
100% (2)
Republic of The Philippines Department of Education Region Vii, Central Visayas Division of Cebu Province Self-Learning Home Task (SLHT)
20 pages
2025 Reqwhiterun
No ratings yet
2025 Reqwhiterun
6 pages
Polymerization of Alkenes... Final..fizza...
No ratings yet
Polymerization of Alkenes... Final..fizza...
19 pages
Jovision JVS-517-TDL
No ratings yet
Jovision JVS-517-TDL
2 pages
6941 Ais - Database.model - file.LampiranLain TUGAS PERTEMUAN 12
No ratings yet
6941 Ais - Database.model - file.LampiranLain TUGAS PERTEMUAN 12
4 pages
Risk
No ratings yet
Risk
27 pages
Organic Bakery Marketing Plan
No ratings yet
Organic Bakery Marketing Plan
30 pages
Report On Comparative Leadership styles-UK and India
No ratings yet
Report On Comparative Leadership styles-UK and India
12 pages
Vanity Litepaper
No ratings yet
Vanity Litepaper
6 pages
Antarang Foundation
No ratings yet
Antarang Foundation
25 pages
IFD5 Manual - Issue 5
No ratings yet
IFD5 Manual - Issue 5
21 pages
De Vera, Crisangelyn C
No ratings yet
De Vera, Crisangelyn C
2 pages
Fashion Polka Dot Background Business PPT Templates
No ratings yet
Fashion Polka Dot Background Business PPT Templates
25 pages
Takeover Full
50% (2)
Takeover Full
92 pages
List Spare Part NCR BSB - 6622 - 6622e - Rev1
No ratings yet
List Spare Part NCR BSB - 6622 - 6622e - Rev1
56 pages
Industrial Shakers
No ratings yet
Industrial Shakers
4 pages
Msafdzp 2025 Package
No ratings yet
Msafdzp 2025 Package
30 pages