0% found this document useful (0 votes)

46 views36 pages

CS230: Lecture 2 Practical Approaches To Deep Learning Projects

Here are the key steps: 1. Collect a dataset of face images for every student, with labels 2. Use face detection to extract faces as input 3. Siamese network with contrastive loss to learn embedding 4. Compare embeddings of input face to database, output similarity score The architecture encodes faces to embeddings, contrastive loss trains embeddings of same/different people to be close/far. At test time, we embed an input face and find its nearest neighbor in the database to do verification. Kian Katanforoosh, Andrew Ng, Younes Bensouda Mourri Face Recognition Goal: Build a photo tagging system that recognizes faces in photos and tags

Uploaded by

Sarah Eharot

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

46 views36 pages

CS230: Lecture 2 Practical Approaches To Deep Learning Projects

Uploaded by

Sarah Eharot

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 36

CS230: Lecture 2

Practical Approaches to Deep

Learning Projects
Kian Katanforoosh

Kian Katanforoosh, Andrew Ng, Younes Bensouda Mourri

Recap

Kian Katanforoosh, Andrew Ng, Younes Bensouda Mourri

Learning Process

Input Output
Model
=
Architecture
+
0
Parameters

Things that can change Loss

- Activation function
- Optimizer
Gradients
- Hyperparameters
- … Kian Katanforoosh, Andrew Ng, Younes Bensouda Mourri
Logistic Regression as a Neural Network

image2vector

⎛ 255 ⎞
/255
(i )
x 1

⎜ 231 ⎟ /255

x (i )

⎜ ⎟
2

0.73 > 0.5

⎜ ... ⎟ … … … wT x (i ) + b σ 0.73 “it’s a cat”

⎜ 94 ⎟ /255
x (i )

⎜ ⎟
n−1

⎝ 142 ⎠
/255
(i )
x n

Kian Katanforoosh, Andrew Ng, Younes Bensouda Mourri

Multi-class

image2vector

⎛ 255 ⎞
/255
(i )
x 0.12 < 0.5
1
w x +b
T (i )
σ 0.12 Dog?

⎜ 231 ⎟ /255

x (i )

⎜ ⎟
2

⎜ ⎟
0.73 > 0.5

... … … wT x (i ) + b σ 0.73 Cat?

⎜ 94 ⎟ /255
x (i )

⎜ ⎟
n−1
0.04 < 0.5
w x +b σ
⎝ 142 ⎠
T (i )
/255
(i )
0.04 Giraffe?
x n

Kian Katanforoosh, Andrew Ng, Younes Bensouda Mourri

Neural Network (Multi-class)

image2vector

⎛ 255 ⎞
/255
(i )
x 1
w x +b
T (i )
σ
⎜ 231 ⎟ /255

x (i )

⎜ ⎟
2

⎜ ... ⎟ … … wT x (i ) + b σ

⎜ 94 ⎟ /255
x (i )

⎜ ⎟
n−1

w x +b σ
⎝ 142 ⎠
T (i )
/255
(i )
x n

Kian Katanforoosh, Andrew Ng, Younes Bensouda Mourri

Neural Network (1 hidden layer)

image2vector

Hidden layer

⎛ 255 ⎞
/255
(i )
x [1]
a
1

⎜ 231 ⎟ /255

x (i )
1

⎜ ⎟
2 output layer

⎜ ⎟
[1] [2]
... … … a 2
a 1 0.73

⎜ 94 ⎟ /255
x (i )
0.73 > 0.5

⎜ ⎟
n−1
[1]
a
⎝ 142 ⎠
/255 Cat
x (i ) 3
n

Kian Katanforoosh, Andrew Ng, Younes Bensouda Mourri

Deeper network: Encoding
Hidden layer

(i ) [1]
x
1 a 1
Hidden layer

[2]
a 1
(i ) [1]
x 2 a 2
output layer

(i )
a [2]
2 a [ 3]
1
ŷ
(i ) [1]
x 3 a 3
[2]
a 3
(i ) [1]
x 4 a 4

Technique called “encoding”

Kian Katanforoosh, Andrew Ng, Younes Bensouda Mourri
Summary of learnings: Introduction

• A model is defined by its architecture and its parameters. 

• The labelling strategy matters to successfully train your models. For example, if
you’re training a 3-class (dog, cat, giraffe) classifier under the constraint of one
animal per picture, you might use one-hot vectors to label your data. 

• We introduced a set of notations to differentiate indices for neurons, layers

and examples. 

• In deep learning, feature learning replaces feature engineering.

Kian Katanforoosh, Andrew Ng, Younes Bensouda Mourri

Let’s build intuition on concrete applications

Kian Katanforoosh, Andrew Ng, Younes Bensouda Mourri

Today’s outline

We will learn tips and tricks to:

- Analyze a problem from a I. Day’n’Night classification
deep learning approach
II. Face verification and recognition
- Choose an architecture
III. Neural style transfer (Art
- Choose a loss and a generation)
training strategy
IV. Trigger-word detection

Kian Katanforoosh, Andrew Ng, Younes Bensouda Mourri

Day’n’Night classification

Goal: Given an image, classify as taken “during the day” (0) or “during the night” (1)

1. Data? 10,000 images Split? Bias?

2. Input? Resolution? (64, 64, 3)

3. Output? y = 0 or y = 1 Last Activation? sigmoid

4. Architecture ? Shallow network should do the job pretty well

5. Loss? L = −[ y log( ŷ) + (1− y)log(1− ŷ)] Easy warm up

Kian Katanforoosh, Andrew Ng, Younes Bensouda Mourri
Summary of learnings: Day n’ Night classification

• Use a known proxy project to evaluate how much data you need. 

• Be scrappy. For example, if you’d like to find a good resolution of images to use
for your data, but don’t have time for a large scale experiment, approximate
human-level performance by testing your friends as classifiers.

Kian Katanforoosh, Andrew Ng, Younes Bensouda Mourri

Face Verification

Goal: A school wants to use Face Verification for validating student IDs in facilities
(dinning halls, gym, pool …)

1. Data? 2. Input? 3. Output?

Picture of every student

labelled with their name y = 1 (it’s you)
or
y = 0 (it’s not you)

Bertrand Resolution?
(412, 412, 3)
Kian Katanforoosh, Andrew Ng, Younes Bensouda Mourri
Face Verification

Goal: A school wants to use Face Verification for validating student IDs in facilities
(dinning halls, gym, pool …)

4. What architecture?
Simple solution: Issues:

compute distance
- Background lighting differences
pixel per pixel - A person can wear make-up, grow a
if less than threshold
beard…
then y=1
- ID photo can be outdated
database image input image

Kian Katanforoosh, Andrew Ng, Younes Bensouda Mourri

Face Verification

Goal: A school wants to use Face Verification for validating student IDs in facilities
(dinning halls, gym, pool …)

4. What architecture?
Our solution: encode information about a picture in a vector
128-d
⎛ 0.931 ⎞
⎜ 0.433 ⎟
⎜ ⎟
⎜ 0.331 ⎟
Deep Network ⎜! ⎟
⎜ ⎟
⎜ 0.942 ⎟
⎜ 0.158 ⎟
⎜ ⎟ 0.4 < threshold
⎝ 0.039 ⎠
distance 0.4 y=1
⎛ 0.922 ⎞
⎜ 0.343 ⎟
⎜ ⎟
⎜ 0.312 ⎟
⎜! ⎟
Deep Network ⎜ ⎟
⎜ 0.892 ⎟
⎜ 0.142 ⎟ We gather all student faces encoding in a database. Given a new
⎜ ⎟
⎝ 0.024 ⎠ picture, we compute its distance with the encoding of card holder
Kian Katanforoosh, Andrew Ng, Younes Bensouda Mourri
Face Recognition

Goal: A school wants to use Face Verification for validating student IDs in facilities
(dinning hall, gym, pool …)

4. Loss? Training?
We need more data so that our model understands how to encode:
Use public face datasets
So let’s generate triplets:
What we really want:

anchor positive negative

minimize encoding distance

similar encoding different encoding maximize encoding distance

Kian Katanforoosh, Andrew Ng, Younes Bensouda Mourri

Face Recognition

What we really want:

similar encoding different encoding

So let’s generate triplets:

anchor positive negative

minimize encoding distance

maximize encoding distance

Kian Katanforoosh, Andrew Ng, Younes Bensouda Mourri

Recap: Learning Process

Input Output
Model ⎛ 0.13 ⎞ ⎛ 0.01 ⎞ ⎛ 0.95 ⎞
⎜ 0.42 ⎟ ⎜ 0.54 ⎟ ⎜ 0.45 ⎟
= ⎜
⎜ ..
⎟
⎟
⎜
⎜ ..
⎟
⎟
⎜
⎜ ..
⎟
⎟

0
⎜ 0.10 ⎟ ⎜ 0.45 ⎟ ⎜ 0.20 ⎟
⎜ ⎟ ⎜ ⎟ ⎜ ⎟

Architecture ⎜ 0.31 ⎟
⎜ 0.73 ⎟
⎜ ⎟
⎜ 0.11 ⎟
⎜ 0.49 ⎟
⎜ ⎟
⎜ 0.41 ⎟
⎜ 0.89 ⎟
⎜ ⎟
⎜ .. ⎟ ⎜ .. ⎟ ⎜ .. ⎟

+ ⎜ 0.43 ⎟
⎜ ⎟
⎜⎝ 0.33 ⎟⎠
⎜ 0.12 ⎟
⎜ ⎟
⎜⎝ 0.01 ⎟⎠
⎜ 0.31 ⎟
⎜ ⎟
⎜⎝ 0.34 ⎟⎠

anchor positive negative Parameters Enc(A) Enc(P) Enc(N)

Loss
L = Enc(A) − Enc(P)
2
2

− Enc(A) − Enc(N )
2

Gradients 2
+α
Kian Katanforoosh, Andrew Ng, Younes Bensouda Mourri
Face Recognition

Goal: A school wants to use Face Identification for recognize students in facilities
(dinning hall, gym, pool …)

K-Nearest Neighbors

Goal: You want to use Face Clustering to group pictures of the same people on your
smartphone

K-Means Algorithm

Maybe we need to detect the faces first?

Kian Katanforoosh, Andrew Ng, Younes Bensouda Mourri
Summary of learnings: Face Recognition

• In face verification, we have used an encoder network to learn a lower

dimensional representation (called “encoding”) for a set of data by training the
network to focus on non-noisy signals. 

• Triplet loss is a loss function where an (anchor) input is compared to a

positive input and a negative input. The distance from the anchor input to the
positive input is minimized, whereas the distance from the anchor input to the
negative input is maximized. 

• You learnt the difference between face verification, face identification and
face clustering.

Kian Katanforoosh, Andrew Ng, Younes Bensouda Mourri

Art generation (Neural Style Transfer)

Goal: Given a picture, make it look beautiful

1. Data? 2. Input? 3. Output?

Let’s say we have content

any data image

style
image generated
image

Leon A. Gatys, Alexander S. Ecker, Matthias Bethge: A Neural Algorithm of Artistic Style, 2015 Kian Katanforoosh
Art generation (Neural Style Transfer)

4. Architecture? We use a pre-trained model because it extracts important information from images.

Deep Network
classification
(pretrained on ImageNet)

⎛ 0.43⎞
⎜ 0.39⎟ ⎛ 0.92⎞ ⎛ 0.13⎞
ContentC = ⎜ ⎟ ⎜ 0.01⎟ Gram Matrix ⎜ 0.32⎟
⎜! ⎟ ⎜ ⎟ StyleS = ⎜ ⎟
⎜! ⎟ ⎜! ⎟
⎜ ⎟
⎝ 0.53⎠ ⎜ ⎟
⎝ 0.53⎠ ⎜ ⎟
⎝ 0.92⎠

Deep Network
classification
(pretrained on ImageNet)

Leon A. Gatys, Alexander S. Ecker, Matthias Bethge: A Neural Algorithm of Artistic Style, 2015 Kian Katanforoosh
Art generation (Neural Style Transfer)
Image generation process
⎛ 0.29⎞ ⎛ 0.22⎞ ⎛ 0.12⎞
⎜ 0.31⎟ ⎜ 0.99⎟
Gram Matrix ⎜ 0.10⎟
⎜ ⎟
ContentG = ⎜ ⎟ ⎜! ⎟ StyleG = ⎜ ⎟
⎜! ⎟ ⎜ ⎟ ⎜! ⎟
⎜ ⎟ ⎝ 0.43⎠
⎝ 0.44⎠ ⎜ ⎟
⎝ 0.92⎠

Deep Network
compute loss
(pretrained on ImageNet)

After 2000 ⎛ 0.43⎞

iterations ⎜ 0.39⎟
ContentC = ⎜ ⎟
∂L ⎜! ⎟
update pixels using gradients
∂x ⎜ ⎟
⎝ 0.53⎠
⎛ 0.13⎞
⎜ 0.32⎟
StyleS = ⎜ ⎟
⎜! ⎟
⎜ ⎟
⎝ 0.92⎠
Leon A. Gatys, Alexander S. Ecker, Matthias Bethge: A Neural Algorithm of Artistic Style, 2015 Kian Katanforoosh
Art generation (Neural Style Transfer)

Which loss should we minimize?

Kian Katanforoosh
Art generation (Neural Style Transfer)
Image generation process
⎛ 0.29⎞ ⎛ 0.22⎞ ⎛ 0.12⎞
⎜ 0.31⎟ ⎜ 0.99⎟
Gram Matrix ⎜ 0.10⎟
⎜ ⎟
ContentG = ⎜ ⎟ ⎜! ⎟ StyleG = ⎜ ⎟
⎜! ⎟ ⎜ ⎟ ⎜! ⎟
⎜ ⎟ ⎝ 0.43⎠
⎝ 0.44⎠ ⎜ ⎟
⎝ 0.92⎠

Deep Network
compute loss
(pretrained on ImageNet)

After 2000 ⎛ 0.43⎞

iterations ⎜ 0.39⎟
ContentC = ⎜ ⎟
∂L ⎜! ⎟
update pixels using gradients
∂x ⎜ ⎟
⎝ 0.53⎠
⎛ 0.13⎞
⎜ 0.32⎟
2 2 StyleS = ⎜ ⎟
where L = ContentC − ContentG 2 + StyleS − StyleG ⎜! ⎟
2 ⎜ ⎟
⎝ 0.92⎠
Leon A. Gatys, Alexander S. Ecker, Matthias Bethge: A Neural Algorithm of Artistic Style, 2015 Kian Katanforoosh
Kian Katanforoosh
Content image
Summary of learnings: Neural Style Transfer

• In the neural style transfer algorithm proposed by Gatys et al., you optimize
image pixels rather than model parameters. Model parameters are
pretrained and non-trainable. 

• You leverage the “knowledge” of a pretrained model to extract the content of a

content image and the style of a style image.

• The loss proposed by Gatys et al. aims to minimize the distances between the
content of the generated and content images, and the style of the generated
and style images.

Kian Katanforoosh, Andrew Ng, Younes Bensouda Mourri

Trigger word detection

Goal: Given a 10sec audio speech, detect the word “activate”.

1. Data? A bunch of 10s audio clips Distribution?

x = A 10sec audio clip Resolution? (sample rate)

2. Input?

3. Output? y = 0 or y = 1

Kian Katanforoosh
Let’s have an experiment!

y=1
y=0
y=1

y = 000000000000000000000000000000000000000010000000000

y = 000000000000000000000000000000000000000000000000000

y = 000000000001000000000000000000000000000000000000000
Kian Katanforoosh
Trigger word detection

Goal: Given a 10sec audio speech, detect the word “activate”.

1. Data? A bunch of 10s audio clips Distribution?

x = A 10sec audio clip Resolution? (sample rate)

2. Input?
y = 0 or y = 1 Last Activation?
3. Output? y = 00..0000100000..000 sigmoid
y = 00..00001..1000..000 (sequential)

4. Architecture ? Sounds like it should be a RNN

5. Loss? L = −( y log( ŷ) + (1− y)log(1− ŷ))

(sequential)
Kian Katanforoosh
Trigger word detection

What is critical to the success of this project?

1. Strategic data collection/ 2. Architecture search & Hyperparameter tuning

labelling process

Positive word Negative words Background noise

Fourier transform Fourier transform

LSTM LSTM LSTM … LSTM LSTM LSTM CONV + BN

GRU
+
GRU
+ … GRU
+
GRU
+

σ σ σ … σ σ σ
BN BN BN BN

σ σ … σ σ
000000..000001..10000..000 000000..000001..10000..000

Automated labelling Never give up 00..00001..100..0

+ Error analysis 000000..000001..10000..000

Kian Katanforoosh
Summary of learnings: Trigger word detection

• Your data collection strategy is critical to the success of your project. (If
applicable) Don’t hesitate to get out of the building. 

• You can gain insights on your labelling strategy by using a human experiment. 

• Refer to expert advice to earn time and be guided towards a good direction.

Kian Katanforoosh, Andrew Ng, Younes Bensouda Mourri

Featured in the Magazine “the Most Beautiful Loss functions of 2015”

Joseph Redmon, Santosh Divvala, Ross Girshick, Ali Farhadi: You Only Look Once: Unified, Real-Time Object Detection Kian Katanforoosh, Andrew Ng, Younes Bensouda Mourri
Duties for next week

For Tuesday 10/08, 8am: 

C1M3
• Quiz: Shallow Neural Networks
• Programming Assignment: Planar data classification with one-hidden layer

C1M4
• Quiz: Deep Neural Networks
• Programming Assignment: Building a deep neural network - Step by Step
• Programming Assignment: Deep Neural Network Application

Others:
• TA project mentorship (mandatory this week)
• Friday TA section (10/04)
• Fill-in AWS Form to get GPU credits for your projects

Kian Katanforoosh, Andrew Ng, Younes Bensouda Mourri

8 Deep Learning CNN
No ratings yet
8 Deep Learning CNN
63 pages
UNIT IV_NNDL (3)
No ratings yet
UNIT IV_NNDL (3)
32 pages
L7-Lecture-Image.classification.DNN-v4
No ratings yet
L7-Lecture-Image.classification.DNN-v4
61 pages
SMDS-unit-3
No ratings yet
SMDS-unit-3
45 pages
ref_1284
No ratings yet
ref_1284
14 pages
Lec14-CNNRNNModels
No ratings yet
Lec14-CNNRNNModels
64 pages
Course_ a Deep Understanding of Deep Learning (With Python Intro)
No ratings yet
Course_ a Deep Understanding of Deep Learning (With Python Intro)
4 pages
Deep-Learning-book-part1
No ratings yet
Deep-Learning-book-part1
100 pages
summary notes
No ratings yet
summary notes
61 pages
Deep Learning Notes (1) 2
No ratings yet
Deep Learning Notes (1) 2
54 pages
The Little Book of Deep Learning
No ratings yet
The Little Book of Deep Learning
163 pages
Autoencoder_GAN_edited
No ratings yet
Autoencoder_GAN_edited
138 pages
Deep Learning notes
No ratings yet
Deep Learning notes
155 pages
Download ebooks file Wireless Communication Systems in Matlab 2nd Edition Mathuranathan Viswanathan all chapters
No ratings yet
Download ebooks file Wireless Communication Systems in Matlab 2nd Edition Mathuranathan Viswanathan all chapters
55 pages
CNN Week 4
No ratings yet
CNN Week 4
52 pages
Object Recog
No ratings yet
Object Recog
102 pages
Kirkvik Acit2022
No ratings yet
Kirkvik Acit2022
155 pages
Capture d’écran, le 2025-04-21 à 21.26.38
No ratings yet
Capture d’écran, le 2025-04-21 à 21.26.38
14 pages
ch4_CNN
No ratings yet
ch4_CNN
35 pages
Chapter 5 Deep Learning
No ratings yet
Chapter 5 Deep Learning
35 pages
LBDL
No ratings yet
LBDL
185 pages
MN906 AI Watermarking
No ratings yet
MN906 AI Watermarking
99 pages
Unit 3 (a) : Time Value Of Money 1: Analyzing Single Cash Flows 货币时间价值1：分析单一现金流 - Chapter 4
No ratings yet
Unit 3 (a) : Time Value Of Money 1: Analyzing Single Cash Flows 货币时间价值1：分析单一现金流 - Chapter 4
72 pages
Deeplearning - Ai Deeplearning - Ai
No ratings yet
Deeplearning - Ai Deeplearning - Ai
53 pages
Generating Arabic Letters Using Generative
No ratings yet
Generating Arabic Letters Using Generative
63 pages
A Recurrent Neural Network
No ratings yet
A Recurrent Neural Network
22 pages
Programming Assignments of Deep Learning Specialization 5 Courses 1
No ratings yet
Programming Assignments of Deep Learning Specialization 5 Courses 1
304 pages
MODULE 6 Day 1 Illustrating The T Distribution
No ratings yet
MODULE 6 Day 1 Illustrating The T Distribution
25 pages
03 InformationGain
No ratings yet
03 InformationGain
20 pages
Tut-2_Solution
No ratings yet
Tut-2_Solution
2 pages
The Little Book of Deep Learning
No ratings yet
The Little Book of Deep Learning
143 pages
7 CNN
No ratings yet
7 CNN
66 pages
Table of Content: (Page Numbers in PDF File)
No ratings yet
Table of Content: (Page Numbers in PDF File)
223 pages
4b Image Processing
No ratings yet
4b Image Processing
63 pages
Python Data Science Handbook - Python Data Science Handbook
No ratings yet
Python Data Science Handbook - Python Data Science Handbook
4 pages
Keras1 - 1.4 Advanced Model Architectures
No ratings yet
Keras1 - 1.4 Advanced Model Architectures
11 pages
StatPred Deep Learning Winter 2020 Handout
No ratings yet
StatPred Deep Learning Winter 2020 Handout
17 pages
6-DeepVisualLearning L6
No ratings yet
6-DeepVisualLearning L6
82 pages
W01 PracticalProblemsProjects
No ratings yet
W01 PracticalProblemsProjects
27 pages
NISS Deep Learning Tutorial
No ratings yet
NISS Deep Learning Tutorial
58 pages
Vector Quantization: April 2006
No ratings yet
Vector Quantization: April 2006
25 pages
Machine Learning MCQ
No ratings yet
Machine Learning MCQ
11 pages
Dlincv 161110052148 PDF
No ratings yet
Dlincv 161110052148 PDF
271 pages
Data Science Interview Preparation
100% (1)
Data Science Interview Preparation
113 pages
14 - Pedestrian Detection Based On YOLO Network Model
No ratings yet
14 - Pedestrian Detection Based On YOLO Network Model
5 pages
Constrained Maxima and Minima PDF
No ratings yet
Constrained Maxima and Minima PDF
16 pages
DL Tutorial NIPS2015 PDF
No ratings yet
DL Tutorial NIPS2015 PDF
133 pages
Bott Curt Noce 18
No ratings yet
Bott Curt Noce 18
89 pages
Deep Learning Basics (Lecture Notes) : Romain Tavenard
No ratings yet
Deep Learning Basics (Lecture Notes) : Romain Tavenard
49 pages
Go To: CS230: Lecture 5 Attacking Networks With Adversarial Examples - Generative Adversarial Networks
No ratings yet
Go To: CS230: Lecture 5 Attacking Networks With Adversarial Examples - Generative Adversarial Networks
30 pages
Kian Katanforoosh, Andrew NG, Younes Bensouda Mourri
No ratings yet
Kian Katanforoosh, Andrew NG, Younes Bensouda Mourri
25 pages
CS230: Lecture 3: The Mathematics of Deep Learning
No ratings yet
CS230: Lecture 3: The Mathematics of Deep Learning
14 pages
SciGen PDF
No ratings yet
SciGen PDF
1 page
Instance-Based Learning: Slides Provided by Introduction To Data Mining, 2 Edition
No ratings yet
Instance-Based Learning: Slides Provided by Introduction To Data Mining, 2 Edition
13 pages
Digital Communication Systems (ECE4001) : Dr. Thomas Joseph
No ratings yet
Digital Communication Systems (ECE4001) : Dr. Thomas Joseph
55 pages
Cryptography
No ratings yet
Cryptography
23 pages
15-123 15-123 Systems Skills in C and Unix
No ratings yet
15-123 15-123 Systems Skills in C and Unix
42 pages
Neural Network Project Report.
No ratings yet
Neural Network Project Report.
12 pages
Least Cost Algorithm
No ratings yet
Least Cost Algorithm
9 pages
Lin - Using Machine Learning To Assist Crime Prevention
No ratings yet
Lin - Using Machine Learning To Assist Crime Prevention
2 pages
Vbook - Pub Deep Learning For Computer Visionpdf
No ratings yet
Vbook - Pub Deep Learning For Computer Visionpdf
24 pages
MA204 2024 SM Tutoril 1
No ratings yet
MA204 2024 SM Tutoril 1
2 pages
Introducing Deep Learning With MATLAB
No ratings yet
Introducing Deep Learning With MATLAB
15 pages
Convolutional Neural PDF
No ratings yet
Convolutional Neural PDF
187 pages
Deep Learning Handson
No ratings yet
Deep Learning Handson
65 pages
CS464 Ch1 Intro Fall2020
No ratings yet
CS464 Ch1 Intro Fall2020
83 pages
資料探勘技術在晶圓針測誤宰分析之應用 Applying Data Mining Techniques to the Overkill Analysis of Wafer Testing
No ratings yet
資料探勘技術在晶圓針測誤宰分析之應用 Applying Data Mining Techniques to the Overkill Analysis of Wafer Testing
57 pages
Deep Learning For Computer Vision PDF
No ratings yet
Deep Learning For Computer Vision PDF
24 pages
Deep Learning For Computer Vision PDF
7% (14)
Deep Learning For Computer Vision PDF
24 pages
Farkas Image Classif NN
No ratings yet
Farkas Image Classif NN
32 pages
Handbook of Floating-Point Arithmetic
No ratings yet
Handbook of Floating-Point Arithmetic
11 pages
Convolutional Neural Networks: CS 535 Deep Learning, Winter 2020 Fuxin Li
No ratings yet
Convolutional Neural Networks: CS 535 Deep Learning, Winter 2020 Fuxin Li
44 pages
PSR Lecture Notes 1
100% (1)
PSR Lecture Notes 1
7 pages
Short Course On Deep Learning: Welcome!!
No ratings yet
Short Course On Deep Learning: Welcome!!
57 pages
Structure of The Convolutional Codes
No ratings yet
Structure of The Convolutional Codes
19 pages
Plant Disease Identification
No ratings yet
Plant Disease Identification
17 pages
Deep Learning Notes Andrew NG
No ratings yet
Deep Learning Notes Andrew NG
54 pages
DLJS - Book Sample Chapters
No ratings yet
DLJS - Book Sample Chapters
25 pages
PID Control Tutorial - Yokogawa America
100% (1)
PID Control Tutorial - Yokogawa America
3 pages
VLSI Physical Design: From Graph Partitioning To Timing Closure
No ratings yet
VLSI Physical Design: From Graph Partitioning To Timing Closure
30 pages
Ee210-Project Report Pdf-Ilovepdf-Compressed
No ratings yet
Ee210-Project Report Pdf-Ilovepdf-Compressed
59 pages
Data Mining Exercises - Solutions
No ratings yet
Data Mining Exercises - Solutions
5 pages
UCP - Process Control Manual-1
No ratings yet
UCP - Process Control Manual-1
115 pages
Coaching Actuaries Exam SRM Suggested Study Schedule: Phase 1: Learn
No ratings yet
Coaching Actuaries Exam SRM Suggested Study Schedule: Phase 1: Learn
4 pages

CS230: Lecture 2 Practical Approaches To Deep Learning Projects

Uploaded by

CS230: Lecture 2 Practical Approaches To Deep Learning Projects

Uploaded by

CS230: Lecture 2

Practical Approaches to Deep

Kian Katanforoosh, Andrew Ng, Younes Bensouda Mourri

Kian Katanforoosh, Andrew Ng, Younes Bensouda Mourri

Things that can change Loss

0.73 > 0.5

⎜ ... ⎟ … … … wT x (i ) + b σ 0.73 “it’s a cat”

Kian Katanforoosh, Andrew Ng, Younes Bensouda Mourri

... … … wT x (i ) + b σ 0.73 Cat?

Kian Katanforoosh, Andrew Ng, Younes Bensouda Mourri

Kian Katanforoosh, Andrew Ng, Younes Bensouda Mourri

Kian Katanforoosh, Andrew Ng, Younes Bensouda Mourri

Technique called “encoding”

• A model is defined by its architecture and its parameters.

• We introduced a set of notations to differentiate indices for neurons, layers

• In deep learning, feature learning replaces feature engineering.

Kian Katanforoosh, Andrew Ng, Younes Bensouda Mourri

Kian Katanforoosh, Andrew Ng, Younes Bensouda Mourri

We will learn tips and tricks to:

Kian Katanforoosh, Andrew Ng, Younes Bensouda Mourri

1. Data? 10,000 images Split? Bias?

2. Input? Resolution? (64, 64, 3)

3. Output? y = 0 or y = 1 Last Activation? sigmoid

4. Architecture ? Shallow network should do the job pretty well

5. Loss? L = −[ y log( ŷ) + (1− y)log(1− ŷ)] Easy warm up

Kian Katanforoosh, Andrew Ng, Younes Bensouda Mourri

1. Data? 2. Input? 3. Output?

Picture of every student

Kian Katanforoosh, Andrew Ng, Younes Bensouda Mourri

anchor positive negative

minimize encoding distance

similar encoding different encoding maximize encoding distance

Kian Katanforoosh, Andrew Ng, Younes Bensouda Mourri

What we really want:

similar encoding different encoding

So let’s generate triplets:

anchor positive negative

minimize encoding distance

maximize encoding distance

Kian Katanforoosh, Andrew Ng, Younes Bensouda Mourri

anchor positive negative Parameters Enc(A) Enc(P) Enc(N)

Maybe we need to detect the faces first?

• In face verification, we have used an encoder network to learn a lower

• Triplet loss is a loss function where an (anchor) input is compared to a

Kian Katanforoosh, Andrew Ng, Younes Bensouda Mourri

Goal: Given a picture, make it look beautiful

1. Data? 2. Input? 3. Output?

Let’s say we have content

After 2000 ⎛ 0.43⎞

Which loss should we minimize?

After 2000 ⎛ 0.43⎞

• You leverage the “knowledge” of a pretrained model to extract the content of a

Kian Katanforoosh, Andrew Ng, Younes Bensouda Mourri

Goal: Given a 10sec audio speech, detect the word “activate”.

1. Data? A bunch of 10s audio clips Distribution?

x = A 10sec audio clip Resolution? (sample rate)

Goal: Given a 10sec audio speech, detect the word “activate”.

1. Data? A bunch of 10s audio clips Distribution?

x = A 10sec audio clip Resolution? (sample rate)

4. Architecture ? Sounds like it should be a RNN

5. Loss? L = −( y log( ŷ) + (1− y)log(1− ŷ))

What is critical to the success of this project?

1. Strategic data collection/ 2. Architecture search & Hyperparameter tuning

Positive word Negative words Background noise

LSTM LSTM LSTM … LSTM LSTM LSTM CONV + BN

Automated labelling Never give up 00..00001..100..0

+ Error analysis 000000..000001..10000..000

Kian Katanforoosh, Andrew Ng, Younes Bensouda Mourri

For Tuesday 10/08, 8am:

Kian Katanforoosh, Andrew Ng, Younes Bensouda Mourri

You might also like

• A model is defined by its architecture and its parameters. 

For Tuesday 10/08, 8am: