0% found this document useful (0 votes)

8 views28 pages

Lecture3 Transfer Learning

The document outlines an introductory course on Applied Machine Learning led by Dr. Tao Han at the New Jersey Institute of Technology. It covers key concepts such as transfer learning, model fine-tuning, multitask learning, domain adaptation, and zero-shot learning, emphasizing their applications and challenges. The course is designed to equip students with practical skills in training classifiers and adapting models to different tasks and domains.

Uploaded by

ra734

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

8 views28 pages

Lecture3 Transfer Learning

Uploaded by

ra734

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 28

ECE 498: ST:

Introduction to Applied Machine Learning

• Tao Han, Ph.D.

• Associate Professor
• Electrical and Computer Engineering
• Newark College of Engineering
• New Jersey Institute of Technology

• https://tao-han-njit.netlify.app

Slides are designed based on Prof. Hung-yi Lee’s Machine Learning courses at National Taiwan University
http://weebly110810.weebly.com/3
96403913129399.html
http://www.sucaitianxia.com/png/c
Transfer Learning artoon/200811/4261.html

Dog/Cat
Classifier
cat dog

Data not directly related to the task considered

elephant tiger dog cat

Similar domain, different tasks Different domains, same task

Transfer Learning - Overview
Source Data (not directly related to the task)

labelled

Model Fine-tuning
labelled
Target Data

unlabeled

Warning: different terminology in different literature

Model Fine-tuning
One-shot learning: only a few
examples in target domain
• Task description
• Source data: 𝑥 𝑠 , 𝑦 𝑠 A large amount
• Target data: 𝑥 𝑡 , 𝑦 𝑡 Very little
• Example: (supervised) speaker adaption
• Source data: audio data and transcriptions from many
speakers
• Target data: audio data and its transcriptions of specific
user
• Idea: training a model by source data, then fine-
tune the model by target data
• Challenge: only limited target data, so be careful about
overfitting
Conservative Training
Output layer output close Output layer

parameter close

initialization
Input layer Input layer

Target data (e.g.

Source data
A little data from
(e.g. Audio data of
target speaker)
Many speakers)
Layer Transfer
Output layer Copy some parameters

Target data

Input layer 1. Only train the rest layers (prevent

Source
data overfitting)
2. fine-tune the whole network (if
there is sufficient data)
Layer Transfer
• Which layer can be transferred (copied)?
• Image: usually copy the first few layers

Pixels Layer 1 Layer 2 Layer L

x1 …… ……
x2 …… elephant

……
……
……
……

xN …… ……
Transfer Learning - Overview
Source Data (not directly related to the task)

labelled

Model Fine-tuning
labelled

Multitask Learning
Target Data

unlabeled

Warning: different terminology in different literature

Multitask Learning
• The multi-layer structure makes NN suitable for
multitask learning
Task A Task B
Task A Task B

Input
Input feature Input feature
feature
for task A for task B
Multitask Learning
- Multilingual Speech Recognition
states of states of states of states of states of
French German Spanish Italian Mandarin

Human languages
share some common
characteristics.
acoustic features
Similar idea in translation: Daxiang Dong, Hua Wu, Wei He, Dianhai Yu and
Haifeng Wang, "Multi-task learning for multiple language translation.“, ACL 2015
Multitask Learning - Multilingual
50
Character Error Rate

40 Mandarin
only
35

With
30 European
Language
25
1 10 100 1000

Hours of training data for Mandarin

Huang, Jui-Ting, et al. "Cross-language knowledge transfer using multilingual
deep neural network with shared hidden layers." ICASSP, 2013
Transfer Learning - Overview
Source Data (not directly related to the task)

labelled

Model Fine-tuning
labelled

Multitask Learning
Target Data

Domain-adaptation
unlabeled

Warning: different terminology in different literature

You have learned a lot about ML. Training a classifier is
not a big deal for you. ☺
Training
Data

Testing
Data
99.5% 57.5%
The results are from: http://proceedings.mlr.press/v37/ganin15.pdf

Domain shift: Training and testing data have different

distributions. Domain adaptation
Domain Shift
Training Data Testing Data

Source Target
Domain Domain

1 2 3 4 5 1 2 3 4 5

This is “0”. This is “1”.

Source Domain
Domain Adaptation (with labeled data)

“4” “0” “1”

Knowledge of target domain

• Idea: training a model by source data,

“8”
then fine-tune the model by target data
• Challenge: only limited target data, so be Little but
careful about overfitting labeled
Source Domain
Domain Adaptation (with labeled data)

“4” “0” “1”

Knowledge of target domain

“8”

Large amount of Little but

unlabeled data labeled
Basic Idea Learn to ignore colors
Feature
Extractor feature
(network)

Source
The same
Different
distribution
Target

Feature
Extractor feature
(network)
Domain Adversarial Training
image class distribution

Feature Label
“4”
Extractor Predictor

Source
(labeled)
blue points
Target
(unlabeled)
red points
Domain Adversarial Training
𝜃𝑓∗ = min 𝐿 − 𝐿𝑑 always zero?
𝜃𝑓
𝜃𝑓 𝜃𝑝
Feature Label
“4”
Extractor Predictor
𝐿
Generator 𝜃𝑝∗ = min 𝐿
𝜃𝑝

• Feature extractor: Learn 𝜃𝑑∗ = min 𝐿𝑑

𝜃𝑑
𝐿𝑑
to “fool” domain classifier 𝜃𝑑
Domain Source?
• Also need to support Classifier Target?
label predictor
Discriminator
Domain Adversarial Training
Yaroslav Ganin, Victor Lempitsky, Unsupervised Domain Adaptation by Backpropagation,
ICML, 2015
Hana Ajakan, Pascal Germain, Hugo Larochelle, François Laviolette, Mario Marchand,
Domain-Adversarial Training of Neural Networks, JMLR, 2016
class 1 (source) Target data
class 2 (source) (class unknown)
Limitation
Decision boundaries learned
from source domain

Source and target data Target data (unlabeled

are aligned, but …… far from boundary)
Considering Decision Boundary
unlabeled Small entropy

Feature Label
Extractor Predictor
1 2 3 4 5

unlabeled
Large entropy
Feature Label
Extractor Predictor
1 2 3 4 5

Used in Decision-boundary Iterative Refinement Training with

a Teacher (DIRT-T) https://arxiv.org/abs/1802.08735

Maximum Classifier Discrepancy https://arxiv.org/abs/1712.02560

Transfer Learning - Overview
Source Data (not directly related to the task)

labelled

Model Fine-tuning
labelled

Multitask Learning
Target Data

Domain-adaptation
unlabeled

Zero-shot learning

Warning: different terminology in different literature

http://evchk.wikia.com/wiki/%E8%8
Zero-shot Learning D%89%E6%B3%A5%E9%A6%AC

• Source data: 𝑥 𝑠 , 𝑦 𝑠 Training data Different

• Target data: 𝑥 𝑡 Testing data tasks

𝑥 𝑠: …… 𝑥𝑡 :

𝑦𝑠: cat dog …… Alpaca

How we solve this problem?

Zero-shot Learning
• Representing each class by its attributes
Training
1 0 0 1 1 1 Database
furry 4 legs tail furry 4 legs tail
attributes
furry 4 legs tail …
Dog O O O
NN NN
class Fish X X O
Chimp O X X
…

sufficient attributes for one

to one mapping
Zero-shot Learning
• Representing each class by its attributes
Testing Find the class with the most
similar attributes
0 0 1
furry 4 legs tail attributes
furry 4 legs tail …
Dog O O O
NN
class Fish X X O
Chimp O X X
…

sufficient attributes for one

to one mapping
𝑓 ∗ and g ∗ can be NN.
Zero-shot Learning Training target:
𝑓 𝑥 𝑛 and 𝑔 𝑦 𝑛 as
close as possible
• Attribute embedding

x2 y1 (attribute
of chimp) y2 (attribute
x1 of dog)

2
𝑓 𝑥2 𝑔 𝑦
𝑓 𝑥1 𝑔 𝑦1

y3 (attribute of 𝑔 𝑦3 𝑓 𝑥3
x3
Alpaca)
Embedding Space
More about Zero-shot learning
• Mark Palatucci, Dean Pomerleau, Geoffrey E. Hinton, Tom M.
Mitchell, “Zero-shot Learning with Semantic Output Codes”, NIPS
2009
• Zeynep Akata, Florent Perronnin, Zaid Harchaoui and Cordelia
Schmid, “Label-Embedding for Attribute-Based Classification”,
CVPR 2013
• Andrea Frome, Greg S. Corrado, Jon Shlens, Samy Bengio, Jeff
Dean, Marc'Aurelio Ranzato, Tomas Mikolov, “DeViSE: A Deep
Visual-Semantic Embedding Model”, NIPS 2013
• Mohammad Norouzi, Tomas Mikolov, Samy Bengio, Yoram
Singer, Jonathon Shlens, Andrea Frome, Greg S. Corrado, Jeffrey
Dean, “Zero-Shot Learning by Convex Combination of Semantic
Embeddings”, arXiv preprint 2013
• Subhashini Venugopalan, Lisa Anne Hendricks, Marcus
Rohrbach, Raymond Mooney, Trevor Darrell, Kate Saenko,
“Captioning Images with Diverse Objects”, arXiv preprint 2016

Transfer Learning Through Embedding Spaces (Z-Lib - Io)
No ratings yet
Transfer Learning Through Embedding Spaces (Z-Lib - Io)
223 pages
Lecture 11 Transfer and Few-shot Learning
No ratings yet
Lecture 11 Transfer and Few-shot Learning
47 pages
AdvAI_Unit4
No ratings yet
AdvAI_Unit4
79 pages
Transfer (v3)
No ratings yet
Transfer (v3)
38 pages
Data - and AI-driven Methods in Engineering
No ratings yet
Data - and AI-driven Methods in Engineering
40 pages
Transfer Learning - Qiang Yang
No ratings yet
Transfer Learning - Qiang Yang
393 pages
Transfer Learning
No ratings yet
Transfer Learning
40 pages
Meta-Learning & Transfer Learning
No ratings yet
Meta-Learning & Transfer Learning
56 pages
Lec11 Transfer Learning
No ratings yet
Lec11 Transfer Learning
45 pages
Transfer Learning
No ratings yet
Transfer Learning
60 pages
Unit-V Tranfer Learning Notes
No ratings yet
Unit-V Tranfer Learning Notes
27 pages
07-dlintro deep learning nlp
No ratings yet
07-dlintro deep learning nlp
31 pages
Session 5
No ratings yet
Session 5
33 pages
UNIT_ICHP 4
No ratings yet
UNIT_ICHP 4
19 pages
Cross-domain Fault Diagnosis Using Knowledge Transfer Strategy- A Review
No ratings yet
Cross-domain Fault Diagnosis Using Knowledge Transfer Strategy- A Review
29 pages
TransferLearningwithAdaptiveFine Tuning
No ratings yet
TransferLearningwithAdaptiveFine Tuning
16 pages
11 Deep Transfer Learning and Multi Task Learning
No ratings yet
11 Deep Transfer Learning and Multi Task Learning
24 pages
Unit - V
No ratings yet
Unit - V
44 pages
2018 - Kouw - An Introduction To Domain Adaptation and Transfer Learning
No ratings yet
2018 - Kouw - An Introduction To Domain Adaptation and Transfer Learning
41 pages
ML-II 5
No ratings yet
ML-II 5
5 pages
Transferability in Deep Learning: A Survey: Junguang Jiang
No ratings yet
Transferability in Deep Learning: A Survey: Junguang Jiang
64 pages
A Comprehensive Survey On Transfer Learning
No ratings yet
A Comprehensive Survey On Transfer Learning
31 pages
Training the application of LLM
No ratings yet
Training the application of LLM
68 pages
AAM ans
No ratings yet
AAM ans
3 pages
PHD Defense
100% (1)
PHD Defense
89 pages
A Primer On Domain Adaptation PDF
No ratings yet
A Primer On Domain Adaptation PDF
31 pages
Essay 6
No ratings yet
Essay 6
15 pages
17.feature-Based Distant Domain Transfer Learning
No ratings yet
17.feature-Based Distant Domain Transfer Learning
8 pages
20250415 - Deep_learning
No ratings yet
20250415 - Deep_learning
49 pages
Transfer Learning
No ratings yet
Transfer Learning
13 pages
Lecture 17 Transfer Learning
No ratings yet
Lecture 17 Transfer Learning
12 pages
Transfer Learning On Quora Dataset
No ratings yet
Transfer Learning On Quora Dataset
6 pages
Shared-Leadership-in-Higher-Education
No ratings yet
Shared-Leadership-in-Higher-Education
36 pages
Transfer Learning Using PNN
No ratings yet
Transfer Learning Using PNN
5 pages
Transfer Learnring
No ratings yet
Transfer Learnring
5 pages
[Fall 2024] Deep Learning 3
No ratings yet
[Fall 2024] Deep Learning 3
54 pages
Iconips Paper On Transfer Learning
No ratings yet
Iconips Paper On Transfer Learning
11 pages
Domain Differential Adaptation For Neural Machine Translation
No ratings yet
Domain Differential Adaptation For Neural Machine Translation
11 pages
11.RNN and Transformers
No ratings yet
11.RNN and Transformers
100 pages
4 CS826 - Meta Learning
No ratings yet
4 CS826 - Meta Learning
40 pages
Abstract:: Keywords: Transfer Learning, Convolutional Neural Networks (Convnets), Imagenet, Vgg16
No ratings yet
Abstract:: Keywords: Transfer Learning, Convolutional Neural Networks (Convnets), Imagenet, Vgg16
11 pages
CH 5
No ratings yet
CH 5
16 pages
AI in Robotics Presentation
No ratings yet
AI in Robotics Presentation
20 pages
Zhu 2018
No ratings yet
Zhu 2018
8 pages
Data Science Guide
100% (1)
Data Science Guide
275 pages
-3
No ratings yet
-3
28 pages
UNIT III
No ratings yet
UNIT III
26 pages
Multitask Transfer (1)
No ratings yet
Multitask Transfer (1)
36 pages
Domain-Adversarial Training of Neural Networks
No ratings yet
Domain-Adversarial Training of Neural Networks
35 pages
Copy of Numeracy Nursery 1 Scheme of Work - SyllabusNG
No ratings yet
Copy of Numeracy Nursery 1 Scheme of Work - SyllabusNG
21 pages
ReviewPaper TransferLearning
No ratings yet
ReviewPaper TransferLearning
6 pages
Lecture Notes
No ratings yet
Lecture Notes
86 pages
A Survey On Deep Transfer Learning
No ratings yet
A Survey On Deep Transfer Learning
10 pages
Master Inspera
No ratings yet
Master Inspera
45 pages
Neural Transfer Learning For NLP
No ratings yet
Neural Transfer Learning For NLP
329 pages
Unit 4
No ratings yet
Unit 4
50 pages
AI in English Learning
0% (1)
AI in English Learning
3 pages
Haeusser Iccv 17
No ratings yet
Haeusser Iccv 17
11 pages
Employability Prediction Model Using Academic Performance in Higher Education Through Deep Learning Techniques
No ratings yet
Employability Prediction Model Using Academic Performance in Higher Education Through Deep Learning Techniques
16 pages
The Art and Science of Leadership
No ratings yet
The Art and Science of Leadership
112 pages
transfer learning
No ratings yet
transfer learning
24 pages
Bba 207 Guide Book
No ratings yet
Bba 207 Guide Book
37 pages
Aes CW1
No ratings yet
Aes CW1
12 pages
Branches of Philosophy: Sir Christian C. Estacion
No ratings yet
Branches of Philosophy: Sir Christian C. Estacion
42 pages
All Task 2 Templates
No ratings yet
All Task 2 Templates
9 pages
Chapter 13 Roles of Engineering Drawing
No ratings yet
Chapter 13 Roles of Engineering Drawing
39 pages
Assigment Session 3 - 2023 - 2024
No ratings yet
Assigment Session 3 - 2023 - 2024
12 pages
Path Goal Theory
No ratings yet
Path Goal Theory
3 pages
A Survey On Transfer Learning
No ratings yet
A Survey On Transfer Learning
42 pages
Short Course On Deep Learning: Welcome!!
No ratings yet
Short Course On Deep Learning: Welcome!!
57 pages
A Guide To Writing The Dissertation Literature Review: Group Members
No ratings yet
A Guide To Writing The Dissertation Literature Review: Group Members
16 pages
OTE Speaking Lesson 3
No ratings yet
OTE Speaking Lesson 3
6 pages
(M10Ge-Iic-D-1) : Minuyan National High School Grade 10 Jefferson B. Torres Mathematics October 22, 2024 Second
No ratings yet
(M10Ge-Iic-D-1) : Minuyan National High School Grade 10 Jefferson B. Torres Mathematics October 22, 2024 Second
3 pages
Quiz Introduction To SCRUM
0% (1)
Quiz Introduction To SCRUM
3 pages
Swimming Club Letter and Plan
No ratings yet
Swimming Club Letter and Plan
5 pages
Case Study
No ratings yet
Case Study
2 pages
Util
No ratings yet
Util
3 pages
Excel Commercial Construction Schedule Template
No ratings yet
Excel Commercial Construction Schedule Template
5 pages
University of Stavanger: Environmental Engineering Petroleum Engineering
No ratings yet
University of Stavanger: Environmental Engineering Petroleum Engineering
4 pages
Weekly Reflection Journal
No ratings yet
Weekly Reflection Journal
9 pages
Grade 1 Math Addition Within 20 Word Problems
No ratings yet
Grade 1 Math Addition Within 20 Word Problems
7 pages
Argumentative Essay Outline
No ratings yet
Argumentative Essay Outline
2 pages
Have To Speaking
No ratings yet
Have To Speaking
2 pages
Phonics Screening Check For Year 2: Dear Parent and Carers
No ratings yet
Phonics Screening Check For Year 2: Dear Parent and Carers
1 page
Transfer Learning Seminar
No ratings yet
Transfer Learning Seminar
12 pages
Org - Man - Q2 - Module9 - The Concept and Nature of Staffing
100% (1)
Org - Man - Q2 - Module9 - The Concept and Nature of Staffing
20 pages
School of Business BUSI0036ABC - Quantitative Analysis For Business Decisions I
No ratings yet
School of Business BUSI0036ABC - Quantitative Analysis For Business Decisions I
4 pages
NLC Assessment Tool
No ratings yet
NLC Assessment Tool
3 pages
Deep Learning with Hadoop
From Everand
Deep Learning with Hadoop
Dipayan Dev
No ratings yet
Deep Learning: Fundamentals and Applications
From Everand
Deep Learning: Fundamentals and Applications
Fouad Sabry
No ratings yet

Lecture3 Transfer Learning

Uploaded by

Lecture3 Transfer Learning

Uploaded by

ECE 498: ST:

Introduction to Applied Machine Learning

• Tao Han, Ph.D.

Data not directly related to the task considered

elephant tiger dog cat

Similar domain, different tasks Different domains, same task

Warning: different terminology in different literature

Target data (e.g.

Input layer 1. Only train the rest layers (prevent

Pixels Layer 1 Layer 2 Layer L

Warning: different terminology in different literature

Hours of training data for Mandarin

Warning: different terminology in different literature

Domain shift: Training and testing data have different

This is “0”. This is “1”.

“4” “0” “1”

• Idea: training a model by source data,

“4” “0” “1”

Large amount of Little but

• Feature extractor: Learn 𝜃𝑑∗ = min 𝐿𝑑

Source and target data Target data (unlabeled

Used in Decision-boundary Iterative Refinement Training with

Maximum Classifier Discrepancy https://arxiv.org/abs/1712.02560

Warning: different terminology in different literature

• Source data: 𝑥 𝑠 , 𝑦 𝑠 Training data Different

𝑦𝑠: cat dog …… Alpaca

How we solve this problem?

sufficient attributes for one

sufficient attributes for one

You might also like