0% found this document useful (0 votes)
98 views18 pages

04 Transfer Learning With Tensorflow Part 1 Feature Extraction

This document discusses transfer learning with TensorFlow. It begins with an overview of transfer learning and some common use cases in computer vision and natural language processing. It then explains that transfer learning allows leveraging existing neural network architectures and learned patterns from similar data to build models that perform better than training from scratch with less data. The document outlines how transfer learning works by training earlier layers on a large dataset and then tuning the later layers on a new smaller dataset. It also introduces TensorFlow Hub as a place to find pre-trained models for transfer learning.

Uploaded by

Akbar Shakoor
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
98 views18 pages

04 Transfer Learning With Tensorflow Part 1 Feature Extraction

This document discusses transfer learning with TensorFlow. It begins with an overview of transfer learning and some common use cases in computer vision and natural language processing. It then explains that transfer learning allows leveraging existing neural network architectures and learned patterns from similar data to build models that perform better than training from scratch with less data. The document outlines how transfer learning works by training earlier layers on a large dataset and then tuning the later layers on a new smaller dataset. It also introduces TensorFlow Hub as a place to find pre-trained models for transfer learning.

Uploaded by

Akbar Shakoor
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

Transfer Learning with

Part 1: Feature extraction


Where can you get help?
“If in doubt, run the code”

• Follow along with the code


• Try it for yourself
• Press SHIFT + CMD + SPACE to read the docstring
• Search for it
• Try again
• Ask (don’t forget the Discord chat!)
(yes, including the “dumb”
questions)
“What is transfer learning?”
Surely someone has spent the time crafting the right model for the job…
Example transfer learning use cases
Computer vision

Natural language processing

To: [email protected] To: [email protected]


Hey Daniel, Hay daniel…

This deep learning course is incredible! C0ongratu1ations! U win $1139239230


I can’t wait to use what I’ve learned!

Not spam Spam

Model learns patterns/weights from similar problem space Patterns get used/tuned to speci c problem
fi
“Why use transfer learning?”
Why use transfer learning?
• Can leverage an existing neural network architecture proven to work on problems similar to our
own

• Can leverage a working network architecture which has already learned patterns on similar
data to our own (often results in great results with less data)

Learn patterns in a E cientNet archiecture Tune patterns/weights to


Model performs better
wide variety of images (already works really well our own problem
than from scratch
(using ImageNet) on computer vision tasks) (Food Vision)
ffi
What we’re going to cover
(broadly)
• Introduce transfer learning with TensorFlow

• Using a small dataset to experiment faster (10% of training


samples)

• Building a transfer learning feature extraction model with


TensorFlow Hub

• Use TensorBoard to track modelling experiments and results

👩🍳 👩🔬
(we’ll be cooking up lots of code!)

How:
Let’s code!
What are callbacks?
• Callbacks are a tool which can add helpful functionality to your models during training,
evaluation or inference

• Some popular callbacks include:

Callback name Use case Code

Log the performance of multiple models and then view and compare
these models in a visual way on TensorBoard (a dashboard for
TensorBoard tf.keras.callbacks.TensorBoard()
inspecting neural network parameters). Helpful to compare the results
of di erent models on your data.
Save your model as it trains so you can stop training if needed and
Model checkpointing come back to continue o where you left. Helpful if training takes a tf.keras.callbacks.ModelCheckpoint()
long time and can't be done in one sitting.

Leave your model training for an arbitrary amount of time and have it
Early stopping stop training automatically when it ceases to improve. Helpful when tf.keras.callbacks.EarlyStopping()
you've got a large dataset and don't know how long training will take.
ff
ff
What is TensorFlow Hub?
• A place to nd a plethora of pre-trained machine learning models (ready to be applied and
ne-tuned for your own problems)
🤔 “Does m y p r o b l e m e x i s t
on Ten s o r F lo w H u b ? ”

https://tfhub.dev/tensorflow/efficientnet/b0/feature-vector/1
TensorFlow Hub makes using a pre-trained model as
simple as calling a URL
fi
fi
ResNet50* feature extractor
Input data
(10 classes of Food101)

Changes
Stays same
(same shape as number
(frozen, pre-trained on ImageNet)
of classes)

10
*Note: In the code, we’re actually using ResNet50, a slightly larger architecture than ResNet34.
Image source: https://arxiv.org/abs/1512.03385
Original Model vs. Feature Extraction
Changes Output layer(s) gets trained
Output Layer (shape = 1000) 10 on new data


Layer 235 Layer 235

Layer 234 Layer 234


Working Stays same (frozen)
architecture …


(original model layers
(e.g. E cientNet) don’t update during training)
Layer 2 Layer 2

Input Layer Input Layer



Changes

Large dataset (e.g. ImageNet) Di erent dataset (e.g. 10 classes of food)

Original Model Feature Extraction Transfer Learning Model


ff
ffi
Kinds of Transfer Learning Top layers get trained
on new data

Output Layer (shape = 1000) Changes 10 Stays same 10


Layer 235 Layer 235 Layer 235
Changes
(unfrozen)
Layer 234 Stays same Layer 234 Layer 234
(frozen)


Layer 2 Layer 2 Layer 2
Stays same
(frozen)
Input Layer Input Layer Input Layer


Fine-t
Changes Might change usual uning
ly req
more d uires
ata th
featur an
Large dataset (e.g. ImageNet) Di erent dataset (e.g. 10 classes of food) extrac e
tion

Original Model Feature Extraction Fine-tuning


ff
Kinds of Transfer Learning
Type Description What happens When to use

Take a pretrained model as it is and


The original model remains Helpful if you have the exact same kind of data
Original model (“As is”) apply it to your task without any
unchanged. the original model was trained on.
changes.

Take the underlying patterns (also Helpful if you have a small amount of custom
Most of the layers in the original
called weights) a pretrained model data (similar to what the original model was
Feature extraction has learned and adjust its outputs
model remain frozen during training
trained on) and want to utilise a pretrained model
(only the top 1-3 layers get updated).
to be more suited to your problem. to get better results on your speci c problem.

Helpful if you have a large amount of custom data


Take the weights of a pretrained Some, many or all of the layers in
and want to utilise a pretrained model and
Fine-tuning model and adjust ( ne-tune) them the pretrained model are updated
improve its underlying patterns to your speci c
to your own problem. during training.
problem.
fi
fi
fi
What is TensorBoard?
• A way to visually explore your machine learning models performance and internals

• Host, track and share your machine learning experiments on TensorBoard.dev


n t e gr a t e s
o a r d a ls oi
(TensorB t s & B i a s e s )
i k e W ei gh
s i t e s l
With web

Comparing the results of two di erent model


architectures (ResNet50V2 & E cientNetB0)
on the same dataset.

Source: https://tensorboard.dev/experiment/73taSKxXQeGPQsNBcVvY3g/#scalars
ffi
ff
🍔👁 Food Vision: Dataset(s) we’re using
Note: For randomly selected data, the Food101 dataset was downloaded and modi ed using the Image Data Modi cation Notebook

Dataset Name Source Classes Training data Testing data

750 images of pizza and steak 250 images of pizza and steak
pizza_steak Food101 Pizza, steak (2) (same as original Food101 (same as original Food101
dataset) dataset)

Chicken curry, chicken wings,


7 randomly selected images 250 images of each class
fried rice, grilled salmon,
10_food_classes_1_percent Same as above of each class (1% of original (same as original Food101
hamburger, ice cream, pizza,
training data) dataset)
ramen, steak, sushi (10)

75 randomly selected images


10_food_classes_10_percent Same as above Same as above of each class (10% of original Same as above
training data)

750 images of each class


10_food_classes_100_percent Same as above Same as above (100% of original training Same as above
data)

75 images of each class (10% 250 images of each class


101_food_classes_10_percent Same as above All classes from Food101 (101) of original Food101 training (same as original Food101
dataset) dataset)
fi
fi
Useful computer vision architectures
• tf.keras.applications and keras.applications have many of the most popular and
best performing computer vision architectures built-in & pre-trained, ready to use for your
own problems

Source: https://keras.io/api/applications/ Source: https://www.tensorflow.org/api_docs/python/tf/keras/applications


Improving a model (from a model’s perspective)

Smaller model

Common ways to improve a deep model:


• Adding layers Larger model
• Increase the number of hidden units
• Change the activation functions
• Change the optimization function
• Change the learning rate (because you can alter each of
• Fitting on more data these, they’re hyperparameters)

• Fitting for longer

You might also like