04 Transfer Learning With Tensorflow Part 1 Feature Extraction
04 Transfer Learning With Tensorflow Part 1 Feature Extraction
Model learns patterns/weights from similar problem space Patterns get used/tuned to speci c problem
fi
“Why use transfer learning?”
Why use transfer learning?
• Can leverage an existing neural network architecture proven to work on problems similar to our
own
• Can leverage a working network architecture which has already learned patterns on similar
data to our own (often results in great results with less data)
👩🍳 👩🔬
(we’ll be cooking up lots of code!)
How:
Let’s code!
What are callbacks?
• Callbacks are a tool which can add helpful functionality to your models during training,
evaluation or inference
Log the performance of multiple models and then view and compare
these models in a visual way on TensorBoard (a dashboard for
TensorBoard tf.keras.callbacks.TensorBoard()
inspecting neural network parameters). Helpful to compare the results
of di erent models on your data.
Save your model as it trains so you can stop training if needed and
Model checkpointing come back to continue o where you left. Helpful if training takes a tf.keras.callbacks.ModelCheckpoint()
long time and can't be done in one sitting.
Leave your model training for an arbitrary amount of time and have it
Early stopping stop training automatically when it ceases to improve. Helpful when tf.keras.callbacks.EarlyStopping()
you've got a large dataset and don't know how long training will take.
ff
ff
What is TensorFlow Hub?
• A place to nd a plethora of pre-trained machine learning models (ready to be applied and
ne-tuned for your own problems)
🤔 “Does m y p r o b l e m e x i s t
on Ten s o r F lo w H u b ? ”
https://tfhub.dev/tensorflow/efficientnet/b0/feature-vector/1
TensorFlow Hub makes using a pre-trained model as
simple as calling a URL
fi
fi
ResNet50* feature extractor
Input data
(10 classes of Food101)
Changes
Stays same
(same shape as number
(frozen, pre-trained on ImageNet)
of classes)
10
*Note: In the code, we’re actually using ResNet50, a slightly larger architecture than ResNet34.
Image source: https://arxiv.org/abs/1512.03385
Original Model vs. Feature Extraction
Changes Output layer(s) gets trained
Output Layer (shape = 1000) 10 on new data
…
Layer 235 Layer 235
…
(original model layers
(e.g. E cientNet) don’t update during training)
Layer 2 Layer 2
…
Changes
…
Layer 235 Layer 235 Layer 235
Changes
(unfrozen)
Layer 234 Stays same Layer 234 Layer 234
(frozen)
…
…
Layer 2 Layer 2 Layer 2
Stays same
(frozen)
Input Layer Input Layer Input Layer
…
…
Fine-t
Changes Might change usual uning
ly req
more d uires
ata th
featur an
Large dataset (e.g. ImageNet) Di erent dataset (e.g. 10 classes of food) extrac e
tion
Take the underlying patterns (also Helpful if you have a small amount of custom
Most of the layers in the original
called weights) a pretrained model data (similar to what the original model was
Feature extraction has learned and adjust its outputs
model remain frozen during training
trained on) and want to utilise a pretrained model
(only the top 1-3 layers get updated).
to be more suited to your problem. to get better results on your speci c problem.
Source: https://tensorboard.dev/experiment/73taSKxXQeGPQsNBcVvY3g/#scalars
ffi
ff
🍔👁 Food Vision: Dataset(s) we’re using
Note: For randomly selected data, the Food101 dataset was downloaded and modi ed using the Image Data Modi cation Notebook
750 images of pizza and steak 250 images of pizza and steak
pizza_steak Food101 Pizza, steak (2) (same as original Food101 (same as original Food101
dataset) dataset)
Smaller model