09 Pytorch Model Deployment Ipynb
09 Pytorch Model Deployment Ipynb
Open in Colab
But so far our PyTorch models have only been accessible to us.
How about we bring FoodVision Mini to life and make it publically accessible?
In other words,
Trying out the deployed version of FoodVision Mini (what we're going to build) on my lunch. The
model got it right too !
Someone else being a person who can interact with your model in some way.
For example, someone taking a photo on their smartphone of food and then having our
1 of 87 5/28/2025, 11:04 PM
09_pytorch_model_deployment.ipynb - Colab https://colab.research.google.com/github/mrdbourke/pytorch-deep-learn...
For example, someone taking a photo on their smartphone of food and then having our
FoodVision Mini model classify it into pizza, steak or sushi.
Something else might be another program, app or even another model that interacts with your
machine learning model(s).
For example, a banking database might rely on a machine learning model making predictions as
to whether a transaction is fraudulent or not before transferring funds.
Or an operating system may lower its resource consumption based on a machine learning
model making predictions on how much power someone generally uses at speci�c times of day.
For example, a Tesla car's computer vision system will interact with the car's route planning
program (something else) and then the route planning program will get inputs and feedback
from the driver (someone else).
Machine learning model deployment involves making your model available to someone or
something else. For example, someone might use your model as part of a food recognition app
(such as FoodVision Mini or Nutrify). And something else might be another model or program
using your model such as a banking system using a machine learning model to detect if a
transaction is fraud or not.
2 of 87 5/28/2025, 11:04 PM
09_pytorch_model_deployment.ipynb - Colab https://colab.research.google.com/github/mrdbourke/pytorch-deep-learn...
Because although you can get a pretty good idea of how your model's going to function by
evaluting it on a well crafted test set or visualizing its results, you never really know how it's
going to perform until you release it to the wild.
Having people who've never used your model interact with it will often reveal edge cases you
never thought of during training.
For example, what happens if someone was to upload a photo that wasn't of food to our
FoodVision Mini model?
One solution would be to create another model that �rst classi�es images as "food" or "not
food" and passing the target image through that model �rst (this is what Nutrify does).
Then if the image is of "food" it goes to our FoodVision Mini model and gets classi�ed into
pizza, steak or sushi.
Thus this highlights the importance of model deployment: it helps you �gure out errors in your
model that aren't obvious during training/testing.
3 of 87 5/28/2025, 11:04 PM
09_pytorch_model_deployment.ipynb - Colab https://colab.research.google.com/github/mrdbourke/pytorch-deep-learn...
We covered a PyTorch work�ow back in 01. PyTorch Work�ow. But once you've got a good
model, deployment is a good next step. Monitoring involves seeing how your model goes on the
most important data split: data from the real world. For more resources on deployment and
monitoring see PyTorch Extra Resources.
"What is the most ideal scenario for my machine learning model to be used?"
Of course, you may not know this ahead of time. But you're smart enough to imagine such
things.
Easy.
1. The model should work on a mobile device (this means there will be some compute
constraints).
2. The model should make predictions fast (because a slow app is a boring app).
And of course, depending on your use case, your requirements may vary.
You may notice the above two points break down into another two questions:
4 of 87 5/28/2025, 11:04 PM
09_pytorch_model_deployment.ipynb - Colab https://colab.research.google.com/github/mrdbourke/pytorch-deep-learn...
When starting to deploy machine learning models, it's helpful to start by asking what's the most
ideal use case and then work backwards from there, asking where the model's going to go and
then how it's going to function.
When you deploy your machine learning model, where does it live?
The main debate here is usually on-device (also called edge/in the browser) or on the cloud (a
computer/server that isn't the actual device someone/something calls the model from).
Can be very fast (since no data leaves the device) Limited compute power (larger models ta
Privacy preserving (again no data has to leave the device) Limited storage space (smaller model si
Near unlimited compute power (can scale up when needed) Costs can get out of hand (if proper scal
Can deploy one model and use everywhere (via API) Predictions can be slower due to data ha
Links into existing cloud ecosystem Data has to leave device (this may cause
There are more details to these but I've left resources in the extra-curriculum to learn more.
5 of 87 5/28/2025, 11:04 PM
09_pytorch_model_deployment.ipynb - Colab https://colab.research.google.com/github/mrdbourke/pytorch-deep-learn...
If we're deploying FoodVision Mini as an app, we want it to perform well and fast.
1. A model on-device that performs at 95% accuracy with an inference time (latency) of one
second per prediction.
2. A model on the cloud that performs at 98% accuracy with an inference time of 10 seconds
per per prediction (bigger, better model but takes longer to compute).
I've made these numbers up but they showcase a potential difference between on-device and on
the cloud.
Option 1 could potentially be a smaller less performant model that runs fast because its able to
�t on a mobile device.
Option 2 could potentially a larger more performant model that requires more compute and
storage but it takes a bit longer to run because we have to send data off the device and get it
back (so even though the actual prediction might be fast, the network time and data transfer has
to factored in).
For FoodVision Mini, we'd likely prefer option 1, because the small hit in performance is far
outweighed by the faster inference speed.
In the case of a Tesla car's computer vision system, which would be better? A smaller model
that performs well on device (model is on the car) or a larger model that performs better that's
on the cloud? In this case, you'd much prefer the model being on the car. The extra network time
6 of 87 5/28/2025, 11:04 PM
09_pytorch_model_deployment.ipynb - Colab https://colab.research.google.com/github/mrdbourke/pytorch-deep-learn...
on the cloud? In this case, you'd much prefer the model being on the car. The extra network time
it would take for data to go from the car to the cloud and then back to the car just wouldn't be
worth it (or potentially even impossible with poor signal areas).
For a full example of seeing what it's like to deploy a PyTorch model to an
edge device, see the PyTorch tutorial on achieving real-time inference (30fps+)
with a computer vision model on a Raspberry Pi.
Back to the ideal use case, when you deploy your machine learning model, how should it work?
The main difference between each being: predictions being made immediately or periodically.
Periodically can have a varying timescale too, from every few seconds to every few hours or
days.
In the case of FoodVision Mini, we'd want our inference pipeline to happen online (real-time), so
when someone uploads an image of pizza, steak or sushi, the prediction results are returned
immediately (any slower than real-time would make a boring experience).
But for our training pipeline, it's okay for it to happen in a batch (o�ine) fashion, which is what
we've been doing throughout the previous chapters.
7 of 87 5/28/2025, 11:04 PM
09_pytorch_model_deployment.ipynb - Colab https://colab.research.google.com/github/mrdbourke/pytorch-deep-learn...
We've discussed a couple of options for deploying machine learning models (on-device and
cloud).
Apple's Core ML and coremltools Python package On-device (all Apple devices)
Many more...
Which option you choose will be highly dependent on what you're building/who you're working
with.
And one of the best ways to do so is by turning your machine learning model into a demo app
with Gradio and then deploying it on Hugging Face Spaces.
8 of 87 5/28/2025, 11:04 PM
09_pytorch_model_deployment.ipynb - Colab https://colab.research.google.com/github/mrdbourke/pytorch-deep-learn...
A handful of places and tools to host and deploy machine learning models. There are plenty I've
missed so if you'd like to add more, please leave a discussion on GitHub.
Our goal is to deploy our FoodVision Model via a demo Gradio app with the following metrics:
1. 95%+ accuracy.
2. real-time inference of 30FPS+ (each prediction has a latency of lower than ~0.03s).
We'll start by running an experiment to compare our best two models so far: EffNetB2 and ViT
feature extractors.
Then we'll deploy the one which performs closest to our goal metrics.
We've written a fair bit of useful code over the past few sections, let's
A ViT feature extractor has been the best performing model yet on o
We've built two of the best performing models yet, let's make predict
Let's compare our models to see which performs best with our goals
One of our models performs better than the other (in terms of our go
Our Gradio app demo works locally, let's prepare it for deployment!
Let's take FoodVision Mini to the web and make it pubically accessib
If you run into trouble, you can ask a question on the course GitHub Discussions page.
And of course, there's the PyTorch documentation and PyTorch developer forums, a very helpful
place for all things PyTorch.
0. Getting setup
As we've done previously, let's make sure we've got all of the modules we'll need for this section.
We'll import the Python scripts (such as data_setup.py and engine.py ) we created in 05.
PyTorch Going Modular.
To do so, we'll download going_modular directory from the pytorch-deep-learning repository (if
we don't already have it).
And since later on we'll be using torchvision v0.13 package (available as of July 2022), we'll
make sure we've got the latest versions.
If you're using Google Colab, and you don't have a GPU turned on yet, it's
now time to turn one on via Runtime -> Change runtime type -> Hardware
accelerator -> GPU .
# For this notebook to run with updated APIs, we need torch 1.12+ and torchvision 0.13+
try:
import torch
import torchvision
assert int(torch.__version__.split(".")[1]) >= 12, "torch version should be 1.12+"
assert int(torchvision.__version__.split(".")[1]) >= 13, "torchvision version should be 0.13+
print(f"torch version: {torch.__version__}")
print(f"torchvision version: {torchvision.__version__}")
except:
print(f"[INFO] torch/torchvision versions not as required, installing nightly versions."
!pip3 install -U torch torchvision torchaudio --extra-index-url https://download.pytorch.org/
import torch
import torchvision
print(f"torch version: {torch.__version__}")
print(f"torchvision version: {torchvision.__version__}")
10 of 87 5/28/2025, 11:04 PM
09_pytorch_model_deployment.ipynb - Colab https://colab.research.google.com/github/mrdbourke/pytorch-deep-learn...
If you're using Google Colab and the cell above starts to install various
software packages, you may have to restart your runtime after running the above
cell. After restarting, you can run the cell again and verify you've got the right
versions of torch and torchvision .
Now we'll continue with the regular imports, setting up device agnostic code and this time we'll
also get the helper_functions.py script from GitHub.
• set_seeds() to set the random seeds (created in 07. PyTorch Experiment Tracking section
0).
• download_data() to download a data source given a link (created in 07. PyTorch
Experiment Tracking section 1).
• plot_loss_curves() to inspect our model's training results (created in 04. PyTorch
Custom Datasets section 7.8)
# Try to import the going_modular directory, download it from GitHub if it doesn't work
try:
from going_modular.going_modular import data_setup, engine
from helper_functions import download_data, set_seeds, plot_loss_curves
except:
# Get the going_modular scripts
11 of 87 5/28/2025, 11:04 PM
09_pytorch_model_deployment.ipynb - Colab https://colab.research.google.com/github/mrdbourke/pytorch-deep-learn...
Finally, we'll setup device-agnostic code to make sure our models run on the GPU.
'cuda'
1. Getting data
We left off in 08. PyTorch Paper Replicating comparing our own Vision Transformer (ViT)
feature extractor model to the E�cientNetB2 (EffNetB2) feature extractor model we created in
07. PyTorch Experiment Tracking.
The EffNetB2 model was trained on 20% of the pizza, steak and sushi data from Food101 where
as the ViT model was trained on 10%.
Since our goal is to deploy the best model for our FoodVision Mini problem, let's start by
downloading the 20% pizza, steak and sushi dataset and train an EffNetB2 feature extractor and
ViT feature extractor on it and then compare the two models.
This way we'll be comparing apples to apples (one model trained on a dataset to another model
trained on the same dataset).
We can download the data using the download_data() function we created in 07. PyTorch
Experiment Tracking section 1 from helper_functions.py .
12 of 87 5/28/2025, 11:04 PM
09_pytorch_model_deployment.ipynb - Colab https://colab.research.google.com/github/mrdbourke/pytorch-deep-learn...
data_20_percent_path = download_data(source="https://github.com/mrdbourke/pytorch-deep-learnin
destination="pizza_steak_sushi_20_percent")
data_20_percent_path
Wonderful!
Now we've got a dataset, let's create training and test paths.
The ideal deployed model FoodVision Mini performs well and fast.
Real-time in this case being ~30FPS (frames per second) because that's about how fast the
human eye can see (there is debate on this but let's just use ~30FPS as our benchmark).
And for classifying three different classes (pizza, steak and sushi), we'd like a model that
performs at 95%+ accuracy.
Of course, higher accuracy would be nice but this might sacri�ce speed.
13 of 87 5/28/2025, 11:04 PM
09_pytorch_model_deployment.ipynb - Colab https://colab.research.google.com/github/mrdbourke/pytorch-deep-learn...
FoodVision Mini deployment goals. We'd like a fast predicting well-performing model (because a
slow app is boring).
We'll put an emphasis on speed, meaning, we'd prefer a model performing at 90%+ accuracy at
~30FPS than a model performing 95%+ accuracy at 10FPS.
To try and achieve these results, let's bring in our best performing models from the previous
sections:
A "feature extractor model" often starts with a model that has been
pretrained on a dataset similar to your own problem. The pretrained model's base
layers are often left frozen (the pretrained patterns/weights stay the same) whilst
some of the top (or classi�er/classi�cation head) layers get customized to your
own problem by training on your own data. We covered the concept of a feature
extractor model in 06. PyTorch Transfer Learning section 3.4.
14 of 87 5/28/2025, 11:04 PM
09_pytorch_model_deployment.ipynb - Colab https://colab.research.google.com/github/mrdbourke/pytorch-deep-learn...
So let's now recreate it here so we can compare its results to a ViT feature extractor trained on
the same data.
To do so we can:
# 4. Freeze the base layers in the model (this will freeze all layers to begin with)
for param in effnetb2.parameters():
param.requires_grad = False
Now to change the classi�er head, let's �rst inspect it using the classifier attribute of our
model.
Sequential(
(0): Dropout(p=0.3, inplace=True)
(1): Linear(in_features=1408, out_features=1000, bias=True)
15 of 87 5/28/2025, 11:04 PM
09_pytorch_model_deployment.ipynb - Colab https://colab.research.google.com/github/mrdbourke/pytorch-deep-learn...
Excellent! To change the classi�er head to suit our own problem, let's replace the out_features
variable with the same number of classes we have (in our case, out_features=3 , one for pizza,
steak, sushi).
Beautiful!
We'll call it create_effnetb2_model() and it'll take a customizable number of classes and a
random seed parameter for reproducibility.
Ideally, it will return an EffNetB2 feature extractor along with its associated transforms.
def create_effnetb2_model(num_classes:int=3,
seed:int=42):
"""Creates an EfficientNetB2 feature extractor model and transforms.
Args:
num_classes (int, optional): number of classes in the classifier head.
Defaults to 3.
seed (int, optional): random seed value. Defaults to 42.
Returns:
model (torch.nn.Module): EffNetB2 feature extractor model.
transforms (torchvision.transforms): EffNetB2 image transforms.
"""
# 1, 2, 3. Create EffNetB2 pretrained weights, transforms and model
16 of 87 5/28/2025, 11:04 PM
09_pytorch_model_deployment.ipynb - Colab https://colab.research.google.com/github/mrdbourke/pytorch-deep-learn...
weights = torchvision.models.EfficientNet_B2_Weights.DEFAULT
transforms = weights.transforms()
model = torchvision.models.efficientnet_b2(weights=weights)
No errors, nice, now to really try it out, let's get a summary with torchinfo.summary() .
17 of 87 5/28/2025, 11:04 PM
09_pytorch_model_deployment.ipynb - Colab https://colab.research.google.com/github/mrdbourke/pytorch-deep-learn...
We'll use a batch_size of 32 and transform our images using the effnetb2_transforms so
they're in the same format that our effnetb2 model was trained on.
# Setup DataLoaders
from going_modular.going_modular import data_setup
train_dataloader_effnetb2, test_dataloader_effnetb2, class_names = data_setup.create_dataloaders(
Just like in 07. PyTorch Experiment Tracking section 7.6, ten epochs should be enough to get
good results.
18 of 87 5/28/2025, 11:04 PM
09_pytorch_model_deployment.ipynb - Colab https://colab.research.google.com/github/mrdbourke/pytorch-deep-learn...
# Setup optimizer
optimizer = torch.optim.Adam(params=effnetb2.parameters(),
lr=1e-3)
# Setup loss function
loss_fn = torch.nn.CrossEntropyLoss()
As we saw in 07. PyTorch Experiment Tracking, the EffNetB2 feature extractor model works
quite well on our data.
Let's turn its results into loss curves to inspect them further.
Loss curves are one of the best ways to visualize how your model's
performing. For more on loss curves, check out 04. PyTorch Custom Datasets
section 8: What should an ideal loss curve look like?
19 of 87 5/28/2025, 11:04 PM
09_pytorch_model_deployment.ipynb - Colab https://colab.research.google.com/github/mrdbourke/pytorch-deep-learn...
plot_loss_curves(effnetb2_results)
Woah!
It looks like our model is performing quite well and perhaps would bene�t from a little longer
training and potentially some data augmentation (to help prevent potential over�tting occurring
from longer training).
To save our model we can use the utils.save_model() function we created in 05. PyTorch
Going Modular section 5.
20 of 87 5/28/2025, 11:04 PM
09_pytorch_model_deployment.ipynb - Colab https://colab.research.google.com/github/mrdbourke/pytorch-deep-learn...
Well, while not always the case, the size of a model can in�uence its inference speed.
As in, if a model has more parameters, it generally performs more operations and each one of
these operations requires some computing power.
And because we'd like our model to work on devices with limited computing power (e.g. on a
mobile device or in a web browser), generally, the smaller the size the better (as long as it still
performs well in terms of accuracy).
And we'll calculate an extra one for fun, total number of parameters.
21 of 87 5/28/2025, 11:04 PM
09_pytorch_model_deployment.ipynb - Colab https://colab.research.google.com/github/mrdbourke/pytorch-deep-learn...
And we'll calculate an extra one for fun, total number of parameters.
7705221
Excellent!
Now let's put everything in a dictionary so we can make comparisons later on.
{'test_loss': 0.28128674924373626,
'test_acc': 0.96875,
'number_of_parameters': 7705221,
'model_size (MB)': 29}
Epic!
And we'll do it in much the same way as the EffNetB2 feature extractor except this time with
torchvision.models.vit_b_16() instead of torchvision.models.efficientnet_b2() .
We'll start by creating a function called create_vit_model() which will be very similar to
create_effnetb2_model() except of course returning a ViT feature extractor model and
transforms rather than EffNetB2.
22 of 87 5/28/2025, 11:04 PM
09_pytorch_model_deployment.ipynb - Colab https://colab.research.google.com/github/mrdbourke/pytorch-deep-learn...
Another slight difference is that torchvision.models.vit_b_16() 's output layer is called heads
rather than classifier .
Sequential(
(head): Linear(in_features=768, out_features=1000, bias=True)
)
Knowing this, we've got all the pieces of the puzzle we need.
def create_vit_model(num_classes:int=3,
seed:int=42):
"""Creates a ViT-B/16 feature extractor model and transforms.
Args:
num_classes (int, optional): number of target classes. Defaults to 3.
seed (int, optional): random seed value for output layer. Defaults to 42.
Returns:
model (torch.nn.Module): ViT-B/16 feature extractor model.
transforms (torchvision.transforms): ViT-B/16 image transforms.
"""
# Create ViT_B_16 pretrained weights, transforms and model
weights = torchvision.models.ViT_B_16_Weights.DEFAULT
transforms = weights.transforms()
model = torchvision.models.vit_b_16(weights=weights)
23 of 87 5/28/2025, 11:04 PM
09_pytorch_model_deployment.ipynb - Colab https://colab.research.google.com/github/mrdbourke/pytorch-deep-learn...
Now let's get a nice-looking summary of our ViT model using torchinfo.summary() .
# # Print ViT feature extractor model summary (uncomment for full output)
# summary(vit,
# input_size=(1, 3, 224, 224),
# col_names=["input_size", "output_size", "num_params", "trainable"],
# col_width=20,
# row_settings=["var_names"])
Just like our EffNetB2 feature extractor model, our ViT model's base layers are frozen and the
output layer is customized to our needs!
Our ViT model has far more parameters than our EffNetB2 model. Perhaps this will come into
play when we compare our models across speed and performance later on.
We'll do this in the same way we did for EffNetB2 except we'll use to transform
24 of 87 5/28/2025, 11:04 PM
09_pytorch_model_deployment.ipynb - Colab https://colab.research.google.com/github/mrdbourke/pytorch-deep-learn...
We'll do this in the same way we did for EffNetB2 except we'll use vit_transforms to transform
our images into the same format the ViT model was trained on.
...it's traininggggggg time (sung in the same tune as the song Closing Time).
Let's train our ViT feature extractor model for 10 epochs using our engine.train() function with
torch.optim.Adam() and a learning rate of 1e-3 as our optimizer and
torch.nn.CrossEntropyLoss() as our loss function.
We'll use our set_seeds() function before training to try and make our results as reproducible
as possible.
# Setup optimizer
optimizer = torch.optim.Adam(params=vit.parameters(),
lr=1e-3)
# Setup loss function
loss_fn = torch.nn.CrossEntropyLoss()
25 of 87 5/28/2025, 11:04 PM
09_pytorch_model_deployment.ipynb - Colab https://colab.research.google.com/github/mrdbourke/pytorch-deep-learn...
Don't forget you can see what an ideal set of loss curves should look like in
04. PyTorch Custom Datasets section 8.
plot_loss_curves(vit_results)
Ohh yeah!
Those are some nice looking loss curves. Just like our EffNetB2 feature extractor model, it looks
our ViT model might bene�t from a little longer training time and perhaps some data
augmentation (to help prevent over�tting).
26 of 87 5/28/2025, 11:04 PM
09_pytorch_model_deployment.ipynb - Colab https://colab.research.google.com/github/mrdbourke/pytorch-deep-learn...
We can do so using the utils.save_model() function we created in 05. PyTorch Going Modular
section 5.
utils.save_model(model=vit,
target_dir="models",
model_name="09_pretrained_vit_feature_extractor_pizza_steak_sushi_20_percent.pth
Hmm, how does the ViT feature extractor model size compare to our EffNetB2 model size?
We'll �nd this out shortly when we compare all of our model's characteristics.
We saw it in the summary output above but we'll calculate its total number of parameters.
27 of 87 5/28/2025, 11:04 PM
09_pytorch_model_deployment.ipynb - Colab https://colab.research.google.com/github/mrdbourke/pytorch-deep-learn...
We saw it in the summary output above but we'll calculate its total number of parameters.
85800963
Woah, that looks like a fair bit more than our EffNetB2!
Now let's create a dictionary with some important characteristics of our ViT model.
vit_stats
{'test_loss': 0.06418210905976593,
'test_acc': 0.984659090909091,
'number_of_parameters': 85800963,
'model_size (MB)': 327}
Nice! Looks like our ViT model achieves over 95% accuracy too.
Now how about we test them out doing what we'd like them to do?
We know both of our models are performing at over 95% accuracy on the test dataset, but how
fast are they?
28 of 87 5/28/2025, 11:04 PM
09_pytorch_model_deployment.ipynb - Colab https://colab.research.google.com/github/mrdbourke/pytorch-deep-learn...
Ideally, if we're deploying our FoodVision Mini model to a mobile device so people can take
photos of their food and identify it, we'd like the predictions to happen at real-time (~30 frames
per second).
To �nd out how long each of our models take to performance inference, let's create a function
called pred_and_store() to iterate over each of the test dataset images one by one and perform
a prediction.
We'll time each of the predictions as well as store the results in a common prediction format: a
list of dictionaries (where each element in the list is a single prediction and each sinlge
prediction is a dictionary).
We time the predictions one by one rather than by batch because when our
model is deployed, it will likely only be making a prediction on one image at a
time. As in, someone takes a photo and our model predicts on that single image.
Since we'd like to make predictions across all the images in the test set, let's �rst get a list of all
of the test image paths so we can iterate over them.
1. Create a function that takes a list of paths, a trained PyTorch model, a series of transforms
(to prepare images), a list of target class names and a target device.
2. Create an empty list to store prediction dictionaries (we want the function to return a list of
29 of 87 5/28/2025, 11:04 PM
09_pytorch_model_deployment.ipynb - Colab https://colab.research.google.com/github/mrdbourke/pytorch-deep-learn...
2. Create an empty list to store prediction dictionaries (we want the function to return a list of
dictionaries, one for each prediction).
3. Loop through the target input paths (steps 4-14 will happen inside the loop).
4. Create an empty dictionary for each iteration in the loop to store prediction values per
sample.
5. Get the sample path and ground truth class name (we can do this by inferring the class
from the path).
�. Start the prediction timer using Python's timeit.default_timer() .
7. Open the image using PIL.Image.open(path) .
�. Transform the image so it's capable of being used with the target model as well as add a
batch dimension and send the image to the target device.
9. Prepare the model for inference by sending it to the target device and turning on eval()
mode.
10. Turn on torch.inference_mode() and pass the target transformed image to the model and
calculate the prediction probability using torch.softmax() and the target label using
torch.argmax() .
11. Add the prediction probability and prediction class to the prediction dictionary created in
step 4. Also make sure the prediction probability is on the CPU so it can be used with non-
GPU libraries such as NumPy and pandas for later inspection.
12. End the prediction timer started in step 6 and add the time to the prediction dictionary
created in step 4.
13. See if the predicted class matches the ground truth class from step 5 and add the result to
the prediction dictionary created in step 4.
14. Append the updated prediction dictionary to the empty list of predictions created in step 2.
15. Return the list of prediction dictionaries.
Let's do it.
import pathlib
import torch
# 1. Create a function to return a list of dictionaries with sample, truth label, prediction, pre
def pred_and_store(paths: List[pathlib.Path],
model: torch.nn.Module,
transform: torchvision.transforms,
class_names: List[str],
30 of 87 5/28/2025, 11:04 PM
09_pytorch_model_deployment.ipynb - Colab https://colab.research.google.com/github/mrdbourke/pytorch-deep-learn...
class_names: List[str],
device: str = "cuda" if torch.cuda.is_available() else "cpu") -> List[Dict]:
# 8. Transform the image, add batch dimension and put image on target device
transformed_image = transform(img).unsqueeze(0).to(device)
# 9. Prepare model for inference by sending it to target device and turning on eval() mod
model.to(device)
model.eval()
# 11. Make sure things in the dictionary are on CPU (required for inspecting predicti
pred_dict["pred_prob"] = round(pred_prob.unsqueeze(0).max().cpu().item(), 4)
pred_dict["pred_class"] = pred_class
31 of 87 5/28/2025, 11:04 PM
09_pytorch_model_deployment.ipynb - Colab https://colab.research.google.com/github/mrdbourke/pytorch-deep-learn...
Ho, ho!
And you know what, since our pred_and_store() is a pretty good utility function for making and
storing predictions, it could be stored to going_modular.going_modular.predictions.py for later
use. That might be an extension you'd like to try, check out 05. PyTorch Going Modular for ideas.
Let's start by using it to make predictions across the test dataset with our EffNetB2 model,
paying attention to two details:
1. - We'll hard code the device parameter to use "cpu" because when we deploy our
model, we won't always have access to a "cuda" (GPU) device.
◦ Making the predictions on CPU will be a good indicator of speed of inference too
because generally predictions on CPU devices are slower than GPU devices.
Let's inspect the �rst couple and see what they look like.
[{'image_path': PosixPath('data/pizza_steak_sushi_20_percent/test/
steak/831681.jpg'),
32 of 87 5/28/2025, 11:04 PM
09_pytorch_model_deployment.ipynb - Colab https://colab.research.google.com/github/mrdbourke/pytorch-deep-learn...
steak/831681.jpg'),
'class_name': 'steak',
'pred_prob': 0.9293,
'pred_class': 'steak',
'time_for_pred': 0.0494,
'correct': True},
{'image_path': PosixPath('data/pizza_steak_sushi_20_percent/test/
steak/3100563.jpg'),
'class_name': 'steak',
'pred_prob': 0.9534,
'pred_class': 'steak',
'time_for_pred': 0.0264,
'correct': True}]
Woohoo!
Thanks to our list of dictionaries data structure, we've got plenty of useful information we can
further inspect.
data/
0 pizza_steak_sushi_20_percent/ steak 0.9293 steak 0.0494 True
test/steak/8...
data/
1 pizza_steak_sushi_20_percent/ steak 0.9534 steak 0.0264 True
test/steak/3...
data/
2 pizza_steak_sushi_20_percent/ steak 0.7532 steak 0.0256 True
Beautiful!
Look how easily those prediction dictionaries turn into a structured format we can perform
analysis on.
Such as �nding how many predictions our EffNetB2 model got wrong...
True 145
33 of 87 5/28/2025, 11:04 PM
09_pytorch_model_deployment.ipynb - Colab https://colab.research.google.com/github/mrdbourke/pytorch-deep-learn...
False 5
Name: correct, dtype: int64
Hmm, how does that average prediction time live up to our criteria of our model performing at
real-time (~30FPS or 0.03 seconds per prediction)?
Let's add our EffNetB2 average time per prediction to our effnetb2_stats dictionary.
{'test_loss': 0.28128674924373626,
'test_acc': 0.96875,
'number_of_parameters': 7705221,
'model_size (MB)': 29,
'time_per_pred_cpu': 0.0269}
To do so, we can use the pred_and_store() function we created above except this time we'll
pass in our vit model as well as the vit_transforms .
And we'll keep the predictions on the CPU via device="cpu" (a natural extension here would be
to test the prediction times on CPU and on GPU).
34 of 87 5/28/2025, 11:04 PM
09_pytorch_model_deployment.ipynb - Colab https://colab.research.google.com/github/mrdbourke/pytorch-deep-learn...
# Make list of prediction dictionaries with ViT feature extractor model on test images
vit_test_pred_dicts = pred_and_store(paths=test_data_paths,
model=vit,
transform=vit_transforms,
class_names=class_names,
device="cpu")
Predictions made!
[{'image_path': PosixPath('data/pizza_steak_sushi_20_percent/test/
steak/831681.jpg'),
'class_name': 'steak',
'pred_prob': 0.9933,
'pred_class': 'steak',
'time_for_pred': 0.1313,
'correct': True},
{'image_path': PosixPath('data/pizza_steak_sushi_20_percent/test/
steak/3100563.jpg'),
'class_name': 'steak',
'pred_prob': 0.9893,
'pred_class': 'steak',
'time_for_pred': 0.0638,
'correct': True}]
Wonderful!
And just like before, since our ViT model's predictions are in the form of a list of dictionaries, we
can easily turn them into a pandas DataFrame for further inspection.
data/
0 pizza_steak_sushi_20_percent/ steak 0.9933 steak 0.1313 True
test/steak/8...
35 of 87 5/28/2025, 11:04 PM
09_pytorch_model_deployment.ipynb - Colab https://colab.research.google.com/github/mrdbourke/pytorch-deep-learn...
data/
1 pizza_steak_sushi_20_percent/ steak 0.9893 steak 0.0638 True
test/steak/3...
data/
2 pizza_steak_sushi_20_percent/ steak 0.9971 steak 0.0627 True
True 148
False 2
Name: correct, dtype: int64
Woah!
Our ViT model did a little better than our EffNetB2 model in terms of correct predictions, only
two samples wrong across the whole test dataset.
As an extension you might want to visualize the ViT model's wrong predictions and see if there's
any reason why it might've got them wrong.
How about we calculate how long the ViT model took per prediction?
Well, that looks a little slower than our EffNetB2 model's average time per prediction but how
does it look in terms of our second criteria: speed?
For now, let's add the value to our vit_stats dictionary so we can compare it to our EffNetB2
model's stats.
The average time per prediction values will be highly dependent on the
hardware you make them on. For example, for the ViT model, my average time per
prediction (on the CPU) was 0.0693-0.0777 seconds on my local deep learning PC
with an Intel i9 CPU. Where as on Google Colab, my average time per prediction
with the ViT model was 0.6766-0.7113 seconds.
36 of 87 5/28/2025, 11:04 PM
09_pytorch_model_deployment.ipynb - Colab https://colab.research.google.com/github/mrdbourke/pytorch-deep-learn...
vit_stats["time_per_pred_cpu"] = vit_average_time_per_pred
vit_stats
{'test_loss': 0.06418210905976593,
'test_acc': 0.984659090909091,
'number_of_parameters': 85800963,
'model_size (MB)': 327,
'time_per_pred_cpu': 0.0641}
Now let's put them head to head and compare across their different statistics.
To do so, let's turn our effnetb2_stats and vit_stats dictionaries into a pandas DataFrame.
We'll add a column to view the model names as well as convert the test accuracy to a whole
percentage rather than decimal.
df
model_size
test_loss test_acc number_of_parameters time_per_pred_cpu model
(MB)
Wonderful!
It seems our models are quite close in terms of overall test accuracy but how do they look
across the other �elds?
One way to �nd out would be to divide the ViT model statistics by the EffNetB2 model statistics
to �nd out the different ratios between the models.
37 of 87 5/28/2025, 11:04 PM
09_pytorch_model_deployment.ipynb - Colab https://colab.research.google.com/github/mrdbourke/pytorch-deep-learn...
pd.DataFrame(data=(df.set_index("model").loc["ViT"] / df.set_index("model").loc["EffNetB2"
columns=["ViT to EffNetB2 ratios"]).T
model_size
test_loss test_acc number_of_parameters time_per_pred_cpu
(MB)
ViT to
It seems our ViT model outperforms the EffNetB2 model across the performance metrics (test
loss, where lower is better and test accuracy, where higher is better) but at the expense of
having:
Perhaps if we had unlimited compute power but for our use case of deploying the FoodVision
Mini model to a smaller device (e.g. a mobile phone), we'd likely start out with the EffNetB2
model for faster predictions at a slightly reduced performance but dramatically smaller size.
However, our EffNetB2 model performs predictions faster and has a much smaller model size.
1. Create a scatter plot from the comparison DataFrame to compare EffNetB2 and ViT
time_per_pred_cpu and test_acc values.
2. Add titles and labels respective of the data and customize the fontsize for aesthetics.
3. Annotate the samples on the scatter plot from step 1 with their appropriate labels (the
model names).
4. Create a legend based on the model sizes ( model_size (MB) ).
38 of 87 5/28/2025, 11:04 PM
09_pytorch_model_deployment.ipynb - Colab https://colab.research.google.com/github/mrdbourke/pytorch-deep-learn...
x="time_per_pred_cpu",
y="test_acc",
c=["blue", "orange"], # what colours to use?
s="model_size (MB)") # size the dots by the model sizes
39 of 87 5/28/2025, 11:04 PM
09_pytorch_model_deployment.ipynb - Colab https://colab.research.google.com/github/mrdbourke/pytorch-deep-learn...
Woah!
The plot really visualizes the , in other words, when you have a
larger, better performing deep model (like our ViT model), it generally takes longer to perform
inference (higher latency).
There are exceptions to the rule and new research is being published all the time to help make
larger models perform faster.
And it can be tempting to just deploy the best performing model but it's also good to take into
consideration where the model is going to be performing.
In our case, the differences between our model's performance levels (on the test loss and test
accuracy) aren't too extreme.
But since we'd like to put an emphasis on speed to begin with, we're going to stick with
deploying EffNetB2 since it's faster and has a much smaller footprint.
Prediction times will be different across different hardware types (e.g. Intel
i9 vs Google Colab CPU vs GPU) so it's important to think about and test where
your model is going to end up. Asking questions like "where is the model going to
be run?" or "what is the ideal scenario for running the model?" and then running
experiments to try and provide answers on your way to deployment is very helpful.
There are several ways to deploy a machine learning model each with speci�c use cases (as
discussed above).
We're going to be focused on perhaps the quickest and certainly one of the most fun ways to get
40 of 87 5/28/2025, 11:04 PM
09_pytorch_model_deployment.ipynb - Colab https://colab.research.google.com/github/mrdbourke/pytorch-deep-learn...
What's Gradio?
Gradio is the fastest way to demo your machine learning model with a friendly
web interface so that anyone can use it, anywhere!
Because metrics on the test set look nice but you never really know how your model performs
until you use it in the wild.
We'll start by importing Gradio with the common alias gr and if it's not present, we'll install it.
# Import/install Gradio
try:
import gradio as gr
except:
!pip -q install gradio
import gradio as gr
Gradio ready!
The overall premise of Gradio is very similar to what we've been repeating throughout the
course.
41 of 87 5/28/2025, 11:04 PM
09_pytorch_model_deployment.ipynb - Colab https://colab.research.google.com/github/mrdbourke/pytorch-deep-learn...
In our case, for FoodVision Mini, our inputs are images of food, our ML model is EffNetB2 and
our outputs are classes of food (pizza, steak or sushi).
Though the concepts of inputs and outputs can be bridged to almost any other kind of ML
problem.
• Images
• Text
• Video
• Tabular data
• Audio
• Numbers
• & more
And the ML model you build will depend on your inputs and outputs.
42 of 87 5/28/2025, 11:04 PM
09_pytorch_model_deployment.ipynb - Colab https://colab.research.google.com/github/mrdbourke/pytorch-deep-learn...
Gradio provides a very helpful Interface class to easily create an inputs -> model/function ->
outputs work�ow where the inputs and outputs could be almost anything you want. For
example, you might input Tweets (text) to see if they're about machine learning or not or input a
text prompt to generate images.
Gradio has a vast number of possible inputs and outputs options known
as "Components" from images to text to numbers to audio to videos and more.
You can see all of these in the Gradio Components documentation.
We created a function earlier called pred_and_store() to make predictions with a given model
across a list of target �les and store them in a list of dictionaries.
How about we create a similar function but this time focusing on making a prediction on a
single image with our EffNetB2 model?
More speci�cally, we want a function that takes an image as input, preprocesses (transforms) it,
makes a prediction with EffNetB2 and then returns the prediction (pred or pred label for short)
as well as the prediction probability (pred prob).
And while we're here, let's return the time it took to do so too:
input: image -> transform -> predict with EffNetB2 -> output: pred, pred prob, time taken
First, let's make sure our EffNetB2 model is on the CPU (since we're sticking with CPU-only
predictions, however you could change this if you have access to a GPU).
43 of 87 5/28/2025, 11:04 PM
09_pytorch_model_deployment.ipynb - Colab https://colab.research.google.com/github/mrdbourke/pytorch-deep-learn...
device(type='cpu')
And now let's create a function called predict() to replicate the work�ow above.
# Create a prediction label and prediction probability dictionary for each prediction class (
pred_labels_and_probs = {class_names[i]: float(pred_probs[0][i]) for i in range(len(class_nam
Beautiful!
Now let's see our function in action by performing a prediction on a random image from the test
dataset.
We'll start by getting a list of all the image paths from the test directory and then randomly
selecting one.
import random
from PIL import Image
44 of 87 5/28/2025, 11:04 PM
09_pytorch_model_deployment.ipynb - Colab https://colab.research.google.com/github/mrdbourke/pytorch-deep-learn...
Nice!
Running the cell above a few times we can see different prediction probabilities for each label
from our EffNetB2 model as well as the time it took per prediction.
But before we create the demo, let's create one more thing: a list of examples.
So let's create a list of lists containing random �lepaths to our test images.
[['data/pizza_steak_sushi_20_percent/test/sushi/804460.jpg'],
['data/pizza_steak_sushi_20_percent/test/steak/746921.jpg'],
['data/pizza_steak_sushi_20_percent/test/steak/2117351.jpg']]
Perfect!
45 of 87 5/28/2025, 11:04 PM
09_pytorch_model_deployment.ipynb - Colab https://colab.research.google.com/github/mrdbourke/pytorch-deep-learn...
Perfect!
Our Gradio demo will showcase these as example inputs to our demo so people can try it out
and see what it does without uploading any of their own data.
input: image -> transform -> predict with EffNetB2 -> output: pred, pred prob, time taken
• fn - a Python function to map inputs to outputs , in our case, we'll use our predict()
function.
• inputs - the input to our interface, such as an image using gradio.Image() or "image" .
• outputs - the output of our interface once the inputs have gone through the fn , such as
a label using gradio.Label() (for our model's predicted labels) or number using
gradio.Number() (for our model's prediction time).
◦ Gradio comes with many in-built inputs and outputs options known as
"Components".
Once we've created our demo instance of gr.Interface() , we can bring it to life using
gradio.Interface().launch() or demo.launch() command.
Easy!
import gradio as gr
46 of 87 5/28/2025, 11:04 PM
09_pytorch_model_deployment.ipynb - Colab https://colab.research.google.com/github/mrdbourke/pytorch-deep-learn...
This share link expires in 72 hours. For free permanent hosting, check out Spaces: https://h
(<gradio.routes.App at 0x7f122dd0f0d0>,
'http://127.0.0.1:7860/',
'https://27541.gradio.app')
47 of 87 5/28/2025, 11:04 PM
09_pytorch_model_deployment.ipynb - Colab https://colab.research.google.com/github/mrdbourke/pytorch-deep-learn...
FoodVision Mini Gradio demo running in Google Colab and in the browser (the link when running
from Google Colab only lasts for 72 hours). You can see the permanent live demo on Hugging
Face Spaces.
FoodVision Mini has o�cially come to life in an interface someone could use and try out.
If you set the parameter share=True in the launch() method, Gradio also provides you with a
shareable link such as https://123XYZ.gradio.app (this link is an example only and likely
expired) which is valid for 72-hours.
The link provides a proxy back to the Gradio interface you launched.
For more permanent hosting, you can upload your Gradio app to Hugging Face Spaces or
anywhere that runs Python code.
We've seen our FoodVision Mini model come to life through a Gradio demo.
Well, we could use the provided Gradio link, however, the shared link only lasts for 72-hours.
To make our FoodVision Mini demo more permanent, we can package it into an app and upload
it to Hugging Face Spaces.
48 of 87 5/28/2025, 11:04 PM
09_pytorch_model_deployment.ipynb - Colab https://colab.research.google.com/github/mrdbourke/pytorch-deep-learn...
Hugging Face Spaces is a resource that allows you to host and share machine learning apps.
Building a demo is one of the best ways to showcase and test what you've done.
If having a good GitHub portfolio showcases your coding abilities, having a good Hugging Face
portfolio can showcase your machine learning abilities.
There are many other places we could upload and host our Gradio app such
as, Google Cloud, AWS (Amazon Web Services) or other cloud vendors, however,
we're going to use Hugging Face Spaces due to the ease of use and wide
adoption by the machine learning community.
To upload our demo Gradio app, we'll want to put everything relating to it into a single directory.
For example, our demo might live at the path demos/foodvision_mini/ with the �le structure:
demos/
└── foodvision_mini/
├── 09_pretrained_effnetb2_feature_extractor_pizza_steak_sushi_20_percent.pth
├── app.py
├── examples/
│ ├── example_1.jpg
│ ├── example_2.jpg
│ └── example_3.jpg
├── model.py
└── requirements.txt
Where:
• 09_pretrained_effnetb2_feature_extractor_pizza_steak_sushi_20_percent.pth is our
trained PyTorch model �le.
• app.py contains our Gradio app (similar to the code that launched the app).
◦ app.py is the default �lename used for Hugging Face Spaces, if you deploy
your app there, Spaces will by default look for a �le called app.py to run. This is
changeable in settings.
49 of 87 5/28/2025, 11:04 PM
09_pytorch_model_deployment.ipynb - Colab https://colab.research.google.com/github/mrdbourke/pytorch-deep-learn...
The quicker we can run smaller experiments, the better our bigger ones will be.
We're going to work towards recreating the structure above but you can see a live demo app
running on Hugging Face Spaces as well as the �le structure:
8.3 Creating a demos folder to store our FoodVision Mini app �les
To begin, let's �rst create a demos/ directory to store all of our FoodVision Mini app �les.
import shutil
from pathlib import Path
# Remove files that might already exist there and create new directory
if foodvision_mini_demo_path.exists():
shutil.rmtree(foodvision_mini_demo_path)
# If the file doesn't exist, create it anyway
foodvision_mini_demo_path.mkdir(parents=True,
exist_ok=True)
8.4 Creating a folder of example images to use with our FoodVision Mini
demo
50 of 87 5/28/2025, 11:04 PM
09_pytorch_model_deployment.ipynb - Colab https://colab.research.google.com/github/mrdbourke/pytorch-deep-learn...
Now we've got a directory to store our FoodVision Mini demo �les, let's add some examples to
it.
To do so we'll:
import shutil
from pathlib import Path
Now to verify our examples are present, let's list the contents of our demos/foodvision_mini/
examples/ directory with os.listdir() and then format the �lepaths into a list of lists (so it's
compatible with Gradio's gradio.Interface() example parameter).
import os
8.5 Moving our trained EffNetB2 model to our FoodVision Mini demo
51 of 87 5/28/2025, 11:04 PM
09_pytorch_model_deployment.ipynb - Colab https://colab.research.google.com/github/mrdbourke/pytorch-deep-learn...
8.5 Moving our trained EffNetB2 model to our FoodVision Mini demo
directory
We previously saved our FoodVision Mini EffNetB2 feature extractor model under
models/09_pretrained_effnetb2_feature_extractor_pizza_steak_sushi_20_percent.pth .
And rather double up on saved model �les, let's move our model to our demos/foodvision_mini
directory.
We can do so using Python's shutil.move() method and passing in src (the source path of the
target �le) and dst (the destination path of the target �le to be moved to) parameters.
import shutil
52 of 87 5/28/2025, 11:04 PM
09_pytorch_model_deployment.ipynb - Colab https://colab.research.google.com/github/mrdbourke/pytorch-deep-learn...
To do this in a modular fashion we'll create a script called model.py which contains our
create_effnetb2_model() function we created in section 3.1: Creating a function to make an
EffNetB2 feature extractor.
That way we can import the function in another script (see app.py below) and then use it to
create our EffNetB2 model instance as well as get its appropriate transforms.
Just like in 05. PyTorch Going Modular, we'll use the %%writefile path/to/file magic
command to turn a cell of code into a �le.
%%writefile demos/foodvision_mini/model.py
import torch
import torchvision
def create_effnetb2_model(num_classes:int=3,
seed:int=42):
"""Creates an EfficientNetB2 feature extractor model and transforms.
Args:
num_classes (int, optional): number of classes in the classifier head.
Defaults to 3.
seed (int, optional): random seed value. Defaults to 42.
Returns:
model (torch.nn.Module): EffNetB2 feature extractor model.
transforms (torchvision.transforms): EffNetB2 image transforms.
"""
# Create EffNetB2 pretrained weights, transforms and model
weights = torchvision.models.EfficientNet_B2_Weights.DEFAULT
transforms = weights.transforms()
model = torchvision.models.efficientnet_b2(weights=weights)
53 of 87 5/28/2025, 11:04 PM
09_pytorch_model_deployment.ipynb - Colab https://colab.research.google.com/github/mrdbourke/pytorch-deep-learn...
nn.Dropout(p=0.3, inplace=True),
nn.Linear(in_features=1408, out_features=num_classes),
)
Writing demos/foodvision_mini/model.py
8.7 Turning our FoodVision Mini Gradio app into a Python script ( app.py )
We've now got a model.py script as well as a path to a saved model state_dict that we can
load in.
We call it app.py because by default when you create a HuggingFace Space, it looks for a �le
called app.py to run and host (though you can change this in settings).
Our app.py script will put together all of the pieces of the puzzle to create our Gradio demo and
will have four main parts:
◦ We'll have to create the example list on the �y via the examples parameter. We
can do so by creating a list of the �les inside the examples/ directory with:
[["examples/" + example] for example in os.listdir("examples")] .
4. - This is where the main logic of our demo will live, we'll create a
gradio.Interface() instance called demo to put together our inputs, predict() function
and outputs. And we'll �nish the script by calling demo.launch() to launch our FoodVision
54 of 87 5/28/2025, 11:04 PM
09_pytorch_model_deployment.ipynb - Colab https://colab.research.google.com/github/mrdbourke/pytorch-deep-learn...
and outputs. And we'll �nish the script by calling demo.launch() to launch our FoodVision
Mini demo!
%%writefile demos/foodvision_mini/app.py
### 1. Imports and class names setup ###
import gradio as gr
import os
import torch
# Create a prediction label and prediction probability dictionary for each prediction class (
pred_labels_and_probs = {class_names[i]: float(pred_probs[0][i]) for i in (len(class_nam
55 of 87 5/28/2025, 11:04 PM
09_pytorch_model_deployment.ipynb - Colab https://colab.research.google.com/github/mrdbourke/pytorch-deep-learn...
Writing demos/foodvision_mini/app.py
This will be a text �le containing all of the required dependencies for our demo.
When we deploy our demo app to Hugging Face Spaces, it will search through this �le and install
the dependencies we de�ne so our app can run.
1. torch==1.12.0
2. torchvision==0.13.0
3. gradio==3.1.4
56 of 87 5/28/2025, 11:04 PM
09_pytorch_model_deployment.ipynb - Colab https://colab.research.google.com/github/mrdbourke/pytorch-deep-learn...
De�ning the version number is not 100% required but we will for now so if any breaking updates
occur in future releases, our app still runs (PS if you �nd any errors, feel free to post on the
course GitHub Issues).
%%writefile demos/foodvision_mini/requirements.txt
torch==1.12.0
torchvision==0.13.0
gradio==3.1.4
Writing demos/foodvision_mini/requirements.txt
Nice!
We've o�cially got all the �les we need to deploy our FoodVision Mini demo!
There are two main options for uploading to a Hugging Face Space (also called a Hugging Face
Repository, similar to a git repository):
◦ You can also use the huggingface_hub library to interact with Hugging Face,
this would be a good extension to the above two options.
Feel free to read the documentation on both options but we're going to go with option two.
To host anything on Hugging Face, you will need to sign up for a free
Hugging Face account.
ls stands for "list" and the ! means we want to execute the command at the shell level.
!ls demos/foodvision_mini
57 of 87 5/28/2025, 11:04 PM
09_pytorch_model_deployment.ipynb - Colab https://colab.research.google.com/github/mrdbourke/pytorch-deep-learn...
!ls demos/foodvision_mini
09_pretrained_effnetb2_feature_extractor_pizza_steak_sushi_20_percent.pth
app.py
examples
model.py
requirements.txt
To begin uploading our �les to Hugging Face, let's now download them from Google Colab (or
wherever you're running this notebook).
To do so, we'll �rst compress the �les into a single zip folder via the command:
Where:
• zip stands for "zip" as in "please zip together the �les in the following directory".
• -r stands for "recursive" as in, "go through all of the �les in the target directory".
• ../foodvision_mini.zip is the target directory we'd like our �les to be zipped to.
• * stands for "all the �les in the current directory".
• -x stands for "exclude these �les".
We can download our zip �le from Google Colab using google.colab.files.download("demos/
foodvision_mini.zip") (we'll put this inside a try and except block just in case we're not
running the code inside Google Colab, and if so we'll print a message saying to manually
download the �les).
# Change into and then zip the foodvision_mini folder but exclude certain files
!cd demos/foodvision_mini && zip -r ../foodvision_mini.zip * -x "*.pyc" "*.ipynb" "*__pycache__*"
# Download the zipped FoodVision Mini app (if running in Google Colab)
try:
from google.colab import files
files.download("demos/foodvision_mini.zip")
except:
print("Not running in Google Colab, can't use google.colab.files.download(), please manually
58 of 87 5/28/2025, 11:04 PM
09_pytorch_model_deployment.ipynb - Colab https://colab.research.google.com/github/mrdbourke/pytorch-deep-learn...
Woohoo!
If you're running this notebook in Google Colab, you should see a �le start to download in your
browser.
Otherwise, you can see the foodvision_mini.zip folder (and more) on the course GitHub under
the demos/ directory.
If you download the foodvision_mini.zip �le, you can test it locally by:
◦ This step may take 5-10 minutes depending on your internet connection. And if
you're facing errors, you may need to upgrade pip �rst: pip install --upgrade pip .
This should result in a Gradio demo just like the one we built above running locally on your
machine at a URL such as http://127.0.0.1:7860/ .
If you run the app locally and you notice a flagged/ directory appear, it
contains samples that have been "�agged".
For example, if someone tries the demo and the model produces an incorrect
result, the sample can be "�agged" and reviewed for later.
59 of 87 5/28/2025, 11:04 PM
09_pytorch_model_deployment.ipynb - Colab https://colab.research.google.com/github/mrdbourke/pytorch-deep-learn...
learning demo is to show it to other people and allow them to use it.
To do so, we're going to upload our FoodVision Mini demo to Hugging Face.
The following series of steps uses a Git (a �le tracking system) work�ow.
For more on how Git works, I'd recommend going through the Git and GitHub for
Beginners tutorial on freeCodeCamp.
3. Give the Space a name, for example, mine is called mrdbourke/foodvision_mini , you can
see it here: https://huggingface.co/spaces/mrdbourke/foodvision_mini
4. Select a license (I used MIT).
5. Select Gradio as the Space SDK (software development kit).
◦ You can use other options such as Streamlit but since our app is built with
Gradio, we'll stick with that.
�. Choose whether your Space is it's public or private (I selected public since I'd like my
Space to be available to others).
7. Click "Create Space".
�. Clone the repo locally by running something like: git clone https://huggingface.co/
spaces/[YOUR_USERNAME]/[YOUR_SPACE_NAME] in terminal or command prompt.
◦ You can also add �les via uploading them under the "Files and versions" tab.
9. Copy/move the contents of the downloaded foodvision_mini folder to the cloned repo
folder.
10. To upload and track larger �les (e.g. �les over 10MB or in our case, our PyTorch model �le)
you'll need to install Git LFS (which stands for "git large �le storage").
11. After you've installed Git LFS, you can activate it by running git lfs install .
12. In the foodvision_mini directory, track the �les over 10MB with Git LFS with git lfs
track "*.file_extension" .
13. Track .gitattributes (automatically created when cloning from HuggingFace, this �le will
help ensure our larger �les are tracked with Git LFS). You can see an example
.gitattributes �le on the FoodVision Mini Hugging Face Space.
60 of 87 5/28/2025, 11:04 PM
09_pytorch_model_deployment.ipynb - Colab https://colab.research.google.com/github/mrdbourke/pytorch-deep-learn...
14. Add the rest of the foodvision_mini app �les and commit them with:
◦ git add *
◦ git commit -m "first commit"
◦ git push
1�. Wait 3-5 minutes for the build to happen (future builds are faster) and your app to become
live!
If everything worked, you should see a live running example of our FoodVision Mini Gradio demo
like the one here: https://huggingface.co/spaces/mrdbourke/foodvision_mini
And we can even embed our FoodVision Mini Gradio demo into our notebook as an iframe with
IPython.display.IFrame and a link to our space in the format https://hf.space/embed/
[YOUR_USERNAME]/[YOUR_SPACE_NAME]/+ .
61 of 87 5/28/2025, 11:04 PM
09_pytorch_model_deployment.ipynb - Colab https://colab.research.google.com/github/mrdbourke/pytorch-deep-learn...
And now we've seen it working in a live demo, how about we step things up a notch?
How?
FoodVision Big!
Since FoodVision Mini is trained on pizza, steak and sushi images from the Food101 dataset
(101 classes of food x 1000 images each), how about we make FoodVision Big by training a
model on all 101 classes!
From pizza, steak, sushi to pizza, steak, sushi, hot dog, apple pie, carrot cake, chocolate cake,
french fries, garlic bread, ramen, nachos, tacos and more!
How?
Well, we've got all the steps in place, all we have to do is alter our EffNetB2 model slightly as
well as prepare a different dataset.
To �nish Milestone Project 3, let's recreate a Gradio demo similar to FoodVision Mini (three
classes) but for FoodVision Big (101 classes).
62 of 87 5/28/2025, 11:04 PM
09_pytorch_model_deployment.ipynb - Colab https://colab.research.google.com/github/mrdbourke/pytorch-deep-learn...
FoodVision Mini works with three food classes: pizza, steak and sushi. And FoodVision Big
steps it up a notch to work across 101 food classes: all of the classes in the Food101 dataset.
We can create an EffNetB2 feature extractor for Food101 by using our create_effnetb2_model()
function we created above, in section 3.1, and passing it the parameter num_classes=101 (since
Food101 has 101 classes).
Beautiful!
# # Get a summary of EffNetB2 feature extractor for Food101 with 101 output classes (uncomment fo
# summary(effnetb2_food101,
# input_size=(1, 3, 224, 224),
# col_names=["input_size", "output_size", "num_params", "trainable"],
# col_width=20,
# row_settings=["var_names"])
63 of 87 5/28/2025, 11:04 PM
09_pytorch_model_deployment.ipynb - Colab https://colab.research.google.com/github/mrdbourke/pytorch-deep-learn...
Nice!
See how just like our EffNetB2 model for FoodVision Mini the base layers are frozen (these are
pretrained on ImageNet) and the outer layers (the classifier layers) are trainable with an
output shape of [batch_size, 101] ( 101 for 101 classes in Food101).
Now since we're going to be dealing with a fair bit more data than usual, how about we add a
little data augmentation to our transforms ( effnetb2_transforms ) to augment the training data.
# Create Food101 training data transforms (only perform data augmentation on the training images)
64 of 87 5/28/2025, 11:04 PM
09_pytorch_model_deployment.ipynb - Colab https://colab.research.google.com/github/mrdbourke/pytorch-deep-learn...
# Create Food101 training data transforms (only perform data augmentation on the training images)
food101_train_transforms = torchvision.transforms.Compose([
torchvision.transforms.TrivialAugmentWide(),
effnetb2_transforms,
])
Epic!
Now let's compare food101_train_transforms (for the training data) and effnetb2_transforms
(for the testing/inference data).
print(f"Training transforms:\n{food101_train_transforms}\n")
print(f"Testing transforms:\n{effnetb2_transforms}")
Training transforms:
Compose(
TrivialAugmentWide(num_magnitude_bins=31, interpolation=InterpolationMode.NEAREST, fill=
ImageClassification(
crop_size=[288]
resize_size=[288]
mean=[0.485, 0.456, 0.406]
std=[0.229, 0.224, 0.225]
interpolation=InterpolationMode.BICUBIC
)
)
Testing transforms:
ImageClassification(
crop_size=[288]
resize_size=[288]
mean=[0.485, 0.456, 0.406]
std=[0.229, 0.224, 0.225]
interpolation=InterpolationMode.BICUBIC
)
Then we'll download and transform the training and testing dataset splits using
food101_train_transforms and effnetb2_transforms to transform each dataset respectively.
If you're using Google Colab, the cell below will take ~3-5 minutes to fully
run and download the Food101 images from PyTorch.
This is because there is over 100,000 images being downloaded (101 classes x
65 of 87 5/28/2025, 11:04 PM
09_pytorch_model_deployment.ipynb - Colab https://colab.research.google.com/github/mrdbourke/pytorch-deep-learn...
1000 images per class). If you restart your Google Colab runtime and come back
to this cell, the images will have to redownload. Alternatively, if you're running this
notebook locally, the images will be cached and stored in the directory speci�ed
by the root parameter of torchvision.datasets.Food101() .
Data downloaded!
Now we can get a list of all the class names using train_data.classes .
['apple_pie',
'baby_back_ribs',
'baklava',
'beef_carpaccio',
'beef_tartare',
'beet_salad',
'beignets',
'bibimbap',
'bread_pudding',
'breakfast_burrito']
Ho ho! Those are some delicious sounding foods (although I've never heard of "beignets"...
update: after a quick Google search, beignets also look delicious).
66 of 87 5/28/2025, 11:04 PM
09_pytorch_model_deployment.ipynb - Colab https://colab.research.google.com/github/mrdbourke/pytorch-deep-learn...
You can see a full list of the Food101 class names on the course GitHub under extras/
food101_class_names.txt .
We don't need to create another subset of the Food101 dataset, we could train and evaluate a
model across the whole 101,000 images.
But to keep training fast, let's create a 20% split of the training and test datasets.
Our goal will be to see if we can beat the original Food101 paper's best results with only 20% of
the data.
04, 05, 06, 07, 08 FoodVision Mini (10% data) Food101 custom split 3 (pizza, steak, sushi) 225 75
07, 08, 09 FoodVision Mini (20% data) Food101 custom split 3 (pizza, steak, sushi) 450 150
FoodVision Big (20% data) Food101 custom split 101 (all Food101 classes) 15150 5050
Just like our model size slowly increased overtime, so has the size of the dataset we've been
using for experiments.
To truly beat the original Food101 paper's results with 20% of the data, we'd
have to train a model on 20% of the training data and then evaluate our model on
the whole test set rather than the split we created. I'll leave this as an extension
exercise for you to try. I'd also encourage you to try training a model on the entire
Food101 training dataset.
To make our FoodVision Big (20% data) split, let's create a function called split_dataset() to
split a given dataset into certain proportions.
We can use torch.utils.data.random_split() to create splits of given sizes using the lengths
parameter.
The lengths parameter accepts a list of desired split lengths where the total of the list must
equal the overall length of the dataset.
For example, with a dataset of size 100, you could pass in lengths=[20, 80] to receive a 20%
and 80% split.
67 of 87 5/28/2025, 11:04 PM
09_pytorch_model_deployment.ipynb - Colab https://colab.research.google.com/github/mrdbourke/pytorch-deep-learn...
We'll want our function to return two splits, one with the target length (e.g. 20% of the training
data) and the other with the remaining length (e.g. the remaining 80% of the training data).
Args:
dataset (torchvision.datasets): A PyTorch Dataset, typically one from torchvision.dataset
split_size (float, optional): How much of the dataset should be split?
E.g. split_size=0.2 means there will be a 20% split and an 80% split. Defaults to 0.2
seed (int, optional): Seed for random generator. Defaults to 42.
Returns:
tuple: (random_split_1, random_split_2) where random_split_1 is of size split_size*len(da
random_split_2 is of size (1-split_size)*len(dataset).
"""
# Create split lengths based on original dataset length
length_1 = int(len(dataset) * split_size) # desired length
length_2 = len(dataset) - length_1 # remaining length
Now let's test it out by creating a 20% training and testing dataset split of Food101.
len(train_data_food101_20_percent), len(test_data_food101_20_percent)
[INFO] Splitting dataset of length 75750 into splits of size: 15150 (20%), 60600 (80%)
[INFO] Splitting dataset of length 25250 into splits of size: 5050 (20%), 20200 (80%)
(15150, 5050)
68 of 87 5/28/2025, 11:04 PM
09_pytorch_model_deployment.ipynb - Colab https://colab.research.google.com/github/mrdbourke/pytorch-deep-learn...
Excellent!
Now let's turn our Food101 20% dataset splits into DataLoader 's using
torch.utils.data.DataLoader() .
We'll set shuffle=True for the training data only and the batch size to 32 for both datasets.
And we'll set num_workers to 4 if the CPU count is available or 2 if it's not (though the value of
num_workers is very experimental and will depend on the hardware you're using, there's an active
discussion thread about this on the PyTorch forums).
import os
import torch
BATCH_SIZE = 32
NUM_WORKERS = 2 if os.cpu_count() <= 4 else 4 # this value is very experimental and will depend
And because we've got so many classes, we'll also setup a loss function using
torch.nn.CrossEntropyLoss() with label_smoothing=0.1 , inline with torchvision 's state-of-the-
art training recipe.
What's ?
69 of 87 5/28/2025, 11:04 PM
09_pytorch_model_deployment.ipynb - Colab https://colab.research.google.com/github/mrdbourke/pytorch-deep-learn...
process of preventing over�tting) that reduces the value a model gives to anyone label and
spreads it across the other labels.
In essence, rather than a model getting too con�dent on a single label, label smoothing gives a
non-zero value to other labels to help aid in generalization.
For example, if a model without label smoothing had the following outputs for 5 classes:
The model is still con�dent on its prediction of class 3 but giving small values to the other labels
forces the model to at least consider other options.
Finally, to keep things quick, we'll train our model for �ve epochs using the engine.train()
function we created in 05. PyTorch Going Modular section 4 with the goal of beating the original
Food101 paper's result of 56.4% accuracy on the test set.
Running the cell below will take ~15-20 minutes to run on Google Colab.
This is because it's training the biggest model with the largest amount of data
we've used so far (15,150 training images, 5050 testing images). And it's a reason
we decided to split 20% of the full Food101 dataset off before (so training didn't
take over an hour).
# Setup optimizer
optimizer = torch.optim.Adam(params=effnetb2_food101.parameters(),
lr=1e-3)
# Want to beat original Food101 paper with 20% of data, need 56.4%+ acc on test dataset
set_seeds()
effnetb2_food101_results = engine.train(model=effnetb2_food101,
train_dataloader=train_dataloader_food101_20_percent,
test_dataloader=test_dataloader_food101_20_percent,
optimizer=optimizer,
loss_fn=loss_fn,
70 of 87 5/28/2025, 11:04 PM
09_pytorch_model_deployment.ipynb - Colab https://colab.research.google.com/github/mrdbourke/pytorch-deep-learn...
loss_fn=loss_fn,
epochs=5,
device=device)
Woohoo!!!!
Looks like we beat the original Food101 paper's results of 56.4% accuracy with only 20% of the
training data (though we only evaluated on 20% of the testing data too, to fully replicate the
results, we could evaluate on 100% of the testing data).
71 of 87 5/28/2025, 11:04 PM
09_pytorch_model_deployment.ipynb - Colab https://colab.research.google.com/github/mrdbourke/pytorch-deep-learn...
Nice!!!
It looks like our regularization techniques (data augmentation and label smoothing) helped
prevent our model from over�tting (the training loss is still higher than the test loss) this
indicates our model has a bit more capacity to learn and could improve with further training.
Model saved!
Before we move on, let's make sure we can load it back in.
72 of 87 5/28/2025, 11:04 PM
09_pytorch_model_deployment.ipynb - Colab https://colab.research.google.com/github/mrdbourke/pytorch-deep-learn...
Hmm, it looks like the model size stayed largely the same (30 MB for FoodVision Big and 29 MB
for FoodVision Mini) despite the large increase in the number of classes.
This is because all the extra parameters for FoodVision Big are only in the last layer (the
classi�er head).
All of the base layers are the same between FoodVision Big and FoodVision Mini.
Going back up and comparing the model summaries will give more details.
And instead of letting our model live in a folder all its life, let's deploy it!
We'll deploy our FoodVision Big model in the same way we deployed our FoodVision Mini model,
as a Gradio demo on Hugging Face Spaces.
To begin, let's create a demos/foodvision_big/ directory to store our FoodVision Big demo �les
as well as a demos/foodvision_big/examples directory to hold an example image to test the
demo with.
73 of 87 5/28/2025, 11:04 PM
09_pytorch_model_deployment.ipynb - Colab https://colab.research.google.com/github/mrdbourke/pytorch-deep-learn...
demos/
foodvision_big/
09_pretrained_effnetb2_feature_extractor_food101_20_percent.pth
app.py
class_names.txt
examples/
example_1.jpg
model.py
requirements.txt
Where:
For our example image, we're going to use the faithful pizza-dad image (a photo of my dad
eating pizza).
So let's download it from the course GitHub via the !wget command and then we can move it to
demos/foodvision_big/examples with the !mv command (short for "move").
While we're here we'll move our trained Food101 EffNetB2 model from
74 of 87 5/28/2025, 11:04 PM
09_pytorch_model_deployment.ipynb - Colab https://colab.research.google.com/github/mrdbourke/pytorch-deep-learn...
While we're here we'll move our trained Food101 EffNetB2 model from
models/09_pretrained_effnetb2_feature_extractor_food101_20_percent.pth to demos/
foodvision_big as well.
# Move trained model to FoodVision Big demo folder (will error if model is already moved)
!mv models/09_pretrained_effnetb2_feature_extractor_food101_20_percent.pth demos/foodvision_big
We'll just remind ourselves what they look like �rst by checking out food101_class_names .
['apple_pie',
'baby_back_ribs',
'baklava',
'beef_carpaccio',
'beef_tartare',
'beet_salad',
'beignets',
'bibimbap',
'bread_pudding',
'breakfast_burrito']
Wonderful, now we can write these to a text �le by �rst creating a path to demos/
foodvision_big/class_names.txt and then opening a �le with Python's open() and then writing
to it leaving a new line for each class.
75 of 87 5/28/2025, 11:04 PM
09_pytorch_model_deployment.ipynb - Colab https://colab.research.google.com/github/mrdbourke/pytorch-deep-learn...
apple_pie
baby_back_ribs
baklava
beef_carpaccio
beef_tartare
...
To do so we'll use Python's open() in read mode ( "r" ) and then use the readlines() method to
read each line of our class_names.txt �le.
And we can save the class names to a list by stripping the newline value of each of them with a
list comprehension and strip() .
# Open Food101 class names file and read each line into a list
with open(foodvision_big_class_names_path, "r") as f:
food101_class_names_loaded = [food.strip() for food in f.readlines()]
11.3 Turning our FoodVision Big model into a Python script ( model.py )
Just like the FoodVision Mini demo, let's create a script that's capable of instantiating an
EffNetB2 feature extractor model along with its necessary transforms.
%%writefile demos/foodvision_big/model.py
import torch
76 of 87 5/28/2025, 11:04 PM
09_pytorch_model_deployment.ipynb - Colab https://colab.research.google.com/github/mrdbourke/pytorch-deep-learn...
import torchvision
def create_effnetb2_model(num_classes:int=3,
seed:int=42):
"""Creates an EfficientNetB2 feature extractor model and transforms.
Args:
num_classes (int, optional): number of classes in the classifier head.
Defaults to 3.
seed (int, optional): random seed value. Defaults to 42.
Returns:
model (torch.nn.Module): EffNetB2 feature extractor model.
transforms (torchvision.transforms): EffNetB2 image transforms.
"""
# Create EffNetB2 pretrained weights, transforms and model
weights = torchvision.models.EfficientNet_B2_Weights.DEFAULT
transforms = weights.transforms()
model = torchvision.models.efficientnet_b2(weights=weights)
Overwriting demos/foodvision_big/model.py
11.4 Turning our FoodVision Big Gradio app into a Python script ( app.py )
We've got a FoodVision Big model.py script, now let's create a FoodVision Big app.py script.
This will again mostly be the same as the FoodVision Mini app.py script except we'll change:
77 of 87 5/28/2025, 11:04 PM
09_pytorch_model_deployment.ipynb - Colab https://colab.research.google.com/github/mrdbourke/pytorch-deep-learn...
We'll also make sure to save it to demos/foodvision_big/app.py using the %%writefile magic
command.
%%writefile demos/foodvision_big/app.py
### 1. Imports and class names setup ###
import gradio as gr
import os
import torch
# Create model
effnetb2, effnetb2_transforms = create_effnetb2_model(
num_classes=101, # could also use len(class_names)
)
78 of 87 5/28/2025, 11:04 PM
09_pytorch_model_deployment.ipynb - Colab https://colab.research.google.com/github/mrdbourke/pytorch-deep-learn...
# Create a prediction label and prediction probability dictionary for each prediction class (
pred_labels_and_probs = {class_names[i]: float(pred_probs[0][i]) for i in range(len(class_nam
Overwriting demos/foodvision_big/app.py
79 of 87 5/28/2025, 11:04 PM
09_pytorch_model_deployment.ipynb - Colab https://colab.research.google.com/github/mrdbourke/pytorch-deep-learn...
Now all we need is a requirements.txt �le to tell our Hugging Face Space what dependencies
our FoodVision Big app requires.
%%writefile demos/foodvision_big/requirements.txt
torch==1.12.0
torchvision==0.13.0
gradio==3.1.4
Overwriting demos/foodvision_big/requirements.txt
We'll use the same process we used for the FoodVision Mini app above in section 9.1:
Downloading our Foodvision Mini app �les.
# Download the zipped FoodVision Big app (if running in Google Colab)
try:
from google.colab import files
files.download("demos/foodvision_big.zip")
except:
print("Not running in Google Colab, can't use google.colab.files.download()")
Let's deploy our FoodVision Big Gradio demo to Hugging Face Spaces so we can test it
interactively and let others experience the magic of our machine learning efforts!
80 of 87 5/28/2025, 11:04 PM
09_pytorch_model_deployment.ipynb - Colab https://colab.research.google.com/github/mrdbourke/pytorch-deep-learn...
There are several ways to upload �les to Hugging Face Spaces. The
following steps treat Hugging Face as a git repository to track �les. However, you
can also upload directly to Hugging Face Spaces via the web interface or by the
huggingface_hub library.
The good news is, we've already done the steps to do so with FoodVision Mini, so now all we
have to do is customize them to suit FoodVision Big:
3. Give the Space a name, for example, mine is called mrdbourke/foodvision_big , you can
see it here: https://huggingface.co/spaces/mrdbourke/foodvision_big
4. Select a license (I used MIT).
5. Select Gradio as the Space SDK (software development kit).
◦ You can use other options such as Streamlit but since our app is built with
Gradio, we'll stick with that.
�. Choose whether your Space is public or private (I selected public since I'd like my Space to
be available to others).
7. Click "Create Space".
�. Clone the repo locally by running: git clone https://huggingface.co/spaces/
[YOUR_USERNAME]/[YOUR_SPACE_NAME] in terminal or command prompt.
◦ You can also add �les via uploading them under the "Files and versions" tab.
9. Copy/move the contents of the downloaded foodvision_big folder to the cloned repo
folder.
10. To upload and track larger �les (e.g. �les over 10MB or in our case, our PyTorch model �le)
you'll need to install Git LFS (which stands for "git large �le storage").
11. After you've installed Git LFS, you can activate it by running git lfs install .
12. In the foodvision_big directory, track the �les over 10MB with Git LFS with git lfs track
"*.file_extension" .
13. Track .gitattributes (automatically created when cloning from HuggingFace, this �le will
81 of 87 5/28/2025, 11:04 PM
09_pytorch_model_deployment.ipynb - Colab https://colab.research.google.com/github/mrdbourke/pytorch-deep-learn...
help ensure our larger �les are tracked with Git LFS). You can see an example
.gitattributes �le on the FoodVision Big Hugging Face Space.
14. Add the rest of the foodvision_big app �les and commit them with:
◦ git add *
◦ git commit -m "first commit"
◦ git push
1�. Wait 3-5 minutes for the build to happen (future builds are faster) and your app to become
live!
If everything worked correctly, our FoodVision Big Gradio demo should be ready to classify!
Or we can even embed our FoodVision Big Gradio demo right within our notebook as an iframe
with IPython.display.IFrame and a link to our space in the format https://hf.space/embed/
[YOUR_USERNAME]/[YOUR_SPACE_NAME]/+ .
82 of 87 5/28/2025, 11:04 PM
09_pytorch_model_deployment.ipynb - Colab https://colab.research.google.com/github/mrdbourke/pytorch-deep-learn...
We've come a long way from building PyTorch models to predict a straight line... now we're
building computer vision models accessible to people all around the world!
Main takeaways
1. What’s the most ideal use case for the model (how well and how fast does it
perform)?
2. Where’s the model going to go (is it on-device or on the cloud)?
3. How’s the model going to function (are predictions online or o�ine)?
83 of 87 5/28/2025, 11:04 PM
09_pytorch_model_deployment.ipynb - Colab https://colab.research.google.com/github/mrdbourke/pytorch-deep-learn...
you’ll have to update your model. Or new research gets released and there’s a better
architecture to use.
◦ So deploying one model is an excellent step, but you'll likely want to update it over
time.
•
MLOps is an extension of DevOps (development
operations) and involves all the engineering parts around training a model: data collection
and storage, data preprocessing, model deployment, model monitoring, versioning and
more. It’s a rapidly evolving �eld but there are some solid resources out there to learn
more, many of which are in PyTorch Extra Resources.
Exercises
You should be able to complete them by referencing each section or by following the
resource(s) linked.
◦ See a live video walkthrough of the solutions on YouTube (errors and all).
1. Make and time predictions with both feature extractor models on the test dataset using
the GPU ( device="cuda" ). Compare the model's prediction times on GPU vs CPU - does
this close the gap between them? As in, does making predictions on the GPU make the ViT
feature extractor prediction times closer to the EffNetB2 feature extractor prediction
times?
◦ You'll �nd code to do these steps in section 5. Making predictions with our trained
models and timing them and section 6. Comparing model results, prediction times
and size.
2. The ViT feature extractor seems to have more learning capacity (due to more parameters)
than EffNetB2, how does it go on the larger 20% split of the entire Food101 dataset?
◦ Train a ViT feature extractor on the 20% Food101 dataset for 5 epochs, just like we
did with EffNetB2 in section 10. Creating FoodVision Big.
3. Make predictions across the 20% Food101 test dataset with the ViT feature extractor from
exercise 2 and �nd the "most wrong" predictions.
84 of 87 5/28/2025, 11:04 PM
09_pytorch_model_deployment.ipynb - Colab https://colab.research.google.com/github/mrdbourke/pytorch-deep-learn...
◦ The predictions will be the ones with the highest prediction probability but with the
wrong predicted label.
◦ Write a sentence or two about why you think the model got these predictions wrong.
4. Evaluate the ViT feature extractor across the whole Food101 test dataset rather than just
the 20% version, how does it perform?
◦ Does it beat the original Food101 paper's best result of 56.4% accuracy?
5. Head to Paperswithcode.com and �nd the current best performing model on the Food101
dataset.
�. Write down 1-3 potential failure points of our deployed FoodVision models and what some
potential solutions might be.
◦ For example, what happens if someone was to upload a photo that wasn't of food to
our FoodVision Mini model?
7. Pick any dataset from torchvision.datasets and train a feature extractor model on it
using a model from torchvision.models (you could use one of the models we've already
created, e.g. EffNetB2 or ViT) for 5 epochs and then deploy your model as a Gradio app to
Hugging Face Spaces.
◦ You may want to pick smaller dataset/make a smaller split of it so training doesn't
take too long.
◦ I'd love to see your deployed models! So be sure to share them in Discord or on the
course GitHub Discussions page.
Extra-curriculum
◦ Inside you'll �nd recommendations for resources such as Chip Huyen's book
Designing Machine Learning Systems (especially chapter 7 on model deployment)
and Goku Mohandas's Made with ML MLOps course.
• As you start to build more and more of your own projects, you'll likely start using Git (and
potentially GitHub) quite frequently. To learn more about both, I'd recommend the Git and
GitHub for Beginners - Crash Course video on the freeCodeCamp YouTube channel.
• We've only scratched the surface with what's possible with Gradio. For more, I'd
85 of 87 5/28/2025, 11:04 PM
09_pytorch_model_deployment.ipynb - Colab https://colab.research.google.com/github/mrdbourke/pytorch-deep-learn...
• We've only scratched the surface with what's possible with Gradio. For more, I'd
recommend checking out the full documentation, especially:
• Edge devices aren't limited to mobile phones, they include small computers like the
Raspberry Pi and the PyTorch team have a fantastic blog post tutorial on deploying a
PyTorch model to one.
• For a fantastic guide on developing AI and ML-powered applications, see Google's People
+ AI Guidebook. One of my favourites is the section on setting the right expectations.
◦ I covered more of these kinds of resources, including guides from Apple, Microsoft
and more in the April 2021 edition of Machine Learning Monthly (a monthly
newsletter I send out with the latest and greatest of the ML �eld).
• If you'd like to speed up your model's runtime on CPU, you should be aware of TorchScript,
ONNX (Open Neural Network Exchange) and OpenVINO. Going from pure PyTorch to
ONNX/OpenVINO models I've seen a ~2x+ increase in performance.
• For turning models into a deployable and scalable API, see the TorchServe library.
• For a terri�c example and rationale as to why deploying a machine learning model in the
browser (a form of edge deployment) offers several bene�ts (no network transfer latency
delay), see Jo Kristian Bergum's article on Moving ML Inference from the Cloud to the
Edge.
86 of 87 5/28/2025, 11:04 PM
09_pytorch_model_deployment.ipynb - Colab https://colab.research.google.com/github/mrdbourke/pytorch-deep-learn...
87 of 87 5/28/2025, 11:04 PM