03_pytorch_computer_vision
03_pytorch_computer_vision
Neural Networks
with
Where can you get help?
“If in doubt, run the code”
https://www.github.com/mrdbourke/pytorch-deep-learning/discussions
“What is a computer vision
problem?”
Example computer vision problems
“Is this a photo of steak or pizza?” “Where’s the thing we’re looking for?”
Source: Tesla AI Day Video (49:49). PS see 2:01:31 of the same video for surprise ;)
Tesla Computer Vision
• Evaluating a model
👩🍳 👩🔬
(w e’ ll be co ok ing u p lots of co d e! )
How:
fi
Computer vision inputs and outputs
224
224
[[0.31, 0.62, 0.44…], 🍣 🥩 🍕
224 [0.92, 0.03, 0.27…], [0.00, 0.97, 0.03]
[0.25, 0.78, 0.07…], i o n p r ob ab i l i t i e s )
(predict
…,
torchvision.datasets.FashionMNIST
fi
Input and output shapes
(gets represented as a tens
28
or)
[[0.00, 0.62, 0.44…], 🥾 👕 👖…
28
[0.00, 0.03, 0.27…], [0.00, 0.97, …]
[0.01, 0.78, 0.07…], t i o n p r o b ab i l i t i e s )
(predic
…,
Sample 0 1 2 3 4 5 32
Batch 0 …
1 …
2 …
torch.utils.data.DataLoader
3 …
torchvision.datasets.FashionMNIST 4 …
shuffle=True
…
(samples all mixed up)
Num samples/
batch_size
(typical)*
Architecture of a CNN
Steak 🥩
Pizza 🍕
Sushi 🍣
*Note: there are almost an unlimited amount of ways you could stack together a convolutional neural network, this slide demonstrates only one.
Typical architecture of a CNN
(col o ur e d b l o c k e d it i o n )
Simple CNN
Deeper CNN
CNN Explainer model
Input layer Conv2d layers ReLU activation layers Pooling layers Output layer
in_channels De nes the number of input channels of the input data. 1 (grayscale), 3 (RGB color images)
📖 Resource: For an interactive demonstration of the above hyperparameters, see the CNN Explainer website.
fi
fi
fi
fi
fi
Breakdown of torch.nn.Conv2d layer (Visually)
📖 Resource: For an interactive demonstration of the above hyperparameters, see the CNN Explainer website.
FashionMNIST -> CNN
Output layer outputs
predictions
👡
Numerical Layers learn numerical
Inputs
encoding representation 👗
…
Keep going until number
of classes is fulfilled
torchvision.transforms
torch.utils.data.Dataset
torch.save
torch.utils.data.DataLoader torchmetrics torch.load
For example, a student who studies the course materials too hard and then isn’t able to perform
well on the nal exam. Or tries to put their knowledge into practice at the workplace and nds
what they learned has nothing to do with the real world.
Smaller model
Not all data samples are created equally. Removing poor samples
Better data from or adding better samples to your dataset can improve your
model’s performance.
*Note: There are many more di erent kinds of data augmentation such as, cropping, replacing, shearing. This slide only demonstrates a few.
ff
ff
Popular & useful computer vision
architectures: see torchvision.models
Release
Architecture Paper Use in PyTorch When to use
Date
Lightweight architecture
https://arxiv.org/abs/
MobileNet(s) 2017 torchvision.models.mobilenet… suitable for devices with
1704.04861
less computing power
ffi
ffi
The machine learning explorer’s
motto
“Visualize, visualize, visualize”
Data
Training
Predictions
The machine learning practitioner’s
motto
👩🍳 👩🔬
(try lots of things an
d see what
tastes good)