PyTorch Workflow Fundamentals - Zero To Mastery Learn PyTorch For Deep Learning
PyTorch Workflow Fundamentals - Zero To Mastery Learn PyTorch For Deep Learning
PyTorch Workflow Fundamentals - Zero to Mastery Learn PyTorch for Deep Learning
Open in Colab
The essence of machine learning and deep learning is to take some data from the past,
build an algorithm (like a neural network) to discover patterns in it and use the
discovered patterns to predict the future.
There are many ways to do this and many new ways are being discovered all the time.
And we see if we can build a PyTorch model that learns the pattern of the straight line
and matches it.
For now, we'll use this workflow to predict a simple straight line but the workflow steps
can be repeated and changed depending on the problem you're working on.
https://www.learnpytorch.io/01_pytorch_workflow/ 1/43
2024/10/8 清晨7:44 01. PyTorch Workflow Fundamentals - Zero to Mastery Learn PyTorch for Deep Learning
Topic Contents
1. Getting data ready Data can be almost anything but to get started
we're going to create a simple straight line
3. Fitting the model to data We've got data and a model, now let's let the
(training) model (try to) find patterns in the (training)
data.
4. Making predictions and Our model's found patterns in the data, let's
evaluating a model compare its findings to the actual (testing)
(inference) data.
5. Saving and loading a You may want to use your model elsewhere, or
model come back to it later, here we'll cover that.
6. Putting it all together Let's take all of the above and combine it.
And if you run into trouble, you can ask a question on the Discussions page there too.
There's also the PyTorch developer forums, a very helpful place for all things PyTorch.
Let's start by putting what we're covering into a dictionary to reference later.
And now let's import what we'll need for this module.
https://www.learnpytorch.io/01_pytorch_workflow/ 2/43
2024/10/8 清晨7:44 01. PyTorch Workflow Fundamentals - Zero to Mastery Learn PyTorch for Deep Learning
We're going to get torch , torch.nn ( nn stands for neural network and this package
contains the building blocks for creating neural networks in PyTorch) and matplotlib .
Out[2]: '1.12.1+cu113'
https://www.learnpytorch.io/01_pytorch_workflow/ 3/43
2024/10/8 清晨7:44 01. PyTorch Workflow Fundamentals - Zero to Mastery Learn PyTorch for Deep Learning
No data.
We'll use linear regression to create the data with known parameters (things that can be
learned by a model) and then we'll use PyTorch to see if we can build model to estimate
these parameters using gradient descent.
Don't worry if the terms above don't mean much now, we'll see them in action and I'll put
extra resources below where you can learn more.
# Create data
start = 0
end = 1
step = 0.02
X = torch.arange(start, end, step).unsqueeze(dim=1)
y = weight * X + bias
X[:10], y[:10]
Out[3]: (tensor([[0.0000],
[0.0200],
[0.0400],
[0.0600],
[0.0800],
[0.1000],
[0.1200],
[0.1400],
[0.1600],
[0.1800]]),
tensor([[0.3000],
[0.3140],
[0.3280],
[0.3420],
[0.3560],
[0.3700],
[0.3840],
[0.3980],
[0.4120],
[0.4260]]))
Beautiful! Now we're going to move towards building a model that can learn the
relationship between X (features) and y (labels).
https://www.learnpytorch.io/01_pytorch_workflow/ 4/43
2024/10/8 清晨7:44 01. PyTorch Workflow Fundamentals - Zero to Mastery Learn PyTorch for Deep Learning
One of most important steps in a machine learning project is creating a training and test
set (and when required, a validation set).
For now, we'll just use a training and test set, this means we'll have a dataset for our
model to learn on as well as be evaluated on.
Note: When dealing with real-world data, this step is typically done right at the start of
a project (the test set should always be kept separate from all other data). We want
https://www.learnpytorch.io/01_pytorch_workflow/ 5/43
2024/10/8 清晨7:44 01. PyTorch Workflow Fundamentals - Zero to Mastery Learn PyTorch for Deep Learning
our model to learn from training data and then evaluate it on test data to get an
indication of how well it generalizes to unseen examples.
Wonderful, we've got 40 samples for training ( X_train & y_train ) and 10 samples for
testing ( X_test & y_test ).
The model we create is going to try and learn the relationship between X_train &
y_train and then we will evaluate what it learns on X_test and y_test .
https://www.learnpytorch.io/01_pytorch_workflow/ 6/43
2024/10/8 清晨7:44 01. PyTorch Workflow Fundamentals - Zero to Mastery Learn PyTorch for Deep Learning
In [6]: plot_predictions();
Epic!
Now instead of just being numbers on a page, our data is a straight line.
Note: Now's a good time to introduce you to the data explorer's motto... "visualize,
visualize, visualize!"
Think of this whenever you're working with data and turning it into numbers, if you can
visualize something, it can do wonders for understanding.
Machines love numbers and we humans like numbers too but we also like to look at
things.
2. Build model
Now we've got some data, let's build a model to use the blue dots to predict the green
dots.
https://www.learnpytorch.io/01_pytorch_workflow/ 7/43
2024/10/8 清晨7:44 01. PyTorch Workflow Fundamentals - Zero to Mastery Learn PyTorch for Deep Learning
Alright there's a fair bit going on above but let's break it down bit by bit.
Resource: We'll be using Python classes to create bits and pieces for building neural
networks. If you're unfamiliar with Python class notation, I'd recommend reading Real
Python's Object Orientating programming in Python 3 guide a few times.
PyTorch has four (give or take) essential modules you can use to create almost any kind
of neural network you can imagine.
https://www.learnpytorch.io/01_pytorch_workflow/ 8/43
2024/10/8 清晨7:44 01. PyTorch Workflow Fundamentals - Zero to Mastery Learn PyTorch for Deep Learning
torch.nn.Module The base class for all neural network modules, all the
building blocks for neural networks are subclasses. If
you're building a neural network in PyTorch, your models
should subclass nn.Module . Requires a forward()
method be implemented.
If the above sounds complex, think of like this, almost everything in a PyTorch neural
network comes from torch.nn ,
nn.Parameter contains the smaller parameters like weights and biases (put these
together to make nn.Module (s))
forward() tells the larger blocks how to make calculations on inputs (tensors full
of data) within nn.Module (s)
https://www.learnpytorch.io/01_pytorch_workflow/ 9/43
2024/10/8 清晨7:44 01. PyTorch Workflow Fundamentals - Zero to Mastery Learn PyTorch for Deep Learning
Basic building blocks of creating a PyTorch model by subclassing nn.Module . For objects
that subclass nn.Module , the forward() method must be defined.
Resource: See more of these essential modules and their use cases in the PyTorch
Cheat Sheet.
Now we've got these out of the way, let's create a model instance with the class we've
made and check its parameters using .parameters() .
We can also get the state (what the model contains) of the model using .state_dict() .
https://www.learnpytorch.io/01_pytorch_workflow/ 10/43
2024/10/8 清晨7:44 01. PyTorch Workflow Fundamentals - Zero to Mastery Learn PyTorch for Deep Learning
Notice how the values for weights and bias from model_0.state_dict() come out
as random float tensors?
Essentially we want to start from random parameters and get the model to update them
towards parameters that fit our data best (the hardcoded weight and bias values we
set when creating our straight line data).
Exercise: Try changing the torch.manual_seed() value two cells above, see what
happens to the weights and bias values.
Because our model starts with random values, right now it'll have poor predictive power.
To check this we can pass it the test data X_test to see how closely it predicts y_test .
When we pass data to our model, it'll go through the model's forward() method and
produce a result using the computation we've defined.
Hmm?
https://www.learnpytorch.io/01_pytorch_workflow/ 11/43
2024/10/8 清晨7:44 01. PyTorch Workflow Fundamentals - Zero to Mastery Learn PyTorch for Deep Learning
Note: In older PyTorch code, you may also see torch.no_grad() being used for
inference. While torch.inference_mode() and torch.no_grad() do similar things,
torch.inference_mode() is newer, potentially faster and preferred. See this Tweet
from PyTorch for more.
We've made some predictions, let's see what they look like.
This is because of the kind of data we're using. For our straight line, one X value maps
to one y value.
However, machine learning models are very flexible. You could have 100 X values
mapping to one, two, three or 10 y values. It all depends on what you're working on.
Our predictions are still numbers on a page, let's visualize them with our
plot_predictions() function we created above.
In [12]: plot_predictions(predictions=y_preds)
https://www.learnpytorch.io/01_pytorch_workflow/ 12/43
2024/10/8 清晨7:44 01. PyTorch Workflow Fundamentals - Zero to Mastery Learn PyTorch for Deep Learning
Out[13]: tensor([[0.4618],
[0.4691],
[0.4764],
[0.4836],
[0.4909],
[0.4982],
[0.5054],
[0.5127],
[0.5200],
[0.5272]])
This makes sense though, when you remember our model is just using random
parameter values to make predictions.
It hasn't even looked at the blue dots to try to predict the green dots.
3. Train model
Right now our model is making predictions using random parameters to make
calculations, it's basically guessing (randomly).
To fix that, we can update its internal parameters (I also refer to parameters as patterns),
the weights and bias values we set randomly using nn.Parameter() and
https://www.learnpytorch.io/01_pytorch_workflow/ 13/43
2024/10/8 清晨7:44 01. PyTorch Workflow Fundamentals - Zero to Mastery Learn PyTorch for Deep Learning
We could hard code this (since we know the default values weight=0.7 and bias=0.3 )
but where's the fun in that?
Much of the time you won't know what the ideal parameters are for a model.
Instead, it's much more fun to write code to see if the model can try and figure them out
itself.
For our model to update its parameters on its own, we'll need to add a few more things to
our recipe.
Let's create a loss function and an optimizer we can use to help improve our model.
https://www.learnpytorch.io/01_pytorch_workflow/ 14/43
2024/10/8 清晨7:44 01. PyTorch Workflow Fundamentals - Zero to Mastery Learn PyTorch for Deep Learning
Depending on what kind of problem you're working on will depend on what loss function
and what optimizer you use.
However, there are some common values, that are known to work well such as the SGD
(stochastic gradient descent) or Adam optimizer. And the MAE (mean absolute error)
loss function for regression problems (predicting a number) or binary cross entropy loss
function for classification problems (predicting one thing or another).
For our problem, since we're predicting a number, let's use MAE (which is under
torch.nn.L1Loss() ) in PyTorch as our loss function.
params is the target model parameters you'd like to optimize (e.g. the weights and
bias values we randomly set before).
lr is the learning rate you'd like the optimizer to update the parameters at, higher
means the optimizer will try larger updates (these can sometimes be too large and
the optimizer will fail to work), lower means the optimizer will try smaller updates
(these can sometimes be too small and the optimizer will take too long to find the
ideal values). The learning rate is considered a hyperparameter (because it's set by a
machine learning engineer). Common starting values for the learning rate are 0.01 ,
0.001 , 0.0001 , however, these can also be adjusted over time (this is called
learning rate scheduling).
https://www.learnpytorch.io/01_pytorch_workflow/ 15/43
2024/10/8 清晨7:44 01. PyTorch Workflow Fundamentals - Zero to Mastery Learn PyTorch for Deep Learning
Woohoo! Now we've got a loss function and an optimizer, it's now time to create a
training loop (and testing loop).
The training loop involves the model going through the training data and learning the
relationships between the features and labels .
The testing loop involves going through the testing data and evaluating how good the
patterns are that the model learned on the training data (the model never sees the
testing data during training).
Each of these is called a "loop" because we want our model to look (loop through) at
each sample in each dataset.
To create these we're going to write a Python for loop in the theme of the unofficial
PyTorch optimization loop song (there's a video version too).
https://www.learnpytorch.io/01_pytorch_workflow/ 16/43
2024/10/8 清晨7:44 01. PyTorch Workflow Fundamentals - Zero to Mastery Learn PyTorch for Deep Learning
The unofficial PyTorch optimization loops song, a fun way to remember the steps in a
PyTorch training (and testing) loop.
https://www.learnpytorch.io/01_pytorch_workflow/ 17/43
2024/10/8 清晨7:44 01. PyTorch Workflow Fundamentals - Zero to Mastery Learn PyTorch for Deep Learning
https://www.learnpytorch.io/01_pytorch_workflow/ 18/43
2024/10/8 清晨7:44 01. PyTorch Workflow Fundamentals - Zero to Mastery Learn PyTorch for Deep Learning
Note: The above is just one example of how the steps could be ordered or described.
With experience you'll find making PyTorch training loops can be quite flexible.
And on the ordering of things, the above is a good default order but you may see
slightly different orders. Some rules of thumb:
For resources to help understand what's happening behind the scenes with
backpropagation and gradient descent, see the extra-curriculum section.
As for the testing loop (evaluating our model), the typical steps include:
https://www.learnpytorch.io/01_pytorch_workflow/ 19/43
2024/10/8 清晨7:44 01. PyTorch Workflow Fundamentals - Zero to Mastery Learn PyTorch for Deep Learning
https://www.learnpytorch.io/01_pytorch_workflow/ 20/43
2024/10/8 清晨7:44 01. PyTorch Workflow Fundamentals - Zero to Mastery Learn PyTorch for Deep Learning
Let's put all of the above together and train our model for 100 epochs (forward passes
through the data) and we'll evaluate it every 10 epochs.
In [15]: torch.manual_seed(42)
# 4. Loss backwards
https://www.learnpytorch.io/01_pytorch_workflow/ 21/43
2024/10/8 清晨7:44 01. PyTorch Workflow Fundamentals - Zero to Mastery Learn PyTorch for Deep Learning
loss.backward()
### Testing
with torch.inference_mode():
# 1. Forward pass on test data
test_pred = model_0(X_test)
train_loss_values.append(loss.detach().numpy())
test_loss_values.append(test_loss.detach().numpy())
print(f"Epoch: {epoch} | MAE Train Loss:
{loss} | MAE Test Loss: {test_loss} ")
Epoch: 0 | MAE Train Loss: 0.31288138031959534 | MAE Test
Loss: 0.48106518387794495
Epoch: 10 | MAE Train Loss: 0.1976713240146637 | MAE Test
Loss: 0.3463551998138428
Epoch: 20 | MAE Train Loss: 0.08908725529909134 | MAE Test
Loss: 0.21729660034179688
Epoch: 30 | MAE Train Loss: 0.053148526698350906 | MAE Tes
t Loss: 0.14464017748832703
Epoch: 40 | MAE Train Loss: 0.04543796554207802 | MAE Test
Loss: 0.11360953003168106
Epoch: 50 | MAE Train Loss: 0.04167863354086876 | MAE Test
Loss: 0.09919948130846024
Epoch: 60 | MAE Train Loss: 0.03818932920694351 | MAE Test
Loss: 0.08886633068323135
Epoch: 70 | MAE Train Loss: 0.03476089984178543 | MAE Test
Loss: 0.0805937647819519
Epoch: 80 | MAE Train Loss: 0.03132382780313492 | MAE Test
Loss: 0.07232122868299484
Epoch: 90 | MAE Train Loss: 0.02788739837706089 | MAE Test
Loss: 0.06473556160926819
https://www.learnpytorch.io/01_pytorch_workflow/ 22/43
2024/10/8 清晨7:44 01. PyTorch Workflow Fundamentals - Zero to Mastery Learn PyTorch for Deep Learning
Oh would you look at that! Looks like our loss is going down with every epoch, let's plot it
to find out.
Nice! The loss curves show the loss going down over time. Remember, loss is the
measure of how wrong your model is, so the lower the better.
Well, thanks to our loss function and optimizer, the model's internal parameters
( weights and bias ) were updated to better reflect the underlying patterns in the data.
Let's inspect our model's .state_dict() to see how close our model gets to the original
values we set for weights and bias.
https://www.learnpytorch.io/01_pytorch_workflow/ 23/43
2024/10/8 清晨7:44 01. PyTorch Workflow Fundamentals - Zero to Mastery Learn PyTorch for Deep Learning
The model learned the following values for weights and bia
s:
OrderedDict([('weights', tensor([0.5784])), ('bias', tenso
r([0.3513]))])
Our model got very close to calculating the exact original values for weight and bias
(and it would probably get even closer if we trained it for longer).
Exercise: Try changing the epochs value above to 200, what happens to the loss
curves and the weights and bias parameter values of the model?
It'd likely never guess them perfectly (especially when using more complicated datasets)
but that's okay, often you can do very cool things with a close approximation.
This is the whole idea of machine learning and deep learning, there are some ideal
values that describe our data and rather than figuring them out by hand, we can train a
model to figure them out programmatically.
We've already seen a glimpse of this in the training and testing code above, the steps to
do it outside of the training/testing loop are similar.
There are three things to remember when making predictions (also called performing
inference) with a PyTorch model:
2. Make the predictions using the inference mode context manager ( with
torch.inference_mode(): ... ).
3. All predictions should be made with objects on the same device (e.g. data and
model on GPU only or data and model on CPU only).
The first two items make sure all helpful calculations and settings PyTorch uses behind
the scenes during training but aren't necessary for inference are turned off (this results in
faster computation). And the third ensures that you won't run into cross-device errors.
https://www.learnpytorch.io/01_pytorch_workflow/ 24/43
2024/10/8 清晨7:44 01. PyTorch Workflow Fundamentals - Zero to Mastery Learn PyTorch for Deep Learning
Out[18]: tensor([[0.8141],
[0.8256],
[0.8372],
[0.8488],
[0.8603],
[0.8719],
[0.8835],
[0.8950],
[0.9066],
[0.9182]])
Nice! We've made some predictions with our trained model, now how do they look?
In [19]: plot_predictions(predictions=y_preds)
Woohoo! Those red dots are looking far closer than they were before!
https://www.learnpytorch.io/01_pytorch_workflow/ 25/43
2024/10/8 清晨7:44 01. PyTorch Workflow Fundamentals - Zero to Mastery Learn PyTorch for Deep Learning
As in, you might train it on Google Colab or your local machine with a GPU but you'd like
to now export it to some sort of application where others can use it.
Or maybe you'd like to save your progress on a model and come back and load it back
later.
For saving and loading models in PyTorch, there are three main methods you should be
aware of (all of below have been taken from the PyTorch saving and loading models
guide):
https://www.learnpytorch.io/01_pytorch_workflow/ 26/43
2024/10/8 清晨7:44 01. PyTorch Workflow Fundamentals - Zero to Mastery Learn PyTorch for Deep Learning
The recommended way for saving and loading a model for inference (making
predictions) is by saving and loading a model's state_dict() .
1. We'll create a directory for saving models to called models using Python's pathlib
module.
3. We'll call torch.save(obj, f) where obj is the target model's state_dict() and
f is the filename of where to save the model.
Note: It's common convention for PyTorch saved models or objects to end with .pt
or .pth , like saved_model_01.pth .
https://www.learnpytorch.io/01_pytorch_workflow/ 27/43
2024/10/8 清晨7:44 01. PyTorch Workflow Fundamentals - Zero to Mastery Learn PyTorch for Deep Learning
Saving the entire model rather than just the state_dict() is more intuitive, however, to
quote the PyTorch documentation (italics mine):
The disadvantage of this approach (saving the whole model) is that the serialized data
is bound to the specific classes and the exact directory structure used when the
model is saved...
Because of this, your code can break in various ways when used in other projects or
after refactors.
So instead, we're using the flexible method of saving and loading just the state_dict() ,
which again is basically a dictionary of model parameters.
Now to test our loaded model, let's perform inference with it (make predictions) on the
test data.
https://www.learnpytorch.io/01_pytorch_workflow/ 28/43
2024/10/8 清晨7:44 01. PyTorch Workflow Fundamentals - Zero to Mastery Learn PyTorch for Deep Learning
2. Make the predictions using the inference mode context manager ( with
torch.inference_mode(): ... ).
3. All predictions should be made with objects on the same device (e.g. data and model
on GPU only or data and model on CPU only).
Now we've made some predictions with the loaded model, let's see if they're the same as
the previous predictions.
Out[24]: tensor([[True],
[True],
[True],
[True],
[True],
[True],
[True],
[True],
[True],
[True]])
Nice!
It looks like the loaded model predictions are the same as the previous model
predictions (predictions made prior to saving). This indicates our model is saving and
loading as expected.
Note: There are more methods to save and load PyTorch models but I'll leave these for
extra-curriculum and further reading. See the PyTorch guide for saving and loading
models for more.
https://www.learnpytorch.io/01_pytorch_workflow/ 29/43
2024/10/8 清晨7:44 01. PyTorch Workflow Fundamentals - Zero to Mastery Learn PyTorch for Deep Learning
But once you've had some practice, you'll be performing the above steps like dancing
down the street.
Except this time we'll make our code device agnostic (so if there's a GPU available, it'll
use it and if not, it will default to the CPU).
There'll be far less commentary in this section than above since what we're going to go
through has already been covered.
Note: If you're using Google Colab, to setup a GPU, go to Runtime -> Change runtime
type -> Hardware acceleration -> GPU. If you do this, it will reset the Colab runtime and
you will lose saved variables.
Out[25]: '1.12.1+cu113'
Now let's start making our code device agnostic by setting device="cuda" if it's
available, otherwise it'll default to device="cpu" .
https://www.learnpytorch.io/01_pytorch_workflow/ 30/43
2024/10/8 清晨7:44 01. PyTorch Workflow Fundamentals - Zero to Mastery Learn PyTorch for Deep Learning
Otherwise, you'll be using a CPU for the following computations. This is fine for our small
dataset but it will take longer for larger datasets.
6.1 Data
Then we'll make a range of numbers between 0 and 1, these will be our X values.
Finally, we'll use the X values, as well as the weight and bias values to create y using
the linear regression formula ( y = weight * X + bias ).
Out[27]: (tensor([[0.0000],
[0.0200],
[0.0400],
[0.0600],
[0.0800],
[0.1000],
[0.1200],
[0.1400],
[0.1600],
[0.1800]]),
tensor([[0.3000],
[0.3140],
[0.3280],
[0.3420],
[0.3560],
[0.3700],
[0.3840],
[0.3980],
https://www.learnpytorch.io/01_pytorch_workflow/ 31/43
2024/10/8 清晨7:44 01. PyTorch Workflow Fundamentals - Zero to Mastery Learn PyTorch for Deep Learning
[0.4120],
[0.4260]]))
Wonderful!
Now we've got some data, let's split it into training and test sets.
We'll use an 80/20 split with 80% training data and 20% testing data.
https://www.learnpytorch.io/01_pytorch_workflow/ 32/43
2024/10/8 清晨7:44 01. PyTorch Workflow Fundamentals - Zero to Mastery Learn PyTorch for Deep Learning
We'll create the same style of model as before except this time, instead of defining the
weight and bias parameters of our model manually using nn.Parameter() , we'll use
nn.Linear(in_features, out_features) to do it for us.
Where in_features is the number of dimensions your input data has and
out_features is the number of dimensions you'd like it to be output to.
In our case, both of these are 1 since our data has 1 input feature ( X ) per label ( y ).
Creating a linear regression model using nn.Parameter versus using nn.Linear . There
are plenty more examples of where the torch.nn module has pre-built computations,
including many popular and useful neural network layers.
https://www.learnpytorch.io/01_pytorch_workflow/ 33/43
2024/10/8 清晨7:44 01. PyTorch Workflow Fundamentals - Zero to Mastery Learn PyTorch for Deep Learning
Out[30]: (LinearRegressionModelV2(
(linear_layer): Linear(in_features=1, out_features=1,
bias=True)
),
OrderedDict([('linear_layer.weight', tensor([[0.764
5]])),
('linear_layer.bias', tensor([0.8300]))]))
Now let's put our model on the GPU (if it's available).
We can change the device our PyTorch objects are on using .to(device) .
Out[31]: device(type='cpu')
Nice! Because of our device agnostic code, the above cell will work regardless of
whether a GPU is available or not.
If you do have access to a CUDA-enabled GPU, you should see an output of something
like:
device(type='cuda', index=0)
6.3 Training
https://www.learnpytorch.io/01_pytorch_workflow/ 34/43
2024/10/8 清晨7:44 01. PyTorch Workflow Fundamentals - Zero to Mastery Learn PyTorch for Deep Learning
Let's use the same functions we used earlier, nn.L1Loss() and torch.optim.SGD() .
We'll have to pass the new model's parameters ( model.parameters() ) to the optimizer
for it to adjust them during training.
The learning rate of 0.01 worked well before too so let's use that again.
# Create optimizer
optimizer =
torch.optim.SGD(params=model_1.parameters(), #
optimize newly created model's parameters
lr=0.01)
Beautiful, loss function and optimizer ready, now let's train and evaluate our model using
a training and testing loop.
The only different thing we'll be doing in this step compared to the previous training loop
is putting the data on the target device .
We've already put our model on the target device using model_1.to(device) .
That way if the model is on the GPU, the data is on the GPU (and vice versa).
If you need a reminder of the PyTorch training loop steps, see below.
https://www.learnpytorch.io/01_pytorch_workflow/ 35/43
2024/10/8 清晨7:44 01. PyTorch Workflow Fundamentals - Zero to Mastery Learn PyTorch for Deep Learning
1. Forward pass - The model goes through all of the training data once, performing its
forward() function calculations ( model(x_train) ).
2. Calculate the loss - The model's outputs (predictions) are compared to the ground
truth and evaluated to see how wrong they are ( loss = loss_fn(y_pred, y_train ).
3. Zero gradients - The optimizers gradients are set to zero (they are accumulated by
default) so they can be recalculated for the specific training step
( optimizer.zero_grad() ).
4. Perform backpropagation on the loss - Computes the gradient of the loss with
respect for every model parameter to be updated (each parameter with
requires_grad=True ). This is known as backpropagation, hence "backwards"
( loss.backward() ).
In [34]: torch.manual_seed(42)
# 1. Forward pass
y_pred = model_1(X_train)
# 2. Calculate loss
loss = loss_fn(y_pred, y_train)
# 4. Loss backward
https://www.learnpytorch.io/01_pytorch_workflow/ 36/43
2024/10/8 清晨7:44 01. PyTorch Workflow Fundamentals - Zero to Mastery Learn PyTorch for Deep Learning
loss.backward()
### Testing
model_1.eval() # put the model in evaluation mode
for testing (inference)
# 1. Forward pass
with torch.inference_mode():
test_pred = model_1(X_test)
if epoch % 100 == 0:
print(f"Epoch: {epoch} | Train loss: {loss} |
Test loss: {test_loss}")
Epoch: 0 | Train loss: 0.5551779866218567 | Test loss: 0.5
739762187004089
Epoch: 100 | Train loss: 0.006215683650225401 | Test loss:
0.014086711220443249
Epoch: 200 | Train loss: 0.0012645035749301314 | Test los
s: 0.013801801018416882
Epoch: 300 | Train loss: 0.0012645035749301314 | Test los
s: 0.013801801018416882
Epoch: 400 | Train loss: 0.0012645035749301314 | Test los
s: 0.013801801018416882
Epoch: 500 | Train loss: 0.0012645035749301314 | Test los
s: 0.013801801018416882
Epoch: 600 | Train loss: 0.0012645035749301314 | Test los
s: 0.013801801018416882
Epoch: 700 | Train loss: 0.0012645035749301314 | Test los
s: 0.013801801018416882
Epoch: 800 | Train loss: 0.0012645035749301314 | Test los
s: 0.013801801018416882
Epoch: 900 | Train loss: 0.0012645035749301314 | Test los
s: 0.013801801018416882
Note: Due to the random nature of machine learning, you will likely get slightly
different results (different loss and prediction values) depending on whether your
model was trained on CPU or GPU. This is true even if you use the same random seed
on either device. If the difference is large, you may want to look for errors, however, if
it is small (ideally it is), you can ignore it.
Let's check the parameters our model has learned and compare them to the original
parameters we hard-coded.
https://www.learnpytorch.io/01_pytorch_workflow/ 37/43
2024/10/8 清晨7:44 01. PyTorch Workflow Fundamentals - Zero to Mastery Learn PyTorch for Deep Learning
Remember though, in practice, it's rare that you'll know the perfect parameters ahead of
time.
And if you knew the parameters your model had to learn ahead of time, what would be
the fun of machine learning?
Plus, in many real-world machine learning problems, the number of parameters can well
exceed tens of millions.
I don't know about you but I'd rather write code for a computer to figure those out rather
than doing it by hand.
Now we've got a trained model, let's turn on it's evaluation mode and make some
predictions.
https://www.learnpytorch.io/01_pytorch_workflow/ 38/43
2024/10/8 清晨7:44 01. PyTorch Workflow Fundamentals - Zero to Mastery Learn PyTorch for Deep Learning
Out[36]: tensor([[0.8600],
[0.8739],
[0.8878],
[0.9018],
[0.9157],
[0.9296],
[0.9436],
[0.9575],
[0.9714],
[0.9854]], device='cuda:0')
If you're making predictions with data on the GPU, you might notice the output of the
above has device='cuda:0' towards the end. That means the data is on CUDA device 0
(the first GPU your system has access to due to zero-indexing), if you end up using
multiple GPUs in the future, this number may be higher.
Note: Many data science libraries such as pandas, matplotlib and NumPy aren't
capable of using data that is stored on GPU. So you might run into some issues when
trying to use a function from one of these libraries with tensor data not stored on the
CPU. To fix this, you can call .cpu() on your target tensor to return a copy of your
target tensor on the CPU.
https://www.learnpytorch.io/01_pytorch_workflow/ 39/43
2024/10/8 清晨7:44 01. PyTorch Workflow Fundamentals - Zero to Mastery Learn PyTorch for Deep Learning
Woah! Look at those red dots, they line up almost perfectly with the green dots. I guess
the extra epochs helped.
We're happy with our models predictions, so let's save it to file so it can be used later.
And just to make sure everything worked well, let's load it back in.
We'll:
Send the new instance of the model to the target device (to ensure our code is
device-agnostic)
print(f"Loaded model:\n{loaded_model_1}")
https://www.learnpytorch.io/01_pytorch_workflow/ 40/43
2024/10/8 清晨7:44 01. PyTorch Workflow Fundamentals - Zero to Mastery Learn PyTorch for Deep Learning
print(f"Model on
device:\n{next(loaded_model_1.parameters()).device}")
Loaded model:
LinearRegressionModelV2(
(linear_layer): Linear(in_features=1, out_features=1, bi
as=True)
)
Model on device:
cuda:0
Now we can evaluate the loaded model to see if its predictions line up with the
predictions made prior to saving.
Out[40]: tensor([[True],
[True],
[True],
[True],
[True],
[True],
[True],
[True],
[True],
[True]], device='cuda:0')
Well, we've come a long way. You've now built and trained your first two neural network
models in PyTorch!
Exercises
All exercises have been inspired from code throughout the notebook.
Note: For all exercises, your code should be device agnostic (meaning it could run on
CPU or GPU if it's available).
https://www.learnpytorch.io/01_pytorch_workflow/ 41/43
2024/10/8 清晨7:44 01. PyTorch Workflow Fundamentals - Zero to Mastery Learn PyTorch for Deep Learning
1. Create a straight line dataset using the linear regression formula ( weight * X +
bias ).
Set weight=0.3 and bias=0.9 there should be at least 100 datapoints total.
Implement the forward() method to compute the linear regression function you
used to create the dataset in 1.
Once you've constructed the model, make an instance of it and check its
state_dict() .
Set the learning rate of the optimizer to be 0.01 and the parameters to optimize
should be the model parameters from the model you created in 2.
Write a training loop to perform the appropriate training steps for 300 epochs.
The training loop should test the model on the test dataset every 20 epochs.
Visualize these predictions against the original training and testing data (note: you
may need to make sure the predictions are not on the GPU if you want to use non-
CUDA-enabled libraries such as matplotlib to plot).
Create a new instance of your model class you made in 2. and load in the
state_dict() you just saved to it.
Perform predictions on your test data with the loaded model and confirm they match
the original model predictions from 4.
Resource: See the exercises notebooks templates and solutions on the course
GitHub.
Extra-curriculum
https://www.learnpytorch.io/01_pytorch_workflow/ 42/43
2024/10/8 清晨7:44 01. PyTorch Workflow Fundamentals - Zero to Mastery Learn PyTorch for Deep Learning
Listen to The Unofficial PyTorch Optimization Loop Song (to help remember the
steps in a PyTorch training/testing loop).
Spend 10-minutes scrolling through and checking out the PyTorch documentation
cheatsheet for all of the different PyTorch modules you might come across.
Spend 10-minutes reading the loading and saving documentation on the PyTorch
website to become more familiar with the different saving and loading options in
PyTorch.
Spend 1-2 hours reading/watching the following for an overview of the internals of
gradient descent and backpropagation, the two main algorithms that have been
working in the background to help our model learn.
https://www.learnpytorch.io/01_pytorch_workflow/ 43/43