PyTorch Fundamentals - Zero To Mastery Learn PyTorch For Deep Learning
PyTorch Fundamentals - Zero To Mastery Learn PyTorch For Deep Learning
For example, Andrej Karpathy (head of AI at Tesla) has given several talks (PyTorch
DevCon 2019, Tesla AI Day 2021) about how Tesla use PyTorch to power their self-
driving computer vision models.
PyTorch is also used in other industries such as agriculture to power computer vision
on tractors.
PyTorch also helps take care of many things such as GPU acceleration (making your
code run faster) behind the scenes.
So you can focus on manipulating data and writing algorithms and PyTorch will make
sure it runs fast.
And if companies such as Tesla and Meta (Facebook) use it to build models they
deploy to power hundreds of applications, drive thousands of cars and deliver content
to billions of people, it's clearly capable on the development front too.
Subsequent notebooks build upon knowledge from the previous one (numbering
starts at 00, 01, 02 and goes to whatever it ends up going to).
This notebook deals with the basic building block of machine learning and deep
learning, the tensor.
Topic Contents
And if you run into trouble, you can ask a question on the Discussions page there too.
There's also the PyTorch developer forums, a very helpful place for all things PyTorch.
Importing PyTorch
Note: Before running any of the code in this notebook, you should have
gone through the PyTorch setup steps.
Let's start by importing PyTorch and checking the version we're using.
Out[1]: '1.13.1+cu116'
This means if you're going through these materials, you'll see most compatability with
PyTorch 1.10.0+, however if your version number is far higher than that, you might
notice some inconsistencies.
And if you do have any issues, please post on the course GitHub Discussions page.
Introduction to tensors
Now we've got PyTorch imported, it's time to learn about tensors.
For example, you could represent an image as a tensor with shape [3, 224, 224]
which would mean [colour_channels, height, width] , as in the image has 3
colour channels (red, green, blue), a height of 224 pixels and a width of 224 pixels.
In tensor-speak (the language used to describe tensors), the tensor would have three
dimensions, one for colour_channels , height and width .
Creating tensors
PyTorch loves tensors. So much so there's a whole documentation page dedicated to
the torch.Tensor class.
Let's code.
Note: That's a trend for this course. We'll focus on writing specific code.
But often I'll set exercises which involve reading and getting familiar
with the PyTorch documentation. Because after all, once you're finished
this course, you'll no doubt want to learn more. And the documentation
is somewhere you'll be finding yourself quite often.
In [2]: # Scalar
scalar = torch.tensor(7)
scalar
Out[2]: tensor(7)
In [3]: scalar.ndim
Out[3]: 0
In [4]: # Get the Python number within a tensor (only works with
one-element tensors)
scalar.item()
Out[4]: 7
The important trend here is that a vector is flexible in what it can represent (the same
with tensors).
In [5]: # Vector
vector = torch.tensor([7, 7])
vector
Out[6]: 1
Hmm, that's strange, vector contains two numbers but only has a single dimension.
You can tell the number of dimensions a tensor in PyTorch has by the number of
square brackets on the outside ( [ ) and you only need to count one side.
Another important concept for tensors is their shape attribute. The shape tells you
how the elements inside them are arranged.
Out[7]: torch.Size([2])
The above returns torch.Size([2]) which means our vector has a shape of [2] .
This is because of the two elements we placed inside the square brackets ( [7, 7] ).
In [8]: # Matrix
MATRIX = torch.tensor([[7, 8],
[9, 10]])
MATRIX
Wow! More numbers! Matrices are as flexible as vectors, except they've got an extra
dimension.
Out[9]: 2
MATRIX has two dimensions (did you count the number of square brakcets on the
outside of one side?).
In [10]: MATRIX.shape
Out[10]: torch.Size([2, 2])
We get the output torch.Size([2, 2]) because MATRIX is two elements deep
and two elements wide.
In [11]: # Tensor
TENSOR = torch.tensor([[[1, 2, 3],
[3, 6, 9],
[2, 4, 5]]])
TENSOR
The one we just created could be the sales numbers for a steak and almond butter
store (two of my favourite foods).
How many dimensions do you think it has? (hint: use the square bracket counting
trick)
Note: You might've noticed me using lowercase letters for scalar and
vector and uppercase letters for MATRIX and TENSOR . This was on
purpose. In practice, you'll often see scalars and vectors denoted as
lowercase letters such as y or a . And matrices and tensors denoted
as uppercase letters such as X or W .
You also might notice the names martrix and tensor used
interchangably. This is common. Since in PyTorch you're often dealing
with torch.Tensor s (hence the tensor name), however, the shape and
dimensions of what's inside will dictate what it actually is.
Let's summarise.
Name What is it? Number of Lower or
dimensions upper
(usually/example)
And machine learning models such as neural networks manipulate and seek patterns
within tensors.
But when building machine learning models with PyTorch, it's rare you'll create tensors
by hand (like what we've being doing).
Instead, a machine learning model often starts out with large random tensors of
numbers and adjusts these random numbers as it works through data to better
represent it.
In essence:
Start with random numbers -> look at data -> update random numbers -
> look at data -> update random numbers...
As a data scientist, you can define how the machine learning model starts
(initialization), looks at data (representation) and updates (optimization) its random
numbers.
For example, say you wanted a random tensor in the common image shape of [224,
224, 3] ( [height, width, color_channels ]).
This happens a lot with masking (like masking some of the values in one tensor with
zeros to let a model know not to learn them).
Where:
For example, a tensor of all zeros with the same shape as a previous tensor.
Tensor datatypes
There are many different tensor datatypes available in PyTorch.
Some are specific for CPU and some are better for GPU.
Generally if you see torch.cuda anywhere, the tensor is being used for GPU (since
Nvidia GPUs use a computing toolkit called CUDA).
But there's also 16-bit floating point ( torch.float16 or torch.half ) and 64-bit
floating point ( torch.float64 or torch.double ).
And to confuse things even more there's also 8-bit, 16-bit, 32-bit and 64-bit integers.
Plus more!
This matters in deep learning and numerical computing because you're making so
many operations, the more detail you have to calculate on, the more compute you
have to use.
So lower precision datatypes are generally faster to compute on but sacrifice some
performance on evaluation metrics like accuracy (faster to compute but less accurate).
Resources:
See the PyTorch documentation for a list of all available tensor datatypes.
Read the Wikipedia page for an overview of what precision in computing) is.
Let's see how to create some tensors with specific datatypes. We can do so using the
dtype parameter.
float_32_tensor.shape, float_32_tensor.dtype,
float_32_tensor.device
Aside from shape issues (tensor shapes don't match up), two of the other most
common issues you'll come across in PyTorch are datatype and device issues.
Or one of your tensors is on the CPU and the other is on the GPU (PyTorch likes
calculations between tensors to be on the same device).
float_16_tensor.dtype
Out[21]: torch.float16
We've seen these before but three of the most common attributes you'll want to find
out about tensors are:
shape - what shape is the tensor? (some operations require specific shape
rules)
dtype - what datatype are the elements within the tensor stored in?
device - what device is the tensor stored on? (usually GPU or CPU)
Let's create a random tensor and find out details about it.
Note: When you run into issues in PyTorch, it's very often one to do
with one of the three attributes above. So when the error messages
show up, sing yourself a little song called "what, what, where":
"what shape are my tensors? what datatype are they and where are they
stored? what shape, what datatype, where where where"
Addition
Substraction
Multiplication (element-wise)
Division
Matrix multiplication
And that's it. Sure there are a few more here and there but these are the basic building
blocks of neural networks.
Stacking these building blocks in the right way, you can create the most sophisticated
of neural networks (just like lego!).
Basic operations
Let's start with a few of the fundamental operations, addition ( + ), subtraction ( - ),
mutliplication ( * ).
In [24]: # Multiply it by 10
tensor * 10
Out[24]: tensor([10, 20, 30])
Notice how the tensor values above didn't end up being tensor([110, 120,
130]) , this is because the values inside the tensor don't change unless they're
reassigned.
Let's subtract a number and this time we'll reassign the tensor variable.
PyTorch also has a bunch of built-in functions like torch.mul() (short for
multiplication) and torch.add() to perform basic operations.
However, it's more common to use the operator symbols like * instead of
torch.mul()
Resource: You can see all of the rules for matrix multiplication using
torch.matmul() in the PyTorch documentation.
Out[31]: torch.Size([3])
Out[33]: tensor(14)
In [34]: # Can also use the "@" symbol for matrix multiplication,
though not recommended
tensor @ tensor
Out[34]: tensor(14)
In [35]: %%time
# Matrix multiplication by hand
# (avoid doing operations with for loops at all cost,
they are computationally expensive)
value = 0
for i in range(len(tensor)):
value += tensor[i] * tensor[i]
value
CPU times: user 773 µs, sys: 0 ns, total: 773 µs
Wall time: 499 µs
Out[35]: tensor(14)
In [36]: %%time
torch.matmul(tensor, tensor)
CPU times: user 146 µs, sys: 83 µs, total: 229 µs
Wall time: 171 µs
Out[36]: tensor(14)
----------------------------------------------------------
-----------------
RuntimeError Traceback (most
recent call last)
/home/daniel/code/pytorch/pytorch-course/pytorch-deep-lear
ning/00_pytorch_fundamentals.ipynb Cell 75 in <cell line:
10>()
<a href='vscode-notebook-cell://ssh-remote%2B7b22686
f73744e616d65223a22544954414e2d525458227d/home/daniel/cod
e/pytorch/pytorch-course/pytorch-deep-learning/00_pytorch_
fundamentals.ipynb#Y134sdnNjb2RlLXJlbW90ZQ%3D%3D?line=1'>2
</a> tensor_A = torch.tensor([[1, 2],
<a href='vscode-notebook-cell://ssh-remote%2B7b22686
f73744e616d65223a22544954414e2d525458227d/home/daniel/cod
e/pytorch/pytorch-course/pytorch-deep-learning/00_pytorch_
fundamentals.ipynb#Y134sdnNjb2RlLXJlbW90ZQ%3D%3D?line=2'>3
</a> [3, 4],
<a href='vscode-notebook-cell://ssh-remote%2B7b22686
f73744e616d65223a22544954414e2d525458227d/home/daniel/cod
e/pytorch/pytorch-course/pytorch-deep-learning/00_pytorch_
fundamentals.ipynb#Y134sdnNjb2RlLXJlbW90ZQ%3D%3D?line=3'>4
</a> [5, 6]], dtype=torch.float3
2)
<a href='vscode-notebook-cell://ssh-remote%2B7b22686
f73744e616d65223a22544954414e2d525458227d/home/daniel/cod
e/pytorch/pytorch-course/pytorch-deep-learning/00_pytorch_
fundamentals.ipynb#Y134sdnNjb2RlLXJlbW90ZQ%3D%3D?line=5'>6
</a> tensor_B = torch.tensor([[7, 10],
<a href='vscode-notebook-cell://ssh-remote%2B7b22686
f73744e616d65223a22544954414e2d525458227d/home/daniel/cod
e/pytorch/pytorch-course/pytorch-deep-learning/00_pytorch_
fundamentals.ipynb#Y134sdnNjb2RlLXJlbW90ZQ%3D%3D?line=6'>7
</a> [8, 11],
<a href='vscode-notebook-cell://ssh-remote%2B7b22686
f73744e616d65223a22544954414e2d525458227d/home/daniel/cod
e/pytorch/pytorch-course/pytorch-deep-learning/00_pytorch_
fundamentals.ipynb#Y134sdnNjb2RlLXJlbW90ZQ%3D%3D?line=7'>8
</a> [9, 12]], dtype=torch.float3
2)
---> <a href='vscode-notebook-cell://ssh-remote%2B7b22686f
73744e616d65223a22544954414e2d525458227d/home/daniel/code/
pytorch/pytorch-course/pytorch-deep-learning/00_pytorch_fu
ndamentals.ipynb#Y134sdnNjb2RlLXJlbW90ZQ%3D%3D?line=9'>10
</a> torch.matmul(tensor_A, tensor_B)
One of the ways to do this is with a transpose (switch the dimensions of a given
tensor).
Output:
Without the transpose, the rules of matrix mulitplication aren't fulfilled and we get an
error like above.
How about a visual?
You can create your own matrix multiplication visuals like this at
http://matrixmultiplication.xyz/.
The torch.nn.Linear() module (we'll see this in action later on), also known as a
feed-forward layer or fully connected layer, implements a matrix multiplication
between an input x and a weights matrix A .
y=x⋅AT +b
𝑇
𝑦 = 𝑥⋅𝐴 + 𝑏
Where:
A is the weights matrix created by the layer, this starts out as random
numbers that get adjusted as a neural network learns to better represent
patterns in the data (notice the " T ", that's because the weights matrix gets
transposed).
Note: You might also often see W or another letter like X used to
showcase the weights matrix.
b is the bias term used to slightly offset the weights and inputs.
y is the output (a manipulation of the input in the hopes to discover
patterns in it).
This is a linear function (you may have seen something like y=mx+b𝑦 = 𝑚𝑥 + 𝑏 in high school or
elsewhere), and can be used to draw a straight line!
Try changing the values of in_features and out_features below and see what
happens.
Output:
tensor([[2.2368, 1.2292, 0.4714, 0.3864, 0.1309, 0.9838],
[4.4919, 2.1970, 0.4469, 0.5285, 0.3401, 2.4777],
[6.7469, 3.1648, 0.4224, 0.6705, 0.5493, 3.9716]],
grad_fn=<AddmmBackward0>)
If you've never done it before, matrix multiplication can be a confusing topic at first.
But after you've played around with it a few times and even cracked open a few neural
networks, you'll notice it's everywhere.
First we'll create a tensor and then find the max, min, mean and sum of it.
Out[43]: tensor([ 0, 10, 20, 30, 40, 50, 60, 70, 80, 90])
Positional min/max
You can also find the index of a tensor where the max or minimum occurs with
torch.argmax() and torch.argmin() respectively.
This is helpful incase you just want the position where the highest (or lowest) value is
and not the actual value itself (we'll see this in a later section when using the softmax
activation function).
First we'll create a tensor and check it's datatype (the default is torch.float32 ).
Out[47]: torch.float32
Now we'll create another tensor the same as before but change its datatype to
torch.float16 .
Out[48]: tensor([10., 20., 30., 40., 50., 60., 70., 80., 90.], dty
pe=torch.float16)
Out[49]: tensor([10, 20, 30, 40, 50, 60, 70, 80, 90], dtype=torch.
int8)
Exercise: So far we've covered a fair few tensor methods but there's a
bunch more in the torch.Tensor documentation, I'd recommend
spending 10-minutes scrolling through and looking into any that catch
your eye. Click on them and then write them out in code yourself to see
what happens.
Reshaping, stacking, squeezing and unsqueezing
Often times you'll want to reshape or change the dimensions of your tensors without
actually changing the values inside them.
Because deep learning models (neural networks) are all about manipulating tensors in
some way. And because of the rules of matrix multiplication, if you've got shape
mismatches, you'll run into errors. These methods help you make sure the right
elements of your tensors are mixing with the right elements of other tensors.
Remember though, changing the view of a tensor with torch.view() really only
creates a new view of the same tensor.
Out[53]: (tensor([[5., 2., 3., 4., 5., 6., 7.]]), tensor([5., 2.,
3., 4., 5., 6., 7.]))
If we wanted to stack our new tensor on top of itself five times, we could do so with
torch.stack() .
You can also rearrange the order of axes values with torch.permute(input,
dims) , where the input gets turned into a view with new dims .
Note: Because permuting returns a view (shares the same data as the
original), the values in the permuted tensor will be the same as the
original tensor and if you change the values in the view, it will change
the values of the original.
If you've ever done indexing on Python lists or NumPy arrays, indexing in PyTorch with
tensors is very similar.
Indexing values goes outer dimension -> inner dimension (check out the square
brackets).
In [60]: # Get all values of 0th dimension and the 0 index of 1st
dimension
x[:, 0]
In [61]: # Get all values of 0th & 1st dimensions but only index
1 of 2nd dimension
x[:, :, 1]
In [62]: # Get all values of the 0 dimension but only the 1 index
value of the 1st and 2nd dimension
x[:, 1, 1]
Out[62]: tensor([5])
In [63]: # Get index 0 of 0th and 1st dimension and all values of
2nd dimension
x[0, 0, :] # same as x[0][0]
Indexing can be quite confusing to begin with, especially with larger tensors (I still
have to try indexing multiple times to get it right). But with a bit of practice and
following the data explorer's motto (visualize, visualize, visualize), you'll start to get
the hang of it.
The two main methods you'll want to use for NumPy to PyTorch (and back again) are:
So if you want to convert your NumPy array (float64) -> PyTorch tensor
(float64) -> PyTorch tensor (float32), you can use tensor =
torch.from_numpy(array).type(torch.float32) .
Because we reassigned tensor above, if you change the tensor, the array stays the
same.
And if you want to go from PyTorch tensor to NumPy array, you can call
tensor.numpy() .
And the same rule applies as above, if you change the original tensor , the new
numpy_tensor stays the same.
In [67]: # Change the tensor, keep the array the same
tensor = tensor + 1
tensor, numpy_tensor
Well, pseudorandomness that is. Because after all, as they're designed, a computer is
fundamentally deterministic (each step is predictable) so the randomness they create
are simulated randomness (though there is debate on this too, but since I'm not a
computer scientist, I'll let you find out more yourself).
How does this relate to neural networks and deep learning then?
We've discussed neural networks start with random numbers to describe patterns in
data (these numbers are poor descriptions) and try to improve those random numbers
using tensor operations (and a few other things we haven't discussed yet) to better
describe patterns in data.
In short:
start with random numbers -> tensor operations -> try to make better
(again and again and again)
Although randomness is nice and powerful, sometimes you'd like there to be a little
less randomness.
Why?
And then your friend tries it out to verify you're not crazy.
We'll start by creating two random tensors, since they're random, you'd expect them
to be different right?
print(f"Tensor A:\n{random_tensor_A}\n")
print(f"Tensor B:\n{random_tensor_B}\n")
print(f"Does Tensor A equal Tensor B? (anywhere)")
random_tensor_A == random_tensor_B
Tensor A:
tensor([[0.8016, 0.3649, 0.6286, 0.9663],
[0.7687, 0.4566, 0.5745, 0.9200],
[0.3230, 0.8613, 0.0919, 0.3102]])
Tensor B:
tensor([[0.9536, 0.6002, 0.0351, 0.6826],
[0.3743, 0.5220, 0.1336, 0.9666],
[0.9754, 0.8474, 0.8988, 0.1105]])
Just as you might've expected, the tensors come out with different values.
But what if you wanted to created two random tensors with the same values.
As in, the tensors would still contain random values but they would be of the same
flavour.
print(f"Tensor C:\n{random_tensor_C}\n")
print(f"Tensor D:\n{random_tensor_D}\n")
print(f"Does Tensor C equal Tensor D? (anywhere)")
random_tensor_C == random_tensor_D
Tensor C:
tensor([[0.8823, 0.9150, 0.3829, 0.9593],
[0.3904, 0.6009, 0.2566, 0.7936],
[0.9408, 0.1332, 0.9346, 0.5936]])
Tensor D:
tensor([[0.8823, 0.9150, 0.3829, 0.9593],
[0.3904, 0.6009, 0.2566, 0.7936],
[0.9408, 0.1332, 0.9346, 0.5936]])
Nice!
And by default these operations are often done on a CPU (computer processing unit).
If so, you should look to use it whenever you can to train neural networks because
chances are it'll speed up the training time dramatically.
There are a few ways to first get access to a GPU and secondly get PyTorch to use the
GPU.
1. Getting a GPU
You may already know what's going on when I say GPU. But if not, there are a few
ways to get access to one.
There are more options for using GPUs but the above three will suffice for now.
Personally, I use a combination of Google Colab and my own personal computer for
small scale experiments (and creating this course) and go to cloud resources when I
need more compute power.
Resource: If you're looking to purchase a GPU of your own but not sure
what to get, Tim Dettmers has an excellent guide.
To check if you've got access to a Nvidia GPU, you can run !nvidia-smi where the
! (also called bang) means "run this on the command line".
In [70]: !nvidia-smi
Sat Jan 21 08:34:23 2023
+---------------------------------------------------------
--------------------+
| NVIDIA-SMI 515.48.07 Driver Version: 515.48.07 CUD
A Version: 11.7 |
|-------------------------------+----------------------+--
--------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | V
olatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | G
PU-Util Compute M. |
| | |
MIG M. |
|===============================+======================+==
====================|
| 0 NVIDIA TITAN RTX On | 00000000:01:00.0 Off |
N/A |
| 40% 30C P8 7W / 280W | 177MiB / 24576MiB |
0% Default |
| | |
N/A |
+-------------------------------+----------------------+--
--------------------+
+---------------------------------------------------------
--------------------+
| Processes:
|
| GPU GI CI PID Type Process name
GPU Memory |
| ID ID
Usage |
|=========================================================
====================|
| 0 N/A N/A 1061 G /usr/lib/xorg/Xorg
53MiB |
| 0 N/A N/A 2671131 G /usr/lib/xorg/Xorg
97MiB |
| 0 N/A N/A 2671256 G /usr/bin/gnome-shell
9MiB |
+---------------------------------------------------------
--------------------+
If you don't have a Nvidia GPU accessible, the above will output something like:
If you do have a GPU, the line above will output something like:
+-------------------------------------------------------------
----------------+
| Processes:
|
| GPU GI CI PID Type Process name
GPU Memory |
| ID ID
Usage |
|==============================================================
| No running processes found
|
+-------------------------------------------------------------
----------------+
Out[71]: True
If the above outputs True , PyTorch can see and use the GPU, if it outputs False , it
can't see the GPU and in that case, you'll have to go back through the installation
steps.
Now, let's say you wanted to setup your code so it ran on CPU or the GPU if it was
available.
That way, if you or someone decides to run your code, it'll work regardless of the
computing device they're using.
Out[72]: 'cuda'
If the above output "cuda" it means we can set all of our PyTorch code to use the
available CUDA device (a GPU) and if it output "cpu" , our PyTorch code will stick with
the CPU.
Note: In PyTorch, it's best practice to write device agnostic code. This
means code that'll run on CPU (always available) or GPU (if available).
If you want to do faster computing you can use a GPU but if you want to do much
faster computing, you can use multiple GPUs.
You can count the number of GPUs PyTorch has access to using
torch.cuda.device_count() .
Out[73]: 1
Knowing the number of GPUs PyTorch has access to is helpful incase you wanted to
run a specific process on one GPU and another process on another (PyTorch also has
features to let you run a process across all GPUs).
Why do this?
GPUs offer far faster numerical computing than CPUs do and if a GPU isn't available,
because of our device agnostic code (see above), it'll run on the CPU.
Note: Putting a tensor on GPU using to(device) (e.g.
some_tensor.to(device) ) returns a copy of that tensor, e.g. the
same tensor will be on CPU and GPU. To overwrite tensors, reassign
them:
some_tensor = some_tensor.to(device)
Let's try creating a tensor and putting it on the GPU (if it's available).
If you have a GPU available, the above code will output something like:
Notice the second tensor has device='cuda:0' , this means it's stored on the 0th
GPU available (GPUs are 0 indexed, if two GPUs were available, they'd be 'cuda:0'
and 'cuda:1' respectively, up to 'cuda:n' ).
For example, you'll want to do this if you want to interact with your tensors with
NumPy (NumPy does not leverage the GPU).
----------------------------------------------------------
-----------------
TypeError Traceback (most
recent call last)
/home/daniel/code/pytorch/pytorch-course/pytorch-deep-lear
ning/00_pytorch_fundamentals.ipynb Cell 157 in <cell line:
2>()
<a href='vscode-notebook-cell://ssh-remote%2B7b22686
f73744e616d65223a22544954414e2d525458227d/home/daniel/cod
e/pytorch/pytorch-course/pytorch-deep-learning/00_pytorch_
fundamentals.ipynb#Y312sdnNjb2RlLXJlbW90ZQ%3D%3D?line=0'>1
</a> # If tensor is on GPU, can't transform it to NumPy (t
his will error)
----> <a href='vscode-notebook-cell://ssh-remote%2B7b22686
f73744e616d65223a22544954414e2d525458227d/home/daniel/cod
e/pytorch/pytorch-course/pytorch-deep-learning/00_pytorch_
fundamentals.ipynb#Y312sdnNjb2RlLXJlbW90ZQ%3D%3D?line=1'>2
</a> tensor_on_gpu.numpy()
Instead, to get a tensor back to CPU and usable with NumPy we can use
Tensor.cpu() .
This copies the tensor to CPU memory so it's usable with CPUs.
The above returns a copy of the GPU tensor in CPU memory so the original tensor is
still on GPU.
In [77]: tensor_on_gpu
Exercises
All of the exercises are focused on practicing the code above.
You should be able to complete them by referencing each section or by following the
resource(s) linked.
Resources:
Exercise template notebook for 00.
Example solutions notebook for 00 (try the exercises before looking at this).
6. Create two random tensors of shape (2, 3) and send them both to the
GPU (you'll need access to a GPU for this). Set torch.manual_seed(1234)
when creating the tensors (this doesn't have to be the GPU random seed).
10. Make a random tensor with shape (1, 1, 1, 10) and then create a new
tensor with all the 1 dimensions removed to be left with a tensor of shape
(10) . Set the seed to 7 when you create it and print out the first tensor
and it's shape as well as the second tensor and it's shape.
Extra-curriculum
Spend 1-hour going through the PyTorch basics tutorial (I'd recommend the
Quickstart and Tensors sections).
To learn more on how a tensor can represent data, see this video: What's a
tensor?