Chapter 3
Chapter 3
activation functions
INTRODUCTION TO DEEP LEARNING WITH PYTORCH
f(x) = max(x, 0)
In PyTorch:
relu = nn.ReLU()
In PyTorch:
Increasing the number of hidden layers = increasing the number of parameters = increasing
the model capacity
total = 0
Manually calculating the number of
for parameter in model.parameters():
parameters:
total += parameter.numel()
first layer has 4 neurons, each neuron has print(total)
8+1 parameters = 36 parameters
46
second layer has 2 neurons, each neuron
has 4+1 parameters = 10 parameters
total = 46 learnable parameters
Two parameters:
learning rate: controls the step size
The outputs of a layer would explode if the inputs and the weights are not normalized.
The weights can be initialized using different methods (for example, using a uniform
distribution)
print(custom_layer.fc.weight.min(), custom_layer.fc.weight.max())
For example, we trained a first model on a large dataset of data scientist salaries across the
US and we want to train a new model on a smaller dataset of salaries in Europe.
import torch
new_layer = torch.load('layer.pth')
Rule of thumb: freeze early layers of network and fine-tune layers closer to output layer
import torch.nn as nn