0% found this document useful (0 votes)
113 views

(Deep Learning Using PyTorch) (Cheatsheet)

Uploaded by

Nitesh Kumar
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
113 views

(Deep Learning Using PyTorch) (Cheatsheet)

Uploaded by

Nitesh Kumar
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

# [ Deep Learning Using PyTorch ] [ cheatsheet ]

Tensor Creation and Manipulation

● Create a tensor from a list: tensor = torch.tensor([1, 2, 3])


● Create a tensor of zeros: tensor = torch.zeros(shape)
● Create a tensor of ones: tensor = torch.ones(shape)
● Create a tensor with random values: tensor = torch.rand(shape)
● Create a tensor with normally distributed random values: tensor =
torch.randn(shape)
● Create a tensor with a range of values: tensor = torch.arange(start, end,
step)
● Create a tensor with evenly spaced values: tensor = torch.linspace(start,
end, steps)
● Reshape a tensor: tensor = tensor.view(new_shape)
● Transpose a tensor: tensor = tensor.transpose(dim1, dim2)
● Flatten a tensor: tensor = tensor.flatten()
● Concatenate tensors along a dimension: tensor = torch.cat([tensor1,
tensor2], dim)
● Stack tensors along a new dimension: tensor = torch.stack([tensor1,
tensor2], dim)
● Squeeze a tensor (remove dimensions of size 1): tensor =
tensor.squeeze()
● Unsqueeze a tensor (add a dimension of size 1): tensor =
tensor.unsqueeze(dim)
● Permute the dimensions of a tensor: tensor = tensor.permute(dims)

Tensor Operations

● Addition: result = tensor1 + tensor2


● Subtraction: result = tensor1 - tensor2
● Multiplication (element-wise): result = tensor1 * tensor2
● Division (element-wise): result = tensor1 / tensor2
● Matrix multiplication: result = tensor1.matmul(tensor2)
● Exponential: result = torch.exp(tensor)
● Logarithm: result = torch.log(tensor)
● Square root: result = torch.sqrt(tensor)
● Sine: result = torch.sin(tensor)
● Cosine: result = torch.cos(tensor)

By: Waleed Mousa


● Tangent: result = torch.tan(tensor)
● Sigmoid: result = torch.sigmoid(tensor)
● ReLU: result = torch.relu(tensor)
● Tanh: result = torch.tanh(tensor)
● Softmax: result = torch.softmax(tensor, dim)

Neural Network Layers

● Linear layer: layer = nn.Linear(in_features, out_features)


● Convolutional layer: layer = nn.Conv2d(in_channels, out_channels,
kernel_size, stride, padding)
● Transposed convolutional layer: layer = nn.ConvTranspose2d(in_channels,
out_channels, kernel_size, stride, padding)
● Max pooling layer: layer = nn.MaxPool2d(kernel_size, stride, padding)
● Average pooling layer: layer = nn.AvgPool2d(kernel_size, stride, padding)
● Batch normalization layer: layer = nn.BatchNorm2d(num_features)
● Dropout layer: layer = nn.Dropout(p)
● Recurrent layer (RNN): layer = nn.RNN(input_size, hidden_size,
num_layers)
● Long Short-Term Memory layer (LSTM): layer = nn.LSTM(input_size,
hidden_size, num_layers)
● Gated Recurrent Unit layer (GRU): layer = nn.GRU(input_size, hidden_size,
num_layers)
● Embedding layer: layer = nn.Embedding(num_embeddings, embedding_dim)

Loss Functions

● Mean Squared Error (MSE) loss: loss_fn = nn.MSELoss()


● Cross-Entropy loss: loss_fn = nn.CrossEntropyLoss()
● Binary Cross-Entropy loss: loss_fn = nn.BCELoss()
● Negative Log-Likelihood loss: loss_fn = nn.NLLLoss()
● Kullback-Leibler Divergence loss: loss_fn = nn.KLDivLoss()
● Margin Ranking loss: loss_fn = nn.MarginRankingLoss()
● Triplet Margin loss: loss_fn = nn.TripletMarginLoss()
● Cosine Embedding loss: loss_fn = nn.CosineEmbeddingLoss()
● Hinge Embedding loss: loss_fn = nn.HingeEmbeddingLoss()

Optimization Algorithms

By: Waleed Mousa


● Stochastic Gradient Descent (SGD): optimizer =
torch.optim.SGD(model.parameters(), lr)
● Adam: optimizer = torch.optim.Adam(model.parameters(), lr)
● RMSprop: optimizer = torch.optim.RMSprop(model.parameters(), lr)
● Adagrad: optimizer = torch.optim.Adagrad(model.parameters(), lr)
● Adadelta: optimizer = torch.optim.Adadelta(model.parameters(), lr)
● Adamax: optimizer = torch.optim.Adamax(model.parameters(), lr)
● Sparse Adam: optimizer = torch.optim.SparseAdam(model.parameters(), lr)
● LBFGS: optimizer = torch.optim.LBFGS(model.parameters(), lr)

Learning Rate Schedulers

● Step LR: scheduler = torch.optim.lr_scheduler.StepLR(optimizer,


step_size, gamma)
● Multi-Step LR: scheduler =
torch.optim.lr_scheduler.MultiStepLR(optimizer, milestones, gamma)
● Exponential LR: scheduler =
torch.optim.lr_scheduler.ExponentialLR(optimizer, gamma)
● Cosine Annealing LR: scheduler =
torch.optim.lr_scheduler.CosineAnnealingLR(optimizer, T_max)
● Reduce LR on Plateau: scheduler =
torch.optim.lr_scheduler.ReduceLROnPlateau(optimizer, mode, factor,
patience)
● Cyclic LR: scheduler = torch.optim.lr_scheduler.CyclicLR(optimizer,
base_lr, max_lr, step_size_up)

Model Training and Evaluation

● Move model to device: model = model.to(device)


● Set model to training mode: model.train()
● Set model to evaluation mode: model.eval()
● Forward pass: outputs = model(inputs)
● Compute loss: loss = loss_fn(outputs, targets)
● Backward pass: loss.backward()
● Update model parameters: optimizer.step()
● Zero gradients: optimizer.zero_grad()
● Get model parameters: parameters = model.parameters()
● Get model state dictionary: state_dict = model.state_dict()
● Load model state dictionary: model.load_state_dict(state_dict)
● Save model checkpoint: torch.save(model.state_dict(), 'checkpoint.pth')

By: Waleed Mousa


● Load model checkpoint:
model.load_state_dict(torch.load('checkpoint.pth'))

Data Loading and Processing

● Create a dataset: dataset = torch.utils.data.TensorDataset(inputs,


targets)
● Create a data loader: data_loader = torch.utils.data.DataLoader(dataset,
batch_size, shuffle)
● Iterate over data loader: for batch in data_loader: inputs, targets =
batch
● Normalize data: data = (data - data.mean()) / data.std()
● Resize images: images = torch.nn.functional.interpolate(images, size)
● Random crop images: images =
torchvision.transforms.RandomCrop(size)(images)
● Random horizontal flip images: images =
torchvision.transforms.RandomHorizontalFlip()(images)
● Convert images to tensors: images =
torchvision.transforms.ToTensor()(images)
● Normalize images: images = torchvision.transforms.Normalize(mean,
std)(images)

Pretrained Models

● Load a pretrained model: model =


torchvision.models.resnet18(pretrained=True)
● Freeze model weights: for param in model.parameters():
param.requires_grad = False
● Replace the last layer of a pretrained model: model.fc = nn.Linear(512,
num_classes)
● Extract features from a pretrained model: features = model(inputs)

Model Evaluation Metrics

● Accuracy: accuracy = (predicted == targets).float().mean()


● Precision: precision = torch.sum(predicted * targets) /
torch.sum(predicted)
● Recall: recall = torch.sum(predicted * targets) / torch.sum(targets)
● F1 score: f1_score = 2 * (precision * recall) / (precision + recall)
● Mean Absolute Error (MAE): mae = torch.abs(predicted - targets).mean()

By: Waleed Mousa


● Mean Squared Error (MSE): mse = torch.square(predicted - targets).mean()
● Root Mean Squared Error (RMSE): rmse = torch.sqrt(mse)
● Intersection over Union (IoU): iou = torch.sum(predicted * targets) /
torch.sum((predicted + targets) > 0)
● Area Under the ROC Curve (AUC): auc =
torchmetrics.functional.auroc(predicted, targets)
● Average Precision (AP): ap =
torchmetrics.functional.average_precision(predicted, targets)
● Confusion Matrix: cm =
torchmetrics.functional.confusion_matrix(predicted, targets)

Model Visualization

● Visualize the model architecture: print(model)


● Visualize the model graph: torchviz.make_dot(outputs,
params=dict(model.named_parameters())).render('model_graph',
format='png')
● Visualize the model summary: torchsummary.summary(model, input_size)

Transfer Learning

● Freeze the weights of the feature extractor: for param in


model.features.parameters(): param.requires_grad = False
● Fine-tune the last layer: for param in model.classifier.parameters():
param.requires_grad = True
● Load a pretrained model and replace the last layer: model =
torchvision.models.resnet18(pretrained=True); model.fc = nn.Linear(512,
num_classes)

Adversarial Attacks

● Fast Gradient Sign Method (FGSM) attack: perturbed_inputs = inputs +


epsilon * torch.sign(inputs.grad)
● Projected Gradient Descent (PGD) attack: for _ in range(num_steps):
perturbed_inputs = torch.clamp(perturbed_inputs + alpha *
torch.sign(perturbed_inputs.grad), min=inputs-epsilon,
max=inputs+epsilon)
● Carlini & Wagner (C&W) attack: adversarial_inputs = torch.clamp(inputs +
perturbations, min=0, max=1)

By: Waleed Mousa


Model Pruning

● Prune model weights: pruned_model =


torch.nn.utils.prune.random_unstructured(model, name='weight',
amount=pruning_ratio)
● Prune model biases: pruned_model =
torch.nn.utils.prune.l1_unstructured(model, name='bias',
amount=pruning_ratio)
● Prune model layers: pruned_model =
torch.nn.utils.prune.ln_structured(model, name='conv',
amount=pruning_ratio, n=2, dim=0)

Model Quantization

● Quantize model weights: quantized_model =


torch.quantization.quantize_dynamic(model, {torch.nn.Linear},
dtype=torch.qint8)
● Quantize model activations: quantized_model =
torch.quantization.quantize_dynamic(model, {torch.nn.ReLU},
dtype=torch.quint8)
● Convert model to quantized version: quantized_model =
torch.quantization.convert(model)

Distributed Training

● Initialize distributed training:


torch.distributed.init_process_group(backend='nccl',
init_method='tcp://localhost:23456', rank=args.rank,
world_size=args.world_size)
● Wrap model with DistributedDataParallel: model =
torch.nn.parallel.DistributedDataParallel(model,
device_ids=[args.local_rank])
● Synchronize gradients across devices: torch.distributed.barrier()
● Reduce gradients across devices: torch.distributed.all_reduce(tensor,
op=torch.distributed.ReduceOp.SUM)

Model Interpretability

● Compute gradients w.r.t. inputs: gradients = torch.autograd.grad(outputs,


inputs, grad_outputs=torch.ones_like(outputs))

By: Waleed Mousa


● Compute saliency maps: saliency_maps = torch.abs(gradients).max(dim=1,
keepdim=True)[0]
● Compute guided backpropagation: guided_backprop = torch.clamp(gradients,
min=0)
● Compute class activation maps (CAM): cam = torch.sum(features *
weights.view(num_classes, -1), dim=1).view(batch_size, -1)
● Compute Grad-CAM: grad_cam = torch.sum(features *
gradients.view(batch_size, num_channels, -1), dim=2).view(batch_size,
num_channels, 1, 1)

Model Debugging

● Print model gradients: for name, param in model.named_parameters(): if


param.requires_grad: print(name, param.grad)
● Print model activations: for name, module in model.named_modules(): if
isinstance(module, nn.ReLU): module.register_forward_hook(lambda module,
input, output: print(name, output))
● Print model parameters: for name, param in model.named_parameters():
print(name, param)
● Print model buffers: for name, buffer in model.named_buffers():
print(name, buffer)
● Set anomaly detection for debugging:
torch.autograd.set_detect_anomaly(True)

By: Waleed Mousa

You might also like