0% found this document useful (0 votes)
5 views36 pages

DL Pipeline and Tutorial

The document outlines a tutorial on the deep learning pipeline, contrasting it with traditional data science and machine learning practices. It details the steps involved in tuning deep learning models, including choosing model architecture, optimizers, batch sizes, and initial configurations, along with practical examples using PyTorch. Additionally, it emphasizes the importance of data augmentation and experiment visualization for improving model performance.

Uploaded by

Ziqing Wang
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views36 pages

DL Pipeline and Tutorial

The document outlines a tutorial on the deep learning pipeline, contrasting it with traditional data science and machine learning practices. It details the steps involved in tuning deep learning models, including choosing model architecture, optimizers, batch sizes, and initial configurations, along with practical examples using PyTorch. Additionally, it emphasizes the importance of data augmentation and experiment visualization for improving model performance.

Uploaded by

Ziqing Wang
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 36

CISC 351, CISC 372, CMPE 351

School of Computing, Queen's University, Canada

Advanced Data Analytics


Tutorial for Deep Learning Pipeline

y.tian [at] queensu.ca


1
Recap: Traditional DS/ML Pipeline

01 04
Define the Goal Feature Engineering

02 05
Collect Data and Modeling
Clean Data (Model Selection, Hyper-parameter
Optimization, Validation)

03 06
Exploratory Data Deployment
Analysis

How about Deep


Learning Pipeline?
2
Traditional ML practice vs. Deep Learning

What are the architectures &


hyperparameters we can tune
for deep learning approaches?

3
Deep Learning Pipeline

Image source: ”Deep Learning with Pytorch”


Before Tuning

1. Enough of the essential work of problem formulation, data


cleaning, etc. has already been done that spending time on the
model architecture and training configuration makes sense.
2. There is already a pipeline set up that does training and
evaluation, and it is easy to execute training and prediction
jobs for various models of interest.
3. The appropriate metrics have been selected and implemented.
These should be as representative as possible of what would
be measured in the deployed environment.
5
DL Tuning Playbook
Source: https://github.com/google-research/tuning_playbook

6
Tuning Steps

1. Choosing the model architecture


2. Choosing the optimizer
3. Choosing the batch size
4. Choosing the initial configuration
5. A scientific approach to improving model performance
(next lecture)
Note: we will not discuss how to design new neural architecture
7
Choosing the Model Architecture

1. Image classification -> CNN, Transformer


2. Text classification, Time Series Forecasting -> RNN, Transformer
3. Machine translation -> Seq2Seq
4. Question-answering, Conversation -> Seq2Seq, Generative Models, GPT
5. Recommending interested content/friend -> Graph Neural Networks
6. Generate image/music -> Generative Adversarial Network, Diffusion Model
7. Code -> OpenAI Codex for GitHub Copilot

Select model based on the nature of the data, and literature work, i.e.,
https://paperswithcode.com/ , conferences in different research domains.
8
Choosing the Optimizer
Start with the most popular optimizer for the type of problem at hand.
Adam, NAdam, Nesterov
How to fairly compare different optimizers?
• Performance and training speed.
• Tune the best hyper-parameters for the
optimizer!

9
Choosing the Batch Size

Why batch size is important? Training time = (time per step) x (total number of steps)

Practice: Often, the ideal batch size will be the largest batch size supported
by the available hardware.

How?

Run training jobs at different batch sizes (increasing following powers of


2), from a small number until one number exceeds the available memory.
10
Choosing the Initial Configuration
https://pytorch.org/vision/stable/models.html

This includes specifying


(1) the model configuration (e.g. number
of layers, what are those layers),
(2) the optimizer hyperparameters (e.g.
learning rate),
(3) and the number of training steps.

Practice: find a simple, relatively fast, relatively


low-resource-consumption configuration that
obtains a "reasonable" result.
11
PyTorch Hub https://pytorch.org/hub/

12
Build your first CNN model for Food
Image Classification

Using google colab Source: https://www.epfl.ch/labs/mmspg/downloads/food-image-


https://colab.research.google.com/signup datasets/

Step 1: Exploring the dataset


This is a dataset containing 16643 food images grouped in 11 major food categories. The
11 categories are Bread, Dairy product, Dessert, Egg, Fried food, Meat, Noodles/Pasta,
Rice, Seafood, Soup, and Vegetable/Fruit.

13
Step 1: Data Exploration
How to upload data to colab?
Option 1: GUI based, upload local file
#upload file from local system
from google.colab import files
uploades = files.upload()

Option 2: directly download from URL:

! wget URLtoData Don’t forget to unzip the dataset:


Or
! wget -O food11.zip ”URLtoData" ! unzip food11.zip

14
Step 1: Data Exploration (cont.)
Pick a random image to check, print its original shape
import imageio
import os
import glob
from collections import Counter
import random
myseed = 12345 # set a random seed for reproducibility

from google.colab.patches import cv2_imshow

# let's take a look at one random image


random_pic_file = random.choice(os.listdir('./food11/training/'))
pic = imageio.imread('./food11/training/'+ random_pic_åfile)
cv2_imshow(pic)
height, width, channels = pic.shape
print(f'original height, width, and channels of each image: {height}
{width} {channels}') 15
Step 1: Data Exploration (cont.)
How many train/test/validation images we have?

16
Step 2: Data Loader

How to prepare data for PyTorch pipeline?


Step 1: define pipelines of basic preprocessing functions
# All we need here is to resize the PIL image and transform it into Tensor.
train_tfm = transforms.Compose([
# Resize the image into a fixed shape (height = width = 128)
transforms.Resize((128, 128)),
# You may add some transforms here.
transforms.ToTensor(),
])

test_tfm = transforms.Compose([
transforms.Resize((128, 128)),
transforms.ToTensor(),
]) If you want to scale images or crop the image (around the center),
and then transform it to tensor, you should also do it here.

17
Step 2: Data Loader

How to prepare data for PyTorch pipeline?


Step 2: create a dataset class for image loading

Image source: ”Deep Learning with Pytorch”

18
Step 3: Define a Neural Network Model
Let’s first look at a simple case:
If we have 32 * 32 * 3 channel image
We want to fit it into a fully connected model, with a hidden layer containing 512 neurons.

19
Step 3: Define a
Neural Network Model

1. Inherited from provided model class.


2. Fill two functions, initialization (self.cnn,
self.fc), and forward function.
3. For the CNN model, be aware of
input/output shape.
4. CNN model is a combination of multiple
Conv2d layer, ReLU, MaxPool2d, then
flatten, then send to multiple fully
connected layers.
5. BatchNorm
20
Determine the
input/output Shape
Input: 128 * 128 * 3 channels
If we set first Conv2d layer: output_chanel = 64, kernel_size = 3, stride = 1,
and padding = 1, what would be the output of the first Conv2d layer?
Output: 64 * 128 * 128
By default, dilation=1

21
Input Channel and Output Channel
Animation source: link
Input: 3 * 3 * 3 Output: 3 * 3 * 2

Kernel: 1 * 1 * 3 * 2 parameters

Stride = 1, Padding = 0
22
Step 4: Initialize the Train/validation Process
First, set up data loader

_exp_name = "sample”
batch_size = 64
_dataset_dir = "./food11"
# Construct datasets.
# The argument "loader" tells how torchvision reads the data.
train_set = FoodDataset(os.path.join(_dataset_dir,"training"), tfm=train_tfm)
train_loader = DataLoader(train_set, batch_size=batch_size, shuffle=True,
num_workers=0, pin_memory=True)

valid_set = FoodDataset(os.path.join(_dataset_dir,"validation"),
tfm=test_tfm)
valid_loader = DataLoader(valid_set, batch_size=batch_size, shuffle=True,
num_workers=0, pin_memory=True)

23
Step 4: Initialize the Train/validation Process
Then set up hyper-parameters
# "cuda" only when GPUs are available.
device = "cuda" if torch.cuda.is_available() else "cpu"

# The number of training epochs and patience.


n_epochs = 4
patience = 300 # If no improvement in 'patience' epochs, early stop

# Initialize a model, and put it on the device specified.


model = FirstCNN().to(device)

# For the classification task, we use cross-entropy as the measurement of


performance.
criterion = nn.CrossEntropyLoss()

# Initialize optimizer, you may fine-tune some hyperparameters such as learning rate
on your own.
optimizer = torch.optim.Adam(model.parameters(), lr=0.0003, weight_decay=1e-5)

# Initialize trackers, these are not parameters and should


24 not be changed
stale = 0
best_acc = 0
Step 5: Train and Validation
# ---------- Training ----------
for epoch in range(n_epochs):

…[code for set up one epoch]

for batch in tqdm(train_loader):

# Record the loss and accuracy.

# Calculate and print train_loss and train_acc for this epoch

# ---------- Validation ----------


# Apply the model on validation set and compare them, pick the best model at a
certain epoch, save model

# If no improvement for X consecutive epochs, early stop.


25
For the details, refer to the tutorial code.
Code for one batch training
# A batch consists of image data and corresponding labels.
imgs, labels = batch

# Forward the data. (Make sure data and model are on the same device.)
logits = model(imgs.to(device))

# Calculate the cross-entropy loss.


loss = criterion(logits, labels.to(device))

# Gradients stored in the parameters in the previous step should be cleared out first.
optimizer.zero_grad()

# Compute the gradients for parameters.


loss.backward()

# Clip the gradient norms for stable training.


grad_norm = nn.utils.clip_grad_norm_(model.parameters(), max_norm=10)

# Update the parameters with computed gradients.


optimizer.step()

# Compute the accuracy for current batch.


acc = (logits.argmax(dim=-1) == labels.to(device)).float().mean()

# Record the loss and accuracy.


train_loss.append(loss.item()) 26
train_accs.append(acc)
Step 6: Apply the model on Test dataset

27

How to improve?
A scientific approach to improving model
performance
• Identify an appropriately-scoped goal for the next round
of experiments. Sample goal: a new regularizer, preprocessing
choice, which hyper-parameter matters, which hyper-parameters
need to be re-tuned together, etc.
• Design and run a set of experiments that makes progress
towards this goal.
• Learn what we can from the results.
• Consider whether to launch the new best configuration.
28

Warning: we will give a few tricks for performance improvement, there are many more to explore
Preprocessing Choice for Image Classification:
Data Augmentation

Modify the image data so non-identical inputs are given to the model each
epoch, to prevent overfitting of the model.
Image source: link

29
Visit torchvision.transforms for a list of choices and their corresponding effect.
Data Augmentation

train_tfm = transforms.Compose([
# Data augmentation
transforms.RandomResizedCrop(128),
transforms.RandomHorizontalFlip(0.5),

# ToTensor() should be the last one of the transforms.


transforms.ToTensor(),
# Add data augmentation here
])
30
How to get insights from experiment results?

If the best points cluster towards the edge of a search space (in some dimension), then the search space boundaries might
need to be expanded until the best observed point is no longer close to the boundary.
31
How to visualize running results for one pipeline?

Why visualization helps?

Tutorial:https://pytorch.org/tutorials/recipes/recipes
/tensorboard_with_pytorch.html 32
Wandb: Weights & Biases (Tune + Visualize)

https://wandb.ai/site
33
Wandb + PyTorch (step 1)

First, you need to sign up for wandb to get an API token (free)
# use wandb for trial tracking and hyper-parameter tuning
import wandb
wandb.login(relogin=True)

By setting configuration dictionary(sweep_config), you can choose how


you want to tune parameters (search for hyper-parameter values).

Parameter tuning method: grid/random/bayes

34
Wandb + PyTorch (step 2)
Put the train and validation process in one train function, pass configuration values in

35
Wandb + PyTorch (step 3)
Start experiment:

sweep_id = wandb.sweep(sweep_config, project=config.project_name)


wandb.agent(sweep_id, train, count=50)

36

You might also like