Video 9 - PyTorch Datasets and Dataloaders

The document explains how to create Dataset and DataLoader objects in PyTorch to manage images and target values for model training. It emphasizes the importance of batch dimensions and shuffling training data to improve training efficiency. Additionally, it highlights the connection between DataLoaders and Stochastic Gradient Descent, noting that DataLoaders facilitate the random sampling necessary for this optimization technique.

Uploaded by

shubham jha

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

1 views9 pages

Video 9 - PyTorch Datasets and Dataloaders

Uploaded by

shubham jha

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 9

PyTorch Datasets and Dataloaders

Antonio Rueda-Toicen
Learning goals

○ Create Dataset objects in PyTorch to wrap images and target values together
○ Implement DataLoader PyTorch objects to feed data to a model
○ Understand the connection between DataLoader and Stochastic Gradient Descent
The batch dimension is important for our models

torch image tensors follow the format: [N, C, H, W]

where:

● N = batch size (number of images)

● C = channels (e.g., 1 for grayscale, 3 for RGB)
● H = height in pixels
● W = width in pixels

print(torch_tensor_gray.shape)

# torch.Size([1, 1, 28, 28])

Feeding torch tensors to a PyTorch model

# Define the train loader with batch size and shuffling

batch_size = 32
train_loader = DataLoader(dataset=train_dataset,
batch_size=batch_size,
shuffle=True) Tip: use the nvidia-smi bash command to
check the RAM available in your GPU
# Move model to device (GPU/CPU)
model = model.to(device)

# Training loop
for images, labels in train_loader:
# Remember that data has to be explicitly sent to the GPU
images, labels = images.to(device), labels.to(device)
output = model(images)
batch_loss = loss_function(output, labels)
Shuffling the training set only

# Create data loaders

batch_size = 32
# Notice that we shuffle the training loader, but not the validation or test loaders.
# This practice of shuffling the training set is one of the techniques
# that have been shown to improve training.
train_loader = DataLoader(train_subset, batch_size=batch_size, shuffle=True)
valid_loader = DataLoader(valid_subset, batch_size=batch_size, shuffle=False)
test_loader = DataLoader(test_set, batch_size=batch_size, shuffle=False)
Dataloader depends on Dataset

# Define the MNIST Dataset class as a subclass of Dataset

class MNISTDataset(Dataset):
# Content of the dataset
# gets called on dataset_instance = MNISTDataset(<params>)
def __init__(self, dataframe, labels=True, transform=None):
…
def __len__(self):
# What gets called on len(dataset_instance)
return len(self.data)

def getitem(self, idx):

# What gets called on dataset_instance[index]
…
return image_tensor, label_tensor
Practice creating a Dataset

https://www.kaggle.com/competitions/digit-recognizer
Summary

Dataset objects wrap images and labels together

● Provide a standard way to access data through __getitem__ and __len__
methods

Dataloaders split datasets in batches

● Dataloaders shuffle the training data and maintain sequential order for validation and
test sets

Dataloaders are fundamental to implement Stochastic Gradient Descent

● Their random sampling and shuffling of training samples provide the ‘stochastic’ part of
Stochastic Gradient Descent
References

Datasets and Dataloaders

● https://pytorch.org/tutorials/beginner/basics/data_tutorial.html

MNIST on PyTorch’s datasets

● https://pytorch.org/vision/0.20/generated/torchvision.datasets.MNIST.html

MNIST on Fiftyone
● https://try.fiftyone.ai/datasets/mnist/samples

Anthropic MCP Server
100% (2)
Anthropic MCP Server
10 pages
Pytorch MNIST Digits Prediction Hands On 1
No ratings yet
Pytorch MNIST Digits Prediction Hands On 1
16 pages
Shaking Table Tests and Stability Analysis of Steep Nailed Slopes
No ratings yet
Shaking Table Tests and Stability Analysis of Steep Nailed Slopes
16 pages
Py Torch
No ratings yet
Py Torch
786 pages
BlockChain IITKGP
No ratings yet
BlockChain IITKGP
58 pages
Neetcode Blind 75
No ratings yet
Neetcode Blind 75
55 pages
Major Training-AME BHEL Bhopal
No ratings yet
Major Training-AME BHEL Bhopal
35 pages
Electricity Handwritten Notes
No ratings yet
Electricity Handwritten Notes
9 pages
54118-mt - Advanced Digital Signal Processing
No ratings yet
54118-mt - Advanced Digital Signal Processing
2 pages
Factory Mutual LPDS 1-49
100% (3)
Factory Mutual LPDS 1-49
25 pages
Chapter 10
No ratings yet
Chapter 10
39 pages
Massachusetts Institute of Technology: Case Study Case Study
No ratings yet
Massachusetts Institute of Technology: Case Study Case Study
3 pages
Dissolution Problems
No ratings yet
Dissolution Problems
12 pages
ETABS-Example-RC Building Seismic Load - Response
50% (2)
ETABS-Example-RC Building Seismic Load - Response
35 pages
1409 Electricity
No ratings yet
1409 Electricity
139 pages
Highly Available iSCSI Storage With DRBD and Pacemaker: Florian Haas
No ratings yet
Highly Available iSCSI Storage With DRBD and Pacemaker: Florian Haas
24 pages
Disini Case - Case Analysis No. 3
No ratings yet
Disini Case - Case Analysis No. 3
33 pages
075 - CE8301, CE6306 Strength of Materials I - Question Bank
No ratings yet
075 - CE8301, CE6306 Strength of Materials I - Question Bank
10 pages
Zara
No ratings yet
Zara
10 pages
Scenarıos To Tercıos Volume I Travlos DRAFT
100% (1)
Scenarıos To Tercıos Volume I Travlos DRAFT
36 pages
Child Labor
No ratings yet
Child Labor
15 pages
Leetcode Slides
No ratings yet
Leetcode Slides
20 pages
A Synopsis Report ON Credit Risk Management AT Icici Bank LTD
No ratings yet
A Synopsis Report ON Credit Risk Management AT Icici Bank LTD
19 pages
Republic of The Philippines Province of Isabela Municipality of Gamu BARANGAY - Office of The Punong Barangay
No ratings yet
Republic of The Philippines Province of Isabela Municipality of Gamu BARANGAY - Office of The Punong Barangay
2 pages
Pytorch Tutorial 1 Rev 1
No ratings yet
Pytorch Tutorial 1 Rev 1
48 pages
Deep Learning Lab Manual
100% (10)
Deep Learning Lab Manual
30 pages
Internal Parts of Computer
No ratings yet
Internal Parts of Computer
2 pages
ADA FILE SJ
No ratings yet
ADA FILE SJ
30 pages
Chap 8 AE
No ratings yet
Chap 8 AE
8 pages
Certifiedincybersecurity Isc2
No ratings yet
Certifiedincybersecurity Isc2
28 pages
Ospriopyqbased
No ratings yet
Ospriopyqbased
22 pages
Master Thesis Bioinformatics Germany
100% (2)
Master Thesis Bioinformatics Germany
5 pages
PROGRAMMING IN PYTHON - Unit1,2
No ratings yet
PROGRAMMING IN PYTHON - Unit1,2
20 pages
Implemented LeNet On PyTorch
100% (1)
Implemented LeNet On PyTorch
17 pages
NN From Scratch
No ratings yet
NN From Scratch
5 pages
Dataset and DataLoader Class
No ratings yet
Dataset and DataLoader Class
12 pages
SWAYAM Assignment 9
No ratings yet
SWAYAM Assignment 9
3 pages
Assignment1 DeepL25
No ratings yet
Assignment1 DeepL25
15 pages
Assignment Test 1
No ratings yet
Assignment Test 1
3 pages
Video 7 - Building A Multilayer Feedforward Network For Classification in PyTorch
No ratings yet
Video 7 - Building A Multilayer Feedforward Network For Classification in PyTorch
18 pages
Video 18 - Transfer Learning and Fine-Tuning Pretrained Models
No ratings yet
Video 18 - Transfer Learning and Fine-Tuning Pretrained Models
14 pages
Deep Learning With PyTorch
No ratings yet
Deep Learning With PyTorch
19 pages
Sales Force Motivation Thesis PDF
100% (4)
Sales Force Motivation Thesis PDF
6 pages
Video 19 - Class Activation Mapping CAM
No ratings yet
Video 19 - Class Activation Mapping CAM
15 pages
Video 4 - Introduction To Neural Networks
No ratings yet
Video 4 - Introduction To Neural Networks
18 pages
1.5 - Knowledge Graphs
No ratings yet
1.5 - Knowledge Graphs
21 pages
Video 15 - Skip Connections
No ratings yet
Video 15 - Skip Connections
12 pages
Video 14 - Binary Cross Entropy Loss
No ratings yet
Video 14 - Binary Cross Entropy Loss
16 pages
Video 6 - Matrix Multiplications Non-Linear Activations and Network Shape
No ratings yet
Video 6 - Matrix Multiplications Non-Linear Activations and Network Shape
13 pages
1.4 - Graphs and Triples
No ratings yet
1.4 - Graphs and Triples
16 pages
Dicionário de Gastronomia
No ratings yet
Dicionário de Gastronomia
3 pages
1.3 - The Art of Understanding
No ratings yet
1.3 - The Art of Understanding
13 pages
DLCD Unit 1-4 Merged
No ratings yet
DLCD Unit 1-4 Merged
160 pages
Questions
No ratings yet
Questions
4 pages
02 - Asl - Ipynb (4) - JupyterLab
No ratings yet
02 - Asl - Ipynb (4) - JupyterLab
15 pages
Pme 826 Westcott Mod 1 Minor Task 2
No ratings yet
Pme 826 Westcott Mod 1 Minor Task 2
2 pages
Pale Midterm Exam April 19 2023
No ratings yet
Pale Midterm Exam April 19 2023
8 pages
1.2 - Knowledge and How To Represent It
No ratings yet
1.2 - Knowledge and How To Represent It
12 pages
8 Challenges and Solutions For Efficient LLM Deployment
No ratings yet
8 Challenges and Solutions For Efficient LLM Deployment
6 pages
Chest Cancer - 90.8 On Test Data Set Code
No ratings yet
Chest Cancer - 90.8 On Test Data Set Code
17 pages
Pytorch Tutorial: Narges Honarvar Nazari January 30
No ratings yet
Pytorch Tutorial: Narges Honarvar Nazari January 30
29 pages
01 - Mnist - Ipynb (4) - JupyterLab
No ratings yet
01 - Mnist - Ipynb (4) - JupyterLab
23 pages
HW3 Instruction
No ratings yet
HW3 Instruction
1 page
1.0 - Knowledge Representation With Graphs
No ratings yet
1.0 - Knowledge Representation With Graphs
5 pages
Capstone Project-1
No ratings yet
Capstone Project-1
15 pages
CS401 24 Assign 2 Template Fixed
No ratings yet
CS401 24 Assign 2 Template Fixed
11 pages
Assignment 4x
No ratings yet
Assignment 4x
19 pages
Assignment 2 - CNN 1
No ratings yet
Assignment 2 - CNN 1
3 pages
How To Develop A CNN For MNIST Handwritten Digit Classification
No ratings yet
How To Develop A CNN For MNIST Handwritten Digit Classification
43 pages
MIC Assignment4
No ratings yet
MIC Assignment4
9 pages
Lecture 08 Dataset and Dataloader
No ratings yet
Lecture 08 Dataset and Dataloader
21 pages
C3W1 Data Augmentation Assignment
No ratings yet
C3W1 Data Augmentation Assignment
16 pages
PyTorch Made Easy A Quick Overview
No ratings yet
PyTorch Made Easy A Quick Overview
55 pages
Homework 6
No ratings yet
Homework 6
7 pages
Pytorch Tutorial 1
No ratings yet
Pytorch Tutorial 1
48 pages
Pytorch
No ratings yet
Pytorch
38 pages
Invoice CT-2237192
No ratings yet
Invoice CT-2237192
2 pages
LLM Fine Tune
No ratings yet
LLM Fine Tune
11 pages
Week 6 Tutorial T&D With Answers
No ratings yet
Week 6 Tutorial T&D With Answers
4 pages
Convolutional Autoencoder in Pytorch On MNIST Dataset - by Eugenia Anello - DataSeries - Medium
No ratings yet
Convolutional Autoencoder in Pytorch On MNIST Dataset - by Eugenia Anello - DataSeries - Medium
18 pages
Deeplearning - Ai Deeplearning - Ai
No ratings yet
Deeplearning - Ai Deeplearning - Ai
42 pages
Unbalanced Data Loading For Multi-Task Learning in PyTorch (Blog)
No ratings yet
Unbalanced Data Loading For Multi-Task Learning in PyTorch (Blog)
11 pages
D2L CH3 Part5
No ratings yet
D2L CH3 Part5
12 pages
PyTorch CrashCourse
No ratings yet
PyTorch CrashCourse
16 pages
Module02 PyTorch
No ratings yet
Module02 PyTorch
36 pages
DLP Tle 6
No ratings yet
DLP Tle 6
2 pages
Aditya Joshi 23252595 Assign 5
No ratings yet
Aditya Joshi 23252595 Assign 5
7 pages
Harvard CS197 Lecture 5 Notes
No ratings yet
Harvard CS197 Lecture 5 Notes
14 pages
CIFAR - 10 - Dataset - Using - CNN - Aniiiii - HTML
No ratings yet
CIFAR - 10 - Dataset - Using - CNN - Aniiiii - HTML
8 pages
PyTorch - A Comprehensive Overview
No ratings yet
PyTorch - A Comprehensive Overview
7 pages
Project Documentation
No ratings yet
Project Documentation
24 pages
Assignment 2 DL
No ratings yet
Assignment 2 DL
10 pages
Keras
No ratings yet
Keras
4 pages
Assignment3 AL
No ratings yet
Assignment3 AL
23 pages
Pytorch Neural Networks Guide 1717173717
No ratings yet
Pytorch Neural Networks Guide 1717173717
17 pages
یادگیری پایتورچ
No ratings yet
یادگیری پایتورچ
30 pages
DL Lab-Final
No ratings yet
DL Lab-Final
22 pages
Deep Learning Lab: How To Train Your First Neural Network
No ratings yet
Deep Learning Lab: How To Train Your First Neural Network
68 pages
Ilovepdf Merged
No ratings yet
Ilovepdf Merged
10 pages
Introduction To PyTorch
No ratings yet
Introduction To PyTorch
35 pages
Assignment 02# - Machine Learning 2023
No ratings yet
Assignment 02# - Machine Learning 2023
8 pages
Pink
No ratings yet
Pink
5 pages
Assignment 3 DS5620
No ratings yet
Assignment 3 DS5620
11 pages
2c PyTorch4
No ratings yet
2c PyTorch4
4 pages
DL Mannual For Reference
No ratings yet
DL Mannual For Reference
58 pages
PyTorch Crash Course 1713016363
No ratings yet
PyTorch Crash Course 1713016363
15 pages
Cad and Dog
No ratings yet
Cad and Dog
5 pages
Cad and Dog 2
No ratings yet
Cad and Dog 2
5 pages
Intro To Pytorch
No ratings yet
Intro To Pytorch
12 pages
Py Torch
No ratings yet
Py Torch
19 pages
"I C U N N ": Mage Lassification Sing Eural Etworks
No ratings yet
"I C U N N ": Mage Lassification Sing Eural Etworks
15 pages
Advanced C Concepts and Programming: First Edition
From Everand
Advanced C Concepts and Programming: First Edition
Gayatri
3/5 (1)
Light Propagation Modelling Using Comsol Multiphysics 4.4
No ratings yet
Light Propagation Modelling Using Comsol Multiphysics 4.4
22 pages
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
César Pérez López
No ratings yet