Video 9 - PyTorch Datasets and Dataloaders
Video 9 - PyTorch Datasets and Dataloaders
Antonio Rueda-Toicen
Learning goals
○ Create Dataset objects in PyTorch to wrap images and target values together
○ Implement DataLoader PyTorch objects to feed data to a model
○ Understand the connection between DataLoader and Stochastic Gradient Descent
The batch dimension is important for our models
print(torch_tensor_gray.shape)
# Training loop
for images, labels in train_loader:
# Remember that data has to be explicitly sent to the GPU
images, labels = images.to(device), labels.to(device)
output = model(images)
batch_loss = loss_function(output, labels)
Shuffling the training set only
https://www.kaggle.com/competitions/digit-recognizer
Summary
● Dataloaders shuffle the training data and maintain sequential order for validation and
test sets
● Their random sampling and shuffling of training samples provide the ‘stochastic’ part of
Stochastic Gradient Descent
References
● https://pytorch.org/tutorials/beginner/basics/data_tutorial.html
MNIST on Fiftyone
● https://try.fiftyone.ai/datasets/mnist/samples