0% found this document useful (0 votes)
2 views

lab(localization and detection )

Deep learning

Uploaded by

enssifan
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

lab(localization and detection )

Deep learning

Uploaded by

enssifan
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

11/27/24, 8:13 AM Untitled5.

ipynb - Colab

Lab 1: Object Localization with Bounding Box Prediction

1. Import Required Libraries

import torch
import torch.nn as nn
import torch.optim as optim
from torchvision import datasets, transforms
from torch.utils.data import DataLoader
import numpy as np

2. Prepare the Dataset with Bounding Boxes

class MNISTWithBoundingBoxes:
def __init__(self, train=True):
self.dataset = datasets.MNIST(
root='./data',
train=train,
download=True,
transform=transforms.ToTensor()
)

def __getitem__(self, idx):


img, label = self.dataset[idx]
img_np = img.squeeze(0).numpy() # Convert to numpy for bounding box calculation

# Find the non-zero region (digit) in the image


rows, cols = np.where(img_np > 0)
y_min, x_min = rows.min(), cols.min()
y_max, x_max = rows.max(), cols.max()

# Normalize bounding box coordinates to [0, 1]


bbox = torch.tensor([x_min / 28, y_min / 28, x_max / 28, y_max / 28], dtype=torch.float32)

return img, label, bbox

def __len__(self):
return len(self.dataset)

# Create DataLoader for batch processing


train_dataset = MNISTWithBoundingBoxes(train=True)
test_dataset = MNISTWithBoundingBoxes(train=False)
train_loader = DataLoader(train_dataset, batch_size=32, shuffle=True)
test_loader = DataLoader(test_dataset, batch_size=32, shuffle=False)

What’s happening?

We load the MNIST dataset and compute bounding boxes based on non-zero pixels.

Bounding boxes are normalized to [ 0 , 1 ] [0,1] relative to the image size. bold text

3. Define the Localization Model

class LocalizationModel(nn.Module):
def __init__(self):
super(LocalizationModel, self).__init__()
self.backbone = nn.Sequential(
nn.Conv2d(1, 16, kernel_size=3, stride=1, padding=1),
nn.ReLU(),
nn.MaxPool2d(2, 2),
nn.Conv2d(16, 32, kernel_size=3, stride=1, padding=1),
nn.ReLU(),
nn.MaxPool2d(2, 2)
)
self.fc = nn.Sequential(
nn.Flatten(),
nn.Linear(32 * 7 * 7, 128),
nn.ReLU(),
nn.Linear(128, 4) # 4 outputs: [x_min, y_min, x_max, y_max]
)

def forward(self, x):


features = self.backbone(x)

https://colab.research.google.com/drive/1bctWzlVrcVlhhHqOU9PH8yYSJBuzMpU7#scrollTo=u9fZNMbFkYhT&printMode=true 1/3
11/27/24, 8:13 AM Untitled5.ipynb - Colab
bbox = self.fc(features)
return bbox

# Initialize model
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
model = LocalizationModel().to(device)

4. Define Loss Function and Optimizer

criterion = nn.MSELoss() # Mean Squared Error for bounding box regression


optimizer = optim.Adam(model.parameters(), lr=0.001)

Loss Function: Compares predicted bounding box coordinates to ground truth.

Optimizer: Adam optimizer updates model weights.

5. Train the Model

epochs = 5
for epoch in range(epochs):
model.train()
total_loss = 0
for imgs, _, bboxes in train_loader:
imgs, bboxes = imgs.to(device), bboxes.to(device)

optimizer.zero_grad()
pred_bboxes = model(imgs)
loss = criterion(pred_bboxes, bboxes)
loss.backward()
optimizer.step()

total_loss += loss.item()

print(f"Epoch [{epoch+1}/{epochs}], Loss: {total_loss / len(train_loader):.4f}")

Explanation:

For each batch:

Forward pass to predict bounding boxes.

Compute loss between predictions and ground truth.

Backpropagate gradients to update model weights.

6. Test the Model

model.eval()
with torch.no_grad():
for imgs, _, bboxes in test_loader:
imgs, bboxes = imgs.to(device), bboxes.to(device)
pred_bboxes = model(imgs)
print("Predicted BBox:", pred_bboxes[0].cpu().numpy())
print("Ground Truth BBox:", bboxes[0].cpu().numpy())
break

Explanation:

Use the trained model to predict bounding boxes for unseen test data.

Compare predictions with ground truth bounding boxes.

Double-click (or enter) to edit

https://colab.research.google.com/drive/1bctWzlVrcVlhhHqOU9PH8yYSJBuzMpU7#scrollTo=u9fZNMbFkYhT&printMode=true 2/3
11/27/24, 8:13 AM Untitled5.ipynb - Colab

https://colab.research.google.com/drive/1bctWzlVrcVlhhHqOU9PH8yYSJBuzMpU7#scrollTo=u9fZNMbFkYhT&printMode=true 3/3

You might also like