Cep Dip
Cep Dip
ISLAMABAD
DIGITAL SIGNAL PROCESSING LAB
Breast cancer is one of the most prevalent cancers affecting women worldwide,
with Invasive Ductal Carcinoma (IDC) being the most common subtype. Early
detection of IDC plays a crucial role in effective treatment and patient survival.
This project aims to develop a machine learning model using Convolutional
Neural Networks (CNNs) to identify IDC from histopathology images. The
dataset consists of approximately 5,000 labeled 50x50 pixel RGB images of
H&E-stained breast tissue samples, with each patch labeled as IDC (1) or non-
IDC (0).
2. Methodology
2.1 Dataset
The dataset comprises:
Images: 50x50 pixel RGB images of histopathology samples.
o 1: Contains IDC.
1. Two convolutional layers (32 and 64 filters) with ReLU activations and
max-pooling.
The model was trained to minimize loss on the training set and evaluated on
unseen test data to measure its generalization ability.
2.4 Code
Training Code:
import torch
import torch.nn as nn
import numpy as np
class SimpleCNN(nn.Module):
def __init__(self):
super(SimpleCNN, self).__init__()
self.pool = nn.MaxPool2d(2, 2)
self.fc2 = nn.Linear(128, 2)
x = self.pool(torch.relu(self.conv1(x)))
x = self.pool(torch.relu(self.conv2(x)))
x = x.view(-1, 64 * 12 * 12)
x = torch.relu(self.fc1(x))
x = self.fc2(x)
return x
# Dataset preparation
class CustomDataset(Dataset):
def __len__(self):
return len(self.x_data)
# Load data
model = SimpleCNN()
criterion = nn.CrossEntropyLoss()
# Training loop
model.train()
optimizer.zero_grad()
outputs = model(inputs)
loss.backward()
optimizer.step()
running_loss += loss.item()
_, predicted = torch.max(outputs, 1)
correct += (predicted == labels).sum().item()
total += labels.size(0)
torch.save(model.state_dict(), 'simple_cnn.pth')
Testing Code:
from PIL import Image
model = SimpleCNN()
model.load_state_dict(torch.load('simple_cnn.pth'))
model.eval()
img_path = 'test_image.png'
img = Image.open(img_path).convert('RGB')
transform = transforms.Compose([
transforms.Resize((50, 50)),
transforms.ToTensor(),
])
img_tensor = transform(img).unsqueeze(0)
# Prediction
output = model(img_tensor)
_, predicted = torch.max(output, 1)
The model was trained for 10 epochs, and the training and testing performance
improved over time. Below are the key metrics:
Simulation Output:
4. Conclusion
The developed Convolutional Neural Network (CNN) successfully identified
IDC in histopathology images with a final test accuracy of 78.92%. This
demonstrates the model's potential in assisting pathologists with cancer
diagnosis. Future work includes: