Deep Learning Computer Vision Lecture Notes
Deep Learning Computer Vision Lecture Notes
Lecture Notes
Author: Salam Kalam
Date: June 21, 2025
Table of Contents
1. Convolutional Neural Networks (CNNs)
2. Transfer Learning
3. Object Detection (YOLO, Faster R-CNN)
4. Semantic Segmentation
5. Generative Models (GANs)
6. References
1. Convolutional Neural Networks (CNNs)
CNNs use convolutional layers to extract spatial hierarchies of features. Key
components include kernels, pooling layers, and fully connected layers.
2. Transfer Learning
Transfer learning leverages pretrained CNNs (e.g., VGG, ResNet) on large
datasets, fine-tuning them for specific vision tasks to reduce training time and data
requirements.
3. Object Detection
- YOLO (You Only Look Once): Single-stage detection with real-time performance.
- Faster R-CNN: Two-stage detection with region proposal networks.
4. Semantic Segmentation
Model Description
U-Net Encoder-decoder architecture for medical imaging segmentation.
SegNet Efficient segmentation with max-pooling indices transfer.
5. Generative Models (GANs)
Generative Adversarial Networks consist of generator and discriminator networks
trained in an adversarial setup to synthesize realistic images.
6. References
1. Goodfellow, I. et al. (2014). Generative Adversarial Nets. NeurIPS.
2. He, K. et al. (2016). Deep Residual Learning for Image Recognition. CVPR.
3. Ronneberger, O. et al. (2015). U-Net: Convolutional Networks for Biomedical
Image Segmentation. MICCAI.