0% found this document useful (0 votes)
11 views11 pages

Going Deeper With Convolutions: Christian Szegedy Et Al

cbbcv

Uploaded by

anand.sharma
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views11 pages

Going Deeper With Convolutions: Christian Szegedy Et Al

cbbcv

Uploaded by

anand.sharma
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

Going Deeper with Convolutions

Christian Szegedy et al.

Presented by Anand Sharma (20211055)


ILSVRC 2014 Classification Challenge Overview

▶ Task: Classify images into 1000 categories


▶ Dataset:
▶ 1.2 million training images
▶ 50,000 validation images
▶ 100,000 testing images
▶ Performance Metrics:
▶ Top-1 error rate: Checks if the first predicted label matches
the ground truth
▶ Top-5 error rate: Checks if the ground truth is among the top
5 predicted labels
GoogLeNet Model Performance

▶ Achieved top-5 error rate of 6.67 percent


▶ Represented a 40 percent relative reduction compared to the
previous year’s best approach (Clarifai, 2012)
▶ No external data used for training; relied solely on ILSVRC
2014 dataset
▶ Demonstrated the efficiency and effectiveness of the Inception
architecture
Ensemble of Models

▶ Seven versions of GoogLeNet trained independently


▶ Each model had variations in sampling methodologies and
random orders
▶ Captured diverse features, improving overall robustness
▶ Combining predictions from different models reduced
likelihood of errors
▶ Ensemble approach led to more accurate and reliable final
predictions
Aggressive Cropping

▶ Resized images to four different scales:


▶ Shorter dimension: 256, 288, 320, 352 pixels
▶ Multiple crops taken from each scale:
▶ Left, center, right; four corners and center
▶ Total: 144 crops per image
▶ Ensured model could handle variations in object sizes and
positions
▶ Improved accuracy by capturing different perspectives of the
objects
Softmax Averaging

▶ Averaged softmax probabilities from different crops and


models
▶ Provided robust final predictions
▶ Reduced chance of errors from individual models or specific
crops
▶ Smoothed out inconsistencies in predictions
▶ Significant improvement in classification accuracy
ILSVRC 2014 Detection Challenge Overview

▶ Task: Produce bounding boxes around objects in images from


200 categories
▶ Complexity: Identify and localize multiple objects of varying
scales within a single image
▶ Performance Metric: Mean average precision (mAP)
▶ Detected objects considered correct if:
▶ Class matches ground truth
▶ Bounding boxes overlap by at least 50
Detection Challenge Techniques

▶ Achieved 43.9
▶ Techniques used:
▶ Selective Search and Multi-box Predictions: Combined
methods for generating region proposals; improved object
bounding box recall
▶ Increased Superpixel Size: Doubled size to reduce number of
proposals, cutting down on false positives; improved mAP by 1
percent
▶ Ensemble of ConvNets: Six Convolutional Networks
(ConvNets) used for region classification; enhanced detection
accuracy from 40 to 43.9 percent
Ensemble for Detection

▶ Ensemble strategy similar to classification, but tailored for


region proposals
▶ Each ConvNet in ensemble processed region proposals
independently
▶ Outputs combined for final decision
▶ Captured various aspects of objects, leading to more accurate
detection
▶ Unlike R-CNN, did not employ bounding box regression;
focused on region proposal quality
Improvements and Future Work

▶ Inception architecture effective in approximating sparse


structures with dense components
▶ Provided significant quality gains with modest computational
requirements
▶ Future work to focus on developing sparser, more refined
neural network structures
▶ Potential for revolutionizing neural network design and
implementation
▶ Holds promise for further advancements in computer vision
and other fields
Thank You

▶ Thank you for your attention


▶ Open to any questions about the presentation or the Inception
architecture

You might also like