0% found this document useful (0 votes)
12 views

Deep Residual Learning For Image Recognition

Uploaded by

sashidhar avuthu
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views

Deep Residual Learning For Image Recognition

Uploaded by

sashidhar avuthu
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

Deep residual learning for

Image Recognition
Authors: Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun
Presenters

Akash Gadicherla Sashidhar Reddy Avuthu


Computer Engineering Computer Engineering
[email protected]
[email protected]
Outline

MOTIVATION/ APPROACH RESULTS RELATED WORK CONCLUSION OUR THOUGHTS


BACKGROUND
Motivation/
Background
Background

Common convention
Unable to scale
for image recognition Depth of network is
networks beyond 20
was using CNNs crucially important
layers
(VGG, AlexNet)
Motivation Network scaling

Gradient degradation

Optimization issues
Approach

•Instead of directly learning the


transformation H(x), residual learning
Residual reformulates this to H(x)=F(x)+x
Mapping: •Residual Function F(x): Represents
the difference between the input and
target, making it easier to learn.

•Simplifies learning by focusing on


small adjustments.
Advantages: •Addresses degradation by allowing
layers to "do nothing" if needed,
where F(x) can be zero.
Approach
•Shortcut Connections

•Types of Shortcuts:

• Identity Shortcuts: Add the input directly,


requiring no extra parameters.
• Projection Shortcuts: Use 1x1 convolution to
match input-output dimensions when necessary.

•Function of Shortcuts:

• Enable effective information flow across layers.


• Minimize optimization difficulty by bypassing
some transformations when they’re unnecessary.
Approach
•Architecture Overview

•Plain Networks vs. ResNets:

• Plain Networks: Sequential layers, vulnerable to


degradation.
• ResNet: Introduces shortcut connections in every few
layers to mitigate issues.

•Bottleneck Architecture:

• For deep networks (50+ layers), ResNet employs a 3-layer


bottleneck (1x1-3x3-1x1) to reduce computational costs.
Experimental
Results
•Results on ImageNet

•Setup:

• ResNets with 18, 34, 50, 101, and 152 layers


tested.
• Dataset: ImageNet, over a million images
across 1,000 classes.

•Results:

• 152-layer ResNet: Achieved 3.57% top-5


error, setting a new state-of-the-art.
• Ensemble: Further reduced error, winning
the 2015 ImageNet competition.
•Results on CIFAR-10

• Objective: Evaluate extreme network depth.

Experimental •

Setup: ResNets tested up to 1,202 layers on CIFAR-10.
Findings:

Results •


ResNets trained successfully without degradation, even at 1,202 layers.

Conclusion: Residual learning enables training of very deep networks,


unlike plain networks that degrade as depth increases.
Related Work ResNeXt

• Uses groups to
GoogLeNet learn different
features
• Uses Inception module
• 22 layers
• Parallel computing
Conclusion
and Future
Work
Summary

RESNETS ABLE TO SOLVE EASY SET THE


ACHIEVE HIGH THE GRADIENT OPTIMIZATION STANDARD FOR
ACCURACY WITH DEGRADATION FUTURE IMAGE
HIGHER DEPTH PROBLEM CLASSIFICATION
ASSOCIATED ARCHITECTURES
WITH HIGH
LAYER NEURAL
NETWORKS
Our Thoughts

Overall, the paper had a lot of detail and thoroughly


explained the residual networks

We were impressed with how relevant it still is today

The model is versatile and can be extended for other


vision based applications
Thank
You!
Any Questions?

You might also like