SSL 18 Mar 23 PDF
SSL 18 Mar 23 PDF
Spring 2023
Sudeshna Sarkar
Self-Supervised Learning
Sudeshna Sarkar
17 Mar 2023
Self-supervised Learning
• Learn What?
• How to learn?
Coral
• Learn from what?
Fish
Compact Mental
Image Representation
im2vec
layer 3 representation of image
Image
Representations??
Must be
good for
transfer
learning
Data Dropout Prediction
Prediction
Objective
• Unsupervised / Self-supervised by predicting
part of data from other part
Self-supervised pretext tasks
learn to predict image transformations / complete corrupted images.
1. Solving the pretext tasks allow the model to learn good features.
2. We can automatically generate labels for the pretext tasks.
How to evaluate a self-supervised learning method?
Self-supervised
learning by rotating
the entire input
images.
Self-supervised learning on
CIFAR10 (entire training set)
No pretraining
Self-supervised learning on
ImageNet (entire training set)
with AlexNet
Self-supervised learning
with rotation prediction
Pretext task: predict relative patch locations
Model predicts relative location of
two patches from the same image.
Discriminative pretraining task
Doersch et al, “Unsupervised Visual Representation Learning by Context Prediction”, ICCV 2015
87
[slide credit: Justin Johnson]
Pretext task: solving “jigsaw puzzles”
Deepak Pathak, Philipp Krähenbühl, Jeff Donahue, Trevor Darrell, Alexei Efros. CVPR 2016
Feature Learning by Inpainting
Learning to inpaint by reconstruction
Encoder: Decoder:
𝜙𝜙 𝜓𝜓
39
[slide credit: Justin Johnson]
Context Encoders: Learning by Inpainting
Input Image Predict Missing Pixels
Encoder: Decoder:
𝜙𝜙 𝜓𝜓
40
[slide credit: Justin Johnson]
Context Encoders: Learning by Inpainting
Input Image Predict Missing Pixels
Encoder: Decoder:
𝜙𝜙 𝜓𝜓
L2 Loss
(Best for feature learning)
Pathak et al, “Context Encoders: Feature Learning by Inpainting”, CVPR 2016
41
[slide credit: Justin Johnson]
Context Encoders: Learning by Inpainting
Input Image Predict Missing Pixels
Encoder: Decoder:
𝜙𝜙 𝜓𝜓
L2 + Adversarial Loss
(Best for nice images)
Pathak et al, “Context Encoders: Feature Learning by Inpainting”, CVPR 2016
42
[slide credit: Justin Johnson]
Learning to inpaint by reconstruction
Hadsell et al, “Dimensionality Reduction by Learning and Invariant Mapping”, CVPR 2006 White kitten image is free for commercial use under the Pixabay license
CNN
CNN
Hadsell et al, “Dimensionality Reduction by Learning and Invariant Mapping”, CVPR 2006 White kitten image is free for commercial use under the Pixabay license
Similar images should have similar features Dissimilar images should have dissimilar features
CNN CNN
CNN CNN
Hadsell et al, “Dimensionality Reduction by Learning and Invariant Mapping”, CVPR 2006 White kitten image is free for commercial use under the Pixabay license
CNN CNN
CNN CNN
Hadsell et al, “Dimensionality Reduction by Learning and Invariant Mapping”, CVPR 2006 White kitten image is free for commercial use under the Pixabay license
CNN CNN
CNN CNN
𝑑𝑑 2
𝐿𝐿𝑆𝑆 𝑥𝑥1, 𝑥𝑥2 = 𝐿𝐿𝐷𝐷 𝑥𝑥1, 𝑥𝑥2 = max(0, 𝑚𝑚 − 𝑑𝑑2 )
Pull features together Push features apart
Justin Johnson Lecture 22 - 92 (upto margin m)
April 6, 2022
[slide credit: Justin Johnson]
Contrastive Learning
Problem: Where to get positive and negative pairs?
Similar images should have similar features Dissimilar images should have dissimilar features
CNN CNN
CNN CNN
𝑑𝑑 2
𝐿𝐿𝑆𝑆 𝑥𝑥1, 𝑥𝑥2 = 𝐿𝐿𝐷𝐷 𝑥𝑥1, 𝑥𝑥2 = max(0, 𝑚𝑚 − 𝑑𝑑2 )
Pull features together Push features apart
Justin Johnson Lecture 22 - 92 (upto margin m)
April 6, 2022
[slide credit: Justin Johnson]
Contrastive Learning with Data Augmentation
Batch of N
images
Hadsell et al, “Dimensionality Reduction by Learning and Invariant Mapping”, CVPR 2006 Hjelm et al, “Learning deep representations by mutual information estimation and maximization”, ICLR 2019 Tian et al, “Contrastive Multiview Coding”, ECCV 2020
Wu et al, “Unsupervised Feature Learning by Non-Parametric Instance-Level Discrimination”, CVPR 2018 Bachman et al, “Learning Representations by Maximizing Mutual Information Across Views”, NeurIPS 2019 He et al, “Momentum Contrast for Unsupervised Visual Representation Learning”, CVPR 2020
Van den Oord et al, “Representation Learning with Contrastive Predictive Coding”, NeurIPS 2018 Henaff et al, “Data-Efficient Image Recognition with Contrastive Predictive Coding”, ICML 2020 Chen et al, “A Simple Framework for Contrastive Learning of Visual Representations”, ICML 2020
𝑥𝑥!
𝑥𝑥"
𝑥𝑥#
𝑥𝑥$
𝑥𝑥%
𝑥𝑥&
Hadsell et al, “Dimensionality Reduction by Learning and Invariant Mapping”, CVPR 2006 Hjelm et al, “Learning deep representations by mutual information estimation and maximization”, ICLR 2019 Tian et al, “Contrastive Multiview Coding”, ECCV 2020
Wu et al, “Unsupervised Feature Learning by Non-Parametric Instance-Level Discrimination”, CVPR 2018 Bachman et al, “Learning Representations by Maximizing Mutual Information Across Views”, NeurIPS 2019 He et al, “Momentum Contrast for Unsupervised Visual Representation Learning”, CVPR 2020
Van den Oord et al, “Representation Learning with Contrastive Predictive Coding”, NeurIPS 2018 Henaff et al, “Data-Efficient Image Recognition with Contrastive Predictive Coding”, ICML 2020 Chen et al, “A Simple Framework for Contrastive Learning of Visual Representations”, ICML 2020
𝑥𝑥!
𝑥𝑥"
𝑥𝑥#
𝑥𝑥$
𝑥𝑥%
𝑥𝑥&
Hadsell et al, “Dimensionality Reduction by Learning and Invariant Mapping”, CVPR 2006 Hjelm et al, “Learning deep representations by mutual information estimation and maximization”, ICLR 2019 Tian et al, “Contrastive Multiview Coding”, ECCV 2020
Wu et al, “Unsupervised Feature Learning by Non-Parametric Instance-Level Discrimination”, CVPR 2018 Bachman et al, “Learning Representations by Maximizing Mutual Information Across Views”, NeurIPS 2019 He et al, “Momentum Contrast for Unsupervised Visual Representation Learning”, CVPR 2020
Van den Oord et al, “Representation Learning with Contrastive Predictive Coding”, NeurIPS 2018 Henaff et al, “Data-Efficient Image Recognition with Contrastive Predictive Coding”, ICML 2020 Chen et al, “A Simple Framework for Contrastive Learning of Visual Representations”, ICML 2020
𝑥𝑥"
𝑥𝑥#
𝑥𝑥$
𝑥𝑥%
𝑥𝑥&
Hadsell et al, “Dimensionality Reduction by Learning and Invariant Mapping”, CVPR 2006 Hjelm et al, “Learning deep representations by mutual information estimation and maximization”, ICLR 2019 Tian et al, “Contrastive Multiview Coding”, ECCV 2020
Wu et al, “Unsupervised Feature Learning by Non-Parametric Instance-Level Discrimination”, CVPR 2018 Bachman et al, “Learning Representations by Maximizing Mutual Information Across Views”, NeurIPS 2019 He et al, “Momentum Contrast for Unsupervised Visual Representation Learning”, CVPR 2020
Van den Oord et al, “Representation Learning with Contrastive Predictive Coding”, NeurIPS 2018 Henaff et al, “Data-Efficient Image Recognition with Contrastive Predictive Coding”, ICML 2020 Chen et al, “A Simple Framework for Contrastive Learning of Visual Representations”, ICML 2020
𝑥𝑥"
𝑥𝑥#
𝑥𝑥$
𝑥𝑥%
𝑥𝑥&
Hadsell et al, “Dimensionality Reduction by Learning and Invariant Mapping”, CVPR 2006 Hjelm et al, “Learning deep representations by mutual information estimation and maximization”, ICLR 2019 Tian et al, “Contrastive Multiview Coding”, ECCV 2020
Wu et al, “Unsupervised Feature Learning by Non-Parametric Instance-Level Discrimination”, CVPR 2018 Bachman et al, “Learning Representations by Maximizing Mutual Information Across Views”, NeurIPS 2019 He et al, “Momentum Contrast for Unsupervised Visual Representation Learning”, CVPR 2020
Van den Oord et al, “Representation Learning with Contrastive Predictive Coding”, NeurIPS 2018 Henaff et al, “Data-Efficient Image Recognition with Contrastive Predictive Coding”, ICML 2020 Chen et al, “A Simple Framework for Contrastive Learning of Visual Representations”, ICML 2020
𝑥𝑥&
Hadsell et al, “Dimensionality Reduction by Learning and Invariant Mapping”, CVPR 2006 Hjelm et al, “Learning deep representations by mutual information estimation and maximization”, ICLR 2019 Tian et al, “Contrastive Multiview Coding”, ECCV 2020
Wu et al, “Unsupervised Feature Learning by Non-Parametric Instance-Level Discrimination”, CVPR 2018 Bachman et al, “Learning Representations by Maximizing Mutual Information Across Views”, NeurIPS 2019 He et al, “Momentum Contrast for Unsupervised Visual Representation Learning”, CVPR 2020
Van den Oord et al, “Representation Learning with Contrastive Predictive Coding”, NeurIPS 2018 Henaff et al, “Data-Efficient Image Recognition with Contrastive Predictive Coding”, ICML 2020 Chen et al, “A Simple Framework for Contrastive Learning of Visual Representations”, ICML 2020
Interpretation: Cross-entropy
𝑥𝑥& loss over the other 2N-1
elements in the batch!
Hadsell et al, “Dimensionality Reduction by Learning and Invariant Mapping”, CVPR 2006 Hjelm et al, “Learning deep representations by mutual information estimation and maximization”, ICLR 2019 Tian et al, “Contrastive Multiview Coding”, ECCV 2020
Wu et al, “Unsupervised Feature Learning by Non-Parametric Instance-Level Discrimination”, CVPR 2018 Bachman et al, “Learning Representations by Maximizing Mutual Information Across Views”, NeurIPS 2019 He et al, “Momentum Contrast for Unsupervised Visual Representation Learning”, CVPR 2020
Van den Oord et al, “Representation Learning with Contrastive Predictive Coding”, NeurIPS 2018 Henaff et al, “Data-Efficient Image Recognition with Contrastive Predictive Coding”, ICML 2020 Chen et al, “A Simple Framework for Contrastive Learning of Visual Representations”, ICML 2020