0% found this document useful (0 votes)
47 views43 pages

Joint Embeddings of Shapes and Images Via CNN Image Purification

This document describes a method for jointly embedding 3D shapes and images into a shared embedding space using CNN image purification. Key aspects include: 1. Extracting Light Field HoG descriptors from shapes to obtain shape embeddings and computing distance matrices between shapes. 2. Training a CNN to map images to the same 128-dimensional embedding space as the shapes by learning from image-shape pairs, with shape embeddings providing supervision. 3. Evaluating quantitatively by measuring retrieval performance between shapes and images in the shared embedding space.

Uploaded by

nsparikh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
47 views43 pages

Joint Embeddings of Shapes and Images Via CNN Image Purification

This document describes a method for jointly embedding 3D shapes and images into a shared embedding space using CNN image purification. Key aspects include: 1. Extracting Light Field HoG descriptors from shapes to obtain shape embeddings and computing distance matrices between shapes. 2. Training a CNN to map images to the same 128-dimensional embedding space as the shapes by learning from image-shape pairs, with shape embeddings providing supervision. 3. Evaluating quantitatively by measuring retrieval performance between shapes and images in the shared embedding space.

Uploaded by

nsparikh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 43

Joint Embeddings of Shapes and Images

via CNN Image Purification


Yangyan Li* Hao Su* Charles R. Qi Noa Fish
Daniel Cohen-Or Leonidas J. Guibas
(*Joint First Authors)
Joint Embeddings of Shapes and Images
via CNN Image Purification
Deep learning is so cool for so many problems…
Deep learning, yay or nay?
A piece of cake, What the hell is
elementary math… Y = 𝑓(𝑋) the 𝑓?

It eats, a lot!
Joint Embeddings of Shapes and Images
via CNN Image Purification
128 dim space visualized by t-SNE
Image based Shape Retrieval
Shape based Image Retrieval
Cross-View Image Retrieval
Text Images Shapes
Text based Shape Retrieval
Text based Shape Retrieval
Shape Embedding

𝑆𝑖𝑚𝑖𝑙𝑎𝑟𝑖𝑡𝑦(𝑆𝑖 ,𝑆𝑗 ) = 𝒫𝑖 − 𝒫𝑗
Many choices for 𝒫𝑖 :
Shape Histograms, Spin Images, Spherical
Harmonics, Shape Distributions, etc.
LFD-HoG
Very Strong!

Light Field Rendering

… … … …
HoG HoG HoG HoG HoG
… … … …
Concatenate
𝑆𝟏 𝑆𝟏
𝑆𝟐 𝑆𝟐
𝑆𝟑 𝑆𝟑
. .
. .

. .

. .

.
PCA
.
. .

. .

. .

𝑆𝒌 𝑆𝒌
. .

. .

𝑆𝒏 𝑆𝒏

203,760 128
chairs

planes

cars

Distance Matrix: 𝑑(𝑆𝑖 , 𝑆𝑗 ) in the 𝑖, 𝑗 − 𝑡ℎ element


𝑆𝟏 𝑆𝟐 𝑆𝟑 . . . . . . . . . . . 𝑆𝒏
𝑆𝟏 𝑆𝟏
𝑆𝟐 𝑆𝟐
𝑆𝟑 𝑆𝟑
. .
. .
. MDS .
. .
. Sammon's Error .

. 1 (𝑑𝑖𝑗 − 𝑑𝑖𝑗 )2 .
𝐸= ∗ ෍ ∗
. σ𝑖<𝑗 𝑑𝑖𝑗 𝑑𝑖𝑗 .
𝑖<𝑗
. .

𝑆𝒌 𝑆𝒌
. .
. .

𝑆𝒏 𝑆𝒏

Distance Matrix: 𝑑(𝑆𝑖 , 𝑆𝑗 ) in the 𝑖, 𝑗 − 𝑡ℎ element 128


Each row can serve as the embedding point
250

Sammon
Num of neighbors by original distance
PCA
200 LLE
NPE
Optimal
150

100

50

0
0 50 100 150 200 250
Neighborhood size in embedding space
Shape Embedding

𝑆𝑖𝑚𝑖𝑙𝑎𝑟𝑖𝑡𝑦(𝑆𝑖 ,𝑆𝑗 ) = 𝒫𝑖 − 𝒫𝑗
Our choice of embedding point 𝒫𝑖 :
1. Extract Light Field HoG Descriptors
2. Compute Distance Matrix
3. MDS with Sammon’s Error
Image Embedding
via CNN Image Purification
Deep learning, yay or nay?

𝒫𝑖 = 𝑓(𝐼𝑖 )
A piece of cake, What the hell is
elementary math…
𝒫2 − 𝒫3 < 𝒫1 − 𝒫2 the 𝑓?
http://shapenet.org
Shape Embedding Image Synthesis

Many image-point pairs (𝐼𝑆𝑖 , 𝒫𝑖 )


≠ 1014 ∗

It’s not only the number…


Training Phase Testing Phase
Input: many image-point pairs (𝐼𝑆𝑖 , 𝒫𝑖 )
Task: learn the function 𝒫𝑖 = 𝑓(𝐼𝑆𝑖 )
Hey, wake up!
Here comes the most important slide!
Shape Embedding Precious High Quality Supervision

Image Synthesis Messy but Nutritional Training Data

Training Phase
Testing Phase 𝒫𝑖 = 𝑓(𝐼𝑆𝑖 ), the hell function
Quantitative Evaluation

AUC of image to image retrieval precision-recall curve

First and last image match rankings in shape to image retrieval


Quantitative Evaluation

Image to shape retrieval


Key Steps towards 3D Reconstruction

Similar Shape Retrieval


+
Viewpoint estimation
Render for CNN: Viewpoint Estimation in Images Using CNNs
Trained with Rendered 3D Model Views, ICCV 2015 Oral
Limitations & Future Work
•Dynamic embedding space construction
•Similarity: visual  semantic
-For example, the upcoming SHED!
•Similarity: scalar  vector/matrix
•Whole shape  Parts
•Joint analysis of shapes and images
•…...
http://shapenet.github.io/JointEmbedding/
Stay Cool with http://shapenet.github.io/RenderForCNN/
Thank you!
FC Layer Softmax Loss Layer Euclidean Loss Layer
CONV Layer m Dimensions

Class Embedding
Label Point

You might also like