Cbir-Using CNN
Cbir-Using CNN
1
Outline
What is CBIR ?
Image Features
Feature Weighting and Relevance
Feedback
User Interface and Visualization
2
What is Content-based Image
Retrieval (CBIR)?
Image Search Systems that search
for images by image content
<-> Keyword-based Image/Video Retrieval
(ex. Google Image Search, YouTube)
3
Applications of CBIR
Consumer Digital Photo Albums
Digital Cameras
Flickr
Medical Images
Digital Museum
Trademarks Search
MPEG-7 Content Descriptors
4
Basic Components of CBIR
Feature Extractor
Create the metadata
Query Engine
Calculate similarity
User Interface
5
How does CBIR work ?
Extract Features from Images
Let the user do Query
Query by Sketch
Query by Keywords
Query by Example
Refine the result by Relevance Feedback
Give feedback to the previous result
6
Query by Example
Pick example images, then ask the system
to retrieve “similar” images.
Query Sample
Results
7
Relevance Feedback
User gives a feedback to the query results
System recalculates feature weights
Initial
sample 1st Result 2nd Result
8
Basic Components of CBIR
Feature Extractor
Create the metadata
Query Engine
Calculate similarity
User Interface
9
Image Features (Metadata)
Color
Texture
Structure
etc
10
Color Features
Which Color Space?
RGB, CMY, YCrCb, CIE, YIQ, HLS, …
Our Favorite is HSV
Designed to be similar to human perception
11
HSV Color Space
H (Hue)
Dominant color (spectral)
S (Saturation)
Amount of white
V (Value)
Brightness
12
Straightforward way to use
HSV as color features
Histogram for each H, S, and V
Then compare in each bin
Is this a good idea?
13
Are these two that different?
Histogram comparison is very
sensitive
14
Color Moments [Stricker ‘95]
For each image, the color distribution in
each of H, S and V is calculated
1st (mean), 2nd (var) and 3rd moment for HSV
N
1
Ei = ∑
N j=1
pij
N
i : color channel {i=h,s,v}
1
σi = ( ∑
N j=1
(pij − Ei ) 2 )1 2 N = # of pixels in image
1 N
Total 9 features
si = ( ∑ (pij − Ei ) 3)1 3
N j=1
15
Shape Features
Region-Based Shape
Outer Boundary
Contour-Based Shape
Features of Contour
Edge-Based Shape
Ex. Histogram of edge length and
orientation
16
Region-based vs. Contour-based
Region-based
Suitable for Complex objects with
disjoint region
Contour-based
Suitable when semantics are contained in
the contour
17
Region-based vs. Contour-based
Region Complexity
Contour
Complexity
18
Region-based vs. Contour-based
19
Angular Radial Transformation (ART)
[Kim’99]
A Region-based shape
Calculate the coefficients based on image
intensities in polar coordinates (n<3, m<12)
2π 1
Fnm = ∫ 0 ∫ o
Vnm (ρ , θ ) f (ρ ,θ ) ρdρdθ
f (ρ,θ )L image intensity in polar coordinates
Vnm (ρ,θ )L ART basis function
Vnm (ρ,θ ) = 1 / 2 π exp( jmθ )Rn (ρ)
€ 1 n=0
Rn (ρ) =
2 cos(πnρ) n ≠ 0
Total 35 coefficients in 140 bits (4 bits/coeff)
€ 20
Curvature Scale-Space (CSS)
[Mokhtaarian ‘92]
A contour-based shape
1) Apply lowpass filter repeatedly until
concave contours smoothed out
2) “How contours are filtered” becomes
the features
• Zero crossing in the curvature functions
after each application of the lowpass
filter
• CSS Image
21
CSS Image
Tracks zero-crossing locations of each concavity in the contour
Contour Curvature
B
A
C
D
F
3 ITR
E
29 ITR
B
A
E
F
100
ITR
22
S
CSS Features
# of peaks in CSS images
Highest peak
Circularity (perimeter2/ area)
Eccentricity
Etc.
23
Texture Features
Wavelet-based Texture Features
[Smith’94]
24
Wavelet Filter Bank
Coarse Info (low
freq)
Wavelet
Filter
Original Image
Detail (high freq)
25
Texture Features from Wavelet
26
Other approaches: Region-Based
Global features often times fail to
capture local content in an image
GLOBAL DESCRIPTION
{Green, Grassy, Hillside}
color, texture, shape
27
Other approaches: Region-Based
Segmentation-Based
Images are segmented by color/texture similarities:
Blobworld [Carson ‘99], Netra [Ma and Manjunath ‘99]
Grid-Based
Images are partitioned, features are calculated from
blocks: [Tian ‘00],[Moghaddam ‘99]
28
Other approaches: Region-Based
Combine Grid and Segmentation: [Dagli and Huang, ‘04]
29
Basic Components of CBIR
Feature Extractor
Create the metadata
Query Engine
Calculate similarity
User Interface
30
Now, We have many features
(too many?)
31
Visual Similarity ?
“Similarity” is Subjective and Context-
dependent.
“Similarity” is High-level Concept.
Cars, Flowers, …
But, our features are Low-level features.
Semantic Gap!
32
Which features are most important?
Not all features are always important.
“Similarity” measure is always changing
The system has to weight features on the fly.
How ?
33
Online Feature Weighting
Approach #1 - Manual
Ask the user to specify number
“35% of color and 50% of texture…”
Very difficult to determine the numbers
Approach #2 - Automatic
Learn feature weights from examples
Relevance Feedback
34
Online Feature Weighting
From Query Examples, the system
determines feature weighting matrix W
CBIR
Calculate W
Query
Result
r r r rT r r
distance( x, y ) = ( x − y ) W ( x − y )
35
How to Calculate W ?
No Negative Examples (1-class)
Positive and Negative Examples (2-class)
One Positive and Many Negative classes
(1+x)-class
Many Positive and Many Negative classes
(x+y)-class
36
When there are only relevant
images available…
We want to give more weights to
common features among example
images.
Use the variance.
Features with low variance
-> Common features
-> Give higher weight
37
One Class Relevance Feedback in
MARS [Rui ‘98]
Calculates the Variance among relevant examples.
The inverse of variance becomes the weight of each
feature.
This means “common features” between positive
examples have larger weights.
1/σ12 0
2
1/ σ 2
W = 2
1/σ 3
W is a k x k diagonal matrix
O
2
0 1/σ k
38
Relevance Feedback as Two-Class
Problem (positive and negative)
Fisherʼs Discriminant Analysis (FDA)
Find a W that …
minimizes the scatter
of each class cluster
positive (within scatter)
maximizes the scatter
negative between the clusters
(between scatter)
39
Two-Class problem
Target function W T SBW
W = argmax T
W is full matrix W W SW W
SB LBetween Scatter Matrix
SW LWithin Scatter Matrix
2
SW = ∑ ∑ j ∈group # i (x j − m i )(x j − m i ) T
i=1
€ SB = (m1 − m2 )(m1 − m2 )T
m1,m2 Lmean of each class
40
Solution
The problem is reduced to
generalized eigenvalue problem
SB w i = λi SW w i
1/ 2
W = ΦΛ
ΛLdiagonal matrix of eigenvalues
ΦLeigenvectors
€ 41
From Two-class to (1+x)-class
Positive examples are usually from
one class such as flower
Negative examples can be from any
classes such as “car”, “elephant”,
“orange”…
It is not desirable to assume negative
images as one class.
42
RF as (1+x)-Class Problem
• Biased Discriminant Analysis [Zhou et al. ‘01]
• Negative examples can be any images
• Each negative image has its own group
positive
negative
SW = ∑ (x − m)(x − m)T
x ∈ positive
SB = ∑ (x − m)(x − m)T
x ∈negative
positive
negative
44
RF as (x+y)-Class Problem
Group BDA [Nakazato, Dagli ‘03]
Multiple Positive classes
Scattered Negative classes
positive
negative
SW = ∑ i ∑ x ∈i (x − mi )(x − mi )T
SB = ∑ ∑ (y − mi )(y − mi )T
i y ∈negative
45
Basic Components of CBIR
Feature Extractor
Create the metadata
Query Engine
Calculate similarity
User Interface
46
User Interface and Visualization
Basic GUI
Direct Manipulation GUI
El Nino [UC San Diego]
Image Grouper [Nakazato and Huang]
3D Virtual Reality Display
47
Traditional GUI for Relevance
Feedback
User selects
relevant images
If good images
are found, add
them
When no more
images to add, the
search converges
Slider or Checkbox
48
ImageGrouper [Nakazato and Huang]
Query by Groups
Make a query by creating groups of images
Easier to try different combinations of
query sets (trial-and-Error Query)
49
ImageGrouper
50
Note
Trial-and-Error Query is very important
because
Image similarity is subjective and context-
dependent.
In addition, we are using low-level image features.
(semantic gap)
Thus, it is VERY difficult to express the user’s
concept by these features.
51
Image Retrieval in 3D
Image retrieval and browsing in 3D
Virtual Reality
The user can see more images without
occlusion
Query results can be displayed in
various criteria
Results by Color features, by texture, by
combination of color and texture
52
3D MARS
Texture
Structure
color
Initial Display
Result
53
3D MARS in CAVE™
54
Demos
Traditional GUI
IBM QBIC
• http://wwwqbic.almaden.ibm.com/
UIUC MARS
• http://chopin.ifp.uiuc.edu:8080
ImageGrouper
http://www.ifp.uiuc.edu/~nakazato/grouper
55
References (Image Features)
Bober, M., “MPEG-7 Visual Descriptors,” In IEEE Transactions on Circuits and
Systems for Video Technology, Vol. 11, No. 6, June 2001.
Stricker, M. and Orengo, M., “Similarity of Color Images,” In Proceedings of SPIE,
Vol. 2420 (Storage and Retrieval of Image and Video Databases III), SPIE Press,
Feb. 1995.
Zhou, X. S. and Huang, T. S., “Edge-based structural feature for content-base
image retrieval,” Pattern Recognition Letters, Special issue on Image and Video
Indexing, 2000.
Smith, J. R. and Chang S-F. Transform features for texture classification and
discrimination in large image databases. In Proceedings of IEEE Intl. Conf. on
Image Processing, 1994.
Smith J. R. and Chang S-F. “Quad-Tree Segmentation for Texture-based Image
Query.” In Proceedings of ACM 2nd International Conference on Multimedia, 1994.
Dagli, C. K. and Huang, T.S., “A Framework for Color Grid-Based Image Retrieval,”
In Proceedings of International Conference on Pattern Recognition, 2004.
Tian, Q. et. al. “Combine user defined region-of-interest and spatial layout in image
retrieval,” in IEEE Intl. Conf. on Image Processing, 2000.
Moghaddam B. et. al. “Defining image content with multiple regions-of-interest,” in
IEEE Wrkshp on Content-Based Access of Image and Video Libraries, 1999.
56
References (Relevance Feedback)
Rui, Y., et al., “Relevance Feedback: A Power Tool for
Interactive Content-Based Image Retrieval,” In IEEE
Trans. on Circuits and Video Technology, Vol.8, No.5,
Sept. 1998
Zhou, X. S., Petrovic, N. and Huang, T. S. “Comparing
Discriminating Transformations and SVM for Learning
during Multimedia Retrieval.” In Proceedings of ACM
Multimedia ‘01, 2001.
Ishikawa, Y., Subrammanya, R. and Faloutsos, C.,
“MindReader: Query database through multiple
examples,” In Proceedings of the 24th VLDB
Conference, 1998.
57
References (User Interfaces and
Visualizations)
Nakazato, M. and Huang, T. S. “3D MARS: Immersive Virtual
Reality for Content-Based Image Retrieval.“ In Proceedings of
2001 IEEE International Conference on Multimedia and Expo
(ICME2001), Tokyo, August 22-25, 2001
Nakazato, M., Manola, L. and Huang, T.S., “ImageGrouper: Search,
Annotate and Organize Images by Groups,” In Proc. of 5th Intl.
Conf. On Visual Information Systems (VIS’02), 2002.
Nakazato, M., Dagli C.K., and Huang T.S., “Evaluating Group-Based
Relevance Feedback for Content-Based Image Retrieval,” In
Proceedings of International Conference on Image Processing,
2003.
Santini, S. and Jain, R., “Integrated Browsing and Querying for
Image Database,” IEEE Multimedia, Vol. 7, No. 3, 2000, page
26-39.
58
Similar Region Shape, Different
Contour
59