Data Representation and Pattern Recognition in Image Mining-N D Thokare
Data Representation and Pattern Recognition in Image Mining-N D Thokare
THOKARE NITIN D
M.E. SSA
Dept of EE, IISc Bangalore
Sr.No. 4910-412-091-07119
Guide: Prof M N Murty
Dept of CSA, IISc Bangalore
3.1.1. Color
3.1.2. Texture and Shape sponding labels given to image. This work is further explored
in preliminary experiment section.
Widely used textural features are coarseness, contrast, direc-
tionality, line-likeness, regularity and roughness. All these
features are present in all images at different levels. Sim- 3.3. Image Retrieval
ilarly different shape information like lines, circles/ellipses, For image retrieval we can use text based or content based ap-
rectangles are considered as features representing image. In proach. In text based approach the textual information given
this, given image is divided into patches(overlapping or non- about images is analysed and the appropriate label(s) for that
overlapping) and above features(texture and shapes) are ex- image is(are) decided. The result of image annotation can be
tracted from each patch. Collectively all patches form the used as a textual representation of image for text based im-
feature vector corresponding to that image and can be used age retrieval. This(text based) approach is in use since the
for classification or comparison purpose. birth of image retrieval concept. But now a days the later
concept i.e. content based image retrieval is getting more at-
3.2. Image Annotation tension of researchers. In this the local and global features
like color, texture and shape are extracted from image and are
Lebeling a given image with semantically correct labels that used for comparison between two images. [11] and [12] have
equivalently explain the contents of image is the task of im- discussed about seven rotation, translation and scale invari-
age annotation algorithms. Image annotation can be done in ant moments in image analysis. The first four of them are as
different ways. Given an image, divide image into different follows:
segments that corresponds to possibly different objects using
image segmentation and apply object recognition algorithm φ1 = µ20 + µ02 (1)
on each segmentation. And finally annotate the image with 2
φ2 = (µ20 − µ02 ) + 4µ11 2
(2)
all those labels found in object recognition process. As a part 2 2
of this I completed the face detection and recgnition using φ3 = (µ30 − 3µ12 ) + (3µ21 − µ03 ) (3)
improved LBP under Bayesian framework[10]. One result of 2 2
φ4 = (µ30 + µ12 ) + (µ21 + µ03 ) (4)
this work is shown in Figure 1.
But image segmentation itself is an active research area where, µij is (i + j)th order normalized central moment. In
due to complex variation of object appearance from image to image retrieval systems relevance feedback is commonly used
image. Also for object recognition, training of all objects, that for improvement of query and to get more relevent result.
may be present in test image, must have been done which is Using relevance feedback the query can be internally modi-
not a feasible task. To avoid such complexity we follow an- fied and used to get better results. Consider, X(i) denote the
other approach to learn a probabilistic model that will learn Query at ith step in a particular relevance feedback session,
each image with it’s local or/and global features and corre- then modified Query X(i+1) can be computed as:
P Yk P Yk
X(i + 1) = αX(i) + β |R| −γ |N | Pareto distribution, Cauchy distribution, Zipf’s law are
Yk R Yk N some of common heavy-tailed distributions. Cauchy distri-
where, Yk is an image from retrieved images at ith stage, R bution has the density function given by,
is set of relevant examples and N is set of non-relevant ex- 1
amples decided by user in feedback and α, γ, β are constants f (x; x0 , γ) = 2
x−x0
to control importance given to previous query, relevant exam- πγ 1 + γ
ples and non-relevant examples respectively. (5)
This is known as Hard relevance feedback where only rel-
1 γ
evant or non-relevant option are available to user. The soft rel- =
π (x − x0 )2 + γ 2
evance feedback can be obtained by providing the user with
more than two options [13] like ‘Highly Relevant (HR)’, ‘Rel- where, x0 is the location parameter, specifying the location
evant (R)’, ‘No Opinion (NO)’, ‘Non-relevant (NR)’, ‘Highly of the peak of the distribution, and γ is the scale parameter
Non-relevant (HN)’. In this case the modified Query can be which specifies the half-width at half-maximum (HWHM).
obtained by, Zipf’s law states that, for number of appearances of an
P rk ×Yk P rk ×Yk object, denoted by R, and it’s rank, denoted by n
X(i + 1) = αX(i) + β P
rk + γ
P
rk
Yk R Yk N
Yk R Yk N
R = cn-β
where, rk are relevance weights for different options as fol-
lows(e.g.): for some positive constants c and β. For n=1, Zipf’s law states
that popularity (R) and rank (n) are inversly proportional.
0.5 if Yk HR
0.1 if Yk R
rk = 0.0 if Yk N O 4. PRELIMINARY EXPERIMENTS AND RESULTS
−0.1 if Yk N
In this section details of image retrieval part with it’s results
−0.5 if Y HN
k are given. Also the initial approach for image annotation part
is explained here.
3.4. Association Rule Mining
4.1. Image Retrieval
Association analysis is useful for discovering useful relation-
ship hidden in large data collections[14]. These relations can Image retrieval can be done using textual or visual informa-
be useful to user, if represented as rule, for important decision tion. Textual image retrieval method includes image retrieval
making. Collecting the user feedback during relevance feed- using the text available in the information/ explaination about
back session, it is possible to exploit the user relevant concept image. Textual information also includes labels or annota-
from feedback and association rules that in turn will help to tions of image. Whereas, the visual-content based image re-
improve precision rate. trieval includes the use of visual information of an image.
When user start a new query session, the a priori relevance This information can be obtained in the form of different fea-
association rules about this query are first retrieved[13]. First tures like color, texture, shape etc.
Using these associations the results are shown. Then depend- In this work we completed image retrieval part using vi-
ing upon the soft relevance feedback given by user, the query sual features as follows:
can be recomputed and used to further improve the result. The
association rule formed from many users’ experience is useful
to get better image retrieval result for future sessions. • Initially convert the image from RGB to HSV color
space, then divide the image into 6 × 4 (or 4 × 6) i.e.
3.5. Heavy-Tailed Distributions 24 number of same size patches(overlapping or non-
overlapping) for each of H,S and V separately.
Heavy-tailed distributions are probability distributions whose
tails are not exponentially bounded, i.e. tails are heavier than • For each patch find the histogram in the range [0 :
exponential distribution. More precisely, if F (x) denotes the 0.25 : 1] and concatenate three histograms(each with
cumulative distribution function of a random variable X, and dimension 5) corresponding to H,S and V plane.
F̄ = 1 − F (X), the F (X) is said to have heavy-tailed distri-
• To add texture features, grayscale image is transformed
bution if [9],
into an image containing local shape and texture in-
F̄ ∼ cx α
formation with the help of Local Binary Pattern(LBP)
where, c is a positive constant, 0 < α < 2 and a(x) ∼ b(x) which is an illumination invariant feature[15]. A 256-
means limx→∞ a(x)/b(x) = 1. bin histogram is formed using image LBP.
P
PX/W (x/Ii ) = πIi k F (x, θIi k )
K
4.2. Image Annotation PW (wi ) is computed from the training set as the pro-
portion of images containing annotation wi and PX is
[7] discusses the probabilistic modelling for image annotation
a constant in the computation above for all wi
and retrieval using semantic labels given to training images.
Using Gaussian Mixture Model(GMM) the algorithm trains 7. Annotate the test image with the classes wi having
the model from training images provided with labels. On the posterior probability, log PW/X (wi /Xt ), greater than
similar line we propose the following algorithm for image some threshold.
annotation:
Let ω = {ω1 , ω2 , ..., ωm } be the set of labels given to 5. OBSERVATIONS AND CONCLUSIONS
I = {I1 , I2 , ..., In } images. Each image can have mul-
tiple labels. To annotate the test image we need to know For image representation to retain more visual information,
how features in the image are related to their corresponding the HSV for color model is found to be more relevant. In case
labels. To model these relationships we learn the mixture of texture representation out of many methods available, LBP
model, with heavy-tailed distribution function, corresponding is found to be most simple and gives more relevant results
to each ωi ω. Let Ii ⊆ I be the set of images that are anno- as compared to others. Further additional use of shape infor-
tated with label ωi . Then for each image: mation using image invariant moments the result was found
to be less relevant as compared to that with only HSV color
1. Convert the image from RGB to HSV color space. and LBP texture features. But still for some Queries (contain-
Divide the HSV color image into 16 × 16 overlap- ing larger objects with similar color and texture but different
ping samples. These samples can be obtained by scan- shapes) the true positives in the results are found to be less
ning the image in left-to-right top-to-bottom sequence and hence demand shape features to be considered. So a bet-
by overlapping of 50% area between each adjuscent ter shape model is necessary.
pairs. After experiments with different distance metrics the Eu-
clidean distance metric for dissimilarity measure is found to
2. For each patch compute the 616 dimensional feature be simple and most appropriate for image retrieval among a
vector as explained in above section 4.1. collection large variety of images.
1. N. Vasconcelos, “From Pixels to Semantic Spaces: Ad- 8. J Li, J Z Wang, “Real-TIme Computerized Annotation
vances in Content-Based Image Retrieval ”, Computer, of Pictures ”, IEEE Transactions on Pattern Analysis
Vol. 40, No. 7, pp. 20-26, 2007 and Machine Intelligence, Vol. 30, Issue 6, pp.985-
1002, 2008
2. Arnold W. M. Smeulders, Marcel Worring, Simone
Santini, Amarnath Gupta, Ramesh Jain, “Content 9. Mark Crovella, “Performance Evaluation with Heavy
based image retrieval at the end of the early years Tailed Distributions ”, Revised Papers from the 7th In-
”, IEEE Transactions on Pattern Analysis and Machine ternational Workshop on Job Scheduling Strategies for
Intelligence, Vol. 22, pp. 1349-1380, 2000 Parallel Processing, pp.1-10, June 16, 2001
10. H Jin, Q Liu, H Lu, X Tong, “Face Detection Using 14. P N Tan, M Steinbach, V Kumar, “Introduction To Data
Improved LBP Under Bayesian Framework ”, Third In- Mining ”, Pearson, 2009
ternational Conference on Image and Graphics, pp.306-
309, 2004 15. T Ojala, M Pietikainnen, T Maenpaa, “Multiresolution
Gray-scale and Rotation Invariant Texture Classifica-
11. M K Hu, “Visual Pattern Recognition by Moment In- tion with Local Binary Patterns ”, IEEE Transaction on
variants ”, IRE Transaction on Information Theory, Vol. Pattern Analysis and Machine Intelligence, Vol. 24,No.
8, Issue 2, pp. 179-187, 1962 7, pp.971-987, Jul-2002
12. J Flusser, “Moment Invariants in Image Analysis ”, 16. Simplicity1000 Image dataset,
Proceedings of World Academy of Science, Egineering “http://wang.ist.psu.edu/docs/related/ ”
and Technology, Vol. 11, pp. 196-201, 2006
17. R O Duda, P E Hart, D G Stork, “Pattern Classification
13. Peng-Yeng Yin, Shin-Huei Li, “Content-based image ”, 2nd edition, 2000
retrieval using association rule mining with soft rel-
evance feedback ”, Journal of Visual Communication
and Image Representation, Vol. 17, Issue 5, pp. 1108-
1125, 2006