Al Kadi2008
Al Kadi2008
Keywords: Image classification, texture analysis, pixels while the latter exploits the self-similarity of texture
Bayesian classifier at varying scales. For statistical-based methods, first and
second order statistics are derived after analyzing the spatial
Abstract distributions of pixel grey level values. Gray level co-
occurrence, run-length and autocovariance function methods
This paper aims to improve the accuracy of texture were selected for feature extraction.
classification based on extracting texture features using five The texture features obtained by different methods are
different texture measures and classifying the patterns using used individually and in combination with each other for
a naïve Bayesian classifier. Three statistical-based and two classification. A supervised learning approach was adopted
model-based methods are used to extract texture features for training and testing the extracted features from samples
from eight different texture images, then their accuracy is of image segments obtained from each image class using a
ranked after using each method individually and in pairs. naïve Bayesian classifier.
The accuracy improved up to 97.01% when model based –
Gaussian Markov random field (GMRF) and fractional 2. Methodology
Brownian motion (fBm) – were used together for
classification as compared to the highest achieved using 2.1 Data set preparation
each of the five different methods alone; and proved to be
better in classifying as compared to statistical methods. Different type of texture images ranging from fine to coarse
Also, using GMRF with statistical based methods, such as for the purpose of texture classification were used in this
grey level co-occurrence (GLCM) and run-length (RLM) paper as shown in Fig.1. Eight different texture images
matrices, improved the overall accuracy to 96.94% and having size of 256x256 with 8-bit grey levels were selected
96.55%; respectively. from the Brodatz album [2]. Each image which defines a
separate class was divided into size of 32 x 32 image
segments with 50% overlapping. Nearly one third of the
1. Introduction image segments (64 samples) referring to each class were
used for training, while the rest (192 samples) were used for
sij ;1 = I i −1, j + I i +1; j sij ;3 = I i − 2, j + I i + 2; j sij ;5 = I i −1, j −1 + I i +1; j +1 By operating pixel by pixel, an FD image was generated for
sij ;2 = I i , j −1 + I i ; j +1 sij ;4 = I i , j − 2 + I i ; j + 2 sij ;6 = I i −1, j +1 + I i +1; j −1 each sample image segment where each pixel has its own
FD value. Then first order statistical features were derived,
which are: mean variance, lacunarity (i.e., variance divided
For an image segment of size M x N the GMRF parameters by mean), skewness and kurtosis.
α and σ are estimated using least square error estimation
method, as follows: 2.2.2 Statistical-based features methods
−1
⎛ α1 ⎞ ⎧ ⎡ sij ;1 sij ;1 L sij ;1 sij ; n ⎤ ⎫ ⎛ sij ;1 ⎞ Co-occurrence matrices
⎜ ⎟ ⎪ ⎢ ⎥⎪ ⎜ ⎟
⎜ M ⎟ = ⎨∑ ⎢ M O M ⎥⎬ ∑ij Iij ⎜ M ⎟ The Grey level co-occurrence matrix (GLCM) Pg (i, j | θ , d )
⎜ ⎟ ⎪ ij ⎪ ⎜ ⎟
⎝ α n ⎠ ⎩ ⎢⎣ sij ; n sij ;1 L sij ; n sij ; n ⎥⎦ ⎭ ⎝ sij ; n ⎠ represents the joint probability of certain sets of pixels
(2) having certain grey-level values. It calculates how many
times a pixel with grey-level i occurs jointly with another
⎡ n
⎤
∑ ⎢⎣ I − ∑ α s
ij
l =1
l ij ;l ⎥
⎦
pixel having a grey value j. By varying the displacement
vector d between each pair of pixels, many GLCMs with
σ2 = ij
different directions can be generated. For each sample
( M − 2)( N − 2)
(3) image segment and with distance set to one, four GLCMs
Texture feature extraction method
GLCM GMRF RLM fBm ACF
Texture
type Training Testing Training Testing Training Testing Training Testing Training Testing
set set set set set set set set set set
D16 100% 82.81% 100% 99.48% 100% 77.08% 98.44% 83.33% 7.81% 1.56%
D20 100% 100% 100% 98.96% 100% 100% 100% 80.73% 95.31% 86.46%
D74 100% 100% 98.44% 83.85% 98.44% 98.44% 100% 97.40% 21.88% 25.52%
D24 100% 88.54% 95.31% 91.67% 100% 85.94% 96.88% 76.56% 21.88% 12.50%
D93 100% 99.48% 100% 98.44% 98.44% 88.02% 90.63% 83.85% 56.25% 54.69%
D98 100% 97.40% 100% 100% 100% 92.19% 93.75% 96.35% 4.69% 0.52%
D106 100% 97.92% 100% 99.48% 100% 98.44% 92.19% 76.04% 45.31% 41.15%
D112 100% 100% 89.06% 91.67% 100% 89.06% 95.31% 91.67% 65.63% 45.31%
accuracy 100% 95.77% 97.85% 95.44% 99.61% 91.15% 95.90% 85.74% 39.84% 33.46%
Table I: Overall accuracy of classification using each texture feature extraction method individually
having directions (0°,45°,90° &135°) were generated. which are the horizontal and vertical margins values
Having the GLCM normalized, we can then derive eight referring to the ACF and exponential fittings.
second order statistic features which are also known as
Haralick features [6] for each sample, which are: contrast, 2.3 Classification algorithm
correlation, energy, entropy, homogeneity, dissimilarity,
inverse difference momentum, maximum probability. The naïve Bayesian classifier (nBC) is a simple probabilistic
classifier which assumes attributes are independent. Yet, it
Run-length matrices is a robust method with on average has a good classification
accuracy performance, and even with possible presence of
The grey level run-length matrix (RLM) Pr (i, j | θ ) is defined dependent attributes [8]. From Bayes’ theorem,
as the numbers of runs with pixels of gray level i and run
length j for a given direction θ [7]. RLMs was generated for P ( X Ci ) P(Ci )
each sample image segment having directions (0°,45°,90° Pi (Ci X ) =
P( X )
&135°), then the following five statistical features were
derived: short run emphasis, long run emphasis, gray level (6)
non-uniformity, run length non-uniformity and run
percentage. Given a data sample X which represents the extracted
texture features vector ( f 1, f 2, f 3 K K fj ) having a probability
Autocovariance function density function (PDF) P ( X C i ) , we tend to maximize the
posterior probability P (C i X ) (i.e., assign sample X to the
The Autocovariance function (ACF) is the autocorrelation
function after subtracting the mean. It is a way to investigate class Ci that yields the highest probability value).
non-randomness by looking for replication of certain P (C i X ) is the probability of assigning class i given feature
patterns in an image. The ACF is defined as: vector X ; and P ( X C i ) is the probability; P (Ci ) is the
M −x N − y
probability that class i occurs in all data set; P( X ) is the
∑ ∑ ( I (i, j ) − µ )( I (i + x, j + y) − µ ) probability of occurrence of feature vector X in the data
ρ ( x, y ) = i =1 j =1 set.
( M − x)( N − y ) (5) P (Ci ) and P( X ) can be ignored since we assume that all are
equally probable for all samples. This yields the maximum
Where I (i, j ) is the grey value of a M x N image, µ is the of P (C i X ) is equal to the maximum of P ( X C i ) and can be
mean of the image before processing and x, y are the amount estimated using maximum likelihood after assuming a
of shifts. Gaussian PDF [9] as follows:
After calculating the ACF for each sample image segment
the peaks of the horizontal and vertical margins were fitted
1 ⎡ −1
⎤
using least squares by an exponential function. Therefore, P ( X Ci ) = 1
exp ⎢ − 12 ( X − µi )T ∑ ( X − µi ) ⎥
each sample is represented by four different parameters, (2π ) n/2
Σi 2
⎣ i ⎦
(7)
It was also noticed that the GMRF texture features appear in
Combined No. of Train set Test set all paired combinations which improved the overall
methods features accuracy accuracy
accuracy, and with only 12 features, the GMRF and fBm
GMRF & fBm 12 99.80% 97.01%
combination overcame the accuracy achieved when using
GMRF & RLM 27 100% 96.94%
GMRF & GLCM 39 100% 96.55%
the GLCM individually which needs 32 features. Combining
RLM & fBm 25 100% 95.70% more than two methods with each other did not improve
GLCM & ACF 36 100% 95.05% much the accuracy (e.g. GMRF with fBm and RLM
fBm & GLCM 37 100% 94.66% achieved the highest with 97.07% classification accuracy)
GMRF & ACF 11 97.66% 92.84% and will increase the time for computation as well. Accuracy
GLCM & RLM 52 100% 92.12% could be further improved if a feature selection method was
RLM & ACF 24 99.61% 89.58% used to remove possibly highly correlated features; this
fBm & ACF 9 96.48% 85.48% needs to be further investigated.
Table II: Overall accuracy of texture feature extraction
4. Conclusion
methods combined in pairs
As texture feature extraction methods tend to capture
Where Σi and µi are the covariance matrix and mean vector different image texture characteristics, using different
of feature vector X of class Ci; Σii and ΣΣi-1i−1 are the combinations could assist in improving the classifier
determinant and inverse of the covariance matrix; and accuracy. Using a nBC, it was shown that combined model-
( X − µi )T is the transpose of ( X − µi ). based texture feature extraction methods (GMRF with fBm)
proved to be better in classifying as compared to statistical
3. Experimental Results and Discussion methods. The model-based combined features improved the
overall classification accuracy above the highest achieved
Initially, each method was applied individually to each of using each of five different methods individually. Moreover,
the eight different texture images to show which gives better using GMRF features with statistical methods (RLM and
classification accuracy. The overall classification accuracies GLCM) improved the overall accuracy as well.
for each of the eight Brodatz images are shown in Table 1.
The GLCM and GMRF achieved the highest classification 5. References
rate with 95.77% and 95.44%, while RLM and fBm scored
91.15% and 85.74%. All methods achieved relative good [1] M. Tuceryan and A. Jain, The Handbook of pattern
accuracy except for the ACF. Recognition and Computer Vision, 2 ed: World Scientific
Publishing Co., 1998.
The method that achieved the lowest misclassification in all [2] P. Brodatz, A Photographic Album for Artists and
of the eight texture images was GMRF. The least accuracy Designers. New York: Dover, 1996.
was in texture 3 with 83.85% and that was due to large [3] M. Petrou and P. Gacia Sevilla, Image processing:
structure of the image texture which was beyond the size of Dealing with texture: Wiley, 2006.
the third order neighborhood box used to capture. [4] B. B. Mandelbrot, Fractal Geometry of Nature. San
Francisco, CA: Freeman, 1982.
Then the highest classification accuracy which was achieved [5] C. C. Chen, J. S. Daponte, and M. D. Fox, "Fractal
by the GLCM method was set as the accuracy improvement Feature Analysis and Classification in Medical Imaging,"
criterion to compare the performance of the next part, where IEEE Transactions on Medical Imaging, vol. 8, pp. 133-
texture features from different methods are combined 142, 1989.
together to investigate if they may assist in increasing [6] R. M. Haralick, K. Shanmuga, and I. Dinstein, "Textural
accuracy rate. Features for Image Classification," IEEE Transactions on
Systems, Man and Cybernetics, vol. SMC3, pp. 610-621,
It was found that using the two model-based texture features 1973.
(GMRF and fBm) together improved the overall accuracy [7] M. M. Galloway, "Texture analysis using gray level run
up to 97.01%. Also, when combining the statistical-based lengths," Computer Graphics Image Processing, vol. 4, pp.
RLM and GLCM with the GMRF texture features it 172-179, 1975.
increased the overall accuracy to 96.94% and 96.55%; [8] F. Demichelis, P. Magni, P. Piergiorgi, M. A. Rubin, and
respectively. RLM with fBm gave nearly the same accuracy R. Bellazzi, "A hierarchical Naive Bayes Model for
using GLCM alone, while the rest of combinations scored handling sample heterogeneity in classification problems: an
less than the predefined accuracy improvement criteria. application to tissue microarrays," Bmc Bioinformatics, vol.
Table 2 summarizes all classification accuracies and number 7, 2006.
of texture features used for all possible paired combinations. [9] R. C. Gonzales and R. E. Woods, Digital Image
Processing, 2 ed: Prentice Hall, 2002.