0% found this document useful (0 votes)
5 views7 pages

Sparse Coded Spatial Pyramid Matching and Multi-Kernel - CP

This paper presents a novel approach for non-linear scene classification using a sparse coded multi-scale method combined with multi-kernel integrated support vector machines (SVM). The proposed method enhances classification accuracy by optimizing feature extraction through parameterization of SIFT and the fusion of multiple kernels, achieving a maximum accuracy of 91.12%. The study demonstrates the effectiveness of this approach over traditional linear multi-kernel SVM methods, particularly for large class natural scene datasets.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views7 pages

Sparse Coded Spatial Pyramid Matching and Multi-Kernel - CP

This paper presents a novel approach for non-linear scene classification using a sparse coded multi-scale method combined with multi-kernel integrated support vector machines (SVM). The proposed method enhances classification accuracy by optimizing feature extraction through parameterization of SIFT and the fusion of multiple kernels, achieving a maximum accuracy of 91.12%. The study demonstrates the effectiveness of this approach over traditional linear multi-kernel SVM methods, particularly for large class natural scene datasets.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

Journal of ELECTRICAL ENGINEERING, VOL 72(2021), NO6, 374–380

sciendo
PAPERS
Sparse coded spatial pyramid matching and multi-kernel
integrated SVM for non-linear scene classification

Bhavinkumar Gajjar1∗ , Hiren Mewada2 , Ashwin Patani1

Support vector machine (SVM) techniques and deep learning have been prevalent in object classification for many years.
However, deep learning is computation-intensive and can require a long training time. SVM is significantly faster than
Convolution Neural Network (CNN). However, the SVM has limited its applications in the mid-size dataset as it requires
proper tuning. Recently the parameterization of multiple kernels has shown greater flexibility in the characterization of the
dataset. Therefore, this paper proposes a sparse coded multi-scale approach to reduce training complexity and tuning of
SVM using a non-linear fusion of kernels for large class natural scene classification. The optimum features are obtained
by parameterizing the dictionary, Scale Invariant Feature Transform (SIFT) parameters, and fusion of multiple kernels.
Experiments were conducted on a large dataset to examine the multi-kernel space capability to find distinct features for
better classification. The proposed approach founds to be promising than the linear multi-kernel SVM approaches achieving
91.12 % maximum accuracy.
K e y w o r d s: multiple kernel learning, support vector machine, classification, SIFT, spatial pyramid matching (SPM)

1 Introduction objects. Proper kernel formation is complex for the mul-


ticlass problem. The goal of the prediction task is to iden-
The use of the kernel in the machine learning ap- tify a subset of relevant features optimally, and it is a base
proaches is playing a significant role in various computer for multiple kernel learning (MKL) [1]. However, MKL
vision applications. The kernel is playing a significant is computationally very expensive so it limits its use in
role in various classification algorithms including SVM, the huge size of the feature set. In our study, we focus on
regression algorithms, kernel-based principle component sparse coded spatial pyramid matching (ScSPM) features
analysis, convolution neural network, etc. The CNNs are to be used for multi-kernel SVM. Types of MKL and its
helpful for high-dimensional data and are increasingly performance are not a scope of this study. We focus on
used in image classification and pattern recognition appli- the accuracy enhancement of multiclass SVM using the
cations. Deep CNNs require a lot of processing power and simple MKL approach.
take a long time to train, making it challenging to exten- This paper studies and compares overall performance
sively examine, repeat, and refine their accuracy. Even with baseline [26] work and a few famous SVM-based
though CNN has gained attraction in the classification
MKL methods. As there is no universal solution that
problems, it is more susceptible to trap in local minima
states that a particular multi-kernel works well with all
and over-fitting is the major concerned in contrast to sup-
multiclass scene data [2], we proposed our kernels based
port vector machine. In addition, the support of wide va-
on simple MKL to combine standard SVM’s kernels Gaus-
rieties of the kernel in SVM can assist the classification of
sian and polynomial. Differentiation between classes can
the data more accurately for the dataset where the class
be more accurate based on finding the best combinations
labels and their features are related nonlinearly. There-
of kernel functions using multiple kernel learning (MKL).
fore, this paper proposes the use of SVM in multiclass
scene classification applications. We report on related work on classification and mul-
For multiclass classification problems, it is difficult to tiple kernel learning, and explain the implementation of
claim accuracy using a single kernel because there is large the proposed algorithm with a discussion on the ScSPM
interclass relation may present, and its limited feature feature. After providing experimental results and com-
map of two similar type categories makes it separation paring the obtained results with sparse coded SPM and
difficult. To address this issue multi-kernel approach is other MKL approaches, we conclude with future scope
necessary for better bifurcation of features derived from for further improvement.

1 Department of Electronics and Communication Engineering, Indus University, Rancharda, Ahmedabad, 382115, India, 2 Department

of Electrical Engineering, Prince Mohammad Bin Fahd University, PO Box 1664, Al Khobar 31952, Saudi Arabia, ∗ corresponding author
[email protected]

https://doi.org/10.2478/jee-2021-0053, Print (till 2015) ISSN 1335-3632, On-line ISSN 1339-309X


c This is an open access article licensed under the Creative Commons Attribution-NonCommercial-NoDerivs License
(http: //creativecommons.org/licenses/by-nc-nd/3.0/).
Journal of ELECTRICAL ENGINEERING 72(2021), NO6 375

2 Related work SVM-based classification achieving an average accuracy


of 98 %.
Vision classification and recognition have gained more A multi-label scene classification using multi-instance
importance in the past few years. It consists of three com- learning was proposed in [11], where images are catego-
ponents: point of interest detection, description of the re- rized into several graphs using muti-instance learning.
gion of interest, and classification. Description of the re- Then RBF kernel with variation in its parameters was
gion includes the extraction of various features discrimi- associated with each graph, and a composite kernel was
nating various scenes like local binary pattern, histogram created for SVM-based classification. They obtained a
of gradient visual words, etc. These robust and powerful 61 % F1 score on the scene dataset. Further compari-
feature sets are used in the categorization of the scene. son between the SVM and CNN network was established
Further improvement in the classification can be achieved by Hasan et al [12]. In the SVM approach, they used a
using either multiple feature sets or multiple kernels in combination of linear and RBF kernels to classify the fea-
SVM. Scene classification plays an essential role in ur- tures from the hyperspectral images. The features were
ban planning, land management, environment monitor- optimized using principal component analysis. They pre-
ing, and exploration, and object classification. This sec- sented that SVM using PCA optimization outperform the
tion presents the study of single and multi-kernel SVM CNN network with 98.84 % accuracy, and CNN obtained
for multiclass classification applications. 94 % accuracy. A multi-kernel SVM based on a fuzzy al-
The combination of multiple descriptors using multiple gorithm was proposed [13]. The histogram of oriented
kernel SVM was proposed in [3] and showed remarkable gradients (HOG) features are extracted from the hand
improvement in varied scene classification. The multi- digit images and a multi-kernel SVM was proposed us-
label least-squares SVM method was proposed in [4]. ing the fuzzy triangle membership function. Patel and
They used multi-kernel RBF-based SVM for multi-label Mewada [14] analyze various large multi-class scene clas-
scene classification problem. The algorithm was validated sification algorithms and concluded that the demerit of
on a mixture of four datasets achieving maximum ac- over-fitting and poor convergence problem of CNN al-
curacy of 85 %. Effect of kernel in SVM is verified by gorithm causes limitation in scene classification. In con-
Kancherla et al [5]. They used different feature set with trast, the geometrical interpretation of the features in
various linear kernel SVM and simulated the algorithm SVM outperforms the NN for a dataset with large classes.
with a 3 to 4 class dataset. They found that the RBF The fusion of features like dense SIFT, color SIFT, and
kernel provides a better classification rate of 82.06 % on structure similarity was used along with localized multi-
the MIT dataset than other kernels. kernel SVM for real-world scene classification [15]. They
A comparative analysis of SVM with the decision tree achieved 81.92% and 42.12% accuracy for CalTech101
and k-nearest neighbors (KNN) algorithm for satellite and Caltech256 dataset respectively. A sparse dictionary
scene classification is presented [6]. Speeded up Robust approach was presented in [14] to reduce the features and
features (SURF) and bag of visual words (BOVW) mod- use robust features in SVM. Initially, a rigorous study
els were used as image futures in SVM in classification. on various dictionaries, ie DCT, DWT, K-SVD, was pre-
SVM-based scene classification model for robotic appli- sented, and the effect of patches in the features was an-
cation was presented in [7]. The robotic application re- alyzed. Later a reduced feature-based multi-kernel SVM
quires fast execution. Therefore, heuristic metric-based was proposed to classify scenes.
key points were identified from the captured scene and In summary, the multi-kernel SVM has played an es-
used in the SVM model. They conclude that the inte- sential role in many recognition and classification ap-
gration of local binary pattern and SURF features with plications. The study proposed that even though multi-
SVM received better accuracy in comparison with VGG kernel outperformed the latest CNN approaches to clas-
based neural network model. A combined model of SVM sify scenes amongst a large number of categories, a further
and DNN is presented for acoustic scene classification [8]. improvement is required to reduce the miss-classification
A discordancy using scene utterance was created for the rate for the databases containing large number classes.
SVM model. Dimensionality is reduced using support vec- Moreover, this can be achieved if robust features were
tor decomposition. Their hybrid DNN model achieved a used by reducing redundancy and designing an SVM ker-
66.1 % classification rate. Nazir et al [9] extracted fea- nel with optimum parameters that correspond to these
tures using the ResNet network, and these features were feature sets.
used to classify the action from the videos. They used
the RBF-based multi-kernel SVM with L2 regulariza-
tion function achieving 70 % classification accuracy. A 3 ScSPM features and MKL implementation
hybrid approach of spatial, spectral, and semantic fea-
tures was proposed in [10] to classify the hyperspectral A complete workflow of our algorithm shown in Fig. 1.
images. Gabor-based structural features are integrated We used pre-trained KSVD dictionary for sparsifying fea-
with morphological-based spatial features and K-means tures. For the given dictionary V best coefficient U for
and entropy-based semantic features. Later a composite signal X can be found using sparse coding. This paper
kernel is created in corresponds to these three features in adopts this encoding of SIFT features into sparse code
376 B. Gajjar, H. Mewada, A. Patani: SPARSE CODED SPATIAL PYRAMID MATCHING AND MULTI-KERNEL INTEGRATED . . .

Training Testing
images images

Sample
Tunable SIFT feature extraction
features

Training labels
Sparse
KSVD
coding
Offline codebook
learning
Spatial
pooling

Training phase Testing phase dark line indicates


training
Non-linear multikernel SVM
classifier
doted line indicates
Predictions testing

Fig. 1. Work flow of proposed algorithm

and investigates the performance by tuning the parame- where the max pooling function F is applied on each
ters. In the proposed algorithm, the conversion of SIFT column of absolute sparse code U as
features vector quantization to the sparse code is
zj = max{|u1j |, |u2 j |, . . . , |uMj |}, (6)
M
X
min kxm − um V 2 k2 + λ|um |, (1) where zj is the j -th element of z , uij is the matrix
U,V
m=1 element at i -th row and j -th column of U , and M is the
subject to kvk k ≤ 1 , k = 1, 2, . . . , K , where V is N × K number of local descriptors in the region.
size over-complete (K > N ) dictionary and um is sparse Multiclass classification usually disintegrates into groups
coefficient matrix for signal xm . of binary 1, −1 problem that can easily accommodate
In this unit, L2-norm on V and L1-norm on Um is typ- the functionality of standard SVM famous approaches
ically applied with regularization parameter λ. The prob- one-versus-rest and one-versus-one. In this experiment,
lem in (2) is convex in V and U simultaneously. This can we have implemented SVM as proposed in [17] to solve
be solved to fix the iteration number to achieve optimiza- the following convex optimization problem using kernel
tion over V or U while fixing any other. Fixing codebook weights dm . X
V , (3) can be solved as a linear regression problem with J(d) = Jp (d), (7)
L1-norm regularization on sparse coefficients p∈P

min kxm − um V k22 + λ|um | . (2) where P is the set of all pairs to be considered, and Jp (d)
um is the binary SVM objective value shown in (8) for the
Fixing U, same problem will be transformed to least classification problem pertaining to pair p.
square with quadratic constraints
maxα − 12 i,j αi αj m dm Km (zi , zj )
 P P


min kX − U V k2F , (3) J(d) = 1
with 0 ≤ αi ≤ vl ∀i , (8)
V 
P
subject to kvk k ≤ 1 , ∀k = 1, 2, . . . , K . Lagrange dual i αi = 1

[16] can cleanly deal with it. where αi is Lagrange multiplier. The gradient of the given
ScSPM feature is computed by the histogram pooling function in (8) can be found as
method
M
1 X ∂J 1 XX ∗ ∗
z= um . (4) =− αi,p αj,p yi yj Km (zi , zj ) ∀m, (9)
M m=1 ∂dm 2 i,j
p∈P

For the pre-chosen pooling function F sparse matrix U


where αi,p is the Lagrange multiplier of the i -th example
will result ScSPM feature X
involved in the p-th decision function. For every pair of
Z = F (U ), (5) example Lagrange multiplier is obtained independently.
Journal of ELECTRICAL ENGINEERING 72(2021), NO6 377

Fig. 2. Sample images of datasets used in this experiment: (a) – Caltech-101, (b) – Scene-15

Table 1. Kernels and their parameter’s value

MKL Base kernel Coefficient value Number of kernels


K1 Polynomial (1,2,3) 3
K2 Gaussian (0.5,1,2,5,7,10,12, 15,17,20) 10
K3 Gaussian and polynomial (0.5,1,2,5,7,10,12, 15,17,20) (1,2,3) 13

4 Experiments and results failed to extract local features and hence classification
accuracy. Therefore, the proposed method uses 16 ori-
The improvement in the classification accuracy in entations and four bins for SIFT features. We excluded
SVM requires fine-tuning of the kernel within SVM. the detailed discussion of other types of SPM (KSPM
Therefore, a different fusion of the kernel is used, and clas- and LSPM) and MKL and their results are used in the
sification accuracy is analysed rigorously. A benchmark comparison. ScSPM [18] used a linear kernel on spatial-
dataset named Caltech-101 [27] and Scene-15 [28–30] is pyramid pooling for sparsified SIFT features whereas, in
used in the experiment. The Caltech-101 dataset con- the proposed experiment, the linear kernel is replaced by
tains 9145 images for 101 different classes with various MKL as suggested in [17].
object categories like animals, instruments, vehicles, flow- By the rigorous study of literature and detailing the
ers, plants, etc. The dataset contains 40 to 800 images work for ScSPM, it has been observed that the patch
per class. The scene-15 dataset contains main indoor and size reference to dictionary size, number of training and
outdoor scenes like the kitchen, living room, offices, etc. testing samples, and SVM contributes magnificently in
Though the number of classes is less, it has low inter- the improvement of classification rate.
class covariance, making it difficult to achieve high accu-
The patch size reference to dictionary size contributes
racy. A total of 4000 images are available in 15 classes
ranging from 200 to 400. Sparse coding of features and to the sparsity of the features as expressed in (3). In
multi-resolution approach using spatial pyramid match- the proposed experiment, 256 × 1024 and 16 × 16 sizes are
ing robust to local spatial translation [18] are used. That used for dictionary and patch respectively. Dictionary is
trained for 30 iterations for KSVD. The average coef-
reduces the training complexity from O(n3 ) to O(n) and
keeps the testing complexity constant. ficients value for the learned KSVD dictionary over 30
iterations is shown in Fig. 3. The One-Versus-Rest SVM
Each sparsified SIFT features sets for Caltech-101 was
approach is used in training. The fusions of the kernels
divided into 30 and 15 training samples and the remain-
with the values of their coefficients are listed in Tab. 1.
ing for testing. For this experiment, Scene-15 dataset was
We tested the performance for 5 independent runs and
divided into 50 and 100 training images per class, and
noted average accuracy achieved after five runs. This ex-
the remaining were left for testing. In this experiment,
periment was conducted with Intel Core i3 of 2.50 GHz,
we extracted SIFT feature as per our previous work on
8 GB RAM, and Windows-10 of 64 bit machine configu-
parametrizing SIFT and sparse dictionaries [26]. Total six
parameters are involved in the extraction of SIFT fea- rations.
tures, including the number of Gaussian functions, its Table 2 presents a comparison of the obtained results
variance, amount of image scaling, histogram bins’ ori- with other state-of-art methods.
entation, and its radius and features vector size. The em- The proposed method uses the same size as the dic-
pirical study presented in [26] suggests that the size of tionaries used in [26] and [18]. Wang et al [31] presented
SIFT feature depends on the number of bins and angles. an SVM-based scene classification model where images
And analysis propagates that large bins with less angle are characterized using SIFT features obtained from the
378 B. Gajjar, H. Mewada, A. Patani: SPARSE CODED SPATIAL PYRAMID MATCHING AND MULTI-KERNEL INTEGRATED . . .

Table 2. Comparison with other SPM or multi-kernel approaches

Dataset: Caltech–101
Average
Algorithm accuracy Training Method name
(%) images
67.0 ± 0.45
ScSPM [18] 73.02 ± 0.54 30 SPM sparse coding
BOW(400), [19] 72.02
BOW(1000), [19] 70.11 30 Bag of words
BOW(4000), [19] 71.24
NBNN, [20] 70.4 15 Naive-Bayes nearest-neighbor
70.7 Local visual feature coding based
LVFC-HSF, [21]
78.7 on heterogeneous structure fusion
CLGC(RGB-RGB) (22) 72.6 30 Concatenation of local and global color
64.0
CSAE, [23] 15 Convolutional sparse auto-encoder
71.4
LMMK, [24] 62.3 Large margin multiple kernel
Parameterizing ScSPM (26) 77.08 ± 0.31 30 Parameterizing SPM sparce coding
79.29 ± 0.43 15
85.06 ± 0.31 30 kernel K1
Proposed 79.87 ± 0.36 15
method 85.72 ± 0.47 30 kernel K2

78.96 ± 0.24 15
84.97 ± 0.21 30 kernel K3
Dataset: Scene –15
Average
Algorithm accuracy Training Method name
(%) images
ScSPM, [18] 87.28 ± 0.93 SPM sparse coding
LVFC-HSF, [21] 87.23 Local visual feature coding based
100 on heterogeneous structure fusion
OVH, [25] 87.07 Orthogonal vector histogram
Parameterizing ScSPM, [26] 81.13 ± 0.53 Parameterizing SPM sparce coding
81.94 ± 0.54 50
89.12 ± 0.41 100 kernel K1
Proposed 83.32 50
method 91.12 ± 0.57 100 kernel K2

85.55 ± 0.40 50
90.52 ± 0.21 100 kernel K3

Sparsity curacy is limited to 31 % only due to a single kernel


5
SVM in classification. In the proposed algorithm, selec-
Scene-15 tive sparsified features with Gaussian kernels combina-
4 tion in SVM can transform data in more separable di-
mension space. So performance is better than SPM based
3 SIFT approach [18] where only local features of different
scales are concatenated. Similarly, our previous work [26]
2 shows the SIFT feature tuning can outperform over Sc-
SPM. In this work, we used the tuned SIFT feature with
Caltech-101 MKL so better grouping between homogenous features
1
can achieve. Our method shows ∼ 8 % higher accuracy
than [26]. LVFC-HSF offers greater optimization between
1 5 9 13 17 21 25 29 local and global features but it cannot achieve more lin-
earity in higher dimension space than our algorithm. In
Fig. 3. Average coefficient value for each KSVD iteration on
Caltech-101 and Scene-15 dataset the LMKL method [24] higher weight is assigned to fea-
ture which make more discrimination in classes. They
calculated base kernels by ten different image descriptors
spatial pyramid. For the Caltech-256 dataset, their ac- given in [24] using Gaussian function, increasing the com-
Journal of ELECTRICAL ENGINEERING 72(2021), NO6 379

putation complexity for higher dimensional feature vec- this kind of similarity ie sunflowers, water lily in Caltech-
tor. Overall accuracy they reported ∼ 3 % higher than 101 and bedroom, living room in Scene-15 dataset, etc. In
our but in their experiment test images from each class the future scope, we will examine the effect of this feature
are fixed to 15. For the scene-15 dataset, OVH [25] cal- on alike classes.
culates a global rotation invariant geometric visual word
to relate with BoVW as special information but cannot
take advantage of distinct local information. References
The proposed approach increases the accuracy to
[1] H. Liu and L. Yu, “Toward integrating feature selection algo-
85.72 % for a large multiclass dataset of CalTech-101 and rithms for classification and clustering”, IEEE Transactions on
91.12 % for the scene-15 dataset. The features classifica- knowledge and data engineering, vol. 17, no. 4, pp. 491–502,
tion using kernels K1, K2, and K3 provides a better classi- 2005.
fication rate than large margin multiple kernel (LMMK). [2] S. S. Bucak, R. Jin, and A. K. Jain, “Multiple kernel learning
The sparse-based learning model’s improvement depends for visual object recognition” A review”, IEEE Transactions
on the configuration of the parameters in the sparse dic- on Pattern Analysis and Machine Intelligence, vol. 36, no. 7,
pp. 1354–1369, 2013.
tionary and the extraction of robust feature sets. Also,
[3] M. Varma and D. Ray, “Learning the discriminative power in-
the optimum selection of dictionary size, patch size, and variance trade-off”, in 2007 IEEE 11th International Conference
integration of the kernel plays a vital role in classify- on Computer Vision, pp. 1–8, IEEE, 2007.
ing a large confusing multi-label dataset. The non-linear [4] S. Xu and X. An, “Ml2s-svm: multi-label least-squares support
nature of polynomial and Gaussian kernel helped distin- vector machine classifiers”, The Electronic Library, 2019.
guish these features in SVM and hence proposed model [5] D. Kancherla, J. D. Bodapati, and N. Veeranjaneyulu, “Effect
achieved a better classification rate. of different kernels on the performance of an svm based classifi-
cation”, Int. J. Recent Technol. Eng, no. 5, pp. 1–6, 2019.
[6] S. Bouteldja and A. Kourgli, “A comparative analysis of svm,
k-nn, and decision trees for high resolution satellite image scene
5 Conclusions classification”, in Twelfth International Conference on Machine
Vision (ICMV 2019), vol. 11433, p. 114331I, International Soci-
CNN has obtained large popularity in the classification ety for Optics and Photonics, 2020.
models at the cost of large training time and increased [7] D. Santos, E. Lopez-Lopez, X. M. Pardo, R. Iglesias, S. Barro,
computation cost. In comparison with CNN, SVM is and X. R. Fdez-Vidal, “Robust and fast scene recognition in
robotics through the automatic identification of meaningful im-
found to have greater flexibility in characterization if an ages”, Sensors, vol. 19, no. 18, p. 4024, 2019.
appropriate kernel is used for challenging datasets. The [8] X. Bai, J. Du, Z.-R. Wang, and C.-H. Lee, “A hybrid approach to
single kernel limits its application for datasets having lin- acoustic scene classification based on universal acoustic models”,
ear classification. Therefore, a multi-kernel SVM has ex- in Interspeech, pp. 3619–3623, 2019.
perimented again with the aim of optimization in the se- [9] S. Nazir, Y. Qian, M. Yousaf, S. A. V. Carroza, E. Izquierdo,
lection of the kernels and study of various parameters and E. Vazquez, “Human action recognition using multi-kernel
learning for temporal residual network”, 2019.
affecting the kernel performance in classification. The in-
[10] Y. Wang, W. Yu, and Z. Fang, “Multiple kernel based svm clas-
vestigation of simple MKL over ScSPM features for clas-
sification of hyperspectral images by combining spectral, spa-
sification accuracy is presented initially and the role of tial, and semantic information”, Remote Sensing, vol. 12, no. 1,
various parameters has been explored to minimize the re- p. 120, 2020.
dundant features. Then a sparse-dictionary is created for [11] C. Tong-Tong, L. Chan-Juan, Z. Hai-Lin, Z. Shu-Sen, L. Ying,
minimizing the features size. and D. Xin-Miao, “A multi-instance multi-label scene classifica-
tion method based on multi-kernel fusion”, in 2015 SAI Intelli-
After getting the maximum sparsity of the dictionary,
gent Systems Conference (IntelliSys), pp. 782–787, IEEE, 2015.
the effect of MKL on overall classification accuracy is pre-
[12] H. Hasan, H. Z. Shafri, and M. Habshi, “A comparison between
sented. We noted that even with a minimal combination support vector machine (svm) and convolutional neural network
of a single type kernel like Polynomial as shown in Tab. 2 (cnn) models for hyperspectral image classification”, in IOP
accuracy will be higher than the single kernel SVM al- Conference Series: Earth and Environmental Science, vol. 357,
gorithm. Multiple combinations of Gaussian kernels lead p. 012035, IOP Publishing, 2019.
to an increase in the classification accuracy to 85.72 % [13] A. Sampath and N. Gomathi, “Fuzzy-based multi-kernel spher-
ical support vector machine for effective handwritten character
for 101 class datasets. We observe that training time and recognition”, Sadhana , vol. 42, no. 9, pp. 1513–1525, 2017.
storage requirement also increases with a higher number [14] H. Patel and H. Mewada, “Analysis of machine learning based
of the Gaussian kernel which makes difficult to work on scene classification algorithms and quantitative evaluation”, In-
large dataset like Caltech-256 using minimum hardware ternational Journal of Applied Engineering Research, vol. 13,
requirements. Hence we conclude that with good features no. 10, pp. 7811–7819, 2018.
and Multi kernels, object recognition is still an open area [15] F. Zamani and M. Jamzad, “A feature fusion based localized
to work. Coral reef classification using image augmenta- multiple kernel learning system for real world image classifi-
cation”, EURASIP Journal on image and Video processing,
tion in [32] gives promising results in the limited dataset. vol. 2017, no. 1, pp. 1–11, 2017.
In that work, they used RGB and gray colours as a feature [16] H. Lee, A. Battle, R. Raina, and A. Y. Ng, “Efficient sparse cod-
for predicting most corals that have the most similarity. ing algorithms”, in Advances in neural information processing
In this work, there are many classes in a dataset that have systems, pp. 801–808, 2007.
380 B. Gajjar, H. Mewada, A. Patani: SPARSE CODED SPATIAL PYRAMID MATCHING AND MULTI-KERNEL INTEGRATED . . .

[17] A. Rakotomamonjy, F. Bach, S. Canu, and Y. Grandvalet, categories”, in 2006 IEEE Computer Society Conference on
“Simplemkl”, Journal of Machine Learning Research, vol. 9, Computer Vision and Pattern Recognition (CVPR’06), vol. 2,
pp. 2491–2521, 2008. pp. 2169–2178, IEEE, 2006.
[18] J. Yang, K. Yu, Y. Gong, and T. Huang, “Linear spatial pyramid [31] H.-H. Wang, C.-W. Tu, and C.-K. Chiang, “Sparse representa-
matching using sparse coding for image classification”, in 2009 tion for image classification via paired dictionary learning”, Mul-
IEEE Conference on computer vision and pattern recognition, timedia Tools and Applications, vol. 78, no. 12, pp. 16945–16963,
pp. 1794–1801, IEEE, 2009. 2019.
[19] H. Liao, J. Xiang, W. Sun, and S. Yu, “Adaptive aggregating [32] S. Sharan, S. Kininmonth, U. V. Mehta, et al , “Automated
multi-resolution feature coding for image classification”, Math- cnn based coral reef classification using image augmentation and
ematical Problems in Engineering, vol. 2014, 2014.
deep learning”, International Journal of Engineering Intelligent
[20] O. Boiman, E. Shechtman, and M. Irani, “In defense of nearest Systems, vol. 29, no. 4, pp. 253–261, 2021.
neighbor based image classification”, in 2008 IEEE Conference
on Computer Vision and Pattern Recognition, pp. 1–8, IEEE, Received 3 May 2021
2008.
[21] G. Lin, C. Fan, H. Zhu, Y. Miu, and X. Kang, “Visual feature
coding based on heterogeneous structure fusion for image clas- Bhavinkumar Gajjar is a holder of MTech degree in
sification”, Information Fusion, vol. 36, pp. 275–283, 2017.. Communication System Engineering from Gujarat Technolog-
[22] L. Kabbai, M. Abdellaoui, and A. Douik, “Image classification ical University, India. His field of interest is Image processing,
by combining local and global features”, The Visual Computer, Computer Vision and Optimization Algorithms. Currently he
vol. 35, no. 5, pp. 679–693, 2019.
is working on accuracy enhancement for multiclass classifica-
[23] W. Luo, J. Li, J. Yang, W. Xu, and J. Zhang, “Convolutional
tions techniques as a research scholar in Indus University. He
sparse autoencoders for image classification”, IEEE transac-
tions on neural networks and learning systems, vol. 29, no. 7, is professional software developer and working in Arohi Op-
pp. 3289–3294, 2017. erations Pvt Ltd. He has 7 years of academic and 3.5 years
[24] B. Hosseini and B. Hammer, “Large-margin multiple kernel of industrial experience. He has six international and two na-
learning for discriminative features selection and representation tional publications in reputed journals/conferences.
learning”, in 2019 International Joint Conference on Neural Net- Hiren Mewada has obtained his MTech and PhD de-
works (IJCNN), pp. 1–8, IEEE, 2019. gree from Sardar Vallbhbhai National Institute of Technology-
[25] B. Zafar, R. Ashraf, N. Ali, M. Ahmed, S. Jabbar, and S. A. Surat, Gujarat, India. Presently he is Assistant Research Pro-
Chatzichristofis, “Image classification by addition of spatial in-
fessor at Prince Mohammad Bin Fahd University, Kingdom of
formation based on histograms of orthogonal vectors”, PloS one,
vol. 13, no. 6, p. e0198175, 2018. Saudi Arabia. Previously he was associate professor at Charo-
[26] B. Gajjar and H. M. A. Patani, “Parameterizing sift and sparse tar University of Science and Technology, Gujarat, India. He
dictionary for svm based multi-class object classification”, In- has more than 17 years teaching experience. His current ar-
ternational Journal of Artificial Intelligence, vol. 19, pp. 95–108, eas of interest are computer vision, signal processing, machine
2021. learning and Embedded System design. He has published more
[27] L. Fei-Fei, R. Fergus, and P. Perona, “One-shot learning of ob- than 60 research papers and completed several funded research
ject categories”, IEEE transactions on pattern analysis and ma- projects. He is coauthor of one book and published five book
chine intelligence, vol. 28, no. 4, pp. 594–611, 2006. chapters. He is member of IETE and ISTE.
[28] A. Oliva and A. Torralba, “Modeling the shape of the scene:
Ashwin Patani has obtained his MTech from Gujarat
A holistic representation of the spatial envelope”, International
journal of computer vision, vol. 42, no. 3, pp. 145–175, 2001. university, Gujarat and PhD degree from meghalaya university
[29] L. Fei-Fei and P. Perona, “A bayesian hierarchical model for Meghalaya, India. Presently he is senior Assistant Professor at
learning natural scene categories”, in 2005 IEEE Computer So- Indus University, Ahmedabad Gujarat. He has more than 15
ciety Conference on Computer Vision and Pattern Recognition years teaching experience. His current areas of interest are
(CVPR’05), vol. 2, pp. 524–531, IEEE, 2005. sensors & networks, machine learning and Embedded System
[30] S. Lazebnik, C. Schmid, and J. Ponce, “Beyond bags of fea- design. He has published more than 20 research papers. He is
tures: Spatial pyramid matching for recognizing natural scene author of one book. He is member of IETE and ISTE.

You might also like