0% found this document useful (0 votes)

33 views12 pages

Compact Structure Hashing Via Sparse and Similarity Preserving Embedding

Uploaded by

xingyanzhou687

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

33 views12 pages

Compact Structure Hashing Via Sparse and Similarity Preserving Embedding

Uploaded by

xingyanzhou687

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 12

718 IEEE TRANSACTIONS ON CYBERNETICS, VOL. 46, NO.

3, MARCH 2016

Compact Structure Hashing via Sparse and

Similarity Preserving Embedding
Renzhen Ye and Xuelong Li, Fellow, IEEE

Abstract—Over the past few years, fast approximate near- search, also known as nearest search, has been a hot topic
est neighbor (ANN) search is desirable or essential, e.g., in in information retrieval, database, and computer science. The
huge databases, and therefore many hashing-based ANN tech- task of the image search aims to take a query image and accu-
niques have been presented to return the nearest neighbors of
a given query from huge databases. Hashing-based ANN tech- rately find its nearest neighbors within a large database. The
niques have become popular due to its low memory cost and direct method of finding neighbors is to search over the given
good computational complexity. Recently, most of hashing meth- database and sort them in the light of their similarity to the
ods have realized the importance of the relationship of the data query. However, the complexity of exhaustive search is ampli-
and exploited the different structure of data to improve retrieval fied and becomes prohibitively expensive when the database
performance. However, a limitation of the aforementioned meth-
ods is that the sparse reconstructive relationship of the data items are large. Moreover, the searching performance will drop
is neglected. In this case, few methods can find the discrim- significantly due to the storage of the original data with high
inating power and own the local properties of the data for dimensionality. Hence, it is necessary to consider approximate
learning compact and effective hash codes. To take this crucial nearest neighbor (ANN) techniques to make large-scale search
issue into account, this paper proposes a method named special practical.
structure-based hashing (SSBH). SSBH can preserve the under-
lying geometric information among the data, and exploit the Over the past decade, fast indexing and similarity search
prior information that there exists sparse reconstructive relation- in large database have attracted considerable attention and
ship of the data, for learning compact and effective hash codes. many ANN techniques have been developed for information
Upon extensive experimental results, SSBH is demonstrated to retrieval. Broadly, the fast nearest neighbor search methods can
be more robust and more effective than state-of-the-art hashing be categorized into two families: 1) tree-based methods [5]–[9]
methods.
and 2) hashing-based methods [10]–[13].
Index Terms—Hashing, nearest neighbor search, structure Tree-based methods can exploit various tree structures,
sparse-based hashing. which include M-trees [5], k–d trees [6], [7], cover trees [8],
and metric trees [9], to find the approximate spatial parti-
I. I NTRODUCTION tions of the feature space. These methods try to perform
N RECENT years, large-scale image search has attracted fast similarity search. However, they suffer from the curse of
I considerable attention because of the rapid growing of
Web data including documents, images, and videos [1]–[4].
dimensionality and cannot deal with the large-scale database
due to the memory constrain. Moreover, the search perfor-
For example, popular photo sharing website Flickr has sur- mance of the aforementioned methods is more prone to drop
passed six billion photos. YouTube is a popular video sharing over the data with high dimensionality.
website that receives more than 48 h of uploaded videos Accordingly, the research of hashing-based methods has
every minute. It is very important to retrieve relevant informa- received large amount of interest in recent years [14]–[16].
tion from such massive image database. Efficiently, similarity The task of hashing-based methods tries to find a map from
the space of high-dimensional data to the space of low-
Manuscript received October 13, 2014; revised January 26, 2015; accepted dimensional binary while the topology structure of the original
March 10, 2015. Date of publication April 20, 2015; date of current ver-
sion February 12, 2016. This work is supported in part by the National data space can be preserved. Hashing-based methods are
Basic Research Program of China (973 Program) under Grant 2012CB316400, promising in performing fast similarity searches by generating
in part by the National Natural Science Foundation of China under Grant compact binary codes for a large-scale database. In this case,
61125106 and Grant 61300142, and in part by the Key Research Program of
the Chinese Academy of Sciences under Grant KGZD-EW-T03. This paper similar neighbors can be retrieved by returning the images
was recommended by Associate Editor Y. Zhao. from a given database using a small Hamming distance. By
R. Ye is with the Center for Optical Imagery Analysis and Learning, State encoding images as a set of compact binary codes, hashing-
Key Laboratory of Transient Optics and Photonics, Xi’an Institute of Optics
and Precision Mechanics, Chinese Academy of Sciences, Xi’an 710119, based methods can make the search operation extremely
China, and also with the School of Electronics and Information Engineering, fast. Hence, hashing-based methods can reduce the storage
Xi’an Jiaotong University, Xi’an 710119, China requirement and achieve the fast query time. Recently, much
X. Li is with the Center for Optical Imagery Analysis and Learning, State
Key Laboratory of Transient Optics and Photonics, Xi’an Institute of Optics emphasis has been directed at the data-dependent or learning-
and Precision Mechanics, Chinese Academy of Sciences, Xi’an 710119, based hashing methods [17]–[19]. Most recent researches on
China. data-dependent methods aim to find the inherent neighbor-
Color versions of one or more of the figures in this paper are available
online at http://ieeexplore.ieee.org. hood structure while the original data is embedded into a
Digital Object Identifier 10.1109/TCYB.2015.2414299 low-dimensional space.
2168-2267 c 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: XIDIAN UNIVERSITY. Downloaded on December 15,2023 at 02:17:58 UTC from IEEE Xplore. Restrictions apply.
YE AND LI: COMPACT STRUCTURE HASHING VIA SPARSE AND SIMILARITY PRESERVING EMBEDDING 719

Most of the aforementioned hashing methods have realized Unsupervised methods can achieve binary codes
the importance of the relationship of the data and exploited for the given data points with the unlabeled data
the different structure information of data to improve retrieval in unsupervised way [26], such as locality sensitive
performance. However, a limitation of the aforementioned hashing (LSH) [12], [27], [28], spectral hashing (SH) [29],
methods is that the sparse reconstructive relationship of the principal component analysis hashing [30], and its rotational
data is neglected. Hence, few methods can find the discrim- variant [16]. LSH is most notably unsupervised hashing
inating power and own the local properties of the data for method and provides probabilistic guarantees of retrieving
learning compact and effective hash codes. items for similar examples [31], [32]. An LSH-based method
Realizing the importance of the sparse reconstructive rela- for similar examples over arbitrary kernel functions was
tionship of the data for learning compact, effective hash codes, presented in [13] and can be exploited in many existing
we propose a special structure-based hashing (SSBH) frame- useful measures. Another effective unsupervised hashing
work that can preserve the underlying geometric information method, also known as SH, was proposed recently by
among the data. The proposed objective function in SSBH Weiss et al. [29]. Compared to SH method, the optimal code
framework is composed of three components: 1) the term com- using multidimensional SH(MDSH) is guaranteed to faithfully
bined empirical fitness; 2) information theoretic regularization; reproduce the affinities as the number of bits increases [33].
and 3) structure sparsity representation regularization. First, By exploring the geometric structure of the data, density
we construct an objective function to balance the maximum sensitive hashing (DSH) avoids the purely random projections
of empirical accuracy combined with the information theo- selection and uses those projective functions which best agree
retic and the minimum of sparse reconstruction provided by with the distribution of the data [34].
each bits. Second, the sparisty is introduced to encode the Recently, AGH method was proposed to design appropri-
data domain via locality preserving embedding, which can ate compact codes by exploiting the inherent neighborhood
effectively reflect the intrinsic geometric properties of the structure in the data [2], [35], [36]. Zhang et al. [37] pro-
data. Finally, L2,1 norm is introduced into a modified sparse posed a novel research problem: composite hashing with
representation framework so that the discriminate informa- multiple information sources, which incorporated the features
tion of the hashing code can be preserved in the weight from different information sources into the binary hash-
matrix. ing codes. In addition, the existing state-of-the-art methods
The rest of this paper is organized as follows. The related including [16], [38], [39], and [40], have shown good perfor-
work of several hashing methods is briefly reviewed in mance to retrieve high-dimensional data. Dubbed iterative
Section II. Then, Section III presents the proposed SSBH quantization performed a rotation of zero centered data so as to
framework. A comprehensive set of comparison experiments alleviate the problem of unbalanced variances. Robust sparse
are reported in Section IV, and finally Section V concludes hashing (RSH) method [38] and sparse embedding and least
this paper. variance encoding (SELVE) method [39] can achieve good
hashing of with binary codes by exploiting sparse coding.
A novel algorithm, named iterative expanding hashing, was
II. R ELATED W ORK proposed to exploit very small Hamming radius and iteratively
Many data-dependent methods have been developed and expand a few nearest candidates, and can obtain high recall
can be divided into two categories: 1) supervised methods and low search time simultaneously [40].
and 2) unsupervised methods. Supervised methods exploit the In the following section, sequential learning for hashing
label information to preserve the semantic similarity and thus for the application of image retrieval will be discussed. For
improve the ability of fast similarity search for a large-scale later convenience, some annotations are introduced. Given
database [17], [20]. Unsupervised hash-based methods [such the databases X = [x1 , x2 , . . . , xi , . . . , xN ] ∈ Rd×N , the
as anchor graph hashing (AGH)] generate compact hash codes task of hashing-based method is to exploit K hash func-
via employing the techniques on machine learning. tions to map a data point xi to a K-bit hash code H(xi ) =
To exploit the label information, many supervised hashing [h1 (xi ), . . . , hK (xi )], where hk is the kth hash function and
methods have been proposed. A boosted similarity sensitive hk (x) ∈ {−1, 1}.
coding method was proposed to design a series of weighted
hash functions by using labeled data [21]. Liu et al. [24]
tried to learn efficient kernelized hash functions by utilizing A. Sequential Learning for Hashing methods
pairwise labels. Other supervised hashing methods, like deep Sequential learning for hashing methods [30] tried to intro-
neural network stacked with restricted Boltzmann machines, duce a neighbor-pair set M and a nonneighbor-pair set C
have been developed in recent years [22], which retains the to learn hash functions when the same bits for (xi , xj ) ∈ M
semantic similarity structure of data. A binary reconstructive and different bits for (xi , xj ) ∈ C are given. In [30], a semi-
embedding technique was proposed to learn hashing function supervised hashing framework is proposed to minimize empir-
by minimizing the reconstruction error between the metric ical error over the labeled data and an information theoretic
space and Hamming space [23]. Although existing supervised constraint over data points. Without loss of generality, let us
hashing techniques can improve retrieval performance due to denote the matrix formed by these lth columns of X as Xl .
its ability of exploiting label information [10], they need many Let H = [h1 , . . . , hK ] be a sequence of K hash functions. The
labeled images, which is a tough task to label [24], [25]. objective function in [30] for the empirical accuracy over the

Authorized licensed use limited to: XIDIAN UNIVERSITY. Downloaded on December 15,2023 at 02:17:58 UTC from IEEE Xplore. Restrictions apply.
720 IEEE TRANSACTIONS ON CYBERNETICS, VOL. 46, NO. 3, MARCH 2016

labeled data can be denoted as can provide mappings to encode unseen data points through
⎧ ⎫ a linear regression function and effectively cope with the
hk (xi )hk xj
⎪ ⎨
(xi ,xj )
∈M
⎪
⎬ reconstructive coefficients vary smoothly along the geodesics
J(H) = . (1)
⎪
⎩ − hk (xi )hk xj ⎪
⎭
of the data manifold. In this case, the locality and the simi-
k
(xi ,xj )∈C larity of data can be preserved when binary codes of data are
generated, which makes the encoding process highly efficient.
According to [30], the empirical fitness in (1) can be
expressed in a compact matrix form
A. Objective Function
1
J(H) = Tr H(Xl )TH(Xl )T Given the data X = [x1 , x2 , . . . xi , . . . , xN ] ∈ Rd×n , each
2
1 T
sample xi is expected to be reconstructed from a few samples
⇒ J(W) = Tr sgn W T Xl Tsgn W T Xl (2) of the data matrix X. Hence, we try to find a sparse represen-
2
tation vector si for each sample xi by using the following l1
where W = [w1 , . . . , wK ] ∈ Rd×K is the sequence of projec- minimization problem:
tion vectors, sgn(W T Xl ) is the matrix of signs of individual
elements, the pairwise matrix T, incorporating the pariwise
n

labeled information from Xl , can be defined as min si 1

si
⎧ ⎫ i=1
⎨ 1 : xi , xj ∈ M ⎬ s.t. xi = Xsi , i = 1, 2, . . . , n (7)
Tij = −1 : xi , xj ∈ C . (3)
⎩ ⎭ where si is an n-dimensional vector, and the element sij is
0 : otherwise
regarded as the contribution of each sample xj to xi . The
After this relaxation [26], [41], the relaxed empirical fitness
weight matrix S = [s1 , s2 , . . . , si , . . . , sn ] not only measures
term can be obtained from (2)
the similarity among different samples, but also captures some
1 intrinsic structure properties and the discriminate informa-
J(W) = Tr W T Xl TXlT W . (4)
2 tion of the data. This is because the elements are invariant
It mostly gets overfitting to measure the empirical accuracy to rotations and rescalings according to the first constraint
on the labeled data. Hence, it is necessary to consider the in (7). Moreover, the nonzero entries in sparse matrix may
information provided by each binary bit to obtain desir- help to distinguish the samples from the given class even if
able properties of hash codes. By incorporating informa- no class-labels are provided. In this case, the sparse recon-
tion theoretic constraint into the relaxed empirical fitness structive weight vector from all the samples tends to include
term as a regularizer, the objective function is obtained as potential discriminant information. As described above, the
follows: sparse weight matrix can capture intrinsic structure properties
1 λ of the data and contain natural discrimination information.
J1 (W) = Tr W T Xl TXlT W + Tr W T XX T W (5) One might further expect that the desirable characteristics
2 2
in the high-dimensional features can be preserved in a low-
where Tr{W T XX T W} is the information theoretic term. dimensional Hamming space where items can be efficiently
Sequential learning for hashing method tried to maximize the searched. In this paper, we can construct hash functions based
objective function in (5) to learn the optimal W, which can be on a structure sparse representation framework such that the
exploited to design hash functions. hashing coding can be characterized by sparse weight matrix
of the data. In real applications, similar samples in training
III. l2,1 -N ORM R EGULARIZED H ASHING L EARNING VIA data space are often reconstructed with different reconstruc-
S IMILARITY P RESERVING E MBEDDING tion coefficients. Moreover, the decomposition error of similar
In this paper, we try to find a hash function H that map data data will be increased in spare linear reconstruction with-
points Xl into binary codes by using the following formula: out considering the local topological structure of the data
space [42], [43]. Therefore, the priori information of local-
H(Xl ) = sign W T Xl (6) ity and similarity constraint can be exploited to regularize
where W is the projection matrix. H(Xl ) can map Xl into binary the sparse reconstruction [43]–[45]. Based on the above dis-
codes. Hence, it is important to learn the projection matrix cussion, we seek K hash functions {hk }K k=1 , which can best
for deriving final hashing function. To obtain the projection preserve the optimal weight vector si , to map a data point xi
matrix, a novel regularized technique is formulated by extend- to a K-bit hash code H(x) = [h1 (x), . . . , hK (x)]
ing the existing hashing method to the proposed method in this
n
n
n
2
paper. The proposed method has three features. First, the con- min si 1 + β si − sj Aij
si
structed hash function can capture the intrinsic attribute of the i=1 i=1 j=1
data, which generate interpretable binary codes. Second, it is s.t. hk (xi ) = hk (Xsi )
expected to exploit only a linear combination of a few entries i = 1, 2, . . . , n; k = 1, 2, . . . , K (8)
in the dictionary to reconstruct the original data. Thus, the
low-dimensional nature of sparse representation make it appro- where Aij = exp(−(xi − xj 2 /σ )) (σ is the heat kernel
priate to encode unseen data. Finally, the proposed method parameter) and β is a regularized parameter. For convenience,

Authorized licensed use limited to: XIDIAN UNIVERSITY. Downloaded on December 15,2023 at 02:17:58 UTC from IEEE Xplore. Restrictions apply.
YE AND LI: COMPACT STRUCTURE HASHING VIA SPARSE AND SIMILARITY PRESERVING EMBEDDING 721

the problem in (8) can be transformed as follows: where U is a diagonal matrix whose ith diagonal element is
Uii = 1/(2si. 2 ), and si is the ith row of S. In practice, the

K
n
min hk (xi ) − hk (Xsi )2 ith diagonal element is redefined Uii = 1/(2si. 2 + ε) (ε is
hk ,si
k=1 i=1
a small constant) due to si 2 may be zero. In summary, we

n
n
n try to learn optimal map W and S by minimizing the objective
2
+α si 1 + β si − sj Aij (9) function as
i=1 i=1 j=1
(W, S) = arg min J2 (W, S). (13)
where α is the balance parameter. Let Bbe a diagonal matrix W,S

and its elements are defined as Bi,i = j Aj,i , V = B − A is Although recent hashing methods can get the promising
the graph Laplacian matrix. Equation (9) can be transformed results in the application of image retrieval, they may neglect
as follows: the sparse reconstructive relationship of the data and cannot

K
n find the discriminating power and own the algebraic struc-
min hk (xi ) − hk (Xsi )2 + αS1 + βTr(SVS). (10) ture of the data. Consequently, it is necessary to exploit the
hk ,si sparse relationship of the data points in hashing learning. In
k=1 i=1
this paper, we construct an objective function, which incorpo-
It can be found in (10) that the low-dimensional binary
rates to build a balance between the maximum of empirical
embeddings can preserve the original structure information of
accuracy combined with the information theoretic and the min-
the data. Since the sparse matrix S has natural discriminating
imum of sparse reconstruction of data points. By considering
power, the hash function learned from the feature space to a
the terms in (5) and (13), the following objective function of
Hamming space has a good discriminative power even if no
the proposed method is presented as (14) as shown at the top
supervised information is provided.
of the next page, where the matrix M = 1/2(Xl TXlT + λXX T )
In [41], the l2,1 -norm of a matrix is exploited as a rotational
and the matrix L = I − S − ST + ST S. The proposed objective
invariant l1 -norm and has attracted increase attention. Different
function is divided into two steps: 1) learning sparse weight
from the flat penalty introduced by the l1 -norm, l2,1 -norm
matrix S (fixing W) and 2) learning map W (fixing S).
term can regularize all the elements {si }ni=1 corresponding to
the training data as a whole and compute the l1 -norm over
B. Computing Map Matrix W
s = [s1 2 , . . . , si 2 , . . . , sn 2 ]T . The samples in original
data X corresponding to the nonzero entries of the weight Fixing the sparse weight matrix S, (14) can write the new
matrix can be automatically selected to approximate the given objective function as

data vector. Therefore, when the l2,1 -norm is imposed on all Tr W T Xl I − S − ST + ST S XlT W
the elements {si }ni=1 , the objective function in (10) can be W = arg min . (15)
W Tr W T MW
formulated as follows:
For compact expression, the objective function in (15) can

K
n
hk (xi ) − hk (Xsi ) + αS2,1 + βTr(SVS)
2 be transformed as follows:
min
hk ,si Tr W T Xl LXlT W
k=1 i=1
W = arg min 1 . (16)
(11) T
W 2 Tr W MW
where the l2,1 -norm of S2,1 is defined as: S2,1 =

The minimization problem in (16) can be solved by gen-
n
i=1 si. 2 and si is the ith row of S. The objective func- eralized eigenvalue decomposition problem. Therefore, the
tion in (11) is not differentiable due to hk (xi ) = sgn(wTk xi ). optimal map matrix is obtained by computing the eigenvectors
Therefore, the objective function in (11) is relaxed by replac- corresponding to the first smallest eigenvalues.
ing the sgn() with its signed magnitude and represented the
objective as a function of W C. Computing Sparse Weight Matrix S
n If W is fixed, the minimization problem in (13) can be
K 2
J2 (W, S) = w xi − w Xl si
T T + αS2,1 transformed as
k k
k=1 i=1
S = arg min Tr W T Xl I − S − ST S XlT W

n
n
S
+β
2
si − sj Aij
+ αS2,1 + βTr SVST
i=1 j=1
= arg min Tr D I − S − ST + ST S D

K
n
S
= Tr wTk (xi − Xl si )(xi − Xl si ) wk
T
+ αTr SUST + βTr SVST
k=1 i=1
= arg min C(S) (17)
+ αTr SUST + βTr SVST
S
= Tr W T Xl XlT − Xl SXlT − Xl ST XlT + Xl ST SXlT W where D = W T Xl and C(S) = W T Xl (I − S − ST + ST S)XlT W +

+ αTr SUST + βTr SVST αTr(SUST ) + βTr(SVST ). Let (∂C(S)/∂S) equals to zero,

= Tr W T Xl I − S − ST + ST S XlT W namely
∂C(S)
+ αTr SUST + βTr SVST (12) ∂S = −2DT D + 2SDT D + 2αSU + 2βSV (18)

Authorized licensed use limited to: XIDIAN UNIVERSITY. Downloaded on December 15,2023 at 02:17:58 UTC from IEEE Xplore. Restrictions apply.
722 IEEE TRANSACTIONS ON CYBERNETICS, VOL. 46, NO. 3, MARCH 2016

J2 (W, S)
(W, S) = arg min
W,S J1 (W)

Tr W T Xl I − S − ST + ST S XlT W + αS2,1 + βTr SVST
= arg min λ
l W + 2 Tr W XX W
1 T T T T
W,S
T 2 Tr W Xl TX
Tr W XLX T W + αTr SUST + βTr SVST
= arg min
W,S
1
Tr W T Xl TXlT W + λ2 Tr W T XX T W
2
Tr W T XLX T W + αTr SUST + βTr SVST
= arg min 1
(14)
T
W,S 2 Tr W MW

Algorithm 1 Computing Map Matrix W and Sparse Weight DSH method [34], complementary hashing (CH) method [26],
Matrix S LSH method [12], SH method [29], MDSH method [33],
Input: given a set of dimensional training data points Xl = and unsupervised sequential projection learning for hash-
[x1 , . . . , xt ], the parameters α and β, number of iterations J. ing (USPLH) method [30]. In our experiments, to perform
Initialize: set S0 = In×n , where In×n is matrix of ones, and fair evaluation, all methods are performed in the unsupervised
compute the diagonal matrix U0 manner. Similar to [34], a returned point is regarded as a true
⎡ 1 ⎤ neighbor if it lies in the top two percentile points closest (mea-

2s01. 2 +ε
⎢ ⎥ sured by the Euclidian distance in the original space) to the
U0 = ⎣ ··· ⎦. query. For each query, all the data points in the database are
1
2s0n. 2 +ε ranked according to their Hamming distances to the query. In
this case, the pairwise matrix T in (3) is not available. Similar
For: t = 1:J
to [30], a pesudolable Tij = 1 is assigned for a pair of sam-
1: Compute the map matrix Wt from the following mini-
ples (xi , xj ) ∈ M and Tij = −1 is assigned for those samples
mization problem by generalized eigenvalue decomposition
(xi , xj ) ∈ C. M and C are a neighbor-pair and nonneighbor-pair
problem in the t-th iteration:
sets, respectively. Two sets can be defined as follows:
Tr Wt T Xl LXlT Wt
Wt = arg min 1 T
. M = xi , xj : h(xi ) ∗ h xj
Wt 2 Tr Wt MWt
= −1, wT xi − xj ≤ ε
2: Compute the sparse weight matrix St by exploiting the
following equation: C = xi , xj : h(xi ) ∗ h xj

−1 = 1, wT xi − xj ≥ ξ. (20)
St = DTt Dt DTt Dt + αUt + βV ,
t t
where Dt = Wt Xl and St = s1. , s2. , . . . , sn. .
T t
A. Datasets and Settings
3: Update the diagonal matrix Ut : Three benchmark datasets are exploited to verify the effi-
⎡ 1 ⎤
ciency of the proposed method. Specifically, the experiment
2st1. 2 +ε
⎢ ⎥ settings are listed in the following.
Ut = ⎣ ··· ⎦.
1 1) CIFAR-10 Dataset: The CIFAR-10 dataset is a labeled
2stn. 2 +ε subset of the well known 80 M tiny image collection [46]. It
End: contains ten classes of 60 000 samples, and each class consists
Output: the sparse weight matrix SJ and the map matrix WJ . of 6000 samples with size 32×32 color images. We randomly
split this database into two parts: 1) a training set with 59 000
samples and 2) a test set with 1000 samples. The training set
or equivalently is exploited to learning hash function and construct the hash
lookup. A few image samples from the CIFAR-10 dataset are
−1
S = DT D DT D + αU + βV . (19) listed in Fig. 1.
2) SIFT-1M Dataset: The SIFT-1M dataset consists of
The corresponding method for computing the map matrix W one million local scale invariant feature transform (SIFT)
and sparse weight matrix S is summarized as Algorithm 1. descriptors, which are extracted from random images and
described in [47]. Each sample in this dataset is composed of
IV. E XPERIMENTAL R ESULTS a 128-dimensional vector representing histograms of gradient
This section reports a set of experiments to verify the orientation. Similar to [24], one milion samples in SIFT-1M
efficiency of the proposed method. To validate the perfor- dataset are chosen to train and an additional 10 000 is regarded
mance of the proposed method in image retrieval, we compare as test samples.
eight hashing methods including our method and seven other 3) 22K LabelMe Dataset: 22K LabelMe dataset, compiled
unsupervised state-of-the-art methods: SELVE method [39], by Torralba et al. [48], contains 22 019 images and 2000 test

Fig. 3. Parameter selection results on CIFAR-10 dataset: MAP varies with

Fig. 1. Gallery of images from the CIFAR-10 dataset. From top to bottom the parameter α for 64-bit.
rows, the image classes are airplane, automobile, bird, cat, deer, dog, horse,
ship, and truck.

Fig. 4. Parameter selection results on CIFAR-10 dataset: MAP varies with

Fig. 2. Parameter selection results on CIFAR-10 dataset: MAP varies with the parameter λ for 64-bit.
the parameter β for 64-bit.

images. The size of each image is 32 × 32 pixels with a hashing performance measure to select the optimal parameters.
512-dimensional Gist descriptor. First, we fix parameter pair (α, λ) = (0.1, 0.1). We study how
4) 100K Tiny Image Dataset: 100K tiny image consists MAP varying the parameter β for the proposed method when
of 100 000 images sampled from the large 80 million tiny the parameters α and λ keep unchanged. Fig. 2 plots the exper-
images [46]. Tiny image data tries to provide a visualiza- imental results varying with the value of parameter β from
tion of all the nouns in the English language and is mainly 0.1 to 0.9 for 64-bit when fixing parameter pair α = 0.1 and
collected by Goolges image search engine. The size of the λ = 0.1. It can be shown from Fig. 2 that our method achieves
original images is 32 × 32 pixels with a 384-dimensional Gist the best MAP performance when β = 0.3. Second, we fix
descriptor. The entire dataset can be divided into two parts: parameter pair (β, λ) = (0.3, 0.1), and then select the best
1) a test set with 4000 images and 2) a training set with parameter α. Fig. 3 presents the values varying with param-
96 000 images. eter α from 0.1 to 0.9 for 64-bit when fixing β = 0.3 and
λ = 0.1. It can be seen from Fig. 3 that the best parame-
ter α is 0.5. Finally, two parameters β and α are fixed 0.3
B. Parameters Selection and 0.5, respectively. Fig. 4 shows the MAP varying with λ
To further verify the proposed method, we choose the for 64-bit. In Fig. 4, we can find λ = 0.8 achieves the best
CIFAR-10 dataset for a thorough study on three important performance while other parameters keep unchanged. Thus,
parameters in the objective function of (14). In parameters we select the best parameter pair for the proposed method is
analysis, mean average precision (MAP) is exploited as a (β, α, λ) = (0.3, 0.5, 0.8).

Authorized licensed use limited to: XIDIAN UNIVERSITY. Downloaded on December 15,2023 at 02:17:58 UTC from IEEE Xplore. Restrictions apply.
724 IEEE TRANSACTIONS ON CYBERNETICS, VOL. 46, NO. 3, MARCH 2016

Fig. 5. Precision–recall curve on CIFAR-10 image data. (a)–(h) Performances for the hash codes of 8, 12, 16, 24, 32, 48, 64, and 128 bits, respectively.

Fig. 6. Precision curves on CIFAR-10 image data. (a)–(h) Performances for the hash codes of 8, 12, 16, 24, 32, 48, 64, and 128 bits, respectively.

C. Competitors 2) LSH method [12] is a classic unsupervised method for

In this paper, the results of the proposed method are com- multitemporal data, and has a wide application in change
pared with seven state-of-the-art methods. Details of these detection domains.
methods are as follows. 3) DSH method [34] is considered to be an extension of
1) SELVE method [39] is one of the latest method for LSH. Compared with LSH, DSH avoids the purely random
hashing, which learns a dictionary to encode the sparse projections selection and achieves better performance.
embdding feature, and binarizes the coding coefficients 4) SH method [29] is a classic method to conduct principal
as the hash codes. component analysis on the original data, and exploits

Fig. 7. Recall curves on CIFAR-10 image data. (a)–(h) Performances for the hash codes of 8, 12, 16, 24, 32, 48, 64, and 128 bits, respectively.

Fig. 8. Results on the SIFT-1M image dataset. Precision curves with (a) 8 and (b) 12 bits. Recall curves with (c) 8 and (d) 12 bits.

Fig. 9. Precision–recall curve on LableMe image data. (a)–(d) Performances for the hash codes of 8, 16, 24, and 32 bits, respectively.

Laplacian eigenfunctions computed along the principal 6) MDSH method [33] is a popular method to find binary
directions of the data to generate the hash codes. codes so that inner product among the codes approxi-
5) CH method [26] is a latest hashing method, which mates the affinity between datapoints.
aims to learn a series of hashing functions which 7) USPLH (semi-supervised novelty detection)
cross the sparse data region and generate balanced hash method [30] is an unsupervised hashing method,
buckets. which tries to learn efficient hash codes by

Authorized licensed use limited to: XIDIAN UNIVERSITY. Downloaded on December 15,2023 at 02:17:58 UTC from IEEE Xplore. Restrictions apply.
726 IEEE TRANSACTIONS ON CYBERNETICS, VOL. 46, NO. 3, MARCH 2016

Fig. 10. Precision–recall curve on tiny image data. (a)–(h) Performances for the hash codes of 8, 12, 16, 24, 32, 48, 64,and 128 bits, respectively.

Fig. 11. Precision of the top 500 returned samples on different dataset using Hamming ranking. (a) CIFAR-10, (b) SIFT1M, (c) LableMe, and
(d) tiny datasets.

simple linear mapping that can handle semantic to false positive rate. As shown in Fig. 6, our method has much
similarity/dissimilarity among the data. better precision–recall curves compared with other methods.
The proposed method can perform best in all cases from
D. Experimental Results Figs. 5–7. Higher precision and recall in Figs. 5–7 demon-
A number of experimental results, in which the number strate the advantage of our method. Moreover, compared with
of bits is varied from 8 to 64, are planned to conduct over other methods, the drop in precision for the proposed method
the CIFAR-10, SIFT-1M, 22K LableMe, and 100K tiny image is much less when the codes increase, which indicates fewer
datasets. The proposed method and other methods are imple- queries for the proposed methods. Fig. 8 presents the results
mented in a MATLAB environment on a modern desktop com- of the six methods on SIFT-1M dataset by exploiting the pre-
puter (3.8 GHz Core 4 Quad Core with 8 GB random access cision and recall curves. It can be observed in Fig. 8 that the
memory) to compute the training time and compression time. results illustrate significant performance improvement using
The comparison of precision and recall curves on CIFAR-10 the proposed method over other methods. In Fig. 9, we show
image datasets is shown in Fig. 5, respectively. Similarly, the comparison of precision–recall curves on LabelMe image
the precision and recall curves for CIFAR-10 are listed in dataset. It can be shown in Fig. 9 that the proposed method
Figs. 6 and 7, respectively. It can be shown in Fig. 5 that the confirms its superiority on search performance compared with
precision for all methods drops significantly when the recall other methods.
increase, and vice versa. This is because the value of precision The experimental results of different methods on tiny image
is sensitive to true positive rate while that of recall is sensitive dataset from 8- to 128-bit can be listed in Fig. 10. We can find

Fig. 12. Training time and compression time on LableMe dataset. (a) Training and (b) compress times.

Fig. 13. Sample top retrieved image for query in (a) using 32 bits. Red rectangle denotes false positive. Best viewed in color. (a) Query, (b) uur method
(precision: 0.69), (c) CH method (precision: 0.36), (d) LSH method (precision: 0.44), (e) SH method (precision: 0.23), (f) MDSH method (precision: 0.38),
(g) USPLH method (precision: 0.58), (i) DSH method (precision: 0.44), and (j) SELVE method (precision: 0.50).

that the proposed method significantly improves the perfor- have similar training cost. However, the training time of
mance on tiny image dataset. This indicates that the proposed USPLH method is longer than other methods since it needs to
method can find the discriminating power and capture the alge- perform eigenvalue decomposition and update pairwise label
braic structure among the data points. In Fig. 11, we present matrix. As shown in Fig. 12(b), SH method needs a little more
the experimental results on four different image datasets to time compared with other methods since the sinusoidal func-
demonstrate the effectiveness of the proposed hashing method. tion in compression process must be calculated. In addition,
It is not very surprising to see that the proposed hashing other methods incur compression time than LSH methods.
method can obtain the best performance over four datasets. Finally, the experimental results of the unsupervised test on
Therefore, the proposed method leads to fewer query failures a sample query from CIFAR-10 dataset are demonstrated in
compared with other methods. Additionally, the comparison Fig. 13. It can be clearly seen from Fig. 13 that the proposed
of computational cost, including compression time and train- method provides more visually consistent search results than
ing time, for different methods is reported in Fig. 12. The other methods.
compression time indicates the encoding time of transform-
ing the original test data into binary codes and the training V. C ONCLUSION
time refers to computational cost of learning the hash func- In this paper, we propose a SSBH framework that can pre-
tions from training data. It can be observed in Fig. 12(a) serve the underlying geometric information among the data.
that LSH method needs negligible training time compared to The proposed objective function of SSBH is composed of
other methods. This is because that the projections in LSH three components: 1) the term combined empirical fitness; 2)
method are not learned but randomly generated. It is not sur- information theoretic regularization; and 3) structure sparsity
prising that the eigenvalue decomposition-based techniques, representation regularization. First, we construct an objective
including SELVE, CH, DSH, SH, MDSH, and our method, function to build a balance between the maximum of empirical

Authorized licensed use limited to: XIDIAN UNIVERSITY. Downloaded on December 15,2023 at 02:17:58 UTC from IEEE Xplore. Restrictions apply.
728 IEEE TRANSACTIONS ON CYBERNETICS, VOL. 46, NO. 3, MARCH 2016

accuracy combined with the information theoretic and the min- [21] G. Shakhnarovich, “Learning task-specific similarity,” Ph.D. dissertation,
imum of sparse reconstruction provided by each bits. Second, Dept. Electr. Eng. Comput. Sci., MIT, Cambridge, MA, USA, 2005.
[22] R. Salakhutdinov and G. Hinton, “Semantic hashing,” Int. J. Approx.
the sparsity is introduced to encode the data domain via local- Reason., vol. 50, no. 7, pp. 969–978, 2009.
ity preserving embedding, which can effectively reflect the [23] B. Kulis and T. Darrell, “Learning to hash with binary reconstructive
intrinsic geometric properties of the data. Finally, L2,1 norm embeddings,” in Proc. Adv. Neural Inf. Process. Syst., Vancouver, BC,
Canada, 2009, pp. 1042–1050.
is introduced into a modified sparse representation frame-
[24] W. Liu, J. Wang, R. Ji, Y.-G. Jiang, and S.-F. Chang, “Supervised
work so that the discriminate information of the hashing code hashing with kernels,” in Proc. IEEE Conf. Comput. Vis. Pattern
can be preserved in the weight matrix. Experimental results Recognit. (CVPR), Providence, RI, USA, 2012, pp. 2074–2081.
on different datasets show that our method outperforms the [25] K. Grauman and T. Darrell, “Pyramid match hashing: Sub-linear time
indexing over partial correspondences,” in Proc. IEEE Conf. Comput.
state-of-the-art methods at a large margin. Vis. Pattern Recognit. (CVPR), Minneapolis, MN, USA, 2007, pp. 1–8.
[26] H. Xu et al., “Complementary hashing for approximate nearest neighbor
search,” in Proc. IEEE Int. Conf. Comput. Vis. (ICCV), Barcelona, Spain,
R EFERENCES 2013, pp. 1631–1638.
[27] Q. Lv, W. Josephson, Z. Wang, M. Charikar, and K. Li, “Multi-probe
[1] M.-S. Chen, M. Lo, P. S. Yu, and H. C. Young, “Applying segmented LSH: Efficient indexing for high-dimensional similarity search,” in Proc.
right-deep trees to pipelining multiple hash joins,” IEEE Trans. Knowl. 33rd Int. Conf. Very Large Data Bases, Vienna, Austria, 2007,
Data Eng., vol. 7, no. 4, pp. 656–668, Aug. 1995. pp. 950–961.
[2] J. Song, Y. Yang, X. Li, Z. Huang, and Y. Yang, “Robust hashing with [28] M. Bawa, T. Condie, and P. Ganesan, “LSH forest: Self-tuning indexes
local models for approximate similarity search,” IEEE Trans. Cybern., for similarity search,” in Proc. 14th Int. Conf. World Wide Web, Chicago,
vol. 44, no. 7, pp. 1225–1236, Jul. 2014. IL, USA, 2005, pp. 651–660.
[3] L. Chen, D. Xu, I. Tsang, and X. Li, “Spectral embedded hashing [29] Y. Weiss, A. Torralba, and R. Fergus, “Spectral hashing,” in Proc.
for scalable image retrieval,” IEEE Trans. Cybern., vol. 44, no. 7, Adv. Neural Inf. Process. Syst., Vancouver, BC, Canada, 2008,
pp. 1180–1190, Jul. 2014. pp. 1753–1760.
[4] Y. Yuan, X. Lu, and X. Li, “Learning hash functions using sparse
[30] J. Wang, S. Kumar, and S. Chang, “Semi-supervised hashing for large
reconstruction,” in Proc. ACM Conf. Internet Multimedia Comput.
scale search,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 34, no. 12,
Serv. (ICIMCS), Xiamen, China, 2014, pp. 14–18.
pp. 2393–2406, Dec. 2012.
[5] P. Ciaccia, M. Patella, and P. Zezula, “An efficient access method for
[31] Y. Mu, J. Shen, and S. Yan, “Weakly-supervised hashing in kernel
similarity search in metric spaces,” in Proc. 23rd Int. Conf. Very Large
space,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR),
Data Bases, vol. 23. Athens, Greece, 1997, pp. 426–435.
San Francisco, CA, USA, 2010, pp. 3344–3351.
[6] C. Silpa-Anan and R. Hartley, “Optimised KD-trees for fast image
[32] B. Kulis, P. Jain, and K. Grauman, “Fast similarity search for learned
descriptor matching,” in Proc. IEEE Conf. Comput. Vis. Pattern
metrics,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 31, no. 12,
Recognit. (CVPR), Anchorage, AK, USA, 2008, pp. 1–8.
pp. 2143–2157, Dec. 2009.
[7] J. L. Bentley, “Multidimensional binary search trees used for associative
searching,” Commun. ACM, vol. 18, no. 9, pp. 509–517, 1975. [33] Y. Weiss, R. Fergus, and A. Torralba, “Multidimensional spectral hash-
ing,” in Proc. 12th Eur. Conf. Comput. Vision (ECCV), Florence, Italy,
[8] A. Beygelzimer, S. Kakade, and J. Langford, “Cover trees for nearest
2012, pp. 340–353.
neighbor,” in Proc. 23rd Int. Conf. Mach. Learn., Pittsburgh, PA, USA,
2006, pp. 97–104. [34] Z. Jin, C. Li, Y. Lin, and D. Cai, “Density sensitive hashing,”
[9] J. K. Uhlmann, “Satisfying general proximity/similarity queries with IEEE Trans. Cybern., vol. 44, no. 8, pp. 1362–1371, Aug. 2014.
metric trees,” Inf. Process. Lett., vol. 40, no. 4, pp. 175–179, 1991. [35] W. Liu, J. Wang, S. Kumar, and S.-F. Chang, “Hashing with graphs,”
[10] C. Wu, J. Zhu, D. Cai, C. Chen, and J. Bu, “Semi-supervised nonlin- in Proc. 28th Int. Conf. Mach. Learn. (ICML), Bellevue, WA, USA,
ear hashing using bootstrap sequential projection learning,” IEEE Trans. 2011, pp. 1–8.
Knowl. Data Eng., vol. 25, no. 6, pp. 1380–1393, Jun. 2013. [36] L. Chen, D. Xu, I. W.-H. Tsang, and X. Li, “Spectral embedded hash-
[11] A. Gionis, P. Indyk, and R. Motwani, “Similarity search in high ing for scalable image retrieval,” IEEE Trans. Cybern., vol. 44, no. 7,
dimensions via hashing,” in Proc. 25th Int. Conf. Very Large Data pp. 1180–1190, Jul. 2014.
Bases (VLDB), vol. 99. Edinburgh, Scotland, 1999, pp. 518–529. [37] D. Zhang, F. Wang, and L. Si, “Composite hashing with multiple infor-
[12] M. Raginsky and S. Lazebnik, “Locality-sensitive binary codes from mation sources,” in Proc. ACM SIGIR Conf. Res. Develop. Inf. Retrieval,
shift-invariant kernels,” in Proc. Adv. Neural Inf. Process. Syst., Beijing, China, 2011, pp. 225–234.
Vancouver, BC, Canada, 2009, pp. 1509–1517. [38] A. Cherian, S. Sra, V. Morellas, and N. Papanikolopoulos, “Efficient
[13] B. Kulis and K. Grauman, “Kernelized locality-sensitive hashing for nearest neighbors via robust sparse hashing,” IEEE Trans. Image
scalable image search,” in Proc. IEEE 12th Int. Conf. Comput. Vis., Process., vol. 23, no. 8, pp. 3646–3655, Aug. 2014.
Kyoto, Japan, 2009, pp. 2130–2137. [39] X. Zhu, L. Zhang, and Z. Huang, “A sparse embedding and least variance
[14] G. E. Hinton and R. R. Salakhutdinov, “Reducing the dimensionality of encoding approach to hashing,” IEEE Trans. Image Process., vol. 23,
data with neural networks,” Science, vol. 313, no. 5786, pp. 504–507, no. 9, pp. 3737–3750, Sep. 2014.
2006. [40] Z. Jin et al., “Fast and accurate hashing via iterative nearest neigh-
[15] Y. Zhen, Y. Gao, R. Ji, D. Yeung, and X. Li, “Spectral multimodal hash- bors expansion,” IEEE Trans. Cybern., vol. 44, no. 11, pp. 2167–2177,
ing and its application to multimedia retrieval,” IEEE Trans. Cybern., Nov. 2014.
vol. 44, no. 7, pp. 1225–1236, Jul. 2014. [41] C. Ding, D. Zhou, X. He, and H. Zha, “R1-PCA: Rotational
[16] Y. Gong and S. Lazebnik, “Iterative quantization: A procrustean invariant L1-norm principal component analysis for robust subspace
approach to learning binary codes,” in Proc. IEEE Conf. Comput. factorization,” in Proc. 23rd Int. Conf. Mach. Learn., Pittsburgh, PA,
Vis. Pattern Recognit. (CVPR), Providence, RI, USA, 2011, USA, 2006, pp. 281–288.
pp. 817–824. [42] X. Lu, H. Yuan, P. Yan, Y. Yuan, and X. Li, “Geometry constrained
[17] J. Wang, S. Kumar, and S.-F. Chang, “Sequential projection learn- sparse coding for single image super-resolution,” in Proc. IEEE Conf.
ing for hashing with compact codes,” in Proc. 27th Int. Conf. Mach. Comput. Vis. Pattern Recognit. (CVPR), Providence, RI, USA, 2012,
Learn. (ICML), Haifa, Israel, 2010, pp. 1127–1134. pp. 1648–1655.
[18] X. Liu, Y. Mu, D. Zhang, B. Lang, and X. Li, “Large-scale unsupervised [43] X. Lu, Y. Yuan, and P. Yan, “Image super-resolution via double spar-
hashing with shared structure learning,” IEEE Trans. Cybern., vol. 45, sity regularized manifold learning,” IEEE Trans. Circuits Syst. Video
no. 3, pp. 358–369, Mar. 2015. Technol., vol. 23, no. 12, pp. 2022–2033, Dec. 2013.
[19] P. Jain, B. Kulis, and K. Grauman, “Fast image search for learned [44] X. Lu, Y. Wang, and Y. Yuan, “Sparse coding from a Bayesian
metrics,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), perspective,” IEEE Trans. Neural Netw. Learn. Syst., vol. 24, no. 6,
Anchorage, AK, USA, 2008, pp. 1–8. pp. 929–939, Jun. 2013.
[20] J. Wang, S. Kumar, and S.-F. Chang, “Semi-supervised hashing for [45] X. Lu, Y. Yuan, and P. Yuan, “Alternatively constrained dictionary learn-
scalable image retrieval,” in Proc. IEEE Conf. Comput. Vis. Pattern ing for image super-resolution,” IEEE Trans. Cybern., vol. 44, no. 3,
Recognit. (CVPR), San Francisco, CA, USA, 2010, pp. 3424–3431. pp. 366–377, Mar. 2014.

[46] A. Torralba, R. Fergus, and W. T. Freeman, “80 million tiny images: Xuelong Li (M’02–SM’07–F’12) is a Full Professor with the Center for
A large data set for nonparametric object and scene recognition,” Optical Imagery Analysis and Learning, State Key Laboratory of Transient
IEEE Trans. Pattern Anal. Mach. Intell., vol. 30, no. 11, pp. 1958–1970, Optics and Photonics, Xi’an Institute of Optics and Precision Mechanics,
Nov. 2008. Chinese Academy of Sciences, Xi’an, China.
[47] D. G. Lowe, “Object recognition from local scale-invariant features,”
in Proc. 7th IEEE Int. Conf. Comput. Vision, vol. 2. Kerkyra, Greece,
1999, pp. 1150–1157.
[48] A. Torralba, R. Fergus, and Y. Weiss, “Small codes and large image
databases for recognition,” in Proc. IEEE Conf. Comput. Vis. Pattern
Recognit. (CVPR), Anchorage, AK, USA, 2008, pp. 1–8.

Renzhen Ye is currently pursuing the Ph.D. degree

with the Center for Optical Imagery Analysis
and Learning, State Key Laboratory of Transient
Optics and Photonics, Xi’an Institute of Optics and
Precision Mechanics, Chinese Academy of Sciences,
Xi’an, China.
She is an Associate Professor with the Department
of Mathematics, Huazhong Agricultural University,
Wuhan, China. Her current research interests include
partial differential equations, mathematical mech-
anization and mathematical physics, and machine
learning.

Authorized licensed use limited to: XIDIAN UNIVERSITY. Downloaded on December 15,2023 at 02:17:58 UTC from IEEE Xplore. Restrictions apply.

DSA-Module 2 Notes
100% (1)
DSA-Module 2 Notes
18 pages
Finding Similar Items
No ratings yet
Finding Similar Items
85 pages
ML Using Scikit
50% (4)
ML Using Scikit
23 pages
Image Retrieval Thesis
100% (3)
Image Retrieval Thesis
6 pages
L1 Intro
No ratings yet
L1 Intro
29 pages
Machine Learning KNN Presentation
No ratings yet
Machine Learning KNN Presentation
28 pages
AIF-C01 (87 Questions)
No ratings yet
AIF-C01 (87 Questions)
79 pages
Tutti Gli Articoli
No ratings yet
Tutti Gli Articoli
140 pages
A Design Framework For The Mass Customisation of Custom-Fit Bicycle Helmet Model
No ratings yet
A Design Framework For The Mass Customisation of Custom-Fit Bicycle Helmet Model
12 pages
Deep Learning ECOMMERCE
No ratings yet
Deep Learning ECOMMERCE
9 pages
Application of Machine Learning Techniques in Project Management
No ratings yet
Application of Machine Learning Techniques in Project Management
11 pages
[Lecture Notes in Computer Science 2341] Cui Yu (eds.) - High-Dimensional Indexing_ Transformational Approaches to High-Dimensional Range and Similarity Searches (2003, Springer-Verlag Berlin Heidelberg) - li.pdf
No ratings yet
[Lecture Notes in Computer Science 2341] Cui Yu (eds.) - High-Dimensional Indexing_ Transformational Approaches to High-Dimensional Range and Similarity Searches (2003, Springer-Verlag Berlin Heidelberg) - li.pdf
159 pages
An Investigation into the Use of a Neural Tree Classifier for Knowledge Discovery in OLAP Databases
From Everand
An Investigation into the Use of a Neural Tree Classifier for Knowledge Discovery in OLAP Databases
David R Swinburne
No ratings yet
Feature-Based Similarity Search in Graph
No ratings yet
Feature-Based Similarity Search in Graph
36 pages
Indexes in Vector Oracle 23ai
No ratings yet
Indexes in Vector Oracle 23ai
3 pages
RAG Vs VectorDB. Introduction To RAG and VectorDB - by Bijit Ghosh - Medium
No ratings yet
RAG Vs VectorDB. Introduction To RAG and VectorDB - by Bijit Ghosh - Medium
37 pages
Whitepaper - Embeddings & Vector Stores
No ratings yet
Whitepaper - Embeddings & Vector Stores
52 pages
LLM Powered Autonomous Agents - Lil'Log
No ratings yet
LLM Powered Autonomous Agents - Lil'Log
24 pages
Intelligent Tools For Building A Scientific Information Platform - From Research To Implementation PDF
No ratings yet
Intelligent Tools For Building A Scientific Information Platform - From Research To Implementation PDF
297 pages
Principles of Hash-Based Text Retrieval.
100% (1)
Principles of Hash-Based Text Retrieval.
8 pages
NMSLib Manual
No ratings yet
NMSLib Manual
82 pages
Most Cited Articles in Academia - Signal & Image Processing: An International Journal (SIPIJ)
No ratings yet
Most Cited Articles in Academia - Signal & Image Processing: An International Journal (SIPIJ)
35 pages
The W-Tree: An Index Structure For High-Dimensional Data: King-Lp Lin, H.V. Jagadish, and Christos Faloutsos
No ratings yet
The W-Tree: An Index Structure For High-Dimensional Data: King-Lp Lin, H.V. Jagadish, and Christos Faloutsos
26 pages
Hashing For Similarity Search: A Survey: Jingdong Wang, Heng Tao Shen, Jingkuan Song, and Jianqiu Ji
No ratings yet
Hashing For Similarity Search: A Survey: Jingdong Wang, Heng Tao Shen, Jingkuan Song, and Jianqiu Ji
29 pages
A Survey On Learning To Hash
No ratings yet
A Survey On Learning To Hash
22 pages
KNN Based Clothing Color Detection For Optimization of Color Selection Based On Thermal Comforatability
No ratings yet
KNN Based Clothing Color Detection For Optimization of Color Selection Based On Thermal Comforatability
22 pages
Building A Generative AI Platform
No ratings yet
Building A Generative AI Platform
26 pages
Mmi11 PPT 1503
No ratings yet
Mmi11 PPT 1503
20 pages
Informal Introduction To Similarity-Based and Case-Based Reasoning
No ratings yet
Informal Introduction To Similarity-Based and Case-Based Reasoning
59 pages
GGNN (Ieee)
No ratings yet
GGNN (Ieee)
16 pages
1.1 About Spatial Mining
No ratings yet
1.1 About Spatial Mining
53 pages
Matchminer: Efficient Spanning Structure Mining in Large Image Collections
No ratings yet
Matchminer: Efficient Spanning Structure Mining in Large Image Collections
14 pages
Voronoi Diagrams A Survey of A Fundamental Geometric Data Structure
No ratings yet
Voronoi Diagrams A Survey of A Fundamental Geometric Data Structure
61 pages
Enhancing Data Retrieval Efficiency in Large-Scale Javascript Object Notation Datasets by Using Indexing Techniques
No ratings yet
Enhancing Data Retrieval Efficiency in Large-Scale Javascript Object Notation Datasets by Using Indexing Techniques
12 pages
The W-Tree: An Index Structure For High-Dimensional Data: King-Lp Lin, H.V. Jagadish, and Christos Faloutsos
No ratings yet
The W-Tree: An Index Structure For High-Dimensional Data: King-Lp Lin, H.V. Jagadish, and Christos Faloutsos
26 pages
Comparative Performance Analysis of K Nearest Neighbour (KNN) Algorithm and Its Different Variants For Disease Prediction
No ratings yet
Comparative Performance Analysis of K Nearest Neighbour (KNN) Algorithm and Its Different Variants For Disease Prediction
11 pages
LM-DiskANN Low Memory Footprint in Disk-Native Dynamic Graph-Based ANN Indexing
No ratings yet
LM-DiskANN Low Memory Footprint in Disk-Native Dynamic Graph-Based ANN Indexing
10 pages
WJ96
No ratings yet
WJ96
8 pages
Binary Hashing For Approximate Nearest Neighbor Search On Big Data A Survey
No ratings yet
Binary Hashing For Approximate Nearest Neighbor Search On Big Data A Survey
16 pages
Fast and Exact Fixed-Radius Neighbor Search Based On Sorting
No ratings yet
Fast and Exact Fixed-Radius Neighbor Search Based On Sorting
17 pages
Learning To Hash For Indexing Big Data - A Survey
No ratings yet
Learning To Hash For Indexing Big Data - A Survey
22 pages
FLANN - Fast Library For Approximate Nearest Neighbors User Manual
No ratings yet
FLANN - Fast Library For Approximate Nearest Neighbors User Manual
15 pages
Learning Hash Functions Using Column Generation: Xi Li Guosheng Lin Chunhua Shen Anton Van Den Hengel Anthony Dick
No ratings yet
Learning Hash Functions Using Column Generation: Xi Li Guosheng Lin Chunhua Shen Anton Van Den Hengel Anthony Dick
9 pages
NeurIPS 2019 Diskann Fast Accurate Billion Point Nearest Neighbor Search On A Single Node Paper
No ratings yet
NeurIPS 2019 Diskann Fast Accurate Billion Point Nearest Neighbor Search On A Single Node Paper
11 pages
PSLSH
No ratings yet
PSLSH
10 pages
A Comprehensive Survey On Vector Database
No ratings yet
A Comprehensive Survey On Vector Database
13 pages
JCSSP 2025 991 1002
No ratings yet
JCSSP 2025 991 1002
12 pages
Fast Subspace Search Via Grassmannian Based Hashing
No ratings yet
Fast Subspace Search Via Grassmannian Based Hashing
8 pages
Scalable Nearest Neighbor Algorithms For High Dimensional Data
No ratings yet
Scalable Nearest Neighbor Algorithms For High Dimensional Data
14 pages
SSDH
No ratings yet
SSDH
7 pages
Elpis
No ratings yet
Elpis
12 pages
Efficient Nearest Neighbor Search in High Dimensional Hamming Space
No ratings yet
Efficient Nearest Neighbor Search in High Dimensional Hamming Space
11 pages
Adl - 00 (2021 - 07 - 30 08 - 37 - 35 Utc)
No ratings yet
Adl - 00 (2021 - 07 - 30 08 - 37 - 35 Utc)
7 pages
Goel Efficient Category Mining 2013 CVPR Paper
No ratings yet
Goel Efficient Category Mining 2013 CVPR Paper
7 pages
A Voluminous Survey On Content Based Image Retrieval: Abstract
No ratings yet
A Voluminous Survey On Content Based Image Retrieval: Abstract
8 pages
Similarity-Based Learning: Exercise Solutions: Solutionsmanual-Mit-7X9-Style 2015/4/22 21:17 Page 45 #55
No ratings yet
Similarity-Based Learning: Exercise Solutions: Solutionsmanual-Mit-7X9-Style 2015/4/22 21:17 Page 45 #55
10 pages
RAGViz
No ratings yet
RAGViz
8 pages
Fast Nearest Neighbor Search With Keywords: Yufei Tao Cheng Sheng
No ratings yet
Fast Nearest Neighbor Search With Keywords: Yufei Tao Cheng Sheng
13 pages
Double-Bit Quantization For Hashing: Weihao Kong Wu-Jun Li
No ratings yet
Double-Bit Quantization For Hashing: Weihao Kong Wu-Jun Li
7 pages
K : An Instance-Based Learner Using An Entropic Distance Measure
No ratings yet
K : An Instance-Based Learner Using An Entropic Distance Measure
14 pages
Article cvprw15
No ratings yet
Article cvprw15
9 pages
Answering Metric Skyline Queries by PM-tree
No ratings yet
Answering Metric Skyline Queries by PM-tree
16 pages
An Efficient High Dimensional Indexing Method For Content Based Image Retrieval (CBIR)
No ratings yet
An Efficient High Dimensional Indexing Method For Content Based Image Retrieval (CBIR)
7 pages
Content Based Image Retrieval
No ratings yet
Content Based Image Retrieval
4 pages
Ijmer 45060109 PDF
No ratings yet
Ijmer 45060109 PDF
9 pages
Vor-Tree: R-Trees With Voronoi Diagrams For Efficient Processing of Spatial Nearest Neighbor Queries
No ratings yet
Vor-Tree: R-Trees With Voronoi Diagrams For Efficient Processing of Spatial Nearest Neighbor Queries
12 pages
Outsourced Similarity Search On Metric Data Assets
No ratings yet
Outsourced Similarity Search On Metric Data Assets
9 pages
Packer 2016 CLAMMS
No ratings yet
Packer 2016 CLAMMS
3 pages
Hierarchical Clustering PDF
No ratings yet
Hierarchical Clustering PDF
9 pages
A FFT Based Technique For Image Signature Generation: Augusto Celentano and Vincenzo Di Lecce
No ratings yet
A FFT Based Technique For Image Signature Generation: Augusto Celentano and Vincenzo Di Lecce
10 pages
1.1 Geographic Information System
No ratings yet
1.1 Geographic Information System
5 pages
Efficient Filtering With Sketches in The Ferret Toolkit: Qin LV, William Josephson, Zhe Wang, Moses Charikar and Kai Li
No ratings yet
Efficient Filtering With Sketches in The Ferret Toolkit: Qin LV, William Josephson, Zhe Wang, Moses Charikar and Kai Li
10 pages
Fast Searching of Nearest Neighbor Using Key Values in Data Mining
No ratings yet
Fast Searching of Nearest Neighbor Using Key Values in Data Mining
5 pages
Locality-Sensitive Hashing Scheme Based On P-Stable Distributions
No ratings yet
Locality-Sensitive Hashing Scheme Based On P-Stable Distributions
10 pages
Fast Exact Search in Hamming Space With Multi-Index Hashing: Mohammad Norouzi, Ali Punjani, David J. Fleet
No ratings yet
Fast Exact Search in Hamming Space With Multi-Index Hashing: Mohammad Norouzi, Ali Punjani, David J. Fleet
14 pages
Fast Multiresolution Image Querying: Charles E. Jacobs Adam Finkelstein David H. Salesin
No ratings yet
Fast Multiresolution Image Querying: Charles E. Jacobs Adam Finkelstein David H. Salesin
10 pages
78221000
No ratings yet
78221000
7 pages
Content Based Image Retrieval Methods Using Self Supporting Retrieval Map Algorithm
No ratings yet
Content Based Image Retrieval Methods Using Self Supporting Retrieval Map Algorithm
7 pages
Nearest Neighbor Retrieval Using Distance-Based Hashing
No ratings yet
Nearest Neighbor Retrieval Using Distance-Based Hashing
10 pages
Innovative Approach To Detect Mental Disorder Using Multimodal Technique
No ratings yet
Innovative Approach To Detect Mental Disorder Using Multimodal Technique
4 pages
Image Retrieval: Importance and Applications: João Augusto Da Silva Júnior Rodiney Elias Marçal Marcos Aurélio Batista
No ratings yet
Image Retrieval: Importance and Applications: João Augusto Da Silva Júnior Rodiney Elias Marçal Marcos Aurélio Batista
5 pages
Cerra SCC10 Final
No ratings yet
Cerra SCC10 Final
6 pages
Locality-Sensitive Hashing Scheme Based On P-Stable Distributions
No ratings yet
Locality-Sensitive Hashing Scheme Based On P-Stable Distributions
10 pages
Techniques For Efficiently Searching in Spatial, Temporal, Spatio-Temporal, and Multimedia Databases
No ratings yet
Techniques For Efficiently Searching in Spatial, Temporal, Spatio-Temporal, and Multimedia Databases
4 pages
Efficiently Searching Nearest Neighbor in Documents
No ratings yet
Efficiently Searching Nearest Neighbor in Documents
3 pages
Sample Paper PDF
No ratings yet
Sample Paper PDF
3 pages
Compusoft, 3 (7), 1020-1023 PDF
No ratings yet
Compusoft, 3 (7), 1020-1023 PDF
4 pages
Classification of Images Using Similar Objects
No ratings yet
Classification of Images Using Similar Objects
4 pages
p117 Andoni
No ratings yet
p117 Andoni
6 pages
17CS73-ML Mod-5
No ratings yet
17CS73-ML Mod-5
26 pages

Compact Structure Hashing Via Sparse and Similarity Preserving Embedding

Uploaded by

Compact Structure Hashing Via Sparse and Similarity Preserving Embedding

Uploaded by

718 IEEE TRANSACTIONS ON CYBERNETICS, VOL. 46, NO.

Compact Structure Hashing via Sparse and

labeled information from Xl , can be defined as min si 1

Fig. 3. Parameter selection results on CIFAR-10 dataset: MAP varies with

Fig. 4. Parameter selection results on CIFAR-10 dataset: MAP varies with

C. Competitors 2) LSH method [12] is a classic unsupervised method for

Renzhen Ye is currently pursuing the Ph.D. degree

You might also like

labeled information from Xl , can be defined as min si 1