0% found this document useful (0 votes)

16 views7 pages

Double-Bit Quantization For Hashing: Weihao Kong Wu-Jun Li

Uploaded by

xingyanzhou687

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

16 views7 pages

Double-Bit Quantization For Hashing: Weihao Kong Wu-Jun Li

Uploaded by

xingyanzhou687

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

Double-Bit Quantization for Hashing

Weihao Kong and Wu-Jun Li

Shanghai Key Laboratory of Scalable Computing and Systems
Department of Computer Science and Engineering, Shanghai Jiao Tong University, China
{kongweihao,liwujun}@cs.sjtu.edu.cn

Abstract massive data sets, and many hashing methods have been pro-
posed by researchers from different research communities.
Hashing, which tries to learn similarity-preserving bi-
nary codes for data representation, has been widely The existing hashing methods can be mainly divided into
used for efficient nearest neighbor search in massive two categories (Gong and Lazebnik 2011; Liu et al. 2011;
databases due to its fast query speed and low storage 2012): data-independent methods and data-dependent meth-
cost. Because it is NP hard to directly compute the best ods.
binary codes for a given data set, mainstream hash- Representative data-independent methods include
ing methods typically adopt a two-stage strategy. In the locality-sensitive hashing (LSH) (Gionis, Indyk, and
first stage, several projected dimensions of real values Motwani 1999; Andoni and Indyk 2008) and its ex-
are generated. Then in the second stage, the real val-
ues will be quantized into binary codes by thresholding.
tensions (Datar et al. 2004; Kulis and Grauman 2009;
Currently, most existing methods use one single bit to Kulis, Jain, and Grauman 2009), and shift invariant kernel
quantize each projected dimension. One problem with hashing (SIKH) (Raginsky and Lazebnik 2009). LSH and
this single-bit quantization (SBQ) is that the threshold its extensions use simple random projections which are
typically lies in the region of the highest point density independent of the training data for hash functions. SIKH
and consequently a lot of neighboring points close to chooses projection vectors similar to those of LSH, but
the threshold will be hashed to totally different bits, SIKH uses a shifted cosine function to generate hash values.
which is unexpected according to the principle of hash- Both LSH and SIKH have an important property that points
ing. In this paper, we propose a novel quantization strat- with high similarity will have high probability to be mapped
egy, called double-bit quantization (DBQ), to solve the to the same hashcodes. Compared with the data-dependent
problem of SBQ. The basic idea of DBQ is to quantize
each projected dimension into double bits with adap-
methods, data-independent methods need longer codes
tively learned thresholds. Extensive experiments on two to achieve satisfactory performance (Gong and Lazebnik
real data sets show that our DBQ strategy can signifi- 2011), which will be less efficient due to the higher storage
cantly outperform traditional SBQ strategy for hashing. and computational cost.
Considering the shortcomings of data-independent meth-
Introduction ods, more and more recent works have focused on data-
dependent methods whose hash functions are learned from
With the explosive growth of data on the Internet, there the training data. Semantic hashing (Salakhutdinov and Hin-
has been increasing interest in approximate nearest neigh- ton 2007; 2009) adopts a deep generative model to learn the
bor (ANN) search in massive data sets. Common approaches hash functions. Spectral hashing (SH) (Weiss, Torralba, and
for efficient ANN search are based on similarity-preserving Fergus 2008) uses spectral graph partitioning for hashing
hashing techniques (Gionis, Indyk, and Motwani 1999; with the graph constructed from the data similarity relation-
Andoni and Indyk 2008) which encode similar points in ships. Binary reconstruction embedding (BRE) (Kulis and
the original space into close binary codes in the hashcode Darrell 2009) learns the hash functions by explicitly min-
space. Most methods use the Hamming distance to measure imizing the reconstruction error between the original dis-
the closeness between points in the hashcode space. By us- tances and the Hamming distances of the corresponding bi-
ing hashing codes, we can achieve constant or sub-linear nary codes. Semi-supervise hashing (SSH) (Wang, Kumar,
search time complexity (Torralba, Fergus, and Weiss 2008; and Chang 2010) exploits some labeled data to help hash
Liu et al. 2011). Furthermore, the storage needed to store the function learning. Selt-taught hashing (Zhang et al. 2010)
binary codes will be greatly decreased. For example, if we uses supervised learning algorithms for hashing based on
encode each point with 128 bits, we can store a data set of 1 self-labeled data. Composite hashing (Zhang, Wang, and
million points with only 16M memory. Hence, hashing pro- Si 2011) integrates multiple information sources for hash-
vides a very effective and efficient way to perform ANN for ing. Minimal loss hashing (MLH) (Norouzi and Fleet 2011)
Copyright c 2012, Association for the Advancement of Artificial formulates the hashing problem as a structured prediction
Intelligence (www.aaai.org). All rights reserved. problem. Both accuracy and time are jointly optimized to
learn the hash functions in (He et al. 2011). One of the Problem Definition
most recent data-dependent methods is iterative quantization Given a set of n data points S = {x1 , x2 , · · · , xn } with
(ITQ) (Gong and Lazebnik 2011) which finds an orthogonal xi ∈ Rd , the goal of hashing is to learn a mapping
rotation matrix to refine the initial projection matrix learned to encode point xi with a binary string yi ∈ {0, 1}c ,
by principal component analysis (PCA) so that the quanti- where c denotes the code size. To achieve the similarity-
zation error of mapping the data to the vertices of binary preserving property, we require close points in the original
hypercube is minimized. It outperforms most other state-of- space Rd to have similar binary codes in the code space
the-art methods with relatively short codes. {0, 1}c . To get the c-bit codes, we need c binary hash func-
Because it is NP hard to directly compute the best bi- tions {hk (·)}ck=1 . Then the binary code can be computed
nary codes for a given data set (Weiss, Torralba, and Fer- as yi = [h1 (xi ), h2 (xi ), · · · , hc (xi )]T . Most hashing algo-
gus 2008), both data-independent and data-dependent hash- rithms adopt the following two-stage strategy:
ing methods typically adopt a two-stage strategy. In the first
stage, several projected dimensions of real values are gener- • In the first stage, c real-valued functions {fk (·)}ck=1
ated. Then in the second stage, the real values will be quan- are used to generate an intermediate vector
tized into binary codes by thresholding. Currently, most ex- zi = [f1 (xi ), f2 (xi ), · · · , fc (xi )]T , where zi ∈ Rc .
isting methods use one single bit to quantize each projected These real-valued functions are often called
dimension. More specifically, given a point x, each projected projection functions (Andoni and Indyk 2008;
dimension i will be associated with a real-valued projection Wang, Kumar, and Chang 2010; Gong and Lazeb-
function fi (x). The ith hash bit of x will be 1 if fi (x) ≥ θ. nik 2011), and each function corresponds to one of the c
Otherwise, it will be 0. One problem with this single-bit projected dimensions;
quantization (SBQ) is that the threshold θ (0 for most cases • In the second stage, the real-valued vector zi is encoded
if the data are zero centered) typically lies in the region of into binary vector yi , typically by thresholding. When the
the highest point density and consequently a lot of neighbor- data have been normalized to have zero mean which is
ing points close to the threshold might be hashed to totally adopted by most methods, a common encoding approach
different bits, which is unexpected according to the princi- is to use function sgn(x), where sgn(x) = 1 if x ≥ 0 and
ple of hashing. Figure 1 illustrates an example distribution 0 otherwise. For a matrix or a vector, sgn(·) will denote
of the real values before thresholding which is computed the result of element-wise application of the above func-
by PCA1 . We can find that point “B” and point “C” in Fig- tion. Hence, let yi = sgn(zi ), we can get the binary code
ure 1(a) which lie in the region of the highest density will be of xi . This also means that hk (xi ) = sgn(fk (xi )).
quantized into 0 and 1 respectively although they are very
We can see that the above sgn(·) function actually quan-
close to each other. On the contrary, point “A” and point “B”
tizes each projected dimension into one single bit with the
will be quantized into the same code 0 although they are far
threshold 0. As stated in the Introduction section, this SBQ
away from each other. Because a lot of points will lie close
strategy is unreasonable, which motivates the DBQ work of
to the threshold, it is very unreasonable to adopt this kind of
this paper.
SBQ strategy for hashing.
To the best of our knowledge, only one existing method,
called AGH (Liu et al. 2011), has found this problem of
Double-Bit Quantization
SBQ and proposed a new quantization method called hierar- This section will introduce our double-bit quantization
chical hashing (HH) to solve it. The basic idea of HH is to (DBQ) in detail. First, we will describe the motivation of
use three thresholds to divide the real values of each dimen- DBQ based on observation from real data. Then, the adap-
sion into four regions, and encode each dimension with dou- tive threshold learning algorithm for DBQ will be proposed.
ble bits. However, for any projected dimension, the Ham- Finally, we will do some qualitative analysis and discussion
ming distance between the two farthest points is the same about the performance of DBQ.
as that between two relatively close points, which is unrea-
sonable. Furthermore, although the HH strategy can achieve Observation and Motivation
very promising performance when combined with AGH pro- Figure 1 illustrates the point distribution (histogram) of the
jection functions (Liu et al. 2011), it is still unclear whether real values before thresholding on one of the projected di-
HH will be truly better than SBQ when it is combined with mensions computed by PCA on 22K LabelMe data set (Tor-
other projection functions. ralba, Fergus, and Weiss 2008) which will be used in our
In this paper, we clearly claim that using double bits experiments. It clearly reveals that the point density is high-
with adaptively learned thresholds to quantize each pro- est near the mean, which is zero here. Note that unless oth-
jected dimension can completely solve the problem of erwise stated, we assume the data have been normalized to
SBQ. The result is our novel quantization strategy called have zero mean, which is a typical choice by existing meth-
double-bit quantization (DBQ). Extensive experiments on ods.
real data sets demonstrate that our DBQ can significantly The popular coding strategy SBQ which adopts zero as
outperform SBQ and HH. the threshold is shown in Figure 1(a). Due to the threshold-
ing, the intrinsic neighboring structure in the original space
1 is destroyed. For example, points A, B, C, and D are four
Distribution of real-valued points computed by other hashing
methods, such as SH and ITQ, are similar. points sampled from the X-axis of the point distribution
2000 learning algorithm to push the thresholds far way from the
dense regions, and solve the problems of SBQ and HH.
Sample Numbe
1500

1000
Adaptive Threshold Learning
500
Now we describe how to adaptively learn the optimal thresh-
0
í1.5 í1 í0.5 0 0.5 1 1.5
olds from data. To find the reasonable thresholds, we want
X the neighboring structure in the original space to be kept as
(a) A 0 BC 1 D much as possible. The equivalent goal is to make the points
01 10 11
in each region as similar as possible.
(b) 00
Let a denote the left threshold, b denote the right threshold
(c) 01 00 10 and a < b, S denote real values of the whole point set on one
projected dimension, S1 , S2 , S3 denote the subsets divided
Figure 1: Point distribution of the real values computed by by the thresholds, i.e., S1 = {x| − ∞ < x ≤ a, x ∈ S},
PCA on 22K LabelMe data set, and different coding results S2 = {x|a < x ≤ b, x ∈ S}, S3 = {x|b < x < ∞, x ∈ S}.
based on the distribution: (a) single-bit quantization (SBQ); Our goal is to find a and b to minimize the following objec-
(b) hierarchical hashing (HH); (c) double-bit quantization tive function:
(DBQ).
X X X
E= (x − µ1 )2 + (x − µ2 )2 + (x − µ3 )2 ,
x∈S1 x∈S2 x∈S3

graph. After SBQ, points A and B, two distant and almost where µi is the mean of the points in Si .
irrelevant points, receive the same code 0 in this dimension. As we have discussed above, cutting off right on 0 is not
However, B and C, two points which are extremely close in a wise way as the densest region is right there. So we set µ2
the original space, get totally different codes (0 for B, and 1 to be 0, which means that a < 0 and b > 0. Then E can be
for C). Because the threshold zero lies in the densest region, calculated as follows:
the occurrence probability of the cases like B and C is very
high. Hence, it is obvious that SBQ is not very reasonable
X X X X X
E= x2 − 2 xµ1 + µ21 − 2 xµ3 + µ23
for coding.
x∈S x∈S1 x∈S1 x∈S3 x∈S3
The HH strategy (Liu et al. 2011) is shown in Figure 1(b). X
2
Besides the threshold zero which has been shown to be = x − |S1 |µ21 − |S3 |µ23
a bad choice, HH uses two other thresholds to divide the x∈S
whole dimension into four regions, and encode each re- X ( x∈S1 x)2
P
( x∈S3 x)2
P
2
gion with double bits. Note that the thresholds are shown = x − − ,
in vertical lines in Figure 1(b). If we use d(A, B) to de- |S1 | |S3 |
x∈S
note the Hamming distance between A and B, we can
where |S| denotes the number of elements in set S.
find that d(A, D) = d(A, B) = d(C, D) = d(B, C) = 1
Because x∈S x2 is a constant, minimizing E equals to
P
for HH, which is obviously not reasonable.
maximizing:
In fact, if we adopt double bits to encode four regions
( x∈S1 x)2 ( x∈S3 x)2
P P
like those in Figure 1(b), the neighboring structure will
be destroyed no matter how we encode the four regions. F = +
|S1 | |S3 |
That is to say, no matter how we assign the four codes
(‘00’,‘01’,‘10’,‘11’) to the four regions, we cannot get any subject to : µ2 = 0.
result which can preserve the neighboring structure. This re- Algorithm 1 outlines the procedure to learn the thresholds,
sult is caused by the limitation of Hamming distance. More where sum(S) denotes the summation of all points in set S.
specifically, the largest Hamming distance between 2-bit The basic idea of our algorithm is to expand S2 from
codes is 2. However, to keep the relative distances between empty set by moving one point from either S1 or S3 each
4 different points, the largest Hamming distance should time while simultaneously keeping sum(S2 ) close to 0. Af-
be at least 3. Hence, no matter how we choose the 2-bit ter all the elements in initial S1 and S3 have been moved
codes for the four regions, we cannot get any neighborhood- to S2 , all possible candidate thresholds (points in S) have
preserving result. been checked, and those achieving the largest F have been
In this paper, DBQ is proposed to preserve the neighbor- recorded in a and b. After we have sorted the points in the
ing structure by omitting the ‘11’ code, which is shown in initial S1 and S3 , the while loop is just an one-time scan of
Figure 1(c). More specifically, we find two thresholds which all the points, and hence the total number of operations in
do not lie in the densest region to divide the dimension the while loop is just n where n is the number of points in
into three regions, and then use double bits to code. With S. Each operation is of constant time complexity if we keep
our DBQ code, d(A, D) = 2, d(A, B) = d(C, D) = 1, and sum(S1 ), sum(S2 ), sum(S3 ) in memory. Hence, the most
d(B, C) = 0, which is obviously reasonable to preserve the time-consuming part in Algorithm 1 is to sort the initial S1
similarity relationships in the original space. Please note that and S3 , the time complexity of which is O(n log n).
the neighboring structure near the thresholds can still be de- After we have learned a and b, we can use them to divide
stroyed in DBQ. But we can design some adaptive threshold the whole set into S1 , S2 and S3 , and then use the DBQ in
Algorithm 1 The algorithm to adaptively learn the thresh- r will be 1%, 0.01%, and 10−16 , respectively. Much less en-
olds for DBQ. tries will gain higher collision probability (improve recall),
Input: The whole point set S faster query speed and less storage. In fact, the number of
Initialize with possible entries of c bits of DBQ code only equals to that of
c log2 3
S1 ← {x| − ∞ < x ≤ 0, x ∈ S} 2 ≈ 0.79c bits of ordinary SBQ or HH code.
S2 ← ∅
S3 ← {x|0 < x < +∞, x ∈ S} Experiment
max ← 0
sort the points in S1
Data Sets
sort the points in S3 We evaluate our methods on two widely used data sets,
while S1 6= ∅ or S3 6= ∅ do CIFAR (Krizhevsky 2009) and LabelMe (Torralba, Fergus,
if sum(S2 ) ≤ 0 then and Weiss 2008).
move the smallest point in S3 to S2 The CIFAR data set (Krizhevsky 2009) includes different
else versions. The version we use is CIFAR-10, which consists of
move the largest point in S1 to S2 60,000 images. These images are manually labeled into 10
end if classes, which are airplane, automobile, bird, cat, deer, dog,
compute F frog, horse, ship, and truck. The size of each image is 32×32
if F > max then pixels. We represent them with 512-dimensional gray-scale
set a to be the largest point in S1 , and b to be the GIST descriptors (Oliva and Torralba 2001).
largest point in S2 The second data set is 22K LabelMe used in (Torralba,
max ← F Fergus, and Weiss 2008; Norouzi and Fleet 2011). It con-
end if sists of 22,019 images. We represent each image with 512-
end while dimensional GIST descriptors (Oliva and Torralba 2001).

Evaluation Protocols and Baseline Methods

Figure 1(c) to quantize the points in these subsets into 01, As the protocols widely used in recent papers (Weiss,
00, 10, respectively. Torralba, and Fergus 2008; Raginsky and Lazebnik 2009;
Gong and Lazebnik 2011), Euclidean neighbors in the orig-
Discussion inal space are considered as ground truth. More specifi-
cally, a threshold of the average distance to the 50th nearest
Let us analyze the expected performance of our DBQ. The neighbor is used to define whether a point is a true positive
first advantage of DBQ is about accuracy. From Figure 1 or not. Based on the Euclidean ground truth, we compute
and the corresponding analysis, it is expected that DBQ will the precision-recall curve and the mean average precision
achieve better accuracy than SBQ and HH because DBQ can (mAP) (Liu et al. 2011; Gong and Lazebnik 2011). For all
better preserve the similarity relationships between points. experiments, we randomly select 1000 points as queries, and
This will be verified by our experimental results. leave the rest as training set to learn the hash functions. All
The second advantage of DBQ is on time complexity, in- the experimental results are averaged over 10 random train-
cluding both coding (training) time and query time. For a c- ing/test partitions.
bit DBQ, we only need to generate c/2 projected dimensions Our DBQ can be used to replace the SBQ stage for any
while SBQ need c projected dimensions. For most meth- existing hashing method to get a new version. In this paper,
ods, the projection step is the most time-consuming step. we just select the most representative methods for evalua-
Although some extra cost is needed to adaptively learn the tion, which contain SH (Weiss, Torralba, and Fergus 2008),
thresholds for DBQ, this extra computation is just sorting PCA (Gong and Lazebnik 2011), ITQ (Gong and Lazebnik
and an one-time scan of the training points which are actu- 2011), LSH (Andoni and Indyk 2008), and SIKH (Raginsky
ally very fast. Hence, to get the same size of code, the over- and Lazebnik 2009). SH, PCA, and ITQ are data-dependent
all coding time complexity of DBQ will be lower than SBQ, methods, while LSH and SIKH are data-independent meth-
which will also be verified in our experiment. ods. By adopting different quantization methods, we can
As for the query time, similar to coding time, the num- get different versions of a specific hashing method. Let’s
ber of projection operations for DBQ is only half of that for take SH as an example. “SH-SBQ” denotes the original SH
SBQ. Hence, it is expected that the query speed of DBQ will method based on single-bit quantization, “SH-HH” denotes
be faster than SBQ. The query time of DBQ is similar to that the combination of SH projection with HH quantization (Liu
of HH because HH also uses half of the number of projection et al. 2011), and “SH-DBQ” denotes the method combining
operations for SBQ. Furthermore, if we use inverted index the SH projection with double-bit quantization. Please note
or hash table to accelerate searching, ordinary c-bit SBQ or that threshold optimization techniques in (Liu et al. 2011)
HH coding would have 2c possible entries in the hash table. for the two non-zero thresholds in HH can not be used for
As DBQ does not allow code ‘11’, the possible number of the above five methods. In our experiments, we just use the
entries for DBQ is 3c/2 . This difference is actually very sig- same thresholds as those in DBQ. For all the evaluated meth-
nificant. Let r denote the ratio between the number of entries ods, we use the source codes provided by the authors. For
for DBQ and that for SBQ (or HH). When c = 32, 64, 256, ITQ, we set the iteration number to be 100. To run SIKH, we
Table 1: mAP on LabelMe data set. The best mAP among SBQ, HH and DBQ under the same setting is shown in bold face.
# bits 32 64 128 256
SBQ HH DBQ SBQ HH DBQ SBQ HH DBQ SBQ HH DBQ
ITQ 0.2926 0.2592 0.3079 0.3413 0.3487 0.4002 0.3675 0.4032 0.4650 0.3846 0.4251 0.4998
SH 0.0859 0.1329 0.1815 0.1071 0.1768 0.2649 0.1730 0.2034 0.3403 0.2140 0.2468 0.3468
PCA 0.0535 0.1009 0.1563 0.0417 0.1034 0.1822 0.0323 0.1083 0.1748 0.0245 0.1103 0.1499
LSH 0.1657 0.105 0.12272 0.2594 0.2089 0.2577 0.3579 0.3311 0.4055 0.4158 0.4359 0.5154
SIKH 0.0590 0.0712 0.0772 0.1132 0.1514 0.1737 0.2792 0.3147 0.3436 0.4759 0.5055 0.5325

use a Gaussian kernel and set the bandwidth to the average that our method can achieve the best performance with code
distance of the 50th nearest neighbor, which is the same as size larger than 64, the overall performance of DBQ is still
that in (Raginsky and Lazebnik 2009). All experiments are the best under most settings with small code size such as the
conducted on our workstation with Intel(R) Xeon(R) CPU case of 32 bits.
[email protected] and 64G memory. Figure 2, Figure 3 and Figure 4 show precision-recall
curves for ITQ, SH and LSH with different code sizes on
Accuracy the 22K LabelMe data set. The relative performance among
Table 1 and Table 2 show the mAP results for different SBQ, HH, and DBQ in the precision-recall curves for PCA
methods with different code sizes on 22K LabelMe and and SIKH is similar to that for ITQ. We do not show these
CIFAR-10, respectively. Each entry in the Tables denotes the curves due to space limitation. From Figure 2, Figure 3 and
mAP of a combination of a hashing method with a quan- Figure 4, it is clear that our DBQ method significantly out-
tization method under a specific code size. For example, performs SBQ and HH under most settings.
the value “0.2926” in the upper left corner of Table 1 de-
Computational Cost
notes the mAP of ITQ-SBQ with the code size 32. The best
mAP among SBQ, HH and DBQ under the same setting Table 3 shows the training time on CIFAR-10. Although
is shown in bold face. For example, in Table 1, when the some extra cost is needed to adaptively learn the thresholds
code size is 32 and the hashing method is ITQ, the mAP for DBQ, this extra computation is actually very fast. Be-
of DBQ (0.3079) is the best one compared with those of cause the number of projected dimensions for DBQ is only
SBQ (0.2926) and HH (0.2592). Hence, the value 0.3079 half of that for SBQ, the training of DBQ is still faster than
will be in bold face. From Table 1 and Table 2, we can find SBQ. This can be seen from Table 3. For query time, DBQ is
that when the code size is small, the performance of data- also faster than SBQ, which has been analyzed above. Due
independent methods LSH and SIKH is relatively poor and to space limitation, we omit the detailed query time compar-
ITQ achieves the best performance under most settings es- ison here.
pecially for those with small code size, which verifies the Table 3: Training time on CIFAR-10 date set (in seconds).
claims made in existing work (Raginsky and Lazebnik 2009; # bits 32 64 256
Gong and Lazebnik 2011). This also indicates that our im- SBQ DBQ SBQ DBQ SBQ DBQ
plementations are correct. ITQ 14.48 8.46 29.95 14.12 254.14 80.09
Our DBQ method achieves the best performance under SIKH 1.76 1.46 2.00 1.57 4.55 2.87
most settings, and it outperforms HH under all settings ex- LSH 0.30 0.19 0.53 0.30 1.80 0.95
cept the ITQ with 32 bits on the CIFAR-10 data set. This im- SH 5.60 3.74 11.72 5.57 133.50 37.57
plies that our DBQ with adaptively learned thresholds is very PCA 4.03 3.92 4.31 3.99 5.86 4.55
effective. The exceptional settings where our DBQ method
is outperformed by SBQ are LSH and ITQ with code size
smaller than 64. One possible reason might be from the fact
that the c-bit code in DBQ only utilizes c/2 projected di- Conclusion
mensions while c projected dimensions are utilized in SBQ. The SBQ strategy adopted by most existing hashing methods
When the code size is too small, the useful information for will destroy the neighboring structure in the original space,
hashing is also very weak, especially for data-independent which violates the principle of hashing. In this paper, we
methods like LSH. Hence, even if our DBQ can find the best propose a novel quantization strategy called DBQ to effec-
way to encode, the limited information kept in the projected tively preserve the neighboring structure among data. Exten-
dimensions can not guarantee a good performance. Fortu- sive experiments on real data sets demonstrate that our DBQ
nately, when the code size is 64, the worst performance of can achieve much better accuracy with lower computational
DBQ is still comparable with that of SBQ. When the code cost than SBQ.
size is 128 or larger, the performance of DBQ will signifi-
cantly outperform SBQ under any setting. As stated in the Acknowledgments
Introduction session, the storage cost is still very low when This work is supported by the NSFC (No. 61100125) and the 863
the code size is 128. Hence, the setting with code size 128 Program of China (No. 2011AA01A202, No. 2012AA011003). We
can be seen as a good tradeoff between accuracy and stor- thank Yunchao Gong and Wei Liu for sharing their codes and pro-
age cost in real systems. Please note that although we argue viding useful help for our experiments.
Table 2: mAP on CIFAR-10 data set. The best mAP among SBQ, HH and DBQ under the same setting is shown in bold face.
# bits 32 64 128 256
SBQ HH DBQ SBQ HH DBQ SBQ HH DBQ SBQ HH DBQ
ITQ 0.2716 0.2240 0.2220 0.3293 0.3006 0.3350 0.3593 0.3826 0.4395 0.3727 0.4140 0.5221
SH 0.0511 0.0742 0.1114 0.0638 0.0936 0.1717 0.0998 0.1209 0.2501 0.1324 0.1697 0.3337
PCA 0.0357 0.0646 0.1072 0.0311 0.0733 0.1541 0.0261 0.0835 0.1966 0.0217 0.1127 0.2053
LSH 0.1192 0.0665 0.0660 0.1882 0.1457 0.1588 0.2837 0.2601 0.3153 0.3480 0.3640 0.4680
SIKH 0.0417 0.0359 0.0466 0.0953 0.0911 0.1063 0.1836 0.1969 0.2263 0.3677 0.3601 0.3975

1 1 1 1
ITQ−SBQ ITQ−SBQ ITQ−SBQ ITQ−SBQ
ITQ−HH ITQ−HH ITQ−HH ITQ−HH
0.8 ITQ−DBQ 0.8 ITQ−DBQ 0.8 ITQ−DBQ 0.8 ITQ−DBQ
Precision

Precision

Precision
0.6 0.6 0.6 0.6

0.4 0.4 0.4 0.4

0.2 0.2 0.2 0.2

0 0 0 0
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
Recall Recall Recall Recall

(a) ITQ 32 bits (b) ITQ 64 bits (c) ITQ 128 bits (d) ITQ 256 bits

Figure 2: Precision-recall curve for ITQ on 22K LabelMe data set

1 1 1 1
SH−SBQ SH−SBQ SH−SBQ SH−SBQ
SH−HH SH−HH SH−HH SH−HH
0.8 SH−DBQ 0.8 SH−DBQ 0.8 SH−DBQ 0.8 SH−DBQ
Precision

Precision

Precision
0.6 0.6 0.6 0.6

0.4 0.4 0.4 0.4

0.2 0.2 0.2 0.2

0 0 0 0
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
Recall Recall Recall Recall

(a) SH 32 bits (b) SH 64 bits (c) SH 128 bits (d) SH 256 bits

Figure 3: Precision-recall curve for SH on 22K LabelMe data set

1 1 1 1
LSH−SBQ LSH−SBQ LSH−SBQ LSH−SBQ
LSH−HH LSH−HH LSH−HH LSH−HH
0.8 LSH−DBQ 0.8 LSH−DBQ 0.8 LSH−DBQ 0.8 LSH−DBQ
Precision

Precision

0.6 0.6 0.6 0.6

0.4 0.4 0.4 0.4

0.2 0.2 0.2 0.2

0 0 0 0
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
Recall Recall Recall Recall

(a) LSH 32 bits (b) LSH 64 bits (c) LSH 128 bits (d) LSH 256 bits

Figure 4: Precision-recall curve for LSH on 22K LabelMe data set

References Weiss, Y.; Torralba, A.; and Fergus, R. 2008. Spectral hash-
Andoni, A., and Indyk, P. 2008. Near-optimal hashing algo- ing. In Proceedings of Neural Information Processing Sys-
rithms for approximate nearest neighbor in high dimensions. tems.
Commun. ACM 51(1):117–122. Zhang, D.; Wang, J.; Cai, D.; and Lu, J. 2010. Self-taught
Datar, M.; Immorlica, N.; Indyk, P.; and Mirrokni, V. S. hashing for fast similarity search. In Proceedings of Inter-
2004. Locality-sensitive hashing scheme based on p-stable national ACM SIGIR Conference on Research and Develop-
distributions. In Proceedings of the ACM Symposium on ment in Information Retrieval.
Computational Geometry. Zhang, D.; Wang, F.; and Si, L. 2011. Composite hash-
Gionis, A.; Indyk, P.; and Motwani, R. 1999. Similarity ing with multiple information sources. In Proceedings of
search in high dimensions via hashing. In Proceedings of International ACM SIGIR Conference on Research and De-
International Conference on Very Large Data Bases. velopment in Information Retrieval.
Gong, Y., and Lazebnik, S. 2011. Iterative quantization: A
procrustean approach to learning binary codes. In Proceed-
ings of Computer Vision and Pattern Recognition.
He, J.; Radhakrishnan, R.; Chang, S.-F.; and Bauer, C. 2011.
Compact hashing with joint optimization of search accuracy
and time. In Proceedings of Computer Vision and Pattern
Recognition.
Krizhevsky, A. 2009. Learning multiple layers of features
from tiny images. Tech report, University of Toronto.
Kulis, B., and Darrell, T. 2009. Learning to hash with bi-
nary reconstructive embeddings. In Proceedings of Neural
Information Processing Systems.
Kulis, B., and Grauman, K. 2009. Kernelized locality-
sensitive hashing for scalable image search. In Proceedings
of International Conference on Computer Vision.
Kulis, B.; Jain, P.; and Grauman, K. 2009. Fast similarity
search for learned metrics. IEEE Trans. Pattern Anal. Mach.
Intell. 31(12):2143–2157.
Liu, W.; Wang, J.; Kumar, S.; and Chang, S.-F. 2011. Hash-
ing with graphs. In Proceedings of International Conference
on Machine Learning.
Liu, W.; Wang, J.; Ji, R.; Jiang, Y.-G.; and Chang, S.-F.
2012. Supervised hashing with kernels. In Proceedings of
Computer Vision and Pattern Recognition.
Norouzi, M., and Fleet, D. J. 2011. Minimal loss hashing
for compact binary codes. In Proceedings of International
Conference on Machine Learning.
Oliva, A., and Torralba, A. 2001. Modeling the shape of
the scene: A holistic representation of the spatial envelope.
International Journal of Computer Vision 42(3):145–175.
Raginsky, M., and Lazebnik, S. 2009. Locality-sensitive
binary codes from shift-invariant kernels. In Proceedings of
Neural Information Processing Systems.
Salakhutdinov, R., and Hinton, G. 2007. Semantic Hashing.
In SIGIR workshop on Information Retrieval and applica-
tions of Graphical Models.
Salakhutdinov, R., and Hinton, G. E. 2009. Semantic hash-
ing. Int. J. Approx. Reasoning 50(7):969–978.
Torralba, A.; Fergus, R.; and Weiss, Y. 2008. Small codes
and large image databases for recognition. In Proceedings
of Computer Vision and Pattern Recognition.
Wang, J.; Kumar, S.; and Chang, S.-F. 2010. Sequential
projection learning for hashing with compact codes. In Pro-
ceedings of International Conference on Machine Learning.

Class Actvity 1 Answers
55% (11)
Class Actvity 1 Answers
10 pages
Learning Hash Functions Using Column Generation: Xi Li Guosheng Lin Chunhua Shen Anton Van Den Hengel Anthony Dick
No ratings yet
Learning Hash Functions Using Column Generation: Xi Li Guosheng Lin Chunhua Shen Anton Van Den Hengel Anthony Dick
9 pages
SSDH
No ratings yet
SSDH
7 pages
Binary Hashing For Approximate Nearest Neighbor Search On Big Data A Survey
No ratings yet
Binary Hashing For Approximate Nearest Neighbor Search On Big Data A Survey
16 pages
A Survey On Learning To Hash
No ratings yet
A Survey On Learning To Hash
22 pages
Nearest Neighbor Retrieval Using Distance-Based Hashing
No ratings yet
Nearest Neighbor Retrieval Using Distance-Based Hashing
10 pages
Bit Reduction For Locality-Sensitive Hashing
No ratings yet
Bit Reduction For Locality-Sensitive Hashing
12 pages
Locality-Sensitive Hashing Scheme Based On P-Stable Distributions
No ratings yet
Locality-Sensitive Hashing Scheme Based On P-Stable Distributions
10 pages
Learning To Hash For Indexing Big Data - A Survey
No ratings yet
Learning To Hash For Indexing Big Data - A Survey
22 pages
CSQ - Yuan Central Similarity Quantization For Efficient Image and Video Retrieval CVPR 2020 Paper
No ratings yet
CSQ - Yuan Central Similarity Quantization For Efficient Image and Video Retrieval CVPR 2020 Paper
10 pages
Lshanalysis Preprint
No ratings yet
Lshanalysis Preprint
12 pages
Online Hashing: Long-Kai Huang, Qiang Yang, and Wei-Shi Zheng
No ratings yet
Online Hashing: Long-Kai Huang, Qiang Yang, and Wei-Shi Zheng
14 pages
CAIM: Cerca I Anàlisi D'informació Massiva: FIB, Grau en Enginyeria Informàtica
No ratings yet
CAIM: Cerca I Anàlisi D'informació Massiva: FIB, Grau en Enginyeria Informàtica
19 pages
Composite Hashing With Multiple Information Sources: Dan Zhang Fei Wang Luo Si
No ratings yet
Composite Hashing With Multiple Information Sources: Dan Zhang Fei Wang Luo Si
10 pages
Locality-Sensitive Hashing Scheme Based On P-Stable Distributions
No ratings yet
Locality-Sensitive Hashing Scheme Based On P-Stable Distributions
10 pages
Near-Optimal Hashing Algorithms For Approximate Near (Est) Neighbor Problem
No ratings yet
Near-Optimal Hashing Algorithms For Approximate Near (Est) Neighbor Problem
31 pages
24 SimilaritySearch
No ratings yet
24 SimilaritySearch
52 pages
Toc CS246 PRK
No ratings yet
Toc CS246 PRK
17 pages
Tutti Gli Articoli
No ratings yet
Tutti Gli Articoli
140 pages
PSLSH
No ratings yet
PSLSH
10 pages
1 s2.0 S0167865510001169 Main
No ratings yet
1 s2.0 S0167865510001169 Main
11 pages
CS246 Hw1
No ratings yet
CS246 Hw1
5 pages
Learning To Hash With Binary Deep Neural Network: October 2016
No ratings yet
Learning To Hash With Binary Deep Neural Network: October 2016
17 pages
p117 Andoni
No ratings yet
p117 Andoni
6 pages
Locality-Sensitive Binary Codes From Shift-Invariant Kernels
No ratings yet
Locality-Sensitive Binary Codes From Shift-Invariant Kernels
9 pages
Efficient Distributed Locality Sensitive Hashing: Bahman Bahmani Ashish Goel Rajendra Shinde
No ratings yet
Efficient Distributed Locality Sensitive Hashing: Bahman Bahmani Ashish Goel Rajendra Shinde
5 pages
Iterative Quantization: A Procrustean Approach To Learning Binary Codes
No ratings yet
Iterative Quantization: A Procrustean Approach To Learning Binary Codes
8 pages
Compact Structure Hashing Via Sparse and Similarity Preserving Embedding
No ratings yet
Compact Structure Hashing Via Sparse and Similarity Preserving Embedding
12 pages
Theory of Locality Sensitive Hashing - CS246 Stanford (Slides)
No ratings yet
Theory of Locality Sensitive Hashing - CS246 Stanford (Slides)
52 pages
Product Quantization For Nearest Neighbor Search
No ratings yet
Product Quantization For Nearest Neighbor Search
13 pages
L3: Finding Similar Items: Locality Sensitive Hashing
No ratings yet
L3: Finding Similar Items: Locality Sensitive Hashing
54 pages
Locality Sensitive Hashing Towards Data Science
No ratings yet
Locality Sensitive Hashing Towards Data Science
16 pages
Hashing For Similarity Search: A Survey: Jingdong Wang, Heng Tao Shen, Jingkuan Song, and Jianqiu Ji
No ratings yet
Hashing For Similarity Search: A Survey: Jingdong Wang, Heng Tao Shen, Jingkuan Song, and Jianqiu Ji
29 pages
Asymmetric Distances For Binary Embeddings
No ratings yet
Asymmetric Distances For Binary Embeddings
8 pages
Fast Exact Search in Hamming Space With Multi-Index Hashing: Mohammad Norouzi, Ali Punjani, David J. Fleet
No ratings yet
Fast Exact Search in Hamming Space With Multi-Index Hashing: Mohammad Norouzi, Ali Punjani, David J. Fleet
14 pages
(IJCST-V12I2P10) :CH. Nikitha Reddy, P.V.Shilohini Angel, P. Hrithika Malkan, V. Nikitha, Mr.K. Anil Kumar
No ratings yet
(IJCST-V12I2P10) :CH. Nikitha Reddy, P.V.Shilohini Angel, P. Hrithika Malkan, V. Nikitha, Mr.K. Anil Kumar
4 pages
Efficient Filtering With Sketches in The Ferret Toolkit: Qin LV, William Josephson, Zhe Wang, Moses Charikar and Kai Li
No ratings yet
Efficient Filtering With Sketches in The Ferret Toolkit: Qin LV, William Josephson, Zhe Wang, Moses Charikar and Kai Li
10 pages
Big Data Unit II
No ratings yet
Big Data Unit II
23 pages
Fast and Exact Fixed-Radius Neighbor Search Based On Sorting
No ratings yet
Fast and Exact Fixed-Radius Neighbor Search Based On Sorting
17 pages
Nips 11
No ratings yet
Nips 11
9 pages
Wang 等 - 2019 - A Memory-Efficient Sketch Method for Estimating Hi
No ratings yet
Wang 等 - 2019 - A Memory-Efficient Sketch Method for Estimating Hi
10 pages
Binary Code Ranking With Weighted Hamming Distance
No ratings yet
Binary Code Ranking With Weighted Hamming Distance
8 pages
39 Remotesensing 13 02924 v2
No ratings yet
39 Remotesensing 13 02924 v2
16 pages
Similarity 1
No ratings yet
Similarity 1
53 pages
Principles of Hash-Based Text Retrieval.
100% (1)
Principles of Hash-Based Text Retrieval.
8 pages
LSH Lecture
No ratings yet
LSH Lecture
101 pages
Online Product Quantization
No ratings yet
Online Product Quantization
18 pages
Randomized Algorithms Notes
No ratings yet
Randomized Algorithms Notes
13 pages
DM LSH en PF
No ratings yet
DM LSH en PF
31 pages
Attribute Discovery Via Predictable Discriminative Binary Codes
No ratings yet
Attribute Discovery Via Predictable Discriminative Binary Codes
14 pages
Chapter 5
No ratings yet
Chapter 5
53 pages
UNIT 2 Bigdata Mining and Analytics
No ratings yet
UNIT 2 Bigdata Mining and Analytics
18 pages
Finding Similar Items
No ratings yet
Finding Similar Items
85 pages
Probabilistic Data Structures
No ratings yet
Probabilistic Data Structures
26 pages
Maximally Consistent Sampling and The Jaccard Index of Probability Distributions - 2018 (1809.04052)
No ratings yet
Maximally Consistent Sampling and The Jaccard Index of Probability Distributions - 2018 (1809.04052)
11 pages
CS246: Mining Massive Datasets Jure Leskovec,: Stanford University
No ratings yet
CS246: Mining Massive Datasets Jure Leskovec,: Stanford University
58 pages
Online Product Quantization: Donna Xu, Ivor W. Tsang, and Ying Zhang
No ratings yet
Online Product Quantization: Donna Xu, Ivor W. Tsang, and Ying Zhang
14 pages
Locality-Sensitive Hashing
No ratings yet
Locality-Sensitive Hashing
10 pages
Anchorhash: A Scalable Consistent Hash
No ratings yet
Anchorhash: A Scalable Consistent Hash
12 pages
E Cient Histogram-Based Similarity Search in Ultra-High Dimensional Space
No ratings yet
E Cient Histogram-Based Similarity Search in Ultra-High Dimensional Space
15 pages
Geometric Hashing: Efficient Algorithms for Image Recognition and Matching
From Everand
Geometric Hashing: Efficient Algorithms for Image Recognition and Matching
Fouad Sabry
No ratings yet
Harrison's Rheumatology, 2nd Edition Scribd Download
100% (13)
Harrison's Rheumatology, 2nd Edition Scribd Download
15 pages
The Travelers Property Casualty Co. v. Saint-Gobain Technical Fabrics Canada Ltd.
No ratings yet
The Travelers Property Casualty Co. v. Saint-Gobain Technical Fabrics Canada Ltd.
11 pages
Above SSC Applicationform
No ratings yet
Above SSC Applicationform
1 page
Key To Corrections - LEVEL 2 MODULE 3
No ratings yet
Key To Corrections - LEVEL 2 MODULE 3
10 pages
TOS TLE 8 Agricrop For Sharing
No ratings yet
TOS TLE 8 Agricrop For Sharing
2 pages
Punzalan, Joshua Mitchell L. Case-Scenarios-NICU
No ratings yet
Punzalan, Joshua Mitchell L. Case-Scenarios-NICU
2 pages
Residential Plots For Sale in Wadakpally - Bheeramguda
No ratings yet
Residential Plots For Sale in Wadakpally - Bheeramguda
2 pages
Separation by Drying
No ratings yet
Separation by Drying
30 pages
Sample Study Matter JEE (Advanced) PDF
100% (1)
Sample Study Matter JEE (Advanced) PDF
89 pages
Survey Instrument Validation Rating Scale SHS 2023
No ratings yet
Survey Instrument Validation Rating Scale SHS 2023
1 page
Ficha Técnica de Balatas-001 Noviembre 2011
No ratings yet
Ficha Técnica de Balatas-001 Noviembre 2011
4 pages
Todorov Theory
No ratings yet
Todorov Theory
1 page
Saint Louis College: Legislative Committee
No ratings yet
Saint Louis College: Legislative Committee
3 pages
Exam 2022 p2 Ans
No ratings yet
Exam 2022 p2 Ans
14 pages
Name: - Date: - Grade & Section: - Score
No ratings yet
Name: - Date: - Grade & Section: - Score
2 pages
RESUME CV Tabeti Abdelkader English 2017
No ratings yet
RESUME CV Tabeti Abdelkader English 2017
11 pages
Internship Report
No ratings yet
Internship Report
10 pages
Inderbir Singh Human Embryology 11th Edition by Subhadra Devi ISBN 9789352701155 9352701151 Instant Download
100% (4)
Inderbir Singh Human Embryology 11th Edition by Subhadra Devi ISBN 9789352701155 9352701151 Instant Download
46 pages
EG8145V5 Quick Start 01 (R20C00)
No ratings yet
EG8145V5 Quick Start 01 (R20C00)
16 pages
API FR - INR.RINR DS2 en Excel v2 2917298
No ratings yet
API FR - INR.RINR DS2 en Excel v2 2917298
74 pages
THEONE ? Sentence Improvement Pre 4th Oct Level Up Your English
No ratings yet
THEONE ? Sentence Improvement Pre 4th Oct Level Up Your English
145 pages
Tata Play Packs 20240904 - 0
No ratings yet
Tata Play Packs 20240904 - 0
243 pages
Lioba CV
No ratings yet
Lioba CV
5 pages
Blood Angels Army List (2000)
No ratings yet
Blood Angels Army List (2000)
1 page
Native Corn Recipes
100% (3)
Native Corn Recipes
115 pages
Multiple Choice Questions: University of Cape Town School of Economics Eco1010F Tutorial 8
No ratings yet
Multiple Choice Questions: University of Cape Town School of Economics Eco1010F Tutorial 8
6 pages
ST LINES + CIRCLES TOP 200 PYQs of JEE Mains 2022
No ratings yet
ST LINES + CIRCLES TOP 200 PYQs of JEE Mains 2022
60 pages
The Threats To The Objectivity in Internal Auditing
No ratings yet
The Threats To The Objectivity in Internal Auditing
2 pages
Mentee Application Form: PERIOD: 1 October 2022 - 1 May 2023
No ratings yet
Mentee Application Form: PERIOD: 1 October 2022 - 1 May 2023
2 pages

Double-Bit Quantization For Hashing: Weihao Kong Wu-Jun Li

Uploaded by

Double-Bit Quantization For Hashing: Weihao Kong Wu-Jun Li

Uploaded by

Double-Bit Quantization for Hashing

Weihao Kong and Wu-Jun Li

Evaluation Protocols and Baseline Methods

0.4 0.4 0.4 0.4

0.2 0.2 0.2 0.2

Figure 2: Precision-recall curve for ITQ on 22K LabelMe data set

0.4 0.4 0.4 0.4

0.2 0.2 0.2 0.2

Figure 3: Precision-recall curve for SH on 22K LabelMe data set

0.6 0.6 0.6 0.6

0.4 0.4 0.4 0.4

0.2 0.2 0.2 0.2

Figure 4: Precision-recall curve for LSH on 22K LabelMe data set

You might also like