0% found this document useful (0 votes)
15 views12 pages

Neubert2019 Article AnIntroductionToHyperdimension

Uploaded by

Mohamed GOU-ALI
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views12 pages

Neubert2019 Article AnIntroductionToHyperdimension

Uploaded by

Mohamed GOU-ALI
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

KI - Künstliche Intelligenz (2019) 33:319–330

https://doi.org/10.1007/s13218-019-00623-z

TECHNICAL CONTRIBUTION

An Introduction to Hyperdimensional Computing for Robotics


Peer Neubert1 · Stefan Schubert1 · Peter Protzel1

Received: 15 December 2018 / Accepted: 11 September 2019 / Published online: 18 September 2019
© Gesellschaft für Informatik e.V. and Springer-Verlag GmbH Germany, part of Springer Nature 2019

Abstract
Hyperdimensional computing combines very high-dimensional vector spaces (e.g. 10,000 dimensional) with a set of carefully
designed operators to perform symbolic computations with large numerical vectors. The goal is to exploit their representa-
tional power and noise robustness for a broad range of computational tasks. Although there are surprising and impressive
results in the literature, the application to practical problems in the area of robotics is so far very limited. In this work, we aim
at providing an easy to access introduction to the underlying mathematical concepts and describe the existing computational
implementations in form of vector symbolic architectures (VSAs). This is accompanied by references to existing applica-
tions of VSAs in the literature. To bridge the gap to practical applications, we describe and experimentally demonstrate the
application of VSAs to three different robotic tasks: viewpoint invariant object recognition, place recognition and learning
of simple reactive behaviors. The paper closes with a discussion of current limitations and open questions.

Keywords Hyperdimensional computing · Vector symbolic architectures · Robotics

1 Introduction artificial neural networks (ANN). Their recent success


includes robotic subproblems, e.g., for robust perception.
Humans typically gain an intuitive understanding of 2-D However, in many robotic tasks, deep learning approaches
and 3-D Euclidean spaces very early in their lives. Higher face (at least) three challenges [35]: (1) limited amount of
dimensional spaces have some counterintuitive properties training data, (2) often, there is prior knowledge that we
that render the generalization of many algorithms from low want to integrate (models as well as algorithms), and (3) we
to high-dimensional spaces useless—a phenomenon known want to be able to assess the generalization capabilities (e.g.
as curse of dimensionality. However, there is a whole class from one environment to another or from simulation to real
of approaches that aims at exploiting these properties. These world). The later is particularly important if the robot is an
approaches work in vector spaces with thousands of dimen- autonomous car. A resulting motivation for using VSAs is
sions and are referred to as hyperdimensional computing to combine the versatility, representational power and noise
or vector symbolic architectures (VSAs) (previously they robustness of high-dimensional representations (for example
were also called high-dimensional computing or hypervec- learned by ANNs) with sample-efficient, programmable and
tor computing). They build upon a set of carefully designed better interpretable symbolic processing.
operators to perform symbolic computations with large Although processing of vectors with thousands of dimen-
numerical vectors. sions is currently not very time efficient on standard CPUs,
Another, better known class of algorithms that (internally) typically, VSA operations can be highly parallelized. Fur-
work with high-dimensional representations are (deep) ther, VSAs support distributed representations, which are
exceptionally robust towards noise [2], an omnipresent prob-
lem in robotics [36]. In the long term, this robustness can
* Peer Neubert
[email protected]‑chemnitz.de also allow to use very power efficient stochastic devices that
are prone to bit errors but extend the battery life of a mobile
Stefan Schubert
[email protected]‑chemnitz.de robot [31].
The goal of this paper is to provide an easily accessible
Peter Protzel
[email protected]‑chemnitz.de
introduction to this field that spans the range from the math-
ematical properties of high-dimensional spaces in Sect. 2,
1
Chemnitz University of Technology, Chemnitz, Germany

13
Vol.:(0123456789)
320 KI - Künstliche Intelligenz (2019) 33:319–330

10 80 data points (e.g. training samples), the sampling density


decreases with increasing number of dimensions. For an
10 60
d=0.01
d=0.03
n-dimensional space and k samples, it is proportional to
capacity

10 40 d=0.05
d=0.10
k1∕n (cf. [11, p. 23]). If we require 100 samples for an
10 20
dense (d=1) accurate representation of a one dimensional problem, the
same problem in a 10 dimensional space would require
10 0
0 500 1000 1500 2000 10010 samples to achieve the same sample density.
# dimensions
Beyer et al. [4] showed a direct consequence for the
nearest neighbor problem (given a set of data points in an
Fig. 1  Capacity of dense and sparse vector spaces quickly becomes n-dimensional metric space, the task is to find the clos-
very large (d is the ratio of ones). A discussion of properties of sparse
representations can, e.g., be found in [2] est data point to some query point). They define a query
as unstable if the distance from the query point to most
datapoints is less than (1 + 𝜖) times the distance from the
over implementations and computing principles of VSAs query to the nearest neighbor. Under a broad range of prac-
in Sect. 3, and a short overview of existing applications in tically relevant conditions, for any fixed 𝜖 > 0 and increas-
Sect. 4, to three (novel) demonstrations how hyperdimen- ing number of dimensions, the probability that a query is
sional computing can address robotic problems in Sect. 5. unstable converges to 1. In other words, the distance to the
These demonstrations are intended as showcases to inspire nearest neighbor approaches the distance to the farthest
future applications in the field of robotics. Remaining data point.
impediments in form of current limitations and open ques- Based on the these results on the contrast in nearest
tions are discussed in Sect. 6. neighbor queries in high-dimensional spaces, Aggarwal
et al. [1] investigated the influence of the choice of the
metric. For example, the often used Euclidean L2 norm is
2 Properties of High‑Dimensional Spaces: not well suited for high-dimensional spaces, better choices
Curse and Blessing are Lp norms with smaller p (for some applications this
includes fractal norms with p < 1). Also angular distances
2.1 High‑Dimensional Spaces Have Huge Capacity for real vectors and Hamming distance for binary vectors
are suitable choices.
The most obvious property is high capacity. For example,
when we increase the number of dimensions in a binary
vector, the number of stored possible patterns increases
2.3 Random Vectors are Very Likely Almost
exponentially. For n dimensions, the capacity is 2n . For real
Orthogonal
valued vector spaces and practical implementations with
Random vectors are created by sampling each dimension
limited accuracy (i.e. a finite length representation in a com-
independently and uniformly from the underlying space.
puter) the capacity is also exponential in the number of
The distribution of angles between two such random vectors
dimensions. Interestingly, even for sparse binary vector
contradicts our intuition. In an n-dimensional real valued
spaces, the number of possibly stored patterns grows very
space, for any given vector, there are n − 1 exactly orthog-
fast. Figure 1 illustrates this behavior. For n dimensions and
onal vectors. However, the number of almost orthogonal
� n � d (the rate of ones in the vector), the capacity is
density
vectors, whose angular distance to the given random vec-
. Even if there are only 5% non-zero entries, a 1000
⌊d⋅n⌋ tor is ≤ 𝜋2 + 𝜖 , grows exponentially for any fixed 𝜖 > 0. An
dimensional vector can store more patterns than the sup-
posed number of atoms in the universe (presumably about
1080).

2.2 Nearest Neighbor Becomes Unstable


or Meaningless

This is a less intuitive property but nevertheless it is very


important since it lies at the heart of the curse of dimen-
sionality. This term was coined by Bellman [3] to describe
the downsides of the exponential growth of capacity (or Fig. 2  Example visualization for similar and almost orthogonal areas
volume) of the space: if there is a fixed number of known in 2-D and 3-D spaces (angular thresholds 0.1)

13
KI - Künstliche Intelligenz (2019) 33:319–330 321

Surface areas Ratio of surface areas Random sampling probability Random sampling probability (extended)
7 10 25 0.4 1
Similar
6 Almost orthogonal

/A Similar
10 20 0.8
0.3
area (varying unit)

10 15

probability
0.6

probability
4

AAlmost orthogonal
0.2
3 0.4
10 10
2
0.1
10 5 0.2
1 Similar Similar
Almost orthogonal Almost orthogonal
0 10 0 0 0
0 5 10 15 20 25 0 5 10 15 20 25 0 5 10 15 20 25 0 200 400 600 800 1000
# dimensions # dimensions # dimensions # dimensions

Fig. 3  Analytical results on n-spheres. Note the logarithmic scale in the second plot. Due to numerical reasons, the dashed extension for
#dimensions > 300 in the right plot is not obtained analytically but using sampling

important consequence is that two randomly chosen vectors 1 2


are very likely to be almost orthogonal. =0.10

Probability of wrong query answer


noise
0.8 =0.25 1.5
Although we can not directly illustrate this effect in
noise

noise
=0.50
1
high-dimensional spaces, the transition from 2-D to 3-D 0.6

value
0.5
space already gives an idea of what is happening. Figure 2 0.4
0
shows the sets of similar and almost orthogonal vectors on 0.2
-0.5
unit spheres in 2-D and 3-D space. It can be easily seen
0 -1
that in the higher dimensional space, a random point on 0 50 100
# dimensions
150 200 0 50 100
dimension index
150 200

the sphere is much more likely to lie in the red area of


almost orthogonal vectors than in the yellow area of simi- Fig. 4  (Left) Robustness towards different noise. (Right) For illustra-
lar vectors, and that this effect increased from 2-D to 3-D. tion: the blue example database vector [ 12 , 12 , …] is affected by addi-
Figure 3 shows analytical results for higher dimensional tive noise ∼ N(0, 0.5) (the amount represented by the yellow curve on
the left) and becomes the red vector (color figure online)
spaces based on the equations of surface areas of n-spheres
and n-caps.1 The first plot shows the surface areas of the
similar and almost orthogonal ranges for an angular dis-
2.4 Noise has Low Influence on Nearest Neighbor
tance for 𝜖 = 0.1 (the similar and almost orthogonal one
Queries with Random Vectors
and two dimensional surfaces from Fig. 2 provide each
two points on the corresponding curves in this first plot).2
Why is it important that random vectors are very likely
Although the value of the surface area also decreases
almost orthogonal? If random vectors point in almost
beyond the local maximum (which is also a global maxi-
orthogonal directions, this creates a remarkably robust-
mum), it decreases much slower than the area value of the
ness when trying to recognize them from noisy measure-
similar region. This is demonstrated in the second plot in
ments. Let us demonstrate this with an example:3 suppose
Fig. 3 that shows the ratio of the almost orthogonal and
there is a database of one million random feature vectors
similar surface areas. The linear shape in this logarithmic
fi ∈ 𝕍 = [0, 1]n (again, each dimension is sampled indepen-
plot reveals the exponential growth of this ratio. The two
dently from a uniform distribution). Also there is a query
right plots show the probability to randomly sample either
vector, which is a noisy measurement of one of the database
a similar or an almost orthogonal vector. For high number
vectors (each dimension has additive noise ∼ N(0, 𝜎), as a
of dimensions (e.g. > 700 ) the probability that two random
consequence they can also leave their original range [0, 1]).
vectors are almost orthogonal ( 𝜖 = 0.1) gets close to 1.
Using the angular distance, what is the probability to get a
wrong query answer (i.e. that a wrong vector from the data-
1
n-sphere: a hypersphere in the (n+1)-dimensional space. n-cap:
base is closer to the noisy query than the correct one)? Fig-
portion of an n-sphere cut off by a hyperplane. ure 4 shows results for increasing number of dimensions and
2
Please keep in mind, that the unit of the surface area of an increasing amounts of noise. Even for noise with standard
n-sphere is an n-dimensional object, thus the unit along the vertical deviation of 0.5, that is half the available range of the initial
axis changes and the values along the curves are not directly com- value (this is illustrated in the right part of Fig. 4), using
parable. Nevertheless, the fact that there is a local maximum of the
surface area of the almost orthogonal range is surprising. However,
it is a direct consequence of the local maximum of the surface area
of the whole unit n-sphere (which in turn becomes intuitive based on
3
the recursive expression of the surface area An+1 = An ⋅ 2𝜋
n
since for Similar experiments can, e.g., be found in [31] and [2]; analytical
n > 2𝜋 this factor becomes smaller one). results on VSA capacity can be found in [7].

13
322 KI - Künstliche Intelligenz (2019) 33:319–330

1
k=2
1
k=2 3 How to do Hyperdimensional Computing:
Vector Symbolic Architectures (VSA)
Probability of wrong query answer

Probability of wrong query answer


k=3 k=3
0.8 k=4 0.8 k=4
k=5 k=5
k=6
0.6 0.6
k=7

0.4 0.4
k=8
k=9
The previous Sect. 2 listed properties of high-dimensional
k=10
vector spaces and demonstrated how a bundle of vectors can
be represented by their sum. Formally, the resulting vector
0.2 0.2

0
0 200 400 600 800 1000
0
0 200 400 600 800 1000 represents the unordered set of the bundled vectors. To be of
broader value, we need to be able to represent more complex
# dimensions # dimensions

and compositional structures such as ordered lists, hierarchies,


Fig. 5  Performance using bundles. Correct query answers are those
were all k-nearest neighbors are correct. (left) Random vectors are or object-part relations. A key element to storing structured
from 𝕍 = [0, 1]n (right) 𝕍 = [−1, 1]n data is to assign different roles to different parts of the data
[15]. Think of the personal record: {name = Alice, year_of_
birth = 1980, high_score = 1000}. Storing just the values {
more than about 170 dimensions renders the probability of Alice, 1980, 1000} is of limited help, since, for example, this
a false matching almost zero. unordered set cannot distinguish between Alice’s year of birth
Adding noise to each dimension is the same as adding and her high score. We need information about the binding
a noise vector to the the whole data vector. This yields an between the role (or variable) “year_of_birth” and its filler
interesting application: What if this added vector is again a (or value) “1980”.
known data vector? This is known as the bundle of two vec- There have been multiple approaches presented to do this
tors and will be subject of Sect. 3 where we will use the sym- using hyperdimensional computing, e.g. by Plate [28], Kanerva
bol + to refer to this operator. Since both vectors act sym- [15], Gayler [8], and others. In 2003, Gayler coined the term
metrically as noise for the recognition of the other, a query Vector Symbolic Architectures (VSA) for these approaches [9].
with the sum of two data vectors is expected to return both In a nutshell, a VSA combines a vector space with a set of
vectors as the two nearest neighbors. Figure 5 shows results carefully chosen (designed) operators with particular proper-
for this experiment using the above database of one million ties. Each of the above VSAs uses a different vector space.
data vectors and also different numbers k of summed vectors. The set of operators has to include the two operators bundling
We evaluate the capability to return perfect query answers + and binding ⊗, and for certain applications a permute (or
(i.e. exactly the k true vectors are the k-nearest neighbors). protect) 𝛱 operator is required. The output of each operator
As a reading example for these plots: The purple curve in the is again a vector from the same space. Bundling shares some
left plot shows that in a 600 dimensional vector space, we properties with addition and binding some properties with
can safely add five vectors and the result is almost certainly multiplication of numbers.
more similar to all of these five vectors than to any other Before we proceed with the formal requirements, let us give
of the one million data vectors from our example database. an example of the overall goal. Given are vector representa-
This is the most straightforward way of implementing tions AliceV , 1980V and 1000V , of the string “Alice” and the
bundling and we can easily improve performance. E.g., to two numbers “1980” and “1000”; they can be obtained using
allow for increasing and decreasing values during summa- a suitable encoder or be just random vectors whose meaning
tion, we could sample each dimension from [−1, 1] instead we have stored for later decoding. Also, we have random vec-
of [0, 1] as before. The result are shown in the right part of tors nameV , year_of _birthV , and high_scoreV that represent
Fig. 5. In this configuration, 300 dimensions are sufficient the corresponding roles. Using hyperdimensional computing,
to recognize each of a sum of five vectors, and a 600 dimen- we want to be able to create a single vector H that contains the
sional sum can handle a sum of ten vectors. Presumably, whole record using the above operators:
there are many other ways to improve performance in this
simple example. H = nameV ⊗ AliceV
The next section will present a more systematic approach + year_of _birthV ⊗ 1980V (1)
to solve problems with hyperdimensional computing that + high_scoreV ⊗ 1000V
build upon the presented properties of high-dimensional
spaces. While the examples from this section build upon ran-
Subsequently, we want to be able to query for the value of
dom vectors, Sect. 5 will demonstrate performance of these
each part of the composite structure using the same opera-
approaches when confronted with data from robot sensors.
tors, e.g. for the name:
H ⊗ nameV → AliceV (2)

13
KI - Künstliche Intelligenz (2019) 33:319–330 323

Table 1  Example VSAs


Smolensky [32] Plate [28] Kanerva [15] Gayler [8]

Space 𝕍 Tensors of real numbers Real and complex vectors {0, 1}n [−1, 1]n (or {−1, 1}n)
Bundle + Elementwise sum Elementwise sum Thresholded elementwise sum Limited elementwise sum
Bind ⊗ Tensor product a Circular convolution Elementwise XOR Elementwise product
Protect 𝛱 (Not considered) (Not considered) Permutations Permutations

A more extensive list can be found in [31]


a
This operator changes the vector size and shape

This example will be explained in the following subsections. available binding implementations. Section 2 illustrated that
Querying a record is a simple example of high-dimensional the distribution of similar and dissimilar vectors is an impor-
computing. Before we proceed with more sophisticated tant property of high-dimensional vectors spaces. Thus,
demonstrations in Sects. 4 and 5, we will characterize the the effect of VSA operations on these similarities is also
operators in more detail and explain how the properties of important. E.g., binding should be similarity preserving:
high-dimensional spaces are exploited in the query-record ∀A, B, X ∈ 𝕍 ∶ dist(A, B) = dist(A ⊗ X, B ⊗ X), the distance
example. However, there is no exact definition available of of two vectors remains constant when binding both vectors
the required properties of the VSA operators, the exact set of to the same third vector. Moreover, as the first sentence of
operators or the vector space 𝕍 . The following is a summary this section already said, the result vector has to be dissimi-
of properties from the literature, i.e. the VSAs from Table 1. lar to the two inputs. This is important for the combination
with the bundle operator explained in the following section.
3.1 Binding ⊗
3.2 Bundling +
Binding ⊗ ∶ 𝕍 × 𝕍 → 𝕍 combines two input vectors into
a single output vector that is not similar to the input vec- The goal of the bundling operation + ∶ 𝕍 × 𝕍 → 𝕍 is to com-
tors but allows to (approximately) recover any of the input bine two input vectors such that the output vector is similar
vectors given the output vector and the other input vec- to both inputs. This is also called superposition of vectors.
tor. E.g., we can bind the filler AliceV to the role nameV by Typically, the bundling operator is some kind of elementwise
N = AliceV ⊗ nameV and later recover the filler AliceV given sum of the vector elements (see Table 1). E.g., the Multi-
N and the role vector nameV . To recover this vector, we need ply–Add–Permute VSA of Gayler [8] uses elementwise sum
to unbind one vector from another. For VSAs were vectors on the same vector space [−1, 1]n as the experiments from
are also (approximately) self inverse, unbinding and binding Fig. 5 (the sum is limited to the range of the vector space ele-
are the same operation (e.g. [8, 15]). Self-inverse means: ments [−1, 1]). In these experiments, we already showed that
the elementwise sum of vectors is similar to each of the vec-
∀X ∈ 𝕍 ∶ X ⊗ X = 𝟏
tors; this was a direct consequence of the almost orthogonality
where 𝟏 is the neutral element of binding in the space 𝕍 . We of random vectors.
will use this property in the following examples. According to Kanerva [17] the bundle and bind operations
An intuitive example of such a binding operator is the should “form an algebraic field or approximate a field”. In par-
special case of Gayler’s VSA with 𝕍 = {−1, 1}n (instead of ticular, bundling should be associative and commutative and
[−1, 1]n ) and binding and unbinding as elementwise mul- binding should distribute over bundling. Let us illustrate this
tiplication. The self-invertibility is due to the limitation to with a closer look at the example of Alice’s record. For brevity
±1, since −1 ⋅ −1 = 1 ⋅ 1 = 1 and 1 is the neutral element of we use X, Y, Z for the role vectors “name”, “ year_of _birth”
multiplication. With such a VSA, recovering the name in the and “high_score” and A, B, C for the vector representations
above role-filler example works as follows: of their values “Alice”, “1980” and “1000”. The record vector
is formed by H = (X ⊗ A) + (Y ⊗ B) + (Z ⊗ Z). What hap-
N ⊗ nameV = (AliceV ⊗ nameV ) ⊗ nameV
pens when querying for the name by binding with its vector X?
= AliceV ⊗ (nameV ⊗ nameV )
= AliceV ⊗ 𝟏 = AliceV X ⊗ H = X ⊗ ((X ⊗ A) + (Y ⊗ B) + (Z ⊗ Z))
= (X ⊗ X ⊗ A) + (X ⊗ Y ⊗ B) + (X ⊗ Z ⊗ C)
This example also requires the binding operator to be asso- = A + noise
ciative. Further, to allow to change the order of the vectors,
binding is typically also commutative. Table 1 lists several

13
324 KI - Künstliche Intelligenz (2019) 33:319–330

The noise term includes the terms (X ⊗ Y ⊗ B) and traditional subject-predicate-object relationships (e.g. “aspi-
(X ⊗ Z ⊗ C). Both are non-similar to each of their elements rin TREATS headache”) for fast approximate inference on
(a property of binding). Thus the only known vector that the relationships of diseases, symptoms and treatments. Nat-
is similar to X ⊗ H is A. Again this exploits the property ural language processing is considered a challenging task.
of high-dimensional vectors to be almost sure non-similar Jackendoff [13] specified this statement to four theoretical
(i.e. almost orthogonal) to random vectors. The database challenges that a system that aims at processing language at
experiments from Sect. 2.4 already illustrated how a noise- a human level has to solve. According to Gayler [9], VSAs
free version of vector A can be recovered: given a database can solve these challenges. Hyperdimensional computing
with all elementary vectors, returning the nearest neighbor was also used to encode n-gram statistics to recognize the
to A + noise results very likely in A (to not return a vector language of a text [14]. There is evidence that distributed
which is similar to noise, we need the property of binding high-dimensional representations are widely used for rep-
to be non-similar to its inputs). In VSAs this database is resentation in the human brain [2]. This is extensively used
typically called clean-up or item memory [16]. It can be as in brain-inspired cognitive systems like Spaun [6] and in
simple as our look-up table or, e.g., an attractor network hierarchical temporal memory (HTM) [12], a computational
[17]. Section 5.2 will evaluate properties of such a clean-up model of working principles of the human neocortex. The
memory in combination with real-world data. latter was also applied for mobile robot place recognition
There can be trade-offs between the performance of the [24].
bundling and binding operators. For example, in the VSA of
Gayler, the bundling operator works well for 𝕍 = [−1, 1]n ;
however, the self-inverse property of binding holds only 5 Application to Robotic Tasks
exactly for the special case of 𝕍 = {−1, 1}n . The clean-up
memory can also be used to restore exact values in the non- This section showcases three examples, how hyperdimen-
exact inversion case. sional computing can be used for real robotic tasks. We do
not claim that the presented approaches are better than exist-
3.3 Permutation (or Protect) ˘ ing solutions to the considered tasks, however, they dem-
onstrate the versatility of hyperdimensional computing, its
Gayler [8] discussed the benefit of using an additional capability to work with real world data and advocate the
operator in order to protect vectors. Think of a situation practical value. Before we start with the applications, we
with two bound role-filler pairs: A ⊗ X and B ⊗ Y . When will describe how we bridge the gap between real world
binding these two pairs to (A ⊗ X) ⊗ (B ⊗ Y), it becomes sensors and vector computations.
necessary to prevent mixing roles and fillers: Since bind-
ing is associative and commutative, this is equivalent to 5.1 Encoding Real World Data
(A ⊗ Y) ⊗ (B ⊗ X). The permutation operator 𝛱 protects
a term from associative and distributive rules. In the above Section 2.4 used synthetic data to demonstrate the noise
example this is (A ⊗ X) ⊗ 𝛱(B ⊗ Y). It is typically imple- robustness of hyperdimensional computations and its appli-
mented as a permutation of vector dimensions. Its output is cation to bundling. The random vectors in this synthetic data
dissimilar to the input and by application of the reverse per- fulfill the requirements to achieve pairwise almost orthogo-
mutation, it is also invertible. For details please refer to [8]. nal vectors by design. What if we want to work with real
world data that does not provide thousands of independent
random dimensions? For simple data structures and the par-
4 Applications from the Literature ticular case of sparse binary vectors, Purdy [29] discusses
different encodings. Very recently, Kleyko et al. [20] dis-
VSAs have been applied to various problems like text clas- cussed trade-offs in binary hyperdimensional encodings of
sification [20], fault detection [19], analogy mapping [30], images. A comprehensive discussion of encoding approaches
and reinforcement learning [18]. Kanerva [17] discusses of real world sensor data is beyond the scope of this paper.
the general computational power of VSAs and concludes However, we want to shortly describe our approach to
one could create a “High dimensional computing-Lisp”. encode the real world image data in our experiments.
While this is still open, work in this direction includes syn- Any high-dimensional image feature vector can poten-
thesis of finite state automata [27] and hyperdimensional tially be used. Based on their recent success, we decided
stack machines [38]. Danihelka et al. [5] (Deepmind) used to use image descriptors from early layers from deep con-
a VSA to model long-short term memory. In the medical volutional neural networks in a similar fashion as they are
domain, Widdows and Cohen [37] used Predication-based used for place recognition [25, 33]. To get a descriptor for
Semantic Indexing which exploits a VSA to represent an image, it is fed to an off-the-shelf readily trained CNN

13
KI - Künstliche Intelligenz (2019) 33:319–330 325

1
0.2
0.9

0.15 0.8

Cosine distance
Individual

Accuracy
Bundle
0.7
0.1 Static B4
Static B8
0.6
Individual
0.05 Bundle
Static B4 0.5

Fig. 6  Example views of one of the 1000 ALOI objects from 0◦, 90◦ 0
Static B8
0.4
and 180◦ viewing angle 0 45 90 135 180
Angular distance of query to known vectors
0 45 90 135
Angular distance of query to known vectors
180

(we use AlexNet [21]) and instead of using the final out- Fig. 7  Object recognition performance on ALOI dataset (color figure
online)
put (e.g. the 1000 dimensional soft-max class output), the
intermediate output of an earlier layer is used [we use the
13 × 13 × 385 = 64,896 dimensional output of the third con- in between) and image index q. The task is to assoziate q to
volutional layer (conv3)]. To reduce computational effort the correct image index k.
and to get a distributed representation, we use a locality VSA approach We bundle the image descriptors Ixk and Iyk ,
sensitive hashing (LSH) approach and project the normal- i.e. create Ixk + Iyk for each k (there will be one vector for each
ized conv3 descriptor with a random matrix R to a lower object in the database).
Results When comparing a query image Iz to the data-
q
dimensional space. Each row in R is the normal of a 64,896
dimensional hyperplane (obtained by sampling R from a base, motivation (1) is achieved by design: instead of com-
paring Iz to Ixk and Iyk individually, we can now compare
q
standard normal distribution followed by normalization of
rows to length one). Since these products of the normalized against the single bundle vector and reduce the number of
conv3 descriptor and each row (hyperplane normal) reflect required comparisons by factor two.
the cosine of the angle between the vectors, they are in range The results in Fig. 7 demonstrate the better interpolation
[−1, 1] and can be directly used in the Multiply–Add–Per- capabilities from motivation (2): the bundled representation
mute Architecture [8] (see Table 1). We use 8192 rows in R. (red curve) has a smaller cosine distance to the object image
under a novel viewing angle than the individual images (blue
5.2 Bundling Views for Object Recognition curve). This also results in a better object recognition accu-
racy (right part). See footnote4 for details.
Robotic task For the first robotic use-case, we demonstrate To evaluate towards continiously integrating more views
the application of hyperdimensional computing to recog- [motivation (3)], the yellow and the purple curves in Fig. 7
nize objects from multiple viewpoints. This is important for show query results when bundling a (static) set of multiple
mobile robot localization by recognizing known landmarks, views from the four angles {0, 90, 180, 270} and the eight
recognizing objects for manipulation, and other robotics angles {0, 45, 90, … , 315}. Although the distance values
tasks.
Motivation We use this task to transfer the results on syn-
thetic data from Sect. 2.4 on bundling of high-dimensional 4
Details: the red curve in the left plot evaluates vector similari-
vectors to real world data. Bundling allows to combine mul- ties (the query image index q is known and we compare the simi-
tiple vectors into one. This can be straightforwardly used larity of Ixk + Iyk and Iz ), the red curve in the right plot evaluates
q=k

to combine two or more known views. The motivation is the accuracy of a nearest neighbor query (the query image index q
is not known to the system and it returns the index k of the nearest
threefold: (1) if we combine all known views into one repre- neighbor to Iz of all Ixk + Iyk , k ∈ {1 … 1000}). x is fixed at viewing
q

sentation, the comparison of a query vector to all known rep- angle 0 . y varies from 0◦ to 350◦. The horizontal axis is the mean

resentations is a single vector comparison. (2) There might angular distance from z to x and y. As a reading example: in the left
be a better interpolation between the known views. (3) This plot, the red curve evaluated at 90◦ means that for x = 0◦, y = 180◦,
z = 90◦ (e.g. the images from Fig. 6), the average cosine distance of
allows to straightforwardly update the representation of an the bundle (I0k + I180
k
) and I90
k
is about 0.17, and the right plot tells us
object, particularly iteratively in an online-filter fashion. that for about 53% of the objects the query image was most similar
Experimental setup We practically demonstrate this to the correct bundle. For comparison without bundling, the blue
approach using the Amsterdam Library of Object Images curves in Fig. 7 show the results when comparing the query image
to the individual images Ixk and Iyk (instead of their bundle). For the
(ALOI) dataset [10], in particular the collection of 72,000 distance evaluation in the left plot, we use the closest of the two indi-
images of 1000 objects seen from 72 different horizontal vidual results for each query. For the query results in the right plot,
viewing angles (5◦ steps). Figure 6 shows example images. all views Ixk and Iyk are stored in the database and a single query is
In our experiments, given are a database of k ∈ {1 … 1000} made (the number of data base entries and thus comparisons has now
doubled compared to the bundling approach). The VSA approach not
known images Ixk and Iyk at viewing angles x and y, as well as only reduces the number of comparison, it also performs slightly bet-
a query image Iz at viewing angle z = x+y (the viewing angle
q
2
ter than using individual comparisons in both plots.

13
326 KI - Künstliche Intelligenz (2019) 33:319–330

in S and the similarities between images before and after i


and j. SeqSLAM assumes a constant velocity within each
sequence, thus it sums over the similarities on a short linear
segment in S centered at si,j (illustrated as red line in Fig. 8).
See [23] for more details.
VSA approach To implement the SeqSLAM idea using
hyperdimensional computing, we first encode each image as
a vector using the approach from Sect. 5.1. Then, the basic
idea is to replace each image vector in each sequence by a
Fig. 8  (Left) Illustration of a similarity matrix and SeqSLAM post- vector (of the same dimensionality) that encodes this image
processing. (Right) Three example places from the Nordland dataset and the d images before and after this image in a bundle.
in spring and winter To preserve the order of the images, each image is bound
to a static position vector Pk before bundling. In our experi-
ments, the position vectors are random vectors, thus they are
does not reach zero distance for known views for these larger very likely almost orthogonal. The encoding of each image
bundles, the cosine distance varies the less the more views is: Yi = +dk=−d (Xi+k ⊗ Pk ); for the beginning and end of the
are bundled. sequence (e.g. i < d + 1), fewer vectors are bundled. Since
the bundle of an arbitrary number of vectors has exactly
5.3 Sequence Processing for Place Recognition the same shape, this is neatly handled. Finally, to obtain
place recognition results, the Yi encodings of the database
Robotic task Place recognition is a task similar to image and query image sets can be compared pairwise.
retrieval: given a set of images from known places, find cor- Why does this work? Consider the encodings of a
responding places to the current camera view of the robot. sequence of two consecutive images from the database:
In contrast to the more general image retrieval task, for (Aa ⊗ P0 ) + (Aa−1 ⊗ P−1 ), and a sequence of two consecu-
place recognition, we can assume that the robot does not tive query images: (Bb ⊗ P0 ) + (Bb−1 ⊗ P−1 ). When com-
jump arbitrarily between places and that the sequence of paring these two bundles, they are the more similar, the
previous places provide some information about the cur- more of their components are similar. Let us evaluate some
rent place. component pairs: The similarity of (Aa ⊗ P0 ) and (Bb ⊗ P0 )
Motivation The previous application used only bundling. depends on the similarity of Aa and Bb since both are bound
This second examples demonstrates the combination with to the same vector P0 and operator ⊗ is similarity preserving.
the binding operator ⊗ to implement the important concept The same holds for Aa−1 and Bb−1. In contrast, e.g., (Aa ⊗ P0 )
of role-filler pairs using hyperdimensional computing. We and (Bb−1 ⊗ P−1 ) are known to be non similar since P0 and
also use this example to showcase how hyperdimensional P−1 are almost orthogonal.
computing can provide a simple and concise implemen- Experimental setup We use the Nordland dataset [34]
tation of an existing algorithm (SeqSLAM). This VSA which provides images from four 728 km train journeys
implementation can also potentially benefit from the gen- through Norway, once each season (see Fig. 8 for exam-
eral advantages of hyperdimensional computing like noise ple images). We use 288 equally spaced places along the
tolerance and potential energy efficiency. Similar to the tracks and perform place recognition between the spring and
previous object recognition example, the superposition of the winter image sets. The evaluation is done using preci-
vectors also reduces the number of required comparison sion-recall curves based on the known ground-truth place
operations. associations.
Mimicked approach SeqSLAM [23] exploits image Results The experimental results in Fig. 9 show that the
sequence information to approach the challenging problem VSA SeqSLAM implementation (solid curves) can exploit
of place recognition in changing environments (e.g. given a sequential information to improve the place recognition per-
database of images taken in summer and the goal is to local- formance. The results closely approximate the results of the
ize during winter). Input to Seq-SLAM’s sequence process- original SeqSLAM sequence processing approach (dashed
ing part is a pairwise image similarity matrix S shown in curves).
Fig. 8. Each entry si,j is the similarity between the ith image There is an additional theoretical benefit: the number of
from the database and the jth image of the robot’s current performed operations is significantly smaller for the VSA
camera sequence. The output of SeqSLAM is a new value approach. For database size n, a query size m, and sequence
for the similarity of images i and j based on their similarity length ds = 2 ⋅ d + 1 ( past + future + current ), the number

13
KI - Künstliche Intelligenz (2019) 33:319–330 327

Fig. 9  (Left) Place recognition results on Nordland dataset. The origi- performance compared to a direct pairwise comparison (top-right is
nal SeqSLAM sequence processing approach is well approximated better). (Right) Schematic overview of data flow for behavior learning
by the vector sequence encoding. Both improve the place recognition

of original SeqSLAM operations is m ⋅ n ⋅ ds . For the VSA the system learns a representation that encodes this reac-
implementation it is m ⋅ ds + n ⋅ ds + m ⋅ n (the first two tive behavior and can resemble it during new runs in the
terms represent the descriptor bundling and the last term environment.
the final pairwise comparison). E.g., for our database and Motivation The goal of this final example is to showcase a
query size of 288 images and d = 5, the ratio of the numbers more complex VSA-based system.5 In contrast to the previ-
of operations is more than factor 10 and it becomes larger if ous applications this does also involve action selection by
any of these values increases. Unfortunately, while for the the robot. The goal is to encode the whole robot program
original SeqSLAM most of the operations deal with scalar (a set of reactive behavior rules) in a single vector. When
similarity values, for the VSA approach all operations are executing (and combining) such VSA-based behaviors, the
high-dimensional vector operations. Presumably, a practi- advantages of vectors (i.e. the representational power and
cal runtime improvement can only be achieved with special robustness to noise) are preserved. A particular beauty of
hardware for high-dimensional vector computations (which this approach is that it can learn encodings for behaviors that
then could also exploit the energy saving potential of VSAs). have exactly the same form (a single vector) no matter how
Extensions The Nordland data is perfectly suited for complex the sensor input or the behaviors are.
SeqSLAM and its vector variant since each train journey is Experimental setup We use the simulation task described
a single long sequence with constant speed. To account for in [22]. Figure 9 illustrates the used simple robot with dif-
a slightly varying speed between the sequences, the original ferential drive (i.e. a left and a right motor), a left and a right
SeqSLAM algorithm evaluates line segments with different distance sensor, and a central light sensor. The robot starts
slopes and uses the best choice. The proposed VSA imple- at a random location in a labyrinth and the task is to wander
mentation can be straightforwardly extended to these vary- around while avoiding obstacles, until the robot finds a light
ing velocities by superposing the different combinations of source. Then the robot should stay under this light. This is
image vectors and sequence position vectors. Further inter- a simple task that can be coded using a simple set of rules
esting directions would be to control the similarity between (e.g. see [22]).
neighbored sequence position vectors or to use other VSAs’ VSA approach The listing in Algorithm 1 describes the
ways of encoding sequence information, e.g. permutations learning procedure. Inputs are pairs of sensor measures and
[16]. corresponding actuator commands. The idea is to (1) encode
the sensor and actuator values individually, (2) combine a
5.4 Learning and Recall of Reactive Behavior sensor value in a condition vector and all actuator encodings
in a result vector, (3) combine the condition with the result
Robotic Task The task is to learn simple reactive behaviors vector to a rule vector, and finally (4) combine all rule vec-
from demonstration. ”Simple“ means that we can repre- tors to a single vector that contains the whole ”program“.
sent them as a set of sensor-action (condition-result) pairs. Algorithm 2 is used in the execution phase to find the best
Given a successful demonstration of a navigation run (e.g. actuator commands for the current sensor input.
from a human) by pairs of sensor input and actuator output,

5
This work was previously presented at an IROS workshop, see [26]
for details.

13
328 KI - Künstliche Intelligenz (2019) 33:319–330

Algorithm 1: Learning 3 . Since many of these rules are bundled to create the com-
Data: k training samples [S, A]1:k of sensor and actuator plete program 4 , the condition vector has to be protected
values, a VSA, an encoder, an empty program
progHV and an empty vector knownCondHV of
(think of it as using brackets that also prevent distribution
known conditions in an equation) to prevent mixing up sensor conditions from
Result: progHV - a vector representation of the behaviour
different training pairs. To allow later recall from noisy vec-
// get vector representations for each sensor and actor
1 [sensor, actuator] = V SA.assignRandomV ectors() tors, each actuator-value pair [e.g. actuator1 ⊗ encode(a1 )]
// for each training sample [S, A] and the result bundle have to be stored in the clean-up
2 foreach pair [S = (s1 , ..., sn ), A = (a1 , ..., am )] do memory.
// encode values, bind to device and bundle
condition/result
In this example, during query, the task is to obtain the
3 conditionHV := +n i=1 (sensori ⊗ encode(si )) left and right motor commands given the current sensor
resultHV := +m i=1 (actuatori ⊗ encode(ai ))
input and the program vector. To be able to get the cor-
4

5 if isDissimilar(knownCondHV, Π(conditionHV ))
then rect commands, also the encoder/decoder and the clean up
// protect the condition and append (bundle) to memory are required 5 . The given sensor information are
6
the program
progHV :=
combined to a condition vector as before 6 . Binding this
progHV + (Π(conditionHV ) ⊗ resultHV ) vector to the program vector retrieves the most similar rule
// also append (bundle) the condition to the set vector from training 7 . The clean-up memory can be used
of known conditions
7 knownCondHV := to obtain a noise-free version. By binding this result with an
knownCondHV + (Π(conditionHV )) actuator role vector (e.g. actuator1), a noisy version of the
// insert the result and the actuator encoding
to the clean-up memory
corresponding command is obtained 8 . Using the clean-up
8 V SA.addT oCU M (resultHV ) memory and the decoder, this can be translated in a motor
foreach actuatori do
9
10 V SA.addT oCU M (actuatori ⊗ encode(ai )) command and used to control the robot 9 .
11 end Results We implemented this system and were able to
end
successfully learn behaviors that solve the described simu-
12
13 end
lation task from [22] using human demonstration runs. For
more details, please refer to [26].
This demonstrates that VSAs can also be used to imple-
Algorithm 2: Query ment more complex programs, including action selection. It
Data: progHV - the output of the learning procedure Alg. is possible to encode a complete robot program in a single
1, the VSA and encoder/decoder used in Alg. 1, the vector. However, the complexity of the program is limited by
query sensor inputs S
Result: output actuator commands A the capacity of this vector. More work is required to investi-
// encode values, bind to device and bundle condition gate the practical potential of this example.
1 conditionHV := +n i=1 (sensori ⊗ encode(si ))

// query program to get a noisy version of the resultHV


2 resultHV N oisy := Π(conditionHV ) ⊗ progHV
// remove noise
3 resultHV := vsa.queryCU M (resultHV N oisy) 6 Limitations, Discussion and Open
// for each actuator , extract the command from the Questions
result vector
4 foreach actuatori do
// unbind a noisy version from the result vector We demonstrated that hyperdimensional computing and its
5 commandHV N oisy := actuatori ⊗ resultHV implementation in form of VSAs have interesting properties
6
// remove noise
commandHV := vsa.queryCU M (commandHV N oisy)
and a variety of applications in the literature, and that they
// decode the command value from the vector
can be used to address robotic problems. However, there
7 ai := decode(commandHV ) are a couple of shortcomings and potential limitations we
end
8
want to discuss.
In our opinion, the first challenge is the lack of a clear defi-
Figure 9 illustrates how VSA operators are used during nition of VSAs. It is a name for a collection of approaches
training and query. Each encoded sensor value is bound to to exploit the properties of high-dimensional vector spaces
a random vector that represents the corresponding sensor based on a set of operators with similar properties. We tried
(see 1 in Fig. 9). E.g., the left distance sensor has a ran- to collect information about different VSAs in Sect. 3 to pro-
dom, but static vector representation (coined sensor1) that vide a coherent definition. However, the list of properties of
indicates the role ”left distance sensor“. During training, for the operators includes terms like ”should“ or ”approximately“
each input pair, all sensor-value-binding are bundled 2 . The without further ascertainment. The ultimate goal would be a
same happens on the actuator side. The complete sensor- theoretically rigorous definition in form of axioms and derived
action rule is stored as the binding of sensors and actuators theorems about the capabilities of such systems.

13
KI - Künstliche Intelligenz (2019) 33:319–330 329

There are trade-offs like the one between the binding and 2. Ahmad S, Hawkins J (2015) Properties of sparse distributed
bundling operators in the Multiply–Add–Permute architecture representations and their application to hierarchical temporal
memory. CoRR arxiv​:abs/1503.07469​
explained in Sect. 3, the first works better using {0, 1}n as vec- 3. Bellman RE (1961) Adaptive Control Processes: A Guided Tour.
tor space, the other when using the interval [0, 1]n . This is MIT Press, Cambridge
not an unscalable problem, neither in theory (e.g by using a 4. Beyer K, Goldstein J, Ramakrishnan R, Shaft U (1999) When
clean-up memory) nor in practice (we also used this VSA in Is nearest neighbor meaningful? In: Beeri C, Buneman P (eds)
Database theory—ICDT’99. Springer, Berlin Heidelberg, Ber-
the robotics experiments in Sect. 5). Although some theoretical lin, Heidelberg, pp 217–235
insights on properties of VSAs are available (e.g. on the bundle 5. Danihelka I, Wayne G, Uria B, Kalchbrenner N, Graves A
capacity [7]), better insights in such trade-offs and limitations (2016) Associative long short-term memory. In: Balcan MF,
would support the practical application. Weinberger KQ (eds) Proceedings of the 33rd international
conference on machine learning, proceedings of machine learn-
A particularly important and challenging task is the encod- ing research, vol 48. PMLR, New York, pp 1986–1994. http://
ing of real world data into vectors. Our examples in Sect. 2 proce​eding​s.mlr.press​/v48/danih​elka1​6.html
and most applications from Sect. 4 use random vectors— 6. Eliasmith C, Stewart TC, Choo X, Bekolay T, DeWolf T,
which are very likely pairwise almost orthogonal. However, Tang Y, Rasmussen D (2012) A large-scale model of the
functioning brain. Science 338(6111):1202–1205. https​://doi.
for the shown robotic experiments in Sects. 5.2 and 5.3, we org/10.1126/scien​ce.12252​66. http://scien​ce.scien​cemag​.org/
used encodings obtained from images using a CNN and LSH conte​nt/338/6111/1202
(Sect. 5.1). The resulting vectors span only a subspace of the 7. Frady EP, Kleyko D, Sommer FT (2018) A theory of sequence
vectorspace. Thus, presumably, the VSA mechanisms work indexing and working memory in recurrent neural networks.
Neural Comput 30(6):1449–1513. https ​ : //doi.org/10.1162/
only approximately—nevertheless, they provide reasonable neco_a_01084​
results. Insights to requirements on properties of the encoding/ 8. Gayler RW (1998) Multiplicative binding, representation opera-
decoding could have a huge influence for practical application. tors, and analogy. In: Advances in analogy research: integr of the-
The fact that in hyperdimensional computing most things ory and data from the cogn, comp, and neural sciences. Bulgaria
9. Gayler RW (2003) Vector symbolic architectures answer Jacken-
work only approximately, requires a different engineer’s mind doff’s challenges for cognitive neuroscience. In: Proc. of ICCS/
set. In the foreseeable future, complex machines like robots ASCS Int. Conf. on cognitive science, pp 133–138. Sydney,
will very likely contain a lot of engineering work—an easier Australia
access for non-mathematicians to what works why and when 10. Geusebroek JM, Burghouts GJ, Smeulders AWM (2005)
The Amsterdam library of object images. Int J Comput Vis
in these systems would presumably be a very helpful contribu- 61(1):103–112
tion. A very interesting direction would also be the connection 11. Hastie T, Tibshirani R, Friedman J (2009) The elements of sta-
to the probabilistic methods that are widely used in this field. tistical learning: data mining, inference and prediction, 2 edn.
Beside access to theoretical findings for applications of Springer. http://www-stat.stanf​ord.edu/~tibs/ElemS​tatLe​arn/
12. Hawkins J, Ahmad S (2016) Why neurons have thousands of syn-
hyperdimensional computing, a structured way to practically apses, a theory of sequence memory in neocortex. Front Neural
designing systems using VSAs is missing. Currently, almost Circ 10:23. https​://doi.org/10.3389/fncir​.2016.00023​
every problem that is solved using hyperdimensional comput- 13. Jackendoff R (2002) Foundations of language (brain, meaning,
ing is a somehow isolated application. Although the same prin- grammar, evolution). Oxford University Press, Oxford
14. Joshi A, Halseth JT, Kanerva P (2017) Language geometry using
ciples are used on every occasion, a structured approach how random indexing. In: de Barros JA, Coecke B, Pothos E (eds)
to solve problems, e.g. by means of design patterns, would be Quantum interaction. Springer International Publishing, Cham,
very desirable. Also related is the very fundamental question, pp 265–274
which parts of the system have to be designed manually and 15. Kanerva P (1997) Fully distributed representation. In: Proc. of
real world computing symposium, pp 358–365. Tokyo, Japan
which parts can be learned. Currently, many results are due 16. Kanerva P (2009) Hyperdimensional computing: an introduction
to elaborate design rather than learning. However, the high- to computing in distributed representation with high-dimensional
dimensional representations presumably provide easy access random vectors. Cognit Comput 1(2):139–159
to connectionists’ learning approaches—potentially an elegant 17. Kanerva P (2014) Computing with 10,000-bit words. In: 2014
52nd annual Allerton conference on communication, control,
bridge between (deep) artificial neural networks and (vector) and computing (Allerton), pp 304–310 . https​://doi.org/10.1109/
symbolic processing. ALLER​TON.2014.70284​70
18. Kleyko D, Osipov E, Gayler RW, Khan AI, Dyer AG (2015) Imita-
tion of honey bees’ concept learning processes using Vector Sym-
bolic Architectures. Biol Inspired Cognit Arch 14:57–72. https​://
References doi.org/10.1016/j.bica.2015.09.002
19. Kleyko D, Osipov E, Papakonstantinou N, Vyatkin V, Mousavi
1. Aggarwal CC, Hinneburg A, Keim DA (2001) On the surprising A (2015) Fault detection in the hyperspace: towards intelligent
behavior of distance metrics in high dimensional space. In: Van automation systems. In: 2015 IEEE 13th international confer-
den Bussche J, Vianu V (eds) Database theory—ICDT 2001. ence on industrial informatics (INDIN), pp 1219–1224. https​://
Springer, Berlin Heidelberg, Berlin, Heidelberg, pp 420–434 doi.org/10.1109/INDIN​.2015.72819​09

13
330 KI - Künstliche Intelligenz (2019) 33:319–330

20. Kleyko D, Rahimi A, Rachkovskij DA, Osipov E, Rabaey JM 28. Plate TA (1994) Distributed representations and nested composi-
(2018) Classification and recall with binary hyperdimensional tional structure. Ph.D. thesis, Toronto, Ont., Canada, Canada
computing: tradeoffs in choice of density and mapping character- 29. Purdy S (2016) Encoding data for HTM systems. CoRR arxiv​
istics. IEEE Trans Neural Netw Learn Syst 29(12):5880–5898. :abs/1602.05925​
https​://doi.org/10.1109/TNNLS​.2018.28144​00 30. Rachkovskij DA, Slipchenko SV (2012) Similarity-based
21. Krizhevsky A, Sutskever I, Hinton GE (2012) ImageNet clas- retrieval with structure-sensitive sparse binary distributed repre-
sification with deep convolutional neural networks. In: Pereira sentations. Comput Intell 28(1):106–129. https​://doi.org/10.111
F, Burges C, Bottou L, Weinberger K (eds) Advances in neural 1/j.1467-8640.2011.00423​.x
information processing systems, vol 25. Curran Associates, Inc., 31. Rahimi A, Datta S, Kleyko D, Frady EP, Olshausen B, Kanerva P,
pp 1097–1105. http://paper​s.nips.cc/paper​/4824-image​net-class​ Rabaey JM (2017) High-dimensional computing as a nanoscalable
ifica​tion-with-deep-convo​lutio​nal-neura​l-netwo​rks.pdf paradigm. IEEE Trans Circ Syst I Regular Pap 64(9):2508–2521.
22. Levy SD, Bajracharya S, Gayler RW (2013) Learning behavior https​://doi.org/10.1109/TCSI.2017.27050​51
hierarchies via high-dimensional sensor projection. In: Proc. of 32. Smolensky P (1990) Tensor product variable binding and the rep-
AAAI conference on learning rich representations from low-level resentation of symbolic structures in connectionist systems. Artif
sensors, AAAIWS’13–12, pp 25–27 Intell 46(1–2):159–216
23. Milford M, Wyeth GF (2012) SeqSLAM: visual route-based 33. Sünderhauf N, Dayoub F, Shirazi S, Upcroft B, Milford M (2015)
navigation for sunny summer days and stormy winter nights. In: On the performance of ConvNet features for place recognition.
Proceedings of the IEEE international conference on robotics and CoRR arxiv​:abs/1501.04158​
automation (ICRA) 34. Sünderhauf N, Neubert P, Protzel P (2013) Are we there yet? Chal-
24. Neubert P, Ahmad S, Protzel P (2018) A sequence-based neuronal lenging SeqSLAM on a 3000 km journey across all four seasons.
model for mobile robot localization. In: Proc of KI: advances in In: Proceedings of the IEEE international conference on robotics
artificial intelligence and automation (ICRA), workshop on long-term autonomy
25. Neubert P, Protzel P (2015) Neubert P, Protzel P (2015) Local 35. Sünderhauf N, Brock O, Scheirer W, Hadsell R, Fox D, Leitner J,
region detector+ CNN based landmarks for practical place recog- Upcroft B, Abbeel P, Burgard W, Milford M, Corke P (2018) The
nition in changing environments. In: Proceedings of the European limits and potentials of deep learning for robotics. Int J Robot Res
conference on mobile robotics (ECMR) 37(4–5):405–420. https​://doi.org/10.1177/02783​64918​77073​3
26. Neubert P, Schubert S, Protzel P (2016) Learning vector symbolic 36. Thrun S, Burgard W, Fox D (2005) Probabilistic robotics (intelli-
architectures for reactive robot behaviours. In: Proc of Intl Conf gent robotics and autonomous agents). The MIT Press, Cambridge
on intelligent robots and systems (IROS) workshop on machine 37. Widdows D, Cohen T (2015) Reasoning with vectors: a continu-
learning methods for high-level cognitive capabilities in robotics ous model for fast robust inference. Logic J IGPL/Interest Group
27. Osipov E, Kleyko D, Legalov A (2017) Associative synthesis of Pure Appl Logics 2:141–173
finite state automata model of a controlled object with hyperdi- 38. Yerxa T, Anderson A, Weiss E (2018) The hyperdimensional stack
mensional computing. In: IECON 2017—43rd annual conference machine. In: Proceedings of Cognitive Computing, Hannover, pp.
of the IEEE industrial electronics society, pp 3276–3281. https​:// 1–2
doi.org/10.1109/IECON​.2017.82165​54

13

You might also like