0% found this document useful (0 votes)
21 views15 pages

Generation of Fiducial Marker Dictionaries Using Mixed Integer Linear Programming (2015)

This paper presents two Mixed Integer Linear Programming (MILP) approaches for generating fiducial marker dictionaries that maximize inter-marker distances, enhancing error correction capabilities. The first approach guarantees optimal solutions but is limited to smaller dictionaries due to long computation times, while the second provides suboptimal solutions more quickly, outperforming existing methods. The study emphasizes the importance of custom dictionaries for specific applications in augmented reality and computer vision.

Uploaded by

airlife91733
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views15 pages

Generation of Fiducial Marker Dictionaries Using Mixed Integer Linear Programming (2015)

This paper presents two Mixed Integer Linear Programming (MILP) approaches for generating fiducial marker dictionaries that maximize inter-marker distances, enhancing error correction capabilities. The first approach guarantees optimal solutions but is limited to smaller dictionaries due to long computation times, while the second provides suboptimal solutions more quickly, outperforming existing methods. The study emphasizes the importance of custom dictionaries for specific applications in augmented reality and computer vision.

Uploaded by

airlife91733
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/282426080

Generation of fiducial marker dictionaries using Mixed Integer Linear


Programming

Article in Pattern Recognition · October 2015


DOI: 10.1016/j.patcog.2015.09.023

CITATIONS READS
115 12,491

4 authors, including:

Rafael Muñoz-Salinas Francisco J. Madrid-Cuevas


University of Cordoba (Spain) University of Cordoba (Spain)
80 PUBLICATIONS 1,979 CITATIONS 41 PUBLICATIONS 1,544 CITATIONS

SEE PROFILE SEE PROFILE

Rafael Medina-Carnicer
University of Cordoba (Spain)
79 PUBLICATIONS 1,232 CITATIONS

SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Mapping and Localization from Planar Markers View project

Computer Science Doctor (PhD) View project

All content following this page was uploaded by Rafael Muñoz-Salinas on 11 October 2017.

The user has requested enhancement of the downloaded file.


Generation of fiducial marker dictionaries using mixed integer linear programming

S. Garrido-Jurado1 , R. Muñoz-Salinas∗, F.J Madrid-Cuevas, R. Medina-Carnicer


Computing and Numerical Analysis Department, Córdoba University, Spain

Abstract
Square-based fiducial markers are one of the most popular approaches for camera pose estimation due to its fast detection
and robustness. In order to maximize their error correction capabilities, it is required to use an inner binary codification
with a large inter-marker distance. This paper proposes two Mixed Integer Linear Programming (MILP) approaches to
generate configurable square-based fiducial marker dictionaries maximizing their inter-marker distance. The first approach
guarantees the optimal solution, however, it can only be applied to relatively small dictionaries and number of bits since
the computing times are too long for many situations. The second approach is an alternative formulation to obtain
suboptimal dictionaries within restricted time, achieving results that still surpass significantly the current state of the art
methods.
Keywords: fiducial markers, MILP, mixed integer linear programming, augmented reality, computer vision.

1. Introduction tance [10], which is the minimum Hamming distance between


the binary codes of the markers, considering the four possi-
Camera pose estimation is a common problem in numer- ble rotations. This distance defines the maximum number of
ous computer vision applications such as robot navigation bits that can be corrected without producing an inter-marker
[1, 2] or augmented reality [3, 4, 5], which is usually based confusion error, i.e. a marker being erroneously identified
on obtaining correspondences between environment and im- as a different one. As a consequence, the inter-marker dis-
age points. While the use of natural features, such as key tance is directly related to the error correction capabilities
points or textures [6, 7, 8, 9], is a very popular strategy which of a dictionary. The larger the inter-marker, the lower the
does not require altering the environment, the use of fiducial false negative and inter-marker confusion rates, and there-
markers is still of great importance since it provides point fore, the higher the robustness of the process.
correspondences more robustly, efficiently and precisely. For instance, Figure 1 shows an example of inter-marker
In particular, square-based fiducial markers are the most confusion error and the importance of large inter-marker
popular in the field of augmented reality [4, 10, 11] since a distances. The two first markers have a short distance of
single marker provides the four points required to estimate only 1 bit while the third marker has larger distances of at
the camera pose (given that it is properly calibrated). In least 5 bits to the rest of markers. As it can be seen in Fig-
general, squared-based markers use an inner binary code for ures 1d,e, a single erroneous bit is enough to cause a wrong
identification, error detection and correction. identification of the second marker. On the other hand, the
The detection process of this type of markers can be split in third marker is correctly identified despite having a higher
two main steps. The first step is the candidate search, which number of errors.
consists in finding square shapes in the image that look like Most related works propose their own predefined dictio-
markers. The second step is the identification stage, where nary of markers with a fixed number of markers and bits, and
the inner codification of the candidates is analyzed in order a constant inter-marker distance. However, using a prede-
to determine whether they really are markers, and if they fined dictionary for every application is not the optimal ap-
belong to the considered set of valid ones, also known as proach. Instead, if the number of required markers and their
dictionary. size is known, it is preferable to create a custom dictionary
A key aspect of such dictionaries is the inter-marker dis- that maximizes the inter-marker distance and, consequently,
the error detection and correction capabilities. Although this
∗ Corresponding
is the tendency of the latest proposals [12, 13], they rely on
author
Email addresses: [email protected] (S. Garrido-Jurado),
heuristic approaches, none of them being optimal.
[email protected] (R. Muñoz-Salinas), [email protected] (F.J This paper presents two novel dictionary generation
Madrid-Cuevas), [email protected] (R. Medina-Carnicer)
1 Computing and Numerical Analysis Department, Edificio Einstein. methods based on the Mixed Integer Linear Programming
Campus de Rabanales, Córdoba University, 14071, Córdoba, Spain, (MILP) paradigm. The first MILP model proposed guaran-
Tlfn:(+34)957212255 tees the optimal inter-marker distance for a specific number

Preprint submitted to Pattern Recognition October 5, 2015


1 1 1 1 1 0
0 0 1 0 0 0
m 0 1 0 1 0 0 0
1 1 0 0 0 1
0 1 0 0 1 1
1 1 1 1 0 1

1 0 1 1 1 1
1 1 0 0 1 0
1 1 0 0 0 1 1
m 0 0 0 1 0 1
0 0 0 1 0 0
0 1 1 1 1 1

1 0 0 1 1 1
1 0 0 1 1 1
1 0 0 0 0 1
m2 0 1 1 0 0 1
1 0 0 1 1 1
1 0 0 0 1 1

(a) (b) (c) (d) (e)

Figure 1: Example of inter-marker confusion error and how it is avoided with large inter-marker distances. (a) Original marker images.
The marker distances are D(m0 , m1 ) = 1, D(m0 , m2 ) = 5 and D(m1 , m2 ) = 6. (b) Real image with the markers placed in the environment.
(c) Marker images after removing the perspective. (d) Inner bits extracted from images in c. Erroneous bits are highlighted in red. (e) Final
identifiers assigned to each marker. It can be observed that m1 has been confused with m0 . On the other hand, m2 , despite a higher number
of errors, can be correctly identify since the distance to the rest of markers is higher. (Best seen in color).

of markers and bits. It is the first approach in the literature, A straight evolution of the point markers are the circular
up to our knowledge, that assures optimal results in terms of markers [16, 17] (Fig. 2a). These markers are similar to the
inter-marker distance. However, since the convergence time previous ones except for the fact that they include infor-
of this model is too long for many applications, we also pro- mation (as circular sectors or concentric rings) to facilitate
pose an alternative MILP formulation that converges faster, the identification process. Their main drawback is that they
and, although it does not guarantee optimality, its results provide only one correspondence point per marker.
surpass the current state of the art significantly. The two Other types of fiducial markers are based on blob detec-
proposed approaches presented in this paper constitute rel- tion. For instance, Cybercode [18] and VisualCode [19] (Fig.
evant improvements to the error correction capabilities of 2b,c) are based on the same technology than QR and Maxi-
fiducial marker systems. code [20] codes, but providing several correspondence points.
The rest of the paper is structured as follows. Section Other popular fiducial markers based on blob detection are
2 reviews related work. Section 3 presents a mathematical the ReacTIVision amoeba markers [21], that are designed
formalization of the problem. Section 4 details the proposed using genetic algorithms (Fig. 2d).
MILP models to our problem. Finally, Section 5 presents One of the most popular families of fiducial markers are
the experimentation carried out, and Section 6 draws some the square-based ones. They contain a black border to ease
conclusions. their detection and employ their inner region for identifica-
tion purposes. Their main benefit is that each marker pro-
vides four prominent points (i.e. its four corners) which can
2. Related work be easily detected and employed as correspondence points,
thus allowing camera pose estimation using a single marker.
Fiducial markers are synthetic elements placed in the In this category, one of the most popular systems is AR-
working area either to facilitate the camera pose estimation ToolKit [4], an open source project which has been exten-
task or for labeling purposes. They are specially designed sively used in the last decade, especially in the academic
to be easily detected even at low resolutions and most of the community. ARToolKit markers include a pattern in their
applications require not one but many different markers (a inner region for identification which can be customized by
dictionary). Thus, the ability of identifying them uniquely is the user (Fig. 2e). Despite its popularity, it presents some
an important feature. Several fiducial marker systems have drawbacks. Firstly, it uses a template matching strategy to
been proposed in the literature as shown in Figure 2. identify the markers, which produces a high false positive
The simplest proposals are those based on fiducial points. rate [22]. Secondly, the square detection is based on global
These markers constitute a single scene point and are usually thresholding which makes it high sensitive to the lighting
based on leds, retroreflective spheres or planar dots [14, 15]. conditions.
Their identification is typically based on the relative posi- Instead of using template matching, the majority of
tions of different points, which can be a limiting and complex square-based marker systems employs a binary codification
process. in their inner region [23, 10, 11]. Most approaches use a

2
points are vulnerable to errors since they are not protected
by any coding system.
Most recent approaches rely on heuristics to select a set of
markers with large inter-marker distances, considering rota-
(a) Intersense (b) CyberCode (c) VisualCode (d) ReacTIVision tion. However, since the search space is very large even for
small dictionaries with a low number of bits, an exhaustive
search is unfeasible and optimality can not be guaranteed.
One of the first and simplest proposals is the BinARyID
system [27], whose dictionary generation process is based on
(e) ARToolKit (f) Matrix (g) ARTag (h) ARToolKit selecting those markers that accomplish a minimum Ham-
Plus ming distance of one to any of the previous selected mark-
ers, so that rotation ambiguities are avoided. The markers
are analyzed in ascending order until the desired number of
markers is achieved. Its main problem is that it does not
(i) AprilTags (j) ArUco
allow error detection and correction (since the distance is
one) and the generation times can be prohibitive for large
Figure 2: Examples of fiducial markers proposed in previous works. dictionaries.
The generation method proposed by the AprilTags library
[13] (Fig. 2i) employs a similar approach than BinARyID,
codification based on classic methods of signal coding, such but with some significant improvements. Firstly, the mini-
as CRC codes [24], achieving a more robust identification mum Hamming distance can be provided by the user, gen-
and facilitating the error detection and correction processes. erating dictionaries with larger inter-marker distances and,
hence, allowing error detection and correction. Secondly,
Matrix [23] is one of the first and simplest proposals which instead of analyzing markers one by one in ascending order,
uses a binary code with redundant bits for error detection larger increments are performed based on an heuristic ap-
(Fig. 2f). ARTag [10] (Fig. 2g) is based on the same idea proach, so that markers with larger inter-marker distances
but it employs a more robust codification. Furthermore, it are found faster. Finally, selected markers also need to ac-
improves the square detection using an edge-based method complish a minimum geometric complexity to increment the
instead of the global thresholding of ARToolKit. ARTag number of bit transitions. Its main downside is that the
provides its marker dictionary in a specific order so that the generation time is still very large, specially for large marker
inter-marker distance is maximized. Its main drawbacks are sizes.
that the marker size is fixed to 6 × 6 bits and the error In [12], the ArUco coding system is presented (Fig. 2j).
correction process can correct up to one bit, independently Its generation is based on maximizing both the inter-marker
of the inter-marker distance of the used marker subset. distance and the number of bit transitions. Contrary to
ARToolKit Plus [11] (Fig. 2h) improves some of the fea- AprilTags, the minimum inter-marker distance does not
tures of its predecessor ARToolKit. Firstly, it employs a need to be provided by the user, instead it is automatically
dynamic method to update the global threshold depending derived during the generation process. However, it has two
on the pixel values in the previous detected markers. Sec- main downsides. Firstly, it uses a time consuming stochastic
ondly, as in ARTag, it provides a binary codification for search strategy. Secondly, as the marker size increases, its
marker identification. The first ARToolKit Plus version in- memory requirements grow exponentially, limiting the max-
cludes 512 markers whose codification is based on repeating imum size to which it can be applied. As ARTag, ArUco
four times a 9-bits identifier, achieving a minimum distance sorts the generated markers in a list so as to maximize the
of four bits between any pair of markers. The last known inter-marker distance.
version instead, proposes a dictionary of 4096 markers with This paper proposes two novel approaches to generate
6 × 6 bits based on a BCH codification [25] that presents a square-based fiducial marker dictionaries based on Mixed In-
minimum distance of two bits so that error correction is not teger Linear Programming (MILP) [28]. MILP methods can
possible. ARToolKitPlus project was halted and followed by achieve the optimal results of a mathematical model repre-
the Studierstube Tracker [26] project which is not publicly sented by linear relationships and where some unknowns are
available. constrained to be integers. MILP problems receive special
The main problem of codification based on classic coding attention from the community since they fit many real-life
techniques is that they need to deal with the different marker situations, such as embedded system design [29], industrial
rotations, which affect negatively to the inter-marker dis- processes [30], automatic scheduling [31], distribution sys-
tances of the generated dictionaries. Some approaches em- tems [32] or trajectory planning [33]. However, these kind of
ploy special anchor points, such as QR and Maxicode [20], problems are known to be NP-hard and, as a consequence,
to remove the rotation ambiguity. However, this complicates there have been many efforts in developing techniques to
the detection process and, more importantly, these anchor speed up the convergence process.

3
m R1(m) R2(m) R3(m)
In contrast to previous works, our first proposed method
achieves the optimal dictionary in terms of inter-marker dis- m1 m2 m3 m7 m4 m1 m9 m8 m7 m3 m6 m9

tance for any number of markers and bits. It is the first m4 m5 m6 m8 m5 m2 m6 m5 m4 m2 m5 m8

approach in the literature that guarantees the optimal inter- m7 m8 m9 m9 m6 m3 m3 m2 m1 m1 m4 m7

marker distances, up to our knowledge. However, it suffers


Figure 3: Example of 90 degrees rotations of a marker, m, composed
from the course of dimensionality and the computing times
by 3 × 3 bits. The marker m is represented by the bits in row-
are too long as the size of the dictionaries or number of bits major order (m1 , . . . , m9 ). As can be observed, after one rotation,
increase. Thus, we propose a second approach that achieves the bits are permuted and the obtained marker, R1 (m), is represented
suboptimal dictionaries within restricted computing times by (m7 , m4 , m1 , m8 , m5 , m2 , m9 , m6 , m3 ). The same process repeats
for the rest of rotations.
which compares very favorable to the dictionaries obtained
by previous works.
As shown in the experimental section, our methods ob- Since an exhaustive evaluation of the entire search space
tain state of the art dictionaries (in terms of inter-marker is not feasible, a MILP model to find the optimal solution
distances), which are a relevant improvement to the error is proposed in this work.
correction capabilities of square fiducial marker systems. Let us start defining the marker named i, i.e. a component
of a dictionary D, as a binary matrix mi of size n × n:
3. Problem formulation
mi = (mi1 , mi2 , mi3 . . . , min×n ) | mik ∈ {0, 1}, (2)
The most relevant aspects to consider during the design
where mik denotes the k-th bit of the marker matrix mi
of a marker dictionary are the false positive rate, the false
assuming a row-major order.
negative rate and the inter-marker confusion rate. The first
two are usually handled using error detection and correc- Note that a marker detected in an image can be rotated
tion techniques. On the other hand, the inter-marker con- respect to its original position and, thus, the marker bits will
fusion rate depends only on the distance among the dictio- also be rotated. As it is shown in Fig. 3, a marker rotation
nary markers. If the distance between two markers is short, can be formulated as a permutation of the marker bits. For
a marker could be confused with another one with just a a marker mi , let us define its analogous set, A(mi ), as the
few bit modifications, and the error could not be detected. set of the markers obtained after the three possible rotations
As a consequence, this value also affects to the maximum plus the marker itself:
number of erroneous bits that can be corrected.
Let us denote by D the set of all possible markers of n × n 3
[
i
bits and by Dd ⊂ D the subset of all vectors formed by A(m ) = Rl (mi ), (3)
exactly d markers. An element D = (m1 , m2 , m3 . . . , md ) l=0
of the set Dd is named a dictionary, where the super-index
indicates the marker position in the dictionary. being Rl an operator that rotates the marker bits l × 90
An automatic dictionary generation process consists in degrees in clockwise direction. In fact, all the markers in an
analogous set can be considered as equivalent solutions of
selecting a dictionary D from the set Dd , so that each mi , i ∈
the search space.
{1, ..., d}, is as far from each mj , j ∈ {1, ..., d}, j 6= i, as
possible. In general, the problem is to find the dictionary Since our goal is to obtain a dictionary that maximizes
D∗ that maximizes the desired criterion τ (D): the inter-marker distance, the distance between two mark-
ers, mi , mj , | i, j ∈ {1, ..., d}, i 6= j, must be properly de-
D∗ = argmax {τ (D)} . (1) fined. Considering that all the elements in an analogous set
D∈Dd are equivalent, the distance between markers from different
analogous set is defined as:
The criterion τ (D) employed in this work is the dictionary
inter-marker distance, which is the minimum Hamming dis- D(mi , mj ) = min { H(mi , mk ) }, (4)
tance between any two markers of D. This value is of great mk ∈A(mj )

importance since it indicates the minimum number of er-


where the function H is the Hamming distance between two
roneous bits that can be corrected: b(τ (D) − 1)/2c. If the
markers.
number of erroneous bits of a marker is lower than or equal
Furthermore, since we want to obtain the camera pose
to this value, it can be guaranteed that the closest marker
with respect to the marker, its corners must be identified
in the dictionary is the correct one. However, if the num-
unequivocally. Therefore, the distance of a marker to the
ber of erroneous bits is higher than τ (D), the assumption
rest of elements in its analogous set must be considered too.
does not hold and hence, the error correction cannot be per-
This distance, referred to as marker self-distance, is defined
formed because the closest marker could not be the correct
as:
one (inter-marker confusion mistake). Thus, the objective of
a dictionary generation method is to maximize the function
τ (D). S(mi ) = min { H(mi , mk ) }, (5)
mk ∈A0 (mi )

4
where A0 (mi ) is the analogous set of mi without considering
the marker itself:

A0 (mi ) = A(mi ) − {mi }. (6) Figure 4: Example of quartet in a 4 × 4 bits marker.

In the end, the objective function τ (D) in Eq. 1 is the Hamming distances
Group Quartets
minimum distance among the marker self-distances and the 90 deg 180 deg 270 deg
distances between any pair of markers in the dictionary: Q1 0000, 1111 0 0 0
1000, 0100, 0010, 0001,
Q2 2 2 2
1110, 0111, 1011, 1101
  Q3 1100, 0110, 0011, 1001 2 4 2

 
 Q4 0101, 1010 4 0 4

S(mi ) , imin D(mi , mj )


 
τ (D) = min min
i j
. Table 1: Quartet groups and Hamming distances they provide in each
m ∈D
 m ,m ∈D
i j

 rotation.
m 6=m
(7)
This function represents the inter-marker distance of a dic-
Hamming distances
tionary and the goal of the dictionary generation methods Quartet Group
90 degrees 180 degrees 270 degrees
proposed in this work is to maximize it. 1 Q3 2 4 2
2 Q3 2 4 2
Figure 1 shows a marker detection example which illus- 3 Q4 4 0 4
trates the benefits of using dictionaries with large inter- 4 Q3 2 4 2
marker distances. A dictionary composed by 3 markers of Total distances 10 12 10
4
6 × 6 bits is shown in Fig.1a,b. While the distance between τmax min(10, 12, 10) = 10
m0 and m1 is only one bit, the distances to m2 are larger,
Table 2: Quartet assignment for a 4 × 4 marker (C = 4) to obtain
D(m0 , m2 ) = 5 and D(m1 , m2 ) = 6. The markers have 4
τmax = 10. It can be observed that the sequence {Q3 , Q3 , Q4 } is
been detected using a standard marker detection process like repeated until filling all the quartets in the marker.
the one in [12]. Figure 1c shows the images obtained af-
ter removing the perspective distortion of each marker and
Figure 1d shows the extracted bits from each image. It can n
The key point to understand the derivation of τmax is
be observed that the images in Figure 1c have an important
the concept of quartet, which is the set of four bits that
amount of noise, which usually produces errors during the bit
interchange their positions at each rotation (see Figure 4).
extraction process. Erroneous bits are highlighted in red in
As can be observed, these four bits do not interact with the
Figure 1d. Finally, Figure 1e shows the assigned identifiers
rest of bits when the marker is rotated. Hence, a quartet
to each of the markers applying a maximum error correction
contributes to the marker self-distance independently from
of 2 bits. The marker m1 has been erroneously identified as
the rest of quartets.
j 2 k In general, the number of quartets of a
m0 due to the erroneous extracted bit and the short marker
distance. Note that a single error is enough to make m1 marker is C = n4 . If n is odd, the central bit constitutes
identical to m0 . On the other hand, m2 has been correctly a quartet by itself that can be ignored since it does not
identified despite presenting two erroneous bits. Due to the influence on the self-distance.
large marker distance, it is less unlikely that m2 gets erro- Since a quartet is composed by 4 bits, there is a total of
neously identified during the identification step. 16 different possible quartets. From a brief study, it can be
observed that some of the quartets provide the same Ham-
n
3.1. Maximum inter-marker distance: τmax ming distances in each rotation. For the purpose of calcu-
n
In order to reduce the search space and accelerate the lating τmax , these quartets can be considered equivalent and
MILP convergence, it is of great importance to known the they can be grouped into the same quartet group, Qi . Table
maximum possible value of the objective function τ (D). Let 1 shows the 4 different quartet groups and the Hamming
n
us denote by τmax the maximum possible inter-marker dis- distances they provide in each rotation.
tance of a dictionary with markers of n × n bits. In this For instance, the quartet 1100 contributes with Hamming
section, the derivation of this value is presented as already distances (2, 4, 2) as it rotates:
done in [12].
If we think of the simplest possible dictionary, we realize H(1100, 0110) = 2; H(1100, 0011) = 4; H(1100, 1001) = 2,
that it is composed by a single marker. Then, the marker
and quartet 1001, which belongs to the same quartet group,
self-distance constitutes the inter-marker distance of the dic-
contributes with the same Hamming distances:
tionary. If a second marker is added to the dictionary, the
new inter-marker distance will be smaller than or equal to H(1001, 1100) = 2; H(1001, 0110) = 4; H(1001, 0011) = 2.
the previous one. As a consequence, the maximum theoret-
n
ical value of τmax is given only by the self-distance (Eq. 5) Hence, the problem of obtaining the maximum marker
of the first marker. self-distance consists in assigning each quartet to a quartet

5
group, so that the minimum marker distance of the three also implies that the computational complexity cannot be de-
rotations is maximized. If this is understood as a multiob- termined, neither analytically or experimentally.
jective problem, the Pareto front is composed by the quar- However, these kind of problems have been extensively
tet groups Q3 and Q4 , since they dominate all the other studied in the literature and many techniques have been pro-
solutions. Thus, the problem is simplified in assigning each posed in order to obtain the optimal solution efficiently. The
quartet to any of the two quartet groups Q3 and Q4 . Then, main techniques are based in the Branch and Cut method
n
it can be easily deduced that the maximum value τmax is ob-
[37], which is an iterative process that combines the Branch
tained by assigning the groups {Q3 , Q3 , Q4 } (in this order) and Bound [38] algorithm and the use of cutting planes [39].
repeatedly until completing all the quartets of a marker. Branch and Bound is an optimization algorithm based on
4
For instance, for a 4 × 4 marker (C = 4), τmax is obtained
a search tree which is explored by partitioning the search
by assigning the groups {Q3 , Q3 , Q4 , Q3 }, as it is shown inspace on each node. It is comprised by the branching step
Table 2. and the bounding step.
In general, the maximum inter-marker distance is calcu- During the branching step, a node is split in several
lated as: branches by dividing the possible values of a specific variable,
  so that the union of all the branches covers all the possibil-
n 4C ities. In the MILP procedure, this is performed by splitting
τmax = 2 . (8)
3 the different values that a decision variable can take.
The bounding step determines the lower and upper bounds
4. Proposed solutions of the optimization function in a particular branch. If these
bounds cannot surpass the current best solution, the branch
This section presents our proposals to generate marker is pruned, reducing the search space. In the MILP case,
dictionaries using MILP. First, a short introduction to MILP non-integral solutions to LP relaxations, i.e. the problem
is given. Then, our first model is presented, which obtains without considering the integer constraints, serve as upper
optimal solutions. Finally, our second model, which obtains bounds and integral solutions serve as lower bounds.
suboptimal solutions within restricted time is presented. Finally, cutting planes can be applied during the opti-
mization process to further reduce the search space. Cutting
4.1. Mixed Integer Linear Programming (MILP) planes generate new restrictions for the model that are sat-
isfied for any feasible solution, i.e. any integer solution, but
An integer linear programming (ILP) problem is a math-
violated by the current solution of the LP relaxation, so that
ematical optimization or feasibility program in which some
the next non-integer optimal solution should be closer to the
or all of the decision variables are restricted to be integers
integer one.
and the objective function and the constraints are linear.
The Branch and Cut algorithm explores the tree until find-
The canonical form of a integer linear program is:
ing the optimal solution, nevertheless many feasible non-
optimal solutions can also be found during the process.
maximize ct x
Subject to 4.2. Optimal Dictionary
Ax ≤ b (9) In order to obtain dictionaries with optimal inter-marker
distances, we propose a MILP model to obtain the maximum
x≥0
value of the cost function τ (D). This model is processed by
x ∈ Z, a MILP solver so that Branch and Cut algorithm is applied
to reduce the search space and speed up the convergence to
where x is the vector of decision variables, c is the coef-
the optimum.
ficient vector of the objective function, A is the coefficient
The decision variables of the proposed model are the bits
matrix of the constraints and b is the constant terms vec-
mij of the dictionary markers, i.e., one binary variable per
tor. The last constraint forces the decision variables to be
bit. In order to formulate the MILP problem, let us rewrite
integers, although in practice this constraint can be applied
the objective function in Eq. 7 as:
to some or all of the variables. In case all variables are
constrained to be integers, the problem is known as a Pure  
Integer Linear Programming Problem (PILP), otherwise it

 


 

 
is known as Mixed Integer Linear Programming (MILP).
 
i k i k
τ (D) = min min {H(m , m )}, min {H(m , m )} .
i i j
Whereas Linear Programming problems belong to complex- 

k
m ∈D
0 i
m ,m ∈D 

mi 6=mj
 
m ∈A (m )
 
ity class P [34], which means they are efficiently solvable,
 

k j
m ∈A(m )
ILP or MILP problems are known to be NP-hard due to the (10)
integer restriction and, thus, they cannot be solved in poly-
nomial time [35]. In fact, the particular case where the de- Thus, our goal can be enunciated as maximizing the min-
cision variables are binary is one of the problems in the well imum of a set of Hamming distances, some of which are
known list of Karp’s 21 NP-complete problems [36]. This self-distances and the others are distances between pair of

6
markers. The Hamming distance between two markers can optimal dictionary with fewer markers has been previously
be expressed as: generated, its optimal objective value can be employed as
upper-bound, since it is not possible to surpass the optimal
n×n
X solution of a dictionary with fewer markers. This modifica-
H(mi , mj ) = mik ⊗ mjk , (11)
tion accelerates the convergence process.
k=1
Finally, in order to reduce the search space, constraints
being ⊗ the exclusive-or operator. Since this is a non- (IV), (V) and (VI) are added to remove symmetric solutions.
linear operation, it must be reformulated as a linear one in Note that for an optimal solution of the model, another
order to be represented in a MILP model. This is accom- optimal solution can be obtained by just rotating any of the
plished by introducing, for each exclusive-or operation, a markers (i.e., selecting another marker from its analogous set
new auxiliary binary decision variable, δ, and the following A(mi )). To avoid this, only the markers with the highest
set of constraints: encoded value in the analogous sets are considered. This
is achieved by constraint (IV), where I(mi ) represents the
mi ⊗ mj = mi + mj − 2δ number encoded by the marker bits:
k k k k
δ ≤ mik n×n
(12) I(mi ) =
X
2k−1 mik . (14)
δ≤ mjk
k=1
δ≥ mik + mjk − 1.
Another symmetry arises from the fact that a dictionary is
Finally, since the objective function is the minimum of a vector of markers, thus, permutations of its elements lead
a set of values, a new auxiliary decision variable, τ ∗ , is to equivalent solutions. Constraint (V) is added to avoid
added to represent the minimum of all the Hamming dis- this symmetry by forcing an strict ascending order of the
tances (Eq. 10). The proposed problem formulation is then markers.
defined as: Finally, an optimal solution can be converted into another
one by just inverting all its bits. To avoid this symmetry,
maximize τ ∗ constraint (VI) forces that the total number of ones has to be
higher than or equal to the total number of zeros. Thus, only
Subject to, ∀mi ∈ D,
one of the two opposite solutions is valid. The parameter d
i k ∗
(I) H(m , m ) − τ ≥ 0 ∀mk ∈ A(mj ), ∀mj ∈ D, j 6= i is the cardinal of the dictionary.

i
(II) H(m , m ) − τ ≥ 0
k
∀mk ∈ A0 (mi )
∗ n
(III) 0 ≤ τ ≤ τmax 4.3. Suboptimal Dictionary
i
(IV) I(m ) − I(m ) ≥ 0
k
∀mk ∈ A0 (mi ) The previous model achieves the optimal inter-marker dis-
i i+1
(V) I(m ) − I(m )≥0 mi+1 ∈ D tance results. However, as shown in Sec. 5, the convergence
X n×n times are too long despite the efforts made to reduce the
X dn2
(VI) mjk ≥ , search space. As a consequence, it can only be applied to
2
mj ∈D k=0 generate dictionaries with relatively small number of mark-
(13) ers and bits. In this section, an alternative formulation that
obtains suboptimal results in much less time is proposed.
so that the minimum distance τ ∗ is maximized, ensuring In this case, instead of generating all the markers in a
that every Hamming distance is larger or equal to this value. single MILP optimization step, an iterative method, where
Note that after the optimization process, the variable τ ∗ will markers are generated incrementally, is proposed. At each
contain the value of τ (D). iteration, t, a new MILP model is defined to generate the
Constraint (I) guarantees that the distance between any next marker mt , that maximizes its distance to all the pre-
pair of markers in the dictionary is greater than or equal viously generated markers and its self-distance. As in the
to τ ∗ , while constraint (II) guarantees the same condition previous case, the decision variables of this model are the
for all the self-distances. Note that the previous model is marker bits. Then, the objective function at the t-th itera-
a simplified version since each Hamming distance is repre- tion is defined as:
sented by a sum of exclusive-or operations (which include
the decision variables associated to the marker bits, see Eq.  
11). Furthermore, each exclusive-or operation requires the t t i
τ (Dt ) = min S(m ), imin {D(m , m )} =
addition of the inequalities in Eq. 12 and the auxiliary vari- m ∈Dt−1
ables δ. We have decided not to represent all these auxiliary  
 
information in Eq. 13 for the sake of clarity.  
Constraint (III) defines an upper bound for the value of min min H(mt , mk ), min {H(mt , mk )} .
mk ∈A0 (mt )
 mk ∈Dt−1 
τ ∗ . This bound is the maximum inter-marker distance τmaxn 
, i k
m ∈A(m )
which is theoretically obtained in Sec. 3.1. However, if an (15)

7
Once again, our goal is equivalent to the maximization of Algorithm 1 Suboptimal dictionary generation.
the minimum of a set of Hamming distances, some of them 1: D0 ← ∅ # Empty dictionary
related to the new marker self-distance and the rest related 2: for t from 1 to d do
to its distance to the previous markers. The self-distances 3: Generate MILP Model for Dt # See Model in Eq. 17
4: mt ← SolveS MILP Model # Get optimal marker
can be expressed using the same transformation described 5: Dt ← Dt−1 mt # Add to previous markers
for the previous model (Eq. 12). However, this transforma- 6: end for
tion is not required to calculate the distance between two 7: Return last dictionary, Dd
markers since only the bits of one of them are variables. As
a consequence, the transformation in Eq. 12 can be refor-
mulated as: Additionally, in order to ensure the convergence within a
( restricted amount of time, a time limit can be set to each
i j mik , if mjk = 0 MILP model. Thus, if the optimization is not finished when
mk ⊗ mk = , (16)
1 − mik , otherwise the limit is reached, the best feasible solution obtained at
that moment is selected. This is a common strategy in
where mik is an unknown bit represented by a decision vari- Mixed Integer Programming to guarantee convergence when
able and mjk is a bit from a marker in Dt−1 . suboptimal solutions are allowed.
The proposed model is similar to the previous one, in-
cluding the τ ∗ variable which represents the τ (D) value. 5. Experiments and results
However, in this case the number of involved Hamming dis-
tances is smaller and so is the total number of constraints. This section shows the results of the experimentation car-
The proposed MILP formulation is as follows: ried out to validate our proposals. First, the optimal formu-
lation is studied. Then, the suboptimal model is analyzed in
∗ terms of inter-marker distances and generation times. The
maximize τ
Subject to obtained results have been compared to those produced by
t k ∗ k 0 t
the best alternatives in the literature, ArUco [12], AprilTags
(I) H(m , m ) − τ ≥ 0 ∀m ∈ A (m )
t k ∗ k i i
(17) [13], ARTag [10] and ARToolKitPlus [11].
(II) H(m , m ) − τ ≥ 0 ∀m ∈ A(m ), ∀m ∈ Dt−1 The Gurobi optimizer v5.6 [40] has been chosen to solve

(III) 0 ≤ τ ≤ τ (Dt−1 ) the MILP models since it is the solver achieving the fastest
t k k
(IV) I(m ) − I(m ) ≥ 0 ∀m ∈ A (m ).
0 t convergence times for our models. The default Gurobi con-
figuration has shown to be adequate for our proposals, since
most of the parameters are configured automatically based
Similarly to the optimal model, constraint (I) guarantees
on the model characteristics. All tests were performed us-
that the marker distance (Eq. 4) between the new generated
ing the twelve cores of a system equipped with an Intel Core
marker, mt , and any marker previously generated is greater
i7-3930K 3.20 Ghz processor, 16 GB of RAM and Ubuntu
than or equal to τ ∗ . Constraint (II) guarantees the same
14.04 as operating system with a load average of 0.1.
condition for the self-distance of mt .
It must be indicated that the generated dictionaries by our
The upper bound of τ ∗ in constraint (III) refers to the
proposals have been set publicly available as a part of the
fact that the maximum cost of a solution is lower than or
ArUco library [12].
equal to the maximum cost of the previous dictionary. This
assumption is only valid if the previous markers have been
created using the same suboptimal iterative method. For 5.1. Optimal formulation
n The generation of a marker dictionary is an off-line process
the first generated marker, the value τmax (Sec. 3.1) can be
employed for constraint (III). Finally, constraint (IV) is the which is typically performed only once. As a consequence,
same than in the optimal model, i.e., selecting the marker the generated time is not a critical aspect of the process.
with highest encoded value from its analogous set. However, the main problem of the proposed optimal model
The proposed method works iteratively, i.e., a new MILP is that its generation times are too long for high number
model is generated and solved to obtain a new marker at of markers and number of bits. Note that the search space
each iteration. This process repeats until the desired num- for the optimal model is 2n×n×d (being d the number of
ber of markers, d, is generated. It must be noted that, at generated markers) which indicates an exponential growth.
each iteration, the optimal solution may not be unique, and From our experimentation, it has been observed that for
that the selection of a solution will condition the subsequent markers of sizes bigger than 5×5 bits, the convergence times
iterations. As a result, the marker dictionary obtained by are too long to study the results, thus, our experimentation
this method is not optimal, nor unique, contrary to the pre- has been restricted to smaller sizes. Even for marker sizes
vious model. However, the processing time spent by a MILP of 3 × 3 and 4 × 4 bits, we have been limited to dictionaries
solver is much shorter compared to the optimal formulation of 37 and 8 markers respectively.
and the results, as shown in Sec. 5, are still remarkable. Figure 5 shows the generation times for the optimal
The whole process is summarized in Algorithm 1 . model. For each dictionary size, the τ (D) value obtained

8
8 ter an amount of unproductive iterations. To compare the
7
3 × 3 bits results in the same conditions, the ArUco method was also
4 × 4 bits configured to decrement this value after 150 seconds of un-
Generation time (days)

6
productive iterations.
5 In the AprilTags method, the objective distance has to be
4 specified by the user and its method does not propose any spe-
cific condition to reduce this value. Thus, we have employed
3
the same condition than in the ArUco case, i.e. reducing
2 the objective distance after 150 seconds of unproductive it-
1 erations.
0 The marker dictionaries of ARTag and ARToolKitPlus
0 5 10 15 20 25 30 35 are fixed and their markers are composed by 6×6 bits. Thus,
d they cannot been compared for different marker sizes. In the
ARTag case, different dictionary sizes are obtained by tak-
Figure 5: Generation times for the optimal formulation proposal as a
function of the dictionary size for 3 × 3 and 4 × 4 bits. As it can be ob- ing the specific subset of markers in the order recommended
served, the generation times increase considerably with the dictionary by the authors. On the other hand, ARToolKitPlus does not
size and the number of bits. A formal study for bigger marker sizes or provide a recommended order and hence, its dictionary size
number of markers is not feasible. is also fixed.
Figure 6 shows the mean τ (D) value for 30 executions as
in the previous generation was used as an upper bound of a function of the dictionary size and for different marker
the objective function, as it is explained in the model de- sizes. The results of the different executions only present
scription in Sec. 4.2. deviations in the intervals where the objective distance is
It can be observed that the convergence times are indeed reduced, which correspond to the slopes of the curves. In
considerably long. For instance, the generation of an opti- the flat regions, there is no deviation and the same result is
mal dictionary of 37 markers and 3 × 3 bits lasted 6 days, achieved for all the executions.
and the generation of a dictionary of only 8 markers and As it can be observed, the proposed suboptimal method
4 × 4 bits lasted more than 7 days. Due to this time limita- outperforms the results of all the other proposals. For the
tion, we have not been able to study the optimal dictionar- smallest size, 4 × 4 bits, there are not remarkable differences
ies for a higher number of bits or dictionary sizes, neither to since the search space is smaller. However, the improve-
compare the optimal results with the rest of methods. Nev- ments increase with the marker size. For marker sizes of
ertheless, it must be noted that the formulation is suitable 6 × 6 bits and bigger, the suboptimal method achieves re-
for those applications where the required number of markers sults which clearly surpass the other methods. For instance,
and marker size are not too high, keeping in mind that dic- for a dictionary composed by 22 markers of 10 × 10 bits,
tionary generation is an off-line process which is necessary the suboptimal model achieves a τ (D) value of 47 while the
to perform only once. second best alternative, the ArUco method, achieves a value
Furthermore, these long times justify the suboptimal of 42. This implies that the suboptimal dictionary can cor-
model proposal which converges notably faster, allowing the rect up to 23 erroneous bits whereas the ArUco method can
generation of bigger dictionaries, both in number of markers only correct up to 20 bits, a difference of 3 bits. For 25 × 25
and bits. markers, the difference increases to 15 bits.
It is also remarkable how the results of the AprilTags
5.2. Suboptimal formulation method notably degrade as the marker size increases. This
5.2.1. Analysis of dictionary distances indicates that the employed strategy, which is based in an as-
This section compares the distances obtained by the sub- cending order search, is not suitable for large search spaces.
optimal formulation with those obtained by the ArUco,
AprilTags, ARTag and ARToolKitPlus methods. To that The ARToolKitPlus library proposes two different dictio-
end, dictionaries with up to 250 markers have been gen- naries, also known as ARToolKitPlus Simple and ARToolK-
erated, covering in our opinion the requirements of most itPlus BCH. However, both of them are fixed and, contrary
fiducial marker applications. The marker sizes have been to ARTag, they do not provide a recommended order and,
selected from a range of sizes from 4 × 4 to 25 × 25 bits. consequently, the inter-marker distance cannot be analyzed
A time restriction of 150 seconds has been set to solve as a function of the dictionary size. Instead, we have com-
each MILP model in order to ensure the convergence of the pared against the ARToolKitPlus dictionaries by generating
optimization process within restricted time. Once the limit dictionaries with the same characteristics in terms of dictio-
is reached, the best feasible solution at that moment is se- nary size and number of bits. ARToolKitPlus Simple dictio-
lected. nary is composed by 512 markers of 6 × 6 bits and achieves
The ArUco method is an iterative process which employs a τ (D) value of 4, while the ARToolKitPlus BCH dictionary
an objective distance value. This value is decremented af- is composed by 4096 markers of 6 × 6 bits achieving a τ (D)

9
12 26
Suboptimal 24 Suboptimal
10 ArUco ArUco
22
AprilTags AprilTags
20 ARTag
8
18
τ (D)

τ (D)
16
6
14

4 12
10
2 8
0 50 100 150 200 250 0 50 100 150 200 250
d d

(a) 4 × 4 bits (b) 6 × 6 bits


70
500
Suboptimal Suboptimal
60 ArUco ArUco
400
AprilTags AprilTags

50
300
τ (D)

40 τ (D)
200

30 100

20
0 50 100 150 200 250 0 50 100 150 200 250
d d

(c) 10 × 10 bits (d) 25 × 25 bits

Figure 6: Minimum inter-marker distances τ (D) for the suboptimal formulation and the ArUco, AprilTags and ARTag methods, as a function
of the dictionary size for different number of bits. ARTag dictionary is only shown for 6 × 6 bits since its dictionary is fixed. It can be observed
that the suboptimal model outperforms the results of the other methods.

τ (D)
Dictionary Size Marker Size
Original Suboptimal ArUco AprilTags
ARToolKitPlus Simple 512 markers 6 × 6 bits 4 11 10 10
ARToolKitPlus BCH 4096 markers 6 × 6 bits 2 9 8 8

Table 3: Inter-marker distance comparison between the dictionaries from the ARToolKitPlus library and those generated by ArUco, AprilTags
and our suboptimal proposal with the same characteristics, i.e. same number of markers and bits. The column Original indicates the inter-
marker distance of the original ARToolKitPlus dictionaries. The results of the original dictionaries are significantly lower than those obtained
by the rest of methods. The results of our suboptimal proposal surpass the other approaches, allowing the correction of one more bit in
comparison to ArUco and AprilTags.

value of 2. The same situation occurs for ARToolKitPlus BCH case,


Table 3 shows the results obtained by the suboptimal, the original dictionary cannot perform error correction while
ArUco and AprilTags methods in comparison to the AR- ArUco and AprilTags can correct up to 3 bits due to an inter-
ToolKitPlus dictionaries. It can be observed that the results marker distance of 8. Once again, our suboptimal approach
of the original ARToolKitPlus dictionaries are considerably surpasses the rest of methods achieving a maximum inter-
low in comparison to the rest of methods. For instance, in marker distance of 9 and allowing error correction up to 4
the case of ARToolKitPlus Simple, the original dictionary bits.
can perform an error correction of 1 bit, whereas the ArUco The experimentation shows that the proposed suboptimal
and AprilTags dictionaries can correct up to 4 bits due to the model produces dictionaries with the longest inter-marker
larger inter-marker distance of 10. However, our suboptimal distances in the literature, incrementing the error correction
proposal achieves a inter-marker distance of 11, allowing a capabilities and only surpassed by the results of the optimal
maximum error correction of 5 bits and surpassing the rest model, which is not applicable to a high number of markers
of methods. and bits.

10
1200
3000
Suboptimal Suboptimal
1000 ArUco ArUco
2500
AprilTags AprilTags
Generation time (s)

Generation time (s)


800
2000

600
1500

400 1000

200 500

0 0
0 50 100 150 200 250 0 50 100 150 200 250
d d

(a) 4 × 4 bits (b) 6 × 6 bits


25000
8000
Suboptimal Suboptimal
7000 ArUco ArUco
20000
AprilTags AprilTags
Generation time (s)

Generation time (s)


6000

5000 15000

4000
10000
3000

2000
5000
1000

0 0
0 50 100 150 200 250 0 50 100 150 200 250
d d

(c) 10 × 10 bits (d) 25 × 25 bits

Figure 7: Generation times for the suboptimal, ArUco and AprilTags methods as a function of the dictionary size for different number of bits.
The suboptimal times are, in general, shorter than the times of the other approaches, although the difference decreases as the marker size
increase.

5.2.2. Generation Time those obtained by our proposal or the ArUco method.
This section analyses the times employed by the subop- As for the ArUco and AprilTags results, it can be noted
timal, ArUco and AprilTags methods to generate the dic- that there are some slopes in the plots, where the times
tionaries shown in the previous section. As for ARTag and increase sharply. This is especially remarkable for small
ARToolKitPlus, the comparison is not feasible since their marker sizes (4 × 4 and 6 × 6 bits). These peaks correspond
dictionaries are fixed and, hence, there is no generation pro- to those dictionary sizes where the objective distance is de-
cess. creased. Since the objective distance can only be reduced
Figure 7 shows the mean generation times for 30 execu- by reaching the time limit, the generation time increases
tions as a function of the dictionary size for different number considerably with each reduction.
of bits. As can be observed, the generation times are notably On the other hand, the suboptimal proposal reduces the
shorter compared to the optimal case. For instance, the gen- objective distance by adjusting their bounds during the
eration of an optimal dictionary composed by 6 markers and MILP optimization based on the linear relaxation of the
4×4 bits needs more than 1 day, while the suboptimal model problem. This means that the time limit does not need
employed less than 20 seconds to generate a dictionary of 250 to be reached every time the objective distance is reduced
markers with the same marker size. and it explains why the suboptimal times are significantly
Also, the generation times of the suboptimal method are, shorter than the times of the other approaches for 4 × 4 bits.
in general, shorter than those of ArUco and AprilTags. However, as the marker size increases, the times of the sub-
These differences are specially relevant for smaller marker optimal method start to grow and some sharply slopes ap-
sizes. The generation times are only shorter in the AprilT- pear (similarly to those on the ArUco or AprilTags curves).
ags case for 25 × 25 bits. However, as it has been shown in These peaks also correspond to the dictionary sizes where
Sec. 5.2.1, the inter-marker distances obtained by AprilTags the time limit is reached. In these cases, the solver takes the
in this case are completely unsatisfactory in comparison to best feasible solution found until that moment and continues

11
τ (D)
with the next generation. d
Optimal Suboptimal
Note that for the three methods, after the upper bound of 1 10 10
2 8 8
the objective function is reduced, the next generated mark- 3 8 8
ers are less restricted and they can be generated faster, 4 8 7.37
5 8 6.2
which explains the flat lines after each slope. 6 8 6
For the biggest marker size, 25 × 25 bits, there is a high 7 8 6
8 8 6
number of objective distance reductions so that the slopes
become less distinguishable in the three cases. Table 4: Suboptimal distances compared to the optimal distances for
4 × 4 bits and dictionary size up to 8 markers.

5.2.3. Comparison to optimal dictionary


In this section, the distances obtained by the subopti-
mal method are compared to those obtained by the optimal References
model. However, as it is shown in Section 5.1, the experi-
mentation carried out with the optimal model is limited to [1] B. Williams, M. Cummins, J. Neira, P. Newman,
small dictionaries and number of bits due to its time com- I. Reid, J. Tardós, A comparison of loop closing tech-
plexity. As a consequence, the results are not enough to niques in monocular SLAM, Robotics and Autonomous
draw any conclusion. Table 4 summarizes the distance re- Systems (2009) 1188–1197.
sults for dictionaries up to 8 markers and 4 × 4 bits. The
[2] E. Royer, M. Lhuillier, M. Dhome, J.-M. Lavest,
distances of the suboptimal model correspond to the mean
Monocular vision for mobile robot localization and au-
of 30 executions.
tonomous navigation, International Journal of Com-
As it can be observed, the maximum difference between puter Vision 74 (3) (2007) 237–260.
the distances of the optimal and suboptimal models is 2,
although, as it has been stated, we cannot draw a conclusion [3] R. T. Azuma, A survey of augmented reality, Presence
due to the reduced number of results. 6 (1997) 355–385.

[4] H. Kato, M. Billinghurst, Marker tracking and HMD


6. Conclusions calibration for a video-based augmented reality confer-
encing system, in: Proceedings of the 2nd IEEE and
ACM International Workshop on Augmented Reality,
This paper has proposed two novel methods to obtain IWAR ’99, IEEE Computer Society, Washington, DC,
fiducial marker dictionaries based on the Mixed Integer Lin- USA, 1999, pp. 85–94.
ear Programming paradigm. The first model, contrary to
any of the previous methods, guarantees the optimality of [5] V. Lepetit, P. Fua, Monocular model-based 3d tracking
the dictionary in terms of inter-marker distance for any num- of rigid objects: A survey, in: Foundations and Trends
ber of bits and markers. However, the generation times are in Computer Graphics and Vision, 2005, pp. 1–89.
too long for many practical situations. The second method,
proposes an iterative formulation that, although does not [6] W. Daniel, R. Gerhard, M. Alessandro, T. Drummond,
guarantee optimality, achieves better results than the state- S. Dieter, Real-time detection and tracking for aug-
of-the art methods within restricted time. mented reality on mobile phones, IEEE Transactions
As a consequence, the dictionaries generated with our pro- on Visualization and Computer Graphics 16 (3) (2010)
posals allow the detection and correction of a higher number 355–368.
of erroneous bits than previous approaches. These results
[7] G. Klein, D. Murray, Parallel tracking and mapping
lead to a direct improvement in the marker detection pro-
for small AR workspaces, in: Proceedings of the 2007
cess.
6th IEEE and ACM International Symposium on Mixed
Finally, it must be indicated that the generated dictionar- and Augmented Reality, ISMAR ’07, IEEE Computer
ies by our proposals have been set publicly available as a part Society, Washington, DC, USA, 2007, pp. 1–10.
of the ArUco library [12].
[8] K. Mikolajczyk, C. Schmid, Indexing based on scale
invariant interest points., in: ICCV, 2001, pp. 525–531.
7. Acknowledgments
[9] D. G. Lowe, Object recognition from local scale-
invariant features, in: Proceedings of the International
We are grateful to the financial support provided by Conference on Computer Vision-Volume 2 - Volume 2,
Science and Technology Ministry of Spain and FEDER ICCV ’99, IEEE Computer Society, Washington, DC,
(projects TIN2012-32952 and BROCA). USA, 1999, pp. 1150–1157.

12
[10] M. Fiala, Designing highly reliable fiducial markers, [21] M. Kaltenbrunner, R. Bencina, reacTIVision: a
IEEE Trans. Pattern Anal. Mach. Intell. 32 (7) (2010) computer-vision framework for table-based tangible in-
1317–1324. teraction, in: Proceedings of the 1st international con-
ference on Tangible and embedded interaction, TEI ’07,
[11] D. Wagner, D. Schmalstieg, ARToolKitPlus for pose ACM, New York, NY, USA, 2007, pp. 69–74.
tracking on mobile devices, in: Computer Vision Win-
ter Workshop, 2007, pp. 139–146. [22] M. Fiala, Comparing ARTag and ARToolKit Plus fidu-
cial marker systems, in: IEEE International Workshop
[12] S. Garrido-Jurado, R. Muñoz-Salinas, F. Madrid- on Haptic Audio Visual Environments and their Appli-
Cuevas, M. Marı́n-Jiménez, Automatic generation and cations, 2005, pp. 147–152.
detection of highly reliable fiducial markers under oc-
clusion, Pattern Recognition 47 (6) (2014) 2280 – 2292. [23] J. Rekimoto, Matrix: A realtime object identifica-
tion and registration method for augmented reality, in:
[13] E. Olson, AprilTag: A robust and flexible visual fiducial Third Asian Pacific Computer and Human Interaction,
system, in: Proceedings of the IEEE International Con- Kangawa, Japan, IEEE Computer Society, 1998, pp.
ference on Robotics and Automation (ICRA), IEEE, 63–69.
2011, pp. 3400–3407.
[24] W. Peterson, D. Brown, Cyclic codes for error detec-
[14] K. Dorfmüller, H. Wirth, Real-time hand and head tion, Proceedings of the IRE 49 (1) (1961) 228–235.
tracking for virtual environments using infrared bea- [25] S. Lin, D. Costello, Error Control Coding: Fundamen-
cons, in: in Proceedings CAPTECH’98. 1998, Springer, tals and Applications, Prentice Hall, 1983.
1998, pp. 113–127.
[26] D. Schmalstieg, A. Fuhrmann, G. Hesina, Z. Szalavári,
[15] M. Ribo, A. Pinz, A. L. Fuhrmann, A new optical L. M. Encarnaçäo, M. Gervautz, W. Purgathofer,
tracking system for virtual and augmented reality ap- The Studierstube augmented reality project, Presence:
plications, in: In Proceedings of the IEEE Instrumenta- Teleoper. Virtual Environ. 11 (1) (2002) 33–54.
tion and Measurement Technical Conference, 2001, pp.
1932–1936. [27] D. Flohr, J. Fischer, A Lightweight ID-Based Extension
for Marker Tracking Systems, in: Eurographics Sym-
[16] V. A. Knyaz, R. V. Sibiryakov, The development of posium on Virtual Environments (EGVE) Short Paper
new coded targets for automated point identification Proceedings, 2007, pp. 59–64.
and non-contact surface measurements, in: 3D Surface
Measurements, International Archives of Photogram- [28] A. Schrijver, Theory of Linear and Integer Program-
metry and Remote Sensing, Vol. XXXII, part 5, 1998, ming, John Wiley & Sons, Inc., New York, NY, USA,
pp. 80–85. 1986.

[17] L. Naimark, E. Foxlin, Circular data matrix fiducial [29] R. Niemann, P. Marwedel, An algorithm for hard-
system and robust image processing for a wearable ware/software partitioning using mixed integer linear
vision-inertial self-tracker, in: Proceedings of the 1st In- programming, Design Automation for Embedded Sys-
ternational Symposium on Mixed and Augmented Real- tems 2 (2) (1997) 165–193.
ity, ISMAR ’02, IEEE Computer Society, Washington,
[30] C.-W. Hui, Y. Natori, An industrial application using
DC, USA, 2002, pp. 27–36.
mixed-integer programming technique: A multi-period
utility system model, Computers and Chemical Engi-
[18] J. Rekimoto, Y. Ayatsuka, CyberCode: designing aug-
neering 20, Supplement 2 (0) (1996) 1577–1582.
mented reality environments with visual tags, in: Pro-
ceedings of DARE 2000 on Designing augmented reality [31] H. Morais, P. Kádár, P. Faria, Z. A. Vale, H. Khodr,
environments, DARE ’00, ACM, New York, NY, USA, Optimal scheduling of a renewable micro-grid in an
2000, pp. 1–10. isolated load area using mixed-integer linear program-
ming, Renewable Energy 35 (1) (2010) 151–156.
[19] M. Rohs, B. Gfeller, Using camera-equipped mobile
phones for interacting with real-world objects, in: Ad- [32] T. Gönen, Distribution-system planning using mixed-
vances in Pervasive Computing, 2004, pp. 265–271. integer programming, Generation, Transmission and
Distribution, IEE Proceedings C 128 (1981) 70–79(9).
[20] E. Ouaviani, A. Pavan, M. Bottazzi, E. Brunelli,
F. Caselli, M. Guerrero, A common image processing [33] A. Richards, J. P. How, Aircraft trajectory planning
framework for 2d barcode reading, in: Image Processing with collision avoidance using mixed integer linear
and Its Applications. Seventh International Conference programming, American Control Conference (ACC) 3
on (Conf. Publ. No. 465), Vol. 2, 1999, pp. 652–655. (2002) 1936–1941 vol.3.

13
[34] J. A. Nelder, R. Mead, A simplex method for function
minimization, The computer journal 7 (4) (1965) 308–
313.
[35] A. Schrijver, Theory of Linear and Integer Program-
ming, John Wiley & Sons, Chichester, 1986.

[36] M. R. Garey, D. S. Johnson, Computers and intractabil-


ity: a guide to the theory of NP-completeness. 1979,
San Francisco, LA: Freeman.
[37] M. Padberg, G. Rinaldi, A branch-and-cut algorithm
for the resolution of large-scale symmetric traveling
salesman problems, SIAM review 33 (1) (1991) 60–100.
[38] E. L. Lawler, D. E. Wood, Branch-and-bound methods:
A survey, Operations research 14 (4) (1966) 699–719.
[39] H. Marchand, A. Martin, R. Weismantel, L. Wolsey,
Cutting planes in integer and mixed integer program-
ming, Discrete Applied Mathematics 123 (1) (2002)
397–446.
[40] I. Gurobi Optimization, Gurobi optimizer reference
manual, http://www.gurobi.com (2014).

14

View publication stats

You might also like