Generation of Fiducial Marker Dictionaries Using Mixed Integer Linear Programming (2015)
Generation of Fiducial Marker Dictionaries Using Mixed Integer Linear Programming (2015)
net/publication/282426080
CITATIONS READS
115 12,491
4 authors, including:
Rafael Medina-Carnicer
University of Cordoba (Spain)
79 PUBLICATIONS 1,232 CITATIONS
SEE PROFILE
Some of the authors of this publication are also working on these related projects:
All content following this page was uploaded by Rafael Muñoz-Salinas on 11 October 2017.
Abstract
Square-based fiducial markers are one of the most popular approaches for camera pose estimation due to its fast detection
and robustness. In order to maximize their error correction capabilities, it is required to use an inner binary codification
with a large inter-marker distance. This paper proposes two Mixed Integer Linear Programming (MILP) approaches to
generate configurable square-based fiducial marker dictionaries maximizing their inter-marker distance. The first approach
guarantees the optimal solution, however, it can only be applied to relatively small dictionaries and number of bits since
the computing times are too long for many situations. The second approach is an alternative formulation to obtain
suboptimal dictionaries within restricted time, achieving results that still surpass significantly the current state of the art
methods.
Keywords: fiducial markers, MILP, mixed integer linear programming, augmented reality, computer vision.
1 0 1 1 1 1
1 1 0 0 1 0
1 1 0 0 0 1 1
m 0 0 0 1 0 1
0 0 0 1 0 0
0 1 1 1 1 1
1 0 0 1 1 1
1 0 0 1 1 1
1 0 0 0 0 1
m2 0 1 1 0 0 1
1 0 0 1 1 1
1 0 0 0 1 1
Figure 1: Example of inter-marker confusion error and how it is avoided with large inter-marker distances. (a) Original marker images.
The marker distances are D(m0 , m1 ) = 1, D(m0 , m2 ) = 5 and D(m1 , m2 ) = 6. (b) Real image with the markers placed in the environment.
(c) Marker images after removing the perspective. (d) Inner bits extracted from images in c. Erroneous bits are highlighted in red. (e) Final
identifiers assigned to each marker. It can be observed that m1 has been confused with m0 . On the other hand, m2 , despite a higher number
of errors, can be correctly identify since the distance to the rest of markers is higher. (Best seen in color).
of markers and bits. It is the first approach in the literature, A straight evolution of the point markers are the circular
up to our knowledge, that assures optimal results in terms of markers [16, 17] (Fig. 2a). These markers are similar to the
inter-marker distance. However, since the convergence time previous ones except for the fact that they include infor-
of this model is too long for many applications, we also pro- mation (as circular sectors or concentric rings) to facilitate
pose an alternative MILP formulation that converges faster, the identification process. Their main drawback is that they
and, although it does not guarantee optimality, its results provide only one correspondence point per marker.
surpass the current state of the art significantly. The two Other types of fiducial markers are based on blob detec-
proposed approaches presented in this paper constitute rel- tion. For instance, Cybercode [18] and VisualCode [19] (Fig.
evant improvements to the error correction capabilities of 2b,c) are based on the same technology than QR and Maxi-
fiducial marker systems. code [20] codes, but providing several correspondence points.
The rest of the paper is structured as follows. Section Other popular fiducial markers based on blob detection are
2 reviews related work. Section 3 presents a mathematical the ReacTIVision amoeba markers [21], that are designed
formalization of the problem. Section 4 details the proposed using genetic algorithms (Fig. 2d).
MILP models to our problem. Finally, Section 5 presents One of the most popular families of fiducial markers are
the experimentation carried out, and Section 6 draws some the square-based ones. They contain a black border to ease
conclusions. their detection and employ their inner region for identifica-
tion purposes. Their main benefit is that each marker pro-
vides four prominent points (i.e. its four corners) which can
2. Related work be easily detected and employed as correspondence points,
thus allowing camera pose estimation using a single marker.
Fiducial markers are synthetic elements placed in the In this category, one of the most popular systems is AR-
working area either to facilitate the camera pose estimation ToolKit [4], an open source project which has been exten-
task or for labeling purposes. They are specially designed sively used in the last decade, especially in the academic
to be easily detected even at low resolutions and most of the community. ARToolKit markers include a pattern in their
applications require not one but many different markers (a inner region for identification which can be customized by
dictionary). Thus, the ability of identifying them uniquely is the user (Fig. 2e). Despite its popularity, it presents some
an important feature. Several fiducial marker systems have drawbacks. Firstly, it uses a template matching strategy to
been proposed in the literature as shown in Figure 2. identify the markers, which produces a high false positive
The simplest proposals are those based on fiducial points. rate [22]. Secondly, the square detection is based on global
These markers constitute a single scene point and are usually thresholding which makes it high sensitive to the lighting
based on leds, retroreflective spheres or planar dots [14, 15]. conditions.
Their identification is typically based on the relative posi- Instead of using template matching, the majority of
tions of different points, which can be a limiting and complex square-based marker systems employs a binary codification
process. in their inner region [23, 10, 11]. Most approaches use a
2
points are vulnerable to errors since they are not protected
by any coding system.
Most recent approaches rely on heuristics to select a set of
markers with large inter-marker distances, considering rota-
(a) Intersense (b) CyberCode (c) VisualCode (d) ReacTIVision tion. However, since the search space is very large even for
small dictionaries with a low number of bits, an exhaustive
search is unfeasible and optimality can not be guaranteed.
One of the first and simplest proposals is the BinARyID
system [27], whose dictionary generation process is based on
(e) ARToolKit (f) Matrix (g) ARTag (h) ARToolKit selecting those markers that accomplish a minimum Ham-
Plus ming distance of one to any of the previous selected mark-
ers, so that rotation ambiguities are avoided. The markers
are analyzed in ascending order until the desired number of
markers is achieved. Its main problem is that it does not
(i) AprilTags (j) ArUco
allow error detection and correction (since the distance is
one) and the generation times can be prohibitive for large
Figure 2: Examples of fiducial markers proposed in previous works. dictionaries.
The generation method proposed by the AprilTags library
[13] (Fig. 2i) employs a similar approach than BinARyID,
codification based on classic methods of signal coding, such but with some significant improvements. Firstly, the mini-
as CRC codes [24], achieving a more robust identification mum Hamming distance can be provided by the user, gen-
and facilitating the error detection and correction processes. erating dictionaries with larger inter-marker distances and,
hence, allowing error detection and correction. Secondly,
Matrix [23] is one of the first and simplest proposals which instead of analyzing markers one by one in ascending order,
uses a binary code with redundant bits for error detection larger increments are performed based on an heuristic ap-
(Fig. 2f). ARTag [10] (Fig. 2g) is based on the same idea proach, so that markers with larger inter-marker distances
but it employs a more robust codification. Furthermore, it are found faster. Finally, selected markers also need to ac-
improves the square detection using an edge-based method complish a minimum geometric complexity to increment the
instead of the global thresholding of ARToolKit. ARTag number of bit transitions. Its main downside is that the
provides its marker dictionary in a specific order so that the generation time is still very large, specially for large marker
inter-marker distance is maximized. Its main drawbacks are sizes.
that the marker size is fixed to 6 × 6 bits and the error In [12], the ArUco coding system is presented (Fig. 2j).
correction process can correct up to one bit, independently Its generation is based on maximizing both the inter-marker
of the inter-marker distance of the used marker subset. distance and the number of bit transitions. Contrary to
ARToolKit Plus [11] (Fig. 2h) improves some of the fea- AprilTags, the minimum inter-marker distance does not
tures of its predecessor ARToolKit. Firstly, it employs a need to be provided by the user, instead it is automatically
dynamic method to update the global threshold depending derived during the generation process. However, it has two
on the pixel values in the previous detected markers. Sec- main downsides. Firstly, it uses a time consuming stochastic
ondly, as in ARTag, it provides a binary codification for search strategy. Secondly, as the marker size increases, its
marker identification. The first ARToolKit Plus version in- memory requirements grow exponentially, limiting the max-
cludes 512 markers whose codification is based on repeating imum size to which it can be applied. As ARTag, ArUco
four times a 9-bits identifier, achieving a minimum distance sorts the generated markers in a list so as to maximize the
of four bits between any pair of markers. The last known inter-marker distance.
version instead, proposes a dictionary of 4096 markers with This paper proposes two novel approaches to generate
6 × 6 bits based on a BCH codification [25] that presents a square-based fiducial marker dictionaries based on Mixed In-
minimum distance of two bits so that error correction is not teger Linear Programming (MILP) [28]. MILP methods can
possible. ARToolKitPlus project was halted and followed by achieve the optimal results of a mathematical model repre-
the Studierstube Tracker [26] project which is not publicly sented by linear relationships and where some unknowns are
available. constrained to be integers. MILP problems receive special
The main problem of codification based on classic coding attention from the community since they fit many real-life
techniques is that they need to deal with the different marker situations, such as embedded system design [29], industrial
rotations, which affect negatively to the inter-marker dis- processes [30], automatic scheduling [31], distribution sys-
tances of the generated dictionaries. Some approaches em- tems [32] or trajectory planning [33]. However, these kind of
ploy special anchor points, such as QR and Maxicode [20], problems are known to be NP-hard and, as a consequence,
to remove the rotation ambiguity. However, this complicates there have been many efforts in developing techniques to
the detection process and, more importantly, these anchor speed up the convergence process.
3
m R1(m) R2(m) R3(m)
In contrast to previous works, our first proposed method
achieves the optimal dictionary in terms of inter-marker dis- m1 m2 m3 m7 m4 m1 m9 m8 m7 m3 m6 m9
4
where A0 (mi ) is the analogous set of mi without considering
the marker itself:
In the end, the objective function τ (D) in Eq. 1 is the Hamming distances
Group Quartets
minimum distance among the marker self-distances and the 90 deg 180 deg 270 deg
distances between any pair of markers in the dictionary: Q1 0000, 1111 0 0 0
1000, 0100, 0010, 0001,
Q2 2 2 2
1110, 0111, 1011, 1101
Q3 1100, 0110, 0011, 1001 2 4 2
Q4 0101, 1010 4 0 4
5
group, so that the minimum marker distance of the three also implies that the computational complexity cannot be de-
rotations is maximized. If this is understood as a multiob- termined, neither analytically or experimentally.
jective problem, the Pareto front is composed by the quar- However, these kind of problems have been extensively
tet groups Q3 and Q4 , since they dominate all the other studied in the literature and many techniques have been pro-
solutions. Thus, the problem is simplified in assigning each posed in order to obtain the optimal solution efficiently. The
quartet to any of the two quartet groups Q3 and Q4 . Then, main techniques are based in the Branch and Cut method
n
it can be easily deduced that the maximum value τmax is ob-
[37], which is an iterative process that combines the Branch
tained by assigning the groups {Q3 , Q3 , Q4 } (in this order) and Bound [38] algorithm and the use of cutting planes [39].
repeatedly until completing all the quartets of a marker. Branch and Bound is an optimization algorithm based on
4
For instance, for a 4 × 4 marker (C = 4), τmax is obtained
a search tree which is explored by partitioning the search
by assigning the groups {Q3 , Q3 , Q4 , Q3 }, as it is shown inspace on each node. It is comprised by the branching step
Table 2. and the bounding step.
In general, the maximum inter-marker distance is calcu- During the branching step, a node is split in several
lated as: branches by dividing the possible values of a specific variable,
so that the union of all the branches covers all the possibil-
n 4C ities. In the MILP procedure, this is performed by splitting
τmax = 2 . (8)
3 the different values that a decision variable can take.
The bounding step determines the lower and upper bounds
4. Proposed solutions of the optimization function in a particular branch. If these
bounds cannot surpass the current best solution, the branch
This section presents our proposals to generate marker is pruned, reducing the search space. In the MILP case,
dictionaries using MILP. First, a short introduction to MILP non-integral solutions to LP relaxations, i.e. the problem
is given. Then, our first model is presented, which obtains without considering the integer constraints, serve as upper
optimal solutions. Finally, our second model, which obtains bounds and integral solutions serve as lower bounds.
suboptimal solutions within restricted time is presented. Finally, cutting planes can be applied during the opti-
mization process to further reduce the search space. Cutting
4.1. Mixed Integer Linear Programming (MILP) planes generate new restrictions for the model that are sat-
isfied for any feasible solution, i.e. any integer solution, but
An integer linear programming (ILP) problem is a math-
violated by the current solution of the LP relaxation, so that
ematical optimization or feasibility program in which some
the next non-integer optimal solution should be closer to the
or all of the decision variables are restricted to be integers
integer one.
and the objective function and the constraints are linear.
The Branch and Cut algorithm explores the tree until find-
The canonical form of a integer linear program is:
ing the optimal solution, nevertheless many feasible non-
optimal solutions can also be found during the process.
maximize ct x
Subject to 4.2. Optimal Dictionary
Ax ≤ b (9) In order to obtain dictionaries with optimal inter-marker
distances, we propose a MILP model to obtain the maximum
x≥0
value of the cost function τ (D). This model is processed by
x ∈ Z, a MILP solver so that Branch and Cut algorithm is applied
to reduce the search space and speed up the convergence to
where x is the vector of decision variables, c is the coef-
the optimum.
ficient vector of the objective function, A is the coefficient
The decision variables of the proposed model are the bits
matrix of the constraints and b is the constant terms vec-
mij of the dictionary markers, i.e., one binary variable per
tor. The last constraint forces the decision variables to be
bit. In order to formulate the MILP problem, let us rewrite
integers, although in practice this constraint can be applied
the objective function in Eq. 7 as:
to some or all of the variables. In case all variables are
constrained to be integers, the problem is known as a Pure
Integer Linear Programming Problem (PILP), otherwise it
is known as Mixed Integer Linear Programming (MILP).
i k i k
τ (D) = min min {H(m , m )}, min {H(m , m )} .
i i j
Whereas Linear Programming problems belong to complex-
k
m ∈D
0 i
m ,m ∈D
mi 6=mj
m ∈A (m )
ity class P [34], which means they are efficiently solvable,
k j
m ∈A(m )
ILP or MILP problems are known to be NP-hard due to the (10)
integer restriction and, thus, they cannot be solved in poly-
nomial time [35]. In fact, the particular case where the de- Thus, our goal can be enunciated as maximizing the min-
cision variables are binary is one of the problems in the well imum of a set of Hamming distances, some of which are
known list of Karp’s 21 NP-complete problems [36]. This self-distances and the others are distances between pair of
6
markers. The Hamming distance between two markers can optimal dictionary with fewer markers has been previously
be expressed as: generated, its optimal objective value can be employed as
upper-bound, since it is not possible to surpass the optimal
n×n
X solution of a dictionary with fewer markers. This modifica-
H(mi , mj ) = mik ⊗ mjk , (11)
tion accelerates the convergence process.
k=1
Finally, in order to reduce the search space, constraints
being ⊗ the exclusive-or operator. Since this is a non- (IV), (V) and (VI) are added to remove symmetric solutions.
linear operation, it must be reformulated as a linear one in Note that for an optimal solution of the model, another
order to be represented in a MILP model. This is accom- optimal solution can be obtained by just rotating any of the
plished by introducing, for each exclusive-or operation, a markers (i.e., selecting another marker from its analogous set
new auxiliary binary decision variable, δ, and the following A(mi )). To avoid this, only the markers with the highest
set of constraints: encoded value in the analogous sets are considered. This
is achieved by constraint (IV), where I(mi ) represents the
mi ⊗ mj = mi + mj − 2δ number encoded by the marker bits:
k k k k
δ ≤ mik n×n
(12) I(mi ) =
X
2k−1 mik . (14)
δ≤ mjk
k=1
δ≥ mik + mjk − 1.
Another symmetry arises from the fact that a dictionary is
Finally, since the objective function is the minimum of a vector of markers, thus, permutations of its elements lead
a set of values, a new auxiliary decision variable, τ ∗ , is to equivalent solutions. Constraint (V) is added to avoid
added to represent the minimum of all the Hamming dis- this symmetry by forcing an strict ascending order of the
tances (Eq. 10). The proposed problem formulation is then markers.
defined as: Finally, an optimal solution can be converted into another
one by just inverting all its bits. To avoid this symmetry,
maximize τ ∗ constraint (VI) forces that the total number of ones has to be
higher than or equal to the total number of zeros. Thus, only
Subject to, ∀mi ∈ D,
one of the two opposite solutions is valid. The parameter d
i k ∗
(I) H(m , m ) − τ ≥ 0 ∀mk ∈ A(mj ), ∀mj ∈ D, j 6= i is the cardinal of the dictionary.
∗
i
(II) H(m , m ) − τ ≥ 0
k
∀mk ∈ A0 (mi )
∗ n
(III) 0 ≤ τ ≤ τmax 4.3. Suboptimal Dictionary
i
(IV) I(m ) − I(m ) ≥ 0
k
∀mk ∈ A0 (mi ) The previous model achieves the optimal inter-marker dis-
i i+1
(V) I(m ) − I(m )≥0 mi+1 ∈ D tance results. However, as shown in Sec. 5, the convergence
X n×n times are too long despite the efforts made to reduce the
X dn2
(VI) mjk ≥ , search space. As a consequence, it can only be applied to
2
mj ∈D k=0 generate dictionaries with relatively small number of mark-
(13) ers and bits. In this section, an alternative formulation that
obtains suboptimal results in much less time is proposed.
so that the minimum distance τ ∗ is maximized, ensuring In this case, instead of generating all the markers in a
that every Hamming distance is larger or equal to this value. single MILP optimization step, an iterative method, where
Note that after the optimization process, the variable τ ∗ will markers are generated incrementally, is proposed. At each
contain the value of τ (D). iteration, t, a new MILP model is defined to generate the
Constraint (I) guarantees that the distance between any next marker mt , that maximizes its distance to all the pre-
pair of markers in the dictionary is greater than or equal viously generated markers and its self-distance. As in the
to τ ∗ , while constraint (II) guarantees the same condition previous case, the decision variables of this model are the
for all the self-distances. Note that the previous model is marker bits. Then, the objective function at the t-th itera-
a simplified version since each Hamming distance is repre- tion is defined as:
sented by a sum of exclusive-or operations (which include
the decision variables associated to the marker bits, see Eq.
11). Furthermore, each exclusive-or operation requires the t t i
τ (Dt ) = min S(m ), imin {D(m , m )} =
addition of the inequalities in Eq. 12 and the auxiliary vari- m ∈Dt−1
ables δ. We have decided not to represent all these auxiliary
information in Eq. 13 for the sake of clarity.
Constraint (III) defines an upper bound for the value of min min H(mt , mk ), min {H(mt , mk )} .
mk ∈A0 (mt )
mk ∈Dt−1
τ ∗ . This bound is the maximum inter-marker distance τmaxn
, i k
m ∈A(m )
which is theoretically obtained in Sec. 3.1. However, if an (15)
7
Once again, our goal is equivalent to the maximization of Algorithm 1 Suboptimal dictionary generation.
the minimum of a set of Hamming distances, some of them 1: D0 ← ∅ # Empty dictionary
related to the new marker self-distance and the rest related 2: for t from 1 to d do
to its distance to the previous markers. The self-distances 3: Generate MILP Model for Dt # See Model in Eq. 17
4: mt ← SolveS MILP Model # Get optimal marker
can be expressed using the same transformation described 5: Dt ← Dt−1 mt # Add to previous markers
for the previous model (Eq. 12). However, this transforma- 6: end for
tion is not required to calculate the distance between two 7: Return last dictionary, Dd
markers since only the bits of one of them are variables. As
a consequence, the transformation in Eq. 12 can be refor-
mulated as: Additionally, in order to ensure the convergence within a
( restricted amount of time, a time limit can be set to each
i j mik , if mjk = 0 MILP model. Thus, if the optimization is not finished when
mk ⊗ mk = , (16)
1 − mik , otherwise the limit is reached, the best feasible solution obtained at
that moment is selected. This is a common strategy in
where mik is an unknown bit represented by a decision vari- Mixed Integer Programming to guarantee convergence when
able and mjk is a bit from a marker in Dt−1 . suboptimal solutions are allowed.
The proposed model is similar to the previous one, in-
cluding the τ ∗ variable which represents the τ (D) value. 5. Experiments and results
However, in this case the number of involved Hamming dis-
tances is smaller and so is the total number of constraints. This section shows the results of the experimentation car-
The proposed MILP formulation is as follows: ried out to validate our proposals. First, the optimal formu-
lation is studied. Then, the suboptimal model is analyzed in
∗ terms of inter-marker distances and generation times. The
maximize τ
Subject to obtained results have been compared to those produced by
t k ∗ k 0 t
the best alternatives in the literature, ArUco [12], AprilTags
(I) H(m , m ) − τ ≥ 0 ∀m ∈ A (m )
t k ∗ k i i
(17) [13], ARTag [10] and ARToolKitPlus [11].
(II) H(m , m ) − τ ≥ 0 ∀m ∈ A(m ), ∀m ∈ Dt−1 The Gurobi optimizer v5.6 [40] has been chosen to solve
∗
(III) 0 ≤ τ ≤ τ (Dt−1 ) the MILP models since it is the solver achieving the fastest
t k k
(IV) I(m ) − I(m ) ≥ 0 ∀m ∈ A (m ).
0 t convergence times for our models. The default Gurobi con-
figuration has shown to be adequate for our proposals, since
most of the parameters are configured automatically based
Similarly to the optimal model, constraint (I) guarantees
on the model characteristics. All tests were performed us-
that the marker distance (Eq. 4) between the new generated
ing the twelve cores of a system equipped with an Intel Core
marker, mt , and any marker previously generated is greater
i7-3930K 3.20 Ghz processor, 16 GB of RAM and Ubuntu
than or equal to τ ∗ . Constraint (II) guarantees the same
14.04 as operating system with a load average of 0.1.
condition for the self-distance of mt .
It must be indicated that the generated dictionaries by our
The upper bound of τ ∗ in constraint (III) refers to the
proposals have been set publicly available as a part of the
fact that the maximum cost of a solution is lower than or
ArUco library [12].
equal to the maximum cost of the previous dictionary. This
assumption is only valid if the previous markers have been
created using the same suboptimal iterative method. For 5.1. Optimal formulation
n The generation of a marker dictionary is an off-line process
the first generated marker, the value τmax (Sec. 3.1) can be
employed for constraint (III). Finally, constraint (IV) is the which is typically performed only once. As a consequence,
same than in the optimal model, i.e., selecting the marker the generated time is not a critical aspect of the process.
with highest encoded value from its analogous set. However, the main problem of the proposed optimal model
The proposed method works iteratively, i.e., a new MILP is that its generation times are too long for high number
model is generated and solved to obtain a new marker at of markers and number of bits. Note that the search space
each iteration. This process repeats until the desired num- for the optimal model is 2n×n×d (being d the number of
ber of markers, d, is generated. It must be noted that, at generated markers) which indicates an exponential growth.
each iteration, the optimal solution may not be unique, and From our experimentation, it has been observed that for
that the selection of a solution will condition the subsequent markers of sizes bigger than 5×5 bits, the convergence times
iterations. As a result, the marker dictionary obtained by are too long to study the results, thus, our experimentation
this method is not optimal, nor unique, contrary to the pre- has been restricted to smaller sizes. Even for marker sizes
vious model. However, the processing time spent by a MILP of 3 × 3 and 4 × 4 bits, we have been limited to dictionaries
solver is much shorter compared to the optimal formulation of 37 and 8 markers respectively.
and the results, as shown in Sec. 5, are still remarkable. Figure 5 shows the generation times for the optimal
The whole process is summarized in Algorithm 1 . model. For each dictionary size, the τ (D) value obtained
8
8 ter an amount of unproductive iterations. To compare the
7
3 × 3 bits results in the same conditions, the ArUco method was also
4 × 4 bits configured to decrement this value after 150 seconds of un-
Generation time (days)
6
productive iterations.
5 In the AprilTags method, the objective distance has to be
4 specified by the user and its method does not propose any spe-
cific condition to reduce this value. Thus, we have employed
3
the same condition than in the ArUco case, i.e. reducing
2 the objective distance after 150 seconds of unproductive it-
1 erations.
0 The marker dictionaries of ARTag and ARToolKitPlus
0 5 10 15 20 25 30 35 are fixed and their markers are composed by 6×6 bits. Thus,
d they cannot been compared for different marker sizes. In the
ARTag case, different dictionary sizes are obtained by tak-
Figure 5: Generation times for the optimal formulation proposal as a
function of the dictionary size for 3 × 3 and 4 × 4 bits. As it can be ob- ing the specific subset of markers in the order recommended
served, the generation times increase considerably with the dictionary by the authors. On the other hand, ARToolKitPlus does not
size and the number of bits. A formal study for bigger marker sizes or provide a recommended order and hence, its dictionary size
number of markers is not feasible. is also fixed.
Figure 6 shows the mean τ (D) value for 30 executions as
in the previous generation was used as an upper bound of a function of the dictionary size and for different marker
the objective function, as it is explained in the model de- sizes. The results of the different executions only present
scription in Sec. 4.2. deviations in the intervals where the objective distance is
It can be observed that the convergence times are indeed reduced, which correspond to the slopes of the curves. In
considerably long. For instance, the generation of an opti- the flat regions, there is no deviation and the same result is
mal dictionary of 37 markers and 3 × 3 bits lasted 6 days, achieved for all the executions.
and the generation of a dictionary of only 8 markers and As it can be observed, the proposed suboptimal method
4 × 4 bits lasted more than 7 days. Due to this time limita- outperforms the results of all the other proposals. For the
tion, we have not been able to study the optimal dictionar- smallest size, 4 × 4 bits, there are not remarkable differences
ies for a higher number of bits or dictionary sizes, neither to since the search space is smaller. However, the improve-
compare the optimal results with the rest of methods. Nev- ments increase with the marker size. For marker sizes of
ertheless, it must be noted that the formulation is suitable 6 × 6 bits and bigger, the suboptimal method achieves re-
for those applications where the required number of markers sults which clearly surpass the other methods. For instance,
and marker size are not too high, keeping in mind that dic- for a dictionary composed by 22 markers of 10 × 10 bits,
tionary generation is an off-line process which is necessary the suboptimal model achieves a τ (D) value of 47 while the
to perform only once. second best alternative, the ArUco method, achieves a value
Furthermore, these long times justify the suboptimal of 42. This implies that the suboptimal dictionary can cor-
model proposal which converges notably faster, allowing the rect up to 23 erroneous bits whereas the ArUco method can
generation of bigger dictionaries, both in number of markers only correct up to 20 bits, a difference of 3 bits. For 25 × 25
and bits. markers, the difference increases to 15 bits.
It is also remarkable how the results of the AprilTags
5.2. Suboptimal formulation method notably degrade as the marker size increases. This
5.2.1. Analysis of dictionary distances indicates that the employed strategy, which is based in an as-
This section compares the distances obtained by the sub- cending order search, is not suitable for large search spaces.
optimal formulation with those obtained by the ArUco,
AprilTags, ARTag and ARToolKitPlus methods. To that The ARToolKitPlus library proposes two different dictio-
end, dictionaries with up to 250 markers have been gen- naries, also known as ARToolKitPlus Simple and ARToolK-
erated, covering in our opinion the requirements of most itPlus BCH. However, both of them are fixed and, contrary
fiducial marker applications. The marker sizes have been to ARTag, they do not provide a recommended order and,
selected from a range of sizes from 4 × 4 to 25 × 25 bits. consequently, the inter-marker distance cannot be analyzed
A time restriction of 150 seconds has been set to solve as a function of the dictionary size. Instead, we have com-
each MILP model in order to ensure the convergence of the pared against the ARToolKitPlus dictionaries by generating
optimization process within restricted time. Once the limit dictionaries with the same characteristics in terms of dictio-
is reached, the best feasible solution at that moment is se- nary size and number of bits. ARToolKitPlus Simple dictio-
lected. nary is composed by 512 markers of 6 × 6 bits and achieves
The ArUco method is an iterative process which employs a τ (D) value of 4, while the ARToolKitPlus BCH dictionary
an objective distance value. This value is decremented af- is composed by 4096 markers of 6 × 6 bits achieving a τ (D)
9
12 26
Suboptimal 24 Suboptimal
10 ArUco ArUco
22
AprilTags AprilTags
20 ARTag
8
18
τ (D)
τ (D)
16
6
14
4 12
10
2 8
0 50 100 150 200 250 0 50 100 150 200 250
d d
50
300
τ (D)
40 τ (D)
200
30 100
20
0 50 100 150 200 250 0 50 100 150 200 250
d d
Figure 6: Minimum inter-marker distances τ (D) for the suboptimal formulation and the ArUco, AprilTags and ARTag methods, as a function
of the dictionary size for different number of bits. ARTag dictionary is only shown for 6 × 6 bits since its dictionary is fixed. It can be observed
that the suboptimal model outperforms the results of the other methods.
τ (D)
Dictionary Size Marker Size
Original Suboptimal ArUco AprilTags
ARToolKitPlus Simple 512 markers 6 × 6 bits 4 11 10 10
ARToolKitPlus BCH 4096 markers 6 × 6 bits 2 9 8 8
Table 3: Inter-marker distance comparison between the dictionaries from the ARToolKitPlus library and those generated by ArUco, AprilTags
and our suboptimal proposal with the same characteristics, i.e. same number of markers and bits. The column Original indicates the inter-
marker distance of the original ARToolKitPlus dictionaries. The results of the original dictionaries are significantly lower than those obtained
by the rest of methods. The results of our suboptimal proposal surpass the other approaches, allowing the correction of one more bit in
comparison to ArUco and AprilTags.
10
1200
3000
Suboptimal Suboptimal
1000 ArUco ArUco
2500
AprilTags AprilTags
Generation time (s)
600
1500
400 1000
200 500
0 0
0 50 100 150 200 250 0 50 100 150 200 250
d d
5000 15000
4000
10000
3000
2000
5000
1000
0 0
0 50 100 150 200 250 0 50 100 150 200 250
d d
Figure 7: Generation times for the suboptimal, ArUco and AprilTags methods as a function of the dictionary size for different number of bits.
The suboptimal times are, in general, shorter than the times of the other approaches, although the difference decreases as the marker size
increase.
5.2.2. Generation Time those obtained by our proposal or the ArUco method.
This section analyses the times employed by the subop- As for the ArUco and AprilTags results, it can be noted
timal, ArUco and AprilTags methods to generate the dic- that there are some slopes in the plots, where the times
tionaries shown in the previous section. As for ARTag and increase sharply. This is especially remarkable for small
ARToolKitPlus, the comparison is not feasible since their marker sizes (4 × 4 and 6 × 6 bits). These peaks correspond
dictionaries are fixed and, hence, there is no generation pro- to those dictionary sizes where the objective distance is de-
cess. creased. Since the objective distance can only be reduced
Figure 7 shows the mean generation times for 30 execu- by reaching the time limit, the generation time increases
tions as a function of the dictionary size for different number considerably with each reduction.
of bits. As can be observed, the generation times are notably On the other hand, the suboptimal proposal reduces the
shorter compared to the optimal case. For instance, the gen- objective distance by adjusting their bounds during the
eration of an optimal dictionary composed by 6 markers and MILP optimization based on the linear relaxation of the
4×4 bits needs more than 1 day, while the suboptimal model problem. This means that the time limit does not need
employed less than 20 seconds to generate a dictionary of 250 to be reached every time the objective distance is reduced
markers with the same marker size. and it explains why the suboptimal times are significantly
Also, the generation times of the suboptimal method are, shorter than the times of the other approaches for 4 × 4 bits.
in general, shorter than those of ArUco and AprilTags. However, as the marker size increases, the times of the sub-
These differences are specially relevant for smaller marker optimal method start to grow and some sharply slopes ap-
sizes. The generation times are only shorter in the AprilT- pear (similarly to those on the ArUco or AprilTags curves).
ags case for 25 × 25 bits. However, as it has been shown in These peaks also correspond to the dictionary sizes where
Sec. 5.2.1, the inter-marker distances obtained by AprilTags the time limit is reached. In these cases, the solver takes the
in this case are completely unsatisfactory in comparison to best feasible solution found until that moment and continues
11
τ (D)
with the next generation. d
Optimal Suboptimal
Note that for the three methods, after the upper bound of 1 10 10
2 8 8
the objective function is reduced, the next generated mark- 3 8 8
ers are less restricted and they can be generated faster, 4 8 7.37
5 8 6.2
which explains the flat lines after each slope. 6 8 6
For the biggest marker size, 25 × 25 bits, there is a high 7 8 6
8 8 6
number of objective distance reductions so that the slopes
become less distinguishable in the three cases. Table 4: Suboptimal distances compared to the optimal distances for
4 × 4 bits and dictionary size up to 8 markers.
12
[10] M. Fiala, Designing highly reliable fiducial markers, [21] M. Kaltenbrunner, R. Bencina, reacTIVision: a
IEEE Trans. Pattern Anal. Mach. Intell. 32 (7) (2010) computer-vision framework for table-based tangible in-
1317–1324. teraction, in: Proceedings of the 1st international con-
ference on Tangible and embedded interaction, TEI ’07,
[11] D. Wagner, D. Schmalstieg, ARToolKitPlus for pose ACM, New York, NY, USA, 2007, pp. 69–74.
tracking on mobile devices, in: Computer Vision Win-
ter Workshop, 2007, pp. 139–146. [22] M. Fiala, Comparing ARTag and ARToolKit Plus fidu-
cial marker systems, in: IEEE International Workshop
[12] S. Garrido-Jurado, R. Muñoz-Salinas, F. Madrid- on Haptic Audio Visual Environments and their Appli-
Cuevas, M. Marı́n-Jiménez, Automatic generation and cations, 2005, pp. 147–152.
detection of highly reliable fiducial markers under oc-
clusion, Pattern Recognition 47 (6) (2014) 2280 – 2292. [23] J. Rekimoto, Matrix: A realtime object identifica-
tion and registration method for augmented reality, in:
[13] E. Olson, AprilTag: A robust and flexible visual fiducial Third Asian Pacific Computer and Human Interaction,
system, in: Proceedings of the IEEE International Con- Kangawa, Japan, IEEE Computer Society, 1998, pp.
ference on Robotics and Automation (ICRA), IEEE, 63–69.
2011, pp. 3400–3407.
[24] W. Peterson, D. Brown, Cyclic codes for error detec-
[14] K. Dorfmüller, H. Wirth, Real-time hand and head tion, Proceedings of the IRE 49 (1) (1961) 228–235.
tracking for virtual environments using infrared bea- [25] S. Lin, D. Costello, Error Control Coding: Fundamen-
cons, in: in Proceedings CAPTECH’98. 1998, Springer, tals and Applications, Prentice Hall, 1983.
1998, pp. 113–127.
[26] D. Schmalstieg, A. Fuhrmann, G. Hesina, Z. Szalavári,
[15] M. Ribo, A. Pinz, A. L. Fuhrmann, A new optical L. M. Encarnaçäo, M. Gervautz, W. Purgathofer,
tracking system for virtual and augmented reality ap- The Studierstube augmented reality project, Presence:
plications, in: In Proceedings of the IEEE Instrumenta- Teleoper. Virtual Environ. 11 (1) (2002) 33–54.
tion and Measurement Technical Conference, 2001, pp.
1932–1936. [27] D. Flohr, J. Fischer, A Lightweight ID-Based Extension
for Marker Tracking Systems, in: Eurographics Sym-
[16] V. A. Knyaz, R. V. Sibiryakov, The development of posium on Virtual Environments (EGVE) Short Paper
new coded targets for automated point identification Proceedings, 2007, pp. 59–64.
and non-contact surface measurements, in: 3D Surface
Measurements, International Archives of Photogram- [28] A. Schrijver, Theory of Linear and Integer Program-
metry and Remote Sensing, Vol. XXXII, part 5, 1998, ming, John Wiley & Sons, Inc., New York, NY, USA,
pp. 80–85. 1986.
[17] L. Naimark, E. Foxlin, Circular data matrix fiducial [29] R. Niemann, P. Marwedel, An algorithm for hard-
system and robust image processing for a wearable ware/software partitioning using mixed integer linear
vision-inertial self-tracker, in: Proceedings of the 1st In- programming, Design Automation for Embedded Sys-
ternational Symposium on Mixed and Augmented Real- tems 2 (2) (1997) 165–193.
ity, ISMAR ’02, IEEE Computer Society, Washington,
[30] C.-W. Hui, Y. Natori, An industrial application using
DC, USA, 2002, pp. 27–36.
mixed-integer programming technique: A multi-period
utility system model, Computers and Chemical Engi-
[18] J. Rekimoto, Y. Ayatsuka, CyberCode: designing aug-
neering 20, Supplement 2 (0) (1996) 1577–1582.
mented reality environments with visual tags, in: Pro-
ceedings of DARE 2000 on Designing augmented reality [31] H. Morais, P. Kádár, P. Faria, Z. A. Vale, H. Khodr,
environments, DARE ’00, ACM, New York, NY, USA, Optimal scheduling of a renewable micro-grid in an
2000, pp. 1–10. isolated load area using mixed-integer linear program-
ming, Renewable Energy 35 (1) (2010) 151–156.
[19] M. Rohs, B. Gfeller, Using camera-equipped mobile
phones for interacting with real-world objects, in: Ad- [32] T. Gönen, Distribution-system planning using mixed-
vances in Pervasive Computing, 2004, pp. 265–271. integer programming, Generation, Transmission and
Distribution, IEE Proceedings C 128 (1981) 70–79(9).
[20] E. Ouaviani, A. Pavan, M. Bottazzi, E. Brunelli,
F. Caselli, M. Guerrero, A common image processing [33] A. Richards, J. P. How, Aircraft trajectory planning
framework for 2d barcode reading, in: Image Processing with collision avoidance using mixed integer linear
and Its Applications. Seventh International Conference programming, American Control Conference (ACC) 3
on (Conf. Publ. No. 465), Vol. 2, 1999, pp. 652–655. (2002) 1936–1941 vol.3.
13
[34] J. A. Nelder, R. Mead, A simplex method for function
minimization, The computer journal 7 (4) (1965) 308–
313.
[35] A. Schrijver, Theory of Linear and Integer Program-
ming, John Wiley & Sons, Chichester, 1986.
14