0% found this document useful (0 votes)
19 views

Augmented Reality Implementation Methods in Mainstream Applications

Uploaded by

lara.suknaic
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views

Augmented Reality Implementation Methods in Mainstream Applications

Uploaded by

lara.suknaic
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/51913818

Augmented Reality Implementation Methods in Mainstream Applications

Article in Acta Universitatis Agriculturae et Silviculturae Mendelianae Brunensis · June 2011


DOI: 10.11118/actaun201159040257 · Source: arXiv

CITATIONS READS

10 6,003

2 authors:

David Procházka Tomas Koubek


Mendel University in Brno Mendel University in Brno
48 PUBLICATIONS 406 CITATIONS 12 PUBLICATIONS 27 CITATIONS

SEE PROFILE SEE PROFILE

All content following this page was uploaded by David Procházka on 21 May 2014.

The user has requested enhancement of the downloaded file.


ACTA UNIVERSITATIS AGRICULTURAE ET SILVICULTURAE MENDELIANAE BRUNENSIS

Volume LIX 30 Number 4, 2011

AUGMENTED REALITY IMPLEMENTATION


METHODS IN MAINSTREAM APPLICATIONS

D. Procházka, T. Koubek

Received: August 31, 2010

Abstract

PROCHÁZKA, D., KOUBEK, T.: Augmented reality implementation methods in mainstream applications. Acta
univ. agric. et silvic. Mendel. Brun., 2011, LIX, No. 4, pp. 257–266

Augmented reality has became an useful tool in many areas from space exploration to military
applications. Although used theoretical principles are well known for almost a decade, the augmented
reality is almost exclusively used in high budget solutions with a special hardware. However, in last
few years we could see rising popularity of many projects focused on deployment of the augmented
reality on different mobile devices. Our article is aimed on developers who consider development
of an augmented reality application for the mainstream market. Such developers will be forced to
keep the application price, therefore also the development price, at reasonable level. Usage of existing
image processing soware library could bring a significant cut-down of the development costs. In the
theoretical part of the article is presented an overview of the augmented reality application structure.
Further, an approach for selection appropriate library as well as the review of the existing soware
libraries focused in this area is described. The last part of the article outlines our implementation of
key parts of the augmented reality application using the OpenCV library.

augmented reality, OpenCV, template matching, ARToolkit, computer vision, image processing.

1 Introduction specific purpose. Especially from this reason, they


The general principle of the augmented reality are rather expensive. Therefore, they are used
(AR) is embedding digital information into the real just for saving lives or speed-up production or
world scene. Thus it is a step between virtual reality maintenance of very expensive devices.
and the real world. The embedded information is However, in last few years AR applications have
usually based on the content of the scene. A selected been emerging into other areas – design, medicine
real object could be augmented by a virtual object and even consumer electronics. Development of
or completely replaced. Well-known examples AR applications for this field is discussed in this
are the presentation of the fighter status report on article. It is obvious that one of the most important
the head-up display before a pilot or a navigation prerequisites for the success in this area is the price.
information projected on the wind shield of a car. This price is given by the price of the hardware
Applications based on the augmented reality have (which is nowadays usually quite cheap) and the
been used for decision making process support for development costs.
many years. Military solutions for field operations The development costs could be significantly
made the pioneering work in this area – from cut-off using different image processing soware
mentioned fighter head-up displays to tactical libraries. AR applications are working on similar
suits for troopers. Moreover there is a number of well-know principles. For a number of complex
applications for construction and maintenance of algorithms is therefore possible to use an existing
complex systems. Especially space devices, planes implementation (image preprocessing, edge detec-
and helicopters. A number of these solutions tion, etc.). The goal of this article is to present
is outlined in Ong – Nee (2004). The discussed these libraries potential for development of such
applications are usually developed for a single mainstream AR applications and clearly outline

257
258 D. Procházka, T. Koubek

the general AR application structure including the spatial augmented reality is the adaptive projector
implementation in a selected library. used for projection on heterogeneous surfaces. The
The section 2 shortly describes the augmented projected image is controlled by the soware and
reality hardware and examples of its usage. In the adjusted to compensate the differences between
section 3 is outlined the key problem – detection the anticipated and the real image. A thorough
of an object in an image. The section 4 is a short description of this technology could be found in
review of the existing soware libraries for Bimber – Raskar (2005).
image processing. In section 5 is presented our The last group of devices are solutions based on
implementation of key parts of an augmented reality different screens. This area has been growing in
application. the fastest way in last years. It is given especially by
a huge emerge of advanced cellphones and tablets.
2 Hardware used for augmented reality These devices have all necessary components:
solutions a suitable display, a high resolution camera,
Our reality could be augmented in many ways. a processor fast enough to make the real-time image
Widely spread are for example audio navigation analysis and also a GPS accompanied by compass.
tools for visually impaired people. However, in One of the most popular AR application of this kind
the following review we will focus especially on is project Layar1. From the technical pointof-view, it
a visual augmentation of the reality. This visual is a video composition based application merging
augmentation could be divided into three main the camera image with additional information from
categories. The first one is based on usage of the map layers stored inside the device.
head mounted displays. The other group is based on The cellphones and other portable devices present
projectors. This kind of augmented reality is called a platform with a significant economical potential.
spatial augmented reality. The last category is based The number of users is in comparison to previously
on common displays (tablets, cell phones, etc.). mentioned categories incomparable. According
Currently used head mounted displays are based to the research done by the Garther agency, 62
on the optical composition of the scene or on million smartphones with ability to run such AR
the video composition. The optical composition applications was sold only in second quarter of the
is a projection of artificial objects on a semi- year 2010. The rapidly growing market of tablets
transparent screen before user eyes. The video founded by the Apple’s iPad2 is also important.
composition combines an image from a camera
with artificial objects and the result presents on 2.1 Comparion of AR implementations
a small LCD screen in a virtual helmet. These semi- Although the presented output devices are
transparent screens are suitable especially for completely different, there is a number of common
applications when a camera signal blackout could principles. The first key problem is the identification
be critical (fighters, troopers, etc.). On the same of the screen before the user. For this purpose
principle there are in fact based head-up displays could be used solely image processing (searching
in cars with status and navigation information. The for known objects) or there could be used other
most important problem of this solution is the exact position techniques. There is frequently used the
overlaying of a real and digital object. triangulation from a cellphone network, Wi-Fi
The solution is in the usage of the video hotspots, the GPS or some inertial sensors. At the
composition based device. The image from the moment the scene before the user is identified, it
front-side camera is analysed, position of the real is necessary just to insert appropriate information.
object is found and this object could be seamlessly These first two steps are based on well-known
replaced. Drawbacks of this solution are usually common principles described later.
higher head mounted display size and weight, A significant difference is in the method of
limited field-of-view and the price. presentation to the user. However it is just a question
Applications based on them are usually from the of the used hardware. Soware architecture of the
category discussed in the beginning of the article systems is usually very similar.
– special high budget solutions. However, these
solutions are not suitable for a common customer. 3 Object detection methods
The other group of products – a solution based The general structure of any application based on
on projectors – is significantly growing in last years. image processing is following: We will acquire an
A computer is analysing the scene using a camera image from a camera and store it into an inner rep-
and searching for predefined objects. If the found, resentation (a kind of RGB color matrix). Further
the attached projector is able to augment directly we will make an image analysis and identify a pos-
the real world object surface by the given digital sible wanted object, its position and orientation.
information. A quite common example of this This potentially wanted object is compared with

1 http://www.layar.com/
2 http://www.apple.com/ipad/
Augmented reality implementation methods in mainstream applications 259

1: Scheme outlining the basic functionality of the augmented reality application

a predefined pattern or patterns. In the case of suc- must be tested as described further. If its possible,
cess, the last step is insertion of the artificial object. it is recommended to prepare also an appropriate
This process is illustrated on the fig. 1. This process testing environment. Suitable lighting and high
is quite common for all AR applications. contrast markers could significantly simplify the
Another significant difference is especially in the preprocessing phase. Generally, a homogeneous
step of comparison of the possible desired object controlled environment allows to wipe out most
with the patterns. It depends whether searched of the unwanted objects (se e.g. Dutta – Chaudhuri
object is a face, a natural object (building), a simple (2009)). In case of complex lighting conditions,
shape (window) or an artificial marker (usually different adaptive thresholding techniques are
black and white square with a predefined pattern). usually used.
In the following part of the article we will focus on The result of the previous step is an image with
the artificial marker. This case is quite simple and number of vertices and edges. The other step
could be the first step in building of a robust AR is detection of connected components with the
application. required shape. This could be done using an image
morphology algorithm. This algorithm produces
3.1 Artificial marker detection a list or tree of image components. For identification
An example of the marker detection process could of potential markers it is sufficient to browse this
be described by following steps. The whole process list of objects and test whether the provided entity
is also outlined on Fig. 2. Our presented method is fulfils the given criterion. In case of a common
not the only possible solution, however it is widely white square marker with a black inner square with
used by many well tested applications. a pattern, it is an entity with four vertices with an
As been already mentioned, from an input device inner entity again with four vertices.
a color image is taken and stored into the inner As soon as we have potential marker vertices,
representation of the image processing library. they must be compared with predefined patterns.
Further, this image is transformed into the gray For this comparison there must be done marker
scale. It is possible to make a standard conversion pixels transformation between a marker plane
of all color channels or prefer a specified color and a camera plane (in other words: perspective
channel (e.g. green). The grey value now presents distortion must be eliminated). Equation 2 describes
the brightness of the pixel (hence an object). this transformation. If defined transformation
Such image contains a number of objects – matrix is applied on point [xm, ym, zm] in marker
markers, persons, furniture, etc. For performance plane, we will receive the position of this vertex in
improvement it is necessary to remove most of these the camera plane. By inversion of this process we
objects from the image. This step is usually done could receive the original vertex position before
via thresholding. Simple thresholding could be perspective distortion (generally we will receive the
generally described using following formula: original object shape). Elements T1–T3 represent
a translation vector. Elements R11–R33 represent well
 0, f(i, j) ≤ P known 3 × 3 rotation matrix (see Neider et al. (2007),
g(i, j)=  (1)
 1, f(i, j) > P , p. 806). Calculation of the transformation matrix
elements is described in Šťastný et al. (2011). By
where function f(i, j) is the source image (brightness this step we have restored the original shape of the
of the pixel), value P is the threshold and g(i, j) is object and it is possible to make comparison with
the result image. Value of the threshold is usually the marker patterns.
determined according to the scene content.
The gist of the thresholding is transformation  x    R11 R12 R13 T1  x 
 c      m 
to a bitmap in such way that allows to remove yc   R21 R22 R23 T2  ym 
most of unnecessary objects. A well chosen         (2)
 zc    R31 R32 R33 T3  zm 
thresholding method could significantly improve       
 1    0 0 0 1   1 
the performance of the application. Each object that
is not filtered out is a potential marker and therefore
260 D. Procházka, T. Koubek

 xc    C11 C12 C13 C14   R11 R12 R13 T1  xm 


         
y    C21 C22 C23 C24   R21 R22 R23 T2  ym 
 c          (3)
 z    C31 C32 C33 1   R31 R32 R33 T3  zm 
 c        
 1    0 0 0 1   0 0 0 1   1 

A precondition of a successful transformation is (2008) on page 214. Each potential marker must be
a camera calibration step. The camera calibration tested against all patterns using a selected method
matrix describes optical properties of a given until all patterns are tested or the appropriate pattern
device and compensates also possible optical errors. is found. It is obvious that computation complexity
This calibration matrix is calculated for a given grows linearly with the number of patterns. That is
camera only once. Therefore, the performance of the reason why in many applications there are used
the application is not affected. The application of special markers with patterns given by algorithms
calibration matrix is described in 3. Its structure and such as the well known Golay error code.
calculation is outlined in Kato – Billinghurst (1999). In case there is a corresponding pattern, the
The last step – marker identification – is the application will store the transformation matrix that
most time consuming operation. It is a correlation defines the orientation of the marker in the scene.
calculation between the potential marker image This matrix is the homomorphy matrix described
and the pattern. In case matching is above the given before.
limit, images are taken as corresponding. For the
correlation calculation could be used a number of 3.2 Conclusion of marker identification
methods. From neural networks to least squares process
algorithm. Selected approaches are described in Basic principles of the AR applications are quite
Kato et al. (2003). Methods implemented directly in similar, as is obvious from the description outlined
OpenCV library could be found in Bradski – Kaehler

2: Stages of object detection and augmenting of the scene


Augmented reality implementation methods in mainstream applications 261

above. The difference is mostly in particular in comparison. It supports C++, C and Python
algorithms for image comparison. That is the reason languages. It consists of tools for image analysis,
why it is not effective to implement the application image transformations, camera calibration, stereo
from the scratch, but to use an existing library that vision and also tools for simple graphical user
supports the mentioned well-known algorithms. interface and more.
The following part of the article describes an OpenCV is associated with Intel company, which
appropriate library selection. implements the support for OpenCV into hardware.
This library uses Intel Integrated Performance Primitives
4 Methodics of image processing library that provides high performance for low-level
selection routines for sound, video, speech recognition,
Before library selection for our AR application it is coding, decoding, cryptography etc. Intel Threading
necessary to specify exactly the functionality which Building Blocks is used for parallel processing.
is required. For a huge number of applications, The library is a cross-platform, there are versions
following criteria are important: for GNU/Linux, Microso Windows and Mac OS X,
1. Required programming language support: 32b and 64b systems. A big advantage of this project
Fulfilment of this criterion could be complicated. is that it is still in development. In December 2010
Most of the libraries support just C/C++. An there was released version 2.2. The development
exception is the OpenCV library that supports of a new version is in progress. The community of
also the Python language and NyARToolKit OpenCV users is huge, there are a lot of manuals,
supporting many mainstream languages. tutorials and discussion forums. Basics of OpenCV,
2. Required platform and architecture support: mathematical principles of image processing and
This problem is obviously the most limiting their implementation are described by Bradski –
criterion. Especially in case application is targeted Kaehler (2008).
on mobile devices. On personal computers there Easy usage of cameras is very useful for
is limited only support of 64 bit architecture that augmented reality applications development. This
is necessary for complex applications. framework uses resources of operating systems
and communication with hardware drivers is fully
3. Project is under active development: There is
provided as well. If a camera driver is installed in
a huge number of already unsupported projects
an operating system, it can be initialized and used
or projects with very limited numbers of active
immediately. Disadvantage of this library is that
developers and users. Usage of such library is not
there are not implemented any direct methods
recommended. Support of new architectures,
for marker registration. As shown below, for this
input devices, etc. is an essential feature of any
purpose it is necessary to link several different
library.
OpenCV functions.
4. Documentation: This aspect is in the beginning
of the projects frequently underestimated. 4.2 ARToolKit
However, developers will be facing a number
ARToolKit is a soware library for development
4
of situations where is utmost important to
of augmented reality applications. It is a cross-
understand thoroughly the implemented
platform. There are versions for GNU/Linux,
method. It is not enough to know that this
Microso Windows, Mac OS X and SGI, but officially
method or function if proving some correlation
only for 32b versions of the systems. This framework
coefficient, it is necessary to know exactly how
implements some basic tools for marker registration.
the algorithm works.
This is more straightforward in comparison with the
5. Provided functions number: Again, it is quite
OpenCV. Implemented methods are robust. The
an obvious requirement. It is recommended to
library supports C language. The community of
consider the application future development.
ARToolKit users is not as big as OpenCV community,
To base the project on a library with limited
but still quite large.
functionality could be in long term point-ofview
Working with cameras is more complicated than
expensive.
in the case of OpenCV. However, a more substantial
Further section briefly reviews selected frequently problem is that development of a free version of
used image processing toolkits. this framework was stopped and it continues only
for the paid version called ARToolKit Professional5.
4.1 OpenCV
The last free version was released in February 2007
OpenCV3 (Open Computer Vision) is an open source and since then no further update has been released.
toolkit for the real-time image processing. OpenCV The development is stopped and probably no new
is the most robust solution among all frameworks version will be released in the future. Compatibility

3 More information on: http://opencv.willowgarage.com


4 http://www.hitl.washington.edu/artoolkit
5 http://www.artoolworks.com
262 D. Procházka, T. Koubek

with new versions of the operating systems or OpenCV has the best stability of development
other features is not guaranteed. These can be among all frameworks. New versions are released
found in the paid version, which is very costly. The oen and a lot of mistakes are corrected and a new
development license costs 4995$ a year, use of the features are added. For other projects, continuation
framework in third party applications costs 995$ of the development is uncertain. NyARToolKit
a year. The development of a commercial version is publishes updates approximately in six-month
currently uncertain, because the last major changes intervals, the last in April 2011. Studierstube released
of ARToolKit Professional were made, according the latest version of the framework two years ago.
to the documentation, in 2007. The library is still The choice of a library is not clear at all, because
maintained. it depends on particular needs of the project.
However, for production deployment NyARToolkit,
4.3 NyARToolKit and derivates ARToolKit and Studierstube are not appropriate.
NyARToolKit6 was derived from ARToolKit version The big disadvantage is the uncertain or stopped
release 2.72.1 (the last version of free ARToolKit). development. These libraries have big potential,
Nowadays is NyARToolKit developed by Japanese but there should be some issues in future, especially
author. In fact, it is derivated from ARToolKit for new architecture or operating system compatibility
other platforms and languages (ARToolKit is for C etc. The development of the ARToolKit Professional is
language only). There is a version for Java, Flash, also questionable. The project website was updated
Android, Silverlight (SLARToolKit), Actionscript at 2011, but only version 2.72.1 (from 2007) is
(FLARToolKit), or C#. C++ version is in beta stage avalaible for free.
(no support for 3D or cameras). There are still some The choice of an appropriate library is a very
restrictions, i.e. documentation is in Japanese but import decision at this point. From our point-
project is active and other shortages can be removed of-view, there are two options – ARToolKit
in next versions. The community will be crucial. Professional and OpenCV. The first one is ready
for development augmented reality applications,
4.4 Other AR libraries but it is very expensive. OpenCV has a good
Other frameworks for working with the AR documentation, user support and it is free, but not
are Morgan, DART, Goblin XNA or Studierstube. ready for AR applications development. The user
Mostly these are university projects, but they are has to implement a method for marker tracking
not so recent or their support is only marginal, or recognition. However, the OpenCV providea
as well as their user community is only limited. a lot of image processing functions which can be
Studierstube is worth of noticing, because of the best useful later. We decided to use the OpenCV for
documentation (among university projects). It is development of our AR application.
developed by Institute for Computer Graphics and
Vision at Graz University of Technology. At first 5 Detection of artificial marker in OpenCV
Studierstube has been used for collaborative AR, later environment
was focused on mobile applications. This library As we described earlier, the OpenCV library
uses other frameworks for tracking, video and does not implement any direct methods for
registration. identifying and registering artificial markers in
It is advisable to check the development of Goblin space. But it provides a lot of methods for image
XNA. It is a Columbian University project, derived analysing and image processing, which can be
from the project Goblin with added support for used for implementation of marker recognition
Microso XNA framework. ARTag is used for marker and registration. For testing of the AR applications
tracking. Connection with XNA helps to bring OpenCV offers functionality for finding of the
augmented reality applications on the Microso special type of marker – chessboard. The use of
Xbox platform. However, it is also available for chessboard marker makes development of simple
GNU/Linux and MacOS X. AR applications much easier. Next section describes
main ideas for finding of a artificial marker. The
4.5 Choice of appropriate library method details can be found in Bradski-Kaehler
OpenCV, ARToolKit (Profesional), NyARToolKit (2008), in OpenCV reference manual on the project
and Studierstube have some pros and cons, but they homepage and in code examples distributed within
belong to better ones. But none of these fulfilled the installation package.
completely specified criteria for the framework
choice. All frameworks are cross-platform. Support 5.1 Finding of artificial marker vertices
for 64b is provided for OpenCV, NyARToolKit and As mentioned in section 3, firstly the image has
ARToolKit Professional. OpenCV, ARToolKit (Professinal) to be converted to gray-scale and thresholded.
and Studierstube have a lot of documentation (OpenCV Gray-scaling is done by function cvCvtColor
has both official documentation and literature). which converts OpenCV internal format to different

6 http://nyatla.jp/nyartoolkit
Augmented reality implementation methods in mainstream applications 263

color spaces. This conversion is possible to color transformation matrix, which describes this marker
spaces RGB, CIE Luv, CIE Lab, HLS, HSV, YCrCb pose and position, function cvFindHomography
and CIE XYZ. One parameter of cvCvtColor can be used. Its input is the matrix of points which
specifies between which two color spaces the represents the marker template and matrix of points
image is converted. To threshold image function which represents the marker in the image (this is
cvAdaptiveThreshold can be used. Its output is just a square of given size). Their transformation
a binary image. The threshold value, threshold is described by the matrix in equation 2. It is also
methods and their settings are specified in the used for compensation of marker rotation. For this
function parameters. cvWarpPerspective is used. This function makes
Next step of our analysis is finding of edges and inverse operation for transformation which is
vertices. OpenCV provides several functions which described by the matrix (for inverse transformation
implement different algorithms for these actions. is necessary to use flag CV_WARP_INVERSE MAP).
For edge detection is available i. e. Canny edge The output of this function is an image without
detector or Hough transformation. Mentioned perspective transformation. This image can be
methods are described at Wang – Fan (2009), matched with the template.
respectively Duda – Hart (1972). Another option is For comparison of two images OpenCV provides
OpenCV function cvFindCountour. The process own implementation of a template matching
described below works with well-known artificial method, described by Brunelli (2009). It is
marker – a black rectangle within a white field, with implemented in function matchTemplate. The
a unique picture (can be compared with template). template slides through the image, compares the
Our application uses the cvFindContour overlapped patches against the template using
function to find contours in the thresholded image. a specified method and stores the comparison
There are two kinds of contours – an inner and results to a matrix (result value expresses probability
outer contour. All contours are represented by an of the template presence in the image and depends
OpenCV structure called cvSeq and is stored into the on the matching method). For finding the template
special storage. Some parameters of cvFindContour position in the image function minMaxLoc must be
are important for the whole algorithm. Parameter called. This function finds the position and value of
mode, in function cvFindContour, means, how global minimum and maximum in the result matrix.
the found contours will be organized. If mode is set Surely the match matchTemplate method is not
to CV_RETR_CCOMP, all contours are organized the only solution for pattern recognition. There
into two level hierarchy – parents and children (it is could be used a number of methods including
possible to obtain also a list or tree of contours). different clustering methods (see e. g. Fejfar et
All contours are stored in the internal storage, al. (2010)) or neural networks (image processing
and we must decide, which contours belong to applications outlined e. g. in Prochazka et al. (2011)).
a particular marker. At first, we can approximate
polygon from contours by function ApproxPoly. 5.3 Finding chessboard marker
It is an important step in finding a marker, bacause In this section there is implemented detection
polygons have some additional attributes, i. e. of chessboard vertices. This is a special case of the
attribute total, that expresses amount of lines of issue described in section 5.1. For this chessboard
a polygon. vertices detection OpenCV includes function
Choice of right contours is crucial. We can use cvFindChessboardCorners which finds all
attributes of cvSeq and attributes of polygons. We inner corners of the chessboard. The input of this
declare, that a contour is a marker, if it consists of 4 algorithm is the chessboard size – amount of the
lines (being the attribute of polygon – determining inner corners (height and width of the chessboard).
outer contours of the black rectangle), if has a child There are few methods for finding the corners.
(being the attribute of cvSeq – determining inner Function cvFindCornerSubPix can be used
contours of black rectangle) and the child consists for a more precise determination of the corner
of 4 lines. These conditions determine the edges of position. Aer this, the position of all inner corners
the marker. Each contour is the input for structure of the chessboard is stored into an array of cvPoint.
CvSeqReader. This is used for finding vertices of Vertices, which represent the four edge vertices
markers. For this purpose there is called macro of the marker are on these array coordinates: 0,
CV_READ_SEQ_ELEM. This algorithm output boardWidth-1, (boardHeight-1)*boardWidth and
is a marker vertices quaternion. It is used for (boardHeight-1)*boardWidth+boardWidth-1,
transformation of the coordinate system (see 2, number boardWidth represents the amount of
step 6). Now we have a quadruple of points that inner chessboard corners in rows and boardHeight
describe the vertices of a potential marker. represents the amount of corners in columns. This
algorithm output is a marker vertices quaternion.
5.2 Transformation of vertices coordinates These are the corners coordinates of the chessboard
into camera plane and image matching marker in the image. These corners are highlighted
For matching of the found object with an in fig. 3 by big circles. Other found corners are
image template it is necessary to compensate its highlighted by smaller circles. There is an apparent
perspective transformation. For computing of the robustness for rotation and bending of the
264 D. Procházka, T. Koubek

3: Example of chessboard marker detection

chessboard. A loss of inner corners occurs at big The augmented reality applications for common
deformation of the marker. users are emerging area. Low user-friendliness is
a crucial problem of the development, although
there is a significant research in this area for almost
DISCUSSION AND CONCLUSIONS
two decades. For instance, there are no standard
The implementation clearly shows that the approaches for user interface design, even despite
OpenCV library is able to implement same the fact these are commonly used for a desktop
functionality as the well-known ARToolkit and other and mobile applications design – see Saffer (2010),
libraries based on this project. This is a key issue for Kryštof (2009) and many others. One of the reasons
many developers considering the AR application is that many metrics and design patterns are not
development. According to our experience, the applicable to AR applications development.
OpenCV is more feasible solution than the ARToolkit, Assimilation of new metrics and patterns is
despite more complicated beginnings. a significant challenge. In future work we want to
focus on this area.

SUMMARY
As been outlined in our article, all augmented reality applications work on similar theoretical
principles. The crucial difference is in a position/orientation detection and in a template matching
approach.
Only satellite navigation systems, compasses and motion sensors are used in a part of applications.
However, these applications usually do not allow representation of complex graphics objects. The
other group of applications use the image analysis to specify the information about orientation of
the user. The base of these applications lies in composition of a signal from a camera and a digital
information. In our article we deal with this area. We chose the problem of an identification of an
artificial marker for illustration.
The functionality of the application based on the image analysis was defined in the section 3.1 as
the ability of: image reading from an input device, its preprocessing (gray scale transformation,
thresholding), segmentation on continuous objects, vertex detection, compensation of geometric
distortions and the comparison with given patterns. The list of these processes could be supplemented
by an ability to insert the model/information to the image. Its implementation depends on the ar-
chitecture of the application (if OpenGL, Microso DirectX or other API is used). The aim of this
section is to explain comprehensibly basic principles of the image processing in AR applications.
The basic implementation scheme, which is briefly described in section 5, presents clearly the
implementation of discussed problems using the OpenCV library. The usage of presented solutions
can reduce development time. Discussed projects (and especially the OpenCV project) are open-
source. Therefore, it is possible to modify or extend the unsatisfactory existing implementations
of the methods. On the basis of our experience with discussed libraries, we recommend to use the
OpenCV library. The significance of the project, open source codes, high-quality implementation,
available documentation and also the wide community and support of Intel company are indicators
of project quality and to some measure guarantee of further development.

Acknowledgements
This paper is written as a part of a solution of project IGA FBE MENDELU 31/2011 and research plan
FBE MENDELU: MSM 6215648904.
Augmented reality implementation methods in mainstream applications 265

REFERENCES KATO, H. et al., 2003: A Registration Method


BIMBER, O. – RASKAR, R., 2005: Spatial Augmented based on Texture Tracking using ARToolKit. In:
Reality: Meeting Real and Virtual Worlds. Wellesley, Proceedings of IEEE International Augmented Reality
Massachusetts, USA: A K Peters. ISBN 1-56881- Toolkit Workshop, pp. 77–85. ISBN 0-7803-7680-3.
230-2. KRYŠTOF, J., 2009: Towards an MDA-based
BRADSKI, G. – KAEHLER, A., 2008: Learning approach for development of a structural scope
OpenCV: Computer Vision with the OpenCV Library. of the presentation layer. Acta univ. agric. et silvic.
USA: O’Reilly Media. ISBN 0-596-51613-4. Mendel. Brun., LVII, 6, pp. 123–133. ISSN 1211-
BRUNELLI, R., 2009: Template Matching Techniques 8516.
in Computer Vision: Theory and Practice. USA: Wiley. NEIDER, J. – DAVIS, T. – WOO, M., 2007: The OpenGL
ISBN 978-0-470-51706-2. Programming Guide: The Official Guide to Learning
DUDA, R. O. – HART, P. E., 1972: Use of the Hough OpenGL Version 3.0 and 3.1. USA: Addison-Wesley
transformation to detect lines and curves in Publishing Company. ISBN 0-3214-8100-3.
pictures. Commun. ACM. 15, 1, pp. 11–15. ISSN ONG, S. – NEE, A., 2004: Virtual and Augmented Reality
0001-0782. Applications in Manufacturing. London: Springer.
DUTTA, S. – CHAUDHURI, B. B., 2009: Homogenous ISBN 978-1-85233-796-4.
Region based Color Image Segmentation. In: PROCHÁZKA, D. et al., 2011: Mobile Augmented
Proceedings of the World Congress on Engineering Reality Applications. In: MENDEL 2011, 17th
and Computer Science, San Francisco, CA, Vol. II., International Conference on So Computing. Brno
pp. 1301–1305. ISBN 978-988-182102-7. University of Technology. ISSN 1803-3814.
FEJFAR, J. – WEINLICHOVÁ, J. – ŠŤASTNÝ, J., SAFFER, D., 2010: Designing for Interaction: Creating
2010: Musical Form Retrieval. In MENDEL 2010, Innovative Applications and Devices. Berkeley, CA:
16th International Conference on So Computing. Brno New Riders. ISBN 0-321-64339-9.
University of Technology. WANG, B. – FAN, S., 2009: An Improved CANNY
KATO, H. – BILLINGHURST, M., 1999: Marker Edge Detection Algorithm. In: IWCSE ’09:
Tracking and HMD Calibration for a Video-based Proceedings of the 2009 Second International Workshop
Augmented Reality Conferencing System. In: on Computer Science and Engineering, pp. 497–500,
Proceedings of International Workshop on Augmented Washington, DC, USA. IEEE Computer Society.
Reality, pp. 85–94, San Francisco, CA. IEEE. ISBN ISBN 978-0-7695-3881-5.
0-7695-0359-4. ŠŤASTNÝ, J. et al., 2011: Augmented reality usage
for prototyping speed up. Acta univ. agric. et silvic.
Mendel. Brun., LIX, 2, pp. 353–360. ISSN 1211-8516.

Address
Ing. David Procházka, Ph.D., Ing. Tomáš Koubek, Ústav informatiky, Mendelova univerzita v Brně,
Zemědělská 1, 613 00 Brno, Česká republika, e-mail: [email protected]
266

View publication stats

You might also like