Augmented Reality: Department of Ece, Brill
Augmented Reality: Department of Ece, Brill
ABSTRACT
Augmented reality is a technology that merges visual perception of real world environments
and objects with virtual, computer-generated content. Augmented reality systems achieve this
combination of the real and virtual using computers, displays, specialized devices for
geospatial and graphic alignment, wired and wireless networks and software
CHAPTER 1
INTRODUCTION
Augmented reality is a technology that merges visual perception of real world
environments and objects with virtual, computer-generated content. The research literature
defines augmented reality as systems that have the following three characteristics: (1)
combine real and virtual content; (2) are interacAtive in real time and (3) are registered in 3-
D [1:356].
Augmented reality systems achieve this combination of the real and virtual using
computers, displays (head-mounted, hand-held, projected, screen and retinal), specialized
devices for geospatial and graphic alignment, wired and wireless networks and software.
Research in augmented reality draws on development from a number of other fields including
virtual reality, wearable and ubiquitous computing, and human computer interaction [2: 167].
Augmented reality is related to virtual reality and virtual worlds (such as Second Life) in its
use of virtual content however it does not fully immerse the user in a virtual environment.
Whereas virtual reality and virtual worlds immerse the subject in a computer-simulated
environment, augmented reality augments and annotates the natural environment with
virtual components. Augmented reality brings virtual reality into the real world and, in the
process, enhances what we can do in real-world scenarios.
There are many challenges to the widespread, everyday acceptance and use of augmented
reality applications. Alignment of virtual content with real-world environments is a complex
and vexing challenge. Wireless networks for mobile augmented reality applications are
inconsistent and often lack sufficient bandwidth for sophisticated applications. Head-
mounted displays, although improving constantly with advances in miniaturization of optical
and display technology, can be uncomfortable and socially awkward. All of these challenges
are being addressed. Augmented reality technology and applications are advancing rapidly.
The purpose of this report is to provide a background on this exciting field: it will cover the
history, technology, applications and future directions .
CHAPTER 2
TECHNOLOGY
The technology behind augmented reality draws from a number of disciplines and exhibits a
wide array of approaches to the problem of combining virtual and real-world objects and
scenes. Display technology solutions range from head-mounted, hand-held, projected, and
screen-based to what many consider the future of augmented reality – retinal displays.
Augmented reality applications can be stationary and tethered (wired to local networks),
mobile and wireless (using broadband wireless networks) or some mixture of the two. Few
technologies have explored as many different approaches to their problem space as
augmented reality. Human vision is the most reliable and complicated sense, providing
more than 70% of total sensory information [5:45]. With so much emphasis placed on the
visual, it is of utmost importance that augmented reality provide a seamless merging of the
virtual and real, and this is also its biggest challenge.
• Fiducials
• image-based (markerless)
Fiducial registration. Fiducials are markers placed in the real-world scene to facilitate
alignment between computer-generated content and the world. These markers employ an
easily recognizable symbol or shape, preferably one that can be computationally recognized
in a maximum of different conditions and from any angle or alignment. The augmented
reality software deduces not only lateral alignments, but also depth and distance
information from the fiducials. Many popular augmented reality applications use fiducial
registration as it is the easiest to handle computationally.
Image-based registration. Markerless or image-based registration relies on a number of
environmental cues for alignment of computer-generated content with real-world scenes.
There are edge-based methods that use computer vision algorithms to detect object
boundaries. There are also texture-based methods that isolate and uniquely identify points
on a surface and then correlate the points in real-time to incorporate motion (both of the
real-world objects and of the user) while maintaining registration. A problem with
imagebased registration is that for edge- or texture-based methods, a baseline scene must
be established. This is not always possible with real-world augmented reality applications
however faster processors and new approaches have enabled applications to establish
baselines on the fly.
Tracking. Tracking allows augmented reality applications to properly render the virtual
components of a scene as the user’s point of view shifts. Virtual objects should not follow
the user’s gaze around a scene, unless that is the intent of the designers. Likewise, virtual
objects should not tilt if the user tilts his or her head. Tracking data is used by the virtual
reality application to make sure that the virtual and real components of a scene align
properly regardless of the position of the user’s head and the direction of gaze. Tracking can
be achieved with a wide variety of different technologies that are based on different
physical principles. Mechanical, magnetic, acoustic and optical tracking approaches are
commonly used [5:213]. Not all augmented reality applications require precise tracking.
Popular modern applications, such as those that employ cell phones, need only be
concerned with aligning virtual content with the camera’s view of the scene.
Displays
A wide variety of display technologies are employed for augmented reality applications. The
main categories of display are:
• Optical
• Video
• Hand-held
• Projected
• Retinal
• Screen-based
Display technologies vary according to requirements. Optical and video displays are
considered head-mounted displays and afford the user maximum freedom of motion. For
head-mounted displays, weight and balance of equipment is an important consideration for
comfort and freedom of movement. If stereoscopic vision is required, the distance between
the user’s eyes must be calibrated for accurate depth imaging.
Optical displays. Also known as see-through displays, optical displays allow the user to view
the real world directly with the addition of computer-generated graphics or text
superimposed on the scene. Optical displays are worn like glasses with image-generating and
blending components placed in a location that does not interfere with vision. An optical
combiner, typically a half-silvered mirror, is placed in the optical path of the viewer. Image-
generation devices, typically tiny LCD (liquid crystal display) screens, are situated at the side
of the user’s head, and their content conveyed via relay optics to the optical combiner.
Video displays. Video displays combine live video of real-world scenes with
computergenerated graphics or text. With video displays, the user does not view the real-
world directly, but from the point of view of two small video cameras mounted on the
headset. Chroma-key (green screen) techniques, like those employed to combine
meteorologists with weather maps on the television news, are used to fuse video and
generated imagery. Although resolution is lower with video displays than optical, the
combination of generated and real-world imagery is much more seamless, as the entire visual
field can be rendered together for display to the user.
Hand-held displays. Cell phones, PDAs and tablet PCs are all examples of hand-held
augmented reality displays. Hand-held displays incorporate all aspects of augmented reality
equipment in one device: processor, memory, camera, display and interactive components.
Imagery from the device’s camera is combined with generated imagery to produce the
augmented scene on the screen.
While they do not afford the immersive experience of optical and video displays, handheld
displays have a unique advantage: they employ widely accepted, inexpensive equipment that
most people already own or have access to. These devices have the greatest potential for
bringing augmented reality applications to a mass audience
Projected displays. Projected displays are employed by location-specific augmented reality
applications. Museums, art galleries and scientific simulations are all candidates for projected
augmented reality. A camera captures the real-world scene, processors combine real and
virtual elements, and a projector projects the scene onto a screen or wall. Projected displays
are limited by their physical immobility as well as the resolution and
displays, this technology has a high potential sharpness of the projected imagery.
Retinal displays. Retinal displays project light directly onto the retina. This eliminates the
need for screens and imaging optics, theoretically allowing for very high resolutions and wide
field of view [5:48]. Low-powered lasers have been used to project images directly onto the
retina, but current technology limits the display to monochromatic red light. The US military
has pioneered the use of this technology: the Stryker armored vehicle augmented reality
system has a component that projects battlefield computer imagery onto the commander’s
retina.
Screen-based displays. Screen-based displays mix real and virtual imagery for display on a
regular computer or video monitor. This technology is usually combined with webcams to
produce personal computer-based augmented reality applications. Like handheld displays,
this technology has a high potential for mass acceptance and has already seen a proliferation
of internet-based advertising applications that highlight a specific product
Software Early augmented reality researchers had to write all of their software from scratch.
This made research and development highly time consuming and required extensive
programming resources for each project. In recent years, frameworks for augmented reality
software development have emerged, making the creation of robust applications possible in
a fraction of the time and with far fewer bugs and errors than the early adapters faced. The
most popular frameworks in augmented reality software development are:
• ARVIKA
• ARTag
• ARToolki
ARVIKA. ARVIKA is a consortium funded by the German government to research and create
augmented reality applications, and it is also the name of their framework for application
development. ARVIKA supports stationary applications using high-end graphics systems, as
well as mobile systems with lower resolution graphics. The principal mobile front-end used
by ARVIKA is a standard web browser with a plug-in. Particularly suited to industrial
applications, ARVIKA applications specialize in incorporating CAD (computer-aided design)
drawings and supporting text and graphics into augmented reality applications. ARVIKA’s
collaboration tools foster interaction between mobile users and lab- or factory-based experts
to facilitate remote support for highly technical repair and maintenance applications [6:244].
ARTag. ARTag, developed by the National Research Council of Canada, is a fiducial marker
system for augmented reality. ARTag markers are digitally generated and scientifically verified
as a trusted reference for augmented reality applications. ARTag employs special black-and-
white square markers to register virtual content accurately relative to the real world. Markers
are printed and placed where users would like the virtual elements to appear [7:5-6].
ARToolkit. ARToolkit is an open source software library for building augmented reality
applications. Registration is accomplished with fiducial markers, and tracking is done using
computer vision algorithms. ARToolkit supports both optical and video devices and can also
run on a PC or Mac with a USB webcam. A version of the toolkit uses Flash (a popular graphics
animation program for the web) to display virtual and real content in a YouTube-like viewer.
Wearable Computing
An essential ingredient for many augmented reality applications is complete portability of
hardware. For augmented reality to exist in the world while not being tethered to a bundle of
wires, an independent but related technology, wearable computing, is available. Wearable
computers are not solely used for augmented reality applications: virtual reality, multiplayer
gaming environments, and ubiquitous computing hobbyists and researchers all employ
aspects of wearable computing.
The onset of 4G is much closer than that of IPv6. Rollouts of large carriers’ 4G networks are
scheduled to begin in late 2009 into 2010. We will undoubtedly start to see the development
of serious mobile augmented reality applications soon after the arrival of 4G, as the
technology for full-blown 3D augmentation has been a capability of tethered augmented
reality applications for a number of years already
For a wearable augmented reality system, there is still not enough computing power to create
stereo 3-D graphics. So researchers are using whatever they can get out of laptops and
personal computers, for now. Laptops are just now starting to be equipped with graphics
processing units (GPUs). Toshiba just added an NVIDIA GPU to their notebooks that is able to
process more than 17-million triangles per second and 286-million pixels per second, which
can enable CPU-intensive programs, such as 3-D games. But still, notebooks lag far behind --
NVIDIA has developed a custom 300-MHz 3-D graphics processor for Microsoft's Xbox game
console that can produce 150 million polygons per second--and polygons are more
complicated than triangles. So you can see how far mobile graphics chips have to go before
they can create smooth graphics like the ones you see on your home video-game system.
CHAPTER 3
WORKING
AR systems track the position and orientation of the user's head so that the overlaid material
can be aligned with the user's view of the world. Through this process, known as registration,
graphics software can place a three-dimensional image of a teacup, for example, on top of a
real saucer and keep the virtual cup fixed in that position as the user moves about the room.
AR systems employ some of the same hardware technologies used in virtual-reality research,
but there's a crucial difference: whereas virtual reality brashly aims to replace the real world,
augmented reality respectfully supplements it.
Augmented reality is still in an early stage of research and development a various universities
and high-tech companies. Eventually, possibly by the end of this decade, we will see the first
mass-marketed augmented-reality system, which one researcher calls
"the Walkman of the 21st century." What augmented reality attempts to do is not only
superimpose graphics over a real environment in real-time, but also change those graphics to
accommodate a user's head- and eye- movements, so that the graphics always fit the
perspective. Here are the three components needed to make an augmented-reality system
work:
the view can be in stereo. Sony makes a see-through display that some researchers use, called
the Glasstron.
Video See-through device: In contrast, a video see-through display uses video mixing
technology, originally developed for television special effects, to combine the image from a
headworn camera with synthesized graphics. The merged image is typically presented on an
opaque head-worn display. With careful design, the camera can be positioned so that its
optical path is close to that of the user's eye; the video image thus approximates what the
user would normally see. As with optical see-through displays, a separate system can be
provided for each eye to support stereo vision.
Video composition can be done in more than one way. A simple way is to use chroma-keying:
a technique used in many video special effects. The background of the computer graphic
images is set to a specific color, say green, which none of the virtual objects use. Then the
combining step replaces all green areas with the corresponding parts from the video of the
real world. This has the effect of superimposing the virtual objects over the real world. A more
sophisticated composition would use depth information. If the system had depth information
at each pixel for the real world images, it could combine the real and virtual images by a pixel-
by-pixel depth comparison. This would allow real objects to cover virtual objects and vice-
versa
A different approach is the virtual retinal display, which forms images directly on the retina.
These displays, which Micro Vision is developing commercially, literally draw on the retina
with low-power lasers whose modulated beams are scanned by micro electromechanical
mirror assemblies that sweep the beam horizontally and vertically. Potential advantages
include high brightness and contrast, low power consumption, and large depth of field.
Each of the approaches to see-through display design has its pluses and minuses. Optical see-
through systems allow the user to see the real world with full resolution and field of view. But
the overlaid graphics in current optical see-through systems are not opaque and therefore
cannot completely obscure the physical objects behind them. As a result, the superimposed
text may be hard to read against some backgrounds, and the three-dimensional graphics may
not produce a convincing illusion. Furthermore, although a user focuses physical objects
depending on their distance, virtual objects are all focused in the plane of the display. This
means that a virtual object that is intended to be at the same position as a physical object
may have a geometrically correct projection, yet the user may not be able to view both
objects in focus at the same time.
In video see-through systems, virtual objects can fully obscure physical ones and can be
combined with them using a rich variety of graphical effects. There is also no discrepancy
between how the eye focuses virtual and physical objects, because both are viewed on the
same plane. The limitations of current video technology, however, mean that the quality of
the visual experience of the real world is significantly decreased, essentially to the level of
synthesized graphics, with everything focusing at the same apparent distance. At present, a
video camera and display are no match for the human eye.
An optical approach has the following advantages over a video
approach:
• Simplicity: Optical blending is simpler and cheaper than video blending. Optic approaches
have only one "stream" of video to worry about: the graphic images. The real world is seen
directly through the combiners, and that time delay is generally a few nanoseconds. Video
blending, on the other hand, must deal with separate videostreams for the real and virtual
images The two streams of real and virtual images must be properly synchronized or temporal
distortion results. Also, optical see through HMDs with narrow field-of-view combiners offer
views of the real world than have little distortion. Video cameras almost always have some
amount of distortion that must be compensated for, along with any distortion from the optics
in front of th display devices. Since video requires cameras and combiners that optical
approaches do not need, video will probably be more expensive and complicated to build
than optical-based systems.
• Resolution: Video blending limits the resolution of what the user sees, both real and virtual,
to the resolution of the display devices. With current displays, this resolution is far less than
the resolving power of the fovea. Optical see-through also shows the graphic images at the
resolution of the display device, but the user's view the real world is not degraded. Thus, video
reduces the resolution of the real world, while optical see-through does not.
• Safety: Video see-through HMDs are essentially modified closed-view HMDs. If the power
is cut off, the user is effectively blind. This is a safety concern in some applications. In contrast,
when power is removed from an optical see-through HMD, the user still has a direct view of
the real world. The HMD then becomes a pair of heavy sunglasses, but the user can still see.
• No Eye Offset: With video see-through, the user's view of the real world is provided by the
video cameras. In essence, this puts his "eyes" where the video cameras are. In most
configurations, the cameras are not located exactly where the user's eyes are,creating an
offset between the cameras and the real eyes. The distance separating the cameras may also
not be exactly the same as the user's inter pupillary distance (IPD). This difference between
camera locations and eye locations introduces displacements from what the user sees
compared to what he expects to see. For example, if the cameras are above the user's eyes,
he will see the world from a vantage point slightly taller than he is used to.
Video blending offers the following advantages over optical blending:
• Flexibility in composition strategies: A basic problem with optical see-through I that the
virtual objects do not completely obscure the real world objects, because the optical
combiners allow light from both virtual and real sources. Building an optical see-through HMD
that can selectively shut out the light from the real world is difficult. Any filter that would
selectively block out light must be placed in the optical path at a point where the image is in
focus, which obviously cannot be the user's eye. Therefore, the optical system must have two
places where the image is in focus: at the user's eye and the point of the hypothetical filter.
This makes the optical design much more difficult and complex. No existing optical see-
through HMD blocks incoming light in this fashion. Thus, the virtual objects appear ghostlike
and semi-transparent. This damages the illusion of reality because occlusion is one of the
strongest depth cues. In contrast, video see-through is far more flexible about how it merges
the real and virtual images. Since both the real and virtual are available in digital form, video
see-through compositors can, on a pixel-by-pixel basis, take the real, or the virtual, or some
blend between the two to simulate transparency. Because of this flexibility, video see-through
may ultimately produce more compelling environments than optical see-through approaches.
• Wide field-of-view: Distortions in optical systems are a function of the radial distance away
from the optical axis. The further one looks away from the center of the view, the larger the
distortions get. A digitized image taken through a distorted optical system can be undistorted
by applying image processing techniques to unwarp the image, provided that the optical
distortion is well characterized. This requires significant amounts of computation, but this
constraint will be less important in the future as computers become faster. It is harder to build
wide field-of-view displays with optical see-through techniques. Any distortions of the user's
view of the real world must be corrected optically, rather than digitally, because the system
has no digitized image of the real world to manipulate. Complex optics are expensive and add
weight to the HMD. Wide field-of-view systems are an exception to the general trend of
optical approaches being simpler and cheaper than video approaches.
• Real and virtual view delays can be matched : Video offers an approach for reducing or
avoiding problems caused by temporal mismatches between the real and virtual images.
Optical see-through HMDs offer an almost instantaneous view of the real world but a delayed
view of the virtual. This temporal mismatch can cause problems. With video approaches, it is
possible to delay the video of the real world to match the delay from the virtual image stream.
• Additional registration strategies: In optical see-through, the only information the system
has about the user's head location comes from the head tracker. Video blending provides
another source of information: the digitized image of the real scene. This digitized image
means that video approaches can employ additional registration strategies unavailable to
optical approaches.
• Easier to match the brightness of real and virtual objects: Both optical and video
technologies have their roles, and the choice of technology depends on the application
requirements. Many of the mechanical assembly and repair prototypes use optical
approaches, possibly because of the cost and safety issues. If successful, the equipment would
have to be replicated in large numbers to equip workers on a factory floor. In contrast, most
of the prototypes for medical applications use video approaches, probably for the flexibility
in blending real and virtual and for the additional registration strategies offered.
CHAPTER 4
APPLICATIONS
With little fanfare, augmented reality systems have been used in a wide variety of fields over
the past two decades. While many of the applications described here are purely research
prototypes, quite a few are beginning to see widespread acceptance, particularly the mobile
applications. This survey of augmented reality applications explores technological innovation
in a full spectrum of fields:
• Medical
• Military
• Industrial / Manufacturing
• Mobile
• Entertainment
• Education
Medical Applications
Augmented reality applications are appearing in both training and real-world medical
scenarios. The capability to look within a body without cutting it open has long been a goal of
medical technology research. Augmented reality systems are realizing this goal with displays
that mix real-world views of patients with virtual internal views facilitated by real-time
ultrasound, magnetic resonance imaging (MRI), computed tomography (CT) scan, and
laparoscopic data.
Augmented ultrasound. Researchers at the Department of Computer Science at the
University of North Carolina, Chapel Hill, have pioneered the development of medical
applications of augmented reality. Physicians were outfitted with head-mounted displays that
enabled viewing of a pregnant woman with an ultrasound scan of the fetus overlaying the
woman’s abdomen. Walking around the patient allowed the physicians to observe the fetus
from different angles and to diagnose its position in relation to the woman’s internal organs
[8:20].
UNC researchers also developed an application allowing physicians to see directly inside a
patient using ultrasound echographic imaging and laparoscopic range imaging combined with
a video head-mounted display. A high performance graphic computer was used to generate
imagery from the imaging systems and render it for integration with the live video feed of the
patient [12].
patients with accurate, realistic interior viewpoints of the target anatomy. They also mixed
computer models of a patient’s brain and tumor with a live video image of the patient to plan
the removal of diseased tissue [8:20].
A birth simulator. Siemens Corporate Research has developed an augmented reality system
for real-time augmentation in medical procedures (RAMP). It is optimized for accurate
registration, high resolution and fast refresh rate. Using RAMP, the Orthopedic Clinic of
Munich, Germany has developed a birth simulator for medical training.
The birth simulator provides a 3D visualization of the birth process on a head-mounted
display that supports stereo vision. Head movements are tracked and the display altered
accordingly, providing depth cues as if the user were viewing a live baby inside the mother.
The imagery is overlaid onto an anatomically correct partial dummy representing the mother.
The skin and hip bones of the virtual mother can be display or removed for different depth-
level visualization.
The user also sees real-time vital statistics for the mother and the baby projected on the head-
mounted display field of view. Blood pressure, heart rate, pain and oxygen supply data are
provided. Biomechanical data are provided as well, including position of the baby’s head,
friction in the birth canal and tissue forces. Sensors and force-feedback allow trainees to apply
forceps to the proper location on the articulated dummy and feel (and see) dynamic
simulation of a birth procedure.
The birth simulator system uses a high resolution video head-mounted display. Visual tracking
data is used for accurate rendering of virtual objects. The all-video nature of the system
facilitates third-party monitoring of the training process via traditional video monitors. Initial
results of the simulator in training were positive. Further development of the RAMP system
with other medical applications is in progress [13].
Military Applications The US military, in conjunction with major defense contractors and
aerospace companies, has been researching and experimenting with augmented reality
systems for the better part of two decades. Their stated goal is to improve situational
awareness for pilots and soldiers and to facilitate enhanced communication with their peers
and the chain of command. Heads-up displays (transparent displays of data mounted in the
pilot’s line of sight) have long been a reality for fighter jet pilots, and recent developments
make use of advanced eye-tracking to allow pilots to acquire targets and fire upon them
simply by looking at them.
Land Warrior. Land Warrior is a US Army wearable computing and augmented reality
application that is part of the Future Combat Systems initiative. It combines commercial, off-
the-shelf technology and standard military equipment, integrating weapon systems
(M16 rifle or M4 carbine) with video, thermal and laser sighting in a head-mounted display
that overlays situational awareness data with real world views in real time [14].
The head-mounted display shows digital maps, intelligence information and troop locations
together with imagery from the weapon sighting systems. Thermal imaging enables the
soldier to see through obstacles as well as offering greatly enhanced night vision [15]. A GPS
The system contains detailed 3D models of real-world objects in the environment which are
used to generate the registered graphic overlays. The models are stored in a shared database
which also stores metadata such as descriptions, threat classifications and object relevance
(to each other and to the mission). Intelligent filters display or hide information based on
physical location and contextual information gleaned from the shared database.
User interaction with the system is facilitated by a handheld wireless mouse which
superimposes a cursor on the scene. Speech and gesture-based interaction is also being
developed.
The prototype BARS consists of:
• GPS receiver
• Orientation tracker
Many modern manufacturing systems have largely abandoned the “one size fits all” approach
in favor of methodologies that allow highly customized, one-off versions of a product line as
customer demands become increasing specialized. Workers must consult multiple versions of
assembly guides, templates, parts lists and other related documents in order to fulfill
customer orders.
Augmented reality can provide hands-free visual overlays of dynamic manufacturing
information targeted to specific, highly controllable automated and semi-automated
assembly environments. Problems of registration and tracking within busy, noisy factory
environments remain however these stumbling blocks are sure to be overcome in the pursuit
of the competitive advantages afforded by augmented reality and related technologies.
Boeing’s wire bundle assembly project. The first application of augmented reality
technology to manufacturing was Boeing’s wire bundle assembly project, started in 1990. The
term “augmented reality” was coined by Tom Caudell, a researcher on the project. Boeing’s
Everett, Washington engineering and manufacturing facility, the world’s largest factory
building, was a logical choice of sites to introduce this ground-breaking technology
At Boeing, wiring bundles are assembled prior to installation in aircraft. The traditional
method is to prefabricate one or more 3’ by 8’ easel-like boards called formboards. Plotter
paper glued to the surface of the boards contains full-scale schematic diagrams of the wire
bundles to be assembled. Workers refer to the diagrams and also to stacks of printed material
in assembling the bundles on pegs mounted to the boards.
Boeing researchers developed an augmented reality system using a stereo optical
headmounted display. Registration and tracking were limited to the highly controllable
environment of the formboard. When a worked looked through the headset at the
formboard, the 3D path of the next wire to mount in the bundle was indicated by a colored
line superimposed on the view. The wire gauge and type were indicated in a graphic shown
to the side. As the worker changed his or her perspective on the formboard, the graphical
indicators appeared to stay in the same location, as if painted on the board. With this new
approach, workers were able to better concentrate on the accuracy of the bundle assembly
without having to look away from the work to consult documents or changed formboards for
every different assembly required [8:17-19].
DaimlerChrysler’s augmented reality initiatives. DaimlerChrysler has used virtual reality
systems for design and modeling of automotive parts and assemblies. Recently it has
enhanced and extended its virtual reality initiatives to include augmented reality applications.
Its in-house virtual reality system, known as DBView, was used as the virtual image generation
platform for its augmented reality initiatives.
Truck wiring harness design. Purchasers of DaimlerChrysler trucks have a high degree of
freedom in configuring their vehicles. Because of this, wiring must be individually planned for
virtually every truck produced. DaimlerChrysler developed an augmented reality system for
designing customized wiring configurations for truck assembly
The system uses head-mounted displays to project a virtual geometric line known as a spline
curve representing the wiring within the structure of the truck chassis. The workers can
interact with the system, changing the path of the line using a 3D pointing device. Once they
have configured the optimum wiring path, the design is exported in the form of
manufacturing orders for subcontractors or for their own factories [6:217].
Visualization of data in airplane cabins. DaimlerChrysler developed an application for
interpreting computational fluid dynamics data within an airline cabin. The user wears an
optical head-mounted display and data such as air temperature, velocity and pressure are
overlaid in colored-coded volumetric graphics, like transparent smoke or vapor clouds. After
an initial calibration step, the application can be run in any airplane cabin [6:218- 219].
Motor maintenance and repair. Site-specific repair of car engines was the target of an
augmented reality initiative by DaimlerChrysler. Rather than having to look away from the
work area to reference a paper manual or CD-ROM, the workers wear head-mounted displays
connected to an augmented reality service and maintenance system that overlays repair
information on real-world car engines. The information is conveyed as both static text
overlays and as video and 3D animated graphics. DaimlerChrysler also developed a user-
friendly authoring system to build the sequence-based instructions called PowerSpace, using
the slide metaphor of Microsoft’s popular PowerPoint software [6:220-222].
BMW’s intelligent welding gun. Constructing prototypes of experimental vehicles presents a
challenge to automobile manufacturers. Since only a few cars of an 16 experimental design
are ever built, the process is largely based on manual work. Automated factories cannot be
customized quickly enough to accommodate prototype construction. BMW turned to
augmented reality technology in order to streamline the prototype construction process. Stud
welding is a time-consuming process for prototype construction.
Typically, it is a two-person process: the first person reads the coordinates of a stud from a
computer printout and positions the locater arm; the second person marks the position of
the stud with an etching device. Typically around 300 studs need to be welded to every car
frame. Once all of the stud positions have been marked, the welders place the studs at the
specific locations using a welding gun.
BMW’s augmented reality application skips the two-person stud marking process. The system
guides welders directly to the exact stud locations using visual overlays on a video screen
attached to the welding gun. They found this to be a safer solution than the usual head-
mounted display which would restrict the welder’s field of view and compromise safety. This
approach added a layer of complexity to the tracking and registration of the system: it not
only had to track the position of the welder’s head and viewpoint, but also the position of the
welding gun.
In testing the intelligent welding gun, BMW found that workers using this technology were
able to quadruple their speed without any loss of precision compared to unaided workers
[6:334].
Mobile Applications Mobile augmented reality combines augmented reality technology with
mobile devices, including wearable computers, PDAs and mobile phones. Geospatial
positioning and registration is accomplished with built-in digital compasses, GPS units, and, in
the case of the iPhone, a technology known as location service. Location service uses a
combination of wifi, cellular tower location, and GPS to determine the geospatial location of
the iPhone user.
While mobile augmented reality generally does not provide the precision or resolution of
tethered, indoor augmented reality, it has one dominant factor in its favor: near ubiquity of
mobile phones, GPS devices, and their supporting infrastructure. Mobile augmented reality
applications may seem like toys today, but they are the leading edge of mass acceptance of
this up and coming technology.
LifeClipper. LifeClipper is a mobile augmented reality application that has evolved
significantly across three major release cycles: LifeClipper, LifeClipper2, and LifeClipper3.
The original LifeClipper, designed by Swiss artist Jan Torpus, incorporates a walking tour of
the medieval quarter of Basil, Switzerland. The user’s location is tracked via GPS, and
audiovisual augmentations relevant to the location are presented on the headmounted
display.
The equipment consists of:
• Video camera
• Microphone
• GPS receiver
• Compass
from tour guide-style information to users’ comments and social media tags. The goal is to
foster collaborative, wiki-style augmentation of the world [21].
Wikitude World Browser. Wikitude World Browser is a mobile augmented reality application
developed by Mobilizy for Android-based phones. Android is an open source operating
system created by Google for use in mobile phones, PDAs and netbooks. Wikitude was
originally developed to overlay Wikipedia information onto real-world scenes using camera-
equipped mobile phones. The user points the cell phone camera at a scene, and, using the
phone’s built-in location technology, the World Browser overlays 18 the scene with relevant
information from Wikipedia. The software accomplishes this by determining the longitude
and latitude of objects in the camera’s field of view using the GPS receiver and digital compass
of the phone and matching this data with coordinateenhanced entries in Wikipedia, of which
there are approximately 600,000 [23].
With the development of Wikitude.me, users can add content to Wikitude by creating unique
points of interest and location-specific hyperlinked media content and saving them to the
Wikitude database. Then, any World Browser-equipped mobile device will overlay the user’s
content on the scene [22]. This application, in use today, goes a long way toward realizing the
concept of massively augmented reality as envisioned by the LifeClipper3 designers.
Seer. A special version of Wikitude called Seer was introduced at this year’s Wimbledon tennis
tournament. Developed for the Android G1 handset by IBM and the advertising agency Oglivy,
it overlaid information on the phone’s camera view about tennis courts, restaurants and bars
and provided live updates from the tennis matches [23].
TAT Augmented ID. Swedish software and design company TAT has developed a mobile
augmented reality application that matches cell phone camera images of people’s faces with
information from social networking sites to present text overlays such as Facebook messages.
The subject’s image must be in TAT’s database in order to match up the information, and
users have the ability to register facial images with TAT for this purpose. Augmented ID uses
technology from Polar Rose to match facial characteristics with the TAT database, enabling
the application to consistently identify faces in different viewing angles and lighting situations
[24].
TwittARound. Developed by WebIt, TwittARound is a mobile augmented reality application
for the iPhone 3GS which overlays graphics on the real world indicating Twitter tweets that
are occurring nearby in real time. The application uses the iPhone’s location service to
pinpoint the user’s location, and the built-in compass to determine the user’s viewing
direction. Location-stamped tweets appear on the phone’s screen, showing the content of
the tweet and how far away the tweet-creator is located [25].
Yelp. Yelp is a social networking company that specializes in local search and user reviews of
businesses. Yelp has a popular iPhone app that has a hidden feature called Monocle. To launch
Monocle, iPhone users with the Yelp app installed must shake their phone three times. This
will activate an augmented reality overlay onto the live camera view showing icons for nearby
Yelp-reviewed businesses, including restaurants. The icons include distance and location
information so that users can easily find the nearby businesses [26]
The Touring Machine displays virtual flags which appear to be planted in various locations
across the Columbia University campus. The flags represent locations that have stories
associated with them. If the user selects a flag, the application displays a series of still photos
and video snippets with a narrator’s voice-over playing over headphones.
One story recounts the student anti-war protests at Columbia in 1968. Another story
describes the Bloomingdale Asylum, which previously occupied the current site of the
Columbia campus. The asylum’s buildings, rendered in 3D models, are overlaid at their
original locations on the optical head-mounted display. Meanwhile, the hand-held display
presents an interactive annotated timeline of the asylum’s history. The user can choose
different dates on the timeline and the application synchronizes the overlay of relevant
buildings on the head-mounted display [4:55].
Entertainment Applications
The entertainment industry is a fertile ground for augmented reality applications. The
promise of combining virtual imagery with real-world scenes, particularly for live-action
entertainment categories such as sporting events and concerts, opens up a world of new
possibilities.
ARQuake. ID Software’s classic first-person shooter game Quake was massively popular soon
after its release in 1996. This was due, in part, to its innovative capability to be played by
groups over the internet.
ARQuake uses the Quake game engine and moves the game action into the real world.
Developed at the University of South Australia, ARQuake uses a head-mounted display,
notebook computer, head tracker and a GPS system to overlay virtual monsters onto the
player’s point of view. As the player’s head moves, the game calculates which virtual monsters
should appear [7:8]
Since the development of ARQuake, other commercially available augmented reality game
systems have come on the market, including board games by Beyond Reality and the thrilling
Zombie Attack, which overlays miniature animated 3D zombies on a game board with
integrated registration and graphics, viewable on a smartphone.
Virtual sets. Several AR applications have been developed with virtual sets, a compositing
system that merges real actors with virtual backgrounds, in real time and 3D. This is frequently
utilized in television sports news and other scenarios featuring live commentators
superimposed on graphically generated sets. With this technology the entertainment industry
has been able to reduce costs, as creating and storing virtual sets is frequently more cost
effective than building new physical sets from scratch [1:8].
First down. Anyone who has watched a US football game in the past several years cannot help
but notice the colored first down line superimposed on the playing field. Hockey viewers may
have seen a colored trail indicating the location and direction of travel of the puck. These are
augmented reality elements that have made their way into the mainstream with little fanfare.
In Great Britain, rugby fields and cricket pitches are branded by their sponsors via
augmentation of the live video feed: giant logos are inserted onto the fields for the benefit of
the television viewing audience [27:32].
Concert Augmentation. British new wave band Duran Duran was the first performing group
to use augmented reality in a live show. Working with Charmed Technologies, the group
deployed projection screens which enabled virtual animated characters to appear onstage
during their 2000 Pop Trash tour [27:32].
Education Applications
Augmented reality has been associated with educational institutions since its beginnings.
Much of the research and many of the breakthroughs have been accomplished by teams in
colleges and universities. Augmented reality applications are beginning to find their way into
elementary and secondary schools, made possible by inexpensive yet powerful hand-held
devices and personal computers and widely available authoring systems like ARToolkit
BBC Jam storybooks for kids. The BBC has been a leader in funding augmented reality
application development in education. The first trial application, BBC Jam, is an online
learning service available in the UK. The application consists of a series of story packs available
for download. Using a standard personal computer equipped with a USB webcam, the story
packs include booklets with registration markers which become animated 3D pop-up books
when viewed on the pc screen via the webcam. Narration and other educational materials
accompany the story packs [7:8].
The Invisible Train. The Vienna University of Technology has developed an augmented reality
application for children called The Invisible Train. The application is written for PDAs and can
accommodate multiple players. The players set up a real wooden track and then can control
virtual trains superimposed on the track via the PDA screen. Players use the stylus to steer the
trains, switch tracks and control train speed [7:9].
AR Polygonal Modeling. Purdue University has developed an augmented reality application
for 3D modeling. Extending the capabilities of the popular 3DS Max modeling and animation
suite, AR Polygonal Modeling uses a head-mounted display augmented reality component to
create and manipulate 3D models that appear on a physical desktop marked with registration
points. 3DS Max’s tools are represented with 3D overlays and can be manipulated via a 3D
mouse which is a wireless mouse attached to a rigid array of markers which the head-
mounted cameras synchronize with the application for precise user interaction with the
models [7:11-12]
Architecture and Urban Planning Applications
Augmented reality applications for architectural visualization with walk-through capabilities
are currently in development. Likewise, collaborative design applications are being developed
which facilitate shared virtual models and data projected onto a shared platform, such as a
table or desktop.
As with many other industries and practices, the ability to superimpose imagery on the real
world is a boon to architects and designers as they can plan their creations in situ and
collaborate with colleagues using shared models.
The ARTHUR project. German and British architecture and design firms, in collaboration with
Aalborg University in Denmark and University College in London, have developed the
Augmented Round Table for Architecture and Urban Planning (ARTHUR). The application uses
optical augmented reality glasses developed by Saab Avionics to view virtual models of urban
design schemes. Using printed registration markers, the ARTHUR environment can be
implemented on a table or desktop, around which collaborators sit. Models are projected
over placeholder objects, and the collaborators can move model components by physically
moving the placeholders. Designers can model different pedestrian and vehicular traffic flows
through urban models and even add animations of people and cars for increased realism [28].
tasks. However, AR also introduces many high-level tasks, such as the need to identify what
information should be provided, what’s the appropriate representation for that data, and how the
user should make queries and reports. For example, a user might want to walk down a street, look in
a shop window, and query the inventory of that shop. To date, few have studied such issues. However,
we expect significant growth in this area because research AR systems with sufficient capabilities are
now more commonly available. For example, recent work suggests that the creation and presentation
of narrative performances and structures may lead to more realistic and richer AR experiences
1 Technological limitations:
Alhough we’ve seen much progress in the basic enabling technologies, they still primarily
prevent the deployment of many AR applications. Displays, trackers, and AR systems in
general need to become more accurate, lighter, cheaper, and less power consuming. By
describing problems from our common experiences in building outdoor AR systems, we hope
to impart a sense of the many areas that still need improvement. Displays such as the Sony
Glasstron are intended for indoor consumer use and aren’t ideal for outdoor use. The display
isn’t very bright and completely washes out in bright sunlight. The image has axed focus to
appear several feet away from the user, which is often closer than the outdoor landmarks.
The equipment isn’t nearly as portable as desired. Since the user must wear the PC, sensors,
display, batteries, and everything else required, the end result is a cumbersome and heavy
backpack. Laptops today have only one CPU, limiting the amount of visual and hybrid tracking
that we can do. Operating systems aimed at the consumer market aren’t built to support real-
time computing, but specialized real-time operating systems don’t have the drivers to support
the sensors and graphics in modern hardware. Tracking in unprepared environments remains
an enormous challenge. Outdoor demonstrations today have shown good tracking only with
significant restrictions in operating range, often with sensor suites that are too bulky and
expensive for practical use. Today’s systems generally require extensive calibration
procedures that an end user would be unacceptably complicated. Many connectors such as
universal serial bus (USB) connectors aren’t rugged enough for outdoor operation and are
prone to breaking. While we expect some improvements to naturally occur from other fields
such as wearable computing, research in AR can reduce these difficulties through improved
tracking in unprepared environments and calibration free or auto calibration approaches to
minimize set-up requirements.
Increase Sales
One obstacle to online shopping is that products are not always correctly
represented, oftentimes due to poor photo availability. With augmented reality, it is
possible to visualize an object in its "true" form before actually making a purchase.
Enrich Content
Augmented reality is a data-adding system that offers cultural, security, and time
savings benefits. The technology provides additional information in real time in a
defined position or in a specific environment, without the user having to look for it.
For example, augmented reality can provide users information about historic sites on
a sightseeing tour, stops along a scenic drive, or even about plants and formations
seen on a bike ride.
Improve Notoriety
Technological innovations and developments are increasingly popular with users and
potential customers. A company or brand that chooses to use augmented reality
acquires leverage to gain visibility but, also, a certain image. By taking advantage of
this trend, the brand or the company can both attract new audiences and retain its
existing customers. ave existed between a customer and a vendor.
With the arrival of advances in wireless broadband networking and device addressing (4G
and IPv6) the environment will be filled with microscopic tagging and sensing devices. Tag-
based environments will provide real-time feedback to augmented reality systems for
accurate registration of mobile applicationsAs for augmented reality applications, the
imagination runs wild with possibilities. Computing tasks will be freed from the desktop and
laptop and will accompany us wherever we want them. Virtual animals, humans, and
objects will proliferate, filling the landscape and cityscape with helpers, game characters,
tour guides, and much more. Virtual meetings with participants located anywhere on earth
can happen anywhere, free of the cumbersome and expensive equipment required for
today’s teleconferencing systems. Work, play, health care, manufacturing, education,
advertising and more will be transformed by the fully realized potential of augmented
reality.
CONCLUSION
Augmented reality is far behind Virtual Environments in maturity. Several commercial
vendors sell complete, turnkey Virtual Environment systems. However, no commercial vendor
currently sells an HMD-based Augmented Reality system. A few monitor-based “virtual set”
systems are available, but today AR systems are primarily found in academic and industrial
research laboratories.
The first deployed HMD-based AR systems will probably be in the application of aircraft
manufacturing. Both Boeing and McDonnell Douglas are exploring this technology. The
former uses optical approaches, while the letter is pursuing video approaches. Boeing has
performed trial runs with workers using a prototype system but has not yet made any
deployment decisions. Annotation and visualization applications in restricted, limited range
environments are deployable today, although much more work needs to be done to make
them cost effective and flexible.
Applications in medical visualization will take longer. Prototype visualization aids have been
used on an experimental basis, but the stringent registration requirements and ramifications
of mistakes will postpone common usage for many years. AR will probably be used for medical
training before it is commonly used in surgery.
The next generation of combat aircraft will have Helmet Mounted Sights with graphics
registered to targets in the environment. These displays, combined with short-range steer
able missiles that can shoot at targets off-bore sight, give a tremendous combat advantage to
pilots in dogfights. Instead of having to be directly behind his target in order to shoot at it, a
pilot can now shoot at anything within a 60-90 degree cone of his aircraft’s forward centerline.
Russia and Israel currently have systems with this capability, and the U.S is expected to field
the AIM-9X missile with its associated Helmet-mounted sight in 2002
Augmented Reality is a relatively new field, where most of the research efforts have occurred
in the past four years. Because of the numerous challenges and unexplored avenues in this
area, AR will remain a vibrant area of research for at least the next several years.
After the basic problems with AR are solved, the ultimate goal will be to generate virtual
objects that are so realistic that they are virtually indistinguishable from the real environment.
Photorealism has been demonstrated in feature films, but accomplishing this in an interactive
application will be much harder. Lighting conditions, surface reflections, and other properties
must be measured automatically, in real time. More sophisticated lighting, texturing, and
shading capabilities must run at interactive rates in future scene generators. Registration
must be nearly perfect, without manual intervention or adjustments.
While these are difficult problems, they are probably not insurmountable. It took about 25
years to progress from drawing stick figures on a screen to the photorealistic dinosaurs in
“Jurassic Park.” Within another 25 years, we should be able to wear a pair of AR glasses
outdoors to see and interact with photorealistic dinosaurs eating a tree in our backyard