1 s2.0 S0166361519309698 Main

Download as pdf or txt
Download as pdf or txt
You are on page 1of 26

Computers in Industry 119 (2020) 103229

Contents lists available at ScienceDirect

Computers in Industry
journal homepage: www.elsevier.com/locate/compind

A systematic design method of adaptive augmented reality work


instruction for complex industrial operations
Junhao Geng a,e , Xinyu Song a , Yuntao Pan a , Jianjun Tang b , Yu Liu c , Dongping Zhao d ,
Yongsheng Ma e,∗
a
Institute of Intelligent Manufacturing, Northwestern Polytechnical University, Xi’an 710072, China
b
AVIC Chengdu Aircraft Industry (Group) Co., Ltd., Chengdu, 610092, China
c
Technical Research Department, CRRC Academy, Beijing, 100160, China
d
School of Aircraft Engineering, Xi’an Aeronautical University, Xi’an, 710077, China
e
Department of Mechanical Engineering, Faculty of Engineering, University of Alberta, Edmonton T6G 1H9, Canada

a r t i c l e i n f o a b s t r a c t

Article history: One of the barriers that prevent augmented reality (AR) from being widely adopted in diverse and complex
Received 1 November 2019 industrial operations is the lack of adaptive and scalable AR work instruction (ARWI). This paper suggests
Received in revised form 15 March 2020 a systematic method to solve the problem. The proposed method presents an adaptive representation
Accepted 17 March 2020
structure that expresses and controls those interrelated aspects of ARWI adaptiveness: authoring, envi-
Available online 1 April 2020
ronment, guiding and process control. The proposed method also provides a detailed approach to achieve
the adaptiveness of ARWI in its authoring and running process. The ARWI design workflow suggested
Keywords:
can incorporate industrial AR elements via on-site operators’ inputs without programming instead of
Industrial operation
Augmented reality
developers or engineers. At the same time, the adaptive ARWI can adapt to different people, environ-
Work instruction ment objects, and processes of complex industrial operations at runtime. This advantage brings ARWI
Adaptiveness design closer to real applications and makes the technology more adoptable. Three user studies, using
Design method the disassembly operations of an aircraft engine’s hydraulic actuator and other two practical cases, are
used to assess the proposed method and support our claims.
© 2020 Elsevier B.V. All rights reserved.

1. Instruction with problem-solving skills, and identify various space structures


of products and final states of components. Such complex man-
For complex products, such as aircraft, machine tools, engines ual operations have low efficiency and high quality-fluctuation.
and transport vehicles, several major phases of their life cycles need Therefore, they are typically bottlenecks of intelligent manufactur-
to be implemented through complex industrial operations, such as ing (Zhou et al., 2018). Developing effective and reliable assistive
assembly, commissioning, Maintainance Repairment and Overhaul methods for on-site operators has become imperative in order to
(MRO) and decommissioning (Mo et al., 2015). These operations can reduce experience requirement, improve operation efficiency, and
be characterized with complicated product structure, rigid depen- stabilize quality.
dencies to different components and tools, and strict and lengthy Electronic Work Instruction (EWI) has been applied as an assis-
working process. Such industrial operations are typically carried tive method to guide the operation process on-site (Geng et al.,
out by on-site operators, and heavily depend on manual maneu- 2014), consisting of texts, pictures, animations, 3D models and
ver or human-machine collaboration (Regenbrecht et al., 2005; Fox, other digital information, and they can be browsed and interacted
2010); sophisticated experience and comprehensive measurement with tablets, industrial computers and other display devices (Geng
for performance are required. The operator has to remember and et al., 2015). Compared with paper-based instruction, EWI has been
understand all kinds of operation instructions, be familiar with more intuitive, interactive, informative and understandable, and
a variety of complicated operation actions, master related tools can decrease the temporal demands and effort of operations (Li
et al., 2018). Therefore, it has been widely used in aviation, automo-
tive, heavy equipment, electronics and other industrial fields (Geng
et al., 2014Li et al., 2018). However, EWI cannot be immersed with,
∗ Corresponding author at: Department of Mechanical Engineering, Faculty of
or merged into, real environment of on-site operation. The operator
Engineering, University of Alberta, Edmonton T6G 1H9, Canada.
E-mail address: yongsheng.ma@ualberta.ca (Y. Ma).
has to switch his vision and cognition between virtual information

https://doi.org/10.1016/j.compind.2020.103229
0166-3615/© 2020 Elsevier B.V. All rights reserved.
2 J. Geng, X. Song, Y. Pan et al. / Computers in Industry 119 (2020) 103229

and real environment frequently. It increases the levels of mental, 2016), one key technical issue is its poor adaptiveness (Palmarini
physical demands and frustration of operators (Li et al., 2018). et al., 2018; Fraga-Lamas et al., 2018). Perhaps, adaptiveness, also
With the rapid development of Augmented Reality (AR) tech- known as re-usability, applicability, scalability, flexibility in litera-
nology, its derived operation assistance has been recognized as a ture, has been a big bottleneck that limits IAR transferring from
promising technology (Jetter et al., 2018), exploring applications laboratory settings to real world implementations (Wang et al.,
are reported in manufacturing, maintenance and other industrial 2016a; Erkoyuncu et al., 2017).
fields (Nee et al., 2012). AR can synchronize the information space Ideally, IAR should be easily customized to adapt to vari-
on top of the physical space, and display the real-time task-relevant ance of people, products, processes, working conditions and other
information to reduce the operators’ mental workload and cog- operation elements (Wang et al., 2016a; Fite-Georgel, 2020).
nitive distance (Uva et al., 2018). Evaluation results show that Furthermore, IAR applications should be neatly delivered by
AR-based operation assistance can decrease execution time and developers and deployed to the end-users without requiring re-
significantly reduce error rates (Elia et al., 2016). However, AR programming; such self-deploy-ability, or the adaptiveness, is
technology has not been widely adopted in real context of various necessary for IAR usability (Zhu et al., 2013).
complex industrial operations (Bottani and Vignali, 2019). Besides The concept of adaptive AR (AAR) has been considered to over-
the high cost and uncomfortable wearing experience of the AR come the current IAR limitations (Grubert et al., 2017; Hallaway
hardware, susceptible tracking robustness and registration accu- et al., 2004). So far, only a few explorative studies have been
racy, unfriendly user interfaces and interaction, and unrealized achieved. For instance, a context-aware AR system is designed to
integration with enterprise data, the lack of flexible and scalable assist the workers in maintenance tasks by analyzing the con-
AR work instruction authoring procedure also is one of the major texts of the maintenance tasks and provide relevant and useful
reasons (Nee et al., 2012; Bottani and Vignali, 2019; Palmarini et al., information (Zhu et al., 2013; Zhu et al., 2014). Another AR main-
2018; Wang et al., 2016a; Fite-Georgel, 2020). tenance system uses an innovative method of wrist-based haptic
The work instruction contents are the basis of an AR system tracking and image-based detection to provide the completion sta-
to guide operator orderly. However, current design and authoring tus of a maintenance step to an operator, which means the user
of AR work instruction (ARWI) depend on specifically customized could receive adaptive feedback thus allowing the user to be more
code development (Nee et al., 2012). That means ARWI has to effective and willing to accept guidance information during a main-
be designed for the specific operation scene by highly qualified tenance process (Siew et al., 2019). A user-adaptive AR system
developers rather than end users. This situation leads to the high can recommend relevant contents according to user’s preferences
technical threshold, high cost and low adaptability (Palmarini et al., as well as history of interactive selection (Oh and Byun, 2012).
2018). Therefore, the technology of authoring adaptive ARWI has An i-ARA architecture endowing context-awareness and user per-
attracted the attention of researchers, but no systematic solutions. sonalization is proposed to improve the user interfaces (Hervás
This paper proposes a systematic design method of adaptive ARWI et al., 2013). Furthermore, the adaptiveness in the runtime of AR
for complex industrial operations, which can be used to rapidly applications can also be improved by machine learning such as
design ARWI for complex industrial operation scenes by engineers deep learning technology. One novel research shows that the Mask
and operators rather than experienced developers. region-based CNN (Mask R-CNN) can be used to efficiently detect
In this paper, Section 2 introduces the previous research works and segment the objects of tasks, so that the AR task assistance can
in the field. Section 3 gives the overall design of the proposed adap- achieve a better performance with the deep learning-based adap-
tive ARWI design method. Section 4 illustrates and explains the tiveness of environment (Park et al., 2020). How to systematically
details. Section 5 presents the system implementation and Section design and express AR contents and to make them more generi-
6 presents user studies. Finally, conclusions and future research are cally adaptive for end users, environment, operating objects and
presented in Section 7. work processes, still needs further research.

2.2. ARWI
2. Related works
A work instruction system advises the operator on what to do
2.1. Industrial AR and adaptive AR and how to do (Geng et al., 2015; Haug, 2015) jobs. IAR, differ-
ent from paper-based WI and EW, is a new form carrier of work
AR can superimpose virtual context-sensitive information onto instructions, organizes AR contents, and provides assistance and
physically observed real world environment scenes (Regenbrecht guidance to on-site workers (Uva et al., 2018; Bottani and Vignali,
et al., 2005). In the industrial sectors AR is considered to have 2019; Zhu et al., 2013); its composition procedure adaptiveness,
great potential (Nee et al., 2012; Bottani and Vignali, 2019; Wang the authors believe, is critical for the scalability (Siew et al., 2019;
et al., 2016a), and hence the term “Industrial Augmented Reality” Hervás et al., 2013). ARWI contains elementary AR contents, that
(IAR) is used for product design, manufacturing, assembly, mainte- define the local objects and varying context of operations, or known
nance/inspection, and training (Fite-Georgel, 2020). as various scenes; and they should be dynamically modified during
Boeing used IAR to provide step-by-step instructions for work- the runtime (Erkoyuncu et al., 2017). Consequently, to improve the
ers in manufacturing and assembly (Mizell, 2020). IAR can help adaptiveness of IAR applications, on-site insertion of AR elements
workers to perceive instructions with less effort, avoiding the fre- of ARWI is explored. Three aspects need to be investigated, i.e. con-
quent changes between a real-world context and a virtual one tents, structure and authoring of ARWI (Nee et al., 2012; Siew et al.,
where the relevant information is accessed; and the technology 2019).
can combine human abilities to provide efficient and complemen- AR contents in ARWI include text, image, audio, video, 3D model,
tary tools to assist industrial operation tasks (Nee et al., 2012; and even physical feedback information (Palmarini et al., 2018).
Fraga-Lamas et al., 2018). Subsequently IAR has received growing For instance, researchers used text messages and circle annotations
attention in various industrial areas throughout phases of prod- to instruct a machinist to complete some preventive maintenance
uct life-cycles (Regenbrecht et al., 2005; Bottani and Vignali, 2019). tasks of a CNC milling machine (Zhu et al., 2013). 3D-animation and
IAR has been commonly applied to assist on-site workers in assem- virtual position icon are superimposed onto the captured machine
bly, maintenance, and repair (Nee et al., 2012; Quandt et al., 2018). part to guide a trainee to accomplish an assembly operation (Webel
However, IAR has not reached yet its full potential (Makris et al., et al., 2013). Virtual wrist motion flags are displayed on the cap-
J. Geng, X. Song, Y. Pan et al. / Computers in Industry 119 (2020) 103229 3

tured video, and these flags are triggered based on the position of Winer, 2019). Furthermore, some researchers proposed a valuable
the wrist of a user in order to instruct the motion of user’s hands method to convert existing traditional manuals to new AR com-
(Siew et al., 2019). A virtual avatar of user’s hand is attached to patible manuals based on three principles: the optimization of text
the movement of the haptic device to guide the user in grasp- usage with the ASD Simplified Technical English, the conversion
ing the components to assemble/disassemble (Ferrise et al., 2013). of text instructions into 2D graphic symbols, and the structuring of
Virtual component model, instruction text and direction mark are the content through the combination of Darwin Information Typing
employed to guide the user to find tools, prompt user on the sta- Architecture (DITA) and Information Mapping (IM) (Gattullo et al.,
tus of components (Wang et al., 2016b). IAR system also needs 2019).
the contents of real world to track and identify physical opera- At present, there are also some commercial systems that
tion environment (Bottani and Vignali, 2019). At present, those can support the authoring of AR work instructions for non-
real application-specific AR contents are solidified beforehand with programmers to use such as ScopeAR, Dynamics 365 Guides
physical markers or as virtual features, such rigid structure directly for HoloLens and Cortona3D (HoloLens, 2019; Cortona3D, 2019;
affects the scalability of IAR systems (Webel et al., 2013; Wang et al., ScopeAR, 2019). These commercial authoring tools focus on the
2016b; Fiorentino et al., 2014; Wang et al., 2016c). It can be appre- visualization of authoring process and have the latest technologies
ciated that if the AR contents have a modular structure, and can of tracking and registration. But how to support the coopera-
be written-in ARWI and customized on-site to accommodate the tion between engineers and operators, display different contents
changes of operation scenes, it will greatly increase the adaptive- according to different operators dynamically and comply with
ness of ARWI (Siew et al., 2019). ARWI contents also need to be other management regulations such as exception handling and
modularly configured for many different aspects (Mattsson et al., quality control regulations still need to be investigated further.
2016), e.g. quality management, with which the operator should do So far these methods have mainly focused on the generation of
their work accordingly (Haug, 2015). Quality-related images can be guidance information; action tracking and feature identification are
used to detect whether a step has been conducted correctly (Siew not well studied. Some researchers also believe that work instruc-
et al., 2019). The contents need also to adapt to user’s expertise level tion is a directive document of enterprise; it must be designed and
or preferences for the better cognitive processes and performance audited in a standardized way before it can be released to the pro-
(Zhu et al., 2013; Oh and Byun, 2012), e.g. rules can be a part of ARWI duction site (Li et al., 2018; Haug, 2015). Therefore, ARWI authoring
to decide how to display AR instruction based on the user’s ability approaches are still in the early stage and there is a lack of for-
level and action time (Syberfeldt et al., 2016). In reality, exception mal method for the end users (Nee et al., 2012; Palmarini et al.,
handling, parts/tools verification and other aspects also need to be 2018).
reflected in the ARWI (Wang et al., 2016a; Radkowski et al., 2015). In
general, allowing configuration on what and how contents should
be reflected in the real world is essential.
Hence, next, ARWI needs a well-designed structure to organize 2.3. Summary
rich and various AR contents modularly and to deal with the com-
plexities of operation station design, work variance and disturbance A review of the state-of-the-art ARWI’s adaptiveness and corre-
handling (Mattsson et al., 2016). One effective way is to formally sponding design method for complex industrial operations reveals
describe the structure of ARWI model via its elements’ relations or that the adaptiveness of ARWI has become a bottleneck preventing
associations in potential operation scenes. From this angle, classes AR applications from being widely adopted in diverse and complex
in Ontology for Assembly Tasks Procedure (OATP) are used to industrial operations, but there is still a lack of a systematic anal-
illustrate three aspects of information in an assembly guidance pro- ysis of ARWI’s adaptiveness and a systematic design method for
cess (Wang et al., 2016b), i.e. cognition, process, and component adaptive ARWI. So far, the previous methods have mainly focused
configuration. Each class can be linked to other classes with the on a certain aspect of adaptiveness such as environment track-
relationships between them. Another way is organizing the AR data ing, user feedback, operation workflow, and content generation,
from the perspective of roles in AR system. For example, the Mainte- or a certain field of industrial operations such as maintenance and
nance Data Framework (MDF) is use to present the information for assembly. As a result, the corresponding methods can effectively
supporting maintenance activities. Context Data Framework (CDF) design and express the specific aspects of ARWI’s adaptiveness,
and Rendering Data Framework (RDF) have been used to express and establish a good foundation for further study of ARWI’s adap-
the data required for the contextualization of information and its tiveness. But how to more comprehensively and systematically
format (Erkoyuncu et al., 2017). Researches so far mainly focus express the adaptiveness of ARWI and design the adaptive in
on the operation objects and tools as well as the tasks, but have order to make it more generically adaptive for end users, envi-
not progressed much yet involving quality management, exception ronment, operating objects and work processes, still needs further
handling and beyond. How to organize AR contents focusing on the research.
skills or preferences of operators as well as parts/tools needs to be Therefore, we propose a systematic design method of adaptive
studied. ARWI for complex industrial operations such that the end users
Authoring ARWI is the key step of completing the ARWI can quickly customize and deploy the adaptive ARWI in a non-
(Palmarini et al., 2018). So far it has been a time-consuming and programming mode as well as in a neutral way thereby accelerating
expensive programming procedure (Erkoyuncu et al., 2017; Zhu the AR adoption in the real industrial scenes. The main novel aspects
et al., 2014). ARWI still seems like not fully delivered to down- of our method, compared to the earlier related work are: 1) an adap-
stream users by developers without requiring constant editing tive representation structure of ARWI that expresses and controls
codes (Fite-Georgel, 2020). Some studies tried to overcome this those interrelated aspects of ARWI to enable its scalability, 2) a
roadblock through real-time automated authoring via a 2D desk- cooperative mode of ARWI authoring that supports the coopera-
top user interface and the engineers can author AR contents on-site, tion between engineers and operators for teamwork authoring, 3)
that was an authorable context-aware AR System (Zhu et al., 2013). an adaptive guiding method that can display different and suitable
Another authoring method was to capture product assembly steps contents based on operator’s preference and performance dynami-
performed by an expert user, decompose the captured video into cally, 4) an integrated process control that complies with enterprise
assembly actions and turn these actions as AR contents for guid- regulations besides normal workflow such as exception handling
ing next user (Bhattacharya and Winer, 2015; Bhattacharya and and quality control.
4 J. Geng, X. Song, Y. Pan et al. / Computers in Industry 119 (2020) 103229

Fig. 1. Overall cyclic procedure of adaptive ARWI application.

3. Overall design of adaptive ARWI for complex industrial structure and contents of ARWI should be adaptive to the frequent
operations changes of operation scenes in a customizable manner.
Based on the above basic design principles, an overall cyclic pro-
3.1. ARWI design principles, lifecycle phases and morphology cedure of adaptive ARWI application is proposed as shown in Fig. 1.
consideration There are three important characteristics for the ARWI system pro-
posed: operation scenes, life cycles, and dynamic morphology.
In order to achieve the suggested adaptiveness of ARWI and An operation scene includes operation environment, process,
make it suitable for ever changing industrial operations, the over- people, system and ARWI modules. These elements and their inter-
all system design should follow three basic principles. The first is relationships constitute the external ecological environment of
that the ARWI has to be generically reusable with dynamic load- ARWI. (1) Operation environment refers to the physical circum-
ing capability for different AR systems; in other words, ARWI must stances where the operation activities take place, it includes the
be independent of any AR system and should not be hardcoded. operation objects, such as the parts of the repaired engine, as well
Its contents should be subject to real-world physical hardware and as various operation tools used to assist operators to complete
software configurations. ARWI should be loaded dynamically by the operation, general background entities, including site, equip-
an AR system as a neutral information source. The second prin- ment, instruments, etc. (2) Operation process refers to a series of
ciple is that the ARWI system has the real-time bi-directional activities that the operator must carry out in order to complete
interaction capability. ARWI should not only be usable in any the operation. These activities, generally consisting of a series of
real environment, but also support to collect real operation feed- actions, associative to operation objects and tools, provide detailed
back information via its tracking function, and superimpose virtual specifications and description of the operation to guide the oper-
information onto ARWI environment with context-aware meth- ators. These predefined activities must be strictly implemented as
ods. On the other hand, the deployment of ARWI system should required in order to ensure the correct completion of the operation.
not affect the physical state of the operating environment through (3) Operation people refers to the assigned engineer and/or opera-
placing some markers or other physical ways. The third principle tor. By default, an engineer is in charge of the procedure design
is that the proposed on-site ARWI application does not need pro- of ARWI contents. The operator uses ARWI to complete the job.
gramming. ARWI contents should be planned, authored and used Unlike EWI, which is authored entirely by engineers, the authoring
by end users in a programming-free way. At the same time, the of ARWI needs the collaboration of the engineer and the operator.
J. Geng, X. Song, Y. Pan et al. / Computers in Industry 119 (2020) 103229 5

Fig. 2. Five aspects of ARWI adaptiveness.

(4) ARWI system refers to AR hardware and software working with system for the subsequent personalized evaluation. (4) The last
ARWI contents. The system is capable of loading and processing phase is revising, after which new lifecycle iteration starts from the
ARWI. In fact, the system loads and parses ARWI contents, tracks planning phase again.
the operation environment under the guidance of the operation The morphology consideration of ARWI consists of three aspects
process and superimposes the corresponding AR contents. (5) ARWI that are basic requirements for realizing above principles. Firstly,
module refers to the functioning information subsystem that orga- ARWI should have neutral data format which can ensure its inde-
nizes the corresponding AR contents through a specific structure. pendence to AR systems and can be systematically and frequently
ARWI module is used to drive and guide the operator to complete updated out of the system. XML data package is chosen because
his/her work in an AR-assisted environment. Its contents need care- it is independent, scalable and widely supported. Then, ARWI
ful planning and authoring. The contents and quality of the ARWI should support tracking function that should not affect the phys-
module should conform to enterprise management specifications ical environment (Vignali et al., 2018). Marker-less tracking, like
and be suitable for current operation scene. the one based on feature recognition techniques, provides seam-
From a system design perspective, the lifecycle of the ARWI less integration of the augmented information with the real world
application can be divided into four phases: planning, enriching, run- (Blanco-Novoa et al., 2018). Natural feature tracking based vision is
ning, and revising. (1) In the planning phase, the engineer designs the more suitable for industrial environment because it can be used for
overall structure of the work instruction and creates non AR-related unprepared components and then can greatly reduce the manual
contents, e.g. job description, technical parameters, part models cost and time to physically setup the operations (Nee et al., 2012;
and simulation animation in a file or a set of linked files accord- Siew et al., 2019). Lastly, ARWI contents should have the modu-
ing to the operation process design requirement and the elements lar structure ready for changes of operation scene and operator
of operation environment. At the same time, placeholders of the preference.
AR contents are inserted and designated for the next phase. At this
stage, the ARWI file has the basic information for instructing the 3.2. Concept of ARWI adaptiveness
operator to complete the operation, but lacks the relevant contents
reflecting the real environment, such as tracking images and on-site The first step of design for adaptive ARWI is to specify the
operation video. In this phase, the ARWI can be considered as an ini- concept of adaptiveness. That means the meaning of ARWI adap-
tialized WI with some reserved empty blocks for AR details. (2) The tiveness and what aspects involved in the adaptiveness should be
next phase is enriching. Once the partial ARWI contents are read into analysed and determined. This subsection discusses the concept
the working AR system, and the on-site operator is to carry out the of the adaptiveness only. The capabilities of dealing with the com-
trial operation for verifying the skeleton ARWI contents so far, using plex operation environment and formalized operation process, and
AR hardware to collect the required information for the actual phys- its own authoring process are the core preconditions for enabling
ical environment in real-time operation, and correcting errors. The flexible and friendly ARWI application. The adaptiveness of ARWI
collected information will form AR-related contents and replace means the capability of the structure and contents of ARWI that
those preset placeholders. Then the engineer will post-process the can be quickly customized in a non-programming mode as well
new ARWI contents; e.g. adding the registration locations for over- as in a neutral way; so that it can adapt to various elements of
lying AR contents. After this step, the ARWI file becomes complete different operation scenes once it is authored and implemented.
with full AR-relevant contents. If some abnormal results occur, the The authors recognize that there is no omnipotent ARWI that can
error handling process will be developed, real-time messages or automatically adapt to any industrial operation scene, instead, a
instructions collected and merged into ARWI as the exception han- specific operation scene needs readily tailored ARWI contents that
dling information. (3) The third phase is running through which the can be quickly configured. Fig. 2 shows the five aspects of ARWI
ARWI contents are uploaded into the AR system which is switched adaptiveness suggested.
to live mode, and the production operator carries out their tasks (1) Content structure. In order to realize the on-site and interre-
under the system guidance. In this process, the information of pro- lated aspects of each real operation, the ARWI content structural
duction operator’s performance will be recorded in the live ARWI design accommodates AR information through rapid composing,
6 J. Geng, X. Song, Y. Pan et al. / Computers in Industry 119 (2020) 103229

Fig. 3. Major ontology entities for ARWI structure design.

or tailored enrichment. The key is that ARWI should contain and supports operators to execute the task workflow including excep-
organize various modularized elements and their logical relation- tion handling and quality control in accordance with operational
ships. (2) Authoring. The authoring of ARWI contents requires only specifications. Libraries of messages, videos, pictures, cues, and
on-site application engineers and operators rather than develop- action steps are available and reusable.
ers. (3) Environment. Various and diverse products, tools, sites and
other elements are supported for different fields of complex indus-
trial operations. The ARWI system conveniently manages the AR 4. Detailed design of ARWI contents
element types, tracking data and registration parameters com-
ing from the operation environment. (4) Guiding interface. During 4.1. Ontological modeling
the runtime, the AR system tracks the environment after pars-
ing AR contents, and guides the operator by displaying related The structure of ARWI contents need to generically represent
instruction contents. The ARWI interface is customized accord- all the elements of industrial operations which contains exten-
ing to operator’s preferences and performances requirement, e.g. sive properties and interrelationships; their properties, semantic
giving sufficient and only necessary display subject to the opera- relationships, and dynamic changes must be carefully considered
tor’s ability to receive the messages and cues, and to make better (Rajbabu et al., 2018; Lyu et al., 2017). ARWI domain ontology mod-
performance. (5) Process. The operation processes are configurable eling (Wang et al., 2016b) are designed into levels of abstraction:
with the action types, complexity, exception handling, quality man- classes, properties, and corresponding restrictions and axioms as
agement procedures and other factors. The ARWI system flexibly shown in Fig. 3.
J. Geng, X. Song, Y. Pan et al. / Computers in Industry 119 (2020) 103229 7

Each class is given an abbreviation for being easy to distinguish 4.2. Authoring ARWI based on tri-step coordination
and use later. Those process-related classes serve as the backbone in
a hierarchical framework, including Procedure (PROC), Task (TASK), WI document must be authored by the appropriate specialists
Action (ACTN), Exception Handling Action (EACT) and State (STAT). with formal definitions, grammar and procedures. ARWI contains
PROC represents the entire operation flow, consisting of a series a large amount of tracking, guiding and registration AR related
of serial or parallel TASKs. TASK represents the work that needs items associated with real physical scenes. Therefore, to author
to be carried out in a single workstation, consisting of ACTNs. ARWI, solely relying on engineers who are not familiar with on-
TASK specifies the manipulated Items (ITEM) and the used Tools site operation has been proven not viable. The operator who deals
(TOOL). ACTN represents a basic step to complete a TASK. ACTN with the real physical environment and operation should also be
is divided into three categories: Identification (IDET), Implementa- a co-author. Company management regulations and industry best
tion (IMPT) and InspectionTest (INST). IDET represents the action to practice typically require trial operation in the real environment
confirm the operation objects or tools. IMPT represents the action before the real operation in order to verify the correctness of the
to use tools or hands to manipulate the objects to meet the oper- work instructions. If some exceptions occur during the trial oper-
ation requirements. INST represents the action to check or test ation, the exception handling procedures will also be recorded as
whether the result parameters of objects meet the requirements. the knowledge for solving the corresponding exceptions during the
If some abnormal operation results occur in ACTN, it will be pro- formal operation. Therefore, the environment, process and excep-
cessed through EACT. EACT is equal to ACTN excepting a type flag tion handling of trial operation are also the useful source of ARWI
of exception, and also has the same subclasses. ACTN is executed contents.
in different STATs, including Execution (EXEC), Completion (COMP) This paper proposes tri-step coordinated authoring. The basic
and Diagnosis (DIAG). EXEC represents the starting state needing to idea is that an engineer setups the process framework and presets
track whether the operation environment meets the requirements placeholders of AR contents, and then the operator instantiates AR
for starting action and provide action guidance. COMP represents materials through the trial operation, so that engineers and opera-
the completed state needing to determine whether the operation tors collaboratively contribute to the authoring of ARWI with their
objects have been manipulated correctly after finishing the action respective knowledge and effective input without programming.
execution. DIAG represents the diagnostic state needing to track the Fig. 4 shows the authoring process with three steps of planning,
abnormal operation environment and retrieve the corresponding enriching and post-processing.
EACT if the results of action execution are not correct.
Tracking physical environment and overlaying virtual con-
tents enable environment adaptiveness. Tracker (TRAC), Indicator 4.2.1. Planning by the engineer
(INDC), Registration (REGI), Location (LOCT), Display (DISP) classes (1) The engineer designs the entire hierarchy of Procedure, includ-
are designed for this purpose. TRAC is used to characterize marker- ing a series of hierarchical, serial Tasks and Actions. At this time,
less images with natural feature of physical environments. INDC the engineer has only created the ARWI text without physical
is used to express virtual information used to superimpose on environment-related contents, but preset placeholders for such
real environments, including Text (TEXT), Picture (PICT), Audio contents. The interface preferences of operators are also preset
(AUDO), Video (VIDO), Static Annotation (SANT), Dynamic Anno- in the procedure but open for the follow-up customization. The
tation (DANT), Static Model (SMOD) and Dynamic Model (DMOD). logical expressions are shown below:
When an INDC is superimposed, REGI is used to describe the over-
lying relation through specifying a TRAC and the LOCT, and DISP is
used to specify the format to show. TRAC and some INDC come from Procedure = {Tasks, Preferences}, Procedure ∈ PROC,
physical environment and are applied to physical environments.
Operator (OPRT), Preference (PREF) and Performance (PERF) Preferences = {Preference, Preference ∈ PREF}, Tasks
are used to customize guiding interfaces. TASK records the oper- = {Task, Task ∈ TASK};
ators who perform the task through OPRT. Personal preferences
are grouped into Vision Preference and Content Type Preference,
and are recorded by PREF and shared as global information. Per-
sonnel’s real-time performance is recorded incrementally through Task = {Tools, Items, Actions, Operators}, Tools
PERF. AR systems can compute and decide when, how and which
= {Tool, Tool ∈ TOOL}, Items = {Item, Item ∈ ITEM}, Actions
kind of INDC should be displayed in real time based on operator’s
PREF and PERF. = {Action, Action ∈ ACTN}, Operators = {Operator, Operator
Authoring activity involves all classes, but TRAC, INDC, REGI and
Placeholder (PLHD) need special attention. The engineer reserves ∈ OPRT}.
PLHD at the planning phase. PLHD includes two categories: Tracker-
Holder (TRHD) and IndicatorHolder (IDHD). At the enriching phase,
the trial operator captures the information of environment and live (2) Each Action has the following state variables: ExecutionState,
operation, transforms them into TRAC and INDC, and replaces PLHD CompletionState and DiagnosisState, and optional Exception
with specific TRAC or INDC. Then the engineer specifies REGIs to Handling Action (EHA). The EHA is empty initially because
express the overlying relationship between TRAC and INDC. exception has not been identified and the solution procedure
For the implementation of ontology modeling, the formal has not been developed. The performance of the operators for
descriptions of major ontology entities based on Ontology Web each action is to be recorded in real-time as well.
Language 2 (OWL2) with functional-style syntax (Anon, 2019) are
shown in Table 1. Here we assume that its prefix name of IRI is a
colon character (:). Other object properties and data properties are
Action = {ExecutionState, CompletionState, DiagnosisState,
not listed here due to limited space. The detailed designs of other
four aspects are described in the corresponding sections below. ExceptionHandlingActions = ∅, Performances},
8 J. Geng, X. Song, Y. Pan et al. / Computers in Industry 119 (2020) 103229

Table 1
Major ontology entities.

Ontology Type Ontology Express with OWL2 Functional-style Syntax

Classes Declaration (Class(:Procedure)) Declaration (Class(:Operator)) Declaration (Class(:DynamicModel))


Declaration (Class(:Task)) Declaration (Class(:Tool)) Declaration (Class(:Tracker))
Declaration (Class(:Action)) Declaration (Class(:Item)) Declaration (Class(:Display))
Declaration (Class(:ExceptionHandlingAction)) Declaration (Class(:Indicator)) Declaration (Class(:Registration))
Declaration (Class(:Identification)) Declaration (Class(:Text)) Declaration (Class(:Location))
Declaration (Class(:Implementation)) Declaration (Class(:Audio)) Declaration (Class(:Placeholder))
Declaration (Class(:InspectionTest)) Declaration (Class(:Picture)) Declaration (Class(:TrackerHolder))
Declaration (Class(:State)) Declaration (Class(:Video)) Declaration (Class(:IndicatorHolder))
Declaration (Class(:Execution)) Declaration (Class(:StaticAnnotation))
Declaration (Class(:Completion)) Declaration
(Class(:DynamicAnnotation))
Declaration (Class(:Diagnosis)) Declaration (Class(:StaticModel))
Declaration (Class(:Preference))
Declaration (Class(:Performance))
Class Hierarchies EquivalentClasses (:Action SubClassOf (:Text :Indicator)
:ExceptionHandlingAction)
SubClassOf (:Identification :Action) SubClassOf (:Audio :Indicator)
SubClassOf (:Implementation :Action) SubClassOf (:Picture :Indicator)
SubClassOf (:InspectionTest :Action) SubClassOf (:Video :Indicator)
SubClassOf (:Execution :State) SubClassOf (:SataticAnnotation :Indicator)
SubClassOf (:Completion :State) SubClassOf (:DynamicAnnotation :Indicator)
SubClassOf (:Diagnosis :State) SubClassOf (:SataticModel :Indicator)
SubClassOf (:TrackerHolder :Placeholder) SubClassOf (:DynamicModel :Indicator)
SubClassOf (:IndicatorHolder :Placeholder)
Object Properties Declaration (ObjectProperty (:consistOfTask)) Declaration (ObjectProperty (:perfFrom))
Declaration (ObjectProperty (:prevTask)) Declaration (ObjectProperty (:executedBy))
Declaration (ObjectProperty (:useTool)) Declaration (ObjectProperty (:guideWith))
Declaration (ObjectProperty (:manipulateItem)) Declaration (ObjectProperty (:trackWith))
Declaration (ObjectProperty (:consistOfAction)) Declaration (ObjectProperty (:showWith))
Declaration (ObjectProperty (:prevAction)) Declaration (ObjectProperty (:overlyWith))
Declaration (ObjectProperty (:consistOfState)) Declaration (ObjectProperty (:overlyWhere))
Declaration (ObjectProperty Declaration (ObjectProperty (:overlyOn))
(:exceptionSolvedBy))
Declaration (ObjectProperty (:recordPref)) Declaration (ObjectProperty (:holdIndicator))
Declaration (ObjectProperty (:prefFrom)) Declaration (ObjectProperty (:holdTracker))
Declaration (ObjectProperty (:recordPerf))
Restrictions and ObjectPropertyDomain (:consistOfTask ObjectPropertyDomain (:guideWith : State)
Axioms :Procedure)
ObjectPropertyRange (:consistOfTask :Task) ObjectPropertyRange (:guideWith : ObjectUnionOf (:Text :Audio :Picture :Video
:StaticAnnotation :DynamicAnnotation :StaticModel :DynamicModel))
ObjectMinCardinality (1 :consistOfTask) ObjectMinCardinality (0 :guideWith)
ObjectPropertyDomain (:recordPref :Procedure) ObjectPropertyDomain (:trackWith : ObjectUnionOf(:Execution :Completion
:Diagnosis)) ObjectPropertyRange (:trackWith :Tracker)
ObjectPropertyRange (:recordPref :Preference) ObjectMinCardinality (0 :trackWith)
ObjectMinCardinality (0 : recordPref) ObjectPropertyDomain (:showWith : ObjectUnionOf (:Text :Audio :Picture
:Video :StaticAnnotation :DynamicAnnotation :StaticModel :DynamicModel))
ObjectPropertyDomain(:prefFrom :Preference) ObjectPropertyRange (:showWith :Display)
ObjectPropertyRange(:prefFrom :Operator) ObjectExactCardinality (1 : showWith)
ObjectExactCardinality(1 : prefFrom) ObjectPropertyDomain(:overlyWith : Registration)
ObjectPropertyDomain (:prevTask :Task) ObjectPropertyRange(:overlyWith : ObjectUnionOf (:Text :Audio :Picture :Video
:StaticAnnotation :DynamicAnnotation :StaticModel :DynamicModel))
ObjectPropertyRange (:prevTask :Task) ObjectExactCardinality (1 : overlyWith)
ObjectMinCardinality (0 :prevTask) ObjectPropertyDomain (:overlyOn : Registration) ObjectPropertyRange
(:overlyOn :Tracker)
ObjectMaxCardinality (1 :prevTask) ObjectExactCardinality (1 : overlyOn)
ObjectPropertyDomain (:executedBy :Task) ObjectPropertyDomain (:overlyWhere : Registration)
ObjectPropertyRange (:executedBy :Operator) ObjectPropertyRange (:overlyWhere :Location)
ObjectMinCardinality (1 :executedBy) ObjectExactCardinality (1 : overlyWhere)
ObjectPropertyDomain (:useTool :Task) ObjectPropertyDomain (:holdIndicator : ObjectUnionOf (:Text :Audio :Picture
:Video :StaticAnnotation :DynamicAnnotation :StaticModel :DynamicModel))
ObjectPropertyRange (:useTool :Tool) ObjectPropertyRange (:holdIndicator : IndicatorHolder)
ObjectMinCardinality (0 :useTool) ObjectMaxCardinality (1 : holdIndicator)
ObjectPropertyDomain (:manipulateItem :Task) ObjectPropertyDomain (:holdTracker : Tracker)
ObjectPropertyRange (:manipulateItem :Item) ObjectPropertyRange (:holdTracker : TrackerHolder)
ObjectMinCardinality (0 :manipulateItem) ObjectMaxCardinality (1 : holdTracker)
ObjectPropertyDomain (:consistOfAction :Task)
ObjectPropertyRange (:consistOfAction :Action)
ObjectMinCardinality (1 :consistOfAction)
ObjectPropertyDomain (:prevAction :Action)
ObjectPropertyRange (:prevAction :Action)
ObjectMinCardinality (0 :prevAction)
ObjectMaxCardinality (1 :prevAction)
ObjectPropertyDomain (:exceptionSolvedBy
:Action)
ObjectPropertyRange (:exceptionSolvedBy :
ExceptionHandlingAction)
J. Geng, X. Song, Y. Pan et al. / Computers in Industry 119 (2020) 103229 9

Table 1 (Continued)

Ontology Type Ontology Express with OWL2 Functional-style Syntax

ObjectMinCardinality (0 : exceptionSolvedBy)
ObjectPropertyDomain (:recordPerf :Action)
ObjectPropertyRange (:recordPerf
:Performance)
ObjectMinCardinality (0 :recordPerf)
ObjectPropertyDomain (:perfFrom :
Performance)
ObjectPropertyRange (:perfFrom :Operator)
ObjectExactCardinality (1 : perfFrom)
ObjectPropertyDomain (:consistOfState :Action)
ObjectPropertyRange (:consistOfState
:ObjectUnionOf(:Execution :Completion
:Diagnosis))
ObjectExactCardinality (1 :consistOfState
:Execution)
ObjectExactCardinality (1 :consistOfState
:Completion)
ObjectExactCardinality (1 :consistOfState
:Diagnosis)

ExecutionState ∈ EXEC, CompletionState ∈ COMP, CompletionState = {TrackerHolders, IndicatorHolders,

DiagnosisState ∈ DIAG, Performances VirtualIndicators};

= {Performance, Performance ∈ PERF}.


TrackerHolders = {TrackerHolder, TrackerHolder

∈ TRHD}, IndicatorHolders = {IndicatorHolder,


(3) For ExecutionState and CompletionState, each consists of IndicatorHolder ∈ IDHD}
Trackers and Indicators. DiagnosticState does not need Track-
ers and Indicators; is used to diagnose the exception type.
The exception handling process will only be filled when the 4.2.2. Filling AR elements by the operator
exception handling action is encountered. In the planning step, At this stage, ARWI is a skeleton directive document (WI) ready
because there is no real environment information, engineers to be filled-in with real environment-related AR contents. Now, the
only design virtual indicators for ExecutionState and Comple- operator can use ARWI (skeleton) to implement the trial operation
tionState. These indicators are independent of the physical in the real environment step by step. With all the steps explicitly
scene or need not be superimposed on the physical scene, recorded, the operator authors the real environment-related AR
mainly text, audio, annotation and model information. contents in the operation process according to the placeholders
and records the required AR data or files, hence eventually the full-
fledge ARWI is completed.
ExecutionState = {VirtualIndicators}, CompletionState
(1) ARWI file is uploaded to the AR system. The operator shall wear
= {VirtualIndicators}, DiagnosisState = {}; AR equipment and get ready to execute the expected operation.
(2) The states of the first Action of ARWI’s first Task is initial-
ized and their virtual indicators are used to guide operators
to execute the operation. (3) Once the first AR placeholder is
VirtualIndicators = {VirtualIndicator, VirtualIndicator encountered, the operator will be prompted to capture the AR
contents from the real environment-related scene accordingly.
∈ TEXT ∪ AUDO ∪ SANT ∪ DANT ∪ SMOD ∪ DMOD} (4) If the Tracker Placeholder is triggered, the camera is acti-
vated, and the operator takes a picture of the current scene,
which will be converted into a Tracker for AR tracking. (5) If
the Indicator Holder is triggered, the camera is activated, and
(4) Trackers and realistic Indicators now lack real environment the operator takes pictures of the current scene or records the
information. Therefore, placeholders are then earmarked for operation video for superimposition, which will be converted
the corresponding environment-related contents which need into the realistic Indicators for guiding.
to be subsequently filled in, and they specify their format and
parameters. Different types of actions require different types of
trackers. How Tracker works will be discussed later in Sections TrackerHolders → Trackers, Trackers = {Tracker, Tracker ∈ TRAC};
4.4 and 4.5.

IndicatorHolders → RealIndicators, RealIndicators


ExecutionState = {TrackerHolders, IndicatorHolders,
= {RealIndicator, RealIndicator ∈ PICT ∪ VIDO};
VirtualIndicators},
State = {Trackers, RealIndicators, VirtualIndicators};
10 J. Geng, X. Song, Y. Pan et al. / Computers in Industry 119 (2020) 103229

Fig. 4. Tri-step coordination process of authoring workflow.

(2) When the Action is finished, the operation result of Action needs 4) If the Action is a properly carried out but the result is not accept-
to be judged in order to determine whether exception handling able, then an EHA will be automatically created based on the
process needs to be executed. template (with preset placeholders). The operator configures
3) If the Action is properly carried out and the result is acceptable, virtual indicators for EHA and attaches it to the normal Action
then a normal Action is recorded. Go to the next action, i.e. to as part of the knowledge base of exception diagnosis and han-
repeat step 2. dling. Then the process goes to step 2 for exception handling
iteratively until this exception is resolved.
J. Geng, X. Song, Y. Pan et al. / Computers in Industry 119 (2020) 103229 11

background under the restriction of placeholders in the trail oper-


ation. The engineer is not supposed to take a lot of effort to provide
Action = {ExecutionState, CompletionState, DiagnosisState,
tracking information. When the ARWI is executed, these images
ExceptionHandlingActions}; will be used to identify the operation objects.
In order to handle the change of observed distance or angle,
multiple images should be prepared beforehand according to the
ExceptionHandlingActions = {ExceptionHandlingAction, expected view parameters. Specifying the overlying locations on
different tracking images is the key work of environment adap-
ExceptionHandlingAction ∈ EACT} tiveness design. When the operator captured the multiple tracking
images for a specific State of Action, the engineer needs to specify
the overlying locations in different images. However, these loca-
tions must indicate the same position in the physical environment.
a) If the Action is not carried out properly, and the result is unac-
It is time-consuming and inaccurate to rely on manual designa-
ceptable, the EHA loop will be executed iteratively until the
tion one by one. Therefore, we propose a multi-image synchronous
result is accepted.
registration method where, the operator only needs to specify the
b) If the Action is not carried out properly, the result is accept-
coordinate of overlying point in anyone image in the image group,
able; the process will go to the normal Action with EHA attached
the coordinates of other points in other images will be automati-
as a solution for the recorded exception cause. If the result of
cally identified and extracted as the registration locations.
this normal Action is confirming the effectiveness, the process
The calculation process is shown in Fig. 5, steps follow: (1) Pre-
will go to the next Action. If the result of this normal Action
pare the registering images. Select anyone image from the tracking
is unacceptable still, the process can go to the step 2 immedi-
image group as the reference image RI, Set other images in image
ately for re-execution of this normal Action, or make a mark for
group as target image set TI. (2) Select the reference coordinate
re-execution in the next trail operation. Eventually, the normal
point RP in RI. RP is the registration location where indicators will
Action route should be executed with all results accepted in the
be overly in the RI, and also the datum point of other registration
end.
location of other tracking images. (3) Extract feature points RFP of
RI using SIFT algorithm because SIFT method can deal with scaling
4.2.3. Post-processing by the engineer
and deflecting of image effectively compared with other algorithms
As the last step, the engineer needs to do completion-processing
(Hassaballah et al., 2016; Lowe, 2004). (4) Extract feature points
of ARWI:
TFP i of any tracking image TI i also using SIFT method, and then
match RFP and TFP i with their descriptors using SIFT method. (5)
(1) The engineer registers each realistic Indicator on the tracking Calculate epipolar line ELi corresponding to RP for TI i using epipo-
images, that is, Registration information describes each Indi- lar geometry. Epipolar geometry can accurately reflect the relations
cator needs to be superimposed on which Tracker and what between the 3D points and their projections onto the different 2D
location. See Section 4.4 for details of registration method. images (Zhang, 1998). (6) Search and find the intersection TP i of ELi
and TI i . The TP i is the projection point of the same spatial point of
physical environment represented by point RP on RI. That means
State = {Trackers, RealIndicators, VirtualIndicators, Registrations};
TP i is the registration location in TI i . (7) Set RP and all TP i as reg-
Registrations = {Registration, Registration ∈ REGI}; istration points when all target tracking images are processed and
the multi-image synchronous registration flow completes.
Registration = {Tracker, Location, RealIndicator}, Location ∈ LOCT;
4.4. Guiding detail design based on real-time behavior

(2) Once the registrations are specified, the tri-step coordination ARWI provides guiding contents with a suitable level of details
process is finished, and the completed ARWI is saved and will according to the operator preference. The guiding interface design
be sent to the production environment for real operation. has common feature characteristics. Satisfactory settings, i.e. types
to use (Oh and Byun, 2012), amount of information (Golparvar-Fard
4.3. AR environment adaptation based on natural feature et al., 2009), and the performance notification (Uva et al., 2018),
are influenced by the operator’s emotion, physiology and cognition
There are four guidelines to be followed to achieve the AR adap- ability. This paper suggests the design of guidance should consider
tiveness. The first is that the background scene entities for tracking the real-time behavior of the operator, and balance among the pre-
must come from the real operation environment in order to keep set preference settings, i.e. the watching priority of AR content
the conformance and to reduce workload of identifying objects. types, the visual range of observation, the recorded information
The second is that the environment tracking should not affect the of previous operation performance, and the guidance strength of
physical state of the operation, i.e. avoidance of placing markers different types of AR content in order to comprehensively calculate
or other physical indicators. The third is that the tracking should the specific guiding contents suitable for the operator in action.
reflect the true effect view when the distance and the angle of oper- The levels of guidance strength of AR contents, i.e. text, audio,
ator’s observation change. The forth is that the registration feature picture, video, static annotation, dynamic annotation, static 3d
information, like the overlying locations, should keep following up model, dynamic 3d model, differ due to their characteristics. The
with the changes of observing distance or angle. characteristics of AR contents can be classified according to Intu-
Tracking and superimposing are functions from the AR system, itivity, Dynamicity and Spatiality. Intuitivity means the degree of the
ARWI is only supposed to provide supporting information for these effort rate required by a user to perceive the content directly by
functions. The design of ARWI should focus on how to specify the intuition without rational thought. The less time and effort used, the
tracking and registration information quickly and consistently. In more intuitive the content. Dynamicity means the change degree
the reported work, image-tracking-based vision has been used to of the content in appearance and location over time. The more
capture features without putting on physical markers. The operator changes occur, the more dynamic the content. Spatiality means
should have taken the images of real environment as the tracking the change degree of the content in meaning richness over spatial
12 J. Geng, X. Song, Y. Pan et al. / Computers in Industry 119 (2020) 103229

Fig. 5. Process of multi-image synchronous registration.

dimensions. The more changes occur, the more spatial the con- others may want to see a wider range. On the other hand, com-
tent. Generally speaking, for guidance strength, a dynamic type is pared with video, some users may want to see 3D annotation first.
stronger than a static one, an intuitive one is stronger than the The premise of guiding interface configuration is that the existing
descriptive one, and the spatial type is stronger than the planar preference and past performance information have been recorded
one. Each category of characteristics can be divided into three lev- in the ARWI. When the ARWI is authored, the preferences informa-
els, i.e. the corresponding guidance strength increases from 1 to 3. tion of the specific operator should be included into the Procedure
Intuitivity is further classified into Descriptive, Interpretive and Intu- as the global meta-data. When the ARWI is executed, after every
itive. Dynamicity is divided into Static, Semi-dynamic and Dynamic. Action finished, the current operator’s operation result and time
Spatiality is divided into Straight, Planar and Spatial. Because visual- for the current Action are recorded. The guidance strength of AR
ity and dynamicity have greater impact on mental model building contents can be adjusted based on the previous operation perfor-
of operators (Radkowski et al., 2015), guidance strength of AR con- mance records of the current operator. All AR contents with the
tent types can be divided into three levels: low, medium and high corresponding guidance strength in the current action are selected
based on whether it is intuitive or dynamic. to build the indicator set. Then the indicator set is filtered according
In general, AR contents with high guidance strength consume to the operator’s desired visual range of observation, and only the
more time and increase information redundancy, but provide more indicators in the range of observation is retained. Finally, the indi-
details (Siew et al., 2019; Gattullo et al., 2019). Different levels cator set is sequenced according to the operator’s watching priority
of operators need different types of guidance contents (Palmarini of AR content type, only the indicators with the highest priority are
et al., 2018). For novices, the contents with more details and bet- set to be on by default. The remaining indicators in the indicator set
ter inductiveness such as video or 3D animation would be more will be superimposed when the configuration is finished.
suitable because they lack rich practical experience and deep oper- Ultimately, the final performance measures of the operator,
ation recall. Correct actions are more critical for them than high operation result and time, are the decisive factors in deciding what
operation efficiency. For experts, the contents with less time con- AR contents should be superimposed. The operation result reflects
suming and visual occlusion such as text or static annotation would the quality and the operation time reflects the efficiency. By calcu-
be more suitable because they have rich practical experience and lating the action error rate of the current action and the difference
subconscious movement. High operation efficiency is more criti- between the average operation time and the standard operation
cal for them on the basis of correct actions. Beyond operator level time, we can evaluate the performance of the current operator for
factor, the operation environment also affects the choice of visual the current action, and then decide the level of guidance parame-
AR contents (Radkowski et al., 2015). For example, audio contents ters. That is why the operation performance of the operator needs to
are not suitable for a noisy work environment; accurate position be recorded in the ARWI system after each action is completed. The
indication prefers a static annotation with an arrow rather than configuration process of guiding adaptiveness based on real-time
text or video; and an action with complex movement path needs a behavior is shown in Fig. 6.
3D animation rather than a static image or text. Therefore, convey-
ing the same information of operation guidance to different people (1) Fetch the preferences information of the current operator once
and environments needs different choices for AR contents. Actually, the operation starts, and the previous performance information
the combination of multi-type contents is more common in com- specific to the first Action to be started.
plex operation scenes than the single type content in order to fully (2) Calculate the Action Error Rate (AER) of the current operator who
describe operation requirements and ensure the effect of guidance. is executing this Action. Assume the Action has been executed
The categories and guidance strength of AR contents are shown in N times by current operator. The operation result of Ith is RI .
Table 2. 
N
Operator preferences can be divided into two aspects, the The error times can be defined as F = F (RI ) where F (RI ) =
desired visual range of observation and priority of AR content types.
 I=1
Observing too many contents at the same time will increase cogni- 0, RI == true
tive load and attention distraction. The operator may want to see . As long as an error occurs in the operation
1, RI == false
the overlaid contents only near the center of camera vision, while
process, even if the error has been resolved through exception
J. Geng, X. Song, Y. Pan et al. / Computers in Industry 119 (2020) 103229 13

Table 2
Categories and guidance strength of AR contents.

Type Characteristics Guidance Strength level Suitable Suitable scenes


strength operator level

Text Descriptive, Static, Straight 3 Low Expert Simple action or nonvisual


Audio Interpretive, Semi-dynamic, Straight 5 Not noisy scene, simple action
Picture Intuitive, Static, Planar 6 Medium Intermediate Single-step action
Static Annotation Intuitive, Static, Spatial 7 Position, direction or object
Static 3D Model Intuitive, Static, Spatial 7 Augmented position, direction or object
Video Intuitive, Dynamic, Planar 8 High Novice Complex or multi-step action
Dynamic Annotation Intuitive, Dynamic, Spatial 9 Emphasized position, direction or object
Dynamic 3D Model Intuitive, Dynamic, Spatial 9 Action with complex path or multi-step

Fig. 6. Calculation process of guiding adaptiveness.

handling, the result will still be recorded as an error. So the operator, and ERH I > ERH E . Assume the values of TRH as
AER can be defined as AER = NF . If it is the first time operation, TRH I and TRH E respectively for the intermediate and expert
AER = 1. operator, and TRH I ≥ 0 > TRH E . The GSL needed by the current
(3) Calculate the Action Time Rate (ATR) of current operator execut- operator can be defined as:
ing current Action. Assume the standard operation time is T ,
and the operation time of Ith is TI . The mean operation time

N ⎧
⎪ Low, AER < ERH E or ATR < TRH E
TI ⎨
GSL = Intermediate, ERH E ≤ AER ≤ ERH I or TRH E ≤ ATR ≤ TRH I
can be defined as T = I=1 . So, ATR = T −T . If it is the first time ⎪
N T ⎩
operation, ATR = 1. High, AER > ERH I or ATR > TRH E
(4) Identify the Guidance Strength Level (GSL) based on
Error Rate Threshold (ERH) of AER and Time Rate T hreshold (TRH)
of ATR. ERH and TRH both can be customized based on the (5) Judge the result of last operation and take punitive measure. If
actual situation of the enterprise. Assume the values of ERH the last result is false or the last operation time is more than the
as ERH I an ERH E respectively for the intermediate and expert standard operation time, the GSL will be increased one level.
14 J. Geng, X. Song, Y. Pan et al. / Computers in Industry 119 (2020) 103229

when:ImageTriggered

⎪ Low → Intermediate, RI == false or TI > T

GSL = Intermediate → High, RI == false or TI > T

[r == “No”]

== ture
[in (T)]
High → High, RI == false or TI > T

final



(6) Fetch all indicators  with this GSL in the Action and build the

when:ImageTriggered

when:ImageTriggered
indicators set IALL = I, SLI = GSL where I is the indicator and

InterruptSignal ==
the GSLI is the guidance strength level of I. If there are no indica-
tors that have this GSL or the designer has specific requirements,

interrupting

[r == “No”]
the algorithm should adjust according to the special situation.

[after (T)]

[after (T)]
== false

== false
(7) Filter IALL according to the Preset Vision Range V . V represents

when:

true
the ratio of the maximum visual range to the maximum visual



radius of the center of camera vision, and 0 < V ≤ 1. Fetch the
PI of I. Assume DI is the distance from PI to the center of camera

when:ImageTriggered
vision, the maximum radius of camera vision is R. If PI is not in

exception-handling
the visual range, I will be removed. The IALL filtered according to
the V can be expressed as:

== ture
[in (T)]
DI
IALL = I, GSLI = GSL, ≤V



R

when:ImageTriggered
(8) Continue to filter IALL and retain the indicators with the highest
guidance strength value in the set. Assume GS I is the guidance
strength of I and the highest guidance strength in the set is

diagnosing

[after (T)]
GS MAX . The IALL filtered by guidance strength can be expressed

== false
as





DI
IALL = I|GSLI == GSL, ≤ V, GS I = GS MAX
R

CompletionSignal
(9) Then, filter IALL according to the Preset Type Priority. The priority
is the sort of content type. Assume TI is the type of I and the
checking

== true
preference degree of TI is F(TI ). Then the priority configuration
when:

is defined as F = (F(TI, ) ≥). If the highest preference degree in






F is F(TI )MAX , the final IALL can be expressed as:
when:ImageTriggered

DI
IALL = I, GSLI = GSL, ≤ V, GS I = GS MAX , F (TI ) = F(T )MAX
R
executing

== ture
[in (T)]

(10) Finally, the retained indicators in the IALL will be superimposed


at the registration location.



4.5. Process control based on state machine


Start Action

Operation consists of a series of Tasks, which in turn consists of


entering

Actions. Actions have three categories: Identification, Implementa-


tion and InspectionTest. Identification is used to confirm required



Tools and Items before implementing the operation. InspectionTest


is used to inspect specific key results and test some key parame-
State (Next)

[r == “Yes”]
[r == “Yes”]

ters. Implementation is used to carry out the main contents of the


operation. Clusters of Actions form diverse and complete operation
initial

workflow. Therefore, accurate control for each Action is the key of



process adaptiveness. In order to support operators to execute the


Action State Transition Table.

(—: impossible transition).

task process including normal operation, exception handling and


quality control in accordance with operational specifications, the
exception-handling

ARWI system uses triggering mechanism to control the change of


State (Current)

Action state, and thereby realize the adaptive control.


interrupting
diagnosing

Each Action has eight states, initial, entering, executing, check-


executing

checking
entering

ing, diagnosing, exception-handling, interrupting, and final. The state


initial
Table 3

final

transition process is shown in Fig. 7 and the transition table is


shown as Table 3. Image tracking method is applied to check the
J. Geng, X. Song, Y. Pan et al. / Computers in Industry 119 (2020) 103229 15

Fig. 7. State transition process of Action.

change of operation environment (Section 4.3). The configuration is shown in Fig. 8. The framework is divided into five levels: data,
information for tracking is provided by the classes of EXEC, COMP logical, function and application.
and DIAG in ARWI. In this section, the Action state control method At the data level, XML format is used to store express and store
is introduced which is based on the finite state machine (Shafique the ontology entities of ARWI structure and index tracking and indi-
and Labiche, 2015) approach. cator files. The tracking images and associate audio, video, model
The finite state machine of Action control can be described by and other indicator files are stored in the file system supporting
a five-element tuple M = (˙, S, s0 , ı, F). ˙ is the input alphabet Windows and Android. All of them can be compressed into one
which is a finite, non-empty set of event trigger signals like “Inter- work instruction package and be released to smart glasses or pad.
ruptSignal==true”. S is a finite, non-empty set of states: At the logical level, five API managers are used to manage system
data and control run-time logic. ARWI Manager reads and writes
S = {initial, entering, exectuing, checking, diagnosing, exception ARWI structure and resources. TrackReg Manager tracks environ-
ment and registers AR contents. Visualization Manager displays AR
− handling, interupting, final}
contents with a proper layout. Control Manager calculates the run-
ning status based on the proposed adaptive algorithms and controls
s0 is an initial state, an element of S and s0 = {initial}. ı is the the executive process. Interaction Manager is responsible for user
state-transition function ı : S × ˙ → S. These functions will tran- input.
sit current state to another state. F is the set of final state, and At the function level, ARWI Designer is responsible for plan-
F = {final}. ning and authoring the Procedures, Tasks, Actions, States and some
indicators, reserving placeholders, and configuring the personnel
behaviors. Smart Pilot is responsible for instantiating placehold-
5. System implementation
ers, creating environment-related indicators such as images and
video, and recording exception handling knowledge during the trial
An AR-based Industrial Operation Smart Pilot System (AR-
operation. Furthermore, Smart Pilot is also responsible for control-
IOSPS) has been developed for validating the proposed method
ling the operation process, tracking the operation environment and
with OpenCV, EasyAR, Unity, Visual Studio 2015 and C# language.
recognizing the current scene according to the tracking images,
OpenCV is used to realize the multi-image synchronous regis-
superimposing the corresponding AR contents for operation guid-
tration. EasyAR is used as the development platform of physical
ance, and recording the performance of operator during the formal
environment tracking and Unity is used as the development plat-
operation.
form of content overlying and displaying. The AR-IOSPS framework
16 J. Geng, X. Song, Y. Pan et al. / Computers in Industry 119 (2020) 103229

Fig. 8. Architecture of AR-IOSPS.

At the application level, the engineer uses ARWI Designer to son results would be objective and evaluable. A comparison of the
plan the overall structure of ARWI, preset placeholders, and send range of adaptiveness with other existing authoring methods of
ARWI to the AR devices. Then, the experienced operator of trial ARWI was discussion in this study. An authoring case was also inte-
operation wears or uses AR devices to create tracking images and grated in this study. The second study of running acceptability of
some AR indicators online, record exception handling knowledge ARWI was used to gather the users’ feedbacks of ARWI running with
and complete the enriching of ARWI. Then the production operator NASA-TLX-like method, and assess if its running is acceptable for
uses the completed ARWI to carry out the operation in the formal operators. In this study, the running acceptability of ARWI’s adap-
operation. tiveness is the assessment objective because the task performance
(completion times and error rates) of ARWI has been investigated
in relative works (Uva et al., 2018; Elia et al., 2016; Vignali et al.,
6. User studies
2018). A running case was also integrated in this study. The last
study of structure generality of ARWI was used to assess if its struc-
6.1. User study set-up
ture is generalizable to different complex operations. In this study,
the completeness of three ARWIs with different operation types
This method aims to provide a systematic method for designing
and complexity were used as the basis for determination.
the adaptive ARWI without programming. It implies that the adap-
tive ARWI should be feasible in the authoring and running process
and its structure also should be generalizable to different complex 6.2. Study for authoring load of ARWI
operations. Therefore, we conducted initial pilot studies in order to
assess the assumptions made. There are three types of user studies For this study, a hydraulic actuator disassembly of engine’s tail
including authoring load, running acceptability and structure gen- nozzle was chose as the study object. Five subjects (all male) with
erality. At the same time, an application case from authoring ARWI an average age of 25 all from the Aeronautical Maintenance Labora-
to running ARWI was also integrated into user studies. tory of Xi’an Aeronautical University were recruited. Four of them
The first study of authoring load of ARWI and EWI was used to are assistants and one of them is a laboratory operator. They were
gather the time and task load information with NASA-TLX method divided into four groups, and each group includes one assistant and
(Hart, 2006), and assess if the authoring process of ARWI is fea- the same operator.
sible compared to EWI. The reason why the EWI was chosen is Assistants planned and post-processed the ARWI and EWI sep-
that it has been widely adopted in industries and the compari- arately, and the operator carried out the trial operations to provide
J. Geng, X. Song, Y. Pan et al. / Computers in Industry 119 (2020) 103229 17

Fig. 9. Hydraulic actuator in aircraft engine’s tail nozzle.

Table 4 operation workflow, operation contents, technical/management


Disassembly process of hydraulic actuator.
requirements and guidance information.
Task Action

Preparation Identify Tools (1) Planning ARWI. The assistant used ARWI Designer to plan the
Confirm Environment ARWI based on technical requirements of operation. The EWI
Confirm Hydraulic Actuator also was planned at the same time. Fig. 10 shows how the assis-
Onboard Disassembly Remove Safety Unit
tant planned the ARWI. Fig. 11 shows how the assistants preset
Unscrew Outer Nuts
Unscrew Inner Nuts the placeholders. Fig. 12 shows how the assistant planned the
Remove Retaining Pins EWI.
Accessories Disassembly Remove Plane Liner (2) Enriching ARWI. The experienced operator wore a pair of smart
Remove Stressing Nuts
glasses and completed the whole trial operation of disassembly
Remove Max-state Nuts
Sleeve Disassembly Remove Stop Screws
as shown in Fig. 13. In trial operation, the operator instantiated
Detach Sleeve all placeholders as shown in Fig. 14.
Unscrew Retaining Nuts
Detach Retaining Nuts
The AR device is a pair of split-type smart glasses, and its model
Piston Disassembly Detach Sealing Cylinder Liner
Separate Front Piston is Techlens T2. Its configuration consists of a 1.8 G Qualcomm Snap-
Remove Back Piston dragon CPU, 2GB main memory, OLED screen, 35 ◦ FOV, 800 M pixels
Finishing Check Check Parts camera and the Android system. At the same time, the operator
Clean Environment
also recorded the pictures and videos of trial operation for the EWI
design with a conventional digital camera.

the AR-related contents for ARWI and provide the operation pic- (3) Post-processing ARWI. When the operator completed the
tures and videos for EWI. They all had received the training of enriching process of ARWI and EWI, the assistant specified the
ARWI design and the use of AR glasses. Each of assistants authored registration locations for specific indicators on specific tracker
one ARWI with our system and one EWI with CAPPFramework. images (Fig. 15) for the ARWI post-processing. The assistant also
CAPPFramework is a commercial software for EWI design which linked the corresponding images or videos in specific operation
has been used in more than 200 Chinese manufacturing enterprises steps of EWI.
(CAPPFramework, 2020). The operator carried out the trial opera-
tion one time for ARWI and one time for EWI. All of them recorded When the authoring process completed, the time consumption
the time consumption and filled in the form of ARWI Authoring were collected and listed in Table 5.
Load Index based on the NASA-TLX method. In order to keep the Based on the recorded data, it can be seen that the mean values
test conditions counterbalanced, the first two groups authored the of total times of authoring ARWI and EWI (119.25–114) are similar
ARWI before the EWI and the last two groups authored the ARWI in this user study. Although the mean times are similar, the ARWI
after the EWI. already has an essential difference with the EWI in terms of AR.
The authoring process was divided into three phases in order For the planning time, the ARWI is more than the EWI (44.75
to ensure the same comparison benchmark for both the ARWI and to 35.75) because the engineer needed to take some time to think
EWI: planning, enriching and post-processing. Actually, the EWI what and where the placeholders should be. It is different from
design also needs some on-site contents besides technical draw- the general EWI design. For the post-processing time, the ARWI is
ings or 3D models to improve the guidance effect, the enriching more than the EWI (21 to 7.75). The main reason is that there are
and post-processing are also necessary. But the difference is that many registration locations needed to be specified. If the number
EWI authoring doesn’t need to collect the AR tracking contents and of AR tracking images could be reduced, the post-processing time
augmented visuals. And the EWI authoring also doesn’t need to of ARWI should descend.
specify the registration positions when the pictures or videos are For the enriching time, considering that the time of video record-
added. ing may vary from person to person and the designated pictures
Disassembly of a hydraulic actuator from an aircraft engine’s tail and videos are the same, the same operator just needed to record
nozzle is a typical complex industrial operation with long work- one time on ARWI and one time on EWI for all groups. An inter-
flow. The operation involves 19 operation actions and 13 kinds of esting finding was that the enriching time of ARWI is less than the
parts. The hydraulic actuator is shown as Fig. 9 and the main assem- time of EWI (52–69) although the number of recorded contents for
bly tasks and actions are shown in Table 4. In order to ensure the ARWI is more than EWI. The main reason would be that the oper-
reliability of the data used for comparison, all assistants and opera- ator just recorded the contents in the guidance of placeholders for
tors were required to author the ARWI and EWI with the designated the ARWI enriching. But the operator needed to frequently com-
18 J. Geng, X. Song, Y. Pan et al. / Computers in Industry 119 (2020) 103229

Fig. 10. Planning tasks and actions of ARWI.

Fig. 11. Presetting tracker’s placeholders of ARWI.

Table 5
Time of authoring process.

User group Time for ARWI (Minute) Time for EWI (Minute)

Planning Enriching Post-processing Total Planning Enriching Post-processing Total

1 44 52 24 120 35 69 7 111
2 47 52 20 119 38 69 11 118
3 45 52 21 118 37 69 8 114
4 46 52 22 120 38 69 6 113
Mean time 45.5 52 21.75 119.25 37 69 8 114
J. Geng, X. Song, Y. Pan et al. / Computers in Industry 119 (2020) 103229 19

Fig. 12. Planning operation process of EWI.

Fig. 13. Trial operation for ARWI enriching.

Fig. 14. Instantiate placeholders for ARWI enriching.

municate with engineers to make sure what contents should be workload acceptability (Eitrheim and Fernandes, 2016), the final
recorded for the EWI enriching. The communication time also was scores mean the task loads both ARWI and EWI are acceptable
also considered a part of enriching time. because workload levels below 50 were perceived as acceptable
The form data of ARWI and EWI Authoring Load Index were col- or even higher levels of workload (>70) were in some situations
lected and listed in Table 6. 0 means the minimum load and 100 perceived as acceptable too. It was within the expectation that the
for the maximum load in this survey. MD, PD, TD, PE, EF, FR and W ARWI’s final workload is higher than the EWI’s. After all, ARWI
represent Mental Demand, Physical Demand, Temporal Demand, design is a new design mode of work instruction.
Performance, Effort, Frustration and Weight respectively. The mental demand score of ARWI is higher than the EWI (60 to
The final weighted average scores of ARWI authoring load and 50.45) because the authoring mode of ARWI is new for the users and
EWI authoring load are 51.20 and 46.33. Based on the research of most of users needed to adapt to this new mode. The users needed
20 J. Geng, X. Song, Y. Pan et al. / Computers in Industry 119 (2020) 103229

Fig. 15. Specify the registration location.

Table 6
Authoring Load Index weighted averages of ARWI and EWI.

User ARWI EWI

MD/W PD/W TD/W PE/W EF/W FR/W Mean MD/W PD/W TD/W PE/W EF/W FR/W Mean

Engineer 1 60/5 35/0 50/1 55/2 60/3 55/4 57.33 55/4 45/1 55/0 35/3 55/5 45/2 49.00
Engineer 2 65/5 50/0 45/1 50/4 70/3 35/2 56.67 50/4 50/1 30/2 25/2 60/3 50/3 46.00
Engineer 3 40/0 55/4 45/2 60/2 40/3 55/4 51.33 45/3 30/5 70/1 25/2 55/1 40/3 38.67
Engineer 4 55/5 35/1 40/3 65/3 45/3 20/0 50.67 45/0 45/4 45/4 30/3 40/3 45/1 41.00
Operator 40/0 40/3 15/2 60/4 30/5 60/1 40.00 45/0 55/2 65/2 55/3 50/3 60/5 57.00
Diagnostic Subscores 60 46.88 37.22 57.67 46.76 51.82 50.45 41.15 48.89 31.54 52.00 50.36
Final Scores 51.20 46.33

to carefully think about how to set the placeholders of tracking ronment because the operator cannot participate in the authoring
images and indicators which they were not used to. Obviously, it process. In addition, facing the same guiding contents for all oper-
is a more abstract task with a higher mental demand. The tem- ators regardless of their skill level and experience may confuse
poral demand score of ARWI is lower than the EWI (37.22–48.89) operators or affect the efficiency of AR guiding. Furthermore, lack
because the cooperation between engineers and operators has been of quality control and exception handling will reduce the coverage
preset in the ARWI. The engineers and operators could cooperate of AR work instruction. Our method has these kinds of adaptiveness
easily under the guidance of ARWI. Another finding was that the through flexible structure and corresponding methods.
performance score of ARWI is higher than the EWI (57.67 to 31.54)
because the users of ARWI couldn’t see the final result of the work 6.3. Study for running acceptability of ARWI
instruction immediately until the ARWI runs on the AR device. If
the preview function of ARWI is available, it would be helpful to For this study, ten subjects (male: 8, female: 2) with an aver-
reduce the performance score. age age of 24 from the Aeronautical Maintenance Laboratory of
There are also commercial ARWI authoring tools, such as Xi’an Aeronautical University were recruited. They are divided into
Dynamics 365 Guides for HoloLens, ScopeAR and Cortona3D. Based different groups based on their mechanical engineering knowl-
on the detailed investigation of these systems’ manuals, demos and edge and experimental operation skills. Four of them are junior
some trails, we made a technical comparison focusing on adap- undergraduates with basic mechanical engineering knowledge and
tiveness as shown in Table 7. The main goal of these commercial experimental operation skills. They didn’t conduct the disassembly
systems also is to design AR work instruction without program- experiment of hydraulic actuator but observed the whole disassem-
ming. At the same time, these systems have made their instructions bly process as training. So, they were divided into novice group.
certain adaptiveness in normal workflows and environments. Espe- Four of them are senior undergraduates with more mechanical
cially in the aspect of environmental adaptiveness, some of them engineering knowledge and experimental operation skills. They
support rich computer vision technologies, such as SLAM, with the conducted the disassembly experiment of hydraulic actuator in
support of specific hardware. Comparatively speaking, our method the situation without AR assistance and got scores above B. There-
has advantages in the adaptiveness of collaborative authoring, fore, they were divided into intermediate group. Two of them are
behavior of operator and management process. For these systems, laboratory operators with mechanical engineering knowledge and
the engineer needs more efforts to build the fundamental data envi- rich experimental operation experience. They have conducted this
J. Geng, X. Song, Y. Pan et al. / Computers in Industry 119 (2020) 103229 21

Table 7
Adaptiveness comparison of existing ARWI tools.

Method Structure Authoring Environment People Process

Dynamics 365 Guides Flexible structure for For engineers Image, model and Same contents for all Complying with
for HoloLens processes and SLAM based operators normal workflow
environments environment tracking
and registration
ScopeAR Flexible structure for For engineers Image and model based Same contents for all Complying with
processes and environment tracking operators normal workflow
environments and registration
Cortona3D Flexible structure for For engineers Image or model based Same contents for all Complying with
processes and environment tracking operators normal workflow
environments and registration
Our method Flexible structure for Supporting the Multi-image based Dynamic contents Complying with
collaborative cooperation between marker-less based on operator’s normal workflow,
authoring, various engineers and environment tracking preference and exception handling and
operators, processes operators based on and registration performance quality control
and environments placeholders

Fig. 16. Tools confirmation.

disassembly experiment of hydraulic actuator several times. As a (3) Controlling the operation process. The user controlled the oper-
result, they could be seen as experts and were divided into expert ation process according to the management regulations under
group. They all had been trained how to use AR glasses and AR the guidance of ARWI. Fig. 19–21 show the quality inspection,
system. exception diagnosis and handling and parts check respectively.
In this study, their preferences about AR contents and estimated
performances of operation were preset in the ARWI. Each subject
wore AR glasses and carried out the disassembly of hydraulic actu- When the running process completed, the results of Running
ator under the guidance of ARWI. After the operation, the subject Acceptability Index were collected and listed in Table 8. 0 means
filled in the form of ARWI Running Acceptability Index which is a the minimum acceptability and 100 for the maximum acceptability
scoring worksheet designed by us following the NASA-TLX format. in this survey.
There are three acceptability factors: Environment Adaptiveness, Overall, it can be seen that all scores of three groups are more
People Adaptiveness and Process Adaptiveness. In addition, there than 70 (73.33, 74.17 and 72.50). It means all groups considered the
are three paired-choices for weighting scores: Environment Adap- adaptiveness of ARWI are acceptable. Although all the scores are
tiveness to People Adaptiveness, Environment Adaptiveness to close, their subscores are different which indicated the preferences
Process Adaptiveness, and People Adaptiveness to Process Adap- of different users.
tiveness. For novice group, the acceptability of people adaptiveness got
In this study, overplay and display the AR contents according to a highest level (80) in both its own three subscores and the same
indicators, real-time record the operation behavior, and complete scores of three groups. Because the novices are lack of experience,
the operation tasks and actions. they thought action guidance is more critical than other factors.
But some operators said too many visual contents would affect the
effect of guidance and they needed a better layout. For expert group,
(1) Confirming operation environment. The user used AR glasses the acceptability of process adaptiveness got a highest level (76.67)
to confirm whether the operation environment meets the in its own three subscores because exception handling and qual-
requirements of operation preparation with the prede- ity control could make operators focus on the technical actions.
fined tracking images. Fig. 16 shows the scene of tools The experts have rich practical experience, so they thought high
confirmation. operation efficiency is more critical than other factors. It also was
(2) Carrying out the action under the guidance of AR contents. seen that the acceptability of environment adaptiveness got a little
The user carried out the disassembly action under the guid- low value because the tracking accuracy got affected by the metal
ance of visual AR contents such as annotations and models. texture and intense illumination. The operators sometimes needed
Figs. 17 and 18 show the disassembly guidance based on oper- to shake their AR glasses several times to get correct recognition
ator’s behavior. results.
22 J. Geng, X. Song, Y. Pan et al. / Computers in Industry 119 (2020) 103229

Fig. 17. Disassembly guidance based on operator’s behavior (1).

Fig. 18. Parts disassembly guiding based on operator’s behavior (2).

Fig. 19. Quality Inspection.

6.4. Study for structure generality of ARWI neers planned and post-processed the ARWIs, and operators carried
out the trial operations to provide the AR-related contents and
For this study, six subjects (all male) with an average age of 26 ran the ARWI for formal operation. They all have relevant profes-
were recruited. Three of them are engineers and three of them are sional experience, and had been trained how to use ARWI Designer
laboratory operators from different factories and laboratories. Engi- to design and how to use AR glasses and Smart Pilot. Three typi-
J. Geng, X. Song, Y. Pan et al. / Computers in Industry 119 (2020) 103229 23

Fig. 20. Exception diagnosis and handling.

Fig. 21. Parts check.

Table 8
ARWI Running Acceptability Index weighted averages.

User group Environment People Adaptive- Process Adap- IndividualScores Group Scores
Adaptiveness/Weight ness/Weight tiveness/Weight

Novice 1 85/0 90/2 75/1 85.00 73.33


2 70/0 85/2 60/1 76.67
3 60/2 70/1 50/0 63.33
4 65/0 70/2 65/1 68.33
Diagnostic subscores 60 80 66.67
Intermediate 1 85/0 85/2 75/1 81.67 74.17
2 70/1 85/2 60/0 80.00
3 65/2 70/1 70/0 66.67
4 65/0 70/2 65/1 68.33
Diagnostic subscores 66.67 78.29 70
Expert 1 70/1 60/0 85/2 80.00 72.50
2 65/1 70/1 60/1 65.00
Diagnostic subscores 67.50 70.00 76.67

cal industrial operations with different types and complexity were 6.5. Discussion
chose for ARWI design. One of them is the case above. Each pair of
engineer and operator was responsible for one ARWI design and The results of authoring load study show that the time con-
execution. sumption and work load are acceptable for engineers and operators.
When the design and running process completed, the design Because the comparison object is the EWI, and its authoring is rel-
data from the structure of three ARWIs were collected and listed atively mature, we can say that ARWI should have the foundation
in Table 9. Compared with corresponding EWIs, the results show of rapid design and deployment. We found that most of users are
that three ARWIs for typical types and medium or high complex- inexperienced in the new authoring mode of ARWI. The most chal-
ity of industrial operations were successfully designed with 100 % lenging is to plan what placeholders needed and where to place
completeness, which means the major operation workflow, oper- these placeholders for authoring ARWI. The difficulty of this work
ation contents, technical/management requirements and guidance should descend as the engineers become more familiar with it. Here
information can be successfully expressed in the ARWI and covered we must point out that the authoring of case ARWI was based on the
the corresponding contents of EWIs. Their running screenshots are given operation process and technical requirements, so the author-
shown in Fig. 22.
24 J. Geng, X. Song, Y. Pan et al. / Computers in Industry 119 (2020) 103229

Table 9
Design data from three ARWIs.

No. Operation name Task number Action number Placeholder Placeholder Placeholder Completeness
number of number of image number of video
tracking image indicator indicator

1 Machine tool collet 3 10 180 9 6 100 %


maintenance
2 Assembly of hydraulic pipe 4 14 260 13 10 100 %
3 Actuator disassembly of 6 19 309 21 15 100 %
engine’s tail nozzle

Fig. 22. Running screenshots of three ARWIs.

ing of ARWI for full new industrial operation needs further research 1) The proposed method has an adaptive representation struc-
in the future. ture that expresses and controls those interrelated aspects of
The results of running acceptability study show that users ARWI to enable its scalability: authoring, environment, people
accepted the running adaptiveness of ARWI. But the running adap- and process. Compared with other studies, this method fur-
tiveness is still subject to some existing AR technologies such as ther considers quality and exception management, engineer and
tracking and visual layout. With the development of these tech- operator team collaboration in the authoring process, and con-
nologies, the running adaptiveness of ARWI should be further figuration of user environment settings. More importantly, it
improved. For novices, guidance is more critical than other factors, also provides an embedded syntax for ARWI ontology model-
and higher operation efficiency is more critical for experts. ing which ensures the consistency and standardization of the
The results of structure generality study indicate that the struc- ARWI design.
ture of ARWI is generalizable in typical industrial operations with 2) It provides a detailed implementation method to achieve the
different types and medium or high complexity. But more case adaptiveness of ARWI in its authoring and running process. Its
studies should be examined to assess its generalizability because main advantages include giving full play to the strengths of engi-
above three operations all belong to the field of mechanical engi- neers and operators, utilizing widely used AR technologies such
neering. Case studies in other fields should be examined in the as marker-less tracking and complying with operation manage-
future. ment regulations of the enterprise. These detailed aspects can
Overall, it can be seen that engineers and operators designed be critical for AR technology to move towards real practice from
different ARWIs without any programmer within a reasonable time laboratory procedure.
and work load. In addition, the technical and management require- 3) It enables the end users to complete the authoring of ARWI with-
ments of operations can be met through controlling the normal out programming although the creation of some complex AR
operation, quality inspection and exception handling. But some contents such as 3D animation still needs engineers’ efforts. That
limitations still exist. First, the trail operation is needed for cre- means that the adaptive ARWI can be quickly customized in a
ating real environment-related indicators and tracking images, so non-programming mode as well as implemented in a natural
one-time or non-trial industrial operations cannot be achieved with way to be quickly deployed on the work site. This advantage
this method. Second, the live creation of indicators and tracking brings ARWI closer to real applications and makes the technol-
images according to the placeholders does take some time during ogy more adoptable.
the trial operation. Third, if the generation of dynamic 3D model is
desired, the engineer still has to accomplish the job; this method
can’t optimize this kind of work. The method is deemed to be helpful to accelerate the AR
adoption in the real industrial scenes. For more comprehensive
assessment, further evaluation is required to compare the adaptive-
7. Conclusions ness of ARWI with similar systems. In addition, more extensive user
studies and more design cases need to be studied besides the field of
This paper addresses the demand of a fast, effective solution mechanical engineering, for example electromechanical, electron-
for developing adaptive ARWI application and proposes a system- ics & electrical and energy & chemical, because these operations
atic method to design the ARWI with adaptiveness. The proposed may have some unique requirements.
design method can be adapted to complex industrial operation Furthermore, a critical future work needs to deepen and
scenes through a generic semantics structure design and imple- expand the intelligent degree of ARWI’s adaptiveness based on the
mentation via adaptive instruction authoring, operation guiding, proposed method, especially in the terms of more complex envi-
environment tracking and workflow controlling. The contributions ronment and more natural guidance. With the rapid development
of this study are as follows: of some emerging AR technologies, such as model-based tracking,
J. Geng, X. Song, Y. Pan et al. / Computers in Industry 119 (2020) 103229 25

simultaneous localization and mapping (SLAM), spatial projection Mizell, D.W., 2020. Virtual reality and augmented reality in aircraft design and
and multi-modal interaction, it is expected the proposed method manufacturing. Proceedings of the Conference Name|, Conference Location|,
Conference Date, Publisher, Year of Conference.
will integrate these emerging technologies through the expansion Fraga-Lamas, P., Fernández-Caramés, T.M., B.-N. Ó, Vilar-Montesinos, M.A., 2018. A
of the semantic structure and implementation algorithm of adap- review on industrial augmented reality systems for the industry 4.0 shipyard.
tive ARWI. IEEE Access 6, 13358–13375.
Quandt, M., Knoke, B., Gorldt, C., Freitag, M., Thoben, K.-D., 2018. General
requirements for industrial augmented reality applications. Procedia Cirp 72,
Declaration of Competing Interest 1130–1135.
Makris, S., Karagiannis, P., Koukas, S., Matthaiakis, A.-S., 2016. Augmented reality
system for operator support in human–robot collaborative assembly. CIRP Ann.
We hereby declare that we, all the authors, have no conflict of Manuf. Technol. 65 (1), 61–64.
interest to declare. This manuscript has not been published previ- Erkoyuncu, J.A., del Amo, I.F., Dalle Mura, M., Roy, R., Dini, G., 2017. Improving
ously and it has not been submitted for publication elsewhere. efficiency of industrial maintenance with context aware adaptive authoring in
augmented reality. CIRP Ann. Manuf. Technol. 66 (1), 465–468.
Zhu, J., Ong, S.K., Nee, A.Y.C., 2013. An authorable context-aware augmented reality
Acknowledgements system to assist the maintenance technicians. Int. J. Adv. Manuf. Technol. 66 (9),
1699–1714.
Grubert, J., Langlotz, T., Zollmann, S., Regenbrecht, H., 2017. Towards pervasive
This work was partially supported by the “top international augmented reality: context-awareness in augmented reality. IEEE Trans. Vis.
university visiting program” for “outstanding young scholars” of Comput. Graph. 23 (6), 1706–1724.
Hallaway, D., Feiner, S., HÖLlerer, T., 2004. Bridging the gaps: hybrid tracking for
Northwestern Polytechnical University and the “natural science adaptive mobile augmented reality. Appl. Artif. Intell. 18 (6), 477–500.
basic research” project of Shaanxi Province, China (Grant No. Zhu, J., Ong, S.K., Nee, A.Y.C., 2014. A context-aware augmented reality system to
2019JM-435), and was largely conducted when the first author was assist the maintenance operators. Int. J. Interact. Des. Manuf. 8 (4), 293–304.
Siew, C.Y., Ong, S.K., Nee, A.Y.C., 2019. A practical augmented reality-assisted main-
attached at the University of Alberta. The corresponding author has
tenance system framework for adaptive user support. Robot. Comput.-Integr.
been supported by Canada NSERC Discovery grant (RGPIN 5641 Manuf. 59, 115–129.
Ma). The authors would also like to express thanks to Zhaoxu Chen, Oh, S., Byun, Y.-C., 2012. A user-adaptive augmented reality system in Mobile
Weiwei Wang, Xiaofeng Luo, Bo Ye, Fenghua Xiao and Bin Lu who computing environment. In: Lee, R. (Ed.), Software and Network Engineering.
Springer Berlin Heidelberg, Berlin, Heidelberg, pp. 41–53.
made contributions to system development, prototype test and Hervás, R., Bravo, J., Fontecha, J., Villarreal, V., 2013. Achieving adaptive aug-
application. mented reality through ontological context-awareness applied to AAL scenarios.
J. Univers. Comput. Sci. 19, 1334–1349.
Park, K.-B., Kim, M., Choi, S.H., Lee, J.Y., 2020. Deep learning-based smart task assis-
Appendix A. Supplementary data tance in wearable augmented reality. Robot. Comput.-Integr. Manuf. 63, 101887.
Haug, A., 2015. Work instruction quality in industrial management. Int. J. Ind. Ergon.
50, 170–177.
Supplementary material related to this article can be found, in Webel, S., Bockholt, U., Engelke, T., Gavish, N., Olbrich, M., Preusche, C., 2013. An
the online version, at doi:https://doi.org/10.1016/j.compind.2020. augmented reality training platform for assembly and maintenance skills. Robot.
103229. Auton. Syst. 61 (4), 398–403.
Ferrise, F., Caruso, G., Bordegoni, M., 2013. Multimodal training and tele-assistance
systems for the maintenance of industrial products. Virtual Phys. Prototyp. 8 (2),
References 113–126.
Wang, X., Ong, S.K., Nee, A.Y.C., 2016b. Multi-modal augmented-reality assembly
Mo, J., Bil, C., Sinha, A., 2015. The life cycles of complex engineering systems. In: guidance based on bare-hand interface. Int. J. Adv. Sci. Eng. Inf. Technol. 30 (3),
Mo, J.P.T., Bil, C., Sinha, A. (Eds.), Engineering Systems Acquisition and Support. 406–421.
Woodhead Publishing, Oxford, pp. 19–36. Fiorentino, M., Uva, A.E., Gattullo, M., Debernardis, S., Monno, G., 2014. Augmented
Regenbrecht, H., Baratoff, G., Wilke, W., 2005. Augmented reality projects in the reality on large screen for interactive maintenance instructions. Comput. Ind. 65
automotive and aerospace industries. IEEE Comput. Graph. Appl. 25 (6), 48–56. (2), 270–278.
Fox, S., 2010. The importance of information and communication design for man- Wang, X., Ong, S.K., Nee, A.Y.C., 2016c. Real-virtual components interaction for
ual skills instruction with augmented reality. J. Manuf. Technol. Manag. 21 (2), assembly simulation and planning. Robot. Comput.-Integr. Manuf. 41, 102–114.
188–205. Mattsson, S., Tarrar, M., Fast-Berglund, Å., 2016. Perceived production complexity –
Zhou, J., Li, P., Zhou, Y., Wang, B., Zang, J., Meng, L., 2018. Toward new-generation understanding more than parts of a system. Int. J. Prod. Res. 54 (20), 6008–6016.
intelligent manufacturing. Engineering 4 (1), 11–20. Syberfeldt, A., Danielsson, O., Holm, M., Wang, L., 2016. Dynamic operator instruc-
Geng, J., Tian, X., Bai, M., Jia, X., Liu, X., 2014. A design method for three-dimensional tions based on augmented reality and rule-based expert systems. Procedia Cirp
maintenance, repair and overhaul job card of complex products. Comput. Ind. 41, 346–351.
65 (1), 200–209. Radkowski, R., Herrema, J., Oliver, J., 2015. Augmented reality-based manual assem-
Geng, J., Zhang, S., Yang, B., 2015. A publishing method of lightweight three- bly support with visual features for different degrees of difficulty. Int. J. Hum.
dimensional assembly instruction for complex products. J. Comput. Inf. Sci. Eng. Interact. 31 (5), 337–349.
15 (3), 031004-031004-031012. Bhattacharya, B., Winer, E., 2015. A Method for Real-Time Generation of Augmented
Li, D., Mattsson, S., Salunkhe, O., Fast-Berglund, Å., Skoogh, A., Broberg, J., 2018. Reality Work Instructions Via Expert Movements. SPIE.
Effects of information content in work instructions for operator performance. Bhattacharya, B., Winer, E.H., 2019. Augmented reality via expert demonstration
Procedia Manuf. 25, 628–635. authoring (AREDA). Comput. Ind. 105, 61–79.
Jetter, J., Eimecke, J., Rese, A., 2018. Augmented reality tools for industrial applica- Gattullo, M., Scurati, G.W., Fiorentino, M., Uva, A.E., Ferrise, F., Bordegoni, M., 2019.
tions: what are potential key performance indicators and who benefits? Comput. Towards augmented reality manuals for industry 4.0: a methodology. Robot.
Human Behav. 87, 18–33. Comput.-Integr. Manuf. 56, 276–286.
Nee, A.Y.C., Ong, S.K., Chryssolouris, G., Mourtzis, D., 2012. Augmented reality appli- D.G.f. HoloLens, https://docs.microsoft.com/en-us/dynamics365/mixed-reality/
cations in design and manufacturing. CIRP Ann. Manuf. Technol. 61 (2), 657–679. guides/hololens-authoring, 2019.
Uva, A.E., Gattullo, M., Manghisi, V.M., Spagnulo, D., Cascella, G.L., Fiorentino, M., Cortona3D, http://www.cortona3d.com/en/products/authoring-publishing-
2018. Evaluating the effectiveness of spatial augmented reality in smart man- solutions/authoring-solutions-augmented-reality, 2019.
ufacturing: a solution for manual working stations. Int. J. Adv. Manuf. Technol. ScopeAR, https://www.scopear.com/solutions/work-instructions/, 2019.
94 (1), 509–521. Vignali, G., Bertolini, M., Bottani, E., Di Donato, L., Ferraro, A., Longo, F., 2018. Design
Elia, V., Gnoni, M.G., Lanzilotto, A., 2016. Evaluating the application of augmented and testing of an augmented reality solution to enhance operator safety in the
reality devices in manufacturing from a process point of view: an AHP based food industry. Int. J. Food Eng.
model. Expert Syst. Appl. 63, 187–197. Blanco-Novoa, Óscar, Fernández-Caramés, T.M., Fraga-Lamas, P., Vilar-Montesinos,
Bottani, E., Vignali, G., 2019. Augmented reality technology in the manufacturing M.A., 2018. A practical evaluation of commercial industrial augmented reality
industry: a review of the last decade. Iise Trans. 51 (3), 284–310. systems in an industry 4.0 shipyard. IEEE Access 6, 8201–8218.
Palmarini, R., Erkoyuncu, J.A., Roy, R., Torabmostaedi, H., 2018. A systematic review of Rajbabu, K., Srinivas, H., Sudha, S., 2018. Industrial information extraction through
augmented reality applications in maintenance. Robot. Comput.-Integr. Manuf. multi-phase classification using ontology for unstructured documents. Comput
49, 215–228. Ind 100, 137–147.
Wang, X., Ong, S.K., Nee, A.Y.C., 2016a. A comprehensive survey of augmented reality Lyu, G., Chu, X., Xue, D., 2017. Product modeling from knowledge, distributed com-
assembly research. Adv. Manuf. 4 (1), 1–22. puting and lifecycle perspectives: a literature review. Comput. Ind. 84, 1–13.
Fite-Georgel, P., 2020. Is there a reality in industrial augmented reality? Proceedings Anon O.W.O.L.S.S.a.F.-S. Syntax, https://www.w3.org/TR/2012/REC-owl2-syntax-
of the Conference Name|, Conference Location|, Conference Date, Publisher, Year 20121211/, 2019.
of Conference.
26 J. Geng, X. Song, Y. Pan et al. / Computers in Industry 119 (2020) 103229

Hassaballah, M., Abdelmgeid, A.A., Alshazly, H.A., 2016. Image features detection, collection, processing and communication. J. Inf. Technol. Construction 14 (13),
description and matching. In: Awad, A.I., Hassaballah, M. (Eds.), Image Feature 129–153.
Detectors and Descriptors : Foundations and Applications. Springer Interna- Shafique, M., Labiche, Y., 2015. A systematic review of state-based test tools. Int. J.
tional Publishing, Cham, pp. 11–45. Softw. Tools Technol. Transf. 17 (1), 59–76.
Lowe, D.G., 2004. Distinctive image features from scale-invariant keypoints. Int. J. Hart, S.G., 2006. Nasa-task load index (NASA-TLX); 20 years later. Hum. Factors
Comput. Vis. 60 (2), 91–110. Ergon. Soc. Annu. Meet. Proc. 50 (9), 904–908.
Zhang, Z., 1998. Determining the epipolar geometry and its uncertainty: a review. CAPPFramework, http://www.cappframework.com/index.asp, 2020.
Int. J. Comput. Vis. 27 (2), 161–195. Eitrheim, M.H., Fernandes, A., 2016. The NASA Task Load Index for Rating Workload
Golparvar-Fard, M., Peña-Mora, F., Savarese, S., 2009. D4AR–a 4-dimensional aug- Acceptability.
mented reality model for automating construction progress monitoring data

You might also like