Industrial Transfer Learning
Industrial Transfer Learning
com
ScienceDirect
Procedia CIRP 107 (2022) 511–516
www.elsevier.com/locate/procedia
* Corresponding author. Tel.: +49 711 685 67294; fax: +49 711 685 67302. E-mail address: [email protected]
Abstract
Despite the high solution potential of machine learning for common problems in automation technology, there are only few
examples of its application in real-world manufacturing practice. In order to find the reason for this phenomenon, the authors
identify the hurdles for conventional machine learning using four exemplary use cases namely self-learning robots, wear prediction,
visual object detection, and predictive quality in manufacturing. While these use-cases differ in principle, the problems engineers
face when using conventional machine learning approaches to solve them are related, such as the lack of manifold training data or
high dynamics of industrial processes. The authors showcase that utilizing deep transfer learning and continual learning approaches
in the industrial context – subsumed under the term industrial transfer learning – can overcome these hurdles. Even for industrial
transfer learning, there is a deficiency regarding preconditions for the large-scale deployment of such approaches, but unlike in
conventional machine learning, it is principally possible to establish those. The article concludes with a discussion of these
prerequisites and makes suggestions as to how they could be fulfilled.
© 2022 The Authors. Published by Elsevier B.V.
This is an open access article under the CC BY-NC-ND license (https://creativecommons.org/licenses/by-nc-nd/4.0)
Peer-review under responsibility of the International Programme committee of the 55th CIRP Conference on Manufacturing Systems
Keywords: Continual Learning; Industrial Application, Simulation to Reality, Transfer Learning, Use Cases
This is a resupply of March 2023 as the template used in the publication of the original article contained errors. The content of the article has remained unaffected.
512 Benjamin Maschler et al. / Procedia CIRP 107 (2022) 511–516
The basic principle of data-based AI is to use mathematical The automation of joining processes by robots requires
methods to automatically recognize patterns in data sets. These precise guidance of components along a trajectory.
patterns can then be understood and used as an abstract model, Reinforcement learning using self-learning agents offers the
e. g. to simulatively "test" the system behavior under changed potential to learn control strategies for dynamic environments
environmental conditions. A central prerequisite for the in which, for example, the position of the components varies
usability of such a model is that its predictions are precise greatly [10, 11].
enough. This depends to a large extent on the data used, which The central challenge in the use of such agents is that they
must be sufficiently diverse, for example, so that an algorithm have to gain experience in order to build up an understanding
can find descriptive and generally valid patterns in them. of the task and then derive solutions from this. The required
Current work is very often based on artificial neural experience, in the form of thousands of iterations, is time-
networks that are trained on datasets which are assumed to be consuming and cost-intensive, so that training on real robots is
representative for the given problem. However, this assumption usually not feasible. Furthermore, a reinforcement learning
is often not valid in industrial production: agent must also be able to gather negative experiences, which
On the one hand, obtaining a sufficiently large and varied in the context of robotics means collisions with potential
database is problematic (see Fig. 1, left). Example reasons for damage to hardware.
this phenomenon are listed in Table 1. In addition, the minimum In order to save time as well as to avoid material damage
amount of data necessary is usually unknown. and possibly personal injury, simulations offer a cost-effective
On the other hand, many industrial processes are alternative to the collection of experience. A major challenge
characterized by high dynamics (see Fig. 1, right). Example here, however, is that simulations are only suitable for training
reasons for this phenomenon are listed in Table 1. Thus, data agents to a limited extent due to deviations from real processes
sets that have been created once and are sufficiently good at this (the so-called simulation-to-reality gap).
point in time, together with the algorithms trained with them,
lose their validity over time because the process under 2.2. Wear Prediction
consideration develops away from them [7–9]. Furthermore, it
is often not possible to identify the exact point in time when the Timely replacements of worn components can avoid failures
process and the data set start to differ significantly. and reduce expenses accordingly, but require precise prediction
These challenges hinder the wider, practical use of data- of the optimal replacement time. Data-driven machine learning
based AI approaches in production automation. The following offers the potential to enable wear predictions for predictive
four examples serve to illustrate these issues in detail. maintenance that are adapted to the respective operating
conditions without having to understand and model the
physical relationships.
The prerequisite for this is large quantities of diverse,
SCARCE DATA HIGH PROBLEM DYNAMICS accurate and labeled training data, which should thus also
Problem Problem changes continously contain component failures for the present operating conditions
if possible. However, these conditions are difficult to fulfill in
? ? ? ?
practice: Due to the high costs, failures are often completely
provides ? solves provides solves ? x does not
avoided and the variety of e. g. plants, manufacturers and
solve
? ! ! operating conditions makes it difficult to find similar scenarios
trains
trains even for standard components with very high quantities.
Scarce Algorithm Data Algorithm
Data
Fig. 1: Challenges for data driven AI-based in automation in manufacturing 2.3. Visual Object Detection
This is a resupply of March 2023 as the template used in the publication of the original article contained errors. The content of the article has remained unaffected.
Benjamin Maschler et al. / Procedia CIRP 107 (2022) 511–516 513
amount of data and new training runs, training costs and energy
requirements continually increase over the lifecycle of an AI
algorithm.
Concept Description
Continual learning Learning algorithm using knowledge transfer from source problem(s) to target problem to subsequently solve source and target
problems better
Federated learning Learning algorithm using knowledge transfer between decentralized clients and central server to jointly solve different variants of a
problem without exchanging the underlying data
Few-shot learning Collective term for learning algorithms whose goal is to minimize the amount of training data required. Continual and transfer
learning methodologies are often used for this purpose.
Industrial transfer Learning algorithm using knowledge transfer from source problem(s) to target problem to solve learning tasks in an industrial context
learning in a more robust, accurate or data-efficient way. Both continual and transfer learning can be used.
Learning algorithm Mathematical procedure mapped in software that can derive general relationships from data in a targeted manner to solve a specific
problem
Problem Learning task characterized by the format and type of input and output data, as well as the input and output connecting relationships
Source problem Problem that is already known to the learning algorithm and can be solved by it
Transfer learning Learning algorithm using knowledge transfer from source problem(s) to target problem to subsequently solve target problem better
Knowledge transfer Exchange of linked information of high complexity, where the form can be abstract, i. e. not humanly comprehensible
Target problem Problem that is as yet unknown to the learning algorithm and cannot yet be solved by it
This is a resupply of March 2023 as the template used in the publication of the original article contained errors. The content of the article has remained unaffected.
514 Benjamin Maschler et al. / Procedia CIRP 107 (2022) 511–516
Transfer Transfer
Fig. 3: Schematic comparison of different learning methods with and without knowledge transfer
Fig. 4: Bridging the simulation-to-reality gap with domain Figure 5: Cross-scenario learning for efficient data-based wear prediction
randomization in order to overcome the disadvantages of training on the on small datasets
real process
This is a resupply of March 2023 as the template used in the publication of the original article contained errors. The content of the article has remained unaffected.
Benjamin Maschler et al. / Procedia CIRP 107 (2022) 511–516 515
accelerate the learning process or to complete it successfully in continuously for emerging process variations [7, 27]. In [28],
the first place despite small amounts of data. Preliminary this is experimentally investigated for the use of artificial
results indicate a broad applicability of the described approach. neural networks for process design in injection molding. In the
application case, the goal is to train a neural network to estimate
3.3. Visual Object Detection the deformation of plastic components based on various
machine parameters, including pressure and time, and to use it
In [25] it was shown that continual learning not only works for quality predictions. A central challenge for the model
with data created specifically for this purpose, but also enables training is that the correlations in the process data change with
further learning with practically relevant industrial data in the each newly manufactured component variation.
area of wear prediction. However, unlike in visual object As a proposed solution, a combination of continual learning
detection, the structure of the AI model in wear prediction (regularization method) and transfer learning (transfer of
remains the same over its life cycle. For example, the remaining network structures) is used, with which it is possible to train the
lifetime should always be determined, regardless of how many neural network over several component variations in a data-
times the AI model has been further trained. However, the and resource-efficient way. On the one hand, the investigations
structure of an AI model of visual object detection changes show that the network does not forget its previous knowledge
when, for example, a new object is learned. The type of data when learning new variations ("learning without forgetting").
also differs: wear prediction uses time series data, whereas On the other hand, the knowledge transfer ensures that the
object detection is usually performed on a single camera image. training of the network is increasingly data-efficient over time
Regularization methods work independently of the type of data, so that less costly process data is required for this. The results
but not independently of the structure of the AI model. obtained confirm the great potential of being able to train neural
A promising proposed solution for visual object detection is networks for quality predictions in production sustainably and
learning without forgetting (LwF) [26], where the AI model is efficiently across process changes (see Fig. 7).
forced by regularization during further training to find a trade-
off between relearning and retaining what it learned earlier. The 4. Insights: Putting the Science into Practice
study conducted with academic test datasets achieved
promising results and was also able to report a saving in Industrial transfer learning is currently far from being used
computation time without quantifying it more precisely (see on a large scale in industrial practice. To ensure that this does
Fig. 6). A study is currently being conducted to investigate the not remain the case and that the potential described above can
extent to which regularization methods such as LwF are actually be realized, the authors draw the following conclusions
suitable for making a measurable contribution to saving from the previously described studies:
computing and storage resources and thus costs in an industrial In the scientific debate, better comparability of published
application environment. approaches would be desirable, for example through uniform
but realistic benchmark data sets. This would facilitate the
3.4. Predictive Quality in Manufacturing transferability of results across the boundaries of individual
pilot projects. Thematically, there is also a void in the area of
The use of continual learning methods in predictive quality similarity assessment. It is obvious that the more similar the
applications makes it possible to use a learning algorithm problems under consideration are, the better knowledge
transfer works - but which aspects play a role here and how
these could be measured and thus made usable has not yet
been explored in detail. This also seems to be due to the fact
that the research focus is often more oriented towards basic
research.
Transfer Transfer
Reduced Increased
need for data performance
Fig. 6: Continual learning enables efficient continued training utilizing pre- Fig. 7: Continual learning of an artificial neural network for quality
viously learnt knowledge (bottom) instead of training again with all data (top) estimation in injection molding
This is a resupply of March 2023 as the template used in the publication of the original article contained errors. The content of the article has remained unaffected.
516 Benjamin Maschler et al. / Procedia CIRP 107 (2022) 511–516
On the part of the future users in industry, tasks also have to [7] Tercan H, Guajardo A, Meisen T. Industrial Transfer Learning:
Boosting Machine Learning in Production. 17th IEEE International
be completed that can be summarized as automation of the Conference on Industrial Informatics (INDIN), Helsinki, Finland; 2019;
industrial transfer learning workflow. Here, a central aspect is 274–9.
the automatic recognition of a process change that necessitates [8] Maschler B, Tatiyosyan S, Weyrich M. Regularization-based Continual
Learning for Fault Prediction in Lithium-Ion Batteries. 15th CIRP
renewed training - e.g., as a result of a product change, plant
Conference on Intelligent Computation in Manufacturing Engineering
wear, or changing environmental conditions. This training must (ICME), Gulf of Naples, Italy; 2021;
then be able to run fully automatically, similar to the AutoML [9] Maschler B, Pham T, Weyrich M. Regularization-based Continual
approach, and also include hyperparameter optimization and Learning for Anomaly Detection in Discrete Manufacturing. Procedia
CIRP 2021; 452-457.
variation of synthetic training data without human intervention. [10] Meyes R et al. Motion Planning for Industrial Robots using
Finally, frameworks and toolboxes are needed that can be Reinforcement Learning. Procedia CIRP 2017; 107–12.
embedded into existing enterprise infrastructures and provide [11] Scheiderer C, Mosbach M, Posada-Moreno A, Meisen T. Transfer of
Hierarchical Reinforcement Learning Structures for Robotic
comprehensive user support across tasks from integrating new Manipulation Tasks. International Conference on Computational
data sources over model management to the definition of new Science and Computational Intelligence (CSCI), Las Vegas, USA;
problems to be solved by the learning algorithm. Further 2020; 504–9.
[12] Oščádal P et al. Smart Building Surveillance System as Shared Sensory
synergies could be realized by linking with the concepts of System for Localization of AGVs. Applied Sciences 2020; 23:8452.
intelligent digital twin [21] or intelligent reconfiguration [13] Mandal S et al. Lyft 3D object detection for autonomous vehicles. In
management [22]. Artificial Intelligence for Future Generation Robotics. Elsevier; 2021;
119–36.
[14] Ding G, Lu H, Bai J, Qin X. Development of a High Precision
5. Conclusion UWB/Vision-based AGV and Control System. 5th International
Conference on Control and Robotics Engineering (ICCRE), Osaka,
The application spectrum in which artificial intelligence is Japan; 2020; 99–103.
[15] Vietz H et al. A Methodology to Identify Cognition Gaps in Visual
used in industry could be much larger than it is today. This is Recognition Applications Based on Convolutional Neural Networks.
due to, among other things, a lack of solutions for robust IEEE 17th International Conference on Automation Science and
handling of small data volumes and high problem dynamics. Engineering (CASE)2021;
[16] French R. Catastrophic forgetting in connectionist networks. Trends in
The scenarios described here show possible solutions with the Cognitive Sciences 1999; 4:128–35.
help of industrial transfer learning. [17] Pan S, Yang Q. A Survey on Transfer Learning. IEEE Trans. Knowl.
One advantage is the interlocking of simulation and reality Data Eng. 2010; 10:1345–59.
[18] Maschler B, Kamm S, Weyrich M. Deep Industrial Transfer Learning
through sim2real transfer learning, which demonstrably
at Runtime for Image Recognition. at - Automatisierungstechnik 2021;
increases the robustness of AI agents and thus reduces costs for 3:211-220.
experiments in reality. The payoff here is that simulation has [19] Maschler B, Weyrich M. Deep Transfer Learning for Industrial
long been used as a central method in development and Automation: A Review and Discussion of New Techniques for Data-
Driven Machine Learning. IEEE Industrial Electronics Magazine 2021;
decision support in operations. 65–75.
It should be noted that the use of industrial transfer learning [20] Maschler B, Knodel T, Weyrich M. Towards Deep Industrial Transfer
must be accompanied by technical and organizational changes. Learning for Anomaly Detection on Time Series Data. 26th IEEE
International Conference on Emerging Technologies and Factory
For example, appropriate infrastructures in the form of Automation (ETFA)2021; 1–8.
decentralized storage and computing capacities must be created [21] Maschler B, Braun D, Jazdi N, Weyrich M. Transfer learning as an
and new protocols for data and model exchange must be enabler of the intelligent digital twin. Procedia CIRP 2021; 127–32.
[22] Maschler B, Müller T, Löcklin A, Weyrich M. Transfer Learning as an
developed in order to implement the proposed solutions Enhancement for Reconfiguration Management of Cyber-Physical
successfully. Production Systems. 15th CIRP Conference on Intelligent Computation
in Manufacturing Engineering (ICME), Gulf of Naples, Italy; 2021;
References [23] Joshua Tobin et al. Domain randomization for transferring deep neural
networks from simulation to the real world. 2017 IEEE/RSJ
[1] Javaid M, Haleem A, Singh R, Suman R. Artificial Intelligence International Conference on Intelligent Robots and Systems (IROS)
Applications for Industry 4.0: A Literature-Based Study. J. Ind. Intg. 2017; 23–30.
Mgmt. 2021; 1–29. [24] Scheiderer C, Dorndorf N, Meisen T. Effects of Domain Randomization
[2] Kotsiopoulos T, Sarigiannidis P, Ioannidis D, Tzovaras D. Machine on Simulation-to-Reality Transfer of Reinforcement Learning Policies
Learning and Deep Learning in smart manufacturing: The Smart Grid for Industrial Robots. In Advances in Artificial Intelligence and Applied
paradigm. Computer Science Review 2021; 100341. Cognitive Computing. Springer, Cham; 2021; 157–69.
[3] Lindemann B, Maschler B, Sahlab N, Weyrich M. A Survey on [25] Maschler B, Vietz H, Jazdi N, Weyrich M. Continual Learning of Fault
Anomaly Detection for Technical Systems using LSTM Networks. Prediction for Turbofan Engines using Deep Learning with Elastic
Computers in Industry 2021; 131:103498. Weight Consolidation. 25th IEEE International Conference on
[4] Bernard T, Kühnert C, Campbell E. Web-based Machine Learning Emerging Technologies and Factory Automation (ETFA)2020; 1–8.
Platform for Condition-Monitoring. In Technologien für die intelligente [26] Li Z, Hoiem D. Learning without Forgetting. IEEE Trans. Pattern Anal.
AutomationMachine Learning for Cyber Physical Systems. Berlin, Mach. Intell. 2018; 12:2935–47.
Heidelberg: Springer; 2019; 36–45. [27] Tercan H et al. Transfer-Learning: Bridging the Gap between Real and
[5] Krauß J, Frye M, Beck G, Schmitt R. Selection and Application of Simulation Data for Machine Learning in Injection Molding. Procedia
Machine Learning- Algorithms in Production Quality. In Technologien CIRP 2018; 185–90.
für die intelligente AutomationMachine Learning for Cyber Physical [28] Tercan H, Deibert P, Meisen T. Continual learning of neural networks
Systems. Berlin, Heidelberg: Springer; 2019; 46–57. for quality prediction in production using memory aware synapses and
[6] Lindemann B et al. A survey on long short-term memory networks for weight transfer. J Intell Manuf 2021; 1–10.
time series prediction. Procedia CIRP 2021; 650–5.
This is a resupply of March 2023 as the template used in the publication of the original article contained errors. The content of the article has remained unaffected.