Deviation Detection in Production Processes Based On Video Data Using Unsupervised Machine Learning Approaches
Deviation Detection in Production Processes Based On Video Data Using Unsupervised Machine Learning Approaches
com
ScienceDirect
Procedia CIRP 112 (2022) 162–167
www.elsevier.com/locate/procedia
15th CIRP Conference on Intelligent Computation in Manufacturing Engineering, Gulf of Naples, Italy
Abstract
The detection of deviations within production processes is essential to ensure high productivity or avoid potential damage. Various approaches
are available for this purpose. In the field of video surveillance, unsupervised machine learning methods have made significant progress in
detecting deviations.
In this paper, the transferability of these generic approaches to production processes is investigated. At first, an evaluation basis is created.
Therefore, the variety of deviations, which can occur in an automated production process, is structured and covered as far as possible in video
benchmark data sets. Subsequently, existing unsupervised approaches are selected, adapted and tested on the created data sets. In conclusion, the
results show that the two chosen unsupervised autoencoder architectures can be partially used for generic deviation detection in the production
domain. The main challenges identified are the large variety of different tasks and deviations in production processes. However, for further
investigations, the development of even more detailed benchmark sets is essential.
© 2022 The Authors. Published by Elsevier B.V.
This is an open access article under the CC BY-NC-ND license (https://creativecommons.org/licenses/by-nc-nd/4.0)
Peer-review under responsibility of the scientific committee of the 15th CIRP Conference on Intelligent Computation in Manufacturing Engineering,
14-16 July, Gulf of Naples, Italy
Keywords: Production process; deviation detection; anomaly detection; unsupervised learning; autoencoder; computer vision; video data
This is a resupply of March 2023 as the template used in the publication of the original article contained errors. The content of the article has remained unaffected.
Matthias Mühlbauer et al. / Procedia CIRP 112 (2022) 162–167 163
1.2. Objectives and structure of the paper The main components of an autoencoder, as shown in Figure 1,
Within this paper, the potentials of autoencoder are the encoder, the bottleneck and the decoder.
architectures for detection of deviations based on video data
during technical processes are investigated. Autoencoders are
assigned to the unsupervised learning methods and enable the
detection of deviations solely on the previously learned target
state or sequence of the process. In the case of video
surveillance of public places, the flexible detection of
deviations by autoencoders already has been proven [5]. The
objective is to apply these concepts to the production area.
Figure 1: Autoencoder architecture.
Furthermore, the use of video data makes technical intervention
in the process for data collection unnecessary. The encoder consists of layers with a decreasing number of
In the following, an overview of the state of research is neurons in the direction of the data flow. In the center the layer
given. Based on this, the conception and the recording of with the lowest number of neurons represents the bottleneck.
several reference data sets are described. In the next step, two The following decoder is an inversion of the encoder, so that
chosen autoencoder architectures are slightly adapted and the output data has the same shape as the input data. The used
applied to the recorded data sets. Finally, the obtained results layers can have different architectures depending on the
are discussed. purpose of the autoencoder. Convolutional layers are a standard
for processing image data, whereas recurrent structures are
2. State of research often used for modelling sequential data [10].
Usually, an autoencoder is trained to reproduce the input as
2.1. Monitoring of production processes
good as possible, despite the dimensional reduction which has
To maintain or increase process reliability and machine taken place via the bottleneck. In the field of anomaly
availability, automated monitoring and diagnostic procedures detection, autoencoders are used to learn the characteristics of
are used increasingly [6]. Central tasks of these systems are the a normal data set. If an autoencoder receives untypical, i.e.
recording of the actual condition and the comparison with a abnormal data, it is less capable of reproducing them. In
specified target condition. Following, detected deviations by reconstruction-based approaches, an anomaly can thus be
the status comparison can serve as input for root cause analysis. identified by a mismatch between input and output data. This
A large number of options are available for the technical and mismatch is quantified by the reconstruction error.
organizational implementation of a monitoring task. For
example, time-oriented variables (e.g. frequency and duration 2.4. Evaluation metrics
of monitoring), the degree of automation, the measured The classification of an anomaly based on the calculated
variable or the sensors used must be taken into account [6]. reconstruction error can be done by a threshold value, see
Figure 2. Depending on the definition of a threshold value,
2.2. Deviation and anomalies
normal data points may be recognized as abnormal or abnormal
If technical processes and present deviations are recorded data points may be overlooked. The following methods and
with sensors, deviations can appear as anomalies in the data. metrics are used to evaluate the performance of different
The definitions for anomalies or outliers in data are diverse. approaches under these circumstances.
Hawkins defines an anomaly as "an observation which deviates
so significantly from other observations as to arouse suspicion
that it was generated by a different mechanism" [7]. Patterns,
which do not correspond to the expected behaviour, can thus be
understood as anomalies. Further, it is common to distinguish
between a point, contextual or collective anomaly [8].
In the production area a variety of deviations can occur,
which are recognizable differently in video recordings. In Figure 2: Deviation detection based on the reconstruction error.
particular, it is possible to differentiate between temporal
and/or visual pronounced deviations. For example, incorrectly The True Positive Rate (TPR) describes the percentage of
set process parameters can be reflected in a temporal (e.g. too instances correctly recognized as abnormal, while the False
high speed) or in a spatial deviation (e.g. wrong target Positive Rate (FPR) describes the percentage of instances
coordinates). incorrectly recognized as abnormal. They are defined based on
the absolute number of normal and abnormal events and the
2.3. Anomaly detection with autoencoder absolute numbers of correctly and incorrectly detected events.
Various approaches are available to detect anomalies. A The Receiver Operating Characteristics (ROC) curve is a
promising method from the area of unsupervised learning are metric that can be used to represent the performance of a
autoencoders. Unsupervised learning methods have the system in relation to its sensitivity using the values described.
advantage that they can be applied to unlabeled data [9]. Thus, For this purpose, the TPR is plotted against the FPR for
deviations do not have to be known in advance. Autoencoders different threshold values. The Area under the Curve (AUC)
represent a special arrangement of the layers of an artificial provides information about the performance of the entire
neural network and can be used for a variety of purposes [10]. system. The Equal Error Rate (EER) describes the total error
This is a resupply of March 2023 as the template used in the publication of the original article contained errors. The content of the article has remained unaffected.
164 Matthias Mühlbauer et al. / Procedia CIRP 112 (2022) 162–167
rate for the threshold value at which FPR and False Negative 3. Establishing a data basis
Rate (FNR) have the same value. One of the cornerstones of the development of anomaly
These metrics can be used to evaluate the performance of an detection methods are benchmark data sets, as those enable
anomaly detection approach within a benchmark data set. testing and comparison. Regarding industrial manufacturing
However, the optimal threshold depends on the process being environments, currently no suitable comparative video data is
monitored and the costs or risks associated with a false alarm publicly available. A multitude of reasons exists for this. One
versus those associated with an undetected error. is the high amount of effort and time being required to create a
comprehensive data set, which would interfere with production
2.5. Related work
and operative processes. Additionally, even if such data sets are
The detection of deviations by optical data in the production created, they will be part of the producing companies’
area can be performed by conventional or machine learning intellectual property and not easily shared publicly. Therefore,
methods. For example, object counting can be based on the creation of such benchmark data is the first step.
conventional methods, such as programmed edge detection, or
machine learning methods, such as a trained object detector. 3.1. Defining criteria for video benchmark data
However, the majority of the current publications are based on Typically, the demands on an anomaly detection approach
supervised methods and single images. Examples are Scime can be divided into two main groups. On the one hand, the
and Beuth [11], which predict defects on 3D-printed observed process has a large impact on the detectability of
components or the detection of remaining chips on piston rods anomalies, as highly repetitive processes can already be
[12]. monitored by template approaches, whereas more irregular
Colosimo and Grasso [13], on the other hand, analyze image processes are much more difficult to characterize. On the other
sequences of 20 to 70 images, which represent recordings of a hand, the occurring anomalies influence their detectability
melting process. After a dimensional reduction with different either by their scale regarding the field of view (FOV) or by
principal component methods, the data is grouped their type (point, contextual, collective). The latter determines
unsupervised by cluster algorithms. However, further whether a single image is sufficient for detection, or if the
interpretation requires process knowledge. anomaly only can be recognized by observing the temporal
A first unsupervised approach based on artificial neural context of a video sequence. In the following, this concept will
networks has been published by Tan et al. [14]. The aim is to be clarified using some examples:
detect anomalies in image sequences from infrared recordings
of a laser sintering process. A convolutional autoencoder Table 1: Comparison of exemplary anomalies in manufacturing processes.
architecture is used, which detects anomalies by the Anomaly Process (temporal / spatial) Detectability
reconstruction error. However, the tested anomalies are limited Foreign steady* / repetitive** single frame, template
to a varying deviation of the laser power from the target value. object approaches
[14] This approach is promising and further research is needed Machine steady / repetitive multiple frames, template
to find a flexible and process independent approach and to standstill approaches
consider a larger variety of anomalies. Foreign unsteady / non-repetitive single frame, generalizing
In recent years, the field of video surveillance of public object approaches
places has increasingly established itself as a testing ground for Machine unsteady / non-repetitive multiple frames,
unsupervised anomaly detection in image sequences. The high standstill generalizing approaches
public availability of prepared benchmark data sets, which * Temporal uniformity, e.g., process has constant speed and no interruptions.
allow a comparison of different approaches, contributes to this ** Spatial uniformity, e.g., filmed objects appear in same pattern or orientation.
[15]. Another reason is the special suitability of the
surveillance area for this application. Anomalies in the video 3.2. Experimental setup
recordings can be very different and unpredictable. Therefore, Due to the high number of possible combinations of these
unsupervised approaches are of high importance to detect these criteria, especially regarding the monitored processes, three
unknown deviations. Most of the recent publications in this different data sets are created. They represent two different
area utilize Deep Learning with convolutional layers for feature processes, which differ greatly in terms of their uniformity and
extraction and can achieve convincing results [5, 15–19]. predictability. The two processes are:
A cyclically repeating process, simulated by the head of a
2.6. Research gap
fused deposition modelling 3D printer, which repeatedly runs
Due to the lack of benchmark data sets for the production through a given sequence of coordinates. Anomalies include
area, there is a need for the design, recording and preparation positional deviations, deviation in movement speeds,
of a suitable data set to provide a basis for comparison of movement stops, wrong coordinate sequences or foreign
different anomaly detection approaches. Especially the variety objects. To implement those anomalies, two different datasets
of possible anomalies must be considered. Subsequently, it are created in which different coordinate sequences are
must be investigated to what extent the already successful monitored. Figure 3 shows the setup for one of those sequences.
approaches of autoencoder architectures in the video
surveillance area can be transferred to the production area. The
main objectives are to examine which deviations can be
detected with which autoencoder architectures and where the
restrictions of the existing methods begin.
This is a resupply of March 2023 as the template used in the publication of the original article contained errors. The content of the article has remained unaffected.
Matthias Mühlbauer et al. / Procedia CIRP 112 (2022) 162–167 165
This is a resupply of March 2023 as the template used in the publication of the original article contained errors. The content of the article has remained unaffected.
166 Matthias Mühlbauer et al. / Procedia CIRP 112 (2022) 162–167
complete application workflow is shown schematically in distinguished. Part jams are the only anomaly type that can
Figure 6. reliably be detected, since they represent a huge deviation of
the normal behaviour. The only difference between both
approaches is the ability of the second approach to detect the
absence of parts in the second benchmark, which has a huge
impact on evaluation metrics listed in Table 4, due to the long
duration of that anomaly type.
Table 4: Overview about the results.
This is a resupply of March 2023 as the template used in the publication of the original article contained errors. The content of the article has remained unaffected.
Matthias Mühlbauer et al. / Procedia CIRP 112 (2022) 162–167 167
This is a resupply of March 2023 as the template used in the publication of the original article contained errors. The content of the article has remained unaffected.