0% found this document useful (0 votes)
25 views

Deviation Detection in Production Processes Based On Video Data Using Unsupervised Machine Learning Approaches

This document discusses using unsupervised machine learning methods to detect deviations in production processes based on video data. Specifically, it investigates applying autoencoder architectures, which are unsupervised neural networks, to detect anomalies without needing labeled training data. The document first structures the types of deviations that can occur and creates benchmark video datasets. It then adapts and tests existing autoencoder models on these datasets. The results show autoencoders can partially detect deviations in production, but challenges remain due to the large variety of tasks and possible deviations. More detailed benchmark datasets are needed for further improvement.

Uploaded by

Ryan Coetzee
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views

Deviation Detection in Production Processes Based On Video Data Using Unsupervised Machine Learning Approaches

This document discusses using unsupervised machine learning methods to detect deviations in production processes based on video data. Specifically, it investigates applying autoencoder architectures, which are unsupervised neural networks, to detect anomalies without needing labeled training data. The document first structures the types of deviations that can occur and creates benchmark video datasets. It then adapts and tests existing autoencoder models on these datasets. The results show autoencoders can partially detect deviations in production, but challenges remain due to the large variety of tasks and possible deviations. More detailed benchmark datasets are needed for further improvement.

Uploaded by

Ryan Coetzee
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Available online at www.sciencedirect.

com

ScienceDirect
Procedia CIRP 112 (2022) 162–167
www.elsevier.com/locate/procedia

15th CIRP Conference on Intelligent Computation in Manufacturing Engineering, Gulf of Naples, Italy

Deviation Detection in Production Processes based on Video Data using


Unsupervised Machine Learning Approaches
Matthias Mühlbauera*, Henrik Eppa, Hubert Würschingera, Nico Hanenkampa
a
University Erlangen-Nuremberg, Institute of Resource and Energy Efficient Production Systems, Dr.-Mack-Str. 81, 90762 Fürth, Germany

* Corresponding author. Tel.: +49-911-65078-64818; fax: +49-911-65078-64813. E-mail address: [email protected]

Abstract

The detection of deviations within production processes is essential to ensure high productivity or avoid potential damage. Various approaches
are available for this purpose. In the field of video surveillance, unsupervised machine learning methods have made significant progress in
detecting deviations.
In this paper, the transferability of these generic approaches to production processes is investigated. At first, an evaluation basis is created.
Therefore, the variety of deviations, which can occur in an automated production process, is structured and covered as far as possible in video
benchmark data sets. Subsequently, existing unsupervised approaches are selected, adapted and tested on the created data sets. In conclusion, the
results show that the two chosen unsupervised autoencoder architectures can be partially used for generic deviation detection in the production
domain. The main challenges identified are the large variety of different tasks and deviations in production processes. However, for further
investigations, the development of even more detailed benchmark sets is essential.
© 2022 The Authors. Published by Elsevier B.V.
This is an open access article under the CC BY-NC-ND license (https://creativecommons.org/licenses/by-nc-nd/4.0)
Peer-review under responsibility of the scientific committee of the 15th CIRP Conference on Intelligent Computation in Manufacturing Engineering,
14-16 July, Gulf of Naples, Italy

Keywords: Production process; deviation detection; anomaly detection; unsupervised learning; autoencoder; computer vision; video data

1. Introduction major issue is a frequently necessary alteration or integration


of software on the programmable logic controller. The
1.1. Motivation integration of new sensors can lead to efforts for software and
hardware as well as production downtimes if technical
Due to increasing competitive pressure, production intervention in the production process becomes necessary. The
companies are driven to ensure high product quality and to challenges described can be avoided by recording data without
make their processes more efficient. Especially, unplanned technical intervention in the process. Image and airborne sound
downtimes must be avoided or kept as short as possible. To [2] recordings are particularly suitable in this context.
detect deviations or malfunctions at an early stage, technical Further challenges occur in the phase of data analysis to
processes are often continuously monitored. For components detect deviations or to predict the remaining useful lifetime.
that undergo a wear process, predictive maintenance is a For example, for supervised machine learning approaches, for
prominent trend [1]. all necessary classes a sufficient number of instances must be
To gather data for process monitoring or predictive recorded and labelled [3]. If failure classes only occur rarely in
maintenance applications new sensors often have to be applied the considered process this can be a time-consuming task [4].
if existing sensors do not provide the necessary information. Furthermore, only deviations that have already been trained can
This is accompanied by some obstacles. If, for example, be detected.
relevant data is recorded by existing sensors in a machine, a

2212-8271 © 2022 The Authors. Published by Elsevier B.V.


This is an open access article under the CC BY-NC-ND license (https://creativecommons.org/licenses/by-nc-nd/4.0)
Peer-review under responsibility of the scientific committee of the 15th CIRP Conference on Intelligent Computation in Manufacturing Engineering,
14-16 July, Gulf of Naples, Italy
10.1016/j.procir.2022.09.066

This is a resupply of March 2023 as the template used in the publication of the original article contained errors. The content of the article has remained unaffected.
Matthias Mühlbauer et al. / Procedia CIRP 112 (2022) 162–167 163

1.2. Objectives and structure of the paper The main components of an autoencoder, as shown in Figure 1,
Within this paper, the potentials of autoencoder are the encoder, the bottleneck and the decoder.
architectures for detection of deviations based on video data
during technical processes are investigated. Autoencoders are
assigned to the unsupervised learning methods and enable the
detection of deviations solely on the previously learned target
state or sequence of the process. In the case of video
surveillance of public places, the flexible detection of
deviations by autoencoders already has been proven [5]. The
objective is to apply these concepts to the production area.
Figure 1: Autoencoder architecture.
Furthermore, the use of video data makes technical intervention
in the process for data collection unnecessary. The encoder consists of layers with a decreasing number of
In the following, an overview of the state of research is neurons in the direction of the data flow. In the center the layer
given. Based on this, the conception and the recording of with the lowest number of neurons represents the bottleneck.
several reference data sets are described. In the next step, two The following decoder is an inversion of the encoder, so that
chosen autoencoder architectures are slightly adapted and the output data has the same shape as the input data. The used
applied to the recorded data sets. Finally, the obtained results layers can have different architectures depending on the
are discussed. purpose of the autoencoder. Convolutional layers are a standard
for processing image data, whereas recurrent structures are
2. State of research often used for modelling sequential data [10].
Usually, an autoencoder is trained to reproduce the input as
2.1. Monitoring of production processes
good as possible, despite the dimensional reduction which has
To maintain or increase process reliability and machine taken place via the bottleneck. In the field of anomaly
availability, automated monitoring and diagnostic procedures detection, autoencoders are used to learn the characteristics of
are used increasingly [6]. Central tasks of these systems are the a normal data set. If an autoencoder receives untypical, i.e.
recording of the actual condition and the comparison with a abnormal data, it is less capable of reproducing them. In
specified target condition. Following, detected deviations by reconstruction-based approaches, an anomaly can thus be
the status comparison can serve as input for root cause analysis. identified by a mismatch between input and output data. This
A large number of options are available for the technical and mismatch is quantified by the reconstruction error.
organizational implementation of a monitoring task. For
example, time-oriented variables (e.g. frequency and duration 2.4. Evaluation metrics
of monitoring), the degree of automation, the measured The classification of an anomaly based on the calculated
variable or the sensors used must be taken into account [6]. reconstruction error can be done by a threshold value, see
Figure 2. Depending on the definition of a threshold value,
2.2. Deviation and anomalies
normal data points may be recognized as abnormal or abnormal
If technical processes and present deviations are recorded data points may be overlooked. The following methods and
with sensors, deviations can appear as anomalies in the data. metrics are used to evaluate the performance of different
The definitions for anomalies or outliers in data are diverse. approaches under these circumstances.
Hawkins defines an anomaly as "an observation which deviates
so significantly from other observations as to arouse suspicion
that it was generated by a different mechanism" [7]. Patterns,
which do not correspond to the expected behaviour, can thus be
understood as anomalies. Further, it is common to distinguish
between a point, contextual or collective anomaly [8].
In the production area a variety of deviations can occur,
which are recognizable differently in video recordings. In Figure 2: Deviation detection based on the reconstruction error.
particular, it is possible to differentiate between temporal
and/or visual pronounced deviations. For example, incorrectly The True Positive Rate (TPR) describes the percentage of
set process parameters can be reflected in a temporal (e.g. too instances correctly recognized as abnormal, while the False
high speed) or in a spatial deviation (e.g. wrong target Positive Rate (FPR) describes the percentage of instances
coordinates). incorrectly recognized as abnormal. They are defined based on
the absolute number of normal and abnormal events and the
2.3. Anomaly detection with autoencoder absolute numbers of correctly and incorrectly detected events.
Various approaches are available to detect anomalies. A The Receiver Operating Characteristics (ROC) curve is a
promising method from the area of unsupervised learning are metric that can be used to represent the performance of a
autoencoders. Unsupervised learning methods have the system in relation to its sensitivity using the values described.
advantage that they can be applied to unlabeled data [9]. Thus, For this purpose, the TPR is plotted against the FPR for
deviations do not have to be known in advance. Autoencoders different threshold values. The Area under the Curve (AUC)
represent a special arrangement of the layers of an artificial provides information about the performance of the entire
neural network and can be used for a variety of purposes [10]. system. The Equal Error Rate (EER) describes the total error

This is a resupply of March 2023 as the template used in the publication of the original article contained errors. The content of the article has remained unaffected.
164 Matthias Mühlbauer et al. / Procedia CIRP 112 (2022) 162–167

rate for the threshold value at which FPR and False Negative 3. Establishing a data basis
Rate (FNR) have the same value. One of the cornerstones of the development of anomaly
These metrics can be used to evaluate the performance of an detection methods are benchmark data sets, as those enable
anomaly detection approach within a benchmark data set. testing and comparison. Regarding industrial manufacturing
However, the optimal threshold depends on the process being environments, currently no suitable comparative video data is
monitored and the costs or risks associated with a false alarm publicly available. A multitude of reasons exists for this. One
versus those associated with an undetected error. is the high amount of effort and time being required to create a
comprehensive data set, which would interfere with production
2.5. Related work
and operative processes. Additionally, even if such data sets are
The detection of deviations by optical data in the production created, they will be part of the producing companies’
area can be performed by conventional or machine learning intellectual property and not easily shared publicly. Therefore,
methods. For example, object counting can be based on the creation of such benchmark data is the first step.
conventional methods, such as programmed edge detection, or
machine learning methods, such as a trained object detector. 3.1. Defining criteria for video benchmark data
However, the majority of the current publications are based on Typically, the demands on an anomaly detection approach
supervised methods and single images. Examples are Scime can be divided into two main groups. On the one hand, the
and Beuth [11], which predict defects on 3D-printed observed process has a large impact on the detectability of
components or the detection of remaining chips on piston rods anomalies, as highly repetitive processes can already be
[12]. monitored by template approaches, whereas more irregular
Colosimo and Grasso [13], on the other hand, analyze image processes are much more difficult to characterize. On the other
sequences of 20 to 70 images, which represent recordings of a hand, the occurring anomalies influence their detectability
melting process. After a dimensional reduction with different either by their scale regarding the field of view (FOV) or by
principal component methods, the data is grouped their type (point, contextual, collective). The latter determines
unsupervised by cluster algorithms. However, further whether a single image is sufficient for detection, or if the
interpretation requires process knowledge. anomaly only can be recognized by observing the temporal
A first unsupervised approach based on artificial neural context of a video sequence. In the following, this concept will
networks has been published by Tan et al. [14]. The aim is to be clarified using some examples:
detect anomalies in image sequences from infrared recordings
of a laser sintering process. A convolutional autoencoder Table 1: Comparison of exemplary anomalies in manufacturing processes.
architecture is used, which detects anomalies by the Anomaly Process (temporal / spatial) Detectability
reconstruction error. However, the tested anomalies are limited Foreign steady* / repetitive** single frame, template
to a varying deviation of the laser power from the target value. object approaches
[14] This approach is promising and further research is needed Machine steady / repetitive multiple frames, template
to find a flexible and process independent approach and to standstill approaches
consider a larger variety of anomalies. Foreign unsteady / non-repetitive single frame, generalizing
In recent years, the field of video surveillance of public object approaches
places has increasingly established itself as a testing ground for Machine unsteady / non-repetitive multiple frames,
unsupervised anomaly detection in image sequences. The high standstill generalizing approaches
public availability of prepared benchmark data sets, which * Temporal uniformity, e.g., process has constant speed and no interruptions.
allow a comparison of different approaches, contributes to this ** Spatial uniformity, e.g., filmed objects appear in same pattern or orientation.
[15]. Another reason is the special suitability of the
surveillance area for this application. Anomalies in the video 3.2. Experimental setup
recordings can be very different and unpredictable. Therefore, Due to the high number of possible combinations of these
unsupervised approaches are of high importance to detect these criteria, especially regarding the monitored processes, three
unknown deviations. Most of the recent publications in this different data sets are created. They represent two different
area utilize Deep Learning with convolutional layers for feature processes, which differ greatly in terms of their uniformity and
extraction and can achieve convincing results [5, 15–19]. predictability. The two processes are:
A cyclically repeating process, simulated by the head of a
2.6. Research gap
fused deposition modelling 3D printer, which repeatedly runs
Due to the lack of benchmark data sets for the production through a given sequence of coordinates. Anomalies include
area, there is a need for the design, recording and preparation positional deviations, deviation in movement speeds,
of a suitable data set to provide a basis for comparison of movement stops, wrong coordinate sequences or foreign
different anomaly detection approaches. Especially the variety objects. To implement those anomalies, two different datasets
of possible anomalies must be considered. Subsequently, it are created in which different coordinate sequences are
must be investigated to what extent the already successful monitored. Figure 3 shows the setup for one of those sequences.
approaches of autoencoder architectures in the video
surveillance area can be transferred to the production area. The
main objectives are to examine which deviations can be
detected with which autoencoder architectures and where the
restrictions of the existing methods begin.

This is a resupply of March 2023 as the template used in the publication of the original article contained errors. The content of the article has remained unaffected.
Matthias Mühlbauer et al. / Procedia CIRP 112 (2022) 162–167 165

of both approaches can be looked up in the respective


publications [5, 17]. Additionally, some variations have been
made regarding the data processing.
4.2. Configuration and adaptions
The first autoencoder is based on Chong and Tay [17] and
will be referred to as approach 1 in the following. It consists of
Figure 3: Experimental setup benchmark 1. convolutional elements to process spatial information as well
as recurrent elements to process temporal information. It takes
The second process simulates a disordered and irregular a sequence of a predefined number of individual images as
supply of screws. Here, disordered groups of 3-10 screws are input and returns a reconstruction of that sequence [17]. The
filmed, which slide down a metal chute at irregular intervals. length of sequences as well as their temporal resolution can be
The tested anomalies are mainly foreign bodies of different optimized for a given scenario. As a metric for the irregularity
scale or atypical behaviour of the monitored process (e.g. jams, of a given input sequence, the Euclidean Distance between
atypical screw amounts). The setup is shown in Figure 4. original input data and reconstructed data is calculated,
representing the reconstruction error.
The second approach is a simplified version of the
variational autoencoder published by Fan et al. [5], that also
uses convolutional layers to process spatial information.
Furthermore, this architecture uses small image patches,
which are generated using a sliding window approach [20], of
the video frames as input data, trying to achieve better
Figure 4: Experimental setup benchmark 2. capturing of details. The procedure of generating those patches
is shown in Figure 5.
3.3. Data collection and annotation
This procedure is identical for both data sets. First, the
process was filmed in its normal condition without anomalies.
Afterwards, the same process was filmed with anomalies
implemented. Between the occurrence of the different
anomalies the process always returns to its normal state first.
Each anomaly type / scale was implemented on at least two
occasions in different phases of the recording. Afterwards, the
recorded videos were split into their respective grayscale Figure 5: Second approach: Generation of image patches.
frames, cut and scaled to their respective region of interest As those input patches are based on single frames and nor
(ROI). The frames containing abnormal behavior have been recurrent elements are used within the autoencoder, the
annotated. Overall, the created benchmarks can be classified by recognition of temporal context is not possible. For this reason,
the attributes shown in the following table. The data sets are Fan et al. [5] train a second model of the same architecture,
available from the corresponding author upon request. which uses so-called dynamic flow image patches as input data
[5]. These are a result of a temporal pooling operation on a
Table 2: Overview data sets.
sequence of original video frames based on optical flow and
Benchmark 1, 3D printer Benchmark 2 allow the recognition of movement and temporal context [20,
Benchmark
Data set 1 Data set 2 Screw feed
21]. In addition, a type of temporal pooling which uses the
Frame rate 30 30 30
colour channels of an RGB-frame to stack the input-grayscale
ROI resolution 250x 283 210x283 640x 92
images in a temporal order was tested. As the criterion for
Num. frames train 9,007 8,606 92,628 abnormal events, the reconstruction error is used for both data
Num. frames test 23,922 9,792 15,423 streams.
Num. anormal events 47 15 30
4.3. Training and application
Num. anormal frames 6,692 2,519 2,465
First, the models are trained on the normal data of the given
data set. Afterwards, the trained models are tested on the
4. Application of ML-methods for anomaly detection abnormal dataset, computing reconstruction errors for each
4.1. Choosing the method input data. Thus, the reconstruction error can be plotted over
the time course of the test data set. The reconstruction errors
The methods tested are based on autoencoder architectures, for the two models of the second approach are fused to create a
the type of neural network described above, which have been single plot, which is used for anomaly evaluation. Using a
showing promising results regarding unsupervised video threshold, the sensitivity of the system is set to an optimized
surveillance. Therefore, the chosen and tested architectures are value depending on the circumstances. Depending on the
based on the works of Chong and Tay [17] and Fan et al [5], as supervised process and the resulting data, the plots can also be
those showed the best performance of autoencoder based normalized, smoothed, and processed further using standard
approaches in video surveillance [5]. The original architectures time series processing methods, e.g., seasonal decompose. The

This is a resupply of March 2023 as the template used in the publication of the original article contained errors. The content of the article has remained unaffected.
166 Matthias Mühlbauer et al. / Procedia CIRP 112 (2022) 162–167

complete application workflow is shown schematically in distinguished. Part jams are the only anomaly type that can
Figure 6. reliably be detected, since they represent a huge deviation of
the normal behaviour. The only difference between both
approaches is the ability of the second approach to detect the
absence of parts in the second benchmark, which has a huge
impact on evaluation metrics listed in Table 4, due to the long
duration of that anomaly type.
Table 4: Overview about the results.

Benchmark Metric Approach 1 Approach 2

Benchmark 1, 3D printer ROC 85.7% 85.1%


Data set 1 EER 20.6% 22.1%
Benchmark 1, 3D printer ROC 73.1% 65.8%
Data set 2 EER 36.7% 38.3%
ROC 56.6% 84.3%
Benchmark 2, screw feed
EER 48.6% 21.4%

Figure 6: Complete workflow. Overall, different configurations (e.g. sequence length,


choice of temporal data stream) as well as the different
5. Results architecture and data processing of both approaches have
almost no influence on the basic detectability based on the
5.1. Detectability of anomalies reconstruction error. Therefore, both approaches, apart from
The following section discusses whether the different one anomaly type in benchmark 2, yield similar results. This
anomalies affect the reconstruction error of the given approach becomes apparent when comparing the metrics for benchmark
and, therefore, whether a detection is theoretically possible. A 1. The autoencoder approach and the evaluation using
qualitative assessment of the reconstruction error metric reconstruction errors seem to be more important.
follows. Table 3 shows the overall detectability of the
5.2. Discussion of autoencoder architectures for anomaly
anomalies in both benchmarks.
detection in production environments
Table 3: General detectability on anomalies.
The use of autoencoders with a reconstruction error as the
Anomaly Anomaly Approach Approach metric to identify anomalies shows two major drawbacks. On
Benchmark
type extent* 1 2 the one hand, the inability to precisely control the learned
patterns of the autoencoder. This may result in too much or too
Standstill 0,1 s + +
little generalization capability. For example, in benchmark 2,
Error in motion - both approaches were able to reconstruct image data of foreign
o o
Benchmark 1 sequence
3D printer
objects to a high level of detail.
Coordinate deviations 1 mm + + On the other hand, the dependence of the reconstruction
Both datasets
Speed deviations - 50 % + + error on the complexity of the processed data. As more
Foreign objects 5 % FOV + + complex video slices (e.g. containing more details or
movement) are harder for the autoencoders to reconstruct, the
Part jam - + +
Benchmark 2
error may rise even in absence of anomalies simply by feeding
Absence of parts 5s - o more complex data. This results in a cyclic fluctuation of the
Screw feed
Wrong parts - - - reconstruction error during benchmark 1, which makes some
+ Direct influence on reconstruction error anomalies only detectable after smoothing the resulting
o No direct influence, but detectable as pattern in post processing seasonal components of the error-data. As this smoothing is
- Not detectable at all only possible due to the repetitive nature of benchmark 1 such
* The smallest recognizable anomaly extent is listed
smoothing was not possible for benchmark 2. Within
Considering benchmark 1, all occurrences of different benchmark 2, every group of screws meant an increase in video
anomaly types result in a change of the reconstruction error, complexity compared to the static background. Therefore, the
which either increases significantly or shows a pattern that reconstruction error mainly correlates with the number of
clearly deviates from the normal state. Therefore, both visible parts and was only reliably able to detect jams. Foreign
approaches can detect the anomalies of the first benchmark. objects in the feed were not able to be detected due to the
Thus, a monitoring of repeating processes seems to be possible. combination of the two mentioned drawbacks.
Regarding the second benchmark, both approaches seem to
struggle to learn the defining characteristics of the process. 5.3. Discussion of the created benchmark data sets
Instead, the reconstruction error seems to depend much more Benchmark 1: This benchmark represents a good variety of
on the number of parts or the general 'activity' of the events. anomaly types in varying degrees of severity and is well usable
Therefore, any occurrence of a component group causes an for assessment. Only reducing the anomaly frequency and
increase in the reconstruction error, regardless, if a normal or increasing the total length of the data sets should be considered,
an abnormal group is shown. Therefore, incorrect parts are not to enable a better long-term assessment of the anomaly score.

This is a resupply of March 2023 as the template used in the publication of the original article contained errors. The content of the article has remained unaffected.
Matthias Mühlbauer et al. / Procedia CIRP 112 (2022) 162–167 167

Benchmark 2: The second benchmark, in form of sparse 7. References


events and a complex data set with anomalies that are often
[1] Matyas, K., 2019. Instandhaltungslogistik: Qualität und Produktivität
difficult to detect, poses a challenge for all the approaches
steigern, 7th edn. Hanser, München.
tested. Nevertheless, the partially good results regarding ROC
[2] Mühlbauer, M., Würschinger, H., Polzer, D., Ju, S. et al., 2020.
show some weaknesses of this benchmark. Due to the
Automated Data Labeling and Anomaly Detection Using Airborne Sound
sparseness of events, coupled with a high anomaly frequency,
Analysis.
approaches already achieve comparatively good ROC values,
[3] Roh, Y., Heo, G., Whang, S.E., 2019. A Survey on Data Collection for
if they classify every event as an anomaly. In addition,
Machine Learning: A Big Data - AI Integration Perspective.
protracted anomalies, such as partial pauses, should be
[4] Lei, Y., Li, N., Guo, L., Li, N. et al., 2018. Machinery health prognostics:
included in an own test data set, otherwise the weighting of the
A systematic review from data acquisition to RUL prediction.
anomaly types for the ROC results will be biased.
[5] Fan, Y., Wen, G., Li, D., Qiu, S. et al., 2020. Video anomaly detection
Superordinate consideration: Overall, it can be concluded
and localization via Gaussian Mixture Fully Convolutional Variational
that when using the ROC metrics, a reduction of the share of
Autoencoder.
anomalies in the data sets for both benchmarks seems
[6] Brecher, C., Weck, M., 2021. Werkzeugmaschinen Fertigungssysteme 3.
reasonable. Furthermore, when creating a benchmark, the
Springer Berlin Heidelberg.
sparsity of events should also be considered. In conclusion, the
[7] Hawkins, D.M., 1980. Identification of Outliers. Springer Netherlands,
benchmarks and metrics used in this work are suitable to a
Dordrecht.
limited extent for a comprehensive evaluation of approaches
[8] Cook, A.A., Misirli, G., Fan, Z., 2020. Anomaly Detection for IoT Time-
for anomaly detection in production environments.
Series Data: A Survey.
Nevertheless, for a direct comparison of approaches, they are
[9] Mohri, M., Rostamizadeh, A., Talwalkar, A., 2018. Foundations of
meaningful. Production environments represent an even more
machine learning.
diverse area for anomaly detection than surveillance
[10] Goodfellow, I., Bengio, Y., Courville, A., 2016. Deep learning. MIT
applications. Consequently, for a complete comparison of
Press, Cambridge, Massachusetts, London, England.
anomaly detection methods, a variety of larger and more
[11] Scime, L., Beuth, J., 2018. Anomaly detection and classification in a laser
detailed benchmarks are necessary.
powder bed additive manufacturing process using a trained computer
vision algorithm.
6. Summary
[12] Würschinger, H., Mühlbauer, M., Winter, M., Engelbrecht, M. et al.,
6.1. Benchmark data and results 2020. Implementation and potentials of a machine vision system in a
Within this work, the first step was the design and creation series production using deep learning and low-cost hardware.
of suitable benchmarking data sets. The created datasets can [13] Colosimo, B.M., Grasso, M., 2018. Spatially weighted PCA for
serve as a basis for the evaluation of unsupervised anomaly monitoring video image data with application to additive manufacturing.
detection approaches in the production area. However, the [14] Tan, Y., Jin, B., Nettekoven, A., Chen, Y. et al., 2019 - 2019. An
reduction of the share of the anomalies in the data sets is Encoder-Decoder Based Approach for Anomaly Detection with
recommended. This will allow a more significant evaluation Application in Additive Manufacturing, in 2019 18th IEEE International
based on the mentioned metrics. Conference On Machine Learning And Applications (ICMLA).
Two autoencoders were tested and evaluated on the created [15] Lu, C., Shi, J., Jia, J., 2013 - 2013. Abnormal Event Detection at 150 FPS
data sets. It was determined that the performance of the two in MATLAB, in 2013 IEEE Conference on Computer Vision, p. 2720.
tested architectures is comparable. Within the first benchmark, [16] Boiman, O., Irani, M., 2007. Detecting Irregularities in Images and in
the anomalies could be detected very well after data smoothing. Video.
The detection of deviations within the second benchmark was [17] Chong, Y.S., Tay, Y.H., 2017. Abnormal Event Detection in Videos
significantly more difficult. Here, the sparseness and Using Spatiotemporal Autoencoder, in Advances in Neural Networks -
complexity of the data resulted that only jams were reliably ISNN 2017, Springer International Publishing, Cham, p. 189.
detected. [18] Pang, G., Yan, C., Shen, C., van den Hengel, A. et al., 2020 - 2020. Self-
Trained Deep Ordinal Regression for End-to-End Video Anomaly
6.2. Conclusion and further research Detection, in 2020 IEEE/CVF Conference on Computer Vision and
In summary, the transferability of generic unsupervised Pattern Recognition (CVPR).
autoencoder approaches from the surveillance domain can be [19] Medel, J.R., Savakis, A., 2016. Anomaly Detection in Video Using
evaluated as partially transferable to production environments. Predictive Convolutional Long Short-Term Memory Networks.
The tested approaches show potential but also significant [20] Giannoukos, I., Vrachnakis, V., Anagnostopoulos, C.-N.,
restrictions. In particular, the greater variety of processes in the Anagnostopoulos, I. et al., 2012. Block Operator Context Scanning for
production environment compared to video surveillance Commercial Tracking, in Artificial Intelligence: Theories and
proved to be problematic. However, to fully evaluate these Applications, Springer Berlin Heidelberg, p. 369.
approaches, testing in other process environments and [21] Wang, J., Cherian, A., Porikli, F., 2017 - 2017. Ordered Pooling of
comparison with other methods is necessary. Thus, the Optical Flow Sequences for Action Recognition, in 2017 IEEE Winter
following topics provide the basis for further research: Conference on Applications of Computer Vision (WACV), p. 168.
 Creation of further benchmark data sets [22] Bilen, H., Fernando, B., Gavves, E., Vedaldi, A., 2018. Action
 Testing of further model architectures Recognition with Dynamic Image Networks. IEEE Trans Pattern Anal
 Comparison with classical and supervised methods Mach Intell 40.
 Visualization and localization of anomalies

This is a resupply of March 2023 as the template used in the publication of the original article contained errors. The content of the article has remained unaffected.

You might also like