Processes 08 01068
Processes 08 01068
Article
Quality Prediction and Yield Improvement in Process
Manufacturing Based on Data Analytics
Ji-hye Jun 1 , Tai-Woo Chang 1, * and Sungbum Jun 2
1 Department of Industrial and Management Engineering/Intelligence and Manufacturing Research Center,
Kyonggi University, Suwon, Gyeonggi 16227, Korea; [email protected]
2 Department of Industrial and Systems Engineering, Dongguk University, Seoul 04620, Korea; [email protected]
* Correspondence: [email protected]; Tel.: +82-31-249-9754
Received: 6 July 2020; Accepted: 19 August 2020; Published: 1 September 2020
1. Introduction
In the manufacturing industry, quality management is a key to competitiveness, productivity,
and profit of companies because poor quality management can damage the trust and good image which
a company has built up for a long time [1]. For this reason, the importance of quality management
in various industries has emerged early on. In early 1950, Juran introduced the idea of Total Quality
Control (TQC) to the overall Japanese industry. Lillrank underlined that maintenance is the most
important activity in quality management [2]. Bergman and Klefsgo emphasized the role of quality
maintenance and repair in production [3]. Especially in continuous-flow process manufacturing, it is
very significant to manage quality and defect rate because some defects in a part is directly related to
the quality of all subsequent processes.
Recently, with technical advances of the Internet of Things (IoT) technology, the amount of
industrial data from sensors is surging and this has promoted the level of automation significantly.
From a perspective of quality management, the increasing size of data can be utilized to predict defects
so that the quality of final products can be improved. For this reason, there has been a wide range
of research on data mining that helps to decide through the modeling of the knowledge extracted
from data relations, rules, patterns, and information hidden from the database. In addition, big data
analysis, artificial intelligence, machine learning, and deep learning make it possible to conduct quality
management more usefully and accurately.
Figure
Figure 1.
1. Quality
Qualitymeasurement
measurementexample.
example.
Table
Table 1. Research
Research Questions
Questions and
and Approaches.
Approaches.
Question
Question Approach
Approach
Quality data generation through device data through semi-
How will quality data be collected? Quality data generation through device data through
How will quality data be collected? supervised learning.learning.
semi-supervised
What techniques will you use to conduct Using semi-supervised learning, timelearning,
Using semi-supervised series prediction,
time seriesand
What techniques will you use to conduct your research?
your research? machine learning.
prediction, and machine learning.
Use evaluation
Use evaluation indicators
indicators considering
considering characteristics
characteristics of
How do How
youdo you evaluate
evaluate performance?
performance? of unbalanced
unbalanced data. data.
Processes 2020, 8, 1068 3 of 18
In order to answer all questions, the comprehensive framework with semi-supervised learning,
time series prediction, and machine learning is proposed for utilizing unlabeled data in the continuous
manufacturing process. For verification of the proposed framework, two experiments were conducted
for quality prediction and yield improvement. In the first experiment, after data preprocessing,
deep learning and multiple classifiers are applied to predict product quality. In order to handle
multiple unlabeled data in continuous flow process manufacturing, the second experiment is designed
to predict product quality with the streamline of three models (semi-supervised learning, time series
prediction, and multiple classifiers). The detailed procedures are explained in Sections 4 and 5. Finally,
based on the experimental results, an efficient production plan for the plastic extrusion process is
derived for improving quality and yield in the continuous-flow process.
Today, SSL was applied to a variety of relevant studies, helping to increase research results.
Yan utilized SSL in order for the early diagnosis and detection of the air handling unit and showed
high accuracy of defect diagnosis [16]. Ellefsen also applied SSL to predict the effective life for engine
performance lowering of turbo fan [17]. Sen used SSL to validate data in order to recognize the
authenticity of sensor data which is used to judge a defect in the pipe process [18].
2.4. RNN
RNN is a type of ANN. As one of the deep learning techniques, it recirculates the output of a
hidden layer as an input to learn sequential (time-series) data for prediction or classification. RNN
is an extension of the existing feed-forward neural network using a loop that iterates the previous
input into the output. RNN has the iterative state in which the activation in each step relies on the
activation in its previous step, in order to process sequential data input. When the sequence data
x = (x1 , x2 , x3 , · · · , xt ) (xt is ith time data) is given, RNN updates hidden state ht repeatedly, where ∅ is
a nonlinear function like logistic sigmoid function or tangent function [12]:
0
if t = 0
ht = (1)
∅(ht−1 , xt,
otherwise
2.5. Classification
Machine learning is a method inferring a system function based on a pair of data and labels.
An inference process is dependent on input features. If the prediction result of new input data is a
continuous value, it is used for regression analysis; if a discrete value, it is used for classification [24].
This study uses logistic regression, decision tree, random forest, linear discriminant analysis (LDA),
Gaussian naive Bayes (GNB), K-nearest Neighbor (KNN), and support vector classifier (SVC) among
machine learning classification techniques, and ANN as a deep learning technique.
of the quality of the continuous process, which was not done well in the previous study. Quality
prediction is carried out by collecting equipment data. In the previous study, sensor data is used
to predict and maintain the quality of the process, and there are studies using deep learning and
machine learning for sensor-based prediction [21,22]. For improving the quality prediction of small
data, we apply SSL to labeling and RNN to generating predictive feature data. Finally, we classify the
data, which are SSL and RNN applied, through classification techniques for quality prediction. It is
intended to efficiently operate the production in a continuous process and use the quality prediction
results to check how much the yield can be finally improved.
If a final product exceeds a given tolerance, the product is marked as defective. Based on the
product specification with an inner diameter of 6 inches and a thickness of 8 cm, the allowable error
of process is determined as ±0.2 mm in thickness and ±0.5 mm in thickness. For the application
of a classifier, the quality value for products was coded as a binary value (1 for defective and 0 for
acceptable). In this study, 3839 defective products (about 18%) of 20,801 data were used. Note that
quality value means not the quality of a single piece of product, but the quality of a part of a product
manufactured every second.
Processes 2020, 8, 1068 6 of 18
Processes 2020, 8, x FOR PEER REVIEW 6 of 20
To find
To find the
the characteristics
characteristicsof ofeach
eachdata,
data,we
weconducted
conducted a statistical analysis
a statistical of the
analysis lineline
of the graph andand
graph box
plot. Figure 2 illustrates the line graphs of the features of continuous data. TIEXT1~3
box plot. Figure 2 illustrates the line graphs of the features of continuous data. TIEXT1~3 and and Water_1~4
followed a followed
Water_1~4 similar trend by the trend
a similar sensor.byNevertheless,
the sensor. aNevertheless,
temperature awas changed depending
temperature was changedon a
sensor position so that it went up or down overall. However, Outtemp and Outwet
depending on a sensor position so that it went up or down overall. However, Outtemp and Outwet are changing
irregularly.
are changingThrough this,Through
irregularly. it can bethis,
seenitthat
canTIEXT
be seenand Water
that TIEXT areand
control
Watervariables andvariables
are control Outtempand
and
Outwet are
Outtemp andmeasurement variables.
Outwet are measurement variables.
Figure
Figure2.2.Extrusion
Extrusiontemperature,
temperature,external
externaltemperature,
temperature,external
externalhumidity,
humidity, water
water temperature
temperature graph.
graph.
In
In outliner analysis, features
outliner analysis, featuresofofcontinuous
continuousdatadatawere
were checked
checked withwiththethe
useuse of the
of the boxbox
plot.plot.
BoxBox
plot
plot is one of the various statistical techniques named exploratory data
is one of the various statistical techniques named exploratory data analysis. It analysis. It is used to visually
to visually
identify
identify aa pattern
pattern that
that can
can be
be hidden
hidden in in aa data
data set.
set. Figure
Figure33 represents
represents the the illustrative
illustrative box
box plots
plots of
of
continuous data. The darker the point representing an outlier is, the more outliers
continuous data. The darker the point representing an outlier is, the more outliers are distributed inare distributed in
aarelevant
relevantrange.
range.According
According toto
thethe
analysis, Outtemp
analysis, Outtemp andandOutwet
Outwethadhad no outliers, and and
no outliers, TIEXT2~3 and
TIEXT2~3
Water_1~4
and Water_1~4had sixty outliers
had sixty of each
outliers oneone
of each of 329, 5, 5,
of 329, 315, 237,
315, 237,and
and3174.
3174.Although
Althoughtheretherewere
were aa
significant
significantnumber
numberof of outliers,
outliers,these
these outliers
outlierswere
were concentrated
concentratedin in aa few
few particular
particular sections.
sections. Taking
Taking
this
thisinto
intoconsideration,
consideration, this
this study
study did
did not
not regard
regard the
the outliers
outliers as
as outliers
outliers caused by data errors.
Processes 2020, 8, 1068 7 of 18
Figure3.3. Illustrative
Figure Illustrative box
box plots
plots of
of continuous
continuousdata.
data.
The
The distributionofofdiscrete
distribution discretedata
dataisisshown
shown in
in Table
Table 3.
3. In
Inthe
thecase
caseofofdice
dicetemperature
temperatureandandcylinder
cylinder
temperature, sensors are located differently so that data values and sections are different.
temperature, sensors are located differently so that data values and sections are different. The The number
number
of data related to the speed of RPM1_SCREW and RPM2_EXT
of data related to the speed of RPM1_SCREW and RPM2_EXT is small. is small.
The results from the correlation analysis on features are shown in Figure 4. Pearson correlation
The results
coefficient was from the in
applied, correlation analysis on
which a correlation features is
coefficient are shown
a value in Figureand
between–1 4. Pearson correlation
+1. The closer the
coefficientwas
coefficient is to ±1, theinmore
applied, whichthere is a correlation.
a correlation A correlation
coefficient is a valuecoefficient
between–1 was
and +1. The with
presented closer
different
the coloristemperature.
coefficient to ±1, the moreThethere
largerisa acorrelation coefficient
correlation. is, the coefficient
A correlation darker the color temperatures
was presented with
are. According
different to the analysis,
color temperature. Thethe correlation
larger coefficients
a correlation of theis,
coefficient sensors in close
the darker theproximity, such as
color temperatures
extrusion
are. temperatures
According (TIEXT1~3)
to the analysis, and dice temperatures
the correlation coefficients of(TI_DIES2~3),
the sensors in were mostly
close high. Aside
proximity, such as
from that, the correlation coefficients between TIEXT, TI_DIES4, Outwet, and
extrusion temperatures (TIEXT1~3) and dice temperatures (TI_DIES2~3), were mostly high. Aside Outtemp1 were high.
Additionally,
from that, the correlation
the correlation coefficientscoefficients
between between quality values
TIEXT, TI_DIES4, and and
Outwet, features were analyzed.
Outtemp1 In
were high.
Table 4, the analysis results are presented in descending sort order after taking absolute values. Given
that all of the correlation coefficients between quality values and features are small, a particular
feature does not determine a quality value.
Processes 2020, 8, 1068 8 of 18
Processes 2020, 8, x FOR PEER REVIEW 8 of 20
Figure 4.
Figure 4. Schematic of the
Schematic of the correlation
correlation coefficient
coefficient between
between features.
features.
Table
Additionally, the 4. Correlation
correlation coefficients
coefficients between quality
between quality values
valuesand features.
and features were analyzed.
In Table 4, the analysis results are presented in descending sort order after taking absolute values.
Attribute Correlation Attribute Correlation
Given that all of the correlation coefficients
TI_DIES1 0.2658between quality values and features
Outtemp1 are small, a particular
0.0037
feature does not determine
TI_CYL5 a quality value.
0.2278 Outwet3 0.0037
TI_CYL4 0.1411 TI_CYL2 0.0036
Table 4. Correlation coefficients
TI_CYL6 0.1034 between quality
TIEXT3values and features.
0.0033
Water_3 0.0461 Outwet2 0.0027
Attribute Correlation Attribute Correlation
Water_4 0.0417 TI_DIES4 0.0015
TI_DIES1
Water_2 0.2658
0.0341 Outtemp1
Outwet1 0.0037
0.0014
TI_CYL5
Water_1 0.2278
0.0284 Outwet3
TI_DIES3 0.0037
0.0012
TI_CYL4
Outtemp3 0.1411
0.0257 TI_CYL2
TI_DIES2 0.0036
0.0009
TI_CYL6 0.1034 TIEXT3 0.0033
RPM1_SCREW 0.0192 RPM2_EXT 0.0009
Water_3 0.0461 Outwet2 0.0027
TI_CYL3 0.0163 TIEXT2 0.0002
Water_4 0.0417 TI_DIES4 0.0015
TI_CYL1 0.0075 TIEXT1 0.0001
Water_2 0.0341 Outwet1 0.0014
Outtemp2
Water_1 0.0075
0.0284 TI_DIES3 0.0012
Outtemp3 0.0257 TI_DIES2 0.0009
4. Quality Prediction in Process Manufacturing
RPM1_SCREW 0.0192 RPM2_EXT 0.0009
TI_CYL3 0.0163 TIEXT2 0.0002
In this chapter isTI_CYL1
described the first0.0075
experiment forTIEXT1
predicting product quality with the uses of
0.0001
machine learning classification
Outtemp2 models.0.0075
In all the analyses, Python 3.6.10 was applied and the
software libraries used are listed in Table 5.
4. Quality Prediction in Process Manufacturing
Table 5. List of used libraries for python programming.
In this chapter is described the first experiment for predicting product quality with the uses of
Library
machine learning classification Version In all the analyses, Purpose
models. Python 3.6.10 was applied and the software
Numpy
libraries used are listed in Table 1.18.1
5. fundamental package for scientific computing
pandas 1.0.1 data analysis and manipulation tool
Keras 2.3.1 deep-learning API
scikit-leam 0.22.1 tools for predictive data analysis
Processes 2020, 8, 1068 9 of 18
Methods Description
Reduces the imbalance rate by randomly removing some elements of the majority
Undersampling
class from the data set.
It takes the opposite approach to Undersampling. Instead of reducing the majority
Oversampling
class, add the minority class to the training set as a cloned or disturbed variant.
Hybrid method Combine Undersampling and Oversampling in a data balancing approach.
Of 20,801 original data, 18,218 are non-defective product data, and 2583 are defective product
data. In this study, the set ratio of training data to test data is 5 to 5. Of 10,401 training data, 9131 are
non-defective product data (about 88%), and 1270 are defective product data (about 12%). Since there
was a large difference between the majority class and minority class, Oversampling was applied to set
the class ratio. After resampling, 9131 defective product data and 9131 non-defective product data were
created equally. In addition, since feature values vary and each data scale is irrelevant, feature scaling is
applied to unify a range of features. Feature scaling is used to compare and analyze multi-dimensional
values easily, and to improve the stability and convergence speed in the optimization process [26].
As described in Section 4.1, features had some data including outlier. In consideration of the point,
RobustScaler of the Python scikit-learn library was used as a preprocessing method. RobustScaler
is the technique of removing a median value, scaling data in the range of IQR (Inter-quartile range),
and therefore minimizing the influence of outliers. The processing method of RubustScaler ensure that
each feature has a median of 0 and a quartile of 1 so that all features are on the same magnitude [27].
accuracy is applied to the binary and multiple class classification for processing imbalanced data sets
and is defined as the mean of the recalls obtained from classes:
predicted positive TP
accuracy = =
total predicted
sensitivity+speci f icity
TP+TN
TP TN
balanced accuracy = 2 = TP+FN + FP+TN /2
Of the applied classifiers, ANN was designed to be a model with two hidden layers, each of which
has 100 neurons and 80 neurons, respectively. Rectified linear unit (ReLU) and sigmoid were used
as activation functions, and Adam was used as an optimizer. ReLU is the most frequently applied
activation function. A final classification result should be a binary number: either non-defective
product (0) or defective product (1). Therefore, sigmoid was applied to an output layer. The count
of repetition was set to 1000, and the batch size was set to 10. For model learning, validation data
was set to 30%, and then early stopping was applied. Early stopping can prevent underfitting and
overfitting. It can stop learning even if the repetition count has yet to be reached, only if the repetition
count, in which no more performance is improved, exceeds a particular count. In this study, the count
was set to ‘50 . The results of cross-validation (k = 5) are presented in Table 8.
It was found that a model has more generality in the order of decision tree, LDA, and logistic
regression. For the evaluation of the performance of the classification model, log loss, and receiver
operating characteristic (ROC) curves were analyzed as well as the performance found in
cross-validation. The two performance indexes can be checked after data fitting in the model
so that they are not used in cross-validation. Log loss is called cross-entropy and can be compared
with accuracy. Accuracy means that a prediction value is equal to an actual value. Log loss takes into
account uncertainty depending on how different the label is based on probability. Therefore, with log
loss, it is possible to obtain a delicate view of the model performance. The lower a log loss is, the better
value is. In the ROC curve, false positive rate (FPR) is displayed on the x-axis and true positive rate
(TPR) on the y-axis. At this time, FPR is a rate of negative cases wrongly classified as positive ones,
and TPR is a rate of positive cases that have labels specified correctly [31]. ROC can calculate the area
under the curve (AUC) and provides the score that can be used to compare the AUC model. Generally,
a ROC curve is effective at severe class imbalance when minority class has a small number of data [32].
An unskilled classifier scores 0.5, whereas a perfect classifier scores 1.0. The classification results after
the consideration of the added performance evaluation measures are presented in Table 9.
Processes 2020, 8, 1068 11 of 18
According to the final classification performance analysis, the best model in terms of balanced
accuracy and
Processes log
2020, 8, xloss
FOR was decision tree, and the best model in terms of the ROC curve was
PEER REVIEW random
11 of 20
forest. We compared two models concerning the confusion matrix as shown in Table 10. In the case of
According
random forest, to the
FP (the finalthat
case classification performance
was predicted analysis, the
to be defective best model
though in terms ofwas
non-defective) balanced
very low,
accuracy and log loss was decision tree, and the best model in terms of the ROC curve
but FN (the case that was predicted to non-defective though defective actually) was high. In the case was random
forest. We compared two models concerning the confusion matrix as shown in Table 10. In the case
of the decision tree, each error was distributed in a balanced way. In terms of the number of incorrectly
of random forest, FP (the case that was predicted to be defective though non-defective) was very low,
classified items, decision tree 365 + 286 = 651, and random forest 2 + 1215 = 1217. Given that, we may
but FN (the case that was predicted to non-defective though defective actually) was high. In the case
conclude
of thethat decision
decision tree,tree
eachhaserror
better
wasperformance.
distributed in a balanced way. In terms of the number of
incorrectly classified items, decision tree 365 + 286 = 651, and random forest 2 + 1215 = 1217. Given
Table 10. Confusion matrices applied with simple classification.
that, we may conclude that decision tree has better performance.
Confusion Matrix
ModelsTable 10. Confusion matrices applied with simple classification.
Predicted as Good Predicted as Bad
Confusion Matrix
Models
Actual good 8722 365
Decision Tree Predicted as Good Predicted as Bad
Actual bad
Actual good 8722
286 365
1027
Decision Tree Actual good 9085 2
Random Forest Actual bad 286 1027
Actual bad 1215 98
Actual good 9085 2
Random Forest
Actual bad 1215 98
4.3. Simple Classification Feature Importance
4.3. Simple Classification Feature Importance
Feature Importance was analyzed in order to find which features were considered to be critical
factors inFeature Importance was
data classification by analyzed
decisionintree
order
as to find which
shown features
in Figure 5. were considered
In terms to beimportance,
of feature critical
the top five features of all were TI_CYL5, TI_DIES1, and TIEXT1~3, each of which scored 0.1896,the
factors in data classification by decision tree as shown in Figure 5. In terms of feature importance, 0.1757,
top five features of all were TI_CYL5, TI_DIES1, and TIEXT1~3, each of which scored 0.1896, 0.1757,
0.1101, 0.1085, and 0.1046. The top five of 25 features accounted for 69% influence.
0.1101, 0.1085, and 0.1046. The top five of 25 features accounted for 69% influence.
5.2.5.2. STEP1—Semi-SupervisedLearning
STEP1—Semi-Supervised Learning
When a small amount of data is resampled or scaled, bias can occur. Therefore, the data is
When a small amount of data is resampled or scaled, bias can occur. Therefore, the data is
analyzed without data preprocessing. The ratio of training data to test data was set to 5:5. SSL was
analyzed without data preprocessing. The ratio of training data to test data was set to 5:5. SSL was
conducted only with the use of 10,400 training data. Of the training data, 6281 (about 60%) were
conducted only with the use of 10,400 training data. Of the training data, 6281 (about 60%) were
randomly deleted, and 40% (4119 data) was used to label the deleted data. The SSL method applied
randomly deleted, and 40% (4119 data) was used to label the deleted data. The SSL method applied
pseudo labeling and graph-based classifier. Pseudo labeling applied seven classifiers except for ANN
pseudo labeling and graph-based
among the classifiers classifier.
used in Section Pseudo
4.2. Label labeling
spreading andapplied seven classifiers
label propagation except forSSL
as graph-based ANN
among
classifiers applied radial basis function (RBF) and KNN kernels, respectively. Accordingly, SSL wasSSL
the classifiers used in Section 4.2. Label spreading and label propagation as graph-based
classifiers
applied applied
to a totalradial basiscases,
of eleven function (RBF) and KNN(kkernels,
and cross-validation = 5) wasrespectively. Accordingly,
conducted. Table SSL
11 presents thewas
applied a total of eleven cases, and cross-validation (k =
results. Given the results of cross-validation, decision tree and random forest had the most generalitythe
to 5) was conducted. Table 11 presents
results.
of theGiven
model.the results of cross-validation, decision tree and random forest had the most generality
of the model.
Table 11. Results of cross-validation applied with semi-supervised learning (k = 5).
The results from the application to a real model are shown in Table 12. In this case, SVC classifies
all values as the non-defective product so that it is meaningless in comparison to different performance
evaluation measures. For this reason, it was excluded. For the comparison of model performance,
balanced accuracy, log loss, and AUC of the ROC curve were used. In respect of balanced accuracy
and log loss, the top five models were equal. In terms of the ROC curve, decision tree and random
forest were excellent. Confusion matrices were analyzed in order to select a final model. As a result,
random forest was found to have good performance as shown in Table 13. On balance, labeled data
through random forest-based pseudo labeler are trained in the final classification process of Step 3,
therefore, quality is predicted.
Table 12. Results of actual SSL prediction with performance evaluation measures.
Confusion Matrix
Classifiers Models
Predicted as Good Predicted as Bad
Actual good 5375 109
Decision Tree
Actual bad 100 697
Actual good 5458 26
Pseudo Labeler Random Forest
Actual bad 124 673
Actual good 5376 108
KNN
Actual bad 289 508
Actual good 5335 149
Label Spreading LS—RBF
Actual bad 164 633
Actual good 5332 152
Label Propagation LP—RBF
Actual bad 164 633
Processes 2020, 8, 1068 14 of 18
5.4. STEP3—Classification
In Step 3, the final classification data is obtained for yield improvement. In Step 1, loss data were
generated by random forest-based pseudo labeler. In Step 2, feature data were predicted with RNN.
In the last step, classification is performed on the basis of the data generated in each previous step,
Processes 2020, 8, 1068 15 of 18
and a future quality value is finally predicted. The classifiers used in Section 4.2 were also applied in
this step. Tables 15 and 16 present the final classification performance and results after random forest
(SSL) was applied. As shown in Table 15, decision tree and random forest had excellent performance
in terms of all performance indexes. The final objective of this study is to increase the yield through
quality prediction. For the reason, based on the quality value predicted in Step 3, how much more the
process method can increase a yield than a conventional process method will be calculated.
Table 15. Results of classification applied after semi-supervised learning (Random Forest).
Table 16. Confusion matrices of classification applied after semi-supervised learning (random forest).
Confusion Matrix
Models
Predicted as Good Predicted as Bad
Actual good 8797 285
Decision Tree
Actual bad 370 943
Actual good 8947 135
Random Forest
Actual bad 696 671
(1) If the predicated quality of all items is a non-defective product in a production length during a
unit time, the products are manufactured.
(2) If the manufactured products include any defective product, they are regarded as
defective products.
Without the application of SSL and RNN, a small amount of data was applied to quality prediction.
Of all the data (10,401 data), 60% were deleted randomly, and 40% (4120 data) were trained with
Decision Tree, which showed the best performance in Section 4.2. Based on these small numbers of
data, 10,395 data were predicted again. In addition, the quality of the Step 3 results was predicted with
40% of the data, and the results are shown in Section 5.4. Confusion matrix, which is the result of each
prediction, is shown in Table 17.
Processes 2020, 8, 1068 16 of 18
Confusion Matrix
Pre-Processing Models
Predicted as Good Predicted as Bad
Actual good 8797 285
Decision Tree
Actual bad 370 943
Semi-Supervised Learning
Actual good 8947 135
Random Forest
Actual bad 696 671
Actual good 8437 650
Simple Classification Decision Tree
Actual bad 260 1053
Table 18 presents the numbers of the products generated according to the prediction results and the
proposed method. Given the final number of non-defective products, there was no yield improvement
effect even if product cutting was executed in consideration of defect occurrence. However, as shown
in the study results, the proposed method increased to yield more than a conventional production
method when a production unit time was larger than six seconds (c, d). In the case of decision tree, a
yield increased in all the unit time slots after six seconds and went up by a maximum of about 8.70% (c).
In the case of random forest, the maximum yield rise rate was 6.76%, and a yield sometimes dropped
more than a conventional method sporadically (d). Given that, it is effective to apply the proposed
manufacturing method through the decision tree (c) based quality prediction.
Table 18. Number of non-defective products and yield increasing rates by models.
Conventional Simple Classification Semi-Supervised Learning
6. Conclusions
Most of the studies related to continuous-flow process quality focused on the prediction of
equipment quality or particular factors. In this study, statistical analysis was conducted on basic data,
and quality was predicted with the use of a classifier. In addition, a small amount of quality data
was taken into account. A process of predicting quality efficiently with the use of the small data and
increasing yield was researched. The process consists of three steps. In Step 1, unlabeled data were
labeled by random forest-based pseudo labeling. At this time, the whole quality was predicted with
the use of a small amount of labeled data. It does not require the cost and effort for additional data
collection. In Step 2, a feature value was generated through time-series prediction-based RNN as a
deep learning technique. In Step 3, the final quality was predicted through the decision tree. Finally,
this study proposed a method for cutting a product in consideration of the defect occurrence point
through quality prediction and thereby improving the yield of the continuous-flow process. SSL,
RNN, and classification algorithms were widely used in previous studies. However, unlike previous
studies, this study used these algorithms to predict the quality of a continuous process. In addition,
the proposed framework improved the yield by 8.7% in a continuous process. If a small amount of
data is used for quality prediction with no separate data processing even in consideration of defects,
the yield can become lower than that of a conventional equidistant production method. However,
Processes 2020, 8, 1068 17 of 18
the proposed method using semi-supervised learning and RNN showed better performance than the
existing production method even when there is a small amount of data.
The production method proposed in this study applies to various areas. For example, Case 2 in
Figure 1b can occur usually in the continuous-flow process industry, so applying a quality prediction
can help improve yield. Also, it can be applied to the case where quality needs to be predicted even
if there is not much data in various areas. If the proposed production method is applied as a model
whose prediction ability will be improved with a large amount of data, it is expected to contribute to
saving raw materials and improving quality greatly.
Author Contributions: Conceptualization, J.-h.J., T.-W.C.; Data curation, J.-h.J.; Formal analysis, J.-h.J., T.-W.C.,
S.J.; Funding acquisition, T.-W.C.; Methodology, J.-h.J.; Software, J.-h.J.; Supervision, T.-W.C.; Validation, J.-h.J.,
T.-W.C., S.J.; Visualization, J.-h.J., S.J.; Writing—original draft preparation, J.-h.J.; Writing—review and editing,
T.-W.C., S.J.; All authors have read and agreed to the published version of the manuscript.
Funding: This work was supported by the GRRC program of Gyeonggi province. ((GRRC KGU 2020-B01),
Research on Intelligent Industrial Data Analytics.
Conflicts of Interest: The authors declare no conflict of interest.
References
1. Berumen, S.; Bechmann, F.; Lindner, S.; Kruth, J.P.; Craeghs, T. Quality control of laser-and powder bed-based
Additive Manufacturing (AM) technologies. Phys. Procedia 2010, 5, 617–622. [CrossRef]
2. Lillrank, P.M. Laatumaa—Johdatus Japanin Talouselämään Laatujohtamisen Näkökulmasta; Gummerus Kirjapaino
Oy: Jyväskylä, Finland, 1990; p. 277.
3. Angun, A.S. A Neural Network Applied to Pattern Recognition in Statistical Process Control. Comput. Ind. Eng.
1998, 35, 185–188. [CrossRef]
4. Schnell, J.; Nentwich, C.; Endres, F.; Kollenda, A.; Distel, F.; Knoche, T.; Reinhart, G. Data mining in
lithium-ion battery cell production. J. Power Sources 2019, 413, 360–366. [CrossRef]
5. Ramana, D.E.; Sapthagiri, S.; Srinivas, P. Data Mining Approach for Quality Prediction and Improvement of
Injection Molding Process through SANN, GCHAID and Association Rules. Int. J. Mech. Eng. 2016, 7, 31–40.
6. Wang, G.; Ledwoch, A.; Hasani, R.M.; Grosu, R.; Brintrup, A. A generative neural network model for the
quality prediction of work in progress products. Appl. Soft Comput. 2019, 85, 1–13. [CrossRef]
7. Ogorodnyk, O.; Lyngstad, O.V.; Larsen, M.; Wang, K.; Martinsen, K. Application of machine learning methods
for prediction of parts quality in thermoplastics injection molding. In Proceedings of the 8th International
Workshop of Advanced Manufacturing and Automation (IWAMA), Changzhou, China, 25–26 September 2018;
Volume 8, pp. 237–244.
8. Li, Y.Y.; Bridgwater, J. Prediction of extrusion pressure using an artificial neural network. Powder Technol.
2020, 108, 65–73. [CrossRef]
9. Lela, B.; Musa, A.; Zovko, O. Model-based controlling of extrusion process. Int. J. Adv. Manuf. Technol. 2014,
74, 9–12. [CrossRef]
10. Li, X.L.; Yu, P.S.; Liu, B.; Ng, S.K. Positive unlabeled learning for data stream classification. In Proceedings of
the 2009 SIAM International Conference on Data Mining, Sparks, NV, USA, 30 April–2 May 2009; pp. 259–270.
11. Cohen, I.; Cozman, F.G.; Sebe, N.; Cirelo, M.C.; Huang, T.S. Semisupervised learning of classifiers: Theory,
algorithms, and their application to human-computer interaction. IEEE Trans. Pattern Anal. Mach. Intell.
2004, 26, 1553–1566. [CrossRef]
12. Subramanya, A.; Talukdar, P.P. Graph-based semi-supervised learning. In Synthesis Lectures on Artificial
Intelligence and Machine Learning; Morgan & Claypool Publishers LLC: San Rafael, CA, USA, 2014; Volume 8,
pp. 1–125.
13. Zhu, X.; Ghahramani, Z. Learning from labeled and unlabeled data with label propagation. In Technical
Report CMU-CALD-02-107; Carnegie Mellon University: Pittsburgh, PA, USA, 2002.
14. Zhang, Z.; Jia, L.; Zhao, M.; Liu, G.; Wang, M.; Yan, S. Kernel-induced label propagation by mapping for
semi-supervised classification. IEEE Trans. Big Data 2018, 5, 148–165. [CrossRef]
Processes 2020, 8, 1068 18 of 18
15. Lee, D.H. Pseudo-label: The simple and efficient semi-supervised learning method for deep neural networks.
In Proceedings of the Workshop on Challenges in Representation Learning (ICML), Atlanta, GA, USA,
16–21 June 2013; Volume 3, p. 2.
16. Yan, K.; Zhong, C.; Ji, Z.; Huang, J. Semi-supervised learning for early detection and diagnosis of various air
handling unit faults. Energy Build. 2018, 181, 75–83. [CrossRef]
17. Ellefsen, A.L.; Bjørlykhaug, E.; Æsøy, V.; Ushakov, S.; Zhang, H. Remaining useful life predictions for turbofan
engine degradation using semi-supervised deep architecture. Reliab. Eng. Syst. Saf. 2019, 183, 240–251.
[CrossRef]
18. Sen, D.; Aghazadeh, A.; Mousavi, A.; Nagarajaiah, S.; Baraniuk, R.; Dabak, A. Data-driven semi-supervised
and supervised learning algorithms for health monitoring of pipes. Mech. Syst. Signal Process. 2019,
131, 524–537. [CrossRef]
19. Maknickienė, N.; Rutkauskas, A.V.; Maknickas, A. Investigation of financial market prediction by recurrent
neural network. Innov. Infotechnol. Sci. Bus. Educ. 2011, 2, 3–8.
20. Zhu, J.H.; Zaman, M.M.; Anderson, S.A. Modelling of shearing behaviour of a residual soil with recurrent
neural network. Int. J. Numer. Anal. Methods Geomech. 1998, 22, 671–687. [CrossRef]
21. Namuduri, S.; Narayanan, B.N.; Davuluru, V.S.P.; Burton, L.; Bhansali, S. Deep Learning Methods for Sensor
Based Predictive Maintenance and Future Perspectives for Electrochemical Sensors. J. Electrochem. Soc. 2020,
167, 037552. [CrossRef]
22. Narayanan, B.N.; Beigh, K.; Loughnane, G.; Powar, N. Support vector machine and convolutional neural
network based approaches for defect detection in fused filament fabrication. In Proceedings of the Volume
11139 SPIE OPTICAL ENGINEERING + APPLICATIONS, San Diego, CA, USA, 11–15 August 2019.
23. Lu, C.H.; Tsai, C.C. Adaptive predictive control with recurrent neural network for industrial processes:
An application to temperature control of a variable-frequency oil-cooling machine. IEEE Trans. Ind. Electron.
2008, 55, 1366–1375. [CrossRef]
24. Nam, S.T.; Shin, S.Y.; Jin, C.Y. A Reconstruction of Classification for Iris Species Using Euclidean Distance
Based on a Machine Learning. Korea Inst. Inform. Commun. Eng. 2020, 24, 225–230.
25. He, H.; Garcia, E.A. Learning from imbalanced data. IEEE Trans. Knowl. Data Eng. 2009, 21, 1263–1284.
26. Craig, A.; Cloarec, O.; Holmes, E.; Nicholson, J.K.; Lindon, J.C. Scaling and normalization effects in NMR
spectroscopic metabonomic data sets. Anal. Chem. 2006, 78, 2262–2267. [CrossRef]
27. Shuai, Y.; Zheng, Y.; Huang, H. Hybrid Software Obsolescence Evaluation Model Based on
PCA-SVM-GridSearchCV. In Proceedings of the IEEE 9th International Conference on Software Engineering
and Service Science (ICSESS), Beijing, China, 23–25 November 2018; pp. 449–453.
28. Cawley, G.C.; Talbot, N.L. On over-fitting in model selection and subsequent selection bias in performance
evaluation. J. Mach. Learn. Res. 2010, 11, 2079–2107.
29. Bengio, Y.; Grandvalet, Y. No unbiased estimator of the variance of k-fold cross-validation. J. Mach. Learn.
Res. 2004, 5, 1089–1105.
30. Brodersen, K.H.; Ong, C.S.; Stephan, K.E.; Buhmann, J.M. The balanced accuracy and its posterior
distribution. In Proceedings of the 2010 20th International Conference on Pattern Recognition, Istanbul,
Turkey, 23–26 August 2010; pp. 3121–3124.
31. Khalilia, M.; Chakraborty, S.; Popescu, M. Predicting disease risks from highly imbalanced data using
random forest. BMC Med. Inform. Decis. Mak. 2011, 11, 51. [CrossRef] [PubMed]
32. Mansournia, M.A.; Geroldinger, A.; Greenland, S.; Heinze, G. Separation in logistic regression: Causes,
consequences, and control. Am. J. Epidemiol. 2018, 187, 864–870. [CrossRef] [PubMed]
33. Wang, F.; Zhang, C. Label propagation through linear neighborhoods. IEEE Trans. Knowl. Data Eng. 2007,
20, 55–67. [CrossRef]
© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access
article distributed under the terms and conditions of the Creative Commons Attribution
(CC BY) license (http://creativecommons.org/licenses/by/4.0/).