0% found this document useful (0 votes)
10 views10 pages

DataChallengePHME2022 Gaffet V2

Uploaded by

www.scribd.com
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views10 pages

DataChallengePHME2022 Gaffet V2

Uploaded by

www.scribd.com
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

A Hierarchical XGBoost Early Detection Method for

Quality and Productivity Improvement of Electronics


Manufacturing Systems
Alexandre Gaffet, Nathalie Barbosa Roa, Pauline Ribot, Elodie Chanthery,
Christophe Merle

To cite this version:


Alexandre Gaffet, Nathalie Barbosa Roa, Pauline Ribot, Elodie Chanthery, Christophe Merle. A Hi-
erarchical XGBoost Early Detection Method for Quality and Productivity Improvement of Electronics
Manufacturing Systems. 7th European Conference of the Prognostics and Health Management Society
2022, Jul 2022, Turin, Italy. �hal-03711267�

HAL Id: hal-03711267


https://hal.science/hal-03711267
Submitted on 4 Jul 2022

HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est


archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents
entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non,
lished or not. The documents may come from émanant des établissements d’enseignement et de
teaching and research institutions in France or recherche français ou étrangers, des laboratoires
abroad, or from public or private research centers. publics ou privés.
A Hierarchical XGBoost Early Detection Method for Quality and
Productivity Improvement of Electronics Manufacturing Systems
Alexandre Gaffet1,2 , Nathalie Barbosa Roa1 , Pauline Ribot 2,3 , Elodie Chanthery2,4 and Christophe Merle1

1
Vitesco Technologies France SAS 44, Avenue du Général de Croutte, F-31100 Toulouse, France
[email protected]
[email protected]
[email protected]

2
CNRS, LAAS, 7 avenue du colonel Roche, F-31400 Toulouse, France
[email protected]
[email protected]

3
Univ. de Toulouse, UPS, LAAS, F-31400 Toulouse, France

4
Univ. de Toulouse, INSA, LAAS, F-31400 Toulouse, France

A BSTRACT 1. I NTRODUCTION
This paper presents XGBoost classifier-based methods to solve The 2022 PHME Data challenge encourages participants to
three tasks proposed by the European Prognostics and Health solve multiple classification problems for a real production
Management Society (PHME) 2022 conference. These tasks line from Bitron Spa. The dataset includes data from Sol-
are based on real data from a Surface Mount Technologies der Paste Inspection (SPI) and Automatic Optical Inspection
line. Each of these tasks aims to improve the efficiency of (AOI) equipment of a real industrial production line equipped
the Printed Circuit Board (PCB) manufacturing process, fa- with automated, integrated and fully connected machines (In-
cilitate the operator’s work and minimize the cases of manual dustry 4.0). A detailed description of the dataset is given in 3.
intervention. Due to the structured nature of the problems The challenge is to design an algorithm to predict test labels
proposed for each task, an XGBoost method based on en- for the components. Specifically, the goal is to develop a hier-
coding and feature engineering is proposed. The proposed archical classification predicting: 1. whether the AOI classi-
methods utilise the fusion of test values and system charac- fies the component as defective; 2. in the case of a defect, the
teristics extracted from two different testing equipment of the label applied by the operator; 3. in the case of confirmation
Surface Mount Technologies lines. This work also explores of the defect by the operator, the repair label.
the problems of generalising prediction at the system level us-
To tackle this challenge, we pursue the following steps: data
ing information from the subsystem data. For this particular
exploration and domain knowledge extraction, data cleaning,
industrial case: the challenges with the changes in the number
data preparation (normalization and encoding), data modelling
of subsystems. For Industry 4.0, the need for interpretability
(model training and validation) and results in analysis. The
is very important. This is why the results of the models are
four last steps were made recursively while trying different
analysed using Shapley values. With the proposed method,
approaches, as shown in Section 4. Data exploration allowed
our team took the first place, capable of successfully detect-
us to identify three main issues within the given dataset: miss-
ing at an early stage the defective components for tasks 2 and
ing information, highly imbalanced classes (for all tasks) and
3.
high cardinality on the categorical features. The latter is not
necessarily an issue but implies that a special treatment needs
to be done to these features a priori. We will elaborate on
Alexandre Gaffet et al. This is an open-access article distributed under the
the issues in Section 4.1 and on the categorical encoding in
terms of the Creative Commons Attribution 3.0 United States License, which Section 4.2.
permits unrestricted use, distribution, and reproduction in any medium, pro-
vided the original author and source are credited. To solve each task, we take different information units formed

1
E UROPEAN C ONFERENCE OF THE P ROGNOSTICS AND H EALTH M ANAGEMENT S OCIETY 2022

by feature tuples, corresponding to different levels on the data


hierarchy. At the same time, different features are kept as rel-
evant for each task and followed by specific normalization
or encoding. The specific tools used for each task are de-
scribed in detail in Section 4.3. Finally, after presenting the
experimental setup used to tune the model hyperparameters
(Section 4.4), the achieved results are discussed in Section 5.

2. R ELATED WORK
Several scientific articles already present machine learning
applications for Surface Mount Technology production lines.
(Richter, Streitferdt, & Rozova, 2017) proposes a convolu-
tion neural networks deep learning application working on the Figure 1. Surface Mount Technologies Production Line.
AOI system to automatically detect defects. In (Tavakolizadeh, (PHM Society, 2022)
Soto, Gyulai, & Beecks, 2017), some binary classifiers are
tested to detect defects inside products using simulated pro-
duction data. These data are simulated from several SMT checking the finished solder pads, their position, shape, etc.,
lines and give a good classification score. In (Parviziomran, this process also inspects the component itself, looking for
Cao, Yang, Park, & Won, 2019), a component shift predic- defects like missing or misaligned components. The AOI in-
tion method is proposed to predict the shift of the pad during spection has two types of sanctions: the automatic sanction
the reflow process. In (Park, Yoo, Kim, Lee, & Kim, 2020) provided by the machine itself and the one given by an oper-
SPI data are used to predict at an early stage the potential de- ator that verifies the first one in case of spotted defects.
fects of the prediction. This work is based on a dual-level
defect detection method. In (Jabbar et al., 2018) some tree- Table 1. Summary characteristics of the datasets
based machine learning methods are used to predict the de-
Feature SPI AOI
fects found in AOI using SPI data. (Gaffet, Ribot, Chanth- Number of lines 5 985 382 31 617
ery, Roa, & Merle, 2021) proposes an unsupervised univari- Number of panels 1924 1 924
ate method for monitoring the In-Circuit Testing machine (lo- Number of components 129 102
cated at the end of the Surface Mount Technology lines) and Figures/panel {1,8} {1,. . . ,8}
components. Another large topic of interest for this study is Components/panel {128,129} {2,. . . ,27}
prognosis and health management at different levels: system- Lines/panel {3112,. . . } {2,. . . ,203}
level or sub-system level. In our case, we have to use infor-
mation from a pin level to retrieve the health at a system level Simple data exploration allows to discover anomalous entries
which is the product component. This topic is linked to the and clean the datasets. A summary of the information found
decentralized diagnosis approach. (Zhang, 2010) proposes in each dataset is shown in Table 1. The cleaned SPI dataset
a decentralized model-based approach with a simulation ex- is composed of 1921 panels, each with 3112 entries. A panel
ample of automated highway systems. (Ferdowsi, Raja, & is an ensemble of 8 PCBs, also called figures, grouped. Each
Jagannathan, 2012) proposes a decentralized fault diagnosis figure is composed of 129 components. The component refer-
and prognosis methodology for large-scale systems adapted ence is found on the feature ComponentID. These components
for aircraft, trains, automobiles, power plants and chemical have several pins that can be identified using the PinNumber
plants applications. (Tamssaouet, Nguyen, Medjaher, & Or- feature. The SPI test results provide the volume, area, height,
chard, 2021) proposes a component interaction-based method size, shape and offset of each PadID as well as a final re-
to provide the prognosis of multi sub-system model. sult flag. A PadID corresponds to a unique combination of
{FigureID, ComponentID, PinNumber}. Each panel provided
3. DATA DESCRIPTION by the competition has at least one component detected as a
defect by AOI automatic sanction. Each line of the datasets
The PHME provides the dataset used in this article as part describes a PadID of one electronic board.
of the 2022 conference data challenge (PHM Society, 2022).
The dataset includes measurement information from two dif- For the AOI, each line corresponds to a unique entry of the
ferent steps of the PCB production (see Figure 1). The first set {PanelID, FigureID, MachineID, ComponentID,
step is the SPI, in which each solder pad is checked to ver- PinNumber}, where the PinNumber can be fill as NaN. On such
ify its compliance and, accordingly, a sanction is generated cases, we believe the AOI does not inspect the solder paste
depending on the quality of the solder (evaluated on several but the component itself.
physical aspects). The second step is the AOI. In addition to As introduced before, the challenge is divided into three tasks.

2
E UROPEAN C ONFERENCE OF THE P ROGNOSTICS AND H EALTH M ANAGEMENT S OCIETY 2022

Task 1 is to predict whether or not a component will be classi- CatBoost (Prokhorenkova, Gusev, Vorobev, Dorogush, & Gulin,
fied as defective by the AOI using only the inputs provided by 2018) and LightGBM (Ke et al., 2017) the algorithms with
the SPI, i.e. the pad measurements. Task 2 is about predict- the best results. Among these algorithms, we decide to use
ing the operator’s label using the SPI test results and the auto- XGBoost, which is a scalable, paralleled and distributed im-
matic defect classification provided by the AOI (AOIlabel). plementation of the original gradient boosting tree algorithm.
Finally, task 3 concerns the reparation operation. Again, the GBDT is an ensemble model algorithm, i.e. it combines sev-
objective is to predict whether the component detected as a eral decision trees to perform a better prediction than a single
defect by the operator can be repaired or not. For this task, model. XGBoost uses, in particular, the idea of boosting: it
the information used for the prediction is the SPI test result, uses a collection of weak models to generate a strong model.
the AOIlabel and also the OperatorLabel. In practice, for XGBoost, the idea is to use a gradient descent
algorithm over a cost function to iteratively generate and im-
4. M ETHODOLOGIES prove weak models. On each iteration, a new weak decision
tree is generated based on the error residual of the previous
4.1. Challenges
weak model. The final prediction is a weighted sum of all the
The exploratory analysis of the training data has revealed sev- iterated weak trees. Among ensemble methods, boosting can
eral issues that need to be tackled to solve correctly the dif- minimize the model’s bias. We propose one model for each
ferent tasks: task. In this section, these models as well as the used features
are presented.
1. Missing values: for three different PanelID, the proposed XGBoost is a very performing algorithm, although some cau-
data missed some information in the SPI dataset. We tion has to be taken when categorical features are used. This
choose to exclude the lines with no information. is true for all tree-based or boosted tree methods. In particu-
2. Class imbalance: the number of components detected as lar, one hot encoding can lead to very poor results when the
defects by the AOI is much lower than the number of categorical features have many levels. Indeed, a large number
components from the SPI dataset. Similarly, the number of levels leads to sparsity as for each level, a new variable is
of components really classified as a fault by the operator created. These new variables have only a small fraction of
is much lower than the number of components classified data points that shall have the value 1 and the other the value
as correct. 0, which is a problem for tree-based methods because tree
3. High cardinality of the categorical features: the categori- split searching for the purest nodes. Indeed, a hot encoded
cal features PadID have more than 1000 modes. Without variable is not very likely to lead to the purest nodes if it is
any sort of variable encoding, classifiers are very diffi- very sparse. That is why, the tree split will not be done using
cult to use for such variables. PadID is already encoded this one hot encoded variable. Even if the original categori-
in the sort because the values are ordered by FigureID. cal feature has a lot of importance for the prediction. In our
In a way, the variable is encoded by component area. cases, other encoding techniques should be used.
4. High bias in continuous features: the continuous features First, a common technique is the Hash encoding. It is already
such as volume, area, height... are highly correlated to present in our dataset with a numerical value for each PadID
the categorical feature PadID. level. The PadID level values depend on the FigureID of the
5. Level of prediction: the prediction has to be done at a pad. Here the Hash encoding is done with only one feature
component level, whereas the available data are given at but in general, it could be encoded into more features. One of
a PadID. This leads to a lot of issues in creating the target the most used hashing methods is described in (Yong-Xia &
training and prediction. Indeed, for instance, it is unclear Ge, 2010).
if the target training has to be created by component or The next approach is the frequency-based encoding method:
by pad. it is based on the frequency of the levels as the label value. If
6. Different numbers of pins: the number of pins depends the frequency is linked to the target, it will help the prediction
on the component. It is difficult to use all the pin results of the variable. For instance, in tasks 2 and 3, the frequency
as input of a classifier for each component, as the number of one component is probably linked to the issues that can
of pins and therefore features varies. The generalisation exist for each component. This encoding can be useful in that
of the training depending on the component is very diffi- case.
cult.
Finally, the last type of encoding is label-based. The idea of
4.2. XGBoost algorithm and categorical features encod- label encoding is to replace each categorical value with the
ing conditional probability of the class to be predicted by know-
ing the categorical features. This can be done by several
For tabular data applications, Gradient Boosting Decision Trees methods such as Leave One Out Encoding, and CatBoost en-
(GBDT) are widely used, being XGBoost (Chen et al., 2015),

3
E UROPEAN C ONFERENCE OF THE P ROGNOSTICS AND H EALTH M ANAGEMENT S OCIETY 2022

coding (Prokhorenkova et al., 2018). The main issue with this fined in task 1. For the categorical features, we keep three fea-
method is to learn the conditional probability without overfit- tures: AOILabel, ComponentID and FigureID ComponentID
ting. It can be realized by not taking the observation into ac- where the later is the string concatenation of the variables
count in the learning of the probability for each observation, with the same name. For these features, we use a CatBoost
as in the Leave One Out Encoding. This can also be done encoding based on the categorical encoders python library.
more efficiently using CatBoost encoding. For this case, we We believe a better result can be achieved with deeper work
found that the CatBoost encoding performs best. on the optimization of the encoder hyper-parameters. Fi-
nally, we also create two new meta-features (not encoded)
The frequency-based encoding and Hash encoding have less
Count Pin Component and Count Pin Figure. These two vari-
success.
ables are counting the number of pins detected as a defect
by the AOI respectively for the component and figure of the
4.3. Solving the challenges
tuple PanelID, FigureID, ComponentID, PinNumber. The
Task 1 XGBoost classifier algorithm classes each tuple into the class
“Bad” or “Good” of the OperatorLabel target column.
The first task is the most challenging of all. The main diffi-
culty arises from the fact that some defects are related to the For the AOI defect without any PinNumber associated, we
pin, and others to the component itself. In our opinion, the propose to also use an ensemble of categories and continu-
most important question for each task is “Should we predict ous variables. We use the same categorical and meta fea-
(model) by component or by pin ?”. For this first task, we tures, while for the numerical features we use the mean val-
decide to go for a per-pin prediction. This is mainly guided ues per component of the following variables: Volume(%),
by the difficulty to generate coherent labels and features at Area(%), OffsetX(%), OffsetY(%) to keep the information
the component level for this task. Indeed, only one pin can only on the PanelID, FigureID, ComponentID tuple level.
have an issue. It does not seem right to affect the same la-
Finally, we also use an XGBoost classifier algorithm to class
bel to a component with only one pin detected as a defect
each tuple PanelID, FigureID, ComponentID, PinNumber,
by AOI equipment and another component with multiple pins
with PinNumber referenced as NaN as described before.
with defects. Moreover, the number of pins varies too much
depending on the studied component. As a result, any aggre- For the final sanction (that must be given at the PanelID,
gation of continuous variables will probably hide important FigureID, ComponentID three-tuple level), we use the fol-
information if only one pin has a defect. For instance, the ag- lowing rule: if one of the four-tuple PanelID, FigureID
gregation with the mean of the Volume will not contain a lot ComponentID, PinNumber entry is predicted as “Bad”, then
of information if there are many pins, and only one defective the associated three-tuple will also be considered as “Bad”. If
pin for the considered component. not, the label “Good” is assigned to the three-tuple.
For each tuple, we want to predict if the tuple is detected as a
Task 3
defect by the AOI equipment or not. Accordingly, the train-
ing target column is 1 if the tuple appears in the AOI dataset Task 3 is about the prediction of one categorical value pre-
and 0 otherwise. It is worth noting that we are not consider- sented in the AOI dataset RepairLabel. This label can take
ing as defective tuples for which only PanelID, FigureID, two values, FalseScrap or NotPossibleToRepair. For this
ComponentID appear with PinNumber = NaN. Both, categori- task, we tried the same approach as in task 2 predicting each
cal and continuous features, are used as input. We use the fol- four-tuple PanelID, FigureID, ComponentID, PinNumber
lowing continuous variables: Volume(%),Area(%),OffsetX(%), and merging the results per component, but the approach was
OffsetY(%),Shape(um),PosX(mm), PosY(mm),SizeX, SizeY not successful. To improve the result, we choose to do the
that we simply call “numerical features” in the following of prediction for the PanelID, FigureID, ComponentID three-
the article. As a categorical value, we only keep ComponentID tuple directly. As input, we use the same method as for task 2,
that we encode using a CatBoost encoding method grouping the SPI values per three-tuple using the mean as an
(Prokhorenkova et al., 2018). aggregation method for the “numerical features”. The cate-
gorical variables used are ComponentID, FigureID Component
Task 2 ID. As before, these variables were encoded using the Cat-
Boost encoding method.
For this second task, we first split the AOI dataset into two
parts according to whether or not PinNumber = NaN. In the The meta-features generated are Count Pin Component,
case where PinNumber is specified, we can join the AOI and Count Pin Figure and Count Pin Panel, respectively, the num-
the SPI easily using the columns PanelID, FigureID, Compon- ber of pins detected as defects by the AOI per component,
entID, PinNumber in each dataset. From the joint dataset, we figure and panel. We also created one-hot encoded features
use the “numerical features” from the SPI test results as de- from the AOILabel. For each labelled type of error, the asso-

4
E UROPEAN C ONFERENCE OF THE P ROGNOSTICS AND H EALTH M ANAGEMENT S OCIETY 2022

ciated feature has the value of 1 if the three-tuple is detected


as having this error by the AOI machine (in at least one pin)
and 0 otherwise. To predict the class of each tuple, we use the
XGBoost classifier.

4.4. Hyper-parameter tuning


The XGBoost model has been tuned using the Optuna python
library (Akiba, Sano, Yanase, Ohta, & Koyama, 2019). This
package is an optimization framework that searches for the
best hyper-parameters of the space defined by the user. It Figure 2. Task 1: feature importance of the XGBoost classi-
uses some distributed computation and early stopping to im- fier
prove the speed of the solution. It is implemented with vari- .
ous optimization algorithms. In our case, we use the sampling
algorithm Tree Parzen Estimator Sampler (TPES) (Bergstra,
Bardenet, Bengio, & Kégl, 2011). It is a sequential model-
based optimization. As a bayesian optimization algorithm,
it computes a probability model of the optimization function
and selects the best hyper-parameter according to this proba-
bility model and the real cost result. For each step, the tuning
is done using the mean of the F1-score of a 4 cross-validation
method. The dataset is split into four parts, each of these
parts is recursively the testing dataset while the other is used
for training the model. For the sake of reproducibility, hyper- Figure 3. Task 2: feature importance of the XGBoost classi-
optimization techniques and training algorithms are available fier for the four-tuple (AOI defect is linked to a pin)
in (Gaffet., 2022). Our prediction can probably be easily im-
proved with more iterations of the tuning phase, as we did not
spend much time on it. this creates a bias in the model results that depends on ComponentID
features.
5. R ESULTS AND D ISCUSSION Figure 3 and Figure 4 present the feature importance for task
The results obtained for the three tasks are detailed in this 2 on the four-tuple and three-tuple classifiers respectively.
section. The position and the ComponentID are also important for this
task, but the continuous variables Volume(%), OffsetX(%),
5.1. Score OffsetY(%) and Shape(um) seem to have also a great impact
on the prediction. Even more, the generated meta-features
Table 2. F1-score for the three tasks with training and testing
Count Pin Component, representing the number of pins with
data set
a defect per component, and Count Pin Figure, representing
Dataset Task 1 Task 2 Task 3 Score the number of defective pins per figure, have also a large im-
Training 0.43 0.66 0.90 0.66 pact. Actually, these two features allow improving the F1-
Testing 0.41 0.67 0.77 0.62 score for this task by almost 0.2.
Figure 5 shows the feature importance of the XGBoost clas-
Table 2 shows the F1-score for each task of the challenge. The
training and testing set results are close, showing good gener-
alization capability. It seems that our models avoid overfitting
issues that are difficult to handle with this dataset. Indeed,
the imbalance issue and the fact that some variables such as
ComponentID have a lot of importance can lead to large bias.

5.2. Feature importance


Figure 2 presents the feature importance of the classification
model for task 1. The most important features are the com-
ponent’s position on the panel and the encoded ComponentID Figure 4. Task 2: feature importance of the XGBoost classi-
variable. Because almost 30 per cent of the defects found fier for the three-tuple (AOI defect is not linked to a pin)
in the AOI dataset are coming from one component (“BC1”),

5
E UROPEAN C ONFERENCE OF THE P ROGNOSTICS AND H EALTH M ANAGEMENT S OCIETY 2022

Figure 5. Task 3: feature importance of the XGBoost classi-


fier

sifier for task 3. Without surprise, given the previous results,


ComponentID is shown as the feature with the biggest impor-
tance value. ComponentID could be thought of as a domain-
specific key factor, if, for example, depending on the type of
component, the reparation is possible / allowed or not. For
instance, if there is an issue with a microchip, this is far more
difficult and costly to solve than an issue with a simple resis- Figure 6. Task 1: SHAP value of the XGBoost classifier
tance. The following most important features are the meta-
features being in the order of importance more critical the
number of defective pins per figure, then component and fi-
nal panel. This also seems correct as the number of issues
increases the potential damage to the product. The “Missing”
AOILabel (one-hot encoded) also has a considerable impact.
Maybe it is not possible to replace a missing component due
to oven operation.

5.3. SHAP values


The interpretation of the machine learning model described as
a black block model is a really important topic in the indus-
try. Indeed, for the acceptance of a model, it is mandatory to
explain the model decision to the process experts. The inter-
pretation allows validating the model by comparing the model
and experts’ explanation of a phenomenon. It also allows rec-
ommending some repair actions to the experts. SHAP (SHap-
ley Additive exPlanation) interpretation (Lundberg & Lee,
2017) is based on the game theory (Štrumbelj & Kononenko,
2014). The idea is to compute an interpretation value called
SHAP value and denoted ϕj for all variables and each sample
of the dataset. The output of the model is described by the
sum of the SHAP values ϕj .
X |S|!(M − |S| − 1)!(f (S ∪ {j}) − f (S))
ϕj =
M!
S⊆J\{j}
(1)
where M is the number of variables, J is the ensemble of Figure 7. Task 2 for the four-tuple (AOI defect is linked to a
variables, f the model output and j the variable index. SHAP pin): SHAP value of the XGBoost classifier
is an additive method. The output of a classifier can be de-
scribed as the cumulative sum of the impact of all variables.
Figure 6, Figure 7, Figure 8, Figure 9 present the computed
impact of the features on the model output. For task 1, a

6
E UROPEAN C ONFERENCE OF THE P ROGNOSTICS AND H EALTH M ANAGEMENT S OCIETY 2022

Predicted Label
Defects Not Defects
True labels Defects 8263 23345
Not Defects 4204 1937814

Table 3. Confusion matrix for all training sets

Predicted Label
Defects Not Defects
True labels Defects 8256 1064
Not Defects 4203 1922

Table 4. Confusion matrix for the training set observation of


the component “BC1”

larger value of the CatBoost encoded variable ComponentID


logically results in larger model output (the encoding is done
by replacing the level value with its conditional probability
Figure 8. Task 2 for the three-tuple (AOI defect is not linked
to a pin): SHAP value of the XGBoost classifier of target). The variable SizeX does not seem to be connected
to the model output. A more interesting larger value of the
continuous variable of Volume and Offset seems to be rep-
resentative of a defect for task 1. For task 2, a low number
of defective pins per component and per figure is more likely
to be a “Good” OperatorLabel. This result was expected as
the more the AOI equipment found defective pins, the more
likely a real defect in the component is. The impact of contin-
uous variables is very low for this task. For Task 3, the output
value 1 is associated with the “Not Possible to Repair” label
and the output value 0 with “FalseScrap”. The SHAP values
indicate that the more the number of defective pins per com-
ponent per figure, the more likely the product is impossible
to repair. If a component or a pin is missing, then the prod-
uct seems also more likely to be not possible to repair. The
same result is found for “Unsoldered” AOILabel level. For
the other level of AOILabel, the results are unclear.

5.4. Issues with Task 1


In this subsection, the issues encountered in the prediction
of task 1 are detailed. As previously reported, the results of
the prediction of task 1 are highly biased by one component
“BC1”. Table 3 and Table 4 present the confusion matrix re-
sults of the training of task 1. It can be seen that most of the
correctly detected defects by the classifier model of task 1 is
coming from the results of the component “BC1”. Only five
other tuples are correctly classified for the defects. Obviously,
this is a concern for our method. Another observation is that
the F1-score obtained with the results coming from Table 4 is
0.76 whereas the F1-score obtained considering all the tuples
as defects is 0.75. Both scores are very close and depending
on the cross-validation random selection, the simple rule con-
sidering all the tuples of the the“BC1” component as defects
Figure 9. Task 3: SHAP value of the XGBoost classifier
can be better than a more advanced classifier. To solve this
issue, more advanced analyses are required. From our expe-
rience, it seems easier to predict the defects associated with
the “Unsoldered” AOILabel than the other.

7
E UROPEAN C ONFERENCE OF THE P ROGNOSTICS AND H EALTH M ANAGEMENT S OCIETY 2022

AOILabel / Score Score Task 1 works (ijcnn) (pp. 1–7).


Coplanarity 0.00 Gaffet., A. (2022). Phme data contest 2022.
Translated 0.00
Soldered 0.00 https://github.com/alexandregft/PHME-data-contest.
Unsoldered 0.55 GitHub.
Size 0.00 Gaffet, A., Ribot, P., Chanthery, E., Roa, N. B., & Merle,
LeanSoldering 0.27 C. (2021). Data-driven capability-based health moni-
Misaligned 0.15 toring method for automative manufacturing. In Phm
Missing 0.00 society european conference (Vol. 6, pp. 12–12).
Jumper 0.00
Jabbar, E., Besse, P., Loubes, J.-M., Roa, N. B., Merle, C.,
Table 5. Task 1 score results for different AOILabel levels & Dettai, R. (2018). Supervised learning approach for
surface-mount device production. In International con-
ference on machine learning, optimization, and data
6. C ONCLUSION AND F UTURE WORK science (pp. 254–263).
Ke, G., Meng, Q., Finley, T., Wang, T., Chen, W., Ma, W.,
To solve the different challenges in the Printed Circuit Board . . . Liu, T.-Y. (2017). Lightgbm: A highly efficient
production, we proposed three different methods. These meth- gradient boosting decision tree. Advances in neural in-
ods use the XGBoost classifier and are based on variable en- formation processing systems, 30.
coding and feature engineering. We have shown that gener- Lundberg, S. M., & Lee, S.-I. (2017). A unified approach
ating meta-features representing the circuit state at different to interpreting model predictions. Advances in neural
levels (component, figure and panel) improved the general- information processing systems, 30.
ization of the classifiers. The use of different encoding tech- Park, J.-M., Yoo, Y.-H., Kim, U.-H., Lee, D., & Kim, J.-
niques for the categorical features has shown CatBoost as the H. (2020). D 3 pointnet: Dual-level defect detection
most promising one, due to the high cardinality issues. The pointnet for solder paste printer in surface mount tech-
first task results are probably too biased to be used in produc- nology. IEEE Access, 8, 140310–140322.
tion. However, the results of task 2 and task 3 are promising Parviziomran, I., Cao, S., Yang, H., Park, S., & Won, D.
for a production application. (2019). Data-driven prediction model of components
In future work, it could be interesting to add some informa- shift during reflow process in surface mount technol-
tion to improve the prediction. The time between operations ogy. Procedia Manufacturing, 38, 100–107.
for instance is missing (probably to not share cycle time in-
PHM Society. (2022). Data challenge. (Spon-
formation). A product with a larger cycle can lead to an issue
sored by Bitron Spa. Published electronically at
with dust deposit, humidity change... This information could
https://phm-europe.org/data-challenge)
improve the performance of all tasks. Some extra informa-
tion like the temperature curves of the oven can also help to
Prokhorenkova, L., Gusev, G., Vorobev, A., Dorogush, A. V.,
better predict future issues. Finally, for task 1, unsupervised
& Gulin, A. (2018). Catboost: unbiased boosting with
monitoring methods may be a better solution due to the class
categorical features. Advances in neural information
imbalance issue.
processing systems, 31.
Richter, J., Streitferdt, D., & Rozova, E. (2017). On the de-
R EFERENCES velopment of intelligent optical inspections. In 2017
ieee 7th annual computing and communication work-
Akiba, T., Sano, S., Yanase, T., Ohta, T., & Koyama, M. shop and conference (ccwc) (pp. 1–6).
(2019). Optuna: A next-generation hyperparameter op- Štrumbelj, E., & Kononenko, I. (2014). Explaining prediction
timization framework. In Proceedings of the 25th acm models and individual predictions with feature contri-
sigkdd international conference on knowledge discov- butions. Knowledge and information systems, 41(3),
ery & data mining (pp. 2623–2631). 647–665.
Bergstra, J., Bardenet, R., Bengio, Y., & Kégl, B. (2011). Al- Tamssaouet, F., Nguyen, K. T., Medjaher, K., & Orchard,
gorithms for hyper-parameter optimization. Advances M. E. (2021). Online joint estimation and prediction
in neural information processing systems, 24. for system-level prognostics under component interac-
Chen, T., He, T., Benesty, M., Khotilovich, V., Tang, Y., tions and mission profile effects. ISA transactions, 113,
Cho, H., . . . others (2015). Xgboost: extreme gradient 52–63.
boosting. R package version 0.4-2, 1(4), 1–4. Tavakolizadeh, F., Soto, J., Gyulai, D., & Beecks, C. (2017).
Ferdowsi, H., Raja, D. L., & Jagannathan, S. (2012). A de- Industry 4.0: mining physical defects in production of
centralized fault detection and prediction scheme for surface-mount devices.
nonlinear interconnected continuous-time systems. In Yong-Xia, Z., & Ge, Z. (2010). Md5 research. In 2010
The 2012 international joint conference on neural net- second international conference on multimedia and in-

8
E UROPEAN C ONFERENCE OF THE P ROGNOSTICS AND H EALTH M ANAGEMENT S OCIETY 2022

formation technology (Vol. 2, pp. 271–273). ceedings of the 2010 american control conference (pp.
Zhang, X. (2010). Decentralized fault detection for a class 5650–5655).
of large-scale nonlinear uncertain systems. In Pro-

You might also like