Ipc2022-87060 - Estimating Measurement Performance With Truncated Data Sets
Ipc2022-87060 - Estimating Measurement Performance With Truncated Data Sets
Ipc2022-87060 - Estimating Measurement Performance With Truncated Data Sets
IPC 2022
September 26-30, 2022, Calgary, Canada
IPC2022-87060
1
Integral Engineering, Edmonton, Alberta, CA
2
ExxonMobil Research and Engineering, Spring, Texas, US
3
Esso Petroleum Company Limited, Hythe Terminal, Hampshire, UK
ABSTRACT the reality that there are features below reporting threshold. The
steps required to format the results for use, and achieve more
In September 2021, the API released the third edition of the accurate measurement performance results (e.g., unity charts),
1163 Standard "In-line Inspection Systems Qualification". This are described in this paper.
edition brought many improvements over previous versions, in-
Keywords: truncation, measurement error, measurement
cluding more detail in Section 8 "System Results Validation",
validation, ILI, field validation, API 1163
which defines the methodologies used to validate ILI run toler-
ances. The standard describes three levels of validation, with
ABBREVIATIONS
’Level 3’ requiring the operator calculate ILI tool measurement
performance with real-world data measured in validation spools API American Petroleum Institute
and excavation sites. Real-world, inspection data sets have some Bc back-wall corrosion
characteristics that make them difficult to use to accurately es- CI credible interval
timate measurement performance, one of which is ’truncation’, FS front-surface
that is data with a lower- or upper-bound threshold above which FSc front-side corrosion
no data is reported. For example, most UTCD ILI tools have a HBM hierarchical Bayesian model
lower truncation level, such as 1 mm for crack height, which rep- ILI in-line inspection
resents a signal threshold below which measurements are either MFL magnetic flux leakage
not reliable, or not reported. Although small features below the NDE non-destructive evaluation
reporting threshold exist on the pipeline, they are not normally NDT non-destructive testing
reported by the ILI tool. OLS ordinary least squares
SNR signal-to-noise ratio
This paper describes a model to estimate ILI tool perfor- TR truncated regression
mance using API 1163 Level 3 methods when the data set has a UT ultrasonic testing
lower-truncation threshold. The model is tested with simulation UTCD ultrasonic crack detection
data to show how it responds over a wide range of feature pop-
ulation characteristics, and then applied to two real field data
sets. Comparisons are made between the truncation algorithm NOMENCLATURE
and the standard non-truncated version of the algorithm, to show ℎ validation height
where the new algorithm performs best and is most useful to im- 𝐿 truncation threshold
plement pipeline integrity mitigations. The model used in this 𝑁 (0, 𝜎𝜀2 ) normal probability distribution with mean 0
study is consistent with the example documented in API 1163 - and standard deviation 𝜎𝑒
Appendix C, the Bayesian inference method. The results of the 𝑋 exogenous variables including education
model produce measurement performance specifications that can 𝑌 household income
be used as inputs in a pipeline risk or reliability analysis. The 𝑦 measured height
influence of truncated data sets is common in the field of inspec- 𝛼 y-axis intercept
tion and NDE (including thickness measurements), as it reflects 𝛽 slope of the line
∗ Correspondingauthor: [email protected]
Document version: Final, 1.0, 2022/04/23 .
Measurement (mm)
1. INTRODUCTION 6
and because the wall loss signal can become contained within Once the experiment was running, there was no restriction
the larger front-surface (FS) echo. This type of truncation can on earnings. The experiment continued over several years and
occur for many types of ultrasonic measurements (both thickness incomes could increase and the subjects remained in the study.
and angle beam flaw inspection), other non-destructive testing Due to the cut-off in the earnings of subject selection, the data set
(NDT) methods, and applications beyond pipelines. For ultra- was upper-truncated – subjects initially with incomes higher than
sonic thickness measurements, truncation is less likely to occur the threshold were excluded from the study. In wider society,
on flat parallel surfaces. However, loss of reliable signal mea- these incomes existed, but they were truncated from the data set
surements (i.e., truncation) is more likely when the wall loss is used for the study. Consider the plot of earnings vs education in
more severe in terms of remaining thickness and complex surface Figure 3. The solid ‘true line’ represents the average relationship
morphology. between education and earnings. The dots around the line repre-
In summary, truncation of UT thickness readings is less likely sent the distribution around the mean for each level of education.
when the equipment has no degradation (e.g., smooth, flat and The dashed ‘estimated’ line is the best fit line when all data above
parallel surfaces). However, truncation becomes more likely as threshold 𝐿 are ignored. This underestimates the true effect of
the level of corrosion and degradation increases (e.g., localized education because it ignores data above the threshold.
pitting, pin holes, and significant wall loss with a complex pro- To properly account for the effect of education on income, a
file). correction to go from the dashed line to the solid line was needed.
A key insight into the solution is that truncation introduces a cor-
2. TRUNCATION ALGORITHM relation between the right-hand variables and the error leading to
In 1976, Hausman and Wise [3] published a study in the a predictable bias, see Figure 4. For any given value of education,
National Bureau of Economics Research on a social experiment the distribution of earnings is a distribution truncated at 𝐿, where
called the New Jersey Income Maintenance Experiment. In this 𝐿 depends on the level of education.
study, the researcher attempted to answer the question “will in- The authors developed an algorithm to ‘counter’ the effects of
come maintenance programs which aid the working poor induce upper truncation on a data set. First assume a linear relationship
this group to work less? And if so, how much less?” Subjects between the household income, 𝑌𝑖 , and participant education, 𝑋𝑖 ,
were selected for the experiment based on their household income as follows:
for the year prior to study. They had to have a household income
less that 1.5× the 1967 poverty threshold during the year prior to 𝑌𝑖 = 𝑋𝑖 𝛽 + 𝜀 𝑖 (2)
the experiment to be accepted into the study. The density function of 𝑌𝑖 for a given value of 𝑋𝑖 𝛽 is zero
0
0 1 2 3 4 5 6
Education
Feature Height (mm)
FIGURE 3: NEW JERSEY INCOME EXPERIMENT (HAUSMAN, 1976) FIGURE 5: TRUNCATED REGRESSION WITH REAL DATA
3. APPLICATIONS
80.0% CI 80.0% CI
5 5
ILI Height (mm)
3 2.75 3 2.75
2 2
1 1
TR TR
Intercept ≤ 0 Intercept ≤ 0
OLS OLS
Intercept ≤ 0 Intercept ≤ 0
Ratio of Estimated Slope to True Slope Difference between Estimated Intercept and True Intercept
8 8
Measured Height (mm)
4 4
(a) example simulation with lower truncation (b) example simulation with upper censoring
yields zero. This shows a similar trends to the slope recovery, 2. Cases that more realistically represent ILI validation dig
with the TR method performing very well for simulations in data. Typically, verification datasets are clustered in the
which truncation removes significant portions of the dataset. The bottom left of the unity plot since larger defects are much
difference in performance between TR and OLS is more drastic less common. This can be addressed by operators using
for the intercept recovery, with OLS performing very poorly, a well-designed verification spool program (see Figure 5).
particularly for the high error, low intercept cases. In addition, real datasets typically contain fewer flaws, and
validation spools can provide results more quickly, over a
5. CONCLUSION broader range of flaw sizes, and lower cost compared to
verification digs [7].
Truncation is a phenomenon that exists in datasets measured
with inspection tools. The purpose of data gathering is often to
predict reliability, risk, measurement performance, and mainte- REFERENCES
nance requirements. Without considering truncation, these esti- [1] API. “API 1163: In-line Inspection Systems Qualification.”
mates can be overly-conservative leading to inefficient use of the Standard API 1163-2021. American Petroleum Institute.
time and resources needed to maintain safety and reliability. The 2021.
algorithms described in this paper can help account for truncation [2] Fuller, Wayne A. Measurement Error Methods. Wiley se-
and increase the accuracy of model predictions. ries in probability and mathematical statistics, John Wiley &
Testing the TR method across a wide range of parameters Sons, Inc.
illustrates that it performs very well for cases where the data
[3] Hausman, Jerry A and Wise, David A. “The Evaluation of
has been truncated and reverts to performance similar to OLS for
Results from Truncated Samples: The New Jersey Income
datasets with no truncation. In future work, additional testing will
Maintenance Experiment.” Vol. 5 No. 4 : p. 26.
be performed to determine the performance of the TR method in
these cases: [4] Al-Khafaji, Amir Wadi and Tooley, John R. Numerical Meth-
ods in Engineering Practice, 1st ed. Harcourt Brace Jo-
1. Cases in which the measurement error (𝜀) is not a normal vanovich, Inc (1986).
distribution. It is expected that performance will be worse [5] Mathews, John H. and Fink, Curtis D. “Numerical Methods
since this is an assumption of the model. However, the Using Matlab.” Numerical Methods Using Matlab, 4th ed.
authors have plans to address this assumption, which may Prentice-Hall Inc.
be published in future work. [6] Skow, Jason, Krynicki, Joseph W and Peng, Lujian. “Man-