Report IRA Group
Report IRA Group
Negin Alemazkoor
Graduate Research Assistant
Department of Civil and Environmental Engineering
University of Illinois at Urbana-Champaign
Newmark Lab, 205 N. Mathews Ave., Urbana, IL 61801
Hadi Meidani
Assistant Professor
Department of Civil and Environmental Engineering
University of Illinois at Urbana-Champaign
1211 Newmark Lab, 205 N. Mathews Ave., Urbana, IL 61801
Abstract
Track geometry defects could have a critical implication on the safety of the rail transportation. In order to
make optimal maintenance decisions, to have safe and effective railroads, it is necessary to analyze the track
geometry defects and develop reliable defect deterioration models. This report is aimed to support our answer
to the 2015 RAS competition's problem on predicting evaluation yellow tag geometric defects into conditions
violating the Federal Railroad Administration (FRA) track safety standards. In general, infrastructure
deterioration process is unpredictable since many factors that impact the deterioration process can not be
captured by available data; hence probabilistic deterioration models that account for the stochastic nature
of deterioration are superior to the deterministic curve-fitting models, and are the subject of our study. The
training dataset provided by 2015 RAS competition's organizing committee are used to develop probabilistic
models for the lifetime of yellow tag defects. The developed models are used to predict whether the yellow
tag defects in the test data will deteriorate into red tag state in a given time interval or not.
Introduction
Like any other infrastructure systems, railroads need maintenance. As railroads age, maintenance actions
become essential to insure safe rides on the rails and minimize the probability of derailments. There are two
types of maintenance actions: (a) preventive maintenance actions and (b) corrective maintenance actions.
Individual defects with amplitude greater than the threshold specified by FRA track safety standards (red tag
defects) must be treated by corrective maintenance action as determined by the FRA track safety standards.
Yellow tag defects are those defects with amplitudes that exceed a particular railroad's standards but are
below FRA standards. Understanding how yellow tag defects deteriorate into the red tag defects allows a
better planning for preventive maintenance action, thereby reducing maintenance cost.
Different approaches can be used to analyze the geometric defects, one of which is survival analysis. Analyzing
the waiting time until a specific well-defined event happens is called survival analysis. Survival analysis has
many applications in different fields, such as biomedical, engineering, and economics, where the system,
equipment or organ of interest has two states: a functioning state and a failed state (1, 2). Survival analysis
aims to find, for a component, the probability distribution of its lifetime (the time spent in functioning state
before failing) and also its survival function (the probability that failure has occurred by a given time).
Analyzing the track geometric defects can be done using survival analysis as geometric defects have two
states (yellow and red) and we seek to determine when a yellow tag becomes a red tag. The time a defect
spends in yellow state can be considered as lifetime of the defect and turning into the red tag can be the
failure event. The training dataset is used to find the survival functions of yellow tag defects. The survival
functions are then used to find the probability of failure for the yellow defects in the test dataset.
1
Methodology
Let's denote the yellow state by Y , and the red state by R. The sequence of observed state condition for each
defect has the form Y, Y, Y, ..., R, ..., R. When a failure happens (a defect exceeds the FRA safety limits), i.e.,
a defect goes from the yellow state to the red state, the defect will remain in red condition until a corrective
maintenance action is done. The time that defect stays in yellow state depends on a number of explanatory
variables, such as the defect's amplitude, length, track code and etc. Our goal is to describe this dependence
mathematically. We treat lifetime, T , as a continuous random variable with probability density function f (t)
and cumulative distribution function F (t) = P (T < t). F (t) gives the probability that failure has happened
by time t. The probability that a failure has not happened by time t or the survival probability is given by
1 − F (t). Therefore, we define the survival function to be the complement of the cumulative distribution
function:
Z t Z ∞
0 0
S(t) = P (T > t) = 1 − f (t )dt = f (t0 )dt0 (1)
0 t
Using the concept of conditional probability, we can calculate the probability that failure happens between
time t and t + ∆t given that the failure has not happened by time t, denoted by h(t, ∆t):
F (t + ∆t) − F (t)
h(t, ∆t) = P (t < T < t + ∆t|T > t) = (2)
S(t)
The average rate of failure per unit of time can be obtained by dividing h(t, ∆t) by ∆t , and the instantaneous
rate of failure or hazard rate, denoted by z(t), can then be obtained by choosing ∆t to be very small.
The hazard rate function gives fundamental information about the nature of the event that is modeled.
When a hazard rate monotonically increases over time, it means that the probability of failure increases with
increasing the time spent without failure. The converse is also true.
The simplest hazard function is a constant hazard function, which means that the probability of failure
is constant over time. A constant hazard function implies that lifetime has an exponential density func-
tion:
−d ln S(t)
z(t) = λ = (4)
dt
Z ∞
S(t) = exp(−λt) = f (t0 )dt0 (5)
t
2
Exponential distribution has only one parameter, the scale parameter. Weibull distribution is a general-
ized form of exponential distribution and has an additional parameter,the shape parameter, which makes
the Weibull distribution extremely flexible and applicable for modeling lifetime and hazard occurrence (3,
4). The probability density function, survival function and hazard function of Weibull distribution are as
following:
p
f (t) = λp ptp−1 exp(−(λt) ) (7)
p
S(t) = exp(−(λt) ) (8)
Where p and λ are the shape and the scale parameter, respectively. When p = 1, the Weibull distribution
takes the form of simple exponential distribution and hazard rate takes the constant value of λ. If p > 1,
then the hazard function, z(t), is monotonically increasing. Hazard function is monotonically decreasing if
0 < p < 1. Figure 1 shows Weibull probability distribution for λ = 1 and different shape parameters, which
We want the track geometric deterioration models to be dependent on the explanatory variables. However,
the Weibull hazard function does not account for the impact of the explanatory variables. To overcome this
limitation, an extension of Weibull distribution is used, in which the scale parameter λ is itself a function of
λ = exp(−βX) (10)
p
S(t) = exp(−(exp(−βX)t) ) (12)
Where X is the column vector of explanatory variables and β is the row vector of the coefficients and should
be estimated.
3
Parameter Estimation
As discussed, Weibull distribution is widely used to model the lifetime of components. In this work, we
use Weibull distribution to probabilistically model the time a defect spends in yellow state before exceeding
safety limits. We use maximum likelihood method to estimate the parameters p and β in Equations 11
and 12. For two consecutive reports of a repeated defect, we assume that the lifetime of the yellow defect
starts from the first report. This assumption is reasonable as we include the amplitude of the defect as an
explanatory variable. In other words, we want to know the lifetime of a defect after its amplitude reached
a certain value (defect’s amplitude in the first report). If the defect did not turn into red tag defect by the
second report, we consider it as a right-censored record, implying that we do not know the lifetime of the
defect, i.e., we only know that the defect survived until the second report. Even for un-censored records,
when the second inspection indicates red tag, we do not know the exact failure time. In this work, we assume
In case of having m records, where n of them are uncensored with failure time ti , and m − n of them are
censored with survival time ti , the likelihood function can be written as following:
n
Y m
Y
L= f (β, p, Xi , ti ) S(β, p, Xi , ti ) (13)
i=1 i=n+1
n
X m
X
{p∗ , β ∗ } = argmax ln f (β, p, Xi , ti ) ln S(β, p, Xi , ti ) (14)
p,β i=1 i=n+1
Since Equation 14 includes non-linear function of p and β, iterative methods are usually used to find the
values of p∗ and β ∗ that maximizes the log-likelihood (7, 8). Statistical packages such as R can be used to
In this work, we used the statistical software R to find the parameters in Equations 11 and 12. In order to do
this, the input dataset should have the format as in Table 1. The first column is the time, the second column
indicates whether the failure has occurred in the time duration specified in the first column, i.e., one means
the event has occurred and zero means it has not occurred. The other columns include the explanatory
10 0 1 2.05 7
30 1 0 3.04 8
4
We need to generate a table with the format shown in Table 1 for each type of defect (surface, DIP and
crosslevel) using the training dataset. The data processing steps to generate these tables are described below:
There are three different types of defects in the dataset, which may have very different deterioration process.
In training dataset, there are several defects that are reported multiple times within a day. The average
reported amplitude and lengths for those defects are calculated. Then, all the multiple reports of a defect
within a day are replaced by a single report including the average values.
The next step is identifying the repeated defects. By definition, a defect from the same type find within 100
feet in either side of a previous defect is considered as a repeated defect. We used this definition to find the
repeated defects and sort them by inspection date, i.e., from the earliest inspection to the latest inspection.
Although we do not have any information about when and where maintenance actions were performed, we are
provided information about inspection runs and we use that to make reasonable guesses about maintenance
actions. If between two consecutive times that a defect was reported, there was an inspection run with
no reported defect, we assume a maintenance action was performed and consequently we do not consider
the defects to be repeated defects. Additionally, if the absolute amplitude of the defect reduced from one
inspection to the next inspection, or if the sign of amplitude changed, we assume a maintenance action
was performed between the two inspections. We understand that it might be a measurement error and not
necessarily maintenance was done; however, we are unable to distinguish measurement errors from the cases
when maintenance actions were performed. Therefore, we prefer to be conservative and not consider the
cases where the amplitude of the defect reduced or the sign of amplitude changed to be repeated defects.
Once we have all the repeated defects sorted from the first to the last inspections, we can make a table in
the form of Table 1. The first column is the time between two consecutive inspections reporting a repeated
defect if the second inspection indicates yellow tag; otherwise, it is half of the time between two inspections,
i.e., we assume that the failure happened midway between the two inspections. The second column is zero if
second inspection indicates yellow tag; and is one if the second inspection indicates red tag, meaning that the
failure has occurred. We then include the initial absolute values of the defect's amplitude and length, track
code, class of track, operating speeds, and tonnage as explanatory variables in the table. The track code is a
binary variable. It is either tangent or curve (dataset does not include the spiral case); we assign zero to the
tangent code and one to the curve code. To indicate the class of the track we define binary variables class5,
5
class4 and class3, whose values are one when the track belongs to classes 5, 4 or 3, respectively. When all
three variables are zero, it means that the track belongs to class 2 as the dataset does not include tracks
belong to class 1.
We are interested in including the both traffic and tonnage as explanatory variables in the model; however,
we suspect that traffic data suffers from substantial errors as majority of data suggest very small or very
large number of cars to number of trains ratio, i.e., smaller than 10 or larger than 150. Therefore, we discard
the traffic data and only include the average tonnage load across the section that defect is located as an
explanatory variable.
Results
DIP defects
The processed training data set for DIP defects includes 540 records, of which 363 records are censored.
Table 2 shows the estimated parameters for DIP defects lifetime model. As already discussed, Trackcode is
a binary variable, i.e., zero and one denote tangent and curve codes, respectively. The processed training
data set for DIP defects only includes tracks from classes 5, 4 and 3. Therefore, two binary variables, class5
and class4, are included in the model. When both of these variables are zero, it indicates the class of track
is 3.
Table 2: Parameter Estimations for DIP defects lifetime model (including all variables)
1/p 0.85
It can be seen in Table 2 that the estimated value of p, shape parameter of Weibull distribution, is greater
than one. It means that the hazard rate is monotonically increasing and probability of failure increases with
the time spent without failure. The coefficients' signs are intuitively correct, i.e., an increase in variables
with negative coefficient would lead to increasing the likelihood of a shorter lifetime.
6
We want to investigate whether the coefficients are significantly different from zero or not. The z-score is a
measure of statistical significance which indicates whether or not to reject the null hypothesis of corresponding
coefficient being zero. A z-score higher than 1.96 or lower than -1.96 indicates that there is less than 5%
chance that null hypothesis will be falsely rejected. As it can be seen in Table 2, only four parameters are
significant at 5% level. By removing the insignificant variables from the model, we can get the parameter
1/p 0.86
We designed a test to find out how discarding the insignificant variables can impact the model prediction
quality. We used half of the training data as learning subset and the other half as validation subset. We
used the learning subset to estimate the parameters of survival function. Then we predicted the failure for
the validation subset based on estimated parameters. The estimated parameters are used in Equation 15 to
p
P (failure) = 1 − S(t) = 1 − exp(−(exp(−βX)t) ) (15)
If the failure probability is greater than or equal to 0.5, we predict a failure, otherwise, we predict yellow tag
defect will not turn into red tag defect. The model including all variables makes 71.5% correct predictions
and the model including only significant variables makes 71.1% correct predictions. Since including all
the variables does not improve the predictions substantially, we keep the model concise and discard the
insignificant variables.
Crosslevel defects
The processed training data set for crosslevel defects has 994 records, including 540 uncensored and 454
censored records. Table 4 shows the estimated parameters for crosslevel defects lifetime model. Only the
parameters that are significant at 0.05 level are included in the model. The data set for crosslevel defects
includes very few tracks from classes 3 or 2. Since there is no crosslevel defect from class 2 or 3 in the test
data, we deleted the data with class 2 or 3 from training dataset to get a more accurate model for class 5 and
7
4. Therefore, only one binary variable, class5, is included in the model. Zero value for class5 variable means
the class of track is 4. The estimated value of p is greater than one, indicating monotonically increasing
1/p 0.86
Surface defects
The processed training data set for surface defects includes 960 records, of which 275 records are uncensored.
Table 5 shows the estimated parameters for surface defects' lifetime model. Amplitude is the only significant
parameter at 0.05 level. The data set for surface defects includes mostly tracks from class 5 and very few
tracks from class 4. Since all surface defects are from class 5 in the test dataset, we deleted the data with
class 4 from training dataset to get a more accurate model for class 5. The value of p is estimated to be
1.25. This indicates that the hazard rate is monotonically increasing, which is intuitive.
1/p 0.80
Parameters estimated in Tables 3, 4 and 5 are used in Equation 10 to find the probability of turning into
red tag within the given interval for the yellow tag defects in the test dataset. If the failure probability is
greater than or equal to 0.5, we predict that failure will occur. Conversely, if it is less than 0.5, we predict
that failure will not happen and yellow tag defect will not turn into red tag defect.
Several defects in the test dataset have already exceeded the FRA safety limits and are considered as red
tag defects. These defects will remain in red tag state unless they undergo corrective maintenance actions.
We assumed that no maintenance action is performed during the given intervals for test data, thus those red
8
tag defects will remain in the red tag state by the end of the interval.
Conclusion
This report is aimed to describe our approach to answer the 2015 RAS competition's problem on predicting
yellow tag geometric defects deterioration into red tag defects. Survival analysis is used to probabilistically
model the time the defects spend in yellow state before turning into red state. The developed models are
used to predict whether the yellow tag defects in the test data will deteriorate into red tag state in a given
time interval or not. An excel file including the predictions and also the MATLAB code used to process the
Refrences
1. van Noortwijk, J. M., & Frangopol, D. M. (2004). Deterioration and maintenance models for insuring
safety of civil infrastructures at lowest life-cycle cost. Life-Cycle Performance of Deteriorating Structures:
3. Pinder III, J. E., Wiener, J. G., & Smith, M. H. (1978). The Weibull distribution: a new method of
4. Nelson, W. B. (2005). Applied life data analysis (Vol. 577). John Wiley & Sons.
5. Kleinbaum, D. G., & Klein, M. (2012). Parametric survival models. In Survival analysis (pp. 289-361).
6. Mishalani, R. G., & Madanat, S. M. (2002). Computation of infrastructure transition probabilities using
7. Mauch, M.,& Madanat, S. (2001). Semiparametric hazard rate models of reinforced concrete bridge deck
8. Aitkin, M., & Clayton, D. (1980). The fitting of exponential, Weibull and extreme value distributions to