Two Step Approach For Software Process Control: HLSRGM
Two Step Approach For Software Process Control: HLSRGM
Two Step Approach For Software Process Control: HLSRGM
s, that is,
( ) ( ) { }
( )
!
n
s
e s
P N t s N t n
n
+ = =
Describing uncertainty about an infinite collection of
random variables one for each value of t is called a
stochastic counting process denoted by
( ) , 0 N t t > (
.
The process
( ) { } , 0 N t t >
is assumed to follow a Poisson
distribution with characteristic MVF (Mean Value
Function) m(t). Different models can be obtained by using
different non decreasing m(t).
A Poisson process model for describing about the number
of software failures in a given time (0, t) is given by the
probability equation.
| |
( )
[ ( )]
( ) , 0,1,2,...
!
m t y
e m t
P N t y y
y
= = =
Where,
( ) m t
is a finite valued non negative and non
decreasing function of
' ' t
called the mean value
function. Such a probability model for
( ) N t
is said to be
an NHPP model.
2.2 Model description: HLSRGM
One simple class of finite failure NHPP model is the
HLSRGM, assuming that the failure intensity is
proportional to the number of faults remaining in the
software describing an exponential failure curve. It has
two parameters. Where, a is the expected total number
of faults in the code and b is the shape factor defined as,
the rate at which the failure rate decreases. The
International Journal of EmergingTrends & Technology in Computer Science(IJETTCS)
Web Site: www.ijettcs.org Email: [email protected], [email protected]
Volume 2, Issue 4, July August 2013 ISSN 2278-6856
Volume 2, Issue 4 July August 2013 Page 3
cumulative distribution function of the model is:
( )
( )
( )
1
1
bt
bt
e
F t
e
=
+
.
The expected number of faults at timet is denoted
by
( )
( )
1
( ) , 0, 0, 0
1
bt
bt
a e
m t a b t
e
=
+
. Where t can be calendar time
(Krishna Mohan et al., 2012).
2.3 Parameter estimation methods
The main issue in the NHPP model is to determine an
appropriate mean value function to denote the expected
number of failures experienced up to a certain time point.
Method of least squares (LSE) or maximum likelihood
(MLE) has been suggested and widely used for estimation
of parameters of mathematical models (Kapur et al.,
2008). Non-linear regression is a method of finding a
nonlinear model of the relationship between the
dependent variable and a set of independent variables.
Unlike traditional linear regression, which is restricted to
estimating linear models, nonlinear regression can
estimate models with arbitrary relationships between
independent and dependent variables. The model
proposed in this paper is a non-linear and it is difficult to
find solution for nonlinear models using simple Least
Square method. Therefore, the model has been
transformed from non linear to linear.
The least squares method is widely used to estimate the
numerical values of the parameters to fit a function to a
set of data. We will use the method in the context of a
Linear regression problem. It exists with several
variations: it simpler version is called Ordinary Least
Squares (OLS), a more sophisticated version is called
Weighted Least Squares (WLS). A recent variations of
least squares method are Alternating least squares (ALS)
and Partial Least Squares (PLS) (Lewis-Beck, 2003).
The standard approach of using derivatives is not always
possible when estimating the parameters of a non linear
function with OLS. Therefore, iterative methods are often
used. The best value of the estimate is found by searching
in a stepwise fashion. They proceed by using at each step
a linear approximation of the function and refine this
approximation by successive corrections. The techniques
involved are gradient descent and Gauss-Newton
approximations. A neural network constitutes a popular
recent application of these techniques.
3. TWO STEP APPROACH FOR
PARAMETER ESTIMATION
MLE and LSE techniques are used to estimate the model
parameters (Lyu, 1996; Musa et al., 1987). Sometimes,
the likelihood equations are difficult to solve explicitly. In
such cases, the parameters estimated with some numerical
methods (Newton Raphson method). On the other hand,
LSE, like MLE, applied for small sample sizes and may
provide better estimates (Huang and Kuo, 2002).
3.1 Algorithm for the 2-step approach.
o Consider the Cumulative distribution function
( ) F t
and equate to
i
p
, i.e
( )
i
F t p =
, where
1
i
i
p
n
=
+
o Express the equated equation
( )
i
F t p =
as a linear
form,
y mx b = +
.
o Find model parameters of mean value function
( ) m t
.
Where
( ) ( ) m t a F t =
- The initial number of faults a
.
is estimated through
MLE method. Since, it forms a closed solution.
- The remaining parameters are estimated through
LSE regression approach.
3.2 ML (Maximum Likelihood) Parameter Estimation
The idea behind maximum likelihood parameter
estimation is to determine the parameters that maximize
the probability of the sample data. The method of
maximum likelihood is considered to be more robust and
yields estimators with good statistical properties. In other
words, MLE methods are versatile and apply to many
models and to different types of data. Although the
methodology for MLE is simple, the implementation is
mathematically intense. Using today's computer power,
however, mathematical complexity is not a big obstacle. If
we conduct an experiment and obtain N independent
observations,
1 2
, , ,
N
t t t
. The likelihood function
(Pham, 2003) may be given by the following product:
( )
1 2 1 2 1 2
1
, , , | , , , ( ; , , , )
N
N k i k
i
L t t t f t
=
=
[
Likelihood function by using (t) is:
1
( )
n
i
i
L t
=
=
[
Log Likelihood function for ungrouped data (Pham,
2006),
| |
1
1
l og l og ( )
l og ( ) ( )
n
i
i
n
i n
i
L t
t m t
=
=
| |
=
|
\ .
=
[
The maximum likelihood estimators (MLE) of
1 2
, , ,
k
are obtained by maximizing L or
A
,
where
A
is ln L . By maximizing , which is much easier
to work with than L, the maximum likelihood estimators
(MLE) of
1 2
, , ,
k
are the simultaneous solutions of
k equations such as:
( )
0
j
c A
=
c
, j=1,2,,k. The
parameters a and b are estimated as follows. The
parameter b is estimated by iterative Newton Raphson
International Journal of EmergingTrends & Technology in Computer Science(IJETTCS)
Web Site: www.ijettcs.org Email: [email protected], [email protected]
Volume 2, Issue 4, July August 2013 ISSN 2278-6856
Volume 2, Issue 4 July August 2013 Page 4
Method using
1
( )
'( )
n
n n
n
g b
b b
g b
+
=
, which is substituted
in finding a.
3.3 LS (Least Square) parameter estimation
LSE is a popular technique and widely used in many
fields for function fit and parameter estimation (Liu,
2011). The least squares method finds values of the
parameters such that the sum of the squares of the
difference between the fitting function and the
experimental data is minimized. Least squares linear
regression is a method for predicting the value of a
dependent variable Y, based on the value of an
independent variable X.
o The Least Squares Regression Line
Linear regression finds the straight line, called the least
squares regression line that best represents observations
in a bivariate data set. Given a random sample of
observations, the population regression line is estimated
by:
y bx a = +
. Where, a is a constant, b is the
regression coefficient and x is the value of the
independent variable, and
y
is the predicted value of
the dependent variable. The least square method defines
the estimate of these parameters as the values which
minimize the sum of the squares between the
measurements and the model. Which amounts to
minimizing the expression:
( )
2
i i
i
E Y Y =
(Xie, 2001).
Taking the derivative of E with respect to a and b and
setting them to zero gives the following set of equations
(called the normal equations):
2 2 2 0
i i
E
Na b X Y
a
c
= + =
c
, and
2
2 2 2 0
i i i i
E
b X a X Y X
b
c
= + =
c
The Least Square Estimates of a and b are obtained by
solving the above equations. Where,
a Y bX =
and
( )( )
( )
2
i i
i
Y Y X X
b
X X
=
.
4. ILLUSTRATING THE PARAMETER
ESTIMATION: HLSRGM
4.1 ML Estimation
Procedure to find parameter a using MLE.
The likelihood function of HLSRGM is given
as,
( )
2
1
2
1
i
i
b t N
b t
i
a b e
L
e
=
=
+
[
Taking the natural logarithm on both sides, The Log
Likelihood function is given as:
Taking the Partial derivative with respect to a and
equating to 0.
( )
( )
1
1
n
n
bt
bt
e
a n
e
( +
( =
(
Taking the Partial derivative with respect to b and
equating to0.
( ) ( ) ( ) 1 1
2
( ) 2 1
1 1 1
i n
i n n
bt bt
n n
i n
i
bt bt bt
i i
t e t e n n
g b t
b e e e
= =
(
( =
( + +
Taki
ng the partial derivative again with respect to b and
equating to 0.
( ) ( )
( )
( )
2
2
2
2 2 2 2
2
1
1
1
'( ) 2 2
1 1 1
n
i
n
i n n
bt
bt n
bt i
n
bt bt bt
i
n e
t e n
g b t e
b
e e e
=
(
+
(
= + +
(
+ +
4.2 LS Estimation
Procedure to find parameter b using regression
approach.
- The cumulative distribution function of HLSRGM is,
( )
( )
( )
1
1
b t
b t
e
F t
e
=
+
. The c.d.f is equated to
i
p
.
Where,
1
i
i
p
n
=
+
.
- The equation
( )
i
F t p =
is expressed as a linear form,
i i
V U =
. Where,
i i
V X =
;
1
log
1
i
i
i
p
U
p
| |
=
|
+
\ .
;
and
2 2
i i
i
VU nVU
U nU
- Where,
1
.
a
and
.
b
are Estimates of parameters and the values
can be computed using iterative method for the given
time between failures data (pham, 2006) shown in table 1
and 2. Using a and b values we can compute
( ) m t
.
Table 1. Time between failures of a software, IBM
No. of
Error
Inter-
failure
time
No. of
Error
Inter-
failure
time
No. of
Error
Inter-
failure
time
1 10 6 12 11 19
2 9 7 18 12 30
3 13 8 15 13 32
4 11 9 22 14 25
5 15 10 25 15 40
International Journal of EmergingTrends & Technology in Computer Science(IJETTCS)
Web Site: www.ijettcs.org Email: [email protected], [email protected]
Volume 2, Issue 4, July August 2013 ISSN 2278-6856
Volume 2, Issue 4 July August 2013 Page 5
Table 2. Time between failures of a software, NTDS
F.NO TBF F.NO TBF F.NO TBF
1 9 10 7 19 6
2 12 11 1 20 1
3 11 12 6 21 11
4 4 13 1 22 33
5 7 14 9 23 7
6 2 15 4 24 91
7 5 16 1 25 2
8 8 17 3 26 1
9 5 18 3
Assuming an acceptable probability of false alarm of
0.27%, the control limits can be obtained as (Xie, 2002):
99865 . 0
1
1
=
|
|
.
|
\
|
+
bt
bt
U
e
e
T
5 . 0
1
1
=
|
|
.
|
\
|
+
bt
bt
C
e
e
T
00135 . 0
1
1
=
|
|
.
|
\
|
+
bt
bt
L
e
e
T
These limits are converted to
) (
U
t m
,
) (
C
t m
and
) (
L
t m
form. They are used to find whether the software process
is in control or not by placing the points in Failure control
chart shown in figure 1 and 2. A point below the control
limit
) (
L
t m
indicates an alarming signal. A point above
the control limit
) (
U
t m
indicates better quality. If the
points are falling within the control limits, it indicates the
software process is in stable condition. The values of
control limits are as follows.
Table 3: Parameter estimates and Control limits
Data
Set
Control Limits
UCL CL LCL
DS1
IBM
18.585601
9.305362
5
0.025124
5
DS2
NTD
28.728418
14.38362
7
0.038836
Table 4. Successive differences of mean values: DS1
F.NO CTBF m(t) SDs
1
10 8.994191 5.193557
2
19 14.187748 3.190792
3
32 17.378540 0.836914
4
43 18.215454 0.313307
5
58 18.528761 0.058803
6
70 18.587565 0.019688
7
88 18.607253 0.002758
8
103 18.610011 0.000644
9
125 18.610655 0.000065
10
150 18.610720 0.000004
11
169 18.610724 0.000001
12
199 18.610725 0.000000
13
231 18.610725 0.000000
14
256 18.610725 0.000000
15
296 18.610725
Table 5. Successive differences of mean values: DS2
F.NO CTBF m(t) SDs
1
9 9.364585 9.547409
2
21 18.911995 5.079918
3
32 23.991912 1.160868
4
36 25.152781 1.421052
5
43 26.573833 0.295708
6
45 26.869540 0.580299
7
50 27.449840 0.587264
8
58 28.037104 0.226504
9
63 28.263607 0.204790
10
70 28.468398 0.021510
11
71 28.489908 0.100268
12
77 28.590176 0.012770
13
78 28.602946 0.080585
14
87 28.683532 0.021694
15
91 28.705225 0.004482
16
92 28.709707 0.011595
17
95 28.721302 0.009260
18
98 28.730563 0.013301
19
104 28.743863 0.001691
20
105 28.745554 0.012196
21
116 28.757750 0.008706
22
149 28.766456 0.000326
23
156 28.766782 0.000471
24
247 28.767253 0.000000
25
249 28.767254 0.000000
26
250 28.767254
International Journal of EmergingTrends & Technology in Computer Science(IJETTCS)
Web Site: www.ijettcs.org Email: [email protected], [email protected]
Volume 2, Issue 4, July August 2013 ISSN 2278-6856
Volume 2, Issue 4 July August 2013 Page 6
Fai lure Control Chart
UCL 18.585601
CL 9.305363
LCL 0.025124
0.000000
0.000000
0.000001
0.000100
0.010000
1.000000
100.000000
1 2 3 4 5 6 7 8 9 10 11 12 13 14
Failure Number
S
u
c
c
e
s
s
i
v
e
D
i
f
fe
r
e
n
c
e
s
o
f
M
e
a
n
V
a
l
u
e
s
Figure: 1 Failure Control Chart, DS1
Fail ure Control Chart
UCL 28.728418
CL 14.383627
LCL 0.038836
0.000000
0.000000
0.000001
0.000010
0.000100
0.001000
0.010000
0.100000
1.000000
10.000000
100.000000
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
Fai lure Number
S
u
c
c
e
s
s
i
v
e
D
i
ff
e
r
e
n
c
e
s
o
f
M
e
a
n
V
a
lu
e
s
Figure: 2 Failure Control Chart, DS2
Figure 1 and 2 are obtained by placing the time between
failures cumulative data shown in table 1 and 2 on y axis
and failure number on x axis and the values of control
limits are placed on Failure control chart. The Failure
control charts show that from the 6th failure of DS1 and
10th failure of DS2, the data has fallen below
( )
L
m t
which indicates the failure process. It is
significantly early detection of failures using Failure
control chart. The software quality is determined by
detecting failures at an early stage.
6. CONCLUSION
The given inter failure times are plotted through the
estimated mean value function against the failure serial
order. The parameter estimation is carried out by two step
approach for the considered model. The graphs have
shown out of control signals i.e below the LCL. Hence we
conclude that our method of estimation and the control
chart are giving a +ve recommendation for their use in
finding out preferable control process or desirable out of
control signal. By observing the Failure Control chart we
identified that the failure situation is detected at 6th point
of table-1 and 10th point of table-2 for the
corresponding
( ) m t
, which is below
( )
L
m t
. It indicates
that the failure process is detected at an early stage
compared with Ramchand(2011) control chart and then
continued to fail. The early detection of software failure
will improve the software Reliability. When the time
between failures is less than LCL, it is likely that there
are assignable causes leading to significant process
deterioration and it should be investigated. On the other
hand, when the time between failures has exceeded the
UCL, there are probably reasons that have lead to
significant improvement. From Figure 1 and 2 the process
is stabilized by touching the X-axis. As SPC is to stabilize
at some point of time the two-step approach is preferable.
REFERENCES
[1] Kimura, M., Yamada, S., Osaki, S., 1995.
Statistical Software reliability prediction and its
applicability based on mean time between failures.
Mathematical and Computer Modeling Volume 22,
Issues 10-12, Pages 149-155.
[2] Koutras, M.V., Bersimis, S., Maravelakis,P.E., 2007.
Statistical process control using shewart control
charts with supplementary Runs rules Springer
Science +Business media 9:207-224.
[3] MacGregor, J.F., Kourti, T., 1995. Statistical
process control of multivariate processes. Control
Engineering Practice Volume 3, Issue 3, March
1995, Pages 403-414 .
[4] Musa, J.D., Iannino, A., Okumoto, k., 1987.
Software Reliability: Measurement Prediction
Application. McGraw-Hill, New York.
[5] Ohba, M., 1984. Software reliability analysis
model. IBM J. Res. Develop. 28, 428-443.
[6] Pham. H., 1993. Software reliability assessment:
Imperfect debugging and multiple failure types in
software development. EG&G-RAAM-10737; Idaho
National Engineering Laboratory.
[7] Pham. H., 2003. Handbook Of Reliability
Engineering, Springer.
[8] Pham. H., 2006. System software reliability,
Springer.
[9] Gokhale, S.S and Trivedi, K.S., 1998. Log-Logistic
Software Reliability Growth Model. The 3rd IEEE
International Symposium on High-Assurance
Systems Engineering. IEEE Computer Society.
[10] Xie, M., Goh. T.N., Ranjan.P., (2002). Some
effective control chart procedures for reliability
monitoring -Reliability engineering and System
Safety 77 143 -150.
[11] Goel, A. L. and Okumoto, K., 1979, Time-
dependent error-detection rate model for software
reliability and other performance measures, IEEE
Transactions on Reliability, vol. 28, pp. 206-211.
[12] Huang, C.Y and Kuo, S.Y., (2002). Analysis of
incorporating logistic testing effort function into
software reliability modelling, IEEE Transactions
on Reliability, Vol.51, No. 3, pp. 261-270.
[13] Kapur, P.K., Gupta, D., Gupta, A. And Jha, P.C.,
(2008). Effect of Introduction of Fault and Imperfect
Debugging on Release Time, Ratio Mathematica,
18, pp. 62-90.
[14] Krishna Mohan, G., Srinivasa Rao, B. and Satya
Prasad, R. (2012). A Comparative study of Software
Reliability models using SPC on ungrouped data,
International Journal of Advanced Research in
Computer Science and Software Engineering.
Volume 2, Issue 2, February.
[15] Lewis-Beck, M., Bryman, A. and Futing, T. (Eds)
(2003). Least Squares, Encyclopedia of Social
Sciences Research Methods. Thousand Oaks
(CA):Sage.
International Journal of EmergingTrends & Technology in Computer Science(IJETTCS)
Web Site: www.ijettcs.org Email: [email protected], [email protected]
Volume 2, Issue 4, July August 2013 ISSN 2278-6856
Volume 2, Issue 4 July August 2013 Page 7
[16] Liu, J., (2011). Function based Nonlinear Least
Squares and application to Jelinski-Moranda
Software Reliability Model, stat. ME, 25th August.
[17] Lyu, M.R., (1996). Handbook of Software
Reliability Engineering, McGraw-Hill, New York..
[18] Ramchand. K. H., (2011) ASSESSING
SOFTWARE RELIABILITY USING SPC, Acharya
Nagarjuna University, Ph.D thesis.
[19] Xie, M., Yang, B. and Gaudoin, O. (2001).
Regression goodness-of-fit Test for Software
Reliability Model Validation, ISSRE and Chillarege
Corp.
Author:
Dr. R. Satya Prasad received Ph.D. degree in
Computer Science in the faculty of Engineering in
2007 fromAcharya Nagarjuna University, Andhra
Pradesh. He received gold medal from Acharya
Nagarjuna University for his outstanding performance in
Masters Degree. He is currently working as Associate Professor
and H.O.D, in the Department of Computer Science &
Engineering, Acharya Nagarjuna University. His current
research is focused onSoftware Engineering. He has published
45 papers in National & International J ournals.
Miss. Shaheen working as Assistant Professor,
Institute of Public Enterprise (IPE), Osmania
Univeristy Campus. She did her Master of
Computer Science. She was associated with
INFOSYS, Gacchibowli, Hyderabad and Genpact, Uppal as an
Academic Consultant training fresh recruits on technical
subjects. She is currently pursuing Ph. D. in Computer Science
fromAcharya Nagarjuna University.
Mr. G. Krishna Mohan is working as a Reader
in the Department of Computer Science,
P.B.Siddhartha College, Vijayawada. He
obtained his M.C.A degree from Acharya
Nagarjuna University in 2000, M.Tech from JNTU,
Kakinada, M.Phil from Madurai Kamaraj University and
pursuing Ph.D at Acharya Nagarjuna University. His
research interests lies in Data Mining and Software
Engineering. He has 13 years of teaching experience. He
has qualified APSLET in 2012. He published 15 papers in
National & International journals.