0% found this document useful (0 votes)
45 views7 pages

Project Costing With The Triangular Distribution and Moment Matching

1) The document describes a method for estimating project costs using triangular distributions to model activity costs and moment matching to model the overall project cost distribution. 2) It calculates the moments of a sample project with three activities and fits these to a two-component Gaussian sum distribution. 3) A Monte Carlo simulation is used to validate the fitted distribution, showing a good match between the cumulative distributions.

Uploaded by

Hamit Aydın
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
45 views7 pages

Project Costing With The Triangular Distribution and Moment Matching

1) The document describes a method for estimating project costs using triangular distributions to model activity costs and moment matching to model the overall project cost distribution. 2) It calculates the moments of a sample project with three activities and fits these to a two-component Gaussian sum distribution. 3) A Monte Carlo simulation is used to validate the fitted distribution, showing a good match between the cumulative distributions.

Uploaded by

Hamit Aydın
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

MHF Journal

Issue3

Project Costing with the Triangular Distribution and Moment Matching


Ron Larham Hart Plain Institute for Studies Introduction When estimating the cost to complete a project it is normal to break the program down into a number of activities and estimate the costs for the activities. The estimates for the cost of an activity is usually specied in the form of a three point estimate comprising: an optimistic, a most likely and a worst case cost. The three point estimate is then turned into a convenient probability distribution. Two common distributions used for this purpose are the beta and triangular distributions. The total project cost is now the sum of the random variables representing the costs of the individual activities, which are assumed to be independent. Since not just the mean or expected cost of the project is needed, but the standard deviation and the more important quantiles a Monte-Carlo simulation of the project cost is run and the relevant additional statistics extracted from the simulation results. This is what the @Risk add-in for Excel , one of the more common tools, does. In this note a method is described for extracting the quartiles from the activity cost distribution analytically. This is not an exact method but an approximation that is usable for project estimating where the raw data is pretty ropey to start with. This method involved representing the project cost distribution as a Gaussian sum which matches the rst N moments of the cost distribution which are derived from the corresponding moments of the activity cost distribution. Here we will use the rst ve moments and t a Gaussian sum with two components (which has ve degrees of freedom so with luck should give an exact moment match). The Triangular Distribution The estimates of the cost or duration of an activity is characterized by three numbers, the best case, the most likely and a worst case cost. We know little more than this about the distribution of cost for this activity. This leaves us free to select a pdf which can represent a distribution characterized in this manner. This could be done with a beta distribution but here I will use a triangular distribution (most of what we need to know about the triangular distribution is in chapter 1 of [1]). The triangular distribution is chosen because it is tractable (its properties are either easily derived and/or looked up) and is commonly used for this sort of work (as is the beta but for our purposes the triangular is easier to use). The pdf of the triangular distribution with mode m and support [ a, b] (the set over which the pdf is non-zero) is: 2( x a ) (ba)(ma) , a x m 2( b x ) f a,m,b ( x ) = , mxb (ba)(bm) 0, otherwise 13

MHF Journal and the CDF is: Fa,m,b ( x ) =

Issue3

0,
( x a )2 , (b a)(m a) ( b x )2 1 (ba)(bm) ,

axm mxb xb

xa

1,

These are shown in gures 1 and 2.

Figure 1: Plot of the PDF of a Triangular Distribution

Figure 2: Plots of the PDF of a Triangular Distribution

14

MHF Journal

Issue3

For what follows we will need the ve moments, the rst is the mean , the remaining moments are all central moments, the second the variance 2 and the remaining three denoted 3, 4 and 5 the third fourth and fth central moments. Of these we compute the rst for the given distribution, and for the remainder we do the calculations for the similar standardized triangular distribution with support [0, 1] and mode = (m a)/(b a), and then multiply by the appropriate power of (b a) to get the corresponding moment for the original distribution [1]. Note that the third central moment here differs from that in [1] as there is a typo in the reference and what is below is correct, also the fth moment is not explicitly in the reference but can be obtained from what is said there, or by direct integration using the denition of the central moments and the pdf of the distribution. = 2 3 a+m+b 3 1 + 2) = ( b a )2 18 (1 2 )(2 )(1 + ) = ( b a )3 270 1 + 2) 135
2

4 = ( b a )4 5 = ( b a )5

2(1 2 )(2 )(1 + )(1 + 2 ) 1701

The Central Moments of a Sum of Random Variables A project consists of a number of activities each with a cost which is assumed to have a triangular distribution with the three values (the optimistic, most likely and worst case estimates) given by experts with experience of similar activities. These are all assumed to be independent random variables (RVs), and so the mean is the sum of the means for the activities as is the variance. But what about the third through fth central moments? The third central moment of the sum of two independent RVs is:

sum,3 = E ( x1 + x2 1 2 )3

= E ( x1 1 )3 + 3( x1 1 )2 ( x2 2 ) + 3( x1 1 )( x2 2 )2 + ( x2 2 )3 = 1,3 + 2,3
where i , i2 denote the mean and variance of the ith RV, and i, j denotes the jth central moment of the ith RV. Note that the terms that are linear in xi i all disappear as the expectation of this difference is zero and the distributions are by assumption independent. The above formula for the third central moment of the sum of two RVs can be extended to the sum of an arbitrary number of RVs, but that is unnecessary for our 15

MHF Journal

Issue3

purposes since we are going to be doing our calculations using a spreadsheet or in a program of some other kind and so we can do the calculations in a running manner only ever combining two RVs. Here we see that like the mean and variance the third central moment of the sum of two random variables is the sum of the third central moments. This does not happen for the fourth and fth central moments of a sum. The equivalent formulas for the fourth and fth central moments are:

2 2 2 + 2,4 sum,4 = 1,4 + 61 2 2 sum,5 = 1,5 + 101,3 2 + 101 2,3 + 2,5

Project Cost Model We model a project as a number of activities the distribution of the cost of each being represented by a triangular distribution summarizing the expectations of the estimator. We can use the combination formula developed above to calculate the mean, standard deviation, skew and kurtosis, and the standardized fth central moment. We can also run a Monte-Carlo simulation (where we generate a sample of project costs by sampling the cost of each activity from its distribution and adding them to give instances of the project cost). We can also t the calculated moments of the project cost distribution to a distribution with density modeled as a sum of two Gaussians. We will do all of the above in what follows. Triangular Dist Parameters a m 5.00 7.00 20.00 21.00 10.00 12.00

activity 1 activity 2 activity 3

b 9.00 30.00 50.00

Table 1: Showing a toy projects activity distribution parameters

Table 1 gives the parameter of the toy project costs that we will play with (consider them thousands or millions (etc.) of pounds depending on how big a project you want this to represent). Using the material presented earlier for this model we nd that the mean project cost is: 54.67, with a standard deviation of: 9.51, skew (standardized third central moment 3 /3 and so dimensionless): 0.52, kurtosis (standardized fourth central moment 4 /4 which is also dimensionless): 2.47, and standardized fth central moment 5 /5 :3.11. These are calculated conveniently using a spreadsheet, but is too wide to show legible here. Moment Matching Gaussian Sum Model In order to be be able to look at other statistics of the distribution of project costs I will t the rst ve central moments to a sum of two Gaussians. Thus I want to look for ,1 , 1 , 2 and 2 such that the rst ve central moments of the distribution: 16

MHF Journal

Issue3

p( x ) =

1 21

( x 1 )2 2 2 1

+ (1 )

1 22

( x 2 )2 2 22

are identical (or at least as close as I can get them numerically) to those calculated for the project cost distribution. The objective to be minimized is the difference in the sum square difference between the calculated central moments for the cost distribution and the corresponding moments for p( x ). The moments for p( x ) are easily computed using a table of the (non-central) moments of the normal distribution. This minimization calculation can be performed with the solver function in the spreadsheets Excel or Gnumeric (though extra care is required in Gnumeric to ensure that the constraints 0 1,i > 0, i = 1, 2 are observed). The result of the tting process for our toy project is: = 0.5199, 1 = 47.60, 1 = 4.28, 1 = 62.32, 1 = 7.46 Comparison of Fitted Distribution with Monte-Carlo Simulation Results for the Project Cost Distribution In order to compare the tted distribution with the true distribution (which is too tricky to compute directly) I run a Monte_Carlo simulation of the project cost. This involves in drawing a random number from the distribution for each activity and summing these to give a sample from the project cost distribution. Repeating this many times gives us a random sample from the project cost distribution. Now we can compare p( x ) with the histogram of our Monte-Carlo sample of project costs, and also the corresponding cumulative distributions. Figure 3 and 4 show the plots that we can use to make the desired comparison. From gure 3 we see that the general behavior of the tted distribution follows the results of the Monte-Carlo results except that the tted distribution has two peaks rather than the one of the Monte-Carlo results. When we move on to the cumulative distributions in Figure 4 we see what appears to be a rather good t.

17

MHF Journal

Issue3

Figure 3: Plot of the Histogram of Monte-Carlo Results (solid line) and the PDF of the Two Component Gaussian Sum Distribution (broken line)

Figure 4: Plot of Cumulative Distribution for Monte-Carlo Results (solid line) and the Two Component Gaussian Sum Distribution (broken line)

18

MHF Journal

Issue3

In fact since the underlying model of the project is so crude we are only interested in the most rudimentary predictions from our models. I would suggest that other than the mean, standard deviation, skew and kurtosis of the project costs the most interesting statistics are the Quartiles. 1st 46.98 47.20 -0.5% 2nd 53.22 52.50 +1.3% 3rd 61.35 61.95 -1.0%

Qtiles from MC Qtiles from tted distribution % error in tted

Table 2: Comparison Between Quartiles

Summary I have shown a method of estimating the summary statistics of a project cost distribution when the project comprises a number of independent activities each with individual costs having a triangular distribution. I have also shown how to estimate the quartiles of the project cost distribution using just the summary statistics by tting these to a distribution with two Gaussian components from which the quartiles can be calculated. These have been illustrated using a tow project model. In a real project with more activities the distribution of total cost will usually be closer to normality and we may expect that the process described here will perform even better. Having extolled the virtues of the method of moment matching above I will now observe that for the type of project cost model considered here it is in fact easier to construct the Monte-Carlo model and calculate every thing required from its results. Also we may observe that the activity costs are often if not usually not independent random variables. References [1] Kotz S., van Dorp J.R., Beyond Beta, World Scientic 2004

19

You might also like