0% found this document useful (0 votes)
47 views17 pages

Distrib: Probability Distribution Analysis

The document describes the DISTRIB program, which allows users to easily fit univariate data to probability distributions. The main features of the DISTRIB program include: 1. A main dialog window containing tools for data entry, distribution selection, results display, and plotting. 2. Calculators for the binomial and Poisson distributions accessible from the program menus. 3. The ability to perform predictive analysis by selecting a probability distribution and calculating return periods based on the distribution fit.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
47 views17 pages

Distrib: Probability Distribution Analysis

The document describes the DISTRIB program, which allows users to easily fit univariate data to probability distributions. The main features of the DISTRIB program include: 1. A main dialog window containing tools for data entry, distribution selection, results display, and plotting. 2. Calculators for the binomial and Poisson distributions accessible from the program menus. 3. The ability to perform predictive analysis by selecting a probability distribution and calculating return periods based on the distribution fit.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 17

DISTRIB

Probability Distribution Analysis

Introduction

DISTRIB is a program designed to easily allow univariate data to be fit to a


probability distribution. The program interface is primarily contained in one
dialog window, this screen contains all the analysis and entry tools which would
normally be needed to perform all predictive analysis using probability
distributions. Included with the program are a Poisson and a Binomial probability
calculator.

I.The DISTRIB Main Dialog Window

The DISTRIB main dialog window contains all the features necessary to analyze a set of
univariate data.

DISTRIB Page1
There are 5 major parts to the main dialog window:

1. Select Distribution frame


2. Data Entry Spreadsheet (upper spreadsheet with “white” Data column)
3. Prediction Spreadsheet (lower spreadsheet with “white” Probability
column)
4. Results frame
S. Status Bar (bottom of Window)

Univariate data are directly entered into the Data Entry Spreadsheet (upper). The
white portion of the spreadsheet allows for the entry ofthis data. Data should be
entered with one entry per row.

Do not leave blank rows when entering data.

Once all data are entered, use the mouse to click in the Prediction column of the
upper Data Entry Spreadsheet or in the Select Distribution column, select the
distribution type you wish to “fit” the data.

* DISTRIB Example 1 *
Using the maximum annual flowrate data as shown, predict the 25 year return period flow
using the Log Pearson Type III distribution.

Year Flowrate (cms)


1956 141
1957 90
1958 50
1959 109
1960 103
1961 142
1962 111
1963 75
1964 85
1965 63
1966 144

DISTRIB Page
The data should be entered as shown:
Next click the Log Pearson Type III button in the Select Distribution frame.

Log Pearson Type


III 200
150
Vales |.…]
a

B.IJ 0.2 0.4 0.6 0.8 10


añbull Probabilit y

From the lower Prediction Spreadsheet we can see that the 25 year return period
prediction is 165.46 cms.

* End of Example 1 *
The DISTRIB main dialog window allows for a number of options. To see a
close-up of the plot, click on the actual distribution plot in the lower left
corner of the main dialog window - an enlarged plot of the data will appear.

. S ERO A AA A A AANA DTS

Log Pearson Type III

7 Actual Data

f Distribution

_
u ->_

You may now Print or Copy (to clipboard) the plot. You may also change the
grid displayed by clicking on the Grid button. Each consecutive click will add
horizontal lines, vertical lines, or retumn the plot to the above shown empty grid.

The “C to M” button switches the plot between color and monochrome


display. The Style button allows for various plot line displays.

Select the Done button to return to the main DISTRIB dialog window.
The Prediction Spreadsheet allows for the entry of different Probabilities
and calculation of predictions based on different return periods. The return
period can be calculated from probability using;

RP=—-
l-p

where,
RP = Retum
Period
p = probability

Enter the desired probabilities in the Probabilities (white) column of the


Prediction Spreadsheet (lower) and click on the Return Period heading to
activate the calculation process.

DISTRIB allows for optional plotting position formulas. Clicking on the word
Weibull in the upper corner of the Data Entry Spreadsheet will allow you to scroll
through plotting position formula;

Weibull m/(n+1)
California m/n
Foster (2m-1)/2n
Exceedence (m-1)/n
I. DISTRIB Options

A. Binomial Calculator

The Binomial Calculator is available from the Calculators menu. The Binomial
Calculator, as shown below, allows for calculation ofthe probability of any discrete
value using the binomial distribution.

* DISTRIB Example 2 *

If the probability of any a certain event is 0.2, what is the probability of that event
occurring exactly 3 times in 10 tries?

H
Pr(x) =— P 1- D"*
X!'(A— )!

u 0.10737 [
0.4

0.3 *
x

Prnbabilay |]2 5e

0.1 =

x
0.0 ! !— —
0 2 4 6 8 10
Numba ol
Tues

x
0.0 ! ! — —
0 2 4 6 8 10
Numba ol
Tues
From the Binomial Calculator it can be seen that the probability 1s 0.20133 or
20.133 %.

* End of DISTRIB Example 2 *

DISTRIB Page 7
* DISTRIB Example 3 *

Whatis the probability of getting less than 3 heads in a toss of 7 coins?

n
Pr(x) = pa-py*
X! (N — x)!

0.30

0.25

0.20
Pobobilur » x
0.15

9.10

0.05 x x
0.00%
X
Ea

0
Nueby el Tuas

The probability ofless than 3 heads, is the sum of 0, 1, and 2 heads in thetoss.

Pr(0) = 0.00781
Pr(1) 0.05469
=
Pr(2) = 0.16406

PR (x <3) = 0.22656 or the probability of having less than 3 headsin a toss of 7 coins
is
22.656 %.

* End of DISTRIB Example 3 *


DISTRIB Page 8
B. Poisson Calculator

The Poisson Calculator is available from the Calculators menu.


The Poisson
Calculator performs calculations for the Poisson distribution.

A
Ve
Pr(x) =
x!

Probability = 9.02235221577418E-02

0.30
0.25
0.20
Pl ay
0.15
0.10
0.05
0.00
0 3 A 5
Fireule Tuga
DISTRIB Page 9
C. Annualization Factor

The Anmnualization Factor is available from the Edit menu. The Annualization Factor
allows for the use of partial duration series data in analysis using DISTRIB.

The Annualization Factor is the ratio of number of data points used in the
analysis to the span ofyears the data covers. This factor is used in the prediction
to put the return period predictions on an annual basis.

D. Statistics

The Statistics menu option allows you to activate calculation or plotting of the
statistics without having to click on any of the Results frames (Prediction
columns, Standard Deviation columns, or the Return Period column).

The Fit Distribution selection will calculate all of the Results frame data. The Plot
Distribution will activate the enlarged plot ofthe distribution without having to click
on the small plot in the lower left-hand corner ofthe main DISTRIB dialog window.

DISTRIB Page
10
I. Types of Distributions

DISTRIB uses the common distribution types which are used in the field of
water resources. These distribution are represented by probability density
functions, and are shown below:

A. Normal Distribution

The probability density function for the Normal distribution is:

p (x)= )= —
— exp| - 1 (2=2)
— — — |

" o 2 o
where,

p (X) = probability of event of magnitude less


than x H mean of population
o standard deviation of population
l

B. Log Normal Distribution

The probability density function of the Log Normal distribution is:

- o 27 2 o

where,

P:(7) = probability of event of magnitude less


than y H y = mean of population of y
Oy = standard deviation of population of y
C. Three parameter Log Normal

In the Three Parameter Log Normal distribution the parameter y of the log normal
distribution is calculated as:

y = In (x - a)
where,

€a” isaconstant which is determined in the analysis ofthe data.

D. Pearson Distribution

The probability density function of the Pearson distribution is:

P () = p…,[1+g)%e%
where,

difference between mean and mode (6 = y - X)


with X., = mode of population x
= scale parameter of distribution
po = value of px(x) at mode
E. Log Pearson Distribution

The probability density function of the Log Pearson distribution is:

where,

difference between mean and mode (5 = u, - Y)


q

with Ym = mode of population y


O = scale parameter
ofdistribution Pyo = value of
px(y) at mode

F. Gumbel Distribution

The probability density function of the Gumbel distribution is:

a— Xy

P(D=" 1( e
F

6-7 B-7
where,

O = scale parameter ofthe distribution


B = location parameter ofthe distribution
IV. Performing Distribution Analysis

A. Dealing with Zero Values

A number of methods have been used to handle 0.0 values. DISTRIB does not
perform any of these conversions for you and may crash if you attempt to
fit O data to a log distribution. Toalleviate this problem you may:
DULDWN

Add 1.0 to all data


-

Add a small positive value to all data.


Substitute 1.0 in place of all O data.
Substitute a small positive number in the place ofall zero readings.
Ignore all zero observations.
Consider the probability distribution as the sum of the probability mass at
0.0 and a probability distribution over the remainder of the range.
This method is described in Jennings and Benson.

B. Selecting the “Best Fit” Distribution

In most real world cases a type of distribution will not be specified. In this situation the
best distribution for the data must be selected. If is recommended that:

1. The data befit to each distribution and plotted.


2. Theplots be compared for best fit in the region of interest.
3. The selected best distribution be used forall similar data.

The “region ofinterest” is the portion of the fit which is going to be predicted. When
making predictions of extreme events, the left side ofthe plot (high return periods) should
fit well to the data. It is not important that the low end numbers reflect a
good fit in this case.

You might also like