0% found this document useful (0 votes)
124 views52 pages

2 Statistics: Poisson Distributions

This document discusses statistics and provides examples. It covers: 1. An overview of statistics, which includes descriptive statistics that describe samples and inductive statistics that make inferences about populations from samples. 2. Descriptive statistics definitions including total population, sample, attributes, and forming a frequency distribution by sorting data into classes. 3. An example of collecting data on a heart attack attribute and sorting it into a root list and frequency distribution with classes.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
124 views52 pages

2 Statistics: Poisson Distributions

This document discusses statistics and provides examples. It covers: 1. An overview of statistics, which includes descriptive statistics that describe samples and inductive statistics that make inferences about populations from samples. 2. Descriptive statistics definitions including total population, sample, attributes, and forming a frequency distribution by sorting data into classes. 3. An example of collecting data on a heart attack attribute and sorting it into a root list and frequency distribution with classes.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 52

2 Statistics

n
σ = ∑ (xi − µ )2 ⋅ f X (xi )
2

i =1

σ
n
C=
µ = ∑ xi ⋅ f X (xi ) µ
i =1

Stamp and currency of Germany, valid until 2001

1,0

0,9

0,8 Poisson distributions µ = 0,3


µ=1

density function f(x)


0,7 µ=5
0,6

0,5

0,4

0,3

0,2

0,1

0,0
0 1 2 3 4 5 6 7 8 9 10 11 12

number per unit [time]

Integrated Circuit Manufacturing Prof.Dr.W.Hansch, Dipl.-Ing.E.Schober


Modul 1278 ICM, 2- 1
2. Statistics
2.1 Overview Overview

Statistics is a science,
for the purpose: to collect, to prepare, to describe and to present information (descriptive statistics)
and conclude from sample information to properties of the whole population (inductive statistics)

Collecting data from the total population (called census)


may be:
impossible only take a sample sample
Total Population
impractical
too expensive

Inductive Statistics Descriptive Statistics


Probability Calculations
* Conclusion (interference) from sample (Stochastics) * Collecting data (sample)
to total population * Arrange the data
* Quantification of uncertainty ! Notation: estimations! from sample * Combine the data
* Make decisions to total population: <x>, <s> * Presentation of data
* Search for relations of data

uncertain information of the total population complete information of the sample

! Notation: total population: Greek letters are used µ,σ... ! Notation: sample: Latin letters are used x, , s, ...

Integrated Circuit Manufacturing Prof.Dr.W.Hansch, Dipl.-Ing.E.Schober


Modul 1278 ICM, 2- 2
2 Statistics

2.1 Overview

2.2 Describing (descriptive) Statistics = Description of samples

= Probability Calculations of future events


2.3 Probability Calculations based on well-known probability functions

2.4 Concluding (inductive) Statistics = Estimation on properties of a total population


based on properties of a sample

Integrated Circuit Manufacturing Prof.Dr.W.Hansch, Dipl.-Ing.E.Schober


Modul 1278 ICM, 2- 3
2. Statistics
2.2 Descriptive Statistics Definitions

Total population of amount N: e.g. all masculine residents of a town, who are older than 18 years
Sample of size n: random selection of 100 of these persons
Attribute carrier: object, on which the investigation is done, e.g. patient, animal, government,
Attribute x: selected, assignable property of a sample member, e.g. age, size, blood group, ...
Attribute value (observation value) xi: measured value (15 years) of the attribute (age) of sample member ni
Data collection: collecting all observation values
Root list: unsorted list of all observation values as they are collected
1st manipulation Sorted root list: arranged observation values, e.g. increasing age
!! (even all values are included, with this step information may be lost, e.g. trends)
Example 1:
From the unknown total population N of all people with an heart attack
the attribute "creatine kinase CK" (this is an enzyme) is measured in Sorted root list
a sample of n = 45 and the results are collected in a root list: (in increasing order)
Patient value Patient value Patient value Patient value Patient value value value value value value
1 224 10 170 19 189 28 206 37 248 50 157 190 206 272
2 188 11 119 20 108 29 158 38 111 78 158 191 214 274
3 214 12 201 21 297 30 272 39 190 108 168 191 215 275
4 275 13 288 22 247 31 191 40 78 111 169 191 218 278
5 191 14 191 23 278 32 171 41 169 119 170 193 224 288
6 147 15 194 24 121 33 194 42 193 121 171 194 247 294
7 201 16 157 25 215 34 168 43 50 129 181 194 248 297
8 252 17 409 26 259 35 274 44 151 147 188 201 252 409
9 294 18 218 27 437 36 181 45 129 151 189 201 258 437

Integrated Circuit Manufacturing Prof.Dr.W.Hansch, Dipl.-Ing.E.Schober


Modul 1278 ICM, 2- 4
2. Statistics
2.2 Descriptive Statistics Sorting of Data

From the root list it is not easy to get an overview of the observed values, for this reason the values are
ordered in classes. The presentation of observation values in classes is called frequency distribution.

Formation of classes: sorting the observation values in a number k of non-overlapping ranges (classes, bins, cells)

The choice may be: non equidistant classes: e.g. following superior aspects
in our heart attack example we could create the classes: x<115: medical treatment necessary
115 - 170: only observation, 171-300: normal, ....

equidistant classes role of thumb: number k of classes: k≈ n or k ≈ 5⋅ log(n )

equidistant class size d = (max-min) / k

definition, to which classes the particular class limits should be incorporated


e.g.: belongs 115 to k1 or k2 ?
Classification: sorting the observation values into the classes:

45 == 6.7
sample size n = 45, -> √45 6.7 class ki range d values from root list number ni tally
-> choose 6 or 7 classes
k1 50 ≤ x < 115 50, 78, 109, 111, 4 IIII
lowest value: 50, highest value: 437
-> max - min = 387 k2 115 ≤ x <180 119,121,129,147,151,157, 11 IIIII IIIII I
158,168,169,170,171
-> possible class size: d = 387 / 6 ≈ 65
k3 180 ≤ x <245 ..... 17 IIIII IIIII IIIII II
fixing the class borders: ku ≤ xi <ko k4 245 ≤ x < 310 ..... 11 IIIII IIIII I
k5 310 ≤ x < 375 ..... 0
k6 375 ≤ x < 440 ...... 2 II

Integrated Circuit Manufacturing Prof.Dr.W.Hansch, Dipl.-Ing.E.Schober


Modul 1278 ICM, 2- 5
2. Statistics
2.2 Descriptive Statistics Frequency of Data
Examples:

In lecture room i are 20 students ni


absolute frequency ni : number of observation values, that belong to class i ni
ni From n=100 students are ni=20 in
relative frequency pi : ratio of absolute frequency ni in class i to total sample size n pi = lecture room i
n -> 20% of all students are in lecture i
=> this is a probability

n ni=20 students are in lecture room i


n
absolute frequency density gi = i gi = i with d= 25 seats (=volume)
"density" means normalization ∆xi d -> the room is 80 % filled
due to a volume/size or interval => this is a density
d: class range
From n=100 students are ni=20
g p g i pi
relative frequency density fi = i = i fi = =
students in lecture room i with d=
n ∆xi n d
25 seats
= this is a probability density
that 20% of all students will fill this
special lecture room i to 80%
standardization with non-equidistant with equidistant class sizes
class sizes

We want to know, how large is the part Ni from the smallest value x1 until value xi ? we have to sum up

i
absolute frequency sum N i = ∑ ni It is convenient not to use over-exact procentual
k =1 values for the relative frequency values:

i n < 30 : no decimal places


relative frequency sum
Pi = ∑ pi 30 < n < 300 : one decimal place
k =1 300 < n : two decimal places

Integrated Circuit Manufacturing Prof.Dr.W.Hansch, Dipl.-Ing.E.Schober


Modul 1278 ICM, 2- 6
2. Statistics
2.2 Descriptive Statistics Frequency of Data
sample size: n = 45
class size: d = 65

class range dx absolute relative absolute relative absolute relative


frequency frequency pi frequency density frequency density frequency sum frequency sum
(column 3 / class size) * 10-3 (Sum of column 3) (Sum of column 4)

k1 50 ≤ x < 115 4 4/45 = 8.9% 4 / 65 = 0.062 4 / (45*65) =1.4 4 0.089


k2 115 ≤ x <180 11 24.4 % 11 / 65 = 0.169 11/ (45*65) = 3.8 4 + 11 = 15 0.089 +0. 244 =0.333
k3 180 ≤ x <245 17 37.8 % 17 / 65 = 0.262 5.8 15 + 17 = 32 0,71
k4 245 ≤ x < 310 11 24.4 % 11 / 65 = 0.169 3.8 32 + 11 = 43 0,96
k5 310 ≤ x < 375 0 0% 0 0 43 + 0 = 43 0,96
k6 375 ≤ x < 440 2 4.4 % 2 / 65 = 0.031 0.7 43 + 2 = 45 0,999
Σ = 45 Σ = 99.9 % Σ = 45 Σ = 0.999 = 1

20 1,1
18 0,4 1,0
0,006 40 0,9
16

Absolute Frequency Sum


0,25

Relative Frequency Sum


Absolute Frequency Density

Relative Frequency Density


Absolute Frequency

0,8
Relative Frequency

14 0,3 0,005
0,20 30 0,7
12
0,004 0,6
10 0,15
0,2 0,5
8 0,003 20
0,4
6 0,10
0,002 0,3
4 0,1 10
0,05 0,2
2 0,001
0,1
0 0,0 0,00 0,000 0 0,0
50 100 150 200 250 300 350 400 450 50 100 150 200 250 300 350 400 450 50 100 150 200 250 300 350 400 450
Attribute in Equidistant Classes Attribute in Equidistant Classes Attribute in Equidistant Classes

Integrated Circuit Manufacturing Prof.Dr.W.Hansch, Dipl.-Ing.E.Schober


Modul 1278 ICM, 2- 7
2. Statistics
2.2 Descriptive Statistics Graphical Representation of Data

Most people show lack of interest or have no time to go through facts and numbers given in reports.
But if these numbers are graphically presented, they become easier to grasp and catch the eye and
have a more lasting effect on the reader's mind.
The graphical representation of data makes the reading more interesting, less time-consuming and
easily understandable. The disadvantage of graphical presentation is that it lacks details and is less accurate.

A first impression of the shape and properties of a frequency distribution can be received from the tally.
Based on that several methods exist to receive a visual display of the data.

In a bar chart or bar graph the heights of In a histogram or pie chart the areas (width and height)
the bars are proportional to the frequencies. are proportional to the frequency densities.
The width has no significance. Therefore This is commonly used for non-equidistant classes.
the graph is called one-dimensional. Therefore the graph is called two-dimensional.
Usually the bars are separated by a space. Usually the bars (pies) are touching each other.

Development Costs per


A Lithograph
Department
42 38 39,2
y 80%

Costs [Mill. USD]


B CVD
RIE
27,5 Inspection
AB 24,0 Sawing
7,1
Cleaning 1,4 Bonding
O 12 8 12,0
0,7
3,6
10,7 NCCA
1,6 FA

0 10 20 30 40 50 6 18 5 15 22 15 4 4 11
Number of Department Members

Integrated Circuit Manufacturing Prof.Dr.W.Hansch, Dipl.-Ing.E.Schober


Modul 1278 ICM, 2- 8
2. Statistics
2.2 Descriptive Statistics Graphical Presentation of Data
The appearence of the same frequency distribution may be strongly influenced by the choosen classification:

multi modal ?
uni modal ?

body size in cm body size in cm


class size = 1.25 class size = 3.0

left skewed ?

right skewed ?

body size in cm body size in cm


class size = 5.5 class size = 11.25

Integrated
Objective parameters to describeCircuit Manufacturing
the properties Prof.Dr.W.Hansch,
of a frequency distribution must be Dipl.-Ing.E.Schober
defined
Modul 1278 ICM, 2- 9
2. Statistics
2.2 Descriptive Statistics Parameters of Frequency Distributions

Looking at a frequency distribution, one is interested in its structures or characteristically values.


Finding characteristically values the frequency distribution can be compared to other samples or frequency distributions.
For this reason characteristically values or are introduced:

Parameters describing the position Parameter describing the variance Parameter describing the shape
(central tendency) of the distribution of distributions of the distribution

Median Range (Spannweite)


Mode value Absolut deviance Skew (Schiefe)
Arithmetical mean Variance Steepness (Steilheit, Kurtosis)
Quantiles Standard deviation ....
... ....

Mode
120

100 Median
Sunny Hours [h]

80 Arithmetical steepness
mean
60 variance
40

20

0
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec

skew (here: left skewed)


Integrated Circuit Manufacturing Prof.Dr.W.Hansch, Dipl.-Ing.E.Schober
Modul 1278 ICM, 2- 10
2. Statistics
2.2 Descriptive Statistics Parameters of Frequency Distributions

Parameters describing the central tendency

The mode xH (Modus) represents the most likely observation value

advantage: -> can be calculated on all scales  very stable according to outliers
attention: several modes may be possible

The median xM (central value) separates the (increasing sorted) sample in exactly two halfes

1 n
The mean  of all sample values is a measure of the central tendency (location) x = ⋅ ∑ xi
n i =1
attention: sensitive to outliers

Mode Mode
8 8
30
15 Median sample = median
Language
29 students left, 29 right 58 students mode
Number

20 mean
5
10 10 1.72
0

Language
German
18
English
5
Chinese
27
5
1.60 1.65 1.70 1.75 1.80 1.85 1.90 1.95
0 size [cm]
A+ A A- B+ B B- C+ C D+ D F
Nominal scale: Metrical scale:
no sorting possible (range and ratio scale)
-> only mode exists Ordinal scale: sorting possible -> all position values exist
(A+ better than A and better than A-...)
-> modeIntegrated
and median Circuit
exist Manufacturing Prof.Dr.W.Hansch, Dipl.-Ing.E.Schober
Modul 1278 ICM, 2- 11
2. Statistics
2.2 Descriptive Statistics Parameters of Frequency Distributions

Parameters describing the central tendency


Example
Department X consists of 9 persons. The company is interested in the salary of the employees.
The 9 persons have the following salary (in €):
1160, 1050, 980, 1200, 970, 1800, 6600 (department leader), 1180, 1090 (this is the so-called root list)

1 n
The mean  of all sample values is: x = ⋅ ∑ xi which is here: 1/9 * 16030 € = 1781 €
n i =1

A publication of this value would suggest, that the average salary will be 1781 € .
In reality nearly all employees (7 of 9) earn significant less money, the outlier rises the mean value dramatically.

Another possibility for a position value is the median.


The median xM (central value) separates the (increasing sorted) sample in exactly two halfes

Therefore the median is not so sensitive to outliers. 970, 980, 1050, 1090, 1160, 1180, 1200, 1800, 6600
7000
The mode xH (Modus) represents the most likely observation value
6000
mean
5000
In the presented example each value is just one time observed, median
4000
no mode exists.
We could create classes: 800-1000 € (2 values), 1000-1200 € (5 values), 3000
1200-1400€ (0 values), etc., 2000
then the mode would be the class 1000-1200 € .
1000

0
Paul Schulz Wang Perka Borg Cao Kumar Schober Hansch
salary 970 980 1050 1090 1160 1180 1200 1800 6600

By comparing the various central tendency values the "conformity" of the distribution may be estimated.
Integrated Circuit Manufacturing Prof.Dr.W.Hansch, Dipl.-Ing.E.Schober
Modul 1278 ICM, 2- 12
2. Statistics
2.2 Descriptive Statistics Parameters of Frequency Distributions

Parameters describing the central tendency

Quantiles are a set of 'cut points' that divide a sample of sorted data into groups
containing (as far as possible) equal numbers of observations.

n = 2 : 2 equal parts, the cut point is the median


n = 4: Quartiles are cut points that divide a sample of data into four groups
n = 100: Percentiles are cut points that divide a sample of data into one hundred groups

lower
Median
quartile

upper
50 100 100 quartile
100
100
Number of Sportsmen

40 Note:
Anzahl Sportler

Equal quantiles always contain the


30 same numbers of values,
but on the attribute (x-) scale the
20 range may be different

10 Interquartile range
IQR
0
120 130 140 150 160 170 180 190 200 210 220
Körpergröße
Body Size [cm]

Integrated Circuit Manufacturing Prof.Dr.W.Hansch, Dipl.-Ing.E.Schober


Modul 1278 ICM, 2- 13
2. Statistics
2.2 Descriptive Statistics Parameters of Frequency Distributions

Parameters describing the central tendency


Box-Whisker-Plot

The Box-Whisker-Plot is a quantified graphical display of location values (invented ~1975 by J.W.Tukey)

Above the attribute scale the 0.25-quantile and the 0.75-quantile are drawn as a box.
Within the box the 0.5-quantile (median) is marked as a line and sometimes the mean value as a star.
As whisker usually the 0.1- and 0.9-quantile are given (but other values as well are possible).
In addition, outliers can be given.

In this graphical display a quick overview to the position , to the width (scattering) and the shape of a frequency distribution is achieved.
This property is a great advantage when comparing data.

median 70
mean whisker 60
box

Plant Heigth [cm]


50
40
30
20
1.60 1.65 1.70 1.75 1.80 1.85 1.90 1.95
10
size [cm]
0 Fertilizer

Integrated Circuit Manufacturing Prof.Dr.W.Hansch, Dipl.-Ing.E.Schober


Modul 1278 ICM, 2- 14
2. Statistics
2.2 Descriptive Statistics Parameters of Frequency Distributions

Parameter, describing the spreading of data

The range R is a measurement for the absolute variation of the observed values R = xmax − xmin
! very sensitive to accidental values
n
The absolute deviance AD indicates an average of all measured
absolute deviations from the mean value.
∑ x −x i
AD = i =1
n

2
1 n
The variance s² is a measurement for the variation of all observed values due to the mean value s = ⋅ ∑ ( xi − x )
2

n i =1
! ! due to quadratic term very sensitive to accidental values
Note:
If we have measured n observation values , then the variance s² is calculated as given above (descriptive statistics).
But ! frequently in literature the variance s² of the sample is treated as an estimation value <s²> for the variance σ² of the underlying,
unknown total population N (inductive statistics). This can be recognized if in the s²-equation the fraction 1/(n-1) occurs instead of 1/n.
20 300
18 Mode: 192 280
Mean: 205
16 example 260

s = s2 from p.2-7

Absolute Frequency

Relative Frequency
14 240
The standard deviation s is the square root from variance. standard deviation: +-74
12 220
10 200
! The advantage of s is that its value has the same dimension as the data, 8 180
6
so easy to compare. 160
4 Range: 378 140
2
120
! In graphical presentation usually the mean and ± s ( = 2s) is shown 0
0 50 100 150 200 250 300 350 400 450 500
Attribute in Equidistant Classes
Integrated Circuit Manufacturing Prof.Dr.W.Hansch, Dipl.-Ing.E.Schober
Modul 1278 ICM, 2- 15
2. Statistics
2.2 Descriptive Statistics Parameters of Frequency Distributions

Up to now we have seen some parameters of frequency distributions, e.g. the arithmetical mean or the variance.
But both are just two special parameters of a more common definition of parameters, the empirical moments.

The (empirical) moments of a sample frequency distribution are defined:

1 n if the reference point a = zero -> common moments


mr (a ) = ⋅ ∑ (xi − a )
r
if the reference point a = arithmetical mean -> central moments
n i =1
Special cases:
1 n
Arithmetical mean m1 = x = ⋅ ∑ xi position of the distribution
n i =1

spread
1 n
m2 (x ) = s = ⋅ ∑ ( xi − x ) s
2 2
Variance s² Variation coefficient VC =
n i =1 x

1 n
m3 ( x ) = ⋅ ∑ ( xi − x ) m3
3
Skewness shape Skewness coefficient SC =
n i =1 s3

1 n
m4 ( x ) = ⋅ ∑ ( xi − x ) m4
4
Steepness (kurtosis) shape Kurtosis coefficient KC = −3
n i =1 s4

and so on

Integrated Circuit Manufacturing Prof.Dr.W.Hansch, Dipl.-Ing.E.Schober


Modul 1278 ICM, 2- 16
2. Statistics
2.2 Descriptive Statistics Parameters of Frequency Distributions
n
1 n 1 n 1
⋅ ∑ ( xi − x )
In the equations for the moments: xn = ⋅ ∑ xi sn = ⋅ ∑ ( xi − x ) sn2−1 =
2 2 2
products and sums exist.
n i =1 n i =1 n − 1 i =1
for inductive statistics (only counting to n-1)
for describing statistics all values n must be taken into account)
This makes it necessary, if adding new values to the sample, to make a complete recalculation (because a new mean xn +1 is created) .
This can be avoided, if the terms are rearranged to form recursion formulas (for equal weighted values):
1 n
xn = ⋅ ∑ xi sn2−1 =
1 n
⋅ ∑ ( xi − x )
2
for inductive statistics Example:
n i =1 n − 1 i =1
n 1 Σxi Σ xi2 xi
( )
n xi sn sn-1
xn +1 = ⋅ xn + ⋅ xn +1 =
1
⋅ ∑ xi2 − 2 xi x + x 2
n +1 n +1 n − 1 i =1 55 55 3025 55 0 /

1 n
2 n
1 n 54 109 5941 54.5 0.707 0.5
1 n = ⋅ ∑ xi2 − ⋅ ∑ xi x + ⋅∑ x2
sn2 = ⋅ ∑ ( xi − x )
2
n − 1 i =1 n − 1 i =1 n − 1 i =1 51 160 8542 53.33 2.89 4.33
n i =1
1 n
2⋅n n
1 n
= ⋅ ∑ xi2 − ⋅ ∑ xi x + ⋅∑ x2
( )
n
1
= ⋅ ∑ xi2 − 2 xi x + x 2 n − 1 i =1 n ⋅ (n − 1) i =1 n − 1 i =1 Test for n=3 using values from n=2:
n i =1 n n
1 n 1
1 n 2 n 1 n = ⋅ ∑ xi2 −2 ⋅ ⋅ xx + x 2 ⋅ ⋅ ∑1 x3 = x2+1 =
2
⋅ x2 +
1 2 1
⋅ x3 = ⋅ 54.5 + ⋅ 51 = 53.33
= ⋅ ∑ xi2 − ⋅ ∑ xi x + ⋅ ∑ x 2 n − 1 i =1 n −1 n − 1 i =1 2 +1 2 +1 3 3
n i =1 n i =1 n i =1 n
1 n n
1 n
1 n = ⋅ ∑ xi2 −2 ⋅ ⋅ xx + x 2 ⋅ 1 n 1  n  1
2

= ⋅ ∑ xi2 −2 x x + x 2 ⋅ ⋅ ∑1
1
n − 1 i =1 n −1 n −1 s = ⋅ ∑ xi2 − 2 ⋅ ∑ xi  = ⋅ 8542 − 2 ⋅160 2
2
n
n i =1 n i =1 n n i =1 n  i =1  3 3
1 n
1 n = ⋅ ∑ xi2 − ⋅x 2 = 2847.33 − 2844.44 = 2.89
= ⋅ ∑ xi2 −x 2 n − 1 i =1 n −1
2
n i =1 1 n
1  n  8524 160 2
⋅ ∑ xi2 − ⋅ ∑ xi  =
2
1 n
n 1 n  sn2−1 = − = 4.34
1 n 1 n 
2 = ⋅ ∑ xi2 − ⋅  ⋅ ∑ xi  n − 1 i =1 n ⋅ (n − 1)  i =1  2 3⋅ 2
= ⋅ ∑ xi2 − ⋅ ∑ xi  n − 1 i =1 n − 1  n i =1 
n i =1  n i =1  2
1 n
1  n  Note:
s 2
= ⋅ ∑ xi −
2
⋅ ∑ xi 
n ⋅ (n − 1)  i =1 
2
1 n
1  n
 n −1
n − 1 i =1 for large samples both sum values are large and
sn2 = ⋅ ∑ xi2 − 2 ⋅ ∑ xi  almost equal sized. This may result in rounding errors!
n i =1 n  i =1 
Integrated Circuit Manufacturing Prof.Dr.W.Hansch, Dipl.-Ing.E.Schober
Modul 1278 ICM, 2- 17
2. Statistics
2.2 Descriptive Statistics Statistical Correlations

In many experiments from one attribute carrier, e.g. people, several attributes, e.g. size, weight, eye color, ..., are measured.
It may be of interest, if there exist correlations or dependencies between these attributes, e.g. do people with large size also
have large weight or special eye colors , does alcohol consumption influences driver reaction times ?

In a random sample multiplets (xi, yi, zi, ...) of attributes X,Y, Z,... are measured.
2 2 2
For each attribute the statistical moments (usually mean x , y , z ,....and variance s x , s y , s z ,.... ) are calculated.

As a measurement of common tendency, which means larger values of x are connected with larger /smaller values of y or z,
a covariance is defined:
1 n
Cov( X , Y ) = s xy = ⋅ ∑ ( xi − x )( yi − y ) Note: the symbol of covariance is sxy, not s²xy
n i =1
Note: in many textbooks without explicit mentioning the covariance is also treated as an expectation value <sxy> for the unknown total
population N. Therefore the 1/n is replaced by a 1/(n-1). This is not correct, when dealing with descriptive statistics.

If the Cov is positive (negative), with larger x-values larger (smaller) y-values are connected

if Cov -> 0 , there is merely no relationship


The Cov is limited by: Cov  ≤ sx * sy
if Cov -> sx * sy , the relationship is nearly linear

Because the attributes X, Y, Z, ... may be measured in different scales, the value of covariance is dependent on the scales.
A "normalized covariance" is obtained by dividing the covariance sxy by the standard deviations sx and sy and is called
correlation coefficient r :
s xy
r=
sx ⋅ s y
Integrated Circuit Manufacturing Prof.Dr.W.Hansch, Dipl.-Ing.E.Schober
Modul 1278 ICM, 2- 18
2. Statistics
2.2 Descriptive Statistics Statistical Correlations
Example:

From n = 5 persons the height X and the weight Y is measured. The data are: (172, 70); 155,77); (180,85); (162,70); (191,83)

1 n
⋅ ∑ xi = ⋅ (172 + 155 + 180 + 162 + 191) = 172 [cm]
1
xn = 90
n i =1 5
88
86
1 n
⋅ ∑ yi = ⋅ (70 + 77 + 85 + 70 + 83) = 77 [kg ]
1
yn = 84
82
n i =1 5 80
78
(172 − 172) + (155 − 172) + (180 − 172) + (162 − 172) + (191 − 172) = 162.8 cm 2

Weigth [kg]
[ ]
2 2 2 2 2
1 n
s = ⋅ ∑ ( xi − x ) =
76
2 2
74
x
n i =1 5 72
70

s x = s x2 = 12.76 [cm] 68
66
64

21 n
s = ⋅ ∑ ( yi − y ) =
2 (70 − 77 ) + (77 − 77 ) + (85 − 77 ) + (70 − 77 ) + (83 − 77 ) = 39.6 kg 2
2 2 2 2 2
[ ]
62
60
y
n i =1 5 150 155 160 165 170 175 180 185 190 195 200
Size [cm]
s y = s y2 = 6.29 [kg ]

1 n
Cov( X , Y ) = s xy = ⋅ ∑ ( xi − x )( yi − y ) =
(172 − 172)(70 − 77 ) + (155 − 172)(77 − 77 ) + ... = 49.6 [cm ⋅ kg ]
n i =1 5

the value of Cov is positive -> increasing x-values (height) are usually connected with increasing y-values (weight)

Cov( X , Y ) = s xy = 49.6 [cm ⋅ kg ] ≤ s x ⋅ s y = 12.8 [cm]⋅ 6.3 [kg ] = 80.64 [cm ⋅ kg ]

s xy 49.6[cm ⋅ kg ]
r= = = 0.615
12.8[cm]⋅ 6.3[kg ]
The correlation coefficient r = 61% -> the linear correlation is not too strong
sx ⋅ s y

Integrated Circuit Manufacturing Prof.Dr.W.Hansch, Dipl.-Ing.E.Schober


Modul 1278 ICM, 2- 19
2. Statistics
2.2 Descriptive Statistics Statistical Correlations

For estimations, which value of y we would expect, if we take a x-value, we can estimate a mathematical dependence
(linear, power law,..) between these attributes. Because the (X,Y)-values are subject to statistical fluctuations, we can try to find
a function with a minimum of deviations from all values. This task is done by the so-called regression analysis .

If a linear dependence is assumed a "best fit" can be calculated by:

Cov( X ,Y ) a = y − b ⋅ x
Y = a +b⋅ X with: b =
s x2

in our example:
We can make a "guess" or "forecast", if we have a x-value
Cov( X , Y ) 49.6[cm ⋅ kg ]
b= = = 0.3047 kg  to find an estimated y-value:
sx2
162.8 cm 2
[ ]
 cm e.g. for a x-value of 175 cm we would expect a y-value of:
y= a+bx = 24.6 kg + 0.3047 kg/cm * 175 cm = 77,9 kg
a = y − b ⋅ x = 77[kg ] − 0.3047 kg  ⋅172[cm] = 24.6[kg ]
 cm
90
88 In opposite direction we can do the same to find an estimated
86 x-value for a given y-value:
84
82 Cov( X ,Y )
80 X = a′ + b′ ⋅ Y b´ = a´ = x − b´ ⋅ y
78 s y2
Weigth [kg]

76
74

Cov ( X , Y ) 49.6[cm ⋅ kg ]
Y=A+B*X
72
b′ = = = 1.25cm 
[ ]
70 Parameter Wert Fehler

68
------------------------------------------------------------
A 24,59705 38,62006 sy2
39.6 kg 2  kg 
B 0,30467 0,22392
66

a′ = x − b′ ⋅ y = 172[cm] − 1.25cm  ⋅ 77[kg ] = 75.55[cm]


64
62
60
 kg 
150 155 160 165 170 175 180 185 190 195 200
Size [cm]
Integrated Circuit Manufacturing Prof.Dr.W.Hansch, Dipl.-Ing.E.Schober
Modul 1278 ICM, 2- 20
2 Statistics

2.1 Overview

2.2 Describing (descriptive) Statistics

2.3 Probability Calculations

2.4 Concluding (inductive) Statistics

Integrated Circuit Manufacturing Prof.Dr.W.Hansch, Dipl.-Ing.E.Schober


Modul 1278 ICM, 2- 21
2. Statistics
2.3 Probability Calculations Overview

Historically the origin of probability calculations is based on the prediction of events in gambling

One of the pioneers was Blaise Pascal (1623 - 1662) :


1658: Histoire de la roulette
Suite de l'histoire de la roulette,

The descriptive statistics, as presented in the previous chapter, only describes results of samples
(but sometimes tries to force as such values would be representive for the total population, which is not allowed).

In probability calculations it is tried by using mathematical methods (e.g. mathematical functions, such as normal
distribution equation) to calculate the probability of future events. No experimental values are used !
Both, the probability calculations together with mathematical statistics is called stochastics.

The inductive statistics, presented in the last chapter, will combine the experimental results from samples
together with probability calculations to receive estimations for the unknown total population.

Integrated Circuit Manufacturing Prof.Dr.W.Hansch, Dipl.-Ing.E.Schober


Modul 1278 ICM, 2- 22
2. Statistics
2.3 Probability Calculations Random Variables and Probability Distributions

Construction of probability distributions (functions):

If we know the probability for the occurance of events

by theory by observation

- flipping a coin -> theory would predict to receive - collecting data -> we can fit a function through
a 50% probability to receive head or tail the observation values (see page 2.7)
- rolling the dices -> theory would predict to receive
a 1/6 probability to receive a special number 0,25
0,006

Absolute Frequency Density

Relative Frequency Density


0,005
0,20
0,004
0,15
0,003
0,10
0,002
0,05 0,001

0,00 0,000
we can construct two functions: 50 100 150 200 250 300 350 400 450
Attribute in Equidistant Classes

f(x) : a probability density function (= number of events per interval [a,b] )


F(x): a cumulative probability distribution (= summing up all values from lowest to selected value)

When the parameter x being measured can only take on certain values, such as colors (red, blue,...) or integers (number of parts,...)
the probability distribution is called a discrete ditribution.
The single probabilities of each allowed event i corresponds to the relative frequency pi (see page 2.7)
When the parameter x can take "any" numerical value, the probability function is called a continous distribution.
Because continous variables can exhibit "any" value within a small and smaller range (interval) no probability can be determined
for a special value of x . Instead the probability P(a<x<b) to find a value x in a special interval [a,b] can be calculated.

Integrated Circuit Manufacturing Prof.Dr.W.Hansch, Dipl.-Ing.E.Schober


Modul 1278 ICM, 2- 23
2. Statistics
2.3 Probability Calculations Random Variables and Probability Distributions
Commonly used definitions: X: the observed variable or attribute (e.g. color, size, number, ...)
xi : a special observed, random value of X (e.g. "red", 182 cm, 6, ..)
for discrete random variables the value P(X=xi) is the probability of a special value xi
for continuous random variable the value P(a<xi<b) is the probability to find xi in the interval [a,b]

discrete random variables continous random variables


b
P(X=x) = f(x) = p(x)
Probability (Density) Function P(a<x<b) = ∫f
a
x ( x)dx
PDF
f(x) fX(x)
special observed probabilities pi
P( 3< xi < 5)
P(xi=3)

x x
0 1 2 3 (Cumulative) Distribution Function 0 1 2 3 45 6
CDF

P(xi ≤ 3) P( X≤ xi ) = F(xi) P(xi ≤ 3)


F(x) F(x)
P(a<xi<b) = F(b) - F(a)
1
1
x
F ( x) = ∑ f ( xi ) F ( x) = ∫ f ( x)dx
xi ≤ x −∞
x
x 0 1 2 3
0 1 2 3

Integrated Circuit Manufacturing Prof.Dr.W.Hansch, Dipl.-Ing.E.Schober


Modul 1278 ICM, 2- 24
2. Statistics
2.3 Probability Calculations Parameters of Probability Distributions

If the probability distribution f(x) of a random variable x is known, so-called "expectation values" for this random variable can be predicted.
These expectation values are called moments of the random variable.

discrete continuous

n ∞

µ = ∑ xi ⋅ f (xi ) µ = ∫ x ⋅ f ( x)dx
mean:

(1. common moment) i =1 −∞

n ∞
σ = ∑ (xi − µ ) ⋅ f (xi )
standard deviation
σ = ∫ ( x − µ ) 2 ⋅ f ( x)dx
2 2 2
variance

(2. central moment)


i =1 −∞ σ = σ2
with rearrangement with rearrangement
better suited for samples: better suited for samples: coefficient of variation
n
σ = ∑ x ⋅ f (xi ) − µ

σ
σ = ∫ x 2 ⋅ f ( x)dx − µ 2 VC =
2 2 2 2

µ
i
i =1 −∞

These expressions hold for all distributions

This is the same as measurement of "sharpness" of the distribution,


we have seen at also normalized to compare different distributions
page 2.16

Integrated Circuit Manufacturing Prof.Dr.W.Hansch, Dipl.-Ing.E.Schober


Modul 1278 ICM, 2- 25
2. Statistics
2.3 Probability Calculations Special Probability Distributions

Discrete distributions Continous distributions

Uniform Distribution Uniform Distribution (rectangular distribution)


Bernoulli-Distribution Triangle Distribution
Binominal Distribution Normal Distribution (Gaussian Distribution)
Hypergeometrical Distribution Standard Normal Distribution
Negative Binomial Distribution Logarithmical Normal Distribution
Poisson-Distribution Exponential Distribution
Geometrical Distribution Weibull-Distribution
Gamma Distribution
Beta Distribution
Bose-Einstein Distribution
Cauchy-Distribution
Chi-Square Distribution
Student Distribution

* bold marked distributions are frequently used in manufacturing science

Integrated Circuit Manufacturing Prof.Dr.W.Hansch, Dipl.-Ing.E.Schober


Modul 1278 ICM, 2- 26
2. Statistics
2.3 Probability Calculations Family Tree of Statistical Distributions

Hypergeometrical Distribution
H(x/M, N, n)
M  N − M 
  ⋅  
 x   n − x 
f X ( x) =
M number of attribute carriers N
Θ= =  
N total population n
n sample size if if
AS = =
N total population
0 ,1 < Θ < 0,9; n > 10; AS < 5% 0 ,1 < Θ < 0,9; n > 30; AS < 5%
 M  M M M 
H (x / M , N , n ) → B  x / n,  H (x / M , N , n ) → N  n ⋅ , n ⋅ 1 −  
 N N N 
 N

if
Binomial Distribution Normal Distribution
B(x/n, p) N(µ, σ2)
Θ ≤ 0 ,1 or Θ ≥ 0 ,9; n > 30; AS < 5% if n ⋅ p ⋅ (1 − p ) > 9
 n  ( x − µ )2 
f X ( x) =   p x (1 − p )
n− x 1
 M f X ( x) = ⋅ exp − 
H (x / M , N , n ) → Ps x / n ⋅  B ( x / n, p ) → N (np, np(1 − p )) σ 2π 2σ 2 
 N  x 

if
n ⋅ p ≤ 10; n > 1500 ⋅ p µ ≥ 10
B( x / n, p ) → Ps( x / n ⋅ p ) Ps( x / µ ) → N (µ , µ )

if
Poisson-Distribution
Ps(x/µ)

µx
f X ( x) = e−µ
x!

Integrated Circuit Manufacturing Prof.Dr.W.Hansch, Dipl.-Ing.E.Schober


Modul 1278 ICM, 2- 27
2. Statistics
2.3 Probability Calculations Discrete Distributions: - the Uniform Distribution
Example: For the throw of a single (fair) dice, k = 6 possible events exist, all outcomes are equally probable -> pi = 1/k
This probability function is called: uniform distribution

Probability: The relative frequency of each possible event xi is: pi = 1 is constant


k

1,0
0,9
0,8
Rolling Dices Probability density function
0,7
i =n i =n
1 1 1 i =n
µ = ∑ xi ⋅ p = ∑ xi ⋅ = ⋅ ∑ xi
µ = 3.5
0,6
0,5 f ( x) = = p mean:
p(xi)

0,4 Discrete Uniform Distribution k i =1 i =1 k n i =1


0,3 σ = - 1.7 σ = + 1.7
0,2
1 i =n 1 i =n 1
0,1
0,0
µ = ⋅ ∑ xi = ⋅ ∑ (1 + 2 + 3 + 4 + 5 + 6) = ⋅ 21 = 3.5
1 2 3 4 5 6 n i =1 6 i =1 6
Possible Events xi
i =n
1,2
variance: σ = ∑ xi2 ⋅ p − µ 2
2

1,1 Cumulative distribution function i =1


1,0
0,9 Discrete Uniform Probability Distribution x
F ( x) = ∑ f (k )
0,8 i =∞ i =6
0,7 Rolling Dices 1 1
σ 2 = ∑ xi2 ⋅ pi − µ 2 = ∑ xi2 ⋅ − (3.5) 2 = ⋅ (12 + 2 2 + 32 + 4 2 + 52 + 6 2 ) − 3.52 = 2.92
0,6
F(xi)

k =0 i =1 i =1 6 6
0,5
0,4 σ = 1.71
0,3 F (1) = ∑ p1 = 1 / 6
0,2 xi <1

0,1 F (2) = ∑ pi = 1 / 6 + 1 / 6 = 2 / 6
0,0 xi < 2
1 2 3 4 5 6
Possible Events xi

Integrated Circuit Manufacturing Prof.Dr.W.Hansch, Dipl.-Ing.E.Schober


Modul 1278 ICM, 2- 28
2. Statistics
2.3 Probability Calculations Discrete Distributions: - the Uniform Distribution
We are throwing one dice a lot of times and are looking for observed mean and variance
see java-applet at: http://www.ds.unifi.it/VL/VL_EN/sample/sample7.html

true distribution values

theory: calculated mean = 3.50 1. experiment experimental mean = 3.80


(see prev. page) calculated standard deviation: 1.71 (10 throws) experimental standard deviation: 1.81

2. experiment experimental mean = 3.70 3. experiment experimental mean = 3.54


(new 10 throws) experimental standard deviation: 1.57 (new 100 ! throws) experimental standard deviation: 1.67

Integrated Circuit Manufacturing Prof.Dr.W.Hansch, Dipl.-Ing.E.Schober


Modul 1278 ICM, 2- 29
2. Statistics
2.3 Probability Calculations Discrete Distributions: - the Bernoulli Distribution

The "Bernoulli Trial" refers to a single event which can have one of two possible outcomes k with a fixed probability of each occurring.
These events can be described as "yes or no" questions.
For example: Will the coin land heads ? Will the newborn child be a girl? Are a random person's eyes green?

Probability: The random variable X can only show 2 values: k=1 for "yes" and k=0 for "no"
Usually we do not have a binary system (1 coin + 2 possibilities), but the system is more generalized:
(from 15 persons (number N of total population) have 3 (number M of attribute carriers) green eyes ->

P(1) = p =
M
P(1) = p =
The probability of a "yes" can be calculated: M 3
N = = 0.2 =ˆ 20%
N 15
0 ≤ p ≤1
* corresponds to Θ from p.2-27

Probability density function:


1-p
µ=p
p(xi)

mean:
p p if k = 1
f ( x) =  µ = 0.2
±σ
1 − p if k = 0
0 1
"no" "yes"
variance: σ 2 = p ⋅ (1 − p )
µ = 0.2 Cumulative distribution function: σ2 = 0.2*(1-0.2) = 0.16
σ = 0.4

0 if k < 0
 γ=
(1 − p ) − p
F ( x) = 1 − p if 0 ≤ k < 1 skewness:
1 p ⋅ (1 − p )
 if k ≥ 1
Integrated Circuit Manufacturing Prof.Dr.W.Hansch, Dipl.-Ing.E.Schober
Modul 1278 ICM, 2- 30
2. Statistics
2.3 Probability Calculations Discrete Distributions: - the Binomial Distribution
In probability theory and statistics, the Binomial Distribution is the discrete probability distribution of the number x of successes
in a sequence of n independent yes/no experiments (Bernoulli trials), each of which yields success with constant ! probability p.
In fact, when n = 1, the binomial distribution is a Bernoulli distribution.
„Independent“ means, that the composition M/N from the Bernoulli trial will not be changed by taking out samples. This can
be achieved by putting back every sample after each draw or if the numbers M,N are huge and the taken-out sample n is small.

Probability: The random variable X can only show 2 values: k=1 for "yes" and k=0 for "no"
Usually we do not have a binary system (1 coin + 2 possibilities), but the system is more generalized:
(rom 15 persons (number N of total population) have 3 (number M of attribute carriers) green eyes ->

P(1) = p =
M
The probability of a "yes" can be calculated: 0 ≤ p ≤1
N

n
1,0 Probability density function: mean: µ = ∑ x ⋅ f (x ) = ..... = n ⋅ p
0,9
taking out 5 x =0
n
f ( x ) =   ⋅ p x ⋅ (1 − p )
0,8 n− x
variance:
0,7
 x
Probability fX(x)

0,6 n=5, p=0.05 n


σ = ∑ x 2 ⋅ f (x ) − µ 2 = ...... = n ⋅ p ⋅ (1 − p)
n=8, p=0.4 hoping 2 are "green" 2
0,5

0,4
Cumulative distribution function: x =0

(1 − p ) − p
0,3 F(x) can be expressed in terms of the
0,2 regularized incomplete beta function:
γ=
np ⋅ (1 − p )
skewness:
0,1 very complicated !
0,0
0 1 2 3 4 5 6 7 8  n
Number x of defect Parts
approximations The binomial coefficients   are defined:
 x
Poisson distribution Normal distribution  n  n ⋅ (n − 1) ⋅ (n − 2 ) ⋅ ....... ⋅ (n − x + 1) n!
  = =
 x 1 ⋅ 2 ⋅ 3 ⋅ ....... ⋅ x (n − x)! ⋅ x!
Integrated Circuit Manufacturing Prof.Dr.W.Hansch, Dipl.-Ing.E.Schober
Modul 1278 ICM, 2- 31
2. Statistics
2.3 Probability Calculations Discrete Distributions: - the Binomial Distribution
Example: Yield and quality control

On a 1-Gb-DRAM (= total number N of MOSFETs is huge) the amount of defect memory cells is about 3% (p = 0,03).
For time reasons only n=100 cells will be tested.
What is the probability to find x = 4 defect cells ?

Answer:

Please notify:
In reality this is not ! a Binomial Experiment, but a hypergeometrical experiment (very complicated, see next page)
because the testing of the memory cell sequence is not independent (which means: once a cell is measured, it will not measured again),
this means: the measured cell is taken out of the total population, which changes the probability for the rest.
But: because total number N is huge and number of measured cells is small (100) basically the probability will not change.

What is the probability to find in this measurement of 100 cells no more than 3 defect cells ?

Answer:

Integrated Circuit Manufacturing Prof.Dr.W.Hansch, Dipl.-Ing.E.Schober


Modul 1278 ICM, 2- 32
2. Statistics
2.3 Probability Calculations Discrete Distributions: - the Hypergeometric Distribution

In probability theory and statistics, the Hypergeometric Distribution is a discrete probability distribution, that describes
the number x of successes in a sequence of n draws from a finite population N with M carriers of attribute without replacement.
Therefore the numbers N,M will be changed in the draw sequence. This is the difference to the binomial distribution.

0,35 N=1000, M=400, n=5 Probability density function:


0,30 total population mean:

M  N − M  n
µ = ∑ x ⋅ f (x ) = ..... = n ⋅
0,25 total "green" ones M
  ⋅   = n⋅ p

0,20

f X (x ) =    
x n x N
fX(x)

0,15
expected "greens" x =0

N
 
0,10

variance:
n
0,05

n
N −n
σ = ∑ x 2 ⋅ f (x ) − µ 2 = ...... = n ⋅ p ⋅ (1 − p) ⋅
0,00
0 1 2 3 4 5
sample size 2
x
x =0 N −1
Cumulative distribution function:
no analytical expression !

approximations

If N and M are large compared to n


If n is large, N and M are large compared to n
and p is not close to 0 or 1, then
and p is not close to 0 or 1, then a normal distribution is ok
the binomial distribution is ok
(see exercise previous page)

Integrated Circuit Manufacturing Prof.Dr.W.Hansch, Dipl.-Ing.E.Schober


Modul 1278 ICM, 2- 33
2. Statistics
2.3 Probability Calculations Discrete Distributions: - the Hypergeometric Distribution
Example:

From a total amount of N = 100 pieces, which carrys M = 5 scrap parts ( scrap rate Θ = M/N), n = 5 pieces were taken off.
How high is the probability to find 1 or more defect parts among the 5 pieces drawn ?

(= probability not to find 0 defects = 1 - f(x=0) )

Integrated Circuit Manufacturing Prof.Dr.W.Hansch, Dipl.-Ing.E.Schober


Modul 1278 ICM, 2- 34
2. Statistics
2.3 Probability Calculations Discrete Distributions: - the Poisson Distribution
In probability theory and statistics, the Poisson distribution is a discrete probability distribution that expresses the probability of
a number x of random events occurring in a fixed unit (time, space), if these events occur with a known average rate λ.

Examples: • passing cars per hour (λ=142) -> what is the propability, that only xi = 100 cars will pass ?
• number of defects/wafer (λ=6) -> what is the probability to find a wafer with xi = 0 defects ?

λ is a positive real number, equal to the expected number of occurrences that occur during the given interval. For instance,
if the events occur on average every 4 minutes, and you are interested in the number of events occurring in a 10 minute interval,
you would use as model a Poisson distribution with λ = 10/4 = 2.5.

The advantage of the Poisson distribution is, that with a fast guess of the mean value µ immediately probability calculations can be done

1,0

0,9 Probability density function: mean:


0,8 Poisson distributions µ = 0,3
λx µ =λ
f (x ) = −λ
µ=1
density function f(x)

0,7 µ=5 e
0,6
x!
0,5 variance:

σ2 =λ
0,4

0,3 Cumulative distribution function:


0,2

x
λk
F (x ) = ∑
0,1 skewness
−λ
0,0
0 1 2 3 4 5 6 7 8 9 10 11 12
e 1
number per unit [time] k =0 k! γ=
λ
Approximations: for λ >> 1 the discrete Poisson distribution can be approximated by a continous Normal distribution

Integrated Circuit Manufacturing Prof.Dr.W.Hansch, Dipl.-Ing.E.Schober


Modul 1278 ICM, 2- 35
2. Statistics
2.3 Probability Calculations Continuous Distributions: - the Normal Distribution

In probability theory and statistics, the Normal distribution is a continuous probability distribution, that expresses the probability
of an random event x, by assuming that many small, independent effects are additively contributing to each observation xi .

µ Normal distributions
0,20
with µ=20 and various σ
Probability density function:
density function f(x)

0,15
σ=2
1  ( x − µ )2 
0,10 f ( x) = ⋅ exp − 

mean:
σ 2π  2σ 2

0,05
µ=µ
σ=5
σ=10 Compared to the Poisson Distribution the
0,00 Normal Distribution exhibits 2 independent
variance:
-20 -10 0 10 20 30 40 50 60 70 shape parameters: mean µ and variance σ.
random variable x This makes calculations more complicated.
σ2 =σ2
1,1
1,0 Standardnormalverteilung:
the σ-value is the distance from
normalized probability f(x), F(x)

µ = 0, sigma =1
0,9 Cumulative distribution function:
0,8 mean µ to the interflect point
cumulative
0,7 distribution function F(x)
0,6 skewness
0,5 1
a
 (x − µ ) 2

0,4 F ( x ≤ a) = ⋅ ∫ exp −  dx
 γ =0
0,3
density function f(x)
σ 2π −∞  2σ 2 
0,2
0,1
0,0
-6 -4 -2 0 2 4 6 A good indicator of data having a normal distribution is skewness in
(normalized) random variable x the range of −0.8 to 0.8 and kurtosis in range of −3.0 to 3.0.

Integrated Circuit Manufacturing Prof.Dr.W.Hansch, Dipl.-Ing.E.Schober


Modul 1278 ICM, 2- 36
2. Statistics
2.3 Probability Calculations Parameters of the Normal Distribution
Derivatives of the Standard Normal Distribution:

1  ( x − µ )2 
⋅ exp − 
f(x)
f ( x) = 
σ 2π  2σ 2

Calculating the maximum with the first derivative:

1  ( x − µ )2   x − µ  x−µ
f ´(x) = ⋅ exp −  ⋅  − 2  = − 2 ⋅ f ( x)
σ 2π 2 
2σ   σ  σ

x−µ Maximum at: Value of the function at maximum:


f ´(x) = 0 = − ⋅ f ( x) ⇒ x = µ
σ2 x=µ f (µ ) =
1
⋅1 ≈
0.4
σ ⋅ 2π σ
Calculating the inflexion points with the second derivative:

1  1 
f ′′( x) = ⋅ 2 ⋅ ( x − µ ) − 1 ⋅ f ( x )
2 
2

σ σ 
Inflexion points at: Value of the function at inflexion:

f ′′( x) = 0 ⇒ x = µ ± σ x = µ ±σ 1 0.24
f (x ± σ ) = ≈
σ ⋅ 2πe σ
Integrated Circuit Manufacturing Prof.Dr.W.Hansch, Dipl.-Ing.E.Schober
Modul 1278 ICM, 2- 37
2. Statistics
2.3 Probability Calculations Continuous Distributions: - the Standard Normal Distribution

The standard normal distribution is a normal distribution with a mean of µ = 0 and a standard deviation of σ = 1.
Normal distributions f(x), F(x) can be transformed to standard normal distributions g(z), G(z) by the formula:

σx
z=
x − µx g(z) σz=1
f(x)
σx
X ~ N (µ , σ ) Z ~ N (0,1)
x = µ + zσ z
µx x µz=0
where x is a variable from the original normal distribution, μx is the mean of the original normal distribution,
and σx is the standard deviation of original normal distribution.
The standard normal distribution is sometimes called the z distribution.

With this transformation the old mathematicians (Laplace, Gauss) were able to calculate a universal table of z, f(z) and F(z) values.
(but each value required a long winding calculation procedure by hand).
But once the table was finished any Normal Distribution could be calculated very easily using the z-transformation and the z-table.

x − µx 1  x − µx   x − µx 
Transformation of x, g(x), G(x): z= f x ( x , µ x ,σ x ) = ⋅ g z   Fx ( x , µ x ,σ x ) = Gz  
σx σx  σx   σx 

x − µx dz d  x − µ  d  x  d  µ  1
From: z= it results: dx = σ x ⋅ dz because: =   =  −   = −0 =
1
σx dx dx  σ  dx  σ  dx  σ  σ σ

These transformation equations and relations are frequently used in calculations with Normal Distributions

Integrated Circuit Manufacturing Prof.Dr.W.Hansch, Dipl.-Ing.E.Schober


Modul 1278 ICM, 2- 38
2. Statistics
2.3 Probability Calculations Continuous Distributions: - the Standard Normal Distribution

Frequent questions using the normal distribution are:

What is the probability, that x percent of possible events take place ? Fz (-∞, a)
Fz
We have to solve the integral Fz from -∞ to a (or -∞ to x for 0
the given normal distribution) az

Many times we are interested for the probability, that events are differing from F1 (0, a)
the mean value, e.g.:
- we expect 100 persons will come to our birthday party, what is the probability
that 108 friends will come ? F1
- we fabricate on our IC-chips devices with threshold voltage of 1.33 V, what is
the probability to find devices with threshold voltages up to 1.5 V ? 0 az

We have to solve the integral F1 from µ to x (or µ=0 to a)

F2 (-a, +a)
Usually the customer asked the manufacturer for products, which have
a mean value µ and allowed variations of ± a.
The manufacturer has to find out, how many of his products are in the range F2
of ± a, because only these products the customer will accept (and pay !).
-z-a 0 za
We have to solve the integral F2 from -a to +a

These "red" values Fz, F1 and F2 are tabulated next page

Integrated Circuit Manufacturing Prof.Dr.W.Hansch, Dipl.-Ing.E.Schober


Modul 1278 ICM, 2- 39
F1 F2 2. Statistics
Fz

0 z
0 z
-z 0 z
Table of Standard Normal Distribution Function 2.3 Probability Calculations
z g(z) F F F z g(z) F F F z g(z) F F F z g(z) F F F z g(z) F F F z g(z) F F F
Z 1 2 Z 1 2 Z 1 2 Z 1 2 Z 1 2 Z 1 2
0,01 0,3989 0,5040 0,0040 0,0080 0,51 0,3503 0,6950 0,1950 0,3899 1,01 0,2396 0,8438 0,3438 0,6875 1,51 0,1276 0,9345 0,4345 0,8690 2,01 0,0529 0,9778 0,4778 0,9556 2,51 0,0171 0,9940 0,4940 0,9879
0,02 0,3989 0,5080 0,0080 0,0160 0,52 0,3485 0,6985 0,1985 0,3969 1,02 0,2371 0,8461 0,3461 0,6923 1,52 0,1257 0,9357 0,4357 0,8715 2,02 0,0519 0,9783 0,4783 0,9566 2,52 0,0167 0,9941 0,4941 0,9883
0,03 0,3988 0,5120 0,0120 0,0239 0,53 0,3467 0,7019 0,2019 0,4039 1,03 0,2347 0,8485 0,3485 0,6970 1,53 0,1238 0,9370 0,4370 0,8740 2,03 0,0508 0,9788 0,4788 0,9576 2,53 0,0163 0,9943 0,4943 0,9886
0,04 0,3986 0,5160 0,0160 0,0319 0,54 0,3448 0,7054 0,2054 0,4108 1,04 0,2323 0,8508 0,3508 0,7017 1,54 0,1219 0,9382 0,4382 0,8764 2,04 0,0498 0,9793 0,4793 0,9586 2,54 0,0158 0,9945 0,4945 0,9889
0,05 0,3984 0,5199 0,0199 0,0399 0,55 0,3429 0,7088 0,2088 0,4177 1,05 0,2299 0,8531 0,3531 0,7063 1,55 0,1200 0,9394 0,4394 0,8789 2,05 0,0488 0,9798 0,4798 0,9596 2,55 0,0154 0,9946 0,4946 0,9892
0,06 0,3982 0,5239 0,0239 0,0478 0,56 0,3410 0,7123 0,2123 0,4245 1,06 0,2275 0,8554 0,3554 0,7109 1,56 0,1182 0,9406 0,4406 0,8812 2,06 0,0478 0,9803 0,4803 0,9606 2,56 0,0151 0,9948 0,4948 0,9895
0,07 0,3980 0,5279 0,0279 0,0558 0,57 0,3391 0,7157 0,2157 0,4313 1,07 0,2251 0,8577 0,3577 0,7154 1,57 0,1163 0,9418 0,4418 0,8836 2,07 0,0468 0,9808 0,4808 0,9615 2,57 0,0147 0,9949 0,4949 0,9898
0,08 0,3977 0,5319 0,0319 0,0638 0,58 0,3372 0,7190 0,2190 0,4381 1,08 0,2227 0,8599 0,3599 0,7199 1,58 0,1145 0,9429 0,4429 0,8859 2,08 0,0459 0,9812 0,4812 0,9625 2,58 0,0143 0,9951 0,4951 0,9901
0,09 0,3973 0,5359 0,0359 0,0717 0,59 0,3352 0,7224 0,2224 0,4448 1,09 0,2203 0,8621 0,3621 0,7243 1,59 0,1127 0,9441 0,4441 0,8882 2,09 0,0449 0,9817 0,4817 0,9634 2,59 0,0139 0,9952 0,4952 0,9904
0,10 0,3970 0,5398 0,0398 0,0797 0,60 0,3332 0,7257 0,2257 0,4515 1,10 0,2179 0,8643 0,3643 0,7287 1,60 0,1109 0,9452 0,4452 0,8904 2,10 0,0440 0,9821 0,4821 0,9643 2,60 0,0136 0,9953 0,4953 0,9907
0,11 0,3965 0,5438 0,0438 0,0876 0,61 0,3312 0,7291 0,2291 0,4581 1,11 0,2155 0,8665 0,3665 0,7330 1,61 0,1092 0,9463 0,4463 0,8926 2,11 0,0431 0,9826 0,4826 0,9651 2,61 0,0132 0,9955 0,4955 0,9909
0,12 0,3961 0,5478 0,0478 0,0955 0,62 0,3292 0,7324 0,2324 0,4647 1,12 0,2131 0,8686 0,3686 0,7373 1,62 0,1074 0,9474 0,4474 0,8948 2,12 0,0422 0,9830 0,4830 0,9660 2,62 0,0129 0,9956 0,4956 0,9912
0,13 0,3956 0,5517 0,0517 0,1034 0,63 0,3271 0,7357 0,2357 0,4713 1,13 0,2107 0,8708 0,3708 0,7415 1,63 0,1057 0,9484 0,4484 0,8969 2,13 0,0413 0,9834 0,4834 0,9668 2,63 0,0126 0,9957 0,4957 0,9915
0,14 0,3951 0,5557 0,0557 0,1113 0,64 0,3251 0,7389 0,2389 0,4778 1,14 0,2083 0,8729 0,3729 0,7457 1,64 0,1040 0,9495 0,4495 0,8990 2,14 0,0404 0,9838 0,4838 0,9676 2,64 0,0122 0,9959 0,4959 0,9917
0,15 0,3945 0,5596 0,0596 0,1192 0,65 0,3230 0,7422 0,2422 0,4843 1,15 0,2059 0,8749 0,3749 0,7499 1,65 0,1023 0,9505 0,4505 0,9011 2,15 0,0396 0,9842 0,4842 0,9684 2,65 0,0119 0,9960 0,4960 0,9920
0,16 0,3939 0,5636 0,0636 0,1271 0,66 0,3209 0,7454 0,2454 0,4907 1,16 0,2036 0,8770 0,3770 0,7540 1,66 0,1006 0,9515 0,4515 0,9031 2,16 0,0387 0,9846 0,4846 0,9692 2,66 0,0116 0,9961 0,4961 0,9922
0,17 0,3932 0,5675 0,0675 0,1350 0,67 0,3187 0,7486 0,2486 0,4971 1,17 0,2012 0,8790 0,3790 0,7580 1,67 0,0989 0,9525 0,4525 0,9051 2,17 0,0379 0,9850 0,4850 0,9700 2,67 0,0113 0,9962 0,4962 0,9924
0,18 0,3925 0,5714 0,0714 0,1428 0,68 0,3166 0,7517 0,2517 0,5035 1,18 0,1989 0,8810 0,3810 0,7620 1,68 0,0973 0,9535 0,4535 0,9070 2,18 0,0371 0,9854 0,4854 0,9707 2,68 0,0110 0,9963 0,4963 0,9926
0,19 0,3918 0,5753 0,0753 0,1507 0,69 0,3144 0,7549 0,2549 0,5098 1,19 0,1965 0,8830 0,3830 0,7660 1,69 0,0957 0,9545 0,4545 0,9090 2,19 0,0363 0,9857 0,4857 0,9715 2,69 0,0107 0,9964 0,4964 0,9929
0,20 0,3910 0,5793 0,0793 0,1585 0,70 0,3123 0,7580 0,2580 0,5161 1,20 0,1942 0,8849 0,3849 0,7699 1,70 0,0940 0,9554 0,4554 0,9109 2,20 0,0355 0,9861 0,4861 0,9722 2,70 0,0104 0,9965 0,4965 0,9931
0,21 0,3902 0,5832 0,0832 0,1663 0,71 0,3101 0,7611 0,2611 0,5223 1,21 0,1919 0,8869 0,3869 0,7737 1,71 0,0925 0,9564 0,4564 0,9127 2,21 0,0347 0,9864 0,4864 0,9729 2,71 0,0101 0,9966 0,4966 0,9933
0,22 0,3894 0,5871 0,0871 0,1741 0,72 0,3079 0,7642 0,2642 0,5285 1,22 0,1895 0,8888 0,3888 0,7775 1,72 0,0909 0,9573 0,4573 0,9146 2,22 0,0339 0,9868 0,4868 0,9736 2,72 0,0099 0,9967 0,4967 0,9935
0,23 0,3885 0,5910 0,0910 0,1819 0,73 0,3056 0,7673 0,2673 0,5346 1,23 0,1872 0,8907 0,3907 0,7813 1,73 0,0893 0,9582 0,4582 0,9164 2,23 0,0332 0,9871 0,4871 0,9743 2,73 0,0096 0,9968 0,4968 0,9937
0,24 0,3876 0,5948 0,0948 0,1897 0,74 0,3034 0,7704 0,2704 0,5407 1,24 0,1849 0,8925 0,3925 0,7850 1,74 0,0878 0,9591 0,4591 0,9181 2,24 0,0325 0,9875 0,4875 0,9749 2,74 0,0093 0,9969 0,4969 0,9939
0,25 0,3867 0,5987 0,0987 0,1974 0,75 0,3011 0,7734 0,2734 0,5467 1,25 0,1826 0,8944 0,3944 0,7887 1,75 0,0863 0,9599 0,4599 0,9199 2,25 0,0317 0,9878 0,4878 0,9756 2,75 0,0091 0,9970 0,4970 0,9940
0,26 0,3857 0,6026 0,1026 0,2051 0,76 0,2989 0,7764 0,2764 0,5527 1,26 0,1804 0,8962 0,3962 0,7923 1,76 0,0848 0,9608 0,4608 0,9216 2,26 0,0310 0,9881 0,4881 0,9762 2,76 0,0088 0,9971 0,4971 0,9942
0,27 0,3847 0,6064 0,1064 0,2128 0,77 0,2966 0,7794 0,2794 0,5587 1,27 0,1781 0,8980 0,3980 0,7959 1,77 0,0833 0,9616 0,4616 0,9233 2,27 0,0303 0,9884 0,4884 0,9768 2,77 0,0086 0,9972 0,4972 0,9944
0,28 0,3836 0,6103 0,1103 0,2205 0,78 0,2943 0,7823 0,2823 0,5646 1,28 0,1758 0,8997 0,3997 0,7995 1,78 0,0818 0,9625 0,4625 0,9249 2,28 0,0297 0,9887 0,4887 0,9774 2,78 0,0084 0,9973 0,4973 0,9946
0,29 0,3825 0,6141 0,1141 0,2282 0,79 0,2920 0,7852 0,2852 0,5705 1,29 0,1736 0,9015 0,4015 0,8029 1,79 0,0804 0,9633 0,4633 0,9265 2,29 0,0290 0,9890 0,4890 0,9780 2,79 0,0081 0,9974 0,4974 0,9947
0,30 0,3814 0,6179 0,1179 0,2358 0,80 0,2897 0,7881 0,2881 0,5763 1,30 0,1714 0,9032 0,4032 0,8064 1,80 0,0790 0,9641 0,4641 0,9281 2,30 0,0283 0,9893 0,4893 0,9786 2,80 0,0079 0,9974 0,4974 0,9949
0,31 0,3802 0,6217 0,1217 0,2434 0,81 0,2874 0,7910 0,2910 0,5821 1,31 0,1691 0,9049 0,4049 0,8098 1,81 0,0775 0,9649 0,4649 0,9297 2,31 0,0277 0,9896 0,4896 0,9791 2,81 0,0077 0,9975 0,4975 0,9950
0,32 0,3790 0,6255 0,1255 0,2510 0,82 0,2850 0,7939 0,2939 0,5878 1,32 0,1669 0,9066 0,4066 0,8132 1,82 0,0761 0,9656 0,4656 0,9312 2,32 0,0270 0,9898 0,4898 0,9797 2,82 0,0075 0,9976 0,4976 0,9952
0,33 0,3778 0,6293 0,1293 0,2586 0,83 0,2827 0,7967 0,2967 0,5935 1,33 0,1647 0,9082 0,4082 0,8165 1,83 0,0748 0,9664 0,4664 0,9328 2,33 0,0264 0,9901 0,4901 0,9802 2,83 0,0073 0,9977 0,4977 0,9953
0,34 0,3765 0,6331 0,1331 0,2661 0,84 0,2803 0,7995 0,2995 0,5991 1,34 0,1626 0,9099 0,4099 0,8198 1,84 0,0734 0,9671 0,4671 0,9342 2,34 0,0258 0,9904 0,4904 0,9807 2,84 0,0071 0,9977 0,4977 0,9955
0,35 0,3752 0,6368 0,1368 0,2737 0,85 0,2780 0,8023 0,3023 0,6047 1,35 0,1604 0,9115 0,4115 0,8230 1,85 0,0721 0,9678 0,4678 0,9357 2,35 0,0252 0,9906 0,4906 0,9812 2,85 0,0069 0,9978 0,4978 0,9956
0,36 0,3739 0,6406 0,1406 0,2812 0,86 0,2756 0,8051 0,3051 0,6102 1,36 0,1582 0,9131 0,4131 0,8262 1,86 0,0707 0,9686 0,4686 0,9371 2,36 0,0246 0,9909 0,4909 0,9817 2,86 0,0067 0,9979 0,4979 0,9958
0,37 0,3725 0,6443 0,1443 0,2886 0,87 0,2732 0,8078 0,3078 0,6157 1,37 0,1561 0,9147 0,4147 0,8293 1,87 0,0694 0,9693 0,4693 0,9385 2,37 0,0241 0,9911 0,4911 0,9822 2,87 0,0065 0,9979 0,4979 0,9959
0,38 0,3712 0,6480 0,1480 0,2961 0,88 0,2709 0,8106 0,3106 0,6211 1,38 0,1539 0,9162 0,4162 0,8324 1,88 0,0681 0,9699 0,4699 0,9399 2,38 0,0235 0,9913 0,4913 0,9827 2,88 0,0063 0,9980 0,4980 0,9960
0,39 0,3697 0,6517 0,1517 0,3035 0,89 0,2685 0,8133 0,3133 0,6265 1,39 0,1518 0,9177 0,4177 0,8355 1,89 0,0669 0,9706 0,4706 0,9412 2,39 0,0229 0,9916 0,4916 0,9832 2,89 0,0061 0,9981 0,4981 0,9961
0,40 0,3683 0,6554 0,1554 0,3108 0,90 0,2661 0,8159 0,3159 0,6319 1,40 0,1497 0,9192 0,4192 0,8385 1,90 0,0656 0,9713 0,4713 0,9426 2,40 0,0224 0,9918 0,4918 0,9836 2,90 0,0060 0,9981 0,4981 0,9963
0,41 0,3668 0,6591 0,1591 0,3182 0,91 0,2637 0,8186 0,3186 0,6372 1,41 0,1476 0,9207 0,4207 0,8415 1,91 0,0644 0,9719 0,4719 0,9439 2,41 0,0219 0,9920 0,4920 0,9840 2,91 0,0058 0,9982 0,4982 0,9964
0,42 0,3653 0,6628 0,1628 0,3255 0,92 0,2613 0,8212 0,3212 0,6424 1,42 0,1456 0,9222 0,4222 0,8444 1,92 0,0632 0,9726 0,4726 0,9451 2,42 0,0213 0,9922 0,4922 0,9845 2,92 0,0056 0,9982 0,4982 0,9965
0,43 0,3637 0,6664 0,1664 0,3328 0,93 0,2589 0,8238 0,3238 0,6476 1,43 0,1435 0,9236 0,4236 0,8473 1,93 0,0620 0,9732 0,4732 0,9464 2,43 0,0208 0,9925 0,4925 0,9849 2,93 0,0055 0,9983 0,4983 0,9966
0,44 0,3621 0,6700 0,1700 0,3401 0,94 0,2565 0,8264 0,3264 0,6528 1,44 0,1415 0,9251 0,4251 0,8501 1,94 0,0608 0,9738 0,4738 0,9476 2,44 0,0203 0,9927 0,4927 0,9853 2,94 0,0053 0,9984 0,4984 0,9967
0,45 0,3605 0,6736 0,1736 0,3473 0,95 0,2541 0,8289 0,3289 0,6579 1,45 0,1394 0,9265 0,4265 0,8529 1,95 0,0596 0,9744 0,4744 0,9488 2,45 0,0198 0,9929 0,4929 0,9857 2,95 0,0051 0,9984 0,4984 0,9968
0,46 0,3589 0,6772 0,1772 0,3545 0,96 0,2516 0,8315 0,3315 0,6629 1,46 0,1374 0,9279 0,4279 0,8557 1,96 0,0584 0,9750 0,4750 0,9500 2,46 0,0194 0,9931 0,4931 0,9861 2,96 0,0050 0,9985 0,4985 0,9969
0,47 0,3572 0,6808 0,1808 0,3616 0,97 0,2492 0,8340 0,3340 0,6680 1,47 0,1354 0,9292 0,4292 0,8584 1,97 0,0573 0,9756 0,4756 0,9512 2,47 0,0189 0,9932 0,4932 0,9865 2,97 0,0048 0,9985 0,4985 0,9970
0,48 0,3555 0,6844 0,1844 0,3688 0,98 0,2468 0,8365 0,3365 0,6729 1,48 0,1334 0,9306 0,4306 0,8611 1,98 0,0562 0,9761 0,4761 0,9523 2,48 0,0184 0,9934 0,4934 0,9869 2,98 0,0047 0,9986 0,4986 0,9971
0,49 0,3538 0,6879 0,1879 0,3759 0,99 0,2444 0,8389 0,3389 0,6778 1,49 0,1315 0,9319 0,4319 0,8638 1,99 0,0551 0,9767 0,4767 0,9534 2,49 0,0180 0,9936 0,4936 0,9872 2,99 0,0046 0,9986 0,4986 0,9972
0,50 0,3521 0,6915 0,1915 0,3829 1,00 0,2420 0,8413 0,3413 0,6827 1,50 Integrated Circuit Manufacturing
0,1295 0,9332 0,4332 0,8664 2,00 0,0540 0,9772 0,4772 0,9545 2,50 Prof.Dr.W.Hansch, Dipl.-Ing.E.Schober
0,0175 0,9938 0,4938 0,9876 3,00 0,0044 0,9987 0,4987 0,9973
1σ σ
Modul21278 3σ ICM, 2- 40
2. Statistics
2.3 Probability Calculations Table of Standard Normal Distribution Function

In the tables before only positive z-values are listed, but negative z-values, corresponding to distribution values below 0.5
can be calculated easily:

g(z)
* The value of the probability function g(z) is symmetrical for ± z
* but the value of the integral G(z) is different: G(-z) = 1 - G(+z)
Fz
Example: *
-z 0 z * we know the integral value G = 0.33
* because G < 0.5 the z-value must be negative: G(-z) = 0.33
* finding the corresponding value for positive z: G(+z) = 1- G(-z) = 1- 0.33 = 0.67
* finding the z-value: in the table the z-value to Fz = 0.67 = 0.44
* thus, the z-value for G(-z) = 0.33 is z= -0.44

Integrated Circuit Manufacturing Prof.Dr.W.Hansch, Dipl.-Ing.E.Schober


Modul 1278 ICM, 2- 41
2. Statistics
2.3 Probability Calculations Properties of Normal Distributions

Linearity (= stretching)

If we have a normal distributed random variable X with µx and σx, (


X ~ N µ X , σ X2 )
(
Y ~ N µY , σ Y2 )

then we can create a new random variable Y by Y = aX + b ⇒  µY = aµ X + b stretched mean value
 σ 2 = a 2σ 2
 Y X stretched variance value

Example:
The organization board of a conference is confronted with the problem, that they expect about 1000 participants (µx =1000) with a
proposed standard deviation of about S= ± 120 participants. This is our original X distribution.
They have to provide for each participant 8 sheets of paper (a = 8). This is an exact value, no distribution.
In addition, the organization board will need 500 sheets of paper for their own use (b = 500).
So if we have to tell the paper shop, how many sheets of paper we need, we will transform our X distribution (participants) to a
paper distribution Y by linear stretching:

Y = aX + b

Y = aX+b, which is also normal distributed with µy = aµx + b and σ2y = a2σ2

Integrated Circuit Manufacturing Prof.Dr.W.Hansch, Dipl.-Ing.E.Schober


Modul 1278 ICM, 2- 42
2. Statistics
2.3 Probability Calculations Properties of Normal Distributions

Additivity
( ) (
X ~ N µ X , σ X2  Z ~ N µ Z , σ Z2 )
If X and Y are two independent, random variables and both are normal
distributed with X from (µX, σX2) and Y from (µY, σY2), then the new variable
( )  
Y ~ N µY , σ Y2  ⇒  µ Z = µ X ± µY
Z = X ± Y is also normal distributed with µZ=µX ± µY and σZ2 =σX2+σY2. Z = X ± Y   σ Z2 = σ X2 + σ Y2

Products

If X and Y are two independent, random variables and both are normal distributed
with X from (0,σX2) and Y from (0,σY2), then:
the product X*Y will result in a Bessel function, and the ratio X/Y in a Cauchy function.

Sigma σ rules:
P(µ − σ ≤ x ≤ µ + σ ) = 0 ,6827 ≈
2
Within 2 x 1σ there are ~68%
of all values,
3
within 2x 3σ there are ~ 99,8%
of all values.

Integrated Circuit Manufacturing Prof.Dr.W.Hansch, Dipl.-Ing.E.Schober


Modul 1278 ICM, 2- 43
2. Statistics
2.3 Probability Calculations The Logarithmic Normal Distribution

In statistics the probability of "somehow interacting" , but basically independent random variables X, Y, .. can be described
by a logarithmic normal distribution (lognormal).

The lognormal distribution is frequently used in two fields:


- financial mathematics -> salary distribution of employees, calculation of damage for insurance, stock rates, ...
- medical/ biological: two (or more) interacting, but basically independent parameters
-> head volume of new born babies (calculated by length, width and height)
- In reliability analysis, the lognormal distribution is often used to model times to repair a maintainable system

In all cases in addition a lower (positive) limit exists (usually Zero) and lower values are more frequent than higher values.

Probability density function: ∞

mean: µ * = ∫ x ⋅ f ( x)dx
1  (ln ( x ) − µ ) 2
 0
f ( x, µ , σ ) = ⋅ exp − 
  µ +σ 2 
xσ ⋅ 2π  2σ 2  µ * = exp 
 2 

Cumulative distribution function:

variance: σ = x ⋅ f ( x )dx − µ
2 2 2

1 1  (ln ( x ) − µ )  0

σ *2 = (exp(σ 2 ) − 1)⋅ exp(2µ + σ 2 )


F ( x, µ , σ ) = + ⋅ erf  − 
σ⋅ 2 
2 2  

Integrated Circuit Manufacturing Prof.Dr.W.Hansch, Dipl.-Ing.E.Schober


Modul 1278 ICM, 2- 44
2. Statistics
2.3 Probability Calculations The Exponential Distribution
In statistics the probability of the number of random events in a special interval (e.g. (0,x] or (0,t]) can be described by a Poisson
distribution.
If we look at the distance (e.g. x or t) between the occurrence of the random events within the interval, the distribution is exponential
with the parameter α .

x1 x2 x3 x4 number of events per interval


= Poisson distribution

interval 1 interval 2 interval 3 interval 4 distance between events


t1 t2 t3 t4 t1 t2 t3 per interval
t1 t2 t1 t2 t3
= Exponential distribution

2,0

Dichtefunktion der Exponentialverteilung Probability density function: 1
1,8
f(x)=α*exp[-αx] mean: µ = ∫ x ⋅ f ( x )dx µ=
0 α
1,6

f (t ) = α ⋅ exp[− α ⋅ t ]
∞ ∞ ∞

µ = ∫ t ⋅α ⋅ exp[− at ]dt = ∫ exp[− at ]dt = − ⋅ exp[− αt ] =


1 1
1,4
0 0
α 0 α
1,2

1,0 alpha= 0.5



f(t)

alpha= 1
0,8 alpha= 2 variance: σ = ∫ x 2 ⋅ f ( x )dx − µ 2
2

alpha= 5 Cumulative distribution function: 0


0,6 1
σ2 =
0,4
F ( t ) = 1 − exp[− α ⋅ t ] ∞
α2
σ 2 = ∫ t 2 ⋅α ⋅ exp[− αt ]dt −
1 2 1 1
0,2 = − =
0
α2 α2 α2 α2
0,0
1
0,0 0,5 1,0 1,5 2,0 2,5 3,0
σ
t Variation coefficient: VC = VC = α = 1
µ 1
a
Integrated Circuit Manufacturing Prof.Dr.W.Hansch, Dipl.-Ing.E.Schober
Modul 1278 ICM, 2- 45
2. Statistics
2.3 Probability Calculations The Exponential Distribution
The exponential distribution is widely used in two fields:
- calculation of arrival rates in queuing systems
- in reliability engineering as a model of the time to failure of a component or system

Example: The mean time between random breakdowns of equipment was determined to µ = 4 h.
Thus, a failure rate of α = 1/µ = 0.25/h can be calculated.

Question: What is the probability that the next breakdown will occure within the next 6 hours ?

Answer: F ( 6 ) = 1 − exp[− 0.25 / h * 6h] = 1 − 0.223 = 0.78

Characteristic of the exponential distribution: the event rate α is constant

Dichtefunktion der Exponentialverteilung As a consequence it is independent at which point of time we are looking.
f(x)=α*exp[-αx] Always within the next interval the same amount of events will occur.
1
log(f(x))=logα - αx
-> to make statements for the future, it does not play a role what already
happened in the past.
0,1 This characteristic of the exponential distribution is usually called:
f(t)

memory-less

alpha= 0.5
0,01 alpha= 1
alpha= 2
alpha= 5

0,0 0,5 1,0 1,5 2,0 2,5 3,0


t
Integrated Circuit Manufacturing Prof.Dr.W.Hansch, Dipl.-Ing.E.Schober
Modul 1278 ICM, 2- 46
2. Statistics
2.3 Probability Calculations The Erlang Distribution

Die Erlang-Verteilung ist eine stetige Wahrscheinlichkeitsverteilung, eine Verallgemeinerung der Exponential-Verteilung und ein
Spezialfall der Gamma-Verteilung. Sie wurde von A.Erlang für die statistische Modellierung der Intervall-Längen zwischen
Telefonanrufen entwickelt.

Die Erlang-Verteilung beschreibt die Wahrscheinlichkeit für den Zeitabstand, daß n Zufallsereignisse hintereinander auftreten:
-> Wie lange muß ich an der Kreuzung warten, bis jeweils n=4 Autos die Kreuzung passiert haben.
-> Die Exponential-Verteilung ist ein Spezialfall mit n=1: Wie groß ist der Zeitabstand zwischen jedem einzelnen Auto
=> die Erlang-Verteilung ist die wichtigste Verteilung in der Warteschlangentheorie


Probability density function: n
mean: µ = ∫ x ⋅ f ( x )dx µ=
0 α
α n ⋅ t n −1
f (t) = ⋅ exp[− α ⋅ t ]
(n − 1)!

σ 2 = ∫ x 2 ⋅ f ( x )dx − µ 2
n
variance: σ2 =
0
α2
für t ≥ 0, f(t) = 0 für t < 0

Cumulative distribution function:

Γ(n , αt ) n −1
(α ⋅ t )i
F( t ) = 1 − = 1 − exp[− α ⋅ t ]⋅ ∑
(n − 1) ! i =0 i!
n
σ VC = α = n
Variation coefficient: VC =
µ n
α

Integrated Circuit Manufacturing Prof.Dr.W.Hansch, Dipl.-Ing.E.Schober


Modul 1278 ICM, 2- 47
2. Statistics
2.3 Probability Calculations The Erlang Distribution

Example: Die Zeitspanne zwischen der Ankunft von zwei Kunden in einem Geschäft sei exponentialverteilt.
Wie ist die Dauer der Zeitspanne verteilt, die vergeht, bis 10 Kunden das Geschäft erreicht haben?
Antwort: Sie ist erlangverteilt (Wenn die Kunden unabhängig voneinander sind, also nicht in Gruppen kommen).

Example: Kalte Redundanz: Es sind Reservebauteile vorhanden, die erst benutzt werden, wenn die vorhergehenden
Bauteile ausgefallen sind.
Wann fällt das letzte Reservebauteil aus?

Integrated Circuit Manufacturing Prof.Dr.W.Hansch, Dipl.-Ing.E.Schober


Modul 1278 ICM, 2- 48
2 Statistics

2.1 Overview

2.2 Describing (descriptive) Statistics

2.3 Probability Calculations

2.4 Concluding (inductive) Statistics

Integrated Circuit Manufacturing Prof.Dr.W.Hansch, Dipl.-Ing.E.Schober


Modul 1278 ICM, 2- 49
2. Statistics
2.4 Inductive Statistics Overview

Statistics is a science,
for the purpose: to collect, to prepare, to describe and to present information (descriptive statistics)
and conclude from sample information to properties of the whole population (inductive statistics)

Collecting data from the total population (called census)


may be:
impossible only take a sample sample
Total Population
impractical
too expensive

Inductive Statistics Descriptive Statistics


Probability Calculations
* Conclusion (interference) from sample (Stochastics) * Collecting data (sample)
to total population * Arrange the data
* Quantification of uncertainty ! Notation: estimations! from sample * Combine the data
* Make decisions to total population: <x>, <s> * Presentation of data
* Search for relations of data

uncertain information of the total population complete information of the sample

! Notation: total population: Greek letters are used µ,σ... ! Notation: sample: Latin letters are used x, s, ...

Integrated Circuit Manufacturing Prof.Dr.W.Hansch, Dipl.-Ing.E.Schober


Modul 1278 ICM, 2- 50
2. Statistics
2.4 Inductive Statistics Universal Relations in Statistics

In statistics an expectation value <x> can be calculated from the summation of all values times their probability:
Sometimes in literature instead of <x> the symbol Ê(X) is used to indicate the conclusion from sample mean value x
to the unknown, true value µ of the total population.

Calculation of expectation values:


Ŷ(Q) = expectation value of profit loss x max ∞

g(x) = density function of demand X,


G(x) = cumulative distribution function of demand X
Ê( x ) = ∑ x ⋅ p( x )− > ∫ x ⋅ g (x )dx always true !
see page 2.21
x min −∞
discrete continuous

In statistics for any probability function the following relations hold:

+∞ 0

∫ g ( x)dx = 1
−∞
∫ g( x )dx = G( 0 )
−∞
1 G(x) sum function
G(a) a
a

g(x) probability density function −∞


∫ g ( x)dx = G(a) ∫ g( x )dx = G( a ) − G( 0 )
0

−∞ 0 a x +∞ a a 0

∫ = ∫ − ∫ for a > 0
∫ g( x )dx = 1 − G( a )
a
0 −∞ −∞

Integrated Circuit Manufacturing Prof.Dr.W.Hansch, Dipl.-Ing.E.Schober


Modul 1278 ICM, 2- 51
2 Statistics

Integrated Circuit Manufacturing Prof.Dr.W.Hansch, Dipl.-Ing.E.Schober


Modul 1278 ICM, 2- 52

You might also like