The Gaussian Distribution
The Gaussian Distribution
Gaussian
Distribution
Probability distribution
Histograms are used to plot the distribution of values in a sample. It can be bell shaped.
Because of large population, the Y-axis in probability distribution is known as probability density as it
cannot show the number of observations in each group
The Gaussian Distribution
The variables will approximate a Gaussian distribution when the variation is caused by many
independent factors
When the variation is largely due to one factor (such as variation in a single gene), you expect to see
bimodal or skewed distributions rather than Gaussian distributions.
The Gaussian Distribution
The mean is the centre of the Gaussian distribution and standard deviation is the spread of distribution
The area under the curve within 1 SD represents 68% of the total observation and the area under the
entire curve represents the entire population
95% of the values in a Gaussian distribution lie within 1.96 SD of the mean.
Areas under a portion of the Gaussian distribution are tabulated as the "z' ' distribution, where z is the
number of standard deviations away from the mean.
There are two problems with these calculations. One problem is that you don’t know the mean and SD of
the entire population. The second problem is that the population may not really be Gaussian
By calculating the z-value we can make other calculations for the percentage of area
THE PREDICTION INTERVAL
The prediction interval is only valid when the population is distributed according to a
Gaussian distribution
THE PREDICTION INTERVAL
To calculate a prediction interval, you need to go K SDs from the mean, where K is obtained from
Table
NORMAL LIMITS
The term normal limits can be defined in many ways. Using the Gaussian distribution to make rules
about "normal“ values is not always useful.
The deviations from Gaussian are likely to be most apparent in the tails of the distribution. There is
also no reason to think that the population distribution is symmetrical around the mean.
There are complexities of "normal limits" and problems in defining normal limits as being the
With large enough sample size, the distribution of a population can be predicted which is known as
the probability distribution
The Gaussian distribution is basically a normal distribution which is dependent on many independent
factors for its variation
In case of a small sample having Gaussian distribution, prediction interval needs to be used which
means to use more than 2SD for including 95% of the observations