CHAPTERS
CHAPTERS
by
Nov. 2021
Example: a set of fifty readings of a length measurement are done as shown in the table below.
Use Histogram (frequency distribution curve) to present the data.
Length Number of
(mm) readings
99.7 1
99.8 4
99.9 12
100.0 19
100.1 10
100.2 3
100.3 1
Mode: is the value of the variate that occurs with greatest frequency.
In general it is considered to be a poor measure of the central tendency of the
data as quite often the most frequently occurring quantity does not appear near
the center of the data.
However, for descriptive analysis, mode is a useful measure to describe the
most frequency occurring value.
Median: is the middle value when measurements in the data set are written down in
Range
Variance
The mean square deviation (MSD) also termed as the variance is traditionally a very commo
nly used measure of the variability of the data.
The square of the standard deviation 2 is called the variance. This is sometimes called the
population or biased standard deviation because it strictly applies only when a large number
of samples is taken to describe the population.
In many circumstances the engineer will not be able to collect as many data points
For small sets of data an unbiased or sample standard deviation (adjusted standard
efined by:
The sample or unbiased standard deviation should be used when the underlying
population is not known
Best Estimate of Uncertainty
The best estimate of uncertainty represents the extent of random error in the measured values.
Example: Ten samples of a steel wire were tested on a universal testing machine. The breaking
strengths in tonnes (t) of the samples were: 4.3, 4.5. 4.7 4.2, 4.5, 4.6, 4.4, 4.6, 4.9, 4.5.
Compute the following:
a) The mean value of the braking strength
b) Mean deviation of the data
c) Standard deviation
d) Best estimate of the precision of the apparatus, and
e) Best estimation of the uncertainty in the data
Solution:
The Gaussian or Normal Distribution
In general, measuring instruments are associated with a number of factors causing random
errors. Therefore, the instrument readings exhibit a dispersion/scatter in the data.
The normal distribution is by far the most commonly occurring distribution.
If the measurement is designated by x, the Gaussian distribution gives the probability that the
measurement will lie between x and x + dx and is written
The normal distribution model is employed in decision-making processes like the determinati
on of the probability that the measured value lies within a given range.
Alternatively, if the level of probability or the confidence level is a certain fixed value (which
may be a requirement for a certain situation) then it is possible to determine the allowable
scatter or dispersion from a given mean value.
Also, the properties of normal distribution are used for comparing various normally
distributed samples using statistical criteria known as significance tests.
the expected normally distributed values in the different ranges of measurements using the
criteria known as -test (pronounces as Chi-square test). This criteria of -test is also
applicable for determining whether any non-normal distribution conforms to any other known
theoretical distribution or not.
.
The standard deviation is a measure of the width of the distribution curve; the larg
er the value of the flatter the curve and hence the larger the expected error of all
controlled experiment.
By inspection of the Gaussian distribution function we see that the maximum prob
P(xm) is sometimes called a measure of precision of the data because it has a larger
The probability that a measurement will fall within a certain range x1 of the mean
reading is
1 = 1, 2, and 3.
The probability that the value of a randomly selected observation will lie in this ran
ge is called the confidence level
We thus expect that the mean value will lie within 2.57 with less than 1 perce
The level of significance is 1 minus the confidence level. Thus, for z = 2.57 the
Example: A certain steel bar is measured with a device which has a known precisi
rements are necessary to establish the mean length with a 5 percent level of signi
noticeably different from the majority of the data. If these data points were obtained
under abnormal conditions involving gross blunders and the experimenter is sure
However the experimenter cannot reject a data simply because it is different from
the others, he must rely on certain standard mathematical methods for rejecting any
experimental data.
mean value and the standard deviation are first calculated using all data points. The
If the ratio of deviation of a reading to the standard deviation exceeds the limits
given in table above that reading is rejected. The mean value and the standard deviat
ions are again calculated by excluding the rejected reading from the data.
A criteria used for discarding a data point is its deviation from the mean exceeds
four times the probable error of a single reading. This results in discarding a data
out side a confidence interval for a single reading at a confidence level of 0.993.
Exercise
A laboratory experiment is conducted to measure the viscosity of a specimen of oil.
A series of tests give the values as 5.3*10-3, 5.73*10-3, 6.77*10-3, 5.26*10-3, 4.33*1
0-3, 5.45*10-3, 6.09*10-3, 5.64*10-3, 5.81*10-3, 5.75*10-3 m2/s. Point out any reading
Calculations have been made of the probability that the actual measurements match the expe
cted distribution, and these probabilities are tabulated.
In this table F represents the number of degrees of freedom in the measurements and is given
by: F = n k
Where: n is the number of cells and k is the number of imposed conditions on the
expected distribution
Criteria for Goodness of Fit
Reading Assignment
Significance Test