0% found this document useful (0 votes)
35 views17 pages

CHAPTERS

More books

Uploaded by

hasan rashid
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
35 views17 pages

CHAPTERS

More books

Uploaded by

hasan rashid
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

Dire Dawa University

Mechanical Engineering Department

Instrumentation and Process Control - MEng 6412


Chapter 4 Experimental Data Analysis

by

Dr. Addisu Bekele


Associate Professor of Thermal Engineering, ASTU

Nov. 2021

Statistical Analysis of Experimental Data


The experimental data is obtained in two forms of tests: multi-sample test and single-sample
test.
Multi-Sample Test: in this test, repeated measurements of a given quantity are done using
different test conditions such as employing different instruments, different ways of
measurement and by employing different observers.
Simply making measurements with the same equipment, procedure, technique and same
observer do not provide multi-sample results.
Single-Sample Test: A single measurement (or succession of measurements) done under
identical conditions excepting for time is known as single-sample test.
In order to get the exact value of the quantity under measurement, tests should be done using
as many different procedures, techniques and experiments as practicable.
It should be borne in mind that the statistical means which help us to arrive at correct results
are only valid for multi-sample tests.
Histogram: when a number of multi-sample observations are taken experimentally there is
a scatter of the data about some central value one of the methods, presenting the results is in
the form of a Histogram.

Example: a set of fifty readings of a length measurement are done as shown in the table below.
Use Histogram (frequency distribution curve) to present the data.

Length Number of
(mm) readings
99.7 1
99.8 4
99.9 12
100.0 19
100.1 10
100.2 3
100.3 1

Central Tendency of Data


One of the important parameters describing the numerical information concerns the location
to the central tendency of the data are median, mode, arithmetic mean, etc.

Arithmetic Mean: is the average of the scores in the population.

Mode: is the value of the variate that occurs with greatest frequency.
In general it is considered to be a poor measure of the central tendency of the
data as quite often the most frequently occurring quantity does not appear near
the center of the data.
However, for descriptive analysis, mode is a useful measure to describe the
most frequency occurring value.
Median: is the middle value when measurements in the data set are written down in

ascending order of magnitude.

For a set of n measurements x1, x2, xn of a constant quantity, written down

in ascending order of magnitude, the median value is given by:

xmedian = x(n+1)/2 n = odd

xmedian = 1/2 [x(n/2) + x(n/2)+1 ] n = even

Measure of Dispersion (spread or Variability)


Measures of central location alone usually do not give an adequate description of the experim
ental data. In fact, additionally the variability or the spread of the data should also be taken in
to account.

Range

It is the simplest way to represent dispersion.


It is the difference between the maximum and minimum values of the given data
Disadvantage:
it is based solely on the dispersion of the extreme values.
- It fails to provide information about the clustering or the lack of clustering of the
observed values within the two extreme values.
Hence, it is hardly employed as a measure of dispersion.
Standard Deviation
Quite often the square root of variance denoted and termed standard deviation or root
mean square deviation is used to represent the measure of dispersion.
The main advantage of standard deviation as compared to the variance is that the unit of
standard deviation is the same as that of the measured quantity.

Variance
The mean square deviation (MSD) also termed as the variance is traditionally a very commo
nly used measure of the variability of the data.
The square of the standard deviation 2 is called the variance. This is sometimes called the
population or biased standard deviation because it strictly applies only when a large number
of samples is taken to describe the population.

In many circumstances the engineer will not be able to collect as many data points

as necessary to describe the underlying population. Generally speaking, it is

desired to have at least 20 measurements in order to obtain reliable estimates of

standard deviation and general validity of the data.

For small sets of data an unbiased or sample standard deviation (adjusted standard

deviation denoted by ) or the best estimation of precision of the apparatus is d

efined by:

The sample or unbiased standard deviation should be used when the underlying
population is not known
Best Estimate of Uncertainty
The best estimate of uncertainty represents the extent of random error in the measured values.

sn the unbiased estimate of the


best precision
Un the best estimate of internal
uncertainty

Example: Ten samples of a steel wire were tested on a universal testing machine. The breaking
strengths in tonnes (t) of the samples were: 4.3, 4.5. 4.7 4.2, 4.5, 4.6, 4.4, 4.6, 4.9, 4.5.
Compute the following:
a) The mean value of the braking strength
b) Mean deviation of the data
c) Standard deviation
d) Best estimate of the precision of the apparatus, and
e) Best estimation of the uncertainty in the data
Solution:
The Gaussian or Normal Distribution
In general, measuring instruments are associated with a number of factors causing random
errors. Therefore, the instrument readings exhibit a dispersion/scatter in the data.
The normal distribution is by far the most commonly occurring distribution.

If the measurement is designated by x, the Gaussian distribution gives the probability that the
measurement will lie between x and x + dx and is written

- is the standard deviation.

The normal distribution model is employed in decision-making processes like the determinati
on of the probability that the measured value lies within a given range.
Alternatively, if the level of probability or the confidence level is a certain fixed value (which
may be a requirement for a certain situation) then it is possible to determine the allowable
scatter or dispersion from a given mean value.
Also, the properties of normal distribution are used for comparing various normally
distributed samples using statistical criteria known as significance tests.

the expected normally distributed values in the different ranges of measurements using the
criteria known as -test (pronounces as Chi-square test). This criteria of -test is also
applicable for determining whether any non-normal distribution conforms to any other known
theoretical distribution or not.
.
The standard deviation is a measure of the width of the distribution curve; the larg

er the value of the flatter the curve and hence the larger the expected error of all

the measurements (see figure below).

Thus, as a matter of experimental verification, the Gaussian distribution is

believed to represent the random errors in an adequate manner for a properly

controlled experiment.

By inspection of the Gaussian distribution function we see that the maximum prob

ability occurs at x = xm, and the value of this probability is:

Figure: The Gaussian or normal error


distribution for two values of
the standard deviation.
It is seen that smaller values of the standard deviation produce larger values of the

maximum probability, as would be expected in an intuitive sense.

P(xm) is sometimes called a measure of precision of the data because it has a larger

value for smaller values of the standard deviation.

The probability that a measurement will fall within a certain range x1 of the mean

reading is

Values of the Gaussian normal error distribution (normal

probability density function, P )


Table attached
Integrals of the Gaussian normal error function
Example: Calculate the probabilities that a measurement will fall within one, two, and

three standard deviations of the mean value.

Solution: We perform the calculation using

1 = 1, 2, and 3.

The values of the integral may be obtained from Table

Confidence Interval and Level of Significance


The confidence interval expresses the probability that the mean value will lie within

a certain number of values and is given by the symbol z. Thus,

The probability that the value of a randomly selected observation will lie in this ran
ge is called the confidence level
We thus expect that the mean value will lie within 2.57 with less than 1 perce

nt error (confidence level of 99 percent).

The level of significance is 1 minus the confidence level. Thus, for z = 2.57 the

level of significance is 1 percent.

For large data samples z should be replaced by

Example: A certain steel bar is measured with a device which has a known precisi

on of 0.5 mm when a large number of measurements is taken. How many measu

rements are necessary to establish the mean length with a 5 percent level of signi

ficance such that


Rejection of Data
In most of the experiments, the experimenter finds that some of the data points are

noticeably different from the majority of the data. If these data points were obtained

under abnormal conditions involving gross blunders and the experimenter is sure

about their dubious nature, they can be discarded straight away.

However the experimenter cannot reject a data simply because it is different from

the others, he must rely on certain standard mathematical methods for rejecting any

experimental data.

The common method of data rejection are: criterion, use of confidence

levels, and 3 limits.

criterion for rejecting data:

criterion specifies that a reading may be rejected if the probability of

obtaining the particular deviation from the mean is less than ½ n.


The above table gives the values of the ratio of deviation to standard deviation for

various values of n according to this criterion.

When applying criterion, in order to eliminate any dubious data, the

mean value and the standard deviation are first calculated using all data points. The

deviations of individual readings are then compared with standard deviation.

If the ratio of deviation of a reading to the standard deviation exceeds the limits

given in table above that reading is rejected. The mean value and the standard deviat

ions are again calculated by excluding the rejected reading from the data.

Rejecting Data Based on Confidence Level:

A criteria used for discarding a data point is its deviation from the mean exceeds

four times the probable error of a single reading. This results in discarding a data

out side a confidence interval for a single reading at a confidence level of 0.993.

Exercise
A laboratory experiment is conducted to measure the viscosity of a specimen of oil.

A series of tests give the values as 5.3*10-3, 5.73*10-3, 6.77*10-3, 5.26*10-3, 4.33*1

0-3, 5.45*10-3, 6.09*10-3, 5.64*10-3, 5.81*10-3, 5.75*10-3 m2/s. Point out any reading

that can be rejected by applying criterion. The ration of maximum devi

ation to standard deviation should not exceed 1.96.


Comparison of Data with Normal Distribution
The Chi-Square Test of Goodness of Fit
The chi-square test of goodness of fit is a suitable way of answering how it is known that
random experimental errors would be expected to follow the gaussian distribution,

Calculations have been made of the probability that the actual measurements match the expe
cted distribution, and these probabilities are tabulated.
In this table F represents the number of degrees of freedom in the measurements and is given
by: F = n k
Where: n is the number of cells and k is the number of imposed conditions on the
expected distribution
Criteria for Goodness of Fit

Reading Assignment

Significance Test

Graphical Representation and Curve Fitting of Data

You might also like