0% found this document useful (0 votes)

46 views56 pages

CH7 - Statistical Data Treatment and Evaluation

This document discusses statistical analysis for data treatment and evaluation. It covers several topics: 1. Types of errors in statistical hypothesis testing - type I errors where the null hypothesis is rejected incorrectly, and type II errors where it is incorrectly accepted. 2. Common applications of statistical data treatment - defining confidence intervals, determining required replicate measurements, estimating differences between means, comparing precisions, analysis of variance, and outlier detection. 3. Details on confidence intervals - what they represent statistically, how to calculate them when standard deviation is known or unknown, and values for various confidence levels.

Uploaded by

Giovanni Pelobillo

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

46 views56 pages

CH7 - Statistical Data Treatment and Evaluation

Uploaded by

Giovanni Pelobillo

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 56

Chapter 7:

Statistical Analysis Data Treatment and

Evaluation
In statistical tests to determine whether two quantities are the same,
two types of errors are possible:

A type I error occurs when we reject the hypothesis that two quantities
are the same when they are statistically identical.

A type II error occurs when we accept that they are the same when they
are not statistically identical.
Some of the most common applications of statistical data treatment are:

1. Defining a numerical interval around the mean of a set of replicate

results within which the population mean can be expected to lie with
a certain probability. This interval is called the confidence interval and
is related to the standard deviation of the mean.

2. Determining the number of replicate measurements required

to ensure that an experimental mean falls within a certain range
with a given level of probability.

3. Estimating the probability that (a) an experimental mean and a true

value or (b) two experimental means are different. This test is
particularly important for discovering systematic errors in a method
and determining whether two samples come from the same source.
4. Determining at a given probability level whether the precision of two
sets of measurements differs.

5. Comparing the means of more than two samples to determine

whether differences in the means are real or the result of random
error. This process is known as analysis of variance.

6. Deciding whether to reject or retain a result that appears to be an

outlier in a set of replicate measurements.
7A Confidence intervals

In most quantitative chemical analyses, the true value of the mean m

cannot be determined because a huge number of measurements
(approaching infinity) would be required.

However, the interval surrounding the experimentally determined mean

x can be determined within which the population mean m is expected to
lie with a certain degree of probability. This interval is known as the
confidence interval.
The limits of the interval are called confidence limits.

The size of the confidence interval, which is computed from the sample
standard deviation, depends on how well the sample standard deviation
s estimates the population standard deviation s.
Finding the confidence interval when s is known or s is a good estimate of s

In each of a series of five normal error curves, the relative frequency is plotted as a
function of the quantity z. The shaded areas in each plot lie between the values of 2z
and 1z that are indicated to the left and right of the curves. The numbers within the
shaded areas are the percentage of the total area under the curve that is included
within these values of z.
The confidence level (CL) is the probability that the true mean lies within a certain
interval and is often expressed as a percentage.
The probability that a result is outside the confidence interval is often
called the significance level.

If we make a single measurement x from a distribution of known s, we

can say that the true mean should lie in the interval x  z with a
probability dependent on z.

CIfor  x  z

However, we rarely estimate the true mean from a single measurement.

Instead, we use the experimental mean x of N measurements as a better
estimate of .
z
CIfor  x 
N
Values of z at various confidence levels are found in Table 7-1.
The relative size of the confidence interval as a function of N is shown in
Table 7-2.
Finding the confidence interval when s is unknown
In case of limitations in time or in the amount of available sample, a
single set of replicate measurements must provide not only a mean but
also an estimate of precision.
s calculated from a small set of data may be quite uncertain.

Thus, confidence intervals are necessarily broader when we must use a

small sample value of s as our estimate of s.

To account for the variability of s, we use the important statistical

parameter t, often called Student’s t.

t can be defined as x

t
s
For the mean of N measurements, x
t
s/ N
t depends on the desired confidence level as well as on the number of
degrees of freedom in the calculation of s.
t approaches z as the number of degrees of freedom becomes large.
The confidence interval for the mean x of N replicate measurements can be
ts
calculated from t as follows: CIfor  x 
N
7B Statistical aids to hypothesis testing
To explain an observation, a hypothetical model is advanced and tested
experimentally to determine its validity.
The hypothesis tests are used to determine if the results from these
experiments support the model.
If they do not support, the hypothesis is rejected.
If agreement is found, the hypothetical model serves as the basis for
further experiments.
When the hypothesis is supported by sufficient experimental data, it
becomes recognized as a useful theory until such time as data are
obtained that refute it.

Experimental results seldom agree exactly with those predicted from a

theoretical model.
Statistical tests help determine whether a numerical difference is a
result of a real difference (a systematic error) or a consequence of the
random errors inevitable in all measurements.

Tests of this kind use a null hypothesis, which assumes that the
numerical quantities being compared are the same.

 A probability distribution is used to calculate the probability that the

observed differences are a result of random error.

If the observed difference is greater than or equal to the difference

that would occur 5 times in 100 by random chance, the null hypothesis is
considered questionable, and the difference is judged to be significant.

Other significance levels, such as 0.01 (1%) or 0.001 (0.1%), may also
be adopted, depending on the certainty desired in the judgment.
Comparing an Experimental Mean with a Known Value
In many cases the mean of a data set needs to be compared with a
known value.

In such cases, a statistical hypothesis test is used to draw conclusions

about the population mean  and its nearness to the known value, which
we call 0.

There are two contradictory outcomes in any hypothesis test:

1.The null hypothesis H0, states that  = 0.
2.The alternative hypothesis Ha can be stated as:
• reject the null hypothesis in favor of Ha if   0.
• OR if  < 0 or  > 0.
Large Sample z Test
If a large number of results are available so that s is a good estimate of s,
the z test is appropriate. The procedure that is used is summarized
below:
1. State the null hypothesis: H0:  = 0
x  0
z
2.Form the test statistic: / N

3.State the alternative hypothesis Ha and determine the rejection region:

•For Ha:   0, reject H0 if z  zcrit or if z  -zcrit (two-tailed test)
•For Ha:  > 0, reject H0 if z  zcrit (one-tailed test)
• For Ha:  < 0, reject H0 if z  -zcrit (one-tailed test)
Note that for Ha:   0, we can reject for either a positive value of z or
for a negative value of z that exceeds the critical value.

This is called a two-tailed test since rejection can occur for results in
either tail of the distribution.

For the 95% confidence level, the probability that z exceeds zcrit is 0.025
in each tail or 0.05 total.

Hence, there is only a 5% probability that random error will lead to a

value of z  zcrit or z  -zcrit. The significance level overall is  = 0.05.

If instead our alternative hypothesis is Ha:  > 0, the test is said to be a
one-tailed test. In this case, we can reject only when z  zcrit.
Rejection regions for the 95% confidence level.
(a)Two-tailed test for Ha:   0. Note the critical value of z is 1.96.

(b) One-tailed test for Ha:  > 0.

The critical value of z is 1.64 so that 95% of the area is to the left of zcrit
and 5% of the area is to the right.

(c) One-tailed test for Ha:  < 0.

The critical value is again 1.64 so that 5% of the area lies to the left of -z crit.
Figure 7-2 Rejection regions for the 95% confidence level.
Small Sample t Test
For a small number of results,
1.State the null hypothesis: H0:  = 0
2.Form the test statistic:
x  0
t
s/ N

2.State the alternative hypothesis Ha and determine the rejection region:

•For Ha:   0, reject H0 if t  tcrit or if t  -tcrit (two-tailed test)
•For Ha:  > 0, reject H0 if t  tcrit (one-tailed test)
• For Ha:  < 0, reject H0 if t  -tcrit (one-tailed test)
If the analytical method had no systematic error, or bias, random errors
would give the frequency distribution
shown by curve A.
Method B has some systematic error so
that xB, which estimates B, differs
from the accepted value 0.
The bias is given by
Bias = B - 0

Figure 7-3 Illustration of systematic error in an analytical method.

Curve A is the frequency distribution for the accepted value by a method
without bias.
Curve B illustrates the frequency distribution of results by a method that
could have a significant bias due to a systematic error.
Comparison of Two Experimental Means
In many cases, it needs to be determined whether a difference in the
means of two sets of data is real or the result of random error.

In some cases, the results of chemical analyses are used to determine

whether two materials are identical.

In other cases, the results are used to determine whether two analytical
methods give the same values or whether two analysts using the same
methods obtain the same means.

Data are often collected in pairs to eliminate one source of variability by

focusing on the differences within each pair.
The t Test for Differences in Means
Differences in means can be computed with the z test, modified to take
into account a comparison of two sets of data, if we have large numbers
of measurements in both data sets.

Typically, both sets contain only a few results, and we must use the t test.

Let us assume that N1 replicate analyses by analyst 1 yielded a mean

value of x1 and that N2 analyses by analyst 2 obtained by the same
method gave x2.

According to the null hypothesis, we can write H0: 1 = 2.

The alternative hypothesis is Ha: 1  2, and the test is a two-tailed test.
To get a better estimate of s than given by s1 or s2 (estimates of
population standard deviation) alone, we use the pooled standard
deviation.
s
s m1  1
The std deviation of the mean of analyst 1 is N1

2
The variance of the mean of analyst 1 is s 2
 s1
m1 N1

s2
The variance of the mean of analyst 2 is sm2 
N2

2 2 2
The variance of the difference s d between the means is
2 s d  s m1  s m2

sd s12 s 22
The std deviation of the difference between the means is  
N N1 N 2
Further assumption that the pooled standard deviation spooled is a better
estimate of  than s1 or s2.

sd s 2pooled s 2pooled N1  N 2
  s
N N1 N2 N1 N 2

x1  x 2
The test statistic t is now found from t
N1  N 2
s pooled
N1 N 2

If there is good reason to believe that the standard deviations of the two
data sets differ, the two-sample t test must be used.

However, the significance level for this t test is only approximate, and
the number of degrees of freedom is more difficult to calculate.
Paired Data
The paired t test uses the same type of procedure as the normal t test
except that we analyze pairs of data and compute the differences, d.

The standard deviation is now the standard deviation of the mean

difference. Our null hypothesis is H0: d - 0
where 0 is a specific value of the difference to be tested, often zero. The
test statistic value is
d  0
t
sd
N

Where d
d i

The alternative hypothesis would be d  0, d > 0, d < 0

Errors in Hypothesis Testing
1.A type I error occurs when H0 is rejected although it is actually true. In some
sciences, a type I error is called a false negative. The significance level  gives the
frequency of rejecting H0 when it is true.
2.A type II error occurs when H0 is accepted and it is actually false. This is
sometimes termed a false positive.
The probability of a type II error is given the symbol .

Making  smaller (0.01 instead of 0.05) would appear to minimize the

type I error rate. However, decreasing type I error rate increases type II error rate
because they are inversely related to each other.
Comparison of Variances
A simple statistical test, called the F test, can be used to test this
assumption under the provision that the populations follow the normal
(Gaussian) distribution.

The F test is also used in comparing more than two means and in linear
regression analysis.

The F test is based on the null hypothesis that the two population
variances under consideration are equal, H0:  21 = 22

The test statistic F, which is defined as the ratio of the two sample
variances (F = s21/s22), is calculated and compared with the critical value
of F at the desired significance level.

The null hypothesis is rejected if the test statistic differs too much from
unity.
The F test can be used in either a one-tailed mode or in a two-tailed
mode.

For a one-tailed test we test the alternative hypothesis that one variance
is greater than the other.

Hence, the variance of the supposedly more precise procedure is placed

in the denominator and that of the less precise procedure is placed in
the numerator.

The alternative hypothesis is Ha:  12 = 22

For a two-tailed test, we test whether the variances are different.

The larger variance always appears in the numerator, which makes the
outcome of the test less certain; thus, the uncertainty level of the F
values doubles from 5% to 10%.
7C Analysis of variance

The methods used for multiple comparisons fall under the general
category of analysis of variance, or ANOVA.

These methods use a single test to determine whether there is or is not a

difference among the population means rather than pair wise
comparisons as is done with the t test.

In case of a potential difference, multiple comparison procedures can be

used to identify which specific population means differ from the others.

Experimental design methods take advantage of ANOVA in planning and

performing experiments.
ANOVA Concepts
In ANOVA procedures, difference in several population means is obtained by
comparing the variances. For comparing I population means 1, 2,…I, the null
hypothesis H0 is of the form H0= 1 = 2 = 3 = …. = I

The alternative hypothesis Ha: at least two of the i’s are different.

The following are typical applications of ANOVA:

1. Is there a difference in the results of five analysts determining calcium by a

volumetric method?
2. Will four different solvent compositions have differing influences on the yield
of a chemical synthesis?
3. Are the results of manganese determinations by three different analytical
methods different?
4. Is there any difference in the fluorescence of a complex ion at six different
values of pH?
In each of these situations, the populations have differing values of a common
characteristic called a factor or sometimes a treatment.

In the case of determining calcium by a volumetric method, the factor of interest

is the analyst.

The different values of the factor of interest are called levels.

For the calcium example, there are five levels corresponding to analyst 1, analyst
2, analyst 3, analyst 4, and analyst 5.

The comparisons among the various populations are made by measuring a

response for each item sampled.
Figure 7-4 Pictorial of the results from the ANOVA study of the determination
of calcium by five analysts.
The basic principle of ANOVA is to compare the variations between the
different factor levels (groups) to that within factor levels.

The type of ANOVA shown is a single-factor, or one-way, ANOVA.

The graph shows the ANOVA study of the determination of calcium by

five analysts.

Each analyst does the determination in triplicate.

Analyst is considered a factor, while analyst 1, analyst 2, analyst 3,

analyst 4, and analyst 5 are levels of the factor.
Figure 7-5 This is a pictorial representation of the ANOVA principle.
The results of each analyst are considered a group.
The triangles (m) represent individual results, and the circles (d)
represent the means.
The variation between the group means is compared to that within
groups.
Single-Factor ANOVA
Several quantities are important for testing the null hypothesis.
One of these is the grand average, which is the average of all the data.

N1 N N N
x( ) x1  ( 2 ) x 2  ( 3 ) x3  .....  ( 1 ) x I
N N N N

where N1 is the number of measurements in group 1, N2 is the number

in group 2, and so on.

The grand average can also be found by summing all the data values and
dividing by the total number of measurements N.
To calculate the variance ratio needed in the F test, it is necessary to obtain
several other quantities called sums of squares:
1.The sum of the squares due to the factor SSF is
SSF  N1 ( x1  x) 2  N 2 ( x2  x) 2  N 3 ( x3  x) 2  .....  N I ( x I  x) 2

2.The sum of the squares due to error SSE is

N1 2 N2 2 N3 2 NI 2

SSE   ( x1 j  x1 )   ( x 2 j  x 2 )   ( x3 j  x3 )  .....   ( xij  x I )

j 1 j 1 j 1 j 1

1.These two sums of squares are used to obtain the between-groups

variation and the within-groups variation. The error sum of the squares is
related to the individual group variances by

SSE  ( N1  1) s12  ( N 2  1) s 22  ( N 3  1) s32  ....  ( N I  1) s I2

2.The total sum of the squares SST is obtained as the sum of SSF and SSE:
SST = SSF + SSE
The total sum of the squares SST has N - 1 degrees of freedom.

Just as SST is the sum of SSF and SSE, the total number degrees of freedom N - 1
can be decomposed into degrees of freedom associated with SSF and SSE.

Since there are I groups being compared, SSF has I - 1 degrees of freedom.
This leaves N - I degrees of freedom for SSE. Or,
SST = SSF + SSE
(N - 1) = (I - 1) + (N - I)

Dividing the sums of squares by their corresponding degrees of freedom, we can

obtain quantities that are estimates of the between-groups and within-groups
variations called mean square values.
Mean square due to factor levels: MSF = SSF/I – 1

Mean square due to error: MSE = SSE/N - 1

The quantity MSE is an estimate of the variance due to error (2E), while
MSF is an estimate of the error variance plus the between-groups
variance (2E + 2F)

If the factor has little effect, the between-groups variance should be

small compared to the error variance.

Thus, the two mean squares should be nearly identical under these
circumstances. If the factor effect is significant, MSF is greater than MSE.

The test statistic is the F value, calculated as F = MSF/MSE

Determining Which Results Differ
There are several methods to determine which means are significantly
different.
The least significant difference method is the simplest method in which a
difference is calculated that is judged to be the smallest difference that is
significant.
The difference between each pair of means is then compared to the least
significant difference to determine which means are different.
For an equal number of replicates Ng in each group, the least significant
difference LSD is calculated as follows:

2  MSE
LSD  t
Ng

where MSE is the mean square for error and the value of t has N – I
degrees of freedom.
7D Detection of gross errors
An outlier is a result that is quite different from the others in the data set.

It is important to develop a criterion to decide whether to retain or reject the

outlying data point.

The choice of criterion for the rejection of a suspected result has its perils. If
the standard is too strict so that it is quite difficult to reject a questionable result,
there is a risk of retaining a spurious value that has an inordinate effect on the
mean.

If we set a lenient limit and make the rejection of a result easy, we are likely to
discard a value that rightfully belongs in the set, thus introducing bias to the data.

While there is no universal rule to settle the question of retention or rejection,

the Q test is generally acknowledged to be an appropriate method for making the
decision.
The Q Test
The Q test is a simple, widely used statistical test for deciding whether a
suspected result should be retained or rejected.
In this test, the absolute value of the difference between the questionable result
xq and its nearest neighbor xn is divided by the spread w of the entire set to give
the quantity Q:
xq  xn
Q
w
The Q test for outliers
Other Statistical Tests
Statistical rules should be used with extreme caution when applied to samples
containing only a few values.
The only valid reason for rejecting a result from a small set of data is the sure
knowledge that an error was made in the measurement process.
Else, a cautious approach to rejection of an outlier is wise.

Recommendations for Treating Outliers

1.Reexamine carefully all data relating to the outlying result to see if a gross error
could have affected its value. This recommendation demands a properly kept
laboratory notebook containing careful notations of all observations.

2. If possible, estimate the precision that can be reasonably expected from the
procedure to be sure that the outlying result actually is questionable.
3. Repeat the analysis if sufficient sample and time are available. Agreement
between the newly acquired data and those of the original set that appear to be
valid will lend weight to the notion that the outlying result should be rejected.

4.If more data cannot be secured, apply the Q test to the existing set to see if the
doubtful result should be retained or rejected on statistical grounds.

5. If the Q test indicates retention, consider reporting the median of the set
rather than the mean. The median has the great virtue of allowing inclusion of all
data in a set without undue influence from an outlying value. In addition, the
median of a normally distributed set containing three measurements provides a
better estimate of the correct value than the mean of the set after the outlying
value has been discarded.

HYPOTHESIS
No ratings yet
HYPOTHESIS
24 pages
Hypothesis Testing
No ratings yet
Hypothesis Testing
22 pages
PS - 3rd Unit
No ratings yet
PS - 3rd Unit
53 pages
STA630 MID Term PAST PAPERS by Sparkle Fairy Downloaded From VURANK
No ratings yet
STA630 MID Term PAST PAPERS by Sparkle Fairy Downloaded From VURANK
28 pages
I Feel Like Chocolate Today - Instructions - 2021
No ratings yet
I Feel Like Chocolate Today - Instructions - 2021
3 pages
Stat and Prob Q4 W3
No ratings yet
Stat and Prob Q4 W3
5 pages
Chapter 8
No ratings yet
Chapter 8
19 pages
Homework - 8 DR Avani
No ratings yet
Homework - 8 DR Avani
3 pages
Globalization Impact On Health in Nigeria: Evidence From Enugu State
No ratings yet
Globalization Impact On Health in Nigeria: Evidence From Enugu State
9 pages
DSA Lab Manual Pgms - fINAL
No ratings yet
DSA Lab Manual Pgms - fINAL
34 pages
Ethics in Advertising
No ratings yet
Ethics in Advertising
7 pages
Hypothesis Testing 1
No ratings yet
Hypothesis Testing 1
3 pages
Business Statistics-3
No ratings yet
Business Statistics-3
28 pages
AMA1611 Lecture 9 and 10 Powerpoints
No ratings yet
AMA1611 Lecture 9 and 10 Powerpoints
44 pages
Test of Hypothesis Part 1
No ratings yet
Test of Hypothesis Part 1
21 pages
Toh Solved
No ratings yet
Toh Solved
37 pages
Hypothesis Testing: Six Sigma Thinking, #6
From Everand
Hypothesis Testing: Six Sigma Thinking, #6
Sumeet Savant
No ratings yet
Inference Using Normal and T Distribution
No ratings yet
Inference Using Normal and T Distribution
9 pages
Chemometrics Lecture Note
No ratings yet
Chemometrics Lecture Note
162 pages
Exercise 20
No ratings yet
Exercise 20
2 pages
STAT-101-Chapter 8,9,10,11,12 BY ATHA
No ratings yet
STAT-101-Chapter 8,9,10,11,12 BY ATHA
13 pages
Gbs10e PPT ch09
No ratings yet
Gbs10e PPT ch09
47 pages
Mathematics PDF
No ratings yet
Mathematics PDF
13 pages
MMW Final Exam - Google Forms
No ratings yet
MMW Final Exam - Google Forms
14 pages
SP Lesson 32nd Quarter
No ratings yet
SP Lesson 32nd Quarter
47 pages
4194-Article Text-7852-1-10-20210426
No ratings yet
4194-Article Text-7852-1-10-20210426
8 pages
Mod 5 Hypo1fin
No ratings yet
Mod 5 Hypo1fin
50 pages
Statistical Analysis Data Treatment and Evaluation
No ratings yet
Statistical Analysis Data Treatment and Evaluation
6 pages
Unit 1 Introduction To Inferential Statistics: 1.0 Objectives
No ratings yet
Unit 1 Introduction To Inferential Statistics: 1.0 Objectives
14 pages
4TH QTR 1 Hypothesis Testing Cot
100% (1)
4TH QTR 1 Hypothesis Testing Cot
42 pages
BCS301 - Module 3
No ratings yet
BCS301 - Module 3
20 pages
Stat
No ratings yet
Stat
31 pages
Q4 Weeks 4 Week 5 Statistics and Probability
100% (2)
Q4 Weeks 4 Week 5 Statistics and Probability
14 pages
Day 2 Exploring More Elements of Hyphothesis Testing
No ratings yet
Day 2 Exploring More Elements of Hyphothesis Testing
24 pages
Department of Education: Republic of The Philippines
No ratings yet
Department of Education: Republic of The Philippines
3 pages
5 Estimation and Hypothesis Testing
No ratings yet
5 Estimation and Hypothesis Testing
25 pages
Testing of Hypothesis: Ypothesis Testing. This Is One of The Most Useful Aspects of Statistical Inference
No ratings yet
Testing of Hypothesis: Ypothesis Testing. This Is One of The Most Useful Aspects of Statistical Inference
9 pages
Lecture7 Confidence
No ratings yet
Lecture7 Confidence
45 pages
Unit 5
No ratings yet
Unit 5
18 pages
Hypothesis Testing Intro and Test For Means
No ratings yet
Hypothesis Testing Intro and Test For Means
10 pages
SHS Stat Prob Q4 W2
No ratings yet
SHS Stat Prob Q4 W2
10 pages
Chi Squared for Beginners
From Everand
Chi Squared for Beginners
Stephanie Glen
No ratings yet
Chapter 9 (Independent Means Only) UPDATED!!!
No ratings yet
Chapter 9 (Independent Means Only) UPDATED!!!
27 pages
Waqar Ansari's RISE QM Ch#14
No ratings yet
Waqar Ansari's RISE QM Ch#14
20 pages
IE-525-Chapter 2
No ratings yet
IE-525-Chapter 2
48 pages
Testing of Hypotheses PDF
No ratings yet
Testing of Hypotheses PDF
21 pages
LESSON 1 Prob Stat 4th Quarter
No ratings yet
LESSON 1 Prob Stat 4th Quarter
26 pages
U-3 Notes
No ratings yet
U-3 Notes
42 pages
STAT
No ratings yet
STAT
40 pages
Ch7 0922 2023
No ratings yet
Ch7 0922 2023
103 pages
Hypothesis Testing
No ratings yet
Hypothesis Testing
12 pages
Chapter 5
No ratings yet
Chapter 5
65 pages
Hypothesis Testting3
No ratings yet
Hypothesis Testting3
7 pages
Chapter 2-1
No ratings yet
Chapter 2-1
55 pages
Gsbiju MA202 3 3
No ratings yet
Gsbiju MA202 3 3
7 pages
Research Methodology Unit 1 Ph.D. Course Work
No ratings yet
Research Methodology Unit 1 Ph.D. Course Work
15 pages
P&S Unit 5 (1) 111111111111
No ratings yet
P&S Unit 5 (1) 111111111111
16 pages
5 Largesampletest
No ratings yet
5 Largesampletest
41 pages
Testing of Hypothesis For Large Sample
No ratings yet
Testing of Hypothesis For Large Sample
11 pages
Unit IV
No ratings yet
Unit IV
58 pages
Unit 5 Mba 1ST
No ratings yet
Unit 5 Mba 1ST
197 pages
4 Hypothesis Testing 1 Sample Mean For Students
No ratings yet
4 Hypothesis Testing 1 Sample Mean For Students
18 pages
Basic Statistics and Data Handling
No ratings yet
Basic Statistics and Data Handling
53 pages
2022 Scheme Module 3 BCS302
No ratings yet
2022 Scheme Module 3 BCS302
17 pages
Unit 5: Hypothesis Testing
No ratings yet
Unit 5: Hypothesis Testing
6 pages
CH 7 Statistical Data Treatment and Evaluation
No ratings yet
CH 7 Statistical Data Treatment and Evaluation
30 pages
Probability and Statistics Notes
No ratings yet
Probability and Statistics Notes
10 pages
Sample Hypothesis Test
No ratings yet
Sample Hypothesis Test
41 pages
Chapter 7
No ratings yet
Chapter 7
31 pages
Lecture 10 - Statistics
No ratings yet
Lecture 10 - Statistics
24 pages
MT271 Chapter
No ratings yet
MT271 Chapter
14 pages
Statistics & Probability Q4 - Week 3-4
No ratings yet
Statistics & Probability Q4 - Week 3-4
16 pages
Statistical Analysis Data Treatment and Evaluation
No ratings yet
Statistical Analysis Data Treatment and Evaluation
55 pages
Hypothesis Testing
No ratings yet
Hypothesis Testing
60 pages
Week 1 - Hypothesis Testing - Part 1
No ratings yet
Week 1 - Hypothesis Testing - Part 1
77 pages
Ac 07 PDF
No ratings yet
Ac 07 PDF
3 pages
Z Test T Test For Students
No ratings yet
Z Test T Test For Students
6 pages
Chapter 3
100% (2)
Chapter 3
12 pages
Sta301 Lec37
No ratings yet
Sta301 Lec37
53 pages
Business Statistics Material
No ratings yet
Business Statistics Material
67 pages
Lecture 3 PDF
100% (2)
Lecture 3 PDF
77 pages
Author: Dr. K. GURURAJAN: Class Notes of Engineering Mathematics Iv Subject Code: 06mat41
0% (1)
Author: Dr. K. GURURAJAN: Class Notes of Engineering Mathematics Iv Subject Code: 06mat41
122 pages
Web Chapter 19: Statistical Aids To Hypothesis Testing and Gross Errors
No ratings yet
Web Chapter 19: Statistical Aids To Hypothesis Testing and Gross Errors
8 pages
Hypothesis Testing
No ratings yet
Hypothesis Testing
69 pages
Walpole Chapter 10
No ratings yet
Walpole Chapter 10
13 pages
Hypothesis Tests & Control Charts: by S.G.M
No ratings yet
Hypothesis Tests & Control Charts: by S.G.M
26 pages
What Is Hypothesis Testing
100% (1)
What Is Hypothesis Testing
32 pages
Hypothesis Testing G
No ratings yet
Hypothesis Testing G
28 pages
Lecture 3 2014 Statistical Data Treatment and Evaluation
No ratings yet
Lecture 3 2014 Statistical Data Treatment and Evaluation
44 pages
MNSTA Chapter 4
No ratings yet
MNSTA Chapter 4
31 pages

CH7 - Statistical Data Treatment and Evaluation

Uploaded by

CH7 - Statistical Data Treatment and Evaluation

Uploaded by

Chapter 7:

Statistical Analysis Data Treatment and

1. Defining a numerical interval around the mean of a set of replicate

2. Determining the number of replicate measurements required

3. Estimating the probability that (a) an experimental mean and a true

5. Comparing the means of more than two samples to determine

6. Deciding whether to reject or retain a result that appears to be an

In most quantitative chemical analyses, the true value of the mean m

However, the interval surrounding the experimentally determined mean

If we make a single measurement x from a distribution of known s, we

However, we rarely estimate the true mean from a single measurement.

Thus, confidence intervals are necessarily broader when we must use a

To account for the variability of s, we use the important statistical

t can be defined as x

Experimental results seldom agree exactly with those predicted from a

 A probability distribution is used to calculate the probability that the

If the observed difference is greater than or equal to the difference

In such cases, a statistical hypothesis test is used to draw conclusions

There are two contradictory outcomes in any hypothesis test:

3.State the alternative hypothesis Ha and determine the rejection region:

Hence, there is only a 5% probability that random error will lead to a

(b) One-tailed test for Ha:  > 0.

(c) One-tailed test for Ha:  < 0.

2.State the alternative hypothesis Ha and determine the rejection region:

Figure 7-3 Illustration of systematic error in an analytical method.

In some cases, the results of chemical analyses are used to determine

Data are often collected in pairs to eliminate one source of variability by

Let us assume that N1 replicate analyses by analyst 1 yielded a mean

According to the null hypothesis, we can write H0: 1 = 2.

The standard deviation is now the standard deviation of the mean

The alternative hypothesis would be d  0, d > 0, d < 0

Making  smaller (0.01 instead of 0.05) would appear to minimize the

Hence, the variance of the supposedly more precise procedure is placed

The alternative hypothesis is Ha:  12 = 22

For a two-tailed test, we test whether the variances are different.

These methods use a single test to determine whether there is or is not a

In case of a potential difference, multiple comparison procedures can be

Experimental design methods take advantage of ANOVA in planning and

The following are typical applications of ANOVA:

1. Is there a difference in the results of five analysts determining calcium by a

In the case of determining calcium by a volumetric method, the factor of interest

The different values of the factor of interest are called levels.

The comparisons among the various populations are made by measuring a

The type of ANOVA shown is a single-factor, or one-way, ANOVA.

The graph shows the ANOVA study of the determination of calcium by

Each analyst does the determination in triplicate.

Analyst is considered a factor, while analyst 1, analyst 2, analyst 3,

where N1 is the number of measurements in group 1, N2 is the number

2.The sum of the squares due to error SSE is

SSE   ( x1 j  x1 )   ( x 2 j  x 2 )   ( x3 j  x3 )  .....   ( xij  x I )

1.These two sums of squares are used to obtain the between-groups

SSE  ( N1  1) s12  ( N 2  1) s 22  ( N 3  1) s32  ....  ( N I  1) s I2

Dividing the sums of squares by their corresponding degrees of freedom, we can

Mean square due to error: MSE = SSE/N - 1

If the factor has little effect, the between-groups variance should be

The test statistic is the F value, calculated as F = MSF/MSE

It is important to develop a criterion to decide whether to retain or reject the

While there is no universal rule to settle the question of retention or rejection,

Recommendations for Treating Outliers

You might also like