0% found this document useful (0 votes)
110 views26 pages

Lecture 2

Download as pdf or txt
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 26

1

Chapter 2

Statistics
{2}
Slides (1- 26)

2
Statistics:

 Experimental measurements always contain some


variability ‫ تغير‬, So no conclusion can be drawn
‫نعتبرها‬with certainty ‫يقين‬.

 Statistics gives us tools to accept conclusions that have


a high probability of being correct , and to reject
conclusions that do not.

3
2.1. Gaussian Distribution:

 If an experiment is repeated a great many times, and


if the errors are purely random, then the results tend to
cluster ‫ حزمة‬symmetrically about the average value.

 In general , we cannot make so many measurements in


a lab experimental.

 We are more likely to repeat an experiment 3 to 5 times


than 2000 times.
4
However, from the small set of result , we can then
estimate the statistical parameters that describe the large set,
we can then make estimates of statistical behavior from the
small number of measurements.

The more times the experiment is repeated, the more


closely the results approach an ideal smooth curve called the
Gaussian Distribution

5
Mean Value and Standard Deviation:
- Light bulb lifetime , and the corresponding Gaussian
curve are characterized by two parameters.
- The arithmetic mean , X−, also called the average is
- ( the sum of the measured values divided by n).
- The number of measurements:

Mean X` =∑ xi /n

Where xi is the lifetime of an individual bulb.


The Creek capital sigma ∑ ,… means summation:
∑ xi = x1 + x2 + x3 + x4……………xn
6
The standard deviation: (s)

( measures how closely the data are clustered ‫متجمع‬

about the mean).

The smaller the standard deviation, the more closely


the data are clustered about the mean

Standard deviation: (expressed as a % of the mean


value)
S=

Degree of freedom

7
- Gaussian curve for two sets of light bulbs:-
- one having a standard deviation half as great as the -
other-
- The number of bulbs described by each curve is the same
Fig.)2-2(
8
Confidence Intervals ‫فترة ثقة‬

Student`s t- Test :
Is a statistical tool used most frequently ‫ تكرارا‬to
express confidence intervals.

 and To (compare results from different experiments).

It is the tool you could use to evaluate the probability that
your red blood cell count will be found in a certain rang on
“normal” days.
9
Calculating Confidence Intervals
- From a limited number of measurements, we
cannot find the true population mean , μ, or the true
standard deviation, σ.

- What we can determine are x and s ,


the sample mean and the sample standard deviation.

10
The confidence interval of μ is given by

ts
Confidence interval μ= x
n
Where ,

S is the measured standard deviation,


N is the number of observations,
t is student`s t {taken from Table 2-1(page 31)}.

11
Example 1:

The carbohydrate content of a glycoprotein -


(a protein with sugars attached to it) is determined -
to be 12.6,11.9, 13.0, 12.7,and 12.5 g of
carbohydrate /100 g of protein in replicate analyses.

- Find the 50% and 90% confidence intervals for the


carbohydrate content.

12
-Solution :
Find calculate X`=(12.54) and S = (0.40) for the five
measurements.
For the 50% confidence interval. -
Look up t in Table 2-1 under 50 % and across from four
degrees of freedom = (n-1)

The value of t is 0.741, so the 50% confidence interval is

ts
µ= x
n
= 12.54 ± (0.741)(0.40)/√ 5

=12.54 ± 0.13

13
The 90% confidence interval is:

= 12.54 ± (2.132)(0.40(/√5
= 12.54 ± 0.38

These calculations mean that :

there is a 50% chance that the true mean , µ , lies within


the range 12.54 ± 0.13 (12.41 to 12.67)

There is a 90% chance that µ lies within the range


12.54± 0.38 (12.16 to 12.12.92).

14
Example 2:

Comparison of two methods for measuring cholesterol

Cholesterol content (g/L)

Plasma sample Method A Method B Difference


(di)
1 1.46 1.42 0.04
2 2.22 2.38 -0.16
3 2.84 2.67 0.17
4 1.97 1.80 0.17
5 1.13 1.09 0.04
6 2.35 2.25 0.10
d` = +0.060
15
d

Answer:

t calculated = d N

Sd

Where

Sd = √ Σ )di – d )2

n-1

The quantity ¯d ( average difference between methods A and B ),

and n is the number of pairs of data (six in this case).

16
Sd is

= √ )0.04 – d‫) ־‬2 + ( -0.16 – d‫)־‬2 + (0.17 – d‫) ־‬2 +


(0.17 – d‫ ) ־‬2 +(0.04 – d‫ ) ־‬2 + (0.10 – d‫ ) ־‬2

6-1

= 0.122 ( using d‫ = ־‬0.060)

Putting Sd

t calculated = 0.060 √6 = 1.20


0.122
17
We find that
t (calculated) (1.20 ) is less than t table (2.571) listed in
Table 2-1 for 95% confidence and 5 degrees of
freedom.

The two techniques are not significantly different at the


95% confidence level.

18
The F- Test
- The F test is used to determine if two variance are
statistically different.
- This is a test designed to indicate whether there is a
significant differences between two methods based on their
standard deviations.
- F is defined in terms of the variances of the two methods,
where the variance is the square of the standard deviation:

F = S12 / S22
If F (calculated) greater F (Table) in table , then the
difference is significant.

19
Q -Test for Bad Data:
Sometimes one datum ‫حقيقةةة‬ is inconsistent ‫متنةةض‬ with the
remaining data

You can use the Q test to help decide whether to retain or


discard a questionable datum.

Consider the five results 12.53, 12.56, 12.47, 12.67, and


12.48 is 12.67 a bad point”?

To apply Q-test , arrange the data in order of increasing


value and calculate Q defined as
20
*The range is the total spread of the data.
*The gap is the difference between the questionable point and the nearest
value.

Q calculated = gap / range

Gap 0.11

12.47 12.48 12.53 12.56 12.67

Questionable value
(Too high)

Range 0.20

21
If Q(calculated) > Q(table)
the questionable point should be discarded ‫تهمل‬.
For the numbers above
Q(calculated) = 0.11/ 0.20= 0.55
From the table ,
we find Q( table ) = 0.64
The questionable point should be retained ‫ يحتفظ بهض‬.
Because Q (calculated) Less Q ( table ) the questionable point
should be retained.
There is more than a 10% chance that the value 12.67 is a
member of the same population as the other four numbers.

22
Thank You

23
XXX t Tests with a
Spreadsheet
Excel has built-in procedures for conducting tests
with Student’s t. enter his data in columns B and C
of a spreadsheet

In the TOOLS menu, you might find DATA


ANALYSIS.

If not, select ADD-INS in the TOOLS menu and


find Analysis Tool Pack.

Put an x beside Analysis Too lPack and click OK.


DATA ANALYSIS will then be available in the
TOOLS menu.
24
XXXX We want to know whether the mean values of
the two sets of data are statistically the same or not

In the TOOLS menu, select DATA ANALYSIS.


In the window that appears, select t-Test: Two-Sample
Assuming Equal Variances. Click OK.
The next window asks you to indicate the cells in which
the two sets of data are located. Write B5:B12 for
Variable 1 and C5:C12 for Variable 2.
The routine will ignore the blank space in cell B12.
For the Hypothesized Mean Difference enter 0 and for
Alpha enter 0.05. Alpha is the level of probability to
which we are testing the difference in the means. With
Alpha = 0.05, we are at the 95% confidence level.
For Output Range, select cell E1 and click OK.

25
Spreadsheet for comparing mean values of Rayleigh’s measurements

26

You might also like