0% found this document useful (0 votes)
135 views10 pages

Lecture 9 - Parametric Statistics (Teaching)

1. The document outlines learning outcomes for a lecture on parametric statistics including how to interpret and compare means, proportions, and perform significance testing using the z-test, t-test, chi-square test, Fisher's exact test, ANOVA, and how to define null and alternative hypotheses. 2. It provides details on how to perform independent and paired t-tests to compare two population means, including the assumptions, test statistics, and how to interpret p-values and draw conclusions. 3. Examples are given of both independent and paired t-test analyses to compare means between two groups and before/after measurements to determine significance.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
135 views10 pages

Lecture 9 - Parametric Statistics (Teaching)

1. The document outlines learning outcomes for a lecture on parametric statistics including how to interpret and compare means, proportions, and perform significance testing using the z-test, t-test, chi-square test, Fisher's exact test, ANOVA, and how to define null and alternative hypotheses. 2. It provides details on how to perform independent and paired t-tests to compare two population means, including the assumptions, test statistics, and how to interpret p-values and draw conclusions. 3. Examples are given of both independent and paired t-test analyses to compare means between two groups and before/after measurements to determine significance.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

8/31/2015

Learning outcomes
Parametric statistics At the end of this lecture, students should be able
to:
Inference for population(s) - Mean & Proportion • Interpret Z-test & t-test
• Interpret odd ratio
• Compare single population proportion (Chi-square
test)
Department of Biomedical Sciences
• Compare two population proportions (Fisher exact
tests)
Faculty of Medicine
• Compare two population means (independent and
MAHSA University
paired t-tests)
• Compare means of more than 2 groups (ANOVA)
1 2

Significance testing – general Z- test & t-test


overview
• They are used to test the mean or proportion
value whether differ form specified value. For
1. Define the null and alternative hypotheses
two populations, they are used to compare the
under the study
mean or proportion value.
2. Acquire data

3. Calculate the value of the test statistic • Z- test : variance known, sample size big (n >30)
t- test : variance unknown, sample size small
4. Compare the value of the test statistic to
values from a known probability distribution
• T- test is frequently used in research/ clinical
5. Interpret the p
p--value and draw conclusion trial.
3 4

T-TEST – COMPARE TWO MEANS


Comparing Two Mean
• What does the difference between sample means tell about
1.Parametric test since comparing means the difference between two population means?

2.Paired samples t-test – the mean difference • Example :


(a) Is there any difference in monthly expenses between
gender?
between two linked groups (b) Is BMI higher in female group than male group?

• We can answer the above questions by conducting the


3.Independent samples t-test – the mean difference significance test to compare the mean.

between two independent groups.


5 6

1
8/31/2015

INDEPENDENT SAMPLES T-
T-TEST
Independent t-test
Assumptions: • Example 1:
- In the population of interest the variable is The body weights between two groups of
normally distributed. student were recorded and presented in the
- The variances of the 2 groups are the same following table.
Weight (kg)
Group A 55, 50, 48, 53, 49, 51,60, 55
Group B 52, 48, 48, 53, 45, 62, 62,61

Is the mean body weight different between two


groups of student?
7 8

Independent t-test Independent t-test


• Ho : µA -µB = 0
• SPSS printout : Ha : µA -µB ≠ 0
Group Statistics

Group N Mean Std. Deviation Std. Error Mean


• p-value = 0.664
Weight Group A 8 52.6250 3.96187 1.40073

Group B 8 53.8750 6.91659 2.44539

• Conclusion:
p> α, do not reject Ho. The test result is
not significant. There is not enough evidence
that the mean body weight between two
groups of student is different.
9 10

Paired t-test Paired t-test


• With dependent samples, each observation in one
sample has matched observation in the other • Example 2:
sample. (matched pairs data)
A doctor carried out a research to investigate
• In analysis, we will test difference of mean (µd) of the whether exercise will help in reducing the blood
two match samples.
pressure. The results are as follows:
• µd = µA -µB Subject Before After
1 150 130
• Example of paired t-test: test the effectiveness of 2 155 145
remedial class
3 130 120
10 students give 20 data (10 pre-test & 10 post-test) 4 131 135
11 12
5 145 140

2
8/31/2015

Paired t-test Paired t-test


• Ho : µd = 0
SPSS printout : Ha : µd ≠ 0

• p-value = 0.104
CI = -2.639 < µd < 19.039

• Conclusion:
p> α, do not reject Ho. The test result is not significant. There
is not enough evidence that the exercise will help
in reducing the blood pressure.

13 14

Numerical data > 2 groups Extend t-


t-test to >
>22 groups i.e
ANalysis Of VAriance (ANOVA)
Consider scores for contribution to
Compare means from several groups
energy intake from fat groups,
milk groups and alcohol groups
Single global test of difference in
means Does the mean score differ across
the three categories of intake
Also test for linear trend groups?

1-way analysis of variance (ANOVA) 15


Koh ET, Owen WL. Introduction to Nutrition and Health
16
Research Kluwer Boston, 2000

One-
One-Way ANOVA of scores One-
One-Way ANOVA of Scores

Contributor to Energy Intake The null hypothesis (H0) is ‘there are no


differences in mean score across the
Fat Milk Alcohol three groups’

x1 = x2 = x3
n=6
n=6 n=6 n=6
Mean=4
Mean= 4.22 Mean=2.01 Mean=0.167 Use SPSS One-
One-Way ANOVA to
carry out this test
17 18

3
8/31/2015

Assumptions of 1-Way
Results of ANOVA
ANOVA
1. Standard deviations are similar
ANOVA partitions variation into Within
2. Test variable (scores) are approx. and Between group components
normally distributed
Results in F-
F-statistic – compared with
If assumptions are not met, use non-
non- values in F-
F-tables
parametric equivalent Kruskal-
Kruskal-Wallis
F = 108.6, with 2 and 15 df, p<0.001
test
19 20

Results of ANOVA Summary of ANOVA

The groups differ significantly and it ANOVA useful if number of groups with
continuous summary in each
is clear the Fat group contributes most
to energy score with a mean = 4.22 SPSS does all pairwise group comparisons
adjusted for multiple testing
Further pair-
pair-wise comparisons can be
made (3 possible) using multiple Note that ANOVA is just a form of linear
comparisons test e.g. Bonferroni or regression
Tukey post hoc tests 21 22

TESTS OF INDEPENDENCE
• To test whether two criteria of classification are
independent . For example socioeconomic status
Analysis of Frequency Data and area of residence of people in a city are
independent.
An Introduction to the Chi-
Chi-Square • We divide our sample according to status, low,
medium and high incomes etc. and the same
Distribution samples is categorized according to urban, rural or
suburban and slums etc.
• Put the first criterion in columns equal in number to
classification of 1st criteria ( Socioeconomic status)
and the 2nd in rows, where the no. of rows equal to
the no. of categories of 2nd criteria (areas of cities).

23 24
24

4
8/31/2015

TESTS OF INDEPENDENCE Contingency Tables


Characteristics that distinguish it from other chi-
tests are:
chi-square
(r x c)
• A single sample is selected from a population of interest, and
the subjects or objects are cross-
cross-classified on the basis of 1. Tables can be any size. For example
the two variables of interest.
2 x 2, 3 x 5, 10 x 4, etc.
• The rationale for calculating expected cell frequencies is
based on the probability law, which states that if two events 2. But with very large tables, difficult
(here the two criteria of classification) are independent, the
probability of their joint occurrence is equal to the product of
to interpret tests of association
their individual probabilities. 3. Cross tabulations in SPSS can give
• The hypotheses and conclusions are stated in terms of the
independence (or lack of independence) of two variables
odds ratios as an option with row or
column with two categories

25 26
25

The Contingency Table Observed versus Expected


• Table Two-
Two-Way Classification of sample Frequencies
First Criterion of Classification →

Second • O i j : The frequencies in ith row and jth column given in


Criterion ↓ 1 2 3 ….. c Total any contingency table are called observed frequencies
that result form the cross classification according to the
1 N11 N12 N13 …… N1c N1. two classifications.
2 N21 N22 N 23 …… N2c N2.
3 N31 N32 N33 …...… N3c N3. • e i j :Expected frequencies on the assumption of

. . . . … . . independence of two criterion are calculated by


. . . . . . multiplying the marginal totals of any cell and then
dividing by total frequency
r Nr1 Nr2 Nr3 N rc N r. • Formula: ( ( ) NN i• •j

Total N .1 N .2 N .3 …… N .c N
e=ij
N
Text Book : Basic Concepts and 27 28
Methodology for the Health Sciences 27 28

Chi--square Test
Chi Example
• After the calculations of expected frequency, The researcher are interested to determine that
Prepare a table for expected frequencies and use Chi-
Chi- preconception use of folic acid and race are
square independent. The data is:
( − )
2 Use ofFrequencies
Observed Folic total
Table Yes no
Expected frequencies Total

χ = ∑ [ oi ei ]
2 k Acid
Table White (282)(559)/636 (354)(559)/636 559
i =1
Yes No
ei =247.86 =311.14

Where summation is for all values of rxc = k cells. White 260 299 559 Black (282)(56)/636 (354)(559) 56
Black 15 41 56 =
• D.F.: the degrees of freedom for using the table are (r-
(r- Other 7 14 21 =24.83 31.17
)(c-1) for α level of significance
1)(c- Others (282)((21) 21
• Note that the test is always one-
one-sided. 21x354/636
=9.31 =11.69

29 Total 282 354 636 total: Basic282


Text Book Concepts and 354 636
30
29 Methodology for the Health Sciences 30

5
8/31/2015

Calculations and Testing Conclusion


• Data: See the given table
• Assumption: Simple random sample
• Hypothesis: H0: race and use of folic acid are independent • Statistical decision. We reject H0 since 9.08960> 5.991
HA: the two variables are not independent. Let α = 0.05
• The test statistic is Chi Square given earlier
• Conclusion: we conclude that H0 is false, and that there
• Distribution when H0 is true chi-
chi-square is valid with (r
(r--1)(c-
1)(c-1)
= (3
(3--1)(2-
1)(2-1)= 2 d.f.
d.f. is a relationship between race and preconception use of
• Decision Rule: Reject H0 if value of χ is greater than 2 folic acid.
• P value. Since 7.378< 9.08960< 9.210, 0.01<p <0.025
χ
2

α , ( r − 1 )( c − 1 ) = 5.991 • We also reject the hypothesis at 0.025 level of


significance but do not reject it at 0.01 level.
• Calculations:χ ( 260 − 247 . 86 ) / 247 . 86 + ( 299 − 311 . 14 )
2 2 2
= / 311 . 14

(14 − 11 . 69 ) / 11 . 69 = 9 . 091
2
+ ..... +

31 32
31 32

ODDS RATIO ODDS RATIO


• Where a, b, c and d are the numbers given in the
• In a retrospective study, samples are selected from following table:
Risk Sample Total
those who have the disease called ‘cases’
‘cases’ and those Factor
who do not have the disease called ‘controls’ . The ↓
investigator looks back (have a retrospective look) at the Cases Control
subjects and determines which one have (or had) and Present a b a+b
which one do not have (or did not have ) the risk factor.
• The data is classified into 2x2 table, for comparing cases Absent c d c+d
and controls for risk factor ODDS RATIO IS
Total a+c b+d
CALCULATED
• ODDS are defined to be the ratio of probability of 100(1-α) %CI for OR by formula:
• We may construct 100(1-
success to the probability of failure. NO 2
a / b ad 1± ( z / X )
• The estimate of population odds ratio is OR = = α

cld bc 33
OR / 2

34
33

Confidence Interval for Odds Ratio


Example for Odds Ratio
(1-α) 100% Confidence Interval for Odds Ratio is:
The (1-

• Data relates to the obesity status of O Rˆ


1± ( z α / X 2)

children aged 5-6 and the smoking status n ( ad − bc ) 2


Where
of their mothers during pregnancy X2 =
( a + c )( a + d )( b + c )( b + d )
• Hence OR for table
Smoking cases Non- Total For Example,
• is : OR = (64)(3496) = 9.62 status cases we have: a=64, b=342, c=68, d=3496 , therefore:
(342)(68) (during
Pregnancy) 3970 ( 64 × 3496 − 342 × 68 ) 2
X 2= ( 132 )( 3833 )( 406 )( 3564 )
= 217 .68

Smoked 64 342 406


Obesity status throughout Its 95% CI is:
O Rˆ
1± ( z α / X 2 ) = 9 . 62 1 ± ( 1 . 96 / 217 . 6831 )
Never 68 3496 3564
smoked or (7.12, 13.00)
35 36
Total 132 3838 3970
35 36

6
8/31/2015

Interpretation of Example 6 data Interpretation of ODDS RATIO


• The 95% confidence interval (7.12, 13.00) • The sample odds ratio provides an estimate
mean that we are 95% confident that the of the relative risk of population in the case of
population odds ratio is somewhere between a rare disease.
7.12 and 13.00 • The odds ratio can assume values between 0
• Since the interval does not contain 1, in fact to ∞.
contains values larger than one, we conclude • A value of 1 indicate no association between
that, in Pop. Obese children (cases) are more risk factor and disease status.
likely than non-
non-obese children ( non
non--cases) to • A value greater than one indicates increased
have had a mother who smoked throughout odds of having the disease among subjects in
the pregnancy. whom the risk factor is present.

37 38
37 38

Significance test about proportion Z- test & t-test


• Inference on “preference” • Assumptions of a test about a proportion (large
sample)
• It involves Bernoulli trial, two possible outcome:
success & failure. (a) the variable is categorical
(b) the data production employed randomization
Z- test : Large sample (c) the sample size is sufficiently large &
Binomial test : small sample (not common in practice) population distribution is approximately normal.
• Example : prevalence of diabetes in a country [np ≥ 15 and (n) (1-p) ≥ 15]
: prevalence of hypertension in a city

39 40

Significance test about proportion


Single population proportion
• Test statistic :
sample proportion– null hypothesis proportion • Example 3:
standard error when null hypothesis is true A company claims that less than 20% of its
customers consume another brand of food
supplement on a regular basis. A random
sample of 100 customers yielded 18 who did in
• po is the null hypothesis proportion, the one we define
as success.(1-po) is the failure. fact consume another brand of food
supplement on a regular basis. Do these
sample results support the company’s claim?
41 (Use a level of significance of 0.05.) 42

7
8/31/2015

Chi-square test Chi-square test


• Ho : p = 0.2 (20%)
• SPSS printout : Ha : p < 0.2
brand
Observed Expected

other brand
N
18
N
20.0
Residual
-2.0
• p-value = (0.617)
same brand 82 80.0 2.0
Total 100

Test Statistics
• Conclusion:
brand
p> α, do not reject Ho, There is not enough evidence
Chi-Square .250a to conclude less than 20% of the customers take
df
Asymp. Sig.
1
.617
another brand of food supplement. Thus the result
a. 0 cells (.0%) have expected frequencies less than Does not support the company’s claim.
5. The minimum expected cell frequency is
20.0.
43 44

Single population proportion Chi-square test


• Example 4: • SPSS printout :
The 49 students in a class at the university of Florida Cola

made blinded evaluations of pairs of cola drinks. For Observed N Expected N Residual
Coke 29 24.5 4.5
49 comparisons of Coke and Pepsi, Coke was Pepsi 20 24.5 -4.5
Total 49
preferred 29 times . In the populations that this
sample represent, is the strong evidence that a Test Statistics

majority prefers one of the drinks ? Cola


Chi-Square 1.653a
df 1
Asymp. Sig. .199
Refer to the SPSS printout & MINITAB printout. a. 0 cells (.0%) have expected frequencies less than
5. The minimum expected cell frequency is
45 24.5. 46

Chi-square test Chi-square test


• Ho : p = 0.5 (50%)
• MINITAB printout : • Ha : p > 0.5
Test and CI for One Proportion

Test of p = 0.5 vs p not = 0.5 • p-value = 0.199


X N Sample p 95% CI Z-Value P-Value
29 49 0.591837 (0.454221, 0.729452) 1.29 0.199 • Conclusion:
Using the normal approximation. p> α, do not reject Ho, There is not enough evidence
to conclude more than 50% of the students prefer one
drink. Thus the result does not support majority of
the student prefers one of the drinks.

47 48

8
8/31/2015

Comparing two population Fisher’s-exact test


proportions • Test statistic :
• The categorical variable are tabulated in the Estimate – null hypothesis value
contingency table. (a data summary for categorical
variable). standard error

• p1-p2 = 0 (two population proportions are equal)


• p1-p2 > 0 (p1 >p2 )
• p1-p2 < 0 (p1 <p2 )
• p1-p2 ≠ 0 (p1 <p2 )

49 • do = difference between parameter 50

Fisher’s-exact test Fisher’s-exact test


• Example 5: • MINITAB printout :
A products produced by two machines was
examined. Do these results imply a difference
in the reliability of these two machine?

Machine Defective Acceptable


A 15 210
B 20 190

51 52

Comparing two population


proportions Categorical data > 2 groups
• Ho : p1-p2 = 0
Ha : p1-p2 ≠ 0

• p-value = 0.235
Unordered categories – Nominal
• CI = -0.09< p1-p2 < 0.02 - Chi
Chi--squared test for association

• Conclusion: Ordered categories - Ordinal


p> α/ CI contains zero, do not reject Ho. There is insufficient
evidence that there is difference in the reliability of these two
- Chi squared test for
machine. trend
53 54

9
8/31/2015

Q &A
Thank you

55

10

You might also like