0% found this document useful (0 votes)
219 views54 pages

Lecture 12

Comparing probabilities Index Pluses Risk simplest reduction aids cost-benefit analyses (delta) smaller sample sizes less physical than relative risk harder to explain odds ratio easy to model can use with retrospective datasets rare events simple formula for std err Stat 110 odds ratios and "recommend indices"

Uploaded by

jamesyu
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
219 views54 pages

Lecture 12

Comparing probabilities Index Pluses Risk simplest reduction aids cost-benefit analyses (delta) smaller sample sizes less physical than relative risk harder to explain odds ratio easy to model can use with retrospective datasets rare events simple formula for std err Stat 110 odds ratios and "recommend indices"

Uploaded by

jamesyu
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 54

Stat 110, Lecture 12

Sampling Distributions, Estimation,


and Hypothesis Testing (II)
[email protected]
Statistics

Way Too
No Data Some Data Much Data

Inferential Descriptive
Probability Statistics Statistics

sampling hypothesis estimation


distributions testing

Stat 110 [email protected]


topics

• comparing proportions
• paired vs two-sample (again)
• sample size calculations
• hypothesis testing
• power transformations
• the other two-sample test
• the k-sample problem

Stat 110 [email protected]


even odd
column column total
yields 40 28 68
no yield 60 72 132
total 100 100 00

odd and even columns


have different test heads
Three measures
2. risk reduction 40/100 – 28/100 = 0.12
3. relative risk (72/100)/(60/100)= 1.20
4. odds ratio (40/60)/(28/72) = 1.71
Stat 110 [email protected]
Terms for Comparing Two Probabilities
Risk reduction:
• Rate1 – Rate2 (good≡bad)
Relative risk:
• Rate1 / Rate2 (bad, usually)
Odds:
• Rate / (1– Rate) (good≡bad)
Odds ratio:
• ratio of two odds (good≡bad)
• [Rate1/(1– Rate1)]/[Rate2/(1– Rate2)]
Stat 110 [email protected]
… comparing probabilities
Index Pluses Minuses
Risk •simplest •awkward to model
reduction •aids cost-benefit analyses •additive model less
physical
(delta) •smaller sample sizes

Relative risk •more physical for modeling •no symmetry between Pr


•… extrapolation {A} and Pr{not A}
•requires prospective data
•larger sample sizes

Odds ratio •easy to model •less physical than relative


•can use with retrospective risk
datasets, rare events •harder to explain
•simple formula for std err

Stat 110 [email protected]


… comparing New Old
#fail a b
probabilities #pass c d
total nN nO

notation nN = (a+c), nO = (b+d), pN = a/nN , pO = b/nO


Which Index Estimate Standard Error2
Risk pN – pO pN (1–pN)/nN + pO(1–pO)/nO
Reduction

Loge Relative Loge ( pN / pO ) (1–pN )/a + (1–pO ) /b


Risk
Loge Odds Loge( ad / cd ) = pN (1–pO) 1/a + 1/b + 1/c + 1/d
Ratio (1–pN ) pO

Stat 110 [email protected]


Example confidence intervals

point standard lower conf upper conf


estimate error limit limit
risk reduction 0.12 0.066 -0.129 0.253

relative risk 1.2 0.977 1.47


(log RR) 0.182 0.103 -0.023 0.388

odds ratio 1.71 0.937 3.14


(log OR) 0.539 0.302 -0.065 .14

Stat 110 [email protected]


odds ratios and “recommend indices”
notation:
attrition
Pr{“Y”(0)} = this year's RI = 1–Pr{“N”(0)”} “yes”
Pr{“Y”(-1)} = last year's RI = 1–Pr{“N”(-1)”}

Pr{“Y”(0)|“Y”(-1)}
= this year's retention rate
Pr{“Y”(0)|“N”(-1)} re-enlistment “no”
= this year's re-enlistment rate

Pr{“N”(0)|“Y”(-1)}
= this year's de-enlistment rate
= 1–Pr{“Y”(0)|“Y”(-1)} = 1-retention rate equilibrium RI/(1-RI)
Pr{“Y”(0)|“N”(-1)}
conversion rates Pr{“N”(0)|“Y”(-1)}
Pr{“Y”(0)|“N”(-1)} and Pr{“N”(0)|“Y”(-1)}
Stat 110 [email protected]
Schredder-Schredder chess match
Intel=L Intel=W
AMD=W Draw AMD=L
AMD white 16 44 11 Intel black
AMD black 11 40 19 ntel white

Ignoring draws, White odds = 35:22 ignores


AMD white odds = 16:11 AMD vs Intel effects, ignores
AMD black odds = 11:19 any sample size imbalances.
= Intel white odds vs
AMD odds = 27:30 log(AMD white odds)+
log(Intel white odds),
White odds = 35:22
which implicitly does so adjust
Stat 110 [email protected]
confidence intervals for odds

95% confidence interval for White odds=


White odds = 35:22 = 1.59 AMD white odds ×
ln(35/22)= 0.464 Intel white odds
1/35=0.0286 1/22=0.0455 = (16/11) ×(19/11)
s.e.=[0.0286+0.0455]1/2 Log odds “ratio”=0.921
=0.272 s.e.=[1/16+1/11+1/19+1/11]1/2
0.464 ± 2×0.272 =[0.297]1/2 = 0.5449
=(–0.080,1.008) log-odds 0.921 ±2×0.5449
(0.923,2.741) as odds =(–0.169,1.089) as log odds
OR (0.845,2.97) as odds

Stat 110 [email protected]


sample sizes from confidence intervals

• old and new processes,


• any difference in yield?

Suppose we know σo=σ+=σ.


• standard error for the difference in means
= σd(n) = σ(1/no + 1/n+)1/2
= σ(2/n)1/2, where no=n+= n.
• with approx 95% confidence interval
d ± 2σd(n)
Stat 110 [email protected]
sample sizes (solution)

fix the length of the confidence interval =Δ,


solve for n:
Δ = (d+2σd(n))–(d–2σd(n)) = 4σd(n) = 4×σ(2/n)1/2 or
Δ2 = 16×σ2×(2/n) = 32σ2/n
n = 32σ2/Δ2

e.g. Δ= 2σ, n = 32σ2/(2σ)2=8 per group


e.g. Δ= σ, n = 32 per group

Stat 110 [email protected]


sample sizes (one-sample version)
When: process monitoring, paired data
Suppose we know σd.
• standard error for the mean difference = σd / n1/2
• with approx 95% confidence interval d ± 2σd / n1/2

fix length of the confidence interval=Δ, solve for n:


• Δ = (d+2σd /√n )–(d–2σd/√n) = 4σd/√n or
• n = 16σd2/Δ2
e.g. Δ= 2σd, n = 16σ2/(2σ)2= 4 pairs
e.g.
Stat 110
Δ= σd, n = 16σ /σ
2 2
=16 pairs
[email protected]
The price of two-sample testing

1. Assuming σd is comparable to σx, the two-sample


problem requires twice the number of
observations. This is because the value of its
control group is random.
2. In addition, the cost of a pair is usually less that
of two unrelated observations.
3. Finally, when pairing is feasible, it is reasonable
for σd < σx . When pairing is at random, σd = σx√2,
and the one-sample test is burdened by the loss
of degrees of freedom.
Stat 110 [email protected]
Hypothesis testing

1. null hypothesis, whereby the population


parameter is “uninteresting, unremarkable,
default, null=zero.”
2. alternative hypothesis which is implicitly
accepted if the null hypothesis is rejected. The
alternative hypothesis is usually not unique.
3. test statistic computed from observed sample.
4. rejection region which defines values of the test
statistic that would reject the null hypothesis.

Stat 110 [email protected]


e.g. one-sample mean, σ known
Null hypothesis Ho : Δ=0.
Alternative HΔ : Δ=ΔA.
Test statistic z = d / (σ/n1/2) = d×n1/2/σ
Rejection region: z > 1.645
0.4

0.3

0.2

0.1

-6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6
standard normal z
Stat 110 [email protected]
…p-value version
Null hypothesis Ho : Δ=0.
Alternative HΔ : Δ=ΔA.
Test statistic z = d / (σ/n1/2) = d × n1/2/σ
One-sided p-value:
0.4

p1-value = P(zobs>zpv | Ho) 0.3

0.2
= P( d /(σ/n )> zpv |Δ=0)
1/2
0.1

= 1– Φ(d /(σ/n1/2)) 0

-6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6

=StatΦ(–d×n
110
1/2
/σ) standard normal z
[email protected]
…two-tailed p-value
Null hypothesis Ho : Δ=0.
Alternative HΔ : Δ=ΔA.
Test statistic z = d / (σ/n1/2) = d × n1/2/σ

Two-sided p-value:
p2-value = P(|zobs| > zpv | Ho ) 0.4

= P( |d|/(σ/n1/2)> zpv | Δ=0 ) 0.3

=1–[Φ(|d|/(σ/n1/2))–Φ(-|d|/(σ/n1/2))] 0.2

= 2×Φ(-|d|×n1/2/σ) 0.1

= 2×p1-value 0
-6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6
standard normal z
Stat 110 [email protected]
one-sample mean, σ unknown, p-value
Null hypothesis Ho : Δ=0.
Alternative HΔ : Δ=ΔA.
Test statistic t = d / (s/n1/2) = d × n1/2/s
Rejection region: |t| > tdf=n-1 (α/2)
0.4

0.3
Two-sided p-value:
0.2
p2-value = P(|tobs|>tpv |Ho )
0.1

= P( |d|/(s/n1/2)> tpv|Δ=0) 0

= 2×T(–|d|×n1/2/s, df=n-1) -6 -5 -4 -3 -2 -1 0
t(df=4)
1 2 3 4 5 6

Stat 110 [email protected]


Example: overetch yield experiment

lot split 01-12 13-24 delta


1 clearout 01-12 75 68 7
2 clearout 01-12 45 61 -16
3 clearout 01-12 81 79 2
4 clearout 01-12 78 87 -9
5 clearout 01-12 57 77 20

mean = –7.20 t = d √n / s
stdev = 11.52 = –7.2×√5 / 11.52 = –1.40
t(df=4,0.975)=2.776 p-value = 0.117
Stat 110 [email protected]
Wafer position sequence effects
120
How to adjust
for this effect 100
in the 1st 12
80
vs 2nd 12 split
experiment? 60
zig-zag
even-vs-odd
40 early effect
wafer
20
effect
0
0 5 10 15 20 25
wafer processing sequence
Stat 110 [email protected]
20 concurrent unsplit lots
lot split 01-12 13-24 lot split 01-12 13-24
6 no splits 62.5 58.8 17 no splits 66.7 72.9
7 no splits 50.5 30.6 18 no splits 75.4 68.6
8 no splits 72.5 71.6 19 no splits 78.3 81.4
9 no splits 86.0 73.8 20 no splits 75.5 77.8
10 no splits 68.6 59.3 21 no splits 84.0 73.6
11 no splits 76.6 78.2 22 no splits 79.6 78.5
12 no splits 55.6 44.3 23 no splits 77.9 74.7
13 no splits 64.6 71.3 24 no splits 64.8 61.7
14 no splits 73.5 77.5 25 no splits 69.6 70.2
15 no splits 81.3 77.7 26 no splits 70.7 71.4
16 no splits 66.0 3.1
Stat 110 [email protected]
Adjusting for the wafer position bias
delta
-30 -20 -10 0 10 20

split n mean stdev


clearout

clearout 01-12 5 –7.20 11.52 01-12

no splits 21 3.49 7.04

df 24
no splits

diff means –10.69


pooled stdev .96
t = –10.69 / [7.96×(1/5+1/24)1/2]
= –2.70 with df=24
two-sided p-value = 0.0126
Stat 110 [email protected]
Two errors
Type I error is the probability that, given the null
hypothesis is true, of the statistical procedure
rejecting the null hypothesis.

Type II error is the


probability that, given the truly truly
alternative hypothesis is delta=0 delta≠0
true, of not rejecting the we say 1–α β=
null hypothesis. Type II "delta=0” type II
error is a strong function
of the particular we say α= 1–β =
alternative. "delta≠0” type I power
Stat 110 [email protected]
IF … THEN power(Δ)
1.0
0.9
0.8
0.7
0.6

IF Δ=0, THEN 0.5


0.4
0.3
the probability of a 0.2

significant result = α 0.1


0.0
-1 0 1 2 3

IF Δ=ΔA, THEN Δ=0 Δ


the probability of a
significant result = 1–β(ΔA)

Stat 110 [email protected]


one-sample mean…
power(ΔA)
1.0
0.9
Null Ho : Δ=0. Alt HΔ : Δ=ΔA. 0.8
0.7
statistic z = d /(σ/n ) = d×n
1/2 1/2
/σ 0.6
0.5

reject z > 1.645 0.4


0.3
0.2
P( z > 1.645 | Δ=0 ) = α = 0.05 0.1
0.0
-1 0 1 2 3
power(ΔA) = 1–β
ΔA
= P( z > zα | Δ=ΔA ) = P( d×n /σ > zα |Δ=ΔA)
1/2

= P( (d – ΔA + ΔA)×n1/2/σ > zα | Δ= ΔA)


= P((d – ΔA )×n1/2/σ + ΔAn1/2/σ > zα | Δ= ΔA)
= P( z + ΔAn1/2/σ > zα )=P( z > zα–ΔAn1/2/σ )
= Φ( –zα + ΔAn1/2/σ )
Stat 110 [email protected]
connection to hypothesis tests:
• When the confidence interval contains zero, then
the conventional null hypothesis that the
population parameter is zero cannot be rejected
(at the given confidence level=1–significance).
• Confidence intervals consist of those null
hypotheses that cannot be rejected (at the given
confidence level=1–significance).
• Confidence intervals have sufficient information
to determine whether the null hypothesis is to be
rejected.

Stat 110 [email protected]


one-sample mean…
0.4
0.3 Δ=0 zα
power(ΔA) = 1– β 0.2
0.1
= P( z >zα–ΔA×n /σ )
1/2
0

so 0.4
0.3
Δ√n/σ =1
z1–β = zα – ΔA×n1/2/σ or 0.2
0.1

–zβ = zα – ΔA×n1/2/σ or 0

0.4
ΔA×n /σ = zα + zβ
1/2
or 0.3 Δ√n/σ =2
0.2
n 1/2
= ( zα + zβ ) ×σ / ΔA 0.1
0
n = ( zα + zβ )2×σ2/ ΔA2
-6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6

Stat 110 [email protected]


Two sample version:

σn2 = σ2(1/n + 1/n) = 2 σ2/n, so

n = 2[zα/2 + zβ]2 σ2/(μ1─μ0)2

Guenther’s refinement:
2
n = 2[zα/2 + zβ]2 σ2/(μ1─μ0)2 + zα/2/4

Stat 110 [email protected]


Comparison of Sample Size Calculations
(Two-Sample Problem)

power=0.5 power=0.9
non- non-
alpha Δ/σ n* Guenther central n* Guenther central
0.05 0.25 122.93 123.89 123.88 336.24 337.2 337.2
0.05 0.5 30.73 31.69 31.71 84.06 85.02 85.03
0.05 0.75 13.66 14.62 14.67 37.06 38.32 38.34
0.05 1 7.68 8.64 8.73 21.01 21.98 22.02
0.05 1.25 4.92 5.88 6.02 13.45 14.41 14.48
0.05 1.5 3.41 4.37 4.57 9.34 10.3 10.4
0.05 1.75 1.92 2.88 3.17 5.25 6.21 .39

Stat 110 [email protected]


Examples:
Yield: standard process 100 dpw
“new” process 110 dpw?
Δ=new-std 10 dpw
σ 25 dpw

Reliability: standard process 30


“new” process 35, 40
Δ =new-std 5, 10
σ 6

Stat 110 [email protected]


Corporation Named
paired data
of Interest Competitor count
(binary): yeasayers => Yes Yes 798
Yes No 406
No Yes 95
naysayers => No No 220
total 519

The key information is patterns (yes,no) & (no,yes).


We proceed conditionally: CoI vs NC odds = 406:95 = 4.27
Log odds 95% CI = 1.45 ± 2×0.114=(1.22,1.68)
Odds 95% CI = (3.40,5.36);
95 % CI Yes fraction = Odds/(1+Odds) = (0.773,0.843)
Stat 110 [email protected]
One-sample variance problem
Null hypothesis Ho : σ = σo
Alternative HA : σ > σo
Test statistic and ν s2 / σo2
rejection region > χ2(df=ν,0.95)
0.4

0.3
one-sided p-value:
p1-value 0.2

= P(ν s2 / σo2 >χpv | Ho) 0.1

0.0

= 1– χ2Inv(df=ν, νs2/σo2 ) 0 1 2 3 4 5 6 7 8
Stat 110 [email protected]
Two-sample variance problem
Null hypothesis Ho : σ1 = σ2
Alternative HA : σ1 ≠ σ 2
Test statistic and s12 / s22
rejection region < F(ν1,ν2, 0.025) or
> F(ν1,ν2, 0.975)
two-sided p-value:
Label so that s1 > s2
p2-value
= 2×P( s12/s22 > F Inv(ν1,ν2,1–p2/2) | Ho)
= 2×F(ν1, ν2, s12/s22)
Stat 110 [email protected]
CPU times (reprise)
3 3 3

Normal Quantile Plot


.99 .99

Normal Quantile Plot


.99
2 2 2
.95 .95 .95
.90 .90 .90
1 1 1
.75 .75 .75

.50 0 .50 0 .50 0

.25 .25 .25


-1 -1 -1
.10 .10 .10
.05 .05 .05
-2 -2 -2
.01 .01 .01

-3 -3 -3

0 1 2 3 4 5 0 1 2 -6 -5 -4 -3 -2 -1 0 1 2 3

linear scale square roots log (base 2)


Stat 110 [email protected]
variability tracking with mean
0 5 10

raw
data group mean stdev
1.46 0.58 4.31 1.02 1 2.89 2.62
1.30 8.24 3.51 6.87 2 3.56 4.10
5.92 1.86 1.41 1.70 3 3.08 1.50
4 3.20 3.20
0.17 2.92 0.91 0.43 5 1.21 0.95
1.43 1.44 4.49 4.21 6 2.00 0.80
2.02 1.65 1.40 .40 7 2.27 1.94
8 2.01 .96
Stat 110 [email protected]
a few power transformations
5 1.25

4
1
stdev (linear)

stdev (sqrt)
3
0.75
2

0.5
1
linear sqrt
0 0.25
1.0 1.5 2.0 2.5 3.0 3.5 4.0 1 1.2 1.4 1.6 1.8
mean (linear) mean (sqrt)
2

stdev 1.75

1.5
stdev (log)

1.25

0.75
mean log
0.5
-0.5 0 .5 1 1.5
mean (log)
Stat 110 [email protected]
Why power transformations?
Theoretical reasons
• align physical relationships to (linear) statistical models.
Empirical reasons
• to reduce correlations of group variances to group
means.
• reduces the influence of large values without making
them into outliers
• reduces the skewness in right-skewed data (λ<1).
• resolve an ambiguity in scale, (e.g a rate vs its
reciprocal).
Preference order:
• λ = 0 (logs), 1/2 (square roots), -1 (inverses),
1/3 (cube roots~logs with zeros)
Stat 110 [email protected]
Box-Cox transformations
“poor man’s” Box-Cox
procedure
2. For each group, calculate
the mean, and the
standard devation.
3. Plot the log(stdev) vs log
(mean).
4. Estimate the slope, say r.
What are they? 5. The recommended
Response y → yλ power for transforming
Note: y → (yλ ─1)/ λ the raw data is 1–r,
@1 equals 0 with slope 1 (suitably rounded).

Stat 110 [email protected]


two examples
theoretical empirical Box-Cox
Suppose the standard 2.5

deviation is proportional 2

to the mean: 1.5

log stdev
1

σ(μ) = μ×σo 0.5

0
Box-Cox plots log(σ(μ)) vs -0.5
log(μ): 0 .5 1 1.5 2
log mean

log(σ(μ))= 1×log(μ)+log(σo),
slope = 1.24
slope r is 1, 1–r=0, so
transform by taking logs 1–r = –0.24, suggests logs
of raw data. or reciprocal sqrts.
Stat 110 [email protected]
linear vs log: plots of transformed data
Ra226 (linear) log2 Ra226

0 1 2 3 4 5 6 7 8 9 -3 -2 -1 0 1 2 3

g2
g2

A A

B B

C C

D D

E E

F F

G G

H H

linear logarithms
Stat 110 [email protected]
Mis-calibration:
thickness
Target thickness
β (mean) is β to
target
thickness deviation is (β – b)to is
b
actual proportional to
time mean.

to
So multiplicative relationships tend to promote
constant coefficients of variation, and log
transforms.
Stat 110 [email protected]
sums of small positive errors
actual thickness
Examples:
= ( Σ bi Δt i ) Poisson mean = λ
with variance with variance = λ
= ( Σ Δti2 Var(bi) ) sums of independent
Poissons are Poisson
= (Δt Var(b)) ΣiΔti
= σbΔ2 to so variance is Chi-square (gamma):
proportional to square root
mean =ν
of mean.
variance = 2 ν
sums of independent chi-
squares still chi-squares
Stat 110 [email protected]
Why Box-Cox works:
Background theory: Setup:
g(X) ≈ g(μ) + g'(μ)(X – μ ) or log( σ(μ) ) = k + r log(μ) or
g(X) – g(μ) ≈ g'(μ)(X – μ ) so log(σ2(μ)) = 2k + 2rlog(μ) or
E(g(X) – g(μ))2 σ2(μ)) = c μ2r
≈ g'(μ)2 E(X – μ)2 so
Var(g(X)) ≈ g'(μ)2 Var(X) Suppose g(x) = x1– r then
g'(x) = (1–r) x– r or
g'(x)2 = (1–r)2 x–2r so
Var(g(X)) ≈ g'(μ)2 Var(X)
= (1–r)2 μ–2r c μ2r
≈ constant with respect to μ
Stat 110 [email protected]
The other two-sample t-test

In general, for independent observations from two


populations,
E( X1 – X2 ) = μ1 – μ2 ,
2 2 “classical”
Var( X1 – X2 ) = σ1/n1 + σ2/n2
So a natural two-sample t-statistic is
x1 – x2 x1 – x2
not
2 2
[ s1/n1 + s2/n2 ]1/2 sp [1/n1+1/n2 ]1/2
2 2
where s =[ (n1–1)s + (n2–1)s ]
p
2
1 2

Stat 110
(n1–1) + (n2–1)
[email protected]
issues

The two t-statistics differ when n1 ≠ n2 or s1 ≠ s2.


In larger samples can the differences among s1, s2,
sp be worrisome, but M&S distinguish between
them by whether n1 ≠ n2 or n1 = n2.
For the “unequal variances” procedure, there is no
clear theory for its sampling distribution…in
particular we need to figure out the associated
degrees of freedom.

Stat 110 [email protected]


Degrees of freedom for unequal variances t
Lemma: Let s be a standard deviation from
independent normals with same mean and
variance σ2, and ν degrees of freedom.
Var( s2 ) = 2 σ 4 / ν .
So Var( s12/n1 + s22/n2 )
= 2[σ12/n1]2 /ν1 + 2[σ22/n2]2/ν2, (set equal to)
= 2 σ 4 / ν, where σ =[σ12/n1 + σ22/n2 ]1/2 . Of course,
we don’t know σ1 or σ2 , so we “plug in” s1,s2:
So, [s12/n1+s22/n2]2/ν = (s12/n1)2/ν1+(s22/n2)2/ν2, from
which we solve for ν.
Stat 110 [email protected]
Clearout 5 split + 21 unsplit
split unsplit sum
ng 5 21
mean = xg –7.2 3.49
standard deviation = sg 11.52 7.04

df = νg = ng –1 4 20

sg2 / ng 26.5421 2.360076 28.902


[sg2 / ng]2 / νg 176.1205 0.278498 176.399

calc’d df= 28.92 / 176.4 4.735484

Stat 110 [email protected]


…continued

–7.2 – 3.49
= -1.988 = t with df=4.735
[ 28.902 ]1/2

p-value (one tail) = 0.059 tν=4.735


or
p-value (two tail) = 0.118
0.059 0.059
Conventional pooled t
with df=24, p-values
= 0.0292, 0.0583 -5 -4 -3 -2 -1 0 1 2 3 4 5

Stat 110 [email protected]


comments
Different variances in different groups,
• This can often be of intrinsic interest, with groups
with smaller variation usually more desirable.
• When variation tracks with the mean level (usually
higher going with higher), Box-Cox power
transformations are suggested.
• When differences in means are still of interest (in
spite of differences among groups in variation), the
alternative t-test conservatively adjusts the degrees
of freedom.
• Note df≈5 vs 24, p-value=0.053 vs 0.029
• This low power is why M&S recommend Wilcoxon.
Stat 110 [email protected]
Metrology study
date mean stdev
Monitor of the same 17-Sep 1051 2.2
linewidth (same spot)
22-Sep 1062 4.3
on 10 days,
28-Sep 1063 3.1
5 readings each day.
28-Sep 1058 4.7
29-Sep 1057 3.6
30-Sep 1060 3.3
1-Oct 1062 4.1
2-Oct 1066 4.7
5-Oct 1061 4.1
6-Oct 1060 3.4

Stat 110 [email protected]


components of variance
day-to-day: σday
d

meas-to-meas: σmeas
(repeatability) +m

total variation:
2 2
σtotal = [σday +σmeas]1/2
d+m
(reproducibility)
Stat 110 [email protected]
Estimating these two variances

Pooled within day standard deviation = 3.82


= RMS( 2.2, 4.3, …, 3.4 )

Standard deviation of the daily averages = 4.055

Stat 110 [email protected]

You might also like