0% found this document useful (0 votes)
71 views2 pages

Formulas 12 Eng

This document provides a summary of key concepts for statistics exams, including: - Notation for parameters, samples, distributions, and test statistics for confidence intervals and hypothesis tests on means, proportions, differences of means, and regression. - Pivotal quantities and distributions for conducting inference on various parameters. - Formulas for sample statistics like means, proportions, covariance, and correlation. - Methodology for simple and multiple linear regression including model formulation, estimates, residuals, and ANOVA tables.

Uploaded by

Jose
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
71 views2 pages

Formulas 12 Eng

This document provides a summary of key concepts for statistics exams, including: - Notation for parameters, samples, distributions, and test statistics for confidence intervals and hypothesis tests on means, proportions, differences of means, and regression. - Pivotal quantities and distributions for conducting inference on various parameters. - Formulas for sample statistics like means, proportions, covariance, and correlation. - Methodology for simple and multiple linear regression including model formulation, estimates, residuals, and ANOVA tables.

Uploaded by

Jose
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

Statistics II

Fact sheet for exams


Confidence intervals and hypothesis testing in one and two populations.
Notation:
2
and s2 : sample mean and
X and X
: population mean and variance of a random variable/population X, X
X
quasi-variance

pX population proportion if X Bernoulli(pX ), pX sample proportion


X n : simple random sample (SRS) of size n from X
(1 ) confidence level, significance level
z an upper quantile of N(0,1) distribution, tn1; an upper quantile of a tn1 distribution
Parameter

Assumptions: SRS(s) and

Normal population, known variance

Normal population, unknown variance

Nonnormal population, large sample size

pX

Bernoulli population, large sample size

2
X
and X

Pivotal quantity and distribution


X
X
N (0, 1)
X / n
X
X
tn1
sX / n
X
X
approx. N (0, 1)
sX / n
pX pX
p
approx. N (0, 1)
pX (1 pX )/n
(n 1)s2X
2n1
2
X

Normal population

X Y

Normal difference Di = Xi Yi , matched pairs

X Y

Normal populations, common variance

X Y

Normal populations, known variances

X Y

Nonnormal populations, large sample sizes

pX pY

Bernoulli populations, large sample sizes

D
D
tn1
sD / n
Y (X Y )
X
q
tnX +nY 2 , where
sp n1X + n1Y
(nX 1)s2X + (nY 1)s2Y
s2p =
nX + nY 2
Y (X Y )
X
q 2
N (0, 1)
2
Y
X
+
nX
nY
Y (X Y )
X
q 2
approx. N (0, 1)
s2Y
sX
+
nX
nY

pX pY (pX pY )
r
 approx. N (0, 1), where

1
1
p0 (1 p0 ) nX + nY
p0 =

2
X
/Y2 and X /Y

nX pX + nY pY
nX + nY

2
s2X /X
FnX 1,nY 1
s2Y /Y2

Normal populations

2
2
Example: To construct an (1 ) confidence interval for X if X N (X , X
) with X
unknown we have:


sx
sx
+ tn1;/2
CI1 (X ) = x
tn1;/2 ; x
n
n

To perform a lower-tail test H0 : X 0 versus H1 : X < 0 , the rejection region at significance level , RR , is:

}| {

zx

0
< tn1;1
RR = t :

sx / n

Sample covariance and correlation based on bivariate observations (x1 , y1 ), . . . , (xn , yn ):


sxy
z }| {
cov (x, y) =

n
X
i=1

(xi x) (yi y)
n1

n
X
i=1

xi yi n
xy
n1

r(x,y)
z }| { cov (x, y)
cor (x, y) =
= s
sx sy
n
P

i=1

n
P

xi yi n
xy
s
n
P
2
2
x
y2
xi n
yi2 n
i=1

i=1

Slope and intercept estimates in the simple linear regression model yi = 0 + 1 xi + ui , where
ui iid N (0, 2 ) to obtain the fitted line yi = 0 + 1 xi :
cov(x, y)
1 =
=
s2x

n
X
i=1

(xi x
) (yi y)
n
X
i=1

(xi x
)

n
X

xi yi n
xy

i=1
n
X
i=1

0 = y 1 x

x2
x2i n

Pivotal quantities for 1 , 0 , 2 , with residuals ei = yi yi and residual variance s2R =


s

1 1

s2R
(n 1)s2X

tn2 ,

s2R

0 0

1
x
2
+
n (n 1)s2X

 tn2 ,

n
X

e2i

i=1

n2

(n 2) s2R
2n2
2

Confidence intervals for the mean and individual response for y0 given X = x0 :
v
v
!
!
u
u
2
u
u
1
1
(x

)
(x0 x)2
0
2
2
t
t
, y0 tn2,/2 sR 1 + +
y0 tn2,/2 sR
+
n (n 1) s2X
n (n 1) s2X
ANOVA table for the simple linear regression model (R-squared R2 = SSM/SST ):
Source of variability
Model
Residuals/errors
Total

SS
Pn
SSM =P i=1 (
yi y)2 P
n
n
SSR = i=1 (yi yi )2 = i=1 e2i
SST = SSM + SSR

DF
1
n2
n1

Mean
SSM/1
SSR/(n 2) = s2R

F ratio
SSM/s2R

To test H0 : 1 = 0 vs. H1 : 1 6= 0, test stat is F = SSM/s2R F1,n2 and RR = {F > F1,n2; }.


Model formulation, estimates, fitted model and residuals in multiple linear regression model
yi = 0 + 1 xi1 + 2 xi2 + + k xik + ui , where ui iid N (0, 2 ) in matrix notation:
= (X T X)1 X T y,

y = X + u,

y=

y1
y2
..
.
yn

X=

1
1
..
.

y = X ,

x11
x21
..
.

x12
x22
..
.

..
.

x1k
x2k
..
.

1 xn1

xn2

xnk

e=yy
, where

0
u1
1

u2

= 2 , u = .
..
..
.
un
k

Pivotal quantities for 2 and j , j = 0, 1, . . . , k, with residual variance s2R =


(n k 1) s2R
2nk1 ,
2

j j
tnk1 ,
s(j )

Pn

2
i=1 ei /(n

k 1):

where s(j ) = s2 (j ) and s2 (j ) is the j-th diagonal element of the (estimated) variance-covariance matrix of ,
2
T
1
with the matrix defined as S = sR (X X) .
ANOVA table for the multiple linear regression model:
Source of variability
Model
Residuals/errors
Total

SS
SSM
SSR
SST

DF
k
nk1
n1
2

Mean
SSM/k
SSR/(n k 1) = s2R

F ratio
(SSM/k)/s2R

You might also like