0% found this document useful (0 votes)
10 views3 pages

Formulasheetforfinal

Uploaded by

Rak ADUR
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views3 pages

Formulasheetforfinal

Uploaded by

Rak ADUR
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Formulas That May Be Needed

1 Laws of Probability
• If A and B are mutually exclusive events, then P (A or B) = P (A) + P (B).

• If A and B are independent events, then

– P (A and B) = P (A) × P (B),


– P (A | B) = P (A).

• If P (A and B) = P (A) × P (B) or P (A | B) = P (A) or P (B | A) = P (B), then

– A and B are independent events.

• If A and B are two events and P (B) =


√ 0, the conditional probability of A given B is

P (A and B) P (B | A) × P (A)
P (A | B) = = .
P (B) P (B)

2 Discrete Random Variables (RV from now on)


n n
2
X̄ = E(X) = µX = P (X = xi ) xi VAR(X) = �X = P (X = xi )(xi − µX )2
i=1 i=1

Std Dev(X) = �X = VAR(X)

3 Binomial Distribution with Parameters n and p


n!
µX = np 2
�X = np(1 − p) P (X = x) = px (1 − p)n−x , x = 0, 1, . . . , n
x!(n − x)!

4 Two Random Variables


n
Cov(X, Y ) = P (X = xi and Y = yi )(xi − µX )(yi − µY )
i=1
Cov(X, Y )
Corr(X, Y ) =
�X �Y

If X and Y are independent, then Cov(X, Y ) = 0 and Corr(X, Y ) = 0.

E(aX + bY + c) = aE(X) + bE(Y ) + c

VAR(aX + bY + c) = a2 VAR(X) + b2 VAR(Y ) + 2ab Cov(X, Y )


= a 2 �X
2
+ b2 �Y2 + 2ab �X �Y Corr(X, Y )
5 Uniform Distribution between a and b
a+b (b − a)2 x−a
E(X) = VAR(X ) = P (X � x) = if a � x � b
2 12 b−a

6 Normal Distribution
• If X is a normal distribution with mean µ and standard deviation �, then P (X � x) =
F ( x−µ x−µ
� ), where F (z) can be read from the “normal” table and z = � .

• If X and Y are Normally distributed, then so is aX + bY + c.

• Assume that X� 1 , .., Xn are independent and identically distributed, E(Xi ) = µ, and VAR(Xi ) =
� 2 . Let Sn = ni=1 Xi be the sum and X ¯ = 1 �n Xi be the average, then:
n i=1

– Central Limit Theorem for the sum. If n is moderately large (say, 30 or more) then

Sn is approximately Normally distributed with mean nµ and standard deviation � n.
– Central Limit Theorem for the sample mean. If n is moderately large, then X ¯ is

approximately Normally distributed with mean µ and standard deviation �n .

• A binomial distribution can be approximated with a normal (with the correct parameters µ
and �) when np � 5 and n(1 − p) � 5.

7 Statistical Inference for the Population Mean µ


• X¯ = 1 �n Xi is the sample mean. The observed sample mean x ¯ = n1 ni=1 xi is an estimate

n i=1
of the mean of the population µ.

• S = n−1 1 �n ¯ 2
i=1 (Xi − X) is the standard deviation of the sample. The observed standard

1 �n 2
deviation of the sample s = n−1 i=1 (xi − x̄) is an estimate of the standard deviation of
the population �.

• The standard deviation of the sample mean is Std Dev(X ¯ ) = �� , where � is the standard
n
deviation of the population.
¯ −µ
X
• If n is large (say, 30 or more), then is approximately a standard Normal RV.
�S
n

¯ −µ
X
• If n is small (say, less than 30) and the population distribution is “well-behaved”, then
�S
n
obeys a t-distribution with n − 1 degrees of freedom (dof).
� �
s s
¯ − c × � ,x
• For n � 30 an �% confidence interval for the real mean µ is x ¯+c× � ,
n n
where c can be found by solving P (−c � Z � c) = �/100 with Z being a standard Normal
RV. For example:

For � = .90, c = 1.645; for � = .95, c = 1.960; for � = .98, c = 2.326; for � = .99, c = 2.576.
• For n < 30 and�a “well-behaved” population
� distribution, an �% confidence interval for the
s s
¯ − c × � ,x
real mean µ is x ¯+c× � , where c satisfies that P (−c � T � c) = �/100
n n
with T a RV that has a t-distribution with n − 1 dof.

• To construct an �% confidence interval that is within (plus or minus) L of the actual mean,
c2 s 2
the required sample size is n = 2 , where c satisfies that P (−c � Z � c) = �/100 if Z is a
L
standard Normal RV.

8 Regression
• n = number of data points

• k = number of explanatory (independent) variables

• Based on observed data


(y1 , x11 , . . . , xk1 )
...
(yn , x1n , . . . , xkn )

• Population relation: Yi = �0 + �1 x1i + . . . + �k xki + αi where αi is N (0, �)

• ŷi = b0 + b1 x1i + . . . + bk xki is the predicted value

• bj is the regression coefficient and an estimate of �j , j = 0, 1, . . . , k

• sbj is the standard deviation of bj

• ei = yi − ŷi is the residual

• An �% confidence interval for �j is [bj − c × sbj , bj + c × sbj ]


where c satisfies that P (−c � T � c) = �/100 if T obeys a t-distribution with dof = n − k − 1
bj
• The t-statistic is tbj =
s bj

• Checklist for evaluating a linear regression model: (i) linearity, (ii) signs of regression co­
efficients, (iii) significance of independent variables, (iv) R2 , (v) normality of the residuals,
(vi) heteroscedasticity, (vii) autocorrelation, and (viii) multicolinearity.

You might also like