Formulasheetforfinal
Formulasheetforfinal
1 Laws of Probability
• If A and B are mutually exclusive events, then P (A or B) = P (A) + P (B).
P (A and B) P (B | A) × P (A)
P (A | B) = = .
P (B) P (B)
6 Normal Distribution
• If X is a normal distribution with mean µ and standard deviation �, then P (X � x) =
F ( x−µ x−µ
� ), where F (z) can be read from the “normal” table and z = � .
• Assume that X� 1 , .., Xn are independent and identically distributed, E(Xi ) = µ, and VAR(Xi ) =
� 2 . Let Sn = ni=1 Xi be the sum and X ¯ = 1 �n Xi be the average, then:
n i=1
– Central Limit Theorem for the sum. If n is moderately large (say, 30 or more) then
�
Sn is approximately Normally distributed with mean nµ and standard deviation � n.
– Central Limit Theorem for the sample mean. If n is moderately large, then X ¯ is
�
approximately Normally distributed with mean µ and standard deviation �n .
• A binomial distribution can be approximated with a normal (with the correct parameters µ
and �) when np � 5 and n(1 − p) � 5.
• The standard deviation of the sample mean is Std Dev(X ¯ ) = �� , where � is the standard
n
deviation of the population.
¯ −µ
X
• If n is large (say, 30 or more), then is approximately a standard Normal RV.
�S
n
¯ −µ
X
• If n is small (say, less than 30) and the population distribution is “well-behaved”, then
�S
n
obeys a t-distribution with n − 1 degrees of freedom (dof).
� �
s s
¯ − c × � ,x
• For n � 30 an �% confidence interval for the real mean µ is x ¯+c× � ,
n n
where c can be found by solving P (−c � Z � c) = �/100 with Z being a standard Normal
RV. For example:
For � = .90, c = 1.645; for � = .95, c = 1.960; for � = .98, c = 2.326; for � = .99, c = 2.576.
• For n < 30 and�a “well-behaved” population
� distribution, an �% confidence interval for the
s s
¯ − c × � ,x
real mean µ is x ¯+c× � , where c satisfies that P (−c � T � c) = �/100
n n
with T a RV that has a t-distribution with n − 1 dof.
• To construct an �% confidence interval that is within (plus or minus) L of the actual mean,
c2 s 2
the required sample size is n = 2 , where c satisfies that P (−c � Z � c) = �/100 if Z is a
L
standard Normal RV.
8 Regression
• n = number of data points
• Checklist for evaluating a linear regression model: (i) linearity, (ii) signs of regression co
efficients, (iii) significance of independent variables, (iv) R2 , (v) normality of the residuals,
(vi) heteroscedasticity, (vii) autocorrelation, and (viii) multicolinearity.