0% found this document useful (0 votes)
34 views3 pages

Statistics: 1. Basics

This document provides an overview of key statistical concepts: 1) It defines common probability distributions like the binomial and normal distributions. 2) It discusses confidence intervals and how to calculate margins of error and sample sizes. 3) It compares methods for comparing two proportions, two means, and using matched pairs analysis. 4) It also briefly introduces regression analysis and testing hypotheses about slopes and means.

Uploaded by

Cassia Lmt
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
34 views3 pages

Statistics: 1. Basics

This document provides an overview of key statistical concepts: 1) It defines common probability distributions like the binomial and normal distributions. 2) It discusses confidence intervals and how to calculate margins of error and sample sizes. 3) It compares methods for comparing two proportions, two means, and using matched pairs analysis. 4) It also briefly introduces regression analysis and testing hypotheses about slopes and means.

Uploaded by

Cassia Lmt
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 3

Statistics

1. Basics

a) Bernoulli ”1/2”: One trial, with a probability of ½ for one of two possible outcomes
b) Binomial distribution: 2 outcomes
1 −x / 22

c) Gaussian/Normal Density Function: f ( x )= e


√2 π
68% within 1 s.d. from mean; 95% within 2 s.d.; 99.7% within 3 s.d.;
d) Terminology: μ ( mean/expected value ) ,σ ( SD ) , σ 2 (Variance)
e) As n -> infinity, variance (v/n)-> 0
f) Unbiased variance: n-1 in denominator; Biased variance: n in denominator

2. Confidence interval

a) 95% CI for p (α=0.05): probability that the value falls between p ± 1.96 SD is 95%;
99% CI for p: probability that the value falls between p ± 2.576 SD is 99% (more conservative)
b) Margin of error: CI = [a,b]; MOE = (b-a)/2
c) CI (how probable a value will be within the CI) for mean vs CI for proportions (how probable the CI
calculated for a sample will include the population mean)

3. Sample size

p ( 1− p )
a) 1.96
√ n
=Marginof error
Given p, select an appropriate Margin of error, solve for n. Note that 1.96 is the Z α/2 Critical value
for CI = 95%

4. T-distribution (when population variance is unknown)

a) Slightly wider that N-distribution due to additional uncertainties owning to estimation of Sample
Variance
b) Degrees of freedom = n-1
c) Replace 1.96 with appropriate value (will be >1.96) from mathematical table
d) Sample variance:
5. Matched pairs

a) Used when the experimental data is closely similar e.g. twins, before/after on same subject, two
methods for same data
b) Not independent
c) Analysis: Find Mean of difference between the two data sets, and S.d of the difference; Then use to

compute p-value by plugging into formula ;


H0: Twin B- Twin A = 0, H1: Twin B- Twin A ≠ 0

Twin A Twin B Twin B- Twin A


1 2 1
3 6 3
4 7 3
7 9 2
5 9 4
8 7 -1
9 5 -4
Mean = ?
S.d. =?

6. Comparing two proportions A & B (two sets of data do not have the above
relations in matched pairs)

a) Analysis: You can either compute CI (interval that the difference is likely to be observed at 95% CI)
or Hypothesis testing (H0 vs HA)
b) CI : 95% CI = (pA−pB)±1.96∗√ Var (pA - pB)
Var (pA -pB) = pA (1- pA)/nA + pB (1- pB)/nB

c) Hypothesis method:

d) E.g.: A poll carried out in January surveyed 560 people and found that 45% of them supported a
political candidate. A poll about the same political candidate was also carried out in April and
showed that out of the 1100 people surveyed 52% supported the candidate. What is the CI? Is there
a significant difference between pA (0.45) & pB (0.52)?

7. Comparing two independent means


a) For analysis of H0 : µA = µB (but variances are not equal),
Degree of freedom is calculated by the Welch-Satterthwaite formula (some complicated formula)

√Var (µA - µB) =


Plug in formula: (µA - µB)/ √Var (µA - µB) to calculate t-statistic

b) For analysis of H0 : µA = µB; Var A = Var B (i.e. pooled variance), using the same plug-in formula
above,

8. Regression

To answer the question: How close does the experimental slope b1 is to the real slope β1? You need to
consider R^2 and SE of b1

a) Hypothesis testing:

SE = Standard error, computed with a complicated formula

b) Confidence Interval

c) Remember to check the outliers and ensure a random distribution. If not, the data suggest non-
linearism.

Use log on X-axis to improve linearity

You might also like