0% found this document useful (0 votes)
48 views4 pages

Important Formulas Table

This document defines key statistical concepts used in analyzing sample data: 1) It describes how to calculate the sample mean, variance, and standard deviation to estimate properties of the population from which the sample was drawn. 2) It introduces z-scores, standard error, and test statistics used to evaluate hypotheses about population parameters based on sample data. 3) It defines confidence intervals that indicate the reliability of sample estimates and stochastic models that relate variables with uncertainty.
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
48 views4 pages

Important Formulas Table

This document defines key statistical concepts used in analyzing sample data: 1) It describes how to calculate the sample mean, variance, and standard deviation to estimate properties of the population from which the sample was drawn. 2) It introduces z-scores, standard error, and test statistics used to evaluate hypotheses about population parameters based on sample data. 3) It defines confidence intervals that indicate the reliability of sample estimates and stochastic models that relate variables with uncertainty.
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 4

n

Sample Mean Sum of all x values divided Sample mean is also the
∑ xi by the sample size population estimate
x́= ^μ = i=1
n
n
Sample Variance Sum of squares of how far Distance x values lie from the
∑ ( x i−x́)2 each x value is from the mean
s x2 =σ^ x2 = i=1 mean divided by the
n−1
sample size less one
Sample Standard s x =σ^ x =√ s x 2 Square root of the variance Expresses variability in
Deviation population (distance values
dispersed from estimated
mean)
z-score x−μ Subtract mean from x and Transforms data from a
z=
σ divide by the standard normal distribution into a
deviation standard normal distribution
Standard Error σ Population standard Describes spread of sampling
se( x́ )=
√n deviation divided by square distribution; estimate of
root of the sample size population standard deviation
Expected Value E ( x́ )=μ Average of sample should Expected mean of all x values
be the population average; should equal population
sample mean is an mean; same value as the mode
unbiased estimate of in a standard distribution
population mean
Test Statistic ^ ¿
θ−θ Sample estimate of θ If larger than critical value,
t= minus the hypothesized can reject the null hypothesis
^
se ( θ)
value of θ divided by the
standard error
Critical Value α α is the significance level The cutoff value for
t n−1 ,
2 we’re willing to accept determining if a test is
significant
Confidence CI 100 (1−α ) % ( μ )= μ^ ± t se ( μ^ ) The confidence interval at Interval of numbers within
Interval
[ n−1 ,
α
2 ] α level of a population which we believe the
mean is estimate of the parameter will fall; if
mean plus or minus the hypothesized mean not in
margin of error interval, reject the null
Confidence CI 95% =[ x́−1.96 se ( x́ ) , x́+ 1.96 se (x́) ] With repeated sampling, 95%
Interval (95%) of these samples will contain
the true parameter μ
Confidence CI 99% =[ x́−2.59 se ( x́ ) , x́+ 2.59 se ( x́) ] With repeated sampling, 99%
Interval (99%) of these samples will contain
the true parameter μ
Stochastic Model y=f ( x )+ ε y is the function of x plus Relationship holds true
any uncertainty that generally, but there are some
account for small variations and uncertainties
differences
Simple Linear y i=β 0 + β 1 x i +ε i y is the function of β0 plus One unit of y changes as a
Regression Model x times β1 plus error; β0 is function of x by β1 units; null
the y intercept; β1 is the hypothesis says β1 is 0, so y
slope changes by β0
n
Estimated β1 The sum of x and y values Tells us the value of β1 to use
∑ ( x i−x́ ) ( y i− ý ) minus their means, divided in the simple linear
^β 1= i=1 by the sum of squares of x regression; one unit change in
n

∑ (x i−x́)2 values minus their means x predicts a β1 change in y


i =1

Estimated β0 ^β 0= ý −β1 x́ The mean value of y minus Tells us the value of β0 to use
the mean value of x times in the simple linear regression
β1
Residual ε^ = y− β^ 0 − ^β1 x= y− ^y Estimated error is the The deviation of the original y
difference between left over once relationship
observed y and estimated with x is taken into account
(fitted) y
n n
Sum of Squares Sum of residuals squared is The difference between our
SS Residual=∑ ε^ i2 =∑ ( y i ¿− ^y i )2 ¿ the sum of the difference estimated y line and our actual
Residual i=1 i=1
between y values and y values; want to minimize
estimated y the squares between them
n
Sum of Squares Total variation is sum of All variation in a given simple
SSTotal =∑ ( y i− ý )2 the difference between y linear regression model
Total i=1
values and the mean of y
squared
n
Sum of Squares Explained variation is the Variation of the estimated y
SS Explained=∑ ( ^y i− ý )2 sum of difference between values around their mean
Explained i=1
estimated y and the mean
of y squared
Sum of Squares SSTotal =SS Residual+ SS Explained Total variation is the sum
Total of residual variation and
explained variation
Coefficient of 2 SS Explained SS Residual Goodness of fit is the ratio A measure of how well the
R= =1− of the explained variance regression line fits the data
Determination SSTotal SSTotal
to the total variance (least squares)
Standard Error of se Regression= √ σ^ ε2= σ^ ε Square root of the residual Estimate of the magnitude of
Regression variance the typical deviation from the
regression line
n
Residual Sum of residuals squared Estimate of the variance of
Variance SS Residual i=1 ∑ ε^ i2 divided by the sample size the population errors
σ^ ε2= = less two
n−2 n−2
n
Sample The sum of x and y values Measure of how two variables
Covariance
∑ ( x i−x́ ) ( y i− ý ) minus their means divided move or vary together
s xy = i=1 by sample size less one
n−1
Sample s xy Sample covariance divided Measure of the strength of
r xy = by the standard deviation association between two
Correlation sx s y
of x times the standard variables
deviation of y
Multiple Linear y=β 0 + β 1 x 1 + β 2 x 2 +…+ β k x k + ε The more x values you use, Regressing multiple x
Regression Model the less error there will be variables on y
F-Statistic SS Model (Sum of squares of the Tells the overall significance
MS Model k model divided by # of of the regression model; if
F= = regressors) over (sum of significant, at least one x
MS Residuals SS Residuals
squares of the residuals variable is related to y
n−k −1
divided by # of
observations - # of
regressors -1)
OLS Formula for σx y Covariance of the variables Population parameter in a
β 1= 1

divided by the variance multiple regression;


β1 σ2x 1
add/subtract pieces for
omitted variables (β2)

You might also like