BA Module 2 Summary
BA Module 2 Summary
→ It is often very useful to infer attributes of a large population from a smaller sample. To make sound inferences:
• Make sure the sample is sufficiently large and is representative of the population.
• Avoid biased results by
→ phrasing questions neutrally;
→ ensuring that the sampling method is appropriate for the demographic of the target population; and
→ pursuing high response rates.
• If a sample is sufficiently large and representative of the population, the sample statistics, x and s, should be
reasonably good estimates of the population parameters, µ and σ, respectively.
→ The normal distribution has a unique symmetrical shape whose center and width are determined by its mean and
standard deviation respectively.
• Using the properties of the normal distribution, we can calculate a probability associated with any range of
values.
• Several rules of thumb are helpful for estimating probabilities for a normal distribution.
→ About 68% of the probability is contained in the range reaching one standard deviation away from the
mean on either side, that is, P(µ-σ≤ 𝑥 ≤µ+σ)≈ 68%.
→ About 95% of the probability is contained in the range reaching two standard deviations (1.96 to be exact)
away from the mean on either side, that is, P(µ-2σ≤ 𝑥 ≤µ+2σ)≈ 95%.
→ About 99.7% of the probability is contained in the range reaching three standard deviations away from the
mean on either side, that is, P(µ-3σ≤ 𝑥 ≤µ+3σ)≈ 99.7%.
!!!
• A z-value of a point x is the distance x lies from the mean, measured in standard deviations, z= .
!
→ The Central Limit Theorem states that if we take enough sufficiently large samples from any population, the
means of those samples will be normally distributed, regardless of the shape of the underlying population.
• The distribution of those sample’s means, called the Distribution of Sample Means, more closely
approximates a normal curve as we increase the number of samples and/or the sample size.
• The mean of any single sample lies on the normally distributed Distribution of Sample Means, so we can use
the normal curve’s special properties to draw conclusions from a single sample mean.
• The mean of the Distribution of Sample Means equals the mean of the population distribution.
• The standard deviation of the Distribution of Sample Means equals the standard deviation of the population
distribution divided by the square root of the sample size. Thus, increasing the sample size decreases the
width of the Distribution of Sample Means.
→ The sample mean is only a point estimate. Using the properties of the normal distribution and the Central Limit
Theorem, we can construct a range around the sample mean, called a confidence interval, to estimate the range
in which the true population mean likely lies.
• The width of the confidence interval depends on the level of confidence, our best estimate of the population
standard deviation, and the sample size. We can only control the level of confidence and the sample size.
!
• For large samples (n≥30), the lower and upper bounds are calculated using the following equation: x ± z .
!
• The function CONFIDENCE.NORM calculates the margin of error, which we add and subtract from the sample
mean to find the confidence interval.
!
• For small samples (n<30), the lower and upper bounds are calculated using the following equation: x ± t .
!
→ For small samples, we use a t-distribution, which is shorter and wider than a normal distribution. The t-
distribution provides a wider range, a more conservative estimate of where the true population mean lies.
→ The function CONFIDENCE.T calculates the margin of error, which we add and subtract from the sample
mean to find the confidence interval.
→ We can also calculate confidence intervals for proportions. To do so, we must convert data to dummy (0, 1)
variables.
• After that, we can proceed as we would with any other confidence interval.
→ When estimating the true population proportion, we should ensure that the sample size is large enough by
checking that both of the following conditions are true: n*p ≥ 5, and n(1−p)≥ 5. If either of these
guidelines is not satisfied, we must collect a larger sample.
EXCEL SUMMARY
Recall the Excel functions and analyses covered in this course and make sure to familiarize yourself with all of the
necessary steps, syntax, and arguments. We have provided some additional information for the more complex
functions listed below. As usual, the arguments shown in square brackets are optional. The functions whose names
include “S” use the standard normal distribution.
→ =RAND()
→ =NORM.S.DIST(z, cumulative)
• When cumulative is set to “TRUE”, NORM.S.DIST finds the cumulative probability, that is, the probability of
being less than or equal to the specified value z for a standard normal distribution.
→ =IF(logical_test,[value_if_true],[value_if_false])
• Returns value_if_true if the specified condition is met, and returns value_if_false if the condition is not met.