0% found this document useful (0 votes)
233 views

BA Module 2 Summary

This document discusses sampling and estimation techniques. It explains that taking a sufficiently large, representative sample can provide reasonably good estimates of population parameters. It also describes properties of the normal distribution and how the central limit theorem allows using a normal distribution to analyze sample means. Confidence intervals estimate the range where the true population mean is likely to fall based on a sample's mean and standard deviation.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
233 views

BA Module 2 Summary

This document discusses sampling and estimation techniques. It explains that taking a sufficiently large, representative sample can provide reasonably good estimates of population parameters. It also describes properties of the normal distribution and how the central limit theorem allows using a normal distribution to analyze sample means. Confidence intervals estimate the range where the true population mean is likely to fall based on a sample's mean and standard deviation.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

BUSINESS ANALYTICS MODULE 2

Sampling and Estimation

→ It is often very useful to infer attributes of a large population from a smaller sample. To make sound inferences:
• Make sure the sample is sufficiently large and is representative of the population.
• Avoid biased results by
→ phrasing questions neutrally;
→ ensuring that the sampling method is appropriate for the demographic of the target population; and
→ pursuing high response rates.
• If a sample is sufficiently large and representative of the population, the sample statistics, x and s, should be
reasonably good estimates of the population parameters, µ and σ, respectively.

→ The normal distribution has a unique symmetrical shape whose center and width are determined by its mean and
standard deviation respectively.
• Using the properties of the normal distribution, we can calculate a probability associated with any range of
values.
• Several rules of thumb are helpful for estimating probabilities for a normal distribution.
→ About 68% of the probability is contained in the range reaching one standard deviation away from the
mean on either side, that is, P(µ-σ≤ 𝑥 ≤µ+σ)≈ 68%.
→ About 95% of the probability is contained in the range reaching two standard deviations (1.96 to be exact)
away from the mean on either side, that is, P(µ-2σ≤ 𝑥 ≤µ+2σ)≈ 95%.
→ About 99.7% of the probability is contained in the range reaching three standard deviations away from the
mean on either side, that is, P(µ-3σ≤ 𝑥 ≤µ+3σ)≈ 99.7%.
!!!
• A z-value of a point x is the distance x lies from the mean, measured in standard deviations, z= .
!

→ The Central Limit Theorem states that if we take enough sufficiently large samples from any population, the
means of those samples will be normally distributed, regardless of the shape of the underlying population.
• The distribution of those sample’s means, called the Distribution of Sample Means, more closely
approximates a normal curve as we increase the number of samples and/or the sample size.
• The mean of any single sample lies on the normally distributed Distribution of Sample Means, so we can use
the normal curve’s special properties to draw conclusions from a single sample mean.
• The mean of the Distribution of Sample Means equals the mean of the population distribution.
• The standard deviation of the Distribution of Sample Means equals the standard deviation of the population
distribution divided by the square root of the sample size. Thus, increasing the sample size decreases the
width of the Distribution of Sample Means.

→ The sample mean is only a point estimate. Using the properties of the normal distribution and the Central Limit
Theorem, we can construct a range around the sample mean, called a confidence interval, to estimate the range
in which the true population mean likely lies.
• The width of the confidence interval depends on the level of confidence, our best estimate of the population
standard deviation, and the sample size. We can only control the level of confidence and the sample size.
!
• For large samples (n≥30), the lower and upper bounds are calculated using the following equation: x ± z .
!

Sampling and Estimation | Page 1 of 3


BUSINESS ANALYTICS MODULE 2
Sampling and Estimation

• The function CONFIDENCE.NORM calculates the margin of error, which we add and subtract from the sample
mean to find the confidence interval.
!
• For small samples (n<30), the lower and upper bounds are calculated using the following equation: x ± t .
!

→ For small samples, we use a t-distribution, which is shorter and wider than a normal distribution. The t-
distribution provides a wider range, a more conservative estimate of where the true population mean lies.
→ The function CONFIDENCE.T calculates the margin of error, which we add and subtract from the sample
mean to find the confidence interval.

→ We can also calculate confidence intervals for proportions. To do so, we must convert data to dummy (0, 1)
variables.
• After that, we can proceed as we would with any other confidence interval.
→ When estimating the true population proportion, we should ensure that the sample size is large enough by
checking that both of the following conditions are true: n*p ≥ 5, and n(1−p)≥ 5. If either of these
guidelines is not satisfied, we must collect a larger sample.

Sampling and Estimation | Page 2 of 3


BUSINESS ANALYTICS MODULE 2
Sampling and Estimation

EXCEL SUMMARY

Recall the Excel functions and analyses covered in this course and make sure to familiarize yourself with all of the
necessary steps, syntax, and arguments. We have provided some additional information for the more complex
functions listed below. As usual, the arguments shown in square brackets are optional. The functions whose names
include “S” use the standard normal distribution.

→ =RAND()

→ =NORM.DIST(x, mean, standard_dev, cumulative)


• When cumulative is set to “TRUE”, NORM.DIST finds the cumulative probability, that is, the probability of
being less than or equal to the specified value x, for a normal distribution with the specified mean and standard
deviation. (Inserting the value “FALSE” provides the height of the normal distribution at the value x, which is
not covered in this course.)

→ =NORM.S.DIST(z, cumulative)
• When cumulative is set to “TRUE”, NORM.S.DIST finds the cumulative probability, that is, the probability of
being less than or equal to the specified value z for a standard normal distribution.

→ =NORM.INV(probability, mean, standard_dev)


• Returns the corresponding x-value on a normal distribution for the specified mean, standard deviation, and
cumulative probability.

→ =CONFIDENCE.NORM(alpha, standard_dev, size)


• Returns the margin of error using a normal distribution for a specified alpha, standard_dev, and size. Alpha is
the significance level, which equals one minus the confidence level (for example, a 95% confidence interval
would correspond to the significance level 0.05).

→ =CONFIDENCE.T(alpha, standard_dev, size)


• Returns the margin of error using a t-distribution for a specified alpha, standard_dev, and size.

→ =IF(logical_test,[value_if_true],[value_if_false])
• Returns value_if_true if the specified condition is met, and returns value_if_false if the condition is not met.

Sampling and Estimation | Page 3 of 3

You might also like