IV AI-DS AD3491 FDSA Unit3
IV AI-DS AD3491 FDSA Unit3
Populations – samples – random sampling – Sampling distribution- standard error of the mean -
Hypothesis testing – z-test – z-test procedure –decision rule – calculations – decisions –
interpretations - one-tailed and two-tailed tests – Estimation – point estimate – confidence
interval – level of confidence – effect of sample size.
3.1 POPULATIONS
In statistics as well as in quantitative methodology, the set of data are collected and selected from
a statistical population with the help of some defined procedures. There are two different types
of data sets namely, population and sample. So basically when we calculate the mean deviation,
variance and standard deviation, it is necessary for us to know if we are referring to the entire
population or to only sample data. Suppose the size of the population is denoted by ‘n’ then the
sample size of that population is denoted by n -1. Let us take a look of population data sets and
sample data sets in detail.
Population
It includes all the elements from the data set and measurable characteristics of the population
such as mean and standard deviation are known as a parameter. For example, All people living in
India indicates the population of India.
There are different types of population. They are:
Finite Population
Infinite Population
Existent Population
Hypothetical Population
Let us discuss all the types one by one.
Finite Population
The finite population is also known as a countable population in which the population can be
counted. In other words, it is defined as the population of all the individuals or objects that are
finite. For statistical analysis, the finite population is more advantageous than the infinite
population. Examples of finite populations are employees of a company, potential consumer in a
market.
Infinite Population
The infinite population is also known as an uncountable population in which the counting of
units in the population is not possible. Example of an infinite population is the number of germs
in the patient’s body is uncountable.
Existent Population
The existing population is defined as the population of concrete individuals. In other words, the
population whose unit is available in solid form is known as existent population. Examples are
books, students etc.
Hypothetical Population
The population in which whose unit is not available in solid form is known as the hypothetical
population. A population consists of sets of observations, objects etc that are all something in
common. In some situations, the populations are only hypothetical. Examples are an outcome of
rolling the dice, the outcome of tossing a coin.
Sample
It includes one or more observations that are drawn from the population and the measurable
characteristic of a sample is a statistic. Sampling is the process of selecting the sample from the
population. For example, some people living in India is the sample of the population.
Basically, there are two types of sampling. They are:
Probability sampling
Non-probability
sampling Probability
Sampling
In probability sampling, the population units cannot be selected at the discretion of the
researcher. This can be dealt with following certain procedures which will ensure that every unit
of the population consists of one fixed probability being included in the sample. Such a method
is also called random sampling. Some of the techniques used for probability sampling are:
Quota sampling
Judgement sampling
Purposive sampling
Population and Sample Examples
All the people who have the ID proofs is the population and a group of people who only have
voter id with them is the sample.
All the students in the class are population whereas the top 10 students in the class are the
sample.
All the members of the parliament is population and the female candidates present there is the
sample.
Population and Sample Formulas
We will demonstrate here the formulas for mean absolute deviation (MAD), variance and
standard deviation based on population and given sample. Suppose n denotes the size of the
population and n-1 denotes the sample size, then the formulas for mean absolute deviation,
variance and standard deviation are given by;
3.2SAMPLES
1. What is the Two-Sample t-test? The two-sample t-test is a hypothesis test that compares the
means of two independent groups to determine if they are statistically different. It is
specifically used when the data follow a normal distribution, and the variances of the two
groups are assumed to be equal.
2. Null and Alternative Hypotheses: Before conducting a two-sample t-test, we must establish
the null hypothesis (H0) and the alternative hypothesis (Ha). The null hypothesis assumes
that there is no significant difference between the means of the two groups, while the
alternative hypothesis suggests the presence of a significant difference.
3. Assumptions of the Two-Sample t-test: To ensure accurate results, the two-sample t-test
relies on several assumptions: a) The data in each group are independent and randomly
sampled. b) The data in each group follow a normal distribution. c) The variances of the two
groups are equal.
4. Calculating the Test Statistic: The test statistic for the two-sample t-test is calculated using
the formula:
t = (x1 — x2) / √((s1² / n1) + (s2² / n2))
where x1 and x2 are the sample means, s1 and s2 are the sample standard deviations, and n1 and
n2 are the sample sizes of the two groups.
Degrees of Freedom and Critical Value: The degrees of freedom (df) for the two-sample t-test is
calculated using the formula:
df = n1 + n2–2
The critical value is determined based on the desired significance level (e.g., 0.05) and the
degrees of freedom. If the test statistic exceeds the critical value, we reject the null
hypothesis. Interpreting the Results: If the p-value associated with the two-sample t-test is
less than the chosen significance level, typically 0.05, we reject the null hypothesis and
conclude that there is a significant difference between the means of the two groups.
Conversely, if the p-value is greater than the significance level, we fail to reject the null
hypothesis, indicating no significant difference.
Practical Applications in Data Science: The two-sample t-test finds applications in various
data science scenarios, including:
a) A/B testing: Comparing the performance of two different versions of a website or
application.
b) Market research: Analyzing the preferences of two different customer segments.
c) Medical research: Comparing the effectiveness of two treatment groups.
d) Quality control: Assessing if changes in manufacturing processes lead to significant
differences in product quality.
Python Implementation: Here’s an example implementation of the two-sample t-test in Python
using the SciPy library:
import scipy.stats as stats
In this example, we have two groups, group1 and group2, with sample data representing some
metric of interest. The ttest_ind() function from the SciPy library is used to calculate the two-
sample t-test. It takes the two groups as input and returns the t-statistic and p-value.
The t-statistic measures the difference between the means of the two groups relative to the
variation within each group. A larger absolute t-statistic suggests a larger difference between
the group means. The p-value indicates the probability of obtaining a difference as extreme
as observed (or more extreme) if the null hypothesis is true. A smaller p-value indicates
stronger evidence against the null hypothesis.
By executing this code, you will obtain the t-statistic and p-value, which you can interpret to
draw conclusions about the significance of the difference between the means of the two
groups.
Conclusion: The two-sample t-test is a fundamental statistical tool in data science that allows
us to compare the means of two independent groups. By following the correct procedures
and interpreting the results appropriately, data scientists can make informed decisions based
on statistically significant differences.
Understanding the applications and limitations of the two-sample t-test empowers data
scientists to draw reliable conclusions from their analyses and contribute to evidence-based
decision-making processes. The Python implementation using the SciPy library provides a
practical way to perform the two-sample t-test and obtain the necessary statistics for further
analysis.
In summary, the two-sample t-test serves as a valuable tool for data scientists in various
domains, enabling them to gain insights and make data-driven decisions confidently.
Random sampling, or probability sampling, is a sampling method that allows for the
randomization of sample selection, i.e., each sample has the same probability as other samples to
be selected to serve as a representation of an entire population.
2. Systematic sampling
Systematic sampling is the selection of specific individuals or members from an entire
population. The selection often follows a predetermined interval (k). The systematic sampling
method is comparable to the simple random sampling method; however, it is less complicated to
conduct.
3. Stratified sampling
Stratified sampling, which includes the partitioning of a population into subclasses with notable
distinctions and variances. The stratified sampling method is useful, as it allows the researcher to
make more reliable and informed conclusions by confirming that each respective subclass has
been adequately represented in the selected sample.
4. Cluster sampling
Cluster sampling, which, similar to the stratified sampling method, includes dividing a
population into subclasses. Each of the subclasses should portray comparable characteristics to
the entire selected sample. This method entails the random selection of a whole subclass, as
opposed to the sampling of members from each subclass. This method is ideal for studies that
involve widely spread populations.
A simple random sample is a randomly selected subset of a population. In this sampling method,
each member of the population has an exactly equal chance of being selected.
This method is the most straightforward of all the probability sampling methods, since it only
involves a single random selection and requires little advance knowledge about the population.
Because it uses randomization, any research performed on this sample should have high internal
and external validity, and be at a lower risk for research biases like sampling bias and selection
bias.
When to use simple random sampling
Simple random sampling is used to make statistical inferences about a population. It helps ensure
high internal validity: randomization is the best method to reduce the impact of potential
confounding variables.
In addition, with a large enough sample size, a simple random sample has high external validity:
it represents the characteristics of the larger population.
However, simple random sampling can be challenging to implement in practice. To use this
method, there are some prerequisites:
You have a complete list of every member of the population.
You can contact or access each member of the population if they are selected.
You have the time and resources to collect data from the necessary sample size.
Simple random sampling works best if you have a lot of time and resources to conduct your
study, or if you are studying a limited population that can easily be sampled.
In some cases, it might be more appropriate to use a different type of probability sampling:
Systematic sampling involves choosing your sample based on a regular interval, rather than a
fully random selection. It can also be used when you don’t have a complete list of the
population. Stratified sampling is appropriate when you want to ensure that specific
characteristics are proportionally represented in the sample. You split your population into strata
(for example, divided by gender or race), and then randomly select from each of these
subgroups.
Cluster sampling is appropriate when you are unable to sample from the entire population. You
divide the sample into clusters that approximately reflect the whole population, and then choose
your sample from a random selection of these clusters.
It’s important to ensure that you have access to every individual member of the population, so
that you can collect data from all those who are selected for the sample.
Example: Population
In the American Community Survey, the population is all 128 million households who live in the
United States (including households made up of citizens and non-citizens alike).
Step 2: Decide on the sample size
Next, you need to decide how large your sample size will be. Although larger samples provide
more statistical certainty, they also cost more and require far more work.
There are several potential ways to decide upon the size of your sample, but one of the simplest
involves using a formula with your desired confidence interval and confidence level, estimated
size of the population you are working with, and the standard deviation of whatever you want to
measure in your population.
The most common confidence interval and levels used are 0.05 and 0.95, respectively. Since you
may not know the standard deviation of the population you are studying, you should choose a
number high enough to account for a variety of possibilities (such as 0.5).
You can then use a sample size calculator to estimate the necessary sample
In the lottery method, you choose the sample at random by “drawing from a hat” or by using a
computer program that will simulate the same action.
In the random number method, you assign every individual a number. By using a random
number generator or random number tables, you then randomly pick a subset of the population.
You can also use the random number function (RAND) in Microsoft Excel to generate random
numbers.
Example: Random selection
The Census Bureau randomly selects addresses of 295,000 households monthly (or 3.5 million
per year). Each address has approximately a 1-in-480 chance of being selected.
Step 4: Collect data from your sample
Finally, you should collect data from your sample.
To ensure the validity of your findings, you need to make sure every individual selected actually
participates in your study. If some drop out or do not participate for reasons associated with the
question that you’re studying, this could bias your findings.
For example, if young participants are systematically less likely to participate in your study, your
findings might not be valid due to the underrepresentation of this group.
SAMPLING DISTRIBUTION
A lot of data drawn and used are actually samples rather than populations. A sample is a subset
of a population. Put simply, a sample is a smaller part of a larger group. As such, this smaller
portion is meant to be representative of the population as a whole.
Sampling distributions (or the distribution of data) are statistical metrics that determine whether
an event or certain outcome will take place. This distribution depends on a few different factors,
including the sample size, the sampling process involved, and the population as a whole. There
are a few steps involved with sampling distribution. These include:
Once the information is gathered, plotted, and analyzed, researchers can make inferences and
conclusions. This can help them make decisions about what to expect in the future. For instance,
governments may be able to invest in infrastructure projects based on the needs of a certain
community or a company may decide to proceed with a new business venture if the sampling
distribution suggests a positive outcome.
Special Considerations
The number of observations in a population, the number of observations in a sample, and the
procedure used to draw the sample sets determine the variability of a sampling distribution. The
standard deviation of a sampling distribution is called the standard error.
While the mean of a sampling distribution is equal to the mean of the population, the standard
error depends on the standard deviation of the population, the size of the population, and the size
of the sample.
Knowing how spread apart the mean of each of the sample sets are from each other and from the
population mean will give an indication of how close the sample mean is to the population mean.
The standard error of the sampling distribution decreases as the sample size increases.
Now suppose they take repeated random samples from the general population and compute the
sample mean for each sample group instead. So, for North America, they pull data for 100
newborn weights recorded in the U.S., Canada, and Mexico as follows:
Sampling Distribution of the Mean: This method shows a normal distribution where the middle
is the mean of the sampling distribution. As such, it represents the mean of the overall
population. In order to get to this point, the researcher must figure out the mean of each sample
group and map out the individual data.
Sampling Distribution of Proportion: This method involves choosing a sample set from the
overall population to get the proportion of the sample. The mean of the proportions ends up
becoming the proportions of the larger group.
T-Distribution: This type of sampling distribution is common in cases of small sample sizes. It
may also be used when there is very little information about the entire population. T-distributions
are used to make estimates about the mean and other statistical points.
Following our example, the population average weight of babies in North America and in South
America has a normal distribution because some babies will be underweight (below the mean) or
overweight (above the mean), with most babies falling in between (around the mean). If the
average weight of newborns in North America is seven pounds, the sample mean weight in each
of the 12 sets of sample observations recorded for North America will be close to seven pounds
as well.
But if you graph each of the averages calculated in each of the 1,200 sample groups, the resulting
shape may result in a uniform distribution, but it is difficult to predict with certainty what the
actual shape will turn out to be. The more samples the researcher uses from the population of
over a million weight figures, the more the graph will start forming a normal distribution.
What Is a Mean?
A mean is a metric used in statistics and research. It is the average for at least two numbers. The
mean may be determined by adding up all the numbers and dividing the result by the number of
numbers in that set. This is known as the arithmetic mean. You can determine the geometric
mean by multiplying the values of a data set and taking the root of the sum equal to the number
of values within that data set.
A sampling distribution depends three primary factors: the sample size (n), the population as a
whole (N) and the sampling process. So how does it work?
Choose a random sample from the given population.
Calculate a statistic from that group, such as the standard deviation, mean or median.
Construct a frequency distribution of each sample statistic.
Plot the frequency distribution of each sample statistic on a graph. The resulting graph is the
sampling distribution.
For example, if you randomly sample data three times and determine the mean, or the average, of
each sample, all three means are likely to be different and fall somewhere along the graph. That's
variability. You do that many times, and eventually the data you plot may look like a bell curve.
That process is a sampling distribution.
The standard error of the mean, or simply standard error, indicates how different the
population mean is likely to be from a sample mean. It tells you how much the sample mean
would vary if you were to repeat a study using new samples from within a single population.
The standard error of the mean (SE or SEM) is the most commonly reported type of standard
error. But you can also find the standard error for other statistics, like medians or proportions.
The standard error is a common measure of sampling error—the difference between a
population parameter and a sample statistic.
In statistics, data from samples is used to understand larger populations. Standard error
matters because it helps you estimate how well your sample data represents the whole
population.
With probability sampling, where elements of a sample are randomly selected, you can
collect data that is likely to be representative of the population. However, even with
probability samples, some sampling error will remain. That’s because a sample will never
perfectly match the population it comes from in terms of measures like means and standard
deviations.
By calculating standard error, you can estimate how representative your sample is of your
population and make valid conclusions.
A high standard error shows that sample means are widely spread around the population
mean—your sample may not closely represent your population. A low standard error shows
that sample means are closely distributed around the population mean—your sample is
representative of your population.
You can decrease standard error by increasing sample size. Using a large, random sample is
the best way to minimize sampling bias.
The standard deviation is a descriptive statistic that can be calculated from sample data. In
contrast, the standard error is an inferential statistic that can only be estimated (unless the
real population parameter is known).
The standard error of the math scores, on the other hand, tells you how much the sample
mean score of 550 differs from other sample mean scores, in samples of equal size, in the
population of all test takers in the region.
From the formula, you’ll see that the sample size is inversely proportional to the standard
error. This means that the larger the sample, the smaller the standard error, because the
sample statistic will be closer to approaching the population parameter.
Different formulas are used depending on whether the population standard deviation is
known. These formulas work for samples with more than 20 elements (n > 20).
Formula
SE is standard error
sigma is population standard deviation
n is the number of elements in the sample
When population parameters are unknown
When the population standard deviation is unknown, you can use the below formula to only
estimate standard error. This formula takes the sample standard deviation as a point estimate
for the population standard deviation.
Formula Explanation
SE = \dfrac{s}{\sqrt{n}}
SE is standard error
s is sample standard deviation
n is the number of elements in the sample
Example: Using the standard error formula
To estimate the standard error for math SAT scores, you follow two steps.
First, find the square root of your sample size (n).
Next, divide the sample standard deviation by the number you found in step one.
Formula Calculation
With a 95% confidence level, 95% of all sample means will be expected to lie within a
confidence interval of ± 1.96 standard errors of the sample mean.
Based on random sampling, the true population parameter is also estimated to lie within this
range with 95% confidence.
x̄ + (1.96 × SE)
With random sampling, a 95% CI [525 575] tells you that there is a 0.95 probability that the
population mean math SAT score is between 525 and 575.
Hypothesis testing is the detective work of statistics, where evidence is scrutinized to determine
the truth behind claims. From unraveling mysteries in science to guiding decisions in business,
this method empowers researchers to make sense of data and draw reliable conclusions. In this
article, we’ll explore the fascinating world of hypothesis testing, uncovering its importance and
practical applications in data analytics.
In this comprehensive guide, we will be learning the theory and types of hypothesis testing.
Additionally, we will be taking sample problem statements and solving them step-by-step using
hypothesis testing. We will be using Python as the programming language.
Hypothesis testing is a part of statistical analysis and machine learning, where we test the
assumptions made regarding a population parameter.
Scientific research: Testing the effectiveness of a new drug, evaluating the impact of a treatment
on patient outcomes, or examining the relationship between variables in a study.
Quality control: Assessing whether a manufacturing process meets specified standards or
determining if a product’s performance meets expectations.
Business decision-making: Investigating the effectiveness of marketing strategies, analyzing
customer preferences, or testing hypotheses about financial performance.
Social sciences: Studying the effects of interventions on societal outcomes, examining attitudes
and behaviors, or testing theories about human behavior.
Note: Don’t be confused between the terms Parameter and Satistic.
A Parameter is a number that describes the data from the population whereas, a Statistic is a
number that describes the data from a sample.
1. Null Hypothesis (H0): Null hypothesis is a statistical theory that suggests there is no statistical
significance exists between the populations. It is denoted by H0 and read as H-naught.
Note: H0 must always contain equality(=). Ha always contains difference(≠, >, <).
For example, if we were to test the equality of average means (µ) of two groups:
for a two-tailed test, we define H0: µ1 = µ2 and Ha: µ1≠µ2
for a one-tailed test, we define H0: µ1 = µ2 and Ha: µ1 > µ2 or Ha: µ1 < µ2
3. Test Statistic: It is denoted by t and is dependent on the test that we run. It is the deciding
factor to reject or accept the Null Hypothesis. The four main test statistics are given in the below
table:
4. Significance Level (α): The significance level, often denoted by α (alpha), represents the
probability of rejecting the null hypothesis when it is actually true. Commonly used significance
levels include 0.05 and 0.01, indicating a 5% and 1% chance of Type I error, respectively.
5. P-value: It is the proportion of samples (assuming the Null Hypothesis is true) that would be
as extreme as the test statistic. It is denoted by the letter p.
6. Critical Value: Denoted by C and it is a value in the distribution beyond which leads to the
rejection of the Null Hypothesis. It is compared to the test statistic.
Now, assume we are running a two-tailed Z-Test at 95% confidence. Then, the level of significance
(α) = 5% = 0.05. Thus, we will have (1-α) = 0.95 proportion of data at the center, and α = 0.05
proportion will be equally shared to the two tails. Each tail will have (α/2) = 0.025 proportion of
data.
The critical value i.e., Z is 95% or Z=α/2 = 1.96 is calculated from the Z-scores table.
Formulate Hypotheses: State the null hypothesis and the alternative hypothesis.
Choose Significance Level (α): Select a significance level (α), which determines the threshold for
rejecting the null hypothesis. Commonly used significance levels include 0.05 and 0.01.
Select Appropriate Test: Choose a statistical test based on the research question, type of data, and
assumptions. Common tests include t-tests, chi-square tests, ANOVA, correlation tests, and
regression analysis, among others.
Collect Data and Calculate Test Statistic: Collect relevant sample data and calculate the appropriate
test statistic based on the chosen statistical test.
Determine Critical Region: Define the critical region or rejection region based on the chosen
significance level and the distribution of the test statistic.
Calculate P-value: Determine the probability of observing a test statistic as extreme as, or more
extreme than, the one obtained from the sample data, assuming the null hypothesis is true. The p-
value is compared to the significance level to make decisions about the null hypothesis.
Make Decision: If the p-value is less than or equal to the significance level (p ≤ α), reject the null
hypothesis in favor of the alternative hypothesis. If the p-value is greater than the significance
level (p > α), fail to reject the null hypothesis.
Draw Conclusion: Interpret the results based on the decision made in step 7. Provide implications of
the findings in the context of the research question or problem.
Check Assumptions and Validate Results: Assess whether the assumptions of the chosen statistical
test are met. Validate the results by considering the reliability of the data and the appropriateness
of the statistical analysis.
By following these steps systematically, researchers can conduct hypothesis tests, evaluate the
evidence, and draw valid conclusions from their analyses.
Decision Rules
The two methods of concluding the Hypothesis test are using the Test-statistic value and p-value.
In both methods, we start assuming the Null Hypothesis to be true, and then we reject the Null
hypothesis if we find enough evidence.
Power of test: The probability of rejecting a False Null Hypothesis i.e., the ability of the test to
detect a difference. It is denoted as (1-β) and its value lies between 0 and 1.
Type I error: Occurs when we reject a True Null Hypothesis and is denoted as α.
Type II error: Occurs when we accept a False Null Hypothesis and is denoted as β.
When dealing with continuous data, several common hypothesis tests are used, depending on the
research question and the characteristics of the data. Some of the most widely used hypothesis
tests for continuous data include:
One-Sample t-test: Used to compare the mean of a single sample to a known value or
hypothesized population mean.
Paired t-test: Compares the means of two related groups (e.g., before and after treatment) to
determine if there is a significant difference.
Independent Samples t-test: Compares the means of two independent groups to determine if
there is a significant difference between them.
Analysis of Variance (ANOVA): Used to compare means across three or more independent
groups to determine if there are any statistically significant differences.
Correlation Test (Pearson’s correlation coefficient): Determines if there is a linear relationship
between two continuous variables.
Regression Analysis: Evaluates the relationship between one dependent variable and one or more
independent variables.
When dealing with discrete data, several common hypothesis tests are used to analyze
differences between groups, associations, or proportions. Some of the most widely used
hypothesis tests for discrete data include:
Type I error (False Positive): This happens when one incorrectly rejects the null hypothesis,
indicating a significant result when no true effect or difference exists in the population being
studied.
Type II error (False Negative): This occurs when one fails to reject the null hypothesis despite
the presence of a true effect or difference in the population.
These errors represent the trade-off between making incorrect conclusions and the risk of
missing important findings in hypothesis testing.
The Z-test is a statistical hypothesis test used to determine where the distribution of the test
statistic we are measuring, like the mean, is part of the normal distribution.
While there are multiple types of Z-tests, we’ll focus on the easiest and most well-known one,
the one-sample mean test. This is used to determine if the difference between the mean of a
sample and the mean of a population is statistically significant.
What Is a Z-Test?
A Z-test determines whether there are any statistically significant differences between the means
of two populations. A Z-test can only be applied if the standard deviation of each population is
known and a sample size of at least 30 data points is available.
The name Z-test comes from the Z-score of the normal distribution. This is a measure of how
many standard deviations away a raw score or sample statistics is from the population’s mean. Z-
tests are the most common statistical tests conducted in fields such as healthcare and data
science, making them essential to understand.
A normal distribution drawn on a napkin.
Image: Shutterstock / Built In
Brand Studio Logo
UPDATED BY
Matthew Urwin | Oct 16, 2024
The Z-test is a statistical hypothesis test used to determine where the distribution of the test
statistic we are measuring, like the mean, is part of the normal distribution.
While there are multiple types of Z-tests, we’ll focus on the easiest and most well-known one,
the one-sample mean test. This is used to determine if the difference between the mean of a
sample and the mean of a population is statistically significant.
What Is a Z-Test?
A Z-test determines whether there are any statistically significant differences between the means
of two populations. A Z-test can only be applied if the standard deviation of each population is
known and a sample size of at least 30 data points is available.
The name Z-test comes from the Z-score of the normal distribution. This is a measure of how
many standard deviations away a raw score or sample statistics is from the population’s mean. Z-
tests are the most common statistical tests conducted in fields such as healthcare and data
science, making them essential to understand.
A sample size that’s greater than 30. This is because we want to ensure our sample mean comes
from a distribution that is normal. As stated by the central limit theorem, any distribution can be
approximated as normally distributed if it contains more than 30 data points.
The standard deviation and mean of the population is known.
The sample data is collected/acquired randomly.
The Z-test is a statistical hypothesis test used to determine where the distribution of the test
statistic we are measuring, like the mean, is part of the normal distribution.
While there are multiple types of Z-tests, we’ll focus on the easiest and most well-known one,
the one-sample mean test. This is used to determine if the difference between the mean of a
sample and the mean of a population is statistically significant.
What Is a Z-Test?
A Z-test determines whether there are any statistically significant differences between the means
of two populations. A Z-test can only be applied if the standard deviation of each population is
known and a sample size of at least 30 data points is available.
The name Z-test comes from the Z-score of the normal distribution. This is a measure of how
many standard deviations away a raw score or sample statistics is from the population’s mean. Z-
tests are the most common statistical tests conducted in fields such as healthcare and data
science, making them essential to understand.
Requirements for a Z-Test
In order to conduct a Z-test, your statistics need to meet a few requirements:
A sample size that’s greater than 30. This is because we want to ensure our sample mean
comes from a distribution that is normal. As stated by the central limit theorem, any
distribution can be approximated as normally distributed if it contains more than 30 data
points.
The standard deviation and mean of the population is known.
The sample data is collected/acquired randomly.
Z-Test Steps
There are four steps to complete a Z-test. Let’s examine each one:
1. State the Null Hypothesis
The first step in a Z-test is to state the null hypothesis, H_0. This is what you believe to be true
from the population, which could be the mean of the population, μ_0:
If the test statistic is greater (or lower depending on the test we are conducting) than the critical
value, then the alternate hypothesis is true because the sample’s mean is statistically
significant enough from the population mean.
Another way to think about this is if the sample mean is so far away from the population mean,
the alternate hypothesis has to be true or the sample is a complete anomaly.
3.7DECISION RULE – CALCULATIONS – DECISIONS
The decision to either reject or not to reject a null hypothesis is guided by the distribution the
test statistic assumes. This means that if the variable involved follows a normal distribution,
we use the level of significance of the test to come up with critical values that lie along the
standard normal distribution.
Note that before one makes a decision to reject or not to reject a null hypothesis, one must
consider whether the test should be one-tailed or two-tailed. This is because the number of
tails determines the value of α (significance level). The following is a summary of the
decision rules under different scenarios.
Left One-tailed Test
H1: Parameter < X
Decision rule: Reject H0 if the test statistic is less than the critical value. Otherwise, do not
reject H0.
Two-tailed Test
H1: Parameter ≠ X (not equal to X)
Decision rule: Reject H0 if the test statistic is greater than the upper critical value or less than
the lower critical value.
Critical values link confidence intervals to hypothesis tests. For example, to construct a 95%
confidence interval assuming a normal distribution, we would need to determine the critical
values that correspond to a 5% significance level. Similarly, if we were to conduct a test of
some given hypothesis at the 5% significance level, we would use the same critical values
used for the confidence interval to subdivide the distribution space into rejection and non-
rejection regions.
Example: Hypothesis Testing
A survey carried out using a sample of 50 Level I candidates reveals an average IQ of 100.
Assuming that IQs are distributed normally, carry out a statistical test to determine whether
the mean IQ is greater than 105. You are instructed to use a 5% level of significance.
(Previous studies give a standard deviation of IQs of approximately 20.)
Solution
First, state the hypothesis:
H0: μ = 105 vs H1: μ > 105
Since IQs follow a normal distribution, under H0,(X′–100)(σ√n)∼N(0,1)H0,(X
′– 100)(σn)∼N(0,1).
Next, we compute the test statistic, which is (105–100)(20√50)=1.768(105–
100)(2050)=1.768.
This is a right one-tailed test, and IQs are distributed normally. Therefore, we should
compare our test statistic to the upper 5% point of the normal distribution.
From the normal distribution table, this value is 1.6449. Since 1.768 is greater than 1.6449,
we have sufficient evidence to reject the H0 at the 5% significance level. Therefore, it
is reasonable to conclude that the mean IQ of CFA candidates is greater than 100.
Statistical Significance vs. Economic Significance
Statistical significance refers to the use of a sample to carry out a statistical test meant to
reveal any significant deviation from the stated null hypothesis. We then decide whether to
reject or not reject the null hypothesis.
Economic significance entails the statistical significance and the economic effect inherent in
the decision made after data analysis and testing.
The need to separate statistical significance from economic significance arises because some
statistical results may be significant on paper but not economically meaningful. The
difference from the hypothesized value may carry some statistical weight but lack economic
feasibility, making implementation of the results very unlikely. Perhaps an example can help
you gain a deeper understanding of the two concepts.
Example: Statistical Significance and Economic Feasibility
A well-established pharmaceutical company wishes to assess the effectiveness of a newly
developed drug before commercialization. The company’s board of directors commissions a
pilot test. The drug is administered to a few patients to whom none of the existing drugs has
been prescribed. A statistical test follows and reveals a significant decrease in the average
number of days taken before full recovery. The company considers the evidence sufficient to
conclude that the new drug is more effective than existing alternatives.
However, the production of the new drug is significantly more expensive because of the
scarcity of the active ingredient. Furthermore, the company would have to engage in a year-
long lobbying exercise to convince the Food and Drug Administration and the general public
that the drug is indeed an improvement to the existing brands. At the end of the day, the
management decides to delay the commercialization of the drug because of the higher
production and introduction costs.
Other factors that may affect the economic feasibility of statistical results include:
Tax: Financial institutions generally avoid projects that may increase the tax payable.
Shareholders: They are often trying to increase returns on investment from one year to
the next, not taking into account the long run because their investment horizon is too
short.
Risk: We may have a statistically significant project that is too risky. Projects that are
capital intensive are, in the long term, particularly, very risky. In fact, the additional risk
is excluded from statistical tests.
Evidence of returns based solely on statistical analysis may not be enough to guarantee the
implementation of a project. In particular, large samples may produce results that have high
statistical significance but very low applicability.
Question
A.
There is sufficient evidence to justify the rejection of the H0 and inform the conclusion
that the average IQ is greater than 102.
B.
There is insufficient evidence to justify the rejection of the H0 and guide the conclusion
that the average IQ is not more than 102.
C.
There is sufficient evidence to justify the rejection of the H0 and inform the conclusion
that the average IQ is greater than 102.
Solution
The correct answer is B.
Just like in the example above, start with the statement of the
hypothesis; H0: μ = 102 vs. H1: μ > 102
The test statistic is (105–102)(20√50)=1.061(105–102)(2050)=1.061.
Again, this is a right one-tailed test but this time, 1.061 is less than the upper 5% point of a
standard normal distribution (1.6449). Therefore, we do not have sufficient evidence to reject
the H0 at the 5% level of significance. It is, therefore, reasonable to conclude that the average
IQ of CFA candidates is not more than 102.
There are many applications of decision rules in business and finance, including:
Credit card companies use decision rules to approve credit card applications.
Retailers use associative rules to understand customers' habits and preferences (market
basket analysis) and apply the finding to launch effective promotions and advertising.
Banks use decision rules induced from data about bankrupt and non-bankrupt firms to
support credit-granting decisions.
Telemarketing and direct marketing companies use decision rules to reduce the number
of calls made and increase the ratio of successful calls.
DESCRIBING AND COMPARING INFORMATION ATTRIBUTES
The examples (information) from which decision rules are induced are expressed in terms of
some characteristic attributes. For instance, companies could be described by the following
attributes: sector of activity, localization, number of employees, total assets, profit, and risk
rating. From the viewpoint of conceptual content, attributes can be one of the following
types:
Qualitative attributes (symbolic,categorical, or nominal), including sector of activity or
localization
Quantitative attributes, including number of employees or total assets
Criteria or attributes whose domains are preferentially ordered, including profit, because
a company having large profit is preferred to a company having small profit or even loss
The objects are compared differently depending on the nature of the attributes considered. More
precisely, with respect to qualitative attributes, the objects are compared on the basis of an
indiscernibility relation: two objects are indiscernible if they have the same evaluation with respect to
the considered attributes. The indiscernibility relation is reflexive (i.e., each object is indiscernible
with itself), symmetric (if object A is indiscernible with object B, then object B also is indiscernible
with object A), and transitive (if object A is indiscernible with object B and object B is indiscernible
with object C, then object A also is indiscernible with object C). Therefore, the indiscernibility
relation is an equivalence relation.
For instance, with respect to the attribute “number of employees,” fixing a threshold at 10 percent,
Company A having 2,710 employees is similar to Company B having 3,000 employees. Similarity
relation is reflexive, but neither symmetric nor transitive; the abandon of the transitivity requirement is
easily justifiable, remembering, for example, Luce's paradox of the cups of tea (Luce, 1956). As for the
symmetry, one should notice that the proposition yRx, which means “y is similar to x, ” is directional;
there is a subject y and a referent x, and in general this is not equivalent to the proposition “x is similar to
y. ”
3.8 INTERPRETATIONS
Mean is an average value of a particular data set obtained or calculated by dividing the sum of
the values within that data set by the number of values within that same set.
Standard Deviation is a technique is used to ascertain how responses align with or deviate from
the average value or mean. It relies on the meaning to describe the consistency of the replies
within a particular data set. You can use this when calculating the average pay for a certain
profession and then displaying the upper and lower values in the data set.
Informed decision-making. The managing board must examine the data to take action and
implement new methods. This emphasizes the significance of well-analyzed data as well as a
well-structured data collection process.
Anticipating needs and identifying trends. Data analysis provides users with relevant insights
that they can use to forecast trends. It would be based on customer concerns and expectations.
For example, a large number of people are concerned about privacy and the leakage of personal
information. Products that provide greater protection and anonymity are more likely to become
popular.
Clear foresight. Companies that analyze and aggregate data better understand their own
performance and how consumers perceive them. This provides them with a better understanding
of their shortcomings, allowing them to work on solutions that will significantly improve their
performance.
Point estimators are functions that are used to find an approximate value of a population
parameter from random samples of the population. They use the sample data of a population to
calculate a point estimate or a statistic that serves as the best estimate of an unknown parameter
of a population.
Most often, the existing methods of finding the parameters of large populations are unrealistic.
For example, when finding the average age of kids attending kindergarten, it will be impossible
to collect the exact age of every kindergarten kid in the world. Instead, a statistician can use the
point estimator to make an estimate of the population parameter.
1. Bias
The bias of a point estimator is defined as the difference between the expected value of the
estimator and the value of the parameter being estimated. When the estimated value of the
parameter and the value of the parameter being estimated are equal, the estimator is considered
unbiased.
Also, the closer the expected value of a parameter is to the value of the parameter being
measured, the lesser the bias is.
2. Consistency
Consistency tells us how close the point estimator stays to the value of the parameter as it
increases in size. The point estimator requires a large sample size for it to be more consistent and
accurate.
we can also check if a point estimator is consistent by looking at its corresponding expected
value and variance. For the point estimator to be consistent, the expected value should move
toward the true value of the parameter.
Generally, the efficiency of the estimator depends on the distribution of the population. For
example, in a normal distribution, the mean is considered more efficient than the median, but the
same does not apply in asymmetrical distributions.
On the other hand, interval estimation uses sample data to calculate the interval of the possible
values of an unknown parameter of a population. The interval of the parameter is selected in a
way that it falls within a 95% or higher probability, also known as the confidence interval.
The confidence interval is used to indicate how reliable an estimate is, and it is calculated from
the observed data. The endpoints of the intervals are referred to as the upper and lower
confidence limits.
1. Method of moments
The method of moments of estimating parameters was introduced in 1887 by Russian
mathematician Pafnuty Chebyshev. It starts by taking known facts about a population and then
applying the facts to a sample of the population. The first step is to derive equations that relate
the population moments to the unknown parameters.
The next step is to draw a sample of the population to be used to estimate the population
moments. The equations derived in step one are then solved using the sample mean of the
population moments. This produces the best estimate of the unknown population parameters.
2. Maximum likelihood estimator
The maximum likelihood estimator method of point estimation attempts to find the unknown
parameters that maximize the likelihood function. It takes a known model and uses the values to
compare data sets and find the most suitable match for the data.
For example, a researcher may be interested in knowing the average weight of babies born
prematurely. Since it would be impossible to measure all babies born prematurely in the
population, the researcher can take a sample from one location.
Because the weight of pre-term babies follows a normal distribution, the researcher can use the
maximum likelihood estimator to find the average weight of the entire population of pre-term
babies based on the sample data.
is 46.
= 86.
If this is the case, the researchers should apply the sample's determined standard deviation.
Let's assume, for our example, that the researchers have chosen to compute the standard deviation
from their sample. They get a 6.2-gramme standard deviation.
S = 6.2.
For this example, let's assume that the researchers employ a 95 per cent confidence interval.
Step 5: Find the Z value for the chosen confidence interval in step #5.
The researchers would subsequently use the following table to establish their Z value:
Confidence Interval Z
80% 1.282
85% 1.440
90% 1.645
95% 1.960
99% 2.576
99.5% 2.807
99.9% 3.291
86 ± 1.960 (6.2/6.782)
This calculation yields a value of 86 1.79, which the researchers use as their confidence interval.
We explain the percentile, bias-corrected, and expedited versions of the bootstrap method for
calculating confidence intervals in plain terms. This approach is suitable for both normal and
non-normal data sets and may be used to calculate a broad range of metrics, including mean,
median, the slope of a calibration curve, etc. As a practical example, the bootstrap method
determines the confidence interval around the median level of cocaine in femoral blood.