0% found this document useful (0 votes)
28 views11 pages

Reading Material - Hypothesis Testing

Uploaded by

fatinisraqabir
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
28 views11 pages

Reading Material - Hypothesis Testing

Uploaded by

fatinisraqabir
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

What Is Hypothesis Testing?

Hypothesis testing is an act in statistics where an analyst tests an assumption by observing a


sample.
Hypothesis testing is used to assess the probability of a hypothesis by using sample data. Such data
may come from a larger population or a data-generating process.
Null & Alternative Hypotheses
The null and alternative hypotheses are two competing claims that researchers weigh evidence for
and against using a statistical test:
Null hypothesis (H0): There’s no effect in the population.
Alternative hypothesis (Ha or H1): There’s an effect on the population.
The null hypothesis is the claim that there’s no effect on the population. If the sample provides
enough evidence against the claim that there’s no effect in the population, then we can reject the
null hypothesis. Otherwise, we fail to reject the null hypothesis. Although “fail to reject” may
sound awkward, it’s the only wording that statisticians accept. Be careful not to say you “prove”
or “accept” the null hypothesis.

Example: Population on trial.

Think of a statistical test as being like a legal trial. The population is accused of the “crime” of
having an effect, and the sample is the criminal evidence. In many countries, a person accused of
a crime is assumed to be innocent until proven guilty. Similarly, we start by assuming the
population is “innocent” of having an effect.

In other words, the null hypothesis (i.e., that there is no effect) is assumed to be true until the
sample provides enough evidence to reject it. Null hypotheses often include phrases such as “no
effect,” “no difference,” or “no relationship.” When written in mathematical terms, they always
include an equality (usually =, but sometimes ≥ or ≤).
You can never know with complete certainty whether there is an effect on the population. Some
percentage of the time, your inference about the population will be incorrect. When you incorrectly
reject the null hypothesis, it’s called a type I error. When you incorrectly fail to reject it, it’s a type
II error.
Examples of null hypotheses

The table below gives examples of research questions and null hypotheses. There’s always more
than one way to answer a research question, but these null hypotheses can help you get started.

Research Null hypothesis (H0)


question
General Test-specific

Does tooth Tooth flossing t test:


flossing affect has no effect on
the number of the number of The mean number of cavities per person does not differ
cavities? cavities. between the flossing group (µ1) and the non-flossing
group (µ2) in the population; µ1 = µ2.
Does the The amount of text Linear regression:
amount of text highlighted in the There is no relationship between the amount of text
highlight in the textbook has no highlighted and exam scores in the population; β1 = 0.
textbook affect effect on exam
exam scores? scores.
Does daily Daily Two-proportions z test:
meditation meditation does The proportion of people with depression in the daily-
decrease the not decrease the meditation group (p1) is greater than or equal to the no-
incidence of incidence of meditation group (p2) in the population; p1 ≥ p2.
depression? depression. *
*Note that some researchers prefer to always write the null hypothesis in terms of “no effect” and “=”. It would be
fine to say that daily meditation has no effect on the incidence of depression and p1 = p2.

The alternative hypothesis (Ha) is the other answer to your research question. It claims that there’s
an effect on the population. Often, your alternative hypothesis is the same as your research
hypothesis. In other words, it’s the claim that you expect, or hope will be true.
The alternative hypothesis is a complement to the null hypothesis. Null and alternative hypotheses
are exhaustive, meaning that together they cover every possible outcome. They are also mutually
exclusive, meaning that only one can be true at a time.

Tip: Be careful with your words when you report the results of a statistical test in a research
paper or thesis. If you reject the null hypothesis, you can say that the alternative hypothesis
is supported. On the other hand, if you fail to reject the null hypothesis, then you can say that the
alternative hypothesis is not supported. Never say that you’ve proven or disproven a hypothesis.
Alternative hypotheses often include phrases such as “an effect,” “a difference,” or “a
relationship.” When alternative hypotheses are written in mathematical terms, they always include
an inequality (usually ≠, but sometimes < or >). As with null hypotheses, there are many acceptable
ways to phrase an alternative hypothesis.

Examples of alternative hypotheses

The table below gives examples of research questions and alternative hypotheses to help you get
started with formulating your own.

Research question Alternative hypothesis (Ha)

General Test-specific

Does tooth flossing Tooth flossing has an effect on t test:


affect the number of the number of cavities.
cavities? The mean number of cavities per
person differs between the flossing
group (µ1) and the non-flossing group
(µ2) in the population; µ1 ≠ µ2.
Does the amount of The amount of text highlighted Linear regression:
text highlight in a in the textbook has an effect on There is a relationship between the
textbook affect exam scores. amount of text highlighted and exam
exam score? scores in the population; β1 ≠ 0.
Does daily Daily meditation decreases the Two-proportions z test:
meditation decrease incidence of depression. The proportion of people with
the incidence of depression in the daily-meditation
depression? group (p1) is less than the no-
meditation group (p2) in the
population; p1 < p2.

***Similarities and differences between null and alternative hypotheses

Null and alternative hypotheses are similar in some ways:

• They’re both answers to the research question.


• They both make claims about the population.
• They’re both evaluated by statistical tests.

However, there are important differences between the two types of hypotheses, summarized in the
following table.

Null hypotheses (H0) Alternative hypotheses (Ha)

Definition A claim that there is no effect in the A claim that there is an effect in the
population. population.

Also known as H0 Ha

H1
Typical phrases • No effect • An effect
used • No difference • A difference
• No relationship • A relationship
• No change • A change
• Does not increase. • Increases
• Does not decrease • Decreases

Symbols used Equality symbol (=, ≥, or ≤) Inequality symbol (≠, <, or >)
p≤α Rejected Supported
p>α Failed to reject Not supported

Answering your research question with hypotheses


The null and alternative hypotheses offer competing answers to your research questions.

The null hypothesis (H0) answers “No, there’s no effect in the population.”
The alternative hypothesis (Ha) answers “Yes, there is an effect in the population.”

The null and alternative are always claims about the population. That’s because the goal of
hypothesis testing is to reach conclusion about a population based on a sample. Often, we infer
whether there’s an effect in the population by looking at the sample. It’s important for your
research to write strong hypotheses. You can use a statistical test to decide whether the evidence
collected from the research supports the null or alternative hypothesis.

Example of Hypothesis Testing


If an individual wants to test that a penny has exactly a 50% chance of landing on heads, the null
hypothesis would be that 50% is correct, and the alternative hypothesis would be that 50% is not
correct.
A random sample of 100-coin flips is taken, and the null hypothesis is tested. If it is seen that from
the 100-coin flips, 40 resulted into heads and 60 resulted into tails, the analyst would assume that
a penny does not have a 50% chance of landing on heads and would reject the null hypothesis and
accept the alternative hypothesis.
If there were 48 heads and 52 tails, then it is plausible that the coin could be fair and still produce
such a result. In cases such as this where the null hypothesis is "accepted," the analyst states that
the difference between the expected results (50 heads and 50 tails) and the observed results (48
heads and 52 tails) is "explainable by random chance or coincidence."
What are the Benefits of Hypothesis Testing?
Hypothesis testing helps assess the accuracy of new ideas or theories by testing them against data.
This allows researchers to determine whether the evidence supports their hypothesis, helping to
avoid false claims and conclusions. Hypothesis testing also provides a framework for decision-
making based on data rather than personal opinions or biases. By relying on statistical analysis,
hypothesis testing helps to reduce the effects of chance and confounding variables, providing a
robust framework for making informed conclusions.3

What are the Limitations of Hypothesis Testing?


Hypothesis testing relies exclusively on data and doesn’t provide a comprehensive understanding
of the subject being studied. Additionally, the accuracy of the results depends on the quality of the
available data and the statistical methods used. Inaccurate data or inappropriate hypothesis
formulation may lead to incorrect conclusions or failed tests. Hypothesis testing can also lead to
errors, such as analysts either accepting or rejecting a null hypothesis when they shouldn’t have.
These errors may result in false conclusions or missed opportunities to identify significant patterns
or relationships in the data.

How Hypothesis Testing Works


In hypothesis testing, an analyst tests a statistical sample, intending to provide evidence on the
probability of null hypothesis. Statistical analysts measure and examine a random sample of the
population being analyzed. All analysts use a random population sample to test two different
hypotheses: the null hypothesis and the alternative hypothesis. The alternative hypothesis is
effectively the opposite of a null hypothesis. Thus, they are mutually exclusive, and only one can
be true.

4 Step Process
• State the hypotheses.
• Formulate an analysis plan, which outlines how the data will be evaluated.
• Carry out the plan and analyze the sample data.
• Analyze the results and either reject the null hypothesis, or state that the null hypothesis is
plausible, given the data.
If the result of research is found statistically significant, it means that the observed result is unlikely
due to random factors or bias. The data found from samples are true for the whole population.
If the result of research is found statistically insignificant, it means that the current study’s results
aren’t strong enough to draw a conclusion. The effect might be real for the sample, but it may not
be for the overall population. The sample size used may be too small, or other factors might
influence grades.

Type I & Type II Errors

Scene 1:
You decide to get tested for COVID-19 based on mild symptoms. There are two errors that could
potentially occur:
Type I error (false positive): The test result says you have coronavirus, but you actually don’t.
Type II error (false negative): The test result says you don’t have coronavirus, but you actually
do.
In statistics, a Type I error is a false positive conclusion, while a Type II error is a false negative
conclusion. Making a statistical decision always involves uncertainties, so the risks of making
these errors are unavoidable in hypothesis testing.
Scene 2:
You test whether a new drug intervention can alleviate symptoms of an autoimmune disease.
In this case:
• The null hypothesis (H0) is that the new drug has no effect on symptoms of the disease.
• The alternative hypothesis (H1) is that the drug is effective for alleviating symptoms of the
disease.
Then, you decide whether the null hypothesis can be rejected based on your data and the results of
a statistical test. Since these decisions are based on probability, there is always a risk of making
the wrong conclusion.
• If your results show statistical significance, that means they are very unlikely to occur if
the null hypothesis is true. In this case, you would reject your null hypothesis. But
sometimes, this may actually be a Type I error.
• If your findings do not show statistical significance, they have a high chance of occurring
if the null hypothesis is true. Therefore, you fail to reject your null hypothesis. But
sometimes, this may be a Type II error.

A Type I error happens when you get false positive results: you conclude that the drug intervention
improved symptoms when it actually didn’t. These improvements could have arisen from other
random factors or measurement errors.
A Type II error happens when you get false negative results: you conclude that the drug
intervention didn’t improve symptoms when it actually did. Your study may have missed key
indicators of improvements or attributed any improvements to other factors instead.

A Type I error means rejecting the null hypothesis when it’s actually true. It means concluding
that results are statistically significant when, in reality, they came about purely by chance or
because of unrelated factors.
The risk of committing this error is the significance level (alpha or α) you choose. That’s a value
that you set at the beginning of your study to assess the statistical probability of obtaining your
results (p value).
The significance level is usually set at 0.05 or 5%. This means that your results only have a 5%
chance of occurring, or less, if the null hypothesis is actually true.
If the p value of your test is lower than the significance level, it means your results are statistically
significant and consistent with the alternative hypothesis. If your p value is higher than the
significance level, then your results are considered statistically non-significant.

Is a Type I or Type II error worse?


For statisticians, a Type I error is usually worse. In practical terms, however, type of error could
be worse depending on your research context.
A Type I error means mistakenly going against the main statistical assumption of a null hypothesis.
This may lead to new policies, practices or treatments that are inadequate or a waste of resources.
Example: Consequences of a Type I error
Based on the incorrect conclusion that the new drug intervention is effective, over a million
patients are prescribed the medication, despite risks of severe side effects and inadequate research
on the outcomes. The consequences of this Type I error also mean that other treatment options are
rejected in favor of this intervention.
In contrast, a Type II error means failing to reject a null hypothesis. It may only result in missed
opportunities to innovate, but these can also have important practical consequences.
Example: Consequences of a Type II error
If a Type II error is made, the drug intervention is considered ineffective when it can actually
improve symptoms of the disease. This means that a medication with important clinical
significance doesn’t reach a large number of patients who could tangibly benefit from it.

What Is a T-Test?
A t-test is an inferential statistical test used to determine if there is a significant difference between
the means of two groups and how they are related. The t-test is a test used for hypothesis testing
in statistics and uses t-statistics, the t-distribution values, and the degrees of freedom to determine
statistical significance.

How Is the T-Distribution Table Used?


The T-Distribution Table is available in one-tail and ed formats. One-tailed value is used for
assessing cases that have a fixed value or range with a clear direction, either positive or negative.
For instance, what is the probability of the output value remaining below -3, or getting more than
seven when rolling a pair of dice? Two-tailed value is used for range-bound analysis, such as
asking if the coordinates fall between -2 and +2.
A one-tailed test may be either left-tailed or right-tailed.
• A left-tailed test is used when the alternative hypothesis states that the true value of the
parameter specified in the null hypothesis is less than the null hypothesis claims.
• A right-tailed test is used when the alternative hypothesis states that the true value of the
parameter specified in the null hypothesis is greater than the null hypothesis claims.

Calculating T-value

Does social media use affect well-being?


Researchers want to see if there’s a difference in well-being scores between people who use a lot
of social media and those who use an average level. They collect data from two groups:
• Group 1: People who use social media for more than 2 hours a day (heavy users)
• Group 2: People who use social media for less than 30 minutes a day (light users)
They measure well-being using a standardized survey. The T-test will help them determine if the
average well-being scores between the two groups are significantly different.
The t-test formula uses the following information:
• χ (X bar): This is the average well-being score for one group (either heavy or light users).
• μ (mu): This is the hypothesized average well-being score for the entire population (not
just the two groups studied). Since the researchers don’t know the well-being score of the
entire population, they might use a previous study’s average score or set it to zero.
• s: This is the standard deviation, which measures how spread out the data points are in each
group.
• n: This is the number of people in each group.
By plugging this data into the formula, the t-test calculates a t-value. Researchers can then compare
this t-value to a t-distribution table, which considers the sample sizes (degrees of freedom) to see
if the t-value is statistically significant.
In simpler terms,
The t-test takes average scores (χ), subtracts the hypothesized average score (μ) for the whole
population, and accounts for the variability within each group (s) and the number of people studied
(n). It then gives a t-value that tells us how likely it is that the observed difference between the two
groups happened by random chance.

You might also like