0% found this document useful (0 votes)
8 views46 pages

RM 5

The document outlines a syllabus for a course on research proposals, data analysis, and hypothesis testing, covering topics such as hypothesis formulation, types of tests, and report writing. It explains key concepts including null and alternative hypotheses, significance levels, and various statistical tests like t-tests and ANOVA. Additionally, it discusses the importance of hypothesis testing in drawing conclusions from sample data and the potential errors involved in the process.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views46 pages

RM 5

The document outlines a syllabus for a course on research proposals, data analysis, and hypothesis testing, covering topics such as hypothesis formulation, types of tests, and report writing. It explains key concepts including null and alternative hypotheses, significance levels, and various statistical tests like t-tests and ANOVA. Additionally, it discusses the importance of hypothesis testing in drawing conclusions from sample data and the potential errors involved in the process.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 46

Syllabus

• Unit 5: Research Proposal, Data Analysis and Hypothesis Testing: (12 Sessions)
• Elements of research proposal, Case study on drafting research proposal
• Data Analysis - Frequency distribution table, Charts: Bar, Pie, and Histogram.
• Hypotheses tests: Stages of Hypothesis testing, Hypothesis- meaning, importance and qualities of
good hypothesis, framing null & alternate hypothesis, Type-I & Type-II error, Concept of Level of
Significance and Confidence level.
• One sample t test, independent sample t test, paired sample t test, One way ANOVA, chi-square
test for independence (3 Case studies using EXCEL on hypotheses testing)
• ·Report Writing: Types of Reports, Writing a Research Report.
Definition
• A hypothesis is a tentative answer to your Research question
that needs to be tested.

• Hypothesis testing is the often used strategy for deciding whether


a sample data offer such support for a hypothesis that
generalisation can be made.
• Hypothesis testing enables us to make probability statements
about population parameter(s)
1. Hypothesis should be capable of being tested. A
hypothesis “is testable if other deductions can be made
from it which, in turn, can be confirmed or disproved by
observation.”
2. A good hypothesis does not conflict with any law of
nature which is known to be true. (logically consistent
with established scientific principles and natural laws.)
CHARACTERISTICS 3. Hypothesis should be clear and precise. If the
hypothesis is not clear and precise, the inferences
OF A GOOD drawn on its basis cannot be taken as reliable.
HYPOTHESIS
4. A good hypothesis permits of the application of
deductive reasoning. (Top-Down Approach)
5. A good hypothesis ensures that the sample is readily
approachable.
6. A good hypothesis indicates clearly the role of
different variables involved in the study.
An intuitive introduction to Hypothesis Testing

Suppose we want to find out the


average height of women in a town.

We’ll ignore the potential difference


between women of different age
groups and just keep it simple.
Let’s try and put it to the test
• Assume we take a sample of 20 women and their mean height
comes to 168.6 cms.
• So what does this observation mean for this hypothesis?
• How much doubt does it cast on our hypothesis?
• Taking a different sample Now let’s imagine another scenario
where we yet again randomly sample 20 women but this time the
average of their heights is 161 cms.
Let’s formalize things a little bit now
• We call the original hypothesis as the null hypothesis and
represent it as H₀.
• And then we’ve got this other thing called the alternative
hypothesis often represented as H₁
Formalizing Hypothesis Testing
• Let’s consider this number line representing the possible values of
the sample mean.
• We typically represent sample mean with x̅ and population mean
with 𝝻.
• If you took a sample and took its mean, you will be expecting that
it comes to be 169 but you also know that due to random
variation it could easily be 168 or 170.
• Hypothesis testing will do is set these critical boundaries beyond
which we are going to start rejecting the null hypothesis.
A Second Example
we are gonna take a sample and put this
hypothesis to the test.

This time around though we take only five


people to make up our sample and we find that
the average weight of these 5 chosen people is
only 68 kgs.
This is casting a lot of doubt now.
The average weight hasn’t changed a bit but what has changed is the number of
observations in the sample.
So we now have 500 people in the sample. What that means is that we’re more
confident about the sample mean
BASIC CONCEPTS CONCERNING TESTING OF
HYPOTHESES
• 1.Null hypothesis and alternative hypothesis
• 2. The level of significance
• 3. Decision rule or test of hypothesis
• 4. Type I and Type II errors
• 5. Two-tailed and One-tailed tests
Null and Alternative Hypothesis
• A hypothesis is an educated guess to your research questions.
• The two types of hypotheses: null and alternative work as a
complementary pair.
• Null Hypothesis (H0): “Null” meaning “nothing.” This hypothesis
states that there is no difference between groups or no
relationship between variables.
• Alternative Hypothesis (Ha): This is also known as the claim. This
hypothesis should state what you expect the data to show, based
on your research on the topic. This is your answer to your research
question.
• To write a null hypothesis, first start by asking a question.
Rephrase that question in a form that assumes no
relationship between the variables.
How to write • Testing the null hypothesis can tell you whether your results
Hypothesis? are due to the effect of manipulating a dependent variable or
due to chance.
Contd..
• Alternative hypothesis is usually the one which one wishes to
prove and the null hypothesis is the one which one wishes to
disprove.
• Thus, a null hypothesis represents the hypothesis we are
trying to reject, and alternative hypothesis represents all other
possibilities.
• Rejecting a hypothesis does not mean an experiment was "bad" or
that it didn't produce results. In fact, it is often one of the first
steps toward further inquiry.
Two types of • The alternative hypothesis are of two types: one sided
alternative (directional) and two-sided alternative (non-
Alternative directional)

Hypotheses
One Tailed Test
• Hypothesis Testing: Does Age Affect Mathematical Ability?
• One-Tailed Hypothesis Test
• A one-tailed test checks for an effect in only one direction.
Two Tailed Test

• A two-tailed test checks for an effect in both directions (either


increase or decrease).
Try Yourself
• Does social media usage affect students' academic performance?
Test of Hypothesis
• A test of hypothesis or a test of significance is a procedure for
assessing the compatibility of the data with the null hypothesis.
• It provides us with the ways of using sample data to decide
between the two competing hypotheses.
• It is a rule for deciding whether to reject or not to reject the null, on
the basis of the sample data.
• We assume the null to be true. Then we want to see if the data
contradict the assumption.
• For any test, two possible conclusions are: 'reject H0' or 'fail to
reject H0'
Errors in Hypothesis testing
• Reject of H0 does not mean the null hypothesis you have
formulated is false, rather it means the data are incompatible with
H0. It means the data provide insufficient evidence against H0.
• Thus two types of errors may creep in to the statistical test.
• We may reject H0 when H0 is true and we may accept H0 when in
fact H0 is not true.
• The former is known as Type I error and the latter as Type II error.
• In other words, Type I error means rejection of hypothesis which
should have been accepted and Type II error means accepting the
hypothesis which should have been rejected.
Contd..
Type I error is denoted by α (alpha)
known as α error, also called the
level of significance of test; and
Type II error is denoted by β (beta)
known as β error.
Level of significance
• The significance level (α) is the risk you are willing to take of
guessing wrong.
• Think of Significance Level Like a Fire Alarm
• Imagine you are in a building, and there is a fire alarm system
that detects smoke and warns people if there is a fire. Now, fire
alarms aren’t perfect. Sometimes they ring even when there’s no
fire (like if you burn toast ). Other times, they might fail to ring
when there actually is a fire .
Contd..
• Significance Level (α) = How Strict the Fire Alarm Is
• The significance level (α) is like setting the sensitivity of the alarm:
• α = 5% (0.05) → The alarm is moderately strict. It will go off most of
the time when there is a fire, but 5% of the time, it might ring
falsely (false alarm).
• α = 1% (0.01) → The alarm is super strict! It will only ring if it's 99%
sure there is a fire, reducing false alarms. But... it might miss a real
fire sometimes.
What is a Sampling Distribution
• A sampling distribution is the probability distribution of a statistic
(e.g., mean, proportion, standard deviation) calculated from many
random samples taken from a population.
How the Sampling Distribution is Formed:
• Imagine taking many samples of size 20 from the population.
• Each sample has its own mean (some are higher, some are lower).
• If we plot all these sample means, we get the sampling
distribution of the mean.
Rejection Region/ Critical region
• A rejection region, also called a critical region, is an area where
the null hypothesis is rejected.
• The number of rejection regions is determined by your alternate
hypothesis.
• The Size of rejection region is determined by the alpha
(significance) level.
• For example, if you want to be 95% confident that your results are
significant, you would select a 5% alpha level (100% – 95% = 5%).
This 5% alpha level represents the rejection region. In a one-tailed
test, the 5% is in one tail, while in a two-tailed test, the rejection
region would encompass both tails.
Rejection Regions and Alpha Levels
• You, as a researcher, choose the alpha level you are willing to
accept.
• For example, if you wanted to be 95% confident that your results
are significant, you would choose a 5% alpha level (100% – 95%).
That 5% level is the rejection region. For a one tailed test, the 5%
would be in one tail. For a two tailed test, the rejection region
would be in two tails.
Rejection Regions and P-Values
• There are two ways you can test a hypothesis: with a p-value and
with a critical value.
• P-value method: When you run a hypothesis test (for example, a z
test), the result of that test will be a p value (probability value).
• It’s what tells you if your hypothesis statement is probably true or
not.
• If the value falls in the rejection region, it means you
have statistically significant results; You can reject the null
hypothesis. If the p-value falls outside the rejection region, it
means your results aren’t enough to throw out the null hypothesis.
Critical Values: Find a Critical Value in
Any Tail
• A critical value is a point or threshold on
a probability distribution that helps you
figure out whether to support or reject the
null hypothesis in hypothesis testing. It is
used to define the boundary between
the acceptance region and rejection
region for a given significance level.

• The critical value is the red line


Critical Values of Z
• The critical value of z is term linked to the area under the standard
normal model. Critical values can tell you what probability any
particular data point will have.
• Consider a critical value of 1.28.
• We can show this with the help of a normal distribution curve.
• The graph has two parts
• Central region: The z-score is equal to the number of standard deviations from the mean.
A score of 1.28 indicates that the variable is 1.28 standard deviations from the mean. If
you look in the z-table for a z of 1.28, you’ll find the area is .3997. This is the region to
the right of the mean, so you’ll double it to get the area of the entire central region:
.3997*2 = .7994 or about 80 percent.
• Tail region: The area of the tails (the red areas) is 1 minus the central region. In this
example, 1-.8 = .20, or about 20 percent. The tail regions are sometimes calculated when
you want to know how many variables would be less than or more than a certain figure.
Example question: Find a critical value for a
90% confidence level (Two-Tailed Test).
• Subtract the confidence level from 100% to find the α level: 100%
– 90% = 10%.
• Convert Step 1 to a decimal: 10% = 0.10.
• Divide Step 2 by 2 (this is called “α/2”). So: 0.10 = 0.05. This is the
area in each tail.
• Subtract Step 3 from 1 (because we want the area in the middle,
not the area in the tail): So: 1 – 0.05 = .95.
• Look up the area from Step in the z-table. The area is at z=1.645.
This is your critical value for a confidence level of 90%.
Statistical tests

• Z score
• T test
• Chi square test
• Anova
t-Test
The t-test is a statistical test
procedure that tests whether there
is a significant difference between
the means of two groups.

The two groups could be, for


example, patients who
received drug A once and drug
B once, and you want to know if
there is a difference in blood
pressure between these two
groups.
Types of t-test
• One sample t-Test
• We use the one sample t-test when we want to compare the mean
of a sample with a known reference mean.
Example of a one sample t-test

A manufacturer of chocolate bars


claims that its chocolate bars weigh
50 grams on average. To verify this, a
sample of 30 bars is taken and
weighed. The mean value of this
sample is 48 grams.

We can now perform a one sample t-


test to see if the mean of 48 grams is
significantly different from the
claimed 50 grams.
t-test for independent samples
• We use the t-test for independent samples when we want to
compare the means of two independent groups or samples. We
want to know if there is a significant difference between these
means.
We would like to compare the effectiveness of
two painkillers, drug A and drug B.

To do this, we randomly divide 60 test subjects


into two groups. The first group receives drug
A, the second group receives drug B. With an
independent t-test we can now test whether
there is a significant difference in pain relief
between the two drugs.
Paired samples t-Test
• The t-test for dependent samples is used to compare the means
of two dependent groups
We want to know how effective a diet is.
To do this, we weigh 30 people before the
diet and exactly the same people after the
diet.

Now we can see for each person how big


the weight difference is
between before and after. With a
dependent t-test we can now check
whether there is a significant difference.
One-way ANOVA
(Analysis of Variance)
• one-way ANOVA is the extension of the independent t-test to more
than two groups or samples.
• The one way ANOVA (one factor ANOVA) tests whether there is a
difference between the means of more than 2 groups.
Example
• Imagine you're working in the marketing department of a company
that wants to test which type of advertisement is most effective in
increasing brand awareness.
• You decide to test three different advertising strategies:
• Social Media Ads (e.g., Instagram & Facebook)
• Email Marketing Campaigns
• TV Commercials
Contd..
• You expose three separate groups of 30 customers to one type of
advertisement each. After one week, you measure brand recall (on a
scale of 0 to 10) through a short survey.
• Now, you want to know:
Is there a significant difference in brand recall between the three
advertisement types?
• To answer this, you conduct a One-Way ANOVA, where:
• The independent variable is the type of advertisement (with 3
levels/groups).
• The dependent variable is the brand recall score.
• If the ANOVA result is significant, it suggests that at least one type of
advertisement leads to a different level of brand recall compared to
the others.
Chi-Square Test of Independence
• The Chi-Square Test of Independence is used when two
categorical variables are to be tested for independence.
• The research question that can be answered with the Chi-square
test is: Are the characteristics of gender and ownership of a Netflix
subscription independent of each other?
• does gender have an influence on whether a person has a Netflix
subscription or not?

You might also like