0% found this document useful (0 votes)
38 views23 pages

Hypothesis Testing

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
38 views23 pages

Hypothesis Testing

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 23

12/9/24, 11:30 PM Practice Problems on Hypothesis Testing – Data Science Duniya

Enjoy this site? Gift the author a WordPress.com plan. Gift

Data Science Duniya Learn Data Science, Machine Learning and Artificial Intelligence

HOME MLOPS STATISTICS MACHINE LEARNING WEBINARS WRITE FOR US CONTACT

BLOG STATS
PRACTICE PROBLEMS ON HYPOTHESIS TESTING 596,973 hits
 April 17, 2022  Ashutosh Tripathi  7 comments

In this post I have put together the practice problems (from my academics study Advertisements

notes) to explain how in practical Hypothesis Testing works. This post is written
mostly for the learners who want to deep dive into the statistics for data science.
Focus will be on problem solving. For concepts please refer my previous posts on
testing of hypothesis.

Prerequisite to understand Hypothesis testing examples:

Understanding of hypothesis testing concepts


How to use z-table, t-table and chi square table. REPORT THIS AD
Privacy & Cookies: This site uses cookies. By continuing to use this website, you agree to their use.
To find out more, including how to control cookies, see here: Cookie Policy Close and accept

https://ashutoshtripathi.com/2022/04/17/practice-problems-on-hypothesis-testing/ 1/28
12/9/24, 11:30 PM Practice Problems on Hypothesis Testing – Data Science Duniya

Formula list: Search … 

SUBSCRIBE TO BLOG VIA


EMAIL

Enter your email address to


subscribe to this blog and
receive notifications of new
posts by email.

Email Address
Critical Regions
In hypothesis testing, critical region is represented by set of values, where null Subscribe
hypothesis is rejected. So it is also know as region of rejection. It takes different
boundary values for different level of significance. Below info graphics shows the Join 4,626 other subscribers
region of rejection that is critical region and region of acceptance with respect to
the level of significance 1%.

TOP POSTS & PAGES

Practice Problems on
Hypothesis Testing

BCNF Decomposition
Privacy & Cookies: This site uses cookies. By continuing to use this website, you agree to their use.
To find out more, including how to control cookies, see here: Cookie Policy | A step by step
approach
https://ashutoshtripathi.com/2022/04/17/practice-problems-on-hypothesis-testing/ 2/28
12/9/24, 11:30 PM Practice Problems on Hypothesis Testing – Data Science Duniya

MLOps: A Complete
Guide to Machine
Learning Operations
| MLOps vs DevOps

What is the
Coefficient of
Determination | R
Square

TensorFlow Model
Serving using KServe:
A Step by Step Guide
Critical regions in Hypothesis Testing

LoS -> α = 1% α = 5% α = 10%

Two Tailed Test (-2.58, +2.58) (-1.96, +1.96) (-0.645, +0.645)

Right Tailed Test +2.33 +1.645 +1.28

Left Tailed Test -2.33 -1.645 -1.28

critical region values for 1% level of significance

Question 1
Privacy & Cookies: This site uses cookies. By continuing to use this website, you agree to their use.
To find out more, including how to control cookies, see here: Cookie Policy

https://ashutoshtripathi.com/2022/04/17/practice-problems-on-hypothesis-testing/ 3/28
12/9/24, 11:30 PM Practice Problems on Hypothesis Testing – Data Science Duniya

A Telecom service provider claims that individual customers pay on an average Advertisements

400 rs. per month with standard deviation of 25 rs. A random sample of 50
customers bills during a given month is taken with a mean of 250 and standard
deviation of 15. What to say with respect to the claim made by the service
provider?

Solution:
First thing first, Note down what is given in the question:

REPORT THIS AD
H0 (Null Hypothesis) : μ = 400
H1 (Alternate Hypothesis): μ ≠ 400 (Not equal means either μ > 400
or μ < 400 Hence it will be validated with two tailed test )
σ = 25 (Population Standard Deviation)

LoS (α) = 5% (Take 5% if not given in question)

n = 50 (Sample size)
xbar x̄ = 250 (Sample mean)
s = 15 (sample Standard deviation)

n > = 30 hence will go with z-test

Step 1:
Calculate z using z-test formula as below:

z = (x̄ - μ)/ (σ/√n)


z = (250 - 400) / (25/√50)
Privacy & Cookies: This site uses cookies. By continuing to use this website, you agree to their use.
z = -42.42
To find out more, including how to control cookies, see here: Cookie Policy

https://ashutoshtripathi.com/2022/04/17/practice-problems-on-hypothesis-testing/ 4/28
12/9/24, 11:30 PM Practice Problems on Hypothesis Testing – Data Science Duniya

Step 2:
get z critical value from z table for α = 5%
z critical values = (-1.96, +1.96)
to accept the claim (significantly), calculated z should be in
between
-1.96 < z < +1.96

but calculated z (-42.42) < -1.96 which mean reject the null
hypothesis

z-test example 1

Question 2
From the data available, it is observed that 400 out of 850 customers purchased
Privacy & Cookies: This site uses cookies. By continuing to use this website, you agree to their use.
To find the groceries
out more, includingonline. Can cookies,
how to control we sayseethat
here:most
Cookie of the customers are moving
Policy towards

https://ashutoshtripathi.com/2022/04/17/practice-problems-on-hypothesis-testing/ 5/28
12/9/24, 11:30 PM Practice Problems on Hypothesis Testing – Data Science Duniya

online shopping even for groceries?

Solution:

Note down what is given:


400 out of 850 which indicates that this is a proportion problem.

Proportion p (small p) = 400/850 = 0.47

H0 (Null Hypothesis): P (capital P) > 0.5 (claim is that most of the


customers are moving towards online shopping even for groceries
which mean at least 50% should do online shopping)
H1 (Alternate Hypothesis): P < = 0.5 left tailed

n = 850
LoS (α) = 5% (assume 5% as it not given in question)

n > = 30 hence will go with z-test

Step 1:
calculate z value using the z-test formula
z = (p - P)/√(P*Q/n)
z = (0.47 - 0.50)/√(0.5*0.5/850)
z = -1.74

Step 2:
get z value from z table for α = 5%
From z-table, for α = 5%, z = -1.645 (one value as it is one (left)
tailed
Privacy & Cookies: Thisproblem)
site uses cookies. By continuing to use this website, you agree to their use.
To find out z(calculated)
more, including how to-1.74
control cookies,
< -1.645see here:
(zCookie
fromPolicy
z-table with α = 5%)

https://ashutoshtripathi.com/2022/04/17/practice-problems-on-hypothesis-testing/ 6/28
12/9/24, 11:30 PM Practice Problems on Hypothesis Testing – Data Science Duniya

Conclusion:
Hence we reject the null hypothesis that mean, with given data we
can validate significantly that most of the customers are not
moving towards online shopping even for groceries.

Question 3
It is found that 250 errors in the randomly selected 1000 lines of code from Team
A and 300 errors in 800 lines of code from Team B. Can we assume that team B’s
performance is superior to that of A.

Solution:

Note down what is given in the question:

There are two samples : Team A and Team B


for each Team some proportion is given in terms of line of error
out of total line of code.
Hence this problem can be solved using two proportion z-test.

For one or two proportion type problem we use z-test. (in case of
multi-proportion we use χ2 that is chi square test)

Team A (Sample A):


proportion pA (small p) = 250/1000 = 0.25
nA = This
Privacy & Cookies: 1000site uses cookies. By continuing to use this website, you agree to their use.
To find out more, including how to control cookies, see here: Cookie Policy

https://ashutoshtripathi.com/2022/04/17/practice-problems-on-hypothesis-testing/ 7/28
12/9/24, 11:30 PM Practice Problems on Hypothesis Testing – Data Science Duniya

Team B (Sample B):


proportion pB (small p) = 300/800 = 0.375
nB = 800

Take α = 5% (assume α = 5% if not given in question)

Claim: Team B's performance is superior than Team A which means:


H0 (Null Hypothesis): overall mean error of Team B μB < μA overall
mean error of Team A (with respect to population)
H1 (Alternate Hypothesis) : μB > = μA (one right tailed test)

Step 1:
calculate z value from two proportion z-test formula as below:

z = (pA - pB)/sqrt([p^(1-p^)(1/nA + 1/nB)])


where p^ (p hat) = (nA*pA + nB*pB)/(nA + nB)

p^ (p hat) = (1000*0.25 + 800*0.375) / (1000 + 800) = 0.305

z = (0.25 - 0.375) / sqrt([0.305*(1-0.305)*(1/1000 + 1/800))


z = -0.125/[0.02185]
z = -5.72

Step 2:

get z using z-table for α = 5% which is z = +1.645

Now calculated z -5.72 < +1.645

HenceThis
Privacy & Cookies: will conclude
site uses cookies. By that null
continuing hypothesis
to use this website, youis true
agree which
to their use. mean from
To find out more,
given including
datahow ittoiscontrol cookies,significantly
proven see here: Cookie Policy
that team B's performance is
better that team A's performance.
https://ashutoshtripathi.com/2022/04/17/practice-problems-on-hypothesis-testing/ 8/28
12/9/24, 11:30 PM Practice Problems on Hypothesis Testing – Data Science Duniya

Question 4
Following is the record of number of accidents took place during the various days
of the week.

Wednes Thursda Saturda


Monday Tuesday Friday Sunday
day y y

120 140 200 90 140 120 180

Privacy Accidents
& Cookies: Thistook place
site uses in various
cookies. days
By continuing of this
to use thewebsite,
givenyouweek
agree|toData Science
their use. Duniya
To find out more, including how to control cookies, see here: Cookie Policy
Can we conclude that accident s are independent of the day of week?
https://ashutoshtripathi.com/2022/04/17/practice-problems-on-hypothesis-testing/ 9/28
12/9/24, 11:30 PM Practice Problems on Hypothesis Testing – Data Science Duniya

Solution:

H0 (Null Hypothesis): Accidents are independent of the day


H1 (Alternate Hypothesis): Not independent

Here each day will represent a sample and observed accident data is
proportion.
Hence this problem can be categorized as multi proportion problem
and will be solved using χ2 chi square test.
Below table shows the Observed values of accident in first column
As we want to validate that accidents are independent of the day of
week.

for that average accidents on each day should be different. Hence


we need to calculate average accidents on each day and this will be
called as expected value in χ2 test.

Take α = 5% (assume α = 5% if not given in question)

Step 1: calculate expected values and χ2 values using χ2 formula as


shown below in the table

Privacy & Cookies: This site uses cookies. By continuing to use this website, you agree to their use.
To find out more, including how to control cookies, see here: Cookie Policy

https://ashutoshtripathi.com/2022/04/17/practice-problems-on-hypothesis-testing/ 10/28
12/9/24, 11:30 PM Practice Problems on Hypothesis Testing – Data Science Duniya

Expected (e = average
Observed (o) of Observed values) χ2 = Σ[(o-e)2]/e
e = (990/7)

120 141.43 3.11

140 141.43 0.01

200 141.43 27.91

90 141.43 18.03

140 141.43 0.01

120 141.43 3.11

180 141.43 10.87

Total = 990 Total χ2 = 63.05

χ2 calculation example | χ2 test in hypothesis testing

Step 2: use χ2 table for α = 5% and get χ2 value from the table.
from table we got χ2 (critical value at α = 5%) = 3.841

Step 3: compare both χ2 values.

Privacy & Cookies: This site uses cookies.


The chi-square value By continuing
of 63.05 to use
isthismuch
website, you agreethan
larger to theirthe
use. critical
To find out more, including how to control cookies, see here: Cookie Policy
value of 3.84, so

https://ashutoshtripathi.com/2022/04/17/practice-problems-on-hypothesis-testing/ 11/28
12/9/24, 11:30 PM Practice Problems on Hypothesis Testing – Data Science Duniya

the null hypothesis can be rejected.

It means, reject the null hypothesis and accept the alternate


hypothesis. Which means with given data we can conclude
significantly that accidents are not independent of the day of
week. [might not look realistic but with given data is concluding
this]

Question 5
Analyze the below data and tell whether you can conclude that smoking causes
cancer or not?

Privacy & Cookies: This site uses cookies. By continuing to use this website, you agree to their use.
To find out more, including how to control cookies, see here: Cookie Policy

https://ashutoshtripathi.com/2022/04/17/practice-problems-on-hypothesis-testing/ 12/28
12/9/24, 11:30 PM Practice Problems on Hypothesis Testing – Data Science Duniya

Diagnosed as
Category Without Cancer Total
Cancer

Smokers 400 300 700

Non-Smokers 300 500 800

Total 700 800 1500

chi square test check the independence of the two categorical


variable. Here in this question we need to test whether smoking and
cancer are independent or dependent to each other. Hence will
perform chi square test.

Solution:

Step 1:
H0 (Null Hypothesis): Cancer is dependent on smoking
H1 (Alternate Hypothesis): cancer is not dependent on smoking

Step 2:
Calculate the expected value for each cell of the table (when null
hypothesis is true)
The expected values specify what the values of each cell of the
table would
be if there is no association between the two variables.
Privacy & Cookies: This site uses cookies. By continuing to use this website, you agree to their use.
To find out more,
The including
formula howfor
to control cookies, see
computing here:expected
the Cookie Policy values requires the sample

https://ashutoshtripathi.com/2022/04/17/practice-problems-on-hypothesis-testing/ 13/28
12/9/24, 11:30 PM Practice Problems on Hypothesis Testing – Data Science Duniya

size, the
row totals, and the column totals.

expected value (e) = (row total * column total)/table total

Now lets create another table with observed and expected values
both:

Diagnosed as
Category Without Cancer Total
Cancer

o = 400, e = o = 300, e =
Smokers 700*700/1500 = 700*800/1500 = 700
326 373

o = 300, e = o = 500, e =
Non-Smokers 800*700/1500 = 800*800/1500 = 800
373 426

Total 700 800 1500

Step 3:
calculate the chi square value:

χ2 = Σ[(o-e)2]/e
Privacy & Cookies: This site uses cookies. By continuing to use this website, you agree to their use.
2 2
χ2 =including
To find out more, (400-326) /326 cookies,
how to control + (300-373) + (300-373)2/373 + (500-
/373Policy
see here: Cookie
426)2/426
https://ashutoshtripathi.com/2022/04/17/practice-problems-on-hypothesis-testing/ 14/28
12/9/24, 11:30 PM Practice Problems on Hypothesis Testing – Data Science Duniya

χ2 = 16.79 + 14.28 + 14.28 + 12.85


χ2 = 58.2

Step 4:
Decide if χ2 is statistically significant.

The final step of the chi-square test of significance is to


determine if the value
of the chi-square test statistic is large enough to reject the null
hypothesis.

Now will check χ2 table for the critical value with α = 5%


So from table we got χ2 (critical value at α = 5%) = 3.841

The chi-square value of 58.2 is much larger than the critical value
of 3.84, so
the null hypothesis can be rejected.

Which means with given data, it can be significantly concluded that


cancer is not dependent on smoking.

Privacy & Cookies: This site uses cookies. By continuing to use this website, you agree to their use.
To find out more, including how to control cookies, see here: Cookie Policy

https://ashutoshtripathi.com/2022/04/17/practice-problems-on-hypothesis-testing/ 15/28
12/9/24, 11:30 PM Practice Problems on Hypothesis Testing – Data Science Duniya

chi square test example 2

Question 6
It is claimed that the mean of the population is 67 at 5% level of significance. Mean
obtained from a random sample of size 100 is 64 with SD 3. Validate the claim.

Solution:
First thing first, Note down what is given in the question:

H0 (Null Hypothesis) : μ = 67
H1 (Alternate Hypothesis): μ ≠ 67 (Not equal to means either μ > 67
or μ < 67 Hence it will be validated with two tailed test )

Privacy & Cookies:


LoS (α)This site
= uses
5% cookies. By continuing to use this website, you agree to their use.
To find out more, including how to control cookies, see here: Cookie Policy

https://ashutoshtripathi.com/2022/04/17/practice-problems-on-hypothesis-testing/ 16/28
12/9/24, 11:30 PM Practice Problems on Hypothesis Testing – Data Science Duniya

n = 100 (Sample size)


xbar x̄ = 64 (Sample mean)
s = 3 (sample Standard deviation)

n > = 30 hence will go with z-test

Step 1: Calculate z using z-test formula as below:

z = (x̄ - μ)/ (σ/√n)


z = (64 - 67) / (3/√100) (in question population standard deviation
is not given, in that case take sample standard deviation)
z = -10

step 2:

calculate z critical value for α = 5% from z-table.


so from z-table Z critical value = -1.96, +1.96 (will get two
values due two tailed test)

step 3:

check if calculated z value is in between z critical value then


accept the null hypothesis if z calculated is outside z critical
then reject the null hypothesis.

Here, z calculated value = -10 which is much lesser than the left
side z critical value -1.96, hence will reject the null hypothesis.

Conclusion:
with This
Privacy & Cookies: given data
site uses it By
cookies. iscontinuing
significantly proven
to use this website, that
you agree population
to their use. mean is
To find out more,
not including
equal howto to control cookies, see here: Cookie Policy
67.

https://ashutoshtripathi.com/2022/04/17/practice-problems-on-hypothesis-testing/ 17/28
12/9/24, 11:30 PM Practice Problems on Hypothesis Testing – Data Science Duniya

z-test example 4

Question 7
There is an assumption that there is no significant difference between boys and
girls with respect to intelligence. Tests are conducted on two groups and the
following are the observations

Standard
Mean Size
Deviation

Girls 75 8 60
Privacy & Cookies: This site uses cookies. By continuing to use this website, you agree to their use.
To find out more, including how to control
Boys 73 cookies, see here: Cookie 10Policy 100

https://ashutoshtripathi.com/2022/04/17/practice-problems-on-hypothesis-testing/ 18/28
12/9/24, 11:30 PM Practice Problems on Hypothesis Testing – Data Science Duniya

Validate the claim with 5% LoS (Level of Significance)

Solution:
First thing first, Note down what is given in the question:

H0 (Null Hypothesis) : No difference between boys and girls in


terms of intelligence. (μ2 = μ2)
H1 (Alternate Hypothesis): Boys and girls are different in terms of
intelligence (μ2 ≠ μ2) => two tailed test

x1bar = 75 (boys sample mean)


x2bar = 73 (girls sample mean)
LoS (α) = 5%

In question, we have two sample mean. Boys sample mean and girls
sample mean. Hence this can be solved with two mean problem.

Next both samples size n1 = 60 and n2 = 100 are greater than 30


hence will use z-test.

Step 1:
calculate z value from the two mean z test formula as below:

z = [(x1bar - x2bar) - (μ2 - μ2)]/√(s12/n1 + s22/n2)

μ2 - μ2 = 0 assuming null hypothesis is true

z = (75-73)/√(82/60 + 102/100)
Privacy & Cookies: This site uses cookies. By continuing to use this website, you agree to their use.
z = 1.39
To find out more, including how to control cookies, see here: Cookie Policy

https://ashutoshtripathi.com/2022/04/17/practice-problems-on-hypothesis-testing/ 19/28
12/9/24, 11:30 PM Practice Problems on Hypothesis Testing – Data Science Duniya

step 2:

calculate z critical value for α = 5% from z-table.


so from z-table Z critical value = -1.96, +1.96 (will get two
values due two tailed test)

step 3:

check if calculated z value is in between z critical value then


accept the null hypothesis if z calculated is outside z critical
then reject the null hypothesis.

Here, z calculated value is in between the z critical values. -1.96


< 1.39 < 1.96
Hence will accept the null hypothesis.
Conclusion:
with given data it is significantly proven that there is no
significant difference between the intelligence of boys and girls.

Privacy & Cookies: This site uses cookies. By continuing to use this website, you agree to their use.
To find out more, including how to control cookies, see here: Cookie Policy

https://ashutoshtripathi.com/2022/04/17/practice-problems-on-hypothesis-testing/ 20/28
12/9/24, 11:30 PM Practice Problems on Hypothesis Testing – Data Science Duniya

z-test example 5

Question 8
An automobile tyre manufacturer claims that the average life of a particular
grade of tyre is more than 20,000 km. A random sample of 16 tyres is having
mean 22,000 km with a standard deviation of 5000 km.

Validate the claim of the manufacturer at 5% LoS.

Solution:
First thing first, Note down what is given in the question:

Privacy & Cookies: This siteHypothesis)


H0 (Null uses cookies. By continuing to use this website, you agree to their use.
: μ > 20000
To find out more, including how to control cookies, see here: Cookie Policy
H1 (Alternate Hypothesis): μ <= 22000 (less than mean one tailed

https://ashutoshtripathi.com/2022/04/17/practice-problems-on-hypothesis-testing/ 21/28
12/9/24, 11:30 PM Practice Problems on Hypothesis Testing – Data Science Duniya

test)

LoS (α) = 5% (Take 5% if not given in question)

n = 16 (Sample size)
x̄ = 22000 (Sample mean)
s = 5000 (sample Standard deviation)

n < 30 hence will go with t-test

step 1:

calculate t value from the t-test formula:


t = (x̄ - μ)/ (s/√n)
t = (22000 - 20000) / 5000/√16
t = 1.60

step 2:

get t critical value from t-table for α = 5% and degree of freedom


= 16-1 = 15.
t critical value = 1.753

step 3:

check if t calculate < t critical then accept the null hypothesis


else reject the null hypothesis.

Here,This
Privacy & Cookies: t site
calculated 1.60
uses cookies. By < t tocritical
continuing 1.753,
use this website, hence
you agree to theirwill
use. accept the
To find out more,
nullincluding how to control cookies, see here: Cookie Policy
hypothesis.

https://ashutoshtripathi.com/2022/04/17/practice-problems-on-hypothesis-testing/ 22/28
12/9/24, 11:30 PM Practice Problems on Hypothesis Testing – Data Science Duniya

Conclusion:

from the data given, it is significantly proven that average life


of the tyres is more than 20000.

t-test example 1

That is all for now. Please share your thoughts using the comment section below.

Type your email… Subscribe

Privacy & Cookies: This site uses cookies. By continuing to use this website, you agree to their use.
To find out more, including how to control cookies, see here: Cookie Policy

https://ashutoshtripathi.com/2022/04/17/practice-problems-on-hypothesis-testing/ 23/28

You might also like