0% found this document useful (0 votes)
5 views17 pages

Hypothesis Testing in ML

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views17 pages

Hypothesis Testing in ML

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

Hypothesis Testing in

Machine Learning
Hypothesis
Null and Alternative Hypothesis
One-Tailed and Two-Tailed Tests
Z-Test
T-Test
Chi-Square Test
ANOVA Test
SciPy Library Functions for Tests

Smitesh Tamboli
Hypothesis Testing
A hypothesis is a claim or belief. Hypothesis testing is a statistical process of either rejecting
or retaining a claim or belief, or association related to a business context, product, service
etc. It plays an important role in providing evidence of an association relationship between
an outcome variable and predictor variables.

Example:
The new version of the eComm website has a better conversion rate
The cash-On-Delivery payment method increases sales
The average annual salary of machine learning experts differs for males and females.

Null and Alternative Hypothesis


The Null hypothesis is the default or baseline assumption that there is no effect, no
difference, or no relationship between variables. It is denoted as H0. The Null hypothesis is
the claim that is assumed to be true initially.
The alternative hypothesis is the complement of the Null hypothesis. The alternative
hypothesis is the statement that indicates the presence of an effect, a difference, or a
relationship between variables. It is denoted as HA.
Example:
The new version of the eComm website has a better conversion rate
Null Hypothesis (H0): The conversion rate of the new website version is equal to the
conversion rate of the old website version.
Alternative Hypothesis (HA): The conversion rate of the new website version is better or
higher than the conversion rate of the old website version.

The cash-On-Delivery payment method increases sales


Null Hypothesis (H0): The sales amount with the cash-on-delivery payment method is
equal to the sales amount with other payment methods.
Alternative Hypothesis (HA): The sales amount with the cash-on-delivery payment
method is higher than the sales amount with other payment methods.

The average annual salary of machine learning experts differs for males and females.
Null Hypothesis (H0): The average annual salary of male machine learning experts is
equal to the average annual salary of female machine learning experts.
Alternative Hypothesis (HA): The average annual salary of male machine learning
experts is different from the average annual salary of female machine learning experts.

Smitesh Tamboli
Test Statistic, P-Value and Significance Level
Test Statistic
A test statistic is a standardized value that measures the distance between the observed
sample statistic and the parameter specified in the null hypothesis. It is used to determine
how far the sample data is from what we would expect if the null hypothesis were true.
The test statistic is the standardized value used for calculating the p-value (probability
value) in support of the null hypothesis.

P-Value
P-value is a conditional probability of observing the statistic value given that the null
hypothesis is true. The P-value is the evidence in support of the null hypothesis.

P-value = P(Observing test statistics value | Null hypothesis is true)

Significance Value
The primary task in hypothesis testing is to make a decision to either reject or fail to reject
the Null hypothesis. Significance level provides criteria used for making a decision regarding
the null hypothesis reject or fail to reject (retain) based on calculated P-value.
The significance value is the maximum threshold for the P-value. Usually, the value of
significance level = 0.05. The reason for choosing a very low value of 0.05 is that we start
the process of hypothesis testing with an assumption that the Null hypothesis is true.
Unless there is strong evidence against this assumption, we will not reject the Null
hypothesis.

Rejection Region Rejection Region

Smitesh Tamboli
Criteria Decision

P-value < Reject the Null hypothesis

P-value >= Retain or fail to reject the Null hypothesis

One-Tailed and Two-Tailed Test


One-Tailed Test
A one-tailed test is a statistical test where the critical area of the distribution is on one side
(either left or right) of the distribution so that it is either greater than or less than a certain
value, but not both.
The alternative hypothesis specifies that the parameter is either greater than or less than a
certain value, but not both. The critical region for rejecting the null hypothesis is located
entirely in one tail of the distribution either the left tail or the right rail.

Right-Tailed Test
In a right-tailed test, the critical region for rejecting the Null hypothesis is located in the
right tail of the probability distribution. This test is used when the alternative hypothesis HA
specifies that the parameter of interest is greater than a certain value.

Smitesh Tamboli
Examples:
A pharmaceutical company claims that their new drug increases patient recovery rate more
than the standard treatment.
Null Hypothesis H0: The new drug does not increase the recovery rate more than the
standard treatment. H0:
Alternate Hypothesis HA: The new drug increases the recovery rate more than the
standard treatment.

Examples:
A new marketing campaign increases the average number of customers visiting their website
daily compared to the old campaign.
Null Hypothesis H0: The new marketing campaign does not increase the average number
of daily website visitors compared to the old campaign.
Alternate Hypothesis HA: The new marketing campaign increases the average number of
daily website visitors compared to the old campaign.

Left-Tailed Test
In a left-tailed test, the critical region for rejecting the Null hypothesis is located in the left
tail of the probability distribution. This test is used when the alternative hypothesis HA
specifies that the parameter of interest is less than a certain value.

Examples:
An educational researcher wants to determine if a new teaching method decreases the
average number of student failures compared to the traditional teaching method.
Null Hypothesis H0: The new teaching method does not decrease the average number of
student failures compared to the traditional method.
Alternate Hypothesis HA: The new teaching method decreases the average number of
student failures compared to the traditional method.

Smitesh Tamboli
Two-Tailed Test
A two-tailed test is a type of statistical test where the critical area of the distribution falls
in both tails (left and right) of the distribution. This test is used when the alternative
hypothesis does not specify the direction of the effect but states that there is a difference.
If the test statistic falls into either of these regions, the Null hypothesis is rejected
The significance level is split equally the two tails, so each tails has an area of

Examples:
A researcher wants to test whether a new diet program has a different effect on weight loss
compared to the standard diet program, without specifying if it is more or less effective.
Null Hypothesis H0: The new diet program has the same effect on weight loss as the
standard diet program.
Alternate Hypothesis HA: The new diet program has a different effect on weight loss
compared to the standard diet program.

Hypothetic Condition Tailed-Test

Right-Tailed Test

Left-Tailed Test

Two-Tailed Test

Smitesh Tamboli
Steps To Perform Hypothesis Test

Describe the hypothesis in words


1
e.g. The cash-On-Delivery payment method increases sales

Define Null and Alternate Hypothesis


Null Hypothesis (H0): The sales amount with the cash-

2 on-delivery payment method is equal to the sales amount


with other payment methods.
Alternative Hypothesis (HA): The sales amount with
the cash-on-delivery payment method is higher than the
sales amount with other payment methods.

Identify the test statistic for validity of Null


Hypotheis
3
z-test t-test chi-square test ANOVA test

Decide the criteria for rejection and retention


4 of Null hypothesis
Define Significance level
Generally 0.05 or 5%, but can vary based on criticality of problem

Calculate p-value
5
p-value is the evidence in support of the Null hypothesis.

Decide reject or retain the Null hypothesis


6 Based on p-value and significance level decide either reject
or fail to reject (retain) Null hypothesis.

p-value < -> Reject Null hypothesis p-value >= -> Retain Null hypothesis

Smitesh Tamboli
Type-I and Type-II Errors
In hypothesis testing we end up with two decisions
Reject Null hypothesis
Fail to reject (or retain) Null hypothesis

Type-I Error ( False Positive)


The conditional probability of rejecting a Null hypothesis when it is true. A Type-I error
occurs when the Null hypothesis H0 is True, but we incorrectly reject it.
Type-I Error = = P(Rejecting Null hypothesis | H0 is True )
Significance Level is the probability of committing a Type-I error. The common choices for
is 0.05 meaning there is a 5% risk of rejecting the Null hypothesis when it is actually True.

Type-II Error (False Negative)


The conditional probability of failing to reject (or retain) a Null hypothesis when it is false
(or Alternate Hypothesis is true. A Type-II error occurs when the Null hypothesis H0 is False,
but we fail to reject it. It is the error of not detecting a significant effect or difference when
there is one.

Type-II Error = = P(Retain Null hypothesis | H0 is False)

Decision on Null hypothesis based on hypothesis test

Actual Value of H0 Reject H0 Retain H0

Type-I Error Correct Decision


H0 is True
P(Reject H0 | H0 is True) P(Retain H0 | H0 is True)

Correct Decision Type-II Error


H0 is False
P(Reject H0 | H0 is False) P(Retain H0 | H0 is False )

Smitesh Tamboli
Z-Test
A z-test is a statistical test used to determine whether there is a significant difference
between the means of two groups, or to test if a sample mean significantly differs from a
known population mean. It is called a z-test because it follows a normal distribution (z-
distribution) under the Null hypothesis.
When to use z-test?
We need to test the value of the population mean, given that population variance is
known.
The population is a normal distribution and the population variance is known.
The sample size is large and the population variance is known. i.e. sample size n > 30.

One-sample z-test: used when the sample mean is significantly


different from a known population mean

= Sample Mean
= Population Mean
= Population Standard Deviation
n = Sample size

Two-sample z-test: used when the mean of two independent samples


are significantly different

= Mean of sample 1 and sample 2


= s.d. of populations 1 and 2
n1, n2 = sample sizes of sample 1 and sample 2

Example
Suppose an e-commerce platform receives an average of 100 visitors per day (known
population mean). We want to test if the average number of visitors for a recent sample of
40 days is significantly different from this known average. Verify the claim at significance
level alpha = 0.05.
Null Hypothesis (H0): The average number of visitors for the sample period is not
significantly different from the population mean. I.e. Sample Mean = 100
Alternative Hypothesis (HA): The average number of visitors for the sample period is
significantly different from the population average. i.e. Sample Mean 100

Smitesh Tamboli
One-Sample Z-Test

Smitesh Tamboli
Two-Sample Z-Test
Example
Suppose an e-commerce platform runs two different campaigns to drive traffic to their
website. We want to test if there is a significant difference in the average number of daily
visitors between the two campaigns.

Null Hypothesis (H0): There is no significant difference in the average number of daily
visitors between the two campaigns. I.e.
Alternative Hypothesis (HA): There is a significant difference in the average number of
daily visitors between the two campaigns. i.e.

Smitesh Tamboli
Smitesh Tamboli
T-Test
A T-test is a statistical test used to compare the means of two groups and determine if they
are significantly different from each other.
When to use T-test?
The T-test is used when the population follows a normal distribution and the population
standard deviation is unknown and is estimated from the sample.
The sample size is small (n < 30).

One-Sample T-Test
Used when comparing the mean of a single sample to a known population mean and
variance unknown.
Example: An eComm platform believes that the average number of daily visitors is 150. Test
if the average number of visitors for a sample of 20 days is significantly different from this
value.
H0: The sample mean is not significantly different from the population mean
HA: The sample mean is significantly different from the population mean

Smitesh Tamboli
Two-Sample T-Test
Two-sample T-test is used to determine if there is a significant difference between the
means of two independent groups.
Example: A dataset contains Male and Female students marks. We need to test the means
marks of Male students are not equal to Female students
H0: The means of Marks for Male and Female students are equal
HA: The means of Marks for Male and Female students are not equal

Smitesh Tamboli
Chi-Square Tests
The chi-square test is a statistical test used to determine whether there is a significant
association between two categorical variables.
Example: Consider a dataset that contains preferred payment methods (Credit Card, Cash,
PayPal) and the satisfaction level ( Satisfied, Not Satisfied) of customers on an eComm
Platform. We need to test whether there is any association between payment methods and
satisfaction levels.
H0: There is no association between the mode of payment and the satisfaction level of
customers.
HA: There is an association between the mode of payment and the satisfaction level of
customers.

Smitesh Tamboli
ANOVA Tests
ANOVA or Analysis of Variance is a statistical test used to compare the means of three or
more groups to determine if there are statistically significant differences between them. It
assesses whether the means of several groups are equal or not by examining the variation
between and within groups.
Example: An eComm Platform wants to analyze the effect of different shipping options
(Standard, Express, Same-Day) on customers' purchase amounts.
H0: There is no significant difference between the group means.
HA: There are significant differences between the group means.

Smitesh Tamboli
Summary
Z-Test
We need to test the value of the population mean, given that population variance is
known.
The population is a normal distribution and the population variance is known.
The sample size is large and the population variance is known. i.e. sample size n > 30.

One-sample z-test: Two-sample z-test:

T-Test
The T-test is used when the population follows a normal distribution and the
population standard deviation is unknown and is estimated from the sample.
The sample size is small (n < 30).

One-sample t-test: Used when comparing the mean of a single sample to a


known population mean and variance unknown.
t_stat, p_value = stats.ttest_1samp(visitors,popmean=pop_mean)
Two-sample t-test: Two-sample T-test is used to determine if there is a
significant difference between the means of two independent groups.

t_statistic, p_value = stats.ttest_ind(male_marks, female_marks)

Chi-Square Test
The chi-square test is a statistical test used to determine whether there is a significant
association between two categorical variables.
chi2, p, dof, expected = chi2_contingency(contingency_table)

ANOVA Test
compare the means of three or more groups to determine if there are statistically significant
differences between them.

f_statistic, p_value = f_oneway(standard_shipping, express_shipping,same_day_shipping)

Thank You
Smitesh Tamboli

You might also like