0% found this document useful (0 votes)
18 views

Hypothesis Testing in Python

Hypothesis testing lets you answer questions about your datasets in a statistically rigorous way. In this course, you'll grow your Python analytical skills as you learn how and when to use common tests like t-tests, proportion tests, and chi-square tests. Working with real-world data, including Stack Overflow user feedback and supply-chain data for medical supply shipments, you'll gain a deep understanding of how these tests work and the key assumptions that underpin them.

Uploaded by

jcmayac
Copyright
© © All Rights Reserved
0% found this document useful (0 votes)
18 views

Hypothesis Testing in Python

Hypothesis testing lets you answer questions about your datasets in a statistically rigorous way. In this course, you'll grow your Python analytical skills as you learn how and when to use common tests like t-tests, proportion tests, and chi-square tests. Working with real-world data, including Stack Overflow user feedback and supply-chain data for medical supply shipments, you'll gain a deep understanding of how these tests work and the key assumptions that underpin them.

Uploaded by

jcmayac
Copyright
© © All Rights Reserved
You are on page 1/ 149

Hypothesis tests and

z-scores
HYPOTHESIS TESTING IN PYTHON

James Chapman
Curriculum Manager, DataCamp
A/B testing
In 2013, Electronic Arts (EA) released
SimCity 5

They wanted to increase pre-orders of the


game

They used A/B testing to test different


advertising scenarios

This involves splitting users into control and


treatment groups

1 Image credit: "Electronic Arts" by majaX1 CC BY-NC-SA 2.0

HYPOTHESIS TESTING IN PYTHON


Retail webpage A/B test
Control: Treatment:

HYPOTHESIS TESTING IN PYTHON


A/B test results
The treatment group (no ad) got 43.4% more purchases than the control group (with ad)
Intuition that "showing an ad would increase sales" was false

Was this result statistically significant or just chance?

Need EA's data to determine this

Techniques from Sampling in Python + this course to do so

HYPOTHESIS TESTING IN PYTHON


Stack Overflow Developer Survey 2020
import pandas as pd
print(stack_overflow)

respondent age_1st_code ... age hobbyist


0 36.0 30.0 ... 34.0 Yes
1 47.0 10.0 ... 53.0 Yes
2 69.0 12.0 ... 25.0 Yes
3 125.0 30.0 ... 41.0 Yes
4 147.0 15.0 ... 28.0 No
... ... ... ... ... ...
2259 62867.0 13.0 ... 33.0 Yes
2260 62882.0 13.0 ... 28.0 Yes

[2261 rows x 8 columns]

HYPOTHESIS TESTING IN PYTHON


Hypothesizing about the mean
A hypothesis:

The mean annual compensation of the population of data scientists is $110,000

The point estimate (sample statistic):

mean_comp_samp = stack_overflow['converted_comp'].mean()

119574.71738168952

HYPOTHESIS TESTING IN PYTHON


Generating a bootstrap distribution
import numpy as np
# Step 3. Repeat steps 1 & 2 many times, appending to a list
so_boot_distn = []
for i in range(5000):
so_boot_distn.append(
# Step 2. Calculate point estimate
np.mean(
# Step 1. Resample
stack_overflow.sample(frac=1, replace=True)['converted_comp']
)
)

1 Bootstrap distributions are taught in Chapter 4 of Sampling in Python

HYPOTHESIS TESTING IN PYTHON


Visualizing the bootstrap distribution
import matplotlib.pyplot as plt
plt.hist(so_boot_distn, bins=50)
plt.show()

HYPOTHESIS TESTING IN PYTHON


Standard error
std_error = np.std(so_boot_distn, ddof=1)

5607.997577378606

HYPOTHESIS TESTING IN PYTHON


z-scores
value − mean
standardized value =
standard deviation
sample stat − hypoth. param. value
z=
standard error

HYPOTHESIS TESTING IN PYTHON


sample stat − hypoth. param. value
z=
standard error
stack_overflow['converted_comp'].mean()

119574.71738168952

mean_comp_hyp = 110000

std_error

5607.997577378606

z_score = (mean_comp_samp - mean_comp_hyp) / std_error

1.7073326529796957

HYPOTHESIS TESTING IN PYTHON


Testing the hypothesis
Is 1.707 a high or low number?
This is the goal of the course!

HYPOTHESIS TESTING IN PYTHON


Testing the hypothesis
Is 1.707 a high or low number?
This is the goal of the course!

Hypothesis testing use case:

Determine whether sample statistics are close to or far away from expected (or
"hypothesized" values)

HYPOTHESIS TESTING IN PYTHON


Standard normal (z) distribution
Standard normal distribution: normal distribution with mean = 0 + standard deviation = 1

HYPOTHESIS TESTING IN PYTHON


Let's practice!
HYPOTHESIS TESTING IN PYTHON
p-values
HYPOTHESIS TESTING IN PYTHON

James Chapman
Curriculum Manager, DataCamp
Criminal trials
Two possible true states:
1. Defendant committed the crime

2. Defendant did not commit the crime

Two possible verdicts:


1. Guilty

2. Not guilty

Initially the defendant is assumed to be not guilty

Prosecution must present evidence "beyond reasonable doubt" for a guilty verdict

HYPOTHESIS TESTING IN PYTHON


Age of first programming experience
age_first_code_cut classifies when Stack Overflow user first started programming
"adult" means they started at 14 or older

"child" means they started before 14

Previous research: 35% of software developers started programming as children

Evidence that a greater proportion of data scientists starting programming as children?

HYPOTHESIS TESTING IN PYTHON


Definitions
A hypothesis is a statement about an unknown population parameter

A hypothesis test is a test of two competing hypotheses

The null hypothesis (H0 ) is the existing idea

The alternative hypothesis (HA ) is the new "challenger" idea of the researcher

For our problem:

H0 : The proportion of data scientists starting programming as children is 35%


HA : The proportion of data scientists starting programming as children is greater than 35%

1"Naught" is British English for "zero". For historical reasons, "H-naught" is the international convention for
pronouncing the null hypothesis.

HYPOTHESIS TESTING IN PYTHON


Criminal trials vs. hypothesis testing
Either HA or H0 is true (not both)
Initially, H0 is assumed to be true

The test ends in either "reject H0 " or "fail to reject H0 "

If the evidence from the sample is "significant" that HA is true, reject H0 , else choose H0

Significance level is "beyond a reasonable doubt" for hypothesis testing

HYPOTHESIS TESTING IN PYTHON


One-tailed and two-tailed tests
Hypothesis tests check if the sample statistics
lie in the tails of the null distribution

Test Tails
alternative different from null two-tailed
alternative greater than null right-tailed
alternative less than null left-tailed

HA : The proportion of data scientists starting


programming as children is greater than 35%

This is a right-tailed test

HYPOTHESIS TESTING IN PYTHON


p-values
p-values: probability of obtaining a result,
assuming the null hypothesis is true

Large p-value, large support for H0


Statistic likely not in the tail of the null
distribution
Small p-value, strong evidence against H0
Statistic likely in the tail of the null
distribution
"p" in p-value → probability

"small" means "close to zero"

HYPOTHESIS TESTING IN PYTHON


Calculating the z-score
prop_child_samp = (stack_overflow['age_first_code_cut'] == "child").mean()

0.39141972578505085

prop_child_hyp = 0.35

std_error = np.std(first_code_boot_distn, ddof=1)

0.010351057228878566

z_score = (prop_child_samp - prop_child_hyp) / std_error

4.001497129152506

HYPOTHESIS TESTING IN PYTHON


Calculating the p-value
norm.cdf() is normal CDF from scipy.stats .

Left-tailed test → use norm.cdf() .

Right-tailed test → use 1 - norm.cdf() .

from scipy.stats import norm


1 - norm.cdf(z_score, loc=0, scale=1)

3.1471479512323874e-05

HYPOTHESIS TESTING IN PYTHON


Let's practice!
HYPOTHESIS TESTING IN PYTHON
Statistical
significance
HYPOTHESIS TESTING IN PYTHON

James Chapman
Curriculum Manager, DataCamp
p-value recap
p-values quantify evidence for the null hypothesis
Large p-value → fail to reject null hypothesis

Small p-value → reject null hypothesis

Where is the cutoff point?

HYPOTHESIS TESTING IN PYTHON


Significance level
The significance level of a hypothesis test (α) is the threshold point for "beyond a
reasonable doubt"

Common values of α are 0.2 , 0.1 , 0.05 , and 0.01

If p ≤ α, reject H0 , else fail to reject H0


α should be set prior to conducting the hypothesis test

HYPOTHESIS TESTING IN PYTHON


Calculating the p-value
alpha = 0.05
prop_child_samp = (stack_overflow['age_first_code_cut'] == "child").mean()
prop_child_hyp = 0.35
std_error = np.std(first_code_boot_distn, ddof=1)

z_score = (prop_child_samp - prop_child_hyp) / std_error

p_value = 1 - norm.cdf(z_score, loc=0, scale=1)

3.1471479512323874e-05

HYPOTHESIS TESTING IN PYTHON


Making a decision
alpha = 0.05
print(p_value)

3.1471479512323874e-05

p_value <= alpha

True

Reject H0 in favor of HA

HYPOTHESIS TESTING IN PYTHON


Confidence intervals
For a significance level of α, it's common to choose a confidence interval level of 1 - α

α = 0.05 → 95% confidence interval

import numpy as np
lower = np.quantile(first_code_boot_distn, 0.025)
upper = np.quantile(first_code_boot_distn, 0.975)
print((lower, upper))

(0.37063246351172047, 0.41132242370632466)

HYPOTHESIS TESTING IN PYTHON


Types of errors
Truly didn't commit crime Truly committed crime
Verdict not guilty correct they got away with it
Verdict guilty wrongful conviction correct

actual H0 actual HA

chosen H0 correct false negative

chosen HA false positive correct

False positives are Type I errors; false negatives are Type II errors.

HYPOTHESIS TESTING IN PYTHON


Possible errors in our example
If p ≤ α, we reject H0 :

A false positive (Type I) error: data scientists didn't start coding as children at a higher rate

If p > α, we fail to reject H0 :

A false negative (Type II) error: data scientists started coding as children at a higher rate

HYPOTHESIS TESTING IN PYTHON


Let's practice!
HYPOTHESIS TESTING IN PYTHON
Performing t-tests
HYPOTHESIS TESTING IN PYTHON

James Chapman
Curriculum Manager, DataCamp
Two-sample problems
Compare sample statistics across groups of a variable
converted_comp is a numerical variable

age_first_code_cut is a categorical variable with levels ( "child" and "adult" )

Are users who first programmed as a child compensated higher than those that started as
adults?

HYPOTHESIS TESTING IN PYTHON


Hypotheses
H0 : The mean compensation (in USD) is the same for those that coded first as a child and
those that coded first as an adult.

H0 : μchild = μadult

H0 : μchild − μadult = 0

HA : The mean compensation (in USD) is greater for those that coded first as a child
compared to those that coded first as an adult.

HA : μchild > μadult

HA : μchild − μadult > 0

HYPOTHESIS TESTING IN PYTHON


Calculating groupwise summary statistics
stack_overflow.groupby('age_first_code_cut')['converted_comp'].mean()

age_first_code_cut
adult 111313.311047
child 132419.570621
Name: converted_comp, dtype: float64

HYPOTHESIS TESTING IN PYTHON


Test statistics
Sample mean estimates the population mean

x̄ - a sample mean
x̄child - sample mean compensation for coding first as a child
x̄adult - sample mean compensation for coding first as an adult
x̄child − x̄adult - a test statistic
z-score - a (standardized) test statistic

HYPOTHESIS TESTING IN PYTHON


Standardizing the test statistic
sample stat − population parameter
z=
standard error
difference in sample stats − difference in population parameters
t=
standard error
(x̄child − x̄adult ) − (μchild − μadult )
t=
SE(x̄child − x̄adult )

HYPOTHESIS TESTING IN PYTHON


Standard error
SE(x̄child − x̄adult ) ≈ √
s2child s2adult
+
nchild nadult

s is the standard deviation of the variable

n is the sample size (number of observations/rows in sample)

HYPOTHESIS TESTING IN PYTHON


Assuming the null hypothesis is true
(x̄child − x̄adult ) − (μchild − μadult )
t=
SE(x̄child − x̄adult )
(x̄child − x̄adult )
H0 : μchild − μadult = 0 → t=
SE(x̄child − x̄adult )
(x̄child − x̄adult )
t=

s2child s2adult
+
nchild nadult

HYPOTHESIS TESTING IN PYTHON


Calculations assuming the null hypothesis is true
xbar = stack_overflow.groupby('age_first_code_cut')['converted_comp'].mean()

adult 111313.311047
child 132419.570621
Name: converted_comp, dtype: float64 age_first_code_cut

s = stack_overflow.groupby('age_first_code_cut')['converted_comp'].std()

adult 271546.521729
child 255585.240115
Name: converted_comp, dtype: float64 age_first_code_cut

n = stack_overflow.groupby('age_first_code_cut')['converted_comp'].count()

adult 1376
child 885
Name: converted_comp, dtype: int64

HYPOTHESIS TESTING IN PYTHON


Calculating the test statistic
(x̄child − x̄adult )
t=

s2child s2adult
+
nchild nadult

import numpy as np
numerator = xbar_child - xbar_adult
denominator = np.sqrt(s_child ** 2 / n_child + s_adult ** 2 / n_adult)
t_stat = numerator / denominator

1.8699313316221844

HYPOTHESIS TESTING IN PYTHON


Let's practice!
HYPOTHESIS TESTING IN PYTHON
Calculating p-values
from t-statistics
HYPOTHESIS TESTING IN PYTHON

James Chapman
Curriculum Manager, DataCamp
t-distributions
t statistic follows a t-distribution
Have a parameter named degrees of
freedom, or df
Look like normal distributions, with fatter
tails

HYPOTHESIS TESTING IN PYTHON


Degrees of freedom
Larger degrees of freedom → t-distribution
gets closer to the normal distribution

Normal distribution → t-distribution with


infinite df

Degrees of freedom: maximum number of


logically independent values in the data
sample

HYPOTHESIS TESTING IN PYTHON


Calculating degrees of freedom
Dataset has 5 independent observations
Four of the values are 2, 6, 8, and 5

The sample mean is 5

The last value must be 4

Here, there are 4 degrees of freedom

df = nchild + nadult − 2

HYPOTHESIS TESTING IN PYTHON


Hypotheses
H0 : The mean compensation (in USD) is the same for those that coded first as a child and
those that coded first as an adult

HA : The mean compensation (in USD) is greater for those that coded first as a child
compared to those that coded first as an adult

Use a right-tailed test

HYPOTHESIS TESTING IN PYTHON


Significance level
α = 0.1

If p ≤ α then reject H0 .

HYPOTHESIS TESTING IN PYTHON


Calculating p-values: one proportion vs. a value
from scipy.stats import norm
1 - norm.cdf(z_score)

SE(x̄child − x̄adult ) ≈ √
s2child s2adult
+
nchild nadult

z-statistic: needed when using one sample statistic to estimate a population parameter

t-statistic: needed when using multiple sample statistics to estimate a population parameter

HYPOTHESIS TESTING IN PYTHON


Calculating p-values: two means from different groups
numerator = xbar_child - xbar_adult
denominator = np.sqrt(s_child ** 2 / n_child + s_adult ** 2 / n_adult)
t_stat = numerator / denominator

1.8699313316221844

degrees_of_freedom = n_child + n_adult - 2

2259

HYPOTHESIS TESTING IN PYTHON


Calculating p-values: two means from different groups
Use t-distribution CDF not normal CDF

from scipy.stats import t


1 - t.cdf(t_stat, df=degrees_of_freedom)

0.030811302165157595

Evidence that Stack Overflow data scientists who started coding as a child earn more.

HYPOTHESIS TESTING IN PYTHON


Let's practice!
HYPOTHESIS TESTING IN PYTHON
Paired t-tests
HYPOTHESIS TESTING IN PYTHON

James Chapman
Curriculum Manager, DataCamp
US Republican presidents dataset
state county repub_percent_08 repub_percent_12
0 Alabama Hale 38.957877 37.139882
1 Arkansas Nevada 56.726272 58.983452
2 California Lake 38.896719 39.331367
3 California Ventura 42.923190 45.250693
.. ... ... ... ...
96 Wisconsin La Crosse 37.490904 40.577038
97 Wisconsin Lafayette 38.104967 41.675050
98 Wyoming Weston 76.684241 83.983328
99 Alaska District 34 77.063259 40.789626

[100 rows x 4 columns]

100 rows; each row represents county-level votes in a presidential election.

1 https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/VOQCHQ

HYPOTHESIS TESTING IN PYTHON


Hypotheses
Question: Was the percentage of Republican candidate votes lower in 2008 than 2012?

H0 : μ2008 − μ2012 = 0

HA : μ2008 − μ2012 < 0

Set α = 0.05 significance level.

Data is paired → each voter percentage refers to the same county


Want to capture voting patterns in model

HYPOTHESIS TESTING IN PYTHON


From two samples to one
sample_data = repub_votes_potus_08_12
sample_data['diff'] = sample_data['repub_percent_08'] - sample_data['repub_percent_12']

import matplotlib.pyplot as plt


sample_data['diff'].hist(bins=20)

HYPOTHESIS TESTING IN PYTHON


Calculate sample statistics of the difference
xbar_diff = sample_data['diff'].mean()

-2.877109041242944

HYPOTHESIS TESTING IN PYTHON


Revised hypotheses
Old hypotheses: x̄diff − μdiff
t=

H0 : μ2008 − μ2012 = 0 s2dif f
HA : μ2008 − μ2012 < 0 ndiff

df = ndif f − 1

New hypotheses:
H0 : μdiff = 0

HA : μdiff < 0

HYPOTHESIS TESTING IN PYTHON


Calculating the p-value
x̄diff − μdiff
n_diff = len(sample_data) t=

s2diff
100
ndiff
s_diff = sample_data['diff'].std()
df = ndiff − 1
t_stat = (xbar_diff-0) / np.sqrt(s_diff**2/n_diff)

-5.601043121928489 from scipy.stats import t


p_value = t.cdf(t_stat, df=n_diff-1)

degrees_of_freedom = n_diff - 1
9.572537285272411e-08

99

HYPOTHESIS TESTING IN PYTHON


Testing differences between two means using ttest()
import pingouin
pingouin.ttest(x=sample_data['diff'],
y=0,
alternative="less")

T dof alternative p-val CI95% cohen-d \


T-test -5.601043 99 less 9.572537e-08 [-inf, -2.02] 0.560104

BF10 power
T-test 1.323e+05 1.0

1Details on Returns from pingouin.ttest() are available in the API docs for pingouin at https://pingouin-
stats.org/generated/pingouin.ttest.html#pingouin.ttest.

HYPOTHESIS TESTING IN PYTHON


ttest() with paired=True
pingouin.ttest(x=sample_data['repub_percent_08'],
y=sample_data['repub_percent_12'],
paired=True,
alternative="less")

T dof alternative p-val CI95% cohen-d \


T-test -5.601043 99 less 9.572537e-08 [-inf, -2.02] 0.217364

BF10 power
T-test 1.323e+05 0.696338

HYPOTHESIS TESTING IN PYTHON


Unpaired ttest()
pingouin.ttest(x=sample_data['repub_percent_08'],
y=sample_data['repub_percent_12'],
paired=False, # The default
alternative="less")

T dof alternative p-val CI95% cohen-d BF10 \


T-test -1.536997 198 less 0.062945 [-inf, 0.22] 0.217364 0.927

power
T-test 0.454972

Unpaired t-tests on paired data increases the chances of false negative errors

HYPOTHESIS TESTING IN PYTHON


Let's practice!
HYPOTHESIS TESTING IN PYTHON
ANOVA tests
HYPOTHESIS TESTING IN PYTHON

James Chapman
Curriculum Manager, DataCamp
Job satisfaction: 5 categories
stack_overflow['job_sat'].value_counts()

Very satisfied 879


Slightly satisfied 680
Slightly dissatisfied 342
Neither 201
Very dissatisfied 159
Name: job_sat, dtype: int64

HYPOTHESIS TESTING IN PYTHON


Visualizing multiple distributions
Is mean annual compensation different for
different levels of job satisfaction?

import seaborn as sns


import matplotlib.pyplot as plt
sns.boxplot(x="converted_comp",
y="job_sat",
data=stack_overflow)
plt.show()

HYPOTHESIS TESTING IN PYTHON


Analysis of variance (ANOVA)
A test for differences between groups

alpha = 0.2

pingouin.anova(data=stack_overflow,
dv="converted_comp",
between="job_sat")

Source ddof1 ddof2 F p-unc np2


0 job_sat 4 2256 4.480485 0.001315 0.007882

0.001315 <α
At least two categories have significantly different compensation

HYPOTHESIS TESTING IN PYTHON


Pairwise tests
μvery dissatisfied ≠ μslightly dissatisfied μslightly dissatisfied ≠ μslightly satisfied
μvery dissatisfied ≠ μneither μslightly dissatisfied ≠ μvery satisfied
μvery dissatisfied ≠ μslightly satisfied μneither ≠ μslightly satisfied
μvery dissatisfied ≠ μvery satisfied μneither ≠ μvery satisfied
μslightly dissatisfied ≠ μneither μslightly satisfied ≠ μvery satisfied

Set significance level to α = 0.2.

HYPOTHESIS TESTING IN PYTHON


pairwise_tests()
pingouin.pairwise_tests(data=stack_overflow,
dv="converted_comp",
between="job_sat",
padjust="none")

Contrast A B Paired Parametric ... dof alternative p-unc BF10 hedges


0 job_sat Slightly satisfied Very satisfied False True ... 1478.622799 two-sided 0.000064 158.564 -0.192931
1 job_sat Slightly satisfied Neither False True ... 258.204546 two-sided 0.484088 0.114 -0.068513
2 job_sat Slightly satisfied Very dissatisfied False True ... 187.153329 two-sided 0.215179 0.208 -0.145624
3 job_sat Slightly satisfied Slightly dissatisfied False True ... 569.926329 two-sided 0.969491 0.074 -0.002719
4 job_sat Very satisfied Neither False True ... 328.326639 two-sided 0.097286 0.337 0.120115
5 job_sat Very satisfied Very dissatisfied False True ... 221.666205 two-sided 0.455627 0.126 0.063479
6 job_sat Very satisfied Slightly dissatisfied False True ... 821.303063 two-sided 0.002166 7.43 0.173247
7 job_sat Neither Very dissatisfied False True ... 321.165726 two-sided 0.585481 0.135 -0.058537
8 job_sat Neither Slightly dissatisfied False True ... 367.730081 two-sided 0.547406 0.118 0.055707
9 job_sat Very dissatisfied Slightly dissatisfied False True ... 247.570187 two-sided 0.259590 0.197 0.119131

[10 rows x 11 columns]

HYPOTHESIS TESTING IN PYTHON


As the number of groups increases...

HYPOTHESIS TESTING IN PYTHON


Bonferroni correction
pingouin.pairwise_tests(data=stack_overflow,
dv="converted_comp",
between="job_sat",
padjust="bonf")

Contrast A B ... p-unc p-corr p-adjust BF10 hedges


0 job_sat Slightly satisfied Very satisfied ... 0.000064 0.000638 bonf 158.564 -0.192931
1 job_sat Slightly satisfied Neither ... 0.484088 1.000000 bonf 0.114 -0.068513
2 job_sat Slightly satisfied Very dissatisfied ... 0.215179 1.000000 bonf 0.208 -0.145624
3 job_sat Slightly satisfied Slightly dissatisfied ... 0.969491 1.000000 bonf 0.074 -0.002719
4 job_sat Very satisfied Neither ... 0.097286 0.972864 bonf 0.337 0.120115
5 job_sat Very satisfied Very dissatisfied ... 0.455627 1.000000 bonf 0.126 0.063479
6 job_sat Very satisfied Slightly dissatisfied ... 0.002166 0.021659 bonf 7.43 0.173247
7 job_sat Neither Very dissatisfied ... 0.585481 1.000000 bonf 0.135 -0.058537
8 job_sat Neither Slightly dissatisfied ... 0.547406 1.000000 bonf 0.118 0.055707
9 job_sat Very dissatisfied Slightly dissatisfied ... 0.259590 1.000000 bonf 0.197 0.119131

[10 rows x 11 columns]

HYPOTHESIS TESTING IN PYTHON


More methods
padjust : string

Method used for testing and adjustment of pvalues.

'none' : no correction [default]

'bonf' : one-step Bonferroni correction

'sidak' : one-step Sidak correction

'holm' : step-down method using Bonferroni adjustments

'fdr_bh' : Benjamini/Hochberg FDR correction

'fdr_by' : Benjamini/Yekutieli FDR correction

HYPOTHESIS TESTING IN PYTHON


Let's practice!
HYPOTHESIS TESTING IN PYTHON
One-sample
proportion tests
HYPOTHESIS TESTING IN PYTHON

James Chapman
Curriculum Manager, DataCamp
Chapter 1 recap
Is a claim about an unknown population proportion feasible?

1. Standard error of sample statistic from bootstrap distribution


2. Compute a standardized test statistic

3. Calculate a p-value

4. Decide which hypothesis made most sense

Now, calculate the test statistic without using the bootstrap distribution

HYPOTHESIS TESTING IN PYTHON


Standardized test statistic for proportions
p: population proportion (unknown population parameter)

p^: sample proportion (sample statistic)

p0 : hypothesized population proportion


p^ − mean( p^) p^ − p
z= =
SE( p^) SE( p^)
Assuming H0 is true, p = p0 , so
p^ − p0
z=
SE( p^)

HYPOTHESIS TESTING IN PYTHON


Simplifying the standard error calculations
SE p^ = √
p0 ∗ (1 − p0 )
→ Under H0 , SE p^ depends on hypothesized p0 and sample size n
n
Assuming H0 is true,

p^ − p0
z=
√ p0 ∗ (1 − p0 )
n
^ and n) and the hypothesized parameter (p0 )
Only uses sample information ( p

HYPOTHESIS TESTING IN PYTHON


Why z instead of t?
(x̄child − x̄adult )
t=

s2child s2adult
+
nchild nadult

s is calculated from x̄
x̄ estimates the population mean
s estimates the population standard deviation
↑ uncertainty in our estimate of the parameter
t-distribution - fatter tails than a normal distribution

p^ only appears in the numerator, so z-scores are fine

HYPOTHESIS TESTING IN PYTHON


Stack Overflow age categories
H0 : Proportion of Stack Overflow users under thirty = 0.5

HA : Proportion of Stack Overflow users under thirty ≠ 0.5

alpha = 0.01

stack_overflow['age_cat'].value_counts(normalize=True)

Under 30 0.535604
At least 30 0.464396
Name: age_cat, dtype: float64

HYPOTHESIS TESTING IN PYTHON


Variables for z
p_hat = (stack_overflow['age_cat'] == 'Under 30').mean()

0.5356037151702786

p_0 = 0.50

n = len(stack_overflow)

2261

HYPOTHESIS TESTING IN PYTHON


Calculating the z-score
p^ − p0
z=

p0 ∗ (1 − p0 )
n

import numpy as np
numerator = p_hat - p_0
denominator = np.sqrt(p_0 * (1 - p_0) / n)
z_score = numerator / denominator

3.385911440783663

HYPOTHESIS TESTING IN PYTHON


Calculating the p-value
Two-tailed ("not equal"):

p_value = norm.cdf(-z_score) +
1 - norm.cdf(z_score)

p_value = 2 * (1 - norm.cdf(z_score))
Left-tailed ("less than"):
0.0007094227368100725
from scipy.stats import norm
p_value = norm.cdf(z_score)
p_value <= alpha

Right-tailed ("greater than"):


True

p_value = 1 - norm.cdf(z_score)

HYPOTHESIS TESTING IN PYTHON


Let's practice!
HYPOTHESIS TESTING IN PYTHON
Two-sample
proportion tests
HYPOTHESIS TESTING IN PYTHON

James Chapman
Curriculum Manager, DataCamp
Comparing two proportions
H0 : Proportion of hobbyist users is the same for those under thirty as those at least thirty

H0 : p≥30 − p<30 = 0

HA : Proportion of hobbyist users is different for those under thirty to those at least thirty

HA : p≥30 − p<30 ≠ 0

alpha = 0.05

HYPOTHESIS TESTING IN PYTHON


Calculating the z-score
z-score equation for a proportion test:
( p^≥30 − p^<30 ) − 0
z=
SE( p^≥30 − p^<30 )
Standard error equation:

SE( p^≥30 − p^<30 ) = √


p^ × (1 − p^) p^ × (1 − p^)
+
n≥30 n<30

p^ → weighted mean of p^≥30 and p^<30


n≥30 × p^≥30 + n<30 × p^<30
p^ =
n≥30 + n<30
Only require p^≥30 , p^<30 , n≥30 , n<30 from the sample to calculate the z-score

HYPOTHESIS TESTING IN PYTHON


Getting the numbers for the z-score
p_hats = stack_overflow.groupby("age_cat")['hobbyist'].value_counts(normalize=True)

age_cat hobbyist
At least 30 Yes 0.773333
No 0.226667
Under 30 Yes 0.843105
No 0.156895
Name: hobbyist, dtype: float64

n = stack_overflow.groupby("age_cat")['hobbyist'].count()

age_cat
At least 30 1050
Under 30 1211
Name: hobbyist, dtype: int64

HYPOTHESIS TESTING IN PYTHON


Getting the numbers for the z-score
p_hats = stack_overflow.groupby("age_cat")['hobbyist'].value_counts(normalize=True)

age_cat hobbyist
At least 30 Yes 0.773333
No 0.226667
Under 30 Yes 0.843105
No 0.156895
Name: hobbyist, dtype: float64

p_hat_at_least_30 = p_hats[("At least 30", "Yes")]


p_hat_under_30 = p_hats[("Under 30", "Yes")]
print(p_hat_at_least_30, p_hat_under_30)

0.773333 0.843105

HYPOTHESIS TESTING IN PYTHON


Getting the numbers for the z-score
n = stack_overflow.groupby("age_cat")['hobbyist'].count()

age_cat
At least 30 1050
Under 30 1211
Name: hobbyist, dtype: int64

n_at_least_30 = n["At least 30"]


n_under_30 = n["Under 30"]
print(n_at_least_30, n_under_30)

1050 1211

HYPOTHESIS TESTING IN PYTHON


Getting the numbers for the z-score
p_hat = (n_at_least_30 * p_hat_at_least_30 + n_under_30 * p_hat_under_30) /
(n_at_least_30 + n_under_30)

std_error = np.sqrt(p_hat * (1-p_hat) / n_at_least_30 +


p_hat * (1-p_hat) / n_under_30)

z_score = (p_hat_at_least_30 - p_hat_under_30) / std_error


print(z_score)

-4.223718652693034

HYPOTHESIS TESTING IN PYTHON


Proportion tests using proportions_ztest()
stack_overflow.groupby("age_cat")['hobbyist'].value_counts()

age_cat hobbyist
At least 30 Yes 812
No 238
Under 30 Yes 1021
No 190
Name: hobbyist, dtype: int64

n_hobbyists = np.array([812, 1021])


n_rows = np.array([812 + 238, 1021 + 190])
from statsmodels.stats.proportion import proportions_ztest
z_score, p_value = proportions_ztest(count=n_hobbyists, nobs=n_rows,
alternative="two-sided")

(-4.223691463320559, 2.403330142685068e-05)

HYPOTHESIS TESTING IN PYTHON


Let's practice!
HYPOTHESIS TESTING IN PYTHON
Chi-square test of
independence
HYPOTHESIS TESTING IN PYTHON

James Chapman
Curriculum Manager, DataCamp
Revisiting the proportion test
age_by_hobbyist = stack_overflow.groupby("age_cat")['hobbyist'].value_counts()

age_cat hobbyist
At least 30 Yes 812
No 238
Under 30 Yes 1021
No 190
Name: hobbyist, dtype: int64

from statsmodels.stats.proportion import proportions_ztest


n_hobbyists = np.array([812, 1021])
n_rows = np.array([812 + 238, 1021 + 190])
stat, p_value = proportions_ztest(count=n_hobbyists, nobs=n_rows,
alternative="two-sided")

(-4.223691463320559, 2.403330142685068e-05)

HYPOTHESIS TESTING IN PYTHON


Independence of variables
Previous hypothesis test result: evidence that hobbyist and age_cat are associated

Statistical independence - proportion of successes in the response variable is the same


across all categories of the explanatory variable

HYPOTHESIS TESTING IN PYTHON


Test for independence of variables
import pingouin
expected, observed, stats = pingouin.chi2_independence(data=stack_overflow, x='hobbyist',
y='age_cat', correction=False)
print(stats)

test lambda chi2 dof pval cramer power


0 pearson 1.000000 17.839570 1.0 0.000024 0.088826 0.988205
1 cressie-read 0.666667 17.818114 1.0 0.000024 0.088773 0.988126
2 log-likelihood 0.000000 17.802653 1.0 0.000025 0.088734 0.988069
3 freeman-tukey -0.500000 17.815060 1.0 0.000024 0.088765 0.988115
4 mod-log-likelihood -1.000000 17.848099 1.0 0.000024 0.088848 0.988236
5 neyman -2.000000 17.976656 1.0 0.000022 0.089167 0.988694

χ2 statistic = 17.839570 = (−4.223691463320559)2 = (z -score)2

HYPOTHESIS TESTING IN PYTHON


Job satisfaction and age category
stack_overflow['age_cat'].value_counts() stack_overflow['job_sat'].value_counts()

Under 30 1211 Very satisfied 879


At least 30 1050 Slightly satisfied 680
Name: age_cat, dtype: int64 Slightly dissatisfied 342
Neither 201
Very dissatisfied 159
Name: job_sat, dtype: int64

HYPOTHESIS TESTING IN PYTHON


Declaring the hypotheses
H0 : Age categories are independent of job satisfaction levels

HA : Age categories are not independent of job satisfaction levels

alpha = 0.1

Test statistic denoted χ2

Assuming independence, how far away are the observed results from the expected values?

HYPOTHESIS TESTING IN PYTHON


Exploratory visualization: proportional stacked bar plot
props = stack_overflow.groupby('job_sat')['age_cat'].value_counts(normalize=True)
wide_props = props.unstack()
wide_props.plot(kind="bar", stacked=True)

HYPOTHESIS TESTING IN PYTHON


Exploratory visualization: proportional stacked bar plot

HYPOTHESIS TESTING IN PYTHON


Chi-square independence test
import pingouin
expected, observed, stats = pingouin.chi2_independence(data=stack_overflow, x="job_sat", y="age_cat")
print(stats)

test lambda chi2 dof pval cramer power


0 pearson 1.000000 5.552373 4.0 0.235164 0.049555 0.437417
1 cressie-read 0.666667 5.554106 4.0 0.235014 0.049563 0.437545
2 log-likelihood 0.000000 5.558529 4.0 0.234632 0.049583 0.437871
3 freeman-tukey -0.500000 5.562688 4.0 0.234274 0.049601 0.438178
4 mod-log-likelihood -1.000000 5.567570 4.0 0.233854 0.049623 0.438538
5 neyman -2.000000 5.579519 4.0 0.232828 0.049676 0.439419

Degrees of freedom:

(No. of response categories − 1) × (No. of explanatory categories − 1)

(2 − 1) ∗ (5 − 1) = 4

HYPOTHESIS TESTING IN PYTHON


Swapping the variables?
props = stack_overflow.groupby('age_cat')['job_sat'].value_counts(normalize=True)
wide_props = props.unstack()
wide_props.plot(kind="bar", stacked=True)

HYPOTHESIS TESTING IN PYTHON


Swapping the variables?

HYPOTHESIS TESTING IN PYTHON


chi-square both ways
expected, observed, stats = pingouin.chi2_independence(data=stack_overflow, x="age_cat", y="job_sat")
print(stats[stats['test'] == 'pearson'])

test lambda chi2 dof pval cramer power


0 pearson 1.0 5.552373 4.0 0.235164 0.049555 0.437417

Ask: Are the variables X and Y independent?

Not: Is variable X independent from variable Y?

HYPOTHESIS TESTING IN PYTHON


What about direction and tails?
Observed and expected counts squared must be non-negative

chi-square tests are almost always right-tailed 1

1Left-tailed chi-square tests are used in statistical forensics to detect if a fit is suspiciously good because the
data was fabricated. Chi-square tests of variance can be two-tailed. These are niche uses, though.

HYPOTHESIS TESTING IN PYTHON


Let's practice!
HYPOTHESIS TESTING IN PYTHON
Chi-square
goodness of fit tests
HYPOTHESIS TESTING IN PYTHON

James Chapman
Curriculum Manager, DataCamp
Purple links
How do you feel when you discover that you've already visited the top resource?

purple_link_counts = stack_overflow['purple_link'].value_counts()

purple_link_counts = purple_link_counts.rename_axis('purple_link')\
.reset_index(name='n')\
.sort_values('purple_link')

purple_link n
2 Amused 368
3 Annoyed 263
0 Hello, old friend 1225
1 Indifferent 405

HYPOTHESIS TESTING IN PYTHON


Declaring the hypotheses
hypothesized = pd.DataFrame({ purple_link prop
'purple_link': ['Amused', 'Annoyed', 'Hello, old friend', 'Indifferent'], 0 Amused 0.166667
'prop': [1/6, 1/6, 1/2, 1/6]}) 1 Annoyed 0.166667
2 Hello, old friend 0.500000
3 Indifferent 0.166667

H0 : The sample matches the hypothesized χ2 measures how far observed results are
distribution from expectations in each group

HA : The sample does not match the alpha = 0.01


hypothesized distribution

HYPOTHESIS TESTING IN PYTHON


Hypothesized counts by category
n_total = len(stack_overflow)
hypothesized["n"] = hypothesized["prop"] * n_total

purple_link prop n
0 Amused 0.166667 376.833333
1 Annoyed 0.166667 376.833333
2 Hello, old friend 0.500000 1130.500000
3 Indifferent 0.166667 376.833333

HYPOTHESIS TESTING IN PYTHON


Visualizing counts
import matplotlib.pyplot as plt

plt.bar(purple_link_counts['purple_link'], purple_link_counts['n'],
color='red', label='Observed')
plt.bar(hypothesized['purple_link'], hypothesized['n'], alpha=0.5,
color='blue', label='Hypothesized')

plt.legend()
plt.show()

HYPOTHESIS TESTING IN PYTHON


Visualizing counts

HYPOTHESIS TESTING IN PYTHON


chi-square goodness of fit test
print(hypothesized)

purple_link prop n
0 Amused 0.166667 376.833333
1 Annoyed 0.166667 376.833333
2 Hello, old friend 0.500000 1130.500000
3 Indifferent 0.166667 376.833333

from scipy.stats import chisquare


chisquare(f_obs=purple_link_counts['n'], f_exp=hypothesized['n'])

Power_divergenceResult(statistic=44.59840778416629, pvalue=1.1261810719413759e-09)

HYPOTHESIS TESTING IN PYTHON


Let's practice!
HYPOTHESIS TESTING IN PYTHON
Assumptions in
hypothesis testing
HYPOTHESIS TESTING IN PYTHON

James Chapman
Curriculum Manager, DataCamp
Randomness
Assumption
The samples are random subsets of larger
populations

Consequence
Sample is not representative of population

How to check this


Understand how your data was collected

Speak to the data collector/domain expert

1 Sampling techniques are discussed in "Sampling in Python".

HYPOTHESIS TESTING IN PYTHON


Independence of observations
Assumption
Each observation (row) in the dataset is independent

Consequence
Increased chance of false negative/positive error

How to check this


Understand how our data was collected

HYPOTHESIS TESTING IN PYTHON


Large sample size
Assumption
The sample is big enough to mitigate uncertainty, so that the Central Limit Theorem applies

Consequence
Wider confidence intervals

Increased chance of false negative/positive errors

How to check this


It depends on the test

HYPOTHESIS TESTING IN PYTHON


Large sample size: t-test
One sample Two samples
At least 30 observations in the sample At least 30 observations in each sample

n ≥ 30 n1 ≥ 30, n2 ≥ 30

n: sample size ni : sample size for group i

Paired samples ANOVA


At least 30 pairs of observations across the At least 30 observations in each sample
samples
ni ≥ 30 for all values of i
Number of rows in our data ≥ 30

HYPOTHESIS TESTING IN PYTHON


Large sample size: proportion tests
One sample Two samples
Number of successes in sample is greater Number of successes in each sample is
than or equal to 10 greater than or equal to 10

n × p^ ≥ 10 n1 × p^1 ≥ 10

Number of failures in sample is greater n2 × p^2 ≥ 10


than or equal to 10
Number of failures in each sample is
n × (1 − p^) ≥ 10 greater than or equal to 10

n: sample size n1 × (1 − p^1 ) ≥ 10


p^: proportion of successes in sample
n2 × (1 − p^2 ) ≥ 10

HYPOTHESIS TESTING IN PYTHON


Large sample size: chi-square tests
The number of successes in each group in greater than or equal to 5
ni × p^i ≥ 5 for all values of i

The number of failures in each group in greater than or equal to 5


ni × (1 − p^i ) ≥ 5 for all values of i

ni : sample size for group i


p^i : proportion of successes in sample group i

HYPOTHESIS TESTING IN PYTHON


Sanity check
If the bootstrap distribution doesn't look normal, assumptions likely aren't valid

Revisit data collection to check for randomness, independence, and sample size

HYPOTHESIS TESTING IN PYTHON


Let's practice!
HYPOTHESIS TESTING IN PYTHON
Non-parametric
tests
HYPOTHESIS TESTING IN PYTHON

James Chapman
Curriculum Manager, DataCamp
Parametric tests
z-test, t-test, and ANOVA are all parametric tests
Assume a normal distribution

Require sufficiently large sample sizes

HYPOTHESIS TESTING IN PYTHON


Smaller Republican votes data
print(repub_votes_small)

state county repub_percent_08 repub_percent_12


80 Texas Red River 68.507522 69.944817
84 Texas Walker 60.707197 64.971903
33 Kentucky Powell 57.059533 61.727293
81 Texas Schleicher 74.386503 77.384464
93 West Virginia Morgan 60.857614 64.068711

HYPOTHESIS TESTING IN PYTHON


Results with pingouin.ttest()
5 pairs is not enough to meet the sample size condition for the paired t-test:
At least 30 pairs of observations across the samples.

alpha = 0.01
import pingouin
pingouin.ttest(x=repub_votes_potus_08_12_small['repub_percent_08'],
y=repub_votes_potus_08_12_small['repub_percent_12'],
paired=True,
alternative="less")

T dof alternative p-val CI95% cohen-d BF10 power


T-test -5.875753 4 less 0.002096 [-inf, -2.11] 0.500068 26.468 0.239034

HYPOTHESIS TESTING IN PYTHON


Non-parametric tests
Non-parametric tests avoid the parametric assumptions and conditions
Many non-parametric tests use ranks of the data

x = [1, 15, 3, 10, 6]

from scipy.stats import rankdata


rankdata(x)

array([1., 5., 2., 4., 3.])

HYPOTHESIS TESTING IN PYTHON


Non-parametric tests
Non-parametric tests are more reliable than parametric tests for small sample sizes and
when data isn't normally distributed

HYPOTHESIS TESTING IN PYTHON


Non-parametric tests
Non-parametric tests are more reliable than parametric tests for small sample sizes and
when data isn't normally distributed

Wilcoxon-signed rank test


Developed by Frank Wilcoxon in 1945

One of the first non-parametric procedures

HYPOTHESIS TESTING IN PYTHON


Wilcoxon-signed rank test (Step 1)
Works on the ranked absolute differences between the pairs of data

repub_votes_small['diff'] = repub_votes_small['repub_percent_08'] -
repub_votes_small['repub_percent_12']
print(repub_votes_small)

state county repub_percent_08 repub_percent_12 diff


80 Texas Red River 68.507522 69.944817 -1.437295
84 Texas Walker 60.707197 64.971903 -4.264705
33 Kentucky Powell 57.059533 61.727293 -4.667760
81 Texas Schleicher 74.386503 77.384464 -2.997961
93 West Virginia Morgan 60.857614 64.068711 -3.211097

HYPOTHESIS TESTING IN PYTHON


Wilcoxon-signed rank test (Step 2)
Works on the ranked absolute differences between the pairs of data

repub_votes_small['abs_diff'] = repub_votes_small['diff'].abs()
print(repub_votes_small)

state county repub_percent_08 repub_percent_12 diff abs_diff


80 Texas Red River 68.507522 69.944817 -1.437295 1.437295
84 Texas Walker 60.707197 64.971903 -4.264705 4.264705
33 Kentucky Powell 57.059533 61.727293 -4.667760 4.667760
81 Texas Schleicher 74.386503 77.384464 -2.997961 2.997961
93 West Virginia Morgan 60.857614 64.068711 -3.211097 3.211097

HYPOTHESIS TESTING IN PYTHON


Wilcoxon-signed rank test (Step 3)
Works on the ranked absolute differences between the pairs of data

from scipy.stats import rankdata


repub_votes_small['rank_abs_diff'] = rankdata(repub_votes_small['abs_diff'])
print(repub_votes_small)

state county repub_percent_08 repub_percent_12 diff abs_diff rank_abs_diff


80 Texas Red River 68.507522 69.944817 -1.437295 1.437295 1.0
84 Texas Walker 60.707197 64.971903 -4.264705 4.264705 4.0
33 Kentucky Powell 57.059533 61.727293 -4.667760 4.667760 5.0
81 Texas Schleicher 74.386503 77.384464 -2.997961 2.997961 2.0
93 West Virginia Morgan 60.857614 64.068711 -3.211097 3.211097 3.0

HYPOTHESIS TESTING IN PYTHON


Wilcoxon-signed rank test (Step 4)
state county repub_percent_08 repub_percent_12 diff abs_diff rank_abs_diff
80 Texas Red River 68.507522 69.944817 -1.437295 1.437295 1.0
84 Texas Walker 60.707197 64.971903 -4.264705 4.264705 4.0
33 Kentucky Powell 57.059533 61.727293 -4.667760 4.667760 5.0
81 Texas Schleicher 74.386503 77.384464 -2.997961 2.997961 2.0
93 West Virginia Morgan 60.857614 64.068711 -3.211097 3.211097 3.0

Incorporate the sum of the ranks for negative and positive differences

T_minus = 1 + 4 + 5 + 2 + 3
T_plus = 0
W = np.min([T_minus, T_plus])

HYPOTHESIS TESTING IN PYTHON


Implementation with pingouin.wilcoxon()
alpha = 0.01
pingouin.wilcoxon(x=repub_votes_potus_08_12_small['repub_percent_08'],
y=repub_votes_potus_08_12_small['repub_percent_12'],
alternative="less")

W-val alternative p-val RBC CLES


Wilcoxon 0.0 less 0.03125 -1.0 0.72

Fail to reject H0 , since 0.03125 > 0.01

HYPOTHESIS TESTING IN PYTHON


Let's practice!
HYPOTHESIS TESTING IN PYTHON
Non-parametric
ANOVA and
unpaired t-tests
HYPOTHESIS TESTING IN PYTHON

James Chapman
Curriculum Manager, DataCamp
Wilcoxon-Mann-Whitney test
Also know as the Mann Whitney U test
A t-test on the ranks of the numeric input

Works on unpaired data

HYPOTHESIS TESTING IN PYTHON


Wilcoxon-Mann-Whitney test setup
age_vs_comp = stack_overflow[['converted_comp', 'age_first_code_cut']]

age_vs_comp_wide = age_vs_comp.pivot(columns='age_first_code_cut',
values='converted_comp')

age_first_code_cut adult child


0 77556.0 NaN
1 NaN 74970.0
2 NaN 594539.0
... ... ...
2258 NaN 97284.0
2259 NaN 72000.0
2260 NaN 180000.0

[2261 rows x 2 columns]

HYPOTHESIS TESTING IN PYTHON


Wilcoxon-Mann-Whitney test
alpha=0.01

import pingouin
pingouin.mwu(x=age_vs_comp_wide['child'],
y=age_vs_comp_wide['adult'],
alternative='greater')

U-val alternative p-val RBC CLES


MWU 744365.5 greater 1.902723e-19 -0.222516 0.611258

HYPOTHESIS TESTING IN PYTHON


Kruskal-Wallis test
Kruskal-Wallis test is to Wilcoxon-Mann-Whitney test as ANOVA is to t-test

alpha=0.01

pingouin.kruskal(data=stack_overflow,
dv='converted_comp',
between='job_sat')

Source ddof1 H p-unc


Kruskal job_sat 4 72.814939 5.772915e-15

HYPOTHESIS TESTING IN PYTHON


Let's practice!
HYPOTHESIS TESTING IN PYTHON
Congratulations!
HYPOTHESIS TESTING IN PYTHON

James Chapman
Curriculum Manager, DataCamp
Course recap
Chapter 1 Chapter 3

Workflow for testing proportions vs. a Testing differences in sample proportions


hypothesized value between two groups using proportion tests

False negative/false positive errors Using chi-square independence/goodness


of fit tests

Chapter 2 Chapter 4

Testing differences in sample means Reviewing assumptions of parametric


between two groups using t-tests hypothesis tests

Extending this to more than two groups Examined non-parametric alternatives


using ANOVA and pairwise t-tests when assumptions aren't valid

HYPOTHESIS TESTING IN PYTHON


More courses
Inference
Statistics Fundamentals with Python skill track

Bayesian statistics
Bayesian Data Analysis in Python

Applications
Customer Analytics and A/B Testing in Python

HYPOTHESIS TESTING IN PYTHON


Congratulations!
HYPOTHESIS TESTING IN PYTHON

You might also like