Skip to content

BUG: stats.ttest_ind returns wrong p-values with permutations #16459

@jmkuebler

Description

@jmkuebler

Describe your issue.

When running the two-sample t-test with permutations we do not make any assumptions about the distributions.
A user would reject the two-sample null hypothesis when obtaining a p-value smaller or equal to their significance level and we should guarantee that under the null hypothesis this does not happen with probability larger than the significance level.
Hence, the p-value should never be 0!

But with the current implementation it does.

Note that if a large number of permutations is used, this does not lead to a problem. But an inexperienced user might wrongly reject their null-hypothesis, because they used few permutations and received pvalue equal 0.

Reproducing Code Example

from scipy import stats
import numpy as np

rng = np.random.default_rng()

for i in range(200):
  rvs1 = stats.norm.rvs(loc=5, scale=10, size=500, random_state=rng)
  rvs2 = stats.norm.rvs(loc=5, scale=10, size=500, random_state=rng)
  print(stats.ttest_ind(rvs1, rvs2, permutations=1)[1])

Error message

N/A

SciPy/NumPy/Python version information

1.7.3 1.21.6 sys.version_info(major=3, minor=7, micro=13, releaselevel='final', serial=0)

Metadata

Metadata

Assignees

No one assigned

    Labels

    defectA clear bug or issue that prevents SciPy from being installed or used as expectedscipy.stats

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions