Classification accuracy as a proxy for two sample testing

Kim, Ilmun; Ramdas, Aaditya; Singh, Aarti; Wasserman, Larry

Computer Science > Machine Learning

arXiv:1602.02210 (cs)

[Submitted on 6 Feb 2016 (v1), last revised 17 Feb 2020 (this version, v4)]

Title:Classification accuracy as a proxy for two sample testing

Authors:Ilmun Kim, Aaditya Ramdas, Aarti Singh, Larry Wasserman

View PDF

Abstract:When data analysts train a classifier and check if its accuracy is significantly different from chance, they are implicitly performing a two-sample test. We investigate the statistical properties of this flexible approach in the high-dimensional setting. We prove two results that hold for all classifiers in any dimensions: if its true error remains $\epsilon$-better than chance for some $\epsilon>0$ as $d,n \to \infty$, then (a) the permutation-based test is consistent (has power approaching to one), (b) a computationally efficient test based on a Gaussian approximation of the null distribution is also consistent. To get a finer understanding of the rates of consistency, we study a specialized setting of distinguishing Gaussians with mean-difference $\delta$ and common (known or unknown) covariance $\Sigma$, when $d/n \to c \in (0,\infty)$. We study variants of Fisher's linear discriminant analysis (LDA) such as "naive Bayes" in a nontrivial regime when $\epsilon \to 0$ (the Bayes classifier has true accuracy approaching 1/2), and contrast their power with corresponding variants of Hotelling's test. Surprisingly, the expressions for their power match exactly in terms of $n,d,\delta,\Sigma$, and the LDA approach is only worse by a constant factor, achieving an asymptotic relative efficiency (ARE) of $1/\sqrt{\pi}$ for balanced samples. We also extend our results to high-dimensional elliptical distributions with finite kurtosis. Other results of independent interest include minimax lower bounds, and the optimality of Hotelling's test when $d=o(n)$. Simulation results validate our theory, and we present practical takeaway messages along with natural open problems.

Comments:	71 pages, 4 figures. Accepted for publication at the Annals of Statistics (2020)
Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Statistics Theory (math.ST); Machine Learning (stat.ML)
Cite as:	arXiv:1602.02210 [cs.LG]
	(or arXiv:1602.02210v4 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.1602.02210

Submission history

From: Aaditya Ramdas [view email]
[v1] Sat, 6 Feb 2016 03:48:04 UTC (43 KB)
[v2] Fri, 24 May 2019 22:36:47 UTC (367 KB)
[v3] Fri, 17 Jan 2020 21:29:08 UTC (623 KB)
[v4] Mon, 17 Feb 2020 17:56:24 UTC (645 KB)

Computer Science > Machine Learning

Title:Classification accuracy as a proxy for two sample testing

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Classification accuracy as a proxy for two sample testing

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators