Tea: A High-level Language and Runtime System for Automating Statistical Analysis

Jun, Eunice; Daum, Maureen; Roesch, Jared; Chasins, Sarah E.; Berger, Emery D.; Just, Rene; Reinecke, Katharina

doi:10.1145/3332165.3347940

Computer Science > Programming Languages

arXiv:1904.05387 (cs)

[Submitted on 10 Apr 2019]

Title:Tea: A High-level Language and Runtime System for Automating Statistical Analysis

Authors:Eunice Jun, Maureen Daum, Jared Roesch, Sarah E. Chasins, Emery D. Berger, Rene Just, Katharina Reinecke

View PDF

Abstract:Though statistical analyses are centered on research questions and hypotheses, current statistical analysis tools are not. Users must first translate their hypotheses into specific statistical tests and then perform API calls with functions and parameters. To do so accurately requires that users have statistical expertise. To lower this barrier to valid, replicable statistical analysis, we introduce Tea, a high-level declarative language and runtime system. In Tea, users express their study design, any parametric assumptions, and their hypotheses. Tea compiles these high-level specifications into a constraint satisfaction problem that determines the set of valid statistical tests, and then executes them to test the hypothesis. We evaluate Tea using a suite of statistical analyses drawn from popular tutorials. We show that Tea generally matches the choices of experts while automatically switching to non-parametric tests when parametric assumptions are not met. We simulate the effect of mistakes made by non-expert users and show that Tea automatically avoids both false negatives and false positives that could be produced by the application of incorrect statistical tests.

Comments:	11 pages
Subjects:	Programming Languages (cs.PL); Human-Computer Interaction (cs.HC); Mathematical Software (cs.MS)
Cite as:	arXiv:1904.05387 [cs.PL]
	(or arXiv:1904.05387v1 [cs.PL] for this version)
	https://doi.org/10.48550/arXiv.1904.05387
Related DOI:	https://doi.org/10.1145/3332165.3347940

Submission history

From: Emery Berger [view email]
[v1] Wed, 10 Apr 2019 18:44:55 UTC (2,830 KB)

Computer Science > Programming Languages

Title:Tea: A High-level Language and Runtime System for Automating Statistical Analysis

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Programming Languages

Title:Tea: A High-level Language and Runtime System for Automating Statistical Analysis

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators