Tree-Values: selective inference for regression trees

Neufeld, Anna C.; Gao, Lucy L.; Witten, Daniela M.

Statistics > Methodology

arXiv:2106.07816 (stat)

[Submitted on 15 Jun 2021 (v1), last revised 17 Oct 2022 (this version, v2)]

Title:Tree-Values: selective inference for regression trees

Authors:Anna C. Neufeld, Lucy L. Gao, Daniela M. Witten

View PDF

Abstract:We consider conducting inference on the output of the Classification and Regression Tree (CART) [Breiman et al., 1984] algorithm. A naive approach to inference that does not account for the fact that the tree was estimated from the data will not achieve standard guarantees, such as Type 1 error rate control and nominal coverage. Thus, we propose a selective inference framework for conducting inference on a fitted CART tree. In a nutshell, we condition on the fact that the tree was estimated from the data. We propose a test for the difference in the mean response between a pair of terminal nodes that controls the selective Type 1 error rate, and a confidence interval for the mean response within a single terminal node that attains the nominal selective coverage. Efficient algorithms for computing the necessary conditioning sets are provided. We apply these methods in simulation and to a dataset involving the association between portion control interventions and caloric intake.

Subjects:	Methodology (stat.ME); Machine Learning (stat.ML)
Cite as:	arXiv:2106.07816 [stat.ME]
	(or arXiv:2106.07816v2 [stat.ME] for this version)
	https://doi.org/10.48550/arXiv.2106.07816

Submission history

From: Anna Neufeld [view email]
[v1] Tue, 15 Jun 2021 00:25:11 UTC (2,495 KB)
[v2] Mon, 17 Oct 2022 18:02:44 UTC (4,425 KB)

Statistics > Methodology

Title:Tree-Values: selective inference for regression trees

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Statistics > Methodology

Title:Tree-Values: selective inference for regression trees

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators