Controlling the FDR in variable selection via multiple knockoffs

Emery, Kristen; Keich, Uri

Statistics > Methodology

arXiv:1911.09442 (stat)

[Submitted on 21 Nov 2019 (v1), last revised 22 Nov 2019 (this version, v2)]

Title:Controlling the FDR in variable selection via multiple knockoffs

Authors:Kristen Emery, Uri Keich

View PDF

Abstract:Barber and Candes recently introduced a feature selection method called knockoff+ that controls the false discovery rate (FDR) among the selected features in the classical linear regression problem. Knockoff+ uses the competition between the original features and artificially created knockoff features to control the FDR [1]. We generalize Barber and Candes' knockoff construction to generate multiple knockoffs and use those in conjunction with a recently developed general framework for multiple competition-based FDR control [9].
We prove that using our initial multiple-knockoff construction the combined procedure rigorously controls the FDR in the finite sample setting. Because this construction has a somewhat limited utility we introduce a heuristic we call "batching" which significantly improves the power of our multiple-knockoff procedures.
Finally, we combine the batched knockoffs with a new context-dependent resampling scheme that replaces the generic resampling scheme used in the general multiple-competition setup. We show using simulations that the resulting "multi-knockoff-select" procedure empirically controls the FDR in the finite setting of the variable selection problem while often delivering substantially more power than knockoff+.

Comments:	Fixed minor linguistic errors in the original submission
Subjects:	Methodology (stat.ME)
Cite as:	arXiv:1911.09442 [stat.ME]
	(or arXiv:1911.09442v2 [stat.ME] for this version)
	https://doi.org/10.48550/arXiv.1911.09442

Submission history

From: Uri Keich [view email]
[v1] Thu, 21 Nov 2019 12:41:48 UTC (586 KB)
[v2] Fri, 22 Nov 2019 03:16:18 UTC (586 KB)

Statistics > Methodology

Title:Controlling the FDR in variable selection via multiple knockoffs

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Statistics > Methodology

Title:Controlling the FDR in variable selection via multiple knockoffs

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators