Risk Minimization from Adaptively Collected Data: Guarantees for Supervised and Policy Learning

Bibaut, Aurélien; Chambaz, Antoine; Dimakopoulou, Maria; Kallus, Nathan; van der Laan, Mark

Statistics > Machine Learning

arXiv:2106.01723 (stat)

[Submitted on 3 Jun 2021]

Title:Risk Minimization from Adaptively Collected Data: Guarantees for Supervised and Policy Learning

Authors:Aurélien Bibaut, Antoine Chambaz, Maria Dimakopoulou, Nathan Kallus, Mark van der Laan

View PDF

Abstract:Empirical risk minimization (ERM) is the workhorse of machine learning, whether for classification and regression or for off-policy policy learning, but its model-agnostic guarantees can fail when we use adaptively collected data, such as the result of running a contextual bandit algorithm. We study a generic importance sampling weighted ERM algorithm for using adaptively collected data to minimize the average of a loss function over a hypothesis class and provide first-of-their-kind generalization guarantees and fast convergence rates. Our results are based on a new maximal inequality that carefully leverages the importance sampling structure to obtain rates with the right dependence on the exploration rate in the data. For regression, we provide fast rates that leverage the strong convexity of squared-error loss. For policy learning, we provide rate-optimal regret guarantees that close an open gap in the existing literature whenever exploration decays to zero, as is the case for bandit-collected data. An empirical investigation validates our theory.

Subjects:	Machine Learning (stat.ML); Machine Learning (cs.LG); Statistics Theory (math.ST)
Cite as:	arXiv:2106.01723 [stat.ML]
	(or arXiv:2106.01723v1 [stat.ML] for this version)
	https://doi.org/10.48550/arXiv.2106.01723

Submission history

From: Aurélien Bibaut [view email]
[v1] Thu, 3 Jun 2021 09:50:13 UTC (812 KB)

Statistics > Machine Learning

Title:Risk Minimization from Adaptively Collected Data: Guarantees for Supervised and Policy Learning

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Statistics > Machine Learning

Title:Risk Minimization from Adaptively Collected Data: Guarantees for Supervised and Policy Learning

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators