Skip to main content

Showing 1–10 of 10 results for author: Watson, D S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2505.21441  [pdf, ps, other

    stat.ML cs.AI cs.LG

    Autoencoding Random Forests

    Authors: Binh Duc Vu, Jan Kapar, Marvin Wright, David S. Watson

    Abstract: We propose a principled method for autoencoding with random forests. Our strategy builds on foundational results from nonparametric statistics and spectral graph theory to learn a low-dimensional embedding of the model that optimally represents relationships in the data. We provide exact and approximate solutions to the decoding problem via constrained optimization, split relabeling, and nearest n… ▽ More

    Submitted 27 May, 2025; originally announced May 2025.

    Comments: 10 pages main text, 25 pages total. 5 figures main text, 9 figures total

  2. arXiv:2404.04446  [pdf, other

    stat.ME cs.AI

    Bounding Causal Effects with Leaky Instruments

    Authors: David S. Watson, Jordan Penn, Lee M. Gunderson, Gecia Bravo-Hermsdorff, Afsaneh Mastouri, Ricardo Silva

    Abstract: Instrumental variables (IVs) are a popular and powerful tool for estimating causal effects in the presence of unobserved confounding. However, classical approaches rely on strong assumptions such as the $\textit{exclusion criterion}$, which states that instrumental effects must be entirely mediated by treatments. This assumption often fails in practice. When IV methods are improperly applied to da… ▽ More

    Submitted 8 May, 2024; v1 submitted 5 April, 2024; originally announced April 2024.

    Comments: Camera ready version (UAI 2024)

    Journal ref: 40th Conference on Uncertainty in Artificial Intelligence (UAI 2024)

  3. arXiv:2306.05724  [pdf, other

    stat.ML cs.LG

    Explaining Predictive Uncertainty with Information Theoretic Shapley Values

    Authors: David S. Watson, Joshua O'Hara, Niek Tax, Richard Mudd, Ido Guy

    Abstract: Researchers in explainable artificial intelligence have developed numerous methods for helping users understand the predictions of complex supervised learning models. By contrast, explaining the $\textit{uncertainty}$ of model outputs has received relatively little attention. We adapt the popular Shapley value framework to explain various types of predictive uncertainty, quantifying each feature's… ▽ More

    Submitted 31 October, 2023; v1 submitted 9 June, 2023; originally announced June 2023.

    Comments: Camera ready version (NeurIPS 2023)

  4. arXiv:2306.04027  [pdf, other

    stat.ML cs.AI cs.LG

    Intervention Generalization: A View from Factor Graph Models

    Authors: Gecia Bravo-Hermsdorff, David S. Watson, Jialin Yu, Jakob Zeitler, Ricardo Silva

    Abstract: One of the goals of causal inference is to generalize from past experiments and observational data to novel conditions. While it is in principle possible to eventually learn a mapping from a novel experimental condition to an outcome of interest, provided a sufficient variety of experiments is available in the training data, coping with a large combinatorial space of possible interventions is hard… ▽ More

    Submitted 8 November, 2023; v1 submitted 6 June, 2023; originally announced June 2023.

    Comments: Camera ready version (NeurIPS 2023)

  5. Conditional Feature Importance for Mixed Data

    Authors: Kristin Blesch, David S. Watson, Marvin N. Wright

    Abstract: Despite the popularity of feature importance (FI) measures in interpretable machine learning, the statistical adequacy of these methods is rarely discussed. From a statistical perspective, a major distinction is between analyzing a variable's importance before and after adjusting for covariates - i.e., between $\textit{marginal}$ and $\textit{conditional}$ measures. Our work draws attention to thi… ▽ More

    Submitted 2 May, 2023; v1 submitted 6 October, 2022; originally announced October 2022.

    Journal ref: AStA Advances in Statistical Analysis (2023)

  6. arXiv:2205.09435  [pdf, other

    stat.ML cs.AI cs.LG stat.CO

    Adversarial random forests for density estimation and generative modeling

    Authors: David S. Watson, Kristin Blesch, Jan Kapar, Marvin N. Wright

    Abstract: We propose methods for density estimation and data synthesis using a novel form of unsupervised random forests. Inspired by generative adversarial networks, we implement a recursive procedure in which trees gradually learn structural properties of the data through alternating rounds of generation and discrimination. The method is provably consistent under minimal assumptions. Unlike classic tree-b… ▽ More

    Submitted 13 March, 2023; v1 submitted 19 May, 2022; originally announced May 2022.

    Comments: Camera ready version (AISTATS 2023)

    Journal ref: Proceedings of the 26th International Conference on Artificial Intelligence and Statistics (AISTATS 2023)

  7. arXiv:2205.05715  [pdf, other

    stat.ME cs.AI stat.ML

    Causal discovery under a confounder blanket

    Authors: David S. Watson, Ricardo Silva

    Abstract: Inferring causal relationships from observational data is rarely straightforward, but the problem is especially difficult in high dimensions. For these applications, causal discovery algorithms typically require parametric restrictions or extreme sparsity constraints. We relax these assumptions and focus on an important but more specialized problem, namely recovering the causal order among a subgr… ▽ More

    Submitted 28 June, 2022; v1 submitted 11 May, 2022; originally announced May 2022.

    Comments: Camera ready version (UAI 2022)

    Journal ref: 38th Conference on Uncertainty in Artificial Intelligence (UAI 2022)

  8. Rational Shapley Values

    Authors: David S. Watson

    Abstract: Explaining the predictions of opaque machine learning algorithms is an important and challenging task, especially as complex models are increasingly used to assist in high-stakes decisions such as those arising in healthcare and finance. Most popular tools for post-hoc explainable artificial intelligence (XAI) are either insensitive to context (e.g., feature attributions) or difficult to summarize… ▽ More

    Submitted 16 May, 2022; v1 submitted 18 June, 2021; originally announced June 2021.

    Comments: To be presented at the 2022 ACM FAccT Conference

    Journal ref: 2022 ACM Conference on Fairness, Accountability, and Transparency

  9. arXiv:2106.05074  [pdf, other

    cs.LG stat.ME

    Operationalizing Complex Causes: A Pragmatic View of Mediation

    Authors: Limor Gultchin, David S. Watson, Matt J. Kusner, Ricardo Silva

    Abstract: We examine the problem of causal response estimation for complex objects (e.g., text, images, genomics). In this setting, classical \emph{atomic} interventions are often not available (e.g., changes to characters, pixels, DNA base-pairs). Instead, we only have access to indirect or \emph{crude} interventions (e.g., enrolling in a writing program, modifying a scene, applying a gene therapy). In thi… ▽ More

    Submitted 10 June, 2021; v1 submitted 9 June, 2021; originally announced June 2021.

    Journal ref: International Conference on Machine Learning 2021

  10. arXiv:1901.09917  [pdf, other

    stat.ME cs.LG stat.ML

    Testing Conditional Independence in Supervised Learning Algorithms

    Authors: David S. Watson, Marvin N. Wright

    Abstract: We propose the conditional predictive impact (CPI), a consistent and unbiased estimator of the association between one or several features and a given outcome, conditional on a reduced feature set. Building on the knockoff framework of Candès et al. (2018), we develop a novel testing procedure that works in conjunction with any valid knockoff sampler, supervised learning algorithm, and loss functi… ▽ More

    Submitted 13 May, 2021; v1 submitted 28 January, 2019; originally announced January 2019.