Skip to main content

Showing 1–7 of 7 results for author: Khan, K

Searching in archive stat. Search in all archives.
.
  1. arXiv:2501.16988  [pdf, other

    stat.ML cs.LG

    Marginal and Conditional Importance Measures from Machine Learning Models and Their Relationship with Conditional Average Treatment Effect

    Authors: Mohammad Kaviul Anam Khan, Olli Saarela, Rafal Kustra

    Abstract: Interpreting black-box machine learning models is challenging due to their strong dependence on data and inherently non-parametric nature. This paper reintroduces the concept of importance through "Marginal Variable Importance Metric" (MVIM), a model-agnostic measure of predictor importance based on the true conditional expectation function. MVIM evaluates predictors' influence on continuous or di… ▽ More

    Submitted 28 January, 2025; v1 submitted 28 January, 2025; originally announced January 2025.

  2. arXiv:2404.12356  [pdf, other

    stat.ML cs.LG cs.SI

    Improving the interpretability of GNN predictions through conformal-based graph sparsification

    Authors: Pablo Sanchez-Martin, Kinaan Aamir Khan, Isabel Valera

    Abstract: Graph Neural Networks (GNNs) have achieved state-of-the-art performance in solving graph classification tasks. However, most GNN architectures aggregate information from all nodes and edges in a graph, regardless of their relevance to the task at hand, thus hindering the interpretability of their predictions. In contrast to prior work, in this paper we propose a GNN \emph{training} approach that j… ▽ More

    Submitted 18 April, 2024; originally announced April 2024.

  3. arXiv:2301.05743  [pdf, other

    stat.ME

    Re-thinking Spatial Confounding in Spatial Linear Mixed Models

    Authors: Kori Khan, Candace Berrett

    Abstract: In the last two decades, considerable research has been devoted to a phenomenon known as spatial confounding. Spatial confounding is thought to occur when there is multicollinearity between a covariate and the random effect in a spatial regression model. This multicollinearity is considered highly problematic when the inferential goal is estimating regression coefficients and various methodologies… ▽ More

    Submitted 21 June, 2024; v1 submitted 13 January, 2023; originally announced January 2023.

    Comments: 38 pages main text; 8 figures; code available upon request

  4. arXiv:2212.09931  [pdf, other

    stat.CO stat.ML

    A Generalized Variable Importance Metric and Estimator for Black Box Machine Learning Models

    Authors: Mohammad Kaviul Anam Khan, Olli Saarela, Rafal Kustra

    Abstract: In this paper we define a population parameter, ``Generalized Variable Importance Metric (GVIM)'', to measure importance of predictors for black box machine learning methods, where the importance is not represented by model-based parameter. GVIM is defined for each input variable, using the true conditional expectation function, and it measures the variable's importance in affecting a continuous o… ▽ More

    Submitted 23 December, 2023; v1 submitted 19 December, 2022; originally announced December 2022.

  5. arXiv:2209.14216  [pdf, other

    stat.AP

    Shining a Light on Forensic Black-Box Studies

    Authors: Kori Khan, Alicia L. Carriquiry

    Abstract: Forensic science plays a critical role in the United States criminal justice system. For decades, many feature-based fields of forensic science, such as firearm and toolmark identification, developed outside the scientific community's purview. The results of these studies are widely relied on by judges nationwide. However, this reliance is misplaced. Black-box studies to date suffer from inappropr… ▽ More

    Submitted 1 June, 2023; v1 submitted 28 September, 2022; originally announced September 2022.

  6. arXiv:2111.01285  [pdf, other

    stat.AP

    Computing with R-INLA: Accuracy and reproducibility with implications for the analysis of COVID-19 data

    Authors: Kori Khan, Hengrui Luo, Wenna Xi

    Abstract: The statistical methods used to analyze medical data are becoming increasingly complex. Novel statistical methods increasingly rely on simulation studies to assess their validity. Such assessments typically appear in statistical or computational journals, and the methodology is later introduced to the medical community through tutorials. This can be problematic if applied researchers use the metho… ▽ More

    Submitted 1 November, 2021; originally announced November 2021.

  7. Restricted Spatial Regression Methods: Implications for Inference

    Authors: Kori Khan, Catherine A. Calder

    Abstract: The issue of spatial confounding between the spatial random effect and the fixed effects in regression analyses has been identified as a concern in the statistical literature. Multiple authors have offered perspectives and potential solutions. In this paper, for the areal spatial data setting, we show that many of the methods designed to alleviate spatial confounding can be viewed as special cases… ▽ More

    Submitted 18 August, 2020; v1 submitted 22 May, 2019; originally announced May 2019.

    Comments: Minor notation and typo edits. Primary change is to statement and proof of Theorem 2. KJournal of the American Statistical Association (2020)

    Journal ref: Journal of the American Statistical Association, 117:537, 482-494 (2022)