Skip to main content

Showing 1–17 of 17 results for author: van Smeden, M

Searching in archive stat. Search in all archives.
.
  1. arXiv:2412.10288  [pdf

    cs.LG stat.ME stat.ML

    Performance evaluation of predictive AI models to support medical decisions: Overview and guidance

    Authors: Ben Van Calster, Gary S. Collins, Andrew J. Vickers, Laure Wynants, Kathleen F. Kerr, Lasai Barreñada, Gael Varoquaux, Karandeep Singh, Karel G. M. Moons, Tina Hernandez-boussard, Dirk Timmerman, David J. Mclernon, Maarten Van Smeden, Ewout W. Steyerberg

    Abstract: A myriad of measures to illustrate performance of predictive artificial intelligence (AI) models have been proposed in the literature. Selecting appropriate performance measures is essential for predictive AI models that are developed to be used in medical practice, because poorly performing models may harm patients and lead to increased costs. We aim to assess the merits of classic and contempora… ▽ More

    Submitted 13 December, 2024; originally announced December 2024.

    Comments: 60 pages, 8 tables, 11 figures, two supplementary appendices

  2. arXiv:2406.19673  [pdf

    stat.ME

    Extended sample size calculations for evaluation of prediction models using a threshold for classification

    Authors: Rebecca Whittle, Joie Ensor, Lucinda Archer, Gary S. Collins, Paula Dhiman, Alastair Denniston, Joseph Alderman, Amardeep Legha, Maarten van Smeden, Karel G. Moons, Jean-Baptiste Cazier, Richard D. Riley, Kym I. E. Snell

    Abstract: When evaluating the performance of a model for individualised risk prediction, the sample size needs to be large enough to precisely estimate the performance measures of interest. Current sample size guidance is based on precisely estimating calibration, discrimination, and net benefit, which should be the first stage of calculating the minimum required sample size. However, when a clinically impo… ▽ More

    Submitted 28 June, 2024; originally announced June 2024.

    Comments: 27 pages, 1 figure

  3. arXiv:2404.19494  [pdf, other

    stat.ME

    The harms of class imbalance corrections for machine learning based prediction models: a simulation study

    Authors: Alex Carriero, Kim Luijken, Anne de Hond, Karel GM Moons, Ben van Calster, Maarten van Smeden

    Abstract: Risk prediction models are increasingly used in healthcare to aid in clinical decision making. In most clinical contexts, model calibration (i.e., assessing the reliability of risk estimates) is critical. Data available for model development are often not perfectly balanced with respect to the modeled outcome (i.e., individuals with vs. without the event of interest are not equally represented in… ▽ More

    Submitted 30 April, 2024; originally announced April 2024.

  4. Risk-based decision making: estimands for sequential prediction under interventions

    Authors: Kim Luijken, Paweł Morzywołek, Wouter van Amsterdam, Giovanni Cinà, Jeroen Hoogland, Ruth Keogh, Jesse Krijthe, Sara Magliacane, Thijs van Ommen, Niels Peek, Hein Putter, Maarten van Smeden, Matthew Sperrin, Junfeng Wang, Daniala Weir, Vanessa Didelez, Nan van Geloven

    Abstract: Prediction models are used amongst others to inform medical decisions on interventions. Typically, individuals with high risks of adverse outcomes are advised to undergo an intervention while those at low risk are advised to refrain from it. Standard prediction models do not always provide risks that are relevant to inform such decisions: e.g., an individual may be estimated to be at low risk beca… ▽ More

    Submitted 29 November, 2023; originally announced November 2023.

    Comments: 32 pages, 2 figures

    Journal ref: Luijken e.a. (2024). Risk-Based Decision Making: Estimands for Sequential Prediction Under Interventions. Biometrical Journal, 66: e70011

  5. arXiv:2301.06570  [pdf

    cs.CL stat.ME

    Cross-institution text mining to uncover clinical associations: a case study relating social factors and code status in intensive care medicine

    Authors: Madhumita Sushil, Atul J. Butte, Ewoud Schuit, Maarten van Smeden, Artuur M. Leeuwenberg

    Abstract: Objective: Text mining of clinical notes embedded in electronic medical records is increasingly used to extract patient characteristics otherwise not or only partly available, to assess their association with relevant health outcomes. As manual data labeling needed to develop text mining models is resource intensive, we investigated whether off-the-shelf text mining models developed at external in… ▽ More

    Submitted 16 January, 2023; originally announced January 2023.

    MSC Class: 68T50; 68U35; 62-xx; 62P10; 92C60; 92D30 ACM Class: I.2.7; G.3

  6. arXiv:2207.12892  [pdf

    stat.ME stat.AP

    Minimum Sample Size for Developing a Multivariable Prediction Model using Multinomial Logistic Regression

    Authors: Alexander Pate, Richard D Riley, Gary S Collins, Maarten van Smeden, Ben Van Calster, Joie Ensor, Glen P Martin

    Abstract: Multinomial logistic regression models allow one to predict the risk of a categorical outcome with more than 2 categories. When developing such a model, researchers should ensure the number of participants (n) is appropriate relative to the number of events (E.k) and the number of predictor parameters (p.k) for each category k. We propose three criteria to determine the minimum n required in light… ▽ More

    Submitted 26 July, 2022; originally announced July 2022.

  7. arXiv:2206.12295  [pdf

    stat.ME

    Imputation and Missing Indicators for handling missing data in the development and implementation of clinical prediction models: a simulation study

    Authors: Rose Sisk, Matthew Sperrin, Niels Peek, Maarten van Smeden, Glen P. Martin

    Abstract: Background: Existing guidelines for handling missing data are generally not consistent with the goals of prediction modelling, where missing data can occur at any stage of the model pipeline. Multiple imputation (MI), often heralded as the gold standard approach, can be challenging to apply in the clinic. Clearly, the outcome cannot be used to impute data at prediction time. Regression imputation… ▽ More

    Submitted 24 June, 2022; originally announced June 2022.

    Comments: 42 pages. Submitted to Statistical Methods in Medical Research in October 2021

  8. arXiv:2202.09101  [pdf

    stat.ME

    The harm of class imbalance corrections for risk prediction models: illustration and simulation using logistic regression

    Authors: Ruben van den Goorbergh, Maarten van Smeden, Dirk Timmerman, Ben Van Calster

    Abstract: Methods to correct class imbalance, i.e. imbalance between the frequency of outcome events and non-events, are receiving increasing interest for developing prediction models. We examined the effect of imbalance correction on the performance of standard and penalized (ridge) logistic regression models in terms of discrimination, calibration, and classification. We examined random undersampling, ran… ▽ More

    Submitted 18 February, 2022; originally announced February 2022.

    Comments: Main paper 21 pages, Supplement 53 pages

  9. arXiv:2104.09282  [pdf

    stat.ME

    Risk prediction models for discrete ordinal outcomes: calibration and the impact of the proportional odds assumption

    Authors: Michael Edlinger, Maarten van Smeden, Hannes F Alber, Maria Wanitschek, Ben Van Calster

    Abstract: Calibration is a vital aspect of the performance of risk prediction models, but research in the context of ordinal outcomes is scarce. This study compared calibration measures for risk models predicting a discrete ordinal outcome, and investigated the impact of the proportional odds assumption on calibration and overfitting. We studied the multinomial, cumulative, adjacent category, continuation r… ▽ More

    Submitted 18 November, 2021; v1 submitted 19 April, 2021; originally announced April 2021.

    Comments: Revised version submitted to Statistics in Medicine

  10. arXiv:2102.04791  [pdf, ps, other

    stat.ME

    mecor: An R package for measurement error correction in linear regression models with a continuous outcome

    Authors: Linda Nab, Maarten van Smeden, Ruth H. Keogh, Rolf H. H. Groenwold

    Abstract: Measurement error in a covariate or the outcome of regression models is common, but is often ignored, even though measurement error can lead to substantial bias in the estimated covariate-outcome association. While several texts on measurement error correction methods are available, these methods remain seldomly applied. To improve the use of measurement error correction methodology, we developed… ▽ More

    Submitted 9 February, 2021; originally announced February 2021.

    Comments: 34 pages (including appendix), software package

    MSC Class: 62-04

  11. arXiv:2101.01603  [pdf, other

    stat.ME

    Comparing methods addressing multi-collinearity when developing prediction models

    Authors: Artuur M. Leeuwenberg, Maarten van Smeden, Johannes A. Langendijk, Arjen van der Schaaf, Murielle E. Mauer, Karel G. M. Moons, Johannes B. Reitsma, Ewoud Schuit

    Abstract: Clinical prediction models are developed widely across medical disciplines. When predictors in such models are highly collinear, unexpected or spurious predictor-outcome associations may occur, thereby potentially reducing face-validity and explainability of the prediction model. Collinearity can be dealt with by exclusion of collinear predictors, but when there is no a priori motivation (besides… ▽ More

    Submitted 5 January, 2021; originally announced January 2021.

    MSC Class: 60 ACM Class: G.3

  12. arXiv:1912.05800  [pdf, other

    stat.ME

    Sensitivity analysis for bias due to a misclassfied confounding variable in marginal structural models

    Authors: Linda Nab, Rolf H. H. Groenwold, Maarten van Smeden, Ruth H. Keogh

    Abstract: In observational research treatment effects, the average treatment effect (ATE) estimator may be biased if a confounding variable is misclassified. We discuss the impact of classification error in a dichotomous confounding variable in analyses using marginal structural models estimated using inverse probability weighting (MSMs-IPW) and compare this with its impact in conditional regression models,… ▽ More

    Submitted 12 December, 2019; originally announced December 2019.

    Comments: 25 pages, 3 figures, 3 tables

  13. arXiv:1907.11493  [pdf

    stat.ME

    On the variability of regression shrinkage methods for clinical prediction models: simulation study on predictive performance

    Authors: Ben Van Calster, Maarten van Smeden, Ewout W. Steyerberg

    Abstract: When developing risk prediction models, shrinkage methods are recommended, especially when the sample size is limited. Several earlier studies have shown that the shrinkage of model coefficients can reduce overfitting of the prediction model and subsequently result in better predictive performance on average. In this simulation study, we aimed to investigate the variability of regression shrinkage… ▽ More

    Submitted 26 July, 2019; originally announced July 2019.

    Comments: 138 pages (incl 114 supplementary pages). Main document: 5 figures and 2 tables

    MSC Class: 62J07

  14. arXiv:1901.04795  [pdf, other

    stat.ME

    A weighting method for simultaneous adjustment for confounding and joint exposure-outcome misclassifications

    Authors: Bas B. L. Penning de Vries, Maarten van Smeden, Rolf H. H. Groenwold

    Abstract: Joint misclassification of exposure and outcome variables can lead to considerable bias in epidemiological studies of causal exposure-outcome effects. In this paper, we present a new maximum likelihood based estimator for the marginal causal odd-ratio that simultaneously adjusts for confounding and several forms of joint misclassification of the exposure and outcome variables. The proposed method… ▽ More

    Submitted 15 January, 2019; originally announced January 2019.

    Comments: 36 pages, 7 tables, 1 figure

  15. arXiv:1809.07068  [pdf, other

    stat.ME

    Measurement error in continuous endpoints in randomised trials: problems and solutions

    Authors: Linda Nab, Rolf H. H. Groenwold, Paco M. J. Welsing, Maarten van Smeden

    Abstract: In randomised trials, continuous endpoints are often measured with some degree of error. This study explores the impact of ignoring measurement error, and proposes methods to improve statistical inference in the presence of measurement error. Three main types of measurement error in continuous endpoints are considered: classical, systematic and differential. For each measurement error type, a corr… ▽ More

    Submitted 29 August, 2019; v1 submitted 19 September, 2018; originally announced September 2018.

    Comments: 37 pages, 4 figures, 3 tables

  16. arXiv:1807.09462  [pdf, other

    stat.ML cs.LG

    Propensity score estimation using classification and regression trees in the presence of missing covariate data

    Authors: Bas B. L. Penning de Vries, Maarten van Smeden, Rolf H. H. Groenwold

    Abstract: Data mining and machine learning techniques such as classification and regression trees (CART) represent a promising alternative to conventional logistic regression for propensity score estimation. Whereas incomplete data preclude the fitting of a logistic regression on all subjects, CART is appealing in part because some implementations allow for incomplete records to be incorporated in the tree… ▽ More

    Submitted 25 July, 2018; originally announced July 2018.

    Comments: 29 pages, 5 tables

  17. arXiv:1806.10495  [pdf, other

    stat.ME

    Impact of predictor measurement heterogeneity across settings on performance of prediction models: a measurement error perspective

    Authors: Kim Luijken, Rolf H. H. Groenwold, Ben van Calster, Ewout W. Steyerberg, Maarten van Smeden

    Abstract: It is widely acknowledged that the predictive performance of clinical prediction models should be studied in patients that were not part of the data in which the model was derived. Out-of-sample performance can be hampered when predictors are measured differently at derivation and external validation. This may occur, for instance, when predictors are measured using different measurement protocols… ▽ More

    Submitted 5 February, 2019; v1 submitted 27 June, 2018; originally announced June 2018.

    Comments: 32 pages, 4 figures

    MSC Class: 97K80