Skip to main content

Showing 1–24 of 24 results for author: Gorfine, M

Searching in archive stat. Search in all archives.
.
  1. arXiv:2502.06492  [pdf, other

    stat.ME

    An Overview and Recent Developments in the Analysis of Multistate Processes

    Authors: Malka Gorfine, Richard J. Cook, Per Kragh Andersen, Terry M. Therneau, Pierre Joly, Hein Putter, Maja Pohar Perme, Michal Abrahamowicz

    Abstract: Multistate models offer a powerful framework for studying disease processes and can be used to formulate intensity-based and more descriptive marginal regression models. They also represent a natural foundation for the construction of joint models for disease processes and dynamic marker processes, as well as joint models incorporating random censoring and intermittent observation times. This arti… ▽ More

    Submitted 10 February, 2025; originally announced February 2025.

    Comments: 62 pages, 3 figures, 3 tables

  2. arXiv:2502.01575  [pdf, ps, other

    stat.ML cs.LG

    Heterogeneous Treatment Effect in Time-to-Event Outcomes: Harnessing Censored Data with Recursively Imputed Trees

    Authors: Tomer Meir, Uri Shalit, Malka Gorfine

    Abstract: Tailoring treatments to individual needs is a central goal in fields such as medicine. A key step toward this goal is estimating Heterogeneous Treatment Effects (HTE) - the way treatments impact different subgroups. While crucial, HTE estimation is challenging with survival data, where time until an event (e.g., death) is key. Existing methods often assume complete observation, an assumption viola… ▽ More

    Submitted 4 June, 2025; v1 submitted 3 February, 2025; originally announced February 2025.

  3. arXiv:2409.02888  [pdf, other

    stat.ME stat.AP

    Cost-Effectiveness Analysis for Disease Prevention -- A Case Study on Colorectal Cancer Screening

    Authors: Yi Xiong, Kwun C G Chan, Malka Gorfine, Li Hsu

    Abstract: Cancer Screening has been widely recognized as an effective strategy for preventing the disease. Despite its effectiveness, determining when to start screening is complicated, because starting too early increases the number of screenings over lifetime and thus costs but starting too late may miss the cancer that could have been prevented. Therefore, to make an informed recommendation on the age to… ▽ More

    Submitted 4 September, 2024; originally announced September 2024.

    Comments: 37 pages, 2 figures, 8 tables

  4. arXiv:2406.14009  [pdf, other

    stat.ML cs.LG

    Confidence Intervals and Simultaneous Confidence Bands Based on Deep Learning

    Authors: Asaf Ben Arie, Malka Gorfine

    Abstract: Deep learning models have significantly improved prediction accuracy in various fields, gaining recognition across numerous disciplines. Yet, an aspect of deep learning that remains insufficiently addressed is the assessment of prediction uncertainty. Producing reliable uncertainty estimators could be crucial in practical terms. For instance, predictions associated with a high degree of uncertaint… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

    Journal ref: Transactions on Machine Learning Research, 2024, https://openreview.net/forum?id=PdbaruPVUY

  5. arXiv:2406.13836  [pdf, other

    stat.ME

    Mastering Rare Event Analysis: Optimal Subsample Size in Logistic and Cox Regressions

    Authors: Tal Agassi, Nir Keret, Malka Gorfine

    Abstract: In the realm of contemporary data analysis, the use of massive datasets has taken on heightened significance, albeit often entailing considerable demands on computational time and memory. While a multitude of existing works offer optimal subsampling methods for conducting analyses on subsamples with minimized efficiency loss, they notably lack tools for judiciously selecting the optimal subsample… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

  6. arXiv:2403.18464  [pdf, other

    stat.ME

    Cumulative Incidence Function Estimation Based on Population-Based Biobank Data

    Authors: Malka Gorfine, David M. Zucker, Shoval Shoham

    Abstract: Many countries have established population-based biobanks, which are being used increasingly in epidemiolgical and clinical research. These biobanks offer opportunities for large-scale studies addressing questions beyond the scope of traditional clinical trials or cohort studies. However, using biobank data poses new challenges. Typically, biobank data is collected from a study cohort recruited ov… ▽ More

    Submitted 27 March, 2024; originally announced March 2024.

  7. arXiv:2309.12152  [pdf, other

    stat.ME

    Unveiling Challenges in Mendelian Randomization for Gene-Environment Interaction

    Authors: Malka Gorfine, Conghui Qu, Ulrike Peters, Li Hsu

    Abstract: Many diseases and traits involve a complex interplay between genes and environment, generating significant interest in studying gene-environment interaction through observational data. However, for lifestyle and environmental risk factors, they are often susceptible to unmeasured confounding factors and as a result, may bias the assessment of the joint effect of gene and environment. Recently, Men… ▽ More

    Submitted 21 September, 2023; originally announced September 2023.

  8. Unlocking Retrospective Prevalent Information in EHRs -- a Pairwise Pseudolikelihood Approach

    Authors: Nir Keret, Malka Gorfine

    Abstract: Typically, electronic health record data are not collected towards a specific research question. Instead, they comprise numerous observations recruited at different ages, whose medical, environmental and oftentimes also genetic data are being collected. Some phenotypes, such as disease-onset ages, may be reported retrospectively if the event preceded recruitment, and such observations are termed `… ▽ More

    Submitted 3 September, 2023; originally announced September 2023.

  9. Discrete-time Competing-Risks Regression with or without Penalization

    Authors: Tomer Meir, Malka Gorfine

    Abstract: Many studies employ the analysis of time-to-event data that incorporates competing risks and right censoring. Most methods and software packages are geared towards analyzing data that comes from a continuous failure time distribution. However, failure-time data may sometimes be discrete either because time is inherently discrete or due to imprecise measurement. This paper introduces a new estimati… ▽ More

    Submitted 5 February, 2025; v1 submitted 2 March, 2023; originally announced March 2023.

    Journal ref: Biometrics, Volume 81, Issue 2, June 2025

  10. arXiv:2205.05322  [pdf, ps, other

    stat.ME

    Shared Frailty Methods for Complex Survival Data: A Review of Recent Advances

    Authors: Malka Gorfine, David M. Zucker

    Abstract: Dependent survival data arise in many contexts. One context is clustered survival data, where survival data are collected on clusters such as families or medical centers. Dependent survival data also arise when multiple survival times are recorded for each individual. Frailty models is one common approach to handle such data. In frailty models, the dependence is expressed in terms of a random effe… ▽ More

    Submitted 11 May, 2022; originally announced May 2022.

    Comments: 22 pages, 1 figure, 2 tables

  11. arXiv:2205.03954  [pdf, other

    stat.ME stat.AP

    An Accelerated Failure Time Regression Model for Illness-Death Data: A Frailty Approach

    Authors: Lea Kats, Malka Gorfine

    Abstract: This work presents a new model and estimation procedure for the illness-death survival data where the hazard functions follow accelerated failure time (AFT) models. A shared frailty variate induces positive dependence among failure times of a subject for handling the unobserved dependency between the non-terminal and the terminal failure times given the observed covariates. Semi-parametric maximum… ▽ More

    Submitted 8 May, 2022; originally announced May 2022.

  12. arXiv:2204.05731  [pdf, other

    stat.ML cs.LG

    PyDTS: A Python Package for Discrete-Time Survival (Regularized) Regression with Competing Risks

    Authors: Tomer Meir, Rom Gutman, Malka Gorfine

    Abstract: Time-to-event analysis (survival analysis) is used when the response of interest is the time until a pre-specified event occurs. Time-to-event data are sometimes discrete either because time itself is discrete or due to grouping of failure times into intervals or rounding off measurements. In addition, the failure of an individual could be one of several distinct failure types, known as competing… ▽ More

    Submitted 27 June, 2023; v1 submitted 12 April, 2022; originally announced April 2022.

  13. arXiv:2202.11743  [pdf, other

    stat.ME

    Revisiting the Cumulative Incidence Function With Competing Risks Data

    Authors: David M. Zucker, Malka Gorfine

    Abstract: We consider estimation of the cumulative incidence function (CIF) in the competing risks Cox model. We study three methods. Methods 1 and 2 are existing methods while Method 3 is a newly-proposed method. Method 3 is constructed so that the sum of the CIF's across all event types at the last observed event time is guaranteed, assuming no ties, to be equal to 1. The performance of the methods is exa… ▽ More

    Submitted 23 February, 2022; originally announced February 2022.

    MSC Class: 62N02

  14. Optimal Cox Regression Subsampling Procedure with Rare Events

    Authors: Nir Keret, Malka Gorfine

    Abstract: Massive sized survival datasets are becoming increasingly prevalent with the development of the healthcare industry. Such datasets pose computational challenges unprecedented in traditional survival analysis use-cases. A popular way for coping with massive datasets is downsampling them to a more manageable size, such that the computational resources can be afforded by the researcher. Cox proportio… ▽ More

    Submitted 30 January, 2022; v1 submitted 3 December, 2020; originally announced December 2020.

    Comments: Journal of the American Statistical Association (2023)

  15. arXiv:2010.04485  [pdf, other

    stat.ME math.ST stat.AP

    Causal inference for semi-competing risks data

    Authors: Daniel Nevo, Malka Gorfine

    Abstract: An emerging challenge for time-to-event data is studying semi-competing risks, namely when two event times are of interest: a non-terminal event time (e.g. age at disease diagnosis), and a terminal event time (e.g. age at death). The non-terminal event is observed only if it precedes the terminal event, which may occur before or after the non-terminal event. Studying treatment or intervention effe… ▽ More

    Submitted 9 October, 2020; originally announced October 2020.

    Comments: 35 pages, 3 figure, 3 tables

  16. arXiv:2009.13921  [pdf, other

    stat.ME

    Efficient Study Design with Multiple Measurement Instruments

    Authors: Michal Bitan, Malka Gorfine, Laura Rosen, David M. Steinberg

    Abstract: Outcomes from studies assessing exposure often use multiple measurements. In previous work, using a model first proposed by Buonoccorsi (1991), we showed that combining direct (e.g. biomarkers) and indirect (e.g. self-report) measurements provides a more accurate picture of true exposure than estimates obtained when using a single type of measurement. In this article, we propose a valuable tool fo… ▽ More

    Submitted 29 September, 2020; originally announced September 2020.

  17. Marginalized Frailty-Based Illness-Death Model: Application to the UK-Biobank Survival Data

    Authors: Malka Gorfine, Nir Keret, Asaf Ben Arie, David Zucker, Li Hsu

    Abstract: The UK Biobank is a large-scale health resource comprising genetic, environmental and medical information on approximately 500,000 volunteer participants in the UK, recruited at ages 40--69 during the years 2006--2010. The project monitors the health and well-being of its participants. This work demonstrates how these data can be used to estimate in a semi-parametric fashion the effects of genetic… ▽ More

    Submitted 27 May, 2019; originally announced June 2019.

  18. $K$-sample omnibus non-proportional hazards tests based on right-censored data

    Authors: Malka Gorfine, Matan Schlesinger, Li Hsu

    Abstract: This work presents novel and powerful tests for comparing non-proportional hazard functions, based on sample-space partitions. Right censoring introduces two major difficulties which make the existing sample-space partition tests for uncensored data non-applicable: (i) the actual event times of censored observations are unknown; and (ii) the standard permutation procedure is invalid in case the ce… ▽ More

    Submitted 27 October, 2019; v1 submitted 17 January, 2019; originally announced January 2019.

  19. arXiv:1812.00641  [pdf, other

    stat.ME

    An improved fully nonparametric estimator of the marginal survival function based on case-control clustered data

    Authors: David M. Zucker, Malka Gorfine

    Abstract: A case-control family study is a study where individuals with a disease of interest (case probands) and individuals without the disease (control probands) are randomly sampled from a well-defined population. Possibly right-censored age at onset and disease status are observed for both probands and their relatives. Correlation among the outcomes within a family is induced by factors such as inherit… ▽ More

    Submitted 3 December, 2018; originally announced December 2018.

  20. arXiv:1702.06407  [pdf

    stat.CO cs.MS

    General Semiparametric Shared Frailty Model Estimation and Simulation with frailtySurv

    Authors: John V. Monaco, Malka Gorfine, Li Hsu

    Abstract: The R package frailtySurv for simulating and fitting semi-parametric shared frailty models is introduced. Package frailtySurv implements semi-parametric consistent estimators for a variety of frailty distributions, including gamma, log-normal, inverse Gaussian and power variance function, and provides consistent estimators of the standard errors of the parameters' estimators. The parameters' estim… ▽ More

    Submitted 5 September, 2018; v1 submitted 21 February, 2017; originally announced February 2017.

    Journal ref: Monaco, J., Gorfine, M., & Hsu, L. (2018). General Semiparametric Shared Frailty Model: Estimation and Simulation with frailtySurv. Journal of Statistical Software, 86(4), 1 - 42. doi:http://dx.doi.org/10.18637/jss.v086.i04

  21. arXiv:1410.6758  [pdf, other

    stat.ME

    Consistent distribution-free $K$-sample and independence tests for univariate random variables

    Authors: Ruth Heller, Yair Heller, Shachar Kaufman, Barak Brill, Malka Gorfine

    Abstract: A popular approach for testing if two univariate random variables are statistically independent consists of partitioning the sample space into bins, and evaluating a test statistic on the binned data. The partition size matters, and the optimal partition size is data dependent. While for detecting simple relationships coarse partitions may be best, for detecting complex relationships a great gain… ▽ More

    Submitted 18 June, 2015; v1 submitted 24 October, 2014; originally announced October 2014.

    Comments: arXiv admin note: substantial text overlap with arXiv:1308.1559

    Journal ref: Journal of Machine Learning Research (JMLR) 2016, vol. 17, No. 29, 1-54

  22. arXiv:1404.7595  [pdf, other

    stat.ME

    A Quantile Regression Model for Failure-Time Data with Time-Dependent Covariates

    Authors: Malka Gorfine, Yair Goldberg, Yaacov Ritov

    Abstract: Since survival data occur over time, often important covariates that we wish to consider also change over time. Such covariates are referred as time-dependent covariates. Quantile regression offers flexible modeling of survival data by allowing the covariates to vary with quantiles. This paper provides a novel quantile regression model accommodating time-dependent covariates, for analyzing surviva… ▽ More

    Submitted 30 April, 2014; originally announced April 2014.

  23. arXiv:1308.1559   

    stat.ME

    Consistent distribution-free tests of association between univariate random variables

    Authors: Ruth Heller, Yair Heller, Shachar Kaufman, Malka Gorfine

    Abstract: We consider the problem of testing whether pairs of univariate random variables are associated. Few tests of independence exist that are consistent against all dependent alternatives and are distribution free. We propose novel tests that are consistent, distribution free, and have excellent power properties. The tests have simple form, and are surprisingly computationally efficient thanks to accom… ▽ More

    Submitted 8 December, 2014; v1 submitted 7 August, 2013; originally announced August 2013.

    Comments: The paper has been withdrawn, since we submitted a new manuscript arXiv:1410.6758 that includes this work but is far more general, thus it included also many new results, and therefore should be read instead of this work

  24. A consistent multivariate test of association based on ranks of distances

    Authors: Ruth Heller, Yair Heller, Malka Gorfine

    Abstract: We are concerned with the detection of associations between random vectors of any dimension. Few tests of independence exist that are consistent against all dependent alternatives. We propose a powerful test that is applicable in all dimensions and is consistent against all alternatives. The test has a simple form and is easy to implement. We demonstrate its good power properties in simulations an… ▽ More

    Submitted 31 May, 2012; v1 submitted 17 January, 2012; originally announced January 2012.

    Journal ref: Biometrika (2013), 100, 2, pp. 503-510