Estimating and comparing adverse event probabilities in the presence of varying follow-up times and competing events
Authors:
Regina Stegherr,
Claudia Schmoor,
Michael Lübbert,
Tim Friede,
Jan Beyersmann
Abstract:
Safety analyses in terms of adverse events (AEs) are an important aspect of benefit-risk assessments of therapies. Compared to efficacy analyses AE analyses are often rather simplistic. The probability of an AE of a specific type is typically estimated by the incidence proportion, sometimes the incidence density or the Kaplan-Meier estimator are proposed. But these analyses either do not account f…
▽ More
Safety analyses in terms of adverse events (AEs) are an important aspect of benefit-risk assessments of therapies. Compared to efficacy analyses AE analyses are often rather simplistic. The probability of an AE of a specific type is typically estimated by the incidence proportion, sometimes the incidence density or the Kaplan-Meier estimator are proposed. But these analyses either do not account for censoring, rely on a too restrictive parametric model, or ignore competing events. With the non-parametric Aalen-Johansen estimator as the gold-standard, these potential sources of bias are investigated in a data example from oncology and in simulations, both in the one-sample and in the two-sample case. As the estimators may have large variances at the end of follow-up, the estimators are not only compared at the maximal event time but also at two quantiles of the observed times. To date, consequences for safety comparisons have hardly been investigated in the literature. The impact of using different estimators for group comparisons is unclear, as, for example, the ratio of two both underestimating or overestimating estimators may or may not be comparable to the ratio of the gold-standard estimator. Therefore, the ratio of the AE probabilities is also calculated based on different approaches. By simulations investigating constant and non-constant hazards, different censoring mechanisms and event frequencies, we show that ignoring competing events is more of a problem than falsely assuming constant hazards by use of the incidence density and that the choice of the AE probability estimator is crucial for group comparisons.
△ Less
Submitted 16 January, 2020;
originally announced January 2020.
Netboost: Boosting-supported network analysis improves high-dimensional omics prediction in acute myeloid leukemia and Huntington's disease
Authors:
Pascal Schlosser,
Jochen Knaus,
Maximilian Schmutz,
Konstanze Döhner,
Christoph Plass,
Lars Bullinger,
Rainer Claus,
Harald Binder,
Michael Lübbert,
Martin Schumacher
Abstract:
Background: State-of-the art selection methods fail to identify weak but cumulative effects of features found in many high-dimensional omics datasets. Nevertheless, these features play an important role in certain diseases.
Results: We present Netboost, a three-step dimension reduction technique. First, a boosting-based filter is combined with the topological overlap measure to identify the esse…
▽ More
Background: State-of-the art selection methods fail to identify weak but cumulative effects of features found in many high-dimensional omics datasets. Nevertheless, these features play an important role in certain diseases.
Results: We present Netboost, a three-step dimension reduction technique. First, a boosting-based filter is combined with the topological overlap measure to identify the essential edges of the network. Second, sparse hierarchical clustering is applied on the selected edges to identify modules and finally module information is aggregated by the first principal components. The primary analysis is than carried out on these summary measures instead of the original data. We demonstrate the application of the newly developed Netboost in combination with CoxBoost for survival prediction of DNA methylation and gene expression data from 180 acute myeloid leukemia (AML) patients and show, based on cross-validated prediction error curve estimates, its prediction superiority over variable selection on the full dataset as well as over an alternative clustering approach. The identified signature related to chromatin modifying enzymes was replicated in an independent dataset of AML patients in the phase II AMLSG 12-09 study. In a second application we combine Netboost with Random Forest classification and improve the disease classification error in RNA-sequencing data of Huntington's disease mice.
Conclusion: Netboost improves definition of predictive variables for survival analysis and classification. It is a freely available Bioconductor R package for dimension reduction and hypothesis generation in high-dimensional omics applications.
△ Less
Submitted 27 September, 2019;
originally announced September 2019.