-
Probability Link Models with Symmetric Information Divergence
Authors:
Majid Asadi,
Karthik Devarajan,
Nader Ebrahimi,
Ehsan Soofi,
Lauren Spirko-Burns
Abstract:
This paper introduces link functions for transforming one probability distribution to another such that the Kullback-Leibler and Rényi divergences between the two distributions are symmetric. Two general classes of link models are proposed. The first model links two survival functions and is applicable to models such as the proportional odds and change point, which are used in survival analysis an…
▽ More
This paper introduces link functions for transforming one probability distribution to another such that the Kullback-Leibler and Rényi divergences between the two distributions are symmetric. Two general classes of link models are proposed. The first model links two survival functions and is applicable to models such as the proportional odds and change point, which are used in survival analysis and reliability modeling. A prototype application involving the proportional odds model demonstrates advantages of symmetric divergence measures over asymmetric measures for assessing the efficacy of features and for model averaging purposes. The advantages include providing unique ranks for models and unique information weights for model averaging with one-half as much computation requirement of asymmetric divergences. The second model links two cumulative probability distribution functions. This model produces a generalized location model which are continuous counterparts of the binary probability models such as probit and logit models. Examples include the generalized probit and logit models which have appeared in the survival analysis literature, and a generalized Laplace model and a generalized Student-$t$ model, which are survival time models corresponding to the respective binary probability models. Lastly, extensions to symmetric divergence between survival functions and conditions for copula dependence information are presented.
△ Less
Submitted 10 August, 2020;
originally announced August 2020.
-
Variable Selection with Random Survival Forest and Bayesian Additive Regression Tree for Survival Data
Authors:
Satabdi Saha,
Duchwan Ryu,
Nader Ebrahimi
Abstract:
In this paper we utilize a survival analysis methodology incorporating Bayesian additive regression trees to account for nonlinear and additive covariate effects. We compare the performance of Bayesian additive regression trees, Cox proportional hazards and random survival forests models for censored survival data, using simulation studies and survival analysis for breast cancer with U.S. SEER dat…
▽ More
In this paper we utilize a survival analysis methodology incorporating Bayesian additive regression trees to account for nonlinear and additive covariate effects. We compare the performance of Bayesian additive regression trees, Cox proportional hazards and random survival forests models for censored survival data, using simulation studies and survival analysis for breast cancer with U.S. SEER database for the year 2005. In simulation studies, we compare the three models across varying sample sizes and censoring rates on the basis of bias and prediction accuracy. In survival analysis for breast cancer, we retrospectively analyze a subset of 1500 patients having invasive ductal carcinoma that is a common form of breast cancer mostly affecting older woman. Predictive potential of the three models are then compared using some widely used performance assessment measures in survival literature.
△ Less
Submitted 2 November, 2019; v1 submitted 4 October, 2019;
originally announced October 2019.
-
On the Sample Information About Parameter and Prediction
Authors:
Nader Ebrahimi,
Ehsan S. Soofi,
Refik Soyer
Abstract:
The Bayesian measure of sample information about the parameter, known as Lindley's measure, is widely used in various problems such as developing prior distributions, models for the likelihood functions and optimal designs. The predictive information is defined similarly and used for model selection and optimal designs, though to a lesser extent. The parameter and predictive information measures a…
▽ More
The Bayesian measure of sample information about the parameter, known as Lindley's measure, is widely used in various problems such as developing prior distributions, models for the likelihood functions and optimal designs. The predictive information is defined similarly and used for model selection and optimal designs, though to a lesser extent. The parameter and predictive information measures are proper utility functions and have been also used in combination. Yet the relationship between the two measures and the effects of conditional dependence between the observable quantities on the Bayesian information measures remain unexplored. We address both issues. The relationship between the two information measures is explored through the information provided by the sample about the parameter and prediction jointly. The role of dependence is explored along with the interplay between the information measures, prior and sampling design. For the conditionally independent sequence of observable quantities, decompositions of the joint information characterize Lindley's measure as the sample information about the parameter and prediction jointly and the predictive information as part of it. For the conditionally dependent case, the joint information about parameter and prediction exceeds Lindley's measure by an amount due to the dependence. More specific results are shown for the normal linear models and a broad subfamily of the exponential family. Conditionally independent samples provide relatively little information for prediction, and the gap between the parameter and predictive information measures grows rapidly with the sample size.
△ Less
Submitted 5 January, 2011;
originally announced January 2011.