Causal Inference for Survival Analysis
Authors:
Vikas Ramachandra
Abstract:
In this paper, we propose the use of causal inference techniques for survival function estimation and prediction for subgroups of the data, upto individual units. Tree ensemble methods, specifically random forests were modified for this purpose. A real world healthcare dataset was used with about 1800 patients with breast cancer, which has multiple patient covariates as well as disease free surviv…
▽ More
In this paper, we propose the use of causal inference techniques for survival function estimation and prediction for subgroups of the data, upto individual units. Tree ensemble methods, specifically random forests were modified for this purpose. A real world healthcare dataset was used with about 1800 patients with breast cancer, which has multiple patient covariates as well as disease free survival days (DFS) and a death event binary indicator (y). We use the type of cancer curative intervention as the treatment variable (T=0 or 1, binary treatment case in our example). The algorithm is a 2 step approach. In step 1, we estimate heterogeneous treatment effects using a causalTree with the DFS as the dependent variable. Next, in step 2, for each selected leaf of the causalTree with distinctly different average treatment effect (with respect to survival), we fit a survival forest to all the patients in that leaf, one forest each for treatment T=0 as well as T=1 to get estimated patient level survival curves for each treatment (more generally, any model can be used at this step). Then, we subtract the patient level survival curves to get the differential survival curve for a given patient, to compare the survival function as a result of the 2 treatments. The path to a selected leaf also gives us the combination of patient features and their values which are causally important for the treatment effect difference at the leaf.
△ Less
Submitted 21 March, 2018;
originally announced March 2018.
Deep Learning for Causal Inference
Authors:
Vikas Ramachandra
Abstract:
In this paper, we propose deep learning techniques for econometrics, specifically for causal inference and for estimating individual as well as average treatment effects. The contribution of this paper is twofold: 1. For generalized neighbor matching to estimate individual and average treatment effects, we analyze the use of autoencoders for dimensionality reduction while maintaining the local nei…
▽ More
In this paper, we propose deep learning techniques for econometrics, specifically for causal inference and for estimating individual as well as average treatment effects. The contribution of this paper is twofold: 1. For generalized neighbor matching to estimate individual and average treatment effects, we analyze the use of autoencoders for dimensionality reduction while maintaining the local neighborhood structure among the data points in the embedding space. This deep learning based technique is shown to perform better than simple k nearest neighbor matching for estimating treatment effects, especially when the data points have several features/covariates but reside in a low dimensional manifold in high dimensional space. We also observe better performance than manifold learning methods for neighbor matching. 2. Propensity score matching is one specific and popular way to perform matching in order to estimate average and individual treatment effects. We propose the use of deep neural networks (DNNs) for propensity score matching, and present a network called PropensityNet for this. This is a generalization of the logistic regression technique traditionally used to estimate propensity scores and we show empirically that DNNs perform better than logistic regression at propensity score matching. Code for both methods will be made available shortly on Github at: https://github.com/vikas84bf
△ Less
Submitted 28 February, 2018;
originally announced March 2018.