-
Diagnosing and Rectifying Fake OOD Invariance: A Restructured Causal Approach
Authors:
Ziliang Chen,
Yongsen Zheng,
Zhao-Rong Lai,
Quanlong Guan,
Liang Lin
Abstract:
Invariant representation learning (IRL) encourages the prediction from invariant causal features to labels de-confounded from the environments, advancing the technical roadmap of out-of-distribution (OOD) generalization. Despite spotlights around, recent theoretical results verified that some causal features recovered by IRLs merely pretend domain-invariantly in the training environments but fail…
▽ More
Invariant representation learning (IRL) encourages the prediction from invariant causal features to labels de-confounded from the environments, advancing the technical roadmap of out-of-distribution (OOD) generalization. Despite spotlights around, recent theoretical results verified that some causal features recovered by IRLs merely pretend domain-invariantly in the training environments but fail in unseen domains. The \emph{fake invariance} severely endangers OOD generalization since the trustful objective can not be diagnosed and existing causal surgeries are invalid to rectify. In this paper, we review a IRL family (InvRat) under the Partially and Fully Informative Invariant Feature Structural Causal Models (PIIF SCM /FIIF SCM) respectively, to certify their weaknesses in representing fake invariant features, then, unify their causal diagrams to propose ReStructured SCM (RS-SCM). RS-SCM can ideally rebuild the spurious and the fake invariant features simultaneously. Given this, we further develop an approach based on conditional mutual information with respect to RS-SCM, then rigorously rectify the spurious and fake invariant effects. It can be easily implemented by a small feature selection subnet introduced in the IRL family, which is alternatively optimized to achieve our goal. Experiments verified the superiority of our approach to fight against the fake invariant issue across a variety of OOD generalization benchmarks.
△ Less
Submitted 15 December, 2023;
originally announced December 2023.
-
SMIM: a unified framework of Survival sensitivity analysis using Multiple Imputation and Martingale
Authors:
Shu Yang,
Yilong Zhang,
Guanghan Frank Liu,
Qian Guan
Abstract:
Censored survival data are common in clinical trial studies. We propose a unified framework for sensitivity analysis to censoring at random in survival data using multiple imputation and martingale, called SMIM. The proposed framework adopts the δ-adjusted and control-based models, indexed by the sensitivity parameter, entailing censoring at random and a wide collection of censoring not at random…
▽ More
Censored survival data are common in clinical trial studies. We propose a unified framework for sensitivity analysis to censoring at random in survival data using multiple imputation and martingale, called SMIM. The proposed framework adopts the δ-adjusted and control-based models, indexed by the sensitivity parameter, entailing censoring at random and a wide collection of censoring not at random assumptions. Also, it targets for a broad class of treatment effect estimands defined as functionals of treatment-specific survival functions, taking into account of missing data due to censoring. Multiple imputation facilitates the use of simple full-sample estimation; however, the standard Rubin's combining rule may overestimate the variance for inference in the sensitivity analysis framework. We decompose the multiple imputation estimator into a martingale series based on the sequential construction of the estimator and propose the wild bootstrap inference by resampling the martingale series. The new bootstrap inference has a theoretical guarantee for consistency and is computationally efficient compared to the non-parametric bootstrap counterpart. We evaluate the finite-sample performance of the proposed SMIM through simulation and an application on a HIV clinical trial.
△ Less
Submitted 14 May, 2021; v1 submitted 5 July, 2020;
originally announced July 2020.
-
Deep Active Learning for Remote Sensing Object Detection
Authors:
Zhenshen Qu,
Jingda Du,
Yong Cao,
Qiuyu Guan,
Pengbo Zhao
Abstract:
Recently, CNN object detectors have achieved high accuracy on remote sensing images but require huge labor and time costs on annotation. In this paper, we propose a new uncertainty-based active learning which can select images with more information for annotation and detector can still reach high performance with a fraction of the training images. Our method not only analyzes objects' classificati…
▽ More
Recently, CNN object detectors have achieved high accuracy on remote sensing images but require huge labor and time costs on annotation. In this paper, we propose a new uncertainty-based active learning which can select images with more information for annotation and detector can still reach high performance with a fraction of the training images. Our method not only analyzes objects' classification uncertainty to find least confident objects but also considers their regression uncertainty to declare outliers. Besides, we bring out two extra weights to overcome two difficulties in remote sensing datasets, class-imbalance and difference in images' objects amount. We experiment our active learning algorithm on DOTA dataset with CenterNet as object detector. We achieve same-level performance as full supervision with only half images. We even override full supervision with 55% images and augmented weights on least confident images.
△ Less
Submitted 17 March, 2020;
originally announced March 2020.
-
A spatiotemporal recommendation engine for malaria control
Authors:
Qian Guan,
Brian J. Reich,
Eric B. Laber
Abstract:
Malaria is an infectious disease affecting a large population across the world, and interventions need to be efficiently applied to reduce the burden of malaria. We develop a framework to help policy-makers decide how to allocate limited resources in realtime for malaria control. We formalize a policy for the resource allocation as a sequence of decisions, one per intervention decision, that map u…
▽ More
Malaria is an infectious disease affecting a large population across the world, and interventions need to be efficiently applied to reduce the burden of malaria. We develop a framework to help policy-makers decide how to allocate limited resources in realtime for malaria control. We formalize a policy for the resource allocation as a sequence of decisions, one per intervention decision, that map up-to-date disease related information to a resource allocation. An optimal policy must control the spread of the disease while being interpretable and viewed as equitable to stakeholders. We construct an interpretable class of resource allocation policies that can accommodate allocation of resources residing in a continuous domain, and combine a hierarchical Bayesian spatiotemporal model for disease transmission with a policy-search algorithm to estimate an optimal policy for resource allocation within the pre-specified class. The estimated optimal policy under the proposed framework improves the cumulative long-term outcome compared with naive approaches in both simulation experiments and application to malaria interventions in the Democratic Republic of the Congo.
△ Less
Submitted 10 March, 2020;
originally announced March 2020.
-
A Unified Inference Framework for Multiple Imputation Using Martingales
Authors:
Qian Guan,
Shu Yang
Abstract:
Multiple imputation is widely used to handle missing data. Although Rubin's combining rule is simple, it is not clear whether or not the standard multiple imputation inference is consistent when coupled with the commonly-used full sample estimators. This article establishes a unified martingale representation of multiple imputation for a wide class of asymptotically linear full sample estimators.…
▽ More
Multiple imputation is widely used to handle missing data. Although Rubin's combining rule is simple, it is not clear whether or not the standard multiple imputation inference is consistent when coupled with the commonly-used full sample estimators. This article establishes a unified martingale representation of multiple imputation for a wide class of asymptotically linear full sample estimators. This representation invokes the wild bootstrap inference to provide consistent variance estimation under the correct specification of the imputation models. As a motivating application, we illustrate the proposed method to estimate the average causal effect (ACE) with partially observed confounders in causal inference. Our framework applies to asymptotically linear ACE estimators, including the regression imputation, weighting, and matching estimators. We extend to the scenarios when both outcome and confounders are subject to missingness and when the data are missing not at random.
△ Less
Submitted 2 January, 2023; v1 submitted 11 November, 2019;
originally announced November 2019.
-
Bayesian Nonparametric Policy Search with Application to Periodontal Recall Intervals
Authors:
Qian Guan,
Brian J. Reich,
Eric B. Laber,
Dipankar Bandyopadhyay
Abstract:
Tooth loss from periodontal disease is a major public health burden in the United States. Standard clinical practice is to recommend a dental visit every six months; however, this practice is not evidence-based, and poor dental outcomes and increasing dental insurance premiums indicate room for improvement. We consider a tailored approach that recommends recall time based on patient characteristic…
▽ More
Tooth loss from periodontal disease is a major public health burden in the United States. Standard clinical practice is to recommend a dental visit every six months; however, this practice is not evidence-based, and poor dental outcomes and increasing dental insurance premiums indicate room for improvement. We consider a tailored approach that recommends recall time based on patient characteristics and medical history to minimize disease progression without increasing resource expenditures. We formalize this method as a dynamic treatment regime which comprises a sequence of decisions, one per stage of intervention, that follow a decision rule which maps current patient information to a recommendation for their next visit time. The dynamics of periodontal health, visit frequency, and patient compliance are complex, yet the estimated optimal regime must be interpretable to domain experts if it is to be integrated into clinical practice. We combine non-parametric Bayesian dynamics modeling with policy-search algorithms to estimate the optimal dynamic treatment regime within an interpretable class of regimes. Both simulation experiments and application to a rich database of electronic dental records from the HealthPartners HMO shows that our proposed method leads to better dental health without increasing the average recommended recall time relative to competing methods.
△ Less
Submitted 9 October, 2018;
originally announced October 2018.