-
User-Controllable Recommendation via Counterfactual Retrospective and Prospective Explanations
Authors:
Juntao Tan,
Yingqiang Ge,
Yan Zhu,
Yinglong Xia,
Jiebo Luo,
Jianchao Ji,
Yongfeng Zhang
Abstract:
Modern recommender systems utilize users' historical behaviors to generate personalized recommendations. However, these systems often lack user controllability, leading to diminished user satisfaction and trust in the systems. Acknowledging the recent advancements in explainable recommender systems that enhance users' understanding of recommendation mechanisms, we propose leveraging these advancem…
▽ More
Modern recommender systems utilize users' historical behaviors to generate personalized recommendations. However, these systems often lack user controllability, leading to diminished user satisfaction and trust in the systems. Acknowledging the recent advancements in explainable recommender systems that enhance users' understanding of recommendation mechanisms, we propose leveraging these advancements to improve user controllability. In this paper, we present a user-controllable recommender system that seamlessly integrates explainability and controllability within a unified framework. By providing both retrospective and prospective explanations through counterfactual reasoning, users can customize their control over the system by interacting with these explanations.
Furthermore, we introduce and assess two attributes of controllability in recommendation systems: the complexity of controllability and the accuracy of controllability. Experimental evaluations on MovieLens and Yelp datasets substantiate the effectiveness of our proposed framework. Additionally, our experiments demonstrate that offering users control options can potentially enhance recommendation accuracy in the future. Source code and data are available at \url{https://github.com/chrisjtan/ucr}.
△ Less
Submitted 1 August, 2023;
originally announced August 2023.
-
Multi-Loss Convolutional Network with Time-Frequency Attention for Speech Enhancement
Authors:
Liang Wan,
Hongqing Liu,
Yi Zhou,
Jie Ji
Abstract:
The Dual-Path Convolution Recurrent Network (DPCRN) was proposed to effectively exploit time-frequency domain information. By combining the DPRNN module with Convolution Recurrent Network (CRN), the DPCRN obtained a promising performance in speech separation with a limited model size. In this paper, we explore self-attention in the DPCRN module and design a model called Multi-Loss Convolutional Ne…
▽ More
The Dual-Path Convolution Recurrent Network (DPCRN) was proposed to effectively exploit time-frequency domain information. By combining the DPRNN module with Convolution Recurrent Network (CRN), the DPCRN obtained a promising performance in speech separation with a limited model size. In this paper, we explore self-attention in the DPCRN module and design a model called Multi-Loss Convolutional Network with Time-Frequency Attention(MNTFA) for speech enhancement. We use self-attention modules to exploit the long-time information, where the intra-chunk self-attentions are used to model the spectrum pattern and the inter-chunk self-attention are used to model the dependence between consecutive frames. Compared to DPRNN, axial self-attention greatly reduces the need for memory and computation, which is more suitable for long sequences of speech signals. In addition, we propose a joint training method of a multi-resolution STFT loss and a WavLM loss using a pre-trained WavLM network. Experiments show that with only 0.23M parameters, the proposed model achieves a better performance than DPCRN.
△ Less
Submitted 15 June, 2023;
originally announced June 2023.
-
A flexible approach for causal inference with multiple treatments and clustered survival outcomes
Authors:
Liangyuan Hu,
Jiayi Ji,
Ronald D. Ennis,
Joseph W. Hogan
Abstract:
When drawing causal inferences about the effects of multiple treatments on clustered survival outcomes using observational data, we need to address implications of the multilevel data structure, multiple treatments, censoring and unmeasured confounding for causal analyses. Few off-the-shelf causal inference tools are available to simultaneously tackle these issues. We develop a flexible random-int…
▽ More
When drawing causal inferences about the effects of multiple treatments on clustered survival outcomes using observational data, we need to address implications of the multilevel data structure, multiple treatments, censoring and unmeasured confounding for causal analyses. Few off-the-shelf causal inference tools are available to simultaneously tackle these issues. We develop a flexible random-intercept accelerated failure time model, in which we use Bayesian additive regression trees to capture arbitrarily complex relationships between censored survival times and pre-treatment covariates and use the random intercepts to capture cluster-specific main effects. We develop an efficient Markov chain Monte Carlo algorithm to draw posterior inferences about the population survival effects of multiple treatments and examine the variability in cluster-level effects. We further propose an interpretable sensitivity analysis approach to evaluate the sensitivity of drawn causal inferences about treatment effect to the potential magnitude of departure from the causal assumption of no unmeasured confounding. Expansive simulations empirically validate and demonstrate good practical operating characteristics of our proposed methods. Applying the proposed methods to a dataset on older high-risk localized prostate cancer patients drawn from the National Cancer Database, we evaluate the comparative effects of three treatment approaches on patient survival, and assess the ramifications of potential unmeasured confounding. The methods developed in this work are readily available in the $\textsf{R}$ package $\textsf{riAFTBART}$.
△ Less
Submitted 16 February, 2022;
originally announced February 2022.
-
CIMTx: An R package for causal inference with multiple treatments using observational data
Authors:
Liangyuan Hu,
Jiayi Ji
Abstract:
CIMTx provides efficient and unified functions to implement modern methods for causal inferences with multiple treatments using observational data with a focus on binary outcomes. The methods include regression adjustment, inverse probability of treatment weighting, Bayesian additive regression trees, regression adjustment with multivariate spline of the generalized propensity score, vector matchi…
▽ More
CIMTx provides efficient and unified functions to implement modern methods for causal inferences with multiple treatments using observational data with a focus on binary outcomes. The methods include regression adjustment, inverse probability of treatment weighting, Bayesian additive regression trees, regression adjustment with multivariate spline of the generalized propensity score, vector matching and targeted maximum likelihood estimation. In addition, CIMTx illustrates ways in which users can simulate data adhering to the complex data structures in the multiple treatment setting. Furthermore, the CIMTx package offers a unique set of features to address the key causal assumptions: positivity and ignorability. For the positivity assumption, CIMTx demonstrates techniques to identify the common support region for retaining inferential units using inverse probability of treatment weighting, Bayesian additive regression trees and vector matching}. To handle the ignorability assumption, CIMTx provides a flexible Monte Carlo sensitivity analysis approach to evaluate how causal conclusions would be altered in response to different magnitude of departure from ignorable treatment assignment.
△ Less
Submitted 14 September, 2022; v1 submitted 19 October, 2021;
originally announced October 2021.
-
Estimating the causal effects of multiple intermittent treatments with application to COVID-19
Authors:
Liangyuan Hu,
Jiayi Ji,
Himanshu Joshi,
Erick Scott,
Fan Li
Abstract:
To draw real-world evidence about the comparative effectiveness of multiple time-varying treatments on patient survival, we develop a joint marginal structural survival model and a novel weighting strategy to account for time-varying confounding and censoring. Our methods formulate complex longitudinal treatments with multiple start/stop switches as the recurrent events with discontinuous interval…
▽ More
To draw real-world evidence about the comparative effectiveness of multiple time-varying treatments on patient survival, we develop a joint marginal structural survival model and a novel weighting strategy to account for time-varying confounding and censoring. Our methods formulate complex longitudinal treatments with multiple start/stop switches as the recurrent events with discontinuous intervals of treatment eligibility. We derive the weights in continuous time to handle a complex longitudinal dataset without the need to discretize or artificially align the measurement times. We further use machine learning models designed for censored survival data with time-varying covariates and the kernel function estimator of the baseline intensity to efficiently estimate the continuous-time weights. Our simulations demonstrate that the proposed methods provide better bias reduction and nominal coverage probability when analyzing observational longitudinal survival data with irregularly spaced time intervals, compared to conventional methods that require aligned measurement time points. We apply the proposed methods to a large-scale COVID-19 dataset to estimate the causal effects of several COVID-19 treatments on the composite of in-hospital mortality and ICU admission.
△ Less
Submitted 4 August, 2023; v1 submitted 27 September, 2021;
originally announced September 2021.
-
Modeling and Decoupling Systemic Risk
Authors:
Jingyu Ji,
Deyuan Li,
Zhengjun Zhang
Abstract:
Identifying systemic risk patterns in geopolitical, economic, financial, environmental, transportation, epidemiological systems and their impacts is the key to risk management. This paper proposes a new nonlinear time series model: autoregressive conditional accelerated Fréchet (AcAF) model and introduces two new endopathic and exopathic competing risk measures for better learning risk patterns, d…
▽ More
Identifying systemic risk patterns in geopolitical, economic, financial, environmental, transportation, epidemiological systems and their impacts is the key to risk management. This paper proposes a new nonlinear time series model: autoregressive conditional accelerated Fréchet (AcAF) model and introduces two new endopathic and exopathic competing risk measures for better learning risk patterns, decoupling systemic risk, and making better risk management. The paper establishes the probabilistic properties of stationarity and ergodicity of the AcAF model. Simulation demonstrates the efficiency of the proposed estimators and the AcAF model's flexibility in modeling heterogeneous data. Empirical studies on the stock returns in S&P 500 and the cryptocurrency trading show the superior performance of the proposed model in terms of the identified risk patterns, endopathic and exopathic competing risks, being informative with greater interpretability, enhancing the understanding of the systemic risks of a market and their causes, and making better risk management possible.
△ Less
Submitted 2 September, 2021; v1 submitted 21 July, 2021;
originally announced July 2021.
-
Variable selection with missing data in both covariates and outcomes: Imputation and machine learning
Authors:
Liangyuan Hu,
Jung-Yi Joyce Lin,
Jiayi Ji
Abstract:
The missing data issue is ubiquitous in health studies. Variable selection in the presence of both missing covariates and outcomes is an important statistical research topic but has been less studied. Existing literature focuses on parametric regression techniques that provide direct parameter estimates of the regression model. Flexible nonparametric machine learning methods considerably mitigate…
▽ More
The missing data issue is ubiquitous in health studies. Variable selection in the presence of both missing covariates and outcomes is an important statistical research topic but has been less studied. Existing literature focuses on parametric regression techniques that provide direct parameter estimates of the regression model. Flexible nonparametric machine learning methods considerably mitigate the reliance on the parametric assumptions, but do not provide as naturally defined variable importance measure as the covariate effect native to parametric models. We investigate a general variable selection approach when both the covariates and outcomes can be missing at random and have general missing data patterns. This approach exploits the flexibility of machine learning modeling techniques and bootstrap imputation, which is amenable to nonparametric methods in which the covariate effects are not directly available. We conduct expansive simulations investigating the practical operating characteristics of the proposed variable selection approach, when combined with four tree-based machine learning methods, XGBoost, Random Forests, Bayesian Additive Regression Trees (BART) and Conditional Random Forests, and two commonly used parametric methods, lasso and backward stepwise selection. Numeric results suggest that when combined with bootstrap imputation, XGBoost and BART have the overall best variable selection performance with respect to the $F_1$ score and Type I error across various settings. In general, there is no significant difference in the variable selection performance due to imputation methods. We further demonstrate the methods via a case study of risk factors for 3-year incidence of metabolic syndrome with data from the Study of Women's Health Across the Nation.
△ Less
Submitted 7 July, 2021; v1 submitted 6 April, 2021;
originally announced April 2021.
-
Neural networks behave as hash encoders: An empirical study
Authors:
Fengxiang He,
Shiye Lei,
Jianmin Ji,
Dacheng Tao
Abstract:
The input space of a neural network with ReLU-like activations is partitioned into multiple linear regions, each corresponding to a specific activation pattern of the included ReLU-like activations. We demonstrate that this partition exhibits the following encoding properties across a variety of deep learning models: (1) {\it determinism}: almost every linear region contains at most one training e…
▽ More
The input space of a neural network with ReLU-like activations is partitioned into multiple linear regions, each corresponding to a specific activation pattern of the included ReLU-like activations. We demonstrate that this partition exhibits the following encoding properties across a variety of deep learning models: (1) {\it determinism}: almost every linear region contains at most one training example. We can therefore represent almost every training example by a unique activation pattern, which is parameterized by a {\it neural code}; and (2) {\it categorization}: according to the neural code, simple algorithms, such as $K$-Means, $K$-NN, and logistic regression, can achieve fairly good performance on both training and test data. These encoding properties surprisingly suggest that {\it normal neural networks well-trained for classification behave as hash encoders without any extra efforts.} In addition, the encoding properties exhibit variability in different scenarios. {Further experiments demonstrate that {\it model size}, {\it training time}, {\it training sample size}, {\it regularization}, and {\it label noise} contribute in shaping the encoding properties, while the impacts of the first three are dominant.} We then define an {\it activation hash phase chart} to represent the space expanded by {model size}, training time, training sample size, and the encoding properties, which is divided into three canonical regions: {\it under-expressive regime}, {\it critically-expressive regime}, and {\it sufficiently-expressive regime}. The source code package is available at \url{https://github.com/LeavesLei/activation-code}.
△ Less
Submitted 14 January, 2021;
originally announced January 2021.
-
A flexible sensitivity analysis approach for unmeasured confounding with multiple treatments and a binary outcome with application to SEER-Medicare lung cancer data
Authors:
Liangyuan Hu,
Jungang Zou,
Chenyang Gu,
Jiayi Ji,
Michael Lopez,
Minal Kale
Abstract:
In the absence of a randomized experiment, a key assumption for drawing causal inference about treatment effects is the ignorable treatment assignment. Violations of the ignorability assumption may lead to biased treatment effect estimates. Sensitivity analysis helps gauge how causal conclusions will be altered in response to the potential magnitude of departure from the ignorability assumption. H…
▽ More
In the absence of a randomized experiment, a key assumption for drawing causal inference about treatment effects is the ignorable treatment assignment. Violations of the ignorability assumption may lead to biased treatment effect estimates. Sensitivity analysis helps gauge how causal conclusions will be altered in response to the potential magnitude of departure from the ignorability assumption. However, sensitivity analysis approaches for unmeasured confounding in the context of multiple treatments and binary outcomes are scarce. We propose a flexible Monte Carlo sensitivity analysis approach for causal inference in such settings. We first derive the general form of the bias introduced by unmeasured confounding, with emphasis on theoretical properties uniquely relevant to multiple treatments. We then propose methods to encode the impact of unmeasured confounding on potential outcomes and adjust the estimates of causal effects in which the presumed unmeasured confounding is removed. Our proposed methods embed nested multiple imputation within the Bayesian framework, which allow for seamless integration of the uncertainty about the values of the sensitivity parameters and the sampling variability, as well as use of the Bayesian Additive Regression Trees for modeling flexibility. Expansive simulations validate our methods and gain insight into sensitivity analysis with multiple treatments. We use the SEER-Medicare data to demonstrate sensitivity analysis using three treatments for early stage non-small cell lung cancer. The methods developed in this work are readily available in the R package SAMTx.
△ Less
Submitted 13 August, 2021; v1 submitted 10 December, 2020;
originally announced December 2020.
-
Estimating heterogeneous survival treatment effect in observational data using machine learning
Authors:
Liangyuan Hu,
Jiayi Ji,
Fan Li
Abstract:
Methods for estimating heterogeneous treatment effect in observational data have largely focused on continuous or binary outcomes, and have been relatively less vetted with survival outcomes. Using flexible machine learning methods in the counterfactual framework is a promising approach to address challenges due to complex individual characteristics, to which treatments need to be tailored. To eva…
▽ More
Methods for estimating heterogeneous treatment effect in observational data have largely focused on continuous or binary outcomes, and have been relatively less vetted with survival outcomes. Using flexible machine learning methods in the counterfactual framework is a promising approach to address challenges due to complex individual characteristics, to which treatments need to be tailored. To evaluate the operating characteristics of recent survival machine learning methods for the estimation of treatment effect heterogeneity and inform better practice, we carry out a comprehensive simulation study presenting a wide range of settings describing confounded heterogeneous survival treatment effects and varying degrees of covariate overlap. Our results suggest that the nonparametric Bayesian Additive Regression Trees within the framework of accelerated failure time model (AFT-BART-NP) consistently yields the best performance, in terms of bias, precision and expected regret. Moreover, the credible interval estimators from AFT-BART-NP provide close to nominal frequentist coverage for the individual survival treatment effect when the covariate overlap is at least moderate. Including a non-parametrically estimated propensity score as an additional fixed covariate in the AFT-BART-NP model formulation can further improve its efficiency and frequentist coverage. Finally, we demonstrate the application of flexible causal machine learning estimators through a comprehensive case study examining the heterogeneous survival effects of two radiotherapy approaches for localized high-risk prostate cancer.
△ Less
Submitted 19 May, 2021; v1 submitted 16 August, 2020;
originally announced August 2020.
-
Detecting Problem Statements in Peer Assessments
Authors:
Yunkai Xiao,
Gabriel Zingle,
Qinjin Jia,
Harsh R. Shah,
Yi Zhang,
Tianyi Li,
Mohsin Karovaliya,
Weixiang Zhao,
Yang Song,
Jie Ji,
Ashwin Balasubramaniam,
Harshit Patel,
Priyankha Bhalasubbramanian,
Vikram Patel,
Edward F. Gehringer
Abstract:
Effective peer assessment requires students to be attentive to the deficiencies in the work they rate. Thus, their reviews should identify problems. But what ways are there to check that they do? We attempt to automate the process of deciding whether a review comment detects a problem. We use over 18,000 review comments that were labeled by the reviewees as either detecting or not detecting a prob…
▽ More
Effective peer assessment requires students to be attentive to the deficiencies in the work they rate. Thus, their reviews should identify problems. But what ways are there to check that they do? We attempt to automate the process of deciding whether a review comment detects a problem. We use over 18,000 review comments that were labeled by the reviewees as either detecting or not detecting a problem with the work. We deploy several traditional machine-learning models, as well as neural-network models using GloVe and BERT embeddings. We find that the best performer is the Hierarchical Attention Network classifier, followed by the Bidirectional Gated Recurrent Units (GRU) Attention and Capsule model with scores of 93.1% and 90.5% respectively. The best non-neural network model was the support vector machine with a score of 89.71%. This is followed by the Stochastic Gradient Descent model and the Logistic Regression model with 89.70% and 88.98%.
△ Less
Submitted 29 May, 2020;
originally announced June 2020.
-
Estimation of Causal Effects of Multiple Treatments in Observational Studies with a Binary Outcome
Authors:
Liangyuan Hu,
Chenyang Gu,
Michael Lopez,
Jiayi Ji,
Juan Wisnivesky
Abstract:
There is a dearth of robust methods to estimate the causal effects of multiple treatments when the outcome is binary. This paper uses two unique sets of simulations to propose and evaluate the use of Bayesian Additive Regression Trees (BART) in such settings. First, we compare BART to several approaches that have been proposed for continuous outcomes, including inverse probability of treatment wei…
▽ More
There is a dearth of robust methods to estimate the causal effects of multiple treatments when the outcome is binary. This paper uses two unique sets of simulations to propose and evaluate the use of Bayesian Additive Regression Trees (BART) in such settings. First, we compare BART to several approaches that have been proposed for continuous outcomes, including inverse probability of treatment weighting (IPTW), targeted maximum likelihood estimator (TMLE), vector matching and regression adjustment. Results suggest that under conditions of non-linearity and non-additivity of both the treatment assignment and outcome generating mechanisms, BART, TMLE and IPTW using generalized boosted models (GBM) provide better bias reduction and smaller root mean squared error. BART and TMLE provide more consistent 95 per cent CI coverage and better large-sample convergence property. Second, we supply BART with a strategy to identify a common support region for retaining inferential units and for avoiding extrapolating over areas of the covariate space where common support does not exist. BART retains more inferential units than the generalized propensity score based strategy, and shows lower bias, compared to TMLE or GBM, in a variety of scenarios differing by the degree of covariate overlap. A case study examining the effects of three surgical approaches for non-small cell lung cancer demonstrates the methods.
△ Less
Submitted 16 January, 2020;
originally announced January 2020.
-
Doubly Robust Sure Screening for Elliptical Copula Regression Model
Authors:
Yong He,
Liang Zhang,
Jiadong JI,
Xinsheng Zhang
Abstract:
Regression analysis has always been a hot research topic in statistics. We propose a very flexible semi-parametric regression model called Elliptical Copula Regression (ECR) model, which covers a large class of linear and nonlinear regression models such as additive regression model,single index model. Besides, ECR model can capture the heavy-tail characteristic and tail dependence between variabl…
▽ More
Regression analysis has always been a hot research topic in statistics. We propose a very flexible semi-parametric regression model called Elliptical Copula Regression (ECR) model, which covers a large class of linear and nonlinear regression models such as additive regression model,single index model. Besides, ECR model can capture the heavy-tail characteristic and tail dependence between variables, thus it could be widely applied in many areas such as econometrics and finance. In this paper we mainly focus on the feature screening problem for ECR model in ultra-high dimensional setting. We propose a doubly robust sure screening procedure for ECR model, in which two types of correlation coefficient are involved: Kendall tau correlation and Canonical correlation. Theoretical analysis shows that the procedure enjoys sure screening property, i.e., with probability tending to 1, the screening procedure selects out all important variables and substantially reduces the dimensionality to a moderate size against the sample size. Thorough numerical studies are conducted to illustrate its advantage over existing sure independence screening methods and thus it can be used as a safe replacement of the existing procedures in practice. At last, the proposed procedure is applied on a gene-expression real data set to show its empirical usefulness.
△ Less
Submitted 26 August, 2018;
originally announced August 2018.
-
Deploy Large-Scale Deep Neural Networks in Resource Constrained IoT Devices with Local Quantization Region
Authors:
Yi Yang,
Andy Chen,
Xiaoming Chen,
Jiang Ji,
Zhenyang Chen,
Yan Dai
Abstract:
Implementing large-scale deep neural networks with high computational complexity on low-cost IoT devices may inevitably be constrained by limited computation resource, making the devices hard to respond in real-time. This disjunction makes the state-of-art deep learning algorithms, i.e. CNN (Convolutional Neural Networks), incompatible with IoT world. We present a low-bit (range from 8-bit to 1-bi…
▽ More
Implementing large-scale deep neural networks with high computational complexity on low-cost IoT devices may inevitably be constrained by limited computation resource, making the devices hard to respond in real-time. This disjunction makes the state-of-art deep learning algorithms, i.e. CNN (Convolutional Neural Networks), incompatible with IoT world. We present a low-bit (range from 8-bit to 1-bit) scheme with our local quantization region algorithm. We use models in Caffe model zoo as our example tasks to evaluate the effect of our low precision data representation scheme. With the available of local quantization region, we find implementations on top of those schemes could greatly retain the model accuracy, besides the reduction of computational complexity. For example, our 8-bit scheme has no drops on top-1 and top-5 accuracy with 2x speedup on Intel Edison IoT platform. Implementations based on our 4-bit, 2-bit or 1-bit scheme are also applicable to IoT devices with advances of low computational complexity. For example, the drop on our task is only 0.7% when using 2-bit scheme, a scheme which could largely save transistors. Making low-bit scheme usable here opens a new door for further optimization on commodity IoT controller, i.e. extra speed-up could be achieved by replacing multiply-accumulate operations with the proposed table look-up operations. The whole study offers a new approach to relief the challenge of bring advanced deep learning algorithm to resource constrained low-cost IoT device.
△ Less
Submitted 23 May, 2018;
originally announced May 2018.