Skip to main content

Showing 1–43 of 43 results for author: Qi, Z

Searching in archive stat. Search in all archives.
.
  1. arXiv:2506.20406  [pdf, ps, other

    stat.ML cs.IT cs.LG stat.ME

    POLAR: A Pessimistic Model-based Policy Learning Algorithm for Dynamic Treatment Regimes

    Authors: Ruijia Zhang, Zhengling Qi, Yue Wu, Xiangyu Zhang, Yanxun Xu

    Abstract: Dynamic treatment regimes (DTRs) provide a principled framework for optimizing sequential decision-making in domains where decisions must adapt over time in response to individual trajectories, such as healthcare, education, and digital interventions. However, existing statistical methods often rely on strong positivity assumptions and lack robustness under partial data coverage, while offline rei… ▽ More

    Submitted 25 June, 2025; originally announced June 2025.

  2. arXiv:2506.20048  [pdf, ps, other

    stat.ML cs.LG

    A Principled Path to Fitted Distributional Evaluation

    Authors: Sungee Hong, Jiayi Wang, Zhengling Qi, Raymond Ka Wai Wong

    Abstract: In reinforcement learning, distributional off-policy evaluation (OPE) focuses on estimating the return distribution of a target policy using offline data collected under a different policy. This work focuses on extending the widely used fitted-Q evaluation -- developed for expectation-based reinforcement learning -- to the distributional OPE setting. We refer to this extension as fitted distributi… ▽ More

    Submitted 24 June, 2025; originally announced June 2025.

  3. arXiv:2506.07140  [pdf, other

    stat.ML cs.LG econ.EM

    Quantile-Optimal Policy Learning under Unmeasured Confounding

    Authors: Zhongren Chen, Siyu Chen, Zhengling Qi, Xiaohong Chen, Zhuoran Yang

    Abstract: We study quantile-optimal policy learning where the goal is to find a policy whose reward distribution has the largest $α$-quantile for some $α\in (0, 1)$. We focus on the offline setting whose generating process involves unobserved confounders. Such a problem suffers from three main challenges: (i) nonlinearity of the quantile objective as a functional of the reward distribution, (ii) unobserved… ▽ More

    Submitted 8 June, 2025; originally announced June 2025.

  4. arXiv:2505.23783  [pdf, ps, other

    stat.ML cs.AI cs.CL cs.LG

    Boosting In-Context Learning in LLMs Through the Lens of Classical Supervised Learning

    Authors: Korel Gundem, Juncheng Dong, Dennis Zhang, Vahid Tarokh, Zhengling Qi

    Abstract: In-Context Learning (ICL) allows Large Language Models (LLMs) to adapt to new tasks with just a few examples, but their predictions often suffer from systematic biases, leading to unstable performances in classification. While calibration techniques are proposed to mitigate these biases, we show that, in the logit space, many of these methods are equivalent to merely shifting the LLM's decision bo… ▽ More

    Submitted 22 May, 2025; originally announced May 2025.

  5. arXiv:2505.14999  [pdf, ps, other

    cs.LG cs.AI cs.CL stat.ML

    Learning to Rank Chain-of-Thought: An Energy-Based Approach with Outcome Supervision

    Authors: Eric Hanchen Jiang, Haozheng Luo, Shengyuan Pang, Xiaomin Li, Zhenting Qi, Hengli Li, Cheng-Fu Yang, Zongyu Lin, Xinfeng Li, Hao Xu, Kai-Wei Chang, Ying Nian Wu

    Abstract: Mathematical reasoning presents a significant challenge for Large Language Models (LLMs), often requiring robust multi step logical consistency. While Chain of Thought (CoT) prompting elicits reasoning steps, it doesn't guarantee correctness, and improving reliability via extensive sampling is computationally costly. This paper introduces the Energy Outcome Reward Model (EORM), an effective, light… ▽ More

    Submitted 14 June, 2025; v1 submitted 20 May, 2025; originally announced May 2025.

  6. arXiv:2505.00304  [pdf, other

    stat.ML cs.LG stat.ME

    Reinforcement Learning with Continuous Actions Under Unmeasured Confounding

    Authors: Yuhan Li, Eugene Han, Yifan Hu, Wenzhuo Zhou, Zhengling Qi, Yifan Cui, Ruoqing Zhu

    Abstract: This paper addresses the challenge of offline policy learning in reinforcement learning with continuous action spaces when unmeasured confounders are present. While most existing research focuses on policy evaluation within partially observable Markov decision processes (POMDPs) and assumes discrete action spaces, we advance this field by establishing a novel identification result to enable the no… ▽ More

    Submitted 1 May, 2025; originally announced May 2025.

  7. arXiv:2504.09831  [pdf, other

    stat.ML cs.AI cs.LG math.ST stat.AP

    Offline Dynamic Inventory and Pricing Strategy: Addressing Censored and Dependent Demand

    Authors: Korel Gundem, Zhengling Qi

    Abstract: In this paper, we study the offline sequential feature-based pricing and inventory control problem where the current demand depends on the past demand levels and any demand exceeding the available inventory is lost. Our goal is to leverage the offline dataset, consisting of past prices, ordering quantities, inventory levels, covariates, and censored sales levels, to estimate the optimal pricing an… ▽ More

    Submitted 13 April, 2025; originally announced April 2025.

    MSC Class: 90B05; 68T05; 90C40; 62N02

  8. arXiv:2412.05783  [pdf, other

    cs.LG stat.ML

    Two-way Deconfounder for Off-policy Evaluation in Causal Reinforcement Learning

    Authors: Shuguang Yu, Shuxing Fang, Ruixin Peng, Zhengling Qi, Fan Zhou, Chengchun Shi

    Abstract: This paper studies off-policy evaluation (OPE) in the presence of unmeasured confounders. Inspired by the two-way fixed effects regression model widely used in the panel data literature, we propose a two-way unmeasured confounding assumption to model the system dynamics in causal reinforcement learning and develop a two-way deconfounder algorithm that devises a neural tensor network to simultaneou… ▽ More

    Submitted 7 December, 2024; originally announced December 2024.

  9. arXiv:2412.00664  [pdf, other

    cs.LG cs.CV stat.ML

    Improving Decoupled Posterior Sampling for Inverse Problems using Data Consistency Constraint

    Authors: Zhi Qi, Shihong Yuan, Yulin Yuan, Linling Kuang, Yoshiyuki Kabashima, Xiangming Meng

    Abstract: Diffusion models have shown strong performances in solving inverse problems through posterior sampling while they suffer from errors during earlier steps. To mitigate this issue, several Decoupled Posterior Sampling methods have been recently proposed. However, the reverse process in these methods ignores measurement information, leading to errors that impede effective optimization in subsequent s… ▽ More

    Submitted 14 April, 2025; v1 submitted 30 November, 2024; originally announced December 2024.

  10. arXiv:2411.08126  [pdf, other

    stat.ML cs.LG

    A Tale of Two Cities: Pessimism and Opportunism in Offline Dynamic Pricing

    Authors: Zeyu Bian, Zhengling Qi, Cong Shi, Lan Wang

    Abstract: This paper studies offline dynamic pricing without data coverage assumption, thereby allowing for any price including the optimal one not being observed in the offline data. Previous approaches that rely on the various coverage assumptions such as that the optimal prices are observable, would lead to suboptimal decisions and consequently, reduced profits. We address this challenge by framing the p… ▽ More

    Submitted 12 November, 2024; originally announced November 2024.

  11. arXiv:2408.09155  [pdf, other

    stat.ME math.ST stat.CO stat.ML

    Learning Robust Treatment Rules for Censored Data

    Authors: Yifan Cui, Junyi Liu, Tao Shen, Zhengling Qi, Xi Chen

    Abstract: There is a fast-growing literature on estimating optimal treatment rules directly by maximizing the expected outcome. In biomedical studies and operations applications, censored survival outcome is frequently observed, in which case the restricted mean survival time and survival probability are of great interest. In this paper, we propose two robust criteria for learning optimal treatment rules wi… ▽ More

    Submitted 17 August, 2024; originally announced August 2024.

  12. arXiv:2402.01900  [pdf, other

    stat.ML cs.LG

    Distributional Off-policy Evaluation with Bellman Residual Minimization

    Authors: Sungee Hong, Zhengling Qi, Raymond K. W. Wong

    Abstract: We study distributional off-policy evaluation (OPE), of which the goal is to learn the distribution of the return for a target policy using offline data generated by a different policy. The theoretical foundation of many existing work relies on the supremum-extended statistical distances such as supremum-Wasserstein distance, which are hard to estimate. In contrast, we study the more manageable ex… ▽ More

    Submitted 12 March, 2025; v1 submitted 2 February, 2024; originally announced February 2024.

  13. arXiv:2310.18715  [pdf, other

    cs.LG cs.AI stat.ML

    Robust Offline Reinforcement learning with Heavy-Tailed Rewards

    Authors: Jin Zhu, Runzhe Wan, Zhengling Qi, Shikai Luo, Chengchun Shi

    Abstract: This paper endeavors to augment the robustness of offline reinforcement learning (RL) in scenarios laden with heavy-tailed rewards, a prevalent circumstance in real-world applications. We propose two algorithmic frameworks, ROAM and ROOM, for robust off-policy evaluation and offline policy optimization (OPO), respectively. Central to our frameworks is the strategic incorporation of the median-of-m… ▽ More

    Submitted 30 March, 2024; v1 submitted 28 October, 2023; originally announced October 2023.

    Comments: 23 pages, 6 figures. Proceedings of the 27th International Conference on Artificial Intelligence and Statistics (AISTATS) 2024

  14. arXiv:2306.08719  [pdf, other

    stat.ME cs.LG

    Off-policy Evaluation in Doubly Inhomogeneous Environments

    Authors: Zeyu Bian, Chengchun Shi, Zhengling Qi, Lan Wang

    Abstract: This work aims to study off-policy evaluation (OPE) under scenarios where two key reinforcement learning (RL) assumptions -- temporal stationarity and individual homogeneity are both violated. To handle the ``double inhomogeneities", we propose a class of latent factor models for the reward and observation transition functions, under which we develop a general OPE framework that consists of both m… ▽ More

    Submitted 18 August, 2024; v1 submitted 14 June, 2023; originally announced June 2023.

  15. arXiv:2305.17083  [pdf, other

    stat.ML cs.LG econ.EM math.ST stat.ME

    A Policy Gradient Method for Confounded POMDPs

    Authors: Mao Hong, Zhengling Qi, Yanxun Xu

    Abstract: In this paper, we propose a policy gradient method for confounded partially observable Markov decision processes (POMDPs) with continuous state and observation spaces in the offline setting. We first establish a novel identification result to non-parametrically estimate any history-dependent policy gradient under POMDPs using the offline data. The identification enables us to solve a sequence of c… ▽ More

    Submitted 30 November, 2023; v1 submitted 26 May, 2023; originally announced May 2023.

    Comments: 95 pages, 3 figures

  16. arXiv:2303.14281  [pdf, other

    stat.ML cs.LG

    Sequential Knockoffs for Variable Selection in Reinforcement Learning

    Authors: Tao Ma, Jin Zhu, Hengrui Cai, Zhengling Qi, Yunxiao Chen, Chengchun Shi, Eric B. Laber

    Abstract: In real-world applications of reinforcement learning, it is often challenging to obtain a state representation that is parsimonious and satisfies the Markov property without prior knowledge. Consequently, it is common practice to construct a state larger than necessary, e.g., by concatenating measurements over contiguous time points. However, needlessly increasing the dimension of the state may sl… ▽ More

    Submitted 30 July, 2024; v1 submitted 24 March, 2023; originally announced March 2023.

  17. arXiv:2302.12670  [pdf, ps, other

    stat.ME cs.LG econ.EM stat.ML

    Personalized Pricing with Invalid Instrumental Variables: Identification, Estimation, and Policy Learning

    Authors: Rui Miao, Zhengling Qi, Cong Shi, Lin Lin

    Abstract: Pricing based on individual customer characteristics is widely used to maximize sellers' revenues. This work studies offline personalized pricing under endogeneity using an instrumental variable approach. Standard instrumental variable methods in causal inference/econometrics either focus on a discrete treatment space or require the exclusion restriction of instruments from having a direct effect… ▽ More

    Submitted 24 February, 2023; originally announced February 2023.

  18. arXiv:2302.03821  [pdf, other

    cs.LG math.OC stat.ME stat.ML

    PASTA: Pessimistic Assortment Optimization

    Authors: Juncheng Dong, Weibin Mo, Zhengling Qi, Cong Shi, Ethan X. Fang, Vahid Tarokh

    Abstract: We consider a class of assortment optimization problems in an offline data-driven setting. A firm does not know the underlying customer choice model but has access to an offline dataset consisting of the historically offered assortment set, customer choice, and revenue. The objective is to use the offline dataset to find an optimal assortment. Due to the combinatorial nature of assortment optimiza… ▽ More

    Submitted 7 February, 2023; originally announced February 2023.

  19. arXiv:2301.13152  [pdf, other

    stat.ML cs.LG econ.EM stat.ME

    STEEL: Singularity-aware Reinforcement Learning

    Authors: Xiaohong Chen, Zhengling Qi, Runzhe Wan

    Abstract: Batch reinforcement learning (RL) aims at leveraging pre-collected data to find an optimal policy that maximizes the expected total rewards in a dynamic environment. The existing methods require absolutely continuous assumption (e.g., there do not exist non-overlapping regions) on the distribution induced by target policies with respect to the data distribution over either the state or action or b… ▽ More

    Submitted 25 June, 2024; v1 submitted 30 January, 2023; originally announced January 2023.

  20. arXiv:2301.02220  [pdf, ps, other

    stat.ML cs.LG

    Value Enhancement of Reinforcement Learning via Efficient and Robust Trust Region Optimization

    Authors: Chengchun Shi, Zhengling Qi, Jianing Wang, Fan Zhou

    Abstract: Reinforcement learning (RL) is a powerful machine learning technique that enables an intelligent agent to learn an optimal policy that maximizes the cumulative rewards in sequential decision making. Most of methods in the existing literature are developed in \textit{online} settings where the data are easy to collect or simulate. Motivated by high stake domains such as mobile health studies with l… ▽ More

    Submitted 5 January, 2023; originally announced January 2023.

  21. arXiv:2212.12167  [pdf, ps, other

    stat.ML cs.LG stat.ME

    Offline Reinforcement Learning for Human-Guided Human-Machine Interaction with Private Information

    Authors: Zuyue Fu, Zhengling Qi, Zhuoran Yang, Zhaoran Wang, Lan Wang

    Abstract: Motivated by the human-machine interaction such as training chatbots for improving customer satisfaction, we study human-guided human-machine interaction involving private information. We model this interaction as a two-player turn-based game, where one player (Alice, a human) guides the other player (Bob, a machine) towards a common goal. Specifically, we focus on offline reinforcement learning (… ▽ More

    Submitted 23 December, 2022; originally announced December 2022.

  22. arXiv:2211.06569  [pdf, other

    cs.LG stat.AP stat.ME stat.ML

    RISE: Robust Individualized Decision Learning with Sensitive Variables

    Authors: Xiaoqing Tan, Zhengling Qi, Christopher W. Seymour, Lu Tang

    Abstract: This paper introduces RISE, a robust individualized decision learning framework with sensitive variables, where sensitive variables are collectible data and important to the intervention decision, but their inclusion in decision making is prohibited due to reasons such as delayed availability or fairness concerns. A naive baseline is to ignore these sensitive variables in learning decision rules,… ▽ More

    Submitted 11 November, 2022; originally announced November 2022.

    Comments: Accepted at NeurIPS 2022

  23. arXiv:2210.14420  [pdf, other

    stat.ML cs.LG

    Optimizing Pessimism in Dynamic Treatment Regimes: A Bayesian Learning Approach

    Authors: Yunzhe Zhou, Zhengling Qi, Chengchun Shi, Lexin Li

    Abstract: In this article, we propose a novel pessimism-based Bayesian learning method for optimal dynamic treatment regimes in the offline setting. When the coverage condition does not hold, which is common for offline data, the existing solutions would produce sub-optimal policies. The pessimism principle addresses this issue by discouraging recommendation of actions that are less explored conditioning on… ▽ More

    Submitted 21 February, 2023; v1 submitted 25 October, 2022; originally announced October 2022.

    Comments: 18 pages, 6 figures. Proceedings of the 26th International Conference on Artificial Intelligence and Statistics (AISTATS) 2023

  24. arXiv:2209.15448  [pdf, other

    cs.LG math.ST stat.ME

    Blessing from Human-AI Interaction: Super Reinforcement Learning in Confounded Environments

    Authors: Jiayi Wang, Zhengling Qi, Chengchun Shi

    Abstract: As AI becomes more prevalent throughout society, effective methods of integrating humans and AI systems that leverage their respective strengths and mitigate risk have become an important priority. In this paper, we introduce the paradigm of super reinforcement learning that takes advantage of Human-AI interaction for data driven sequential decision making. This approach utilizes the observed acti… ▽ More

    Submitted 20 October, 2023; v1 submitted 29 September, 2022; originally announced September 2022.

  25. arXiv:2209.10064  [pdf, other

    stat.ML cs.LG math.ST

    Off-Policy Evaluation for Episodic Partially Observable Markov Decision Processes under Non-Parametric Models

    Authors: Rui Miao, Zhengling Qi, Xiaoke Zhang

    Abstract: We study the problem of off-policy evaluation (OPE) for episodic Partially Observable Markov Decision Processes (POMDPs) with continuous states. Motivated by the recently proposed proximal causal inference framework, we develop a non-parametric identification result for estimating the policy value via a sequence of so-called V-bridge functions with the help of time-dependent proxy variables. We th… ▽ More

    Submitted 16 October, 2022; v1 submitted 20 September, 2022; originally announced September 2022.

  26. arXiv:2209.08666  [pdf, other

    cs.LG stat.ME

    Offline Reinforcement Learning with Instrumental Variables in Confounded Markov Decision Processes

    Authors: Zuyue Fu, Zhengling Qi, Zhaoran Wang, Zhuoran Yang, Yanxun Xu, Michael R. Kosorok

    Abstract: We study the offline reinforcement learning (RL) in the face of unmeasured confounders. Due to the lack of online interaction with the environment, offline RL is facing the following two significant challenges: (i) the agent may be confounded by the unobserved state variables; (ii) the offline data collected a prior does not provide sufficient coverage for the environment. To tackle the above chal… ▽ More

    Submitted 18 September, 2022; originally announced September 2022.

  27. arXiv:2207.11532  [pdf, other

    stat.ME

    Change Point Detection for High-dimensional Linear Models: A General Tail-adaptive Approach

    Authors: Bin Liu, Zhengling Qi, Xinsheng Zhang, Yufeng Liu

    Abstract: We propose a novel approach for detecting change points in high-dimensional linear regression models. Unlike previous research that relied on strict Gaussian/sub-Gaussian error assumptions and had prior knowledge of change points, we propose a tail-adaptive method for change point detection and estimation. We use a weighted combination of composite quantile and least squared losses to build a new… ▽ More

    Submitted 21 May, 2024; v1 submitted 23 July, 2022; originally announced July 2022.

  28. arXiv:2201.06169  [pdf, ps, other

    math.ST cs.LG econ.EM stat.ML

    On Well-posedness and Minimax Optimal Rates of Nonparametric Q-function Estimation in Off-policy Evaluation

    Authors: Xiaohong Chen, Zhengling Qi

    Abstract: We study the off-policy evaluation (OPE) problem in an infinite-horizon Markov decision process with continuous states and actions. We recast the $Q$-function estimation into a special form of the nonparametric instrumental variables (NPIV) estimation problem. We first show that under one mild condition the NPIV formulation of $Q$-function estimation is well-posed in the sense of $L^2$-measure of… ▽ More

    Submitted 26 June, 2022; v1 submitted 16 January, 2022; originally announced January 2022.

  29. Rejoinder: Learning Optimal Distributionally Robust Individualized Treatment Rules

    Authors: Weibin Mo, Zhengling Qi, Yufeng Liu

    Abstract: We thank the opportunity offered by editors for this discussion and the discussants for their insightful comments and thoughtful contributions. We also want to congratulate Kallus (2020) for his inspiring work in improving the efficiency of policy learning by retargeting. Motivated from the discussion in Dukes and Vansteelandt (2020), we first point out interesting connections and distinctions bet… ▽ More

    Submitted 17 October, 2021; originally announced October 2021.

    Journal ref: Journal of the American Statistical Association, 116:534, 699-707 (2021)

  30. arXiv:2109.04640  [pdf, other

    cs.LG stat.ME

    Projected State-action Balancing Weights for Offline Reinforcement Learning

    Authors: Jiayi Wang, Zhengling Qi, Raymond K. W. Wong

    Abstract: Offline policy evaluation (OPE) is considered a fundamental and challenging problem in reinforcement learning (RL). This paper focuses on the value estimation of a target policy based on pre-collected data generated from a possibly different policy, under the framework of infinite-horizon Markov decision processes. Motivated by the recently developed marginal importance sampling method in RL and t… ▽ More

    Submitted 9 June, 2022; v1 submitted 9 September, 2021; originally announced September 2021.

  31. arXiv:2105.10635  [pdf, other

    cs.LG stat.ML

    Two-stage Training for Learning from Label Proportions

    Authors: Jiabin Liu, Bo Wang, Xin Shen, Zhiquan Qi, Yingjie Tian

    Abstract: Learning from label proportions (LLP) aims at learning an instance-level classifier with label proportions in grouped training data. Existing deep learning based LLP methods utilize end-to-end pipelines to obtain the proportional loss with Kullback-Leibler divergence between the bag-level prior and posterior class distributions. However, the unconstrained optimization on this objective can hardly… ▽ More

    Submitted 21 May, 2021; originally announced May 2021.

    Comments: 10 pages, 4 figures, 5 tables, accepted by IJCAI 2021

  32. arXiv:2105.01187  [pdf, ps, other

    stat.ME cs.LG stat.ML

    Proximal Learning for Individualized Treatment Regimes Under Unmeasured Confounding

    Authors: Zhengling Qi, Rui Miao, Xiaoke Zhang

    Abstract: Data-driven individualized decision making has recently received increasing research interests. Most existing methods rely on the assumption of no unmeasured confounding, which unfortunately cannot be ensured in practice especially in observational studies. Motivated by the recent proposed proximal causal inference, we develop several proximal learning approaches to estimating optimal individualiz… ▽ More

    Submitted 22 December, 2022; v1 submitted 3 May, 2021; originally announced May 2021.

  33. arXiv:2011.04185  [pdf, other

    math.ST cs.LG stat.ML

    Robust Batch Policy Learning in Markov Decision Processes

    Authors: Zhengling Qi, Peng Liao

    Abstract: We study the offline data-driven sequential decision making problem in the framework of Markov decision process (MDP). In order to enhance the generalizability and adaptivity of the learned policy, we propose to evaluate each policy by a set of the average rewards with respect to distributions centered at the policy induced stationary distribution. Given a pre-collected dataset of multiple traject… ▽ More

    Submitted 9 November, 2021; v1 submitted 8 November, 2020; originally announced November 2020.

  34. arXiv:2007.11771  [pdf, other

    math.ST stat.ML

    Batch Policy Learning in Average Reward Markov Decision Processes

    Authors: Peng Liao, Zhengling Qi, Runzhe Wan, Predrag Klasnja, Susan Murphy

    Abstract: We consider the batch (off-line) policy learning problem in the infinite horizon Markov Decision Process. Motivated by mobile health applications, we focus on learning a policy that maximizes the long-term average reward. We propose a doubly robust estimator for the average reward and show that it achieves semiparametric efficiency. Further we develop an optimization algorithm to compute the optim… ▽ More

    Submitted 17 September, 2022; v1 submitted 22 July, 2020; originally announced July 2020.

  35. arXiv:2006.15121  [pdf, other

    stat.ML cs.LG

    Learning Optimal Distributionally Robust Individualized Treatment Rules

    Authors: Weibin Mo, Zhengling Qi, Yufeng Liu

    Abstract: Recent development in the data-driven decision science has seen great advances in individualized decision making. Given data with individual covariates, treatment assignments and outcomes, policy makers best individualized treatment rule (ITR) that maximizes the expected outcome, known as the value function. Many existing methods assume that the training and testing distributions are the same. How… ▽ More

    Submitted 26 June, 2020; originally announced June 2020.

  36. arXiv:1912.08865  [pdf, other

    cs.LG math.ST stat.ML

    Adversarial VC-dimension and Sample Complexity of Neural Networks

    Authors: Zetong Qi, T. J. Wilder

    Abstract: Adversarial attacks during the testing phase of neural networks pose a challenge for the deployment of neural networks in security critical settings. These attacks can be performed by adding noise that is imperceptible to humans on top of the original data. By doing so, an attacker can create an adversarial sample, which will cause neural networks to misclassify. In this paper, we seek to understa… ▽ More

    Submitted 18 December, 2019; originally announced December 2019.

  37. arXiv:1911.08967  [pdf, ps, other

    cs.LG stat.ML

    Transfer Learning Toolkit: Primers and Benchmarks

    Authors: Fuzhen Zhuang, Keyu Duan, Tongjia Guo, Yongchun Zhu, Dongbo Xi, Zhiyuan Qi, Qing He

    Abstract: The transfer learning toolkit wraps the codes of 17 transfer learning models and provides integrated interfaces, allowing users to use those models by calling a simple function. It is easy for primary researchers to use this toolkit and to choose proper models for real-world applications. The toolkit is written in Python and distributed under MIT open source license. In this paper, the current sta… ▽ More

    Submitted 20 November, 2019; originally announced November 2019.

    Comments: A Transfer Learning Toolkit

  38. arXiv:1911.02685  [pdf, ps, other

    cs.LG stat.ML

    A Comprehensive Survey on Transfer Learning

    Authors: Fuzhen Zhuang, Zhiyuan Qi, Keyu Duan, Dongbo Xi, Yongchun Zhu, Hengshu Zhu, Hui Xiong, Qing He

    Abstract: Transfer learning aims at improving the performance of target learners on target domains by transferring the knowledge contained in different but related source domains. In this way, the dependence on a large number of target domain data can be reduced for constructing target learners. Due to the wide application prospects, transfer learning has become a popular and promising area in machine learn… ▽ More

    Submitted 23 June, 2020; v1 submitted 6 November, 2019; originally announced November 2019.

    Comments: 31 pages, 7 figures

  39. arXiv:1909.02180  [pdf, other

    cs.LG stat.ML

    Learning from Label Proportions with Generative Adversarial Networks

    Authors: Jiabin Liu, Bo Wang, Zhiquan Qi, Yingjie Tian, Yong Shi

    Abstract: In this paper, we leverage generative adversarial networks (GANs) to derive an effective algorithm LLP-GAN for learning from label proportions (LLP), where only the bag-level proportional information in labels is available. Endowed with end-to-end structure, LLP-GAN performs approximation in the light of an adversarial learning mechanism, without imposing restricted assumptions on distribution. Ac… ▽ More

    Submitted 2 December, 2019; v1 submitted 4 September, 2019; originally announced September 2019.

    Comments: Accepted as a conference paper at NeurIPS 2019

  40. arXiv:1908.10742  [pdf, ps, other

    math.OC cs.LG stat.ME stat.ML

    Estimation of Individualized Decision Rules Based on an Optimized Covariate-Dependent Equivalent of Random Outcomes

    Authors: Zhengling Qi, Ying Cui, Yufeng Liu, Jong-Shi Pang

    Abstract: Recent exploration of optimal individualized decision rules (IDRs) for patients in precision medicine has attracted a lot of attention due to the heterogeneous responses of patients to different treatments. In the existing literature of precision medicine, an optimal IDR is defined as a decision function mapping from the patients' covariate space into the treatment space that maximizes the expecte… ▽ More

    Submitted 27 August, 2019; originally announced August 2019.

  41. arXiv:1903.04367  [pdf, other

    stat.ME

    On Robustness of Individualized Decision Rules

    Authors: Zhengling Qi, Jong-Shi Pang, Yufeng Liu

    Abstract: With the emergence of precision medicine, estimating optimal individualized decision rules (IDRs) has attracted tremendous attention in many scientific areas. Most existing literature has focused on finding optimal IDRs that can maximize the expected outcome for each individual. Motivated by complex individualized decision making procedures and the popular conditional value at risk (CVaR) measure,… ▽ More

    Submitted 26 June, 2022; v1 submitted 11 March, 2019; originally announced March 2019.

  42. arXiv:1812.07150  [pdf, other

    cs.LG cs.CV stat.ML

    Interactive Naming for Explaining Deep Neural Networks: A Formative Study

    Authors: Mandana Hamidi-Haines, Zhongang Qi, Alan Fern, Fuxin Li, Prasad Tadepalli

    Abstract: We consider the problem of explaining the decisions of deep neural networks for image recognition in terms of human-recognizable visual concepts. In particular, given a test set of images, we aim to explain each classification in terms of a small number of image regions, or activation maps, which have been associated with semantic concepts by a human annotator. This allows for generating summary v… ▽ More

    Submitted 20 December, 2018; v1 submitted 17 December, 2018; originally announced December 2018.

  43. arXiv:1803.06071  [pdf, other

    cs.DB cs.LG stat.ML

    Impacts of Dirty Data: and Experimental Evaluation

    Authors: Zhixin Qi, Hongzhi Wang, Jianzhong Li, Hong Gao

    Abstract: Data quality issues have attracted widespread attention due to the negative impacts of dirty data on data mining and machine learning results. The relationship between data quality and the accuracy of results could be applied on the selection of the appropriate algorithm with the consideration of data quality and the determination of the data share to clean. However, rare research has focused on e… ▽ More

    Submitted 26 April, 2021; v1 submitted 16 March, 2018; originally announced March 2018.

    Comments: 22 pages, 192 figures