Search | arXiv e-print repository

arXiv:2505.19923 [pdf, ps, other]

Learning to Trust Bellman Updates: Selective State-Adaptive Regularization for Offline RL

Authors: Qin-Wen Luo, Ming-Kun Xie, Ye-Wen Wang, Sheng-Jun Huang

Abstract: Offline reinforcement learning (RL) aims to learn an effective policy from a static dataset. To alleviate extrapolation errors, existing studies often uniformly regularize the value function or policy updates across all states. However, due to substantial variations in data quality, the fixed regularization strength often leads to a dilemma: Weak regularization strength fails to address extrapolat… ▽ More Offline reinforcement learning (RL) aims to learn an effective policy from a static dataset. To alleviate extrapolation errors, existing studies often uniformly regularize the value function or policy updates across all states. However, due to substantial variations in data quality, the fixed regularization strength often leads to a dilemma: Weak regularization strength fails to address extrapolation errors and value overestimation, while strong regularization strength shifts policy learning toward behavior cloning, impeding potential performance enabled by Bellman updates. To address this issue, we propose the selective state-adaptive regularization method for offline RL. Specifically, we introduce state-adaptive regularization coefficients to trust state-level Bellman-driven results, while selectively applying regularization on high-quality actions, aiming to avoid performance degradation caused by tight constraints on low-quality actions. By establishing a connection between the representative value regularization method, CQL, and explicit policy constraint methods, we effectively extend selective state-adaptive regularization to these two mainstream offline RL approaches. Extensive experiments demonstrate that the proposed method significantly outperforms the state-of-the-art approaches in both offline and offline-to-online settings on the D4RL benchmark. △ Less

Submitted 26 May, 2025; originally announced May 2025.

Comments: Accepted to ICML 2025

arXiv:2404.09353 [pdf, other]

A Unified Combination Framework for Dependent Tests with Applications to Microbiome Association Studies

Authors: Xiufan Yu, Linjun Zhang, Arun Srinivasan, Min-ge Xie, Lingzhou Xue

Abstract: We introduce a novel meta-analysis framework to combine dependent tests under a general setting, and utilize it to synthesize various microbiome association tests that are calculated from the same dataset. Our development builds upon the classical meta-analysis methods of aggregating $p$-values and also a more recent general method of combining confidence distributions, but makes generalizations t… ▽ More We introduce a novel meta-analysis framework to combine dependent tests under a general setting, and utilize it to synthesize various microbiome association tests that are calculated from the same dataset. Our development builds upon the classical meta-analysis methods of aggregating $p$-values and also a more recent general method of combining confidence distributions, but makes generalizations to handle dependent tests. The proposed framework ensures rigorous statistical guarantees, and we provide a comprehensive study and compare it with various existing dependent combination methods. Notably, we demonstrate that the widely used Cauchy combination method for dependent tests, referred to as the vanilla Cauchy combination in this article, can be viewed as a special case within our framework. Moreover, the proposed framework provides a way to address the problem when the distributional assumptions underlying the vanilla Cauchy combination are violated. Our numerical results demonstrate that ignoring the dependence among the to-be-combined components may lead to a severe size distortion phenomenon. Compared to the existing $p$-value combination methods, including the vanilla Cauchy combination method, the proposed combination framework can handle the dependence accurately and utilizes the information efficiently to construct tests with accurate size and enhanced power. The development is applied to Microbiome Association Studies, where we aggregate information from multiple existing tests using the same dataset. The combined tests harness the strengths of each individual test across a wide range of alternative spaces, %resulting in a significant enhancement of testing power across a wide range of alternative spaces, enabling more efficient and meaningful discoveries of vital microbiome associations. △ Less

Submitted 14 April, 2024; originally announced April 2024.

arXiv:2403.09984 [pdf, ps, other]

Repro Samples Method for High-dimensional Logistic Model

Authors: Xiaotian Hou, Linjun Zhang, Peng Wang, Min-ge Xie

Abstract: This paper presents a novel method to make statistical inferences for both the model support and regression coefficients in a high-dimensional logistic regression model. Our method is based on the repro samples framework, in which we conduct statistical inference by generating artificial samples mimicking the actual data-generating process. The proposed method has two major advantages. Firstly, fo… ▽ More This paper presents a novel method to make statistical inferences for both the model support and regression coefficients in a high-dimensional logistic regression model. Our method is based on the repro samples framework, in which we conduct statistical inference by generating artificial samples mimicking the actual data-generating process. The proposed method has two major advantages. Firstly, for model support, we introduce the first method for constructing model confidence set in a high-dimensional setting and the proposed method only requires a weak signal strength assumption. Secondly, in terms of regression coefficients, we establish confidence sets for any group of linear combinations of regression coefficients. Our simulation results demonstrate that the proposed method produces valid and small model confidence sets and achieves better coverage for regression coefficients than the state-of-the-art debiasing methods. Additionally, we analyze single-cell RNA-seq data on the immune response. Besides identifying genes previously proved as relevant in the literature, our method also discovers a significant gene that has not been studied before, revealing a potential new direction in understanding cellular immune response mechanisms. △ Less

Submitted 14 March, 2024; originally announced March 2024.

arXiv:2402.15004 [pdf, other]

Repro Samples Method for a Performance Guaranteed Inference in General and Irregular Inference Problems

Authors: Minge Xie, Peng Wang

Abstract: Rapid advancements in data science require us to have fundamentally new frameworks to tackle prevalent but highly non-trivial "irregular" inference problems, to which the large sample central limit theorem does not apply. Typical examples are those involving discrete or non-numerical parameters and those involving non-numerical data, etc. In this article, we present an innovative, wide-reaching, a… ▽ More Rapid advancements in data science require us to have fundamentally new frameworks to tackle prevalent but highly non-trivial "irregular" inference problems, to which the large sample central limit theorem does not apply. Typical examples are those involving discrete or non-numerical parameters and those involving non-numerical data, etc. In this article, we present an innovative, wide-reaching, and effective approach, called "repro samples method," to conduct statistical inference for these irregular problems plus more. The development relates to but improves several existing simulation-inspired inference approaches, and we provide both exact and approximate theories to support our development. Moreover, the proposed approach is broadly applicable and subsumes the classical Neyman-Pearson framework as a special case. For the often-seen irregular inference problems that involve both discrete/non-numerical and continuous parameters, we propose an effective three-step procedure to make inferences for all parameters. We also develop a unique matching scheme that turns the discreteness of discrete/non-numerical parameters from an obstacle for forming inferential theories into a beneficial attribute for improving computational efficiency. We demonstrate the effectiveness of the proposed general methodology using various examples, including a case study example on a Gaussian mixture model with unknown number of components. This case study example provides a solution to a long-standing open inference question in statistics on how to quantify the estimation uncertainty for the unknown number of components and other associated parameters. Real data and simulation studies, with comparisons to existing approaches, demonstrate the far superior performance of the proposed method. △ Less

Submitted 22 February, 2024; originally announced February 2024.

arXiv:2310.13178 [pdf, other]

Exact Inference for Common Odds Ratio in Meta-Analysis with Zero-Total-Event Studies

Authors: Xiaolin Chen, Jerry Q Cheng, Lu Tian, Minge Xie

Abstract: Stemming from the high profile publication of Nissen and Wolski (2007) and subsequent discussions with divergent views on how to handle observed zero-total-event studies, defined to be studies which observe zero events in both treatment and control arms, the research topic concerning the common odds ratio model with zero-total-event studies remains to be an unresolved problem in meta-analysis. In… ▽ More Stemming from the high profile publication of Nissen and Wolski (2007) and subsequent discussions with divergent views on how to handle observed zero-total-event studies, defined to be studies which observe zero events in both treatment and control arms, the research topic concerning the common odds ratio model with zero-total-event studies remains to be an unresolved problem in meta-analysis. In this article, we address this problem by proposing a novel repro samples method to handle zero-total-event studies and make inference for the parameter of common odds ratio. The development explicitly accounts for sampling scheme and does not rely on large sample approximation. It is theoretically justified with a guaranteed finite sample performance. The empirical performance of the proposed method is demonstrated through simulation studies. It shows that the proposed confidence set achieves the desired empirical coverage rate and also that the zero-total-event studies contains information and impacts the inference for the common odds ratio. The proposed method is applied to combine information in the Nissen and Wolski study. △ Less

Submitted 19 October, 2023; originally announced October 2023.

arXiv:2301.12674 [pdf, other]

A Simulation Study of the Performance of Statistical Models for Count Outcomes with Excessive Zeros

Authors: Zhengyang Zhou, Dateng Li, David Huh, Minge Xie, Eun-Young Mun

Abstract: Background: Outcome measures that are count variables with excessive zeros are common in health behaviors research. There is a lack of empirical data about the relative performance of prevailing statistical models when outcomes are zero-inflated, particularly compared with recently developed approaches. Methods: The current simulation study examined five commonly used analytical approaches for c… ▽ More Background: Outcome measures that are count variables with excessive zeros are common in health behaviors research. There is a lack of empirical data about the relative performance of prevailing statistical models when outcomes are zero-inflated, particularly compared with recently developed approaches. Methods: The current simulation study examined five commonly used analytical approaches for count outcomes, including two linear models (with outcomes on raw and log-transformed scales, respectively) and three count distribution-based models (i.e., Poisson, negative binomial, and zero-inflated Poisson (ZIP) models). We also considered the marginalized zero-inflated Poisson (MZIP) model, a novel alternative that estimates the effects on overall mean while adjusting for zero-inflation. Extensive simulations were conducted to evaluate their the statistical power and Type I error rate across various data conditions. Results: Under zero-inflation, the Poisson model failed to control the Type I error rate, resulting in higher than expected false positive results. When the intervention effects on the zero (vs. non-zero) and count parts were in the same direction, the MZIP model had the highest statistical power, followed by the linear model with outcomes on raw scale, negative binomial model, and ZIP model. The performance of a linear model with a log-transformed outcome variable was unsatisfactory. When only one of the effects on the zero (vs. non-zero) part and the count part existed, the ZIP model had the highest statistical power. Conclusions: The MZIP model demonstrated better statistical properties in detecting true intervention effects and controlling false positive results for zero-inflated count outcomes. This MZIP model may serve as an appealing analytical approach to evaluating overall intervention effects in studies with count outcomes marked by excessive zeros. △ Less

Submitted 15 August, 2023; v1 submitted 30 January, 2023; originally announced January 2023.

arXiv:2301.00477 [pdf, other]

A Sequential Quadratic Programming Method with High Probability Complexity Bounds for Nonlinear Equality Constrained Stochastic Optimization

Authors: Albert S. Berahas, Miaolan Xie, Baoyu Zhou

Abstract: A step-search sequential quadratic programming method is proposed for solving nonlinear equality constrained stochastic optimization problems. It is assumed that constraint function values and derivatives are available, but only stochastic approximations of the objective function and its associated derivatives can be computed via inexact probabilistic zeroth- and first-order oracles. Under reasona… ▽ More A step-search sequential quadratic programming method is proposed for solving nonlinear equality constrained stochastic optimization problems. It is assumed that constraint function values and derivatives are available, but only stochastic approximations of the objective function and its associated derivatives can be computed via inexact probabilistic zeroth- and first-order oracles. Under reasonable assumptions, a high-probability bound on the iteration complexity of the algorithm to approximate first-order stationarity is derived. Numerical results on standard nonlinear optimization test problems illustrate the advantages and limitations of our proposed method. △ Less

Submitted 5 October, 2024; v1 submitted 1 January, 2023; originally announced January 2023.

Comments: 29 pages, 2 figures

arXiv:2209.09299 [pdf, other]

Finite- and Large- Sample Inference for Model and Coefficients in High-dimensional Linear Regression with Repro Samples

Authors: Peng Wang, Min-Ge Xie, Linjun Zhang

Abstract: In this paper, we present a new and effective simulation-based approach to conduct both finite- and large-sample inference for high-dimensional linear regression models. This approach is developed under the so-called repro samples framework, in which we conduct statistical inference by creating and studying the behavior of artificial samples that are obtained by mimicking the sampling mechanism of… ▽ More In this paper, we present a new and effective simulation-based approach to conduct both finite- and large-sample inference for high-dimensional linear regression models. This approach is developed under the so-called repro samples framework, in which we conduct statistical inference by creating and studying the behavior of artificial samples that are obtained by mimicking the sampling mechanism of the data. We obtain confidence sets for (a) the true model corresponding to the nonzero coefficients, (b) a single or any collection of regression coefficients, and (c) both the model and regression coefficients jointly. We also extend our approaches to drawing inferences on functions of the regression coefficients. The proposed approach fills in two major gaps in the high-dimensional regression literature: (1) lack of effective approaches to address model selection uncertainty and provide valid inference for the underlying true model; (2) lack of effective inference approaches that guarantee finite-sample performances. We provide both finite-sample and asymptotic results to theoretically guarantee the performances of the proposed methods. In addition, our numerical results demonstrate that the proposed methods are valid and achieve better coverage with smaller confidence sets than the existing state-of-art approaches, such as debiasing and bootstrap approaches. △ Less

Submitted 9 December, 2022; v1 submitted 19 September, 2022; originally announced September 2022.

arXiv:2208.02521 [pdf]

A class of Šidák-type tests based on maximal precedence and exceedance statistic

Authors: Niladri Chakrabortya, Di Cui, Min Xie

Abstract: A class of nonparametric two-sample tests has been proposed in this article. As a generalization of the original Šidáks' test, the proposed test statistic is developed as the sum of the maximal precedence and maximal exceedance statistics. Unlike the Šidák-type precedence-exceedance test and the maximal precedence test, the proposed test is suitable for a two-sided alternative while being free fro… ▽ More A class of nonparametric two-sample tests has been proposed in this article. As a generalization of the original Šidáks' test, the proposed test statistic is developed as the sum of the maximal precedence and maximal exceedance statistics. Unlike the Šidák-type precedence-exceedance test and the maximal precedence test, the proposed test is suitable for a two-sided alternative while being free from any parametric assumption. Exact distribution of the test statistic is obtained under the null as well as under the Lehmann alternative. Power value comparison has been carried out that shows the competency of the proposed test as a useful alternative to a number of existing tests based on precedence-exceedance statistics. Real-life example is provided to illustrate the application of the proposed test. △ Less

Submitted 4 August, 2022; originally announced August 2022.

arXiv:2207.07988 [pdf, ps, other]

Inference of high quantiles of a heavy-tailed distribution from block data

Authors: Yongcheng Qi, Mengzi Xie, Jingping Yang

Abstract: In this paper we consider the estimation problem for high quantiles of a heavy-tailed distribution from block data when only a few largest values are observed within blocks. We propose estimators for high quantiles and prove that these estimators are asymptotically normal. Furthermore, we employ empirical likelihood method and adjusted empirical likelihood method to constructing the confidence int… ▽ More In this paper we consider the estimation problem for high quantiles of a heavy-tailed distribution from block data when only a few largest values are observed within blocks. We propose estimators for high quantiles and prove that these estimators are asymptotically normal. Furthermore, we employ empirical likelihood method and adjusted empirical likelihood method to constructing the confidence intervals of high quantiles. Through a simulation study we also compare the performance of the normal approximation method and the adjusted empirical likelihood methods in terms of the coverage probability and length of the confidence intervals. △ Less

Submitted 24 June, 2023; v1 submitted 16 July, 2022; originally announced July 2022.

Comments: 28 pages

arXiv:2207.03935 [pdf, other]

ControlBurn: Nonlinear Feature Selection with Sparse Tree Ensembles

Authors: Brian Liu, Miaolan Xie, Haoyue Yang, Madeleine Udell

Abstract: ControlBurn is a Python package to construct feature-sparse tree ensembles that support nonlinear feature selection and interpretable machine learning. The algorithms in this package first build large tree ensembles that prioritize basis functions with few features and then select a feature-sparse subset of these basis functions using a weighted lasso optimization criterion. The package includes v… ▽ More ControlBurn is a Python package to construct feature-sparse tree ensembles that support nonlinear feature selection and interpretable machine learning. The algorithms in this package first build large tree ensembles that prioritize basis functions with few features and then select a feature-sparse subset of these basis functions using a weighted lasso optimization criterion. The package includes visualizations to analyze the features selected by the ensemble and their impact on predictions. Hence ControlBurn offers the accuracy and flexibility of tree-ensemble models and the interpretability of sparse generalized additive models. ControlBurn is scalable and flexible: for example, it can use warm-start continuation to compute the regularization path (prediction error for any number of selected features) for a dataset with tens of thousands of samples and hundreds of features in seconds. For larger datasets, the runtime scales linearly in the number of samples and features (up to a log factor), and the package support acceleration using sketching. Moreover, the ControlBurn framework accommodates feature costs, feature groupings, and $\ell_0$-based regularizers. The package is user-friendly and open-source: its documentation and source code appear on https://pypi.org/project/ControlBurn/ and https://github.com/udellgroup/controlburn/. △ Less

Submitted 8 July, 2022; originally announced July 2022.

Comments: 22 pages

arXiv:2206.06421 [pdf, other]

Repro Samples Method for Finite- and Large-Sample Inferences

Authors: Min-ge Xie, Peng Wang

Abstract: This article presents a novel, general, and effective simulation-inspired approach, called {\it repro samples method}, to conduct statistical inference. The approach studies the performance of artificial samples, referred to as {\it repro samples}, obtained by mimicking the true observed sample to achieve uncertainty quantification and construct confidence sets for parameters of interest with guar… ▽ More This article presents a novel, general, and effective simulation-inspired approach, called {\it repro samples method}, to conduct statistical inference. The approach studies the performance of artificial samples, referred to as {\it repro samples}, obtained by mimicking the true observed sample to achieve uncertainty quantification and construct confidence sets for parameters of interest with guaranteed coverage rates. Both exact and asymptotic inferences are developed. An attractive feature of the general framework developed is that it does not rely on the large sample central limit theorem and is likelihood-free. As such, it is thus effective for complicated inference problems which we can not solve using the large sample central limit theorem. The proposed method is applicable to a wide range of problems, including many open questions where solutions were previously unavailable, for example, those involving discrete or non-numerical parameters. To reduce the large computational cost of such inference problems, we develop a unique matching scheme to obtain a data-driven candidate set. Moreover, we show the advantages of the proposed framework over the classical Neyman-Pearson framework. We demonstrate the effectiveness of the proposed approach on various models throughout the paper and provide a case study that addresses an open inference question on how to quantify the uncertainty for the unknown number of components in a normal mixture model. To evaluate the empirical performance of our repro samples method, we conduct simulations and study real data examples with comparisons to existing approaches. Although the development pertains to the settings where the large sample central limit theorem does not apply, it also has direct extensions to the cases where the central limit theorem does hold. △ Less

Submitted 13 June, 2022; originally announced June 2022.

MSC Class: 62A99; 62F99; 62G99

arXiv:2206.01707 [pdf, other]

Approximate confidence distribution computing

Authors: Suzanne Thornton, Wentao Li, Minge Xie

Abstract: Approximate confidence distribution computing (ACDC) offers a new take on the rapidly developing field of likelihood-free inference from within a frequentist framework. The appeal of this computational method for statistical inference hinges upon the concept of a confidence distribution, a special type of estimator which is defined with respect to the repeated sampling principle. An ACDC method pr… ▽ More Approximate confidence distribution computing (ACDC) offers a new take on the rapidly developing field of likelihood-free inference from within a frequentist framework. The appeal of this computational method for statistical inference hinges upon the concept of a confidence distribution, a special type of estimator which is defined with respect to the repeated sampling principle. An ACDC method provides frequentist validation for computational inference in problems with unknown or intractable likelihoods. The main theoretical contribution of this work is the identification of a matching condition necessary for frequentist validity of inference from this method. In addition to providing an example of how a modern understanding of confidence distribution theory can be used to connect Bayesian and frequentist inferential paradigms, we present a case to expand the current scope of so-called approximate Bayesian inference to include non-Bayesian inference by targeting a confidence distribution rather than a posterior. The main practical contribution of this work is the development of a data-driven approach to drive ACDC in both Bayesian or frequentist contexts. The ACDC algorithm is data-driven by the selection of a data-dependent proposal function, the structure of which is quite general and adaptable to many settings. We explore two numerical examples that both verify the theoretical arguments in the development of ACDC and suggest instances in which ACDC outperform approximate Bayesian computing methods computationally. △ Less

Submitted 12 October, 2022; v1 submitted 3 June, 2022; originally announced June 2022.

Comments: Supplementary material available upon request

arXiv:2204.12724 [pdf, other]

Semiparametric transformation Model with measurement error in Covariates: An Instrumental variable approach

Authors: Sudheesh K. K., Deemat C. Mathew, Litty Mathew, Min Xie

Abstract: Linear transformation model provides a general framework for analyzing censored survival data with covariates. The proportional hazards and proportional odds models are special cases of the linear transformation model. In biomedical studies, covariates with measurement error may occur in survival data. In this work, we propose a method to obtain estimators of the regression coefficients in the lin… ▽ More Linear transformation model provides a general framework for analyzing censored survival data with covariates. The proportional hazards and proportional odds models are special cases of the linear transformation model. In biomedical studies, covariates with measurement error may occur in survival data. In this work, we propose a method to obtain estimators of the regression coefficients in the linear transformation model when the covariates are subject to measurement error. In the proposed method, we assume that instrumental variables are available. We develop counting process based estimating equations for finding the estimators of regression coefficients. We prove the large sample properties of the estimators using the martingale representation of the regression estimators. The finite sample performance of the estimators are evaluated through an extensive Monte Carlo simulation study. Finally, we illustrate the proposed method using an AIDS clinical trial (ACTG 175) data. △ Less

Submitted 9 May, 2022; v1 submitted 27 April, 2022; originally announced April 2022.

Comments: We proposed an instrumental variable approach to analyze the linear transformation model when covariates are measured with error

arXiv:2203.16330 [pdf]

Cointegration of SARS-CoV-2 Transmission with Weather Conditions and Mobility during the First Year of the COVID-19 Pandemic in the United States

Authors: Hong Qin, Syed Tareq, William Torres, Megan Doman, Cleo Falvey, Jamaree Moore, Meng Hsiu Tsai, Yingfeng Wang, Azad Hossain, Mengjun Xie, Li Yang

Abstract: Correlation between weather and the transmission of SARS-CoV-2 may suggest its seasonality. Cointegration analysis can avoid spurious correlation among time series data. We examined the cointegration of virus transmission with daily temperature, dewpoint, and confounding factors of mobility measurements during the first year of the pandemic in the United States. We examined the cointegration of th… ▽ More Correlation between weather and the transmission of SARS-CoV-2 may suggest its seasonality. Cointegration analysis can avoid spurious correlation among time series data. We examined the cointegration of virus transmission with daily temperature, dewpoint, and confounding factors of mobility measurements during the first year of the pandemic in the United States. We examined the cointegration of the effective reproductive rate, Rt, of the virus with the dewpoint at two meters, the temperature at two meters, Apple driving mobility, and Google workplace mobility measurements. We found that dewpoint and Apple driving mobility are the best factors to cointegrate with Rt, although temperature and Google workplace mobility also cointegrate with Rt at substantial levels. We found that the optimal lag is two days for cointegration between Rt and weather variables, and three days for Rt and mobility. We observed clusters of states that share similar cointegration results of Rt, weather, and mobility, suggesting regional patterns. Our results support the correlation of weather with the spread of SARS-CoV-2 and its potential seasonality. △ Less

Submitted 30 March, 2022; v1 submitted 28 March, 2022; originally announced March 2022.

Comments: 6 pages, 3 figures

arXiv:2112.05090 [pdf, other]

Extending the WILDS Benchmark for Unsupervised Adaptation

Authors: Shiori Sagawa, Pang Wei Koh, Tony Lee, Irena Gao, Sang Michael Xie, Kendrick Shen, Ananya Kumar, Weihua Hu, Michihiro Yasunaga, Henrik Marklund, Sara Beery, Etienne David, Ian Stavness, Wei Guo, Jure Leskovec, Kate Saenko, Tatsunori Hashimoto, Sergey Levine, Chelsea Finn, Percy Liang

Abstract: Machine learning systems deployed in the wild are often trained on a source distribution but deployed on a different target distribution. Unlabeled data can be a powerful point of leverage for mitigating these distribution shifts, as it is frequently much more available than labeled data and can often be obtained from distributions beyond the source distribution as well. However, existing distribu… ▽ More Machine learning systems deployed in the wild are often trained on a source distribution but deployed on a different target distribution. Unlabeled data can be a powerful point of leverage for mitigating these distribution shifts, as it is frequently much more available than labeled data and can often be obtained from distributions beyond the source distribution as well. However, existing distribution shift benchmarks with unlabeled data do not reflect the breadth of scenarios that arise in real-world applications. In this work, we present the WILDS 2.0 update, which extends 8 of the 10 datasets in the WILDS benchmark of distribution shifts to include curated unlabeled data that would be realistically obtainable in deployment. These datasets span a wide range of applications (from histology to wildlife conservation), tasks (classification, regression, and detection), and modalities (photos, satellite images, microscope slides, text, molecular graphs). The update maintains consistency with the original WILDS benchmark by using identical labeled training, validation, and test sets, as well as the evaluation metrics. On these datasets, we systematically benchmark state-of-the-art methods that leverage unlabeled data, including domain-invariant, self-training, and self-supervised methods, and show that their success on WILDS is limited. To facilitate method development and evaluation, we provide an open-source package that automates data loading and contains all of the model architectures and methods used in this paper. Code and leaderboards are available at https://wilds.stanford.edu. △ Less

Submitted 23 April, 2022; v1 submitted 9 December, 2021; originally announced December 2021.

arXiv:2111.13302 [pdf, ps, other]

doi 10.1103/PhysRevResearch.4.023023

Equivalence between algorithmic instability and transition to replica symmetry breaking in perceptron learning systems

Authors: Yang Zhao, Junbin Qiu, Mingshan Xie, Haiping Huang

Abstract: Binary perceptron is a fundamental model of supervised learning for the non-convex optimization, which is a root of the popular deep learning. Binary perceptron is able to achieve a classification of random high-dimensional data by computing the marginal probabilities of binary synapses. The relationship between the algorithmic instability and the equilibrium analysis of the model remains elusive.… ▽ More Binary perceptron is a fundamental model of supervised learning for the non-convex optimization, which is a root of the popular deep learning. Binary perceptron is able to achieve a classification of random high-dimensional data by computing the marginal probabilities of binary synapses. The relationship between the algorithmic instability and the equilibrium analysis of the model remains elusive. Here, we establish the relationship by showing that the instability condition around the algorithmic fixed point is identical to the instability for breaking the replica symmetric saddle point solution of the free energy function. Therefore, our analysis would hopefully provide insights towards other learning systems in bridging the gap between non-convex learning dynamics and statistical mechanics properties of more complex neural networks. △ Less

Submitted 7 March, 2022; v1 submitted 25 November, 2021; originally announced November 2021.

Comments: 24 pages, 2 figures, revision to journal

Journal ref: Phys. Rev. Research 4, 023023 (2022)

arXiv:2109.01898 [pdf, other]

Confidence Distribution and Distribution Estimation for Modern Statistical Inference

Authors: Yifan Cui, Min-ge Xie

Abstract: This paper introduces to readers the new concept and methodology of confidence distribution and the modern-day distributional inference in statistics. This discussion should be of interest to people who would like to go into the depth of the statistical inference methodology and to utilize distribution estimators in practice. We also include in the discussion the topic of generalized fiducial infe… ▽ More This paper introduces to readers the new concept and methodology of confidence distribution and the modern-day distributional inference in statistics. This discussion should be of interest to people who would like to go into the depth of the statistical inference methodology and to utilize distribution estimators in practice. We also include in the discussion the topic of generalized fiducial inference, a special type of modern distributional inference, and relate it to the concept of confidence distribution. Several real data examples are also provided for practitioners. We hope that the selected content covers the greater part of the developments on this subject. △ Less

Submitted 4 September, 2021; originally announced September 2021.

Comments: To appear as a chapter in Springer Handbook of Engineering Statistics, 2nd ed

arXiv:2107.00219 [pdf, other]

doi 10.1145/3447548.3467387

ControlBurn: Feature Selection by Sparse Forests

Authors: Brian Liu, Miaolan Xie, Madeleine Udell

Abstract: Tree ensembles distribute feature importance evenly amongst groups of correlated features. The average feature ranking of the correlated group is suppressed, which reduces interpretability and complicates feature selection. In this paper we present ControlBurn, a feature selection algorithm that uses a weighted LASSO-based feature selection method to prune unnecessary features from tree ensembles,… ▽ More Tree ensembles distribute feature importance evenly amongst groups of correlated features. The average feature ranking of the correlated group is suppressed, which reduces interpretability and complicates feature selection. In this paper we present ControlBurn, a feature selection algorithm that uses a weighted LASSO-based feature selection method to prune unnecessary features from tree ensembles, just as low-intensity fire reduces overgrown vegetation. Like the linear LASSO, ControlBurn assigns all the feature importance of a correlated group of features to a single feature. Moreover, the algorithm is efficient and only requires a single training iteration to run, unlike iterative wrapper-based feature selection methods. We show that ControlBurn performs substantially better than feature selection methods with comparable computational costs on datasets with correlated features. △ Less

Submitted 1 July, 2021; originally announced July 2021.

Comments: 15 pages

arXiv:2106.09226 [pdf, other]

Why Do Pretrained Language Models Help in Downstream Tasks? An Analysis of Head and Prompt Tuning

Authors: Colin Wei, Sang Michael Xie, Tengyu Ma

Abstract: Pretrained language models have achieved state-of-the-art performance when adapted to a downstream NLP task. However, theoretical analysis of these models is scarce and challenging since the pretraining and downstream tasks can be very different. We propose an analysis framework that links the pretraining and downstream tasks with an underlying latent variable generative model of text -- the downs… ▽ More Pretrained language models have achieved state-of-the-art performance when adapted to a downstream NLP task. However, theoretical analysis of these models is scarce and challenging since the pretraining and downstream tasks can be very different. We propose an analysis framework that links the pretraining and downstream tasks with an underlying latent variable generative model of text -- the downstream classifier must recover a function of the posterior distribution over the latent variables. We analyze head tuning (learning a classifier on top of the frozen pretrained model) and prompt tuning in this setting. The generative model in our analysis is either a Hidden Markov Model (HMM) or an HMM augmented with a latent memory component, motivated by long-term dependencies in natural language. We show that 1) under certain non-degeneracy conditions on the HMM, simple classification heads can solve the downstream task, 2) prompt tuning obtains downstream guarantees with weaker non-degeneracy conditions, and 3) our recovery guarantees for the memory-augmented HMM are stronger than for the vanilla HMM because task-relevant information is easier to recover from the long-term memory. Experiments on synthetically generated data from HMMs back our theoretical findings. △ Less

Submitted 20 April, 2022; v1 submitted 16 June, 2021; originally announced June 2021.

arXiv:2105.07338 [pdf, ps, other]

CCMN: A General Framework for Learning with Class-Conditional Multi-Label Noise

Authors: Ming-Kun Xie, Sheng-Jun Huang

Abstract: Class-conditional noise commonly exists in machine learning tasks, where the class label is corrupted with a probability depending on its ground-truth. Many research efforts have been made to improve the model robustness against the class-conditional noise. However, they typically focus on the single label case by assuming that only one label is corrupted. In real applications, an instance is usua… ▽ More Class-conditional noise commonly exists in machine learning tasks, where the class label is corrupted with a probability depending on its ground-truth. Many research efforts have been made to improve the model robustness against the class-conditional noise. However, they typically focus on the single label case by assuming that only one label is corrupted. In real applications, an instance is usually associated with multiple labels, which could be corrupted simultaneously with their respective conditional probabilities. In this paper, we formalize this problem as a general framework of learning with Class-Conditional Multi-label Noise (CCMN for short). We establish two unbiased estimators with error bounds for solving the CCMN problems, and further prove that they are consistent with commonly used multi-label loss functions. Finally, a new method for partial multi-label learning is implemented with unbiased estimator under the CCMN framework. Empirical studies on multiple datasets and various evaluation metrics validate the effectiveness of the proposed method. △ Less

Submitted 15 May, 2021; originally announced May 2021.

Comments: 18 pages

arXiv:2102.10771 [pdf, ps, other]

Divide-and-conquer methods for big data analysis

Authors: Xueying Chen, Jerry Q. Cheng, Min-ge Xie

Abstract: In the context of big data analysis, the divide-and-conquer methodology refers to a multiple-step process: first splitting a data set into several smaller ones; then analyzing each set separately; finally combining results from each analysis together. This approach is effective in handling large data sets that are unsuitable to be analyzed entirely by a single computer due to limits either from me… ▽ More In the context of big data analysis, the divide-and-conquer methodology refers to a multiple-step process: first splitting a data set into several smaller ones; then analyzing each set separately; finally combining results from each analysis together. This approach is effective in handling large data sets that are unsuitable to be analyzed entirely by a single computer due to limits either from memory storage or computational time. The combined results will provide a statistical inference which is similar to the one from analyzing the entire data set. This article reviews some recently developments of divide-and-conquer methods in a variety of settings, including combining based on parametric, semiparametric and nonparametric models, online sequential updating methods, among others. Theoretical development on the efficiency of the divide-and-conquer methods is also discussed. △ Less

Submitted 21 February, 2021; originally announced February 2021.

arXiv:2012.04550 [pdf, other]

In-N-Out: Pre-Training and Self-Training using Auxiliary Information for Out-of-Distribution Robustness

Authors: Sang Michael Xie, Ananya Kumar, Robbie Jones, Fereshte Khani, Tengyu Ma, Percy Liang

Abstract: Consider a prediction setting with few in-distribution labeled examples and many unlabeled examples both in- and out-of-distribution (OOD). The goal is to learn a model which performs well both in-distribution and OOD. In these settings, auxiliary information is often cheaply available for every input. How should we best leverage this auxiliary information for the prediction task? Empirically acro… ▽ More Consider a prediction setting with few in-distribution labeled examples and many unlabeled examples both in- and out-of-distribution (OOD). The goal is to learn a model which performs well both in-distribution and OOD. In these settings, auxiliary information is often cheaply available for every input. How should we best leverage this auxiliary information for the prediction task? Empirically across three image and time-series datasets, and theoretically in a multi-task linear regression setting, we show that (i) using auxiliary information as input features improves in-distribution error but can hurt OOD error; but (ii) using auxiliary information as outputs of auxiliary pre-training tasks improves OOD error. To get the best of both worlds, we introduce In-N-Out, which first trains a model with auxiliary inputs and uses it to pseudolabel all the in-distribution inputs, then pre-trains a model on OOD auxiliary outputs and fine-tunes this model with the pseudolabels (self-training). We show both theoretically and empirically that In-N-Out outperforms auxiliary inputs or outputs alone on both in-distribution and OOD error. △ Less

Submitted 7 April, 2021; v1 submitted 8 December, 2020; originally announced December 2020.

Comments: ICLR 2021

arXiv:2012.04464 [pdf, other]

Bridging Bayesian, frequentist and fiducial (BFF) inferences using confidence distribution

Authors: Suzanne Thornton, Minge Xie

Abstract: Bayesian, frequentist and fiducial (BFF) inferences are much more congruous than they have been perceived historically in the scientific community (cf., Reid and Cox 2015; Kass 2011; Efron 1998). Most practitioners are probably more familiar with the two dominant statistical inferential paradigms, Bayesian inference and frequentist inference. The third, lesser known fiducial inference paradigm was… ▽ More Bayesian, frequentist and fiducial (BFF) inferences are much more congruous than they have been perceived historically in the scientific community (cf., Reid and Cox 2015; Kass 2011; Efron 1998). Most practitioners are probably more familiar with the two dominant statistical inferential paradigms, Bayesian inference and frequentist inference. The third, lesser known fiducial inference paradigm was pioneered by R.A. Fisher in an attempt to define an inversion procedure for inference as an alternative to Bayes' theorem. Although each paradigm has its own strengths and limitations subject to their different philosophical underpinnings, this article intends to bridge these different inferential methodologies through the lenses of confidence distribution theory and Monte-Carlo simulation procedures. This article attempts to understand how these three distinct paradigms, Bayesian, frequentist, and fiducial inference, can be unified and compared on a foundational level, thereby increasing the range of possible techniques available to both statistical theorists and practitioners across all fields. △ Less

Submitted 15 June, 2022; v1 submitted 8 December, 2020; originally announced December 2020.

Comments: 30 pages, 5 figures, Handbook on Bayesian Fiducial and Frequentist (BFF) Inferences

MSC Class: 62-00 (Primary) 62A01 (Secondary)

arXiv:2011.07047 [pdf, other]

Nonparametric fusion learning: synthesize inferences from diverse sources using depth confidence distribution

Authors: Dungang Liu, Regina Y. Liu, Minge Xie

Abstract: Fusion learning refers to synthesizing inferences from multiple sources or studies to provide more effective inference and prediction than from any individual source or study alone. Most existing methods for synthesizing inferences rely on parametric model assumptions, such as normality, which often do not hold in practice. In this paper, we propose a general nonparametric fusion learning framewor… ▽ More Fusion learning refers to synthesizing inferences from multiple sources or studies to provide more effective inference and prediction than from any individual source or study alone. Most existing methods for synthesizing inferences rely on parametric model assumptions, such as normality, which often do not hold in practice. In this paper, we propose a general nonparametric fusion learning framework for synthesizing inferences of the target parameter from multiple sources. The main tool underlying the proposed framework is the notion of depth confidence distribution (depth-CD), which is also developed in this paper. Broadly speaking, a depth-CD is a data-driven nonparametric summary distribution of inferential information for the target parameter. We show that a depth-CD is a useful inferential tool and, moreover, is an omnibus form of confidence regions (or p-values), whose contours of level sets shrink toward the true parameter value. The proposed fusion learning approach combines depth-CDs from the individual studies, with each depth-CD constructed by nonparametric bootstrap and data depth. This approach is shown to be efficient, general and robust. Specifically, it achieves high-order accuracy and Bahadur efficiency under suitably chosen combining elements. It allows the model or inference structure to be different among individual studies. And it readily adapts to heterogeneous studies with a broad range of complex and irregular settings. This property enables it to utilize indirect evidence from incomplete studies to gain efficiency in the overall inference. The advantages of the proposed approach are demonstrated simulations and in a Federal Aviation Administration (FAA) study of aircraft landing performance. △ Less

Submitted 13 November, 2020; originally announced November 2020.

Comments: 47 pages, 10 figures

arXiv:2009.10265 [pdf, other]

doi 10.1002/sim.9161

A Bias Correction Method in Meta-analysis of Randomized Clinical Trials with no Adjustments for Zero-inflated Outcomes

Authors: Zhengyang Zhou, Minge Xie, David Huh, Eun-Young Mun

Abstract: Many clinical endpoint measures, such as the number of standard drinks consumed per week or the number of days that patients stayed in the hospital, are count data with excessive zeros. However, the zero-inflated nature of such outcomes is sometimes ignored in analyses of clinical trials. This leads to biased estimates of study-level intervention effect and, consequently, a biased estimate of the… ▽ More Many clinical endpoint measures, such as the number of standard drinks consumed per week or the number of days that patients stayed in the hospital, are count data with excessive zeros. However, the zero-inflated nature of such outcomes is sometimes ignored in analyses of clinical trials. This leads to biased estimates of study-level intervention effect and, consequently, a biased estimate of the overall intervention effect in a meta-analysis. The current study proposes a novel statistical approach, the Zero-inflation Bias Correction (ZIBC) method, that can account for the bias introduced when using the Poisson regression model, despite a high rate of inflated zeros in the outcome distribution of a randomized clinical trial. This correction method only requires summary information from individual studies to correct intervention effect estimates as if they were appropriately estimated using the zero-inflated Poisson regression model, thus it is attractive for meta-analysis when individual participant-level data are not available in some studies. Simulation studies and real data analyses showed that the ZIBC method performed well in correcting zero-inflation bias in most situations. △ Less

Submitted 25 June, 2021; v1 submitted 21 September, 2020; originally announced September 2020.

arXiv:2008.09148 [pdf, other]

Towards adversarial robustness with 01 loss neural networks

Authors: Yunzhe Xue, Meiyan Xie, Usman Roshan

Abstract: Motivated by the general robustness properties of the 01 loss we propose a single hidden layer 01 loss neural network trained with stochastic coordinate descent as a defense against adversarial attacks in machine learning. One measure of a model's robustness is the minimum distortion required to make the input adversarial. This can be approximated with the Boundary Attack (Brendel et. al. 2018) an… ▽ More Motivated by the general robustness properties of the 01 loss we propose a single hidden layer 01 loss neural network trained with stochastic coordinate descent as a defense against adversarial attacks in machine learning. One measure of a model's robustness is the minimum distortion required to make the input adversarial. This can be approximated with the Boundary Attack (Brendel et. al. 2018) and HopSkipJump (Chen et. al. 2019) methods. We compare the minimum distortion of the 01 loss network to the binarized neural network and the standard sigmoid activation network with cross-entropy loss all trained with and without Gaussian noise on the CIFAR10 benchmark binary classification between classes 0 and 1. Both with and without noise training we find our 01 loss network to have the largest adversarial distortion of the three models by non-trivial margins. To further validate these results we subject all models to substitute model black box attacks under different distortion thresholds and find that the 01 loss network is the hardest to attack across all distortions. At a distortion of 0.125 both sigmoid activated cross-entropy loss and binarized networks have almost 0% accuracy on adversarial examples whereas the 01 loss network is at 40%. Even though both 01 loss and the binarized network use sign activations their training algorithms are different which in turn give different solutions for robustness. Finally we compare our network to simple convolutional models under substitute model black box attacks and find their accuracies to be comparable. Our work shows that the 01 loss network has the potential to defend against black box adversarial attacks better than convex loss and binarized networks. △ Less

Submitted 20 August, 2020; originally announced August 2020.

Comments: arXiv admin note: text overlap with arXiv:2006.07800

arXiv:2006.16312 [pdf, other]

Dynamic Knapsack Optimization Towards Efficient Multi-Channel Sequential Advertising

Authors: Xiaotian Hao, Zhaoqing Peng, Yi Ma, Guan Wang, Junqi Jin, Jianye Hao, Shan Chen, Rongquan Bai, Mingzhou Xie, Miao Xu, Zhenzhe Zheng, Chuan Yu, Han Li, Jian Xu, Kun Gai

Abstract: In E-commerce, advertising is essential for merchants to reach their target users. The typical objective is to maximize the advertiser's cumulative revenue over a period of time under a budget constraint. In real applications, an advertisement (ad) usually needs to be exposed to the same user multiple times until the user finally contributes revenue (e.g., places an order). However, existing adver… ▽ More In E-commerce, advertising is essential for merchants to reach their target users. The typical objective is to maximize the advertiser's cumulative revenue over a period of time under a budget constraint. In real applications, an advertisement (ad) usually needs to be exposed to the same user multiple times until the user finally contributes revenue (e.g., places an order). However, existing advertising systems mainly focus on the immediate revenue with single ad exposures, ignoring the contribution of each exposure to the final conversion, thus usually falls into suboptimal solutions. In this paper, we formulate the sequential advertising strategy optimization as a dynamic knapsack problem. We propose a theoretically guaranteed bilevel optimization framework, which significantly reduces the solution space of the original optimization space while ensuring the solution quality. To improve the exploration efficiency of reinforcement learning, we also devise an effective action space reduction approach. Extensive offline and online experiments show the superior performance of our approaches over state-of-the-art baselines in terms of cumulative revenue. △ Less

Submitted 29 June, 2020; originally announced June 2020.

Comments: accepted by ICML 2020

arXiv:2006.16205 [pdf, other]

Composed Fine-Tuning: Freezing Pre-Trained Denoising Autoencoders for Improved Generalization

Authors: Sang Michael Xie, Tengyu Ma, Percy Liang

Abstract: We focus on prediction problems with structured outputs that are subject to output validity constraints, e.g. pseudocode-to-code translation where the code must compile. While labeled input-output pairs are expensive to obtain, "unlabeled" outputs, i.e. outputs without corresponding inputs, are freely available (e.g. code on GitHub) and provide information about output validity. We can capture the… ▽ More We focus on prediction problems with structured outputs that are subject to output validity constraints, e.g. pseudocode-to-code translation where the code must compile. While labeled input-output pairs are expensive to obtain, "unlabeled" outputs, i.e. outputs without corresponding inputs, are freely available (e.g. code on GitHub) and provide information about output validity. We can capture the output structure by pre-training a denoiser to denoise corrupted versions of unlabeled outputs. We first show that standard fine-tuning after pre-training destroys some of this structure. We then propose composed fine-tuning, which fine-tunes a predictor composed with the pre-trained denoiser, which is frozen to preserve output structure. For two-layer ReLU networks, we prove that composed fine-tuning significantly reduces the complexity of the predictor, thus improving generalization. Empirically, we show that composed fine-tuning improves over standard fine-tuning on two pseudocode-to-code translation datasets (3% and 6% relative). The improvement from composed fine-tuning is magnified on out-of-distribution (OOD) examples (4% and 25% relative). △ Less

Submitted 24 October, 2023; v1 submitted 29 June, 2020; originally announced June 2020.

Comments: ICML 2021 Long talk

arXiv:2006.07800 [pdf, other]

On the transferability of adversarial examples between convex and 01 loss models

Authors: Yunzhe Xue, Meiyan Xie, Usman Roshan

Abstract: The 01 loss gives different and more accurate boundaries than convex loss models in the presence of outliers. Could the difference of boundaries translate to adversarial examples that are non-transferable between 01 loss and convex models? We explore this empirically in this paper by studying transferability of adversarial examples between linear 01 loss and convex (hinge) loss models, and between… ▽ More The 01 loss gives different and more accurate boundaries than convex loss models in the presence of outliers. Could the difference of boundaries translate to adversarial examples that are non-transferable between 01 loss and convex models? We explore this empirically in this paper by studying transferability of adversarial examples between linear 01 loss and convex (hinge) loss models, and between dual layer neural networks with sign activation and 01 loss vs sigmoid activation and logistic loss. We first show that white box adversarial examples do not transfer effectively between convex and 01 loss and between 01 loss models compared to between convex models. As a result of this non-transferability we see that convex substitute model black box attacks are less effective on 01 loss than convex models. Interestingly we also see that 01 loss substitute model attacks are ineffective on both convex and 01 loss models mostly likely due to the non-uniqueness of 01 loss models. We show intuitively by example how the presence of outliers can cause different decision boundaries between 01 and convex loss models which in turn produces adversaries that are non-transferable. Indeed we see on MNIST that adversaries transfer between 01 loss and convex models more easily than on CIFAR10 and ImageNet which are likely to contain outliers. We show intuitively by example how the non-continuity of 01 loss makes adversaries non-transferable in a dual layer neural network. We discretize CIFAR10 features to be more like MNIST and find that it does not improve transferability, thus suggesting that different boundaries due to outliers are more likely the cause of non-transferability. As a result of this non-transferability we show that our dual layer sign activation network with 01 loss can attain robustness on par with simple convolutional networks. △ Less

Submitted 29 July, 2020; v1 submitted 14 June, 2020; originally announced June 2020.

arXiv:2005.01026 [pdf, other]

doi 10.1007/s11280-022-01046-x

Multi-Center Federated Learning: Clients Clustering for Better Personalization

Authors: Guodong Long, Ming Xie, Tao Shen, Tianyi Zhou, Xianzhi Wang, Jing Jiang, Chengqi Zhang

Abstract: Federated learning has received great attention for its capability to train a large-scale model in a decentralized manner without needing to access user data directly. It helps protect the users' private data from centralized collecting. Unlike distributed machine learning, federated learning aims to tackle non-IID data from heterogeneous sources in various real-world applications, such as those o… ▽ More Federated learning has received great attention for its capability to train a large-scale model in a decentralized manner without needing to access user data directly. It helps protect the users' private data from centralized collecting. Unlike distributed machine learning, federated learning aims to tackle non-IID data from heterogeneous sources in various real-world applications, such as those on smartphones. Existing federated learning approaches usually adopt a single global model to capture the shared knowledge of all users by aggregating their gradients, regardless of the discrepancy between their data distributions. However, due to the diverse nature of user behaviors, assigning users' gradients to different global models (i.e., centers) can better capture the heterogeneity of data distributions across users. Our paper proposes a novel multi-center aggregation mechanism for federated learning, which learns multiple global models from the non-IID user data and simultaneously derives the optimal matching between users and centers. We formulate the problem as a joint optimization that can be efficiently solved by a stochastic expectation maximization (EM) algorithm. Our experimental results on benchmark datasets show that our method outperforms several popular federated learning methods. △ Less

Submitted 5 February, 2023; v1 submitted 3 May, 2020; originally announced May 2020.

Comments: This paper has two duplicated versions: 2005.01026 and 2108.08647. The first one 2005.01026 is the right one, and the second one 2108.08647 should be deleted because it always causes misoperating

Journal ref: World Wide Web,26,(2003),481-500

arXiv:2004.08472 [pdf, other]

Leveraging the Fisher randomization test using confidence distributions: inference, combination and fusion learning

Authors: Xiaokang Luo, Tirthankar Dasgupta, Minge Xie, Regina Liu

Abstract: The flexibility and wide applicability of the Fisher randomization test (FRT) makes it an attractive tool for assessment of causal effects of interventions from modern-day randomized experiments that are increasing in size and complexity. This paper provides a theoretical inferential framework for FRT by establishing its connection with confidence distributions Such a connection leads to developme… ▽ More The flexibility and wide applicability of the Fisher randomization test (FRT) makes it an attractive tool for assessment of causal effects of interventions from modern-day randomized experiments that are increasing in size and complexity. This paper provides a theoretical inferential framework for FRT by establishing its connection with confidence distributions Such a connection leads to development of (i) an unambiguous procedure for inversion of FRTs to generate confidence intervals with guaranteed coverage, (ii) generic and specific methods to combine FRTs from multiple independent experiments with theoretical guarantees and (iii) new insights on the effect of size of the Monte Carlo sample on the results of FRT. Our developments pertain to finite sample settings but have direct extensions to large samples. Simulations and a case example demonstrate the benefit of these new developments. △ Less

Submitted 17 April, 2020; originally announced April 2020.

arXiv:2002.10716 [pdf, other]

Understanding and Mitigating the Tradeoff Between Robustness and Accuracy

Authors: Aditi Raghunathan, Sang Michael Xie, Fanny Yang, John Duchi, Percy Liang

Abstract: Adversarial training augments the training set with perturbations to improve the robust error (over worst-case perturbations), but it often leads to an increase in the standard error (on unperturbed test inputs). Previous explanations for this tradeoff rely on the assumption that no predictor in the hypothesis class has low standard and robust error. In this work, we precisely characterize the eff… ▽ More Adversarial training augments the training set with perturbations to improve the robust error (over worst-case perturbations), but it often leads to an increase in the standard error (on unperturbed test inputs). Previous explanations for this tradeoff rely on the assumption that no predictor in the hypothesis class has low standard and robust error. In this work, we precisely characterize the effect of augmentation on the standard error in linear regression when the optimal linear predictor has zero standard and robust error. In particular, we show that the standard error could increase even when the augmented perturbations have noiseless observations from the optimal linear predictor. We then prove that the recently proposed robust self-training (RST) estimator improves robust error without sacrificing standard error for noiseless linear regression. Empirically, for neural networks, we find that RST with different adversarial training methods improves both standard and robust error for random and adversarial rotations and adversarial $\ell_\infty$ perturbations in CIFAR-10. △ Less

Submitted 6 July, 2020; v1 submitted 25 February, 2020; originally announced February 2020.

Comments: Appearing at International Conference on Machine Learning (ICML) 2020

arXiv:2002.03444 [pdf, other]

Robust binary classification with the 01 loss

Authors: Yunzhe Xue, Meiyan Xie, Usman Roshan

Abstract: The 01 loss is robust to outliers and tolerant to noisy data compared to convex loss functions. We conjecture that the 01 loss may also be more robust to adversarial attacks. To study this empirically we have developed a stochastic coordinate descent algorithm for a linear 01 loss classifier and a single hidden layer 01 loss neural network. Due to the absence of the gradient we iteratively update… ▽ More The 01 loss is robust to outliers and tolerant to noisy data compared to convex loss functions. We conjecture that the 01 loss may also be more robust to adversarial attacks. To study this empirically we have developed a stochastic coordinate descent algorithm for a linear 01 loss classifier and a single hidden layer 01 loss neural network. Due to the absence of the gradient we iteratively update coordinates on random subsets of the data for fixed epochs. We show our algorithms to be fast and comparable in accuracy to the linear support vector machine and logistic loss single hidden layer network for binary classification on several image benchmarks, thus establishing that our method is on-par in test accuracy with convex losses. We then subject them to accurately trained substitute model black box attacks on the same image benchmarks and find them to be more robust than convex counterparts. On CIFAR10 binary classification task between classes 0 and 1 with adversarial perturbation of 0.0625 we see that the MLP01 network loses 27\% in accuracy whereas the MLP-logistic counterpart loses 83\%. Similarly on STL10 and ImageNet binary classification between classes 0 and 1 the MLP01 network loses 21\% and 20\% while MLP-logistic loses 67\% and 45\% respectively. On MNIST that is a well-separable dataset we find MLP01 comparable to MLP-logistic and show under simulation how and why our 01 loss solver is less robust there. We then propose adversarial training for our linear 01 loss solver that significantly improves its robustness on MNIST and all other datasets and retains clean test accuracy. Finally we show practical applications of our method to deter traffic sign and facial recognition adversarial attacks. We discuss attacks with 01 loss, substitute model accuracy, and several future avenues like multiclass, 01 loss convolutions, and further adversarial training. △ Less

Submitted 9 February, 2020; originally announced February 2020.

arXiv:2001.11945 [pdf, other]

p-Value as the Strength of Evidence Measured by Confidence Distribution

Authors: Sifan Liu, Regina Liu, Min-ge Xie

Abstract: The notion of p-value is a fundamental concept in statistical inference and has been widely used for reporting outcomes of hypothesis tests. However, p-value is often misinterpreted, misused or miscommunicated in practice. Part of the issue is that existing definitions of p-value are often derived from constructions under specific settings, and a general definition that directly reflects the evide… ▽ More The notion of p-value is a fundamental concept in statistical inference and has been widely used for reporting outcomes of hypothesis tests. However, p-value is often misinterpreted, misused or miscommunicated in practice. Part of the issue is that existing definitions of p-value are often derived from constructions under specific settings, and a general definition that directly reflects the evidence of the null hypothesis is not yet available. In this article, we first propose a general and rigorous definition of p-value that fulfills two performance-based characteristics. The performance-based definition subsumes all existing construction-based definitions of the p-value, and justifies their interpretations. The paper further presents a specific approach based on confidence distribution to formulate and calculate p-values. This specific way of computing p values has two main advantages. First, it is applicable for a wide range of hypothesis testing problems, including the standard one- and two-sided tests, tests with interval-type null, intersection-union tests, multivariate tests and so on. Second, it can naturally lead to a coherent interpretation of p-value as evidence in support of the null hypothesis, as well as a meaningful measure of degree of such support. In particular, it places a meaning of a large p-value, e.g. p-value of 0.8 has more support than 0.5. Numerical examples are used to illustrate the wide applicability and computational feasibility of our approach. We show that our proposal is effective and can be applied broadly, without further consideration of the form/size of the null space. As for existing testing methods, the solutions have not been available or cannot be easily obtained. △ Less

Submitted 31 January, 2020; originally announced January 2020.

Comments: 30 pages, 8 figures

MSC Class: 62F03 (Primary) 62H15 (Secondary)

arXiv:2001.09595 [pdf, other]

Developing Multi-Task Recommendations with Long-Term Rewards via Policy Distilled Reinforcement Learning

Authors: Xi Liu, Li Li, Ping-Chun Hsieh, Muhe Xie, Yong Ge, Rui Chen

Abstract: With the explosive growth of online products and content, recommendation techniques have been considered as an effective tool to overcome information overload, improve user experience, and boost business revenue. In recent years, we have observed a new desideratum of considering long-term rewards of multiple related recommendation tasks simultaneously. The consideration of long-term rewards is str… ▽ More With the explosive growth of online products and content, recommendation techniques have been considered as an effective tool to overcome information overload, improve user experience, and boost business revenue. In recent years, we have observed a new desideratum of considering long-term rewards of multiple related recommendation tasks simultaneously. The consideration of long-term rewards is strongly tied to business revenue and growth. Learning multiple tasks simultaneously could generally improve the performance of individual task due to knowledge sharing in multi-task learning. While a few existing works have studied long-term rewards in recommendations, they mainly focus on a single recommendation task. In this paper, we propose {\it PoDiRe}: a \underline{po}licy \underline{di}stilled \underline{re}commender that can address long-term rewards of recommendations and simultaneously handle multiple recommendation tasks. This novel recommendation solution is based on a marriage of deep reinforcement learning and knowledge distillation techniques, which is able to establish knowledge sharing among different tasks and reduce the size of a learning model. The resulting model is expected to attain better performance and lower response latency for real-time recommendation services. In collaboration with Samsung Game Launcher, one of the world's largest commercial mobile game platforms, we conduct a comprehensive experimental study on large-scale real data with hundreds of millions of events and show that our solution outperforms many state-of-the-art methods in terms of several standard evaluation metrics. △ Less

Submitted 27 January, 2020; originally announced January 2020.

arXiv:2001.08336 [pdf, other]

Geometric Conditions for the Discrepant Posterior Phenomenon and Connections to Simpson's Paradox

Authors: Yang Chen, Ruobin Gong, Min-ge Xie

Abstract: The discrepant posterior phenomenon (DPP) is a counter-intuitive phenomenon that can frequently occur in a Bayesian analysis of multivariate parameters. It refers to the phenomenon that a parameter estimate based on a posterior is more extreme than both of those inferred based on either the prior or the likelihood alone. Inferential claims that exhibit DPP defy the common intuition that the poster… ▽ More The discrepant posterior phenomenon (DPP) is a counter-intuitive phenomenon that can frequently occur in a Bayesian analysis of multivariate parameters. It refers to the phenomenon that a parameter estimate based on a posterior is more extreme than both of those inferred based on either the prior or the likelihood alone. Inferential claims that exhibit DPP defy the common intuition that the posterior is a prior-data compromise, and the phenomenon can be surprisingly ubiquitous in well-behaved Bayesian models. In this paper we revisit this phenomenon and, using point estimation as an example, derive conditions under which the DPP occurs in Bayesian models with exponential quadratic likelihoods and conjugate multivariate Gaussian priors. The family of exponential quadratic likelihood models includes Gaussian models and those models with local asymptotic normality property. We provide an intuitive geometric interpretation of the phenomenon and show that there exists a nontrivial space of marginal directions such that the DPP occurs. We further relate the phenomenon to the Simpson's paradox and discover their deep-rooted connection that is associated with marginalization. We also draw connections with Bayesian computational algorithms when difficult geometry exists. Our discovery demonstrates that DPP is more prevalent than previously understood and anticipated. Theoretical results are complemented by numerical illustrations. Scenarios covered in this study have implications for parameterization, sensitivity analysis, and prior choice for Bayesian modeling. △ Less

Submitted 12 January, 2022; v1 submitted 22 January, 2020; originally announced January 2020.

arXiv:1906.06032 [pdf, other]

Adversarial Training Can Hurt Generalization

Authors: Aditi Raghunathan, Sang Michael Xie, Fanny Yang, John C. Duchi, Percy Liang

Abstract: While adversarial training can improve robust accuracy (against an adversary), it sometimes hurts standard accuracy (when there is no adversary). Previous work has studied this tradeoff between standard and robust accuracy, but only in the setting where no predictor performs well on both objectives in the infinite data limit. In this paper, we show that even when the optimal predictor with infinit… ▽ More While adversarial training can improve robust accuracy (against an adversary), it sometimes hurts standard accuracy (when there is no adversary). Previous work has studied this tradeoff between standard and robust accuracy, but only in the setting where no predictor performs well on both objectives in the infinite data limit. In this paper, we show that even when the optimal predictor with infinite data performs well on both objectives, a tradeoff can still manifest itself with finite data. Furthermore, since our construction is based on a convex learning problem, we rule out optimization concerns, thus laying bare a fundamental tension between robustness and generalization. Finally, we show that robust self-training mostly eliminates this tradeoff by leveraging unlabeled data. △ Less

Submitted 26 August, 2019; v1 submitted 14 June, 2019; originally announced June 2019.

arXiv:1906.05533 [pdf, other]

Individualized Group Learning

Authors: Chencheng Cai, Rong Chen, Min-ge Xie

Abstract: Many massive data are assembled through collections of information of a large number of individuals in a population. The analysis of such data, especially in the aspect of individualized inferences and solutions, has the potential to create significant value for practical applications. Traditionally, inference for an individual in the data set is either solely relying on the information of the ind… ▽ More Many massive data are assembled through collections of information of a large number of individuals in a population. The analysis of such data, especially in the aspect of individualized inferences and solutions, has the potential to create significant value for practical applications. Traditionally, inference for an individual in the data set is either solely relying on the information of the individual or from summarizing the information about the whole population. However, with the availability of big data, we have the opportunity, as well as a unique challenge, to make a more effective individualized inference that takes into consideration of both the population information and the individual discrepancy. To deal with the possible heterogeneity within the population while providing effective and credible inferences for individuals in a data set, this article develops a new approach called the individualized group learning (iGroup). The iGroup approach uses local nonparametric techniques to generate an individualized group by pooling other entities in the population which share similar characteristics with the target individual, even when individual estimates are biased due to limited number of observations. Three general cases of iGroup are discussed, and their asymptotic performances are investigated. Both theoretical results and empirical simulations reveal that, by applying iGroup, the performance of statistical inference on the individual level are ensured and can be substantially improved from inference based on either solely individual information or entire population information. The method has a broad range of applications. Two examples in financial statistics and maritime anomaly detection are presented. △ Less

Submitted 16 September, 2019; v1 submitted 13 June, 2019; originally announced June 2019.

arXiv:1901.10517 [pdf, other]

Reparameterizable Subset Sampling via Continuous Relaxations

Authors: Sang Michael Xie, Stefano Ermon

Abstract: Many machine learning tasks require sampling a subset of items from a collection based on a parameterized distribution. The Gumbel-softmax trick can be used to sample a single item, and allows for low-variance reparameterized gradients with respect to the parameters of the underlying distribution. However, stochastic optimization involving subset sampling is typically not reparameterizable. To ove… ▽ More Many machine learning tasks require sampling a subset of items from a collection based on a parameterized distribution. The Gumbel-softmax trick can be used to sample a single item, and allows for low-variance reparameterized gradients with respect to the parameters of the underlying distribution. However, stochastic optimization involving subset sampling is typically not reparameterizable. To overcome this limitation, we define a continuous relaxation of subset sampling that provides reparameterization gradients by generalizing the Gumbel-max trick. We use this approach to sample subsets of features in an instance-wise feature selection task for model interpretability, subsets of neighbors to implement a deep stochastic k-nearest neighbors model, and sub-sequences of neighbors to implement parametric t-SNE by directly comparing the identities of local neighbors. We improve performance in all these tasks by incorporating subset sampling in end-to-end training. △ Less

Submitted 26 February, 2021; v1 submitted 29 January, 2019; originally announced January 2019.

Comments: IJCAI 2019

arXiv:1901.06247 [pdf, other]

Micro- and Macro-Level Churn Analysis of Large-Scale Mobile Games

Authors: Xi Liu, Muhe Xie, Xidao Wen, Rui Chen, Yong Ge, Nick Duffield, Na Wang

Abstract: As mobile devices become more and more popular, mobile gaming has emerged as a promising market with billion-dollar revenues. A variety of mobile game platforms and services have been developed around the world. A critical challenge for these platforms and services is to understand the churn behavior in mobile games, which usually involves churn at micro level (between an app and a specific user)… ▽ More As mobile devices become more and more popular, mobile gaming has emerged as a promising market with billion-dollar revenues. A variety of mobile game platforms and services have been developed around the world. A critical challenge for these platforms and services is to understand the churn behavior in mobile games, which usually involves churn at micro level (between an app and a specific user) and macro level (between an app and all its users). Accurate micro-level churn prediction and macro-level churn ranking will benefit many stakeholders such as game developers, advertisers, and platform operators. In this paper, we present the first large-scale churn analysis for mobile games that supports both micro-level churn prediction and macro-level churn ranking. For micro-level churn prediction, in view of the common limitations of the state-of-the-art methods built upon traditional machine learning models, we devise a novel semi-supervised and inductive embedding model that jointly learns the prediction function and the embedding function for user-app relationships. We model these two functions by deep neural networks with a unique edge embedding technique that is able to capture both contextual information and relationship dynamics. We also design a novel attributed random walk technique that takes into consideration both topological adjacency and attribute similarities. To address macro-level churn ranking, we propose to construct a relationship graph with estimated micro-level churn probabilities as edge weights and adapt link analysis algorithms on the graph. We devise a simple algorithm SimSum and adapt two more advanced algorithms PageRank and HITS. The performance of our solutions for the two-level churn analysis problems is evaluated on real-world data collected from the Samsung Game Launcher platform. △ Less

Submitted 14 January, 2019; originally announced January 2019.

Comments: arXiv admin note: substantial text overlap with arXiv:1808.06573

arXiv:1811.05932 [pdf, ps, other]

Streaming Network Embedding through Local Actions

Authors: Xi Liu, Ping-Chun Hsieh, Nick Duffield, Rui Chen, Muhe Xie, Xidao Wen

Abstract: Recently, considerable research attention has been paid to network embedding, a popular approach to construct feature vectors of vertices. Due to the curse of dimensionality and sparsity in graphical datasets, this approach has become indispensable for machine learning tasks over large networks. The majority of existing literature has considered this technique under the assumption that the network… ▽ More Recently, considerable research attention has been paid to network embedding, a popular approach to construct feature vectors of vertices. Due to the curse of dimensionality and sparsity in graphical datasets, this approach has become indispensable for machine learning tasks over large networks. The majority of existing literature has considered this technique under the assumption that the network is static. However, networks in many applications, nodes and edges accrue to a growing network as a streaming. A small number of very recent results have addressed the problem of embedding for dynamic networks. However, they either rely on knowledge of vertex attributes, suffer high-time complexity or need to be re-trained without closed-form expression. Thus the approach of adapting the existing methods to the streaming environment faces non-trivial technical challenges. These challenges motivate developing new approaches to the problems of streaming network embedding. In this paper, We propose a new framework that is able to generate latent features for new vertices with high efficiency and low complexity under specified iteration rounds. We formulate a constrained optimization problem for the modification of the representation resulting from a stream arrival. We show this problem has no closed-form solution and instead develop an online approximation solution. Our solution follows three steps: (1) identify vertices affected by new vertices, (2) generate latent features for new vertices, and (3) update the latent features of the most affected vertices. The generated representations are provably feasible and not far from the optimal ones in terms of expectation. Multi-class classification and clustering on five real-world networks demonstrate that our model can efficiently update vertex representations and simultaneously achieve comparable or even better performance. △ Less

Submitted 14 November, 2018; originally announced November 2018.

arXiv:1808.06573 [pdf, other]

A Semi-Supervised and Inductive Embedding Model for Churn Prediction of Large-Scale Mobile Games

Authors: Xi Liu, Muhe Xie, Xidao Wen, Rui Chen, Yong Ge, Nick Duffield, Na Wang

Abstract: Mobile gaming has emerged as a promising market with billion-dollar revenues. A variety of mobile game platforms and services have been developed around the world. One critical challenge for these platforms and services is to understand user churn behavior in mobile games. Accurate churn prediction will benefit many stakeholders such as game developers, advertisers, and platform operators. In this… ▽ More Mobile gaming has emerged as a promising market with billion-dollar revenues. A variety of mobile game platforms and services have been developed around the world. One critical challenge for these platforms and services is to understand user churn behavior in mobile games. Accurate churn prediction will benefit many stakeholders such as game developers, advertisers, and platform operators. In this paper, we present the first large-scale churn prediction solution for mobile games. In view of the common limitations of the state-of-the-art methods built upon traditional machine learning models, we devise a novel semi-supervised and inductive embedding model that jointly learns the prediction function and the embedding function for user-app relationships. We model these two functions by deep neural networks with a unique edge embedding technique that is able to capture both contextual information and relationship dynamics. We also design a novel attributed random walk technique that takes into consideration both topological adjacency and attribute similarities. To evaluate the performance of our solution, we collect real-world data from the Samsung Game Launcher platform that includes tens of thousands of games and hundreds of millions of user-app interactions. The experimental results with this data demonstrate the superiority of our proposed model against existing state-of-the-art methods. △ Less

Submitted 10 October, 2018; v1 submitted 20 August, 2018; originally announced August 2018.

Comments: to appear in ICDM 2018

arXiv:1805.10407 [pdf, other]

Semi-supervised Deep Kernel Learning: Regression with Unlabeled Data by Minimizing Predictive Variance

Authors: Neal Jean, Sang Michael Xie, Stefano Ermon

Abstract: Large amounts of labeled data are typically required to train deep learning models. For many real-world problems, however, acquiring additional data can be expensive or even impossible. We present semi-supervised deep kernel learning (SSDKL), a semi-supervised regression model based on minimizing predictive variance in the posterior regularization framework. SSDKL combines the hierarchical represe… ▽ More Large amounts of labeled data are typically required to train deep learning models. For many real-world problems, however, acquiring additional data can be expensive or even impossible. We present semi-supervised deep kernel learning (SSDKL), a semi-supervised regression model based on minimizing predictive variance in the posterior regularization framework. SSDKL combines the hierarchical representation learning of neural networks with the probabilistic modeling capabilities of Gaussian processes. By leveraging unlabeled data, we show improvements on a diverse set of real-world regression tasks over supervised deep kernel learning and semi-supervised methods such as VAT and mean teacher adapted for regression. △ Less

Submitted 4 March, 2019; v1 submitted 25 May, 2018; originally announced May 2018.

Comments: In Proceedings of Neural Information Processing Systems (NeurIPS) 2018

arXiv:1805.09757 [pdf, other]

Geographical Hidden Markov Tree for Flood Extent Mapping (With Proof Appendix)

Authors: Miao Xie, Zhe Jiang, Arpan Man Sainju

Abstract: Flood extent mapping plays a crucial role in disaster management and national water forecasting. Unfortunately, traditional classification methods are often hampered by the existence of noise, obstacles and heterogeneity in spectral features as well as implicit anisotropic spatial dependency across class labels. In this paper, we propose geographical hidden Markov tree, a probabilistic graphical m… ▽ More Flood extent mapping plays a crucial role in disaster management and national water forecasting. Unfortunately, traditional classification methods are often hampered by the existence of noise, obstacles and heterogeneity in spectral features as well as implicit anisotropic spatial dependency across class labels. In this paper, we propose geographical hidden Markov tree, a probabilistic graphical model that generalizes the common hidden Markov model from a one dimensional sequence to a two dimensional map. Anisotropic spatial dependency is incorporated in the hidden class layer with a reverse tree structure. We also investigate computational algorithms for reverse tree construction, model parameter learning and class inference. Extensive evaluations on both synthetic and real world datasets show that proposed model outperforms multiple baselines in flood mapping, and our algorithms are scalable on large data sizes. △ Less

Submitted 24 May, 2018; originally announced May 2018.

arXiv:1802.05380 [pdf, other]

Active Feature Acquisition with Supervised Matrix Completion

Authors: Sheng-Jun Huang, Miao Xu, Ming-Kun Xie, Masashi Sugiyama, Gang Niu, Songcan Chen

Abstract: Feature missing is a serious problem in many applications, which may lead to low quality of training data and further significantly degrade the learning performance. While feature acquisition usually involves special devices or complex process, it is expensive to acquire all feature values for the whole dataset. On the other hand, features may be correlated with each other, and some values may be… ▽ More Feature missing is a serious problem in many applications, which may lead to low quality of training data and further significantly degrade the learning performance. While feature acquisition usually involves special devices or complex process, it is expensive to acquire all feature values for the whole dataset. On the other hand, features may be correlated with each other, and some values may be recovered from the others. It is thus important to decide which features are most informative for recovering the other features as well as improving the learning performance. In this paper, we try to train an effective classification model with least acquisition cost by jointly performing active feature querying and supervised matrix completion. When completing the feature matrix, a novel target function is proposed to simultaneously minimize the reconstruction error on observed entries and the supervised loss on training data. When querying the feature value, the most uncertain entry is actively selected based on the variance of previous iterations. In addition, a bi-objective optimization method is presented for cost-aware active selection when features bear different acquisition costs. The effectiveness of the proposed approach is well validated by both theoretical analysis and experimental study. △ Less

Submitted 4 June, 2018; v1 submitted 14 February, 2018; originally announced February 2018.

Comments: 9 pages, 8 figures

arXiv:1802.03511 [pdf, other]

A General Framework For Frequentist Model Averaging

Authors: Priyam Mitra, Heng Lian, Ritwik Mitra, Hua Liang, Min-ge Xie

Abstract: Model selection strategies have been routinely employed to determine a model for data analysis in statistics, and further study and inference then often proceed as though the selected model were the true model that were known a priori. This practice does not account for the uncertainty introduced by the selection process and the fact that the selected model can possibly be a wrong one. Model avera… ▽ More Model selection strategies have been routinely employed to determine a model for data analysis in statistics, and further study and inference then often proceed as though the selected model were the true model that were known a priori. This practice does not account for the uncertainty introduced by the selection process and the fact that the selected model can possibly be a wrong one. Model averaging approaches try to remedy this issue by combining estimators for a set of candidate models. Specifically, instead of deciding which model is the 'right' one, a model averaging approach suggests to fit a set of candidate models and average over the estimators using certain data adaptive weights. In this paper we establish a general frequentist model averaging framework that does not set any restrictions on the set of candidate models. It greatly broadens the scope of the existing methodologies under the frequentist model averaging development. Assuming the data is from an unknown model, we derive the model averaging estimator and study its limiting distributions and related predictions while taking possible modeling biases into account. We propose a set of optimal weights to combine the individual estimators so that the expected mean squared error of the average estimator is minimized. Simulation studies are conducted to compare the performance of the estimator with that of the existing methods. The results show the benefits of the proposed approach over traditional model selection approaches as well as existing model averaging methods. △ Less

Submitted 9 February, 2018; originally announced February 2018.

arXiv:1705.10347 [pdf, ps, other]

An effective likelihood-free approximate computing method with statistical inferential guarantees

Authors: Suzanne Thornton, Wentao Li, Min-ge Xie

Abstract: Approximate Bayesian computing is a powerful likelihood-free method that has grown increasingly popular since early applications in population genetics. However, complications arise in the theoretical justification for Bayesian inference conducted from this method with a non-sufficient summary statistic. In this paper, we seek to re-frame approximate Bayesian computing within a frequentist context… ▽ More Approximate Bayesian computing is a powerful likelihood-free method that has grown increasingly popular since early applications in population genetics. However, complications arise in the theoretical justification for Bayesian inference conducted from this method with a non-sufficient summary statistic. In this paper, we seek to re-frame approximate Bayesian computing within a frequentist context and justify its performance by standards set on the frequency coverage rate. In doing so, we develop a new computational technique called approximate confidence distribution computing, yielding theoretical support for the use of non-sufficient summary statistics in likelihood-free methods. Furthermore, we demonstrate that approximate confidence distribution computing extends the scope of approximate Bayesian computing to include data-dependent priors without damaging the inferential integrity. This data-dependent prior can be viewed as an initial `distribution estimate' of the target parameter which is updated with the results of the approximate confidence distribution computing method. A general strategy for constructing an appropriate data-dependent prior is also discussed and is shown to often increase the computing speed while maintaining statistical inferential guarantees. We supplement the theory with simulation studies illustrating the benefits of the proposed method, namely the potential for broader applications and the increased computing speed compared to the standard approximate Bayesian computing methods. △ Less

Submitted 30 November, 2018; v1 submitted 29 May, 2017; originally announced May 2017.

arXiv:1606.08339 [pdf, other]

doi 10.1002/asmb.2161

Dynamic dependence networks: Financial time series forecasting and portfolio decisions (with discussion)

Authors: Zoey Yi Zhao, Meng Xie, Mike West

Abstract: We discuss Bayesian forecasting of increasingly high-dimensional time series, a key area of application of stochastic dynamic models in the financial industry and allied areas of business. Novel state-space models characterizing sparse patterns of dependence among multiple time series extend existing multivariate volatility models to enable scaling to higher numbers of individual time series. The… ▽ More We discuss Bayesian forecasting of increasingly high-dimensional time series, a key area of application of stochastic dynamic models in the financial industry and allied areas of business. Novel state-space models characterizing sparse patterns of dependence among multiple time series extend existing multivariate volatility models to enable scaling to higher numbers of individual time series. The theory of these "dynamic dependence network" models shows how the individual series can be "decoupled" for sequential analysis, and then "recoupled" for applied forecasting and decision analysis. Decoupling allows fast, efficient analysis of each of the series in individual univariate models that are linked-- for later recoupling-- through a theoretical multivariate volatility structure defined by a sparse underlying graphical model. Computational advances are especially significant in connection with model uncertainty about the sparsity patterns among series that define this graphical model; Bayesian model averaging using discounting of historical information builds substantially on this computational advance. An extensive, detailed case study showcases the use of these models, and the improvements in forecasting and financial portfolio investment decisions that are achievable. Using a long series of daily international currency, stock indices and commodity prices, the case study includes evaluations of multi-day forecasts and Bayesian portfolio analysis with a variety of practical utility functions, as well as comparisons against commodity trading advisor benchmarks. △ Less

Submitted 27 June, 2016; originally announced June 2016.

Comments: 31 pages, 9 figures, 3 tables

MSC Class: 62M10; 62F15; 62P20

Journal ref: Applied Stochastic Models in Business and Industry, 32, 311-339, 2016

arXiv:1505.05184 [pdf]

Multi-Objective Optimization of a Port-of-Entry Inspection Policy

Authors: Christina M. Young, Mingyu Li, Yada Zhu, Minge Xie, Elsayed A. Elsayed, Tsvetan Asamov

Abstract: At the port-of-entry containers are inspected through a specific sequence of sensor stations to detect the presence of nuclear materials, biological and chemical agents, and other illegal cargo. The inspection policy, which includes the sequence in which sensors are applied and the threshold levels used at the inspection stations, affects the probability of misclassifying a container as well as th… ▽ More At the port-of-entry containers are inspected through a specific sequence of sensor stations to detect the presence of nuclear materials, biological and chemical agents, and other illegal cargo. The inspection policy, which includes the sequence in which sensors are applied and the threshold levels used at the inspection stations, affects the probability of misclassifying a container as well as the cost and time spent in inspection. In this paper we consider a system operating with a Boolean decision function combining station results and present a multi-objective optimization approach to determine the optimal sensor arrangement and threshold levels while considering cost and time. The total cost includes cost incurred by misclassification errors and the total expected cost of inspection, while the time represents the total expected time a container spends in the inspection system. An example which applies the approach in a theoretical inspection system is presented. △ Less

Submitted 19 May, 2015; originally announced May 2015.

Showing 1–50 of 52 results for author: Xie, M