Skip to main content

Showing 1–31 of 31 results for author: Xia, Q

Searching in archive stat. Search in all archives.
.
  1. arXiv:2506.07953  [pdf, ps, other

    stat.ME stat.AP

    Mediation Analysis for Sparse and Irregularly Spaced Longitudinal Outcomes with Application to the MrOS Sleep Study

    Authors: Rui Ren, Haoyi Yang, Qian Xiao, Lingzhou Xue, Yuan Huang

    Abstract: Mediation analysis has become a widely used method for identifying the pathways through which an independent variable influences a dependent variable via intermediate mediators. However, limited research addresses the case where mediators are high-dimensional and the outcome is represented by sparse, irregularly spaced longitudinal data. To address these challenges, we propose a mediation analysis… ▽ More

    Submitted 9 June, 2025; originally announced June 2025.

    Comments: 23 pages, 6 figures

  2. arXiv:2502.08808  [pdf, other

    cs.LG math.OC stat.ML

    A First-order Generative Bilevel Optimization Framework for Diffusion Models

    Authors: Quan Xiao, Hui Yuan, A F M Saif, Gaowen Liu, Ramana Kompella, Mengdi Wang, Tianyi Chen

    Abstract: Diffusion models, which iteratively denoise data samples to synthesize high-quality outputs, have achieved empirical success across domains. However, optimizing these models for downstream tasks often involves nested bilevel structures, such as tuning hyperparameters for fine-tuning tasks or noise schedules in training dynamics, where traditional bilevel methods fail due to the infinite-dimensiona… ▽ More

    Submitted 12 February, 2025; originally announced February 2025.

  3. Optimal design of experiments with quantitative-sequence factors

    Authors: Yaping Wang, Sixu Liu, Qian Xiao

    Abstract: A new type of experiment with joint considerations of quantitative and sequence factors is recently drawing much attention in medical science, bio-engineering, and many other disciplines. The input spaces of such experiments are semi-discrete and often very large. Thus, efficient and economical experimental designs are required. Based on the transformations and aggregations of good lattice point s… ▽ More

    Submitted 5 February, 2025; originally announced February 2025.

    Comments: This is the English version of the published paper in Chinese by SCIENCE CHINA Mathematics

    MSC Class: 62K20; 62K99

    Journal ref: SCIENCE CHINA Mathematics, 2025, 55: 1-24 (in Chinese)

  4. arXiv:2501.13614  [pdf, ps, other

    stat.ME

    Determining The Number of Factors in Two-Way Factor Model of High-Dimensional Matrix-Variate Time Series: A White-Noise based Method for Serial Correlation Models

    Authors: Qiang Xia

    Abstract: In this paper, we study a new two-way factor model for high-dimensional matrix-variate time series. To estimate the number of factors in this two-way factor model, we decompose the series into two parts: one being a non-weakly correlated series and the other being a weakly correlated noise. By comparing the difference between two series, we can construct white-noise based signal statistics to dete… ▽ More

    Submitted 25 January, 2025; v1 submitted 23 January, 2025; originally announced January 2025.

  5. arXiv:2406.10148  [pdf, other

    math.OC cs.LG stat.ML

    A Primal-Dual-Assisted Penalty Approach to Bilevel Optimization with Coupled Constraints

    Authors: Liuyuan Jiang, Quan Xiao, Victor M. Tenorio, Fernando Real-Rojas, Antonio G. Marques, Tianyi Chen

    Abstract: Interest in bilevel optimization has grown in recent years, partially due to its applications to tackle challenging machine-learning problems. Several exciting recent works have been centered around developing efficient gradient-based algorithms that can solve bilevel optimization problems with provable guarantees. However, the existing literature mainly focuses on bilevel problems either without… ▽ More

    Submitted 25 August, 2024; v1 submitted 14 June, 2024; originally announced June 2024.

    Comments: In this version, we have made the following updates: (1) Added a sensitivity analysis of the algorithm's hyperparameters (stepsize and penalty constant) in Appendix G. (2) Included a computational complexity analysis and comparison in Appendix H. (3) Explicitly stated the inner-loop stepsizes in Remarks 2 and 3

  6. arXiv:2307.03832  [pdf, other

    stat.ME stat.AP

    A Bayesian Circadian Hidden Markov Model to Infer Rest-Activity Rhythms Using 24-hour Actigraphy Data

    Authors: Jiachen Lu, Qian Xiao, Cici Bauer

    Abstract: 24-hour actigraphy data collected by wearable devices offer valuable insights into physical activity types, intensity levels, and rest-activity rhythms (RAR). RARs, or patterns of rest and activity exhibited over a 24-hour period, are regulated by the body's circadian system, synchronizing physiological processes with external cues like the light-dark cycle. Disruptions to these rhythms, such as i… ▽ More

    Submitted 7 July, 2023; originally announced July 2023.

  7. arXiv:2306.07607  [pdf, other

    cs.IR stat.ML

    Practice with Graph-based ANN Algorithms on Sparse Data: Chi-square Two-tower model, HNSW, Sign Cauchy Projections

    Authors: Ping Li, Weijie Zhao, Chao Wang, Qi Xia, Alice Wu, Lijun Peng

    Abstract: Sparse data are common. The traditional ``handcrafted'' features are often sparse. Embedding vectors from trained models can also be very sparse, for example, embeddings trained via the ``ReLu'' activation function. In this paper, we report our exploration of efficient search in sparse data with graph-based ANN algorithms (e.g., HNSW, or SONG which is the GPU version of HNSW), which are popular in… ▽ More

    Submitted 13 June, 2023; originally announced June 2023.

  8. arXiv:2304.06585  [pdf, ps, other

    stat.ME

    Adaptive Testing for Alphas in High-dimensional Factor Pricing Models

    Authors: Qiang Xia, Xianyang Zhang

    Abstract: This paper proposes a new procedure to validate the multi-factor pricing theory by testing the presence of alpha in linear factor pricing models with a large number of assets. Because the market's inefficient pricing is likely to occur to a small fraction of exceptional assets, we develop a testing procedure that is particularly powerful against sparse signals. Based on the high-dimensional Gaussi… ▽ More

    Submitted 22 May, 2023; v1 submitted 13 April, 2023; originally announced April 2023.

  9. arXiv:2302.05185  [pdf, other

    cs.LG math.OC stat.ML

    On Penalty-based Bilevel Gradient Descent Method

    Authors: Han Shen, Quan Xiao, Tianyi Chen

    Abstract: Bilevel optimization enjoys a wide range of applications in emerging machine learning and signal processing problems such as hyper-parameter optimization, image reconstruction, meta-learning, adversarial training, and reinforcement learning. However, bilevel optimization problems are traditionally known to be difficult to solve. Recent progress on bilevel algorithms mainly focuses on bilevel optim… ▽ More

    Submitted 6 January, 2025; v1 submitted 10 February, 2023; originally announced February 2023.

  10. A Scalable Gaussian Process for Large-Scale Periodic Data

    Authors: Yongxiang Li, Yuting Pu, Changming Cheng, Qian Xiao

    Abstract: The periodic Gaussian process (PGP) has been increasingly used to model periodic data due to its high accuracy. Yet, computing the likelihood of PGP has a high computational complexity of $\mathcal{O}\left(n^{3}\right)$ ($n$ is the data size), which hinders its wide application. To address this issue, we propose a novel circulant PGP (CPGP) model for large-scale periodic data collected at grids th… ▽ More

    Submitted 8 February, 2023; v1 submitted 3 January, 2023; originally announced January 2023.

    Comments: Accepted for publication in Technometrics

  11. arXiv:2211.07096  [pdf, other

    cs.LG math.OC stat.ML

    Alternating Implicit Projected SGD and Its Efficient Variants for Equality-constrained Bilevel Optimization

    Authors: Quan Xiao, Han Shen, Wotao Yin, Tianyi Chen

    Abstract: Stochastic bilevel optimization, which captures the inherent nested structure of machine learning problems, is gaining popularity in many recent applications. Existing works on bilevel optimization mostly consider either unconstrained problems or constrained upper-level problems. This paper considers the stochastic bilevel optimization problems with equality constraints both in the upper and lower… ▽ More

    Submitted 12 February, 2023; v1 submitted 13 November, 2022; originally announced November 2022.

    Comments: Submitted to conference in Oct 2022

  12. Modeling and Active Learning for Experiments with Quantitative-Sequence Factors

    Authors: Qian Xiao, Yaping Wang, Abhyuday Mandal, Xinwei Deng

    Abstract: A new type of experiment that aims to determine the optimal quantities of a sequence of factors is eliciting considerable attention in medical science, bioengineering, and many other disciplines. Such studies require the simultaneous optimization of both quantities and the sequence orders of several components which are called quantitative-sequence (QS) factors. Given the large and semi-discrete s… ▽ More

    Submitted 12 September, 2022; v1 submitted 6 September, 2022; originally announced September 2022.

    Comments: Accepted by Journal of the American Statistical Association

  13. arXiv:2207.08556  [pdf, other

    cs.CR stat.ML

    A Certifiable Security Patch for Object Tracking in Self-Driving Systems via Historical Deviation Modeling

    Authors: Xudong Pan, Qifan Xiao, Mi Zhang, Min Yang

    Abstract: Self-driving cars (SDC) commonly implement the perception pipeline to detect the surrounding obstacles and track their moving trajectories, which lays the ground for the subsequent driving decision making process. Although the security of obstacle detection in SDC is intensively studied, not until very recently the attackers start to exploit the vulnerability of the tracking module. Compared with… ▽ More

    Submitted 18 July, 2022; originally announced July 2022.

  14. arXiv:2206.07126  [pdf, other

    cs.LG math.OC stat.ML

    Lazy Queries Can Reduce Variance in Zeroth-order Optimization

    Authors: Quan Xiao, Qing Ling, Tianyi Chen

    Abstract: A major challenge of applying zeroth-order (ZO) methods is the high query complexity, especially when queries are costly. We propose a novel gradient estimation technique for ZO methods based on adaptive lazy queries that we term as LAZO. Different from the classic one-point or two-point gradient estimation methods, LAZO develops two alternative ways to check the usefulness of old queries from pre… ▽ More

    Submitted 14 June, 2022; originally announced June 2022.

  15. arXiv:2206.03996  [pdf, other

    cs.LG eess.SY math.OC stat.ML

    Sharp-MAML: Sharpness-Aware Model-Agnostic Meta Learning

    Authors: Momin Abbas, Quan Xiao, Lisha Chen, Pin-Yu Chen, Tianyi Chen

    Abstract: Model-agnostic meta learning (MAML) is currently one of the dominating approaches for few-shot meta-learning. Albeit its effectiveness, the optimization of MAML can be challenging due to the innate bilevel problem structure. Specifically, the loss landscape of MAML is much more complex with possibly more saddle points and local minimizers than its empirical risk minimization counterpart. To addres… ▽ More

    Submitted 14 August, 2022; v1 submitted 8 June, 2022; originally announced June 2022.

    Comments: Note: While finalizing the Github repository, we found an error in the testing script. We have reimplemented the code and updated the results in this version. The new code has been uploaded to Github, and the revision includes tables 1-5 and figures 2-3

  16. arXiv:2204.08182  [pdf, other

    cs.CV cs.AI cs.IR stat.ML

    Modality-Balanced Embedding for Video Retrieval

    Authors: Xun Wang, Bingqing Ke, Xuanping Li, Fangyu Liu, Mingyu Zhang, Xiao Liang, Qiushi Xiao, Cheng Luo, Yue Yu

    Abstract: Video search has become the main routine for users to discover videos relevant to a text query on large short-video sharing platforms. During training a query-video bi-encoder model using online search logs, we identify a modality bias phenomenon that the video encoder almost entirely relies on text matching, neglecting other modalities of the videos such as vision, audio. This modality imbalancer… ▽ More

    Submitted 17 May, 2022; v1 submitted 18 April, 2022; originally announced April 2022.

    Comments: Accepted by SIGIR-2022, short paper

    Journal ref: SIGIR, 2022

  17. arXiv:2203.10130  [pdf, other

    stat.ME

    EzGP: Easy-to-Interpret Gaussian Process Models for Computer Experiments with Both Quantitative and Qualitative Factors

    Authors: Qian Xiao, Abhyuday Mandal, C. Devon Lin, Xinwei Deng

    Abstract: Computer experiments with both quantitative and qualitative (QQ) inputs are commonly used in science and engineering applications. Constructing desirable emulators for such computer experiments remains a challenging problem. In this article, we propose an easy-to-interpret Gaussian process (EzGP) model for computer experiments to reflect the change of the computer model under the different level c… ▽ More

    Submitted 18 March, 2022; originally announced March 2022.

    Journal ref: Journal on Uncertainty Quantification, 9(2), 333-353 (2021)

  18. arXiv:2109.01258  [pdf, other

    cs.LG eess.SY stat.AP

    Estimating Demand Flexibility Using Siamese LSTM Neural Networks

    Authors: Guangchun Ruan, Daniel S. Kirschen, Haiwang Zhong, Qing Xia, Chongqing Kang

    Abstract: There is an opportunity in modern power systems to explore the demand flexibility by incentivizing consumers with dynamic prices. In this paper, we quantify demand flexibility using an efficient tool called time-varying elasticity, whose value may change depending on the prices and decision dynamics. This tool is particularly useful for evaluating the demand response potential and system reliabili… ▽ More

    Submitted 2 September, 2021; originally announced September 2021.

    Comments: Author copy of the manuscript submitted to IEEE Trans on Power Systems

    Journal ref: IEEE Transactions on Power Systems, 2022

  19. arXiv:2103.12515  [pdf, other

    physics.soc-ph stat.AP

    Quantitative Assessment of U.S. Bulk Power Systems and Market Operations during COVID-19

    Authors: Guangchun Ruan, Jiahan Wu, Haiwang Zhong, Qing Xia, Le Xie

    Abstract: Starting in early 2020, the novel coronavirus disease (COVID-19) severely affected the U.S., causing substantial changes in the operations of bulk power systems and electricity markets. In this paper, we develop a data-driven analysis to substantiate the pandemic's impacts from the perspectives of power system security, electric power generation, electric power demand and electricity prices. Our r… ▽ More

    Submitted 30 August, 2020; originally announced March 2021.

    Comments: Journal paper, 19 pages, also available at EnerarXiv

  20. arXiv:2102.04671  [pdf, other

    math.OC cs.LG stat.ML

    A Single-Timescale Method for Stochastic Bilevel Optimization

    Authors: Tianyi Chen, Yuejiao Sun, Quan Xiao, Wotao Yin

    Abstract: Stochastic bilevel optimization generalizes the classic stochastic optimization from the minimization of a single objective to the minimization of an objective function that depends the solution of another optimization problem. Recently, stochastic bilevel optimization is regaining popularity in emerging machine learning applications such as hyper-parameter optimization and model-agnostic meta lea… ▽ More

    Submitted 30 March, 2022; v1 submitted 9 February, 2021; originally announced February 2021.

    Comments: Minor edits in Table 1

  21. arXiv:2010.09154  [pdf, ps, other

    stat.ME stat.CO

    Musings about Constructions of Efficient Latin Hypercube Designs with Flexible Run-sizes

    Authors: Hongzhi Wang, Qian Xiao, Abhyuday Mandal

    Abstract: Efficient Latin hypercube designs (LHDs), including maximin distance LHDs, maximum projection LHDs and orthogonal LHDs, are widely used in computer experiments. It is challenging to construct such designs with flexible sizes, especially for large ones. In the current literature, various algebraic methods and search algorithms have been proposed for identifying efficient LHDs, each having its own p… ▽ More

    Submitted 8 January, 2021; v1 submitted 18 October, 2020; originally announced October 2020.

  22. arXiv:2006.07622  [pdf, other

    cs.LG stat.ML

    Distant Transfer Learning via Deep Random Walk

    Authors: Qiao Xiao, Yu Zhang

    Abstract: Transfer learning, which is to improve the learning performance in the target domain by leveraging useful knowledge from the source domain, often requires that those two domains are very close, which limits its application scope. Recently, distant transfer learning has been studied to transfer knowledge between two distant or even totally unrelated domains via auxiliary domains that are usually un… ▽ More

    Submitted 13 June, 2020; originally announced June 2020.

  23. arXiv:2006.04697  [pdf, other

    cs.LG cs.AI stat.ML

    Supervised Whole DAG Causal Discovery

    Authors: Hebi Li, Qi Xiao, Jin Tian

    Abstract: We propose to address the task of causal structure learning from data in a supervised manner. Existing work on learning causal directions by supervised learning is restricted to learning pairwise relation, and not well suited for whole DAG discovery. We propose a novel approach of modeling the whole DAG structure discovery as a supervised learning. To fit the problem in hand, we propose to use per… ▽ More

    Submitted 8 June, 2020; originally announced June 2020.

  24. arXiv:2004.12314  [pdf

    cs.CV cs.LG eess.IV stat.ML

    A Global Benchmark of Algorithms for Segmenting Late Gadolinium-Enhanced Cardiac Magnetic Resonance Imaging

    Authors: Zhaohan Xiong, Qing Xia, Zhiqiang Hu, Ning Huang, Cheng Bian, Yefeng Zheng, Sulaiman Vesal, Nishant Ravikumar, Andreas Maier, Xin Yang, Pheng-Ann Heng, Dong Ni, Caizi Li, Qianqian Tong, Weixin Si, Elodie Puybareau, Younes Khoudli, Thierry Geraud, Chen Chen, Wenjia Bai, Daniel Rueckert, Lingchao Xu, Xiahai Zhuang, Xinzhe Luo, Shuman Jia , et al. (19 additional authors not shown)

    Abstract: Segmentation of cardiac images, particularly late gadolinium-enhanced magnetic resonance imaging (LGE-MRI) widely used for visualizing diseased cardiac structures, is a crucial first step for clinical diagnosis and treatment. However, direct segmentation of LGE-MRIs is challenging due to its attenuated contrast. Since most clinical studies have relied on manual and labor-intensive approaches, auto… ▽ More

    Submitted 7 May, 2020; v1 submitted 26 April, 2020; originally announced April 2020.

  25. arXiv:2001.06780  [pdf, other

    cs.CV stat.ML

    Image denoising via K-SVD with primal-dual active set algorithm

    Authors: Quan Xiao, Canhong Wen, Zirui Yan

    Abstract: K-SVD algorithm has been successfully applied to image denoising tasks dozens of years but the big bottleneck in speed and accuracy still needs attention to break. For the sparse coding stage in K-SVD, which involves $\ell_{0}$ constraint, prevailing methods usually seek approximate solutions greedily but are less effective once the noise level is high. The alternative $\ell_{1}$ optimization is p… ▽ More

    Submitted 19 January, 2020; originally announced January 2020.

    Comments: 9 pages, 6 figures. The paper was accepted by IEEE. WACV 2020 and will placed in the IEEE Xplore

  26. arXiv:1905.10729  [pdf, ps, other

    cs.LG cs.CR cs.CV stat.ML

    Purifying Adversarial Perturbation with Adversarially Trained Auto-encoders

    Authors: Hebi Li, Qi Xiao, Shixin Tian, Jin Tian

    Abstract: Machine learning models are vulnerable to adversarial examples. Iterative adversarial training has shown promising results against strong white-box attacks. However, adversarial training is very expensive, and every time a model needs to be protected, such expensive training scheme needs to be performed. In this paper, we propose to apply iterative adversarial training scheme to an external auto-e… ▽ More

    Submitted 26 May, 2019; originally announced May 2019.

  27. Application of Kriging Models for a Drug Combination Experiment on Lung Cancer

    Authors: Qian Xiao, Lin Wang, Hongquan Xu

    Abstract: Combinatorial drugs have been widely applied in disease treatment, especially chemotherapy for cancer, due to its improved efficacy and reduced toxicity compared with individual drugs. The study of combinatorial drugs requires efficient experimental designs and proper follow-up statistical modelling techniques. Linear and non-linear models are often used in the response surface modelling for such… ▽ More

    Submitted 28 January, 2018; originally announced January 2018.

    Comments: Submitted to "Statistics in Medicine"

    Journal ref: Statistics in Medicine, Volume 38, Issue 2, 2019, Pages 236-246

  28. arXiv:1608.00738  [pdf, ps, other

    stat.ME

    Calculating correlation coefficient for Gaussian copula

    Authors: Qing Xiao

    Abstract: When Gaussian copula with linear correlation coefficient is used to model correlated random variables, one crucial issue is to determine a suitable correlation coefficient $ρ_z$ in normal space for two variables with correlation coefficient $ρ_x$. This paper attempts to address this problem. For two continuous variables, the marginal transformation is approximated by a weighted sum of Hermite poly… ▽ More

    Submitted 2 August, 2016; originally announced August 2016.

  29. arXiv:1508.06433  [pdf, ps, other

    stat.ME

    Generating correlated random vector by polynomial normal transformation

    Authors: Qing Xiao

    Abstract: This paper develops a polynomial normal transformation model, whereby various non-normal probability distributions can be simulated by the standard normal distribution. Two methods are presented to determine the coefficients of polynomial model: (1) probability weighted moment (PWM) matching (2) percentile matching. Compared to the existing raw moment or L-moment matching, the proposed methods are… ▽ More

    Submitted 26 August, 2015; originally announced August 2015.

  30. arXiv:1508.06125  [pdf, ps, other

    stat.ME

    A method for calculating quantile function and its further use for data fitting

    Authors: Qing Xiao

    Abstract: This paper introduces a polynomial transformation model based on Weibull distribution, whereby the analytical representation of the quantile function for many probability distributions can be obtained. Firstly, the target random variable $x$ with specified distribution is expressed as a polynomial of a Weibull random variable $z$, the coefficients are conveniently determined by the percentile matc… ▽ More

    Submitted 25 August, 2015; originally announced August 2015.

  31. arXiv:1205.6031  [pdf, ps, other

    stat.ML cs.LG q-bio.GN

    Towards a Mathematical Foundation of Immunology and Amino Acid Chains

    Authors: Wen-Jun Shen, Hau-San Wong, Quan-Wu Xiao, Xin Guo, Stephen Smale

    Abstract: We attempt to set a mathematical foundation of immunology and amino acid chains. To measure the similarities of these chains, a kernel on strings is defined using only the sequence of the chains and a good amino acid substitution matrix (e.g. BLOSUM62). The kernel is used in learning machines to predict binding affinities of peptides to human leukocyte antigens DR (HLA-DR) molecules. On both fixed… ▽ More

    Submitted 25 June, 2012; v1 submitted 28 May, 2012; originally announced May 2012.

    Comments: updated on June 25, 2012