Skip to main content

Showing 1–34 of 34 results for author: Wen, Y

Searching in archive stat. Search in all archives.
.
  1. arXiv:2505.11281  [pdf, ps, other

    stat.ML cs.LG

    Adaptive Linear Embedding for Nonstationary High-Dimensional Optimization

    Authors: Yuejiang Wen, Paul D. Franzon

    Abstract: Bayesian Optimization (BO) in high-dimensional spaces remains fundamentally limited by the curse of dimensionality and the rigidity of global low-dimensional assumptions. While Random EMbedding Bayesian Optimization (REMBO) mitigates this via linear projections into low-dimensional subspaces, it typically assumes a single global embedding and a stationary objective. In this work, we introduce Self… ▽ More

    Submitted 16 May, 2025; originally announced May 2025.

    Comments: working, to be submitted

  2. arXiv:2504.00024  [pdf, other

    stat.ME cs.AI cs.LG

    A multi-locus predictiveness curve and its summary assessment for genetic risk prediction

    Authors: Changshuai Wei, Ming Li, Yalu Wen, Chengyin Ye, Qing Lu

    Abstract: With the advance of high-throughput genotyping and sequencing technologies, it becomes feasible to comprehensive evaluate the role of massive genetic predictors in disease prediction. There exists, therefore, a critical need for developing appropriate statistical measurements to access the combined effects of these genetic variants in disease prediction. Predictiveness curve is commonly used as a… ▽ More

    Submitted 28 March, 2025; originally announced April 2025.

  3. arXiv:2502.18826  [pdf, ps, other

    cs.LG cs.IT stat.ML

    Adversarial Combinatorial Semi-bandits with Graph Feedback

    Authors: Yuxiao Wen

    Abstract: In combinatorial semi-bandits, a learner repeatedly selects from a combinatorial decision set of arms, receives the realized sum of rewards, and observes the rewards of the individual selected arms as feedback. In this paper, we extend this framework to include \emph{graph feedback}, where the learner observes the rewards of all neighboring arms of the selected arms in a feedback graph $G$. We est… ▽ More

    Submitted 5 June, 2025; v1 submitted 26 February, 2025; originally announced February 2025.

    Comments: To appear in ICML 2025

  4. arXiv:2502.17292  [pdf, other

    cs.LG cs.GT cs.IT stat.ME stat.ML

    Joint Value Estimation and Bidding in Repeated First-Price Auctions

    Authors: Yuxiao Wen, Yanjun Han, Zhengyuan Zhou

    Abstract: We study regret minimization in repeated first-price auctions (FPAs), where a bidder observes only the realized outcome after each auction -- win or loss. This setup reflects practical scenarios in online display advertising where the actual value of an impression depends on the difference between two potential outcomes, such as clicks or conversion rates, when the auction is won versus lost. We a… ▽ More

    Submitted 24 February, 2025; originally announced February 2025.

  5. arXiv:2412.14497  [pdf, other

    cs.LG cs.AI stat.ML

    Disentangled Graph Autoencoder for Treatment Effect Estimation

    Authors: Di Fan, Renlei Jiang, Yunhao Wen, Chuanhou Gao

    Abstract: Treatment effect estimation from observational data has attracted significant attention across various research fields. However, many widely used methods rely on the unconfoundedness assumption, which is often unrealistic due to the inability to observe all confounders, thereby overlooking the influence of latent confounders. To address this limitation, recent approaches have utilized auxiliary ne… ▽ More

    Submitted 20 February, 2025; v1 submitted 18 December, 2024; originally announced December 2024.

    Comments: 22 pages, 6 figures

  6. arXiv:2411.01006  [pdf, other

    cs.LG stat.ML

    Abstracted Shapes as Tokens -- A Generalizable and Interpretable Model for Time-series Classification

    Authors: Yunshi Wen, Tengfei Ma, Tsui-Wei Weng, Lam M. Nguyen, Anak Agung Julius

    Abstract: In time-series analysis, many recent works seek to provide a unified view and representation for time-series across multiple domains, leading to the development of foundation models for time-series data. Despite diverse modeling techniques, existing models are black boxes and fail to provide insights and explanations about their representations. In this paper, we present VQShape, a pre-trained, ge… ▽ More

    Submitted 7 January, 2025; v1 submitted 1 November, 2024; originally announced November 2024.

    Comments: Published in Neural Information Processing Systems (NeurIPS) 2024

  7. arXiv:2407.05664  [pdf, other

    stat.ML cs.AI cs.LG

    How DNNs break the Curse of Dimensionality: Compositionality and Symmetry Learning

    Authors: Arthur Jacot, Seok Hoan Choi, Yuxiao Wen

    Abstract: We show that deep neural networks (DNNs) can efficiently learn any composition of functions with bounded $F_{1}$-norm, which allows DNNs to break the curse of dimensionality in ways that shallow networks cannot. More specifically, we derive a generalization bound that combines a covering number argument for compositionality, and the $F_{1}$-norm (or the related Barron norm) for large width adaptiv… ▽ More

    Submitted 6 March, 2025; v1 submitted 8 July, 2024; originally announced July 2024.

  8. arXiv:2405.18756  [pdf, other

    cs.LG cs.AI cs.CV stat.AP stat.ML

    Provable Contrastive Continual Learning

    Authors: Yichen Wen, Zhiquan Tan, Kaipeng Zheng, Chuanlong Xie, Weiran Huang

    Abstract: Continual learning requires learning incremental tasks with dynamic data distributions. So far, it has been observed that employing a combination of contrastive loss and distillation loss for training in continual learning yields strong performance. To the best of our knowledge, however, this contrastive continual learning framework lacks convincing theoretical explanations. In this work, we fill… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

    Comments: Accepted by ICML 2024

  9. arXiv:2402.08010  [pdf, other

    cs.LG cs.AI stat.ML

    Which Frequencies do CNNs Need? Emergent Bottleneck Structure in Feature Learning

    Authors: Yuxiao Wen, Arthur Jacot

    Abstract: We describe the emergence of a Convolution Bottleneck (CBN) structure in CNNs, where the network uses its first few layers to transform the input representation into a representation that is supported only along a few frequencies and channels, before using the last few layers to map back to the outputs. We define the CBN rank, which describes the number and type of frequencies that are kept inside… ▽ More

    Submitted 6 March, 2025; v1 submitted 12 February, 2024; originally announced February 2024.

  10. arXiv:2205.00403  [pdf, other

    cs.LG stat.ML

    A Simple Approach to Improve Single-Model Deep Uncertainty via Distance-Awareness

    Authors: Jeremiah Zhe Liu, Shreyas Padhy, Jie Ren, Zi Lin, Yeming Wen, Ghassen Jerfel, Zack Nado, Jasper Snoek, Dustin Tran, Balaji Lakshminarayanan

    Abstract: Accurate uncertainty quantification is a major challenge in deep learning, as neural networks can make overconfident errors and assign high confidence predictions to out-of-distribution (OOD) inputs. The most popular approaches to estimate predictive uncertainty in deep learning are methods that combine predictions from multiple neural networks, such as Bayesian neural networks (BNNs) and deep ens… ▽ More

    Submitted 30 December, 2022; v1 submitted 1 May, 2022; originally announced May 2022.

    Comments: arXiv admin note: text overlap with arXiv:2006.10108

  11. arXiv:2011.07478  [pdf, other

    cs.LG stat.ML

    Towards Understanding the Regularization of Adversarial Robustness on Neural Networks

    Authors: Yuxin Wen, Shuai Li, Kui Jia

    Abstract: The problem of adversarial examples has shown that modern Neural Network (NN) models could be rather fragile. Among the more established techniques to solve the problem, one is to require the model to be {\it $ε$-adversarially robust} (AR); that is, to require the model not to change predicted labels when any given input examples are perturbed within a certain range. However, it is observed that s… ▽ More

    Submitted 15 November, 2020; originally announced November 2020.

    Comments: Published as a conference paper at ICML 2020

  12. arXiv:2010.09875  [pdf, other

    cs.LG stat.ML

    Combining Ensembles and Data Augmentation can Harm your Calibration

    Authors: Yeming Wen, Ghassen Jerfel, Rafael Muller, Michael W. Dusenberry, Jasper Snoek, Balaji Lakshminarayanan, Dustin Tran

    Abstract: Ensemble methods which average over multiple neural network predictions are a simple approach to improve a model's calibration and robustness. Similarly, data augmentation techniques, which encode prior information in the form of invariant feature transformations, are effective for improving calibration and robustness. In this paper, we show a surprising pathology: combining ensembles and data aug… ▽ More

    Submitted 22 March, 2021; v1 submitted 19 October, 2020; originally announced October 2020.

  13. arXiv:2006.11967  [pdf, other

    cs.LG stat.ML

    Exploiting Weight Redundancy in CNNs: Beyond Pruning and Quantization

    Authors: Yuan Wen, David Gregg

    Abstract: Pruning and quantization are proven methods for improving the performance and storage efficiency of convolutional neural networks (CNNs). Pruning removes near-zero weights in tensors and masks weak connections between neurons in neighbouring layers. Quantization reduces the precision of weights by replacing them with numerically similar values that require less storage. In this paper, we identify… ▽ More

    Submitted 21 June, 2020; originally announced June 2020.

  14. arXiv:2005.10709  [pdf, other

    cs.LG stat.ML

    TASO: Time and Space Optimization for Memory-Constrained DNN Inference

    Authors: Yuan Wen, Andrew Anderson, Valentin Radu, Michael F. P. O'Boyle, David Gregg

    Abstract: Convolutional neural networks (CNNs) are used in many embedded applications, from industrial robotics and automation systems to biometric identification on mobile devices. State-of-the-art classification is typically achieved by large networks, which are prohibitively expensive to run on mobile and embedded devices with tightly constrained memory and energy budgets. We propose an approach for ahea… ▽ More

    Submitted 21 May, 2020; originally announced May 2020.

  15. arXiv:2005.07186  [pdf, other

    cs.LG stat.ML

    Efficient and Scalable Bayesian Neural Nets with Rank-1 Factors

    Authors: Michael W. Dusenberry, Ghassen Jerfel, Yeming Wen, Yi-An Ma, Jasper Snoek, Katherine Heller, Balaji Lakshminarayanan, Dustin Tran

    Abstract: Bayesian neural networks (BNNs) demonstrate promising success in improving the robustness and uncertainty quantification of modern deep learning. However, they generally struggle with underfitting at scale and parameter efficiency. On the other hand, deep ensembles have emerged as alternatives for uncertainty quantification that, while outperforming BNNs on certain problems, also suffer from effic… ▽ More

    Submitted 14 August, 2020; v1 submitted 14 May, 2020; originally announced May 2020.

    Comments: Published in the International Conference on Machine Learning (ICML) 2020. Code available at https://github.com/google/edward2

  16. arXiv:2004.10931  [pdf

    stat.ML cs.LG stat.AP

    Active Learning for Gaussian Process Considering Uncertainties with Application to Shape Control of Composite Fuselage

    Authors: Xiaowei Yue, Yuchen Wen, Jeffrey H. Hunt, Jianjun Shi

    Abstract: In the machine learning domain, active learning is an iterative data selection algorithm for maximizing information acquisition and improving model performance with limited training samples. It is very useful, especially for the industrial applications where training samples are expensive, time-consuming, or difficult to obtain. Existing methods mainly focus on active learning for classification,… ▽ More

    Submitted 22 April, 2020; originally announced April 2020.

  17. arXiv:2002.08697  [pdf, other

    cs.LG stat.ML

    Performance Aware Convolutional Neural Network Channel Pruning for Embedded GPUs

    Authors: Valentin Radu, Kuba Kaszyk, Yuan Wen, Jack Turner, Jose Cano, Elliot J. Crowley, Bjorn Franke, Amos Storkey, Michael O'Boyle

    Abstract: Convolutional Neural Networks (CNN) are becoming a common presence in many applications and services, due to their superior recognition accuracy. They are increasingly being used on mobile devices, many times just by porting large models designed for server space, although several model compression techniques have been considered. One model compression technique intended to reduce computations is… ▽ More

    Submitted 20 February, 2020; originally announced February 2020.

    Comments: A copy of this was published in IISWC'19

  18. arXiv:2002.06715  [pdf, other

    cs.LG stat.ML

    BatchEnsemble: An Alternative Approach to Efficient Ensemble and Lifelong Learning

    Authors: Yeming Wen, Dustin Tran, Jimmy Ba

    Abstract: Ensembles, where multiple neural networks are trained individually and their predictions are averaged, have been shown to be widely successful for improving both the accuracy and predictive uncertainty of single neural networks. However, an ensemble's cost for both training and testing increases linearly with the number of networks, which quickly becomes untenable. In this paper, we propose Batc… ▽ More

    Submitted 19 February, 2020; v1 submitted 16 February, 2020; originally announced February 2020.

    Journal ref: Eighth International Conference on Learning Representations (ICLR 2020)

  19. arXiv:2001.06937  [pdf, other

    cs.LG stat.ML

    A Review on Generative Adversarial Networks: Algorithms, Theory, and Applications

    Authors: Jie Gui, Zhenan Sun, Yonggang Wen, Dacheng Tao, Jieping Ye

    Abstract: Generative adversarial networks (GANs) are a hot research topic recently. GANs have been widely studied since 2014, and a large number of algorithms have been proposed. However, there is few comprehensive study explaining the connections among different GANs variants, and how they have evolved. In this paper, we attempt to provide a review on various GANs methods from the perspectives of algorithm… ▽ More

    Submitted 19 January, 2020; originally announced January 2020.

  20. arXiv:1907.02057  [pdf, other

    cs.LG cs.AI cs.RO stat.ML

    Benchmarking Model-Based Reinforcement Learning

    Authors: Tingwu Wang, Xuchan Bao, Ignasi Clavera, Jerrick Hoang, Yeming Wen, Eric Langlois, Shunshi Zhang, Guodong Zhang, Pieter Abbeel, Jimmy Ba

    Abstract: Model-based reinforcement learning (MBRL) is widely seen as having the potential to be significantly more sample efficient than model-free RL. However, research in model-based RL has not been very standardized. It is fairly common for authors to experiment with self-designed environments, and there are several separate lines of research, which are sometimes closed-sourced or not reproducible. Acco… ▽ More

    Submitted 3 July, 2019; originally announced July 2019.

    Comments: 8 main pages, 8 figures; 14 appendix pages, 25 figures

  21. arXiv:1905.05929  [pdf, other

    cs.LG cs.CV stat.ML

    Orthogonal Deep Neural Networks

    Authors: Kui Jia, Shuai Li, Yuxin Wen, Tongliang Liu, Dacheng Tao

    Abstract: In this paper, we introduce the algorithms of Orthogonal Deep Neural Networks (OrthDNNs) to connect with recent interest of spectrally regularized deep learning methods. OrthDNNs are theoretically motivated by generalization analysis of modern DNNs, with the aim to find solution properties of network weights that guarantee better generalization. To this end, we first prove that DNNs are of local i… ▽ More

    Submitted 15 October, 2019; v1 submitted 14 May, 2019; originally announced May 2019.

    Comments: To Appear in IEEE Transactions on Pattern Analysis and Machine Intelligence

    Journal ref: IEEE Transactions on Pattern Analysis and Machine Intelligence, 2019

  22. arXiv:1904.10762  [pdf, other

    cs.LG cs.AI stat.ML

    Baconian: A Unified Open-source Framework for Model-Based Reinforcement Learning

    Authors: Linsen Dong, Guanyu Gao, Xinyi Zhang, Liangyu Chen, Yonggang Wen

    Abstract: Model-Based Reinforcement Learning (MBRL) is one category of Reinforcement Learning (RL) algorithms which can improve sampling efficiency by modeling and approximating system dynamics. It has been widely adopted in the research of robotics, autonomous driving, etc. Despite its popularity, there still lacks some sophisticated and reusable open-source frameworks to facilitate MBRL research and exper… ▽ More

    Submitted 15 March, 2021; v1 submitted 23 April, 2019; originally announced April 2019.

  23. arXiv:1904.04088  [pdf, other

    stat.ML cs.CV cs.LG

    Large Margin Multi-modal Multi-task Feature Extraction for Image Classification

    Authors: Yong Luo, Yonggang Wen, Dacheng Tao, Jie Gui, Chao Xu

    Abstract: The features used in many image analysis-based applications are frequently of very high dimension. Feature extraction offers several advantages in high-dimensional cases, and many recent studies have used multi-task feature extraction approaches, which often outperform single-task feature extraction approaches. However, most of these methods are limited in that they only consider data represented… ▽ More

    Submitted 8 April, 2019; originally announced April 2019.

    Journal ref: IEEE Transactions on Image Processing (Volume: 25, Issue: 1, Jan. 2016)

  24. Heterogeneous Multi-task Metric Learning across Multiple Domains

    Authors: Yong Luo, Yonggang Wen, Dacheng Tao

    Abstract: Distance metric learning (DML) plays a crucial role in diverse machine learning algorithms and applications. When the labeled information in target domain is limited, transfer metric learning (TML) helps to learn the metric by leveraging the sufficient information from other related domains. Multi-task metric learning (MTML), which can be regarded as a special case of TML, performs transfer across… ▽ More

    Submitted 8 April, 2019; originally announced April 2019.

    Journal ref: IEEE Transactions on Neural Networks and Learning Systems (Volume: 29, Issue: 9, Sept. 2018)

  25. arXiv:1904.04061  [pdf, other

    stat.ML cs.CV cs.LG

    Transferring Knowledge Fragments for Learning Distance Metric from A Heterogeneous Domain

    Authors: Yong Luo, Yonggang Wen, Tongliang Liu, Dacheng Tao

    Abstract: The goal of transfer learning is to improve the performance of target learning task by leveraging information (or transferring knowledge) from other related tasks. In this paper, we examine the problem of transfer distance metric learning (DML), which usually aims to mitigate the label information deficiency issue in the target DML. Most of the current Transfer DML (TDML) methods are not applicabl… ▽ More

    Submitted 8 April, 2019; originally announced April 2019.

    Journal ref: IEEE Transactions on Pattern Analysis and Machine Intelligence (Volume: 41, Issue: 4, April 1 2019)

  26. arXiv:1904.03921  [pdf, other

    stat.ML cs.CV cs.LG

    Multi-view Vector-valued Manifold Regularization for Multi-label Image Classification

    Authors: Yong Luo, Dacheng Tao, Chang Xu, Chao Xu, Hong Liu, Yonggang Wen

    Abstract: In computer vision, image datasets used for classification are naturally associated with multiple labels and comprised of multiple views, because each image may contain several objects (e.g. pedestrian, bicycle and tree) and is properly characterized by multiple visual features (e.g. color, texture and shape). Currently available tools ignore either the label relationship or the view complementary… ▽ More

    Submitted 8 April, 2019; originally announced April 2019.

    Journal ref: IEEE Transactions on Neural Networks and Learning Systems (Volume: 24, Issue: 5, May 2013)

  27. arXiv:1903.01385  [pdf, other

    cs.LG stat.ML

    Joint Perception and Control as Inference with an Object-based Implementation

    Authors: Minne Li, Zheng Tian, Pranav Nashikkar, Ian Davies, Ying Wen, Jun Wang

    Abstract: Existing model-based reinforcement learning methods often study perception modeling and decision making separately. We introduce joint Perception and Control as Inference (PCI), a general framework to combine perception and control for partially observable environments through Bayesian inference. Based on the fact that object-level inductive biases are critical in human perceptual learning and rea… ▽ More

    Submitted 13 October, 2020; v1 submitted 4 March, 2019; originally announced March 2019.

  28. arXiv:1902.08234  [pdf, other

    cs.LG stat.ML

    An Empirical Study of Large-Batch Stochastic Gradient Descent with Structured Covariance Noise

    Authors: Yeming Wen, Kevin Luk, Maxime Gazeau, Guodong Zhang, Harris Chan, Jimmy Ba

    Abstract: The choice of batch-size in a stochastic optimization algorithm plays a substantial role for both optimization and generalization. Increasing the batch-size used typically improves optimization but degrades generalization. To address the problem of improving generalization while maintaining optimal convergence in large-batch training, we propose to add covariance noise to the gradients. We demonst… ▽ More

    Submitted 28 February, 2020; v1 submitted 21 February, 2019; originally announced February 2019.

    Journal ref: The 23rd International Conference on Artificial Intelligence and Statistics, 2020

  29. arXiv:1901.09207  [pdf, other

    cs.LG cs.AI stat.ML

    Probabilistic Recursive Reasoning for Multi-Agent Reinforcement Learning

    Authors: Ying Wen, Yaodong Yang, Rui Luo, Jun Wang, Wei Pan

    Abstract: Humans are capable of attributing latent mental contents such as beliefs or intentions to others. The social skill is critical in daily life for reasoning about the potential consequences of others' behaviors so as to plan ahead. It is known that humans use such reasoning ability recursively by considering what others believe about their own beliefs. In this paper, we start from level-$1$ recursio… ▽ More

    Submitted 1 March, 2019; v1 submitted 26 January, 2019; originally announced January 2019.

    Comments: ICLR 2019

  30. arXiv:1810.03944  [pdf, other

    stat.ML cs.LG

    Transfer Metric Learning: Algorithms, Applications and Outlooks

    Authors: Yong Luo, Yonggang Wen, Ling-Yu Duan, Dacheng Tao

    Abstract: Distance metric learning (DML) aims to find an appropriate way to reveal the underlying data relationship. It is critical in many machine learning, pattern recognition and data mining algorithms, and usually require large amount of label information (such as class labels or pair/triplet constraints) to achieve satisfactory performance. However, the label information may be insufficient in real-wor… ▽ More

    Submitted 12 November, 2018; v1 submitted 9 October, 2018; originally announced October 2018.

    Comments: 14 pages, 5 figures

  31. arXiv:1805.09496  [pdf, other

    cs.LG cs.AI stat.ML

    Intelligent Trainer for Model-Based Reinforcement Learning

    Authors: Yuanlong Li, Linsen Dong, Xin Zhou, Yonggang Wen, Kyle Guan

    Abstract: Model-based reinforcement learning (MBRL) has been proposed as a promising alternative solution to tackle the high sampling cost challenge in the canonical reinforcement learning (RL), by leveraging a learned model to generate synthesized data for policy training purpose. The MBRL framework, nevertheless, is inherently limited by the convoluted process of jointly learning control policy and config… ▽ More

    Submitted 5 June, 2019; v1 submitted 23 May, 2018; originally announced May 2018.

    Comments: 13 pages

  32. arXiv:1803.04386  [pdf, other

    cs.LG stat.ML

    Flipout: Efficient Pseudo-Independent Weight Perturbations on Mini-Batches

    Authors: Yeming Wen, Paul Vicol, Jimmy Ba, Dustin Tran, Roger Grosse

    Abstract: Stochastic neural net weights are used in a variety of contexts, including regularization, Bayesian neural nets, exploration in reinforcement learning, and evolution strategies. Unfortunately, due to the large number of weights, all the examples in a mini-batch typically share the same weight perturbation, thereby limiting the variance reduction effect of large mini-batches. We introduce flipout,… ▽ More

    Submitted 2 April, 2018; v1 submitted 12 March, 2018; originally announced March 2018.

    Comments: Published as a conference paper at ICLR 2018

  33. arXiv:1612.02295  [pdf, other

    stat.ML cs.LG

    Large-Margin Softmax Loss for Convolutional Neural Networks

    Authors: Weiyang Liu, Yandong Wen, Zhiding Yu, Meng Yang

    Abstract: Cross-entropy loss together with softmax is arguably one of the most common used supervision components in convolutional neural networks (CNNs). Despite its simplicity, popularity and excellent performance, the component does not explicitly encourage discriminative learning of features. In this paper, we propose a generalized large-margin softmax (L-Softmax) loss which explicitly encourages intra-… ▽ More

    Submitted 17 November, 2017; v1 submitted 7 December, 2016; originally announced December 2016.

    Comments: Published in ICML 2016 (with typo fixed)

  34. arXiv:1502.02330  [pdf, other

    stat.ML cs.CV cs.LG

    Tensor Canonical Correlation Analysis for Multi-view Dimension Reduction

    Authors: Yong Luo, Dacheng Tao, Yonggang Wen, Kotagiri Ramamohanarao, Chao Xu

    Abstract: Canonical correlation analysis (CCA) has proven an effective tool for two-view dimension reduction due to its profound theoretical foundation and success in practical applications. In respect of multi-view learning, however, it is limited by its capability of only handling data represented by two-view features, while in many real-world applications, the number of views is frequently many more. Alt… ▽ More

    Submitted 8 February, 2015; originally announced February 2015.

    Comments: 20 pages, 10 figures