Skip to main content

Showing 1–25 of 25 results for author: Cheng, M

Searching in archive stat. Search in all archives.
.
  1. arXiv:2410.02667  [pdf, other

    cs.LG hep-th stat.ML

    GUD: Generation with Unified Diffusion

    Authors: Mathis Gerdes, Max Welling, Miranda C. N. Cheng

    Abstract: Diffusion generative models transform noise into data by inverting a process that progressively adds noise to data samples. Inspired by concepts from the renormalization group in physics, which analyzes systems across different scales, we revisit diffusion models by exploring three key design aspects: 1) the choice of representation in which the diffusion process operates (e.g. pixel-, PCA-, Fouri… ▽ More

    Submitted 3 October, 2024; originally announced October 2024.

    Comments: 11 pages, 8 figures

  2. arXiv:2407.00256  [pdf, other

    cs.AI cs.CL cs.LG stat.ML

    One Prompt is not Enough: Automated Construction of a Mixture-of-Expert Prompts

    Authors: Ruochen Wang, Sohyun An, Minhao Cheng, Tianyi Zhou, Sung Ju Hwang, Cho-Jui Hsieh

    Abstract: Large Language Models (LLMs) exhibit strong generalization capabilities to novel tasks when prompted with language instructions and in-context demos. Since this ability sensitively depends on the quality of prompts, various methods have been explored to automate the instruction design. While these methods demonstrated promising results, they also restricted the searched prompt to one instruction.… ▽ More

    Submitted 28 June, 2024; originally announced July 2024.

    Comments: ICML 2024. code available at https://github.com/ruocwang/mixture-of-prompts

    MSC Class: 68T01

    Journal ref: Proceedings of the 41st International Conference on Machine Learning (ICML), Vienna, Austria, 2024

  3. arXiv:2212.06339  [pdf, other

    cs.LG cs.CV stat.ML

    Regularized Optimal Transport Layers for Generalized Global Pooling Operations

    Authors: Hongteng Xu, Minjie Cheng

    Abstract: Global pooling is one of the most significant operations in many machine learning models and tasks, which works for information fusion and structured data (like sets and graphs) representation. However, without solid mathematical fundamentals, its practical implementations often depend on empirical mechanisms and thus lead to sub-optimal, even unsatisfactory performance. In this work, we develop a… ▽ More

    Submitted 12 December, 2022; originally announced December 2022.

  4. arXiv:2209.13575  [pdf, other

    cs.LG cs.AI cs.CV stat.ML

    Efficient Non-Parametric Optimizer Search for Diverse Tasks

    Authors: Ruochen Wang, Yuanhao Xiong, Minhao Cheng, Cho-Jui Hsieh

    Abstract: Efficient and automated design of optimizers plays a crucial role in full-stack AutoML systems. However, prior methods in optimizer search are often limited by their scalability, generability, or sample efficiency. With the goal of democratizing research and application of optimizer search, we present the first efficient, scalable and generalizable framework that can directly search on the tasks o… ▽ More

    Submitted 27 September, 2022; originally announced September 2022.

    Comments: Accepted at NeurIPS 2022. Code will be released prior to the conference. This is only a preprint, not the final camera ready version

  5. arXiv:2208.13217  [pdf, other

    cs.LG cs.CV stat.ML

    Leachable Component Clustering

    Authors: Miao Cheng, Xinge You

    Abstract: Clustering attempts to partition data instances into several distinctive groups, while the similarities among data belonging to the common partition can be principally reserved. Furthermore, incomplete data frequently occurs in many realworld applications, and brings perverse influence on pattern analysis. As a consequence, the specific solutions to data imputation and handling are developed to co… ▽ More

    Submitted 28 August, 2022; originally announced August 2022.

    Comments: 7 pages, 24 figures

  6. arXiv:2206.01022  [pdf, other

    cs.LG stat.ML

    Learning Disentangled Representations for Counterfactual Regression via Mutual Information Minimization

    Authors: Mingyuan Cheng, Xinru Liao, Quan Liu, Bin Ma, Jian Xu, Bo Zheng

    Abstract: Learning individual-level treatment effect is a fundamental problem in causal inference and has received increasing attention in many areas, especially in the user growth area which concerns many internet companies. Recently, disentangled representation learning methods that decompose covariates into three latent factors, including instrumental, confounding and adjustment factors, have witnessed g… ▽ More

    Submitted 2 June, 2022; originally announced June 2022.

  7. arXiv:2103.11785  [pdf, other

    cs.LG quant-ph stat.ML

    Entangled q-Convolutional Neural Nets

    Authors: Vassilis Anagiannis, Miranda C. N. Cheng

    Abstract: We introduce a machine learning model, the q-CNN model, sharing key features with convolutional neural networks and admitting a tensor network description. As examples, we apply q-CNN to the MNIST and Fashion MNIST classification tasks. We explain how the network associates a quantum state to each classification label, and study the entanglement structure of these network states. In both our exper… ▽ More

    Submitted 5 March, 2021; originally announced March 2021.

  8. arXiv:2011.14031  [pdf, other

    cs.LG cs.CR cs.CV stat.ML

    Voting based ensemble improves robustness of defensive models

    Authors: Devvrit, Minhao Cheng, Cho-Jui Hsieh, Inderjit Dhillon

    Abstract: Developing robust models against adversarial perturbations has been an active area of research and many algorithms have been proposed to train individual robust models. Taking these pretrained robust models, we aim to study whether it is possible to create an ensemble to further improve robustness. Several previous attempts tackled this problem by ensembling the soft-label prediction and have been… ▽ More

    Submitted 27 November, 2020; originally announced November 2020.

  9. arXiv:2007.03966  [pdf, other

    cs.LG stat.ML

    Semi-Supervised Learning with Meta-Gradient

    Authors: Xin-Yu Zhang, Taihong Xiao, Haolin Jia, Ming-Ming Cheng, Ming-Hsuan Yang

    Abstract: In this work, we propose a simple yet effective meta-learning algorithm in semi-supervised learning. We notice that most existing consistency-based approaches suffer from overfitting and limited model generalization ability, especially when training with only a small number of labeled data. To alleviate this issue, we propose a learn-to-generalize regularization term by utilizing the label informa… ▽ More

    Submitted 17 March, 2021; v1 submitted 8 July, 2020; originally announced July 2020.

    Comments: 17 pages

  10. arXiv:2006.10355  [pdf, other

    cs.LG cs.CV stat.ML

    DrNAS: Dirichlet Neural Architecture Search

    Authors: Xiangning Chen, Ruochen Wang, Minhao Cheng, Xiaocheng Tang, Cho-Jui Hsieh

    Abstract: This paper proposes a novel differentiable architecture search method by formulating it into a distribution learning problem. We treat the continuously relaxed architecture mixing weight as random variables, modeled by Dirichlet distribution. With recently developed pathwise derivatives, the Dirichlet parameters can be easily optimized with gradient-based optimizer in an end-to-end manner. This fo… ▽ More

    Submitted 15 March, 2021; v1 submitted 18 June, 2020; originally announced June 2020.

    Comments: ICLR 2021, code is available at https://github.com/xiangning-chen/DrNAS

  11. arXiv:2002.06789  [pdf, other

    cs.LG stat.ML

    CAT: Customized Adversarial Training for Improved Robustness

    Authors: Minhao Cheng, Qi Lei, Pin-Yu Chen, Inderjit Dhillon, Cho-Jui Hsieh

    Abstract: Adversarial training has become one of the most effective methods for improving robustness of neural networks. However, it often suffers from poor generalization on both clean and perturbed data. In this paper, we propose a new algorithm, named Customized Adversarial Training (CAT), which adaptively customizes the perturbation level and the corresponding label for each training sample in adversari… ▽ More

    Submitted 17 February, 2020; originally announced February 2020.

  12. arXiv:1910.14655  [pdf, other

    stat.ML cs.CV cs.LG

    Enhancing Certifiable Robustness via a Deep Model Ensemble

    Authors: Huan Zhang, Minhao Cheng, Cho-Jui Hsieh

    Abstract: We propose an algorithm to enhance certified robustness of a deep model ensemble by optimally weighting each base model. Unlike previous works on using ensembles to empirically improve robustness, our algorithm is based on optimizing a guaranteed robustness certificate of neural networks. Our proposed ensemble framework with certified robustness, RobBoost, formulates the optimal model selection an… ▽ More

    Submitted 31 October, 2019; originally announced October 2019.

    Comments: This is an extended version of ICLR 2019 Safe Machine Learning Workshop (SafeML) paper, "RobBoost: A provable approach to boost the robustness of deep model ensemble". May 6, 2019, New Orleans, LA, USA

  13. arXiv:1909.10773  [pdf, other

    cs.LG stat.ML

    Sign-OPT: A Query-Efficient Hard-label Adversarial Attack

    Authors: Minhao Cheng, Simranjit Singh, Patrick Chen, Pin-Yu Chen, Sijia Liu, Cho-Jui Hsieh

    Abstract: We study the most practical problem setup for evaluating adversarial robustness of a machine learning system with limited access: the hard-label black-box attack setting for generating adversarial examples, where limited model queries are allowed and only the decision is provided to a queried data input. Several algorithms have been proposed for this problem but they typically require huge amount… ▽ More

    Submitted 13 February, 2020; v1 submitted 24 September, 2019; originally announced September 2019.

    Comments: Published in ICLR 2020

  14. arXiv:1906.02494  [pdf, other

    stat.ML cs.LG

    Understanding Adversarial Behavior of DNNs by Disentangling Non-Robust and Robust Components in Performance Metric

    Authors: Yujun Shi, Benben Liao, Guangyong Chen, Yun Liu, Ming-Ming Cheng, Jiashi Feng

    Abstract: The vulnerability to slight input perturbations is a worrying yet intriguing property of deep neural networks (DNNs). Despite many previous works studying the reason behind such adversarial behavior, the relationship between the generalization performance and adversarial behavior of DNNs is still little understood. In this work, we reveal such relation by introducing a metric characterizing the ge… ▽ More

    Submitted 6 June, 2019; originally announced June 2019.

  15. arXiv:1906.02481  [pdf, ps, other

    cs.LG hep-th stat.ML

    Covariance in Physics and Convolutional Neural Networks

    Authors: Miranda C. N. Cheng, Vassilis Anagiannis, Maurice Weiler, Pim de Haan, Taco S. Cohen, Max Welling

    Abstract: In this proceeding we give an overview of the idea of covariance (or equivariance) featured in the recent development of convolutional neural networks (CNNs). We study the similarities and differences between the use of covariance in theoretical physics and in the CNN context. Additionally, we demonstrate that the simple assumption of covariance, together with the required properties of locality,… ▽ More

    Submitted 6 June, 2019; originally announced June 2019.

  16. arXiv:1904.04780  [pdf, other

    stat.ML cs.LG

    Time-Series Analysis via Low-Rank Matrix Factorization Applied to Infant-Sleep Data

    Authors: Sheng Liu, Mark Cheng, Hayley Brooks, Wayne Mackey, David J. Heeger, Esteban G. Tabak, Carlos Fernandez-Granda

    Abstract: We propose a nonparametric model for time series with missing data based on low-rank matrix factorization. The model expresses each instance in a set of time series as a linear combination of a small number of shared basis functions. Constraining the functions and the corresponding coefficients to be nonnegative yields an interpretable low-dimensional representation of the data. A time-smoothing r… ▽ More

    Submitted 6 November, 2019; v1 submitted 9 April, 2019; originally announced April 2019.

    Comments: Machine Learning for Health (ML4H) at NeurIPS 2019 - Extended Abstract

  17. arXiv:1810.04064  [pdf, ps, other

    cs.LG stat.AP stat.ML

    A Family of Maximum Margin Criterion for Adaptive Learning

    Authors: Miao Cheng, Zunren Liu, Hongwei Zou, Ah Chung Tsoi

    Abstract: In recent years, pattern analysis plays an important role in data mining and recognition, and many variants have been proposed to handle complicated scenarios. In the literature, it has been quite familiar with high dimensionality of data samples, but either such characteristics or large data have become usual sense in real-world applications. In this work, an improved maximum margin criterion (MM… ▽ More

    Submitted 7 November, 2018; v1 submitted 9 October, 2018; originally announced October 2018.

    Comments: 14 pages

  18. arXiv:1807.04457  [pdf, other

    cs.LG cs.AI stat.ML

    Query-Efficient Hard-label Black-box Attack:An Optimization-based Approach

    Authors: Minhao Cheng, Thong Le, Pin-Yu Chen, Jinfeng Yi, Huan Zhang, Cho-Jui Hsieh

    Abstract: We study the problem of attacking a machine learning model in the hard-label black-box setting, where no model information is revealed except that the attacker can make queries to probe the corresponding hard-label decisions. This is a very challenging problem since the direct extension of state-of-the-art white-box attacks (e.g., CW or PGD) to the hard-label black-box setting will require minimiz… ▽ More

    Submitted 12 July, 2018; originally announced July 2018.

  19. arXiv:1805.11811  [pdf, ps, other

    stat.ML cs.LG math.OC

    Stochastic Zeroth-order Optimization via Variance Reduction method

    Authors: Liu Liu, Minhao Cheng, Cho-Jui Hsieh, Dacheng Tao

    Abstract: Derivative-free optimization has become an important technique used in machine learning for optimizing black-box models. To conduct updates without explicitly computing gradient, most current approaches iteratively sample a random search direction from Gaussian distribution and compute the estimated gradient along that direction. However, due to the variance in the search direction, the convergenc… ▽ More

    Submitted 2 August, 2018; v1 submitted 30 May, 2018; originally announced May 2018.

  20. arXiv:1712.00673  [pdf, other

    cs.LG cs.CR stat.ML

    Towards Robust Neural Networks via Random Self-ensemble

    Authors: Xuanqing Liu, Minhao Cheng, Huan Zhang, Cho-Jui Hsieh

    Abstract: Recent studies have revealed the vulnerability of deep neural networks: A small adversarial perturbation that is imperceptible to human can easily make a well-trained deep neural network misclassify. This makes it unsafe to apply neural networks in security-critical applications. In this paper, we propose a new defense algorithm called Random Self-Ensemble (RSE) by combining two important concepts… ▽ More

    Submitted 31 July, 2018; v1 submitted 2 December, 2017; originally announced December 2017.

    Comments: ECCV 2018 camera ready

  21. arXiv:1511.01124  [pdf, ps, other

    stat.ME

    Greedy Forward Regression for Variable Screening

    Authors: Ming-Yen Cheng, Sanying Feng, Gaorong Li, Heng Lian

    Abstract: Two popular variable screening methods under the ultra-high dimensional setting with the desirable sure screening property are the sure independence screening (SIS) and the forward regression (FR). Both are classical variable screening methods and recently have attracted greater attention under the new light of high-dimensional data analysis. We consider a new and simple screening method that inco… ▽ More

    Submitted 3 November, 2015; originally announced November 2015.

  22. arXiv:1501.00538  [pdf, ps, other

    stat.ME

    Efficient estimation in semivarying coefficient models for longitudinal/clustered data

    Authors: Ming-Yen Cheng, Toshio Honda, Jialiang Li

    Abstract: In semivarying coefficient models for longitudinal/clustered data, usually of primary interest is usually the parametric component which involves unknown constant coefficients. First, we study semiparametric efficiency bound for estimation of the constant coefficients in a general setup. It can be achieved by spline regression provided that the within-cluster covariance matrices are all known, whi… ▽ More

    Submitted 13 September, 2015; v1 submitted 3 January, 2015; originally announced January 2015.

  23. arXiv:1410.6556  [pdf, ps, other

    stat.ME

    Forward variable selection for sparse ultra-high dimensional varying coefficient models

    Authors: Ming-Yen Cheng, Toshio Honda, Jin-Ting Zhang

    Abstract: Varying coefficient models have numerous applications in a wide scope of scientific areas. While enjoying nice interpretability, they also allow flexibility in modeling dynamic impacts of the covariates. But, in the new era of big data, it is challenging to select the relevant variables when there are a large number of candidates. Recently several work are focused on this important problem based o… ▽ More

    Submitted 23 October, 2014; originally announced October 2014.

    Comments: 33 pages, 5 figures and 4 tables

  24. arXiv:1309.7376  [pdf, other

    math.ST stat.AP stat.ME

    A New Test for One-Way ANOVA with Functional Data and Application to Ischemic Heart Screening

    Authors: Jin-Ting Zhang, Ming-Yen Cheng, Chi-Jen Tseng, Hau-Tieng Wu

    Abstract: We propose and study a new global test, namely the $F_{\max}$-test, for the one-way ANOVA problem in functional data analysis. The test statistic is taken as the maximum value of the usual pointwise $F$-test statistics over the interval the functional responses are observed. A nonparametric bootstrap method is employed to approximate the null distribution of the test statistic and to obtain an est… ▽ More

    Submitted 27 September, 2013; originally announced September 2013.

  25. Nonparametric independence screening and structure identification for ultra-high dimensional longitudinal data

    Authors: Ming-Yen Cheng, Toshio Honda, Jialiang Li, Heng Peng

    Abstract: Ultra-high dimensional longitudinal data are increasingly common and the analysis is challenging both theoretically and methodologically. We offer a new automatic procedure for finding a sparse semivarying coefficient model, which is widely accepted for longitudinal data analysis. Our proposed method first reduces the number of covariates to a moderate order by employing a screening procedure, and… ▽ More

    Submitted 23 September, 2014; v1 submitted 19 August, 2013; originally announced August 2013.

    Comments: Published in at http://dx.doi.org/10.1214/14-AOS1236 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org)

    Report number: IMS-AOS-AOS1236 MSC Class: 62G08 (Primary)

    Journal ref: Annals of Statistics 2014, Vol. 42, No. 5, 1819-1849