Skip to main content

Showing 1–9 of 9 results for author: Theisen, R

.
  1. arXiv:2411.00328  [pdf, other

    stat.ML cs.LG

    How many classifiers do we need?

    Authors: Hyunsuk Kim, Liam Hodgkinson, Ryan Theisen, Michael W. Mahoney

    Abstract: As performance gains through scaling data and/or model size experience diminishing returns, it is becoming increasingly popular to turn to ensembling, where the predictions of multiple models are combined to improve accuracy. In this paper, we provide a detailed analysis of how the disagreement and the polarization (a notion we introduce and define in this paper) among classifiers relate to the pe… ▽ More

    Submitted 31 October, 2024; originally announced November 2024.

  2. arXiv:2310.12304  [pdf, other

    stat.ML cs.AI cs.LG

    Preference Optimization for Molecular Language Models

    Authors: Ryan Park, Ryan Theisen, Navriti Sahni, Marcel Patek, Anna CichoĊ„ska, Rayees Rahman

    Abstract: Molecular language modeling is an effective approach to generating novel chemical structures. However, these models do not \emph{a priori} encode certain preferences a chemist may desire. We investigate the use of fine-tuning using Direct Preference Optimization to better align generated molecules with chemist preferences. Our findings suggest that this approach is simple, efficient, and highly ef… ▽ More

    Submitted 18 October, 2023; originally announced October 2023.

  3. arXiv:2305.12313  [pdf, other

    stat.ML cs.LG

    When are ensembles really effective?

    Authors: Ryan Theisen, Hyunsuk Kim, Yaoqing Yang, Liam Hodgkinson, Michael W. Mahoney

    Abstract: Ensembling has a long history in statistical data analysis, with many impactful applications. However, in many modern machine learning settings, the benefits of ensembling are less ubiquitous and less obvious. We study, both theoretically and empirically, the fundamental question of when ensembling yields significant performance improvements in classification tasks. Theoretically, we prove new res… ▽ More

    Submitted 20 May, 2023; originally announced May 2023.

  4. arXiv:2202.02842  [pdf, other

    cs.CL cs.LG

    Evaluating natural language processing models with generalization metrics that do not need access to any training or testing data

    Authors: Yaoqing Yang, Ryan Theisen, Liam Hodgkinson, Joseph E. Gonzalez, Kannan Ramchandran, Charles H. Martin, Michael W. Mahoney

    Abstract: Selecting suitable architecture parameters and training hyperparameters is essential for enhancing machine learning (ML) model performance. Several recent empirical studies conduct large-scale correlational analysis on neural networks (NNs) to search for effective \emph{generalization metrics} that can guide this type of model selection. Effective metrics are typically expected to correlate strong… ▽ More

    Submitted 4 June, 2023; v1 submitted 6 February, 2022; originally announced February 2022.

    Journal ref: Proceedings of the 29th ACM SIGKDD international conference on knowledge discovery and data mining (2023)

  5. arXiv:2107.11228  [pdf, other

    cs.LG

    Taxonomizing local versus global structure in neural network loss landscapes

    Authors: Yaoqing Yang, Liam Hodgkinson, Ryan Theisen, Joe Zou, Joseph E. Gonzalez, Kannan Ramchandran, Michael W. Mahoney

    Abstract: Viewing neural network models in terms of their loss landscapes has a long history in the statistical mechanics approach to learning, and in recent years it has received attention within machine learning proper. Among other things, local metrics (such as the smoothness of the loss landscape) have been shown to correlate with global properties of the model (such as good generalization performance).… ▽ More

    Submitted 12 December, 2021; v1 submitted 23 July, 2021; originally announced July 2021.

    Journal ref: Thirty-fifth Annual Conference on Neural Information Processing Systems, 2021

  6. arXiv:2106.03357  [pdf, other

    stat.ML cs.LG

    Evaluating State-of-the-Art Classification Models Against Bayes Optimality

    Authors: Ryan Theisen, Huan Wang, Lav R. Varshney, Caiming Xiong, Richard Socher

    Abstract: Evaluating the inherent difficulty of a given data-driven classification problem is important for establishing absolute benchmarks and evaluating progress in the field. To this end, a natural quantity to consider is the \emph{Bayes error}, which measures the optimal classification error theoretically achievable for a given data distribution. While generally an intractable quantity, we show that we… ▽ More

    Submitted 7 June, 2021; originally announced June 2021.

  7. arXiv:2006.12625  [pdf, other

    stat.ML cs.LG

    Good Classifiers are Abundant in the Interpolating Regime

    Authors: Ryan Theisen, Jason M. Klusowski, Michael W. Mahoney

    Abstract: Within the machine learning community, the widely-used uniform convergence framework has been used to answer the question of how complex, over-parameterized models can generalize well to new data. This approach bounds the test error of the worst-case model one could have fit to the data, but it has fundamental limitations. Inspired by the statistical mechanics approach to learning, we formally def… ▽ More

    Submitted 4 March, 2021; v1 submitted 22 June, 2020; originally announced June 2020.

  8. arXiv:1910.10245  [pdf, other

    stat.ML cs.LG

    Global Capacity Measures for Deep ReLU Networks via Path Sampling

    Authors: Ryan Theisen, Jason M. Klusowski, Huan Wang, Nitish Shirish Keskar, Caiming Xiong, Richard Socher

    Abstract: Classical results on the statistical complexity of linear models have commonly identified the norm of the weights $\|w\|$ as a fundamental capacity measure. Generalizations of this measure to the setting of deep networks have been varied, though a frequently identified quantity is the product of weight norms of each layer. In this work, we show that for a large class of networks possessing a posit… ▽ More

    Submitted 22 October, 2019; originally announced October 2019.

  9. arXiv:1901.10756  [pdf, other

    math.DS

    Stochastic versus deterministic consensus dynamics on graphs

    Authors: Dylan Weber, Ryan Theisen, Sebastien Motsch

    Abstract: We study two agent based models of opinion formation - one stochastic in nature and one deterministic. Both models are defined in terms of an underlying graph; we study how the structure of the graph affects the long time behavior of the models in all possible cases of graph topology. We are especially interested in the emergence of a consensus among the agents and provide a condition on the graph… ▽ More

    Submitted 30 January, 2019; originally announced January 2019.