Skip to main content

Showing 1–21 of 21 results for author: Chételat, D

.
  1. arXiv:2501.17994  [pdf, other

    cs.CL cs.LG

    InnerThoughts: Disentangling Representations and Predictions in Large Language Models

    Authors: Didier Chételat, Joseph Cotnareanu, Rylee Thompson, Yingxue Zhang, Mark Coates

    Abstract: Large language models (LLMs) contain substantial factual knowledge which is commonly elicited by multiple-choice question-answering prompts. Internally, such models process the prompt through multiple transformer layers, building varying representations of the problem within its hidden states. Ultimately, however, only the hidden state corresponding to the final layer and token position are used t… ▽ More

    Submitted 29 January, 2025; originally announced January 2025.

    Comments: Accepted at AISTATS 2025

  2. arXiv:2412.13292  [pdf, other

    cs.CL

    Refining Answer Distributions for Improved Large Language Model Reasoning

    Authors: Soumyasundar Pal, Didier Chételat, Yingxue Zhang, Mark Coates

    Abstract: Large Language Models (LLMs) have exhibited an impressive capability to perform reasoning tasks, especially if they are encouraged to generate a sequence of intermediate steps. Reasoning performance can be improved by suitably combining multiple LLM responses, generated either in parallel in a single query, or via sequential interactions with LLMs throughout the reasoning process. Existing strateg… ▽ More

    Submitted 9 April, 2025; v1 submitted 17 December, 2024; originally announced December 2024.

  3. arXiv:2411.00843  [pdf, other

    cs.LG cs.AI cs.AR cs.CL

    The Graph's Apprentice: Teaching an LLM Low Level Knowledge for Circuit Quality Estimation

    Authors: Reza Moravej, Saurabh Bodhe, Zhanguang Zhang, Didier Chetelat, Dimitrios Tsaras, Yingxue Zhang, Hui-Ling Zhen, Jianye Hao, Mingxuan Yuan

    Abstract: Logic synthesis is a crucial phase in the circuit design process, responsible for transforming hardware description language (HDL) designs into optimized netlists. However, traditional logic synthesis methods are computationally intensive, restricting their iterative use in refining chip designs. Recent advancements in large language models (LLMs), particularly those fine-tuned on programming lang… ▽ More

    Submitted 14 February, 2025; v1 submitted 30 October, 2024; originally announced November 2024.

  4. arXiv:2405.11024  [pdf, other

    cs.LG cs.AI

    GraSS: Combining Graph Neural Networks with Expert Knowledge for SAT Solver Selection

    Authors: Zhanguang Zhang, Didier Chetelat, Joseph Cotnareanu, Amur Ghose, Wenyi Xiao, Hui-Ling Zhen, Yingxue Zhang, Jianye Hao, Mark Coates, Mingxuan Yuan

    Abstract: Boolean satisfiability (SAT) problems are routinely solved by SAT solvers in real-life applications, yet solving time can vary drastically between solvers for the same instance. This has motivated research into machine learning models that can predict, for a given SAT instance, which solver to select among several options. Existing SAT solver selection methods all rely on some hand-picked instance… ▽ More

    Submitted 17 May, 2024; originally announced May 2024.

    Comments: Accepted by KDD 2024

  5. arXiv:2310.10603  [pdf, other

    cs.LG cs.AI cs.NE math.OC stat.ML

    Exploring the Power of Graph Neural Networks in Solving Linear Optimization Problems

    Authors: Chendi Qian, Didier Chételat, Christopher Morris

    Abstract: Recently, machine learning, particularly message-passing graph neural networks (MPNNs), has gained traction in enhancing exact optimization algorithms. For example, MPNNs speed up solving mixed-integer optimization problems by imitating computational intensive heuristics like strong branching, which entails solving multiple linear optimization problems (LPs). Despite the empirical success, the rea… ▽ More

    Submitted 16 October, 2023; originally announced October 2023.

  6. arXiv:2210.16934  [pdf, other

    cs.LG cs.AI

    Learning to Compare Nodes in Branch and Bound with Graph Neural Networks

    Authors: Abdel Ghani Labassi, Didier Chételat, Andrea Lodi

    Abstract: Branch-and-bound approaches in integer programming require ordering portions of the space to explore next, a problem known as node comparison. We propose a new siamese graph neural network model to tackle this problem, where the nodes are represented as bipartite graphs with attributes. Similar to prior work, we train our model to imitate a diving oracle that plunges towards the optimal solution.… ▽ More

    Submitted 30 October, 2022; originally announced October 2022.

    Comments: 7 pages, 3 figures, 2 tables

  7. arXiv:2206.14987  [pdf, other

    cs.LG math.OC stat.ML

    Lookback for Learning to Branch

    Authors: Prateek Gupta, Elias B. Khalil, Didier Chetélat, Maxime Gasse, Yoshua Bengio, Andrea Lodi, M. Pawan Kumar

    Abstract: The expressive and computationally inexpensive bipartite Graph Neural Networks (GNN) have been shown to be an important component of deep learning based Mixed-Integer Linear Program (MILP) solvers. Recent works have demonstrated the effectiveness of such GNNs in replacing the branching (variable selection) heuristic in branch-and-bound (B&B) solvers. These GNNs are trained, offline and on a collec… ▽ More

    Submitted 29 December, 2022; v1 submitted 29 June, 2022; originally announced June 2022.

    Comments: Published in Transactions on Machine Learning Research (TMLR)

  8. arXiv:2205.11107  [pdf, other

    cs.LG math.OC

    Learning to branch with Tree MDPs

    Authors: Lara Scavuzzo, Feng Yang Chen, Didier Chételat, Maxime Gasse, Andrea Lodi, Neil Yorke-Smith, Karen Aardal

    Abstract: State-of-the-art Mixed Integer Linear Program (MILP) solvers combine systematic tree search with a plethora of hard-coded heuristics, such as the branching rule. The idea of learning branching rules from data has received increasing attention recently, and promising results have been obtained by learning fast approximations of the strong branching expert. In this work, we instead propose to learn… ▽ More

    Submitted 13 October, 2022; v1 submitted 23 May, 2022; originally announced May 2022.

    Comments: 10 pages, 2 figures, plus supplementary material

  9. arXiv:2204.09122  [pdf, other

    math.OC

    Continuous cutting plane algorithms in integer programming

    Authors: Didier Chételat, Andrea Lodi

    Abstract: Cutting planes for mixed-integer linear programs (MILPs) are typically computed in rounds by iteratively solving optimization problems, the so-called separation. Instead, we reframe the problem of finding good cutting planes as a continuous optimization problem over weights parametrizing families of valid inequalities. This problem can also be interpreted as optimizing a neural network to solve an… ▽ More

    Submitted 6 July, 2023; v1 submitted 19 April, 2022; originally announced April 2022.

    Comments: To be published in Operations Research Letters

    MSC Class: 90C11

  10. arXiv:2203.02433  [pdf, ps, other

    cs.LG cs.NE math.OC stat.ML

    The Machine Learning for Combinatorial Optimization Competition (ML4CO): Results and Insights

    Authors: Maxime Gasse, Quentin Cappart, Jonas Charfreitag, Laurent Charlin, Didier Chételat, Antonia Chmiela, Justin Dumouchelle, Ambros Gleixner, Aleksandr M. Kazachkov, Elias Khalil, Pawel Lichocki, Andrea Lodi, Miles Lubin, Chris J. Maddison, Christopher Morris, Dimitri J. Papageorgiou, Augustin Parjadis, Sebastian Pokutta, Antoine Prouvost, Lara Scavuzzo, Giulia Zarpellon, Linxin Yang, Sha Lai, Akang Wang, Xiaodong Luo , et al. (16 additional authors not shown)

    Abstract: Combinatorial optimization is a well-established area in operations research and computer science. Until recently, its methods have focused on solving problem instances in isolation, ignoring that they often stem from related data distributions in practice. However, recent years have seen a surge of interest in using machine learning as a new approach for solving combinatorial problems, either dir… ▽ More

    Submitted 17 March, 2022; v1 submitted 4 March, 2022; originally announced March 2022.

    Comments: Neurips 2021 competition. arXiv admin note: text overlap with arXiv:2112.12251 by other authors

  11. arXiv:2104.02828  [pdf, ps, other

    cs.LG math.OC

    Ecole: A Library for Learning Inside MILP Solvers

    Authors: Antoine Prouvost, Justin Dumouchelle, Maxime Gasse, Didier Chételat, Andrea Lodi

    Abstract: In this paper we describe Ecole (Extensible Combinatorial Optimization Learning Environments), a library to facilitate integration of machine learning in combinatorial optimization solvers. It exposes sequential decision making that must be performed in the process of solving as Markov decision processes. This means that, rather than trying to predict solutions to combinatorial optimization proble… ▽ More

    Submitted 6 April, 2021; originally announced April 2021.

  12. arXiv:2102.09544  [pdf, ps, other

    cs.LG cs.DS cs.NE math.OC stat.ML

    Combinatorial optimization and reasoning with graph neural networks

    Authors: Quentin Cappart, Didier Chételat, Elias Khalil, Andrea Lodi, Christopher Morris, Petar Veličković

    Abstract: Combinatorial optimization is a well-established area in operations research and computer science. Until recently, its methods have focused on solving problem instances in isolation, ignoring that they often stem from related data distributions in practice. However, recent years have seen a surge of interest in using machine learning, especially graph neural networks (GNNs), as a key building bloc… ▽ More

    Submitted 23 September, 2022; v1 submitted 18 February, 2021; originally announced February 2021.

    Journal ref: Journal of Machine Learning Research, 24(130):1-61, 2023

  13. arXiv:2011.06069  [pdf, other

    cs.LG math.OC

    Ecole: A Gym-like Library for Machine Learning in Combinatorial Optimization Solvers

    Authors: Antoine Prouvost, Justin Dumouchelle, Lara Scavuzzo, Maxime Gasse, Didier Chételat, Andrea Lodi

    Abstract: We present Ecole, a new library to simplify machine learning research for combinatorial optimization. Ecole exposes several key decision tasks arising in general-purpose combinatorial optimization solvers as control problems over Markov decision processes. Its interface mimics the popular OpenAI Gym library and is both extensible and intuitive to use. We aim at making this library a standardized p… ▽ More

    Submitted 24 November, 2020; v1 submitted 11 November, 2020; originally announced November 2020.

    Comments: Published at the 1st Workshop on Learning Meets Combinatorial Algorithms @ NeurIPS 2020, Vancouver, Canada

  14. arXiv:2009.01358  [pdf, ps, other

    cs.LG eess.SP stat.ML

    Change Point Detection by Cross-Entropy Maximization

    Authors: Aurélien Serre, Didier Chételat, Andrea Lodi

    Abstract: Many offline unsupervised change point detection algorithms rely on minimizing a penalized sum of segment-wise costs. We extend this framework by proposing to minimize a sum of discrepancies between segments. In particular, we propose to select the change points so as to maximize the cross-entropy between successive segments, balanced by a penalty for introducing new change points. We propose a dy… ▽ More

    Submitted 2 September, 2020; originally announced September 2020.

    Comments: Preprint

  15. arXiv:1906.01629  [pdf, other

    cs.LG math.OC stat.ML

    Exact Combinatorial Optimization with Graph Convolutional Neural Networks

    Authors: Maxime Gasse, Didier Chételat, Nicola Ferroni, Laurent Charlin, Andrea Lodi

    Abstract: Combinatorial optimization problems are typically tackled by the branch-and-bound paradigm. We propose a new graph convolutional neural network model for learning branch-and-bound variable selection policies, which leverages the natural variable-constraint bipartite graph representation of mixed-integer linear programs. We train our model via imitation learning from the strong branching expert rul… ▽ More

    Submitted 30 October, 2019; v1 submitted 4 June, 2019; originally announced June 2019.

    Comments: Accepted paper at the NeurIPS 2019 conference

  16. arXiv:1705.03510  [pdf, other

    math.PR math.ST

    The middle-scale asymptotics of Wishart matrices

    Authors: Didier Chételat, Martin T. Wells

    Abstract: We study the behavior of a real $p$-dimensional Wishart random matrix with $n$ degrees of freedom when $n,p\rightarrow\infty$ but $p/n\rightarrow 0$. We establish the existence of phase transitions when $p$ grows at the order $n^{(K+1)/(K+3)}$ for every $k\in\mathbb{N}$, and derive expressions for approximating densities between every two phase transitions. To do this, we make use of a novel tool… ▽ More

    Submitted 9 May, 2017; originally announced May 2017.

    MSC Class: 60B20; 60B10 (Primary); 60E10 (Secondary)

  17. arXiv:1510.08873  [pdf, other

    math.ST

    On the Domain of Attraction of a Tracy-Widom Law with Applications to Testing Multiple Largest Roots

    Authors: Didier Chételat, Rajendran Narayanan, Martin T. Wells

    Abstract: The greatest root statistic arises as the test statistic in several multivariate analysis settings. Suppose there is a global null hypothesis that consists of different independent sub-null hypotheses, and suppose the greatest root statistic is used as the test statistic for each sub-null hypothesis. Such problems may arise when conducting a batch MANOVA or several batches of pairwise testing for… ▽ More

    Submitted 29 October, 2015; originally announced October 2015.

    MSC Class: 60G70; 62E20; 62H15

  18. arXiv:1509.02451  [pdf, other

    math.ST

    Improved Second Order Estimation in the Singular Multivariate Normal Model

    Authors: Didier Chételat, Martin T. Wells

    Abstract: We consider the problem of estimating covariance and precision matrices, and their associated discriminant coefficients, from normal data when the rank of the covariance matrix is strictly smaller than its dimension and the available sample size. Using unbiased risk estimation, we construct novel estimators by minimizing upper bounds on the difference in risk over several classes. Our proposal est… ▽ More

    Submitted 8 September, 2015; originally announced September 2015.

    Comments: 34 pages

    MSC Class: Primary 62C15; secondary 62F10; 62H12

  19. arXiv:1410.5014  [pdf, other

    stat.ME math.ST

    Optimal Two-Step Prediction in Regression

    Authors: Didier Chételat, Johannes Lederer, Joseph Salmon

    Abstract: High-dimensional prediction typically comprises two steps: variable selection and subsequent least-squares refitting on the selected variables. However, the standard variable selection procedures, such as the lasso, hinge on tuning parameters that need to be calibrated. Cross-validation, the most popular calibration scheme, is computationally costly and lacks finite sample guarantees. In this pape… ▽ More

    Submitted 5 June, 2017; v1 submitted 18 October, 2014; originally announced October 2014.

  20. arXiv:1408.6440  [pdf, other

    math.ST stat.ME

    Noise Estimation in the Spiked Covariance Model

    Authors: Didier Chételat, Martin T. Wells

    Abstract: The problem of estimating a spiked covariance matrix in high dimensions under Frobenius loss, and the parallel problem of estimating the noise in spiked PCA is investigated. We propose an estimator of the noise parameter by minimizing an unbiased estimator of the invariant Frobenius risk using calculus of variations. The resulting estimator is shown, using random matrix theory, to be strongly cons… ▽ More

    Submitted 27 August, 2014; originally announced August 2014.

  21. Improved multivariate normal mean estimation with unknown covariance when p is greater than n

    Authors: Didier Chételat, Martin T. Wells

    Abstract: We consider the problem of estimating the mean vector of a p-variate normal $(θ,Σ)$ distribution under invariant quadratic loss, $(δ-θ)'Σ^{-1}(δ-θ)$, when the covariance is unknown. We propose a new class of estimators that dominate the usual estimator $δ^0(X)=X$. The proposed estimators of $θ$ depend upon X and an independent Wishart matrix S with n degrees of freedom, however, S is singular almo… ▽ More

    Submitted 27 February, 2013; originally announced February 2013.

    Comments: Published in at http://dx.doi.org/10.1214/12-AOS1067 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org)

    Report number: IMS-AOS-AOS1067

    Journal ref: Annals of Statistics 2012, Vol. 40, No. 6, 3137-3160