Search | arXiv e-print repository

Stochastic Chameleons: Irrelevant Context Hallucinations Reveal Class-Based (Mis)Generalization in LLMs

Authors: Ziling Cheng, Meng Cao, Marc-Antoine Rondeau, Jackie Chi Kit Cheung

Abstract: The widespread success of large language models (LLMs) on NLP benchmarks has been accompanied by concerns that LLMs function primarily as stochastic parrots that reproduce texts similar to what they saw during pre-training, often erroneously. But what is the nature of their errors, and do these errors exhibit any regularities? In this work, we examine irrelevant context hallucinations, in which mo… ▽ More The widespread success of large language models (LLMs) on NLP benchmarks has been accompanied by concerns that LLMs function primarily as stochastic parrots that reproduce texts similar to what they saw during pre-training, often erroneously. But what is the nature of their errors, and do these errors exhibit any regularities? In this work, we examine irrelevant context hallucinations, in which models integrate misleading contextual cues into their predictions. Through behavioral analysis, we show that these errors result from a structured yet flawed mechanism that we term class-based (mis)generalization, in which models combine abstract class cues with features extracted from the query or context to derive answers. Furthermore, mechanistic interpretability experiments on Llama-3, Mistral, and Pythia across 39 factual recall relation types reveal that this behavior is reflected in the model's internal computations: (i) abstract class representations are constructed in lower layers before being refined into specific answers in higher layers, (ii) feature selection is governed by two competing circuits -- one prioritizing direct query-based reasoning, the other incorporating contextual cues -- whose relative influences determine the final output. Our findings provide a more nuanced perspective on the stochastic parrot argument: through form-based training, LLMs can exhibit generalization leveraging abstractions, albeit in unreliable ways based on contextual cues -- what we term stochastic chameleons. △ Less

Submitted 29 May, 2025; v1 submitted 28 May, 2025; originally announced May 2025.

Comments: Accepted to ACL 2025 (Main Conference)

arXiv:2502.15657 [pdf, other]

Superintelligent Agents Pose Catastrophic Risks: Can Scientist AI Offer a Safer Path?

Authors: Yoshua Bengio, Michael Cohen, Damiano Fornasiere, Joumana Ghosn, Pietro Greiner, Matt MacDermott, Sören Mindermann, Adam Oberman, Jesse Richardson, Oliver Richardson, Marc-Antoine Rondeau, Pierre-Luc St-Charles, David Williams-King

Abstract: The leading AI companies are increasingly focused on building generalist AI agents -- systems that can autonomously plan, act, and pursue goals across almost all tasks that humans can perform. Despite how useful these systems might be, unchecked AI agency poses significant risks to public safety and security, ranging from misuse by malicious actors to a potentially irreversible loss of human contr… ▽ More The leading AI companies are increasingly focused on building generalist AI agents -- systems that can autonomously plan, act, and pursue goals across almost all tasks that humans can perform. Despite how useful these systems might be, unchecked AI agency poses significant risks to public safety and security, ranging from misuse by malicious actors to a potentially irreversible loss of human control. We discuss how these risks arise from current AI training methods. Indeed, various scenarios and experiments have demonstrated the possibility of AI agents engaging in deception or pursuing goals that were not specified by human operators and that conflict with human interests, such as self-preservation. Following the precautionary principle, we see a strong need for safer, yet still useful, alternatives to the current agency-driven trajectory. Accordingly, we propose as a core building block for further advances the development of a non-agentic AI system that is trustworthy and safe by design, which we call Scientist AI. This system is designed to explain the world from observations, as opposed to taking actions in it to imitate or please humans. It comprises a world model that generates theories to explain data and a question-answering inference machine. Both components operate with an explicit notion of uncertainty to mitigate the risks of overconfident predictions. In light of these considerations, a Scientist AI could be used to assist human researchers in accelerating scientific progress, including in AI safety. In particular, our system can be employed as a guardrail against AI agents that might be created despite the risks involved. Ultimately, focusing on non-agentic AI may enable the benefits of AI innovation while avoiding the risks associated with the current trajectory. We hope these arguments will motivate researchers, developers, and policymakers to favor this safer path. △ Less

Submitted 24 February, 2025; v1 submitted 21 February, 2025; originally announced February 2025.

Comments: v2 with fixed formatting for URLs and hyperlinks

arXiv:2406.12018 [pdf, other]

CItruS: Chunked Instruction-aware State Eviction for Long Sequence Modeling

Authors: Yu Bai, Xiyuan Zou, Heyan Huang, Sanxing Chen, Marc-Antoine Rondeau, Yang Gao, Jackie Chi Kit Cheung

Abstract: Long sequence modeling has gained broad interest as large language models (LLMs) continue to advance. Recent research has identified that a large portion of hidden states within the key-value caches of Transformer models can be discarded (also termed evicted) without affecting the perplexity performance in generating long sequences. However, we show that these methods, despite preserving perplexit… ▽ More Long sequence modeling has gained broad interest as large language models (LLMs) continue to advance. Recent research has identified that a large portion of hidden states within the key-value caches of Transformer models can be discarded (also termed evicted) without affecting the perplexity performance in generating long sequences. However, we show that these methods, despite preserving perplexity performance, often drop information that is important for solving downstream tasks, a problem which we call information neglect. To address this issue, we introduce Chunked Instruction-aware State Eviction (CItruS), a novel modeling technique that integrates the attention preferences useful for a downstream task into the eviction process of hidden states. In addition, we design a method for chunked sequence processing to further improve efficiency. Our training-free method exhibits superior performance on long sequence comprehension and retrieval tasks over several strong baselines under the same memory budget, while preserving language modeling perplexity. The code and data have been released at https://github.com/ybai-nlp/CItruS. △ Less

Submitted 8 October, 2024; v1 submitted 17 June, 2024; originally announced June 2024.

Comments: EMNLP 2024 Main Conference

arXiv:2401.11323 [pdf, other]

Identifying and Analyzing Performance-Critical Tokens in Large Language Models

Authors: Yu Bai, Heyan Huang, Cesare Spinoso-Di Piano, Marc-Antoine Rondeau, Sanxing Chen, Yang Gao, Jackie Chi Kit Cheung

Abstract: In-context learning (ICL) has emerged as an effective solution for few-shot learning with large language models (LLMs). However, how LLMs leverage demonstrations to specify a task and learn a corresponding computational function through ICL is underexplored. Drawing from the way humans learn from content-label mappings in demonstrations, we categorize the tokens in an ICL prompt into content, stop… ▽ More In-context learning (ICL) has emerged as an effective solution for few-shot learning with large language models (LLMs). However, how LLMs leverage demonstrations to specify a task and learn a corresponding computational function through ICL is underexplored. Drawing from the way humans learn from content-label mappings in demonstrations, we categorize the tokens in an ICL prompt into content, stopword, and template tokens. Our goal is to identify the types of tokens whose representations directly influence LLM's performance, a property we refer to as being performance-critical. By ablating representations from the attention of the test example, we find that the representations of informative content tokens have less influence on performance compared to template and stopword tokens, which contrasts with the human attention to informative words. We give evidence that the representations of performance-critical tokens aggregate information from the content tokens. Moreover, we demonstrate experimentally that lexical meaning, repetition, and structural cues are the main distinguishing characteristics of these tokens. Our work sheds light on how large language models learn to perform tasks from demonstrations and deepens our understanding of the roles different types of tokens play in large language models. △ Less

Submitted 23 February, 2025; v1 submitted 20 January, 2024; originally announced January 2024.

Comments: Work in progress

arXiv:2101.12012 [pdf]

Dynamic formation of spherical voids crossing linear defects

Authors: Youcef A. Bioud, Maxime Rondeau, Abderraouf Boucherif, Gilles Patriarche, Dominique Drouin, Richard Arès

Abstract: A predictive model for the evolution of porous Ge layer upon thermal treatment is reported. We represent an idealized etched dislocation core as an axially symmetric elongated hole and computed its dynamics during annealing. Numerical simulations of the shape change of a completely spherical void via surface diffusion have been performed. Simulations and experiments show individual large spherical… ▽ More A predictive model for the evolution of porous Ge layer upon thermal treatment is reported. We represent an idealized etched dislocation core as an axially symmetric elongated hole and computed its dynamics during annealing. Numerical simulations of the shape change of a completely spherical void via surface diffusion have been performed. Simulations and experiments show individual large spherical voids, aligned along the dislocation core. The creation of voids could facilitate interactions between dislocations, enabling the dislocation network to change its connectivity in a way that facilitates the subsequent annihilation of dislocation segments. This confirms that thermally activated processes such as state diffusion of porous materials provide mechanisms whereby the defects are removed or arranged in configurations of lower energy. This model is intended to be indicative, and more detailed experimental characterization of process parameters such as annealing temperature and time, and could estimate the annealing time for given temperatures, or vice versa, with the right parameters. △ Less

Submitted 7 January, 2021; originally announced January 2021.

Comments: 7 pages, 3 figures

arXiv:2002.09127 [pdf, other]

Learning Dynamic Belief Graphs to Generalize on Text-Based Games

Authors: Ashutosh Adhikari, Xingdi Yuan, Marc-Alexandre Côté, Mikuláš Zelinka, Marc-Antoine Rondeau, Romain Laroche, Pascal Poupart, Jian Tang, Adam Trischler, William L. Hamilton

Abstract: Playing text-based games requires skills in processing natural language and sequential decision making. Achieving human-level performance on text-based games remains an open challenge, and prior research has largely relied on hand-crafted structured representations and heuristics. In this work, we investigate how an agent can plan and generalize in text-based games using graph-structured represent… ▽ More Playing text-based games requires skills in processing natural language and sequential decision making. Achieving human-level performance on text-based games remains an open challenge, and prior research has largely relied on hand-crafted structured representations and heuristics. In this work, we investigate how an agent can plan and generalize in text-based games using graph-structured representations learned end-to-end from raw text. We propose a novel graph-aided transformer agent (GATA) that infers and updates latent belief graphs during planning to enable effective action selection by capturing the underlying game dynamics. GATA is trained using a combination of reinforcement and self-supervised learning. Our work demonstrates that the learned graph-based representations help agents converge to better policies than their text-only counterparts and facilitate effective generalization across game configurations. Experiments on 500+ unique games from the TextWorld suite show that our best agent outperforms text-based baselines by an average of 24.2%. △ Less

Submitted 11 May, 2021; v1 submitted 20 February, 2020; originally announced February 2020.

Comments: Bug fixed in Table 1

arXiv:1208.2237 [pdf, ps, other]

doi 10.1103/PhysRevB.86.125422

Phase diagram of insulating crystal and quantum Hall states in ABC-stacked trilayer graphene

Authors: R. Côté, Maxime Rondeau, Anne-Marie Gagnon, Yafis Barlas

Abstract: In the presence of a perpendicular magnetic field, ABC-stacked trilayer graphene's chiral band structure supports a 12-fold degenerate N=0 Landau level (LL). Along with the valley and spin degrees of freedom, the zeroth LL contains additional quantum numbers associated with the LL orbital index $% n=0,1,2$. Remote inter-layer hopping terms and external potential difference $Δ_{B}$ between the laye… ▽ More In the presence of a perpendicular magnetic field, ABC-stacked trilayer graphene's chiral band structure supports a 12-fold degenerate N=0 Landau level (LL). Along with the valley and spin degrees of freedom, the zeroth LL contains additional quantum numbers associated with the LL orbital index $% n=0,1,2$. Remote inter-layer hopping terms and external potential difference $Δ_{B}$ between the layers lead to LL splitting by introducing a gap $% Δ_{LL}$ between the degenerate zero-energy triplet LL orbitals. Assuming that the spin and valley degrees of freedom are frozen, we study the phase diagram of this system resulting from competition of the single particle LL splitting and Coulomb interactions within the Hartree-Fock approximation at integer filling factors. Above a critical value $Δ_{LL}^{c}$ of the external potential difference i,e, for $|Δ_{LL}| >Δ_{LL}^{c}$, the ground state is a uniform quantum Hall state where the electrons occupy the lowest unoccupied LL orbital index. For $|Δ_{LL}| <Δ_{LL}^{c}$ (which corresponds to large positive or negative values of $Δ_{B}$) the uniform QH state is unstable to the formation of a crystal state at integer filling factors. This phase transition should be characterized by a Hall plateau transition as a function of $Δ_{LL}$ at a fixed filling factor. We also study the properties of this crystal state and discuss its experimental detection. △ Less

Submitted 10 August, 2012; originally announced August 2012.

Comments: 16 pages with 13 figures

Journal ref: Phys. Rev. B 86, 125422 (2012)

arXiv:1112.2729 [pdf, ps, other]

doi 10.1103/PhysRevLett.109.126804

Quantum Hall to charge-density-wave phase transitions in ABC-trilayer graphene

Authors: Yafis Barlas, R. Cote, Maxime Rondeau

Abstract: ABC-stacked trilayer graphene's chiral band structure results in three ($n=0,1,2$) Landau level orbitals with zero kinetic energy. This unique feature has important consequences on the interaction driven states of the 12-fold degenerate (including spin and valley) N=0 Landau level. In particular, at many filling factors $ν_{T} =\pm5,\pm4,\pm2,\pm1$ a quantum phase transition from a quantum Hall li… ▽ More ABC-stacked trilayer graphene's chiral band structure results in three ($n=0,1,2$) Landau level orbitals with zero kinetic energy. This unique feature has important consequences on the interaction driven states of the 12-fold degenerate (including spin and valley) N=0 Landau level. In particular, at many filling factors $ν_{T} =\pm5,\pm4,\pm2,\pm1$ a quantum phase transition from a quantum Hall liquid state to a triangular charge density wave occurs as a function of the single-particle induced LL orbital splitting $Δ_{LL}$. This phase transition should be characterized by a re-entrant integer quantum Hall effect with the Hall conductivity corresponding to the {\it adjacent} interaction driven integer quantum Hall plateau. △ Less

Submitted 4 August, 2012; v1 submitted 12 December, 2011; originally announced December 2011.

Comments: 4+ pages

Journal ref: Phys. Rev. Lett. 109 126804 (2012)

arXiv:1102.0984 [pdf]

doi 10.1038/ncomms1440

Fermi-surface reconstruction by stripe order in cuprate superconductors

Authors: F. Laliberte, J. Chang, N. Doiron-Leyraud, E. Hassinger, R. Daou, M. Rondeau, B. J. Ramshaw, R. Liang, D. A. Bonn, W. N. Hardy, S. Pyon, T. Takayama, H. Takagi, I. Sheikin, L. Malone, C. Proust, K. Behnia, L. Taillefer

Abstract: Quantum oscillations have revealed the presence of a small pocket in the Fermi surface of the cuprate superconductor YBCO, whose nature and origin are the subject of much debate. Interpretations include electron and hole pockets; scenarios include Fermi-surface reconstruction by antiferromagnetism, d-density-wave order, and stripe order. Here we report quantum oscillations in the Seebeck and Nerns… ▽ More Quantum oscillations have revealed the presence of a small pocket in the Fermi surface of the cuprate superconductor YBCO, whose nature and origin are the subject of much debate. Interpretations include electron and hole pockets; scenarios include Fermi-surface reconstruction by antiferromagnetism, d-density-wave order, and stripe order. Here we report quantum oscillations in the Seebeck and Nernst coefficients of YBCO and show, from the magnitude and sign of the Seebeck coefficient, that they come from an electron pocket. Using measurements of the Seebeck coefficient as a function of hole doping p, we show that the evolution of the Fermi surface in YBCO is the same as in Eu-LSCO, a cuprate where stripe order (a modulation of spin and charge densities) is well established. The electron pocket is most prominent where stripe order is strongest, at p = 1/8. This shows that Fermi-surface reconstruction is a generic mechanism of underdoped cuprates, intimately related to stripe order. △ Less

Submitted 6 September, 2011; v1 submitted 4 February, 2011; originally announced February 2011.

Comments: 15 pages, 5 figures, Supplementary information now integrated into article

Journal ref: Nature Communications 2, 432 (2011)

Showing 1–9 of 9 results for author: Rondeau, M