-
In Praise of Stubbornness: An Empirical Case for Cognitive-Dissonance Aware Continual Update of Knowledge in LLMs
Authors:
Simone Clemente,
Zied Ben Houidi,
Alexis Huet,
Dario Rossi,
Giulio Franzese,
Pietro Michiardi
Abstract:
Through systematic empirical investigation, we uncover a fundamental and concerning property of Large Language Models: while they can safely learn facts that don't contradict their knowledge, attempting to update facts with contradictory information triggers catastrophic corruption of unrelated knowledge. Unlike humans, who naturally resist contradictory information, these models indiscriminately…
▽ More
Through systematic empirical investigation, we uncover a fundamental and concerning property of Large Language Models: while they can safely learn facts that don't contradict their knowledge, attempting to update facts with contradictory information triggers catastrophic corruption of unrelated knowledge. Unlike humans, who naturally resist contradictory information, these models indiscriminately accept contradictions, leading to devastating interference, destroying up to 80% of unrelated knowledge even when learning as few as 10-100 contradicting facts. To understand whether this interference could be mitigated through selective plasticity, we experiment with targeted network updates, distinguishing between previously used (stubborn) and rarely used (plastic) neurons. We uncover another asymmetry: while sparing frequently-used neurons significantly improves retention of existing knowledge for non-contradictory updates (98% vs 93% with standard updates), contradictory updates trigger catastrophic interference regardless of targeting strategy. This effect which persists across tested model scales (GPT-2 to GPT-J-6B), suggests a fundamental limitation in how neural networks handle contradictions. Finally, we demonstrate that contradictory information can be reliably detected (95%+ accuracy) using simple model features, offering a potential protective mechanism. These findings motivate new architectures that can, like humans, naturally resist contradictions rather than allowing destructive overwrites.
△ Less
Submitted 10 June, 2025; v1 submitted 5 February, 2025;
originally announced February 2025.
-
Episodic Memories Generation and Evaluation Benchmark for Large Language Models
Authors:
Alexis Huet,
Zied Ben Houidi,
Dario Rossi
Abstract:
Episodic memory -- the ability to recall specific events grounded in time and space -- is a cornerstone of human cognition, enabling not only coherent storytelling, but also planning and decision-making. Despite their remarkable capabilities, Large Language Models (LLMs) lack a robust mechanism for episodic memory: we argue that integrating episodic memory capabilities into LLM is essential for ad…
▽ More
Episodic memory -- the ability to recall specific events grounded in time and space -- is a cornerstone of human cognition, enabling not only coherent storytelling, but also planning and decision-making. Despite their remarkable capabilities, Large Language Models (LLMs) lack a robust mechanism for episodic memory: we argue that integrating episodic memory capabilities into LLM is essential for advancing AI towards human-like cognition, increasing their potential to reason consistently and ground their output in real-world episodic events, hence avoiding confabulations. To address this challenge, we introduce a comprehensive framework to model and evaluate LLM episodic memory capabilities. Drawing inspiration from cognitive science, we develop a structured approach to represent episodic events, encapsulating temporal and spatial contexts, involved entities, and detailed descriptions. We synthesize a unique episodic memory benchmark, free from contamination, and release open source code and datasets to assess LLM performance across various recall and episodic reasoning tasks. Our evaluation of state-of-the-art models, including GPT-4 and Claude variants, Llama 3.1, and o1-mini, reveals that even the most advanced LLMs struggle with episodic memory tasks, particularly when dealing with multiple related events or complex spatio-temporal relationships -- even in contexts as short as 10k-100k tokens.
△ Less
Submitted 20 January, 2025;
originally announced January 2025.
-
Rare Yet Popular: Evidence and Implications from Labeled Datasets for Network Anomaly Detection
Authors:
Jose Manuel Navarro,
Alexis Huet,
Dario Rossi
Abstract:
Anomaly detection research works generally propose algorithms or end-to-end systems that are designed to automatically discover outliers in a dataset or a stream. While literature abounds concerning algorithms or the definition of metrics for better evaluation, the quality of the ground truth against which they are evaluated is seldom questioned. In this paper, we present a systematic analysis of…
▽ More
Anomaly detection research works generally propose algorithms or end-to-end systems that are designed to automatically discover outliers in a dataset or a stream. While literature abounds concerning algorithms or the definition of metrics for better evaluation, the quality of the ground truth against which they are evaluated is seldom questioned. In this paper, we present a systematic analysis of available public (and additionally our private) ground truth for anomaly detection in the context of network environments, where data is intrinsically temporal, multivariate and, in particular, exhibits spatial properties, which, to the best of our knowledge, we are the first to explore. Our analysis reveals that, while anomalies are, by definition, temporally rare events, their spatial characterization clearly shows some type of anomalies are significantly more popular than others. We find that simple clustering can reduce the need for human labeling by a factor of 2x-10x, that we are first to quantitatively analyze in the wild.
△ Less
Submitted 18 November, 2022;
originally announced November 2022.
-
Local Evaluation of Time Series Anomaly Detection Algorithms
Authors:
Alexis Huet,
Jose Manuel Navarro,
Dario Rossi
Abstract:
In recent years, specific evaluation metrics for time series anomaly detection algorithms have been developed to handle the limitations of the classical precision and recall. However, such metrics are heuristically built as an aggregate of multiple desirable aspects, introduce parameters and wipe out the interpretability of the output. In this article, we first highlight the limitations of the cla…
▽ More
In recent years, specific evaluation metrics for time series anomaly detection algorithms have been developed to handle the limitations of the classical precision and recall. However, such metrics are heuristically built as an aggregate of multiple desirable aspects, introduce parameters and wipe out the interpretability of the output. In this article, we first highlight the limitations of the classical precision/recall, as well as the main issues of the recent event-based metrics -- for instance, we show that an adversary algorithm can reach high precision and recall on almost any dataset under weak assumption. To cope with the above problems, we propose a theoretically grounded, robust, parameter-free and interpretable extension to precision/recall metrics, based on the concept of ``affiliation'' between the ground truth and the prediction sets. Our metrics leverage measures of duration between ground truth and predictions, and have thus an intuitive interpretation. By further comparison against random sampling, we obtain a normalized precision/recall, quantifying how much a given set of results is better than a random baseline prediction. By construction, our approach keeps the evaluation local regarding ground truth events, enabling fine-grained visualization and interpretation of algorithmic results. We compare our proposal against various public time series anomaly detection datasets, algorithms and metrics. We further derive theoretical properties of the affiliation metrics that give explicit expectations about their behavior and ensure robustness against adversary strategies.
△ Less
Submitted 27 June, 2022;
originally announced June 2022.
-
New asymptotic techniques for the partial wave cut-off method for calculating the QED one loop effective action
Authors:
Adolfo Huet,
Idrish Huet,
Octavio Cornejo
Abstract:
The Gel'fand-Yaglom theorem has been used to calculate the one-loop effective action in quantum field theory by means of the "partial-wave-cutoff method". This method works well for a wide class of background fields and is essentially exact. However, its implementation has been semi-analytical so far since it involves solving a non-linear ordinary differential equation for which solutions are in g…
▽ More
The Gel'fand-Yaglom theorem has been used to calculate the one-loop effective action in quantum field theory by means of the "partial-wave-cutoff method". This method works well for a wide class of background fields and is essentially exact. However, its implementation has been semi-analytical so far since it involves solving a non-linear ordinary differential equation for which solutions are in general unknown. Within the context of quantum electrodynamics (QED) and $O(2)\times O(3)$ symmetric backgrounds, we present two complementary asymptotic methods that provide approximate analytical solutions to this equation. We test these approximations for different background field configurations and mass regimes and demonstrate that the effective action can indeed be calculated with good accuracy using these asymptotic expressions. To further probe these methods, we analyze the massless limit of the effective action and obtain its divergence structure with respect to the radial suppression parameter of the background field, comparing our findings with previously reported results.
△ Less
Submitted 27 November, 2021;
originally announced November 2021.
-
Human readable network troubleshooting based on anomaly detection and feature scoring
Authors:
Jose M. Navarro,
Alexis Huet,
Dario Rossi
Abstract:
Network troubleshooting is still a heavily human-intensive process. To reduce the time spent by human operators in the diagnosis process, we present a system based on (i) unsupervised learning methods for detecting anomalies in the time domain, (ii) an attention mechanism to rank features in the feature space and finally (iii) an expert knowledge module able to seamlessly incorporate previously co…
▽ More
Network troubleshooting is still a heavily human-intensive process. To reduce the time spent by human operators in the diagnosis process, we present a system based on (i) unsupervised learning methods for detecting anomalies in the time domain, (ii) an attention mechanism to rank features in the feature space and finally (iii) an expert knowledge module able to seamlessly incorporate previously collected domain-knowledge. In this paper, we thoroughly evaluate the performance of the full system and of its individual building blocks: particularly, we consider (i) 10 anomaly detection algorithms as well as (ii) 10 attention mechanisms, that comprehensively represent the current state of the art in the respective fields. Leveraging a unique collection of expert-labeled datasets worth several months of real router telemetry data, we perform a thorough performance evaluation contrasting practical results in constrained stream-mode settings, with the results achievable by an ideal oracle in academic settings. Our experimental evaluation shows that (i) the proposed system is effective in achieving high levels of agreement with the expert, and (ii) that even a simple statistical approach is able to extract useful information from expert knowledge gained in past cases, significantly improving troubleshooting performance.
△ Less
Submitted 26 August, 2021;
originally announced August 2021.
-
On the low-energy limit of the QED N-photon amplitudes: part 2
Authors:
James P. Edwards,
Adolfo Huet,
Christian Schubert
Abstract:
In recent work, Gies and Karbstein have discovered that the two-loop Euler-Heisenberg Lagrangians for scalar and spinor QED have non-vanishing reducible contributions in addition to the well-studied irreducible ones. This invalidates previous applications of those Lagrangians to the computation of the two-loop $N$-photon amplitudes in the low energy limit. Here we compute the corrections to those…
▽ More
In recent work, Gies and Karbstein have discovered that the two-loop Euler-Heisenberg Lagrangians for scalar and spinor QED have non-vanishing reducible contributions in addition to the well-studied irreducible ones. This invalidates previous applications of those Lagrangians to the computation of the two-loop $N$-photon amplitudes in the low energy limit. Here we compute the corrections to those amplitudes due to the reducible contributions.
△ Less
Submitted 27 July, 2018;
originally announced July 2018.
-
Supersymmetric quantum electronic states in graphene under uniaxial strain
Authors:
Yajaira Concha Sanchez,
Adolfo Huet,
Alfredo Raya,
David Valenzuela
Abstract:
We study uniaxially strained graphene under the influence of non-uniform magnetic fields perpendicular to the material sample with a coordinate independent strain tensor. For that purpose, we solve the Dirac equation with anisotropic Fermi velocity and explore the conditions upon which such an equation possesses a supersymmetric structure in the quantum mechanical sense through examples. Working i…
▽ More
We study uniaxially strained graphene under the influence of non-uniform magnetic fields perpendicular to the material sample with a coordinate independent strain tensor. For that purpose, we solve the Dirac equation with anisotropic Fermi velocity and explore the conditions upon which such an equation possesses a supersymmetric structure in the quantum mechanical sense through examples. Working in a Laudau-like gauge, wave functions and energy eigenvalues are found analytically in terms of the magnetic field intensity, the anisotropy scales and other relevant parameters that shape the magnetic field profiles.
△ Less
Submitted 8 June, 2018;
originally announced June 2018.
-
On the Vlasov equation for Schwinger pair production in a time-dependent electric field
Authors:
Adolfo Huet,
Sang Pyo Kim,
Christian Schubert
Abstract:
Schwinger pair creation in a purely time-dependent electric field can be described through a quantum Vlasov equation describing the time evolution of the single-particle momentum distribution function. This equation exists in two versions, both of which can be derived by a Bogoliubov transformation, but whose equivalence is not obvious. For the spinless case, we show here that the difference betwe…
▽ More
Schwinger pair creation in a purely time-dependent electric field can be described through a quantum Vlasov equation describing the time evolution of the single-particle momentum distribution function. This equation exists in two versions, both of which can be derived by a Bogoliubov transformation, but whose equivalence is not obvious. For the spinless case, we show here that the difference between these two evolution equations corresponds to the one between the "in-out" and "in-in" formalisms. We give a simple relation between the asymptotic distribution functions generated by the two Vlasov equations. As examples we discuss the Sauter and single-soliton field cases.
△ Less
Submitted 25 January, 2015; v1 submitted 12 November, 2014;
originally announced November 2014.
-
Integral representations combining ladders and crossed-ladders
Authors:
F. Bastianelli,
A. Huet,
C. Schubert,
R. Thakur,
A. Weber
Abstract:
We use the worldline formalism to derive integral representations for three classes of amplitudes in scalar field theory: (i) the scalar propagator exchanging N momenta with a scalar background field (ii) the "half-ladder" with N rungs in x - space (iii) the four-point ladder with N rungs in x - space as well as in (off-shell) momentum space. In each case we give a compact expression combining the…
▽ More
We use the worldline formalism to derive integral representations for three classes of amplitudes in scalar field theory: (i) the scalar propagator exchanging N momenta with a scalar background field (ii) the "half-ladder" with N rungs in x - space (iii) the four-point ladder with N rungs in x - space as well as in (off-shell) momentum space. In each case we give a compact expression combining the N! Feynman diagrams contributing to the amplitude. As our main application, we reconsider the well-known case of two massive scalars interacting through the exchange of a massless scalar. Applying asymptotic estimates and a saddle-point approximation to the N-rung ladder plus crossed ladder diagrams, we derive a semi-analytic approximation formula for the lowest bound state mass in this model.
△ Less
Submitted 30 May, 2014;
originally announced May 2014.
-
Full mass range analysis of the QED effective action for an O(2)xO(3) symmetric field
Authors:
Naser Ahmadiniaz,
Adolfo Huet,
Alfredo Raya,
Christian Schubert
Abstract:
An interesting class of background field configurations in quantum electrodynamics (QED) are the O(2)xO(3) symmetric fields, originally introduced by S.L. Adler in 1972. Those backgrounds have some instanton-like properties and yield a one-loop effective action that is highly nontrivial, but amenable to numerical calculation. Here, we use the recently developed "partial-wave-cutoff method" for a f…
▽ More
An interesting class of background field configurations in quantum electrodynamics (QED) are the O(2)xO(3) symmetric fields, originally introduced by S.L. Adler in 1972. Those backgrounds have some instanton-like properties and yield a one-loop effective action that is highly nontrivial, but amenable to numerical calculation. Here, we use the recently developed "partial-wave-cutoff method" for a full mass range numerical analysis of the effective action for the "standard" O(2)xO(3) symmetric field, modified by a radial suppression factor. At large mass, we are able to match the asymptotics of the physically renormalized effective action against the leading two mass levels of the inverse mass expansion. For small masses, with a suitable choice of the renormalization scheme we obtain stable numerical results even in the massless limit. We analyze the N - point functions in this background and show that, even in the absence of the radial suppression factor, the two-point contribution to the effective action is the only obstacle to taking its massless limit. The standard O(2)xO(3) background leads to a chiral anomaly term in the effective action, and both our perturbative and nonperturbative results strongly suggest that the small-mass asymptotic behavior of the effective action is, after the subtraction of the two-point contribution, dominated by this anomaly term as the only source of a logarithmic mass dependence. This confirms a conjecture by M. Fry.
△ Less
Submitted 7 May, 2013;
originally announced May 2013.
-
QED effective action for an O(2)xO(3) symmetric field in the full mass range
Authors:
N. Ahmadiniaz,
A. Huet,
A. Raya,
C. Schubert
Abstract:
An interesting class of background field configurations in QED are the O(2)xO(3) symmetric fields. Those backgrounds have some instanton-like properties and yield a one-loop effective action that is highly nontrivial but amenable to numerical calculation, for both scalar and spinor QED. Here we use the recently developed "partial-wave-cutoff method" for a numerical analysis of both effective actio…
▽ More
An interesting class of background field configurations in QED are the O(2)xO(3) symmetric fields. Those backgrounds have some instanton-like properties and yield a one-loop effective action that is highly nontrivial but amenable to numerical calculation, for both scalar and spinor QED. Here we use the recently developed "partial-wave-cutoff method" for a numerical analysis of both effective actions in the full mass range. In particular, at large mass we are able to match the asymptotic behavior of the physically renormalized effective action against the leading two mass levels of the inverse mass (or heat kernel) expansion. At small mass we obtain good numerical results even in the massless case for the appropriately (unphysically) renormalized effective action after the removal of the chiral anomaly term through a small radial cutoff factor. In particular, we show that the effective action after this removal remains finite in the massless limit, which also provides indirect support for M. Fry's hypothesis that the QED effective action in this limit is dominated by the chiral anomaly term.
△ Less
Submitted 23 January, 2013;
originally announced January 2013.
-
The Derivative Expansion at Small Mass for the Spinor Effective Action
Authors:
Gerald V. Dunne,
Adolfo Huet,
Jin Hur,
Hyunsoo Min
Abstract:
We study the small mass limit of the one-loop spinor effective action, comparing the derivative expansion approximation with exact numerical results that are obtained from an extension to spinor theories of the partial-wave-cutoff method. In this approach one can compute numerically the renormalized one-loop effective action, for radially separable gauge field background fields in spinor QED. We h…
▽ More
We study the small mass limit of the one-loop spinor effective action, comparing the derivative expansion approximation with exact numerical results that are obtained from an extension to spinor theories of the partial-wave-cutoff method. In this approach one can compute numerically the renormalized one-loop effective action, for radially separable gauge field background fields in spinor QED. We highlight an important difference between the small mass limit of the derivative expansion approximation for spinor and scalar theories.
△ Less
Submitted 16 March, 2011;
originally announced March 2011.
-
New relations between spinor and scalar one-loop effective Lagrangians in constant background fields
Authors:
Adolfo Huet
Abstract:
Simple new relations are presented between the one-loop effective Lagrangians of spinor and scalar particles in constant curvature background fields, both electromagentic and gravitational. These relations go beyond the well-known cases for self-dual background fields.
Simple new relations are presented between the one-loop effective Lagrangians of spinor and scalar particles in constant curvature background fields, both electromagentic and gravitational. These relations go beyond the well-known cases for self-dual background fields.
△ Less
Submitted 5 April, 2010;
originally announced April 2010.
-
Closed-form weak-field expansion of two-loop Euler-Heisenberg Lagrangians
Authors:
G. V. Dunne,
A. Huet,
D. Rivera,
C. Schubert
Abstract:
We obtain closed-form expressions, in terms of the Faulhaber numbers, for the weak-field expansion coefficients of the two-loop Euler-Heisenberg effective Lagrangians in a magnetic or electric field. This follows from the observation that the magnetic worldline Green's function has a natural expansion in terms of the Faulhaber numbers.
We obtain closed-form expressions, in terms of the Faulhaber numbers, for the weak-field expansion coefficients of the two-loop Euler-Heisenberg effective Lagrangians in a magnetic or electric field. This follows from the observation that the magnetic worldline Green's function has a natural expansion in terms of the Faulhaber numbers.
△ Less
Submitted 10 September, 2006;
originally announced September 2006.
-
Gauge Dependence of Mass and Condensate in Chirally Asymmetric Phase of Quenched QED3
Authors:
A. Bashir,
A. Huet,
A. Raya
Abstract:
We study three dimensional quenched Quantum Electrodynamics in the bare vertex approximation. We investigate the gauge dependence of the dynamically generated Euclidean mass of the fermion and the chiral condensate for a wide range of values of the covariant gauge parameter $ξ$. We find that (i) away from $ξ=0$, gauge dependence of the said quantities is considerably reduced without resorting to…
▽ More
We study three dimensional quenched Quantum Electrodynamics in the bare vertex approximation. We investigate the gauge dependence of the dynamically generated Euclidean mass of the fermion and the chiral condensate for a wide range of values of the covariant gauge parameter $ξ$. We find that (i) away from $ξ=0$, gauge dependence of the said quantities is considerably reduced without resorting to sophisticated vertex {\em ansatze}, (ii) wavefunction renormalization plays an important role in restoring gauge invariance and (iii) the Ward-Green-Takahashi identity seems to increase the gauge dependence when used in conjunction with some simplifying assumptions. In the Landau gauge, we also verify that our results are in agreement with those based upon dimensional regularization scheme within the numerical accuracy available.
△ Less
Submitted 1 March, 2002;
originally announced March 2002.