Search | arXiv e-print repository

From Stability to Inconsistency: A Study of Moral Preferences in LLMs

Authors: Monika Jotautaite, Mary Phuong, Chatrik Singh Mangat, Maria Angelica Martinez

Abstract: As large language models (LLMs) increasingly integrate into our daily lives, it becomes crucial to understand their implicit biases and moral tendencies. To address this, we introduce a Moral Foundations LLM dataset (MFD-LLM) grounded in Moral Foundations Theory, which conceptualizes human morality through six core foundations. We propose a novel evaluation method that captures the full spectrum o… ▽ More As large language models (LLMs) increasingly integrate into our daily lives, it becomes crucial to understand their implicit biases and moral tendencies. To address this, we introduce a Moral Foundations LLM dataset (MFD-LLM) grounded in Moral Foundations Theory, which conceptualizes human morality through six core foundations. We propose a novel evaluation method that captures the full spectrum of LLMs' revealed moral preferences by answering a range of real-world moral dilemmas. Our findings reveal that state-of-the-art models have remarkably homogeneous value preferences, yet demonstrate a lack of consistency. △ Less

Submitted 8 April, 2025; originally announced April 2025.

arXiv:2501.13706 [pdf]

Analysis of Eccentric Coaxial Waveguides Filled with Lossy Anisotropic Media via Finite Difference

Authors: Raul O. Ribeiro, Maria A. Martinez, Guilherme S. Rosa, Rafael A. Penchel

Abstract: This study presents a finite difference method (FDM) to model the electromagnetic field propagation in eccentric coaxial waveguides filled with lossy uniaxially anisotropic media. The formulation utilizes conformal transformation to map the eccentric circular waveguide into an equivalent concentric one. In the concentric problem, we introduce a novel normalized Helmholtz equation to decouple TM an… ▽ More This study presents a finite difference method (FDM) to model the electromagnetic field propagation in eccentric coaxial waveguides filled with lossy uniaxially anisotropic media. The formulation utilizes conformal transformation to map the eccentric circular waveguide into an equivalent concentric one. In the concentric problem, we introduce a novel normalized Helmholtz equation to decouple TM and TE modes, and we solve this non-homogeneous partial differential equation using the finite difference in cylindrical coordinates. The proposed approach was validated against perturbation-based, spectral element-based, and finite-integration-based numerical solutions. The preliminary results show that our solution is superior in computational time. Furthermore, our FDM formulation can be extended with minimal adaptations to model complex media problems, such as metamaterial devices, optical fibers, and geophysical exploration sensors. △ Less

Submitted 23 January, 2025; originally announced January 2025.

Comments: This work was presented at the SBMO 2024 - XXI Brazilian Symposium on Microwaves and Optoelectronics. For more information about the conference, please visit https://www.sbmo.org.br/sbmo/2024/home

arXiv:2410.14627 [pdf, other]

CELI: Controller-Embedded Language Model Interactions

Authors: Jan-Samuel Wagner, Dave DeCaprio, Abishek Chiffon Muthu Raja, Jonathan M. Holman, Lauren K. Brady, Sky C. Cheung, Hosein Barzekar, Eric Yang, Mark Anthony Martinez II, David Soong, Sriram Sridhar, Han Si, Brandon W. Higgs, Hisham Hamadeh, Scott Ogden

Abstract: We introduce Controller-Embedded Language Model Interactions (CELI), a framework that integrates control logic directly within language model (LM) prompts, facilitating complex, multi-stage task execution. CELI addresses limitations of existing prompt engineering and workflow optimization techniques by embedding control logic directly within the operational context of language models, enabling dyn… ▽ More We introduce Controller-Embedded Language Model Interactions (CELI), a framework that integrates control logic directly within language model (LM) prompts, facilitating complex, multi-stage task execution. CELI addresses limitations of existing prompt engineering and workflow optimization techniques by embedding control logic directly within the operational context of language models, enabling dynamic adaptation to evolving task requirements. Our framework transfers control from the traditional programming execution environment to the LMs, allowing them to autonomously manage computational workflows while maintaining seamless interaction with external systems and functions. CELI supports arbitrary function calls with variable arguments, bridging the gap between LMs' adaptive reasoning capabilities and conventional software paradigms' structured control mechanisms. To evaluate CELI's versatility and effectiveness, we conducted case studies in two distinct domains: code generation (HumanEval benchmark) and multi-stage content generation (Wikipedia-style articles). The results demonstrate notable performance improvements across a range of domains. CELI achieved a 4.9 percentage point improvement over the best reported score of the baseline GPT-4 model on the HumanEval code generation benchmark. In multi-stage content generation, 94.4% of CELI-produced Wikipedia-style articles met or exceeded first draft quality when optimally configured, with 44.4% achieving high quality. These outcomes underscore CELI's potential for optimizing AI-driven workflows across diverse computational domains. △ Less

Submitted 18 October, 2024; originally announced October 2024.

Comments: 26 pages, 2 figures

MSC Class: 68T50; 68Q32; 68N19 ACM Class: I.2.6; I.2.7; D.2.2

arXiv:2410.06491 [pdf, other]

Honesty to Subterfuge: In-Context Reinforcement Learning Can Make Honest Models Reward Hack

Authors: Leo McKee-Reid, Christoph Sträter, Maria Angelica Martinez, Joe Needham, Mikita Balesni

Abstract: Previous work has shown that training "helpful-only" LLMs with reinforcement learning on a curriculum of gameable environments can lead models to generalize to egregious specification gaming, such as editing their own reward function or modifying task checklists to appear more successful. We show that gpt-4o, gpt-4o-mini, o1-preview, and o1-mini - frontier models trained to be helpful, harmless, a… ▽ More Previous work has shown that training "helpful-only" LLMs with reinforcement learning on a curriculum of gameable environments can lead models to generalize to egregious specification gaming, such as editing their own reward function or modifying task checklists to appear more successful. We show that gpt-4o, gpt-4o-mini, o1-preview, and o1-mini - frontier models trained to be helpful, harmless, and honest - can engage in specification gaming without training on a curriculum of tasks, purely from in-context iterative reflection (which we call in-context reinforcement learning, "ICRL"). We also show that using ICRL to generate highly-rewarded outputs for expert iteration (compared to the standard expert iteration reinforcement learning algorithm) may increase gpt-4o-mini's propensity to learn specification-gaming policies, generalizing (in very rare cases) to the most egregious strategy where gpt-4o-mini edits its own reward function. Our results point toward the strong ability of in-context reflection to discover rare specification-gaming strategies that models might not exhibit zero-shot or with normal training, highlighting the need for caution when relying on alignment of LLMs in zero-shot settings. △ Less

Submitted 8 October, 2024; originally announced October 2024.

Comments: 20 pages, 9 figures

arXiv:1907.03841 [pdf]

The Advent of Technological Singularity: a Formal Metric

Authors: Juan A. Lara, David Lizcano, María A. Martínez, Juan Pazos

Abstract: The Technological Singularity; that is, the possibility of achieving a General Artificial Intelligence (AGI) that surpasses human intelligence, is one of the vital paradigms of today's humanity. However, until now only opinions about its possibility and/or achievement were issued, therefore, in this work, a metric is presented, for the first time, to objectively measure the actual state in which t… ▽ More The Technological Singularity; that is, the possibility of achieving a General Artificial Intelligence (AGI) that surpasses human intelligence, is one of the vital paradigms of today's humanity. However, until now only opinions about its possibility and/or achievement were issued, therefore, in this work, a metric is presented, for the first time, to objectively measure the actual state in which the advent of technological singularity is found. △ Less

Submitted 25 June, 2019; originally announced July 2019.

Showing 1–5 of 5 results for author: Martínez, M A