Search | arXiv e-print repository

Dynamic Coalition Structure Detection in Natural Language-based Interactions

Authors: Abhishek N. Kulkarni, Andy Liu, Jean-Raphael Gaglione, Daniel Fried, Ufuk Topcu

Abstract: In strategic multi-agent sequential interactions, detecting dynamic coalition structures is crucial for understanding how self-interested agents coordinate to influence outcomes. However, natural-language-based interactions introduce unique challenges to coalition detection due to ambiguity over intents and difficulty in modeling players' subjective perspectives. We propose a new method that lever… ▽ More In strategic multi-agent sequential interactions, detecting dynamic coalition structures is crucial for understanding how self-interested agents coordinate to influence outcomes. However, natural-language-based interactions introduce unique challenges to coalition detection due to ambiguity over intents and difficulty in modeling players' subjective perspectives. We propose a new method that leverages recent advancements in large language models and game theory to predict dynamic multilateral coalition formation in Diplomacy, a strategic multi-agent game where agents negotiate coalitions using natural language. The method consists of two stages. The first stage extracts the set of agreements discussed by two agents in their private dialogue, by combining a parsing-based filtering function with a fine-tuned language model trained to predict player intents. In the second stage, we define a new metric using the concept of subjective rationalizability from hypergame theory to evaluate the expected value of an agreement for each player. We then compute this metric for each agreement identified in the first stage by assessing the strategic value of the agreement for both players and taking into account the subjective belief of one player that the second player would honor the agreement. We demonstrate that our method effectively detects potential coalition structures in online Diplomacy gameplay by assigning high values to agreements likely to be honored and low values to those likely to be violated. The proposed method provides foundational insights into coalition formation in multi-agent environments with language-based negotiation and offers key directions for future research on the analysis of complex natural language-based interactions between agents. △ Less

Submitted 22 February, 2025; originally announced February 2025.

Comments: Proceedings of the 24th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2025)

arXiv:2411.13569 [pdf, other]

Unconditionally stable symplectic integrators for the Navier-Stokes equations and other dissipative systems

Authors: Sutthikiat Sungkeetanon, Joseph S. Gaglione, Robert L. Chapman, Tyler M. Kelly, Howard A. Cushman, Blakeley H. Odom, Bryan MacGavin, Gafar A. Elamin, Nathan J. Washuta, Jonathan E. Crosmer, Adam C. DeVoria, John W. Sanders

Abstract: Symplectic integrators offer vastly superior performance over traditional numerical techniques for conservative dynamical systems, but their application to \emph{dissipative} systems is inherently difficult due to dissipative systems' lack of symplectic structure. Leveraging the intrinsic variational structure of higher-order dynamics, this paper presents a general technique for applying existing… ▽ More Symplectic integrators offer vastly superior performance over traditional numerical techniques for conservative dynamical systems, but their application to \emph{dissipative} systems is inherently difficult due to dissipative systems' lack of symplectic structure. Leveraging the intrinsic variational structure of higher-order dynamics, this paper presents a general technique for applying existing symplectic integration schemes to dissipative systems, with particular emphasis on viscous fluids modeled by the Navier-Stokes equations. Two very simple such schemes are developed here. Not only are these schemes unconditionally stable for dissipative systems, they also outperform traditional methods with a similar degree of complexity in terms of accuracy for a given time step. For example, in the case of viscous flow between two infinite, flat plates, one of the schemes developed here is found to outperform both the implicit Euler method and the explicit fourth-order Runge-Kutta method in predicting the velocity profile. To the authors' knowledge, this is the very first time that a symplectic integration scheme has been applied successfully to the Navier-Stokes equations. We interpret the present success as direct empirical validation of the canonical Hamiltonian formulation of the Navier-Stokes problem recently published by Sanders~\emph{et al.} More sophisticated symplectic integration schemes are expected to exhibit even greater performance. It is hoped that these results will lead to improved numerical methods in computational fluid dynamics. △ Less

Submitted 12 November, 2024; originally announced November 2024.

Comments: 18 pages, 7 figures

arXiv:2402.07069 [pdf, other]

Using Large Language Models to Automate and Expedite Reinforcement Learning with Reward Machine

Authors: Shayan Meshkat Alsadat, Jean-Raphael Gaglione, Daniel Neider, Ufuk Topcu, Zhe Xu

Abstract: We present LARL-RM (Large language model-generated Automaton for Reinforcement Learning with Reward Machine) algorithm in order to encode high-level knowledge into reinforcement learning using automaton to expedite the reinforcement learning. Our method uses Large Language Models (LLM) to obtain high-level domain-specific knowledge using prompt engineering instead of providing the reinforcement le… ▽ More We present LARL-RM (Large language model-generated Automaton for Reinforcement Learning with Reward Machine) algorithm in order to encode high-level knowledge into reinforcement learning using automaton to expedite the reinforcement learning. Our method uses Large Language Models (LLM) to obtain high-level domain-specific knowledge using prompt engineering instead of providing the reinforcement learning algorithm directly with the high-level knowledge which requires an expert to encode the automaton. We use chain-of-thought and few-shot methods for prompt engineering and demonstrate that our method works using these approaches. Additionally, LARL-RM allows for fully closed-loop reinforcement learning without the need for an expert to guide and supervise the learning since LARL-RM can use the LLM directly to generate the required high-level knowledge for the task at hand. We also show the theoretical guarantee of our algorithm to converge to an optimal policy. We demonstrate that LARL-RM speeds up the convergence by 30% by implementing our method in two case studies. △ Less

Submitted 10 February, 2024; originally announced February 2024.

arXiv:2311.18472 [pdf, other]

doi 10.1007/JHEP08(2024)176

Searching for exclusive leptoquarks with the Nambu-Jona-Lasinio composite model at the LHC and HL-LHC

Authors: Sehar Ajmal, Jethro Gaglione, Alfredo Gurrola, Orlando Panella, Matteo Presilla, Francesco Romeo, Hao Sun, She-Sheng Xue

Abstract: We present a detailed study concerning a new physics scenario involving four fermion operators of the Nambu-Jona-Lasinio type characterized by a strong-coupling ultraviolet fixed point where composite particles are formed as bound states of elementary fermions at the scale $Λ={\cal O}(\text{TeV})$. After implementing the model in the Universal FeynRules Output format, we focus on the phenomenology… ▽ More We present a detailed study concerning a new physics scenario involving four fermion operators of the Nambu-Jona-Lasinio type characterized by a strong-coupling ultraviolet fixed point where composite particles are formed as bound states of elementary fermions at the scale $Λ={\cal O}(\text{TeV})$. After implementing the model in the Universal FeynRules Output format, we focus on the phenomenology of the scalar leptoquarks at the LHC and the High-Luminosity option. Leptoquark particles have undergone extensive scrutiny in the literature and experimental searches, primarily relying on pair production and, more recently, incorporating single, t-channel, and lepton-induced processes. This study marks, for the first time, the examination of these production modes at varying jet multiplicities. Novel mechanisms emerge, enhancing the total production cross-section, especially for leptoquarks couplings to higher fermion generations. A global strategy is devised to capture all final state particles produced in association with leptoquarks or originating from their decay, which we termed ``exclusive'', in an analogy to the nomenclature used in nuclear reactions. The assessment of the significance in current and future LHC runs, focusing on the case of leptoquark coupling to a muon - $\textit{c}$ quark pair, reveals superior sensitivity compared to ongoing searches. Given this heightened discovery potential, we advocate the incorporation of exclusive leptoquark searches in future investigations at the LHC. △ Less

Submitted 31 May, 2024; v1 submitted 30 November, 2023; originally announced November 2023.

Journal ref: JHEP 08 (2024) 176

arXiv:2309.10171 [pdf, other]

Specification-Driven Video Search via Foundation Models and Formal Verification

Authors: Yunhao Yang, Jean-Raphaël Gaglione, Sandeep Chinchali, Ufuk Topcu

Abstract: The increasing abundance of video data enables users to search for events of interest, e.g., emergency incidents. Meanwhile, it raises new concerns, such as the need for preserving privacy. Existing approaches to video search require either manual inspection or a deep learning model with massive training. We develop a method that uses recent advances in vision and language models, as well as forma… ▽ More The increasing abundance of video data enables users to search for events of interest, e.g., emergency incidents. Meanwhile, it raises new concerns, such as the need for preserving privacy. Existing approaches to video search require either manual inspection or a deep learning model with massive training. We develop a method that uses recent advances in vision and language models, as well as formal methods, to search for events of interest in video clips automatically and efficiently. The method consists of an algorithm to map text-based event descriptions into linear temporal logic over finite traces (LTL$_f$) and an algorithm to construct an automaton encoding the video information. Then, the method formally verifies the automaton representing the video against the LTL$_f$ specifications and adds the pertinent video clips to the search result if the automaton satisfies the specifications. We provide qualitative and quantitative analysis to demonstrate the video-searching capability of the proposed method. It achieves over 90 percent precision in searching over privacy-sensitive videos and a state-of-the-art autonomous driving dataset. △ Less

Submitted 18 September, 2023; originally announced September 2023.

Comments: 12 pages, 18 figures

arXiv:2306.13732 [pdf, other]

Reinforcement Learning with Temporal-Logic-Based Causal Diagrams

Authors: Yash Paliwal, Rajarshi Roy, Jean-Raphaël Gaglione, Nasim Baharisangari, Daniel Neider, Xiaoming Duan, Ufuk Topcu, Zhe Xu

Abstract: We study a class of reinforcement learning (RL) tasks where the objective of the agent is to accomplish temporally extended goals. In this setting, a common approach is to represent the tasks as deterministic finite automata (DFA) and integrate them into the state-space for RL algorithms. However, while these machines model the reward function, they often overlook the causal knowledge about the en… ▽ More We study a class of reinforcement learning (RL) tasks where the objective of the agent is to accomplish temporally extended goals. In this setting, a common approach is to represent the tasks as deterministic finite automata (DFA) and integrate them into the state-space for RL algorithms. However, while these machines model the reward function, they often overlook the causal knowledge about the environment. To address this limitation, we propose the Temporal-Logic-based Causal Diagram (TL-CD) in RL, which captures the temporal causal relationships between different properties of the environment. We exploit the TL-CD to devise an RL algorithm in which an agent requires significantly less exploration of the environment. To this end, based on a TL-CD and a task DFA, we identify configurations where the agent can determine the expected rewards early during an exploration. Through a series of case studies, we demonstrate the benefits of using TL-CDs, particularly the faster convergence of the algorithm to an optimal policy due to reduced exploration of the environment. △ Less

Submitted 23 June, 2023; originally announced June 2023.

arXiv:2305.17372 [pdf, other]

Reinforcement Learning With Reward Machines in Stochastic Games

Authors: Jueming Hu, Jean-Raphael Gaglione, Yanze Wang, Zhe Xu, Ufuk Topcu, Yongming Liu

Abstract: We investigate multi-agent reinforcement learning for stochastic games with complex tasks, where the reward functions are non-Markovian. We utilize reward machines to incorporate high-level knowledge of complex tasks. We develop an algorithm called Q-learning with reward machines for stochastic games (QRM-SG), to learn the best-response strategy at Nash equilibrium for each agent. In QRM-SG, we de… ▽ More We investigate multi-agent reinforcement learning for stochastic games with complex tasks, where the reward functions are non-Markovian. We utilize reward machines to incorporate high-level knowledge of complex tasks. We develop an algorithm called Q-learning with reward machines for stochastic games (QRM-SG), to learn the best-response strategy at Nash equilibrium for each agent. In QRM-SG, we define the Q-function at a Nash equilibrium in augmented state space. The augmented state space integrates the state of the stochastic game and the state of reward machines. Each agent learns the Q-functions of all agents in the system. We prove that Q-functions learned in QRM-SG converge to the Q-functions at a Nash equilibrium if the stage game at each time step during learning has a global optimum point or a saddle point, and the agents update Q-functions based on the best-response strategy at this point. We use the Lemke-Howson method to derive the best-response strategy given current Q-functions. The three case studies show that QRM-SG can learn the best-response strategies effectively. QRM-SG learns the best-response strategies after around 7500 episodes in Case Study I, 1000 episodes in Case Study II, and 1500 episodes in Case Study III, while baseline methods such as Nash Q-learning and MADDPG fail to converge to the Nash equilibrium in all three case studies. △ Less

Submitted 28 August, 2023; v1 submitted 27 May, 2023; originally announced May 2023.

arXiv:2212.01944 [pdf, other]

Automaton-Based Representations of Task Knowledge from Generative Language Models

Authors: Yunhao Yang, Jean-Raphaël Gaglione, Cyrus Neary, Ufuk Topcu

Abstract: Automaton-based representations of task knowledge play an important role in control and planning for sequential decision-making problems. However, obtaining the high-level task knowledge required to build such automata is often difficult. Meanwhile, large-scale generative language models (GLMs) can automatically generate relevant task knowledge. However, the textual outputs from GLMs cannot be for… ▽ More Automaton-based representations of task knowledge play an important role in control and planning for sequential decision-making problems. However, obtaining the high-level task knowledge required to build such automata is often difficult. Meanwhile, large-scale generative language models (GLMs) can automatically generate relevant task knowledge. However, the textual outputs from GLMs cannot be formally verified or used for sequential decision-making. We propose a novel algorithm named GLM2FSA, which constructs a finite state automaton (FSA) encoding high-level task knowledge from a brief natural-language description of the task goal. GLM2FSA first sends queries to a GLM to extract task knowledge in textual form, and then it builds an FSA to represent this text-based knowledge. The proposed algorithm thus fills the gap between natural-language task descriptions and automaton-based representations, and the constructed FSA can be formally verified against user-defined specifications. We accordingly propose a method to iteratively refine the queries to the GLM based on the outcomes, e.g., counter-examples, from verification. We demonstrate GLM2FSA's ability to build and refine automaton-based representations of everyday tasks (e.g., crossing a road), and also of tasks that require highly-specialized knowledge (e.g., executing secure multi-party computation). △ Less

Submitted 9 August, 2023; v1 submitted 4 December, 2022; originally announced December 2022.

Comments: Submitted to JAIR

arXiv:2212.00916 [pdf, other]

Learning Temporal Logic Properties: an Overview of Two Recent Methods

Authors: Jean-Raphaël Gaglione, Rajarshi Roy, Nasim Baharisangari, Daniel Neider, Zhe Xu, Ufuk Topcu

Abstract: Learning linear temporal logic (LTL) formulas from examples labeled as positive or negative has found applications in inferring descriptions of system behavior. We summarize two methods to learn LTL formulas from examples in two different problem settings. The first method assumes noise in the labeling of the examples. For that, they define the problem of inferring an LTL formula that must be cons… ▽ More Learning linear temporal logic (LTL) formulas from examples labeled as positive or negative has found applications in inferring descriptions of system behavior. We summarize two methods to learn LTL formulas from examples in two different problem settings. The first method assumes noise in the labeling of the examples. For that, they define the problem of inferring an LTL formula that must be consistent with most but not all of the examples. The second method considers the other problem of inferring meaningful LTL formulas in the case where only positive examples are given. Hence, the first method addresses the robustness to noise, and the second method addresses the balance between conciseness and specificity (i.e., language minimality) of the inferred formula. The summarized methods propose different algorithms to solve the aforementioned problems, as well as to infer other descriptions of temporal properties, such as signal temporal logic or deterministic finite automata. △ Less

Submitted 1 December, 2022; originally announced December 2022.

Comments: Appears in Proceedings of AAAI FSS-22 Symposium "Lessons Learned for Autonomous Assessment of Machine Abilities (LLAAMA)"

ACM Class: I.2; F.4.3

arXiv:2209.02650 [pdf, other]

Learning Interpretable Temporal Properties from Positive Examples Only

Authors: Rajarshi Roy, Jean-Raphaël Gaglione, Nasim Baharisangari, Daniel Neider, Zhe Xu, Ufuk Topcu

Abstract: We consider the problem of explaining the temporal behavior of black-box systems using human-interpretable models. To this end, based on recent research trends, we rely on the fundamental yet interpretable models of deterministic finite automata (DFAs) and linear temporal logic (LTL) formulas. In contrast to most existing works for learning DFAs and LTL formulas, we rely on only positive examples.… ▽ More We consider the problem of explaining the temporal behavior of black-box systems using human-interpretable models. To this end, based on recent research trends, we rely on the fundamental yet interpretable models of deterministic finite automata (DFAs) and linear temporal logic (LTL) formulas. In contrast to most existing works for learning DFAs and LTL formulas, we rely on only positive examples. Our motivation is that negative examples are generally difficult to observe, in particular, from black-box systems. To learn meaningful models from positive examples only, we design algorithms that rely on conciseness and language minimality of models as regularizers. To this end, our algorithms adopt two approaches: a symbolic and a counterexample-guided one. While the symbolic approach exploits an efficient encoding of language minimality as a constraint satisfaction problem, the counterexample-guided one relies on generating suitable negative examples to prune the search. Both the approaches provide us with effective algorithms with theoretical guarantees on the learned models. To assess the effectiveness of our algorithms, we evaluate all of them on synthetic data. △ Less

Submitted 2 March, 2023; v1 submitted 6 September, 2022; originally announced September 2022.

Comments: Full version of the paper that appeared in AAAI23

ACM Class: F.4.1; I.2.6

arXiv:2105.11545 [pdf, other]

Uncertainty-Aware Signal Temporal Logic Inference

Authors: Nasim Baharisangari, Jean-Raphaël Gaglione, Daniel Neider, Ufuk Topcu, Zhe Xu

Abstract: Temporal logic inference is the process of extracting formal descriptions of system behaviors from data in the form of temporal logic formulas. The existing temporal logic inference methods mostly neglect uncertainties in the data, which results in limited applicability of such methods in real-world deployments. In this paper, we first investigate the uncertainties associated with trajectories of… ▽ More Temporal logic inference is the process of extracting formal descriptions of system behaviors from data in the form of temporal logic formulas. The existing temporal logic inference methods mostly neglect uncertainties in the data, which results in limited applicability of such methods in real-world deployments. In this paper, we first investigate the uncertainties associated with trajectories of a system and represent such uncertainties in the form of interval trajectories. We then propose two uncertainty-aware signal temporal logic (STL) inference approaches to classify the undesired behaviors and desired behaviors of a system. Instead of classifying finitely many trajectories, we classify infinitely many trajectories within the interval trajectories. In the first approach, we incorporate robust semantics of STL formulas with respect to an interval trajectory to quantify the margin at which an STL formula is satisfied or violated by the interval trajectory. The second approach relies on the first learning algorithm and exploits the decision tree to infer STL formulas to classify behaviors of a given system. The proposed approaches also work for non-separable data by optimizing the worst-case robustness in inferring an STL formula. Finally, we evaluate the performance of the proposed algorithms in two case studies, where the proposed algorithms show reductions in the computation time by up to four orders of magnitude in comparison with the sampling-based baseline algorithms (for a dataset with 800 sampled trajectories in total). △ Less

Submitted 30 May, 2021; v1 submitted 24 May, 2021; originally announced May 2021.

Comments: 11 pages, 7 figures, 2 tables

arXiv:2104.15083 [pdf, ps, other]

Learning Linear Temporal Properties from Noisy Data: A MaxSAT Approach

Authors: Jean-Raphaël Gaglione, Daniel Neider, Rajarshi Roy, Ufuk Topcu, Zhe Xu

Abstract: We address the problem of inferring descriptions of system behavior using Linear Temporal Logic (LTL) from a finite set of positive and negative examples. Most of the existing approaches for solving such a task rely on predefined templates for guiding the structure of the inferred formula. The approaches that can infer arbitrary LTL formulas, on the other hand, are not robust to noise in the data.… ▽ More We address the problem of inferring descriptions of system behavior using Linear Temporal Logic (LTL) from a finite set of positive and negative examples. Most of the existing approaches for solving such a task rely on predefined templates for guiding the structure of the inferred formula. The approaches that can infer arbitrary LTL formulas, on the other hand, are not robust to noise in the data. To alleviate such limitations, we devise two algorithms for inferring concise LTL formulas even in the presence of noise. Our first algorithm infers minimal LTL formulas by reducing the inference problem to a problem in maximum satisfiability and then using off-the-shelf MaxSAT solvers to find a solution. To the best of our knowledge, we are the first to incorporate the usage of MaxSAT solvers for inferring formulas in LTL. Our second learning algorithm relies on the first algorithm to derive a decision tree over LTL formulas based on a decision tree learning algorithm. We have implemented both our algorithms and verified that our algorithms are efficient in extracting concise LTL descriptions even in the presence of noise. △ Less

Submitted 24 June, 2021; v1 submitted 30 April, 2021; originally announced April 2021.

arXiv:1802.07763 [pdf, other]

doi 10.1103/PhysRevD.99.053003

Neutrino oscillations: ILL experiment revisited

Authors: B. K. Cogswell, D. J. Ernst, K. T. L. Ufheil, J. T. Gaglione, J. M. Malave

Abstract: The ILL experiment, one of the "reactor anomaly" experiments, is re-examined. ILL's baseline of 8.78 m is the shortest of the short baseline experiments, and it is the experiment that finds the largest fraction of the electron anti-neutrinos disappearing -- over 20%. Previous analyses, if they do not ignore the ILL experiment, use functional forms for chisquare which are either totally new and unj… ▽ More The ILL experiment, one of the "reactor anomaly" experiments, is re-examined. ILL's baseline of 8.78 m is the shortest of the short baseline experiments, and it is the experiment that finds the largest fraction of the electron anti-neutrinos disappearing -- over 20%. Previous analyses, if they do not ignore the ILL experiment, use functional forms for chisquare which are either totally new and unjustified, are the magnitude chi-square (also termed a "rate analysis"), or utilize a spectral form for chi-square which double counts the systematic error. We do an analysis which utilizes the standard, conventional form for chi-square as well as a derived form for a spectral chi-square. Results for the ILL, Huber, and Daya Bay fluxes are given. We find that the implications of the ILL experiment in providing evidence for a sterile, fourth neutrino are significantly enhanced. Moreover, we find that the ILL experiment provides a set of choices for specific values of the mass squared difference. The value of the mass square difference for the deepest minimum and its statistical significance, using the flux independent spectral chi-square and the Huber or Daya Bay flux are 0.92 eV$^2$ and 2.9 $σ$; for the conventional chi-square and the Daya Bay flux are 0.95 eV$^2$ and 3.0 $σ$; and for the conventional chi-square and Huber flux are 0.90 eV$^2$ and 3.3 $σ$. These probabilities are significantly larger than the 1.8 $σ$ found for the ILL experiment using a rate analysis. △ Less

Submitted 18 October, 2018; v1 submitted 21 February, 2018; originally announced February 2018.

Comments: 17 pages, 4 figures, and 2 Tables

Journal ref: Phys. Rev. D 99, 053003 (2019)

Showing 1–13 of 13 results for author: Gaglione, J