-
InfoQuest: Evaluating Multi-Turn Dialogue Agents for Open-Ended Conversations with Hidden Context
Authors:
Bryan L. M. de Oliveira,
Luana G. B. Martins,
Bruno Brandão,
Luckeciano C. Melo
Abstract:
Large language models excel at following explicit instructions, but they often struggle with ambiguous or incomplete user requests, defaulting to verbose, generic responses instead of seeking clarification. We introduce InfoQuest, a multi-turn chat benchmark designed to evaluate how dialogue agents handle hidden context in open-ended user requests. This benchmark presents intentionally ambiguous s…
▽ More
Large language models excel at following explicit instructions, but they often struggle with ambiguous or incomplete user requests, defaulting to verbose, generic responses instead of seeking clarification. We introduce InfoQuest, a multi-turn chat benchmark designed to evaluate how dialogue agents handle hidden context in open-ended user requests. This benchmark presents intentionally ambiguous scenarios that require models to engage in information-seeking dialogue by asking clarifying questions before providing appropriate responses. Our evaluation of both open and closed models reveals that, while proprietary models generally perform better, all current assistants struggle to gather critical information effectively. They often require multiple turns to infer user intent and frequently default to generic responses without proper clarification. We provide a systematic methodology for generating diverse scenarios and evaluating models' information-seeking capabilities, which can be leveraged to automatically generate data for self-improvement. We also offer insights into the current limitations of language models in handling ambiguous requests through multi-turn interactions.
△ Less
Submitted 25 April, 2025; v1 submitted 17 February, 2025;
originally announced February 2025.
-
Sliding Puzzles Gym: A Scalable Benchmark for State Representation in Visual Reinforcement Learning
Authors:
Bryan L. M. de Oliveira,
Murilo L. da Luz,
Bruno Brandão,
Luana G. B. Martins,
Telma W. de L. Soares,
Luckeciano C. Melo
Abstract:
Learning effective visual representations enables agents to extract meaningful information from raw sensory inputs, which is essential for generalizing across different tasks. However, evaluating representation learning separately from policy learning remains a challenge with most reinforcement learning (RL) benchmarks. To address this gap, we introduce the Sliding Puzzles Gym (SPGym), a novel ben…
▽ More
Learning effective visual representations enables agents to extract meaningful information from raw sensory inputs, which is essential for generalizing across different tasks. However, evaluating representation learning separately from policy learning remains a challenge with most reinforcement learning (RL) benchmarks. To address this gap, we introduce the Sliding Puzzles Gym (SPGym), a novel benchmark that reimagines the classic 8-tile puzzle with a visual observation space of images sourced from arbitrarily large datasets. SPGym provides precise control over representation complexity through visual diversity, allowing researchers to systematically scale the representation learning challenge while maintaining consistent environment dynamics. Despite the apparent simplicity of the task, our experiments with both model-free and model-based RL algorithms reveal fundamental limitations in current methods. As we increase visual diversity by expanding the pool of possible images, all tested algorithms show significant performance degradation, with even state-of-the-art methods struggling to generalize across different visual inputs while maintaining consistent puzzle-solving capabilities. These results highlight critical gaps in visual representation learning for RL and provide clear directions for improving robustness and generalization in decision-making systems.
△ Less
Submitted 13 February, 2025; v1 submitted 17 October, 2024;
originally announced October 2024.
-
Bandwidth effects in stimulated Brillouin scattering driven by partially incoherent light
Authors:
B. Brandão,
J. E. Santos,
R. M. G. M. Trines,
R. Bingham,
L. O. Silva
Abstract:
A generalized Wigner-Moyal statistical theory of radiation is used to obtain a general dispersion relation for Stimulated Brillouin Scattering (SBS) driven by a broadband radiation field with arbitrary statistics. The monochromatic limit is recovered from our general result, reproducing the classic monochromatic dispersion relation. The behavior of the growth rate of the instability as a simultane…
▽ More
A generalized Wigner-Moyal statistical theory of radiation is used to obtain a general dispersion relation for Stimulated Brillouin Scattering (SBS) driven by a broadband radiation field with arbitrary statistics. The monochromatic limit is recovered from our general result, reproducing the classic monochromatic dispersion relation. The behavior of the growth rate of the instability as a simultaneous function of the bandwidth of the pump wave, the intensity of the incident field and the wave number of the scattered wave is further explored by numerically solving the dispersion relation. Our results show that the growth rate of SBS can be reduced by 1/3 for a bandwidth of 0.3 nm, for typical experimental parameters.
△ Less
Submitted 8 April, 2021;
originally announced April 2021.
-
MARS-Gym: A Gym framework to model, train, and evaluate Recommender Systems for Marketplaces
Authors:
Marlesson R. O. Santana,
Luckeciano C. Melo,
Fernando H. F. Camargo,
Bruno Brandão,
Anderson Soares,
Renan M. Oliveira,
Sandor Caetano
Abstract:
Recommender Systems are especially challenging for marketplaces since they must maximize user satisfaction while maintaining the healthiness and fairness of such ecosystems. In this context, we observed a lack of resources to design, train, and evaluate agents that learn by interacting within these environments. For this matter, we propose MARS-Gym, an open-source framework to empower researchers…
▽ More
Recommender Systems are especially challenging for marketplaces since they must maximize user satisfaction while maintaining the healthiness and fairness of such ecosystems. In this context, we observed a lack of resources to design, train, and evaluate agents that learn by interacting within these environments. For this matter, we propose MARS-Gym, an open-source framework to empower researchers and engineers to quickly build and evaluate Reinforcement Learning agents for recommendations in marketplaces. MARS-Gym addresses the whole development pipeline: data processing, model design and optimization, and multi-sided evaluation. We also provide the implementation of a diverse set of baseline agents, with a metrics-driven analysis of them in the Trivago marketplace dataset, to illustrate how to conduct a holistic assessment using the available metrics of recommendation, off-policy estimation, and fairness. With MARS-Gym, we expect to bridge the gap between academic research and production systems, as well as to facilitate the design of new algorithms and applications.
△ Less
Submitted 30 September, 2020;
originally announced October 2020.
-
Measuring ligand-receptor binding kinetics and dynamics using k-space image correlation spectroscopy
Authors:
Hugo B. Brandao,
Hussain Sangji,
Elvis Pandzic,
Susanne Bechstedt,
Gary J. Brouhard,
Paul W. Wiseman
Abstract:
Accurate measurements of kinetic rate constants for interacting biomolecules is crucial for understanding the mechanisms underlying intracellular signalling pathways. The magnitude of binding rates plays a very important molecular regulatory role which can lead to very different cellular physiological responses under different conditions. Here, we extend the k-space image correlation spectroscopy…
▽ More
Accurate measurements of kinetic rate constants for interacting biomolecules is crucial for understanding the mechanisms underlying intracellular signalling pathways. The magnitude of binding rates plays a very important molecular regulatory role which can lead to very different cellular physiological responses under different conditions. Here, we extend the k-space image correlation spectroscopy (kICS) technique to study the kinetic binding rates of systems wherein: (a) fluorescently labelled, free ligands in solution interact with unlabelled, diffusing receptors in the plasma membrane and (b) systems where labelled, diffusing receptors are allowed to bind/unbind and interconvert between two different diffusing states on the plasma membrane. We develop the necessary mathematical framework for the kICS analysis and demonstrate how to extract the elevant kinetic binding parameters of the underlying molecular system from fluorescence video-microscopy image time-series. Finally, by examining real data for two model experimental systems, we demonstrate how kICS can be a powerful tool to measure molecular transport coefficients and binding kinetics.
△ Less
Submitted 25 July, 2013;
originally announced July 2013.