Search | arXiv e-print repository

InfoQuest: Evaluating Multi-Turn Dialogue Agents for Open-Ended Conversations with Hidden Context

Authors: Bryan L. M. de Oliveira, Luana G. B. Martins, Bruno Brandão, Luckeciano C. Melo

Abstract: Large language models excel at following explicit instructions, but they often struggle with ambiguous or incomplete user requests, defaulting to verbose, generic responses instead of seeking clarification. We introduce InfoQuest, a multi-turn chat benchmark designed to evaluate how dialogue agents handle hidden context in open-ended user requests. This benchmark presents intentionally ambiguous s… ▽ More Large language models excel at following explicit instructions, but they often struggle with ambiguous or incomplete user requests, defaulting to verbose, generic responses instead of seeking clarification. We introduce InfoQuest, a multi-turn chat benchmark designed to evaluate how dialogue agents handle hidden context in open-ended user requests. This benchmark presents intentionally ambiguous scenarios that require models to engage in information-seeking dialogue by asking clarifying questions before providing appropriate responses. Our evaluation of both open and closed models reveals that, while proprietary models generally perform better, all current assistants struggle to gather critical information effectively. They often require multiple turns to infer user intent and frequently default to generic responses without proper clarification. We provide a systematic methodology for generating diverse scenarios and evaluating models' information-seeking capabilities, which can be leveraged to automatically generate data for self-improvement. We also offer insights into the current limitations of language models in handling ambiguous requests through multi-turn interactions. △ Less

Submitted 25 April, 2025; v1 submitted 17 February, 2025; originally announced February 2025.

arXiv:2410.14038 [pdf, other]

Sliding Puzzles Gym: A Scalable Benchmark for State Representation in Visual Reinforcement Learning

Authors: Bryan L. M. de Oliveira, Murilo L. da Luz, Bruno Brandão, Luana G. B. Martins, Telma W. de L. Soares, Luckeciano C. Melo

Abstract: Learning effective visual representations enables agents to extract meaningful information from raw sensory inputs, which is essential for generalizing across different tasks. However, evaluating representation learning separately from policy learning remains a challenge with most reinforcement learning (RL) benchmarks. To address this gap, we introduce the Sliding Puzzles Gym (SPGym), a novel ben… ▽ More Learning effective visual representations enables agents to extract meaningful information from raw sensory inputs, which is essential for generalizing across different tasks. However, evaluating representation learning separately from policy learning remains a challenge with most reinforcement learning (RL) benchmarks. To address this gap, we introduce the Sliding Puzzles Gym (SPGym), a novel benchmark that reimagines the classic 8-tile puzzle with a visual observation space of images sourced from arbitrarily large datasets. SPGym provides precise control over representation complexity through visual diversity, allowing researchers to systematically scale the representation learning challenge while maintaining consistent environment dynamics. Despite the apparent simplicity of the task, our experiments with both model-free and model-based RL algorithms reveal fundamental limitations in current methods. As we increase visual diversity by expanding the pool of possible images, all tested algorithms show significant performance degradation, with even state-of-the-art methods struggling to generalize across different visual inputs while maintaining consistent puzzle-solving capabilities. These results highlight critical gaps in visual representation learning for RL and provide clear directions for improving robustness and generalization in decision-making systems. △ Less

Submitted 13 February, 2025; v1 submitted 17 October, 2024; originally announced October 2024.

arXiv:2104.04097 [pdf, other]

doi 10.1088/1361-6587/ac11b5

Bandwidth effects in stimulated Brillouin scattering driven by partially incoherent light

Authors: B. Brandão, J. E. Santos, R. M. G. M. Trines, R. Bingham, L. O. Silva

Abstract: A generalized Wigner-Moyal statistical theory of radiation is used to obtain a general dispersion relation for Stimulated Brillouin Scattering (SBS) driven by a broadband radiation field with arbitrary statistics. The monochromatic limit is recovered from our general result, reproducing the classic monochromatic dispersion relation. The behavior of the growth rate of the instability as a simultane… ▽ More A generalized Wigner-Moyal statistical theory of radiation is used to obtain a general dispersion relation for Stimulated Brillouin Scattering (SBS) driven by a broadband radiation field with arbitrary statistics. The monochromatic limit is recovered from our general result, reproducing the classic monochromatic dispersion relation. The behavior of the growth rate of the instability as a simultaneous function of the bandwidth of the pump wave, the intensity of the incident field and the wave number of the scattered wave is further explored by numerically solving the dispersion relation. Our results show that the growth rate of SBS can be reduced by 1/3 for a bandwidth of 0.3 nm, for typical experimental parameters. △ Less

Submitted 8 April, 2021; originally announced April 2021.

Comments: 23 pages, 4 figures

arXiv:2010.07035 [pdf, other]

MARS-Gym: A Gym framework to model, train, and evaluate Recommender Systems for Marketplaces

Authors: Marlesson R. O. Santana, Luckeciano C. Melo, Fernando H. F. Camargo, Bruno Brandão, Anderson Soares, Renan M. Oliveira, Sandor Caetano

Abstract: Recommender Systems are especially challenging for marketplaces since they must maximize user satisfaction while maintaining the healthiness and fairness of such ecosystems. In this context, we observed a lack of resources to design, train, and evaluate agents that learn by interacting within these environments. For this matter, we propose MARS-Gym, an open-source framework to empower researchers… ▽ More Recommender Systems are especially challenging for marketplaces since they must maximize user satisfaction while maintaining the healthiness and fairness of such ecosystems. In this context, we observed a lack of resources to design, train, and evaluate agents that learn by interacting within these environments. For this matter, we propose MARS-Gym, an open-source framework to empower researchers and engineers to quickly build and evaluate Reinforcement Learning agents for recommendations in marketplaces. MARS-Gym addresses the whole development pipeline: data processing, model design and optimization, and multi-sided evaluation. We also provide the implementation of a diverse set of baseline agents, with a metrics-driven analysis of them in the Trivago marketplace dataset, to illustrate how to conduct a holistic assessment using the available metrics of recommendation, off-policy estimation, and fairness. With MARS-Gym, we expect to bridge the gap between academic research and production systems, as well as to facilitate the design of new algorithms and applications. △ Less

Submitted 30 September, 2020; originally announced October 2020.

Comments: 15 pages, 14 figures, see https://github.com/deeplearningbrasil/mars-gym

ACM Class: I.6.5; H.4.2

arXiv:1307.6870 [pdf, other]

doi 10.1016/j.ymeth.2013.07.042

Measuring ligand-receptor binding kinetics and dynamics using k-space image correlation spectroscopy

Authors: Hugo B. Brandao, Hussain Sangji, Elvis Pandzic, Susanne Bechstedt, Gary J. Brouhard, Paul W. Wiseman

Abstract: Accurate measurements of kinetic rate constants for interacting biomolecules is crucial for understanding the mechanisms underlying intracellular signalling pathways. The magnitude of binding rates plays a very important molecular regulatory role which can lead to very different cellular physiological responses under different conditions. Here, we extend the k-space image correlation spectroscopy… ▽ More Accurate measurements of kinetic rate constants for interacting biomolecules is crucial for understanding the mechanisms underlying intracellular signalling pathways. The magnitude of binding rates plays a very important molecular regulatory role which can lead to very different cellular physiological responses under different conditions. Here, we extend the k-space image correlation spectroscopy (kICS) technique to study the kinetic binding rates of systems wherein: (a) fluorescently labelled, free ligands in solution interact with unlabelled, diffusing receptors in the plasma membrane and (b) systems where labelled, diffusing receptors are allowed to bind/unbind and interconvert between two different diffusing states on the plasma membrane. We develop the necessary mathematical framework for the kICS analysis and demonstrate how to extract the elevant kinetic binding parameters of the underlying molecular system from fluorescence video-microscopy image time-series. Finally, by examining real data for two model experimental systems, we demonstrate how kICS can be a powerful tool to measure molecular transport coefficients and binding kinetics. △ Less

Submitted 25 July, 2013; originally announced July 2013.

Comments: 14 pages, 5 figures

Showing 1–5 of 5 results for author: Brandão, B