-
Hierarchical Lexical Graph for Enhanced Multi-Hop Retrieval
Authors:
Abdellah Ghassel,
Ian Robinson,
Gabriel Tanase,
Hal Cooper,
Bryan Thompson,
Zhen Han,
Vassilis N. Ioannidis,
Soji Adeshina,
Huzefa Rangwala
Abstract:
Retrieval-Augmented Generation (RAG) grounds large language models in external evidence, yet it still falters when answers must be pieced together across semantically distant documents. We close this gap with the Hierarchical Lexical Graph (HLG), a three-tier index that (i) traces every atomic proposition to its source, (ii) clusters propositions into latent topics, and (iii) links entities and re…
▽ More
Retrieval-Augmented Generation (RAG) grounds large language models in external evidence, yet it still falters when answers must be pieced together across semantically distant documents. We close this gap with the Hierarchical Lexical Graph (HLG), a three-tier index that (i) traces every atomic proposition to its source, (ii) clusters propositions into latent topics, and (iii) links entities and relations to expose cross-document paths. On top of HLG we build two complementary, plug-and-play retrievers: StatementGraphRAG, which performs fine-grained entity-aware beam search over propositions for high-precision factoid questions, and TopicGraphRAG, which selects coarse topics before expanding along entity links to supply broad yet relevant context for exploratory queries. Additionally, existing benchmarks lack the complexity required to rigorously evaluate multi-hop summarization systems, often focusing on single-document queries or limited datasets. To address this, we introduce a synthetic dataset generation pipeline that curates realistic, multi-document question-answer pairs, enabling robust evaluation of multi-hop retrieval systems. Extensive experiments across five datasets demonstrate that our methods outperform naive chunk-based RAG achieving an average relative improvement of 23.1% in retrieval recall and correctness. Open-source Python library is available at https://github.com/awslabs/graphrag-toolkit.
△ Less
Submitted 9 June, 2025;
originally announced June 2025.
-
Investigating A Geometrical Solution to the Vergence-Accommodation Conflict for Targeted Movements in Virtual Reality
Authors:
Xiaoye Michael Wang,
Matthew Prenevost,
Aneesh Tarun,
Ian Robinson,
Michael Nitsche,
Gabby Resch,
Ali Mazalek,
Timothy N. Welsh
Abstract:
While virtual reality (VR) holds significant potential to revolutionize digital user interaction, how visual information is presented through VR head-mounted displays (HMDs) differs from naturalistic viewing and interactions in physical environments, leading to performance decrements. One critical challenge in VR development is the vergence-accommodation conflict (VAC), which arises due to the int…
▽ More
While virtual reality (VR) holds significant potential to revolutionize digital user interaction, how visual information is presented through VR head-mounted displays (HMDs) differs from naturalistic viewing and interactions in physical environments, leading to performance decrements. One critical challenge in VR development is the vergence-accommodation conflict (VAC), which arises due to the intrinsic constraints of approximating the natural viewing geometry through digital displays. Although various hardware and software solutions have been proposed to address VAC, no commercially viable option has been universally adopted by manufacturers. This paper presents and evaluates a software solution grounded in a vision-based geometrical model of VAC that mediates VAC's impact on movement in VR. This model predicts the impact of VAC as a constant offset to the vergence angle, distorting the binocular viewing geometry that results in movement undershooting. In Experiment 1, a 3D pointing task validated the model's predictions and demonstrated that VAC primarily affects online movements involving real-time visual feedback. Experiment 2 implemented a shader program to rectify the effect of VAC, improving movement accuracy by approximately 30%. Overall, this work presented a practical approach to reducing the impact of VAC on HMD-based manual interactions, enhancing the user experience in virtual environments.
△ Less
Submitted 29 May, 2025;
originally announced May 2025.
-
Roboflow100-VL: A Multi-Domain Object Detection Benchmark for Vision-Language Models
Authors:
Peter Robicheaux,
Matvei Popov,
Anish Madan,
Isaac Robinson,
Joseph Nelson,
Deva Ramanan,
Neehar Peri
Abstract:
Vision-language models (VLMs) trained on internet-scale data achieve remarkable zero-shot detection performance on common objects like car, truck, and pedestrian. However, state-of-the-art models still struggle to generalize to out-of-distribution classes, tasks and imaging modalities not typically found in their pre-training. Rather than simply re-training VLMs on more visual data, we argue that…
▽ More
Vision-language models (VLMs) trained on internet-scale data achieve remarkable zero-shot detection performance on common objects like car, truck, and pedestrian. However, state-of-the-art models still struggle to generalize to out-of-distribution classes, tasks and imaging modalities not typically found in their pre-training. Rather than simply re-training VLMs on more visual data, we argue that one should align VLMs to new concepts with annotation instructions containing a few visual examples and rich textual descriptions. To this end, we introduce Roboflow100-VL, a large-scale collection of 100 multi-modal object detection datasets with diverse concepts not commonly found in VLM pre-training. We evaluate state-of-the-art models on our benchmark in zero-shot, few-shot, semi-supervised, and fully-supervised settings, allowing for comparison across data regimes. Notably, we find that VLMs like GroundingDINO and Qwen2.5-VL achieve less than 2% zero-shot accuracy on challenging medical imaging datasets within Roboflow100-VL, demonstrating the need for few-shot concept alignment. Lastly, we discuss our recent CVPR 2025 Foundational FSOD competition and share insights from the community. Notably, the winning team significantly outperforms our baseline by 16.8 mAP! Our code and dataset are available at https://github.com/roboflow/rf100-vl/ and https://universe.roboflow.com/rf100-vl/
△ Less
Submitted 16 June, 2025; v1 submitted 26 May, 2025;
originally announced May 2025.
-
Framing the Game: How Context Shapes LLM Decision-Making
Authors:
Isaac Robinson,
John Burden
Abstract:
Large Language Models (LLMs) are increasingly deployed across diverse contexts to support decision-making. While existing evaluations effectively probe latent model capabilities, they often overlook the impact of context framing on perceived rational decision-making. In this study, we introduce a novel evaluation framework that systematically varies evaluation instances across key features and pro…
▽ More
Large Language Models (LLMs) are increasingly deployed across diverse contexts to support decision-making. While existing evaluations effectively probe latent model capabilities, they often overlook the impact of context framing on perceived rational decision-making. In this study, we introduce a novel evaluation framework that systematically varies evaluation instances across key features and procedurally generates vignettes to create highly varied scenarios. By analyzing decision-making patterns across different contexts with the same underlying game structure, we uncover significant contextual variability in LLM responses. Our findings demonstrate that this variability is largely predictable yet highly sensitive to framing effects. Our results underscore the need for dynamic, context-aware evaluation methodologies for real-world deployments.
△ Less
Submitted 5 March, 2025;
originally announced March 2025.
-
From Independence of Clones to Composition Consistency: A Hierarchy of Barriers to Strategic Nomination
Authors:
Ratip Emin Berker,
Sílvia Casacuberta,
Isaac Robinson,
Christopher Ong,
Vincent Conitzer,
Edith Elkind
Abstract:
We study two axioms for social choice functions that capture the impact of similar candidates: independence of clones (IoC) and composition consistency (CC). We clarify the relationship between these axioms by observing that CC is strictly more demanding than IoC, and investigate whether common voting rules that are known to be independent of clones (such as STV, Ranked Pairs, Schulze, and Split C…
▽ More
We study two axioms for social choice functions that capture the impact of similar candidates: independence of clones (IoC) and composition consistency (CC). We clarify the relationship between these axioms by observing that CC is strictly more demanding than IoC, and investigate whether common voting rules that are known to be independent of clones (such as STV, Ranked Pairs, Schulze, and Split Cycle) are composition-consistent. While for most of these rules the answer is negative, we identify a variant of Ranked Pairs that satisfies CC. Further, we show how to efficiently modify any (neutral) social choice function so that it satisfies CC, while maintaining its other desirable properties. Our transformation relies on the hierarchical representation of clone structures via PQ-trees. We extend our analysis to social preference functions. Finally, we interpret IoC and CC as measures of robustness against strategic manipulation by candidates, with IoC corresponding to strategy-proofness and CC corresponding to obvious strategy-proofness.
△ Less
Submitted 24 February, 2025;
originally announced February 2025.
-
Streamlining Equal Shares
Authors:
Sonja Kraiczy,
Isaac Robinson,
Edith Elkind
Abstract:
Participatory budgeting (PB) is a form of citizen participation that allows citizens to decide how public funds are spent. Through an election, citizens express their preferences on various projects (spending proposals). A voting mechanism then determines which projects will be approved. The Method of Equal Shares (MES) is the state of the art algorithm for a proportional, voting based approach to…
▽ More
Participatory budgeting (PB) is a form of citizen participation that allows citizens to decide how public funds are spent. Through an election, citizens express their preferences on various projects (spending proposals). A voting mechanism then determines which projects will be approved. The Method of Equal Shares (MES) is the state of the art algorithm for a proportional, voting based approach to participatory budgeting and has been implemented in cities across Poland and Switzerland. A significant drawback of MES is that it is not \textit{exhaustive} meaning that it often leaves a portion of the budget unspent that could be used to fund additional projects. To address this, in practice the algorithm is combined with a completion heuristic - most often the ``add-one" heuristic which artificially increases the budget until a heuristically chosen threshold. This heuristic is computationally inefficient and will become computationally impractical if PB is employed on a larger scale. We propose the more efficient \textsc{add-opt} heuristic for Exact Equal Shares (EES), a variation of MES that is known to retain many of its desirable properties. We solve the problem of identifying the next budget for which the outcome for EES changes in $O(mn)$ time for cardinal utilities and $O(m^2n)$ time for uniform utilities, where $m$ is the number of projects and $n$ is the number of voters. Our solution to this problem inspires the efficient \textsc{add-opt} heuristic which bypasses the need to search through each intermediary budget. We perform comprehensive experiments on real-word PB instances from Pabulib and show that completed EES outcomes usually match the proportion of budget spent by completed MES outcomes. Furthermore, the \textsc{add-opt} heuristic matches the proportion of budget spend by add-one for EES.
△ Less
Submitted 17 February, 2025;
originally announced February 2025.
-
Compliance Cards: Automated EU AI Act Compliance Analyses amidst a Complex AI Supply Chain
Authors:
Bill Marino,
Yaqub Chaudhary,
Yulu Pi,
Rui-Jie Yew,
Preslav Aleksandrov,
Carwyn Rahman,
William F. Shen,
Isaac Robinson,
Nicholas D. Lane
Abstract:
As the AI supply chain grows more complex, AI systems and models are increasingly likely to incorporate multiple internally- or externally-sourced components such as datasets and (pre-trained) models. In such cases, determining whether or not the aggregate AI system or model complies with the EU AI Act (AIA) requires a multi-step process in which compliance-related information about both the AI sy…
▽ More
As the AI supply chain grows more complex, AI systems and models are increasingly likely to incorporate multiple internally- or externally-sourced components such as datasets and (pre-trained) models. In such cases, determining whether or not the aggregate AI system or model complies with the EU AI Act (AIA) requires a multi-step process in which compliance-related information about both the AI system or model and all its component parts is: (1) gathered, potentially from multiple arms-length sources; (2) harmonized, if necessary; (3) inputted into an analysis that looks across all of it to render a compliance prediction. Because this process is so complex and time-consuming, it threatens to overburden the limited compliance resources of the AI providers (i.e., developers) who bear much of the responsibility for complying with the AIA. It also renders rapid or real-time compliance analyses infeasible in many AI development scenarios where they would be beneficial to providers. To address these shortcomings, we introduce a complete system for automating provider-side AIA compliance analyses amidst a complex AI supply chain. This system has two key elements. First is an interlocking set of computational, multi-stakeholder transparency artifacts that capture AIA-specific metadata about both: (1) the provider's overall AI system or model; and (2) the datasets and pre-trained models it incorporates as components. Second is an algorithm that operates across all those artifacts to render a real-time prediction about whether or not the aggregate AI system or model complies with the AIA. All told, this system promises to dramatically facilitate and democratize provider-side AIA compliance analyses (and, perhaps by extension, provider-side AIA compliance).
△ Less
Submitted 12 September, 2024; v1 submitted 20 June, 2024;
originally announced June 2024.
-
Obvious Independence of Clones
Authors:
Ratip Emin Berker,
Sílvia Casacuberta,
Christopher Ong,
Isaac Robinson
Abstract:
The Independence of Clones (IoC) criterion measures a voting rule's robustness to strategic nomination. Prior literature has established empirically that individuals may still submit costly, distortionary misreports even in strategy-proof (SP) settings, due to failure to recognize the SP property. The intersection of these issues motivates the search for mechanisms that are Obviously Independent o…
▽ More
The Independence of Clones (IoC) criterion measures a voting rule's robustness to strategic nomination. Prior literature has established empirically that individuals may still submit costly, distortionary misreports even in strategy-proof (SP) settings, due to failure to recognize the SP property. The intersection of these issues motivates the search for mechanisms that are Obviously Independent of Clones (OIoC): where strategic nomination/exiting of clones obviously has no effect on the outcome. We construct a formal and intuitive definition of a voting rule being OIoC and examine five IoC rules to identify whether they satisfy OIoC.
△ Less
Submitted 23 February, 2025; v1 submitted 10 October, 2022;
originally announced October 2022.
-
Interpretable Visualizations with Differentiating Embedding Networks
Authors:
Isaac Robinson
Abstract:
We present a visualization algorithm based on a novel unsupervised Siamese neural network training regime and loss function, called Differentiating Embedding Networks (DEN). The Siamese neural network finds differentiating or similar features between specific pairs of samples in a dataset, and uses these features to embed the dataset in a lower dimensional space where it can be visualized. Unlike…
▽ More
We present a visualization algorithm based on a novel unsupervised Siamese neural network training regime and loss function, called Differentiating Embedding Networks (DEN). The Siamese neural network finds differentiating or similar features between specific pairs of samples in a dataset, and uses these features to embed the dataset in a lower dimensional space where it can be visualized. Unlike existing visualization algorithms such as UMAP or $t$-SNE, DEN is parametric, meaning it can be interpreted by techniques such as SHAP. To interpret DEN, we create an end-to-end parametric clustering algorithm on top of the visualization, and then leverage SHAP scores to determine which features in the sample space are important for understanding the structures shown in the visualization based on the clusters found. We compare DEN visualizations with existing techniques on a variety of datasets, including image and scRNA-seq data. We then show that our clustering algorithm performs similarly to the state of the art despite not having prior knowledge of the number of clusters, and sets a new state of the art on FashionMNIST. Finally, we demonstrate finding differentiating features of a dataset. Code available at https://github.com/isaacrob/DEN
△ Less
Submitted 11 June, 2020;
originally announced June 2020.
-
Tree-SNE: Hierarchical Clustering and Visualization Using t-SNE
Authors:
Isaac Robinson,
Emma Pierce-Hoffman
Abstract:
t-SNE and hierarchical clustering are popular methods of exploratory data analysis, particularly in biology. Building on recent advances in speeding up t-SNE and obtaining finer-grained structure, we combine the two to create tree-SNE, a hierarchical clustering and visualization algorithm based on stacked one-dimensional t-SNE embeddings. We also introduce alpha-clustering, which recommends the op…
▽ More
t-SNE and hierarchical clustering are popular methods of exploratory data analysis, particularly in biology. Building on recent advances in speeding up t-SNE and obtaining finer-grained structure, we combine the two to create tree-SNE, a hierarchical clustering and visualization algorithm based on stacked one-dimensional t-SNE embeddings. We also introduce alpha-clustering, which recommends the optimal cluster assignment, without foreknowledge of the number of clusters, based off of the cluster stability across multiple scales. We demonstrate the effectiveness of tree-SNE and alpha-clustering on images of handwritten digits, mass cytometry (CyTOF) data from blood cells, and single-cell RNA-sequencing (scRNA-seq) data from retinal cells. Furthermore, to demonstrate the validity of the visualization, we use alpha-clustering to obtain unsupervised clustering results competitive with the state of the art on several image data sets. Software is available at https://github.com/isaacrob/treesne.
△ Less
Submitted 13 February, 2020;
originally announced February 2020.