Skip to main content

Showing 1–12 of 12 results for author: Webb, T W

Searching in archive cs. Search in all archives.
.
  1. arXiv:2506.14797  [pdf, other

    cs.LG cs.AI

    Bound by semanticity: universal laws governing the generalization-identification tradeoff

    Authors: Marco Nurisso, Jesseba Fernando, Raj Deshpande, Alan Perotti, Raja Marjieh, Steven M. Frankland, Richard L. Lewis, Taylor W. Webb, Declan Campbell, Francesco Vaccarino, Jonathan D. Cohen, Giovanni Petri

    Abstract: Intelligent systems must deploy internal representations that are simultaneously structured -- to support broad generalization -- and selective -- to preserve input identity. We expose a fundamental limit on this tradeoff. For any model whose representational similarity between inputs decays with finite semantic resolution $\varepsilon$, we derive closed-form expressions that pin its probability o… ▽ More

    Submitted 1 June, 2025; originally announced June 2025.

  2. arXiv:2505.21538  [pdf, other

    cs.CV cs.AI

    Caption This, Reason That: VLMs Caught in the Middle

    Authors: Zihan Weng, Lucas Gomez, Taylor Whittington Webb, Pouya Bashivan

    Abstract: Vision-Language Models (VLMs) have shown remarkable progress in visual understanding in recent years. Yet, they still lag behind human capabilities in specific visual tasks such as counting or relational reasoning. To understand the underlying limitations, we adopt methodologies from cognitive science, analyzing VLM performance along core cognitive axes: Perception, Attention, and Memory. Using a… ▽ More

    Submitted 24 May, 2025; originally announced May 2025.

  3. arXiv:2503.23125  [pdf, other

    cs.CV cs.AI

    Evaluating Compositional Scene Understanding in Multimodal Generative Models

    Authors: Shuhao Fu, Andrew Jun Lee, Anna Wang, Ida Momennejad, Trevor Bihl, Hongjing Lu, Taylor W. Webb

    Abstract: The visual world is fundamentally compositional. Visual scenes are defined by the composition of objects and their relations. Hence, it is essential for computer vision systems to reflect and exploit this compositionality to achieve robust and generalizable scene understanding. While major strides have been made toward the development of general-purpose, multimodal generative models, including bot… ▽ More

    Submitted 29 March, 2025; originally announced March 2025.

  4. arXiv:2411.00238  [pdf, other

    cs.AI cs.CV cs.LG q-bio.NC

    Understanding the Limits of Vision Language Models Through the Lens of the Binding Problem

    Authors: Declan Campbell, Sunayana Rane, Tyler Giallanza, Nicolò De Sabbata, Kia Ghods, Amogh Joshi, Alexander Ku, Steven M. Frankland, Thomas L. Griffiths, Jonathan D. Cohen, Taylor W. Webb

    Abstract: Recent work has documented striking heterogeneity in the performance of state-of-the-art vision language models (VLMs), including both multimodal language models and text-to-image models. These models are able to describe and generate a diverse array of complex, naturalistic images, yet they exhibit surprising failures on basic multi-object reasoning tasks -- such as counting, localization, and si… ▽ More

    Submitted 16 April, 2025; v1 submitted 31 October, 2024; originally announced November 2024.

  5. arXiv:2403.03458  [pdf, other

    cs.CV cs.LG

    Slot Abstractors: Toward Scalable Abstract Visual Reasoning

    Authors: Shanka Subhra Mondal, Jonathan D. Cohen, Taylor W. Webb

    Abstract: Abstract visual reasoning is a characteristically human ability, allowing the identification of relational patterns that are abstracted away from object features, and the systematic generalization of those patterns to unseen problems. Recent work has demonstrated strong systematic generalization in visual reasoning tasks involving multi-object inputs, through the integration of slot-based methods… ▽ More

    Submitted 2 June, 2024; v1 submitted 5 March, 2024; originally announced March 2024.

    Comments: 18 pages, 9 figures

  6. arXiv:2309.06629  [pdf, other

    cs.AI cs.NE

    The Relational Bottleneck as an Inductive Bias for Efficient Abstraction

    Authors: Taylor W. Webb, Steven M. Frankland, Awni Altabaa, Simon Segert, Kamesh Krishnamurthy, Declan Campbell, Jacob Russin, Tyler Giallanza, Zack Dulberg, Randall O'Reilly, John Lafferty, Jonathan D. Cohen

    Abstract: A central challenge for cognitive science is to explain how abstract concepts are acquired from limited experience. This has often been framed in terms of a dichotomy between connectionist and symbolic cognitive models. Here, we highlight a recently emerging line of work that suggests a novel reconciliation of these approaches, by exploiting an inductive bias that we term the relational bottleneck… ▽ More

    Submitted 1 May, 2024; v1 submitted 12 September, 2023; originally announced September 2023.

  7. arXiv:2306.02500  [pdf, other

    cs.CV

    Systematic Visual Reasoning through Object-Centric Relational Abstraction

    Authors: Taylor W. Webb, Shanka Subhra Mondal, Jonathan D. Cohen

    Abstract: Human visual reasoning is characterized by an ability to identify abstract patterns from only a small number of examples, and to systematically generalize those patterns to novel inputs. This capacity depends in large part on our ability to represent complex visual inputs in terms of both objects and relations. Recent work in computer vision has introduced models with the capacity to extract objec… ▽ More

    Submitted 10 November, 2023; v1 submitted 4 June, 2023; originally announced June 2023.

  8. arXiv:2209.15087  [pdf, other

    cs.CV cs.AI

    Zero-shot visual reasoning through probabilistic analogical mapping

    Authors: Taylor W. Webb, Shuhao Fu, Trevor Bihl, Keith J. Holyoak, Hongjing Lu

    Abstract: Human reasoning is grounded in an ability to identify highly abstract commonalities governing superficially dissimilar visual inputs. Recent efforts to develop algorithms with this capacity have largely focused on approaches that require extensive direct training on visual reasoning tasks, and yield limited generalization to problems with novel content. In contrast, a long tradition of research in… ▽ More

    Submitted 29 September, 2022; originally announced September 2022.

  9. arXiv:2110.04906  [pdf, other

    cs.CV cs.AI cs.LG

    Operationalizing Convolutional Neural Network Architectures for Prohibited Object Detection in X-Ray Imagery

    Authors: Thomas W. Webb, Neelanjan Bhowmik, Yona Falinie A. Gaus, Toby P. Breckon

    Abstract: The recent advancement in deep Convolutional Neural Network (CNN) has brought insight into the automation of X-ray security screening for aviation security and beyond. Here, we explore the viability of two recent end-to-end object detection CNN architectures, Cascade R-CNN and FreeAnchor, for prohibited item detection by balancing processing time and the impact of image data compression from an op… ▽ More

    Submitted 10 October, 2021; originally announced October 2021.

  10. arXiv:2012.14601  [pdf, other

    cs.AI cs.LG cs.NE

    Emergent Symbols through Binding in External Memory

    Authors: Taylor W. Webb, Ishan Sinha, Jonathan D. Cohen

    Abstract: A key aspect of human intelligence is the ability to infer abstract rules directly from high-dimensional sensory data, and to do so given only a limited amount of training experience. Deep neural network algorithms have proven to be a powerful tool for learning directly from high-dimensional data, but currently lack this capacity for data-efficient induction of abstract rules, leading some to argu… ▽ More

    Submitted 9 March, 2021; v1 submitted 28 December, 2020; originally announced December 2020.

  11. arXiv:2012.07172  [pdf, other

    cs.AI cs.LG

    A Memory-Augmented Neural Network Model of Abstract Rule Learning

    Authors: Ishan Sinha, Taylor W. Webb, Jonathan D. Cohen

    Abstract: Human intelligence is characterized by a remarkable ability to infer abstract rules from experience and apply these rules to novel domains. As such, designing neural network algorithms with this capacity is an important step toward the development of deep learning systems with more human-like intelligence. However, doing so is a major outstanding challenge, one that some argue will require neural… ▽ More

    Submitted 14 December, 2020; v1 submitted 13 December, 2020; originally announced December 2020.

  12. arXiv:2007.05059  [pdf, other

    cs.CV

    Learning Representations that Support Extrapolation

    Authors: Taylor W. Webb, Zachary Dulberg, Steven M. Frankland, Alexander A. Petrov, Randall C. O'Reilly, Jonathan D. Cohen

    Abstract: Extrapolation -- the ability to make inferences that go beyond the scope of one's experiences -- is a hallmark of human intelligence. By contrast, the generalization exhibited by contemporary neural network algorithms is largely limited to interpolation between data points in their training corpora. In this paper, we consider the challenge of learning representations that support extrapolation. We… ▽ More

    Submitted 6 September, 2023; v1 submitted 9 July, 2020; originally announced July 2020.

    Comments: ICML 2020