Skip to main content

Showing 1–2 of 2 results for author: Marsili, D

Searching in archive cs. Search in all archives.
.
  1. arXiv:2506.07731  [pdf, ps, other

    cs.AI

    NeurIPS 2025 E2LM Competition : Early Training Evaluation of Language Models

    Authors: Mouadh Yagoubi, Yasser Dahou, Billel Mokeddem, Younes Belkada, Phuc H. Le-Khac, Basma El Amel Boussaha, Reda Alami, Jingwei Zuo, Damiano Marsili, Mugariya Farooq, Mounia Lalmas, Georgia Gkioxari, Patrick Gallinari, Philip Torr, Hakim Hacid

    Abstract: Existing benchmarks have proven effective for assessing the performance of fully trained large language models. However, we find striking differences in the early training stages of small models, where benchmarks often fail to provide meaningful or discriminative signals. To explore how these differences arise, this competition tackles the challenge of designing scientific knowledge evaluation tas… ▽ More

    Submitted 9 June, 2025; originally announced June 2025.

  2. arXiv:2502.06787  [pdf, other

    cs.CV

    Visual Agentic AI for Spatial Reasoning with a Dynamic API

    Authors: Damiano Marsili, Rohun Agrawal, Yisong Yue, Georgia Gkioxari

    Abstract: Visual reasoning -- the ability to interpret the visual world -- is crucial for embodied agents that operate within three-dimensional scenes. Progress in AI has led to vision and language models capable of answering questions from images. However, their performance declines when tasked with 3D spatial reasoning. To tackle the complexity of such reasoning problems, we introduce an agentic program s… ▽ More

    Submitted 27 March, 2025; v1 submitted 10 February, 2025; originally announced February 2025.

    Comments: Project website: https://glab-caltech.github.io/vadar/