Skip to main content

Showing 1–50 of 426 results for author: De, T

Searching in archive cs. Search in all archives.
.
  1. arXiv:2506.10897  [pdf, ps, other

    cs.AI

    GenPlanX. Generation of Plans and Execution

    Authors: Daniel Borrajo, Giuseppe Canonaco, Tomás de la Rosa, Alfredo Garrachón, Sriram Gopalakrishnan, Simerjot Kaur, Marianela Morales, Sunandita Patra, Alberto Pozanco, Keshav Ramani, Charese Smiley, Pietro Totis, Manuela Veloso

    Abstract: Classical AI Planning techniques generate sequences of actions for complex tasks. However, they lack the ability to understand planning tasks when provided using natural language. The advent of Large Language Models (LLMs) has introduced novel capabilities in human-computer interaction. In the context of planning tasks, LLMs have shown to be particularly good in interpreting human intents among ot… ▽ More

    Submitted 12 June, 2025; originally announced June 2025.

  2. arXiv:2506.09381  [pdf

    cs.CL

    Binary classification for perceived quality of headlines and links on worldwide news websites, 2018-2024

    Authors: Austin McCutcheon, Thiago E. A. de Oliveira, Aleksandr Zheleznov, Chris Brogly

    Abstract: The proliferation of online news enables potential widespread publication of perceived low-quality news headlines/links. As a result, we investigated whether it was possible to automatically distinguish perceived lower-quality news headlines/links from perceived higher-quality headlines/links. We evaluated twelve machine learning models on a binary, balanced dataset of 57,544,214 worldwide news we… ▽ More

    Submitted 11 June, 2025; originally announced June 2025.

  3. arXiv:2506.01208  [pdf, ps, other

    cs.LG

    Multiresolution Analysis and Statistical Thresholding on Dynamic Networks

    Authors: Raphaël Romero, Tijl De Bie, Nick Heard, Alexander Modell

    Abstract: Detecting structural change in dynamic network data has wide-ranging applications. Existing approaches typically divide the data into time bins, extract network features within each bin, and then compare these features over time. This introduces an inherent tradeoff between temporal resolution and the statistical stability of the extracted features. Despite this tradeoff, reminiscent of time-frequ… ▽ More

    Submitted 1 June, 2025; originally announced June 2025.

  4. arXiv:2505.24784  [pdf, ps, other

    cs.AI cs.LG stat.ML

    AXIOM: Learning to Play Games in Minutes with Expanding Object-Centric Models

    Authors: Conor Heins, Toon Van de Maele, Alexander Tschantz, Hampus Linander, Dimitrije Markovic, Tommaso Salvatori, Corrado Pezzato, Ozan Catal, Ran Wei, Magnus Koudahl, Marco Perin, Karl Friston, Tim Verbelen, Christopher Buckley

    Abstract: Current deep reinforcement learning (DRL) approaches achieve state-of-the-art performance in various domains, but struggle with data efficiency compared to human learning, which leverages core priors about objects and their interactions. Active inference offers a principled framework for integrating sensory information with prior knowledge to learn a world model and quantify the uncertainty of its… ▽ More

    Submitted 30 May, 2025; originally announced May 2025.

    Comments: 10 pages main text, 4 figures, 2 tables; 25 pages supplementary material, 8 figures

  5. arXiv:2505.22114  [pdf, ps, other

    cs.LG

    BiMi Sheets: Infosheets for bias mitigation methods

    Authors: MaryBeth Defrance, Guillaume Bied, Maarten Buyl, Jefrey Lijffijt, Tijl De Bie

    Abstract: Over the past 15 years, hundreds of bias mitigation methods have been proposed in the pursuit of fairness in machine learning (ML). However, algorithmic biases are domain-, task-, and model-specific, leading to a `portability trap': bias mitigation solutions in one context may not be appropriate in another. Thus, a myriad of design choices have to be made when creating a bias mitigation method, su… ▽ More

    Submitted 28 May, 2025; originally announced May 2025.

  6. arXiv:2505.18008  [pdf, ps, other

    math.OC cs.LG

    Deep Operator Neural Network Model Predictive Control

    Authors: Thomas Oliver de Jong, Khemraj Shukla, Mircea Lazar

    Abstract: In this paper, we consider the design of model predictive control (MPC) algorithms based on deep operator neural networks (DeepONets). These neural networks are capable of accurately approximating real and complex valued solutions of continuous time nonlinear systems without relying on recurrent architectures. The DeepONet architecture is made up of two feedforward neural networks: the branch netw… ▽ More

    Submitted 23 May, 2025; originally announced May 2025.

  7. arXiv:2505.17592  [pdf, ps, other

    astro-ph.IM cs.LG

    AstroMLab 4: Benchmark-Topping Performance in Astronomy Q&A with a 70B-Parameter Domain-Specialized Reasoning Model

    Authors: Tijmen de Haan, Yuan-Sen Ting, Tirthankar Ghosal, Tuan Dung Nguyen, Alberto Accomazzi, Emily Herron, Vanessa Lama, Rui Pan, Azton Wells, Nesar Ramachandra

    Abstract: General-purpose large language models, despite their broad capabilities, often struggle with specialized domain knowledge, a limitation particularly pronounced in more accessible, lower-parameter versions. This gap hinders their deployment as effective agents in demanding fields such as astronomy. Building on our prior work with AstroSage-8B, this study introduces AstroSage-70B, a significantly la… ▽ More

    Submitted 23 May, 2025; originally announced May 2025.

  8. arXiv:2505.08906  [pdf

    cs.PL cs.DC cs.PF

    Comparing Parallel Functional Array Languages: Programming and Performance

    Authors: David van Balen, Tiziano De Matteis, Clemens Grelck, Troels Henriksen, Aaron W. Hsu, Gabriele K. Keller, Thomas Koopman, Trevor L. McDonell, Cosmin Oancea, Sven-Bodo Scholz, Artjoms Sinkarovs, Tom Smeding, Phil Trinder, Ivo Gabe de Wolff, Alexandros Nikolaos Ziogas

    Abstract: Parallel functional array languages are an emerging class of programming languages that promise to combine low-effort parallel programming with good performance and performance portability. We systematically compare the designs and implementations of five different functional array languages: Accelerate, APL, DaCe, Futhark, and SaC. We demonstrate the expressiveness of functional array programming… ▽ More

    Submitted 13 May, 2025; originally announced May 2025.

  9. arXiv:2505.07653  [pdf, ps, other

    cs.CL

    JobHop: A Large-Scale Dataset of Career Trajectories

    Authors: Iman Johary, Raphael Romero, Alexandru C. Mara, Tijl De Bie

    Abstract: Understanding labor market dynamics is essential for policymakers, employers, and job seekers. However, comprehensive datasets that capture real-world career trajectories are scarce. In this paper, we introduce JobHop, a large-scale public dataset derived from anonymized resumes provided by VDAB, the public employment service in Flanders, Belgium. Utilizing Large Language Models (LLMs), we process… ▽ More

    Submitted 12 May, 2025; originally announced May 2025.

  10. arXiv:2505.02402  [pdf, ps, other

    cs.LG math.ST stat.ML

    A probabilistic view on Riemannian machine learning models for SPD matrices

    Authors: Thibault de Surrel, Florian Yger, Fabien Lotte, Sylvain Chevallier

    Abstract: The goal of this paper is to show how different machine learning tools on the Riemannian manifold $\mathcal{P}_d$ of Symmetric Positive Definite (SPD) matrices can be united under a probabilistic framework. For this, we will need several Gaussian distributions defined on $\mathcal{P}_d$. We will show how popular classifiers on $\mathcal{P}_d$ can be reinterpreted as Bayes Classifiers using these G… ▽ More

    Submitted 5 May, 2025; originally announced May 2025.

  11. arXiv:2505.01001  [pdf, other

    cs.MM cs.GR cs.HC

    Photoshop Batch Rendering Using Actions for Stylistic Video Editing

    Authors: Tessa De La Fuente

    Abstract: My project looks at an efficient workflow for creative image/video editing using Adobe Photoshop Actions tool and Batch Processing System. This innovative approach to video editing through Photoshop creates a fundamental shift to creative workflow management through the integration of industry-leading image manipulation with video editing techniques. Through systematic automation of Actions, users… ▽ More

    Submitted 2 May, 2025; originally announced May 2025.

    Comments: 11 pages, 12 figures

  12. arXiv:2504.17890  [pdf, other

    eess.SP cs.RO math.MG

    Quaternion Domain Super MDS for 3D Localization

    Authors: Keigo Masuoka, Takumi Takahashi, Giuseppe Thadeu Freitas de Abreu, Hideki Ochiai

    Abstract: We propose a novel low-complexity three-dimensional (3D) localization algorithm for wireless sensor networks, termed quaternion-domain super multidimensional scaling (QD-SMDS). This algorithm reformulates the conventional SMDS, which was originally developed in the real domain, into the quaternion domain. By representing 3D coordinates as quaternions, the method enables the construction of a rank-… ▽ More

    Submitted 28 April, 2025; v1 submitted 24 April, 2025; originally announced April 2025.

    Comments: 5 pages, 9 figures, submitted to SPAWC2025

  13. arXiv:2504.14898  [pdf, other

    stat.ML cs.LG

    Expected Free Energy-based Planning as Variational Inference

    Authors: Bert de Vries, Wouter Nuijten, Thijs van de Laar, Wouter Kouw, Sepideh Adamiat, Tim Nisslbeck, Mykola Lukashchuk, Hoang Minh Huu Nguyen, Marco Hidalgo Araya, Raphael Tresor, Thijs Jenneskens, Ivana Nikoloska, Raaja Ganapathy Subramanian, Bart van Erp, Dmitry Bagaev, Albert Podusenko

    Abstract: We address the problem of planning under uncertainty, where an agent must choose actions that not only achieve desired outcomes but also reduce uncertainty. Traditional methods often treat exploration and exploitation as separate objectives, lacking a unified inferential foundation. Active inference, grounded in the Free Energy Principle, provides such a foundation by minimizing Expected Free Ener… ▽ More

    Submitted 23 April, 2025; v1 submitted 21 April, 2025; originally announced April 2025.

    Comments: 18 pages

  14. arXiv:2504.13597  [pdf, other

    eess.IV cs.AI cs.CV

    FocusNet: Transformer-enhanced Polyp Segmentation with Local and Pooling Attention

    Authors: Jun Zeng, KC Santosh, Deepak Rajan Nayak, Thomas de Lange, Jonas Varkey, Tyler Berzin, Debesh Jha

    Abstract: Colonoscopy is vital in the early diagnosis of colorectal polyps. Regular screenings can effectively prevent benign polyps from progressing to CRC. While deep learning has made impressive strides in polyp segmentation, most existing models are trained on single-modality and single-center data, making them less effective in real-world clinical environments. To overcome these limitations, we propose… ▽ More

    Submitted 18 April, 2025; originally announced April 2025.

    Comments: 9 pages, 6 figures

  15. arXiv:2504.12078  [pdf, other

    cs.CV q-bio.QM

    Single-shot Star-convex Polygon-based Instance Segmentation for Spatially-correlated Biomedical Objects

    Authors: Trina De, Adrian Urbanski, Artur Yakimovich

    Abstract: Biomedical images often contain objects known to be spatially correlated or nested due to their inherent properties, leading to semantic relations. Examples include cell nuclei being nested within eukaryotic cells and colonies growing exclusively within their culture dishes. While these semantic relations bear key importance, detection tasks are often formulated independently, requiring multi-shot… ▽ More

    Submitted 16 April, 2025; originally announced April 2025.

    Comments: 12 pages, 8 figures

    ACM Class: J.3; I.4

  16. arXiv:2504.10052  [pdf, other

    eess.SP cs.IT

    Frequency Hopping Waveform Design for Secure Integrated Sensing and Communications

    Authors: Ali Khandan Boroujeni, Giuseppe Thadeu Freitas de Abreu, Stefan Köpsell, Ghazal Bagheri, Kuranage Roche Rayan Ranasinghe, Rafael F. Schaefer

    Abstract: We introduce a comprehensive approach to enhance the security, privacy, and sensing capabilities of integrated sensing and communications (ISAC) systems by leveraging random frequency agility (RFA) and random pulse repetition interval (PRI) agility (RPA) techniques. The combination of these techniques, which we refer to collectively as random frequency and PRI agility (RFPA), with channel reciproc… ▽ More

    Submitted 14 April, 2025; originally announced April 2025.

    Comments: Submitted to the IEEE for possible publication

  17. arXiv:2504.07646  [pdf, other

    cs.CL cs.AI

    On the Temporal Question-Answering Capabilities of Large Language Models Over Anonymized Data

    Authors: Alfredo Garrachón Ruiz, Tomás de la Rosa, Daniel Borrajo

    Abstract: The applicability of Large Language Models (LLMs) in temporal reasoning tasks over data that is not present during training is still a field that remains to be explored. In this paper we work on this topic, focusing on structured and semi-structured anonymized data. We not only develop a direct LLM pipeline, but also compare various methodologies and conduct an in-depth analysis. We identified and… ▽ More

    Submitted 10 April, 2025; originally announced April 2025.

    Comments: 18 pages, 7 tables, 5 figures

  18. arXiv:2504.04511  [pdf, other

    eess.SP cs.CR

    Post-Quantum Wireless-based Key Encapsulation Mechanism via CRYSTALS-Kyber for Resource-Constrained Devices

    Authors: M. A. González de la Torre, I. A. Morales Sandoval, Giuseppe Thadeu Freitas de Abreu, L. Hernández Encinas

    Abstract: We consider the problem of adapting a Post-Quantum cryptosystem to be used in resource-constrained devices, such as those typically used in Device-to-Device and Internet of Things systems. In particular, we propose leveraging the characteristics of wireless communications channels to minimize the complexity of implementation of a Post-Quantum public key encryption scheme, without diminishing its s… ▽ More

    Submitted 6 April, 2025; originally announced April 2025.

  19. arXiv:2504.04275  [pdf, other

    cs.CL

    negativas: a prototype for searching and classifying sentential negation in speech data

    Authors: Túlio Sousa de Gois, Paloma Batista Cardoso

    Abstract: Negation is a universal feature of natural languages. In Brazilian Portuguese, the most commonly used negation particle is não, which can scope over nouns or verbs. When it scopes over a verb, não can occur in three positions: pre-verbal (NEG1), double negation (NEG2), or post-verbal (NEG3), e.g., não gosto, não gosto não, gosto não ("I do not like it"). From a variationist perspective, these stru… ▽ More

    Submitted 5 April, 2025; originally announced April 2025.

  20. arXiv:2504.03803  [pdf, other

    cs.CL cs.CY cs.LG

    What Large Language Models Do Not Talk About: An Empirical Study of Moderation and Censorship Practices

    Authors: Sander Noels, Guillaume Bied, Maarten Buyl, Alexander Rogiers, Yousra Fettach, Jefrey Lijffijt, Tijl De Bie

    Abstract: Large Language Models (LLMs) are increasingly deployed as gateways to information, yet their content moderation practices remain underexplored. This work investigates the extent to which LLMs refuse to answer or omit information when prompted on political topics. To do so, we distinguish between hard censorship (i.e., generated refusals, error messages, or canned denial responses) and soft censors… ▽ More

    Submitted 4 April, 2025; originally announced April 2025.

    Comments: 17 pages, 38 pages in total including appendix; 5 figures, 22 figures in appendix

  21. arXiv:2503.15173  [pdf, other

    cs.NI eess.SP

    A Robust Routing Protocol for 5G Mesh Networks

    Authors: Niclas Führling, Ivan Alexander Morales Sandoval, Giuseppe Thadeu Freitas de Abreu

    Abstract: We consider a novel routing protocol suitable for ad-hoc networks with dynamically changing topologies, such as DECT 2020 NR (NR+) systems, which often lead to missing links between the nodes and thus, incomplete or inefficient routes. A key point of the proposed protocol is the combination of network discovery and matrix completion techniques, which allow the nodes to establish communication path… ▽ More

    Submitted 19 March, 2025; originally announced March 2025.

  22. arXiv:2503.13274  [pdf, ps, other

    cs.DS

    Parallel Minimum Cost Flow in Near-Linear Work and Square Root Depth for Dense Instances

    Authors: Jan van den Brand, Hossein Gholizadeh, Yonggang Jiang, Tijn de Vos

    Abstract: For $n$-vertex $m$-edge graphs with integer polynomially-bounded costs and capacities, we provide a randomized parallel algorithm for the minimum cost flow problem with $\tilde O(m+n^ {1.5})$ work and $\tilde O(\sqrt{n})$ depth. On moderately dense graphs ($m>n^{1.5}$), our algorithm is the first one to achieve both near-linear work and sub-linear depth. Previous algorithms are either achieving al… ▽ More

    Submitted 17 March, 2025; originally announced March 2025.

  23. arXiv:2503.11576  [pdf, other

    cs.CV

    SmolDocling: An ultra-compact vision-language model for end-to-end multi-modal document conversion

    Authors: Ahmed Nassar, Andres Marafioti, Matteo Omenetti, Maksym Lysak, Nikolaos Livathinos, Christoph Auer, Lucas Morin, Rafael Teixeira de Lima, Yusik Kim, A. Said Gurbuz, Michele Dolfi, Miquel Farré, Peter W. J. Staar

    Abstract: We introduce SmolDocling, an ultra-compact vision-language model targeting end-to-end document conversion. Our model comprehensively processes entire pages by generating DocTags, a new universal markup format that captures all page elements in their full context with location. Unlike existing approaches that rely on large foundational models, or ensemble solutions that rely on handcrafted pipeline… ▽ More

    Submitted 14 March, 2025; originally announced March 2025.

    Comments: 24 pages, 10 figures

  24. arXiv:2503.09330  [pdf, other

    cs.LG cs.AI

    Group-robust Machine Unlearning

    Authors: Thomas De Min, Subhankar Roy, Stéphane Lathuilière, Elisa Ricci, Massimiliano Mancini

    Abstract: Machine unlearning is an emerging paradigm to remove the influence of specific training data (i.e., the forget set) from a model while preserving its knowledge of the rest of the data (i.e., the retain set). Previous approaches assume the forget data to be uniformly distributed from all training datapoints. However, if the data to unlearn is dominant in one group, we empirically show that performa… ▽ More

    Submitted 12 March, 2025; originally announced March 2025.

    Comments: Work in progress

  25. arXiv:2503.08939  [pdf, other

    cs.CV cs.AI

    KAN-Mixers: a new deep learning architecture for image classification

    Authors: Jorge Luiz dos Santos Canuto, Linnyer Beatrys Ruiz Aylon, Rodrigo Clemente Thom de Souza

    Abstract: Due to their effective performance, Convolutional Neural Network (CNN) and Vision Transformer (ViT) architectures have become the standard for solving computer vision tasks. Such architectures require large data sets and rely on convolution and self-attention operations. In 2021, MLP-Mixer emerged, an architecture that relies only on Multilayer Perceptron (MLP) and achieves extremely competitive r… ▽ More

    Submitted 11 March, 2025; originally announced March 2025.

    Comments: 8 pages, 6 figures

  26. arXiv:2503.03446  [pdf, other

    cs.CV cs.CY

    Biased Heritage: How Datasets Shape Models in Facial Expression Recognition

    Authors: Iris Dominguez-Catena, Daniel Paternain, Mikel Galar, MaryBeth Defrance, Maarten Buyl, Tijl De Bie

    Abstract: In recent years, the rapid development of artificial intelligence (AI) systems has raised concerns about our ability to ensure their fairness, that is, how to avoid discrimination based on protected characteristics such as gender, race, or age. While algorithmic fairness is well-studied in simple binary classification tasks on tabular data, its application to complex, real-world scenarios-such as… ▽ More

    Submitted 5 March, 2025; originally announced March 2025.

    Comments: 17 pages, 7 figures

    ACM Class: I.2.10

  27. Sensing Movement: Contemporary Dance Workshops with People who are Blind or have Low Vision and Dance Teachers

    Authors: Madhuka Thisuri De Silva, Jim Smiley, Sarah Goodwin, Leona M Holloway, Matthew Butler

    Abstract: Dance teachers rely primarily on verbal instructions and visual demonstrations to convey key dance concepts and movement. These techniques, however, have limitations in supporting students who are blind or have low vision (BLV). This work explores the role technology can play in supporting instruction for BLV students, as well as improvisation with their instructor. Through a series of design work… ▽ More

    Submitted 4 March, 2025; originally announced March 2025.

    Comments: Accepted to appear at ACM CHI Conference on Human Factors in Computing Systems (CHI '25), April 26 - May 1, 2025, Yokohama, Japan

  28. arXiv:2503.02688  [pdf, other

    cs.DB

    A user-friendly SPARQL query editor powered by lightweight metadata

    Authors: Vincent Emonet, Ana-Claudia Sima, Tarcisio Mendes de Farias

    Abstract: SPARQL query editors often lack intuitive interfaces to aid SPARQL-savvy users to write queries. To address this issue, we propose an easy-to-deploy, triple store-agnostic and open-source query editor that offers three main features: (i) automatic query example rendering, (ii) precise autocomplete based on existing triple patterns including within SERVICE clauses, and (iii) a data-aware schema vis… ▽ More

    Submitted 22 April, 2025; v1 submitted 4 March, 2025; originally announced March 2025.

  29. arXiv:2502.19293  [pdf, other

    cs.CV

    Pathology Report Generation and Multimodal Representation Learning for Cutaneous Melanocytic Lesions

    Authors: Ruben T. Lucassen, Sander P. J. Moonemans, Tijn van de Luijtgaarden, Gerben E. Breimer, Willeke A. M. Blokx, Mitko Veta

    Abstract: Millions of melanocytic skin lesions are examined by pathologists each year, the majority of which concern common nevi (i.e., ordinary moles). While most of these lesions can be diagnosed in seconds, writing the corresponding pathology report is much more time-consuming. Automating part of the report writing could, therefore, alleviate the increasing workload of pathologists. In this work, we deve… ▽ More

    Submitted 27 February, 2025; v1 submitted 26 February, 2025; originally announced February 2025.

    Comments: 11 pages, 2 figures. arXiv admin note: text overlap with arXiv:2502.19285

  30. arXiv:2502.19285  [pdf, ps, other

    cs.CV

    On the Importance of Text Preprocessing for Multimodal Representation Learning and Pathology Report Generation

    Authors: Ruben T. Lucassen, Tijn van de Luijtgaarden, Sander P. J. Moonemans, Gerben E. Breimer, Willeke A. M. Blokx, Mitko Veta

    Abstract: Vision-language models in pathology enable multimodal case retrieval and automated report generation. Many of the models developed so far, however, have been trained on pathology reports that include information which cannot be inferred from paired whole slide images (e.g., patient history), potentially leading to hallucinated sentences in generated reports. To this end, we investigate how the sel… ▽ More

    Submitted 6 June, 2025; v1 submitted 26 February, 2025; originally announced February 2025.

    Comments: 11 pages, 1 figure

  31. arXiv:2502.17364  [pdf, other

    cs.CL cs.AI

    Bridging Gaps in Natural Language Processing for Yorùbá: A Systematic Review of a Decade of Progress and Prospects

    Authors: Toheeb A. Jimoh, Tabea De Wille, Nikola S. Nikolov

    Abstract: Natural Language Processing (NLP) is becoming a dominant subset of artificial intelligence as the need to help machines understand human language looks indispensable. Several NLP applications are ubiquitous, partly due to the myriads of datasets being churned out daily through mediums like social networking sites. However, the growing development has not been evident in most African languages due… ▽ More

    Submitted 24 February, 2025; originally announced February 2025.

  32. arXiv:2502.16723  [pdf, ps, other

    cs.DS cs.CC

    A Parameterized Complexity Analysis of Bounded Height Depth-first Search Trees

    Authors: Lars Jaffke, Paloma T. de Lima, Wojciech Nadara, Emmanuel Sam

    Abstract: Computing bounded depth decompositions is a bottleneck in many applications of the treedepth parameter. The fastest known algorithm, which is due to Reidl, Rossmanith, Sánchez Villaamil, and Sikdar [ICALP 2014], runs in $2^{\mathcal{O}(k^2)}\cdot n$ time and it is a big open problem whether the dependency on $k$ can be improved to $2^{o(k^2)}\cdot n^{\mathcal{O}(1)}$. We show that the related prob… ▽ More

    Submitted 23 February, 2025; originally announced February 2025.

    MSC Class: 05C85

  33. arXiv:2502.09927  [pdf, other

    cs.CV cs.AI

    Granite Vision: a lightweight, open-source multimodal model for enterprise Intelligence

    Authors: Granite Vision Team, Leonid Karlinsky, Assaf Arbelle, Abraham Daniels, Ahmed Nassar, Amit Alfassi, Bo Wu, Eli Schwartz, Dhiraj Joshi, Jovana Kondic, Nimrod Shabtay, Pengyuan Li, Roei Herzig, Shafiq Abedin, Shaked Perek, Sivan Harary, Udi Barzelay, Adi Raz Goldfarb, Aude Oliva, Ben Wieles, Bishwaranjan Bhattacharjee, Brandon Huang, Christoph Auer, Dan Gutfreund, David Beymer , et al. (38 additional authors not shown)

    Abstract: We introduce Granite Vision, a lightweight large language model with vision capabilities, specifically designed to excel in enterprise use cases, particularly in visual document understanding. Our model is trained on a comprehensive instruction-following dataset, including document-related tasks, such as content extraction from tables, charts, diagrams, sketches, and infographics, as well as gener… ▽ More

    Submitted 14 February, 2025; originally announced February 2025.

  34. arXiv:2502.04818  [pdf, other

    cs.LG math.DS nlin.AO nlin.CD

    Harnessing omnipresent oscillator networks as computational resource

    Authors: Thomas Geert de Jong, Hirofumi Notsu, Kohei Nakajima

    Abstract: Nature is pervaded with oscillatory behavior. In networks of coupled oscillators patterns can arise when the system synchronizes to an external input. Hence, these networks provide processing and memory of input. We present a universal framework for harnessing oscillator networks as computational resource. This reservoir computing framework is introduced by the ubiquitous model for phase-locking,… ▽ More

    Submitted 21 February, 2025; v1 submitted 7 February, 2025; originally announced February 2025.

    MSC Class: 92B25 ACM Class: I.2.6

  35. arXiv:2502.02404  [pdf, other

    cs.AR cs.DC

    FPGA Innovation Research in the Netherlands: Present Landscape and Future Outlook

    Authors: Nikolaos Alachiotis, Sjoerd van den Belt, Steven van der Vlugt, Reinier van der Walle, Mohsen Safari, Bruno Endres Forlin, Tiziano De Matteis, Zaid Al-Ars, Roel Jordans, António J. Sousa de Almeida, Federico Corradi, Christiaan Baaij, Ana-Lucia Varbanescu

    Abstract: FPGAs have transformed digital design by enabling versatile and customizable solutions that balance performance and power efficiency, yielding them essential for today's diverse computing challenges. Research in the Netherlands, both in academia and industry, plays a major role in developing new innovative FPGA solutions. This survey presents the current landscape of FPGA innovation research in th… ▽ More

    Submitted 4 February, 2025; originally announced February 2025.

  36. arXiv:2502.01665  [pdf, other

    eess.IV cs.CV

    Entropy-based measure of rock sample heterogeneity derived from micro-CT images

    Authors: Luan Coelho Vieira Silva, Júlio de Castro Vargas Fernandes, Felipe Belilaqua Foldes Guimarães, Pedro Henrique Braga Lisboa, Carlos Eduardo Menezes dos Anjos, Thais Fernandes de Matos, Marcelo Ramalho Albuquerque, Rodrigo Surmas, Alexandre Gonçalves Evsukoff

    Abstract: This study presents an automated method for objectively measuring rock heterogeneity via raw X-ray micro-computed tomography (micro-CT) images, thereby addressing the limitations of traditional methods, which are time-consuming, costly, and subjective. Unlike approaches that rely on image segmentation, the proposed method processes micro-CT images directly, identifying textural heterogeneity. The… ▽ More

    Submitted 1 February, 2025; originally announced February 2025.

    Comments: 26 pages, 11 figures

  37. arXiv:2502.01512  [pdf, other

    stat.ME cs.LG math.ST stat.ML

    Wrapped Gaussian on the manifold of Symmetric Positive Definite Matrices

    Authors: Thibault de Surrel, Fabien Lotte, Sylvain Chevallier, Florian Yger

    Abstract: Circular and non-flat data distributions are prevalent across diverse domains of data science, yet their specific geometric structures often remain underutilized in machine learning frameworks. A principled approach to accounting for the underlying geometry of such data is pivotal, particularly when extending statistical models, like the pervasive Gaussian distribution. In this work, we tackle tho… ▽ More

    Submitted 27 May, 2025; v1 submitted 3 February, 2025; originally announced February 2025.

  38. arXiv:2502.00482  [pdf, other

    cs.SE

    How Does Microservice Granularity Impact Energy Consumption and Performance? A Controlled Experiment

    Authors: Yiming Zhao, Tiziano De Matteis, Justus Bogner

    Abstract: Context: Microservice architectures are a widely used software deployment approach, with benefits regarding flexibility and scalability. However, their impact on energy consumption is poorly understood, and often overlooked in favor of performance and other quality attributes (QAs). One understudied concept in this area is microservice granularity, i.e., over how many services the system functiona… ▽ More

    Submitted 1 February, 2025; originally announced February 2025.

    Comments: Accepted for publication at the International Conference on Software Architecture 2025 (ICSA'25, see https://conf.researchr.org/home/icsa-2025)

  39. arXiv:2501.17887  [pdf, other

    cs.CL cs.CV cs.SE

    Docling: An Efficient Open-Source Toolkit for AI-driven Document Conversion

    Authors: Nikolaos Livathinos, Christoph Auer, Maksym Lysak, Ahmed Nassar, Michele Dolfi, Panos Vagenas, Cesar Berrospi Ramis, Matteo Omenetti, Kasper Dinkla, Yusik Kim, Shubham Gupta, Rafael Teixeira de Lima, Valery Weber, Lucas Morin, Ingmar Meijer, Viktor Kuropiatnyk, Peter W. J. Staar

    Abstract: We introduce Docling, an easy-to-use, self-contained, MIT-licensed, open-source toolkit for document conversion, that can parse several types of popular document formats into a unified, richly structured representation. It is powered by state-of-the-art specialized AI models for layout analysis (DocLayNet) and table structure recognition (TableFormer), and runs efficiently on commodity hardware in… ▽ More

    Submitted 27 January, 2025; originally announced January 2025.

    Comments: Accepted to AAAI 25: Workshop on Open-Source AI for Mainstream Use

  40. arXiv:2501.14542  [pdf, other

    cs.LO math.LO

    Ordinal Exponentiation in Homotopy Type Theory

    Authors: Tom de Jong, Nicolai Kraus, Fredrik Nordvall Forsberg, Chuangjie Xu

    Abstract: We present two seemingly different definitions of constructive ordinal exponentiation, where an ordinal is taken to be a transitive, extensional, and wellfounded order on a set. The first definition is abstract, uses suprema of ordinals, and is solely motivated by the expected equations. The second is more concrete, based on decreasing lists, and can be seen as a constructive version of a classica… ▽ More

    Submitted 20 May, 2025; v1 submitted 24 January, 2025; originally announced January 2025.

    Comments: v2: Discussion of related work by R. J. Grayson; minor revisions following referee reports. To appear at LICS'25. v3: Updated arXiv abstract

  41. arXiv:2501.10219  [pdf, other

    eess.SP cs.CV

    Robust Egoistic Rigid Body Localization

    Authors: Niclas Führling, Giuseppe Thadeu Freitas de Abreu, David González G., Osvaldo Gonsa

    Abstract: We consider a robust and self-reliant (or "egoistic") variation of the rigid body localization (RBL) problem, in which a primary rigid body seeks to estimate the pose (i.e., location and orientation) of another rigid body (or "target"), relative to its own, without the assistance of external infrastructure, without prior knowledge of the shape of the target, and taking into account the possibility… ▽ More

    Submitted 17 January, 2025; originally announced January 2025.

  42. arXiv:2412.14073  [pdf, ps, other

    cs.LO cs.AI

    A Computationally Grounded Framework for Cognitive Attitudes (extended version)

    Authors: Tiago de Lima, Emiliano Lorini, Elise Perrotin, François Schwarzentruber

    Abstract: We introduce a novel language for reasoning about agents' cognitive attitudes of both epistemic and motivational type. We interpret it by means of a computationally grounded semantics using belief bases. Our language includes five types of modal operators for implicit belief, complete attraction, complete repulsion, realistic attraction and realistic repulsion. We give an axiomatization and show t… ▽ More

    Submitted 18 December, 2024; originally announced December 2024.

  43. arXiv:2412.13729  [pdf, other

    cs.RO cs.HC cs.LG

    THÖR-MAGNI Act: Actions for Human Motion Modeling in Robot-Shared Industrial Spaces

    Authors: Tiago Rodrigues de Almeida, Tim Schreiter, Andrey Rudenko, Luigi Palmieiri, Johannes A. Stork, Achim J. Lilienthal

    Abstract: Accurate human activity and trajectory prediction are crucial for ensuring safe and reliable human-robot interactions in dynamic environments, such as industrial settings, with mobile robots. Datasets with fine-grained action labels for moving people in industrial environments with mobile robots are scarce, as most existing datasets focus on social navigation in public spaces. This paper introduce… ▽ More

    Submitted 23 December, 2024; v1 submitted 18 December, 2024; originally announced December 2024.

    Comments: This paper has been accepted to the the 20th edition of the IEEE/ACM International Conference on Human-Robot Interaction (HRI'25), which will be held in Melbourne, Australia on March 4-6, 2025. Code: https://github.com/tmralmeida/thor-magni-actions

  44. arXiv:2412.12744  [pdf, other

    cs.CL cs.AI cs.LG

    Your Next State-of-the-Art Could Come from Another Domain: A Cross-Domain Analysis of Hierarchical Text Classification

    Authors: Nan Li, Bo Kang, Tijl De Bie

    Abstract: Text classification with hierarchical labels is a prevalent and challenging task in natural language processing. Examples include assigning ICD codes to patient records, tagging patents into IPC classes, assigning EUROVOC descriptors to European legal texts, and more. Despite its widespread applications, a comprehensive understanding of state-of-the-art methods across different domains has been la… ▽ More

    Submitted 17 December, 2024; originally announced December 2024.

  45. arXiv:2412.07682  [pdf, other

    cs.CL

    TRIM: Token Reduction and Inference Modeling for Cost-Effective Language Generation

    Authors: Alfredo Garrachón Ruiz, Tomás de la Rosa, Daniel Borrajo

    Abstract: The inference cost of Large Language Models (LLMs) is a significant challenge due to their computational demands, specially on tasks requiring long outputs. However, natural language often contains redundancy, which presents an opportunity for optimization. We have observed that LLMs can generate distilled language-concise outputs that retain essential meaning, when prompted appropriately. We prop… ▽ More

    Submitted 18 December, 2024; v1 submitted 10 December, 2024; originally announced December 2024.

    Comments: 12 pages

  46. arXiv:2412.05467  [pdf, other

    cs.LG cs.AI cs.SE

    The BrowserGym Ecosystem for Web Agent Research

    Authors: Thibault Le Sellier De Chezelles, Maxime Gasse, Alexandre Drouin, Massimo Caccia, Léo Boisvert, Megh Thakkar, Tom Marty, Rim Assouel, Sahar Omidi Shayegan, Lawrence Keunho Jang, Xing Han Lù, Ori Yoran, Dehan Kong, Frank F. Xu, Siva Reddy, Quentin Cappart, Graham Neubig, Ruslan Salakhutdinov, Nicolas Chapados, Alexandre Lacoste

    Abstract: The BrowserGym ecosystem addresses the growing need for efficient evaluation and benchmarking of web agents, particularly those leveraging automation and Large Language Models (LLMs). Many existing benchmarks suffer from fragmentation and inconsistent evaluation methodologies, making it challenging to achieve reliable comparisons and reproducible results. In an earlier work, Drouin et al. (2024) i… ▽ More

    Submitted 28 February, 2025; v1 submitted 6 December, 2024; originally announced December 2024.

  47. arXiv:2411.19710  [pdf, other

    cs.IR cs.LG

    Know Your RAG: Dataset Taxonomy and Generation Strategies for Evaluating RAG Systems

    Authors: Rafael Teixeira de Lima, Shubham Gupta, Cesar Berrospi, Lokesh Mishra, Michele Dolfi, Peter Staar, Panagiotis Vagenas

    Abstract: Retrieval Augmented Generation (RAG) systems are a widespread application of Large Language Models (LLMs) in the industry. While many tools exist empowering developers to build their own systems, measuring their performance locally, with datasets reflective of the system's use cases, is a technological challenge. Solutions to this problem range from non-specific and cheap (most public datasets) to… ▽ More

    Submitted 29 November, 2024; originally announced November 2024.

    Comments: to be published in the 31st International Conference on Computational Linguistics (COLING 2025)

  48. arXiv:2411.17486  [pdf, ps, other

    cs.LO

    Linear Realisability over nets: multiplicatives (long version)

    Authors: Adrien Ragot, Thomas Seiller, Lorenzo Tortora de Falco

    Abstract: We provide a new realisability model based on orthogonality for the multiplicative fragment of linear logic, both in presence of generalised axioms (MLL*) and in the standard case (MLL). The novelty is the definition of cut elimination for generalised axioms. We prove that our model is adequate and complete both for MLL* and MLL.

    Submitted 26 November, 2024; originally announced November 2024.

  49. Ética para LLMs: o compartilhamento de dados sociolinguísticos

    Authors: Marta Deysiane Alves Faria Sousa, Raquel Meister Ko. Freitag, Túlio Sousa de Gois

    Abstract: The collection of speech data carried out in Sociolinguistics has the potential to enhance large language models due to its quality and representativeness. In this paper, we examine the ethical considerations associated with the gathering and dissemination of such data. Additionally, we outline strategies for addressing the sensitivity of speech data, as it may facilitate the identification of inf… ▽ More

    Submitted 11 November, 2024; originally announced November 2024.

    Comments: in Portuguese language. Paper accepted to LAAI-Ethics 2024

  50. arXiv:2411.06837  [pdf, other

    cs.CL

    Persuasion with Large Language Models: a Survey

    Authors: Alexander Rogiers, Sander Noels, Maarten Buyl, Tijl De Bie

    Abstract: The rapid rise of Large Language Models (LLMs) has created new disruptive possibilities for persuasive communication, by enabling fully-automated personalized and interactive content generation at an unprecedented scale. In this paper, we survey the research field of LLM-based persuasion that has emerged as a result. We begin by exploring the different modes in which LLM Systems are used to influe… ▽ More

    Submitted 11 November, 2024; originally announced November 2024.