Skip to main content

Showing 1–27 of 27 results for author: Carvalho, D S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2506.20083  [pdf, ps, other

    cs.CL

    Bridging Compositional and Distributional Semantics: A Survey on Latent Semantic Geometry via AutoEncoder

    Authors: Yingji Zhang, Danilo S. Carvalho, André Freitas

    Abstract: Integrating compositional and symbolic properties into current distributional semantic spaces can enhance the interpretability, controllability, compositionality, and generalisation capabilities of Transformer-based auto-regressive language models (LMs). In this survey, we offer a novel perspective on latent space geometry through the lens of compositional semantics, a direction we refer to as \te… ▽ More

    Submitted 26 June, 2025; v1 submitted 24 June, 2025; originally announced June 2025.

    Comments: In progress

  2. arXiv:2506.19418  [pdf, ps, other

    cs.CL

    Learning to Disentangle Latent Reasoning Rules with Language VAEs: A Systematic Study

    Authors: Yingji Zhang, Marco Valentino, Danilo S. Carvalho, André Freitas

    Abstract: Incorporating explicit reasoning rules within the latent space of language models (LMs) offers a promising pathway to enhance generalisation, interpretability, and controllability. While current Transformer-based language models have shown strong performance on Natural Language Inference (NLI) tasks, they often rely on memorisation rather than rule-based inference. This work investigates how reaso… ▽ More

    Submitted 24 June, 2025; originally announced June 2025.

  3. arXiv:2505.17998  [pdf, other

    cs.CL

    TRACE for Tracking the Emergence of Semantic Representations in Transformers

    Authors: Nura Aljaafari, Danilo S. Carvalho, André Freitas

    Abstract: Modern transformer models exhibit phase transitions during training, distinct shifts from memorisation to abstraction, but the mechanisms underlying these transitions remain poorly understood. Prior work has often focused on endpoint representations or isolated signals like curvature or mutual information, typically in symbolic or arithmetic domains, overlooking the emergence of linguistic structu… ▽ More

    Submitted 23 May, 2025; originally announced May 2025.

  4. arXiv:2505.00004  [pdf, other

    cs.CL cs.AI

    LangVAE and LangSpace: Building and Probing for Language Model VAEs

    Authors: Danilo S. Carvalho, Yingji Zhang, Harriet Unsworth, André Freitas

    Abstract: We present LangVAE, a novel framework for modular construction of variational autoencoders (VAEs) on top of pre-trained large language models (LLMs). Such language model VAEs can encode the knowledge of their pre-trained components into more compact and semantically disentangled representations. The representations obtained in this way can be analysed with the LangVAE companion framework: LangSpac… ▽ More

    Submitted 29 March, 2025; originally announced May 2025.

  5. arXiv:2504.04110  [pdf, other

    cs.AI cs.CL

    PEIRCE: Unifying Material and Formal Reasoning via LLM-Driven Neuro-Symbolic Refinement

    Authors: Xin Quan, Marco Valentino, Danilo S. Carvalho, Dhairya Dalal, André Freitas

    Abstract: A persistent challenge in AI is the effective integration of material and formal inference - the former concerning the plausibility and contextual relevance of arguments, while the latter focusing on their logical and structural validity. Large Language Models (LLMs), by virtue of their extensive pre-training on large textual corpora, exhibit strong capabilities in material inference. However, the… ▽ More

    Submitted 5 April, 2025; originally announced April 2025.

    Comments: Demo paper. Work in progress

  6. arXiv:2502.11066  [pdf, other

    cs.CL

    CARMA: Enhanced Compositionality in LLMs via Advanced Regularisation and Mutual Information Alignment

    Authors: Nura Aljaafari, Danilo S. Carvalho, André Freitas

    Abstract: Large language models (LLMs) struggle with compositional generalisation, limiting their ability to systematically combine learned components to interpret novel inputs. While architectural modifications, fine-tuning, and data augmentation improve compositionality, they often have limited adaptability, face scalability constraints, or yield diminishing returns on real data. To address this, we propo… ▽ More

    Submitted 20 May, 2025; v1 submitted 16 February, 2025; originally announced February 2025.

    Comments: 19 pages, 8 figures, 8 tables

  7. arXiv:2410.12924  [pdf, other

    cs.CL

    Interpreting token compositionality in LLMs: A robustness analysis

    Authors: Nura Aljaafari, Danilo S. Carvalho, André Freitas

    Abstract: Understanding the internal mechanisms of large language models (LLMs) is integral to enhancing their reliability, interpretability, and inference processes. We present Constituent-Aware Pooling (CAP), a methodology designed to analyse how LLMs process compositional linguistic structures. Grounded in principles of compositionality, mechanistic interpretability, and information theory, CAP systemati… ▽ More

    Submitted 19 May, 2025; v1 submitted 16 October, 2024; originally announced October 2024.

    Comments: 23 pages, 3 Figures, 14 tables

  8. arXiv:2408.16779  [pdf, other

    cs.CL cs.AI cs.LO

    Inductive Learning of Logical Theories with LLMs: An Expressivity-Graded Analysis

    Authors: João Pedro Gandarela, Danilo S. Carvalho, André Freitas

    Abstract: This work presents a novel systematic methodology to analyse the capabilities and limitations of Large Language Models (LLMs) with feedback from a formal inference engine, on logic theory induction. The analysis is complexity-graded w.r.t. rule dependency structure, allowing quantification of specific inference challenges on LLM performance. Integrating LLMs with formal methods is a promising fron… ▽ More

    Submitted 14 January, 2025; v1 submitted 15 August, 2024; originally announced August 2024.

    ACM Class: I.2.7

  9. arXiv:2408.11827  [pdf, other

    cs.CL

    The Mechanics of Conceptual Interpretation in GPT Models: Interpretative Insights

    Authors: Nura Aljaafari, Danilo S. Carvalho, André Freitas

    Abstract: Locating and editing knowledge in large language models (LLMs) is crucial for enhancing their accuracy, safety, and inference rationale. We introduce ``concept editing'', an innovative variation of knowledge editing that uncovers conceptualisation mechanisms within these models. Using the reverse dictionary task, inference tracing, and input abstraction, we analyse the Multi-Layer Perceptron (MLP)… ▽ More

    Submitted 5 August, 2024; originally announced August 2024.

    Comments: 23 pages, 25 figures

  10. arXiv:2402.00723  [pdf, other

    cs.CL

    Improving Semantic Control in Discrete Latent Spaces with Transformer Quantized Variational Autoencoders

    Authors: Yingji Zhang, Danilo S. Carvalho, Marco Valentino, Ian Pratt-Hartmann, Andre Freitas

    Abstract: Achieving precise semantic control over the latent spaces of Variational AutoEncoders (VAEs) holds significant value for downstream tasks in NLP as the underlying generative mechanisms could be better localised, explained and improved upon. Recent research, however, has struggled to achieve consistent results, primarily due to the inevitable loss of semantic information in the variational bottlene… ▽ More

    Submitted 1 February, 2024; originally announced February 2024.

  11. arXiv:2312.13208  [pdf, other

    cs.CL

    LlaMaVAE: Guiding Large Language Model Generation via Continuous Latent Sentence Spaces

    Authors: Yingji Zhang, Danilo S. Carvalho, Ian Pratt-Hartmann, André Freitas

    Abstract: Deep generative neural networks, such as Variational AutoEncoders (VAEs), offer an opportunity to better understand and control language models from the perspective of sentence-level latent spaces. To combine the controllability of VAE latent spaces with the state-of-the-art performance of recent large language models (LLMs), we present in this work LlaMaVAE, which combines expressive encoder and… ▽ More

    Submitted 20 December, 2023; originally announced December 2023.

  12. arXiv:2311.08579  [pdf, other

    cs.CL

    Graph-Induced Syntactic-Semantic Spaces in Transformer-Based Variational AutoEncoders

    Authors: Yingji Zhang, Marco Valentino, Danilo S. Carvalho, Ian Pratt-Hartmann, André Freitas

    Abstract: The injection of syntactic information in Variational AutoEncoders (VAEs) has been shown to result in an overall improvement of performances and generalisation. An effective strategy to achieve such a goal is to separate the encoding of distributional semantic features and syntactic structures into heterogeneous latent spaces via multi-task learning or dual encoder architectures. However, existing… ▽ More

    Submitted 14 November, 2023; originally announced November 2023.

  13. arXiv:2309.16819  [pdf, other

    cs.LG cs.AI

    Multi-Bellman operator for convergence of $Q$-learning with linear function approximation

    Authors: Diogo S. Carvalho, Pedro A. Santos, Francisco S. Melo

    Abstract: We study the convergence of $Q$-learning with linear function approximation. Our key contribution is the introduction of a novel multi-Bellman operator that extends the traditional Bellman operator. By exploring the properties of this operator, we identify conditions under which the projected multi-Bellman operator becomes contractive, providing improved fixed-point guarantees compared to the Bell… ▽ More

    Submitted 28 September, 2023; originally announced September 2023.

  14. arXiv:2308.03581  [pdf, other

    cs.CL

    Towards Controllable Natural Language Inference through Lexical Inference Types

    Authors: Yingji Zhang, Danilo S. Carvalho, Ian Pratt-Hartmann, Andre Freitas

    Abstract: Explainable natural language inference aims to provide a mechanism to produce explanatory (abductive) inference chains which ground claims to their supporting premises. A recent corpus called EntailmentBank strives to advance this task by explaining the answer to a question using an entailment tree \cite{dalvi2021explaining}. They employ the T5 model to directly generate the tree, which can explai… ▽ More

    Submitted 24 November, 2024; v1 submitted 7 August, 2023; originally announced August 2023.

  15. arXiv:2305.07303  [pdf, other

    cs.CL cs.LG

    Multi-Relational Hyperbolic Word Embeddings from Natural Language Definitions

    Authors: Marco Valentino, Danilo S. Carvalho, André Freitas

    Abstract: Natural language definitions possess a recursive, self-explanatory semantic structure that can support representation learning methods able to preserve explicit conceptual relations and constraints in the latent space. This paper presents a multi-relational model that explicitly leverages such a structure to derive word embeddings from definitions. By automatically extracting the relations linking… ▽ More

    Submitted 16 February, 2024; v1 submitted 12 May, 2023; originally announced May 2023.

    Comments: Accepted at the 18th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2024), camera-ready

  16. arXiv:2305.01713  [pdf, other

    cs.CL cs.AI

    Learning Disentangled Semantic Spaces of Explanations via Invertible Neural Networks

    Authors: Yingji Zhang, Danilo S. Carvalho, André Freitas

    Abstract: Disentangled latent spaces usually have better semantic separability and geometrical properties, which leads to better interpretability and more controllable data generation. While this has been well investigated in Computer Vision, in tasks such as image disentanglement, in the NLP domain sentence disentanglement is still comparatively under-investigated. Most previous work have concentrated on d… ▽ More

    Submitted 11 June, 2024; v1 submitted 2 May, 2023; originally announced May 2023.

    Comments: ACL 2024

  17. arXiv:2302.04785  [pdf, other

    cs.AI cs.CE cs.CY

    Analysis of business process automation as linear time-invariant system network

    Authors: Mauricio Jacobo-Romero, Danilo S. Carvalho, Andre Freitas

    Abstract: In this work, we examined Business Process (BP) production as a signal; this novel approach explores a BP workflow as a linear time-invariant (LTI) system. We analysed BP productivity in the frequency domain; this standpoint examines how labour and capital act as BP input signals and how their fundamental frequencies affect BP production. Our research also proposes a simulation framework of a BP i… ▽ More

    Submitted 9 February, 2023; originally announced February 2023.

    Comments: 16 pages, 24 figures

  18. arXiv:2212.04310  [pdf, other

    cs.CL

    Montague semantics and modifier consistency measurement in neural language models

    Authors: Danilo S. Carvalho, Edoardo Manino, Julia Rozanova, Lucas Cordeiro, André Freitas

    Abstract: This work proposes a novel methodology for measuring compositional behavior in contemporary language embedding models. Specifically, we focus on adjectival modifier phenomena in adjective-noun phrases. In recent years, distributional language representation models have demonstrated great practical success. At the same time, the need for interpretability has elicited questions on their intrinsic pr… ▽ More

    Submitted 18 December, 2024; v1 submitted 10 October, 2022; originally announced December 2022.

  19. arXiv:2210.06274  [pdf, other

    cs.LG

    Centralized Training with Hybrid Execution in Multi-Agent Reinforcement Learning

    Authors: Pedro P. Santos, Diogo S. Carvalho, Miguel Vasco, Alberto Sardinha, Pedro A. Santos, Ana Paiva, Francisco S. Melo

    Abstract: We introduce hybrid execution in multi-agent reinforcement learning (MARL), a new paradigm in which agents aim to successfully complete cooperative tasks with arbitrary communication levels at execution time by taking advantage of information-sharing among the agents. Under hybrid execution, the communication level can range from a setting in which no communication is allowed between agents (fully… ▽ More

    Submitted 5 June, 2023; v1 submitted 12 October, 2022; originally announced October 2022.

  20. arXiv:2210.06230  [pdf, ps, other

    cs.CL cs.AI

    Quasi-symbolic Semantic Geometry over Transformer-based Variational AutoEncoder

    Authors: Yingji Zhang, Danilo S. Carvalho, André Freitas

    Abstract: Formal/symbolic semantics can provide canonical, rigid controllability and interpretability to sentence representations due to their \textit{localisation} or \textit{composition} property. How can we deliver such property to the current distributional sentence representations to control and interpret the generation of language models (LMs)? In this work, we theoretically frame the sentence semanti… ▽ More

    Submitted 1 July, 2025; v1 submitted 12 October, 2022; originally announced October 2022.

    Comments: CoNLL2025 (Best Paper nomination)

  21. arXiv:2210.02898  [pdf, other

    cs.CL cs.AI

    Learning Disentangled Representations for Natural Language Definitions

    Authors: Danilo S. Carvalho, Giangiacomo Mercatali, Yingji Zhang, Andre Freitas

    Abstract: Disentangling the encodings of neural models is a fundamental aspect for improving interpretability, semantic control and downstream task performance in Natural Language Processing. Currently, most disentanglement methods are unsupervised or rely on synthetic datasets with known generative factors. We argue that recurrent syntactic and semantic regularities in textual data can be used to provide t… ▽ More

    Submitted 15 February, 2023; v1 submitted 22 September, 2022; originally announced October 2022.

    Comments: Findings of EACL 2023

  22. arXiv:2210.01252  [pdf, other

    cs.AI eess.SY

    Estimating productivity gains in digital automation

    Authors: Mauricio Jacobo-Romero, Danilo S. Carvalho, André Freitas

    Abstract: This paper proposes a novel productivity estimation model to evaluate the effects of adopting Artificial Intelligence (AI) components in a production chain. Our model provides evidence to address the "AI's" Solow's Paradox. We provide (i) theoretical and empirical evidence to explain Solow's dichotomy; (ii) a data-driven model to estimate and asses productivity variations; (iii) a methodology unde… ▽ More

    Submitted 8 October, 2022; v1 submitted 3 October, 2022; originally announced October 2022.

    Comments: 11 pages and 9 figures

  23. arXiv:2203.03021  [pdf, other

    cs.LG cs.MA stat.ML

    Hierarchically Structured Scheduling and Execution of Tasks in a Multi-Agent Environment

    Authors: Diogo S. Carvalho, Biswa Sengupta

    Abstract: In a warehouse environment, tasks appear dynamically. Consequently, a task management system that matches them with the workforce too early (e.g., weeks in advance) is necessarily sub-optimal. Also, the rapidly increasing size of the action space of such a system consists of a significant problem for traditional schedulers. Reinforcement learning, however, is suited to deal with issues requiring m… ▽ More

    Submitted 6 March, 2022; originally announced March 2022.

  24. arXiv:2111.11758  [pdf, other

    cs.LG

    The Impact of Data Distribution on Q-learning with Function Approximation

    Authors: Pedro P. Santos, Diogo S. Carvalho, Alberto Sardinha, Francisco S. Melo

    Abstract: We study the interplay between the data distribution and Q-learning-based algorithms with function approximation. We provide a unified theoretical and empirical analysis as to how different properties of the data distribution influence the performance of Q-learning-based algorithms. We connect different lines of research, as well as validate and extend previous results. We start by reviewing theor… ▽ More

    Submitted 10 February, 2023; v1 submitted 23 November, 2021; originally announced November 2021.

  25. arXiv:2102.07537  [pdf, other

    cs.HC cs.AI

    CHARET: Character-centered Approach to Emotion Tracking in Stories

    Authors: Diogo S. Carvalho, Joana Campos, Manuel Guimarães, Ana Antunes, João Dias, Pedro A. Santos

    Abstract: Autonomous agents that can engage in social interactions witha human is the ultimate goal of a myriad of applications. A keychallenge in the design of these applications is to define the socialbehavior of the agent, which requires extensive content creation.In this research, we explore how we can leverage current state-of-the-art tools to make inferences about the emotional state ofa character in… ▽ More

    Submitted 19 July, 2021; v1 submitted 15 February, 2021; originally announced February 2021.

    Comments: Preprint

  26. arXiv:1706.01038   

    cs.IR cs.CL

    Improving Legal Information Retrieval by Distributional Composition with Term Order Probabilities

    Authors: Danilo S. Carvalho, Duc-Vu Tran, Van-Khanh Tran, Le-Nguyen Minh

    Abstract: Legal professionals worldwide are currently trying to get up-to-pace with the explosive growth in legal document availability through digital means. This drives a need for high efficiency Legal Information Retrieval (IR) and Question Answering (QA) methods. The IR task in particular has a set of unique challenges that invite the use of semantic motivated NLP techniques. In this work, a two-stage m… ▽ More

    Submitted 10 June, 2017; v1 submitted 4 June, 2017; originally announced June 2017.

    Comments: wrong version

  27. arXiv:1609.00799  [pdf, other

    cs.IR cs.CL

    Lexical-Morphological Modeling for Legal Text Analysis

    Authors: Danilo S. Carvalho, Minh-Tien Nguyen, Tran Xuan Chien, Minh Le Nguyen

    Abstract: In the context of the Competition on Legal Information Extraction/Entailment (COLIEE), we propose a method comprising the necessary steps for finding relevant documents to a legal question and deciding on textual entailment evidence to provide a correct answer. The proposed method is based on the combination of several lexical and morphological characteristics, to build a language model and a set… ▽ More

    Submitted 3 September, 2016; originally announced September 2016.

    Comments: 16 pages, 5 figures, Lecture notes in computer science: New Frontiers in Artificial Intelligence, 2016/03

    MSC Class: 14J30 (Primary) ACM Class: H.3, H.3.3, I.2.7