-
GaRAGe: A Benchmark with Grounding Annotations for RAG Evaluation
Authors:
Ionut-Teodor Sorodoc,
Leonardo F. R. Ribeiro,
Rexhina Blloshmi,
Christopher Davis,
Adrià de Gispert
Abstract:
We present GaRAGe, a large RAG benchmark with human-curated long-form answers and annotations of each grounding passage, allowing a fine-grained evaluation of whether LLMs can identify relevant grounding when generating RAG answers. Our benchmark contains 2366 questions of diverse complexity, dynamism, and topics, and includes over 35K annotated passages retrieved from both private document sets a…
▽ More
We present GaRAGe, a large RAG benchmark with human-curated long-form answers and annotations of each grounding passage, allowing a fine-grained evaluation of whether LLMs can identify relevant grounding when generating RAG answers. Our benchmark contains 2366 questions of diverse complexity, dynamism, and topics, and includes over 35K annotated passages retrieved from both private document sets and the Web, to reflect real-world RAG use cases. This makes it an ideal test bed to evaluate an LLM's ability to identify only the relevant information necessary to compose a response, or provide a deflective response when there is insufficient information. Evaluations of multiple state-of-the-art LLMs on GaRAGe show that the models tend to over-summarise rather than (a) ground their answers strictly on the annotated relevant passages (reaching at most a Relevance-Aware Factuality Score of 60%), or (b) deflect when no relevant grounding is available (reaching at most 31% true positive rate in deflections). The F1 in attribution to relevant sources is at most 58.9%, and we show that performance is particularly reduced when answering time-sensitive questions and when having to draw knowledge from sparser private grounding sources.
△ Less
Submitted 9 June, 2025;
originally announced June 2025.
-
CART-based Synthetic Tabular Data Generation for Imbalanced Regression
Authors:
António Pedro Pinheiro,
Rita P. Ribeiro
Abstract:
Handling imbalanced target distributions in regression tasks remains a significant challenge in tabular data settings where underrepresented regions can hinder model performance. Among data-level solutions, some proposals, such as random sampling and SMOTE-based approaches, propose adapting classification techniques to regression tasks. However, these methods typically rely on crisp, artificial th…
▽ More
Handling imbalanced target distributions in regression tasks remains a significant challenge in tabular data settings where underrepresented regions can hinder model performance. Among data-level solutions, some proposals, such as random sampling and SMOTE-based approaches, propose adapting classification techniques to regression tasks. However, these methods typically rely on crisp, artificial thresholds over the target variable, a limitation inherited from classification settings that can introduce arbitrariness, often leading to non-intuitive and potentially misleading problem formulations. While recent generative models, such as GANs and VAEs, provide flexible sample synthesis, they come with high computational costs and limited interpretability. In this study, we propose adapting an existing CART-based synthetic data generation method, tailoring it for imbalanced regression. The new method integrates relevance and density-based mechanisms to guide sampling in sparse regions of the target space and employs a threshold-free, feature-driven generation process. Our experimental study focuses on the prediction of extreme target values across benchmark datasets. The results indicate that the proposed method is competitive with other resampling and generative strategies in terms of performance, while offering faster execution and greater transparency. These results highlight the method's potential as a transparent, scalable data-level strategy for improving regression models in imbalanced domains.
△ Less
Submitted 3 June, 2025;
originally announced June 2025.
-
XRAG: Cross-lingual Retrieval-Augmented Generation
Authors:
Wei Liu,
Sony Trenous,
Leonardo F. R. Ribeiro,
Bill Byrne,
Felix Hieber
Abstract:
We propose XRAG, a novel benchmark designed to evaluate the generation abilities of LLMs in cross-lingual Retrieval-Augmented Generation (RAG) settings where the user language does not match the retrieval results. XRAG is constructed from recent news articles to ensure that its questions require external knowledge to be answered. It covers the real-world scenarios of monolingual and multilingual r…
▽ More
We propose XRAG, a novel benchmark designed to evaluate the generation abilities of LLMs in cross-lingual Retrieval-Augmented Generation (RAG) settings where the user language does not match the retrieval results. XRAG is constructed from recent news articles to ensure that its questions require external knowledge to be answered. It covers the real-world scenarios of monolingual and multilingual retrieval, and provides relevancy annotations for each retrieved document. Our novel dataset construction pipeline results in questions that require complex reasoning, as evidenced by the significant gap between human and LLM performance. Consequently, XRAG serves as a valuable benchmark for studying LLM reasoning abilities, even before considering the additional cross-lingual complexity. Experimental results on five LLMs uncover two previously unreported challenges in cross-lingual RAG: 1) in the monolingual retrieval setting, all evaluated models struggle with response language correctness; 2) in the multilingual retrieval setting, the main challenge lies in reasoning over retrieved information across languages rather than generation of non-English text.
△ Less
Submitted 15 May, 2025;
originally announced May 2025.
-
Toward Advancing License Plate Super-Resolution in Real-World Scenarios: A Dataset and Benchmark
Authors:
Valfride Nascimento,
Gabriel E. Lima,
Rafael O. Ribeiro,
William Robson Schwartz,
Rayson Laroca,
David Menotti
Abstract:
Recent advancements in super-resolution for License Plate Recognition (LPR) have sought to address challenges posed by low-resolution (LR) and degraded images in surveillance, traffic monitoring, and forensic applications. However, existing studies have relied on private datasets and simplistic degradation models. To address this gap, we introduce UFPR-SR-Plates, a novel dataset containing 10,000…
▽ More
Recent advancements in super-resolution for License Plate Recognition (LPR) have sought to address challenges posed by low-resolution (LR) and degraded images in surveillance, traffic monitoring, and forensic applications. However, existing studies have relied on private datasets and simplistic degradation models. To address this gap, we introduce UFPR-SR-Plates, a novel dataset containing 10,000 tracks with 100,000 paired low and high-resolution license plate images captured under real-world conditions. We establish a benchmark using multiple sequential LR and high-resolution (HR) images per vehicle -- five of each -- and two state-of-the-art models for super-resolution of license plates. We also investigate three fusion strategies to evaluate how combining predictions from a leading Optical Character Recognition (OCR) model for multiple super-resolved license plates enhances overall performance. Our findings demonstrate that super-resolution significantly boosts LPR performance, with further improvements observed when applying majority vote-based fusion techniques. Specifically, the Layout-Aware and Character-Driven Network (LCDNet) model combined with the Majority Vote by Character Position (MVCP) strategy led to the highest recognition rates, increasing from 1.7% with low-resolution images to 31.1% with super-resolution, and up to 44.7% when combining OCR outputs from five super-resolved images. These findings underscore the critical role of super-resolution and temporal information in enhancing LPR accuracy under real-world, adverse conditions. The proposed dataset is publicly available to support further research and can be accessed at: https://valfride.github.io/nascimento2024toward/
△ Less
Submitted 9 May, 2025;
originally announced May 2025.
-
NeoQA: Evidence-based Question Answering with Generated News Events
Authors:
Max Glockner,
Xiang Jiang,
Leonardo F. R. Ribeiro,
Iryna Gurevych,
Markus Dreyer
Abstract:
Evaluating Retrieval-Augmented Generation (RAG) in large language models (LLMs) is challenging because benchmarks can quickly become stale. Questions initially requiring retrieval may become answerable from pretraining knowledge as newer models incorporate more recent information during pretraining, making it difficult to distinguish evidence-based reasoning from recall. We introduce NeoQA (News E…
▽ More
Evaluating Retrieval-Augmented Generation (RAG) in large language models (LLMs) is challenging because benchmarks can quickly become stale. Questions initially requiring retrieval may become answerable from pretraining knowledge as newer models incorporate more recent information during pretraining, making it difficult to distinguish evidence-based reasoning from recall. We introduce NeoQA (News Events for Out-of-training Question Answering), a benchmark designed to address this issue. To construct NeoQA, we generated timelines and knowledge bases of fictional news events and entities along with news articles and Q\&A pairs to prevent LLMs from leveraging pretraining knowledge, ensuring that no prior evidence exists in their training data. We propose our dataset as a new platform for evaluating evidence-based question answering, as it requires LLMs to generate responses exclusively from retrieved evidence and only when sufficient evidence is available. NeoQA enables controlled evaluation across various evidence scenarios, including cases with missing or misleading details. Our findings indicate that LLMs struggle to distinguish subtle mismatches between questions and evidence, and suffer from short-cut reasoning when key information required to answer a question is missing from the evidence, underscoring key limitations in evidence-based reasoning.
△ Less
Submitted 9 May, 2025;
originally announced May 2025.
-
Hessian Riemannian Flow For Multi-Population Wardrop Equilibrium
Authors:
Tigran Bakaryan,
Christoph Aoun,
Ricardo de Lima Ribeiro,
Naira Hovakimyan,
Diogo Gomes
Abstract:
In this paper, we address the problem of optimizing flows on generalized graphs that feature multiple entry points and multiple populations, each with varying cost structures. We tackle this problem by considering the multi-population Wardrop equilibrium, defined through variational inequalities. We rigorously analyze the existence and uniqueness of the Wardrop equilibrium. Furthermore, we introdu…
▽ More
In this paper, we address the problem of optimizing flows on generalized graphs that feature multiple entry points and multiple populations, each with varying cost structures. We tackle this problem by considering the multi-population Wardrop equilibrium, defined through variational inequalities. We rigorously analyze the existence and uniqueness of the Wardrop equilibrium. Furthermore, we introduce an efficient numerical method to find the solution. In particular, we reformulate the equilibrium problem as a distributed optimization problem over subgraphs and introduce a novel Hessian Riemannian flow method, a Riemannian-manifold-projected Hessian flow, to efficiently compute a solution. Finally, we demonstrate the effectiveness of our approach through examples in urban traffic management, including routing for diverse vehicle types and strategies for minimizing emissions in congested environments.
△ Less
Submitted 22 April, 2025;
originally announced April 2025.
-
Speaker Fuzzy Fingerprints: Benchmarking Text-Based Identification in Multiparty Dialogues
Authors:
Rui Ribeiro,
Luísa Coheur,
Joao P. Carvalho
Abstract:
Speaker identification using voice recordings leverages unique acoustic features, but this approach fails when only textual data is available. Few approaches have attempted to tackle the problem of identifying speakers solely from text, and the existing ones have primarily relied on traditional methods. In this work, we explore the use of fuzzy fingerprints from large pre-trained models to improve…
▽ More
Speaker identification using voice recordings leverages unique acoustic features, but this approach fails when only textual data is available. Few approaches have attempted to tackle the problem of identifying speakers solely from text, and the existing ones have primarily relied on traditional methods. In this work, we explore the use of fuzzy fingerprints from large pre-trained models to improve text-based speaker identification. We integrate speaker-specific tokens and context-aware modeling, demonstrating that conversational context significantly boosts accuracy, reaching 70.6% on the Friends dataset and 67.7% on the Big Bang Theory dataset. Additionally, we show that fuzzy fingerprints can approximate full fine-tuning performance with fewer hidden units, offering improved interpretability. Finally, we analyze ambiguous utterances and propose a mechanism to detect speaker-agnostic lines. Our findings highlight key challenges and provide insights for future improvements in text-based speaker identification.
△ Less
Submitted 21 April, 2025;
originally announced April 2025.
-
Tuning-Free Personalized Alignment via Trial-Error-Explain In-Context Learning
Authors:
Hyundong Cho,
Karishma Sharma,
Nicolaas Jedema,
Leonardo F. R. Ribeiro,
Alessandro Moschitti,
Ravi Krishnan,
Jonathan May
Abstract:
Language models are aligned to the collective voice of many, resulting in generic outputs that do not align with specific users' styles. In this work, we present Trial-Error-Explain In-Context Learning (TICL), a tuning-free method that personalizes language models for text generation tasks with fewer than 10 examples per user. TICL iteratively expands an in-context learning prompt via a trial-erro…
▽ More
Language models are aligned to the collective voice of many, resulting in generic outputs that do not align with specific users' styles. In this work, we present Trial-Error-Explain In-Context Learning (TICL), a tuning-free method that personalizes language models for text generation tasks with fewer than 10 examples per user. TICL iteratively expands an in-context learning prompt via a trial-error-explain process, adding model-generated negative samples and explanations that provide fine-grained guidance towards a specific user's style. TICL achieves favorable win rates on pairwise comparisons with LLM-as-a-judge up to 91.5% against the previous state-of-the-art and outperforms competitive tuning-free baselines for personalized alignment tasks of writing emails, essays and news articles. Both lexical and qualitative analyses show that the negative samples and explanations enable language models to learn stylistic context more effectively and overcome the bias towards structural and formal phrases observed in their zero-shot outputs. By front-loading inference compute to create a user-specific in-context learning prompt that does not require extra generation steps at test time, TICL presents a novel yet simple approach for personalized alignment.
△ Less
Submitted 5 April, 2025; v1 submitted 13 February, 2025;
originally announced February 2025.
-
Histogram Approaches for Imbalanced Data Streams Regression
Authors:
Ehsan Aminian,
Rita P. Ribeiro,
Joao Gama
Abstract:
Imbalanced domains pose a significant challenge in real-world predictive analytics, particularly in the context of regression. While existing research has primarily focused on batch learning from static datasets, limited attention has been given to imbalanced regression in online learning scenarios. Intending to address this gap, in prior work, we proposed sampling strategies based on Chebyshevs i…
▽ More
Imbalanced domains pose a significant challenge in real-world predictive analytics, particularly in the context of regression. While existing research has primarily focused on batch learning from static datasets, limited attention has been given to imbalanced regression in online learning scenarios. Intending to address this gap, in prior work, we proposed sampling strategies based on Chebyshevs inequality as the first methodologies designed explicitly for data streams. However, these approaches operated under the restrictive assumption that rare instances exclusively reside at distribution extremes. This study introduces histogram-based sampling strategies to overcome this constraint, proposing flexible solutions for imbalanced regression in evolving data streams. The proposed techniques -- Histogram-based Undersampling (HistUS) and Histogram-based Oversampling (HistOS) -- employ incremental online histograms to dynamically detect and prioritize rare instances across arbitrary regions of the target distribution to improve predictions in the rare cases. Comprehensive experiments on synthetic and real-world benchmarks demonstrate that HistUS and HistOS substantially improve rare-case prediction accuracy, outperforming baseline models while maintaining competitiveness with Chebyshev-based approaches.
△ Less
Submitted 13 March, 2025; v1 submitted 29 January, 2025;
originally announced January 2025.
-
Analysis of Eccentric Coaxial Waveguides Filled with Lossy Anisotropic Media via Finite Difference
Authors:
Raul O. Ribeiro,
Maria A. Martinez,
Guilherme S. Rosa,
Rafael A. Penchel
Abstract:
This study presents a finite difference method (FDM) to model the electromagnetic field propagation in eccentric coaxial waveguides filled with lossy uniaxially anisotropic media. The formulation utilizes conformal transformation to map the eccentric circular waveguide into an equivalent concentric one. In the concentric problem, we introduce a novel normalized Helmholtz equation to decouple TM an…
▽ More
This study presents a finite difference method (FDM) to model the electromagnetic field propagation in eccentric coaxial waveguides filled with lossy uniaxially anisotropic media. The formulation utilizes conformal transformation to map the eccentric circular waveguide into an equivalent concentric one. In the concentric problem, we introduce a novel normalized Helmholtz equation to decouple TM and TE modes, and we solve this non-homogeneous partial differential equation using the finite difference in cylindrical coordinates. The proposed approach was validated against perturbation-based, spectral element-based, and finite-integration-based numerical solutions. The preliminary results show that our solution is superior in computational time. Furthermore, our FDM formulation can be extended with minimal adaptations to model complex media problems, such as metamaterial devices, optical fibers, and geophysical exploration sensors.
△ Less
Submitted 23 January, 2025;
originally announced January 2025.
-
Higher-Order Spectral Element Methods for Electromagnetic Modeling of Complex Anisotropic Waveguides
Authors:
Raul Oliveira Ribeiro
Abstract:
This research thesis presents a novel higher-order spectral element method (SEM) formulated in cylindrical coordinates for analyzing electromagnetic fields in waveguides filled with complex anisotropic media. In this study, we consider a large class of cylindrical waveguides: radially-bounded and radially-unbounded domains; homogeneous and inhomogeneous waveguides; concentric and non-concentric ge…
▽ More
This research thesis presents a novel higher-order spectral element method (SEM) formulated in cylindrical coordinates for analyzing electromagnetic fields in waveguides filled with complex anisotropic media. In this study, we consider a large class of cylindrical waveguides: radially-bounded and radially-unbounded domains; homogeneous and inhomogeneous waveguides; concentric and non-concentric geometries; Hermitian and non-Hermitian anisotropic media tensors. This work explores different wave equation formulations for one-layer eccentric and multilayer cylindrical waveguides. For the first case, we can define a new normalized scalar Helmholtz equation for decoupling TM and TE modes, and for the second, a vectorial Helmholtz equation for hybrid modes in multilayered anisotropic structures. Additionally, we formulate a transformation optics (TO) framework to include non-symmetric and non-Hermitian media tensors for non-concentric multilayer waveguides. Lastly, we model excitation sources for logging sensors applied in geophysical problems using the fields obtained by SEM. We validate the proposed approach against analytical solutions, perturbation-based and mode-matching-based methods, finite-elements, and finite-integration numerical methods. Our technique obtains accurate results with fewer elements and degrees of freedom (DoF) than Cartesian-based SEM and ordinary finite-element approaches. To this end, we use higher-order two-dimensional basis functions associated with the zeros of the completed Lobatto polynomial to model the fields in each reference element. The convergence analysis demonstrates the absence of the Runge effect as the expansion order increases. Numerical results show that our formulation is efficient and accurate for modeling cylindrical waveguided geometries filled with complex media.
△ Less
Submitted 12 November, 2024;
originally announced November 2024.
-
An Intelligent Native Network Slicing Security Architecture Empowered by Federated Learning
Authors:
Rodrigo Moreira,
Rodolfo S. Villaca,
Moises R. N. Ribeiro,
Joberto S. B. Martins,
Joao Henrique Correa,
Tereza C. Carvalho,
Flavio de Oliveira Silva
Abstract:
Network Slicing (NS) has transformed the landscape of resource sharing in networks, offering flexibility to support services and applications with highly variable requirements in areas such as the next-generation 5G/6G mobile networks (NGMN), vehicular networks, industrial Internet of Things (IoT), and verticals. Although significant research and experimentation have driven the development of netw…
▽ More
Network Slicing (NS) has transformed the landscape of resource sharing in networks, offering flexibility to support services and applications with highly variable requirements in areas such as the next-generation 5G/6G mobile networks (NGMN), vehicular networks, industrial Internet of Things (IoT), and verticals. Although significant research and experimentation have driven the development of network slicing, existing architectures often fall short in intrinsic architectural intelligent security capabilities. This paper proposes an architecture-intelligent security mechanism to improve the NS solutions. We idealized a security-native architecture that deploys intelligent microservices as federated agents based on machine learning, providing intra-slice and architectural operation security for the Slicing Future Internet Infrastructures (SFI2) reference architecture. It is noteworthy that federated learning approaches match the highly distributed modern microservice-based architectures, thus providing a unifying and scalable design choice for NS platforms addressing both service and security. Using ML-Agents and Security Agents, our approach identified Distributed Denial-of-Service (DDoS) and intrusion attacks within the slice using generic and non-intrusive telemetry records, achieving an average accuracy of approximately $95.60\%$ in the network slicing architecture and $99.99\%$ for the deployed slice -- intra-slice. This result demonstrates the potential for leveraging architectural operational security and introduces a promising new research direction for network slicing architectures.
△ Less
Submitted 4 October, 2024;
originally announced October 2024.
-
Learning When to Retrieve, What to Rewrite, and How to Respond in Conversational QA
Authors:
Nirmal Roy,
Leonardo F. R. Ribeiro,
Rexhina Blloshmi,
Kevin Small
Abstract:
Augmenting Large Language Models (LLMs) with information retrieval capabilities (i.e., Retrieval-Augmented Generation (RAG)) has proven beneficial for knowledge-intensive tasks. However, understanding users' contextual search intent when generating responses is an understudied topic for conversational question answering (QA). This conversational extension leads to additional concerns when compared…
▽ More
Augmenting Large Language Models (LLMs) with information retrieval capabilities (i.e., Retrieval-Augmented Generation (RAG)) has proven beneficial for knowledge-intensive tasks. However, understanding users' contextual search intent when generating responses is an understudied topic for conversational question answering (QA). This conversational extension leads to additional concerns when compared to single-turn QA as it is more challenging for systems to comprehend conversational context and manage retrieved passages over multiple turns. In this work, we propose a method for enabling LLMs to decide when to retrieve in RAG settings given a conversational context. When retrieval is deemed necessary, the LLM then rewrites the conversation for passage retrieval and judges the relevance of returned passages before response generation. Operationally, we build on the single-turn SELF-RAG framework (Asai et al., 2023) and propose SELF-multi-RAG for conversational settings. SELF-multi-RAG demonstrates improved capabilities over single-turn variants with respect to retrieving relevant passages (by using summarized conversational context) and assessing the quality of generated responses. Experiments on three conversational QA datasets validate the enhanced response generation capabilities of SELF-multi-RAG, with improvements of ~13% measured by human annotation.
△ Less
Submitted 23 September, 2024;
originally announced September 2024.
-
Speechworthy Instruction-tuned Language Models
Authors:
Hyundong Cho,
Nicolaas Jedema,
Leonardo F. R. Ribeiro,
Karishma Sharma,
Pedro Szekely,
Alessandro Moschitti,
Ruben Janssen,
Jonathan May
Abstract:
Current instruction-tuned language models are exclusively trained with textual preference data and thus are often not aligned with the unique requirements of other modalities, such as speech. To better align language models with the speech domain, we explore (i) prompting strategies grounded in radio-industry best practices and (ii) preference learning using a novel speech-based preference data of…
▽ More
Current instruction-tuned language models are exclusively trained with textual preference data and thus are often not aligned with the unique requirements of other modalities, such as speech. To better align language models with the speech domain, we explore (i) prompting strategies grounded in radio-industry best practices and (ii) preference learning using a novel speech-based preference data of 20K samples, generated with a wide spectrum of prompts that induce varying dimensions of speech-suitability and labeled by annotators who listen to response pairs. Both human and automatic evaluation show that both prompting and preference learning increase the speech-suitability of popular instruction-tuned LLMs. Interestingly, we find that prompting and preference learning can be additive; combining them achieves the best win rates in head-to-head comparison, resulting in responses that are preferred or tied to the base model in 76.2% of comparisons on average. Lastly, we share lexical, syntactical, and qualitative analyses to showcase how each method contributes to improving the speech-suitability of generated responses.
△ Less
Submitted 22 September, 2024;
originally announced September 2024.
-
Watchlist Challenge: 3rd Open-set Face Detection and Identification
Authors:
Furkan Kasım,
Terrance E. Boult,
Rensso Mora,
Bernardo Biesseck,
Rafael Ribeiro,
Jan Schlueter,
Tomáš Repák,
Rafael Henrique Vareto,
David Menotti,
William Robson Schwartz,
Manuel Günther
Abstract:
In the current landscape of biometrics and surveillance, the ability to accurately recognize faces in uncontrolled settings is paramount. The Watchlist Challenge addresses this critical need by focusing on face detection and open-set identification in real-world surveillance scenarios. This paper presents a comprehensive evaluation of participating algorithms, using the enhanced UnConstrained Coll…
▽ More
In the current landscape of biometrics and surveillance, the ability to accurately recognize faces in uncontrolled settings is paramount. The Watchlist Challenge addresses this critical need by focusing on face detection and open-set identification in real-world surveillance scenarios. This paper presents a comprehensive evaluation of participating algorithms, using the enhanced UnConstrained College Students (UCCS) dataset with new evaluation protocols. In total, four participants submitted four face detection and nine open-set face recognition systems. The evaluation demonstrates that while detection capabilities are generally robust, closed-set identification performance varies significantly, with models pre-trained on large-scale datasets showing superior performance. However, open-set scenarios require further improvement, especially at higher true positive identification rates, i.e., lower thresholds.
△ Less
Submitted 11 September, 2024;
originally announced September 2024.
-
Multi-Feature Aggregation in Diffusion Models for Enhanced Face Super-Resolution
Authors:
Marcelo dos Santos,
Rayson Laroca,
Rafael O. Ribeiro,
João C. Neves,
David Menotti
Abstract:
Super-resolution algorithms often struggle with images from surveillance environments due to adverse conditions such as unknown degradation, variations in pose, irregular illumination, and occlusions. However, acquiring multiple images, even of low quality, is possible with surveillance cameras. In this work, we develop an algorithm based on diffusion models that utilize a low-resolution image com…
▽ More
Super-resolution algorithms often struggle with images from surveillance environments due to adverse conditions such as unknown degradation, variations in pose, irregular illumination, and occlusions. However, acquiring multiple images, even of low quality, is possible with surveillance cameras. In this work, we develop an algorithm based on diffusion models that utilize a low-resolution image combined with features extracted from multiple low-quality images to generate a super-resolved image while minimizing distortions in the individual's identity. Unlike other algorithms, our approach recovers facial features without explicitly providing attribute information or without the need to calculate a gradient of a function during the reconstruction process. To the best of our knowledge, this is the first time multi-features combined with low-resolution images are used as conditioners to generate more reliable super-resolution images using stochastic differential equations. The FFHQ dataset was employed for training, resulting in state-of-the-art performance in facial recognition and verification metrics when evaluated on the CelebA and Quis-Campi datasets. Our code is publicly available at https://github.com/marcelowds/fasr
△ Less
Submitted 20 October, 2024; v1 submitted 27 August, 2024;
originally announced August 2024.
-
Enhancing License Plate Super-Resolution: A Layout-Aware and Character-Driven Approach
Authors:
Valfride Nascimento,
Rayson Laroca,
Rafael O. Ribeiro,
William Robson Schwartz,
David Menotti
Abstract:
Despite significant advancements in License Plate Recognition (LPR) through deep learning, most improvements rely on high-resolution images with clear characters. This scenario does not reflect real-world conditions where traffic surveillance often captures low-resolution and blurry images. Under these conditions, characters tend to blend with the background or neighboring characters, making accur…
▽ More
Despite significant advancements in License Plate Recognition (LPR) through deep learning, most improvements rely on high-resolution images with clear characters. This scenario does not reflect real-world conditions where traffic surveillance often captures low-resolution and blurry images. Under these conditions, characters tend to blend with the background or neighboring characters, making accurate LPR challenging. To address this issue, we introduce a novel loss function, Layout and Character Oriented Focal Loss (LCOFL), which considers factors such as resolution, texture, and structural details, as well as the performance of the LPR task itself. We enhance character feature learning using deformable convolutions and shared weights in an attention module and employ a GAN-based training approach with an Optical Character Recognition (OCR) model as the discriminator to guide the super-resolution process. Our experimental results show significant improvements in character reconstruction quality, outperforming two state-of-the-art methods in both quantitative and qualitative measures. Our code is publicly available at https://github.com/valfride/lpsr-lacd
△ Less
Submitted 20 October, 2024; v1 submitted 27 August, 2024;
originally announced August 2024.
-
From Brazilian Portuguese to European Portuguese
Authors:
João Sanches,
Rui Ribeiro,
Luísa Coheur
Abstract:
Brazilian Portuguese and European Portuguese are two varieties of the same language and, despite their close similarities, they exhibit several differences. However, there is a significant disproportion in the availability of resources between the two variants, with Brazilian Portuguese having more abundant resources. This inequity can impact the quality of translation services accessible to Europ…
▽ More
Brazilian Portuguese and European Portuguese are two varieties of the same language and, despite their close similarities, they exhibit several differences. However, there is a significant disproportion in the availability of resources between the two variants, with Brazilian Portuguese having more abundant resources. This inequity can impact the quality of translation services accessible to European Portuguese speakers. To address this issue, we propose the development of a Brazilian Portuguese to European Portuguese translation system, leveraging recent advancements in neural architectures and models. To evaluate the performance of such systems, we manually curated a gold test set comprising 500 sentences across five different topics. Each sentence in the gold test set has two distinct references, facilitating a straightforward evaluation of future translation models. We experimented with various models by fine-tuning existing Large Language Models using parallel data extracted from movie subtitles and TED Talks transcripts in both Brazilian and European Portuguese. Our evaluation involved the use of conventional automatic metrics as well as a human evaluation. In addition, all models were compared against ChatGPT 3.5 Turbo, which currently yields the best results.
△ Less
Submitted 14 August, 2024;
originally announced August 2024.
-
FANTAstic SEquences and Where to Find Them: Faithful and Efficient API Call Generation through State-tracked Constrained Decoding and Reranking
Authors:
Zhuoer Wang,
Leonardo F. R. Ribeiro,
Alexandros Papangelis,
Rohan Mukherjee,
Tzu-Yen Wang,
Xinyan Zhao,
Arijit Biswas,
James Caverlee,
Angeliki Metallinou
Abstract:
API call generation is the cornerstone of large language models' tool-using ability that provides access to the larger world. However, existing supervised and in-context learning approaches suffer from high training costs, poor data efficiency, and generated API calls that can be unfaithful to the API documentation and the user's request. To address these limitations, we propose an output-side opt…
▽ More
API call generation is the cornerstone of large language models' tool-using ability that provides access to the larger world. However, existing supervised and in-context learning approaches suffer from high training costs, poor data efficiency, and generated API calls that can be unfaithful to the API documentation and the user's request. To address these limitations, we propose an output-side optimization approach called FANTASE. Two of the unique contributions of FANTASE are its State-Tracked Constrained Decoding (SCD) and Reranking components. SCD dynamically incorporates appropriate API constraints in the form of Token Search Trie for efficient and guaranteed generation faithfulness with respect to the API documentation. The Reranking component efficiently brings in the supervised signal by leveraging a lightweight model as the discriminator to rerank the beam-searched candidate generations of the large language model. We demonstrate the superior performance of FANTASE in API call generation accuracy, inference efficiency, and context efficiency with DSTC8 and API Bank datasets.
△ Less
Submitted 18 July, 2024;
originally announced July 2024.
-
Measuring Retrieval Complexity in Question Answering Systems
Authors:
Matteo Gabburo,
Nicolaas Paul Jedema,
Siddhant Garg,
Leonardo F. R. Ribeiro,
Alessandro Moschitti
Abstract:
In this paper, we investigate which questions are challenging for retrieval-based Question Answering (QA). We (i) propose retrieval complexity (RC), a novel metric conditioned on the completeness of retrieved documents, which measures the difficulty of answering questions, and (ii) propose an unsupervised pipeline to measure RC given an arbitrary retrieval system. Our proposed pipeline measures RC…
▽ More
In this paper, we investigate which questions are challenging for retrieval-based Question Answering (QA). We (i) propose retrieval complexity (RC), a novel metric conditioned on the completeness of retrieved documents, which measures the difficulty of answering questions, and (ii) propose an unsupervised pipeline to measure RC given an arbitrary retrieval system. Our proposed pipeline measures RC more accurately than alternative estimators, including LLMs, on six challenging QA benchmarks. Further investigation reveals that RC scores strongly correlate with both QA performance and expert judgment across five of the six studied benchmarks, indicating that RC is an effective measure of question difficulty. Subsequent categorization of high-RC questions shows that they span a broad set of question shapes, including multi-hop, compositional, and temporal QA, indicating that RC scores can categorize a new subset of complex questions. Our system can also have a major impact on retrieval-based systems by helping to identify more challenging questions on existing datasets.
△ Less
Submitted 5 June, 2024;
originally announced June 2024.
-
Artificial Intelligence Approaches for Predictive Maintenance in the Steel Industry: A Survey
Authors:
Jakub Jakubowski,
Natalia Wojak-Strzelecka,
Rita P. Ribeiro,
Sepideh Pashami,
Szymon Bobek,
Joao Gama,
Grzegorz J Nalepa
Abstract:
Predictive Maintenance (PdM) emerged as one of the pillars of Industry 4.0, and became crucial for enhancing operational efficiency, allowing to minimize downtime, extend lifespan of equipment, and prevent failures. A wide range of PdM tasks can be performed using Artificial Intelligence (AI) methods, which often use data generated from industrial sensors. The steel industry, which is an important…
▽ More
Predictive Maintenance (PdM) emerged as one of the pillars of Industry 4.0, and became crucial for enhancing operational efficiency, allowing to minimize downtime, extend lifespan of equipment, and prevent failures. A wide range of PdM tasks can be performed using Artificial Intelligence (AI) methods, which often use data generated from industrial sensors. The steel industry, which is an important branch of the global economy, is one of the potential beneficiaries of this trend, given its large environmental footprint, the globalized nature of the market, and the demanding working conditions. This survey synthesizes the current state of knowledge in the field of AI-based PdM within the steel industry and is addressed to researchers and practitioners. We identified 219 articles related to this topic and formulated five research questions, allowing us to gain a global perspective on current trends and the main research gaps. We examined equipment and facilities subjected to PdM, determined common PdM approaches, and identified trends in the AI methods used to develop these solutions. We explored the characteristics of the data used in the surveyed articles and assessed the practical implications of the research presented there. Most of the research focuses on the blast furnace or hot rolling, using data from industrial sensors. Current trends show increasing interest in the domain, especially in the use of deep learning. The main challenges include implementing the proposed methods in a production environment, incorporating them into maintenance plans, and enhancing the accessibility and reproducibility of the research.
△ Less
Submitted 21 May, 2024;
originally announced May 2024.
-
Aequitas Flow: Streamlining Fair ML Experimentation
Authors:
Sérgio Jesus,
Pedro Saleiro,
Inês Oliveira e Silva,
Beatriz M. Jorge,
Rita P. Ribeiro,
João Gama,
Pedro Bizarro,
Rayid Ghani
Abstract:
Aequitas Flow is an open-source framework and toolkit for end-to-end Fair Machine Learning (ML) experimentation, and benchmarking in Python. This package fills integration gaps that exist in other fair ML packages. In addition to the existing audit capabilities in Aequitas, the Aequitas Flow module provides a pipeline for fairness-aware model training, hyperparameter optimization, and evaluation,…
▽ More
Aequitas Flow is an open-source framework and toolkit for end-to-end Fair Machine Learning (ML) experimentation, and benchmarking in Python. This package fills integration gaps that exist in other fair ML packages. In addition to the existing audit capabilities in Aequitas, the Aequitas Flow module provides a pipeline for fairness-aware model training, hyperparameter optimization, and evaluation, enabling easy-to-use and rapid experiments and analysis of results. Aimed at ML practitioners and researchers, the framework offers implementations of methods, datasets, metrics, and standard interfaces for these components to improve extensibility. By facilitating the development of fair ML practices, Aequitas Flow hopes to enhance the incorporation of fairness concepts in AI systems making AI systems more robust and fair.
△ Less
Submitted 30 October, 2024; v1 submitted 9 May, 2024;
originally announced May 2024.
-
A Multilevel Strategy to Improve People Tracking in a Real-World Scenario
Authors:
Cristiano B. de Oliveira,
Joao C. Neves,
Rafael O. Ribeiro,
David Menotti
Abstract:
The Palácio do Planalto, office of the President of Brazil, was invaded by protesters on January 8, 2023. Surveillance videos taken from inside the building were subsequently released by the Brazilian Supreme Court for public scrutiny. We used segments of such footage to create the UFPR-Planalto801 dataset for people tracking and re-identification in a real-world scenario. This dataset consists of…
▽ More
The Palácio do Planalto, office of the President of Brazil, was invaded by protesters on January 8, 2023. Surveillance videos taken from inside the building were subsequently released by the Brazilian Supreme Court for public scrutiny. We used segments of such footage to create the UFPR-Planalto801 dataset for people tracking and re-identification in a real-world scenario. This dataset consists of more than 500,000 images. This paper presents a tracking approach targeting this dataset. The method proposed in this paper relies on the use of known state-of-the-art trackers combined in a multilevel hierarchy to correct the ID association over the trajectories. We evaluated our method using IDF1, MOTA, MOTP and HOTA metrics. The results show improvements for every tracker used in the experiments, with IDF1 score increasing by a margin up to 9.5%.
△ Less
Submitted 29 April, 2024;
originally announced April 2024.
-
A Neuro-Symbolic Explainer for Rare Events: A Case Study on Predictive Maintenance
Authors:
João Gama,
Rita P. Ribeiro,
Saulo Mastelini,
Narjes Davarid,
Bruno Veloso
Abstract:
Predictive Maintenance applications are increasingly complex, with interactions between many components. Black box models are popular approaches based on deep learning techniques due to their predictive accuracy. This paper proposes a neural-symbolic architecture that uses an online rule-learning algorithm to explain when the black box model predicts failures. The proposed system solves two proble…
▽ More
Predictive Maintenance applications are increasingly complex, with interactions between many components. Black box models are popular approaches based on deep learning techniques due to their predictive accuracy. This paper proposes a neural-symbolic architecture that uses an online rule-learning algorithm to explain when the black box model predicts failures. The proposed system solves two problems in parallel: anomaly detection and explanation of the anomaly. For the first problem, we use an unsupervised state of the art autoencoder. For the second problem, we train a rule learning system that learns a mapping from the input features to the autoencoder reconstruction error. Both systems run online and in parallel. The autoencoder signals an alarm for the examples with a reconstruction error that exceeds a threshold. The causes of the signal alarm are hard for humans to understand because they result from a non linear combination of sensor data. The rule that triggers that example describes the relationship between the input features and the autoencoder reconstruction error. The rule explains the failure signal by indicating which sensors contribute to the alarm and allowing the identification of the component involved in the failure. The system can present global explanations for the black box model and local explanations for why the black box model predicts a failure. We evaluate the proposed system in a real-world case study of Metro do Porto and provide explanations that illustrate its benefits.
△ Less
Submitted 21 April, 2024;
originally announced April 2024.
-
Super-Resolution Analysis for Landfill Waste Classification
Authors:
Matias Molina,
Rita P. Ribeiro,
Bruno Veloso,
João Gama
Abstract:
Illegal landfills are a critical issue due to their environmental, economic, and public health impacts. This study leverages aerial imagery for environmental crime monitoring. While advances in artificial intelligence and computer vision hold promise, the challenge lies in training models with high-resolution literature datasets and adapting them to open-access low-resolution images. Considering t…
▽ More
Illegal landfills are a critical issue due to their environmental, economic, and public health impacts. This study leverages aerial imagery for environmental crime monitoring. While advances in artificial intelligence and computer vision hold promise, the challenge lies in training models with high-resolution literature datasets and adapting them to open-access low-resolution images. Considering the substantial quality differences and limited annotation, this research explores the adaptability of models across these domains. Motivated by the necessity for a comprehensive evaluation of waste detection algorithms, it advocates cross-domain classification and super-resolution enhancement to analyze the impact of different image resolutions on waste classification as an evaluation to combat the proliferation of illegal landfills. We observed performance improvements by enhancing image quality but noted an influence on model sensitivity, necessitating careful threshold fine-tuning.
△ Less
Submitted 2 April, 2024;
originally announced April 2024.
-
On the Role of Summary Content Units in Text Summarization Evaluation
Authors:
Marcel Nawrath,
Agnieszka Nowak,
Tristan Ratz,
Danilo C. Walenta,
Juri Opitz,
Leonardo F. R. Ribeiro,
João Sedoc,
Daniel Deutsch,
Simon Mille,
Yixin Liu,
Lining Zhang,
Sebastian Gehrmann,
Saad Mahamood,
Miruna Clinciu,
Khyathi Chandu,
Yufang Hou
Abstract:
At the heart of the Pyramid evaluation method for text summarization lie human written summary content units (SCUs). These SCUs are concise sentences that decompose a summary into small facts. Such SCUs can be used to judge the quality of a candidate summary, possibly partially automated via natural language inference (NLI) systems. Interestingly, with the aim to fully automate the Pyramid evaluat…
▽ More
At the heart of the Pyramid evaluation method for text summarization lie human written summary content units (SCUs). These SCUs are concise sentences that decompose a summary into small facts. Such SCUs can be used to judge the quality of a candidate summary, possibly partially automated via natural language inference (NLI) systems. Interestingly, with the aim to fully automate the Pyramid evaluation, Zhang and Bansal (2021) show that SCUs can be approximated by automatically generated semantic role triplets (STUs). However, several questions currently lack answers, in particular: i) Are there other ways of approximating SCUs that can offer advantages? ii) Under which conditions are SCUs (or their approximations) offering the most value? In this work, we examine two novel strategies to approximate SCUs: generating SCU approximations from AMR meaning representations (SMUs) and from large language models (SGUs), respectively. We find that while STUs and SMUs are competitive, the best approximation quality is achieved by SGUs. We also show through a simple sentence-decomposition baseline (SSUs) that SCUs (and their approximations) offer the most value when ranking short summaries, but may not help as much when ranking systems or longer summaries.
△ Less
Submitted 2 April, 2024;
originally announced April 2024.
-
Logic-based Explanations for Linear Support Vector Classifiers with Reject Option
Authors:
Francisco Mateus Rocha Filho,
Thiago Alves Rocha,
Reginaldo Pereira Fernandes Ribeiro,
Ajalmar Rêgo da Rocha Neto
Abstract:
Support Vector Classifier (SVC) is a well-known Machine Learning (ML) model for linear classification problems. It can be used in conjunction with a reject option strategy to reject instances that are hard to correctly classify and delegate them to a specialist. This further increases the confidence of the model. Given this, obtaining an explanation of the cause of rejection is important to not bl…
▽ More
Support Vector Classifier (SVC) is a well-known Machine Learning (ML) model for linear classification problems. It can be used in conjunction with a reject option strategy to reject instances that are hard to correctly classify and delegate them to a specialist. This further increases the confidence of the model. Given this, obtaining an explanation of the cause of rejection is important to not blindly trust the obtained results. While most of the related work has developed means to give such explanations for machine learning models, to the best of our knowledge none have done so for when reject option is present. We propose a logic-based approach with formal guarantees on the correctness and minimality of explanations for linear SVCs with reject option. We evaluate our approach by comparing it to Anchors, which is a heuristic algorithm for generating explanations. Obtained results show that our proposed method gives shorter explanations with reduced time cost.
△ Less
Submitted 24 March, 2024;
originally announced March 2024.
-
Redex -> Coq: towards a theory of decidability of Redex's reduction semantics
Authors:
Mallku Soldevila,
Rodrigo Ribeiro,
Beta Ziliani
Abstract:
We propose the first steps in the development of a tool to automate the translation of Redex models into a (hopefully) semantically equivalent model in Coq, and to provide tactics to help in the certification of fundamental properties of such models. The work is heavily based on a model of Redex's semantics developed by Klein et al. By means of a simple generalization of the matching problem in Re…
▽ More
We propose the first steps in the development of a tool to automate the translation of Redex models into a (hopefully) semantically equivalent model in Coq, and to provide tactics to help in the certification of fundamental properties of such models. The work is heavily based on a model of Redex's semantics developed by Klein et al. By means of a simple generalization of the matching problem in Redex, we obtain an algorithm suitable for its mechanization in Coq, for which we prove its soundness properties and its correspondence with the original solution proposed by Klein et al. In the process, we also adequate some parts of our mechanization to better prepare it for the future inclusion of Redex features absent in the present model, like its Kleene-star operator. Finally, we discuss future avenues of development that are enabled by this work.
△ Less
Submitted 5 February, 2024;
originally announced February 2024.
-
Rarity of the infinite chains in the tree of numerical semigroups
Authors:
Maria Bras-Amorós,
Mariana Rosas Ribeiro
Abstract:
We prove that, for each fixed genus, the portion of semigroups of that
genus belonging to infinite chains in the semigroup tree approaches 0 as
the genus grows to infinite. This means that most numerical semigroups
have a finite number of descendants in the semigroup tree. This problem
has been open since 2009.
We prove that, for each fixed genus, the portion of semigroups of that
genus belonging to infinite chains in the semigroup tree approaches 0 as
the genus grows to infinite. This means that most numerical semigroups
have a finite number of descendants in the semigroup tree. This problem
has been open since 2009.
△ Less
Submitted 31 January, 2024;
originally announced January 2024.
-
Structural results for the Tree Builder Random Walk
Authors:
Janos Engländer,
Giulio Iacobelli,
Gábor Pete,
Rodrigo Ribeiro
Abstract:
We study the Tree Builder Random Walk: a randomly growing tree, built by a walker as she is walking around the tree. Namely, at each time $n$, she adds a leaf to her current vertex with probability $p_n \asymp n^{-γ}$, $γ\in (2/3,1]$, then moves to a uniform random neighbor on the possibly modified tree. We show that the tree process at its growth times, after a random finite number of steps, can…
▽ More
We study the Tree Builder Random Walk: a randomly growing tree, built by a walker as she is walking around the tree. Namely, at each time $n$, she adds a leaf to her current vertex with probability $p_n \asymp n^{-γ}$, $γ\in (2/3,1]$, then moves to a uniform random neighbor on the possibly modified tree. We show that the tree process at its growth times, after a random finite number of steps, can be coupled to be identical to the Barabási-Albert preferential attachment tree model.
Thus, our TBRW-model is a local dynamics giving rise to the BA-model. The coupling also implies that many properties known for the BA-model, such as diameter and degree distribution, can be directly transferred to our TBRW-model, extending previous results.
△ Less
Submitted 6 December, 2024; v1 submitted 30 November, 2023;
originally announced November 2023.
-
Generating Summaries with Controllable Readability Levels
Authors:
Leonardo F. R. Ribeiro,
Mohit Bansal,
Markus Dreyer
Abstract:
Readability refers to how easily a reader can understand a written text. Several factors affect the readability level, such as the complexity of the text, its subject matter, and the reader's background knowledge. Generating summaries based on different readability levels is critical for enabling knowledge consumption by diverse audiences. However, current text generation approaches lack refined c…
▽ More
Readability refers to how easily a reader can understand a written text. Several factors affect the readability level, such as the complexity of the text, its subject matter, and the reader's background knowledge. Generating summaries based on different readability levels is critical for enabling knowledge consumption by diverse audiences. However, current text generation approaches lack refined control, resulting in texts that are not customized to readers' proficiency levels. In this work, we bridge this gap and study techniques to generate summaries at specified readability levels. Unlike previous methods that focus on a specific readability level (e.g., lay summarization), we generate summaries with fine-grained control over their readability. We develop three text generation techniques for controlling readability: (1) instruction-based readability control, (2) reinforcement learning to minimize the gap between requested and observed readability and (3) a decoding approach that uses lookahead to estimate the readability of upcoming decoding steps. We show that our generation methods significantly improve readability control on news summarization (CNN/DM dataset), as measured by various readability metrics and human judgement, establishing strong baselines for controllable readability in summarization.
△ Less
Submitted 16 October, 2023;
originally announced October 2023.
-
Socially reactive navigation models for mobile robots in dynamic environments
Authors:
Ricarte Ribeiro,
Plinio Moreno
Abstract:
The objective of this work is to expand upon previous works, considering socially acceptable behaviours within robot navigation and interaction, and allow a robot to closely approach static and dynamic individuals or groups. The space models developed in this dissertation are adaptive, that is, capable of changing over time to accommodate the changing circumstances often existent within a social e…
▽ More
The objective of this work is to expand upon previous works, considering socially acceptable behaviours within robot navigation and interaction, and allow a robot to closely approach static and dynamic individuals or groups. The space models developed in this dissertation are adaptive, that is, capable of changing over time to accommodate the changing circumstances often existent within a social environment. The space model's parameters' adaptation occurs with the end goal of enabling a close interaction between humans and robots and is thus capable of taking into account not only the arrangement of the groups, but also the basic characteristics of the robot itself. This work also further develops a preexisting approach pose estimation algorithm in order to better guarantee the safety and comfort of the humans involved in the interaction, by taking into account basic human sensibilities. The algorithms are integrated into ROS's navigation system through the use of the $costmap2d$ and the $move\_base$ packages. The space model adaptation is tested via comparative evaluation against previous algorithms through the use of datasets. The entire navigation system is then evaluated through both simulations (static and dynamic) and real life situations (static). These experiments demonstrate that the developed space model and approach pose estimation algorithms are capable of enabling a robot to closely approach individual humans and groups, while maintaining considerations for their comfort and sensibilities.
△ Less
Submitted 15 October, 2023;
originally announced October 2023.
-
Experiential-Informed Data Reconstruction for Fishery Sustainability and Policies in the Azores
Authors:
Brenda Nogueira,
Gui M. Menezes,
Nuno Moniz,
Rita P. Ribeiro
Abstract:
Fishery analysis is critical in maintaining the long-term sustainability of species and the livelihoods of millions of people who depend on fishing for food and income. The fishing gear, or metier, is a key factor significantly impacting marine habitats, selectively targeting species and fish sizes. Analysis of commercial catches or landings by metier in fishery stock assessment and management is…
▽ More
Fishery analysis is critical in maintaining the long-term sustainability of species and the livelihoods of millions of people who depend on fishing for food and income. The fishing gear, or metier, is a key factor significantly impacting marine habitats, selectively targeting species and fish sizes. Analysis of commercial catches or landings by metier in fishery stock assessment and management is crucial, providing robust estimates of fishing efforts and their impact on marine ecosystems. In this paper, we focus on a unique data set from the Azores' fishing data collection programs between 2010 and 2017, where little information on metiers is available and sparse throughout our timeline. Our main objective is to tackle the task of data set reconstruction, leveraging domain knowledge and machine learning methods to retrieve or associate metier-related information to each fish landing. We empirically validate the feasibility of this task using a diverse set of modeling approaches and demonstrate how it provides new insights into different fisheries' behavior and the impact of metiers over time, which are essential for future fish population assessments, management, and conservation efforts.
△ Less
Submitted 13 October, 2024; v1 submitted 17 September, 2023;
originally announced September 2023.
-
Fuzzy Fingerprinting Transformer Language-Models for Emotion Recognition in Conversations
Authors:
Patrícia Pereira,
Rui Ribeiro,
Helena Moniz,
Luisa Coheur,
Joao Paulo Carvalho
Abstract:
Fuzzy Fingerprints have been successfully used as an interpretable text classification technique, but, like most other techniques, have been largely surpassed in performance by Large Pre-trained Language Models, such as BERT or RoBERTa. These models deliver state-of-the-art results in several Natural Language Processing tasks, namely Emotion Recognition in Conversations (ERC), but suffer from the…
▽ More
Fuzzy Fingerprints have been successfully used as an interpretable text classification technique, but, like most other techniques, have been largely surpassed in performance by Large Pre-trained Language Models, such as BERT or RoBERTa. These models deliver state-of-the-art results in several Natural Language Processing tasks, namely Emotion Recognition in Conversations (ERC), but suffer from the lack of interpretability and explainability. In this paper, we propose to combine the two approaches to perform ERC, as a means to obtain simpler and more interpretable Large Language Models-based classifiers. We propose to feed the utterances and their previous conversational turns to a pre-trained RoBERTa, obtaining contextual embedding utterance representations, that are then supplied to an adapted Fuzzy Fingerprint classification module. We validate our approach on the widely used DailyDialog ERC benchmark dataset, in which we obtain state-of-the-art level results using a much lighter model.
△ Less
Submitted 8 September, 2023;
originally announced September 2023.
-
Enhancing Network Slicing Architectures with Machine Learning, Security, Sustainability and Experimental Networks Integration
Authors:
Joberto S. B. Martins,
Tereza C. Carvalho,
Rodrigo Moreira,
Cristiano Both,
Adnei Donatti,
João H. Corrêa,
José A. Suruagy,
Sand L. Corrêa,
Antonio J. G. Abelem,
Moisés R. N. Ribeiro,
Jose-Marcos Nogueira,
Luiz C. S. Magalhães,
Juliano Wickboldt,
Tiago Ferreto,
Ricardo Mello,
Rafael Pasquini,
Marcos Schwarz,
Leobino N. Sampaio,
Daniel F. Macedo,
José F. de Rezende,
Kleber V. Cardoso,
Flávio O. Silva
Abstract:
Network Slicing (NS) is an essential technique extensively used in 5G networks computing strategies, mobile edge computing, mobile cloud computing, and verticals like the Internet of Vehicles and industrial IoT, among others. NS is foreseen as one of the leading enablers for 6G futuristic and highly demanding applications since it allows the optimization and customization of scarce and disputed re…
▽ More
Network Slicing (NS) is an essential technique extensively used in 5G networks computing strategies, mobile edge computing, mobile cloud computing, and verticals like the Internet of Vehicles and industrial IoT, among others. NS is foreseen as one of the leading enablers for 6G futuristic and highly demanding applications since it allows the optimization and customization of scarce and disputed resources among dynamic, demanding clients with highly distinct application requirements. Various standardization organizations, like 3GPP's proposal for new generation networks and state-of-the-art 5G/6G research projects, are proposing new NS architectures. However, new NS architectures have to deal with an extensive range of requirements that inherently result in having NS architecture proposals typically fulfilling the needs of specific sets of domains with commonalities. The Slicing Future Internet Infrastructures (SFI2) architecture proposal explores the gap resulting from the diversity of NS architectures target domains by proposing a new NS reference architecture with a defined focus on integrating experimental networks and enhancing the NS architecture with Machine Learning (ML) native optimizations, energy-efficient slicing, and slicing-tailored security functionalities. The SFI2 architectural main contribution includes the utilization of the slice-as-a-service paradigm for end-to-end orchestration of resources across multi-domains and multi-technology experimental networks. In addition, the SFI2 reference architecture instantiations will enhance the multi-domain and multi-technology integrated experimental network deployment with native ML optimization, energy-efficient aware slicing, and slicing-tailored security functionalities for the practical domain.
△ Less
Submitted 18 July, 2023;
originally announced July 2023.
-
Reconstructing Spatiotemporal Data with C-VAEs
Authors:
Tiago F. R. Ribeiro,
Fernando Silva,
Rogério Luís de C. Costa
Abstract:
The continuous representation of spatiotemporal data commonly relies on using abstract data types, such as \textit{moving regions}, to represent entities whose shape and position continuously change over time. Creating this representation from discrete snapshots of real-world entities requires using interpolation methods to compute in-between data representations and estimate the position and shap…
▽ More
The continuous representation of spatiotemporal data commonly relies on using abstract data types, such as \textit{moving regions}, to represent entities whose shape and position continuously change over time. Creating this representation from discrete snapshots of real-world entities requires using interpolation methods to compute in-between data representations and estimate the position and shape of the object of interest at arbitrary temporal points. Existing region interpolation methods often fail to generate smooth and realistic representations of a region's evolution. However, recent advancements in deep learning techniques have revealed the potential of deep models trained on discrete observations to capture spatiotemporal dependencies through implicit feature learning.
In this work, we explore the capabilities of Conditional Variational Autoencoder (C-VAE) models to generate smooth and realistic representations of the spatiotemporal evolution of moving regions. We evaluate our proposed approach on a sparsely annotated dataset on the burnt area of a forest fire. We apply compression operations to sample from the dataset and use the C-VAE model and other commonly used interpolation algorithms to generate in-between region representations. To evaluate the performance of the methods, we compare their interpolation results with manually annotated data and regions generated by a U-Net model. We also assess the quality of generated data considering temporal consistency metrics.
The proposed C-VAE-based approach demonstrates competitive results in geometric similarity metrics. It also exhibits superior temporal consistency, suggesting that C-VAE models may be a viable alternative to modelling the spatiotemporal evolution of 2D moving regions.
△ Less
Submitted 28 August, 2023; v1 submitted 12 July, 2023;
originally announced July 2023.
-
Mobility Strategy of Multi-Limbed Climbing Robots for Asteroid Exploration
Authors:
Warley F. R. Ribeiro,
Kentaro Uno,
Masazumi Imai,
Koki Murase,
Barış Can Yalçın,
Matteo El Hariry,
Miguel A. Olivares-Mendez,
Kazuya Yoshida
Abstract:
Mobility on asteroids by multi-limbed climbing robots is expected to achieve our exploration goals in such challenging environments. We propose a mobility strategy to improve the locomotion safety of climbing robots in such harsh environments that picture extremely low gravity and highly uneven terrain. Our method plans the gait by decoupling the base and limbs' movements and adjusting the main bo…
▽ More
Mobility on asteroids by multi-limbed climbing robots is expected to achieve our exploration goals in such challenging environments. We propose a mobility strategy to improve the locomotion safety of climbing robots in such harsh environments that picture extremely low gravity and highly uneven terrain. Our method plans the gait by decoupling the base and limbs' movements and adjusting the main body pose to avoid ground collisions. The proposed approach includes a motion planning that reduces the reactions generated by the robot's movement by optimizing the swinging trajectory and distributing the momentum. Lower motion reactions decrease the pulling forces on the grippers, avoiding the slippage and flotation of the robot. Dynamic simulations and experiments demonstrate that the proposed method could improve the robot's mobility on the surface of asteroids.
△ Less
Submitted 22 June, 2023; v1 submitted 13 June, 2023;
originally announced June 2023.
-
Explainable Predictive Maintenance
Authors:
Sepideh Pashami,
Slawomir Nowaczyk,
Yuantao Fan,
Jakub Jakubowski,
Nuno Paiva,
Narjes Davari,
Szymon Bobek,
Samaneh Jamshidi,
Hamid Sarmadi,
Abdallah Alabdallah,
Rita P. Ribeiro,
Bruno Veloso,
Moamar Sayed-Mouchaweh,
Lala Rajaoarisoa,
Grzegorz J. Nalepa,
João Gama
Abstract:
Explainable Artificial Intelligence (XAI) fills the role of a critical interface fostering interactions between sophisticated intelligent systems and diverse individuals, including data scientists, domain experts, end-users, and more. It aids in deciphering the intricate internal mechanisms of ``black box'' Machine Learning (ML), rendering the reasons behind their decisions more understandable. Ho…
▽ More
Explainable Artificial Intelligence (XAI) fills the role of a critical interface fostering interactions between sophisticated intelligent systems and diverse individuals, including data scientists, domain experts, end-users, and more. It aids in deciphering the intricate internal mechanisms of ``black box'' Machine Learning (ML), rendering the reasons behind their decisions more understandable. However, current research in XAI primarily focuses on two aspects; ways to facilitate user trust, or to debug and refine the ML model. The majority of it falls short of recognising the diverse types of explanations needed in broader contexts, as different users and varied application areas necessitate solutions tailored to their specific needs.
One such domain is Predictive Maintenance (PdM), an exploding area of research under the Industry 4.0 \& 5.0 umbrella. This position paper highlights the gap between existing XAI methodologies and the specific requirements for explanations within industrial applications, particularly the Predictive Maintenance field. Despite explainability's crucial role, this subject remains a relatively under-explored area, making this paper a pioneering attempt to bring relevant challenges to the research community's attention. We provide an overview of predictive maintenance tasks and accentuate the need and varying purposes for corresponding explanations. We then list and describe XAI techniques commonly employed in the literature, discussing their suitability for PdM tasks. Finally, to make the ideas and claims more concrete, we demonstrate XAI applied in four specific industrial use cases: commercial vehicles, metro trains, steel plants, and wind farms, spotlighting areas requiring further research.
△ Less
Submitted 8 June, 2023;
originally announced June 2023.
-
Learning to Reason over Scene Graphs: A Case Study of Finetuning GPT-2 into a Robot Language Model for Grounded Task Planning
Authors:
Georgia Chalvatzaki,
Ali Younes,
Daljeet Nandha,
An Le,
Leonardo F. R. Ribeiro,
Iryna Gurevych
Abstract:
Long-horizon task planning is essential for the development of intelligent assistive and service robots. In this work, we investigate the applicability of a smaller class of large language models (LLMs), specifically GPT-2, in robotic task planning by learning to decompose tasks into subgoal specifications for a planner to execute sequentially. Our method grounds the input of the LLM on the domain…
▽ More
Long-horizon task planning is essential for the development of intelligent assistive and service robots. In this work, we investigate the applicability of a smaller class of large language models (LLMs), specifically GPT-2, in robotic task planning by learning to decompose tasks into subgoal specifications for a planner to execute sequentially. Our method grounds the input of the LLM on the domain that is represented as a scene graph, enabling it to translate human requests into executable robot plans, thereby learning to reason over long-horizon tasks, as encountered in the ALFRED benchmark. We compare our approach with classical planning and baseline methods to examine the applicability and generalizability of LLM-based planners. Our findings suggest that the knowledge stored in an LLM can be effectively grounded to perform long-horizon task planning, demonstrating the promising potential for the future application of neuro-symbolic planning methods in robotics.
△ Less
Submitted 12 May, 2023;
originally announced May 2023.
-
Embedding Aggregation for Forensic Facial Comparison
Authors:
Rafael Oliveira Ribeiro,
João C. R. Neves,
Arnout C. C. Ruifrok,
Flavio de Barros Vidal
Abstract:
In forensic facial comparison, questioned-source images are usually captured in uncontrolled environments, with non-uniform lighting, and from non-cooperative subjects. The poor quality of such material usually compromises their value as evidence in legal matters. On the other hand, in forensic casework, multiple images of the person of interest are usually available. In this paper, we propose to…
▽ More
In forensic facial comparison, questioned-source images are usually captured in uncontrolled environments, with non-uniform lighting, and from non-cooperative subjects. The poor quality of such material usually compromises their value as evidence in legal matters. On the other hand, in forensic casework, multiple images of the person of interest are usually available. In this paper, we propose to aggregate deep neural network embeddings from various images of the same person to improve performance in facial verification. We observe significant performance improvements, especially for very low-quality images. Further improvements are obtained by aggregating embeddings of more images and by applying quality-weighted aggregation. We demonstrate the benefits of this approach in forensic evaluation settings with the development and validation of score-based likelihood ratio systems and report improvements in Cllr of up to 95% (from 0.249 to 0.012) for CCTV images and of up to 96% (from 0.083 to 0.003) for social media images.
△ Less
Submitted 29 April, 2023;
originally announced May 2023.
-
PGTask: Introducing the Task of Profile Generation from Dialogues
Authors:
Rui Ribeiro,
Joao P. Carvalho,
Luísa Coheur
Abstract:
Recent approaches have attempted to personalize dialogue systems by leveraging profile information into models. However, this knowledge is scarce and difficult to obtain, which makes the extraction/generation of profile information from dialogues a fundamental asset. To surpass this limitation, we introduce the Profile Generation Task (PGTask). We contribute with a new dataset for this problem, co…
▽ More
Recent approaches have attempted to personalize dialogue systems by leveraging profile information into models. However, this knowledge is scarce and difficult to obtain, which makes the extraction/generation of profile information from dialogues a fundamental asset. To surpass this limitation, we introduce the Profile Generation Task (PGTask). We contribute with a new dataset for this problem, comprising profile sentences aligned with related utterances, extracted from a corpus of dialogues. Furthermore, using state-of-the-art methods, we provide a benchmark for profile generation on this novel dataset. Our experiments disclose the challenges of profile generation, and we hope that this introduces a new research direction.
△ Less
Submitted 26 August, 2023; v1 submitted 13 April, 2023;
originally announced April 2023.
-
Forecasting Large Realized Covariance Matrices: The Benefits of Factor Models and Shrinkage
Authors:
Rafael Alves,
Diego S. de Brito,
Marcelo C. Medeiros,
Ruy M. Ribeiro
Abstract:
We propose a model to forecast large realized covariance matrices of returns, applying it to the constituents of the S\&P 500 daily. To address the curse of dimensionality, we decompose the return covariance matrix using standard firm-level factors (e.g., size, value, and profitability) and use sectoral restrictions in the residual covariance matrix. This restricted model is then estimated using v…
▽ More
We propose a model to forecast large realized covariance matrices of returns, applying it to the constituents of the S\&P 500 daily. To address the curse of dimensionality, we decompose the return covariance matrix using standard firm-level factors (e.g., size, value, and profitability) and use sectoral restrictions in the residual covariance matrix. This restricted model is then estimated using vector heterogeneous autoregressive (VHAR) models with the least absolute shrinkage and selection operator (LASSO). Our methodology improves forecasting precision relative to standard benchmarks and leads to better estimates of minimum variance portfolios.
△ Less
Submitted 22 March, 2023;
originally announced March 2023.
-
RAMP: Reaction-Aware Motion Planning of Multi-Legged Robots for Locomotion in Microgravity
Authors:
Warley F. R. Ribeiro,
Kentaro Uno,
Masazumi Imai,
Koki Murase,
Kazuya Yoshida
Abstract:
Robotic mobility in microgravity is necessary to expand human utilization and exploration of outer space. Bio-inspired multi-legged robots are a possible solution for safe and precise locomotion. However, a dynamic motion of a robot in microgravity can lead to failures due to gripper detachment caused by excessive motion reactions. We propose a novel Reaction-Aware Motion Planning (RAMP) to improv…
▽ More
Robotic mobility in microgravity is necessary to expand human utilization and exploration of outer space. Bio-inspired multi-legged robots are a possible solution for safe and precise locomotion. However, a dynamic motion of a robot in microgravity can lead to failures due to gripper detachment caused by excessive motion reactions. We propose a novel Reaction-Aware Motion Planning (RAMP) to improve locomotion safety in microgravity, decreasing the risk of losing contact with the terrain surface by reducing the robot's momentum change. RAMP minimizes the swing momentum with a Low-Reaction Swing Trajectory (LRST) while distributing this momentum to the whole body, ensuring zero velocity for the supporting grippers and minimizing motion reactions. We verify the proposed approach with dynamic simulations indicating the capability of RAMP to generate a safe motion without detachment of the supporting grippers, resulting in the robot reaching its specified location. We further validate RAMP in experiments with an air-floating system, demonstrating a significant reduction in reaction forces and improved mobility in microgravity.
△ Less
Submitted 19 January, 2023;
originally announced January 2023.
-
Turning the Tables: Biased, Imbalanced, Dynamic Tabular Datasets for ML Evaluation
Authors:
Sérgio Jesus,
José Pombal,
Duarte Alves,
André Cruz,
Pedro Saleiro,
Rita P. Ribeiro,
João Gama,
Pedro Bizarro
Abstract:
Evaluating new techniques on realistic datasets plays a crucial role in the development of ML research and its broader adoption by practitioners. In recent years, there has been a significant increase of publicly available unstructured data resources for computer vision and NLP tasks. However, tabular data -- which is prevalent in many high-stakes domains -- has been lagging behind. To bridge this…
▽ More
Evaluating new techniques on realistic datasets plays a crucial role in the development of ML research and its broader adoption by practitioners. In recent years, there has been a significant increase of publicly available unstructured data resources for computer vision and NLP tasks. However, tabular data -- which is prevalent in many high-stakes domains -- has been lagging behind. To bridge this gap, we present Bank Account Fraud (BAF), the first publicly available privacy-preserving, large-scale, realistic suite of tabular datasets. The suite was generated by applying state-of-the-art tabular data generation techniques on an anonymized,real-world bank account opening fraud detection dataset. This setting carries a set of challenges that are commonplace in real-world applications, including temporal dynamics and significant class imbalance. Additionally, to allow practitioners to stress test both performance and fairness of ML methods, each dataset variant of BAF contains specific types of data bias. With this resource, we aim to provide the research community with a more realistic, complete, and robust test bed to evaluate novel and existing methods.
△ Less
Submitted 28 November, 2022; v1 submitted 23 November, 2022;
originally announced November 2022.
-
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model
Authors:
BigScience Workshop,
:,
Teven Le Scao,
Angela Fan,
Christopher Akiki,
Ellie Pavlick,
Suzana Ilić,
Daniel Hesslow,
Roman Castagné,
Alexandra Sasha Luccioni,
François Yvon,
Matthias Gallé,
Jonathan Tow,
Alexander M. Rush,
Stella Biderman,
Albert Webson,
Pawan Sasanka Ammanamanchi,
Thomas Wang,
Benoît Sagot,
Niklas Muennighoff,
Albert Villanova del Moral,
Olatunji Ruwase,
Rachel Bawden,
Stas Bekman,
Angelina McMillan-Major
, et al. (369 additional authors not shown)
Abstract:
Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access…
▽ More
Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access language model designed and built thanks to a collaboration of hundreds of researchers. BLOOM is a decoder-only Transformer language model that was trained on the ROOTS corpus, a dataset comprising hundreds of sources in 46 natural and 13 programming languages (59 in total). We find that BLOOM achieves competitive performance on a wide variety of benchmarks, with stronger results after undergoing multitask prompted finetuning. To facilitate future research and applications using LLMs, we publicly release our models and code under the Responsible AI License.
△ Less
Submitted 27 June, 2023; v1 submitted 9 November, 2022;
originally announced November 2022.
-
Incorporating Relevance Feedback for Information-Seeking Retrieval using Few-Shot Document Re-Ranking
Authors:
Tim Baumgärtner,
Leonardo F. R. Ribeiro,
Nils Reimers,
Iryna Gurevych
Abstract:
Pairing a lexical retriever with a neural re-ranking model has set state-of-the-art performance on large-scale information retrieval datasets. This pipeline covers scenarios like question answering or navigational queries, however, for information-seeking scenarios, users often provide information on whether a document is relevant to their query in form of clicks or explicit feedback. Therefore, i…
▽ More
Pairing a lexical retriever with a neural re-ranking model has set state-of-the-art performance on large-scale information retrieval datasets. This pipeline covers scenarios like question answering or navigational queries, however, for information-seeking scenarios, users often provide information on whether a document is relevant to their query in form of clicks or explicit feedback. Therefore, in this work, we explore how relevance feedback can be directly integrated into neural re-ranking models by adopting few-shot and parameter-efficient learning techniques. Specifically, we introduce a kNN approach that re-ranks documents based on their similarity with the query and the documents the user considers relevant. Further, we explore Cross-Encoder models that we pre-train using meta-learning and subsequently fine-tune for each query, training only on the feedback documents. To evaluate our different integration strategies, we transform four existing information retrieval datasets into the relevance feedback scenario. Extensive experiments demonstrate that integrating relevance feedback directly in neural re-ranking models improves their performance, and fusing lexical ranking with our best performing neural re-ranker outperforms all other methods by 5.2 nDCG@20.
△ Less
Submitted 19 October, 2022;
originally announced October 2022.
-
SUMBot: Summarizing Context in Open-Domain Dialogue Systems
Authors:
Rui Ribeiro,
Luísa Coheur
Abstract:
In this paper, we investigate the problem of including relevant information as context in open-domain dialogue systems. Most models struggle to identify and incorporate important knowledge from dialogues and simply use the entire turns as context, which increases the size of the input fed to the model with unnecessary information. Additionally, due to the input size limitation of a few hundred tok…
▽ More
In this paper, we investigate the problem of including relevant information as context in open-domain dialogue systems. Most models struggle to identify and incorporate important knowledge from dialogues and simply use the entire turns as context, which increases the size of the input fed to the model with unnecessary information. Additionally, due to the input size limitation of a few hundred tokens of large pre-trained models, regions of the history are not included and informative parts from the dialogue may be omitted. In order to surpass this problem, we introduce a simple method that substitutes part of the context with a summary instead of the whole history, which increases the ability of models to keep track of all the previous relevant information. We show that the inclusion of a summary may improve the answer generation task and discuss some examples to further understand the system's weaknesses.
△ Less
Submitted 12 October, 2022;
originally announced October 2022.
-
Face Super-Resolution Using Stochastic Differential Equations
Authors:
Marcelo dos Santos,
Rayson Laroca,
Rafael O. Ribeiro,
João Neves,
Hugo Proença,
David Menotti
Abstract:
Diffusion models have proven effective for various applications such as images, audio and graph generation. Other important applications are image super-resolution and the solution of inverse problems. More recently, some works have used stochastic differential equations (SDEs) to generalize diffusion models to continuous time. In this work, we introduce SDEs to generate super-resolution face imag…
▽ More
Diffusion models have proven effective for various applications such as images, audio and graph generation. Other important applications are image super-resolution and the solution of inverse problems. More recently, some works have used stochastic differential equations (SDEs) to generalize diffusion models to continuous time. In this work, we introduce SDEs to generate super-resolution face images. To the best of our knowledge, this is the first time SDEs have been used for such an application. The proposed method provides an improved peak signal-to-noise ratio (PSNR), structural similarity index measure (SSIM), and consistency than the existing super-resolution methods based on diffusion models. In particular, we also assess the potential application of this method for the face recognition task. A generic facial feature extractor is used to compare the super-resolution images with the ground truth and superior results were obtained compared with other methods. Our code is publicly available at https://github.com/marcelowds/sr-sde
△ Less
Submitted 24 September, 2022;
originally announced September 2022.
-
UKP-SQuARE v2: Explainability and Adversarial Attacks for Trustworthy QA
Authors:
Rachneet Sachdeva,
Haritz Puerto,
Tim Baumgärtner,
Sewin Tariverdian,
Hao Zhang,
Kexin Wang,
Hossain Shaikh Saadi,
Leonardo F. R. Ribeiro,
Iryna Gurevych
Abstract:
Question Answering (QA) systems are increasingly deployed in applications where they support real-world decisions. However, state-of-the-art models rely on deep neural networks, which are difficult to interpret by humans. Inherently interpretable models or post hoc explainability methods can help users to comprehend how a model arrives at its prediction and, if successful, increase their trust in…
▽ More
Question Answering (QA) systems are increasingly deployed in applications where they support real-world decisions. However, state-of-the-art models rely on deep neural networks, which are difficult to interpret by humans. Inherently interpretable models or post hoc explainability methods can help users to comprehend how a model arrives at its prediction and, if successful, increase their trust in the system. Furthermore, researchers can leverage these insights to develop new methods that are more accurate and less biased. In this paper, we introduce SQuARE v2, the new version of SQuARE, to provide an explainability infrastructure for comparing models based on methods such as saliency maps and graph-based explanations. While saliency maps are useful to inspect the importance of each input token for the model's prediction, graph-based explanations from external Knowledge Graphs enable the users to verify the reasoning behind the model prediction. In addition, we provide multiple adversarial attacks to compare the robustness of QA models. With these explainability methods and adversarial attacks, we aim to ease the research on trustworthy QA models. SQuARE is available on https://square.ukp-lab.de.
△ Less
Submitted 20 October, 2022; v1 submitted 19 August, 2022;
originally announced August 2022.
-
A Benchmark dataset for predictive maintenance
Authors:
Bruno Veloso,
João Gama,
Rita P. Ribeiro,
Pedro M. Pereira
Abstract:
The paper describes the MetroPT data set, an outcome of a eXplainable Predictive Maintenance (XPM) project with an urban metro public transportation service in Porto, Portugal. The data was collected in 2022 that aimed to evaluate machine learning methods for online anomaly detection and failure prediction. By capturing several analogic sensor signals (pressure, temperature, current consumption),…
▽ More
The paper describes the MetroPT data set, an outcome of a eXplainable Predictive Maintenance (XPM) project with an urban metro public transportation service in Porto, Portugal. The data was collected in 2022 that aimed to evaluate machine learning methods for online anomaly detection and failure prediction. By capturing several analogic sensor signals (pressure, temperature, current consumption), digital signals (control signals, discrete signals), and GPS information (latitude, longitude, and speed), we provide a dataset that can be easily used to evaluate online machine learning methods. This dataset contains some interesting characteristics and can be a good benchmark for predictive maintenance models.
△ Less
Submitted 18 July, 2022; v1 submitted 12 July, 2022;
originally announced July 2022.