Skip to main content

Showing 1–50 of 136 results for author: Oliveira, D

Searching in archive cs. Search in all archives.
.
  1. arXiv:2507.07340  [pdf, ps, other

    cs.CV

    Entity Re-identification in Visual Storytelling via Contrastive Reinforcement Learning

    Authors: Daniel A. P. Oliveira, David Martins de Matos

    Abstract: Visual storytelling systems, particularly large vision-language models, struggle to maintain character and object identity across frames, often failing to recognize when entities in different images represent the same individuals or objects, leading to inconsistent references and referential hallucinations. This occurs because models lack explicit training on when to establish entity connections a… ▽ More

    Submitted 10 July, 2025; v1 submitted 9 July, 2025; originally announced July 2025.

    Comments: 7 pages

    ACM Class: I.2; I.4; I.5; I.7

  2. arXiv:2507.01048  [pdf, ps, other

    cs.LG

    3W Dataset 2.0.0: a realistic and public dataset with rare undesirable real events in oil wells

    Authors: Ricardo Emanuel Vaz Vargas, Afrânio José de Melo Junior, Celso José Munaro, Cláudio Benevenuto de Campos Lima, Eduardo Toledo de Lima Junior, Felipe Muntzberg Barrocas, Flávio Miguel Varejão, Guilherme Fidelis Peixer, Igor de Melo Nery Oliveira, Jader Riso Barbosa Jr., Jaime Andrés Lozano Cadena, Jean Carlos Dias de Araújo, João Neuenschwander Escosteguy Carneiro, Lucas Gouveia Omena Lopes, Lucas Pereira de Gouveia, Mateus de Araujo Fernandes, Matheus Lima Scramignon, Patrick Marques Ciarelli, Rodrigo Castello Branco, Rogério Leite Alves Pinto

    Abstract: In the oil industry, undesirable events in oil wells can cause economic losses, environmental accidents, and human casualties. Solutions based on Artificial Intelligence and Machine Learning for Early Detection of such events have proven valuable for diverse applications across industries. In 2019, recognizing the importance and the lack of public datasets related to undesirable events in oil well… ▽ More

    Submitted 25 June, 2025; originally announced July 2025.

    Comments: 21 pages, 10 figures, and 7 tables

  3. arXiv:2505.11391  [pdf, ps, other

    eess.AS cs.SD

    LipDiffuser: Lip-to-Speech Generation with Conditional Diffusion Models

    Authors: Danilo de Oliveira, Julius Richter, Tal Peer, Timo Gerkmann

    Abstract: We present LipDiffuser, a conditional diffusion model for lip-to-speech generation synthesizing natural and intelligible speech directly from silent video recordings. Our approach leverages the magnitude-preserving ablated diffusion model (MP-ADM) architecture as a denoiser model. To effectively condition the model, we incorporate visual features using magnitude-preserving feature-wise linear modu… ▽ More

    Submitted 26 May, 2025; v1 submitted 16 May, 2025; originally announced May 2025.

  4. arXiv:2505.10292  [pdf, ps, other

    cs.CV cs.CL

    StoryReasoning Dataset: Using Chain-of-Thought for Scene Understanding and Grounded Story Generation

    Authors: Daniel A. P. Oliveira, David Martins de Matos

    Abstract: Visual storytelling systems struggle to maintain character identity across frames and link actions to appropriate subjects, frequently leading to referential hallucinations. These issues can be addressed through grounding of characters, objects, and other entities on the visual elements. We propose StoryReasoning, a dataset containing 4,178 stories derived from 52,016 movie images, with both struc… ▽ More

    Submitted 15 May, 2025; originally announced May 2025.

    Comments: 31 pages, 14 figures

    ACM Class: I.2.10; I.2.7

  5. arXiv:2505.05216  [pdf, other

    eess.AS cs.SD

    Normalize Everything: A Preconditioned Magnitude-Preserving Architecture for Diffusion-Based Speech Enhancement

    Authors: Julius Richter, Danilo de Oliveira, Timo Gerkmann

    Abstract: This paper presents a new framework for diffusion-based speech enhancement. Our method employs a Schroedinger bridge to transform the noisy speech distribution into the clean speech distribution. To stabilize and improve training, we employ time-dependent scalings of the inputs and outputs of the network, known as preconditioning. We consider two skip connection configurations, which either includ… ▽ More

    Submitted 8 May, 2025; originally announced May 2025.

    Comments: Submitted to WASPAA 2025

  6. arXiv:2504.10405  [pdf, ps, other

    cs.CL cs.AI cs.ET cs.HC

    Performance of Large Language Models in Supporting Medical Diagnosis and Treatment

    Authors: Diogo Sousa, Guilherme Barbosa, Catarina Rocha, Dulce Oliveira

    Abstract: The integration of Large Language Models (LLMs) into healthcare holds significant potential to enhance diagnostic accuracy and support medical treatment planning. These AI-driven systems can analyze vast datasets, assisting clinicians in identifying diseases, recommending treatments, and predicting patient outcomes. This study evaluates the performance of a range of contemporary LLMs, including bo… ▽ More

    Submitted 14 April, 2025; originally announced April 2025.

    Comments: 21 pages, 6 figures, 4 tables. Acknowledgements: The authors acknowledge the support of the AITriage4SU Project (2024.07400.IACDC/2024), funded by the FCT (Foundation for Science and Technology), Portugal

    ACM Class: I.2.7; J.3

  7. arXiv:2503.23389  [pdf

    cs.RO cond-mat.soft

    Proprioceptive multistable mechanical metamaterial via soft capacitive sensors

    Authors: Hugo de Souza Oliveira, Niloofar Saeedzadeh Khaanghah, Martijn Oetelmans, Niko Münzenrieder, Edoardo Milana

    Abstract: The technological transition from soft machines to soft robots necessarily passes through the integration of soft electronics and sensors. This allows for the establishment of feedback control systems while preserving the softness of the robot embodiment. Multistable mechanical metamaterials are excellent building blocks of soft machines, as their nonlinear response can be tuned by design to accom… ▽ More

    Submitted 30 March, 2025; originally announced March 2025.

    Comments: 2024 IEEE International Flexible Electronics Technology Conference (IFETC)

  8. arXiv:2503.23375  [pdf, other

    cs.RO cond-mat.soft

    Meta-Ori: monolithic meta-origami for nonlinear inflatable soft actuators

    Authors: Hugo de Souza Oliveira, Xin Li, Johannes Frey, Edoardo Milana

    Abstract: The nonlinear mechanical response of soft materials and slender structures is purposefully harnessed to program functions by design in soft robotic actuators, such as sequencing, amplified response, fast energy release, etc. However, typical designs of nonlinear actuators - e.g. balloons, inverted membranes, springs - have limited design parameters space and complex fabrication processes, hinderin… ▽ More

    Submitted 30 March, 2025; originally announced March 2025.

    Comments: 8th IEEE-RAS International Conference on Soft Robotics

  9. arXiv:2503.10520  [pdf, other

    cs.CV cs.AI cs.LG

    CountPath: Automating Fragment Counting in Digital Pathology

    Authors: Ana Beatriz Vieira, Maria Valente, Diana Montezuma, Tomé Albuquerque, Liliana Ribeiro, Domingos Oliveira, João Monteiro, Sofia Gonçalves, Isabel M. Pinto, Jaime S. Cardoso, Arlindo L. Oliveira

    Abstract: Quality control of medical images is a critical component of digital pathology, ensuring that diagnostic images meet required standards. A pre-analytical task within this process is the verification of the number of specimen fragments, a process that ensures that the number of fragments on a slide matches the number documented in the macroscopic report. This step is important to ensure that the sl… ▽ More

    Submitted 13 March, 2025; originally announced March 2025.

    Comments: 10 pages, 3 figures

    ACM Class: I.2; I.4

  10. arXiv:2503.08321  [pdf, other

    cs.CV

    i-WiViG: Interpretable Window Vision GNN

    Authors: Ivica Obadic, Dmitry Kangin, Dario Oliveira, Plamen P Angelov, Xiao Xiang Zhu

    Abstract: Deep learning models based on graph neural networks have emerged as a popular approach for solving computer vision problems. They encode the image into a graph structure and can be beneficial for efficiently capturing the long-range dependencies typically present in remote sensing imagery. However, an important drawback of these methods is their black-box nature which may hamper their wider usage… ▽ More

    Submitted 11 March, 2025; originally announced March 2025.

  11. arXiv:2503.03599  [pdf, other

    cs.CV cs.RO

    REGRACE: A Robust and Efficient Graph-based Re-localization Algorithm using Consistency Evaluation

    Authors: Débora N. P. Oliveira, Joshua Knights, Sebastián Barbas Laina, Simon Boche, Wolfram Burgard, Stefan Leutenegger

    Abstract: Loop closures are essential for correcting odometry drift and creating consistent maps, especially in the context of large-scale navigation. Current methods using dense point clouds for accurate place recognition do not scale well due to computationally expensive scan-to-scan comparisons. Alternative object-centric approaches are more efficient but often struggle with sensitivity to viewpoint vari… ▽ More

    Submitted 5 March, 2025; originally announced March 2025.

    Comments: Submitted to IROS2025

  12. arXiv:2502.13898  [pdf, ps, other

    cs.CV cs.CL

    GroundCap: A Visually Grounded Image Captioning Dataset

    Authors: Daniel A. P. Oliveira, Lourenço Teodoro, David Martins de Matos

    Abstract: Current image captioning systems lack the ability to link descriptive text to specific visual elements, making their outputs difficult to verify. While recent approaches offer some grounding capabilities, they cannot track object identities across multiple references or ground both actions and objects simultaneously. We propose a novel ID-based grounding system that enables consistent object refer… ▽ More

    Submitted 25 June, 2025; v1 submitted 19 February, 2025; originally announced February 2025.

    Comments: 37 pages

    ACM Class: I.2.10; I.2.7

  13. arXiv:2502.12350  [pdf, other

    cs.CE

    Mamute: high-performance computing for geophysical methods

    Authors: João B. Fernandes, Antônio D. S. Oliveira, Mateus C. A. T. Silva, Felipe H. Santos-da-Silva, Vitor H. M. Rodrigues, Kleiton A. Schneider, Calebe P. Bianchini, João M. de Araujo, Tiago Barros, Ítalo A. S. Assis, Samuel Xavier-de-Souza

    Abstract: Due to their high computational cost, geophysical applications are typically designed to run in large computing systems. Because of that, such applications must implement several high-performance techniques to use the computational resources better. In this paper, we present Mamute, a software that delivers wave equation-based geophysical methods. Mamute implements two geophysical methods: seismic… ▽ More

    Submitted 17 February, 2025; originally announced February 2025.

    Comments: 24 pages, 6 figures, Journal

  14. arXiv:2501.08401  [pdf, ps, other

    cs.DL cs.SI

    Navigating Gender Disparities in Communication Research Leadership: Academic Recognition, Career Development, and Compensation

    Authors: Diego F. M. Oliveira, Qian Huang

    Abstract: This study examines gender disparities in communication research through citation metrics, authorship patterns, team composition, and faculty salaries. Using data from 62,359 papers across 121 communication journals, we find that while female authors are increasingly represented, citation gaps persist, with sole-authored papers by women receiving fewer citations than those by men, especially in sm… ▽ More

    Submitted 15 January, 2025; v1 submitted 14 January, 2025; originally announced January 2025.

  15. arXiv:2501.00556  [pdf, other

    physics.flu-dyn cs.LG

    Finding the Underlying Viscoelastic Constitutive Equation via Universal Differential Equations and Differentiable Physics

    Authors: Elias C. Rodrigues, Roney L. Thompson, Dário A. B. Oliveira, Roberto F. Ausas

    Abstract: This research employs Universal Differential Equations (UDEs) alongside differentiable physics to model viscoelastic fluids, merging conventional differential equations, neural networks and numerical methods to reconstruct missing terms in constitutive models. This study focuses on analyzing four viscoelastic models: Upper Convected Maxwell (UCM), Johnson-Segalman, Giesekus, and Exponential Phan-T… ▽ More

    Submitted 23 May, 2025; v1 submitted 31 December, 2024; originally announced January 2025.

  16. arXiv:2501.00049  [pdf, other

    cs.CL cs.ET

    Seq2Seq Model-Based Chatbot with LSTM and Attention Mechanism for Enhanced User Interaction

    Authors: Lamya Benaddi, Charaf Ouaddi, Adnane Souha, Abdeslam Jakimi, Mohamed Rahouti, Mohammed Aledhari, Diogo Oliveira, Brahim Ouchao

    Abstract: A chatbot is an intelligent software application that automates conversations and engages users in natural language through messaging platforms. Leveraging artificial intelligence (AI), chatbots serve various functions, including customer service, information gathering, and casual conversation. Existing virtual assistant chatbots, such as ChatGPT and Gemini, demonstrate the potential of AI in Natu… ▽ More

    Submitted 27 December, 2024; originally announced January 2025.

    Comments: The Third Workshop on Deployable AI at AAAI-2025

  17. arXiv:2411.09524  [pdf, other

    cs.RO

    FlowNav: Combining Flow Matching and Depth Priors for Efficient Navigation

    Authors: Samiran Gode, Abhijeet Nayak, Débora N. P. Oliveira, Michael Krawez, Cordelia Schmid, Wolfram Burgard

    Abstract: Effective robot navigation in unseen environments is a challenging task that requires precise control actions at high frequencies. Recent advances have framed it as an image-goal-conditioned control problem, where the robot generates navigation actions using frontal RGB images. Current state-of-the-art methods in this area use diffusion policies to generate these control actions. Despite their pro… ▽ More

    Submitted 3 March, 2025; v1 submitted 14 November, 2024; originally announced November 2024.

    Comments: Submitted to IROS'25. Previous version accepted at CoRL 2024 workshop on Learning Effective Abstractions for Planning (LEAP) and workshop on Differentiable Optimization Everywhere: Simulation, Estimation, Learning, and Control

  18. Understanding Code Understandability Improvements in Code Reviews

    Authors: Delano Oliveira, Reydne Santos, Benedito de Oliveira, Martin Monperrus, Fernando Castor, Fernanda Madeiral

    Abstract: Motivation: Code understandability is crucial in software development, as developers spend 58% to 70% of their time reading source code. Improving it can improve productivity and reduce maintenance costs. Problem: Experimental studies often identify factors influencing code understandability in controlled settings but overlook real-world influences like project culture, guidelines, and developers'… ▽ More

    Submitted 12 November, 2024; v1 submitted 29 October, 2024; originally announced October 2024.

    Journal ref: IEEE Transactions on Software Engineering, 2024

  19. arXiv:2410.17834  [pdf, ps, other

    eess.AS cs.LG cs.SD

    Non-intrusive Speech Quality Assessment with Diffusion Models Trained on Clean Speech

    Authors: Danilo de Oliveira, Julius Richter, Jean-Marie Lemercier, Simon Welker, Timo Gerkmann

    Abstract: Diffusion models have found great success in generating high quality, natural samples of speech, but their potential for density estimation for speech has so far remained largely unexplored. In this work, we leverage an unconditional diffusion model trained only on clean speech for the assessment of speech quality. We show that the quality of a speech utterance can be assessed by estimating the li… ▽ More

    Submitted 13 June, 2025; v1 submitted 23 October, 2024; originally announced October 2024.

    Comments: Accepted at Interspeech 2025

  20. Workflows Community Summit 2024: Future Trends and Challenges in Scientific Workflows

    Authors: Rafael Ferreira da Silva, Deborah Bard, Kyle Chard, Shaun de Witt, Ian T. Foster, Tom Gibbs, Carole Goble, William Godoy, Johan Gustafsson, Utz-Uwe Haus, Stephen Hudson, Shantenu Jha, Laila Los, Drew Paine, Frédéric Suter, Logan Ward, Sean Wilkinson, Marcos Amaris, Yadu Babuji, Jonathan Bader, Riccardo Balin, Daniel Balouek, Sarah Beecroft, Khalid Belhajjame, Rajat Bhattarai , et al. (86 additional authors not shown)

    Abstract: The Workflows Community Summit gathered 111 participants from 18 countries to discuss emerging trends and challenges in scientific workflows, focusing on six key areas: time-sensitive workflows, AI-HPC convergence, multi-facility workflows, heterogeneous HPC environments, user experience, and FAIR computational workflows. The integration of AI and exascale computing has revolutionized scientific w… ▽ More

    Submitted 18 October, 2024; originally announced October 2024.

    Report number: ORNL/TM-2024/3573

  21. arXiv:2410.04318  [pdf, other

    cs.CY cs.HC

    Urban Computing for Climate and Environmental Justice: Early Perspectives From Two Research Initiatives

    Authors: Carolina Veiga, Ashish Sharma, Daniel de Oliveira, Marcos Lage, Fabio Miranda

    Abstract: The impacts of climate change are intensifying existing vulnerabilities and disparities within urban communities around the globe, as extreme weather events, including floods and heatwaves, are becoming more frequent and severe, disproportionately affecting low-income and underrepresented groups. Tackling these increasing challenges requires novel approaches that integrate expertise across multipl… ▽ More

    Submitted 5 October, 2024; originally announced October 2024.

    Comments: Accepted at the Viz4Climate + Sustainability: IEEE VIS 2024 Workshop on Visualization for Climate Action and Sustainability (https://svs.gsfc.nasa.gov/events/2024/Viz4ClimateAndSustainability/)

  22. arXiv:2409.10753  [pdf, other

    eess.AS cs.SD

    Investigating Training Objectives for Generative Speech Enhancement

    Authors: Julius Richter, Danilo de Oliveira, Timo Gerkmann

    Abstract: Generative speech enhancement has recently shown promising advancements in improving speech quality in noisy environments. Multiple diffusion-based frameworks exist, each employing distinct training objectives and learning techniques. This paper aims to explain the differences between these frameworks by focusing our investigation on score-based generative models and the Schrödinger bridge. We con… ▽ More

    Submitted 18 January, 2025; v1 submitted 16 September, 2024; originally announced September 2024.

    Comments: Accepted at ICASSP 2025

  23. Computer Vision Model Compression Techniques for Embedded Systems: A Survey

    Authors: Alexandre Lopes, Fernando Pereira dos Santos, Diulhio de Oliveira, Mauricio Schiezaro, Helio Pedrini

    Abstract: Deep neural networks have consistently represented the state of the art in most computer vision problems. In these scenarios, larger and more complex models have demonstrated superior performance to smaller architectures, especially when trained with plenty of representative data. With the recent adoption of Vision Transformer (ViT) based architectures and advanced Convolutional Neural Networks (C… ▽ More

    Submitted 15 August, 2024; originally announced August 2024.

    Journal ref: Computers & Graphics, Volume 123, October 2024, 104015

  24. MyoGestic: EMG Interfacing Framework for Decoding Multiple Spared Degrees of Freedom of the Hand in Individuals with Neural Lesions

    Authors: Raul C. Sîmpetru, Dominik I. Braun, Arndt U. Simon, Michael März, Vlad Cnejevici, Daniela Souza de Oliveira, Nico Weber, Jonas Walter, Jörg Franke, Daniel Höglinger, Cosima Prahm, Matthias Ponfick, Alessandro Del Vecchio

    Abstract: Restoring limb motor function in individuals with spinal cord injury (SCI), stroke, or amputation remains a critical challenge, one which affects millions worldwide. Recent studies show through surface electromyography (EMG) that spared motor neurons can still be voluntarily controlled, even without visible limb movement . These signals can be decoded and used for motor intent estimation; however,… ▽ More

    Submitted 14 August, 2024; originally announced August 2024.

    Comments: 23 pages, 8 figures

    ACM Class: H.5.2; J.3; I.5.4; D.2.13

    Journal ref: Science Advances, 11, 2025, eads9150

  25. Curio: A Dataflow-Based Framework for Collaborative Urban Visual Analytics

    Authors: Gustavo Moreira, Maryam Hosseini, Carolina Veiga, Lucas Alexandre, Nicola Colaninno, Daniel de Oliveira, Nivan Ferreira, Marcos Lage, Fabio Miranda

    Abstract: Over the past decade, several urban visual analytics systems and tools have been proposed to tackle a host of challenges faced by cities, in areas as diverse as transportation, weather, and real estate. Many of these tools have been designed through collaborations with urban experts, aiming to distill intricate urban analysis workflows into interactive visualizations and interfaces. However, the d… ▽ More

    Submitted 12 August, 2024; originally announced August 2024.

    Comments: Accepted at IEEE VIS 2024. Source code available at https://urbantk.org/curio

  26. arXiv:2407.18673  [pdf, other

    cs.CV cs.LG

    A Survey on Cell Nuclei Instance Segmentation and Classification: Leveraging Context and Attention

    Authors: João D. Nunes, Diana Montezuma, Domingos Oliveira, Tania Pereira, Jaime S. Cardoso

    Abstract: Manually annotating nuclei from the gigapixel Hematoxylin and Eosin (H&E)-stained Whole Slide Images (WSIs) is a laborious and costly task, meaning automated algorithms for cell nuclei instance segmentation and classification could alleviate the workload of pathologists and clinical researchers and at the same time facilitate the automatic extraction of clinically interpretable features. But due t… ▽ More

    Submitted 26 July, 2024; originally announced July 2024.

  27. arXiv:2407.11786  [pdf, other

    cs.LG

    Cryptocurrency Price Forecasting Using XGBoost Regressor and Technical Indicators

    Authors: Abdelatif Hafid, Maad Ebrahim, Ali Alfatemi, Mohamed Rahouti, Diogo Oliveira

    Abstract: The rapid growth of the stock market has attracted many investors due to its potential for significant profits. However, predicting stock prices accurately is difficult because financial markets are complex and constantly changing. This is especially true for the cryptocurrency market, which is known for its extreme volatility, making it challenging for traders and investors to make wise and profi… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

    Comments: 9 pages, 3 figures, 4 tables, submitted to the 43rd IEEE International Performance Computing and Communications Conference (IPCCC 2024)

  28. arXiv:2407.00941  [pdf, ps, other

    cs.PL

    Full Iso-recursive Types

    Authors: Litao Zhou, Qianyong Wan, Bruno C. d. S. Oliveira

    Abstract: There are two well-known formulations of recursive types: iso-recursive and equi-recursive types. Abadi and Fiore [1996] have shown that iso- and equi-recursive types have the same expressive power. However, their encoding of equi-recursive types in terms of iso-recursive types requires explicit coercions. These coercions come with significant additional computational overhead, and complicate reas… ▽ More

    Submitted 7 July, 2024; v1 submitted 30 June, 2024; originally announced July 2024.

    Comments: This work has been conditionally accepted to OOPSLA 2024

  29. arXiv:2406.03460  [pdf, other

    eess.AS cs.LG cs.SD

    The PESQetarian: On the Relevance of Goodhart's Law for Speech Enhancement

    Authors: Danilo de Oliveira, Simon Welker, Julius Richter, Timo Gerkmann

    Abstract: To obtain improved speech enhancement models, researchers often focus on increasing performance according to specific instrumental metrics. However, when the same metric is used in a loss function to optimize models, it may be detrimental to aspects that the given metric does not see. The goal of this paper is to illustrate the risk of overfitting a speech enhancement model to the metric used for… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

    Comments: Accepted at Interspeech 2024

  30. arXiv:2406.02748  [pdf, other

    cs.CV cs.AI

    Story Generation from Visual Inputs: Techniques, Related Tasks, and Challenges

    Authors: Daniel A. P. Oliveira, Eugénio Ribeiro, David Martins de Matos

    Abstract: Creating engaging narratives from visual data is crucial for automated digital media consumption, assistive technologies, and interactive entertainment. This survey covers methodologies used in the generation of these narratives, focusing on their principles, strengths, and limitations. The survey also covers tasks related to automatic story generation, such as image and video captioning, and vi… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

    ACM Class: I.2.7; I.2.10

  31. arXiv:2404.09768  [pdf, other

    cs.CV

    Contrastive Pretraining for Visual Concept Explanations of Socioeconomic Outcomes

    Authors: Ivica Obadic, Alex Levering, Lars Pennig, Dario Oliveira, Diego Marcos, Xiaoxiang Zhu

    Abstract: Predicting socioeconomic indicators from satellite imagery with deep learning has become an increasingly popular research direction. Post-hoc concept-based explanations can be an important step towards broader adoption of these models in policy-making as they enable the interpretation of socioeconomic outcomes based on visual concepts that are intuitive to humans. In this paper, we study the inter… ▽ More

    Submitted 13 June, 2024; v1 submitted 15 April, 2024; originally announced April 2024.

  32. arXiv:2403.15324  [pdf, other

    cs.DC cs.DB

    ProvDeploy: Provenance-oriented Containerization of High Performance Computing Scientific Workflows

    Authors: Liliane Kunstmann, Débora Pina, Daniel de Oliveira, Marta Mattoso

    Abstract: Many existing scientific workflows require High Performance Computing environments to produce results in a timely manner. These workflows have several software library components and use different environments, making the deployment and execution of the software stack not trivial. This complexity increases if the user needs to add provenance data capture services to the workflow. This manuscript i… ▽ More

    Submitted 25 March, 2024; v1 submitted 22 March, 2024; originally announced March 2024.

  33. arXiv:2403.08795  [pdf

    cs.HC cs.CL

    Ontologia para monitorar a deficiência mental em seus déficts no processamento da informação por declínio cognitivo e evitar agressões psicológicas e físicas em ambientes educacionais com ajuda da I.A*

    Authors: Bruna Araújo de Castro Oliveira

    Abstract: The intention of this article is to propose the use of artificial intelligence to detect through analysis by UFO ontology the emergence of verbal and physical aggression related to psychosocial deficiencies and their provoking agents, in an attempt to prevent catastrophic consequences within school environments.

    Submitted 31 January, 2024; originally announced March 2024.

    Comments: in Portuguese language. Minha vez de falar sobre a realidade

  34. Opening the Black-Box: A Systematic Review on Explainable AI in Remote Sensing

    Authors: Adrian Höhl, Ivica Obadic, Miguel Ángel Fernández Torres, Hiba Najjar, Dario Oliveira, Zeynep Akata, Andreas Dengel, Xiao Xiang Zhu

    Abstract: In recent years, black-box machine learning approaches have become a dominant modeling paradigm for knowledge extraction in remote sensing. Despite the potential benefits of uncovering the inner workings of these models with explainable AI, a comprehensive overview summarizing the explainable AI methods used and their objectives, findings, and challenges in remote sensing applications is still mis… ▽ More

    Submitted 6 November, 2024; v1 submitted 21 February, 2024; originally announced February 2024.

  35. arXiv:2402.13244  [pdf, other

    cs.SI

    Are Fact-Checking Tools Helpful? An Exploration of the Usability of Google Fact Check

    Authors: Qiangeng Yang, Tess Christensen, Shlok Gilda, Juliana Fernandes, Daniela Oliveira, Ronald Wilson, Damon Woodard

    Abstract: Fact-checking-specific search tools such as Google Fact Check are a promising way to combat misinformation on social media, especially during events bringing significant social influence, such as the COVID-19 pandemic and the U.S. presidential elections. However, the usability of such an approach has not been thoroughly studied. We evaluated the performance of Google Fact Check by analyzing the re… ▽ More

    Submitted 24 May, 2025; v1 submitted 20 February, 2024; originally announced February 2024.

    Comments: Accepted and presented at the 5th EAI International Conference on Data and Information in Online Environments (EAI DIONE 2024)

  36. arXiv:2401.03005  [pdf, other

    physics.soc-ph cs.CV

    Evolution of urban areas and land surface temperature

    Authors: Sudipan Saha, Tushar Verma, Dario Augusto Borges Oliveira

    Abstract: With the global population on the rise, our cities have been expanding to accommodate the growing number of people. The expansion of cities generally leads to the engulfment of peripheral areas. However, such expansion of urban areas is likely to cause increment in areas with increased land surface temperature (LST). By considering each summer as a data point, we form LST multi-year time-series an… ▽ More

    Submitted 5 January, 2024; originally announced January 2024.

  37. arXiv:2310.18660  [pdf, other

    cs.CV cs.LG

    Foundation Models for Generalist Geospatial Artificial Intelligence

    Authors: Johannes Jakubik, Sujit Roy, C. E. Phillips, Paolo Fraccaro, Denys Godwin, Bianca Zadrozny, Daniela Szwarcman, Carlos Gomes, Gabby Nyirjesy, Blair Edwards, Daiki Kimura, Naomi Simumba, Linsong Chu, S. Karthik Mukkavilli, Devyani Lambhate, Kamal Das, Ranjini Bangalore, Dario Oliveira, Michal Muszynski, Kumar Ankur, Muthukumaran Ramasubramanian, Iksha Gurung, Sam Khallaghi, Hanxi, Li , et al. (8 additional authors not shown)

    Abstract: Significant progress in the development of highly adaptable and reusable Artificial Intelligence (AI) models is expected to have a significant impact on Earth science and remote sensing. Foundation models are pre-trained on large unlabeled datasets through self-supervision, and then fine-tuned for various downstream tasks with small labeled datasets. This paper introduces a first-of-a-kind framewo… ▽ More

    Submitted 8 November, 2023; v1 submitted 28 October, 2023; originally announced October 2023.

  38. MCU-Wide Timing Side Channels and Their Detection

    Authors: Johannes Müller, Anna Lena Duque Antón, Lucas Deutschmann, Dino Mehmedagić, Cristiano Rodrigues, Daniel Oliveira, Keerthikumara Devarajegowda, Mohammad Rahmani Fadiheh, Sandro Pinto, Dominik Stoffel, Wolfgang Kunz

    Abstract: Microarchitectural timing side channels have been thoroughly investigated as a security threat in hardware designs featuring shared buffers (e.g., caches) or parallelism between attacker and victim task execution. However, contradicting common intuitions, recent activities demonstrate that this threat is real even in microcontroller SoCs without such features. In this paper, we describe SoC-wide t… ▽ More

    Submitted 18 July, 2024; v1 submitted 22 September, 2023; originally announced September 2023.

    Comments: This version extends the work of the previous version and was accepted and presented at DAC'24

  39. arXiv:2309.09920  [pdf, other

    eess.AS cs.LG cs.SD eess.SP

    Distilling HuBERT with LSTMs via Decoupled Knowledge Distillation

    Authors: Danilo de Oliveira, Timo Gerkmann

    Abstract: Much research effort is being applied to the task of compressing the knowledge of self-supervised models, which are powerful, yet large and memory consuming. In this work, we show that the original method of knowledge distillation (and its more recently proposed extension, decoupled knowledge distillation) can be applied to the task of distilling HuBERT. In contrast to methods that focus on distil… ▽ More

    Submitted 18 September, 2023; originally announced September 2023.

    Comments: Submitted to ICASSP 2024

  40. ProWis: A Visual Approach for Building, Managing, and Analyzing Weather Simulation Ensembles at Runtime

    Authors: Carolina Veiga Ferreira de Souza, Suzanna Maria Bonnet, Daniel de Oliveira, Marcio Cataldi, Fabio Miranda, Marcos Lage

    Abstract: Weather forecasting is essential for decision-making and is usually performed using numerical modeling. Numerical weather models, in turn, are complex tools that require specialized training and laborious setup and are challenging even for weather experts. Moreover, weather simulations are data-intensive computations and may take hours to days to complete. When the simulation is finished, the expe… ▽ More

    Submitted 9 August, 2023; originally announced August 2023.

    Comments: Accepted at IEEE VIS 2023

    Journal ref: Published in: IEEE Transactions on Visualization and Computer Graphics ( Volume: 30, Issue: 1, January 2024)

  41. arXiv:2308.00997  [pdf, other

    cs.DC cs.PF eess.SY

    IRQ Coloring and the Subtle Art of Mitigating Interrupt-generated Interference

    Authors: Diogo Costa, Luca Cuomo, Daniel Oliveira, Ida Maria Savino, Bruno Morelli, José Martins, Alessandro Biasci, Sandro Pinto

    Abstract: Integrating workloads with differing criticality levels presents a formidable challenge in achieving the stringent spatial and temporal isolation requirements imposed by safety-critical standards such as ISO26262. The shift towards high-performance multicore platforms has been posing increasing issues to the so-called mixed-criticality systems (MCS) due to the reciprocal interference created by co… ▽ More

    Submitted 2 August, 2023; originally announced August 2023.

    Comments: 10 pages, 9 figures, 2 tables

  42. arXiv:2307.02693  [pdf, other

    cs.LG stat.ML

    Kernels, Data & Physics

    Authors: Francesco Cagnetta, Deborah Oliveira, Mahalakshmi Sabanayagam, Nikolaos Tsilivis, Julia Kempe

    Abstract: Lecture notes from the course given by Professor Julia Kempe at the summer school "Statistical physics of Machine Learning" in Les Houches. The notes discuss the so-called NTK approach to problems in machine learning, which consists of gaining an understanding of generally unsolvable problems by finding a tractable kernel formulation. The notes are mainly focused on practical applications such as… ▽ More

    Submitted 5 July, 2023; originally announced July 2023.

    Comments: These are notes from the lecture of Julia Kempe given at the summer school "Statistical Physics \& Machine Learning", that took place in Les Houches School of Physics in France from 4th to 29th July 2022

  43. arXiv:2306.03014  [pdf, other

    eess.AS cs.LG cs.SD

    On the Behavior of Intrusive and Non-intrusive Speech Enhancement Metrics in Predictive and Generative Settings

    Authors: Danilo de Oliveira, Julius Richter, Jean-Marie Lemercier, Tal Peer, Timo Gerkmann

    Abstract: Since its inception, the field of deep speech enhancement has been dominated by predictive (discriminative) approaches, such as spectral mapping or masking. Recently, however, novel generative approaches have been applied to speech enhancement, attaining good denoising performance with high subjective quality scores. At the same time, advances in deep learning also allowed for the creation of neur… ▽ More

    Submitted 5 June, 2023; originally announced June 2023.

    Comments: Submitted to ITG Conference on Speech Communication

  44. arXiv:2306.00007  [pdf, other

    cs.CL cs.LG

    Datasets for Portuguese Legal Semantic Textual Similarity: Comparing weak supervision and an annotation process approaches

    Authors: Daniel da Silva Junior, Paulo Roberto dos S. Corval, Aline Paes, Daniel de Oliveira

    Abstract: The Brazilian judiciary has a large workload, resulting in a long time to finish legal proceedings. Brazilian National Council of Justice has established in Resolution 469/2022 formal guidance for document and process digitalization opening up the possibility of using automatic techniques to help with everyday tasks in the legal field, particularly in a large number of texts yielded on the routine… ▽ More

    Submitted 29 May, 2023; originally announced June 2023.

  45. Leveraging Semantic Information for Efficient Self-Supervised Emotion Recognition with Audio-Textual Distilled Models

    Authors: Danilo de Oliveira, Navin Raj Prabhu, Timo Gerkmann

    Abstract: In large part due to their implicit semantic modeling, self-supervised learning (SSL) methods have significantly increased the performance of valence recognition in speech emotion recognition (SER) systems. Yet, their large size may often hinder practical implementations. In this work, we take HuBERT as an example of an SSL model and analyze the relevance of each of its layers for SER. We show tha… ▽ More

    Submitted 30 May, 2023; originally announced May 2023.

    Comments: Accepted at Interspeech 2023

    Journal ref: Proc. Interspeech 2023

  46. An interpretable machine learning system for colorectal cancer diagnosis from pathology slides

    Authors: Pedro C. Neto, Diana Montezuma, Sara P. Oliveira, Domingos Oliveira, João Fraga, Ana Monteiro, João Monteiro, Liliana Ribeiro, Sofia Gonçalves, Stefan Reinhard, Inti Zlobec, Isabel M. Pinto, Jaime S. Cardoso

    Abstract: Considering the profound transformation affecting pathology practice, we aimed to develop a scalable artificial intelligence (AI) system to diagnose colorectal cancer from whole-slide images (WSI). For this, we propose a deep learning (DL) system that learns from weak labels, a sampling strategy that reduces the number of training samples by a factor of six without compromising performance, an app… ▽ More

    Submitted 30 April, 2024; v1 submitted 6 January, 2023; originally announced January 2023.

    Comments: Accepted at npj Precision Oncology. Available at: https://www.nature.com/articles/s41698-024-00539-4

    Journal ref: npj Precis. Onc. 8, 56 (2024)

  47. Chronic pain patient narratives allow for the estimation of current pain intensity

    Authors: Diogo A. P. Nunes, Joana Ferreira-Gomes, Daniela Oliveira, Carlos Vaz, Sofia Pimenta, Fani Neto, David Martins de Matos

    Abstract: Chronic pain is a multi-dimensional experience, and pain intensity plays an important part, impacting the patients emotional balance, psychology, and behaviour. Standard self-reporting tools, such as the Visual Analogue Scale for pain, fail to capture this burden. Moreover, this type of tools is susceptible to a degree of subjectivity, dependent on the patients clear understanding of how to use it… ▽ More

    Submitted 17 November, 2022; v1 submitted 31 October, 2022; originally announced October 2022.

    Comments: 29 pages, 6 figures, 7 tables

    ACM Class: I.2.7; I.5.4; J.3; J.4

  48. arXiv:2210.13167  [pdf, other

    cs.CV

    Exploring Self-Attention for Crop-type Classification Explainability

    Authors: Ivica Obadic, Ribana Roscher, Dario Augusto Borges Oliveira, Xiao Xiang Zhu

    Abstract: Transformer models have become a promising approach for crop-type classification. Although their attention weights can be used to understand the relevant time points for crop disambiguation, the validity of these insights depends on how closely the attention weights approximate the actual workings of these black-box models, which is not always clear. In this paper, we introduce a novel explainabil… ▽ More

    Submitted 20 April, 2025; v1 submitted 24 October, 2022; originally announced October 2022.

  49. arXiv:2210.11327  [pdf, other

    cs.LG stat.ML

    Improving Data Quality with Training Dynamics of Gradient Boosting Decision Trees

    Authors: Moacir Antonelli Ponti, Lucas de Angelis Oliveira, Mathias Esteban, Valentina Garcia, Juan Martín Román, Luis Argerich

    Abstract: Real world datasets contain incorrectly labeled instances that hamper the performance of the model and, in particular, the ability to generalize out of distribution. Also, each example might have different contribution towards learning. This motivates studies to better understanding of the role of data instances with respect to their contribution in good metrics in models. In this paper we propose… ▽ More

    Submitted 22 February, 2024; v1 submitted 20 October, 2022; originally announced October 2022.

  50. arXiv:2210.09969  [pdf, other

    cs.CV cs.AI cs.CL

    Transfer-learning for video classification: Video Swin Transformer on multiple domains

    Authors: Daniel A. P. Oliveira, David Martins de Matos

    Abstract: The computer vision community has seen a shift from convolutional-based to pure transformer architectures for both image and video tasks. Training a transformer from zero for these tasks usually requires a lot of data and computational resources. Video Swin Transformer (VST) is a pure-transformer model developed for video classification which achieves state-of-the-art results in accuracy and effic… ▽ More

    Submitted 28 March, 2025; v1 submitted 18 October, 2022; originally announced October 2022.

    Comments: 7 pages, 11 figures

    ACM Class: I.2.7; I.2.10; I.4.8; I.5.4