Search | arXiv e-print repository

Role of Friction on the Formation of Confined Granular Structures

Authors: Vinícius Pereira da S. Oliveira, Danilo S. Borges, Erick M. Franklin, Jorge Peixinho

Abstract: Metastable systems of fluidized grains can auto-defluidize after some time, the settling particles forming either a glass- or crystal-like structure. We carried out experiments using different polymer spheres, of known friction and roughness, fluidized in water. We show that the level of velocity fluctuations is higher for the high friction material. A diagram was obtained for the settled particle… ▽ More Metastable systems of fluidized grains can auto-defluidize after some time, the settling particles forming either a glass- or crystal-like structure. We carried out experiments using different polymer spheres, of known friction and roughness, fluidized in water. We show that the level of velocity fluctuations is higher for the high friction material. A diagram was obtained for the settled particles when the coefficient of friction is of the order of 0.1, and their structure is characterized through an analysis of the nearest neighbors' angles. For the lower friction values, we find that the number of defects is smaller, the contact chains being longer and aligned. Our results bring new insights for understanding the formation of glass- and crystal-like structures based on the material surface properties. △ Less

Submitted 20 May, 2025; originally announced May 2025.

Comments: 7 pages, 7 figures

arXiv:2505.13752 [pdf, other]

doi 10.3389/fspas.2025.1572313

Tracking Reentries of Starlink Satellites During the Rising Phase of Solar Cycle 25

Authors: Denny M. Oliveira, Eftyhia Zesta, Katherine Garcia-Sage

Abstract: The exponential increase of low-Earth orbit (LEO) satellites in the past 5 years has brought into intense focus the need for reliable monitoring and reentry prediction to safeguard from space collisions and ground debris impacts. However, LEO satellites fly within the upper atmosphere region that exerts significant drag forces to their orbits, reducing their lifetimes, and increasing collision ris… ▽ More The exponential increase of low-Earth orbit (LEO) satellites in the past 5 years has brought into intense focus the need for reliable monitoring and reentry prediction to safeguard from space collisions and ground debris impacts. However, LEO satellites fly within the upper atmosphere region that exerts significant drag forces to their orbits, reducing their lifetimes, and increasing collision risks during dynamic events, like geomagnetic storms. Such conditions can become more severe during geomagnetic storms, particularly during extreme events. In this work, we use two-line element (TLE) satellite tracking data to investigate geomagnetic activity effects on the reentries of 523 Starlink satellites from 2020 to 2024. This period coincides with the rising phase of solar cycle 25, which has shown itself to be more intense than the previous solar cycle. We derive satellite altitudes and velocities from TLE files and perform a superposed epoch analysis, the first with hundreds of similar satellites. Even with limitedly accurate TLE data, our results indisputably show that satellites reenter faster with higher geomagnetic activity. This is explained by the fastest orbital decay rates (in km/day) of the satellites caused by increased drag forces. We also find that prediction errors, defined as the difference between the epochs of actual reentries and predicted reentries at reference altitudes, increase with geomagnetic activity. As a result, we clearly show that the intense solar activity of the current solar cycle has already had significant impacts on Starlink reentries. This is a very exciting time in satellite orbital drag research, since the number of satellites in LEO and solar activity are the highest ever observed in human history. △ Less

Submitted 19 May, 2025; originally announced May 2025.

Comments: 21 pages, 8 figures

Journal ref: Frontiers in Astronomy and Space Science, 2025

arXiv:2505.11391 [pdf, ps, other]

LipDiffuser: Lip-to-Speech Generation with Conditional Diffusion Models

Authors: Danilo de Oliveira, Julius Richter, Tal Peer, Timo Gerkmann

Abstract: We present LipDiffuser, a conditional diffusion model for lip-to-speech generation synthesizing natural and intelligible speech directly from silent video recordings. Our approach leverages the magnitude-preserving ablated diffusion model (MP-ADM) architecture as a denoiser model. To effectively condition the model, we incorporate visual features using magnitude-preserving feature-wise linear modu… ▽ More We present LipDiffuser, a conditional diffusion model for lip-to-speech generation synthesizing natural and intelligible speech directly from silent video recordings. Our approach leverages the magnitude-preserving ablated diffusion model (MP-ADM) architecture as a denoiser model. To effectively condition the model, we incorporate visual features using magnitude-preserving feature-wise linear modulation (MP-FiLM) alongside speaker embeddings. A neural vocoder then reconstructs the speech waveform from the generated mel-spectrograms. Evaluations on LRS3 and TCD-TIMIT demonstrate that LipDiffuser outperforms existing lip-to-speech baselines in perceptual speech quality and speaker similarity, while remaining competitive in downstream automatic speech recognition (ASR). These findings are also supported by a formal listening experiment. Extensive ablation studies and cross-dataset evaluation confirm the effectiveness and generalization capabilities of our approach. △ Less

Submitted 26 May, 2025; v1 submitted 16 May, 2025; originally announced May 2025.

arXiv:2505.10292 [pdf, ps, other]

StoryReasoning Dataset: Using Chain-of-Thought for Scene Understanding and Grounded Story Generation

Authors: Daniel A. P. Oliveira, David Martins de Matos

Abstract: Visual storytelling systems struggle to maintain character identity across frames and link actions to appropriate subjects, frequently leading to referential hallucinations. These issues can be addressed through grounding of characters, objects, and other entities on the visual elements. We propose StoryReasoning, a dataset containing 4,178 stories derived from 52,016 movie images, with both struc… ▽ More Visual storytelling systems struggle to maintain character identity across frames and link actions to appropriate subjects, frequently leading to referential hallucinations. These issues can be addressed through grounding of characters, objects, and other entities on the visual elements. We propose StoryReasoning, a dataset containing 4,178 stories derived from 52,016 movie images, with both structured scene analyses and grounded stories. Each story maintains character and object consistency across frames while explicitly modeling multi-frame relationships through structured tabular representations. Our approach features cross-frame object re-identification using visual similarity and face recognition, chain-of-thought reasoning for explicit narrative modeling, and a grounding scheme that links textual elements to visual entities across multiple frames. We establish baseline performance by fine-tuning Qwen2.5-VL 7B, creating Qwen Storyteller, which performs end-to-end object detection, re-identification, and landmark detection while maintaining consistent object references throughout the story. Evaluation demonstrates a reduction from 4.06 to 3.56 (-12.3%) hallucinations on average per story when compared to a non-fine-tuned model. △ Less

Submitted 15 May, 2025; originally announced May 2025.

Comments: 31 pages, 14 figures

ACM Class: I.2.10; I.2.7

arXiv:2505.05216 [pdf, other]

Normalize Everything: A Preconditioned Magnitude-Preserving Architecture for Diffusion-Based Speech Enhancement

Authors: Julius Richter, Danilo de Oliveira, Timo Gerkmann

Abstract: This paper presents a new framework for diffusion-based speech enhancement. Our method employs a Schroedinger bridge to transform the noisy speech distribution into the clean speech distribution. To stabilize and improve training, we employ time-dependent scalings of the inputs and outputs of the network, known as preconditioning. We consider two skip connection configurations, which either includ… ▽ More This paper presents a new framework for diffusion-based speech enhancement. Our method employs a Schroedinger bridge to transform the noisy speech distribution into the clean speech distribution. To stabilize and improve training, we employ time-dependent scalings of the inputs and outputs of the network, known as preconditioning. We consider two skip connection configurations, which either include or omit the current process state in the denoiser's output, enabling the network to predict either environmental noise or clean speech. Each approach leads to improved performance on different speech enhancement metrics. To maintain stable magnitude levels and balance during training, we use a magnitude-preserving network architecture that normalizes all activations and network weights to unit length. Additionally, we propose learning the contribution of the noisy input within each network block for effective input conditioning. After training, we apply a method to approximate different exponential moving average (EMA) profiles and investigate their effects on the speech enhancement performance. In contrast to image generation tasks, where longer EMA lengths often enhance mode coverage, we observe that shorter EMA lengths consistently lead to better performance on standard speech enhancement metrics. Code, audio examples, and checkpoints are available online. △ Less

Submitted 8 May, 2025; originally announced May 2025.

Comments: Submitted to WASPAA 2025

arXiv:2504.10405 [pdf, ps, other]

Performance of Large Language Models in Supporting Medical Diagnosis and Treatment

Authors: Diogo Sousa, Guilherme Barbosa, Catarina Rocha, Dulce Oliveira

Abstract: The integration of Large Language Models (LLMs) into healthcare holds significant potential to enhance diagnostic accuracy and support medical treatment planning. These AI-driven systems can analyze vast datasets, assisting clinicians in identifying diseases, recommending treatments, and predicting patient outcomes. This study evaluates the performance of a range of contemporary LLMs, including bo… ▽ More The integration of Large Language Models (LLMs) into healthcare holds significant potential to enhance diagnostic accuracy and support medical treatment planning. These AI-driven systems can analyze vast datasets, assisting clinicians in identifying diseases, recommending treatments, and predicting patient outcomes. This study evaluates the performance of a range of contemporary LLMs, including both open-source and closed-source models, on the 2024 Portuguese National Exam for medical specialty access (PNA), a standardized medical knowledge assessment. Our results highlight considerable variation in accuracy and cost-effectiveness, with several models demonstrating performance exceeding human benchmarks for medical students on this specific task. We identify leading models based on a combined score of accuracy and cost, discuss the implications of reasoning methodologies like Chain-of-Thought, and underscore the potential for LLMs to function as valuable complementary tools aiding medical professionals in complex clinical decision-making. △ Less

Submitted 14 April, 2025; originally announced April 2025.

Comments: 21 pages, 6 figures, 4 tables. Acknowledgements: The authors acknowledge the support of the AITriage4SU Project (2024.07400.IACDC/2024), funded by the FCT (Foundation for Science and Technology), Portugal

ACM Class: I.2.7; J.3

arXiv:2503.23389 [pdf]

Proprioceptive multistable mechanical metamaterial via soft capacitive sensors

Authors: Hugo de Souza Oliveira, Niloofar Saeedzadeh Khaanghah, Martijn Oetelmans, Niko Münzenrieder, Edoardo Milana

Abstract: The technological transition from soft machines to soft robots necessarily passes through the integration of soft electronics and sensors. This allows for the establishment of feedback control systems while preserving the softness of the robot embodiment. Multistable mechanical metamaterials are excellent building blocks of soft machines, as their nonlinear response can be tuned by design to accom… ▽ More The technological transition from soft machines to soft robots necessarily passes through the integration of soft electronics and sensors. This allows for the establishment of feedback control systems while preserving the softness of the robot embodiment. Multistable mechanical metamaterials are excellent building blocks of soft machines, as their nonlinear response can be tuned by design to accomplish several functions. In this work, we present the integration of soft capacitive sensors in a multistable mechanical metamaterial, to enable proprioceptive sensing of state changes. The metamaterial is a periodic arrangement of 4 bistable unit cells. Each unit cell has an integrated capacitive sensor. Both the metastructure and the sensors are made of soft materials (TPU) and are 3D printed. Our preliminary results show that the capacitance variation of the sensors can be linked to state transitions of the metamaterial, by capturing the nonlinear deformation. △ Less

Submitted 30 March, 2025; originally announced March 2025.

Comments: 2024 IEEE International Flexible Electronics Technology Conference (IFETC)

arXiv:2503.23375 [pdf, other]

Meta-Ori: monolithic meta-origami for nonlinear inflatable soft actuators

Authors: Hugo de Souza Oliveira, Xin Li, Johannes Frey, Edoardo Milana

Abstract: The nonlinear mechanical response of soft materials and slender structures is purposefully harnessed to program functions by design in soft robotic actuators, such as sequencing, amplified response, fast energy release, etc. However, typical designs of nonlinear actuators - e.g. balloons, inverted membranes, springs - have limited design parameters space and complex fabrication processes, hinderin… ▽ More The nonlinear mechanical response of soft materials and slender structures is purposefully harnessed to program functions by design in soft robotic actuators, such as sequencing, amplified response, fast energy release, etc. However, typical designs of nonlinear actuators - e.g. balloons, inverted membranes, springs - have limited design parameters space and complex fabrication processes, hindering the achievement of more elaborated functions. Mechanical metamaterials, on the other hand, have very large design parameter spaces, which allow fine-tuning of nonlinear behaviours. In this work, we present a novel approach to fabricate nonlinear inflatables based on metamaterials and origami (Meta-Ori) as monolithic parts that can be fully 3D printed via Fused Deposition Modeling (FDM) using thermoplastic polyurethane (TPU) commercial filaments. Our design consists of a metamaterial shell with cylindrical topology and nonlinear mechanical response combined with a Kresling origami inflatable acting as a pneumatic transmitter. We develop and release a design tool in the visual programming language Grasshopper to interactively design our Meta-Ori. We characterize the mechanical response of the metashell and the origami, and the nonlinear pressure-volume curve of the Meta-Ori inflatable and, lastly, we demonstrate the actuation sequencing of a bi-segment monolithic Meta-Ori soft actuator. △ Less

Submitted 30 March, 2025; originally announced March 2025.

Comments: 8th IEEE-RAS International Conference on Soft Robotics

arXiv:2503.20889 [pdf, other]

doi 10.1051/0004-6361/202453307

Feedback from low-to-moderate luminosity radio-AGN with MaNGA

Authors: Pranav Kukreti, Dominika Wylezalek, Marco Albán, Bruno DallAgnol de Oliveira

Abstract: Spatially resolved spectral studies of radio-AGN host galaxies have shown that these systems can impact the ionised gas on galactic scales. However, whether jet and radiation-driven feedback occurs simultaneously is still unclear. We select a large and representative sample of 806 radio-AGN from the MaNGA survey, from $L_\mathrm{1.4GHz}\approx10^{21}-10^{25}$W/Hz, and trace the warm ionised gas ki… ▽ More Spatially resolved spectral studies of radio-AGN host galaxies have shown that these systems can impact the ionised gas on galactic scales. However, whether jet and radiation-driven feedback occurs simultaneously is still unclear. We select a large and representative sample of 806 radio-AGN from the MaNGA survey, from $L_\mathrm{1.4GHz}\approx10^{21}-10^{25}$W/Hz, and trace the warm ionised gas kinematics using the [OIII] emission line from the IFU spectra. We measure the [OIII] line width and compare it to the stellar velocity dispersion to determine the presence and location of the disturbed gas. We find most disturbed [OIII] kinematics and proportion of disturbed sources up to a radial distance of 0.25$R_{eff}$, when both radio and optical AGN are present in a source, and the radio luminosity is larger than $10^{23}$W/Hz. When either radio or optical-AGN are present, the impact on [OIII] is milder. Irrespective of the presence of an optical-AGN, we find significant differences in the feedback from high and low luminosity radio-AGN only up to a radial distance of 0.25$R_{eff}$. The presence of more kinematically disturbed warm ionised gas in the central region of radio-AGN host galaxies is related to both jets and radiation in these sources. We propose that in moderate radio luminosity AGN ($L_\mathrm{1.4GHz}\approx10^{23}-10^{25}$W/Hz) gas clouds pushed to high velocities by the jets (radiation) are driven to even higher velocities by the impact of radiation (jets) when both radio and optical-AGN are present. At lower luminosities ($L_\mathrm{1.4GHz}\approx10^{21}-10^{23}$W/Hz), the correlation between the disturbed ionised gas and enhanced radio emission could either be due to wind-driven shocks powering the radio emission, or low-power jets disturbing the gas. △ Less

Submitted 26 March, 2025; originally announced March 2025.

Comments: Accepted for publication in Astronomy and Astrophysics

Journal ref: A&A 698, A99 (2025)

arXiv:2503.16306 [pdf, other]

The Paradox of Anti-Inductive Dice

Authors: Summer Eldridge, Ivo David de Oliveira, Yogev Shpilman

Abstract: We identify a new type of paradoxical behavior in dice, where the sum of independent rolls produces a deceptive sequence of dominance relations. We call these ``anti-inductive dice". Consider a game with two players and two non-identical dice. Each rolls their die $k$ times, adding the results, and the player with the highest sum wins. For each $k$, this induces a dominance relation between dice,… ▽ More We identify a new type of paradoxical behavior in dice, where the sum of independent rolls produces a deceptive sequence of dominance relations. We call these ``anti-inductive dice". Consider a game with two players and two non-identical dice. Each rolls their die $k$ times, adding the results, and the player with the highest sum wins. For each $k$, this induces a dominance relation between dice, with $A[k]\succ B[k]$ if $A$ is more likely than $B$ to win after $k$ rolls, and vice versa. For certain classes of dice, the limiting behavior of these relations is well-established in the literature, but the transient behavior, the subject of this paper, is less well-understood. This transient behavior, even for dice with only 4 faces, contains an immensely rich parameter space with fractal-like behavior. △ Less

Submitted 20 March, 2025; originally announced March 2025.

Comments: 11 pages, 4 figures

arXiv:2503.11499 [pdf, other]

Tactical Asset Allocation with Macroeconomic Regime Detection

Authors: Daniel Cunha Oliveira, Dylan Sandfelder, André Fujita, Xiaowen Dong, Mihai Cucuringu

Abstract: This paper extends the tactical asset allocation literature by incorporating regime modeling using techniques from machine learning. We propose a novel model that classifies current regimes, forecasts the distribution of future regimes, and integrates these forecasts with the historical performance of individual assets to optimize portfolio allocations. Utilizing a macroeconomic data set from the… ▽ More This paper extends the tactical asset allocation literature by incorporating regime modeling using techniques from machine learning. We propose a novel model that classifies current regimes, forecasts the distribution of future regimes, and integrates these forecasts with the historical performance of individual assets to optimize portfolio allocations. Utilizing a macroeconomic data set from the FRED-MD database, our approach employs a modified k-means algorithm to ensure consistent regime classification over time. We then leverage these regime predictions to estimate expected returns and volatilities, which are subsequently mapped into portfolio allocations using various sizing schemes. Our method outperforms traditional benchmarks such as equal-weight, buy-and-hold, and random regime models. Additionally, we are the first to apply a regime detection model from a large macroeconomic dataset to tactical asset allocation, demonstrating significant improvements in portfolio performance. Our work presents several key contributions, including a novel data-driven regime detection algorithm tailored for uncertainty in forecasted regimes and applying the FRED-MD data set for tactical asset allocation. △ Less

Submitted 21 March, 2025; v1 submitted 14 March, 2025; originally announced March 2025.

arXiv:2503.10520 [pdf, other]

CountPath: Automating Fragment Counting in Digital Pathology

Authors: Ana Beatriz Vieira, Maria Valente, Diana Montezuma, Tomé Albuquerque, Liliana Ribeiro, Domingos Oliveira, João Monteiro, Sofia Gonçalves, Isabel M. Pinto, Jaime S. Cardoso, Arlindo L. Oliveira

Abstract: Quality control of medical images is a critical component of digital pathology, ensuring that diagnostic images meet required standards. A pre-analytical task within this process is the verification of the number of specimen fragments, a process that ensures that the number of fragments on a slide matches the number documented in the macroscopic report. This step is important to ensure that the sl… ▽ More Quality control of medical images is a critical component of digital pathology, ensuring that diagnostic images meet required standards. A pre-analytical task within this process is the verification of the number of specimen fragments, a process that ensures that the number of fragments on a slide matches the number documented in the macroscopic report. This step is important to ensure that the slides contain the appropriate diagnostic material from the grossing process, thereby guaranteeing the accuracy of subsequent microscopic examination and diagnosis. Traditionally, this assessment is performed manually, requiring significant time and effort while being subject to significant variability due to its subjective nature. To address these challenges, this study explores an automated approach to fragment counting using the YOLOv9 and Vision Transformer models. Our results demonstrate that the automated system achieves a level of performance comparable to expert assessments, offering a reliable and efficient alternative to manual counting. Additionally, we present findings on interobserver variability, showing that the automated approach achieves an accuracy of 86%, which falls within the range of variation observed among experts (82-88%), further supporting its potential for integration into routine pathology workflows. △ Less

Submitted 13 March, 2025; originally announced March 2025.

Comments: 10 pages, 3 figures

ACM Class: I.2; I.4

arXiv:2503.08321 [pdf, other]

i-WiViG: Interpretable Window Vision GNN

Authors: Ivica Obadic, Dmitry Kangin, Dario Oliveira, Plamen P Angelov, Xiao Xiang Zhu

Abstract: Deep learning models based on graph neural networks have emerged as a popular approach for solving computer vision problems. They encode the image into a graph structure and can be beneficial for efficiently capturing the long-range dependencies typically present in remote sensing imagery. However, an important drawback of these methods is their black-box nature which may hamper their wider usage… ▽ More Deep learning models based on graph neural networks have emerged as a popular approach for solving computer vision problems. They encode the image into a graph structure and can be beneficial for efficiently capturing the long-range dependencies typically present in remote sensing imagery. However, an important drawback of these methods is their black-box nature which may hamper their wider usage in critical applications. In this work, we tackle the self-interpretability of the graph-based vision models by proposing our Interpretable Window Vision GNN (i-WiViG) approach, which provides explanations by automatically identifying the relevant subgraphs for the model prediction. This is achieved with window-based image graph processing that constrains the node receptive field to a local image region and by using a self-interpretable graph bottleneck that ranks the importance of the long-range relations between the image regions. We evaluate our approach to remote sensing classification and regression tasks, showing it achieves competitive performance while providing inherent and faithful explanations through the identified relations. Further, the quantitative evaluation reveals that our model reduces the infidelity of post-hoc explanations compared to other Vision GNN models, without sacrificing explanation sparsity. △ Less

Submitted 11 March, 2025; originally announced March 2025.

arXiv:2503.03599 [pdf, other]

REGRACE: A Robust and Efficient Graph-based Re-localization Algorithm using Consistency Evaluation

Authors: Débora N. P. Oliveira, Joshua Knights, Sebastián Barbas Laina, Simon Boche, Wolfram Burgard, Stefan Leutenegger

Abstract: Loop closures are essential for correcting odometry drift and creating consistent maps, especially in the context of large-scale navigation. Current methods using dense point clouds for accurate place recognition do not scale well due to computationally expensive scan-to-scan comparisons. Alternative object-centric approaches are more efficient but often struggle with sensitivity to viewpoint vari… ▽ More Loop closures are essential for correcting odometry drift and creating consistent maps, especially in the context of large-scale navigation. Current methods using dense point clouds for accurate place recognition do not scale well due to computationally expensive scan-to-scan comparisons. Alternative object-centric approaches are more efficient but often struggle with sensitivity to viewpoint variation. In this work, we introduce REGRACE, a novel approach that addresses these challenges of scalability and perspective difference in re-localization by using LiDAR-based submaps. We introduce rotation-invariant features for each labeled object and enhance them with neighborhood context through a graph neural network. To identify potential revisits, we employ a scalable bag-of-words approach, pooling one learned global feature per submap. Additionally, we define a revisit with geometrical consistency cues rather than embedding distance, allowing us to recognize far-away loop closures. Our evaluations demonstrate that REGRACE achieves similar results compared to state-of-the-art place recognition and registration baselines while being twice as fast. △ Less

Submitted 5 March, 2025; originally announced March 2025.

Comments: Submitted to IROS2025

arXiv:2502.20442 [pdf, other]

doi 10.1051/0004-6361/202553668

JWST + ALMA ubiquitously discover companion systems within $\lesssim18\,$kpc around four $z$$\approx$3.5 luminous radio-loud AGN

Authors: Wuji Wang, Carlos De Breuck, Dominika Wylezalek, Joël Vernet, Matthew D. Lehnert, Daniel Stern, David S. N. Rupke, Nicole P. H. Nesvadba, Andrey Vayner, Nadia L. Zakamska, Lingrui Lin, Pranav Kukreti, Bruno Dall'Agnol de Oliveira, Julian T. Groth

Abstract: Mergers play important roles in galaxy evolution at and beyond Cosmic Noon ($z\sim3$). They are found to be a trigger of active galactic nuclei (AGN) activity and a process for growing stellar mass and black hole mass. High-$z$ radio galaxies (HzRGs=type-2 radio-loud AGN) are among the most massive galaxies known, and reside in dense environments on scales of tens of kiloparsecs to Megaparsecs. We… ▽ More Mergers play important roles in galaxy evolution at and beyond Cosmic Noon ($z\sim3$). They are found to be a trigger of active galactic nuclei (AGN) activity and a process for growing stellar mass and black hole mass. High-$z$ radio galaxies (HzRGs=type-2 radio-loud AGN) are among the most massive galaxies known, and reside in dense environments on scales of tens of kiloparsecs to Megaparsecs. We present the first search for kpc-scale companions in a sample of four $z\sim3.5$ HzRGs, with many supporting datasets, using matched 0.2" resolution ALMA and JWST/NIRSpec integral field unit data. We discover a total of $\sim12$ companion systems within $\lesssim18\,$kpc across all four HzRG fields using two independent detection methods: peculiar [OIII]$4959,5007$ kinematics offset from the main (systemic) ionized gas component and [CII]$158\rm μm$ emitters. We examine the velocity fields of these companions and find evidence of disk rotation along with more complex motions. We estimate the dynamical masses of these nearby systems to be $M_{\rm dyn}\sim10^{9-11}\,M_{\odot}$, which may indicate a minor merger scenario. Our results indicate that these companions may be the trigger of the powerful radio-loud AGN. We discuss the roles of the discovered companion systems in galaxy evolution for these powerful jetted AGN and indicate that they may impede jet launch and deflect the jet. △ Less

Submitted 27 February, 2025; originally announced February 2025.

Comments: Accepted for publication in Astronomy & Astrophysics, 16 pages, 12 figures, and 2 tables in main text; also cool figures in appendix

Journal ref: A&A 696, A88 (2025)

arXiv:2502.15891 [pdf, other]

Counting communities in weighted Stochastic Block Models via semidefinite programming

Authors: Deborah Oliveira, Andressa Cerqueira, Roberto Oliveira

Abstract: We consider the problem of estimating the number of communities in a weighted balanced Stochastic Block Model. We construct hypothesis tests based on semidefinite programming and with a statistic coming from a GOE matrix to distinguish between any two candidate numbers of communities. This is possible due to a universality result for a semidefinite programming-based function that we also prove. Th… ▽ More We consider the problem of estimating the number of communities in a weighted balanced Stochastic Block Model. We construct hypothesis tests based on semidefinite programming and with a statistic coming from a GOE matrix to distinguish between any two candidate numbers of communities. This is possible due to a universality result for a semidefinite programming-based function that we also prove. The tests are then used to form a sequential test to estimate the number of communities. Furthermore, we also construct estimators of the communities themselves. △ Less

Submitted 21 February, 2025; originally announced February 2025.

Comments: This is a first draft. Comments are welcome

arXiv:2502.13898 [pdf, other]

GroundCap: A Visually Grounded Image Captioning Dataset

Authors: Daniel A. P. Oliveira, Lourenço Teodoro, David Martins de Matos

Abstract: Current image captioning systems lack the ability to link descriptive text to specific visual elements, making their outputs difficult to verify. While recent approaches offer some grounding capabilities, they cannot track object identities across multiple references or ground both actions and objects simultaneously. We propose a novel ID-based grounding system that enables consistent object refer… ▽ More Current image captioning systems lack the ability to link descriptive text to specific visual elements, making their outputs difficult to verify. While recent approaches offer some grounding capabilities, they cannot track object identities across multiple references or ground both actions and objects simultaneously. We propose a novel ID-based grounding system that enables consistent object reference tracking and action-object linking, and present GroundCap, a dataset containing 52,016 images from 77 movies, with 344 human-annotated and 52,016 automatically generated captions. Each caption is grounded on detected objects (132 classes) and actions (51 classes) using a tag system that maintains object identity while linking actions to the corresponding objects. Our approach features persistent object IDs for reference tracking, explicit action-object linking, and segmentation of background elements through K-means clustering. We propose gMETEOR, a metric combining caption quality with grounding accuracy, and establish baseline performance by fine-tuning Pixtral-12B. Human evaluation demonstrates our approach's effectiveness in producing verifiable descriptions with coherent object references. △ Less

Submitted 24 March, 2025; v1 submitted 19 February, 2025; originally announced February 2025.

Comments: 37 pages

ACM Class: I.2.10; I.2.7

arXiv:2502.12350 [pdf, other]

Mamute: high-performance computing for geophysical methods

Authors: João B. Fernandes, Antônio D. S. Oliveira, Mateus C. A. T. Silva, Felipe H. Santos-da-Silva, Vitor H. M. Rodrigues, Kleiton A. Schneider, Calebe P. Bianchini, João M. de Araujo, Tiago Barros, Ítalo A. S. Assis, Samuel Xavier-de-Souza

Abstract: Due to their high computational cost, geophysical applications are typically designed to run in large computing systems. Because of that, such applications must implement several high-performance techniques to use the computational resources better. In this paper, we present Mamute, a software that delivers wave equation-based geophysical methods. Mamute implements two geophysical methods: seismic… ▽ More Due to their high computational cost, geophysical applications are typically designed to run in large computing systems. Because of that, such applications must implement several high-performance techniques to use the computational resources better. In this paper, we present Mamute, a software that delivers wave equation-based geophysical methods. Mamute implements two geophysical methods: seismic modeling and full waveform inversion (FWI). It also supports high-performance strategies such as fault tolerance, automatic parallel looping scheduling, and distributed systems workload balancing. We demonstrate Mamute's operation using both seismic modeling and FWI. Mamute is a C++ software readily available under the MIT license. △ Less

Submitted 17 February, 2025; originally announced February 2025.

Comments: 24 pages, 6 figures, Journal

arXiv:2502.07073 [pdf, ps, other]

Hidden symmetries and the generic spectral setting of generalized laplacians on homogeneous spaces

Authors: Diego S. De Oliveira, Marcus A. M. Marrocos

Abstract: The purpose of this work is to establish the spectral setting of some generalized Laplace operators associated to a generic $G$-invariant metric on a compact homogeneous space $M=G/K$. We show that this generic spectral configuration depends on the $G$-isometries and on some certain hidden symmetries constructed in the adjacent structures of $M$ and of these operators. The purpose of this work is to establish the spectral setting of some generalized Laplace operators associated to a generic $G$-invariant metric on a compact homogeneous space $M=G/K$. We show that this generic spectral configuration depends on the $G$-isometries and on some certain hidden symmetries constructed in the adjacent structures of $M$ and of these operators. △ Less

Submitted 10 February, 2025; originally announced February 2025.

Comments: arXiv admin note: text overlap with arXiv:2501.18747

MSC Class: 35J05; 22C05; 22E46; 53C35

arXiv:2501.18747 [pdf, ps, other]

doi 10.1007/s00229-024-01567-x

A note about the generic irreducibility of the spectrum of the Laplacian on homogeneous spaces

Authors: Diego S. de Oliveira, Marcus A. M. Marrocos

Abstract: Petrecca and Röser (2018, \cite{Petrecca2019}), and Schueth (2017, \cite{Schueth2017}) had shown that for a generic $G$-invariant metric $g$ on certain compact homogeneous spaces $M=G/K$ (including symmetric spaces of rank 1 and some Lie groups), the spectrum of the Laplace-Beltrami operator $Δ_g$ was real $G$-simple. The same is not true for the complex version of $Δ_g$ when there is a presence o… ▽ More Petrecca and Röser (2018, \cite{Petrecca2019}), and Schueth (2017, \cite{Schueth2017}) had shown that for a generic $G$-invariant metric $g$ on certain compact homogeneous spaces $M=G/K$ (including symmetric spaces of rank 1 and some Lie groups), the spectrum of the Laplace-Beltrami operator $Δ_g$ was real $G$-simple. The same is not true for the complex version of $Δ_g$ when there is a presence of representations of complex or quaternionic type. We show that these types of representations induces a $Q_8$-action that commutes with the Laplacian in such way that $G$-properties of the real version of the operator have to be understood as $(Q_8 \times G)$-properties on its corresponding complex version. Also we argue that for symmetric spaces on rank $\geq 2$ there are algebraic symmetries on the corresponding root systems which relates distinct irreducible representations on the same eigenspace. △ Less

Submitted 30 January, 2025; originally announced January 2025.

MSC Class: 35J05; 22C05; 22E46; 53C35

Journal ref: Oliveira, D.S.d., Marrocos, M.A.M. A note about the generic irreducibility of the spectrum of the Laplacian on homogeneous spaces. manuscripta math. 175, 143 154 (2024)

arXiv:2501.08401 [pdf, ps, other]

Navigating Gender Disparities in Communication Research Leadership: Academic Recognition, Career Development, and Compensation

Authors: Diego F. M. Oliveira, Qian Huang

Abstract: This study examines gender disparities in communication research through citation metrics, authorship patterns, team composition, and faculty salaries. Using data from 62,359 papers across 121 communication journals, we find that while female authors are increasingly represented, citation gaps persist, with sole-authored papers by women receiving fewer citations than those by men, especially in sm… ▽ More This study examines gender disparities in communication research through citation metrics, authorship patterns, team composition, and faculty salaries. Using data from 62,359 papers across 121 communication journals, we find that while female authors are increasingly represented, citation gaps persist, with sole-authored papers by women receiving fewer citations than those by men, especially in smaller teams. Team composition analysis reveals a tendency toward gender homophily, with single-gender teams being more common. In top U.S. communication journals, female authors face underrepresentation and citation disparities favoring male authors. Salary analysis from leading U.S. public universities shows that female faculty earn lower salaries at the Assistant Professor level, though disparities lessen at higher ranks. These findings highlight the need for greater efforts to promote gender equity through inclusive collaboration, equitable citation practices, and fair compensation. △ Less

Submitted 15 January, 2025; v1 submitted 14 January, 2025; originally announced January 2025.

arXiv:2501.00556 [pdf, other]

Finding the Underlying Viscoelastic Constitutive Equation via Universal Differential Equations and Differentiable Physics

Authors: Elias C. Rodrigues, Roney L. Thompson, Dário A. B. Oliveira, Roberto F. Ausas

Abstract: This research employs Universal Differential Equations (UDEs) alongside differentiable physics to model viscoelastic fluids, merging conventional differential equations, neural networks and numerical methods to reconstruct missing terms in constitutive models. This study focuses on analyzing four viscoelastic models: Upper Convected Maxwell (UCM), Johnson-Segalman, Giesekus, and Exponential Phan-T… ▽ More This research employs Universal Differential Equations (UDEs) alongside differentiable physics to model viscoelastic fluids, merging conventional differential equations, neural networks and numerical methods to reconstruct missing terms in constitutive models. This study focuses on analyzing four viscoelastic models: Upper Convected Maxwell (UCM), Johnson-Segalman, Giesekus, and Exponential Phan-Thien-Tanner (ePTT), through the use of synthetic datasets. The methodology was tested across different experimental conditions, including oscillatory and startup flows. While the UDE framework effectively predicts shear and normal stresses for most models, it demonstrates some limitations when applied to the ePTT model. The findings underscore the potential of UDEs in fluid mechanics while identifying critical areas for methodological improvement. Also, a model distillation approach was employed to extract simplified models from complex ones, emphasizing the versatility and robustness of UDEs in rheological modeling. △ Less

Submitted 23 May, 2025; v1 submitted 31 December, 2024; originally announced January 2025.

arXiv:2501.00049 [pdf, other]

Seq2Seq Model-Based Chatbot with LSTM and Attention Mechanism for Enhanced User Interaction

Authors: Lamya Benaddi, Charaf Ouaddi, Adnane Souha, Abdeslam Jakimi, Mohamed Rahouti, Mohammed Aledhari, Diogo Oliveira, Brahim Ouchao

Abstract: A chatbot is an intelligent software application that automates conversations and engages users in natural language through messaging platforms. Leveraging artificial intelligence (AI), chatbots serve various functions, including customer service, information gathering, and casual conversation. Existing virtual assistant chatbots, such as ChatGPT and Gemini, demonstrate the potential of AI in Natu… ▽ More A chatbot is an intelligent software application that automates conversations and engages users in natural language through messaging platforms. Leveraging artificial intelligence (AI), chatbots serve various functions, including customer service, information gathering, and casual conversation. Existing virtual assistant chatbots, such as ChatGPT and Gemini, demonstrate the potential of AI in Natural Language Processing (NLP). However, many current solutions rely on predefined APIs, which can result in vendor lock-in and high costs. To address these challenges, this work proposes a chatbot developed using a Sequence-to-Sequence (Seq2Seq) model with an encoder-decoder architecture that incorporates attention mechanisms and Long Short-Term Memory (LSTM) cells. By avoiding predefined APIs, this approach ensures flexibility and cost-effectiveness. The chatbot is trained, validated, and tested on a dataset specifically curated for the tourism sector in Draa-Tafilalet, Morocco. Key evaluation findings indicate that the proposed Seq2Seq model-based chatbot achieved high accuracies: approximately 99.58% in training, 98.03% in validation, and 94.12% in testing. These results demonstrate the chatbot's effectiveness in providing relevant and coherent responses within the tourism domain, highlighting the potential of specialized AI applications to enhance user experience and satisfaction in niche markets. △ Less

Submitted 27 December, 2024; originally announced January 2025.

Comments: The Third Workshop on Deployable AI at AAAI-2025

arXiv:2411.16945 [pdf, other]

Rare events for low energy domain in bouncing ball model

Authors: Edson D. Leonel, Diego F. M. Oliveira

Abstract: The probability distribution for multiple collisions observed in the chaotic low energy domain in the bouncing ball model is shown to be scaling invariant concerning the control parameters. The model considers the dynamics of a bouncing ball particle colliding elastically with two rigid walls. One is fixed, and the other one moves periodically in time. The dynamics is described by a two-dimensiona… ▽ More The probability distribution for multiple collisions observed in the chaotic low energy domain in the bouncing ball model is shown to be scaling invariant concerning the control parameters. The model considers the dynamics of a bouncing ball particle colliding elastically with two rigid walls. One is fixed, and the other one moves periodically in time. The dynamics is described by a two-dimensional mapping for the variables velocity of the particle and phase of the moving wall. For a specific combination of velocity and phase, the particle may experience a type of rare collision named successive collisions. We show that a power law describes the probability distribution of the multiple impacts and is scaling invariant to the control parameter. △ Less

Submitted 25 November, 2024; originally announced November 2024.

arXiv:2411.12928 [pdf, other]

Discussing a transition from bounded to unbounded energy in a time-dependent billiard

Authors: Anne Kétri P. da Fonseca, Felipe Augusto O. Silveira, Célia M. Kuwana, Diego F. M. Oliveira, Edson D. Leonel

Abstract: We revisit a time-dependent, oval-shaped billiard to investigate a phase transition from bounded to unbounded energy growth. In the static case, the phase space exhibits a mixed structure. The chaotic sea in the static scenario leads to average energy growth for a time-dependent boundary. However, inelastic collisions between the particle and the boundary limit this unbounded energy increase. This… ▽ More We revisit a time-dependent, oval-shaped billiard to investigate a phase transition from bounded to unbounded energy growth. In the static case, the phase space exhibits a mixed structure. The chaotic sea in the static scenario leads to average energy growth for a time-dependent boundary. However, inelastic collisions between the particle and the boundary limit this unbounded energy increase. This transition displays properties similar to continuous phase transitions in statistical mechanics, including scale invariance, interrelated critical exponents governed by scaling laws, and an order parameter/susceptibility approaching zero/infinity at the transition. Furthermore, the system exhibits an elementary excitation that promotes particle diffusion and lacks topological defects that provide modifications to the probability distribution function. △ Less

Submitted 19 November, 2024; originally announced November 2024.

arXiv:2411.12648 [pdf, ps, other]

Scaling invariance for the diffusion coefficient in a dissipative standard mapping

Authors: Edson D. Leonel, Celia M. Kuwana, Diego F. M. Oliveira

Abstract: The unbounded diffusion observed for the standard mapping in a regime of high nonlinearity is suppressed by dissipation due to the violation of Liouville's theorem. The diffusion coefficient becomes important for the description of scaling invariance particularly for the suppression of the unbounded action diffusion. When the dynamics start in the regime of low action, the diffusion coefficient re… ▽ More The unbounded diffusion observed for the standard mapping in a regime of high nonlinearity is suppressed by dissipation due to the violation of Liouville's theorem. The diffusion coefficient becomes important for the description of scaling invariance particularly for the suppression of the unbounded action diffusion. When the dynamics start in the regime of low action, the diffusion coefficient remains constant for a long time, guaranteeing the diffusion for an ensemble of particles. Eventually, it evolves into a regime of decay, marking the suppression of particle action growth. We prove it is scaling invariant for the control parameters and the crossover time identifying the changeover from the constant domain, leading to diffusion, for a regime of decay marking the saturation of the diffusion, scales with the same critical exponent $z=-1$ for a transition from bounded to unbounded diffusion in a dissipative time dependent billiard system. △ Less

Submitted 19 November, 2024; originally announced November 2024.

arXiv:2411.09524 [pdf, other]

FlowNav: Combining Flow Matching and Depth Priors for Efficient Navigation

Authors: Samiran Gode, Abhijeet Nayak, Débora N. P. Oliveira, Michael Krawez, Cordelia Schmid, Wolfram Burgard

Abstract: Effective robot navigation in unseen environments is a challenging task that requires precise control actions at high frequencies. Recent advances have framed it as an image-goal-conditioned control problem, where the robot generates navigation actions using frontal RGB images. Current state-of-the-art methods in this area use diffusion policies to generate these control actions. Despite their pro… ▽ More Effective robot navigation in unseen environments is a challenging task that requires precise control actions at high frequencies. Recent advances have framed it as an image-goal-conditioned control problem, where the robot generates navigation actions using frontal RGB images. Current state-of-the-art methods in this area use diffusion policies to generate these control actions. Despite their promising results, these models are computationally expensive and suffer from weak perception. To address these limitations, we present FlowNav, a novel approach that uses a combination of Conditional Flow Matching (CFM) and depth priors from off-the-shelf foundation models to learn action policies for robot navigation. FlowNav is significantly more accurate at navigation and exploration than state-of-the-art methods. We validate our contributions using real robot experiments in multiple unseen environments, demonstrating improved navigation reliability and accuracy. We make the code and trained models publicly available. △ Less

Submitted 3 March, 2025; v1 submitted 14 November, 2024; originally announced November 2024.

Comments: Submitted to IROS'25. Previous version accepted at CoRL 2024 workshop on Learning Effective Abstractions for Planning (LEAP) and workshop on Differentiable Optimization Everywhere: Simulation, Estimation, Learning, and Control

arXiv:2411.02659 [pdf, other]

Scaling Laws and Convergence Dynamics in a Dissipative Kicked Rotator

Authors: Danilo S. Rando, Edson D. Leonel, Diego F. M. Oliveira

Abstract: The kicked rotator model is an essential paradigm in nonlinear dynamics, helping us understand the emergence of chaos and bifurcations in dynamical systems. In this study, we analyze a two-dimensional kicked rotator model considering a homogeneous and generalized function approach to describe the convergence dynamics towards a stationary state. By examining the behavior of critical exponents and s… ▽ More The kicked rotator model is an essential paradigm in nonlinear dynamics, helping us understand the emergence of chaos and bifurcations in dynamical systems. In this study, we analyze a two-dimensional kicked rotator model considering a homogeneous and generalized function approach to describe the convergence dynamics towards a stationary state. By examining the behavior of critical exponents and scaling laws, we demonstrate the universal nature of convergence dynamics. Specifically, we highlight the significance of the period-doubling bifurcation, showing that the critical exponents governing the convergence dynamics are consistent with those seen in other models. △ Less

Submitted 4 November, 2024; originally announced November 2024.

arXiv:2411.01654 [pdf, other]

doi 10.3389/fspas.2024.1522139

The 10 October 2024 geomagnetic storm may have caused the premature reentry of a Starlink satellite

Authors: Denny M. Oliveira, Eftyhia Zesta, Dibyendu Nandy

Abstract: In this short communication, we qualitatively analyze possible effects of the 10 October 2024 geomagnetic storm on accelerating the reentry of a Starlink satellite from very low-Earth orbit (VLEO). The storm took place near the maximum of solar cycle (SC) 25, which has shown to be more intense than SC24. Based on preliminary geomagnetic indices, the 10 October 2024, along with the 10 May 2024, wer… ▽ More In this short communication, we qualitatively analyze possible effects of the 10 October 2024 geomagnetic storm on accelerating the reentry of a Starlink satellite from very low-Earth orbit (VLEO). The storm took place near the maximum of solar cycle (SC) 25, which has shown to be more intense than SC24. Based on preliminary geomagnetic indices, the 10 October 2024, along with the 10 May 2024, were the most intense events since the well-known Halloween storms of October/November 2003. By looking at a preliminary version of the Dst index and altitudes along with velocities extracted from two-line element (TLE) data of the Starlink-1089 (SL-1089) satellite, we observe a possible connection between storm main phase onset and a sharp decay of SL-1089. The satellite was predicted to reenter on 22 October, but it reentered on 12 October, 10 days before schedule. The sharp altitude decay of SL-1089 revealed by TLE data coincides with the storm main phase onset. We compare the de-orbiting altitudes of another three satellites during different geomagnetic conditions and observe that the day difference between actual and predicted reentries increases for periods with higher geomagnetic activity. Therefore, we call for future research to establish the eventual causal relationship between storm occurrence and satellite orbital decay. As predicted by previous works, SC25 is already producing extreme geomagnetic storms with unprecedented satellite orbital drag effects and consequences for current megaconstellations in VLEO. △ Less

Submitted 18 December, 2024; v1 submitted 3 November, 2024; originally announced November 2024.

Comments: 11 pages, 2 figures, 1 table

arXiv:2410.21990 [pdf, other]

doi 10.1109/tse.2024.3453783

Understanding Code Understandability Improvements in Code Reviews

Authors: Delano Oliveira, Reydne Santos, Benedito de Oliveira, Martin Monperrus, Fernando Castor, Fernanda Madeiral

Abstract: Motivation: Code understandability is crucial in software development, as developers spend 58% to 70% of their time reading source code. Improving it can improve productivity and reduce maintenance costs. Problem: Experimental studies often identify factors influencing code understandability in controlled settings but overlook real-world influences like project culture, guidelines, and developers'… ▽ More Motivation: Code understandability is crucial in software development, as developers spend 58% to 70% of their time reading source code. Improving it can improve productivity and reduce maintenance costs. Problem: Experimental studies often identify factors influencing code understandability in controlled settings but overlook real-world influences like project culture, guidelines, and developers' backgrounds. Ignoring these factors may yield results with limited external validity. Objective: This study investigates how developers enhance code understandability through code review comments, assuming that code reviewers are specialists in code quality. Method and Results: We analyzed 2,401 code review comments from Java open-source projects on GitHub, finding that over 42% focus on improving code understandability. We further examined 385 comments specifically related to this aspect and identified eight categories of concerns, such as inadequate documentation and poor identifiers. Notably, 83.9% of suggestions for improvement were accepted and integrated, with fewer than 1% later reverted. We identified various types of patches that enhance understandability, from simple changes like removing unused code to context-dependent improvements such as optimizing method calls. Additionally, we evaluated four well-known linters for their ability to flag these issues, finding they cover less than 30%, although many could be easily added as new rules. Implications: Our findings encourage the development of tools to enhance code understandability, as accepted changes can serve as reliable training data for specialized machine-learning models. Our dataset supports this training and can inform the development of evidence-based code style guides. Data Availability: Our data is publicly available at https://codeupcrc.github.io. △ Less

Submitted 12 November, 2024; v1 submitted 29 October, 2024; originally announced October 2024.

Journal ref: IEEE Transactions on Software Engineering, 2024

arXiv:2410.17834 [pdf, other]

Non-intrusive Speech Quality Assessment with Diffusion Models Trained on Clean Speech

Authors: Danilo de Oliveira, Julius Richter, Jean-Marie Lemercier, Simon Welker, Timo Gerkmann

Abstract: Diffusion models have found great success in generating high quality, natural samples of speech, but their potential for density estimation for speech has so far remained largely unexplored. In this work, we leverage an unconditional diffusion model trained only on clean speech for the assessment of speech quality. We show that the quality of a speech utterance can be assessed by estimating the li… ▽ More Diffusion models have found great success in generating high quality, natural samples of speech, but their potential for density estimation for speech has so far remained largely unexplored. In this work, we leverage an unconditional diffusion model trained only on clean speech for the assessment of speech quality. We show that the quality of a speech utterance can be assessed by estimating the likelihood of a corresponding sample in the terminating Gaussian distribution, obtained via a deterministic noising process. The resulting method is purely unsupervised, trained only on clean speech, and therefore does not rely on annotations. Our diffusion-based approach leverages clean speech priors to assess quality based on how the input relates to the learned distribution of clean data. Our proposed log-likelihoods show promising results, correlating well with intrusive speech quality metrics such as POLQA and SI-SDR. △ Less

Submitted 23 October, 2024; originally announced October 2024.

arXiv:2410.14943 [pdf, other]

doi 10.5281/zenodo.13844758

Workflows Community Summit 2024: Future Trends and Challenges in Scientific Workflows

Authors: Rafael Ferreira da Silva, Deborah Bard, Kyle Chard, Shaun de Witt, Ian T. Foster, Tom Gibbs, Carole Goble, William Godoy, Johan Gustafsson, Utz-Uwe Haus, Stephen Hudson, Shantenu Jha, Laila Los, Drew Paine, Frédéric Suter, Logan Ward, Sean Wilkinson, Marcos Amaris, Yadu Babuji, Jonathan Bader, Riccardo Balin, Daniel Balouek, Sarah Beecroft, Khalid Belhajjame, Rajat Bhattarai , et al. (86 additional authors not shown)

Abstract: The Workflows Community Summit gathered 111 participants from 18 countries to discuss emerging trends and challenges in scientific workflows, focusing on six key areas: time-sensitive workflows, AI-HPC convergence, multi-facility workflows, heterogeneous HPC environments, user experience, and FAIR computational workflows. The integration of AI and exascale computing has revolutionized scientific w… ▽ More The Workflows Community Summit gathered 111 participants from 18 countries to discuss emerging trends and challenges in scientific workflows, focusing on six key areas: time-sensitive workflows, AI-HPC convergence, multi-facility workflows, heterogeneous HPC environments, user experience, and FAIR computational workflows. The integration of AI and exascale computing has revolutionized scientific workflows, enabling higher-fidelity models and complex, time-sensitive processes, while introducing challenges in managing heterogeneous environments and multi-facility data dependencies. The rise of large language models is driving computational demands to zettaflop scales, necessitating modular, adaptable systems and cloud-service models to optimize resource utilization and ensure reproducibility. Multi-facility workflows present challenges in data movement, curation, and overcoming institutional silos, while diverse hardware architectures require integrating workflow considerations into early system design and developing standardized resource management tools. The summit emphasized improving user experience in workflow systems and ensuring FAIR workflows to enhance collaboration and accelerate scientific discovery. Key recommendations include developing standardized metrics for time-sensitive workflows, creating frameworks for cloud-HPC integration, implementing distributed-by-design workflow modeling, establishing multi-facility authentication protocols, and accelerating AI integration in HPC workflow management. The summit also called for comprehensive workflow benchmarks, workflow-specific UX principles, and a FAIR workflow maturity model, highlighting the need for continued collaboration in addressing the complex challenges posed by the convergence of AI, HPC, and multi-facility research environments. △ Less

Submitted 18 October, 2024; originally announced October 2024.

Report number: ORNL/TM-2024/3573

arXiv:2410.04318 [pdf, other]

Urban Computing for Climate and Environmental Justice: Early Perspectives From Two Research Initiatives

Authors: Carolina Veiga, Ashish Sharma, Daniel de Oliveira, Marcos Lage, Fabio Miranda

Abstract: The impacts of climate change are intensifying existing vulnerabilities and disparities within urban communities around the globe, as extreme weather events, including floods and heatwaves, are becoming more frequent and severe, disproportionately affecting low-income and underrepresented groups. Tackling these increasing challenges requires novel approaches that integrate expertise across multipl… ▽ More The impacts of climate change are intensifying existing vulnerabilities and disparities within urban communities around the globe, as extreme weather events, including floods and heatwaves, are becoming more frequent and severe, disproportionately affecting low-income and underrepresented groups. Tackling these increasing challenges requires novel approaches that integrate expertise across multiple domains, including computer science, engineering, climate science, and public health. Urban computing can play a pivotal role in these efforts by integrating data from multiple sources to support decision-making and provide actionable insights into weather patterns, infrastructure weaknesses, and population vulnerabilities. However, the capacity to leverage technological advancements varies significantly between the Global South and Global North. In this paper, we present two multiyear, multidisciplinary projects situated in Chicago, USA and Niterói, Brazil, highlighting the opportunities and limitations of urban computing in these diverse contexts. Reflecting on our experiences, we then discuss the essential requirements, as well as existing gaps, for visual analytics tools that facilitate the understanding and mitigation of climate-related risks in urban environments. △ Less

Submitted 5 October, 2024; originally announced October 2024.

Comments: Accepted at the Viz4Climate + Sustainability: IEEE VIS 2024 Workshop on Visualization for Climate Action and Sustainability (https://svs.gsfc.nasa.gov/events/2024/Viz4ClimateAndSustainability/)

arXiv:2409.12675 [pdf, other]

Resource Management and Circuit Scheduling for Distributed Quantum Computing Interconnect Networks

Authors: Sima Bahrani, Romerson D. Oliveira, Juan Marcelo Parra-Ullauri, Rui Wang, Dimitra Simeonidou

Abstract: Distributed quantum computing (DQC) has emerged as a promising approach to overcome the scalability limitations of monolithic quantum processors in terms of computing capability. However, realising the full potential of DQC requires effective resource management and circuit scheduling. This involves efficiently assigning each circuit to an optimal subset of quantum processing units (QPUs), based o… ▽ More Distributed quantum computing (DQC) has emerged as a promising approach to overcome the scalability limitations of monolithic quantum processors in terms of computing capability. However, realising the full potential of DQC requires effective resource management and circuit scheduling. This involves efficiently assigning each circuit to an optimal subset of quantum processing units (QPUs), based on factors such as their computational power and connectivity. In heterogeneous DQC networks with arbitrary topologies and non-identical QPUs, this becomes a complex challenge. This paper addresses resource management in such settings, with a focus on computing resource allocation in a quantum data center. We propose circuit scheduling and resource allocation algorithms that combine heuristic methods with a Mixed-Integer Linear Programming (MILP) formulation. Our MILP model accounts for infidelities arising from inter-QPU communication. The algorithms consider key factors including network topology, QPU characteristics, and quantum circuit structure to make efficient scheduling and allocation decisions. Simulation results demonstrate that our approach significantly improves circuit execution time and resource utilisation, measured by makespan, throughput, and QPU usage, while also reducing inter-QPU communication, compared to a baseline random allocation strategy. This work provides valuable insights into resource management strategies for scalable and heterogeneous DQC systems. △ Less

Submitted 21 May, 2025; v1 submitted 19 September, 2024; originally announced September 2024.

Comments: 11 pages, 8 figures

arXiv:2409.10753 [pdf, other]

Investigating Training Objectives for Generative Speech Enhancement

Authors: Julius Richter, Danilo de Oliveira, Timo Gerkmann

Abstract: Generative speech enhancement has recently shown promising advancements in improving speech quality in noisy environments. Multiple diffusion-based frameworks exist, each employing distinct training objectives and learning techniques. This paper aims to explain the differences between these frameworks by focusing our investigation on score-based generative models and the Schrödinger bridge. We con… ▽ More Generative speech enhancement has recently shown promising advancements in improving speech quality in noisy environments. Multiple diffusion-based frameworks exist, each employing distinct training objectives and learning techniques. This paper aims to explain the differences between these frameworks by focusing our investigation on score-based generative models and the Schrödinger bridge. We conduct a series of comprehensive experiments to compare their performance and highlight differing training behaviors. Furthermore, we propose a novel perceptual loss function tailored for the Schrödinger bridge framework, demonstrating enhanced performance and improved perceptual quality of the enhanced speech signals. All experimental code and pre-trained models are publicly available to facilitate further research and development in this domain. △ Less

Submitted 18 January, 2025; v1 submitted 16 September, 2024; originally announced September 2024.

Comments: Accepted at ICASSP 2025

arXiv:2408.11254 [pdf, other]

Mapping Chaos: Bifurcation Patterns and Shrimp Structures in the Ikeda Map

Authors: Diego F. M. Oliveira

Abstract: This study examines the dynamical properties of the Ikeda map, with a focus on bifurcations and chaotic behavior. We investigate how variations in dissipation parameters influence the system, uncovering shrimp-shaped structures that represent intricate transitions between regular and chaotic dynamics. Key findings include the analysis of period-doubling bifurcations and the onset of chaos. We util… ▽ More This study examines the dynamical properties of the Ikeda map, with a focus on bifurcations and chaotic behavior. We investigate how variations in dissipation parameters influence the system, uncovering shrimp-shaped structures that represent intricate transitions between regular and chaotic dynamics. Key findings include the analysis of period-doubling bifurcations and the onset of chaos. We utilize Lyapunov exponents to distinguish between stable and chaotic regions. These insights contribute to a deeper understanding of nonlinear and chaotic dynamics in optical systems. △ Less

Submitted 20 August, 2024; originally announced August 2024.

arXiv:2408.09960 [pdf, other]

Causality-Inspired Models for Financial Time Series Forecasting

Authors: Daniel Cunha Oliveira, Yutong Lu, Xi Lin, Mihai Cucuringu, Andre Fujita

Abstract: We introduce a novel framework to financial time series forecasting that leverages causality-inspired models to balance the trade-off between invariance to distributional changes and minimization of prediction errors. To the best of our knowledge, this is the first study to conduct a comprehensive comparative analysis among state-of-the-art causal discovery algorithms, benchmarked against non-caus… ▽ More We introduce a novel framework to financial time series forecasting that leverages causality-inspired models to balance the trade-off between invariance to distributional changes and minimization of prediction errors. To the best of our knowledge, this is the first study to conduct a comprehensive comparative analysis among state-of-the-art causal discovery algorithms, benchmarked against non-causal feature selection techniques, in the application of forecasting asset returns. Empirical evaluations demonstrate the efficacy of our approach in yielding stable and accurate predictions, outperforming baseline models, particularly in tumultuous market conditions. △ Less

Submitted 19 August, 2024; originally announced August 2024.

arXiv:2408.08250 [pdf, other]

doi 10.1016/j.cag.2024.104015

Computer Vision Model Compression Techniques for Embedded Systems: A Survey

Authors: Alexandre Lopes, Fernando Pereira dos Santos, Diulhio de Oliveira, Mauricio Schiezaro, Helio Pedrini

Abstract: Deep neural networks have consistently represented the state of the art in most computer vision problems. In these scenarios, larger and more complex models have demonstrated superior performance to smaller architectures, especially when trained with plenty of representative data. With the recent adoption of Vision Transformer (ViT) based architectures and advanced Convolutional Neural Networks (C… ▽ More Deep neural networks have consistently represented the state of the art in most computer vision problems. In these scenarios, larger and more complex models have demonstrated superior performance to smaller architectures, especially when trained with plenty of representative data. With the recent adoption of Vision Transformer (ViT) based architectures and advanced Convolutional Neural Networks (CNNs), the total number of parameters of leading backbone architectures increased from 62M parameters in 2012 with AlexNet to 7B parameters in 2024 with AIM-7B. Consequently, deploying such deep architectures faces challenges in environments with processing and runtime constraints, particularly in embedded systems. This paper covers the main model compression techniques applied for computer vision tasks, enabling modern models to be used in embedded systems. We present the characteristics of compression subareas, compare different approaches, and discuss how to choose the best technique and expected variations when analyzing it on various embedded devices. We also share codes to assist researchers and new practitioners in overcoming initial implementation challenges for each subarea and present trends for Model Compression. Case studies for compression models are available at \href{https://github.com/venturusbr/cv-model-compression}{https://github.com/venturusbr/cv-model-compression}. △ Less

Submitted 15 August, 2024; originally announced August 2024.

Journal ref: Computers & Graphics, Volume 123, October 2024, 104015

arXiv:2408.07817 [pdf]

doi 10.1126/sciadv.ads9150

MyoGestic: EMG Interfacing Framework for Decoding Multiple Spared Degrees of Freedom of the Hand in Individuals with Neural Lesions

Authors: Raul C. Sîmpetru, Dominik I. Braun, Arndt U. Simon, Michael März, Vlad Cnejevici, Daniela Souza de Oliveira, Nico Weber, Jonas Walter, Jörg Franke, Daniel Höglinger, Cosima Prahm, Matthias Ponfick, Alessandro Del Vecchio

Abstract: Restoring limb motor function in individuals with spinal cord injury (SCI), stroke, or amputation remains a critical challenge, one which affects millions worldwide. Recent studies show through surface electromyography (EMG) that spared motor neurons can still be voluntarily controlled, even without visible limb movement . These signals can be decoded and used for motor intent estimation; however,… ▽ More Restoring limb motor function in individuals with spinal cord injury (SCI), stroke, or amputation remains a critical challenge, one which affects millions worldwide. Recent studies show through surface electromyography (EMG) that spared motor neurons can still be voluntarily controlled, even without visible limb movement . These signals can be decoded and used for motor intent estimation; however, current wearable solutions lack the necessary hardware and software for intuitive interfacing of the spared degrees of freedom after neural injuries. To address these limitations, we developed a wireless, high-density EMG bracelet, coupled with a novel software framework, MyoGestic. Our system allows rapid and tailored adaptability of machine learning models to the needs of the users, facilitating real-time decoding of multiple spared distinctive degrees of freedom. In our study, we successfully decoded the motor intent from two participants with SCI, two with spinal stroke , and three amputees in real-time, achieving several controllable degrees of freedom within minutes after wearing the EMG bracelet. We provide a proof-of-concept that these decoded signals can be used to control a digitally rendered hand, a wearable orthosis, a prosthesis, or a 2D cursor. Our framework promotes a participant-centered approach, allowing immediate feedback integration, thus enhancing the iterative development of myocontrol algorithms. The proposed open-source software framework, MyoGestic, allows researchers and patients to focus on the augmentation and training of the spared degrees of freedom after neural lesions, thus potentially bridging the gap between research and clinical application and advancing the development of intuitive EMG interfaces for diverse neural lesions. △ Less

Submitted 14 August, 2024; originally announced August 2024.

Comments: 23 pages, 8 figures

ACM Class: H.5.2; J.3; I.5.4; D.2.13

Journal ref: Science Advances, 11, 2025, eads9150

arXiv:2408.06139 [pdf, other]

doi 10.1109/TVCG.2024.3456353

Curio: A Dataflow-Based Framework for Collaborative Urban Visual Analytics

Authors: Gustavo Moreira, Maryam Hosseini, Carolina Veiga, Lucas Alexandre, Nicola Colaninno, Daniel de Oliveira, Nivan Ferreira, Marcos Lage, Fabio Miranda

Abstract: Over the past decade, several urban visual analytics systems and tools have been proposed to tackle a host of challenges faced by cities, in areas as diverse as transportation, weather, and real estate. Many of these tools have been designed through collaborations with urban experts, aiming to distill intricate urban analysis workflows into interactive visualizations and interfaces. However, the d… ▽ More Over the past decade, several urban visual analytics systems and tools have been proposed to tackle a host of challenges faced by cities, in areas as diverse as transportation, weather, and real estate. Many of these tools have been designed through collaborations with urban experts, aiming to distill intricate urban analysis workflows into interactive visualizations and interfaces. However, the design, implementation, and practical use of these tools still rely on siloed approaches, resulting in bespoke applications that are difficult to reproduce and extend. At the design level, these tools undervalue rich data workflows from urban experts, typically treating them only as data providers and evaluators. At the implementation level, they lack interoperability with other technical frameworks. At the practical use level, they tend to be narrowly focused on specific fields, inadvertently creating barriers to cross-domain collaboration. To address these gaps, we present Curio, a framework for collaborative urban visual analytics. Curio uses a dataflow model with multiple abstraction levels (code, grammar, GUI elements) to facilitate collaboration across the design and implementation of visual analytics components. The framework allows experts to intertwine data preprocessing, management, and visualization stages while tracking the provenance of code and visualizations. In collaboration with urban experts, we evaluate Curio through a diverse set of usage scenarios targeting urban accessibility, urban microclimate, and sunlight access. These scenarios use different types of data and domain methodologies to illustrate Curio's flexibility in tackling pressing societal challenges. Curio is available at https://urbantk.org/curio. △ Less

Submitted 12 August, 2024; originally announced August 2024.

Comments: Accepted at IEEE VIS 2024. Source code available at https://urbantk.org/curio

arXiv:2407.18673 [pdf, other]

A Survey on Cell Nuclei Instance Segmentation and Classification: Leveraging Context and Attention

Authors: João D. Nunes, Diana Montezuma, Domingos Oliveira, Tania Pereira, Jaime S. Cardoso

Abstract: Manually annotating nuclei from the gigapixel Hematoxylin and Eosin (H&E)-stained Whole Slide Images (WSIs) is a laborious and costly task, meaning automated algorithms for cell nuclei instance segmentation and classification could alleviate the workload of pathologists and clinical researchers and at the same time facilitate the automatic extraction of clinically interpretable features. But due t… ▽ More Manually annotating nuclei from the gigapixel Hematoxylin and Eosin (H&E)-stained Whole Slide Images (WSIs) is a laborious and costly task, meaning automated algorithms for cell nuclei instance segmentation and classification could alleviate the workload of pathologists and clinical researchers and at the same time facilitate the automatic extraction of clinically interpretable features. But due to high intra- and inter-class variability of nuclei morphological and chromatic features, as well as H&E-stains susceptibility to artefacts, state-of-the-art algorithms cannot correctly detect and classify instances with the necessary performance. In this work, we hypothesise context and attention inductive biases in artificial neural networks (ANNs) could increase the generalization of algorithms for cell nuclei instance segmentation and classification. We conduct a thorough survey on context and attention methods for cell nuclei instance segmentation and classification from H&E-stained microscopy imaging, while providing a comprehensive discussion of the challenges being tackled with context and attention. Besides, we illustrate some limitations of current approaches and present ideas for future research. As a case study, we extend both a general instance segmentation and classification method (Mask-RCNN) and a tailored cell nuclei instance segmentation and classification model (HoVer-Net) with context- and attention-based mechanisms, and do a comparative analysis on a multi-centre colon nuclei identification and counting dataset. Although pathologists rely on context at multiple levels while paying attention to specific Regions of Interest (RoIs) when analysing and annotating WSIs, our findings suggest translating that domain knowledge into algorithm design is no trivial task, but to fully exploit these mechanisms, the scientific understanding of these methods should be addressed. △ Less

Submitted 26 July, 2024; originally announced July 2024.

arXiv:2407.11786 [pdf, other]

Cryptocurrency Price Forecasting Using XGBoost Regressor and Technical Indicators

Authors: Abdelatif Hafid, Maad Ebrahim, Ali Alfatemi, Mohamed Rahouti, Diogo Oliveira

Abstract: The rapid growth of the stock market has attracted many investors due to its potential for significant profits. However, predicting stock prices accurately is difficult because financial markets are complex and constantly changing. This is especially true for the cryptocurrency market, which is known for its extreme volatility, making it challenging for traders and investors to make wise and profi… ▽ More The rapid growth of the stock market has attracted many investors due to its potential for significant profits. However, predicting stock prices accurately is difficult because financial markets are complex and constantly changing. This is especially true for the cryptocurrency market, which is known for its extreme volatility, making it challenging for traders and investors to make wise and profitable decisions. This study introduces a machine learning approach to predict cryptocurrency prices. Specifically, we make use of important technical indicators such as Exponential Moving Average (EMA) and Moving Average Convergence Divergence (MACD) to train and feed the XGBoost regressor model. We demonstrate our approach through an analysis focusing on the closing prices of Bitcoin cryptocurrency. We evaluate the model's performance through various simulations, showing promising results that suggest its usefulness in aiding/guiding cryptocurrency traders and investors in dynamic market conditions. △ Less

Submitted 16 July, 2024; originally announced July 2024.

Comments: 9 pages, 3 figures, 4 tables, submitted to the 43rd IEEE International Performance Computing and Communications Conference (IPCCC 2024)

arXiv:2407.07665 [pdf]

doi 10.3847/1538-4357/ad9335

The Solar and Geomagnetic Storms in May 2024: A Flash Data Report

Authors: Hisashi Hayakawa, Yusuke Ebihara, Alexander Mishev, Sergey Koldobskiy, Kanya Kusano, Sabrina Bechet, Seiji Yashiro, Kazumasa Iwai, Atsuki Shinbori, Kalevi Mursula, Fusa Miyake, Daikou Shiota, Marcos V. D. Silveira, Robert Stuart, Denny M. Oliveira, Sachiko Akiyama, Kouji Ohnishi, Vincent Ledvina, Yoshizumi Miyoshi

Abstract: In May 2024, the scientific community observed intense solar eruptions that resulted in a great geomagnetic storm and auroral extension, highlighting the need to document and quantify these events. This study mainly focuses on their quantification. The source active region (AR 13664) evolved from 113 to 2761 millionths of the solar hemisphere between 4 May and 14 May. AR 13664's magnetic free ener… ▽ More In May 2024, the scientific community observed intense solar eruptions that resulted in a great geomagnetic storm and auroral extension, highlighting the need to document and quantify these events. This study mainly focuses on their quantification. The source active region (AR 13664) evolved from 113 to 2761 millionths of the solar hemisphere between 4 May and 14 May. AR 13664's magnetic free energy surpassed 10^33 erg on 7 May, triggering 12 X-class flares on 8 -- 15 May. Multiple interplanetary coronal mass ejections (ICMEs) were produced from this AR, accelerating solar energetic particles toward Earth. According to satellite and interplanetary scintillation data, at least 4 ICMEs erupted from 13664 eventually overcoming each other and combining. The shock arrival at 17:05 UT on 10 May significantly compressed the magnetosphere down to ~ 5.04 RE, and triggered a deep Forbush Decrease. GOES satellite data and ground-based neutron monitors confirmed a ground-level enhancement from 2 UT to 10 UT on 11 May 2024. The ICMEs induced exceptional geomagnetic storms, peaking at a Dst index of -412 nT at 2 UT on 11 May, marking the sixth-largest storm since 1957. The AE and AL indices showed great auroral extensions that located the AE/AL stations into the polar cap. We gathered auroral records at that time and reconstructed the equatorward boundary of the visual auroral oval to 29.8° invariant latitude. We compared naked-eye and camera auroral visibility, providing critical caveats on their difference. We also confirmed global enhancements of storm-enhanced density of the ionosphere. △ Less

Submitted 18 November, 2024; v1 submitted 10 July, 2024; originally announced July 2024.

Comments: 65 pages, 19 figures, and 2 tables. Accepted for publication in the Astrophysical Journal

Journal ref: The Astrophysical Journal, 2024

arXiv:2407.00941 [pdf, ps, other]

Full Iso-recursive Types

Authors: Litao Zhou, Qianyong Wan, Bruno C. d. S. Oliveira

Abstract: There are two well-known formulations of recursive types: iso-recursive and equi-recursive types. Abadi and Fiore [1996] have shown that iso- and equi-recursive types have the same expressive power. However, their encoding of equi-recursive types in terms of iso-recursive types requires explicit coercions. These coercions come with significant additional computational overhead, and complicate reas… ▽ More There are two well-known formulations of recursive types: iso-recursive and equi-recursive types. Abadi and Fiore [1996] have shown that iso- and equi-recursive types have the same expressive power. However, their encoding of equi-recursive types in terms of iso-recursive types requires explicit coercions. These coercions come with significant additional computational overhead, and complicate reasoning about the equivalence of the two formulations of recursive types. This paper proposes a generalization of iso-recursive types called full iso-recursive types. Full iso-recursive types allow encoding all programs with equi-recursive types without computational overhead. Instead of explicit term coercions, all type transformations are captured by computationally irrelevant casts, which can be erased at runtime without affecting the semantics of the program. Consequently, reasoning about the equivalence between the two approaches can be greatly simplified. We present a calculus called $λ^μ_{Fi}$, which extends the simply typed lambda calculus (STLC) with full iso-recursive types. The $λ^μ_{Fi}$ calculus is proved to be type sound, and shown to have the same expressive power as a calculus with equi-recursive types. We also extend our results to subtyping, and show that equi-recursive subtyping can be expressed in terms of iso-recursive subtyping with cast operators. △ Less

Submitted 7 July, 2024; v1 submitted 30 June, 2024; originally announced July 2024.

Comments: This work has been conditionally accepted to OOPSLA 2024

arXiv:2406.09927 [pdf, ps, other]

Index estimates for harmonic Gauss maps

Authors: Alcides de Carvalho, Marcos P. Cavalcante, Wagner Costa-Filho, Darlan de Oliveira

Abstract: Let $Σ$ denote a closed surface with constant mean curvature in $\mathbb{G}^3$, a 3-dimensional Lie group equipped with a bi-invariant metric. For such surfaces, there is a harmonic Gauss map which maps values to the unit sphere within the Lie algebra of $\mathbb{G}$. We prove that the energy index of the Gauss map of $Σ$ is bounded below by its topological genus. We also obtain index estimates in… ▽ More Let $Σ$ denote a closed surface with constant mean curvature in $\mathbb{G}^3$, a 3-dimensional Lie group equipped with a bi-invariant metric. For such surfaces, there is a harmonic Gauss map which maps values to the unit sphere within the Lie algebra of $\mathbb{G}$. We prove that the energy index of the Gauss map of $Σ$ is bounded below by its topological genus. We also obtain index estimates in the case of complete non compact surfaces. △ Less

Submitted 14 June, 2024; originally announced June 2024.

Comments: 11 pages

arXiv:2406.03460 [pdf, other]

The PESQetarian: On the Relevance of Goodhart's Law for Speech Enhancement

Authors: Danilo de Oliveira, Simon Welker, Julius Richter, Timo Gerkmann

Abstract: To obtain improved speech enhancement models, researchers often focus on increasing performance according to specific instrumental metrics. However, when the same metric is used in a loss function to optimize models, it may be detrimental to aspects that the given metric does not see. The goal of this paper is to illustrate the risk of overfitting a speech enhancement model to the metric used for… ▽ More To obtain improved speech enhancement models, researchers often focus on increasing performance according to specific instrumental metrics. However, when the same metric is used in a loss function to optimize models, it may be detrimental to aspects that the given metric does not see. The goal of this paper is to illustrate the risk of overfitting a speech enhancement model to the metric used for evaluation. For this, we introduce enhancement models that exploit the widely used PESQ measure. Our "PESQetarian" model achieves 3.82 PESQ on VB-DMD while scoring very poorly in a listening experiment. While the obtained PESQ value of 3.82 would imply "state-of-the-art" PESQ-performance on the VB-DMD benchmark, our examples show that when optimizing w.r.t. a metric, an isolated evaluation on the same metric may be misleading. Instead, other metrics should be included in the evaluation and the resulting performance predictions should be confirmed by listening. △ Less

Submitted 5 June, 2024; originally announced June 2024.

Comments: Accepted at Interspeech 2024

arXiv:2406.02748 [pdf, other]

Story Generation from Visual Inputs: Techniques, Related Tasks, and Challenges

Authors: Daniel A. P. Oliveira, Eugénio Ribeiro, David Martins de Matos

Abstract: Creating engaging narratives from visual data is crucial for automated digital media consumption, assistive technologies, and interactive entertainment. This survey covers methodologies used in the generation of these narratives, focusing on their principles, strengths, and limitations. The survey also covers tasks related to automatic story generation, such as image and video captioning, and vi… ▽ More Creating engaging narratives from visual data is crucial for automated digital media consumption, assistive technologies, and interactive entertainment. This survey covers methodologies used in the generation of these narratives, focusing on their principles, strengths, and limitations. The survey also covers tasks related to automatic story generation, such as image and video captioning, and visual question answering, as well as story generation without visual inputs. These tasks share common challenges with visual story generation and have served as inspiration for the techniques used in the field. We analyze the main datasets and evaluation metrics, providing a critical perspective on their limitations. △ Less

Submitted 4 June, 2024; originally announced June 2024.

ACM Class: I.2.7; I.2.10

arXiv:2405.13491 [pdf, other]

doi 10.1051/0004-6361/202450810

Euclid. I. Overview of the Euclid mission

Authors: Euclid Collaboration, Y. Mellier, Abdurro'uf, J. A. Acevedo Barroso, A. Achúcarro, J. Adamek, R. Adam, G. E. Addison, N. Aghanim, M. Aguena, V. Ajani, Y. Akrami, A. Al-Bahlawan, A. Alavi, I. S. Albuquerque, G. Alestas, G. Alguero, A. Allaoui, S. W. Allen, V. Allevato, A. V. Alonso-Tetilla, B. Altieri, A. Alvarez-Candal, S. Alvi, A. Amara , et al. (1115 additional authors not shown)

Abstract: The current standard model of cosmology successfully describes a variety of measurements, but the nature of its main ingredients, dark matter and dark energy, remains unknown. Euclid is a medium-class mission in the Cosmic Vision 2015-2025 programme of the European Space Agency (ESA) that will provide high-resolution optical imaging, as well as near-infrared imaging and spectroscopy, over about 14… ▽ More The current standard model of cosmology successfully describes a variety of measurements, but the nature of its main ingredients, dark matter and dark energy, remains unknown. Euclid is a medium-class mission in the Cosmic Vision 2015-2025 programme of the European Space Agency (ESA) that will provide high-resolution optical imaging, as well as near-infrared imaging and spectroscopy, over about 14,000 deg^2 of extragalactic sky. In addition to accurate weak lensing and clustering measurements that probe structure formation over half of the age of the Universe, its primary probes for cosmology, these exquisite data will enable a wide range of science. This paper provides a high-level overview of the mission, summarising the survey characteristics, the various data-processing steps, and data products. We also highlight the main science objectives and expected performance. △ Less

Submitted 24 September, 2024; v1 submitted 22 May, 2024; originally announced May 2024.

Comments: Accepted for publication in the A&A special issue`Euclid on Sky'

Journal ref: A&A 697, A1 (2025)

arXiv:2405.04647 [pdf, other]

doi 10.3389/fspas.2024.1392697

First direct observations of interplanetary shock impact angle effects on actual geomagnetically induced currents: The case of the Finnish natural gas pipeline system

Authors: Denny M. Oliveira, Eftyhia Zesta, Sergio Vidal-Luengo

Abstract: The impact of interplanetary (IP) shocks on the Earth's magnetosphere can greatly disturb the geomagnetic field and electric currents in the magnetosphere-ionosphere system. At high latitudes, the current systems most affected by the shocks are the auroral electrojet currents. These currents then generate ground geomagnetically induced currents (GICs) that couple with and are highly detrimental to… ▽ More The impact of interplanetary (IP) shocks on the Earth's magnetosphere can greatly disturb the geomagnetic field and electric currents in the magnetosphere-ionosphere system. At high latitudes, the current systems most affected by the shocks are the auroral electrojet currents. These currents then generate ground geomagnetically induced currents (GICs) that couple with and are highly detrimental to ground artificial conductors including power transmission lines, oil/gas pipelines, railways, and submarine cables. Recent research has shown that the shock impact angle, the angle the shock normal vector performs with the Sun-Earth line, plays a major role in controlling the subsequent geomagnetic activity. More specifically, due to more symmetric magnetospheric compressions, nearly frontal shocks are usually more geoeffective than highly inclined shocks. In this study, we utilize a subset (332 events) of a shock list with more than 600 events to investigate, for the first time, shock impact angle effects on the subsequent GICs right after shock impact (compression effects) and several minutes after shock impact (substorm-like effects). We use GIC recordings from the Finnish natural gas pipeline performed near the Mäntsälä compression station in southern Finland. We find that GIC peaks (> 5 A) occurring after shock impacts are mostly caused by nearly frontal shocks and occur in the post-noon/dusk magnetic local time sector. These GIC peaks are presumably triggered by partial ring current intensifications in the dusk sector. On the other hand, more intense GIC peaks (> 20 A) generally occur several minutes after shock impacts and are located around the magnetic midnight terminator. These GIC peaks are most likely caused by intense energetic particle injections from the magnetotail which frequently occur during substorms. △ Less

Submitted 7 May, 2024; originally announced May 2024.

Comments: 29 pages, 9 figures

Journal ref: Frontiers in Astronomy and Space Science (2024)

arXiv:2405.04183 [pdf]

doi 10.3390/nano14090795

Super-suppression of long wavelength phonons in constricted nanoporous geometries

Authors: Alex Greaney, S. Aria Hosseini, Laura de Sousa Oliveira, Alathea Davies, Neophytos Neophytou

Abstract: In a typical semiconductor material, the majority of heat is carried by long wavelength, long mean-free-path phonons. Nanostructuring strategies to reduce thermal conductivity, a promising direction in the field of thermoelectrics, place scattering centers of size and spatial separation comparable to the mean-free-paths of the dominant phonons to selectively scatter them. The resultant thermal con… ▽ More In a typical semiconductor material, the majority of heat is carried by long wavelength, long mean-free-path phonons. Nanostructuring strategies to reduce thermal conductivity, a promising direction in the field of thermoelectrics, place scattering centers of size and spatial separation comparable to the mean-free-paths of the dominant phonons to selectively scatter them. The resultant thermal conductivity is in most cases well predicted using Matthiessens rule. In general, however, long wavelength phonons are not as effectively scattered as the rest of the phonon spectrum. In this work, using large-scale Molecular Dynamics simulations, Non-Equilibrium Greens Function simulations, and Monte Carlo simulations, we show that specific nanoporous geometries, which create narrow constrictions in the passage of phonons, lead to anticorrelated heat currents in the phonon spectrum. This results in super-suppression of long-wavelength phonons due to heat trapping, and reductions in the thermal conductivity well below what is predicted by Matthiessens rule. △ Less

Submitted 7 May, 2024; originally announced May 2024.

Comments: 25 pages, 7 figures

Journal ref: Nanomaterials 2024, 14(9), 795

Showing 1–50 of 284 results for author: Oliveira, D