-
The Galactic Pizza: Flat Rotation Curves in the Context of Cosmological Time-Energy Coupling
Authors:
Artur Novais,
André L. B. Ribeiro
Abstract:
The phenomenon of augmented gravity on the scale of galaxies, conventionally attributed to dark matter halos, is shown to possibly result from the incremental growth of galactic masses and radii over time. This approach elucidates the cosmological origins of the acceleration scale $a_0\approx cH_0/2π\approx10^{-10}$ms$^{-2}$ at which galaxy rotation curves deviate from Keplerian behavior, with no…
▽ More
The phenomenon of augmented gravity on the scale of galaxies, conventionally attributed to dark matter halos, is shown to possibly result from the incremental growth of galactic masses and radii over time. This approach elucidates the cosmological origins of the acceleration scale $a_0\approx cH_0/2π\approx10^{-10}$ms$^{-2}$ at which galaxy rotation curves deviate from Keplerian behavior, with no need for new particles or modifications to the laws of gravity, i.e., it constitutes a new explanatory path beyond Cold Dark Matter (CDM) and Modified Newtonian Dynamics (MOND). Once one formally equates the energy density of the universe to the critical value ($ρ=ρ_c$) and the cosmic age to the reciprocal of the Hubble parameter ($t=H^{-1}$), independently of the epoch of observation, the result is the Zero-Energy condition for the cosmic fluid's equation of state, with key repercussions for the study of dark energy since the observables can be explained in the absence of a cosmological constant. Furthermore, this mass-energy evolution framework is able to reconcile the success of CDM models in describing structure assembly at $z\lesssim6$ with the unexpected discovery of massive objects at $z\gtrsim10$. Models that feature a strong coupling between cosmic time and energy are favored by this analysis.
△ Less
Submitted 31 May, 2025;
originally announced June 2025.
-
Viability of general relativity and modified gravity cosmologies using high-redshift cosmic probes
Authors:
Fernanda Oliveira,
Bruno Ribeiro,
Wiliam S. Hipólito-Ricaldi,
Felipe Avila,
Armando Bernui
Abstract:
Several models based on General Relativity and Modified Gravity aim to reproduce the observed universe with precision comparable to the standard flat-$Λ$CDM model. In this study, we investigate the consistency of some of these models with current high-redshift cosmic data, assessing their ability to simultaneously describe both the background expansion and matter clustering, using measurements of…
▽ More
Several models based on General Relativity and Modified Gravity aim to reproduce the observed universe with precision comparable to the standard flat-$Λ$CDM model. In this study, we investigate the consistency of some of these models with current high-redshift cosmic data, assessing their ability to simultaneously describe both the background expansion and matter clustering, using measurements of the Hubble parameter $H(z)$, the luminosity distance $D_L(z)$, and the growth rate of structures $[fσ_8](z)$ through parametric and non-parametric methods. Our results indicate that background observables alone offer limited capacity to distinguish between models, while the inclusion of growth of structures data proves useful in revealing deviations, even if small. An $F(Q)$ model, the non-flat $Λ$CDM and the $ω$CDM emerge as alternatives well supported by data, closely matching the growth data and showing performance comparable to $Λ$CDM, as revealed by the Akaike Information Criterion. In contrast, $F(R)$ models are strongly disfavored compared to $Λ$CDM and $F(Q)$. These analyses illustrate the usefulness of both parametric and non-parametric approaches to explore the observational viability of alternative cosmological models.
△ Less
Submitted 26 May, 2025;
originally announced May 2025.
-
clusttraj: A Solvent-Informed Clustering Tool for Molecular Modeling
Authors:
Rafael Bicudo Ribeiro,
Henrique Musseli Cezar
Abstract:
Clustering techniques are consolidated as a powerful strategy for analyzing the extensive data generated from molecular modeling. In particular, some tools have been developed to cluster configurations from classical simulations with a standard focus on individual units, ranging from small molecules to complex proteins. Since the standard approach includes computing the Root Mean Square Deviation…
▽ More
Clustering techniques are consolidated as a powerful strategy for analyzing the extensive data generated from molecular modeling. In particular, some tools have been developed to cluster configurations from classical simulations with a standard focus on individual units, ranging from small molecules to complex proteins. Since the standard approach includes computing the Root Mean Square Deviation (RMSD) of atomic positions, accounting for the permutation between atoms is crucial for optimizing the clustering procedure in the presence of identical molecules. To address this issue, we present the clusttraj program, a solvent-informed clustering package that fixes inflated RMSD values by finding the optimal pairing between configurations. The program combines reordering schemes with the Kabsch algorithm to minimize the RMSD of molecular configurations before running a hierarchical clustering protocol. By considering evaluation metrics, one can determine the ideal threshold in an automated fashion and compare the different linkage schemes available. The program capabilities are exemplified by considering solute-solvent systems ranging from pure water clusters to a solvated protein or a small solute in different solvents. As a result, we investigate the dependence on different parameters, such as the system size and reordering method, and also the representativeness of the cluster medoids for the characterization of optical properties. clusttraj is implemented as a Python library and can be employed to cluster generic ensembles of molecular configurations that go beyond solute-solvent systems.
△ Less
Submitted 22 April, 2025; v1 submitted 21 April, 2025;
originally announced April 2025.
-
Galaxy Mergers in a Fractal Cosmology
Authors:
Bruno J. Souza,
Osvaldo L. Santos-Pereira,
Marcelo B. Ribeiro
Abstract:
This work discusses the influence of galaxy mergers in the evolution of a parabolic Lemaître-Tolman-Bondi (LTB) cosmology with simultaneous big bang endowed with two consecutive single fractal galaxy distributions systems possessing fractal dimension $D$. Based on recent empirical findings, it is assumed that the resulting galaxy mass from mergers can be expressed by a redshift dependent decaying…
▽ More
This work discusses the influence of galaxy mergers in the evolution of a parabolic Lemaître-Tolman-Bondi (LTB) cosmology with simultaneous big bang endowed with two consecutive single fractal galaxy distributions systems possessing fractal dimension $D$. Based on recent empirical findings, it is assumed that the resulting galaxy mass from mergers can be expressed by a redshift dependent decaying power law. The proposed cosmological model modifies the relativistic fractal number counts distribution by including a merger rate evolution that estimates the model's radial density. Numerical solutions for the first order small-merger-rate approximation (SMRA) are found and the results show that a fractal galaxy distribution having $D=1.5$ in the range $0.1<z<1.0$, and $D=0.5$ for $1<z<6$, as suggested by recent empirical findings, the SMRA allows consistent description of the model for a merger rate power law exponent up to $q=0.2$ considering a fractal galaxy distribution starting from the Local Group. Consistent values were also found up to $q=2.5$ and $z=7$ from a scale smaller than the Local Supercluster. These results show that galaxy mergers can be successfully incorporated into the dynamics of a parabolic LTB fractal cosmology.
△ Less
Submitted 13 April, 2025;
originally announced April 2025.
-
Boosting Relational Deep Learning with Pretrained Tabular Models
Authors:
Veronica Lachi,
Antonio Longa,
Beatrice Bevilacqua,
Bruno Lepri,
Andrea Passerini,
Bruno Ribeiro
Abstract:
Relational databases, organized into tables connected by primary-foreign key relationships, are a common format for organizing data. Making predictions on relational data often involves transforming them into a flat tabular format through table joins and feature engineering, which serve as input to tabular methods. However, designing features that fully capture complex relational patterns remains…
▽ More
Relational databases, organized into tables connected by primary-foreign key relationships, are a common format for organizing data. Making predictions on relational data often involves transforming them into a flat tabular format through table joins and feature engineering, which serve as input to tabular methods. However, designing features that fully capture complex relational patterns remains challenging. Graph Neural Networks (GNNs) offer a compelling alternative by inherently modeling these relationships, but their time overhead during inference limits their applicability for real-time scenarios. In this work, we aim to bridge this gap by leveraging existing feature engineering efforts to enhance the efficiency of GNNs in relational databases. Specifically, we use GNNs to capture complex relationships within relational databases, patterns that are difficult to featurize, while employing engineered features to encode temporal information, thereby avoiding the need to retain the entire historical graph and enabling the use of smaller, more efficient graphs. Our \textsc{LightRDL} approach not only improves efficiency, but also outperforms existing models. Experimental results on the RelBench benchmark demonstrate that our framework achieves up to $33\%$ performance improvement and a $526\times$ inference speedup compared to GNNs, making it highly suitable for real-time inference.
△ Less
Submitted 7 April, 2025;
originally announced April 2025.
-
Integrating Personality into Digital Humans: A Review of LLM-Driven Approaches for Virtual Reality
Authors:
Iago Alves Brito,
Julia Soares Dollis,
Fernanda Bufon Färber,
Pedro Schindler Freire Brasil Ribeiro,
Rafael Teixeira Sousa,
Arlindo Rodrigues Galvão Filho
Abstract:
The integration of large language models (LLMs) into virtual reality (VR) environments has opened new pathways for creating more immersive and interactive digital humans. By leveraging the generative capabilities of LLMs alongside multimodal outputs such as facial expressions and gestures, virtual agents can simulate human-like personalities and emotions, fostering richer and more engaging user ex…
▽ More
The integration of large language models (LLMs) into virtual reality (VR) environments has opened new pathways for creating more immersive and interactive digital humans. By leveraging the generative capabilities of LLMs alongside multimodal outputs such as facial expressions and gestures, virtual agents can simulate human-like personalities and emotions, fostering richer and more engaging user experiences. This paper provides a comprehensive review of methods for enabling digital humans to adopt nuanced personality traits, exploring approaches such as zero-shot, few-shot, and fine-tuning. Additionally, it highlights the challenges of integrating LLM-driven personality traits into VR, including computational demands, latency issues, and the lack of standardized evaluation frameworks for multimodal interactions. By addressing these gaps, this work lays a foundation for advancing applications in education, therapy, and gaming, while fostering interdisciplinary collaboration to redefine human-computer interaction in VR.
△ Less
Submitted 21 February, 2025;
originally announced March 2025.
-
Unveiling the Dynamics in Galaxy Clusters: The Hidden Role of Low-Luminosity Galaxies in Coma
Authors:
Alisson P. Costa,
Andre. L. B. Ribeiro,
Flavio R. de M. Neto,
Juarez dos S. Junior
Abstract:
In this work, we study the Coma cluster, one of the richest and most well-known systems at low redshifts, to explore the importance of low-flux objects in the identification of cluster substructures. In addition, we conduct a study of the infall flow around Coma, considering the presence or absence of low-flux objects across the projected phase space of the cluster. Our results indicate that low-l…
▽ More
In this work, we study the Coma cluster, one of the richest and most well-known systems at low redshifts, to explore the importance of low-flux objects in the identification of cluster substructures. In addition, we conduct a study of the infall flow around Coma, considering the presence or absence of low-flux objects across the projected phase space of the cluster. Our results indicate that low-luminosity galaxies play a fundamental role in understanding the dynamical state of galaxy clusters. These galaxies, often overlooked because of their faint nature, serve as sensitive tracers of substructure dynamics and provide crucial insights into the cluster's evolutionary history. We show that not considering the low-flux objects present in clusters can lead to significant underestimates of the numbers of substructures, both in most central parts, in the infall regions, and beyond, connecting to the large-scale structure up to a distance of approximately $8 R_{200}$ from the center of Coma.
△ Less
Submitted 7 March, 2025; v1 submitted 5 March, 2025;
originally announced March 2025.
-
TRIX: A More Expressive Model for Zero-shot Domain Transfer in Knowledge Graphs
Authors:
Yucheng Zhang,
Beatrice Bevilacqua,
Mikhail Galkin,
Bruno Ribeiro
Abstract:
Fully inductive knowledge graph models can be trained on multiple domains and subsequently perform zero-shot knowledge graph completion (KGC) in new unseen domains. This is an important capability towards the goal of having foundation models for knowledge graphs. In this work, we introduce a more expressive and capable fully inductive model, dubbed TRIX, which not only yields strictly more express…
▽ More
Fully inductive knowledge graph models can be trained on multiple domains and subsequently perform zero-shot knowledge graph completion (KGC) in new unseen domains. This is an important capability towards the goal of having foundation models for knowledge graphs. In this work, we introduce a more expressive and capable fully inductive model, dubbed TRIX, which not only yields strictly more expressive triplet embeddings (head entity, relation, tail entity) compared to state-of-the-art methods, but also introduces a new capability: directly handling both entity and relation prediction tasks in inductive settings. Empirically, we show that TRIX outperforms the state-of-the-art fully inductive models in zero-shot entity and relation predictions in new domains, and outperforms large-context LLMs in out-of-domain predictions. The source code is available at https://github.com/yuchengz99/TRIX.
△ Less
Submitted 26 February, 2025;
originally announced February 2025.
-
EdgeEar: Efficient and Accurate Ear Recognition for Edge Devices
Authors:
Camile Lendering,
Bernardo Perrone Ribeiro,
Žiga Emeršič,
Peter Peer
Abstract:
Ear recognition is a contactless and unobtrusive biometric technique with applications across various domains. However, deploying high-performing ear recognition models on resource-constrained devices is challenging, limiting their applicability and widespread adoption. This paper introduces EdgeEar, a lightweight model based on a proposed hybrid CNN-transformer architecture to solve this problem.…
▽ More
Ear recognition is a contactless and unobtrusive biometric technique with applications across various domains. However, deploying high-performing ear recognition models on resource-constrained devices is challenging, limiting their applicability and widespread adoption. This paper introduces EdgeEar, a lightweight model based on a proposed hybrid CNN-transformer architecture to solve this problem. By incorporating low-rank approximations into specific linear layers, EdgeEar reduces its parameter count by a factor of 50 compared to the current state-of-the-art, bringing it below two million while maintaining competitive accuracy. Evaluation on the Unconstrained Ear Recognition Challenge (UERC2023) benchmark shows that EdgeEar achieves the lowest EER while significantly reducing computational costs. These findings demonstrate the feasibility of efficient and accurate ear recognition, which we believe will contribute to the wider adoption of ear biometrics.
△ Less
Submitted 11 February, 2025;
originally announced February 2025.
-
DiTASK: Multi-Task Fine-Tuning with Diffeomorphic Transformations
Authors:
Krishna Sri Ipsit Mantri,
Carola-Bibiane Schönlieb,
Bruno Ribeiro,
Chaim Baskin,
Moshe Eliasof
Abstract:
Pre-trained Vision Transformers now serve as powerful tools for computer vision. Yet, efficiently adapting them for multiple tasks remains a challenge that arises from the need to modify the rich hidden representations encoded by the learned weight matrices, without inducing interference between tasks. Current parameter-efficient methods like LoRA, which apply low-rank updates, force tasks to comp…
▽ More
Pre-trained Vision Transformers now serve as powerful tools for computer vision. Yet, efficiently adapting them for multiple tasks remains a challenge that arises from the need to modify the rich hidden representations encoded by the learned weight matrices, without inducing interference between tasks. Current parameter-efficient methods like LoRA, which apply low-rank updates, force tasks to compete within constrained subspaces, ultimately degrading performance. We introduce DiTASK a novel Diffeomorphic Multi-Task Fine-Tuning approach that maintains pre-trained representations by preserving weight matrix singular vectors, while enabling task-specific adaptations through neural diffeomorphic transformations of the singular values. By following this approach, DiTASK enables both shared and task-specific feature modulations with minimal added parameters. Our theoretical analysis shows that DITASK achieves full-rank updates during optimization, preserving the geometric structure of pre-trained features, and establishing a new paradigm for efficient multi-task learning (MTL). Our experiments on PASCAL MTL and NYUD show that DiTASK achieves state-of-the-art performance across four dense prediction tasks, using 75% fewer parameters than existing methods. Our code is available [here](https://github.com/ipsitmantri/DiTASK).
△ Less
Submitted 1 June, 2025; v1 submitted 9 February, 2025;
originally announced February 2025.
-
A Cloud-native Agile approach to cyber platform prototyping and integration for astronomy: the ENGAGE SKA case
Authors:
Domingos Barbosa,
Diogo Regateiro,
João Paulo Barraca,
Dzianis Bartashevich,
Marco Bartolini,
Matteo di Carlo,
Piers Harding,
Dalmiro Maia,
Bruno Morgado,
Domingos Nunes,
Bruno Ribeiro,
Bruno Coelho,
Valério Ribeiro,
Allan K. de Almeida Jr,
Timothée Vaillant,
Uğur Yilmaz
Abstract:
The Square Kilometre Array (SKA) Observatory is gearing up the formal construction of its two radio interferometers in Australia and South Africa after the end of design and pre-construction phases. Agile methodologies, the Cloud native Computing technologies and the DevOps software ideas are influencing the design of compute infrastructures that will be key to reduce the operational costs of SKA…
▽ More
The Square Kilometre Array (SKA) Observatory is gearing up the formal construction of its two radio interferometers in Australia and South Africa after the end of design and pre-construction phases. Agile methodologies, the Cloud native Computing technologies and the DevOps software ideas are influencing the design of compute infrastructures that will be key to reduce the operational costs of SKA while improving the control and monitoring of the SKA antennas and ancillary systems, Correlators, HPC facilities or related data centre tiered systems. These tools will likely include advanced power metering technologies and efficient distribution automation and Network Operation Centres (NOC). SKA will become the world's largest radio telescope and is expected to achieve its first science by 2026. To cope with this dimension and complexity, a key part of this distributed Observatory is the overall software control and monitoring system embodied in the Observatory Management and Control (OMC) and the Services Teams that requires specialized Agile Teams to assist in software and cyber infrastructure building using an Agile development environment that includes test automation, Continuous Integration, and Continuous Deployment. To manage such a large and distributed machine, the Agile approach was adopted for the core software package of the SKA Telescope aimed at scheduling observations, controlling their execution, monitoring the telescope status and ensuring scalability and reliability. Here, we report on the ENGAGE SKA ciberinfrastructure prototyping support to the SKA Agile Software Development Life Cycle (SDLC).
△ Less
Submitted 6 February, 2025;
originally announced February 2025.
-
Offshore Wind Turbine Tower Design and Optimization: A Review and AI-Driven Future Directions
Authors:
João Alves Ribeiro,
Bruno Alves Ribeiro,
Francisco Pimenta,
Sérgio M. O. Tavares,
Jie Zhang,
Faez Ahmed
Abstract:
Offshore wind energy leverages the high intensity and consistency of oceanic winds, playing a key role in the transition to renewable energy. As energy demands grow, larger turbines are required to optimize power generation and reduce the Levelized Cost of Energy (LCoE), which represents the average cost of electricity over a project's lifetime. However, upscaling turbines introduces engineering c…
▽ More
Offshore wind energy leverages the high intensity and consistency of oceanic winds, playing a key role in the transition to renewable energy. As energy demands grow, larger turbines are required to optimize power generation and reduce the Levelized Cost of Energy (LCoE), which represents the average cost of electricity over a project's lifetime. However, upscaling turbines introduces engineering challenges, particularly in the design of supporting structures, especially towers. These towers must support increased loads while maintaining structural integrity, cost-efficiency, and transportability, making them essential to offshore wind projects' success. This paper presents a comprehensive review of the latest advancements, challenges, and future directions driven by Artificial Intelligence (AI) in the design optimization of Offshore Wind Turbine (OWT) structures, with a focus on towers. It provides an in-depth background on key areas such as design types, load types, analysis methods, design processes, monitoring systems, Digital Twin (DT), software, standards, reference turbines, economic factors, and optimization techniques. Additionally, it includes a state-of-the-art review of optimization studies related to tower design optimization, presenting a detailed examination of turbine, software, loads, optimization method, design variables and constraints, analysis, and findings, motivating future research to refine design approaches for effective turbine upscaling and improved efficiency. Lastly, the paper explores future directions where AI can revolutionize tower design optimization, enabling the development of efficient, scalable, and sustainable structures. By addressing the upscaling challenges and supporting the growth of renewable energy, this work contributes to shaping the future of offshore wind turbine towers and others supporting structures.
△ Less
Submitted 28 December, 2024;
originally announced February 2025.
-
On the Effectiveness of Random Weights in Graph Neural Networks
Authors:
Thu Bui,
Carola-Bibiane Schönlieb,
Bruno Ribeiro,
Beatrice Bevilacqua,
Moshe Eliasof
Abstract:
Graph Neural Networks (GNNs) have achieved remarkable success across diverse tasks on graph-structured data, primarily through the use of learned weights in message passing layers. In this paper, we demonstrate that random weights can be surprisingly effective, achieving performance comparable to end-to-end training counterparts, across various tasks and datasets. Specifically, we show that by rep…
▽ More
Graph Neural Networks (GNNs) have achieved remarkable success across diverse tasks on graph-structured data, primarily through the use of learned weights in message passing layers. In this paper, we demonstrate that random weights can be surprisingly effective, achieving performance comparable to end-to-end training counterparts, across various tasks and datasets. Specifically, we show that by replacing learnable weights with random weights, GNNs can retain strong predictive power, while significantly reducing training time by up to 6$\times$ and memory usage by up to 3$\times$. Moreover, the random weights combined with our construction yield random graph propagation operators, which we show to reduce the problem of feature rank collapse in GNNs. These understandings and empirical results highlight random weights as a lightweight and efficient alternative, offering a compelling perspective on the design and training of GNN architectures.
△ Less
Submitted 31 January, 2025;
originally announced February 2025.
-
CENSOR: Defense Against Gradient Inversion via Orthogonal Subspace Bayesian Sampling
Authors:
Kaiyuan Zhang,
Siyuan Cheng,
Guangyu Shen,
Bruno Ribeiro,
Shengwei An,
Pin-Yu Chen,
Xiangyu Zhang,
Ninghui Li
Abstract:
Federated learning collaboratively trains a neural network on a global server, where each local client receives the current global model weights and sends back parameter updates (gradients) based on its local private data. The process of sending these model updates may leak client's private data information. Existing gradient inversion attacks can exploit this vulnerability to recover private trai…
▽ More
Federated learning collaboratively trains a neural network on a global server, where each local client receives the current global model weights and sends back parameter updates (gradients) based on its local private data. The process of sending these model updates may leak client's private data information. Existing gradient inversion attacks can exploit this vulnerability to recover private training instances from a client's gradient vectors. Recently, researchers have proposed advanced gradient inversion techniques that existing defenses struggle to handle effectively. In this work, we present a novel defense tailored for large neural network models. Our defense capitalizes on the high dimensionality of the model parameters to perturb gradients within a subspace orthogonal to the original gradient. By leveraging cold posteriors over orthogonal subspaces, our defense implements a refined gradient update mechanism. This enables the selection of an optimal gradient that not only safeguards against gradient inversion attacks but also maintains model utility. We conduct comprehensive experiments across three different datasets and evaluate our defense against various state-of-the-art attacks and defenses. Code is available at https://censor-gradient.github.io.
△ Less
Submitted 26 January, 2025;
originally announced January 2025.
-
Scalable Out-of-distribution Robustness in the Presence of Unobserved Confounders
Authors:
Parjanya Prashant,
Seyedeh Baharan Khatami,
Bruno Ribeiro,
Babak Salimi
Abstract:
We consider the task of out-of-distribution (OOD) generalization, where the distribution shift is due to an unobserved confounder ($Z$) affecting both the covariates ($X$) and the labels ($Y$). In this setting, traditional assumptions of covariate and label shift are unsuitable due to the confounding, which introduces heterogeneity in the predictor, i.e., $\hat{Y} = f_Z(X)$. OOD generalization dif…
▽ More
We consider the task of out-of-distribution (OOD) generalization, where the distribution shift is due to an unobserved confounder ($Z$) affecting both the covariates ($X$) and the labels ($Y$). In this setting, traditional assumptions of covariate and label shift are unsuitable due to the confounding, which introduces heterogeneity in the predictor, i.e., $\hat{Y} = f_Z(X)$. OOD generalization differs from traditional domain adaptation by not assuming access to the covariate distribution ($X^\text{te}$) of the test samples during training. These conditions create a challenging scenario for OOD robustness: (a) $Z^\text{tr}$ is an unobserved confounder during training, (b) $P^\text{te}{Z} \neq P^\text{tr}{Z}$, (c) $X^\text{te}$ is unavailable during training, and (d) the posterior predictive distribution depends on $P^\text{te}(Z)$, i.e., $\hat{Y} = E_{P^\text{te}(Z)}[f_Z(X)]$. In general, accurate predictions are unattainable in this scenario, and existing literature has proposed complex predictors based on identifiability assumptions that require multiple additional variables. Our work investigates a set of identifiability assumptions that tremendously simplify the predictor, whose resulting elegant simplicity outperforms existing approaches.
△ Less
Submitted 29 November, 2024;
originally announced November 2024.
-
Vertical Validation: Evaluating Implicit Generative Models for Graphs on Thin Support Regions
Authors:
Mai Elkady,
Thu Bui,
Bruno Ribeiro,
David I. Inouye
Abstract:
There has been a growing excitement that implicit graph generative models could be used to design or discover new molecules for medicine or material design. Because these molecules have not been discovered, they naturally lie in unexplored or scarcely supported regions of the distribution of known molecules. However, prior evaluation methods for implicit graph generative models have focused on val…
▽ More
There has been a growing excitement that implicit graph generative models could be used to design or discover new molecules for medicine or material design. Because these molecules have not been discovered, they naturally lie in unexplored or scarcely supported regions of the distribution of known molecules. However, prior evaluation methods for implicit graph generative models have focused on validating statistics computed from the thick support (e.g., mean and variance of a graph property). Therefore, there is a mismatch between the goal of generating novel graphs and the evaluation methods. To address this evaluation gap, we design a novel evaluation method called Vertical Validation (VV) that systematically creates thin support regions during the train-test splitting procedure and then reweights generated samples so that they can be compared to the held-out test data. This procedure can be seen as a generalization of the standard train-test procedure except that the splits are dependent on sample features. We demonstrate that our method can be used to perform model selection if performance on thin support regions is the desired goal. As a side benefit, we also show that our approach can better detect overfitting as exemplified by memorization.
△ Less
Submitted 20 November, 2024;
originally announced November 2024.
-
Intermittency of a transitional airfoil flow with laminar separation bubble solved by the lattice-Boltzmann method
Authors:
Bernardo Luiz Ribeiro,
Cayan Dantas,
William Wolf
Abstract:
The flow over a NACA0012 airfoil at a moderate Reynolds number Re = 50,000 and angle of attack of alpha = 3 degrees is investigated using the lattice-Boltzmann method (LBM). The LBM solutions are computed in direct numerical simulation (DNS) mode, i.e., without a wall model. A validation is performed against a Navier-Stokes wall-resolved large eddy simulation, and good agreement is achieved betwee…
▽ More
The flow over a NACA0012 airfoil at a moderate Reynolds number Re = 50,000 and angle of attack of alpha = 3 degrees is investigated using the lattice-Boltzmann method (LBM). The LBM solutions are computed in direct numerical simulation (DNS) mode, i.e., without a wall model. A validation is performed against a Navier-Stokes wall-resolved large eddy simulation, and good agreement is achieved between the different approaches, showing that the LBM can provide accurate solutions of boundary layers under transitional regime, but with a significant computational cost reduction. A laminar separation bubble (LSB) forms over the suction side of the airfoil, leading to intermittent vortex shedding that impacts transition to turbulence and the generation of strong spanwise-coherent vortices. Different shedding patterns are observed including the advection of single vortical structures and pairing of two vortices, which may or may not break into finer turbulent scales. Such flow features are characterized by 2D and 3D events that directly impact the sound generation by the trailing edge. Frequency and amplitude modulations from the LSB lead to a noise spectrum with a main tone plus equidistant secondary tones, and a time-frequency analysis shows that the main tones may switch frequencies due to intermittency. This research advances in the comprehension of the LSB behavior in transitional airfoil flows, impacting the performance and noise generation of blades and propellers.
△ Less
Submitted 7 November, 2024;
originally announced November 2024.
-
Boosting the evolutionary picture of Cl 0024+17 and MS 0451-03: A case study at intermediate-redshift
Authors:
A. P. Costa,
A. L. B. Ribeiro,
R. R. de Carvalho,
J. A. Benevides
Abstract:
In this work we improve the dynamic-evolutionary framework of two massive clusters at intermediate redshifts: Cl 0024+17 at $z \sim 0.4$ and MS 0451-03 at $z \sim 0.5$. The spectroscopic galaxy members were selected from Moran et al. (2007a), which combine optical and UV imaging with spectroscopy. Using a set of dynamic estimators with different approaches, our results show that both Cl 0024+17 an…
▽ More
In this work we improve the dynamic-evolutionary framework of two massive clusters at intermediate redshifts: Cl 0024+17 at $z \sim 0.4$ and MS 0451-03 at $z \sim 0.5$. The spectroscopic galaxy members were selected from Moran et al. (2007a), which combine optical and UV imaging with spectroscopy. Using a set of dynamic estimators with different approaches, our results show that both Cl 0024+17 and MS 0451-03 are non-relaxed systems with distinct dynamical configurations. Cl 0024+17 exhibits a disturbed kinematics, displaying significant gaps and a velocity dispersion profile suggesting a merger. This is confirmed by the presence of previously reported substructures and new ones identified in this study. MS 0451-03 appears less disturbed than Cl 0024+17, indicating by the significant segregation between late and early-type galaxies, with the latter occupying more central regions of the projected phase-space. However, five previously unobserved substructures and non-Gaussianity in the velocity distribution indicate that MS 0451-03 is also out of equilibrium. In both clusters, there are substructures infalling onto the systems, indicating key moments in their assembly histories and potential effects on the pre-processing of galaxies within these subgroups. This is suggested by the high percentage of early-type galaxies outside $R_{200}$ (approximately $83\%$) in the case of CL 0024+17. This work reinforces the importance of more detailed dynamical analysis of clusters to better characterize their evolutionary picture.
△ Less
Submitted 13 November, 2024; v1 submitted 24 October, 2024;
originally announced October 2024.
-
Lorentzian correction for the evolution of the CMB temperature
Authors:
Artur Novais,
A. L. B. Ribeiro
Abstract:
Observational evidence consistently shows that the universe is spatially flat and undergoes Lorentzian time dilation as a function of redshift. In combination, such discoveries suggest that a Minkowskian description of cosmology might be technically viable. The thermal evolution that transpires in a conformal spacetime is herein derived. The description is constrained by the energy conservation of…
▽ More
Observational evidence consistently shows that the universe is spatially flat and undergoes Lorentzian time dilation as a function of redshift. In combination, such discoveries suggest that a Minkowskian description of cosmology might be technically viable. The thermal evolution that transpires in a conformal spacetime is herein derived. The description is constrained by the energy conservation of a unified cosmic fluid. The resulting model puts forth a Lorentzian correction for the temperature of the CMB as a function of redshift, which improves current data fitting without adding any free parameter. Furthermore, it sheds light upon the early galaxy formation problem: our model predicts up to 0.86 Gyr older objects within the first two billion years of the structure evolution in the universe.
△ Less
Submitted 9 October, 2024;
originally announced October 2024.
-
Generalized inflation in the context of $κ$-deformed theories
Authors:
B. W. Ribeiro,
I. M. Macêdo,
F. C. Carvalho
Abstract:
A new inflationary scenario driven by a slowly-rolling homogeneous scalar field whose potential $V\left(\varphi\right)$ is given by a generalized exponential function is investigated. Within the {\it slow-roll} approximation we obtain the main predictions of the model and compare them with current data from cosmic microwave background and large-scale structure observations. We show that this singl…
▽ More
A new inflationary scenario driven by a slowly-rolling homogeneous scalar field whose potential $V\left(\varphi\right)$ is given by a generalized exponential function is investigated. Within the {\it slow-roll} approximation we obtain the main predictions of the model and compare them with current data from cosmic microwave background and large-scale structure observations. We show that this single scalar field model admits a wider set of solutions than usual exponential scenarios and predicts acceptable values of the spectral index, running of the spectral index and tensor-to-scalar ratio for the remaining number of {\it e}-folds lying in the interval $N = 55 \pm 5$ and an energy scale on which $λ\geq \sqrt{2}$; in particular, we observe that the value of the model parameter $κ$ depends on the analysis. Finally, the primordial local non-Gaussianity is briefly discussed where we conclude that $k\gtrsim 0.02$ for $f_\text{NL}^\text{local} \ll 1$.
△ Less
Submitted 16 September, 2024; v1 submitted 11 September, 2024;
originally announced September 2024.
-
Results from ON-OFF analysis of the Neutrinos-Angra detector
Authors:
E. Kemp,
W. V. Santos,
J. C. Anjos,
P. Chimenti,
L. F. G. Gonzalez,
G. P. Guedes,
H. P. Lima Jr.,
R. A. Nóbrega,
I. M. Pepe,
D. B. S. Ribeiro
Abstract:
The Neutrinos Angra Experiment, a water-based Cherenkov detector, is located at the Angra dos Reis nuclear power plant in Brazil. Designed to detect electron antineutrinos produced in the nuclear reactor, the primary objective of the experiment is to demonstrate the feasibility of monitoring reactor activity using an antineutrino detector. This effort aligns with the International Atomic Energy Ag…
▽ More
The Neutrinos Angra Experiment, a water-based Cherenkov detector, is located at the Angra dos Reis nuclear power plant in Brazil. Designed to detect electron antineutrinos produced in the nuclear reactor, the primary objective of the experiment is to demonstrate the feasibility of monitoring reactor activity using an antineutrino detector. This effort aligns with the International Atomic Energy Agency (IAEA) program to identify potential and novel technologies applicable to nonproliferation safeguards. Operating on the surface presents challenges such as high noise rates, necessitating the development of very sensitive, yet small-scale detectors. These conditions make the Angra experiment an excellent platform for both developing the application and gaining expertise in new technologies and analysis methods. The detector employs a water-based target doped with gadolinium to enhance its sensitivity to antineutrinos. In this work, we describe the main features of the detector and the electronics chain, including front-end and data acquisition components. We detail the data acquisition strategies and the methodologies applied for signal processing and event selection. Preliminary physics results suggest that the detector can reliably monitor reactor operations by detecting the inverse beta decay induced by electron antineutrinos from the reactor.
△ Less
Submitted 7 August, 2024; v1 submitted 29 July, 2024;
originally announced July 2024.
-
DiGRAF: Diffeomorphic Graph-Adaptive Activation Function
Authors:
Krishna Sri Ipsit Mantri,
Xinzhi Wang,
Carola-Bibiane Schönlieb,
Bruno Ribeiro,
Beatrice Bevilacqua,
Moshe Eliasof
Abstract:
In this paper, we propose a novel activation function tailored specifically for graph data in Graph Neural Networks (GNNs). Motivated by the need for graph-adaptive and flexible activation functions, we introduce DiGRAF, leveraging Continuous Piecewise-Affine Based (CPAB) transformations, which we augment with an additional GNN to learn a graph-adaptive diffeomorphic activation function in an end-…
▽ More
In this paper, we propose a novel activation function tailored specifically for graph data in Graph Neural Networks (GNNs). Motivated by the need for graph-adaptive and flexible activation functions, we introduce DiGRAF, leveraging Continuous Piecewise-Affine Based (CPAB) transformations, which we augment with an additional GNN to learn a graph-adaptive diffeomorphic activation function in an end-to-end manner. In addition to its graph-adaptivity and flexibility, DiGRAF also possesses properties that are widely recognized as desirable for activation functions, such as differentiability, boundness within the domain, and computational efficiency. We conduct an extensive set of experiments across diverse datasets and tasks, demonstrating a consistent and superior performance of DiGRAF compared to traditional and graph-specific activation functions, highlighting its effectiveness as an activation function for GNNs. Our code is available at https://github.com/ipsitmantri/DiGRAF.
△ Less
Submitted 30 October, 2024; v1 submitted 2 July, 2024;
originally announced July 2024.
-
The Significance of Latent Data Divergence in Predicting System Degradation
Authors:
Miguel Fernandes,
Catarina Silva,
Alberto Cardoso,
Bernardete Ribeiro
Abstract:
Condition-Based Maintenance is pivotal in enabling the early detection of potential failures in engineering systems, where precise prediction of the Remaining Useful Life is essential for effective maintenance and operation. However, a predominant focus in the field centers on predicting the Remaining Useful Life using unprocessed or minimally processed data, frequently neglecting the intricate dy…
▽ More
Condition-Based Maintenance is pivotal in enabling the early detection of potential failures in engineering systems, where precise prediction of the Remaining Useful Life is essential for effective maintenance and operation. However, a predominant focus in the field centers on predicting the Remaining Useful Life using unprocessed or minimally processed data, frequently neglecting the intricate dynamics inherent in the dataset. In this work we introduce a novel methodology grounded in the analysis of statistical similarity within latent data from system components. Leveraging a specifically designed architecture based on a Vector Quantized Variational Autoencoder, we create a sequence of discrete vectors which is used to estimate system-specific priors. We infer the similarity between systems by evaluating the divergence of these priors, offering a nuanced understanding of individual system behaviors. The efficacy of our approach is demonstrated through experiments on the NASA commercial modular aero-propulsion system simulation (C-MAPSS) dataset. Our validation not only underscores the potential of our method in advancing the study of latent statistical divergence but also demonstrates its superiority over existing techniques.
△ Less
Submitted 13 June, 2024;
originally announced June 2024.
-
Unlocking the Potential of Large Language Models for Clinical Text Anonymization: A Comparative Study
Authors:
David Pissarra,
Isabel Curioso,
João Alveira,
Duarte Pereira,
Bruno Ribeiro,
Tomás Souper,
Vasco Gomes,
André V. Carreiro,
Vitor Rolla
Abstract:
Automated clinical text anonymization has the potential to unlock the widespread sharing of textual health data for secondary usage while assuring patient privacy and safety. Despite the proposal of many complex and theoretically successful anonymization solutions in literature, these techniques remain flawed. As such, clinical institutions are still reluctant to apply them for open access to thei…
▽ More
Automated clinical text anonymization has the potential to unlock the widespread sharing of textual health data for secondary usage while assuring patient privacy and safety. Despite the proposal of many complex and theoretically successful anonymization solutions in literature, these techniques remain flawed. As such, clinical institutions are still reluctant to apply them for open access to their data. Recent advances in developing Large Language Models (LLMs) pose a promising opportunity to further the field, given their capability to perform various tasks. This paper proposes six new evaluation metrics tailored to the challenges of generative anonymization with LLMs. Moreover, we present a comparative study of LLM-based methods, testing them against two baseline techniques. Our results establish LLM-based models as a reliable alternative to common approaches, paving the way toward trustworthy anonymization of clinical text.
△ Less
Submitted 29 May, 2024;
originally announced June 2024.
-
Prediction of soil fertility parameters using USB-microscope imagery and portable X-ray fluorescence spectrometry
Authors:
Shubhadip Dasgupta,
Satwik Pate,
Divya Rathore,
L. G. Divyanth,
Ayan Das,
Anshuman Nayak,
Subhadip Dey,
Asim Biswas,
David C. Weindorf,
Bin Li,
Sergio Henrique Godinho Silva,
Bruno Teixeira Ribeiro,
Sanjay Srivastava,
Somsubhra Chakraborty
Abstract:
This study investigated the use of portable X-ray fluorescence (PXRF) spectrometry and soil image analysis for rapid soil fertility assessment, with a focus on key indicators such as available boron (B), organic carbon (OC), available manganese (Mn), available sulfur (S), and the sulfur availability index (SAI). A total of 1,133 soil samples from diverse agro-climatic zones in Eastern India were a…
▽ More
This study investigated the use of portable X-ray fluorescence (PXRF) spectrometry and soil image analysis for rapid soil fertility assessment, with a focus on key indicators such as available boron (B), organic carbon (OC), available manganese (Mn), available sulfur (S), and the sulfur availability index (SAI). A total of 1,133 soil samples from diverse agro-climatic zones in Eastern India were analyzed. The research integrated color and texture features from microscopic soil images, PXRF data, and auxiliary soil variables (AVs) using a Random Forest model. Results showed that combining image features (IFs) with AVs significantly improved prediction accuracy for available B (R2 = 0.80) and OC (R2 = 0.88). A data fusion approach, incorporating IFs, AVs, and PXRF data, further enhanced predictions for available Mn and SAI, with R2 values of 0.72 and 0.70, respectively. The study highlights the potential of integrating these technologies to offer rapid, cost-effective soil testing methods, paving the way for more advanced predictive models and a deeper understanding of soil fertility. Future work should explore the application of deep learning models on a larger dataset, incorporating soils from a wider range of agro-climatic zones under field conditions.
△ Less
Submitted 5 September, 2024; v1 submitted 17 April, 2024;
originally announced April 2024.
-
A Foundation Model for Zero-shot Logical Query Reasoning
Authors:
Mikhail Galkin,
Jincheng Zhou,
Bruno Ribeiro,
Jian Tang,
Zhaocheng Zhu
Abstract:
Complex logical query answering (CLQA) in knowledge graphs (KGs) goes beyond simple KG completion and aims at answering compositional queries comprised of multiple projections and logical operations. Existing CLQA methods that learn parameters bound to certain entity or relation vocabularies can only be applied to the graph they are trained on which requires substantial training time before being…
▽ More
Complex logical query answering (CLQA) in knowledge graphs (KGs) goes beyond simple KG completion and aims at answering compositional queries comprised of multiple projections and logical operations. Existing CLQA methods that learn parameters bound to certain entity or relation vocabularies can only be applied to the graph they are trained on which requires substantial training time before being deployed on a new graph. Here we present UltraQuery, the first foundation model for inductive reasoning that can zero-shot answer logical queries on any KG. The core idea of UltraQuery is to derive both projections and logical operations as vocabulary-independent functions which generalize to new entities and relations in any KG. With the projection operation initialized from a pre-trained inductive KG reasoning model, UltraQuery can solve CLQA on any KG after finetuning on a single dataset. Experimenting on 23 datasets, UltraQuery in the zero-shot inference mode shows competitive or better query answering performance than best available baselines and sets a new state of the art on 15 of them.
△ Less
Submitted 1 October, 2024; v1 submitted 10 April, 2024;
originally announced April 2024.
-
Hybridization induced triplet superconductivity with $S^z=0$
Authors:
Edine Silva,
R. C. Bento Ribeiro,
Heron Caldas,
Mucio A. Continentino
Abstract:
The Kitaev superconducting chain is a model of spinless fermions with triplet-like superconductivity. It has raised interest since for some values of its parameters it presents a non-trivial topological phase that host Majorana fermions. The physical realization of a Kitaev chain is complicated by the scarcity of triplet superconductivity in real physical systems. Many proposals have been put forw…
▽ More
The Kitaev superconducting chain is a model of spinless fermions with triplet-like superconductivity. It has raised interest since for some values of its parameters it presents a non-trivial topological phase that host Majorana fermions. The physical realization of a Kitaev chain is complicated by the scarcity of triplet superconductivity in real physical systems. Many proposals have been put forward to overcome this difficulty and fabricate artificial triplet superconducting chains. In this work we study a superconducting chain of spinful fermions forming Cooper pairs, in a triplet $S=1$ state, but with $S^z=0$. The motivation is that such pairing can be induced in chains that couple through an antisymmetric hybridization to an s-wave superconducting substrate. We study the nature of edge states and the topological properties of these chains. In the presence of a magnetic field the chain can sustain gapless superconductivity with pairs of Fermi points. The momentum space topology of these Fermi points is non-trivial, in the sense that they can only disappear by annihilating each other. For small magnetic fields, we find well defined degenerate edge modes with finite Zeemann energy. These modes are not symmetry protected and decay abruptly in the bulk as their energy merges with the continuum of excitations.
△ Less
Submitted 20 March, 2024;
originally announced March 2024.
-
The Kormendy relation of cluster galaxies in PPS regions
Authors:
André L. B. Ribeiro,
Paulo A. A. Lopes,
Dailer F. Morell,
Christine C. Dantas,
Monyke H. S. Fonseca,
Beatriz G. Amarante,
Flávio R. Morais-Neto
Abstract:
We study a sample of 936 early-type galaxies located in 48 low-z regular galaxy clusters with $M_{200}\geq 10^{14}~ M_\odot$ at $z< 0.1$. We examine variations in the Kormendy relation (KR) according to their location in the projected phase space (PPS) of the clusters. We have used a combination of Bayesian statistical methods to identify possible differences between the fitted relations. Our resu…
▽ More
We study a sample of 936 early-type galaxies located in 48 low-z regular galaxy clusters with $M_{200}\geq 10^{14}~ M_\odot$ at $z< 0.1$. We examine variations in the Kormendy relation (KR) according to their location in the projected phase space (PPS) of the clusters. We have used a combination of Bayesian statistical methods to identify possible differences between the fitted relations. Our results indicate that the overall KR is better fitted when we take into account the information about PPS regions. We also find that objects with time since infall $\geq 6.5$ Gyr have a significant statistical difference of the KR coefficients relative to objects that are more recent in the cluster environment. We show that giant central ellipticals are responsible for tilting the KR relation towards smaller slopes. These galaxies present a late growth probably due to cumulative preprocessing during infall, plus cannibalism and accretion of smaller stripped objects near the center of the clusters.
△ Less
Submitted 15 February, 2024;
originally announced February 2024.
-
GraphMETRO: Mitigating Complex Graph Distribution Shifts via Mixture of Aligned Experts
Authors:
Shirley Wu,
Kaidi Cao,
Bruno Ribeiro,
James Zou,
Jure Leskovec
Abstract:
Graph data are inherently complex and heterogeneous, leading to a high natural diversity of distributional shifts. However, it remains unclear how to build machine learning architectures that generalize to the complex distributional shifts naturally occurring in the real world. Here, we develop GraphMETRO, a Graph Neural Network architecture that models natural diversity and captures complex distr…
▽ More
Graph data are inherently complex and heterogeneous, leading to a high natural diversity of distributional shifts. However, it remains unclear how to build machine learning architectures that generalize to the complex distributional shifts naturally occurring in the real world. Here, we develop GraphMETRO, a Graph Neural Network architecture that models natural diversity and captures complex distributional shifts. GraphMETRO employs a Mixture-of-Experts (MoE) architecture with a gating model and multiple expert models, where each expert model targets a specific distributional shift to produce a referential representation w.r.t. a reference model, and the gating model identifies shift components. Additionally, we design a novel objective that aligns the representations from different expert models to ensure reliable optimization. GraphMETRO achieves state-of-the-art results on four datasets from the GOOD benchmark, which is comprised of complex and natural real-world distribution shifts, improving by 67% and 4.2% on the WebKB and Twitch datasets. Code and data are available at https://github.com/Wuyxin/GraphMETRO.
△ Less
Submitted 28 October, 2024; v1 submitted 7 December, 2023;
originally announced December 2023.
-
MIST: Defending Against Membership Inference Attacks Through Membership-Invariant Subspace Training
Authors:
Jiacheng Li,
Ninghui Li,
Bruno Ribeiro
Abstract:
In Member Inference (MI) attacks, the adversary try to determine whether an instance is used to train a machine learning (ML) model. MI attacks are a major privacy concern when using private data to train ML models. Most MI attacks in the literature take advantage of the fact that ML models are trained to fit the training data well, and thus have very low loss on training instances. Most defenses…
▽ More
In Member Inference (MI) attacks, the adversary try to determine whether an instance is used to train a machine learning (ML) model. MI attacks are a major privacy concern when using private data to train ML models. Most MI attacks in the literature take advantage of the fact that ML models are trained to fit the training data well, and thus have very low loss on training instances. Most defenses against MI attacks therefore try to make the model fit the training data less well. Doing so, however, generally results in lower accuracy. We observe that training instances have different degrees of vulnerability to MI attacks. Most instances will have low loss even when not included in training. For these instances, the model can fit them well without concerns of MI attacks. An effective defense only needs to (possibly implicitly) identify instances that are vulnerable to MI attacks and avoids overfitting them. A major challenge is how to achieve such an effect in an efficient training process. Leveraging two distinct recent advancements in representation learning: counterfactually-invariant representations and subspace learning methods, we introduce a novel Membership-Invariant Subspace Training (MIST) method to defend against MI attacks. MIST avoids overfitting the vulnerable instances without significant impact on other instances. We have conducted extensive experimental studies, comparing MIST with various other state-of-the-art (SOTA) MI defenses against several SOTA MI attacks. We find that MIST outperforms other defenses while resulting in minimal reduction in testing accuracy.
△ Less
Submitted 29 May, 2024; v1 submitted 1 November, 2023;
originally announced November 2023.
-
Efficient Subgraph GNNs by Learning Effective Selection Policies
Authors:
Beatrice Bevilacqua,
Moshe Eliasof,
Eli Meirom,
Bruno Ribeiro,
Haggai Maron
Abstract:
Subgraph GNNs are provably expressive neural architectures that learn graph representations from sets of subgraphs. Unfortunately, their applicability is hampered by the computational complexity associated with performing message passing on many subgraphs. In this paper, we consider the problem of learning to select a small subset of the large set of possible subgraphs in a data-driven fashion. We…
▽ More
Subgraph GNNs are provably expressive neural architectures that learn graph representations from sets of subgraphs. Unfortunately, their applicability is hampered by the computational complexity associated with performing message passing on many subgraphs. In this paper, we consider the problem of learning to select a small subset of the large set of possible subgraphs in a data-driven fashion. We first motivate the problem by proving that there are families of WL-indistinguishable graphs for which there exist efficient subgraph selection policies: small subsets of subgraphs that can already identify all the graphs within the family. We then propose a new approach, called Policy-Learn, that learns how to select subgraphs in an iterative manner. We prove that, unlike popular random policies and prior work addressing the same problem, our architecture is able to learn the efficient policies mentioned above. Our experimental results demonstrate that Policy-Learn outperforms existing baselines across a wide range of datasets.
△ Less
Submitted 20 March, 2024; v1 submitted 30 October, 2023;
originally announced October 2023.
-
The Role of Groups in Galaxy Evolution: compelling evidence of pre-processing out to the turnaround radius of clusters
Authors:
P. A. A. Lopes,
A. L. B. Ribeiro,
D. Brambila
Abstract:
We present clear and direct evidence of the pre-processing effect of group galaxies falling into clusters in the local Universe ($z \lesssim 0.1$). We start with a sample of 238 clusters, from which we select 153 with N$_{200} \ge$ 20. We considered 1641 groups within the turnaround radius ($\sim$ 5$\times$R$_{200}$) of these 153 clusters. There are 6654 {\it individual cluster galaxies} and 4133…
▽ More
We present clear and direct evidence of the pre-processing effect of group galaxies falling into clusters in the local Universe ($z \lesssim 0.1$). We start with a sample of 238 clusters, from which we select 153 with N$_{200} \ge$ 20. We considered 1641 groups within the turnaround radius ($\sim$ 5$\times$R$_{200}$) of these 153 clusters. There are 6654 {\it individual cluster galaxies} and 4133 {\it group galaxies} within this radius. We considered two control samples of galaxies, in isolated groups and in the field. The first comprises 2601 galaxies within 1606 {\it isolated groups}, and the latter has 4273 field objects. The fraction of star forming galaxies in infalling groups has a distinct clustercentric behavior in comparison to the remaining cluster galaxies. Even at $5 \times $R$_{200}$ the {\it group galaxies} already show a reduced fraction of star forming objects. At this radius, the results for the {\it individual cluster galaxies} is actually compatible to the field. That is strong evidence that the group environment is effective to quench the star formation prior to the cluster arrival. The group star forming fraction remains roughly constant inwards, decreasing significantly only within the cluster R$_{200}$ radius. We have also found that the pre-processing effect depends on the group mass (indicated by the number of members). The effect is larger for more massive groups. However, it is significant even for pairs an triplets. Finally, we find evidence that the time scale required for morphological transformation is larger than the one for quenching.
△ Less
Submitted 20 September, 2023;
originally announced September 2023.
-
A Multi-Task Perspective for Link Prediction with New Relation Types and Nodes
Authors:
Jincheng Zhou,
Beatrice Bevilacqua,
Bruno Ribeiro
Abstract:
The task of inductive link prediction in (discrete) attributed multigraphs infers missing attributed links (relations) between nodes in new test multigraphs. Traditional relational learning methods face the challenge of limited generalization to test multigraphs containing both novel nodes and novel relation types not seen in training. Recently, under the only assumption that all relation types sh…
▽ More
The task of inductive link prediction in (discrete) attributed multigraphs infers missing attributed links (relations) between nodes in new test multigraphs. Traditional relational learning methods face the challenge of limited generalization to test multigraphs containing both novel nodes and novel relation types not seen in training. Recently, under the only assumption that all relation types share the same structural predictive patterns (single task), Gao et al. (2023) proposed a link prediction method using the theoretical concept of double equivariance (equivariance for nodes & relation types), in contrast to the (single) equivariance (only for nodes) used to design Graph Neural Networks (GNNs). In this work we further extend the double equivariance concept to multi-task double equivariance, where we define link prediction in attributed multigraphs that can have distinct and potentially conflicting predictive patterns for different sets of relation types (multiple tasks). Our empirical results on real-world datasets demonstrate that our approach can effectively generalize to test graphs with multi-task structures without access to additional information.
△ Less
Submitted 4 December, 2023; v1 submitted 12 July, 2023;
originally announced July 2023.
-
Spin-Polarized Majorana Zero Modes in Proximitized Superconducting Penta-Silicene Nanoribbons
Authors:
R. C. Bento Ribeiro,
J. H. Correa,
L. S. Ricco,
I. A. Shelykh,
M. A. Continentino,
A. C. Seridonio,
M. Minissale,
G. L. Lay,
M. S. Figueira
Abstract:
We theoretically investigate the possibility of obtaining Majorana zero modes (MZMs) in penta-silicene nanoribbons (p-SiNRs) with induced \textit{p}-wave superconductivity. The model explicitly considers an external magnetic field perpendicularly applied to the nanoribbon plane, as well as an extrinsic Rashba spin-orbit coupling (RSOC), in addition to the first nearest neighbor hopping term and \t…
▽ More
We theoretically investigate the possibility of obtaining Majorana zero modes (MZMs) in penta-silicene nanoribbons (p-SiNRs) with induced \textit{p}-wave superconductivity. The model explicitly considers an external magnetic field perpendicularly applied to the nanoribbon plane, as well as an extrinsic Rashba spin-orbit coupling (RSOC), in addition to the first nearest neighbor hopping term and \textit{p}-wave superconducting pairing. By analyzing the dispersion relation profiles, we observe the successive closing and reopening of the induced superconducting gap with a single spin component, indicating a spin-polarized topological phase transition (TPT). Correspondingly, the plots of the energy spectrum versus the chemical potential reveal the existence of zero-energy states with a preferential spin orientation characterized by nonoverlapping wave functions localized at opposite ends of the superconducting p-SiNRs. These findings strongly suggest the emergence of topologically protected, spin-polarized MZMs at the ends of the p-SiNRs with induced \textit{p}-wave superconducting pairing, which can be realized by proximitizing the nanoribbon with an \textit{s}-wave superconductor, such as lead. The proposal paves the way for silicene-based Majorana devices hosting multiple MZMs with a well-defined spin orientation, with possible applications in fault-tolerant quantum computing platforms and Majorana spintronics.
△ Less
Submitted 6 July, 2023;
originally announced July 2023.
-
Large-Scale Text Analysis Using Generative Language Models: A Case Study in Discovering Public Value Expressions in AI Patents
Authors:
Sergio Pelaez,
Gaurav Verma,
Barbara Ribeiro,
Philip Shapira
Abstract:
Labeling data is essential for training text classifiers but is often difficult to accomplish accurately, especially for complex and abstract concepts. Seeking an improved method, this paper employs a novel approach using a generative language model (GPT-4) to produce labels and rationales for large-scale text analysis. We apply this approach to the task of discovering public value expressions in…
▽ More
Labeling data is essential for training text classifiers but is often difficult to accomplish accurately, especially for complex and abstract concepts. Seeking an improved method, this paper employs a novel approach using a generative language model (GPT-4) to produce labels and rationales for large-scale text analysis. We apply this approach to the task of discovering public value expressions in US AI patents. We collect a database comprising 154,934 patent documents using an advanced Boolean query submitted to InnovationQ+. The results are merged with full patent text from the USPTO, resulting in 5.4 million sentences. We design a framework for identifying and labeling public value expressions in these AI patent sentences. A prompt for GPT-4 is developed which includes definitions, guidelines, examples, and rationales for text classification. We evaluate the quality of the labels and rationales produced by GPT-4 using BLEU scores and topic modeling and find that they are accurate, diverse, and faithful. These rationales also serve as a chain-of-thought for the model, a transparent mechanism for human verification, and support for human annotators to overcome cognitive limitations. We conclude that GPT-4 achieved a high-level of recognition of public value theory from our framework, which it also uses to discover unseen public value expressions. We use the labels produced by GPT-4 to train BERT-based classifiers and predict sentences on the entire database, achieving high F1 scores for the 3-class (0.85) and 2-class classification (0.91) tasks. We discuss the implications of our approach for conducting large-scale text analyses with complex and abstract concepts and suggest that, with careful framework design and interactive human oversight, generative language models can offer significant advantages in quality and in reduced time and costs for producing labels and rationales.
△ Less
Submitted 18 May, 2023; v1 submitted 17 May, 2023;
originally announced May 2023.
-
Examining transitional galaxies to understand the role of clusters and their dynamical status in galaxy quenching
Authors:
Douglas Brambila,
Paulo A. A. Lopes,
André L. B. Ribeiro,
Arianna Cortesi
Abstract:
In this work, we consider four different galaxy populations and two distinct global environments in the local Universe (z $\leq 0.11$) to investigate the evolution of transitional galaxies (such as star-forming spheroids and passive discs) across different environments. Our sample is composed of 3,899 galaxies within the R$_{200}$ radius of 231 clusters and 11,460 field galaxies. We also investiga…
▽ More
In this work, we consider four different galaxy populations and two distinct global environments in the local Universe (z $\leq 0.11$) to investigate the evolution of transitional galaxies (such as star-forming spheroids and passive discs) across different environments. Our sample is composed of 3,899 galaxies within the R$_{200}$ radius of 231 clusters and 11,460 field galaxies. We also investigate the impact of the cluster's dynamic state, as well as the galaxy's location in the projected phase space diagram (PPS). We found that although the cluster environment as a whole influences galaxy evolution, the cluster dynamical state does not. Furthermore, star-forming galaxies represent recent cluster arrivals in comparison to passive galaxies (especially in the case of early-types). Among the ETGs, we find that the D$_n(4000)$ and H$_δ$ parameters indicate a smooth transition between the subpopulations. In particular, for the SF-ETGs, we detect a significant difference between field and cluster galaxies, as a function of stellar mass, for objects with Log $M_*$/M$_{\odot} > 10.5$. Analyzing the color gradient, the results point toward a picture where field galaxies are more likely to follow the monolithic scenario, while the cluster galaxies the hierarchical scenario. In particular, if we split the ETGs into lenticulars and ellipticals, we find that the steeper color gradients are more common for the lenticulars. Finally, our results indicate the need for galaxy pre-processing in smaller groups, before entering clusters.
△ Less
Submitted 15 May, 2023;
originally announced May 2023.
-
Cosmological constraints on $R^2$-corrected Appleby-Battye model
Authors:
Bruno Ribeiro,
Armando Bernui,
Marcela Campista
Abstract:
Nowadays, efforts are being devoted to the study of alternative cosmological scenarios in which modifications of General Relativity have been proposed to explain the late cosmic acceleration without assuming the existence of dark energy. In this scenario, we investigate the $R^2$-AB model, which consists of an $f(R)$ model with only one extra free parameter, $b$, in addition to the 6 of the flat-…
▽ More
Nowadays, efforts are being devoted to the study of alternative cosmological scenarios in which modifications of General Relativity have been proposed to explain the late cosmic acceleration without assuming the existence of dark energy. In this scenario, we investigate the $R^2$-AB model, which consists of an $f(R)$ model with only one extra free parameter, $b$, in addition to the 6 of the flat-$Λ$CDM. Regarding this model, it was already shown that a positive value for $b$ is required for the model to be consistent with Solar System tests, moreover, the condition for the existence of a de~Sitter state requires $b \ge 1.6$. To impose observational constraints on the $R^2$-AB model we consider three datasets: 31 $H(z)$ measurements from Cosmic Chronometers (CC), 20 $[{fσ}_{8}](z)$ measurements from Redshift-Space Distortion (RSD), and the most recent type Ia Supernovae (SNe Ia) sample from Pantheon+. Next, we perform two different analyses: we have considered only SNe Ia data and the combined likelihood SNe+CC+RSD. The first one has provided $b=2.28^{+6.52}_{-0.55}$, while the second one $b=2.18^{+5.41}_{-0.55}$. In the first case it was necessary to set the absolute magnitude $M_B = -19.253$ from SH0ES collaboration, while in the second we did a marginalization over the Hubble constant $H_0$ in the normalized growth function. We have also observed that the $H_0-M_B$ degeneracy was broken by adding CC data to the SNe data. Additionally, we perform illustrative analyses that compare this $f(R)$ model with the flat-$Λ$CDM model, considering several values of the parameter $b$, for diverse cosmological functions like the Hubble function $H(z)$, the equation of state $w_{\rm eff}(z)$, the parametrized growth rate of cosmic structures $[fσ_8](z)$, and $σ_8(z)$. We conclude that the model fits well the data, but the parameter $b$ was not unambiguously constrained.
△ Less
Submitted 12 September, 2024; v1 submitted 10 May, 2023;
originally announced May 2023.
-
Proximal Curriculum for Reinforcement Learning Agents
Authors:
Georgios Tzannetos,
Bárbara Gomes Ribeiro,
Parameswaran Kamalaruban,
Adish Singla
Abstract:
We consider the problem of curriculum design for reinforcement learning (RL) agents in contextual multi-task settings. Existing techniques on automatic curriculum design typically require domain-specific hyperparameter tuning or have limited theoretical underpinnings. To tackle these limitations, we design our curriculum strategy, ProCuRL, inspired by the pedagogical concept of Zone of Proximal De…
▽ More
We consider the problem of curriculum design for reinforcement learning (RL) agents in contextual multi-task settings. Existing techniques on automatic curriculum design typically require domain-specific hyperparameter tuning or have limited theoretical underpinnings. To tackle these limitations, we design our curriculum strategy, ProCuRL, inspired by the pedagogical concept of Zone of Proximal Development (ZPD). ProCuRL captures the intuition that learning progress is maximized when picking tasks that are neither too hard nor too easy for the learner. We mathematically derive ProCuRL by analyzing two simple learning settings. We also present a practical variant of ProCuRL that can be directly integrated with deep RL frameworks with minimal hyperparameter tuning. Experimental results on a variety of domains demonstrate the effectiveness of our curriculum strategy over state-of-the-art baselines in accelerating the training process of deep RL agents.
△ Less
Submitted 25 April, 2023;
originally announced April 2023.
-
MetaPhysiCa: OOD Robustness in Physics-informed Machine Learning
Authors:
S Chandra Mouli,
Muhammad Ashraful Alam,
Bruno Ribeiro
Abstract:
A fundamental challenge in physics-informed machine learning (PIML) is the design of robust PIML methods for out-of-distribution (OOD) forecasting tasks. These OOD tasks require learning-to-learn from observations of the same (ODE) dynamical system with different unknown ODE parameters, and demand accurate forecasts even under out-of-support initial conditions and out-of-support ODE parameters. In…
▽ More
A fundamental challenge in physics-informed machine learning (PIML) is the design of robust PIML methods for out-of-distribution (OOD) forecasting tasks. These OOD tasks require learning-to-learn from observations of the same (ODE) dynamical system with different unknown ODE parameters, and demand accurate forecasts even under out-of-support initial conditions and out-of-support ODE parameters. In this work we propose a solution for such tasks, which we define as a meta-learning procedure for causal structure discovery (including invariant risk minimization). Using three different OOD tasks, we empirically observe that the proposed approach significantly outperforms existing state-of-the-art PIML and deep learning methods.
△ Less
Submitted 6 March, 2023;
originally announced March 2023.
-
Video Action Recognition Collaborative Learning with Dynamics via PSO-ConvNet Transformer
Authors:
Nguyen Huu Phong,
Bernardete Ribeiro
Abstract:
Recognizing human actions in video sequences, known as Human Action Recognition (HAR), is a challenging task in pattern recognition. While Convolutional Neural Networks (ConvNets) have shown remarkable success in image recognition, they are not always directly applicable to HAR, as temporal features are critical for accurate classification. In this paper, we propose a novel dynamic PSO-ConvNet mod…
▽ More
Recognizing human actions in video sequences, known as Human Action Recognition (HAR), is a challenging task in pattern recognition. While Convolutional Neural Networks (ConvNets) have shown remarkable success in image recognition, they are not always directly applicable to HAR, as temporal features are critical for accurate classification. In this paper, we propose a novel dynamic PSO-ConvNet model for learning actions in videos, building on our recent work in image recognition. Our approach leverages a framework where the weight vector of each neural network represents the position of a particle in phase space, and particles share their current weight vectors and gradient estimates of the Loss function. To extend our approach to video, we integrate ConvNets with state-of-the-art temporal methods such as Transformer and Recurrent Neural Networks. Our experimental results on the UCF-101 dataset demonstrate substantial improvements of up to 9% in accuracy, which confirms the effectiveness of our proposed method. In addition, we conducted experiments on larger and more variety of datasets including Kinetics-400 and HMDB-51 and obtained preference for Collaborative Learning in comparison with Non-Collaborative Learning (Individual Learning). Overall, our dynamic PSO-ConvNet model provides a promising direction for improving HAR by better capturing the spatio-temporal dynamics of human actions in videos. The code is available at https://github.com/leonlha/Video-Action-Recognition-Collaborative-Learning-with-Dynamics-via-PSO-ConvNet-Transformer.
△ Less
Submitted 21 September, 2023; v1 submitted 17 February, 2023;
originally announced February 2023.
-
Late growth of early-type galaxies in low-z massive clusters
Authors:
A. L. B. Ribeiro,
R. S. Nascimento,
D. F. Morell,
P. A. A. Lopes,
C. C. Dantas,
M. H. S. Fonseca
Abstract:
We study a sample of 936 early-type galaxies (ETGs) located in 48 low-z regular galaxy clusters with $M_{200}\geq 10^{14}~ M_\odot$ at $z< 0.1$. We examine variations in the concentration index, radius, and color gradient of ETGs as a function of their stellar mass and loci in the projected phase space (PPS) of the clusters. We aim to understand the environmental influence on the growth of ETGs ac…
▽ More
We study a sample of 936 early-type galaxies (ETGs) located in 48 low-z regular galaxy clusters with $M_{200}\geq 10^{14}~ M_\odot$ at $z< 0.1$. We examine variations in the concentration index, radius, and color gradient of ETGs as a function of their stellar mass and loci in the projected phase space (PPS) of the clusters. We aim to understand the environmental influence on the growth of ETGs according to the time since infall into their host clusters. Our analysis indicates a significant change in the behavior of the concentration index $C$ and color gradient around $M_{\ast} \approx 2\times 10^{11} ~M_\odot \equiv \tilde{M}_{\ast}$. Objects less massive than $ \tilde{M}_{\ast}$ present a slight growth of $C$ with $M_{\ast}$ with negative and approximately constant color gradients in all regions of the PPS. Objects more massive than $ \tilde{M}_{\ast}$ present a slight decrease of $C$ with $M_{\ast}$ with color gradients becoming less negative and approaching zero. We also find that objects more massive than $ \tilde{M}_{\ast}$, in all PPS regions, have smaller $R_{90}$ for a given $R_{50}$, suggesting a smaller external growth in these objects or even a shrinkage possibly due to tidal stripping. Finally, we estimate different dark matter fractions for galaxies in different regions of the PPS, with the ancient satellites having the largest fractions, $f_{DM}\approx$ 65%. These results favor a scenario where cluster ETGs experience environmental influence the longer they remain and the deeper into the gravitational potential they lie, indicating a combination of tidal stripping + harassment, which predominate during infall, followed by mergers + feedback effects affecting the late growth of ancient satellites and BCGs.
△ Less
Submitted 8 February, 2023;
originally announced February 2023.
-
Double Equivariance for Inductive Link Prediction for Both New Nodes and New Relation Types
Authors:
Jincheng Zhou,
Yucheng Zhang,
Jianfei Gao,
Yangze Zhou,
Bruno Ribeiro
Abstract:
The task of fully inductive link prediction in knowledge graphs has gained significant attention, with various graph neural networks being proposed to address it. This task presents greater challenges than traditional inductive link prediction tasks with only new nodes, as models must be capable of zero-shot generalization to both unseen nodes and unseen relation types in the inference graph. Desp…
▽ More
The task of fully inductive link prediction in knowledge graphs has gained significant attention, with various graph neural networks being proposed to address it. This task presents greater challenges than traditional inductive link prediction tasks with only new nodes, as models must be capable of zero-shot generalization to both unseen nodes and unseen relation types in the inference graph. Despite the development of novel models, a unifying theoretical understanding of their success remains elusive, and the limitations of these methods are not well-studied. In this work, we introduce the concept of double permutation-equivariant representations and demonstrate its necessity for effective performance in this task. We show that many existing models, despite their diverse architectural designs, conform to this framework. However, we also identify inherent limitations in double permutation-equivariant representations, which restrict these models's ability to learn effectively on datasets with varying characteristics. Our findings suggest that while double equivariance is necessary for meta-learning across knowledge graphs from different domains, it is not sufficient. There remains a fundamental gap between double permutation-equivariant models and the concept of foundation models designed to learn patterns across all domains.
△ Less
Submitted 13 January, 2025; v1 submitted 2 February, 2023;
originally announced February 2023.
-
Causal Lifting and Link Prediction
Authors:
Leonardo Cotta,
Beatrice Bevilacqua,
Nesreen Ahmed,
Bruno Ribeiro
Abstract:
Existing causal models for link prediction assume an underlying set of inherent node factors -- an innate characteristic defined at the node's birth -- that governs the causal evolution of links in the graph. In some causal tasks, however, link formation is path-dependent: The outcome of link interventions depends on existing links. Unfortunately, these existing causal methods are not designed for…
▽ More
Existing causal models for link prediction assume an underlying set of inherent node factors -- an innate characteristic defined at the node's birth -- that governs the causal evolution of links in the graph. In some causal tasks, however, link formation is path-dependent: The outcome of link interventions depends on existing links. Unfortunately, these existing causal methods are not designed for path-dependent link formation, as the cascading functional dependencies between links (arising from path dependence) are either unidentifiable or require an impractical number of control variables. To overcome this, we develop the first causal model capable of dealing with path dependencies in link prediction. In this work we introduce the concept of causal lifting, an invariance in causal models of independent interest that, on graphs, allows the identification of causal link prediction queries using limited interventional data. Further, we show how structural pairwise embeddings exhibit lower bias and correctly represent the task's causal structure, as opposed to existing node embeddings, e.g., graph neural network node embeddings and matrix factorization. Finally, we validate our theoretical findings on three scenarios for causal link prediction tasks: knowledge base completion, covariance matrix estimation and consumer-product recommendations.
△ Less
Submitted 27 July, 2023; v1 submitted 2 February, 2023;
originally announced February 2023.
-
Galaxy Distributions as Fractal Systems
Authors:
Sharon Teles,
Amanda R. Lopes,
Marcelo B. Ribeiro
Abstract:
This paper discusses if large scale galaxy distribution samples containing almost one million objects can be characterized as fractal systems. The analysis performed by Teles et al. (2021; arXiv:2012.07164) on the UltraVISTA DR1 survey is extended here to the SPLASH and COSMOS2015 catalogs, hence adding 750k new galaxies with measured redshifts to the studied samples. The standard $Λ$CDM cosmology…
▽ More
This paper discusses if large scale galaxy distribution samples containing almost one million objects can be characterized as fractal systems. The analysis performed by Teles et al. (2021; arXiv:2012.07164) on the UltraVISTA DR1 survey is extended here to the SPLASH and COSMOS2015 catalogs, hence adding 750k new galaxies with measured redshifts to the studied samples. The standard $Λ$CDM cosmology having $H_0=(70\pm5)$ km/s/Mpc and number density tools required for describing these galaxy distributions as single fractal systems with dimension $D$ are adopted. We use the luminosity distance $d_L$, redshift distance $d_z$ and galaxy area distance (transverse comoving distance) $d_G$ as relativistic distance definitions to derive galaxy number densities in the redshift interval $0.1\le z\le4$ at volume limited subsamples defined by absolute magnitudes in the K-band. Similar to the findings of Teles et al. (2021; arXiv:2012.07164), the results show two consecutive redshift scales where galaxy distribution data behave as single fractal structures. For $z<1$ we found $D=1.00\pm0.12$ for the SPLASH galaxies, and $D=1,39\pm0.19$ for the COSMOS2015. For $1\le z\le4$ we respectively found $D=0.83^{+0.36}_{-0.37}$ and $D=0.54^{+0.27}_{-0.26}$. These results were verified to be robust under the assumed Hubble constant uncertainty. Calculations considering blue and red galaxies subsamples in both surveys showed that the fractal dimensions of blue galaxies as basically unchanged, but the ones for the red galaxies changed mostly to smaller values, meaning that $D$ may be seen as a more intrinsic property of the distribution of objects in the Universe, therefore allowing for the fractal dimension to be used as a tool to study different populations of galaxies. All results confirm the decades old theoretical prediction of a decrease in the fractal dimension for $z>1$.
△ Less
Submitted 29 September, 2022;
originally announced September 2022.
-
Bias Challenges in Counterfactual Data Augmentation
Authors:
S Chandra Mouli,
Yangze Zhou,
Bruno Ribeiro
Abstract:
Deep learning models tend not to be out-of-distribution robust primarily due to their reliance on spurious features to solve the task. Counterfactual data augmentations provide a general way of (approximately) achieving representations that are counterfactual-invariant to spurious features, a requirement for out-of-distribution (OOD) robustness. In this work, we show that counterfactual data augme…
▽ More
Deep learning models tend not to be out-of-distribution robust primarily due to their reliance on spurious features to solve the task. Counterfactual data augmentations provide a general way of (approximately) achieving representations that are counterfactual-invariant to spurious features, a requirement for out-of-distribution (OOD) robustness. In this work, we show that counterfactual data augmentations may not achieve the desired counterfactual-invariance if the augmentation is performed by a context-guessing machine, an abstract machine that guesses the most-likely context of a given input. We theoretically analyze the invariance imposed by such counterfactual data augmentations and describe an exemplar NLP task where counterfactual data augmentation by a context-guessing machine does not lead to robust OOD classifiers.
△ Less
Submitted 13 September, 2022; v1 submitted 12 September, 2022;
originally announced September 2022.
-
Veritas: Answering Causal Queries from Video Streaming Traces
Authors:
Chandan Bothra,
Jianfei Gao,
Sanjay Rao,
Bruno Ribeiro
Abstract:
In this paper, we seek to answer what-if questions - i.e., given recorded data of an existing deployed networked system, what would be the performance impact if we changed the design of the system (a task also known as causal inference). We make three contributions. First, we expose the complexity of causal inference in the context of adaptive bit rate video streaming, a challenging domain where t…
▽ More
In this paper, we seek to answer what-if questions - i.e., given recorded data of an existing deployed networked system, what would be the performance impact if we changed the design of the system (a task also known as causal inference). We make three contributions. First, we expose the complexity of causal inference in the context of adaptive bit rate video streaming, a challenging domain where the network conditions during the session act as a sequence of latent and confounding variables, and a change at any point in the session has a cascading impact on the rest of the session. Second, we present Veritas, a novel framework that tackles causal reasoning for video streaming without resorting to randomised trials. Integral to Veritas is an easy to interpret domain-specific ML model (an embedded Hidden Markov Model) that relates the latent stochastic process (intrinsic bandwidth that the video session can achieve) to actual observations (download times) while exploiting control variables such as the TCP state (e.g., congestion window) observed at the start of the download of video chunks. We show through experiments on an emulation testbed that Veritas can answer both counterfactual queries (e.g., the performance of a completed video session had it used a different buffer size) and interventional queries (e.g., estimating the download time for every possible video quality choice for the next chunk in a session in progress). In doing so, Veritas achieves accuracy close to an ideal oracle, while significantly outperforming both a commonly used baseline approach, and Fugu (an off-the-shelf neural network) neither of which account for causal effects.
△ Less
Submitted 26 August, 2022;
originally announced August 2022.
-
Metal content of the circumgalactic medium around star-forming galaxies at z $\sim$ 2.6 as revealed by the VIMOS Ultra-Deep Survey
Authors:
H. Méndez-Hernández,
P. Cassata,
E. Ibar,
R Amorín,
M. Aravena,
S. Bardelli,
O. Cucciati,
B. Garilli,
M. Giavalisco,
L. Guaita,
N. Hathi,
A. Koekemoer,
V. Le Brun,
B. C. Lemaux,
D. Maccagni,
B. Ribeiro,
L. Tasca,
N. Tejos,
R. Thomas,
L. Tresse,
D. Vergani,
G. Zamorani,
E. Zucca
Abstract:
The circumgalactic medium (CGM) is the location where the interplay between large-scale outflows and accretion onto galaxies occurs. Metals in different ionization states flowing between the circumgalactic and intergalactic mediums are affected by large galactic outflows and low-ionization state inflowing gas. Observational studies on their spatial distribution and their relation with galaxy prope…
▽ More
The circumgalactic medium (CGM) is the location where the interplay between large-scale outflows and accretion onto galaxies occurs. Metals in different ionization states flowing between the circumgalactic and intergalactic mediums are affected by large galactic outflows and low-ionization state inflowing gas. Observational studies on their spatial distribution and their relation with galaxy properties may provide important constraints on models of galaxy formation and evolution. To provide new insights into the spatial distribution of the circumgalactic of star-forming galaxies, we select a sample of 238 close pairs at $1.5 < z <4.5$ ($\langle z\rangle\sim$2.6) from the VIMOS Ultra Deep Survey. We then generate composite spectra by co-adding spectra of $background$ galaxies that provide different sight-lines across the CGM to examine the spatial distribution of the gas located around these galaxies and investigate possible correlations between the strength of the low- and high-ionization absorption features with different galaxy properties. We detect C II, Si II, Si IV and C IV) up to separations $\langle b \rangle=$ 172 kpc and 146 kpc. Our $W_{0}$ radial profiles suggest a potential redshift evolution for the CGM gas content producing these absorptions. We find a correlation between C II and C IV with star formation rate, stellar mass and trends with galaxy size estimated by the effective radius and azimuthal angle. Galaxies with high star formation rate show stronger C IV absorptions compared with star-forming galaxies with low SFR and low stellar mass. These results could be explained by stronger outflows, softer radiation fields unable to ionize high-ionization state lines or by the galactic fountain scenario where metal-rich gas ejected from previous star-formation episodes fall back to the galaxy.
△ Less
Submitted 25 July, 2022; v1 submitted 17 June, 2022;
originally announced June 2022.
-
OOD Link Prediction Generalization Capabilities of Message-Passing GNNs in Larger Test Graphs
Authors:
Yangze Zhou,
Gitta Kutyniok,
Bruno Ribeiro
Abstract:
This work provides the first theoretical study on the ability of graph Message Passing Neural Networks (gMPNNs) -- such as Graph Neural Networks (GNNs) -- to perform inductive out-of-distribution (OOD) link prediction tasks, where deployment (test) graph sizes are larger than training graphs. We first prove non-asymptotic bounds showing that link predictors based on permutation-equivariant (struct…
▽ More
This work provides the first theoretical study on the ability of graph Message Passing Neural Networks (gMPNNs) -- such as Graph Neural Networks (GNNs) -- to perform inductive out-of-distribution (OOD) link prediction tasks, where deployment (test) graph sizes are larger than training graphs. We first prove non-asymptotic bounds showing that link predictors based on permutation-equivariant (structural) node embeddings obtained by gMPNNs can converge to a random guess as test graphs get larger. We then propose a theoretically-sound gMPNN that outputs structural pairwise (2-node) embeddings and prove non-asymptotic bounds showing that, as test graphs grow, these embeddings converge to embeddings of a continuous function that retains its ability to predict links OOD. Empirical results on random graphs show agreement with our theoretical results.
△ Less
Submitted 9 October, 2022; v1 submitted 30 May, 2022;
originally announced May 2022.
-
Leading edge vortex formation and wake trajectory: Synthesizing measurements, analysis, and machine learning
Authors:
Howon Lee,
Nicholas Simone,
Yunxing Su,
Yuanhang Zhu,
Bernardo Luiz R. Ribeiro,
Jennifer A. Franck,
Kenneth Breuer
Abstract:
The strength and trajectory of a leading edge vortex (LEV) formed by a pitching-heaving hydrofoil (chord $c$) is studied. The LEV is identified using the $Q$-criterion method, which is calculated from the 2D velocity field obtained from PIV measurements. The relative angle of attack at mid-stroke, ${α_{T/4}} $, proves to be an effective method of combining heave amplitude ($h_0/c$), pitch amplitud…
▽ More
The strength and trajectory of a leading edge vortex (LEV) formed by a pitching-heaving hydrofoil (chord $c$) is studied. The LEV is identified using the $Q$-criterion method, which is calculated from the 2D velocity field obtained from PIV measurements. The relative angle of attack at mid-stroke, ${α_{T/4}} $, proves to be an effective method of combining heave amplitude ($h_0/c$), pitch amplitude ($θ_0$), and reduced frequency ($f^*$) into a single variable that predicts the maximum value of $Q$ over a wide range of operating conditions. Once the LEV separates from the foil, it travels downstream and rapidly weakens and diffuses. The downstream trajectory of the LEV has two characteristic shapes. At low values of ${α_{T/4}}$, it travels straight downstream after separating from the foil, while at higher values of ${α_{T/4}} $, an accompanying Trailing Edge Vortex (TEV) forms and the induced velocity generates a cross-stream component to the vortex trajectories. This behavior is accurately predicted using a potential flow model for the LEV and TEV. Supervised machine learning algorithms, namely Support Vector Regression and Gaussian Process Regression, are used to create regression models that predicts the vortex strength, shape and trajectory during growth and after separation. The regression model successfully captures the features of two vortex regimes observed at different values of ${α_{T/4}} $. However, the predicted LEV trajectories are somewhat smoother than observed in the experiments. The strengths of the vortex is often under-predicted. Both of these shortcomings may be attributed to the relatively small size of the training data set.
△ Less
Submitted 25 May, 2022;
originally announced May 2022.
-
Action Recognition for American Sign Language
Authors:
Nguyen Huu Phong,
Bernardete Ribeiro
Abstract:
In this research, we present our findings to recognize American Sign Language from series of hand gestures. While most researches in literature focus only on static handshapes, our work target dynamic hand gestures. Since dynamic signs dataset are very few, we collect an initial dataset of 150 videos for 10 signs and an extension of 225 videos for 15 signs. We apply transfer learning models in com…
▽ More
In this research, we present our findings to recognize American Sign Language from series of hand gestures. While most researches in literature focus only on static handshapes, our work target dynamic hand gestures. Since dynamic signs dataset are very few, we collect an initial dataset of 150 videos for 10 signs and an extension of 225 videos for 15 signs. We apply transfer learning models in combination with deep neural networks and background subtraction for videos in different temporal settings. Our primarily results show that we can get an accuracy of $0.86$ and $0.71$ using DenseNet201, LSTM with video sequence of 12 frames accordingly.
△ Less
Submitted 20 May, 2022;
originally announced May 2022.