-
Optimising cryptocurrency portfolios through stable clustering of price correlation networks
Authors:
Ruixue Jing,
Ryota Kobayashi,
Luis Enrique Correa Rocha
Abstract:
The emerging cryptocurrency market presents unique challenges for investment due to its unregulated nature and inherent volatility. However, collective price movements can be explored to maximise profits with minimal risk using investment portfolios. In this paper, we develop a technical framework that utilises historical data on daily closing prices and integrates network analysis, price forecast…
▽ More
The emerging cryptocurrency market presents unique challenges for investment due to its unregulated nature and inherent volatility. However, collective price movements can be explored to maximise profits with minimal risk using investment portfolios. In this paper, we develop a technical framework that utilises historical data on daily closing prices and integrates network analysis, price forecasting, and portfolio theory to identify cryptocurrencies for building profitable portfolios under uncertainty. Our method utilises the Louvain network community algorithm and consensus clustering to detect robust and temporally stable clusters of highly correlated cryptocurrencies, from which the chosen cryptocurrencies are selected. A price prediction step using the ARIMA model guarantees that the portfolio performs well for up to 14 days in the investment horizon. Empirical analysis over a 5-year period shows that despite the high volatility in the crypto market, hidden price patterns can be effectively utilised to generate consistently profitable, time-agnostic cryptocurrency portfolios.
△ Less
Submitted 30 May, 2025;
originally announced May 2025.
-
Sentiment spreads, but topics do not, in COVID-19 discussions within the Belgian Reddit community
Authors:
Tim Van Wesemael,
Luis E. C. Rocha,
Tijs W. Alleman,
Jan M. Baetens
Abstract:
This study investigates how topics and sentiments on COVID-19 mitigation measures -- specifically lockdowns, mask mandates, and vaccinations -- spread through the Belgian Reddit community. We explore 655,642 posts created between 1 January 2020 and 30 June 2022. In line with previous studies for other countries and platforms, we find that the volume of posts on these topics can be tied to importan…
▽ More
This study investigates how topics and sentiments on COVID-19 mitigation measures -- specifically lockdowns, mask mandates, and vaccinations -- spread through the Belgian Reddit community. We explore 655,642 posts created between 1 January 2020 and 30 June 2022. In line with previous studies for other countries and platforms, we find that the volume of posts on these topics can be tied to important external events, but not within-Reddit interactions. Sentiment, however, is influenced by the sentiment of previous posts, resulting in homophily and polarisation. We define a homophily measure and find values of 0.228, 0.198, and 0.133 for lockdowns, masks and vaccination, respectively. Additionally, we introduce a novel bounded confidence model that estimates internal sentiment of users from their expressed sentiment. The Wasserstein metric between the predicted and the observed sentiments takes values between 0.493 (vaccination) and 0.607 (lockdown). These results yield insight into the way the Belgian Reddit community experienced the pandemic, and which aspects influenced the topics discussed and their associated sentiment.
△ Less
Submitted 26 May, 2025;
originally announced May 2025.
-
Characterizing asymmetric and bimodal long-term financial return distributions through quantum walks
Authors:
Stijn De Backer,
Luis E. C. Rocha,
Jan Ryckebusch,
Koen Schoors
Abstract:
The analysis of logarithmic return distributions defined over large time scales is crucial for understanding the long-term dynamics of asset price movements. For large time scales of the order of two trading years, the anticipated Gaussian behavior of the returns often does not emerge, and their distributions often exhibit a high level of asymmetry and bimodality. These features are inadequately c…
▽ More
The analysis of logarithmic return distributions defined over large time scales is crucial for understanding the long-term dynamics of asset price movements. For large time scales of the order of two trading years, the anticipated Gaussian behavior of the returns often does not emerge, and their distributions often exhibit a high level of asymmetry and bimodality. These features are inadequately captured by the majority of classical models to address financial time series and return distributions. In the presented analysis, we use a model based on the discrete-time quantum walk to characterize the observed asymmetry and bimodality. The quantum walk distinguishes itself from a classical diffusion process by the occurrence of interference effects, which allows for the generation of bimodal and asymmetric probability distributions. By capturing the broader trends and patterns that emerge over extended periods, this analysis complements traditional short-term models and offers opportunities to more accurately describe the probabilistic structure underlying long-term financial decisions.
△ Less
Submitted 19 May, 2025;
originally announced May 2025.
-
Overcoming Obstacles: Challenges of Gender Inequality in Undergraduate ICT Programs
Authors:
Angelica Pereira Souza,
Anderson Uchôa,
Edna Dias Canedo,
Juliana Alves Pereira,
Claudia Pinto Pereira,
Larissa Rocha
Abstract:
Context: Gender inequality is a widely discussed issue across various sectors, including Information Technology and Communication (ICT). In Brazil, women represent less than 18% of ICT students in higher education. Prior studies highlight gender-related barriers that discourage women from staying in ICT. However, they provide limited insights into their perceptions as undergraduate students and th…
▽ More
Context: Gender inequality is a widely discussed issue across various sectors, including Information Technology and Communication (ICT). In Brazil, women represent less than 18% of ICT students in higher education. Prior studies highlight gender-related barriers that discourage women from staying in ICT. However, they provide limited insights into their perceptions as undergraduate students and the factors influencing their participation and confidence. Goal: This study explores the perceptions of women undergraduate students in ICT regarding gender inequality. Method: A survey of 402 women from 18 Brazilian states enrolled in ICT courses was conducted using a mixed-method approach, combining quantitative and qualitative analyses. Results: Women students reported experiencing discriminatory practices from peers and professors, both inside and outside the classroom. Gender stereotypes were found to undermine their self-confidence and self-esteem, occasionally leading to course discontinuation. Conclusions: Factors such as lack of representation, inappropriate jokes, isolation, mistrust, and difficulty being heard contribute to harmful outcomes, including reduced participation and reluctance to take leadership roles. Addressing these issues is essential to creating a safe and respectful learning environment for all students.
△ Less
Submitted 2 May, 2025;
originally announced May 2025.
-
A thorough benchmark of automatic text classification: From traditional approaches to large language models
Authors:
Washington Cunha,
Leonardo Rocha,
Marcos André Gonçalves
Abstract:
Automatic text classification (ATC) has experienced remarkable advancements in the past decade, best exemplified by recent small and large language models (SLMs and LLMs), leveraged by Transformer architectures. Despite recent effectiveness improvements, a comprehensive cost-benefit analysis investigating whether the effectiveness gains of these recent approaches compensate their much higher costs…
▽ More
Automatic text classification (ATC) has experienced remarkable advancements in the past decade, best exemplified by recent small and large language models (SLMs and LLMs), leveraged by Transformer architectures. Despite recent effectiveness improvements, a comprehensive cost-benefit analysis investigating whether the effectiveness gains of these recent approaches compensate their much higher costs when compared to more traditional text classification approaches such as SVMs and Logistic Regression is still missing in the literature. In this context, this work's main contributions are twofold: (i) we provide a scientifically sound comparative analysis of the cost-benefit of twelve traditional and recent ATC solutions including five open LLMs, and (ii) a large benchmark comprising {22 datasets}, including sentiment analysis and topic classification, with their (train-validation-test) partitions based on folded cross-validation procedures, along with documentation, and code. The release of code, data, and documentation enables the community to replicate experiments and advance the field in a more scientifically sound manner. Our comparative experimental results indicate that LLMs outperform traditional approaches (up to 26%-7.1% on average) and SLMs (up to 4.9%-1.9% on average) in terms of effectiveness. However, LLMs incur significantly higher computational costs due to fine-tuning, being, on average 590x and 8.5x slower than traditional methods and SLMs, respectively. Results suggests the following recommendations: (1) LLMs for applications that require the best possible effectiveness and can afford the costs; (2) traditional methods such as Logistic Regression and SVM for resource-limited applications or those that cannot afford the cost of tuning large LLMs; and (3) SLMs like Roberta for near-optimal effectiveness-efficiency trade-off.
△ Less
Submitted 2 April, 2025;
originally announced April 2025.
-
European Contributions to Fermilab Accelerator Upgrades and Facilities for the DUNE Experiment
Authors:
DUNE Collaboration,
A. Abed Abud,
R. Acciarri,
M. A. Acero,
M. R. Adames,
G. Adamov,
M. Adamowski,
D. Adams,
M. Adinolfi,
C. Adriano,
A. Aduszkiewicz,
J. Aguilar,
F. Akbar,
F. Alemanno,
N. S. Alex,
K. Allison,
M. Alrashed,
A. Alton,
R. Alvarez,
T. Alves,
A. Aman,
H. Amar,
P. Amedo,
J. Anderson,
D. A. Andrade
, et al. (1322 additional authors not shown)
Abstract:
The Proton Improvement Plan (PIP-II) to the FNAL accelerator chain and the Long-Baseline Neutrino Facility (LBNF) will provide the world's most intense neutrino beam to the Deep Underground Neutrino Experiment (DUNE) enabling a wide-ranging physics program. This document outlines the significant contributions made by European national laboratories and institutes towards realizing the first phase o…
▽ More
The Proton Improvement Plan (PIP-II) to the FNAL accelerator chain and the Long-Baseline Neutrino Facility (LBNF) will provide the world's most intense neutrino beam to the Deep Underground Neutrino Experiment (DUNE) enabling a wide-ranging physics program. This document outlines the significant contributions made by European national laboratories and institutes towards realizing the first phase of the project with a 1.2 MW neutrino beam. Construction of this first phase is well underway. For DUNE Phase II, this will be closely followed by an upgrade of the beam power to > 2 MW, for which the European groups again have a key role and which will require the continued support of the European community for machine aspects of neutrino physics. Beyond the neutrino beam aspects, LBNF is also responsible for providing unique infrastructure to install and operate the DUNE neutrino detectors at FNAL and at the Sanford Underground Research Facility (SURF). The cryostats for the first two Liquid Argon Time Projection Chamber detector modules at SURF, a contribution of CERN to LBNF, are central to the success of the ongoing execution of DUNE Phase I. Likewise, successful and timely procurement of cryostats for two additional detector modules at SURF will be critical to the success of DUNE Phase II and the overall physics program. The DUNE Collaboration is submitting four main contributions to the 2026 Update of the European Strategy for Particle Physics process. This paper is being submitted to the 'Accelerator technologies' and 'Projects and Large Experiments' streams. Additional inputs related to the DUNE science program, DUNE detector technologies and R&D, and DUNE software and computing, are also being submitted to other streams.
△ Less
Submitted 31 March, 2025;
originally announced March 2025.
-
DUNE Software and Computing Research and Development
Authors:
DUNE Collaboration,
A. Abed Abud,
R. Acciarri,
M. A. Acero,
M. R. Adames,
G. Adamov,
M. Adamowski,
D. Adams,
M. Adinolfi,
C. Adriano,
A. Aduszkiewicz,
J. Aguilar,
F. Akbar,
F. Alemanno,
N. S. Alex,
K. Allison,
M. Alrashed,
A. Alton,
R. Alvarez,
T. Alves,
A. Aman,
H. Amar,
P. Amedo,
J. Anderson,
D. A. Andrade
, et al. (1322 additional authors not shown)
Abstract:
The international collaboration designing and constructing the Deep Underground Neutrino Experiment (DUNE) at the Long-Baseline Neutrino Facility (LBNF) has developed a two-phase strategy toward the implementation of this leading-edge, large-scale science project. The ambitious physics program of Phase I and Phase II of DUNE is dependent upon deployment and utilization of significant computing res…
▽ More
The international collaboration designing and constructing the Deep Underground Neutrino Experiment (DUNE) at the Long-Baseline Neutrino Facility (LBNF) has developed a two-phase strategy toward the implementation of this leading-edge, large-scale science project. The ambitious physics program of Phase I and Phase II of DUNE is dependent upon deployment and utilization of significant computing resources, and successful research and development of software (both infrastructure and algorithmic) in order to achieve these scientific goals. This submission discusses the computing resources projections, infrastructure support, and software development needed for DUNE during the coming decades as an input to the European Strategy for Particle Physics Update for 2026. The DUNE collaboration is submitting four main contributions to the 2026 Update of the European Strategy for Particle Physics process. This submission to the 'Computing' stream focuses on DUNE software and computing. Additional inputs related to the DUNE science program, DUNE detector technologies and R&D, and European contributions to Fermilab accelerator upgrades and facilities for the DUNE experiment, are also being submitted to other streams.
△ Less
Submitted 31 March, 2025;
originally announced March 2025.
-
The DUNE Phase II Detectors
Authors:
DUNE Collaboration,
A. Abed Abud,
R. Acciarri,
M. A. Acero,
M. R. Adames,
G. Adamov,
M. Adamowski,
D. Adams,
M. Adinolfi,
C. Adriano,
A. Aduszkiewicz,
J. Aguilar,
F. Akbar,
F. Alemanno,
N. S. Alex,
K. Allison,
M. Alrashed,
A. Alton,
R. Alvarez,
T. Alves,
A. Aman,
H. Amar,
P. Amedo,
J. Anderson,
D. A. Andrade
, et al. (1322 additional authors not shown)
Abstract:
The international collaboration designing and constructing the Deep Underground Neutrino Experiment (DUNE) at the Long-Baseline Neutrino Facility (LBNF) has developed a two-phase strategy for the implementation of this leading-edge, large-scale science project. The 2023 report of the US Particle Physics Project Prioritization Panel (P5) reaffirmed this vision and strongly endorsed DUNE Phase I and…
▽ More
The international collaboration designing and constructing the Deep Underground Neutrino Experiment (DUNE) at the Long-Baseline Neutrino Facility (LBNF) has developed a two-phase strategy for the implementation of this leading-edge, large-scale science project. The 2023 report of the US Particle Physics Project Prioritization Panel (P5) reaffirmed this vision and strongly endorsed DUNE Phase I and Phase II, as did the previous European Strategy for Particle Physics. The construction of DUNE Phase I is well underway. DUNE Phase II consists of a third and fourth far detector module, an upgraded near detector complex, and an enhanced > 2 MW beam. The fourth FD module is conceived as a 'Module of Opportunity', aimed at supporting the core DUNE science program while also expanding the physics opportunities with more advanced technologies. The DUNE collaboration is submitting four main contributions to the 2026 Update of the European Strategy for Particle Physics process. This submission to the 'Detector instrumentation' stream focuses on technologies and R&D for the DUNE Phase II detectors. Additional inputs related to the DUNE science program, DUNE software and computing, and European contributions to Fermilab accelerator upgrades and facilities for the DUNE experiment, are also being submitted to other streams.
△ Less
Submitted 29 March, 2025;
originally announced March 2025.
-
The DUNE Science Program
Authors:
DUNE Collaboration,
A. Abed Abud,
R. Acciarri,
M. A. Acero,
M. R. Adames,
G. Adamov,
M. Adamowski,
D. Adams,
M. Adinolfi,
C. Adriano,
A. Aduszkiewicz,
J. Aguilar,
F. Akbar,
F. Alemanno,
N. S. Alex,
K. Allison,
M. Alrashed,
A. Alton,
R. Alvarez,
T. Alves,
A. Aman,
H. Amar,
P. Amedo,
J. Anderson,
D. A. Andrade
, et al. (1322 additional authors not shown)
Abstract:
The international collaboration designing and constructing the Deep Underground Neutrino Experiment (DUNE) at the Long-Baseline Neutrino Facility (LBNF) has developed a two-phase strategy for the implementation of this leading-edge, large-scale science project. The 2023 report of the US Particle Physics Project Prioritization Panel (P5) reaffirmed this vision and strongly endorsed DUNE Phase I and…
▽ More
The international collaboration designing and constructing the Deep Underground Neutrino Experiment (DUNE) at the Long-Baseline Neutrino Facility (LBNF) has developed a two-phase strategy for the implementation of this leading-edge, large-scale science project. The 2023 report of the US Particle Physics Project Prioritization Panel (P5) reaffirmed this vision and strongly endorsed DUNE Phase I and Phase II, as did the previous European Strategy for Particle Physics. The construction of DUNE Phase I is well underway. DUNE Phase II consists of a third and fourth far detector module, an upgraded near detector complex, and an enhanced > 2 MW beam. The fourth FD module is conceived as a 'Module of Opportunity', aimed at supporting the core DUNE science program while also expanding the physics opportunities with more advanced technologies. The DUNE collaboration is submitting four main contributions to the 2026 Update of the European Strategy for Particle Physics process. This submission to the 'Neutrinos and cosmic messengers', 'BSM physics' and 'Dark matter and dark sector' streams focuses on the physics program of DUNE. Additional inputs related to DUNE detector technologies and R&D, DUNE software and computing, and European contributions to Fermilab accelerator upgrades and facilities for the DUNE experiment, are also being submitted to other streams.
△ Less
Submitted 29 March, 2025;
originally announced March 2025.
-
Detection of anomalies in cow activity using wavelet transform based features
Authors:
Valentin Guien,
Violaine Antoine,
Romain Lardy,
Isabelle Veissier,
Luis E C Rocha
Abstract:
In Precision Livestock Farming, detecting deviations from optimal or baseline values - i.e. anomalies in time series - is essential to allow undertaking corrective actions rapidly. Here we aim at detecting anomalies in 24h time series of cow activity, with a view to detect cases of disease or oestrus. Deviations must be distinguished from noise which can be very high in case of biological data. It…
▽ More
In Precision Livestock Farming, detecting deviations from optimal or baseline values - i.e. anomalies in time series - is essential to allow undertaking corrective actions rapidly. Here we aim at detecting anomalies in 24h time series of cow activity, with a view to detect cases of disease or oestrus. Deviations must be distinguished from noise which can be very high in case of biological data. It is also important to detect the anomaly early, e.g. before a farmer would notice it visually. Here, we investigate the benefit of using wavelet transforms to denoise data and we assess the performance of an anomaly detection algorithm considering the timing of the detection. We developed features based on the comparisons between the wavelet transforms of the mean of the time series and the wavelet transforms of individual time series instances. We hypothesized that these features contribute to the detection of anomalies in periodic time series using a feature-based algorithm. We tested this hypothesis with two datasets representing cow activity, which typically follows a daily pattern but can deviate due to specific physiological or pathological conditions. We applied features derived from wavelet transform as well as statistical features in an Isolation Forest algorithm. We measured the distance of detection between the days annotated abnormal by animal caretakers days and the days predicted abnormal by the algorithm. The results show that wavelet-based features are among the features most contributing to anomaly detection. They also show that detections are close to the annotated days, and often precede it. In conclusion, using wavelet transforms on time series of cow activity data helps to detect anomalies related to specific cow states. The detection is often obtained on days that precede the day annotated by caretakers, which offer possibility to take corrective actions at an early stage.
△ Less
Submitted 28 February, 2025;
originally announced February 2025.
-
Neutrino Interaction Vertex Reconstruction in DUNE with Pandora Deep Learning
Authors:
DUNE Collaboration,
A. Abed Abud,
R. Acciarri,
M. A. Acero,
M. R. Adames,
G. Adamov,
M. Adamowski,
D. Adams,
M. Adinolfi,
C. Adriano,
A. Aduszkiewicz,
J. Aguilar,
F. Akbar,
F. Alemanno,
N. S. Alex,
K. Allison,
M. Alrashed,
A. Alton,
R. Alvarez,
T. Alves,
A. Aman,
H. Amar,
P. Amedo,
J. Anderson,
C. Andreopoulos
, et al. (1313 additional authors not shown)
Abstract:
The Pandora Software Development Kit and algorithm libraries perform reconstruction of neutrino interactions in liquid argon time projection chamber detectors. Pandora is the primary event reconstruction software used at the Deep Underground Neutrino Experiment, which will operate four large-scale liquid argon time projection chambers at the far detector site in South Dakota, producing high-resolu…
▽ More
The Pandora Software Development Kit and algorithm libraries perform reconstruction of neutrino interactions in liquid argon time projection chamber detectors. Pandora is the primary event reconstruction software used at the Deep Underground Neutrino Experiment, which will operate four large-scale liquid argon time projection chambers at the far detector site in South Dakota, producing high-resolution images of charged particles emerging from neutrino interactions. While these high-resolution images provide excellent opportunities for physics, the complex topologies require sophisticated pattern recognition capabilities to interpret signals from the detectors as physically meaningful objects that form the inputs to physics analyses. A critical component is the identification of the neutrino interaction vertex. Subsequent reconstruction algorithms use this location to identify the individual primary particles and ensure they each result in a separate reconstructed particle. A new vertex-finding procedure described in this article integrates a U-ResNet neural network performing hit-level classification into the multi-algorithm approach used by Pandora to identify the neutrino interaction vertex. The machine learning solution is seamlessly integrated into a chain of pattern-recognition algorithms. The technique substantially outperforms the previous BDT-based solution, with a more than 20\% increase in the efficiency of sub-1\,cm vertex reconstruction across all neutrino flavours.
△ Less
Submitted 10 February, 2025;
originally announced February 2025.
-
Evaluating the Effectiveness of LLMs in Fixing Maintainability Issues in Real-World Projects
Authors:
Henrique Nunes,
Eduardo Figueiredo,
Larissa Rocha,
Sarah Nadi,
Fischer Ferreira,
Geanderson Esteves
Abstract:
Large Language Models (LLMs) have gained attention for addressing coding problems, but their effectiveness in fixing code maintainability remains unclear. This study evaluates LLMs capability to resolve 127 maintainability issues from 10 GitHub repositories. We use zero-shot prompting for Copilot Chat and Llama 3.1, and few-shot prompting with Llama only. The LLM-generated solutions are assessed f…
▽ More
Large Language Models (LLMs) have gained attention for addressing coding problems, but their effectiveness in fixing code maintainability remains unclear. This study evaluates LLMs capability to resolve 127 maintainability issues from 10 GitHub repositories. We use zero-shot prompting for Copilot Chat and Llama 3.1, and few-shot prompting with Llama only. The LLM-generated solutions are assessed for compilation errors, test failures, and new maintainability problems. Llama with few-shot prompting successfully fixed 44.9% of the methods, while Copilot Chat and Llama zero-shot fixed 32.29% and 30%, respectively. However, most solutions introduced errors or new maintainability issues. We also conducted a human study with 45 participants to evaluate the readability of 51 LLM-generated solutions. The human study showed that 68.63% of participants observed improved readability. Overall, while LLMs show potential for fixing maintainability issues, their introduction of errors highlights their current limitations.
△ Less
Submitted 4 February, 2025;
originally announced February 2025.
-
Engel's Theorem for Alternative and Special Jordan Superalgebras
Authors:
Isabel Hernández,
Laiz Valim da Rocha,
Rodrigo Lucas Rodrigues
Abstract:
In this paper, a nilpotency criterion is given for finite dimensional alternative superalgebras inspired by the celebrated Engel's Theorem for Lie algebras. As a consequence, a similar result is proved for finite-dimensional special Jordan superalgebras over a field $\mathbb{F}$ of characteristic not $2$, without restrictions on the cardinality of $\mathbb{F}$. In that case, the latter extends Eng…
▽ More
In this paper, a nilpotency criterion is given for finite dimensional alternative superalgebras inspired by the celebrated Engel's Theorem for Lie algebras. As a consequence, a similar result is proved for finite-dimensional special Jordan superalgebras over a field $\mathbb{F}$ of characteristic not $2$, without restrictions on the cardinality of $\mathbb{F}$. In that case, the latter extends Engel's Theorem for Jordan superalgebras constructed by Okunev and Shestakov and it gives a partial positive answer to an open problem announced by Murakami et al. for Jordan superalgebras over finite fields. We also establish some connections between the concepts of graded-nil and nilpotent alternative superalgebras.
△ Less
Submitted 29 January, 2025;
originally announced January 2025.
-
Translating and evaluating single-cell Boolean network interventions in the multiscale setting
Authors:
John Metzcar,
Katie Pletz,
Heber L. Rocha,
Jordan C Rozum
Abstract:
Intracellular networks process cellular-level information and control cell fate. They can be computationally modeled using Boolean networks, which are implicit-time causal models of discrete binary events. These networks can be embedded in computational agents to drive cellular behavior. To explore this integration, we construct a set of candidate interventions that induce apoptosis in a cell-surv…
▽ More
Intracellular networks process cellular-level information and control cell fate. They can be computationally modeled using Boolean networks, which are implicit-time causal models of discrete binary events. These networks can be embedded in computational agents to drive cellular behavior. To explore this integration, we construct a set of candidate interventions that induce apoptosis in a cell-survival network of a rare leukemia using exhaustive search simulation, stable motif control, and an individual-based mean field approach (IBMFA). Due to inherent algorithmic limitations, these interventions are most suitable for cell-level determinations, not the more realistic multicellular setting. To address these limitations, we treat the target control solutions as putative targets for therapeutic interventions and develop a pipeline to translate them to continuous-time multicellular, agent-based models. We set the discrete-to-continuous transitions between the Boolean network and multicellular model via thresholding and produce simple computational simulations designed to emulate situations in experimental and translational biology. These include a series of simulations: constant substrate gradients, global substrate pulses, and time-varying boundary conditions. We find that interventions that perform equally well in the implicit-time single-cell setting are separable in the multiscale setting in ability to impact population growth and spatial distribution. Further analysis shows that the population and spatial distribution differences arise from differences in internal dynamics (stable motif controls versus target controls) and network distance between the intervention and output nodes. This proof of concept work demonstrates the importance of accounting for internal dynamics in multicellular simulations as well as impacts on understanding of Boolean network control.
△ Less
Submitted 27 January, 2025;
originally announced January 2025.
-
Coherent deflection of atomic samples and positional mesoscopic superpositions
Authors:
L. F. Alves da Silva,
L. M. R. Rocha,
M. H. Y. Moussa
Abstract:
We present a protocol based on the interplay between superradiance and superabsorption to achieve the coherent deflection of an atomic sample due to the momentum transfer from the atoms to a cavity field. The coherent character of this momentum transfer, causing the atomic sample to deflect as a whole, follows from the collective nature of the atomic superradiant pulse and its superabsorption by t…
▽ More
We present a protocol based on the interplay between superradiance and superabsorption to achieve the coherent deflection of an atomic sample due to the momentum transfer from the atoms to a cavity field. The coherent character of this momentum transfer, causing the atomic sample to deflect as a whole, follows from the collective nature of the atomic superradiant pulse and its superabsorption by the cavity field. The protocol is then used for the construction of positional mesoscopic atomic superpositions.
△ Less
Submitted 11 February, 2025; v1 submitted 27 November, 2024;
originally announced November 2024.
-
Economic Integration of Africa in the 21st Century: Complex Network and Panel Regression Analysis
Authors:
Tekilu Tadesse Choramo,
Jemal Abafita,
Yerali Gandica,
Luis E C Rocha
Abstract:
Global and regional integration has grown significantly in recent decades, boosting intra-African trade and positively impacting national economies through trade diversification and sustainable development. However, existing measures of economic integration often fail to capture the complex interactions among trading partners. This study addresses this gap by using complex network analysis and dyn…
▽ More
Global and regional integration has grown significantly in recent decades, boosting intra-African trade and positively impacting national economies through trade diversification and sustainable development. However, existing measures of economic integration often fail to capture the complex interactions among trading partners. This study addresses this gap by using complex network analysis and dynamic panel regression techniques to identify factors driving economic integration in Africa, based on data from 2002 to 2019. The results show that economic development, institutional quality, regional trade agreements, human capital, FDI, and infrastructure positively influence a country's position in the African trade network. Conversely, trade costs, the global financial crisis, and regional overlapping memberships negatively affect network based integration. Our findings suggest that enhancing a country's connectivity in the African trade network involves identifying key economic and institutional factors of trade partners and strategically focusing on continent-wide agreements rather than just regional ones to boost economic growth.
△ Less
Submitted 28 October, 2024;
originally announced October 2024.
-
Compact object populations over cosmic time I. BOSSA: a Binary Object environment-Sensitive Sampling Algorithm
Authors:
L. M. de Sá,
A. Bernardo,
L. S. Rocha,
R. R. A. Bachega,
J. E. Horvath
Abstract:
Binary population synthesis (BPS) is an essential tool for extracting information about massive binary evolution from gravitational-wave (GW) detections of compact object mergers. It has been successfully used to constrain the most likely permutations of evolution models among hundreds of alternatives, while initial condition models, in contrast, have not yet received the same level of attention.…
▽ More
Binary population synthesis (BPS) is an essential tool for extracting information about massive binary evolution from gravitational-wave (GW) detections of compact object mergers. It has been successfully used to constrain the most likely permutations of evolution models among hundreds of alternatives, while initial condition models, in contrast, have not yet received the same level of attention. Here, we introduce BOSSA, a detailed initial sampling code including a set of 192 initial condition permutations for BPS that capture both "invariant" and "varying" models, the latter accounting for a possible metallicity- and star formation rate (SFR)-dependence of the initial mass function (IMF); as well as correlations between the initial primary mass, orbital period, mass ratio and eccentricity of binaries. We include 24 metallicity-specific cosmic star formation history (cSFH) models and propose two alternate models for the mass-dependent binary fraction. We build a detailed pipeline for time-evolving BPS, such that each binary has well-defined initial conditions, and we are able to distinguish the contributions from populations of different ages. We discuss the meaning of the IMF for binaries and introduce a refined initial sampling procedure for component masses. We also discuss the treatment of higher-order multiple systems when normalizing a binary sample. In particular, we argue for how a consistent interpretation of the IMF implies that this is not the distribution from which any set of component masses should be independently drawn, and show how the individual IMF of primaries and companions is expected to deviate from the full IMF.
△ Less
Submitted 15 October, 2024;
originally announced October 2024.
-
Compact object populations over cosmic time II. Compact object merger rates and masses over redshift from varying initial conditions
Authors:
L. M. de Sá,
L. S. Rocha,
A. Bernardo,
R. R. A. Bachega,
J. E. Horvath
Abstract:
We perform a first study of the impact of varying two components of the initial conditions in binary population synthesis of compact binary mergers - the initial mass function, which is made metallicity- and star formation rate-dependent, and the orbital parameter (orbital period, mass ratio and eccentricity) distributions, which are assumed to be correlated - within a larger grid of initial condi…
▽ More
We perform a first study of the impact of varying two components of the initial conditions in binary population synthesis of compact binary mergers - the initial mass function, which is made metallicity- and star formation rate-dependent, and the orbital parameter (orbital period, mass ratio and eccentricity) distributions, which are assumed to be correlated - within a larger grid of initial condition models also including alternatives for the primary mass-dependent binary fraction and the metallicity-specific cosmic star formation history. We generate the initial populations with the sampling code BOSSA and evolve them with the rapid population synthesis code COMPAS. We find strong suggestions that the main role of initial conditions models is to set the relative weights of key features defined by the evolution models. In the two models we compare, black hole-black hole (BHBH) mergers are the most strongly affected, which we connect to a shift from the common envelope to the stable Roche lobe overflow formation channels with decreasing redshift. We also characterize variations in the black hole-neutron star (BHNS) and neutron star-neutron star (NSNS) final parameter distributions. We obtain the merger rate evolution for BHBH, BHNS and NSNS mergers up to $z=10$, and find a variation by a factor of $\sim50-60$ in the local BHBH and BHNS merger rates, suggesting a more important contribution from initial conditions than previously thought, and calling for a complete exploration of the initial conditions model permutations.
△ Less
Submitted 2 October, 2024;
originally announced October 2024.
-
The track-length extension fitting algorithm for energy measurement of interacting particles in liquid argon TPCs and its performance with ProtoDUNE-SP data
Authors:
DUNE Collaboration,
A. Abed Abud,
B. Abi,
R. Acciarri,
M. A. Acero,
M. R. Adames,
G. Adamov,
M. Adamowski,
D. Adams,
M. Adinolfi,
C. Adriano,
A. Aduszkiewicz,
J. Aguilar,
F. Akbar,
N. S. Alex,
K. Allison,
S. Alonso Monsalve,
M. Alrashed,
A. Alton,
R. Alvarez,
T. Alves,
H. Amar,
P. Amedo,
J. Anderson,
C. Andreopoulos
, et al. (1348 additional authors not shown)
Abstract:
This paper introduces a novel track-length extension fitting algorithm for measuring the kinetic energies of inelastically interacting particles in liquid argon time projection chambers (LArTPCs). The algorithm finds the most probable offset in track length for a track-like object by comparing the measured ionization density as a function of position with a theoretical prediction of the energy los…
▽ More
This paper introduces a novel track-length extension fitting algorithm for measuring the kinetic energies of inelastically interacting particles in liquid argon time projection chambers (LArTPCs). The algorithm finds the most probable offset in track length for a track-like object by comparing the measured ionization density as a function of position with a theoretical prediction of the energy loss as a function of the energy, including models of electron recombination and detector response. The algorithm can be used to measure the energies of particles that interact before they stop, such as charged pions that are absorbed by argon nuclei. The algorithm's energy measurement resolutions and fractional biases are presented as functions of particle kinetic energy and number of track hits using samples of stopping secondary charged pions in data collected by the ProtoDUNE-SP detector, and also in a detailed simulation. Additional studies describe the impact of the dE/dx model on energy measurement performance. The method described in this paper to characterize the energy measurement performance can be repeated in any LArTPC experiment using stopping secondary charged pions.
△ Less
Submitted 26 December, 2024; v1 submitted 26 September, 2024;
originally announced September 2024.
-
Critical Node Detection in Temporal Social Networks, Based on Global and Semi-local Centrality Measures
Authors:
Zahra Farahi,
Ali Kamandi,
Rooholah Abedian,
Luis Enrique Correa Rocha
Abstract:
Nodes that play strategic roles in networks are called critical or influential nodes. For example, in an epidemic, we can control the infection spread by isolating critical nodes; in marketing, we can use certain nodes as the initial spreaders aiming to reach the largest part of the network, or they can be selected for removal in targeted attacks to maximise the fragmentation of the network. In th…
▽ More
Nodes that play strategic roles in networks are called critical or influential nodes. For example, in an epidemic, we can control the infection spread by isolating critical nodes; in marketing, we can use certain nodes as the initial spreaders aiming to reach the largest part of the network, or they can be selected for removal in targeted attacks to maximise the fragmentation of the network. In this study, we focus on critical node detection in temporal networks. We propose three new measures to identify the critical nodes in temporal networks: the temporal supracycle ratio, temporal semi-local integration, and temporal semi-local centrality. We analyse the performance of these measures based on their effect on the SIR epidemic model in three scenarios: isolating the influential nodes when an epidemic happens, using the influential nodes as seeds of the epidemic, or removing them to analyse the robustness of the network. We compare the results with existing centrality measures, particularly temporal betweenness, temporal centrality, and temporal degree deviation. The results show that the introduced measures help identify influential nodes more accurately. The proposed methods can be used to detect nodes that need to be isolated to reduce the spread of an epidemic or as initial nodes to speedup dissemination of information.
△ Less
Submitted 23 September, 2024;
originally announced September 2024.
-
DUNE Phase II: Scientific Opportunities, Detector Concepts, Technological Solutions
Authors:
DUNE Collaboration,
A. Abed Abud,
B. Abi,
R. Acciarri,
M. A. Acero,
M. R. Adames,
G. Adamov,
M. Adamowski,
D. Adams,
M. Adinolfi,
C. Adriano,
A. Aduszkiewicz,
J. Aguilar,
F. Akbar,
K. Allison,
S. Alonso Monsalve,
M. Alrashed,
A. Alton,
R. Alvarez,
T. Alves,
H. Amar,
P. Amedo,
J. Anderson,
C. Andreopoulos,
M. Andreotti
, et al. (1347 additional authors not shown)
Abstract:
The international collaboration designing and constructing the Deep Underground Neutrino Experiment (DUNE) at the Long-Baseline Neutrino Facility (LBNF) has developed a two-phase strategy toward the implementation of this leading-edge, large-scale science project. The 2023 report of the US Particle Physics Project Prioritization Panel (P5) reaffirmed this vision and strongly endorsed DUNE Phase I…
▽ More
The international collaboration designing and constructing the Deep Underground Neutrino Experiment (DUNE) at the Long-Baseline Neutrino Facility (LBNF) has developed a two-phase strategy toward the implementation of this leading-edge, large-scale science project. The 2023 report of the US Particle Physics Project Prioritization Panel (P5) reaffirmed this vision and strongly endorsed DUNE Phase I and Phase II, as did the European Strategy for Particle Physics. While the construction of the DUNE Phase I is well underway, this White Paper focuses on DUNE Phase II planning. DUNE Phase-II consists of a third and fourth far detector (FD) module, an upgraded near detector complex, and an enhanced 2.1 MW beam. The fourth FD module is conceived as a "Module of Opportunity", aimed at expanding the physics opportunities, in addition to supporting the core DUNE science program, with more advanced technologies. This document highlights the increased science opportunities offered by the DUNE Phase II near and far detectors, including long-baseline neutrino oscillation physics, neutrino astrophysics, and physics beyond the standard model. It describes the DUNE Phase II near and far detector technologies and detector design concepts that are currently under consideration. A summary of key R&D goals and prototyping phases needed to realize the Phase II detector technical designs is also provided. DUNE's Phase II detectors, along with the increased beam power, will complete the full scope of DUNE, enabling a multi-decadal program of groundbreaking science with neutrinos.
△ Less
Submitted 22 August, 2024;
originally announced August 2024.
-
A new formulation for the collection and delivery problem of biomedical specimen
Authors:
Luis Aurelio Rocha,
Alena Otto,
Marc Goerigk
Abstract:
We study the collection and delivery problem of biomedical specimens (CDSP) with multiple trips, time windows, a homogeneous fleet, and the objective of minimizing total completion time of delivery requests. This is a prominent problem in healthcare logistics, where specimens (blood, plasma, urin etc.) collected from patients in doctor's offices and hospitals are transported to a central laborator…
▽ More
We study the collection and delivery problem of biomedical specimens (CDSP) with multiple trips, time windows, a homogeneous fleet, and the objective of minimizing total completion time of delivery requests. This is a prominent problem in healthcare logistics, where specimens (blood, plasma, urin etc.) collected from patients in doctor's offices and hospitals are transported to a central laboratory for advanced analysis. To the best of our knowledge, available exact solution approaches for CDSP have been able to solve only small instances with up to 9 delivery requests. In this paper, we propose a two-index mixed-integer programming formulation that, when used with an off-the-shelf solver, results in a fast exact solution approach. Computational experiments on a benchmark data set confirm that the proposed formulation outperforms both the state-of-the-art model and the state-of-the-art metaheuristic from the literature, solving 80 out of 168 benchmark instances to optimality, including a significant number of instances with 100 delivery requests.
△ Less
Submitted 19 August, 2024;
originally announced August 2024.
-
A Strategy to Combine 1stGen Transformers and Open LLMs for Automatic Text Classification
Authors:
Claudio M. V. de Andrade,
Washington Cunha,
Davi Reis,
Adriana Silvina Pagano,
Leonardo Rocha,
Marcos André Gonçalves
Abstract:
Transformer models have achieved state-of-the-art results, with Large Language Models (LLMs), an evolution of first-generation transformers (1stTR), being considered the cutting edge in several NLP tasks. However, the literature has yet to conclusively demonstrate that LLMs consistently outperform 1stTRs across all NLP tasks. This study compares three 1stTRs (BERT, RoBERTa, and BART) with two open…
▽ More
Transformer models have achieved state-of-the-art results, with Large Language Models (LLMs), an evolution of first-generation transformers (1stTR), being considered the cutting edge in several NLP tasks. However, the literature has yet to conclusively demonstrate that LLMs consistently outperform 1stTRs across all NLP tasks. This study compares three 1stTRs (BERT, RoBERTa, and BART) with two open LLMs (Llama 2 and Bloom) across 11 sentiment analysis datasets. The results indicate that open LLMs may moderately outperform or match 1stTRs in 8 out of 11 datasets but only when fine-tuned. Given this substantial cost for only moderate gains, the practical applicability of these models in cost-sensitive scenarios is questionable. In this context, a confidence-based strategy that seamlessly integrates 1stTRs with open LLMs based on prediction certainty is proposed. High-confidence documents are classified by the more cost-effective 1stTRs, while uncertain cases are handled by LLMs in zero-shot or few-shot modes, at a much lower cost than fine-tuned versions. Experiments in sentiment analysis demonstrate that our solution not only outperforms 1stTRs, zero-shot, and few-shot LLMs but also competes closely with fine-tuned LLMs at a fraction of the cost.
△ Less
Submitted 18 August, 2024;
originally announced August 2024.
-
PATopics: An automatic framework to extract useful information from pharmaceutical patents documents
Authors:
Pablo Cecilio,
Antônio Perreira,
Juliana Santos Rosa Viegas,
Washington Cunha,
Felipe Viegas,
Elisa Tuler,
Fabiana Testa Moura de Carvalho Vicentini,
Leonardo Rocha
Abstract:
Pharmaceutical patents play an important role by protecting the innovation from copies but also drive researchers to innovate, create new products, and promote disruptive innovations focusing on collective health. The study of patent management usually refers to an exhaustive manual search. This happens, because patent documents are complex with a lot of details regarding the claims and methodolog…
▽ More
Pharmaceutical patents play an important role by protecting the innovation from copies but also drive researchers to innovate, create new products, and promote disruptive innovations focusing on collective health. The study of patent management usually refers to an exhaustive manual search. This happens, because patent documents are complex with a lot of details regarding the claims and methodology/results explanation of the invention. To mitigate the manual search, we proposed PATopics, a framework specially designed to extract relevant information for Pharmaceutical patents. PATopics is composed of four building blocks that extract textual information from the patents, build relevant topics that are capable of summarizing the patents, correlate these topics with useful patent characteristics and then, summarize the information in a friendly web interface to final users. The general contributions of PATopics are its ability to centralize patents and to manage patents into groups based on their similarities. We extensively analyzed the framework using 4,832 pharmaceutical patents concerning 809 molecules patented by 478 companies. In our analysis, we evaluate the use of the framework considering the demands of three user profiles -- researchers, chemists, and companies. We also designed four real-world use cases to evaluate the framework's applicability. Our analysis showed how practical and helpful PATopics are in the pharmaceutical scenario.
△ Less
Submitted 12 August, 2024;
originally announced August 2024.
-
First Measurement of the Total Inelastic Cross-Section of Positively-Charged Kaons on Argon at Energies Between 5.0 and 7.5 GeV
Authors:
DUNE Collaboration,
A. Abed Abud,
B. Abi,
R. Acciarri,
M. A. Acero,
M. R. Adames,
G. Adamov,
M. Adamowski,
D. Adams,
M. Adinolfi,
C. Adriano,
A. Aduszkiewicz,
J. Aguilar,
F. Akbar,
K. Allison,
S. Alonso Monsalve,
M. Alrashed,
A. Alton,
R. Alvarez,
T. Alves,
H. Amar,
P. Amedo,
J. Anderson,
C. Andreopoulos,
M. Andreotti
, et al. (1341 additional authors not shown)
Abstract:
ProtoDUNE Single-Phase (ProtoDUNE-SP) is a 770-ton liquid argon time projection chamber that operated in a hadron test beam at the CERN Neutrino Platform in 2018. We present a measurement of the total inelastic cross section of charged kaons on argon as a function of kaon energy using 6 and 7 GeV/$c$ beam momentum settings. The flux-weighted average of the extracted inelastic cross section at each…
▽ More
ProtoDUNE Single-Phase (ProtoDUNE-SP) is a 770-ton liquid argon time projection chamber that operated in a hadron test beam at the CERN Neutrino Platform in 2018. We present a measurement of the total inelastic cross section of charged kaons on argon as a function of kaon energy using 6 and 7 GeV/$c$ beam momentum settings. The flux-weighted average of the extracted inelastic cross section at each beam momentum setting was measured to be 380$\pm$26 mbarns for the 6 GeV/$c$ setting and 379$\pm$35 mbarns for the 7 GeV/$c$ setting.
△ Less
Submitted 1 August, 2024;
originally announced August 2024.
-
A Novel Two-Step Fine-Tuning Pipeline for Cold-Start Active Learning in Text Classification Tasks
Authors:
Fabiano Belém,
Washington Cunha,
Celso França,
Claudio Andrade,
Leonardo Rocha,
Marcos André Gonçalves
Abstract:
This is the first work to investigate the effectiveness of BERT-based contextual embeddings in active learning (AL) tasks on cold-start scenarios, where traditional fine-tuning is infeasible due to the absence of labeled data. Our primary contribution is the proposal of a more robust fine-tuning pipeline - DoTCAL - that diminishes the reliance on labeled data in AL using two steps: (1) fully lever…
▽ More
This is the first work to investigate the effectiveness of BERT-based contextual embeddings in active learning (AL) tasks on cold-start scenarios, where traditional fine-tuning is infeasible due to the absence of labeled data. Our primary contribution is the proposal of a more robust fine-tuning pipeline - DoTCAL - that diminishes the reliance on labeled data in AL using two steps: (1) fully leveraging unlabeled data through domain adaptation of the embeddings via masked language modeling and (2) further adjusting model weights using labeled data selected by AL. Our evaluation contrasts BERT-based embeddings with other prevalent text representation paradigms, including Bag of Words (BoW), Latent Semantic Indexing (LSI), and FastText, at two critical stages of the AL process: instance selection and classification. Experiments conducted on eight ATC benchmarks with varying AL budgets (number of labeled instances) and number of instances (about 5,000 to 300,000) demonstrate DoTCAL's superior effectiveness, achieving up to a 33% improvement in Macro-F1 while reducing labeling efforts by half compared to the traditional one-step method. We also found that in several tasks, BoW and LSI (due to information aggregation) produce results superior (up to 59% ) to BERT, especially in low-budget scenarios and hard-to-classify tasks, which is quite surprising.
△ Less
Submitted 24 July, 2024;
originally announced July 2024.
-
Supernova Pointing Capabilities of DUNE
Authors:
DUNE Collaboration,
A. Abed Abud,
B. Abi,
R. Acciarri,
M. A. Acero,
M. R. Adames,
G. Adamov,
M. Adamowski,
D. Adams,
M. Adinolfi,
C. Adriano,
A. Aduszkiewicz,
J. Aguilar,
B. Aimard,
F. Akbar,
K. Allison,
S. Alonso Monsalve,
M. Alrashed,
A. Alton,
R. Alvarez,
T. Alves,
H. Amar,
P. Amedo,
J. Anderson,
D. A. Andrade
, et al. (1340 additional authors not shown)
Abstract:
The determination of the direction of a stellar core collapse via its neutrino emission is crucial for the identification of the progenitor for a multimessenger follow-up. A highly effective method of reconstructing supernova directions within the Deep Underground Neutrino Experiment (DUNE) is introduced. The supernova neutrino pointing resolution is studied by simulating and reconstructing electr…
▽ More
The determination of the direction of a stellar core collapse via its neutrino emission is crucial for the identification of the progenitor for a multimessenger follow-up. A highly effective method of reconstructing supernova directions within the Deep Underground Neutrino Experiment (DUNE) is introduced. The supernova neutrino pointing resolution is studied by simulating and reconstructing electron-neutrino charged-current absorption on $^{40}$Ar and elastic scattering of neutrinos on electrons. Procedures to reconstruct individual interactions, including a newly developed technique called ``brems flipping'', as well as the burst direction from an ensemble of interactions are described. Performance of the burst direction reconstruction is evaluated for supernovae happening at a distance of 10 kpc for a specific supernova burst flux model. The pointing resolution is found to be 3.4 degrees at 68% coverage for a perfect interaction-channel classification and a fiducial mass of 40 kton, and 6.6 degrees for a 10 kton fiducial mass respectively. Assuming a 4% rate of charged-current interactions being misidentified as elastic scattering, DUNE's burst pointing resolution is found to be 4.3 degrees (8.7 degrees) at 68% coverage.
△ Less
Submitted 14 July, 2024;
originally announced July 2024.
-
Challenges and opportunities for digital twins in precision medicine: a complex systems perspective
Authors:
Manlio De Domenico,
Luca Allegri,
Guido Caldarelli,
Valeria d'Andrea,
Barbara Di Camillo,
Luis M. Rocha,
Jordan Rozum,
Riccardo Sbarbati,
Francesco Zambelli
Abstract:
The adoption of digital twins (DTs) in precision medicine is increasingly viable, propelled by extensive data collection and advancements in artificial intelligence (AI), alongside traditional biomedical methodologies. However, the reliance on black-box predictive models, which utilize large datasets, presents limitations that could impede the broader application of DTs in clinical settings. We ar…
▽ More
The adoption of digital twins (DTs) in precision medicine is increasingly viable, propelled by extensive data collection and advancements in artificial intelligence (AI), alongside traditional biomedical methodologies. However, the reliance on black-box predictive models, which utilize large datasets, presents limitations that could impede the broader application of DTs in clinical settings. We argue that hypothesis-driven generative models, particularly multiscale modeling, are essential for boosting the clinical accuracy and relevance of DTs, thereby making a significant impact on healthcare innovation. This paper explores the transformative potential of DTs in healthcare, emphasizing their capability to simulate complex, interdependent biological processes across multiple scales. By integrating generative models with extensive datasets, we propose a scenario-based modeling approach that enables the exploration of diverse therapeutic strategies, thus supporting dynamic clinical decision-making. This method not only leverages advancements in data science and big data for improving disease treatment and prevention but also incorporates insights from complex systems and network science, quantitative biology, and digital medicine, promising substantial advancements in patient care.
△ Less
Submitted 15 May, 2024;
originally announced May 2024.
-
Refinement of an Epilepsy Dictionary through Human Annotation of Health-related posts on Instagram
Authors:
Aehong Min,
Xuan Wang,
Rion Brattig Correia,
Jordan Rozum,
Wendy R. Miller,
Luis M. Rocha
Abstract:
We used a dictionary built from biomedical terminology extracted from various sources such as DrugBank, MedDRA, MedlinePlus, TCMGeneDIT, to tag more than 8 million Instagram posts by users who have mentioned an epilepsy-relevant drug at least once, between 2010 and early 2016. A random sample of 1,771 posts with 2,947 term matches was evaluated by human annotators to identify false-positives. Open…
▽ More
We used a dictionary built from biomedical terminology extracted from various sources such as DrugBank, MedDRA, MedlinePlus, TCMGeneDIT, to tag more than 8 million Instagram posts by users who have mentioned an epilepsy-relevant drug at least once, between 2010 and early 2016. A random sample of 1,771 posts with 2,947 term matches was evaluated by human annotators to identify false-positives. OpenAI's GPT series models were compared against human annotation. Frequent terms with a high false-positive rate were removed from the dictionary. Analysis of the estimated false-positive rates of the annotated terms revealed 8 ambiguous terms (plus synonyms) used in Instagram posts, which were removed from the original dictionary. To study the effect of removing those terms, we constructed knowledge networks using the refined and the original dictionaries and performed an eigenvector-centrality analysis on both networks. We show that the refined dictionary thus produced leads to a significantly different rank of important terms, as measured by their eigenvector-centrality of the knowledge networks. Furthermore, the most important terms obtained after refinement are of greater medical relevance. In addition, we show that OpenAI's GPT series models fare worse than human annotators in this task.
△ Less
Submitted 14 May, 2024;
originally announced May 2024.
-
CANA v1.0.0 and schematodes: efficient quantification of symmetry in Boolean automata
Authors:
Austin M. Marcus,
Jordan Rozum,
Herbert Sizek,
Luis M. Rocha
Abstract:
The biomolecular networks underpinning cell function exhibit canalization, or the buffering of fluctuations required to function in a noisy environment. One understudied putative mechanism for canalization is the functional equivalence of a biomolecular entity's regulators (e.g., among the transcription factors for a gene). In these discrete dynamical systems, activation and inhibition of biomolec…
▽ More
The biomolecular networks underpinning cell function exhibit canalization, or the buffering of fluctuations required to function in a noisy environment. One understudied putative mechanism for canalization is the functional equivalence of a biomolecular entity's regulators (e.g., among the transcription factors for a gene). In these discrete dynamical systems, activation and inhibition of biomolecular entities (e.g., transcription of genes) are modeled as the activity of coupled 2-state automata, and thus the equivalence of regulators can be studied using the theory of symmetry in discrete functions. To this end, we present a new exact algorithm for finding maximal symmetry groups among the inputs to discrete functions. We implement this algorithm in Rust as a Python package, schematodes. We include schematodes in the new CANA v1.0.0 release, an open source Python library for analyzing canalization in Boolean networks, which we also present here. We compare our exact method implemented in schematodes to the previously published inexact method used in earlier releases of CANA and find that schematodes significantly outperforms the prior method both in speed and accuracy. We also apply CANA v1.0.0 to study the symmetry properties of regulatory function from an ensemble of experimentally-supported Boolean networks from the Cell Collective. Using CANA v1.0.0, we find that the distribution of a previously reported symmetry parameter, $k_s/k$, is statistically significantly different in the Cell Collective than in random automata with the same in-degree and activation bias (Kolmogorov-Smirnov test, $p<0.001$). In particular, its spread is much wider than in our null model (IQR 0.31 vs IQR 0.20 with equal medians), demonstrating that the Cell Collective is enriched in functions with extreme symmetry or asymmetry.
△ Less
Submitted 11 May, 2024;
originally announced May 2024.
-
Focused digital cohort selection from social media using the metric backbone of biomedical knowledge graphs
Authors:
Ziqi Guo,
Jack Felag,
Jordan C. Rozum,
Rion Brattig Correia,
Xuan Wang,
Luis M. Rocha
Abstract:
Social media data allows researchers to construct large digital cohorts to study the interplay between human behavior and medical treatment.Identifying the users most relevant to a specific health problem is, however, a challenge in that social media sites vary in the generality of their discourse. To filter relevant users on any social media, we have developed a general method and tested it on ep…
▽ More
Social media data allows researchers to construct large digital cohorts to study the interplay between human behavior and medical treatment.Identifying the users most relevant to a specific health problem is, however, a challenge in that social media sites vary in the generality of their discourse. To filter relevant users on any social media, we have developed a general method and tested it on epilepsy discourse. We analyzed the text from posts by users who mention epilepsy drugs at least once in the general-purpose social media sites X and Instagram, the epilepsy-focused Reddit subgroup (r/Epilepsy), and the Epilepsy Foundation of America (EFA) forums. We used a curated medical terminology dictionary to generate a knowledge graph (KG) from each social media site, whereby nodes represent terms, and edge weights denote the strength of association between pairs of terms in the collected text. Our method is based on computing the metric backbone of each KG, which yields the subgraph of edges that participate in shortest paths. By comparing the subset of users who contribute to the backbone to the subset who do not, we show that epilepsy-focused social media users contribute to the KG backbone in much higher proportion than do general-purpose social media users. Furthermore, using human annotation of Instagram posts, we demonstrate that users who do not contribute to the backbone are much more likely to use dictionary terms in a manner inconsistent with their biomedical meaning and are rightly excluded from the cohort of interest.
△ Less
Submitted 26 May, 2025; v1 submitted 11 May, 2024;
originally announced May 2024.
-
Comparative analysis of graph randomization: Tools,methods, pitfalls, and best practices
Authors:
Bart De Clerck,
Filip Van Utterbeeck,
Luis E. C. Rocha
Abstract:
Graph randomization techniques play a crucial role in network analysis, allowing researchers to assess the statistical significance of observed network properties and distinguish meaningful patterns from random fluctuations. In this survey we provide an overview of the graph randomization methods available in the most popular software tools for network analysis. We propose a comparative analysis o…
▽ More
Graph randomization techniques play a crucial role in network analysis, allowing researchers to assess the statistical significance of observed network properties and distinguish meaningful patterns from random fluctuations. In this survey we provide an overview of the graph randomization methods available in the most popular software tools for network analysis. We propose a comparative analysis of popular software tools to highlight their functionalities and limitations. Through case studies involving diverse graph types, we demonstrate how different randomization methods can lead to divergent conclusions, emphasizing the importance of careful method selection based on the characteristics of the observed network and the research question at hand. This survey proposes some guidelines for researchers and practitioners seeking to understand and utilize graph randomization techniques effectively in their network analysis projects.
△ Less
Submitted 8 May, 2024;
originally announced May 2024.
-
myAURA: Personalized health library for epilepsy management via knowledge graph sparsification and visualization
Authors:
Rion Brattig Correia,
Jordan C. Rozum,
Leonard Cross,
Jack Felag,
Michael Gallant,
Ziqi Guo,
Bruce W. Herr II,
Aehong Min,
Deborah Stungis Rocha,
Xuan Wang,
Katy Börner,
Wendy Miller,
Luis M. Rocha
Abstract:
Objective: We report the development of the patient-centered myAURA application and suite of methods designed to aid epilepsy patients, caregivers, and researchers in making decisions about care and self-management.
Materials and Methods: myAURA rests on the federation of an unprecedented collection of heterogeneous data resources relevant to epilepsy, such as biomedical databases, social media,…
▽ More
Objective: We report the development of the patient-centered myAURA application and suite of methods designed to aid epilepsy patients, caregivers, and researchers in making decisions about care and self-management.
Materials and Methods: myAURA rests on the federation of an unprecedented collection of heterogeneous data resources relevant to epilepsy, such as biomedical databases, social media, and electronic health records. A generalizable, open-source methodology was developed to compute a multi-layer knowledge graph linking all this heterogeneous data via the terms of a human-centered biomedical dictionary.
Results: The power of the approach is first exemplified in the study of the drug-drug interaction phenomenon. Furthermore, we employ a novel network sparsification methodology using the metric backbone of weighted graphs, which reveals the most important edges for inference, recommendation, and visualization, such as pharmacology factors patients discuss on social media. The network sparsification approach also allows us to extract focused digital cohorts from social media whose discourse is more relevant to epilepsy or other biomedical problems. Finally, we present our patient-centered design and pilot-testing of myAURA, including its user interface, based on focus groups and other stakeholder input.
Discussion: The ability to search and explore myAURA's heterogeneous data sources via a sparsified multi-layer knowledge graph, as well as the combination of those layers in a single map, are useful features for integrating relevant information for epilepsy.
Conclusion: Our stakeholder-driven, scalable approach to integrate traditional and non-traditional data sources, enables biomedical discovery and data-powered patient self-management in epilepsy, and is generalizable to other chronic conditions.
△ Less
Submitted 10 May, 2024; v1 submitted 8 May, 2024;
originally announced May 2024.
-
Epidemic risk perception and social interactions lead to awareness cascades on multiplex networks
Authors:
Tim Van Wesemael,
Luis E. C. Rocha,
Jan M. Baetens
Abstract:
The course of an epidemic is not only shaped by infection transmission over face-to-face contacts, but also by preventive behaviour caused by risk perception and social interactions. This study explores the dynamics of coupled awareness and biological infection spread within a two-layer multiplex network framework. One layer embodies face-to-face contacts, with a biological infection transmission…
▽ More
The course of an epidemic is not only shaped by infection transmission over face-to-face contacts, but also by preventive behaviour caused by risk perception and social interactions. This study explores the dynamics of coupled awareness and biological infection spread within a two-layer multiplex network framework. One layer embodies face-to-face contacts, with a biological infection transmission following a simple contagion model, the SIR process. Awareness, modelled by the linear threshold model, a complex contagion, spreads over a social layer and induces behaviour that lowers the chance of a biological infection occurring. It may be provoked by the presence of either aware or infectious neighbours. We introduce a novel model combining these influences through a convex combination, creating a continuum between pure social contagion and local risk perception. Simulation of the model shows distinct effects arising from the awareness sources. Also, for convex combinations where both input sources are of importance, awareness cascades that are not attributable to only one of these sources, emerge. Under these conditions, the combination of a small-world face-to-face and a scale-free social layer, but not vice versa, make that the extent of the infections decreases with increasing transmission probability.
△ Less
Submitted 19 February, 2025; v1 submitted 25 April, 2024;
originally announced April 2024.
-
Social clustering reinforces external influence on the majority opinion model
Authors:
Niels Van Santen,
Jan Ryckebusch,
Luis E. C. Rocha
Abstract:
Public opinion is subject to peer interaction via social networks and external pressure from the media, advertising, and other actors. In this paper, we study the interaction between external and peer influence on the stochastic opinion dynamics of a majority vote model. We introduce a model where agents update their opinions based on the combined influence of their local neighbourhood (peers) and…
▽ More
Public opinion is subject to peer interaction via social networks and external pressure from the media, advertising, and other actors. In this paper, we study the interaction between external and peer influence on the stochastic opinion dynamics of a majority vote model. We introduce a model where agents update their opinions based on the combined influence of their local neighbourhood (peers) and an external actor in the transition rates. In the first model, the external influence is only felt by agents non-aligned with the external actor ("push strategy"). In the second model, agents are affected by external influence, independently of their opinions ("nudging strategy"). In both cases, the external influence increases the possible macroscopic outcomes. These outcomes are determined by the chosen influence strategy. We also find that the social network structure affects the opinion dynamics, with social clustering positively reinforcing the external influence whereas degree heterogeneity weakens the external forces. These findings are relevant to businesses and policy making, helping to understand how groups of individuals collectively react to external actors.
△ Less
Submitted 30 July, 2024; v1 submitted 3 April, 2024;
originally announced April 2024.
-
On the potential of quantum walks for modeling financial return distributions
Authors:
Stijn De Backer,
Luis E. C. Rocha,
Jan Ryckebusch,
Koen Schoors
Abstract:
Accurate modeling of the temporal evolution of asset prices is crucial for understanding financial markets. We explore the potential of discrete-time quantum walks to model the evolution of asset prices. Return distributions obtained from a model based on the quantum walk algorithm are compared with those obtained from classical methodologies. We focus on specific limitations of the classical mode…
▽ More
Accurate modeling of the temporal evolution of asset prices is crucial for understanding financial markets. We explore the potential of discrete-time quantum walks to model the evolution of asset prices. Return distributions obtained from a model based on the quantum walk algorithm are compared with those obtained from classical methodologies. We focus on specific limitations of the classical models, and illustrate that the quantum walk model possesses great flexibility in overcoming these. This includes the potential to generate asymmetric return distributions with complex market tendencies and higher probabilities for extreme events than in some of the classical models. Furthermore, the temporal evolution in the quantum walk possesses the potential to provide asset price dynamics.
△ Less
Submitted 4 December, 2024; v1 submitted 28 March, 2024;
originally announced March 2024.
-
The ultrametric backbone is the union of all minimum spanning forests
Authors:
Jordan C Rozum,
Luis M Rocha
Abstract:
Minimum spanning trees and forests are powerful sparsification techniques that remove cycles from weighted graphs to minimize total edge weight while preserving node connectivity. They have applications in computer science, network science, and graph theory. Despite their utility and ubiquity, they have several limitations, including that they are only defined for undirected networks, they signifi…
▽ More
Minimum spanning trees and forests are powerful sparsification techniques that remove cycles from weighted graphs to minimize total edge weight while preserving node connectivity. They have applications in computer science, network science, and graph theory. Despite their utility and ubiquity, they have several limitations, including that they are only defined for undirected networks, they significantly alter dynamics on networks, and they do not generally preserve important network features such as shortest distances, shortest path distribution, and community structure. In contrast, distance backbones, which are subgraphs formed by all edges that obey a generalized triangle inequality, are well defined in both directed and undirected graphs and preserve those and other important network features. The backbone of a graph is defined with respect to a specified path-length operator that aggregates weights along a path to define its length, thereby associating a cost to indirect connections. The backbone is the union of all shortest paths between each pair of nodes according to the specified operator. One such operator, the max function, computes the length of a path as the largest weight of the edges that compose it (a weakest link criterion). It is the only operator that yields an algebraic structure for computing shortest paths that is consistent with De Morgan's laws. Applying this operator yields the ultrametric backbone of a graph in that (semi-triangular) edges whose weights are larger than the length of an indirect path connecting the same nodes (i.e., those that break the generalized triangle inequality based on max as a path-length operator) are removed. We show that the ultrametric backbone is the union of all minimum spanning forests in undirected graphs and provides a new generalization of minimum spanning trees to directed graphs.
△ Less
Submitted 22 March, 2024; v1 submitted 19 March, 2024;
originally announced March 2024.
-
Performance of a modular ton-scale pixel-readout liquid argon time projection chamber
Authors:
DUNE Collaboration,
A. Abed Abud,
B. Abi,
R. Acciarri,
M. A. Acero,
M. R. Adames,
G. Adamov,
M. Adamowski,
D. Adams,
M. Adinolfi,
C. Adriano,
A. Aduszkiewicz,
J. Aguilar,
B. Aimard,
F. Akbar,
K. Allison,
S. Alonso Monsalve,
M. Alrashed,
A. Alton,
R. Alvarez,
T. Alves,
H. Amar,
P. Amedo,
J. Anderson,
D. A. Andrade
, et al. (1340 additional authors not shown)
Abstract:
The Module-0 Demonstrator is a single-phase 600 kg liquid argon time projection chamber operated as a prototype for the DUNE liquid argon near detector. Based on the ArgonCube design concept, Module-0 features a novel 80k-channel pixelated charge readout and advanced high-coverage photon detection system. In this paper, we present an analysis of an eight-day data set consisting of 25 million cosmi…
▽ More
The Module-0 Demonstrator is a single-phase 600 kg liquid argon time projection chamber operated as a prototype for the DUNE liquid argon near detector. Based on the ArgonCube design concept, Module-0 features a novel 80k-channel pixelated charge readout and advanced high-coverage photon detection system. In this paper, we present an analysis of an eight-day data set consisting of 25 million cosmic ray events collected in the spring of 2021. We use this sample to demonstrate the imaging performance of the charge and light readout systems as well as the signal correlations between the two. We also report argon purity and detector uniformity measurements, and provide comparisons to detector simulations.
△ Less
Submitted 5 March, 2024;
originally announced March 2024.
-
Evolution of long-period compact radio sources driven by winds
Authors:
J. E. Horvath,
Lucas M. de Sá,
Lívia S. Rocha,
Gustavo Y. Chinen,
Lucas G. Barão,
Márcio G. B. de Avellar
Abstract:
We address in this work the nature and evolution of the long-period compact star sources, which has recently added several unexpected members. The central hypothesis is that particle winds drive their evolution, being an important factor for these relatively old sources. We show the consistency of this picture and remark some unsolved problems and caveats within it.
We address in this work the nature and evolution of the long-period compact star sources, which has recently added several unexpected members. The central hypothesis is that particle winds drive their evolution, being an important factor for these relatively old sources. We show the consistency of this picture and remark some unsolved problems and caveats within it.
△ Less
Submitted 9 February, 2024;
originally announced February 2024.
-
Dynamic Q-planning for Online UAV Path Planning in Unknown and Complex Environments
Authors:
Lidia Gianne Souza da Rocha,
Kenny Anderson Queiroz Caldas,
Marco Henrique Terra,
Fabio Ramos,
Kelen Cristiane Teixeira Vivaldini
Abstract:
Unmanned Aerial Vehicles need an online path planning capability to move in high-risk missions in unknown and complex environments to complete them safely. However, many algorithms reported in the literature may not return reliable trajectories to solve online problems in these scenarios. The Q-Learning algorithm, a Reinforcement Learning Technique, can generate trajectories in real-time and has d…
▽ More
Unmanned Aerial Vehicles need an online path planning capability to move in high-risk missions in unknown and complex environments to complete them safely. However, many algorithms reported in the literature may not return reliable trajectories to solve online problems in these scenarios. The Q-Learning algorithm, a Reinforcement Learning Technique, can generate trajectories in real-time and has demonstrated fast and reliable results. This technique, however, has the disadvantage of defining the iteration number. If this value is not well defined, it will take a long time or not return an optimal trajectory. Therefore, we propose a method to dynamically choose the number of iterations to obtain the best performance of Q-Learning. The proposed method is compared to the Q-Learning algorithm with a fixed number of iterations, A*, Rapid-Exploring Random Tree, and Particle Swarm Optimization. As a result, the proposed Q-learning algorithm demonstrates the efficacy and reliability of online path planning with a dynamic number of iterations to carry out online missions in unknown and complex environments.
△ Less
Submitted 9 February, 2024;
originally announced February 2024.
-
Doping Liquid Argon with Xenon in ProtoDUNE Single-Phase: Effects on Scintillation Light
Authors:
DUNE Collaboration,
A. Abed Abud,
B. Abi,
R. Acciarri,
M. A. Acero,
M. R. Adames,
G. Adamov,
M. Adamowski,
D. Adams,
M. Adinolfi,
C. Adriano,
A. Aduszkiewicz,
J. Aguilar,
B. Aimard,
F. Akbar,
K. Allison,
S. Alonso Monsalve,
M. Alrashed,
A. Alton,
R. Alvarez,
H. Amar Es-sghir,
P. Amedo,
J. Anderson,
D. A. Andrade,
C. Andreopoulos
, et al. (1297 additional authors not shown)
Abstract:
Doping of liquid argon TPCs (LArTPCs) with a small concentration of xenon is a technique for light-shifting and facilitates the detection of the liquid argon scintillation light. In this paper, we present the results of the first doping test ever performed in a kiloton-scale LArTPC. From February to May 2020, we carried out this special run in the single-phase DUNE Far Detector prototype (ProtoDUN…
▽ More
Doping of liquid argon TPCs (LArTPCs) with a small concentration of xenon is a technique for light-shifting and facilitates the detection of the liquid argon scintillation light. In this paper, we present the results of the first doping test ever performed in a kiloton-scale LArTPC. From February to May 2020, we carried out this special run in the single-phase DUNE Far Detector prototype (ProtoDUNE-SP) at CERN, featuring 720 t of total liquid argon mass with 410 t of fiducial mass. A 5.4 ppm nitrogen contamination was present during the xenon doping campaign. The goal of the run was to measure the light and charge response of the detector to the addition of xenon, up to a concentration of 18.8 ppm. The main purpose was to test the possibility for reduction of non-uniformities in light collection, caused by deployment of photon detectors only within the anode planes. Light collection was analysed as a function of the xenon concentration, by using the pre-existing photon detection system (PDS) of ProtoDUNE-SP and an additional smaller set-up installed specifically for this run. In this paper we first summarize our current understanding of the argon-xenon energy transfer process and the impact of the presence of nitrogen in argon with and without xenon dopant. We then describe the key elements of ProtoDUNE-SP and the injection method deployed. Two dedicated photon detectors were able to collect the light produced by xenon and the total light. The ratio of these components was measured to be about 0.65 as 18.8 ppm of xenon were injected. We performed studies of the collection efficiency as a function of the distance between tracks and light detectors, demonstrating enhanced uniformity of response for the anode-mounted PDS. We also show that xenon doping can substantially recover light losses due to contamination of the liquid argon by nitrogen.
△ Less
Submitted 2 August, 2024; v1 submitted 2 February, 2024;
originally announced February 2024.
-
Mass Distribution and Maximum Mass of Neutron Stars: Effects of Orbital Inclination Angle
Authors:
Lívia S. Rocha,
Jorge E. Horvath,
Lucas M. de Sá,
Gustavo Y. Chinen,
Lucas G. Barão,
Marcio G. B. de Avellar
Abstract:
Matter at ultra-high densities finds a physical realization inside neutron stars. One key property is their maximum mass, which has far-reaching implications for astrophysics and the equation of state of ultra dense matter. In this work, we employ Bayesian analysis to scrutinize the mass distribution and maximum mass threshold of galactic neutron stars. We compare two distinct models to assess the…
▽ More
Matter at ultra-high densities finds a physical realization inside neutron stars. One key property is their maximum mass, which has far-reaching implications for astrophysics and the equation of state of ultra dense matter. In this work, we employ Bayesian analysis to scrutinize the mass distribution and maximum mass threshold of galactic neutron stars. We compare two distinct models to assess the impact of assuming a uniform distribution for the most important quantity, the cosine of orbital inclination angles ($i$), which has been a common practice in previous analyses. This prevailing assumption yields a maximum mass of $2.25$~$M_\odot$ (2.15--3.32~$M_\odot$ within $90\%$ confidence), with a strong peak around the maximum value. However, in the second model, which indirectly includes observational constraints of $i$, the analysis supports a mass limit of $2.56^{+0.87}_{-0.58}~M_\odot$ ($2σ$ uncertainty), a result that points in the same direction as some recent results gathered from gravitational wave observations, although their statistics are still limited. This work stresses the importance of an accurate treatment of orbital inclination angles, and contributes to the ongoing debate about the maximum neutron star mass, further emphasizing the critical role of uncertainties in the individual neutron star mass determinations.
△ Less
Submitted 20 December, 2023;
originally announced December 2023.
-
The DUNE Far Detector Vertical Drift Technology, Technical Design Report
Authors:
DUNE Collaboration,
A. Abed Abud,
B. Abi,
R. Acciarri,
M. A. Acero,
M. R. Adames,
G. Adamov,
M. Adamowski,
D. Adams,
M. Adinolfi,
C. Adriano,
A. Aduszkiewicz,
J. Aguilar,
B. Aimard,
F. Akbar,
K. Allison,
S. Alonso Monsalve,
M. Alrashed,
A. Alton,
R. Alvarez,
H. Amar,
P. Amedo,
J. Anderson,
D. A. Andrade,
C. Andreopoulos
, et al. (1304 additional authors not shown)
Abstract:
DUNE is an international experiment dedicated to addressing some of the questions at the forefront of particle physics and astrophysics, including the mystifying preponderance of matter over antimatter in the early universe. The dual-site experiment will employ an intense neutrino beam focused on a near and a far detector as it aims to determine the neutrino mass hierarchy and to make high-precisi…
▽ More
DUNE is an international experiment dedicated to addressing some of the questions at the forefront of particle physics and astrophysics, including the mystifying preponderance of matter over antimatter in the early universe. The dual-site experiment will employ an intense neutrino beam focused on a near and a far detector as it aims to determine the neutrino mass hierarchy and to make high-precision measurements of the PMNS matrix parameters, including the CP-violating phase. It will also stand ready to observe supernova neutrino bursts, and seeks to observe nucleon decay as a signature of a grand unified theory underlying the standard model.
The DUNE far detector implements liquid argon time-projection chamber (LArTPC) technology, and combines the many tens-of-kiloton fiducial mass necessary for rare event searches with the sub-centimeter spatial resolution required to image those events with high precision. The addition of a photon detection system enhances physics capabilities for all DUNE physics drivers and opens prospects for further physics explorations. Given its size, the far detector will be implemented as a set of modules, with LArTPC designs that differ from one another as newer technologies arise.
In the vertical drift LArTPC design, a horizontal cathode bisects the detector, creating two stacked drift volumes in which ionization charges drift towards anodes at either the top or bottom. The anodes are composed of perforated PCB layers with conductive strips, enabling reconstruction in 3D. Light-trap-style photon detection modules are placed both on the cryostat's side walls and on the central cathode where they are optically powered.
This Technical Design Report describes in detail the technical implementations of each subsystem of this LArTPC that, together with the other far detector modules and the near detector, will enable DUNE to achieve its physics goals.
△ Less
Submitted 5 December, 2023;
originally announced December 2023.
-
Semi-metric topology characterizes epidemic spreading on complex networks
Authors:
David Soriano Paños,
Felipe Xavier Costa,
Luis M. Rocha
Abstract:
Network sparsification represents an essential tool to extract the core of interactions sustaining both networks dynamics and their connectedness. In the case of infectious diseases, network sparsification methods remove irrelevant connections to unveil the primary subgraph driving the unfolding of epidemic outbreaks in real networks. In this paper, we explore the features determining whether the…
▽ More
Network sparsification represents an essential tool to extract the core of interactions sustaining both networks dynamics and their connectedness. In the case of infectious diseases, network sparsification methods remove irrelevant connections to unveil the primary subgraph driving the unfolding of epidemic outbreaks in real networks. In this paper, we explore the features determining whether the metric backbone, a subgraph capturing the structure of shortest paths across a network, allows reconstructing epidemic outbreaks. We find that both the relative size of the metric backbone, capturing the fraction of edges kept in such structure, and the distortion of semi-metric edges, quantifying how far those edges not included in the metric backbone are from their associated shortest path, shape the retrieval of Susceptible-Infected (SI) dynamics. We propose a new method to progressively dismantle networks relying on the semi-metric edge distortion, removing first those connections farther from those included in the metric backbone, i.e. those with highest semi-metric distortion values. We apply our method in both synthetic and real networks, finding that semi-metric distortion provides solid ground to preserve spreading dynamics and connectedness while sparsifying networks.
△ Less
Submitted 24 November, 2023;
originally announced November 2023.
-
Can Offline Metrics Measure Explanation Goals? A Comparative Survey Analysis of Offline Explanation Metrics in Recommender Systems
Authors:
André Levi Zanon,
Marcelo Garcia Manzato,
Leonardo Rocha
Abstract:
Explanations in a Recommender System (RS) provide reasons for recommendations to users and can enhance transparency, persuasiveness, engagement, and trust-known as explanation goals. Evaluating the effectiveness of explanation algorithms offline remains challenging due to subjectivity. Initially, we conducted a literature review on current offline metrics, revealing that algorithms are often asses…
▽ More
Explanations in a Recommender System (RS) provide reasons for recommendations to users and can enhance transparency, persuasiveness, engagement, and trust-known as explanation goals. Evaluating the effectiveness of explanation algorithms offline remains challenging due to subjectivity. Initially, we conducted a literature review on current offline metrics, revealing that algorithms are often assessed with anecdotal evidence, offering convincing examples, or with metrics that don't align with human perception. We investigated whether, in explanations connecting interacted and recommended items based on shared content, the selection of item attributes and interacted items affects explanation goals. Metrics measuring the diversity and popularity of attributes and the recency of item interactions were used to evaluate explanations from three state-of-the-art agnostic algorithms across six recommendation systems. These offline metrics were compared with results from an online user study. Our findings reveal a trade-off: transparency and trust relate to popular properties, while engagement and persuasiveness are linked to diversified properties. This study contributes to the development of more robust evaluation methods for explanation algorithms in recommender systems.
△ Less
Submitted 14 April, 2025; v1 submitted 22 October, 2023;
originally announced October 2023.
-
TPDR: A Novel Two-Step Transformer-based Product and Class Description Match and Retrieval Method
Authors:
Washington Cunha,
Celso França,
Leonardo Rocha,
Marcos André Gonçalves
Abstract:
There is a niche of companies responsible for intermediating the purchase of large batches of varied products for other companies, for which the main challenge is to perform product description standardization, i.e., matching an item described by a client with a product described in a catalog. The problem is complex since the client's product description may be: (1) potentially noisy; (2) short an…
▽ More
There is a niche of companies responsible for intermediating the purchase of large batches of varied products for other companies, for which the main challenge is to perform product description standardization, i.e., matching an item described by a client with a product described in a catalog. The problem is complex since the client's product description may be: (1) potentially noisy; (2) short and uninformative (e.g., missing information about model and size); and (3) cross-language. In this paper, we formalize this problem as a ranking task: given an initial client product specification (query), return the most appropriate standardized descriptions (response). In this paper, we propose TPDR, a two-step Transformer-based Product and Class Description Retrieval method that is able to explore the semantic correspondence between IS and SD, by exploiting attention mechanisms and contrastive learning. First, TPDR employs the transformers as two encoders sharing the embedding vector space: one for encoding the IS and another for the SD, in which corresponding pairs (IS, SD) must be close in the vector space. Closeness is further enforced by a contrastive learning mechanism leveraging a specialized loss function. TPDR also exploits a (second) re-ranking step based on syntactic features that are very important for the exact matching (model, dimension) of certain products that may have been neglected by the transformers. To evaluate our proposal, we consider 11 datasets from a real company, covering different application contexts. Our solution was able to retrieve the correct standardized product before the 5th ranking position in 71% of the cases and its correct category in the first position in 80% of the situations. Moreover, the effectiveness gains over purely syntactic or semantic baselines reach up to 3.7 times, solving cases that none of the approaches in isolation can do by themselves.
△ Less
Submitted 5 October, 2023;
originally announced October 2023.
-
Evidence of fractal structures in hadrons
Authors:
Rafael P. Baptista,
Lucas Q. Rocha,
D. P. Menezes,
Luis A. Trevisan,
Constantino Tsallis,
Airton Deppman
Abstract:
This study focuses on the presence of (multi)fractal structures in confined hadronic matter through the momentum distributions of mesons produced in proton-proton collisions between 23 GeV and 63 GeV. The analysis demonstrates that the $q$-exponential behaviour of the particle momentum distributions is consistent with fractal characteristics, exhibiting fractal structures in confined hadronic matt…
▽ More
This study focuses on the presence of (multi)fractal structures in confined hadronic matter through the momentum distributions of mesons produced in proton-proton collisions between 23 GeV and 63 GeV. The analysis demonstrates that the $q$-exponential behaviour of the particle momentum distributions is consistent with fractal characteristics, exhibiting fractal structures in confined hadronic matter with features similar to those observed in the deconfined quark-gluon plasma (QGP) regime. Furthermore, the systematic analysis of meson production in hadronic collisions at energies below 1 TeV suggests that specific fractal parameters are universal, independently of confinement or deconfinement, while others may be influenced by the quark content of the produced meson. These results pave the way for further research exploring the implications of fractal structures on various physical distributions and offer insights into the nature of the phase transition between confined and deconfined regimes.
△ Less
Submitted 31 August, 2023;
originally announced August 2023.
-
Osmosis drives explosions and methane release in Siberian permafrost
Authors:
Ana M. O. Morgado,
Luis A. M. Rocha,
Julyan H. E. Cartwright,
Silvana S. S. Cardoso
Abstract:
Mysterious craters, with anomalously high concentrations of methane, have formed in the Yamal and Taymyr peninsulas of Siberia since 2014. While thawing permafrost owing to climate warming promotes methane releases, it is unknown how such release might be associated with explosion and crater formation. A significant volume of surface ice-melt water can migrate downward driven by osmotic pressure a…
▽ More
Mysterious craters, with anomalously high concentrations of methane, have formed in the Yamal and Taymyr peninsulas of Siberia since 2014. While thawing permafrost owing to climate warming promotes methane releases, it is unknown how such release might be associated with explosion and crater formation. A significant volume of surface ice-melt water can migrate downward driven by osmotic pressure associated with a cryopeg, a lens of salty water below. Overpressure reached at depth may lead to the cracking of the soil and subsequent decomposition of methane hydrates, with implications for the climate.
△ Less
Submitted 25 September, 2024; v1 submitted 11 August, 2023;
originally announced August 2023.
-
Fast but multi-partisan: Bursts of communication increase opinion diversity in the temporal Deffuant model
Authors:
Fatemeh Zarei,
Yerali Gandica,
Luis Enrique Correa Rocha
Abstract:
Human interactions create social networks forming the backbone of societies. Individuals adjust their opinions by exchanging information through social interactions. Two recurrent questions are whether social structures promote opinion polarisation or consensus in societies and whether polarisation can be avoided, particularly on social media. In this paper, we hypothesise that not only network st…
▽ More
Human interactions create social networks forming the backbone of societies. Individuals adjust their opinions by exchanging information through social interactions. Two recurrent questions are whether social structures promote opinion polarisation or consensus in societies and whether polarisation can be avoided, particularly on social media. In this paper, we hypothesise that not only network structure but also the timings of social interactions regulate the emergence of opinion clusters. We devise a temporal version of the Deffuant opinion model where pairwise interactions follow temporal patterns and show that burstiness alone is sufficient to refrain from consensus and polarisation by promoting the reinforcement of local opinions. Individuals self-organise into a multi-partisan society due to network clustering, but the diversity of opinion clusters further increases with burstiness, particularly when individuals have low tolerance and prefer to adjust to similar peers. The emergent opinion landscape is well-balanced regarding clusters' size, with a small fraction of individuals converging to extreme opinions. We thus argue that polarisation is more likely to emerge in social media than offline social networks because of the relatively low social clustering observed online. Counter-intuitively, strengthening online social networks by increasing social redundancy may be a venue to reduce polarisation and promote opinion diversity.
△ Less
Submitted 28 July, 2023;
originally announced July 2023.
-
A network-based strategy of price correlations for optimal cryptocurrency portfolios
Authors:
Ruixue Jing,
Luis Enrique Correa Rocha
Abstract:
A cryptocurrency is a digital asset maintained by a decentralised system using cryptography. Investors in this emerging digital market are exploring the profitability potential of portfolios in place of single coins. Portfolios are particularly useful given that price forecasting in such a volatile market is challenging. The crypto market is a self-organised complex system where the complex inter-…
▽ More
A cryptocurrency is a digital asset maintained by a decentralised system using cryptography. Investors in this emerging digital market are exploring the profitability potential of portfolios in place of single coins. Portfolios are particularly useful given that price forecasting in such a volatile market is challenging. The crypto market is a self-organised complex system where the complex inter-dependencies between the cryptocurrencies may be exploited to understand the market dynamics and build efficient portfolios. In this letter, we use network methods to identify highly decorrelated cryptocurrencies to create diversified portfolios using the Markowitz Portfolio Theory agnostic to future market behaviour. The performance of our network-based portfolios is optimal with 46 coins and superior to benchmarks up to an investment horizon of 14 days, reaching up to 1,066% average expected return within 1 day, with reasonable associated risks. We also show that popular cryptocurrencies are typically not included in the optimal portfolios. Past price correlations reduce risk and may improve the performance of crypto portfolios in comparison to methodologies based exclusively on price auto-correlations. Short-term crypto investments may be competitive to traditional high-risk investments such as the stock market or commodity market but call for caution given the high variability of prices.
△ Less
Submitted 5 April, 2023;
originally announced April 2023.