-
Snakemaker: Seamlessly transforming ad-hoc analyses into sustainable Snakemake workflows with generative AI
Authors:
Marco Masera,
Alessandro Leone,
Johannes Köster,
Ivan Molineris
Abstract:
Reproducibility and sustainability present significant challenges in bioinformatics software development, where rapidly evolving tools and complex workflows often result in short-lived or difficult-to-adapt pipelines. This paper introduces Snakemaker, a tool that leverages generative AI to facilitate researchers build sustainable data analysis pipelines by converting unstructured code into well-de…
▽ More
Reproducibility and sustainability present significant challenges in bioinformatics software development, where rapidly evolving tools and complex workflows often result in short-lived or difficult-to-adapt pipelines. This paper introduces Snakemaker, a tool that leverages generative AI to facilitate researchers build sustainable data analysis pipelines by converting unstructured code into well-defined Snakemake workflows. Snakemaker non-invasively tracks the work performed in the terminal by the researcher, analyzes execution patterns, and generates Snakemake workflows that can be integrated into existing pipelines. Snakemaker also supports the transformation of monolithic Ipython Notebooks into modular Snakemake pipelines, resolving the global state of the notebook into discrete, file-based interactions between rules. An integrated chat assistant provides users with fine-grained control through natural language instructions. Snakemaker generates high-quality Snakemake workflows by adhering to the best practices, including Conda environment tracking, generic rule generation and loop unrolling. By lowering the barrier between prototype and production-quality code, Snakemaker addresses a critical gap in computational reproducibility for bioinformatics research.
△ Less
Submitted 26 April, 2025;
originally announced May 2025.
-
Supervised Time Series Classification for Anomaly Detection in Subsea Engineering
Authors:
Ergys Çokaj,
Halvor Snersrud Gustad,
Andrea Leone,
Per Thomas Moe,
Lasse Moldestad
Abstract:
Time series classification is of significant importance in monitoring structural systems. In this work, we investigate the use of supervised machine learning classification algorithms on simulated data based on a physical system with two states: Intact and Broken. We provide a comprehensive discussion of the preprocessing of temporal data, using measures of statistical dispersion and dimension red…
▽ More
Time series classification is of significant importance in monitoring structural systems. In this work, we investigate the use of supervised machine learning classification algorithms on simulated data based on a physical system with two states: Intact and Broken. We provide a comprehensive discussion of the preprocessing of temporal data, using measures of statistical dispersion and dimension reduction techniques. We present an intuitive baseline method and discuss its efficiency. We conclude with a comparison of the various methods based on different performance metrics, showing the advantage of using machine learning techniques as a tool in decision making.
△ Less
Submitted 12 March, 2024;
originally announced March 2024.
-
Neural networks for the approximation of Euler's elastica
Authors:
Elena Celledoni,
Ergys Çokaj,
Andrea Leone,
Sigrid Leyendecker,
Davide Murari,
Brynjulf Owren,
Rodrigo T. Sato Martín de Almagro,
Martina Stavole
Abstract:
Euler's elastica is a classical model of flexible slender structures, relevant in many industrial applications. Static equilibrium equations can be derived via a variational principle. The accurate approximation of solutions of this problem can be challenging due to nonlinearity and constraints. We here present two neural network based approaches for the simulation of this Euler's elastica. Starti…
▽ More
Euler's elastica is a classical model of flexible slender structures, relevant in many industrial applications. Static equilibrium equations can be derived via a variational principle. The accurate approximation of solutions of this problem can be challenging due to nonlinearity and constraints. We here present two neural network based approaches for the simulation of this Euler's elastica. Starting from a data set of solutions of the discretised static equilibria, we train the neural networks to produce solutions for unseen boundary conditions. We present a $\textit{discrete}$ approach learning discrete solutions from the discrete data. We then consider a $\textit{continuous}$ approach using the same training data set, but learning continuous solutions to the problem. We present numerical evidence that the proposed neural networks can effectively approximate configurations of the planar Euler's elastica for a range of different boundary conditions.
△ Less
Submitted 4 June, 2024; v1 submitted 1 December, 2023;
originally announced December 2023.
-
Learning Hamiltonians of constrained mechanical systems
Authors:
Elena Celledoni,
Andrea Leone,
Davide Murari,
Brynjulf Owren
Abstract:
Recently, there has been an increasing interest in modelling and computation of physical systems with neural networks. Hamiltonian systems are an elegant and compact formalism in classical mechanics, where the dynamics is fully determined by one scalar function, the Hamiltonian. The solution trajectories are often constrained to evolve on a submanifold of a linear vector space. In this work, we pr…
▽ More
Recently, there has been an increasing interest in modelling and computation of physical systems with neural networks. Hamiltonian systems are an elegant and compact formalism in classical mechanics, where the dynamics is fully determined by one scalar function, the Hamiltonian. The solution trajectories are often constrained to evolve on a submanifold of a linear vector space. In this work, we propose new approaches for the accurate approximation of the Hamiltonian function of constrained mechanical systems given sample data information of their solutions. We focus on the importance of the preservation of the constraints in the learning strategy by using both explicit Lie group integrators and other classical schemes.
△ Less
Submitted 27 June, 2022; v1 submitted 31 January, 2022;
originally announced January 2022.
-
Dynamics of the N-fold Pendulum in the framework of Lie Group Integrators
Authors:
Elena Celledoni,
Ergys Çokaj,
Andrea Leone,
Davide Murari,
Brynjulf Owren
Abstract:
Since their introduction, Lie group integrators have become a method of choice in many application areas. Various formulations of these integrators exist, and in this work we focus on Runge--Kutta--Munthe--Kaas methods. First, we briefly introduce this class of integrators, considering some of the practical aspects of their implementation, such as adaptive time stepping. We then present some mathe…
▽ More
Since their introduction, Lie group integrators have become a method of choice in many application areas. Various formulations of these integrators exist, and in this work we focus on Runge--Kutta--Munthe--Kaas methods. First, we briefly introduce this class of integrators, considering some of the practical aspects of their implementation, such as adaptive time stepping. We then present some mathematical background that allows us to apply them to some families of Lagrangian mechanical systems. We conclude with an application to a nontrivial mechanical system: the N-fold 3D pendulum.
△ Less
Submitted 25 September, 2021;
originally announced September 2021.
-
Lie Group integrators for mechanical systems
Authors:
Elena Celledoni,
Ergys Çokaj,
Andrea Leone,
Davide Murari,
Brynjulf Owren
Abstract:
Since they were introduced in the 1990s, Lie group integrators have become a method of choice in many application areas. These include multibody dynamics, shape analysis, data science, image registration and biophysical simulations. Two important classes of intrinsic Lie group integrators are the Runge--Kutta--Munthe--Kaas methods and the commutator free Lie group integrators.
We give a short in…
▽ More
Since they were introduced in the 1990s, Lie group integrators have become a method of choice in many application areas. These include multibody dynamics, shape analysis, data science, image registration and biophysical simulations. Two important classes of intrinsic Lie group integrators are the Runge--Kutta--Munthe--Kaas methods and the commutator free Lie group integrators.
We give a short introduction to these classes of methods. The Hamiltonian framework is attractive for many mechanical problems, and in particular we shall consider Lie group integrators for problems on cotangent bundles of Lie groups where a number of different formulations are possible. There is a natural symplectic structure on such manifolds and through variational principles one may derive symplectic Lie group integrators. We also consider the practical aspects of the implementation of Lie group integrators, such as adaptive time stepping. The theory is illustrated by applying the methods to two nontrivial applications in mechanics. One is the N-fold spherical pendulum where we introduce the restriction of the adjoint action of the group $SE(3)$ to $TS^2$, the tangent bundle of the two-dimensional sphere. Finally, we show how Lie group integrators can be applied to model the controlled path of a payload being transported by two rotors. This problem is modeled on $\mathbb{R}^6\times \left(SO(3)\times \mathfrak{so}(3)\right)^2\times (TS^2)^2$ and put in a format where Lie group integrators can be applied.
△ Less
Submitted 14 October, 2021; v1 submitted 25 February, 2021;
originally announced February 2021.
-
Virtual Compton Scattering and the Generalized Polarizabilities of the Proton at Q^2=0.92 and 1.76 GeV^2
Authors:
H. Fonvieille,
G. Laveissiere,
N. Degrande,
S. Jaminion,
C. Jutier,
L. Todor,
R. Di Salvo,
L. Van Hoorebeke,
L. C. Alexa,
B. D. Anderson,
K. A. Aniol,
K. Arundell,
G. Audit,
L. Auerbach,
F. T. Baker,
M. Baylac,
J. Berthot,
P. Y. Bertin,
W. Bertozzi,
L. Bimbot,
W. U. Boeglin,
E. J. Brash,
V. Breton,
H. Breuer,
E. Burtin
, et al. (139 additional authors not shown)
Abstract:
Virtual Compton Scattering (VCS) on the proton has been studied at Jefferson Lab using the exclusive photon electroproduction reaction (e p --> e p gamma). This paper gives a detailed account of the analysis which has led to the determination of the structure functions P_LL-P_TT/epsilon and P_LT, and the electric and magnetic generalized polarizabilities (GPs) alpha_E(Q^2) and beta_M(Q^2) at value…
▽ More
Virtual Compton Scattering (VCS) on the proton has been studied at Jefferson Lab using the exclusive photon electroproduction reaction (e p --> e p gamma). This paper gives a detailed account of the analysis which has led to the determination of the structure functions P_LL-P_TT/epsilon and P_LT, and the electric and magnetic generalized polarizabilities (GPs) alpha_E(Q^2) and beta_M(Q^2) at values of the four-momentum transfer squared Q^2= 0.92 and 1.76 GeV^2. These data, together with the results of VCS experiments at lower momenta, help building a coherent picture of the electric and magnetic GPs of the proton over the full measured Q^2-range, and point to their non-trivial behavior.
△ Less
Submitted 28 June, 2012; v1 submitted 15 May, 2012;
originally announced May 2012.
-
Decomposition of variance in terms of conditional means
Authors:
Alessandro Figa' Talamanca,
Angelo Guerriero,
Alberto Leone,
Gian Piero Mignoli,
Enrico Rogora
Abstract:
We test against two different sets of data an apparently new approach to the analysis of the variance of a numerical variable which depends on qualitative characters. We suggest that this approach be used to complement other existing techniques to study the interdependence of the variables involved. According to our method the variance is expressed as a sum of orthogonal components, obtained as…
▽ More
We test against two different sets of data an apparently new approach to the analysis of the variance of a numerical variable which depends on qualitative characters. We suggest that this approach be used to complement other existing techniques to study the interdependence of the variables involved. According to our method the variance is expressed as a sum of orthogonal components, obtained as differences of conditional means, with respect to the qualitative characters. The resulting expression for the variance depends on the ordering in which the characters are considered. We suggest an algorithm which leads to an ordering which is deemed natural. The first set of data concerns the score achieved by a population of students, on an entrance examination, based on a multiple choice test with 30 questions. In this case the qualitative characters are dyadic and correspond to correct or incorrect answer to each question. The second set of data concerns the delay in obtaining the degree for a population of graduates of Italian universities. The variance in this case is analyzed with respect to a set of seven specific qualitative characters of the population studied (gender, previous education, working condition, parent's educational level, field of study, etc.)
△ Less
Submitted 3 October, 2007;
originally announced October 2007.
-
Probing the DeltaNN component of 3He
Authors:
G. M. Huber,
G. J. Lolos,
E. J. Brash,
S. Dumalski,
F. Farzanpay,
M. Iurescu,
Z. Papandreou,
A. Shinozaki,
A. Weinerman,
T. Emura,
H. Hirosawa,
K. Niwa,
H. Yamashita,
K. Maeda,
T. Terasawa,
H. Yamazaki,
S. Endo,
K. Miyamoto,
Y. Sumi,
G. Garino,
K. Maruyama,
A. Leone,
R. Perrino,
T. Maki,
A. Sasaki
, et al. (1 additional authors not shown)
Abstract:
The 3He(gamma,pi^+/- p) reactions were measured simultaneously over a tagged photon energy range of 800<E_gamma<1120 MeV, well above the Delta resonance region. An analysis was performed to kinematically isolate Delta knockout events from conventional Delta photoproduction events, and a statistically significant excess of pi+p events was identified, consistent with Delta++ knockout. Two methods…
▽ More
The 3He(gamma,pi^+/- p) reactions were measured simultaneously over a tagged photon energy range of 800<E_gamma<1120 MeV, well above the Delta resonance region. An analysis was performed to kinematically isolate Delta knockout events from conventional Delta photoproduction events, and a statistically significant excess of pi+p events was identified, consistent with Delta++ knockout. Two methods were used to estimate the DeltaNN probability in the 3He ground state, corresponding to the observed knockout cross section. The first gave a lower probability limit of 1.5+/-0.6+/-0.5%; the second yielded an upper limit of about 2.6%.
△ Less
Submitted 14 April, 2000; v1 submitted 1 December, 1999;
originally announced December 1999.
-
Subthreshold rho^0 photoproduction on 3He
Authors:
TAGX Collaboration,
M. A. Kagarlis,
Z. Papandreou,
G. M. Huber,
G. J. Lolos,
A. Shinozaki,
E. J. Brash,
F. Farzanpay,
M. Iurescu,
A. Weinerman,
G. Garino,
K. Maruyama,
O. Konno,
K. Maeda,
T. Terasawa,
H. Yamazaki,
T. Emura,
H. Hirosawa,
K. Niwa,
H. Yamashita,
S. Endo,
K. Miyamoto,
Y. Sumi,
A. Leone,
R. Perrino
, et al. (3 additional authors not shown)
Abstract:
A large reduction of the rho^0 mass in the nuclear medium is reported, inferred from dipion photoproduction spectra in the 1 GeV region, for the reaction 3He(gamma,pi+ pi-)X with a 10% duty factor tagged-photon beam and the TAGX multi-particle spectrometer. The energy range covered (800 < E(gamma) < 1120 MeV) lies mostly below the free rho^0 production threshold, a region which is believed sensi…
▽ More
A large reduction of the rho^0 mass in the nuclear medium is reported, inferred from dipion photoproduction spectra in the 1 GeV region, for the reaction 3He(gamma,pi+ pi-)X with a 10% duty factor tagged-photon beam and the TAGX multi-particle spectrometer. The energy range covered (800 < E(gamma) < 1120 MeV) lies mostly below the free rho^0 production threshold, a region which is believed sensitive to modifications of light vector-meson properties at nuclear-matter densities. The rho^0 masses extracted from the MC fitting of the data, m*(rho^0) = 642 +/- 40, 669 +/- 32, and 682 +/- 56 MeV/c^2 for E(gamma) in the 800-880, 880-960, and 960-1040 MeV regions respectively, are independently corroborated by a measured, assumption-free, kinematical observable. This mass shift, far exceeding current mean-field driven theoretical predictions, may be suggestive of rho^0 decay within the range of the nucleonic field.
△ Less
Submitted 23 November, 1998;
originally announced November 1998.