-
Engineering Supercomputing Platforms for Biomolecular Applications
Authors:
Robert Welch,
Charles Laughton,
Oliver Henrich,
Tom Burnley,
Daniel Cole,
Alan Real,
Sarah Harris,
James Gebbie-Rayet
Abstract:
A range of computational biology software (GROMACS, AMBER, NAMD, LAMMPS, OpenMM, Psi4 and RELION) was benchmarked on a representative selection of HPC hardware, including AMD EPYC 7742 CPU nodes, NVIDIA V100 and AMD MI250X GPU nodes, and an NVIDIA GH200 testbed. The raw performance, power efficiency and data storage requirements of the software was evaluated for each HPC facility, along with quali…
▽ More
A range of computational biology software (GROMACS, AMBER, NAMD, LAMMPS, OpenMM, Psi4 and RELION) was benchmarked on a representative selection of HPC hardware, including AMD EPYC 7742 CPU nodes, NVIDIA V100 and AMD MI250X GPU nodes, and an NVIDIA GH200 testbed. The raw performance, power efficiency and data storage requirements of the software was evaluated for each HPC facility, along with qualitative factors such as the user experience and software environment. It was found that the diversity of methods used within computational biology means that there is no single HPC hardware that can optimally run every type of HPC job, and that diverse hardware is the only way to properly support all methods. New hardware, such as AMD GPUs and Nvidia AI chips, are mostly compatible with existing methods, but are also more labour-intensive to support. GPUs offer the most efficient way to run most computational biology tasks, though some tasks still require CPUs. A fast HPC node running molecular dynamics can produce around 10GB of data per day, however, most facilities and research institutions lack short-term and long-term means to store this data. Finally, as the HPC landscape has become more complex, deploying software and keeping HPC systems online has become more difficult. This situation could be improved through hiring/training in DevOps practices, expanding the consortium model to provide greater support to HPC system administrators, and implementing build frameworks/containerisation/virtualisation tools to allow users to configure their own software environment, rather than relying on centralised software installations.
△ Less
Submitted 18 June, 2025;
originally announced June 2025.
-
The longest branches in a non-Markovian phylogenetic tree
Authors:
Sergey Bocharov,
Simon C. Harris,
Bastien Mallein
Abstract:
Consider a Bellman--Harris-type branching process, in which individuals evolve independently of one another, giving birth after a random time $T$ to a random number $L$ of children. In this article, we study the asymptotic behaviour of the length of the longest branches of this branching process at time $t$, both pendant branches (corresponding to individuals still alive at time $t$) and interior…
▽ More
Consider a Bellman--Harris-type branching process, in which individuals evolve independently of one another, giving birth after a random time $T$ to a random number $L$ of children. In this article, we study the asymptotic behaviour of the length of the longest branches of this branching process at time $t$, both pendant branches (corresponding to individuals still alive at time $t$) and interior branches (corresponding to individuals dead before time $t$).
△ Less
Submitted 16 October, 2024; v1 submitted 3 October, 2024;
originally announced October 2024.
-
The need to implement FAIR principles in biomolecular simulations
Authors:
Rommie Amaro,
Johan Åqvist,
Ivet Bahar,
Federica Battistini,
Adam Bellaiche,
Daniel Beltran,
Philip C. Biggin,
Massimiliano Bonomi,
Gregory R. Bowman,
Richard Bryce,
Giovanni Bussi,
Paolo Carloni,
David Case,
Andrea Cavalli,
Chie-En A. Chang,
Thomas E. Cheatham III,
Margaret S. Cheung,
Cris Chipot,
Lillian T. Chong,
Preeti Choudhary,
Gerardo Andres Cisneros,
Cecilia Clementi,
Rosana Collepardo-Guevara,
Peter Coveney,
Roberto Covino
, et al. (103 additional authors not shown)
Abstract:
This letter illustrates the opinion of the molecular dynamics (MD) community on the need to adopt a new FAIR paradigm for the use of molecular simulations. It highlights the necessity of a collaborative effort to create, establish, and sustain a database that allows findability, accessibility, interoperability, and reusability of molecular dynamics simulation data. Such a development would democra…
▽ More
This letter illustrates the opinion of the molecular dynamics (MD) community on the need to adopt a new FAIR paradigm for the use of molecular simulations. It highlights the necessity of a collaborative effort to create, establish, and sustain a database that allows findability, accessibility, interoperability, and reusability of molecular dynamics simulation data. Such a development would democratize the field and significantly improve the impact of MD simulations on life science research. This will transform our working paradigm, pushing the field to a new frontier. We invite you to support our initiative at the MDDB community (https://mddbr.eu/community/) Now published as: Amaro, R.E., et al. The need to implement FAIR principles in biomolecular simulations. Nat Methods (2025) https://doi.org/10.1038/s41592-025-02635-0
△ Less
Submitted 3 April, 2025; v1 submitted 23 July, 2024;
originally announced July 2024.
-
Long edges in Galton-Watson trees
Authors:
Sergey Bocharov,
Simon C. Harris
Abstract:
In this article, we will establish a number of results concerning the limiting behaviour of the longest edges in the genealogical tree generated by a continuous-time Galton-Watson (GW) process. Separately, we consider the large time behaviour of the longest pendant edges, the longest (strictly) interior edges, and the longest of all the edges. These results extend the special case of long pendant…
▽ More
In this article, we will establish a number of results concerning the limiting behaviour of the longest edges in the genealogical tree generated by a continuous-time Galton-Watson (GW) process. Separately, we consider the large time behaviour of the longest pendant edges, the longest (strictly) interior edges, and the longest of all the edges. These results extend the special case of long pendant edges of birth-death processes established in Bocharov, Harris, Kominek, Mooers, and Steel [1] .
△ Less
Submitted 30 August, 2023;
originally announced August 2023.
-
Development of models for predicting Torsade de Pointes cardiac arrhythmias using perceptron neural networks
Authors:
Mohsen Sharifi,
Dan Buzatu,
Stephen Harris,
Jon Wilkes
Abstract:
Blockage of some ion channels and in particular, the hERG cardiac potassium channel delays cardiac repolarization and can induce arrhythmia. In some cases it leads to a potentially life-threatening arrhythmia known as Torsade de Pointes (TdP). Therefore recognizing drugs with TdP risk is essential. Candidate drugs that are determined not to cause cardiac ion channel blockage are more likely to pas…
▽ More
Blockage of some ion channels and in particular, the hERG cardiac potassium channel delays cardiac repolarization and can induce arrhythmia. In some cases it leads to a potentially life-threatening arrhythmia known as Torsade de Pointes (TdP). Therefore recognizing drugs with TdP risk is essential. Candidate drugs that are determined not to cause cardiac ion channel blockage are more likely to pass successfully through clinical phases II and III trials (and preclinical work) and not be withdrawn even later from the marketplace due to cardiotoxic effects. The objective of the present study is to develop an SAR model that can be used as an early screen for torsadogenic (causing TdP arrhythmias) potential in drug candidates. The method is performed using descriptors comprised of atomic NMR chemical shifts and corresponding interatomic distances which are combined into a 3D abstract space matrix. The method is called 3D-SDAR (3 dimensional spectral data-activity relationship) and can be interrogated to identify molecular features responsible for the activity, which can in turn yield simplified hERG toxicophores. A dataset of 55 hERG potassium channel inhibitors collected from Kramer et al. consisting of 32 drugs with TdP risk and 23 with no TdP risk was used for training the 3D-SDAR model.An ANN model with multilayer perceptron was used to define collinearities among the independent 3D-SDAR features. A composite model from 200 random iterations with 25% of the molecules in each case yielded the following figures of merit: training, 99.2 %; internal test sets, 66.7%; external (blind validation) test set, 68.4%. In the external test set, 70.3% of positive TdP drugs were correctly predicted. Moreover, toxicophores were generated from TdP drugs. A 3D-SDAR was successfully used to build a predictive model for drug-induced torsadogenic and non-torsadogenic drugs.
△ Less
Submitted 3 October, 2017;
originally announced October 2017.
-
An anti-symmetric exclusion process for two particles on an infinite 1D lattice
Authors:
Jonathan R Potts,
Stephen Harris,
Luca Giuggioli
Abstract:
A system of two biased, mutually exclusive random walkers on an infinite 1D lattice is studied whereby the intrinsic bias of one particle is equal and opposite to that of the other. The propogator for this system is solved exactly and expressions for the mean displacement and mean square displacement (MSD) are found. Depending on the nature of the intrinsic bias, the system's behaviour displays tw…
▽ More
A system of two biased, mutually exclusive random walkers on an infinite 1D lattice is studied whereby the intrinsic bias of one particle is equal and opposite to that of the other. The propogator for this system is solved exactly and expressions for the mean displacement and mean square displacement (MSD) are found. Depending on the nature of the intrinsic bias, the system's behaviour displays two regimes, characterised by (i) the particles moving towards each other and (ii) away from each other, both qualitatively different from the case of no bias. The continuous-space limit of the propogator is found and is shown to solve a Fokker-Planck equation for two biased, mutually exclusive Brownian particles with equal and opposite drift velocity.
△ Less
Submitted 10 October, 2011; v1 submitted 11 July, 2011;
originally announced July 2011.
-
Brownian walkers within subdiffusing territorial boundaries
Authors:
Luca Giuggioli,
Jonathan R. Potts,
Stephen Harris
Abstract:
Inspired by the collective phenomenon of territorial emergence, whereby animals move and interact through the scent marks they deposit, we study the dynamics of a 1D Brownian walker in a random environment consisting of confining boundaries that are themselves diffusing anomalously. We show how to reduce, in certain parameter regimes, the non-Markovian, many-body problem of territoriality to the a…
▽ More
Inspired by the collective phenomenon of territorial emergence, whereby animals move and interact through the scent marks they deposit, we study the dynamics of a 1D Brownian walker in a random environment consisting of confining boundaries that are themselves diffusing anomalously. We show how to reduce, in certain parameter regimes, the non-Markovian, many-body problem of territoriality to the analytically tractable one-body problem studied here. The mean square displacement (MSD) of the 1D Brownian walker within subdiffusing boundaries is calculated exactly and generalizes well known results when the boundaries are immobile. Furthermore, under certain conditions, if the boundary dynamics are strongly subdiffusive, we show the appearance of an interesting non-monotonicity in the time dependence of the MSD, giving rise to transient negative diffusion.
△ Less
Submitted 1 June, 2011; v1 submitted 4 February, 2011;
originally announced February 2011.
-
Quantum-assisted biomolecular modelling
Authors:
Sarah Harris,
Vivien M. Kendon
Abstract:
Our understanding of the physics of biological molecules, such as proteins and DNA, is limited because the approximations we usually apply to model inert materials are not in general applicable to soft, chemically inhomogeneous systems. The configurational complexity of biomolecules means the entropic contribution to the free energy is a significant factor in their behaviour, requiring detailed dy…
▽ More
Our understanding of the physics of biological molecules, such as proteins and DNA, is limited because the approximations we usually apply to model inert materials are not in general applicable to soft, chemically inhomogeneous systems. The configurational complexity of biomolecules means the entropic contribution to the free energy is a significant factor in their behaviour, requiring detailed dynamical calculations to fully evaluate. Computer simulations capable of taking all interatomic interactions into account are therefore vital. However, even with the best current supercomputing facilities, we are unable to capture enough of the most interesting aspects of their behaviour to properly understand how they work. This limits our ability to design new molecules, to treat diseases, for example. Progress in biomolecular simulation depends crucially on increasing the computing power available. Faster classical computers are in the pipeline, but these provide only incremental improvements. Quantum computing offers the possibility of performing huge numbers of calculations in parallel, when it becomes available. We discuss the current open questions in biomolecular simulation, how these might be addressed using quantum computation and speculate on the future importance of quantum-assisted biomolecular modelling.
△ Less
Submitted 12 July, 2010;
originally announced July 2010.