-
FORTRESS: Frontier Risk Evaluation for National Security and Public Safety
Authors:
Christina Q. Knight,
Kaustubh Deshpande,
Ved Sirdeshmukh,
Meher Mankikar,
Scale Red Team,
SEAL Research Team,
Julian Michael
Abstract:
The rapid advancement of large language models (LLMs) introduces dual-use capabilities that could both threaten and bolster national security and public safety (NSPS). Models implement safeguards to protect against potential misuse relevant to NSPS and allow for benign users to receive helpful information. However, current benchmarks often fail to test safeguard robustness to potential NSPS risks…
▽ More
The rapid advancement of large language models (LLMs) introduces dual-use capabilities that could both threaten and bolster national security and public safety (NSPS). Models implement safeguards to protect against potential misuse relevant to NSPS and allow for benign users to receive helpful information. However, current benchmarks often fail to test safeguard robustness to potential NSPS risks in an objective, robust way. We introduce FORTRESS: 500 expert-crafted adversarial prompts with instance-based rubrics of 4-7 binary questions for automated evaluation across 3 domains (unclassified information only): Chemical, Biological, Radiological, Nuclear and Explosive (CBRNE), Political Violence & Terrorism, and Criminal & Financial Illicit Activities, with 10 total subcategories across these domains. Each prompt-rubric pair has a corresponding benign version to test for model over-refusals. This evaluation of frontier LLMs' safeguard robustness reveals varying trade-offs between potential risks and model usefulness: Claude-3.5-Sonnet demonstrates a low average risk score (ARS) (14.09 out of 100) but the highest over-refusal score (ORS) (21.8 out of 100), while Gemini 2.5 Pro shows low over-refusal (1.4) but a high average potential risk (66.29). Deepseek-R1 has the highest ARS at 78.05, but the lowest ORS at only 0.06. Models such as o1 display a more even trade-off between potential risks and over-refusals (with an ARS of 21.69 and ORS of 5.2). To provide policymakers and researchers with a clear understanding of models' potential risks, we publicly release FORTRESS at https://huggingface.co/datasets/ScaleAI/fortress_public. We also maintain a private set for evaluation.
△ Less
Submitted 24 June, 2025; v1 submitted 17 June, 2025;
originally announced June 2025.
-
FeNN: A RISC-V vector processor for Spiking Neural Network acceleration
Authors:
Zainab Aizaz,
James C. Knight,
Thomas Nowotny
Abstract:
Spiking Neural Networks (SNNs) have the potential to drastically reduce the energy requirements of AI systems. However, mainstream accelerators like GPUs and TPUs are designed for the high arithmetic intensity of standard ANNs so are not well-suited to SNN simulation. FPGAs are well-suited to applications with low arithmetic intensity as they have high off-chip memory bandwidth and large amounts o…
▽ More
Spiking Neural Networks (SNNs) have the potential to drastically reduce the energy requirements of AI systems. However, mainstream accelerators like GPUs and TPUs are designed for the high arithmetic intensity of standard ANNs so are not well-suited to SNN simulation. FPGAs are well-suited to applications with low arithmetic intensity as they have high off-chip memory bandwidth and large amounts of on-chip memory. Here, we present a novel RISC-V-based soft vector processor (FeNN), tailored to simulating SNNs on FPGAs. Unlike most dedicated neuromorphic hardware, FeNN is fully programmable and designed to be integrated with applications running on standard computers from the edge to the cloud. We demonstrate that, by using stochastic rounding and saturation, FeNN can achieve high numerical precision with low hardware utilisation and that a single FeNN core can simulate an SNN classifier faster than both an embedded GPU and the Loihi neuromorphic system.
△ Less
Submitted 13 June, 2025;
originally announced June 2025.
-
A Red Teaming Roadmap Towards System-Level Safety
Authors:
Zifan Wang,
Christina Q. Knight,
Jeremy Kritz,
Willow E. Primack,
Julian Michael
Abstract:
Large Language Model (LLM) safeguards, which implement request refusals, have become a widely adopted mitigation strategy against misuse. At the intersection of adversarial machine learning and AI safety, safeguard red teaming has effectively identified critical vulnerabilities in state-of-the-art refusal-trained LLMs. However, in our view the many conference submissions on LLM red teaming do not,…
▽ More
Large Language Model (LLM) safeguards, which implement request refusals, have become a widely adopted mitigation strategy against misuse. At the intersection of adversarial machine learning and AI safety, safeguard red teaming has effectively identified critical vulnerabilities in state-of-the-art refusal-trained LLMs. However, in our view the many conference submissions on LLM red teaming do not, in aggregate, prioritize the right research problems. First, testing against clear product safety specifications should take a higher priority than abstract social biases or ethical principles. Second, red teaming should prioritize realistic threat models that represent the expanding risk landscape and what real attackers might do. Finally, we contend that system-level safety is a necessary step to move red teaming research forward, as AI models present new threats as well as affordances for threat mitigation (e.g., detection and banning of malicious users) once placed in a deployment context. Adopting these priorities will be necessary in order for red teaming research to adequately address the slate of new threats that rapid AI advances present today and will present in the very near future.
△ Less
Submitted 9 June, 2025; v1 submitted 30 May, 2025;
originally announced June 2025.
-
Constructive community race: full-density spiking neural network model drives neuromorphic computing
Authors:
Johanna Senk,
Anno C. Kurth,
Steve Furber,
Tobias Gemmeke,
Bruno Golosio,
Arne Heittmann,
James C. Knight,
Eric Müller,
Tobias Noll,
Thomas Nowotny,
Gorka Peraza Coppola,
Luca Peres,
Oliver Rhodes,
Andrew Rowley,
Johannes Schemmel,
Tim Stadtmann,
Tom Tetzlaff,
Gianmarco Tiddia,
Sacha J. van Albada,
José Villamar,
Markus Diesmann
Abstract:
The local circuitry of the mammalian brain is a focus of the search for generic computational principles because it is largely conserved across species and modalities. In 2014 a model was proposed representing all neurons and synapses of the stereotypical cortical microcircuit below $1\,\text{mm}^2$ of brain surface. The model reproduces fundamental features of brain activity but its impact remain…
▽ More
The local circuitry of the mammalian brain is a focus of the search for generic computational principles because it is largely conserved across species and modalities. In 2014 a model was proposed representing all neurons and synapses of the stereotypical cortical microcircuit below $1\,\text{mm}^2$ of brain surface. The model reproduces fundamental features of brain activity but its impact remained limited because of its computational demands. For theory and simulation, however, the model was a breakthrough because it removes uncertainties of downscaling, and larger models are less densely connected. This sparked a race in the neuromorphic computing community and the model became a de facto standard benchmark. Within a few years real-time performance was reached and surpassed at significantly reduced energy consumption. We review how the computational challenge was tackled by different simulation technologies and derive guidelines for the next generation of benchmarks and other domains of science.
△ Less
Submitted 2 June, 2025; v1 submitted 27 May, 2025;
originally announced May 2025.
-
Eventprop training for efficient neuromorphic applications
Authors:
Thomas Shoesmith,
James C. Knight,
Balázs Mészáros,
Jonathan Timcheck,
Thomas Nowotny
Abstract:
Neuromorphic computing can reduce the energy requirements of neural networks and holds the promise to `repatriate' AI workloads back from the cloud to the edge. However, training neural networks on neuromorphic hardware has remained elusive. Here, we instead present a pipeline for training spiking neural networks on GPUs, using the efficient event-driven Eventprop algorithm implemented in mlGeNN,…
▽ More
Neuromorphic computing can reduce the energy requirements of neural networks and holds the promise to `repatriate' AI workloads back from the cloud to the edge. However, training neural networks on neuromorphic hardware has remained elusive. Here, we instead present a pipeline for training spiking neural networks on GPUs, using the efficient event-driven Eventprop algorithm implemented in mlGeNN, and deploying them on Intel's Loihi 2 neuromorphic chip. Our benchmarking on keyword spotting tasks indicates that there is almost no loss in accuracy between GPU and Loihi 2 implementations and that classifying a sample on Loihi 2 is up to 10X faster and uses 200X less energy than on an NVIDIA Jetson Orin Nano.
△ Less
Submitted 6 March, 2025;
originally announced March 2025.
-
MOFA: Discovering Materials for Carbon Capture with a GenAI- and Simulation-Based Workflow
Authors:
Xiaoli Yan,
Nathaniel Hudson,
Hyun Park,
Daniel Grzenda,
J. Gregory Pauloski,
Marcus Schwarting,
Haochen Pan,
Hassan Harb,
Samuel Foreman,
Chris Knight,
Tom Gibbs,
Kyle Chard,
Santanu Chaudhuri,
Emad Tajkhorshid,
Ian Foster,
Mohamad Moosavi,
Logan Ward,
E. A. Huerta
Abstract:
We present MOFA, an open-source generative AI (GenAI) plus simulation workflow for high-throughput generation of metal-organic frameworks (MOFs) on large-scale high-performance computing (HPC) systems. MOFA addresses key challenges in integrating GPU-accelerated computing for GPU-intensive GenAI tasks, including distributed training and inference, alongside CPU- and GPU-optimized tasks for screeni…
▽ More
We present MOFA, an open-source generative AI (GenAI) plus simulation workflow for high-throughput generation of metal-organic frameworks (MOFs) on large-scale high-performance computing (HPC) systems. MOFA addresses key challenges in integrating GPU-accelerated computing for GPU-intensive GenAI tasks, including distributed training and inference, alongside CPU- and GPU-optimized tasks for screening and filtering AI-generated MOFs using molecular dynamics, density functional theory, and Monte Carlo simulations. These heterogeneous tasks are unified within an online learning framework that optimizes the utilization of available CPU and GPU resources across HPC systems. Performance metrics from a 450-node (14,400 AMD Zen 3 CPUs + 1800 NVIDIA A100 GPUs) supercomputer run demonstrate that MOFA achieves high-throughput generation of novel MOF structures, with CO$_2$ adsorption capacities ranking among the top 10 in the hypothetical MOF (hMOF) dataset. Furthermore, the production of high-quality MOFs exhibits a linear relationship with the number of nodes utilized. The modular architecture of MOFA will facilitate its integration into other scientific applications that dynamically combine GenAI with large-scale simulations.
△ Less
Submitted 17 January, 2025;
originally announced January 2025.
-
Efficient Event-based Delay Learning in Spiking Neural Networks
Authors:
Balázs Mészáros,
James C. Knight,
Thomas Nowotny
Abstract:
Spiking Neural Networks (SNNs) compute using sparse communication and are attracting increased attention as a more energy-efficient alternative to traditional Artificial Neural Networks~(ANNs). While standard ANNs are stateless, spiking neurons are stateful and hence intrinsically recurrent, making them well-suited for spatio-temporal tasks. However, the duration of this intrinsic memory is limite…
▽ More
Spiking Neural Networks (SNNs) compute using sparse communication and are attracting increased attention as a more energy-efficient alternative to traditional Artificial Neural Networks~(ANNs). While standard ANNs are stateless, spiking neurons are stateful and hence intrinsically recurrent, making them well-suited for spatio-temporal tasks. However, the duration of this intrinsic memory is limited by synaptic and membrane time constants. Delays are a powerful additional mechanism and, in this paper, we propose a novel event-based training method for SNNs with delays, grounded in the EventProp formalism which enables the calculation of exact gradients with respect to weights and delays. Our method supports multiple spikes per neuron and, to the best of our knowledge, is the first delay learning algorithm to be applied to recurrent SNNs. We evaluate our method on a simple sequence detection task, as well as the Yin-Yang, Spiking Heidelberg Digits, Spiking Speech Commands and Braille letter reading datasets, demonstrating that our algorithm can optimise delays from suboptimal initial conditions and enhance classification accuracy compared to architectures without delays. We also find that recurrent delays are particularly beneficial in small networks. Finally, we show that our approach uses less than half the memory of the current state-of-the-art delay-learning method and is up to 26x faster.
△ Less
Submitted 26 June, 2025; v1 submitted 13 January, 2025;
originally announced January 2025.
-
NeuroBench: A Framework for Benchmarking Neuromorphic Computing Algorithms and Systems
Authors:
Jason Yik,
Korneel Van den Berghe,
Douwe den Blanken,
Younes Bouhadjar,
Maxime Fabre,
Paul Hueber,
Weijie Ke,
Mina A Khoei,
Denis Kleyko,
Noah Pacik-Nelson,
Alessandro Pierro,
Philipp Stratmann,
Pao-Sheng Vincent Sun,
Guangzhi Tang,
Shenqi Wang,
Biyan Zhou,
Soikat Hasan Ahmed,
George Vathakkattil Joseph,
Benedetto Leto,
Aurora Micheli,
Anurag Kumar Mishra,
Gregor Lenz,
Tao Sun,
Zergham Ahmed,
Mahmoud Akl
, et al. (75 additional authors not shown)
Abstract:
Neuromorphic computing shows promise for advancing computing efficiency and capabilities of AI applications using brain-inspired principles. However, the neuromorphic research field currently lacks standardized benchmarks, making it difficult to accurately measure technological advancements, compare performance with conventional methods, and identify promising future research directions. Prior neu…
▽ More
Neuromorphic computing shows promise for advancing computing efficiency and capabilities of AI applications using brain-inspired principles. However, the neuromorphic research field currently lacks standardized benchmarks, making it difficult to accurately measure technological advancements, compare performance with conventional methods, and identify promising future research directions. Prior neuromorphic computing benchmark efforts have not seen widespread adoption due to a lack of inclusive, actionable, and iterative benchmark design and guidelines. To address these shortcomings, we present NeuroBench: a benchmark framework for neuromorphic computing algorithms and systems. NeuroBench is a collaboratively-designed effort from an open community of researchers across industry and academia, aiming to provide a representative structure for standardizing the evaluation of neuromorphic approaches. The NeuroBench framework introduces a common set of tools and systematic methodology for inclusive benchmark measurement, delivering an objective reference framework for quantifying neuromorphic approaches in both hardware-independent (algorithm track) and hardware-dependent (system track) settings. In this article, we outline tasks and guidelines for benchmarks across multiple application domains, and present initial performance baselines across neuromorphic and conventional approaches for both benchmark tracks. NeuroBench is intended to continually expand its benchmarks and features to foster and track the progress made by the research community.
△ Less
Submitted 14 January, 2025; v1 submitted 10 April, 2023;
originally announced April 2023.
-
Loss shaping enhances exact gradient learning with Eventprop in spiking neural networks
Authors:
Thomas Nowotny,
James P. Turner,
James C. Knight
Abstract:
Event-based machine learning promises more energy-efficient AI on future neuromorphic hardware. Here, we investigate how the recently discovered Eventprop algorithm for gradient descent on exact gradients in spiking neural networks can be scaled up to challenging keyword recognition benchmarks. We implemented Eventprop in the GPU-enhanced Neural Networks framework and used it for training recurren…
▽ More
Event-based machine learning promises more energy-efficient AI on future neuromorphic hardware. Here, we investigate how the recently discovered Eventprop algorithm for gradient descent on exact gradients in spiking neural networks can be scaled up to challenging keyword recognition benchmarks. We implemented Eventprop in the GPU-enhanced Neural Networks framework and used it for training recurrent spiking neural networks on the Spiking Heidelberg Digits and Spiking Speech Commands datasets. We found that learning depended strongly on the loss function and extended Eventprop to a wider class of loss functions to enable effective training. We then tested a large number of data augmentations and regularisations as well as exploring different network structures; and heterogeneous and trainable timescales. We found that when combined with two specific augmentations, the right regularisation and a delay line input, Eventprop networks with one recurrent layer achieved state-of-the-art performance on Spiking Heidelberg Digits and good accuracy on Spiking Speech Commands. In comparison to a leading surrogate-gradient-based SNN training method, our GeNN Eventprop implementation is 3X faster and uses 4X less memory. This work is a significant step towards a low-power neuromorphic alternative to current machine learning paradigms.
△ Less
Submitted 31 January, 2025; v1 submitted 2 December, 2022;
originally announced December 2022.
-
Improving Disentangled Representation Learning with the Beta Bernoulli Process
Authors:
Prashnna Kumar Gyawali,
Zhiyuan Li,
Cameron Knight,
Sandesh Ghimire,
B. Milan Horacek,
John Sapp,
Linwei Wang
Abstract:
To improve the ability of VAE to disentangle in the latent space, existing works mostly focus on enforcing independence among the learned latent factors. However, the ability of these models to disentangle often decreases as the complexity of the generative factors increases. In this paper, we investigate the little-explored effect of the modeling capacity of a posterior density on the disentangli…
▽ More
To improve the ability of VAE to disentangle in the latent space, existing works mostly focus on enforcing independence among the learned latent factors. However, the ability of these models to disentangle often decreases as the complexity of the generative factors increases. In this paper, we investigate the little-explored effect of the modeling capacity of a posterior density on the disentangling ability of the VAE. We note that the independence within and the complexity of the latent density are two different properties we constrain when regularizing the posterior density: while the former promotes the disentangling ability of VAE, the latter -- if overly limited -- creates an unnecessary competition with the data reconstruction objective in VAE. Therefore, if we preserve the independence but allow richer modeling capacity in the posterior density, we will lift this competition and thereby allow improved independence and data reconstruction at the same time. We investigate this theoretical intuition with a VAE that utilizes a non-parametric latent factor model, the Indian Buffet Process (IBP), as a latent density that is able to grow with the complexity of the data. Across three widely-used benchmark data sets and two clinical data sets little explored for disentangled learning, we qualitatively and quantitatively demonstrated the improved disentangling performance of IBP-VAE over the state of the art. In the latter two clinical data sets riddled with complex factors of variations, we further demonstrated that unsupervised disentangling of nuisance factors via IBP-VAE -- when combined with a supervised objective -- can not only improve task accuracy in comparison to relevant supervised deep architectures but also facilitate knowledge discovery related to task decision-making. A shorter version of this work will appear in the ICDM 2019 conference proceedings.
△ Less
Submitted 3 September, 2019;
originally announced September 2019.
-
Robotic-assisted Ultrasound for Fetal Imaging: Evolution from Single-arm to Dual-arm System
Authors:
Shuangyi Wang,
James Housden,
Yohan Noh,
Davinder Singh,
Anisha Singh,
Emily Skelton,
Jacqueline Matthew,
Cornelius Tan,
Junghwan Back,
Lukas Lindenroth,
Alberto Gomez,
Nicolas Toussaint,
Veronika Zimmer,
Caroline Knight,
Tara Fletcher,
David Lloyd,
John Simpson,
Dharmintra Pasupathy,
Hongbin Liu,
Kaspar Althoefer,
Joseph Hajnal,
Reza Razavi,
Kawal Rhode
Abstract:
The development of robotic-assisted extracorporeal ultrasound systems has a long history and a number of projects have been proposed since the 1990s focusing on different technical aspects. These aim to resolve the deficiencies of on-site manual manipulation of hand-held ultrasound probes. This paper presents the recent ongoing developments of a series of bespoke robotic systems, including both si…
▽ More
The development of robotic-assisted extracorporeal ultrasound systems has a long history and a number of projects have been proposed since the 1990s focusing on different technical aspects. These aim to resolve the deficiencies of on-site manual manipulation of hand-held ultrasound probes. This paper presents the recent ongoing developments of a series of bespoke robotic systems, including both single-arm and dual-arm versions, for a project known as intelligent Fetal Imaging and Diagnosis (iFIND). After a brief review of the development history of the extracorporeal ultrasound robotic system used for fetal and abdominal examinations, the specific aim of the iFIND robots, the design evolution, the implementation details of each version, and the initial clinical feedback of the iFIND robot series are presented. Based on the preliminary testing of these newly-proposed robots on 42 volunteers, the successful and re-liable working of the mechatronic systems were validated. Analysis of a participant questionnaire indicates a comfortable scanning experience for the volunteers and a good acceptance rate to being scanned by the robots.
△ Less
Submitted 10 April, 2019; v1 submitted 14 February, 2019;
originally announced February 2019.
-
Deep Generative Model with Beta Bernoulli Process for Modeling and Learning Confounding Factors
Authors:
Prashnna K Gyawali,
Cameron Knight,
Sandesh Ghimire,
B. Milan Horacek,
John L. Sapp,
Linwei Wang
Abstract:
While deep representation learning has become increasingly capable of separating task-relevant representations from other confounding factors in the data, two significant challenges remain. First, there is often an unknown and potentially infinite number of confounding factors coinciding in the data. Second, not all of these factors are readily observable. In this paper, we present a deep conditio…
▽ More
While deep representation learning has become increasingly capable of separating task-relevant representations from other confounding factors in the data, two significant challenges remain. First, there is often an unknown and potentially infinite number of confounding factors coinciding in the data. Second, not all of these factors are readily observable. In this paper, we present a deep conditional generative model that learns to disentangle a task-relevant representation from an unknown number of confounding factors that may grow infinitely. This is achieved by marrying the representational power of deep generative models with Bayesian non-parametric factor models, where a supervised deterministic encoder learns task-related representation and a probabilistic encoder with an Indian Buffet Process (IBP) learns the unknown number of unobservable confounding factors. We tested the presented model in two datasets: a handwritten digit dataset (MNIST) augmented with colored digits and a clinical ECG dataset with significant inter-subject variations and augmented with signal artifacts. These diverse data sets highlighted the ability of the presented model to grow with the complexity of the data and identify the absence or presence of unobserved confounding factors.
△ Less
Submitted 5 May, 2019; v1 submitted 31 October, 2018;
originally announced November 2018.
-
Standard Plane Detection in 3D Fetal Ultrasound Using an Iterative Transformation Network
Authors:
Yuanwei Li,
Bishesh Khanal,
Benjamin Hou,
Amir Alansary,
Juan J. Cerrolaza,
Matthew Sinclair,
Jacqueline Matthew,
Chandni Gupta,
Caroline Knight,
Bernhard Kainz,
Daniel Rueckert
Abstract:
Standard scan plane detection in fetal brain ultrasound (US) forms a crucial step in the assessment of fetal development. In clinical settings, this is done by manually manoeuvring a 2D probe to the desired scan plane. With the advent of 3D US, the entire fetal brain volume containing these standard planes can be easily acquired. However, manual standard plane identification in 3D volume is labour…
▽ More
Standard scan plane detection in fetal brain ultrasound (US) forms a crucial step in the assessment of fetal development. In clinical settings, this is done by manually manoeuvring a 2D probe to the desired scan plane. With the advent of 3D US, the entire fetal brain volume containing these standard planes can be easily acquired. However, manual standard plane identification in 3D volume is labour-intensive and requires expert knowledge of fetal anatomy. We propose a new Iterative Transformation Network (ITN) for the automatic detection of standard planes in 3D volumes. ITN uses a convolutional neural network to learn the relationship between a 2D plane image and the transformation parameters required to move that plane towards the location/orientation of the standard plane in the 3D volume. During inference, the current plane image is passed iteratively to the network until it converges to the standard plane location. We explore the effect of using different transformation representations as regression outputs of ITN. Under a multi-task learning framework, we introduce additional classification probability outputs to the network to act as confidence measures for the regressed transformation parameters in order to further improve the localisation accuracy. When evaluated on 72 US volumes of fetal brain, our method achieves an error of 3.83mm/12.7 degrees and 3.80mm/12.6 degrees for the transventricular and transcerebellar planes respectively and takes 0.46s per plane. Source code is publicly available at https://github.com/yuanwei1989/plane-detection.
△ Less
Submitted 6 October, 2018; v1 submitted 19 June, 2018;
originally announced June 2018.
-
Fast Multiple Landmark Localisation Using a Patch-based Iterative Network
Authors:
Yuanwei Li,
Amir Alansary,
Juan J. Cerrolaza,
Bishesh Khanal,
Matthew Sinclair,
Jacqueline Matthew,
Chandni Gupta,
Caroline Knight,
Bernhard Kainz,
Daniel Rueckert
Abstract:
We propose a new Patch-based Iterative Network (PIN) for fast and accurate landmark localisation in 3D medical volumes. PIN utilises a Convolutional Neural Network (CNN) to learn the spatial relationship between an image patch and anatomical landmark positions. During inference, patches are repeatedly passed to the CNN until the estimated landmark position converges to the true landmark location.…
▽ More
We propose a new Patch-based Iterative Network (PIN) for fast and accurate landmark localisation in 3D medical volumes. PIN utilises a Convolutional Neural Network (CNN) to learn the spatial relationship between an image patch and anatomical landmark positions. During inference, patches are repeatedly passed to the CNN until the estimated landmark position converges to the true landmark location. PIN is computationally efficient since the inference stage only selectively samples a small number of patches in an iterative fashion rather than a dense sampling at every location in the volume. Our approach adopts a multi-task learning framework that combines regression and classification to improve localisation accuracy. We extend PIN to localise multiple landmarks by using principal component analysis, which models the global anatomical relationships between landmarks. We have evaluated PIN using 72 3D ultrasound images from fetal screening examinations. PIN achieves quantitatively an average landmark localisation error of 5.59mm and a runtime of 0.44s to predict 10 landmarks per volume. Qualitatively, anatomical 2D standard scan planes derived from the predicted landmark locations are visually similar to the clinical ground truth. Source code is publicly available at https://github.com/yuanwei1989/landmark-detection.
△ Less
Submitted 6 October, 2018; v1 submitted 18 June, 2018;
originally announced June 2018.
-
Human-level Performance On Automatic Head Biometrics In Fetal Ultrasound Using Fully Convolutional Neural Networks
Authors:
Matthew Sinclair,
Christian F. Baumgartner,
Jacqueline Matthew,
Wenjia Bai,
Juan Cerrolaza Martinez,
Yuanwei Li,
Sandra Smith,
Caroline L. Knight,
Bernhard Kainz,
Jo Hajnal,
Andrew P. King,
Daniel Rueckert
Abstract:
Measurement of head biometrics from fetal ultrasonography images is of key importance in monitoring the healthy development of fetuses. However, the accurate measurement of relevant anatomical structures is subject to large inter-observer variability in the clinic. To address this issue, an automated method utilizing Fully Convolutional Networks (FCN) is proposed to determine measurements of fetal…
▽ More
Measurement of head biometrics from fetal ultrasonography images is of key importance in monitoring the healthy development of fetuses. However, the accurate measurement of relevant anatomical structures is subject to large inter-observer variability in the clinic. To address this issue, an automated method utilizing Fully Convolutional Networks (FCN) is proposed to determine measurements of fetal head circumference (HC) and biparietal diameter (BPD). An FCN was trained on approximately 2000 2D ultrasound images of the head with annotations provided by 45 different sonographers during routine screening examinations to perform semantic segmentation of the head. An ellipse is fitted to the resulting segmentation contours to mimic the annotation typically produced by a sonographer. The model's performance was compared with inter-observer variability, where two experts manually annotated 100 test images. Mean absolute model-expert error was slightly better than inter-observer error for HC (1.99mm vs 2.16mm), and comparable for BPD (0.61mm vs 0.59mm), as well as Dice coefficient (0.980 vs 0.980). Our results demonstrate that the model performs at a level similar to a human expert, and learns to produce accurate predictions from a large dataset annotated by many sonographers. Additionally, measurements are generated in near real-time at 15fps on a GPU, which could speed up clinical workflow for both skilled and trainee sonographers.
△ Less
Submitted 24 April, 2018;
originally announced April 2018.
-
Attention-Gated Networks for Improving Ultrasound Scan Plane Detection
Authors:
Jo Schlemper,
Ozan Oktay,
Liang Chen,
Jacqueline Matthew,
Caroline Knight,
Bernhard Kainz,
Ben Glocker,
Daniel Rueckert
Abstract:
In this work, we apply an attention-gated network to real-time automated scan plane detection for fetal ultrasound screening. Scan plane detection in fetal ultrasound is a challenging problem due the poor image quality resulting in low interpretability for both clinicians and automated algorithms. To solve this, we propose incorporating self-gated soft-attention mechanisms. A soft-attention mechan…
▽ More
In this work, we apply an attention-gated network to real-time automated scan plane detection for fetal ultrasound screening. Scan plane detection in fetal ultrasound is a challenging problem due the poor image quality resulting in low interpretability for both clinicians and automated algorithms. To solve this, we propose incorporating self-gated soft-attention mechanisms. A soft-attention mechanism generates a gating signal that is end-to-end trainable, which allows the network to contextualise local information useful for prediction. The proposed attention mechanism is generic and it can be easily incorporated into any existing classification architectures, while only requiring a few additional parameters. We show that, when the base network has a high capacity, the incorporated attention mechanism can provide efficient object localisation while improving the overall performance. When the base network has a low capacity, the method greatly outperforms the baseline approach and significantly reduces false positives. Lastly, the generated attention maps allow us to understand the model's reasoning process, which can also be used for weakly supervised object localisation.
△ Less
Submitted 15 April, 2018;
originally announced April 2018.
-
Optimizing the Performance of Reactive Molecular Dynamics Simulations for Multi-Core Architectures
Authors:
Hasan Metin Aktulga,
Christopher Knight,
Paul Coffman,
Kurt A. O'Hearn,
Tzu-Ray Shan,
Wei Jiang
Abstract:
Reactive molecular dynamics simulations are computationally demanding. Reaching spatial and temporal scales where interesting scientific phenomena can be observed requires efficient and scalable implementations on modern hardware. In this paper, we focus on optimizing the performance of the widely used LAMMPS/ReaxC package for multi-core architectures. As hybrid parallelism allows better leverage…
▽ More
Reactive molecular dynamics simulations are computationally demanding. Reaching spatial and temporal scales where interesting scientific phenomena can be observed requires efficient and scalable implementations on modern hardware. In this paper, we focus on optimizing the performance of the widely used LAMMPS/ReaxC package for multi-core architectures. As hybrid parallelism allows better leverage of the increasing on-node parallelism, we adopt thread parallelism in the construction of bonded and nonbonded lists, and in the computation of complex ReaxFF interactions. To mitigate the I/O overheads due to large volumes of trajectory data produced and to save users the burden of post-processing, we also develop a novel in-situ tool for molecular species analysis. We analyze the performance of the resulting ReaxC-OMP package on Mira, an IBM Blue Gene/Q supercomputer. For PETN systems of sizes ranging from 32 thousand to 16.6 million particles, we observe speedups in the range of 1.5-4.5x. We observe sustained performance improvements for up to 262,144 cores (1,048,576 processes) of Mira and a weak scaling efficiency of 91.5% in large simulations containing 16.6 million particles. The in-situ molecular species analysis tool incurs only insignificant overheads across various system sizes and run configurations.
△ Less
Submitted 23 June, 2017;
originally announced June 2017.
-
A Web-based Tool for Identifying Strategic Intervention Points in Complex Systems
Authors:
Sotiris Moschoyiannis,
Nicholas Elia,
Alexandra S. Penn,
David J. B. Lloyd,
Chris Knight
Abstract:
Steering a complex system towards a desired outcome is a challenging task. The lack of clarity on the system's exact architecture and the often scarce scientific data upon which to base the operationalisation of the dynamic rules that underpin the interactions between participant entities are two contributing factors. We describe an analytical approach that builds on Fuzzy Cognitive Mapping (FCM)…
▽ More
Steering a complex system towards a desired outcome is a challenging task. The lack of clarity on the system's exact architecture and the often scarce scientific data upon which to base the operationalisation of the dynamic rules that underpin the interactions between participant entities are two contributing factors. We describe an analytical approach that builds on Fuzzy Cognitive Mapping (FCM) to address the latter and represent the system as a complex network. We apply results from network controllability to address the former and determine minimal control configurations - subsets of factors, or system levers, which comprise points for strategic intervention in steering the system. We have implemented the combination of these techniques in an analytical tool that runs in the browser, and generates all minimal control configurations of a complex network. We demonstrate our approach by reporting on our experience of working alongside industrial, local-government, and NGO stakeholders in the Humber region, UK. Our results are applied to the decision-making process involved in the transition of the region to a bio-based economy.
△ Less
Submitted 1 August, 2016;
originally announced August 2016.
-
Monotonicity of Fitness Landscapes and Mutation Rate Control
Authors:
Roman V. Belavkin,
Alastair Channon,
Elizabeth Aston,
John Aston,
Rok Krasovec,
Christopher G. Knight
Abstract:
A common view in evolutionary biology is that mutation rates are minimised. However, studies in combinatorial optimisation and search have shown a clear advantage of using variable mutation rates as a control parameter to optimise the performance of evolutionary algorithms. Much biological theory in this area is based on Ronald Fisher's work, who used Euclidean geometry to study the relation betwe…
▽ More
A common view in evolutionary biology is that mutation rates are minimised. However, studies in combinatorial optimisation and search have shown a clear advantage of using variable mutation rates as a control parameter to optimise the performance of evolutionary algorithms. Much biological theory in this area is based on Ronald Fisher's work, who used Euclidean geometry to study the relation between mutation size and expected fitness of the offspring in infinite phenotypic spaces. Here we reconsider this theory based on the alternative geometry of discrete and finite spaces of DNA sequences. First, we consider the geometric case of fitness being isomorphic to distance from an optimum, and show how problems of optimal mutation rate control can be solved exactly or approximately depending on additional constraints of the problem. Then we consider the general case of fitness communicating only partial information about the distance. We define weak monotonicity of fitness landscapes and prove that this property holds in all landscapes that are continuous and open at the optimum. This theoretical result motivates our hypothesis that optimal mutation rate functions in such landscapes will increase when fitness decreases in some neighbourhood of an optimum, resembling the control functions derived in the geometric case. We test this hypothesis experimentally by analysing approximately optimal mutation rate control functions in 115 complete landscapes of binding scores between DNA sequences and transcription factors. Our findings support the hypothesis and find that the increase of mutation rate is more rapid in landscapes that are less monotonic (more rugged). We discuss the relevance of these findings to living organisms.
△ Less
Submitted 24 August, 2019; v1 submitted 3 September, 2012;
originally announced September 2012.