-
DepoRanker: A Web Tool to predict Klebsiella Depolymerases using Machine Learning
Authors:
George Wright,
Slawomir Michniewski,
Eleanor Jameson,
Fayyaz ul Amir Afsar Minhas
Abstract:
Background: Phage therapy shows promise for treating antibiotic-resistant Klebsiella infections. Identifying phage depolymerases that target Klebsiella capsular polysaccharides is crucial, as these capsules contribute to biofilm formation and virulence. However, homology-based searches have limitations in novel depolymerase discovery.
Objective: To develop a machine learning model for identifyin…
▽ More
Background: Phage therapy shows promise for treating antibiotic-resistant Klebsiella infections. Identifying phage depolymerases that target Klebsiella capsular polysaccharides is crucial, as these capsules contribute to biofilm formation and virulence. However, homology-based searches have limitations in novel depolymerase discovery.
Objective: To develop a machine learning model for identifying and ranking potential phage depolymerases targeting Klebsiella.
Methods: We developed DepoRanker, a machine learning algorithm to rank proteins by their likelihood of being depolymerases. The model was experimentally validated on 5 newly characterized proteins and compared to BLAST.
Results: DepoRanker demonstrated superior performance to BLAST in identifying potential depolymerases. Experimental validation confirmed its predictive ability on novel proteins.
Conclusions: DepoRanker provides an accurate and functional tool to expedite depolymerase discovery for phage therapy against Klebsiella. It is available as a webserver and open-source software.
Availability: Webserver: https://deporanker.dcs.warwick.ac.uk/ Source code: https://github.com/wgrgwrght/deporanker
△ Less
Submitted 27 January, 2025;
originally announced January 2025.
-
Roadmap on Neuromorphic Photonics
Authors:
Daniel Brunner,
Bhavin J. Shastri,
Mohammed A. Al Qadasi,
H. Ballani,
Sylvain Barbay,
Stefano Biasi,
Peter Bienstman,
Simon Bilodeau,
Wim Bogaerts,
Fabian Böhm,
G. Brennan,
Sonia Buckley,
Xinlun Cai,
Marcello Calvanese Strinati,
B. Canakci,
Benoit Charbonnier,
Mario Chemnitz,
Yitong Chen,
Stanley Cheung,
Jeff Chiles,
Suyeon Choi,
Demetrios N. Christodoulides,
Lukas Chrostowski,
J. Chu,
J. H. Clegg
, et al. (125 additional authors not shown)
Abstract:
This roadmap consolidates recent advances while exploring emerging applications, reflecting the remarkable diversity of hardware platforms, neuromorphic concepts, and implementation philosophies reported in the field. It emphasizes the critical role of cross-disciplinary collaboration in this rapidly evolving field.
This roadmap consolidates recent advances while exploring emerging applications, reflecting the remarkable diversity of hardware platforms, neuromorphic concepts, and implementation philosophies reported in the field. It emphasizes the critical role of cross-disciplinary collaboration in this rapidly evolving field.
△ Less
Submitted 16 January, 2025; v1 submitted 14 January, 2025;
originally announced January 2025.
-
Training of Physical Neural Networks
Authors:
Ali Momeni,
Babak Rahmani,
Benjamin Scellier,
Logan G. Wright,
Peter L. McMahon,
Clara C. Wanjura,
Yuhang Li,
Anas Skalli,
Natalia G. Berloff,
Tatsuhiro Onodera,
Ilker Oguz,
Francesco Morichetti,
Philipp del Hougne,
Manuel Le Gallo,
Abu Sebastian,
Azalia Mirhoseini,
Cheng Zhang,
Danijela Marković,
Daniel Brunner,
Christophe Moser,
Sylvain Gigan,
Florian Marquardt,
Aydogan Ozcan,
Julie Grollier,
Andrea J. Liu
, et al. (3 additional authors not shown)
Abstract:
Physical neural networks (PNNs) are a class of neural-like networks that leverage the properties of physical systems to perform computation. While PNNs are so far a niche research area with small-scale laboratory demonstrations, they are arguably one of the most underappreciated important opportunities in modern AI. Could we train AI models 1000x larger than current ones? Could we do this and also…
▽ More
Physical neural networks (PNNs) are a class of neural-like networks that leverage the properties of physical systems to perform computation. While PNNs are so far a niche research area with small-scale laboratory demonstrations, they are arguably one of the most underappreciated important opportunities in modern AI. Could we train AI models 1000x larger than current ones? Could we do this and also have them perform inference locally and privately on edge devices, such as smartphones or sensors? Research over the past few years has shown that the answer to all these questions is likely "yes, with enough research": PNNs could one day radically change what is possible and practical for AI systems. To do this will however require rethinking both how AI models work, and how they are trained - primarily by considering the problems through the constraints of the underlying hardware physics. To train PNNs at large scale, many methods including backpropagation-based and backpropagation-free approaches are now being explored. These methods have various trade-offs, and so far no method has been shown to scale to the same scale and performance as the backpropagation algorithm widely used in deep learning today. However, this is rapidly changing, and a diverse ecosystem of training techniques provides clues for how PNNs may one day be utilized to create both more efficient realizations of current-scale AI models, and to enable unprecedented-scale models.
△ Less
Submitted 5 June, 2024;
originally announced June 2024.
-
Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context
Authors:
Gemini Team,
Petko Georgiev,
Ving Ian Lei,
Ryan Burnell,
Libin Bai,
Anmol Gulati,
Garrett Tanzer,
Damien Vincent,
Zhufeng Pan,
Shibo Wang,
Soroosh Mariooryad,
Yifan Ding,
Xinyang Geng,
Fred Alcober,
Roy Frostig,
Mark Omernick,
Lexi Walker,
Cosmin Paduraru,
Christina Sorokin,
Andrea Tacchetti,
Colin Gaffney,
Samira Daruki,
Olcan Sercinoglu,
Zach Gleicher,
Juliette Love
, et al. (1112 additional authors not shown)
Abstract:
In this report, we introduce the Gemini 1.5 family of models, representing the next generation of highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio. The family includes two new models: (1) an updated Gemini 1.5 Pro, which exceeds the February…
▽ More
In this report, we introduce the Gemini 1.5 family of models, representing the next generation of highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio. The family includes two new models: (1) an updated Gemini 1.5 Pro, which exceeds the February version on the great majority of capabilities and benchmarks; (2) Gemini 1.5 Flash, a more lightweight variant designed for efficiency with minimal regression in quality. Gemini 1.5 models achieve near-perfect recall on long-context retrieval tasks across modalities, improve the state-of-the-art in long-document QA, long-video QA and long-context ASR, and match or surpass Gemini 1.0 Ultra's state-of-the-art performance across a broad set of benchmarks. Studying the limits of Gemini 1.5's long-context ability, we find continued improvement in next-token prediction and near-perfect retrieval (>99%) up to at least 10M tokens, a generational leap over existing models such as Claude 3.0 (200k) and GPT-4 Turbo (128k). Finally, we highlight real-world use cases, such as Gemini 1.5 collaborating with professionals on completing their tasks achieving 26 to 75% time savings across 10 different job categories, as well as surprising new capabilities of large language models at the frontier; when given a grammar manual for Kalamang, a language with fewer than 200 speakers worldwide, the model learns to translate English to Kalamang at a similar level to a person who learned from the same content.
△ Less
Submitted 16 December, 2024; v1 submitted 8 March, 2024;
originally announced March 2024.
-
Scaling on-chip photonic neural processors using arbitrarily programmable wave propagation
Authors:
Tatsuhiro Onodera,
Martin M. Stein,
Benjamin A. Ash,
Mandar M. Sohoni,
Melissa Bosch,
Ryotatsu Yanagimoto,
Marc Jankowski,
Timothy P. McKenna,
Tianyu Wang,
Gennady Shvets,
Maxim R. Shcherbakov,
Logan G. Wright,
Peter L. McMahon
Abstract:
On-chip photonic processors for neural networks have potential benefits in both speed and energy efficiency but have not yet reached the scale at which they can outperform electronic processors. The dominant paradigm for designing on-chip photonics is to make networks of relatively bulky discrete components connected by one-dimensional waveguides. A far more compact alternative is to avoid explici…
▽ More
On-chip photonic processors for neural networks have potential benefits in both speed and energy efficiency but have not yet reached the scale at which they can outperform electronic processors. The dominant paradigm for designing on-chip photonics is to make networks of relatively bulky discrete components connected by one-dimensional waveguides. A far more compact alternative is to avoid explicitly defining any components and instead sculpt the continuous substrate of the photonic processor to directly perform the computation using waves freely propagating in two dimensions. We propose and demonstrate a device whose refractive index as a function of space, $n(x,z)$, can be rapidly reprogrammed, allowing arbitrary control over the wave propagation in the device. Our device, a 2D-programmable waveguide, combines photoconductive gain with the electro-optic effect to achieve massively parallel modulation of the refractive index of a slab waveguide, with an index modulation depth of $10^{-3}$ and approximately $10^4$ programmable degrees of freedom. We used a prototype device with a functional area of $12\,\text{mm}^2$ to perform neural-network inference with up to 49-dimensional input vectors in a single pass, achieving 96% accuracy on vowel classification and 86% accuracy on $7 \times 7$-pixel MNIST handwritten-digit classification. This is a scale beyond that of previous photonic chips relying on discrete components, illustrating the benefit of the continuous-waves paradigm. In principle, with large enough chip area, the reprogrammability of the device's refractive index distribution enables the reconfigurable realization of any passive, linear photonic circuit or device. This promises the development of more compact and versatile photonic systems for a wide range of applications, including optical processing, smart sensing, spectroscopy, and optical communications.
△ Less
Submitted 27 February, 2024;
originally announced February 2024.
-
Gemini: A Family of Highly Capable Multimodal Models
Authors:
Gemini Team,
Rohan Anil,
Sebastian Borgeaud,
Jean-Baptiste Alayrac,
Jiahui Yu,
Radu Soricut,
Johan Schalkwyk,
Andrew M. Dai,
Anja Hauth,
Katie Millican,
David Silver,
Melvin Johnson,
Ioannis Antonoglou,
Julian Schrittwieser,
Amelia Glaese,
Jilin Chen,
Emily Pitler,
Timothy Lillicrap,
Angeliki Lazaridou,
Orhan Firat,
James Molloy,
Michael Isard,
Paul R. Barham,
Tom Hennigan,
Benjamin Lee
, et al. (1326 additional authors not shown)
Abstract:
This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. The Gemini family consists of Ultra, Pro, and Nano sizes, suitable for applications ranging from complex reasoning tasks to on-device memory-constrained use-cases. Evaluation on a broad range of benchmarks shows that our most-capable Gemini Ultr…
▽ More
This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. The Gemini family consists of Ultra, Pro, and Nano sizes, suitable for applications ranging from complex reasoning tasks to on-device memory-constrained use-cases. Evaluation on a broad range of benchmarks shows that our most-capable Gemini Ultra model advances the state of the art in 30 of 32 of these benchmarks - notably being the first model to achieve human-expert performance on the well-studied exam benchmark MMLU, and improving the state of the art in every one of the 20 multimodal benchmarks we examined. We believe that the new capabilities of the Gemini family in cross-modal reasoning and language understanding will enable a wide variety of use cases. We discuss our approach toward post-training and deploying Gemini models responsibly to users through services including Gemini, Gemini Advanced, Google AI Studio, and Cloud Vertex AI.
△ Less
Submitted 9 May, 2025; v1 submitted 18 December, 2023;
originally announced December 2023.
-
The hardware is the software
Authors:
Jeremie Laydevant,
Logan G. Wright,
Tianyu Wang,
Peter L. McMahon
Abstract:
Human brains and bodies are not hardware running software: the hardware is the software. We reason that because the microscopic physics of artificial-intelligence hardware and of human biological "hardware" is distinct, neuromorphic engineers need to be cautious (and yet also creative) in how we take inspiration from biological intelligence. We should focus primarily on principles and design ideas…
▽ More
Human brains and bodies are not hardware running software: the hardware is the software. We reason that because the microscopic physics of artificial-intelligence hardware and of human biological "hardware" is distinct, neuromorphic engineers need to be cautious (and yet also creative) in how we take inspiration from biological intelligence. We should focus primarily on principles and design ideas that respect -- and embrace -- the underlying hardware physics of non-biological intelligent systems, rather than abstracting it away. We see a major role for neuroscience in neuromorphic computing as identifying the physics-agnostic principles of biological intelligence -- that is the principles of biological intelligence that can be gainfully adapted and applied to any physical hardware.
△ Less
Submitted 20 October, 2023;
originally announced October 2023.
-
Training dynamic models using early exits for automatic speech recognition on resource-constrained devices
Authors:
George August Wright,
Umberto Cappellazzo,
Salah Zaiem,
Desh Raj,
Lucas Ondel Yang,
Daniele Falavigna,
Mohamed Nabih Ali,
Alessio Brutti
Abstract:
The ability to dynamically adjust the computational load of neural models during inference is crucial for on-device processing scenarios characterised by limited and time-varying computational resources. A promising solution is presented by early-exit architectures, in which additional exit branches are appended to intermediate layers of the encoder. In self-attention models for automatic speech r…
▽ More
The ability to dynamically adjust the computational load of neural models during inference is crucial for on-device processing scenarios characterised by limited and time-varying computational resources. A promising solution is presented by early-exit architectures, in which additional exit branches are appended to intermediate layers of the encoder. In self-attention models for automatic speech recognition (ASR), early-exit architectures enable the development of dynamic models capable of adapting their size and architecture to varying levels of computational resources and ASR performance demands. Previous research on early-exiting ASR models has relied on pre-trained self-supervised models, fine-tuned with an early-exit loss. In this paper, we undertake an experimental comparison between fine-tuning pre-trained backbones and training models from scratch with the early-exiting objective. Experiments conducted on public datasets reveal that early-exit models trained from scratch not only preserve performance when using fewer encoder layers but also exhibit enhanced task accuracy compared to single-exit or pre-trained models. Furthermore, we explore an exit selection strategy grounded in posterior probabilities as an alternative to the conventional frame-based entropy approach. Results provide insights into the training dynamics of early-exit architectures for ASR models, particularly the efficacy of training strategies and exit selection methods.
△ Less
Submitted 22 February, 2024; v1 submitted 18 September, 2023;
originally announced September 2023.
-
Quantum-limited stochastic optical neural networks operating at a few quanta per activation
Authors:
Shi-Yuan Ma,
Tianyu Wang,
Jérémie Laydevant,
Logan G. Wright,
Peter L. McMahon
Abstract:
Energy efficiency in computation is ultimately limited by noise, with quantum limits setting the fundamental noise floor. Analog physical neural networks hold promise for improved energy efficiency compared to digital electronic neural networks. However, they are typically operated in a relatively high-power regime so that the signal-to-noise ratio (SNR) is large, and the noise can be treated as a…
▽ More
Energy efficiency in computation is ultimately limited by noise, with quantum limits setting the fundamental noise floor. Analog physical neural networks hold promise for improved energy efficiency compared to digital electronic neural networks. However, they are typically operated in a relatively high-power regime so that the signal-to-noise ratio (SNR) is large, and the noise can be treated as a perturbation. We study optical neural networks where all layers except the last are operated in the limit that each neuron can be activated by just a single photon, and as a result the noise on neuron activations is no longer merely perturbative. We show that by using a physics-based probabilistic model of the neuron activations in training, it is possible to perform accurate machine-learning inference in spite of the extremely high shot noise (SNR ~ 1). We experimentally demonstrated MNIST handwritten-digit classification with a test accuracy of 98% using an optical neural network with a hidden layer operating in the single-photon regime; the optical energy used to perform the classification corresponds to just 0.038 photons per multiply-accumulate (MAC) operation. Our physics-aware stochastic training approach might also prove useful with non-optical ultra-low-power hardware.
△ Less
Submitted 3 February, 2025; v1 submitted 28 July, 2023;
originally announced July 2023.
-
Optical Transformers
Authors:
Maxwell G. Anderson,
Shi-Yuan Ma,
Tianyu Wang,
Logan G. Wright,
Peter L. McMahon
Abstract:
The rapidly increasing size of deep-learning models has caused renewed and growing interest in alternatives to digital computers to dramatically reduce the energy cost of running state-of-the-art neural networks. Optical matrix-vector multipliers are best suited to performing computations with very large operands, which suggests that large Transformer models could be a good target for optical comp…
▽ More
The rapidly increasing size of deep-learning models has caused renewed and growing interest in alternatives to digital computers to dramatically reduce the energy cost of running state-of-the-art neural networks. Optical matrix-vector multipliers are best suited to performing computations with very large operands, which suggests that large Transformer models could be a good target for optical computing. To test this idea, we performed small-scale optical experiments with a prototype accelerator to demonstrate that Transformer operations can run on optical hardware despite noise and errors. Using simulations, validated by our experiments, we then explored the energy efficiency of optical implementations of Transformers and identified scaling laws for model performance with respect to optical energy usage. We found that the optical energy per multiply-accumulate (MAC) scales as $\frac{1}{d}$ where $d$ is the Transformer width, an asymptotic advantage over digital systems. We conclude that with well-engineered, large-scale optical hardware, it may be possible to achieve a $100 \times$ energy-efficiency advantage for running some of the largest current Transformer models, and that if both the models and the optical hardware are scaled to the quadrillion-parameter regime, optical computers could have a $>8,000\times$ energy-efficiency advantage over state-of-the-art digital-electronic processors that achieve 300 fJ/MAC. We analyzed how these results motivate and inform the construction of future optical accelerators along with optics-amenable deep-learning approaches. With assumptions about future improvements to electronics and Transformer quantization techniques (5$\times$ cheaper memory access, double the digital--analog conversion efficiency, and 4-bit precision), we estimated that optical computers' advantage against current 300-fJ/MAC digital processors could grow to $>100,000\times$.
△ Less
Submitted 20 February, 2023;
originally announced February 2023.
-
Simulation-Driven Automated End-to-End Test and Oracle Inference
Authors:
Shreshth Tuli,
Kinga Bojarczuk,
Natalija Gucevska,
Mark Harman,
Xiao-Yu Wang,
Graham Wright
Abstract:
This is the first work to report on inferential testing at scale in industry. Specifically, it reports the experience of automated testing of integrity systems at Meta. We built an internal tool called ALPACAS for automated inference of end-to-end integrity tests. Integrity tests are designed to keep users safe online by checking that interventions take place when harmful behaviour occurs on a pla…
▽ More
This is the first work to report on inferential testing at scale in industry. Specifically, it reports the experience of automated testing of integrity systems at Meta. We built an internal tool called ALPACAS for automated inference of end-to-end integrity tests. Integrity tests are designed to keep users safe online by checking that interventions take place when harmful behaviour occurs on a platform. ALPACAS infers not only the test input, but also the oracle, by observing production interventions to prevent harmful behaviour. This approach allows Meta to automate the process of generating integrity tests for its platforms, such as Facebook and Instagram, which consist of hundreds of millions of lines of production code. We outline the design and deployment of ALPACAS, and report results for its coverage, number of tests produced at each stage of the test inference process, and their pass rates. Specifically, we demonstrate that using ALPACAS significantly improves coverage from a manual test design for the particular aspect of integrity end-to-end testing it was applied to. Further, from a pool of 3 million data points, ALPACAS automatically yields 39 production-ready end-to-end integrity tests. We also report that the ALPACAS-inferred test suite enjoys exceptionally low flakiness for end-to-end testing with its average in-production pass rate of 99.84%.
△ Less
Submitted 5 February, 2023;
originally announced February 2023.
-
Image sensing with multilayer, nonlinear optical neural networks
Authors:
Tianyu Wang,
Mandar M. Sohoni,
Logan G. Wright,
Martin M. Stein,
Shi-Yuan Ma,
Tatsuhiro Onodera,
Maxwell G. Anderson,
Peter L. McMahon
Abstract:
Optical imaging is commonly used for both scientific and technological applications across industry and academia. In image sensing, a measurement, such as of an object's position, is performed by computational analysis of a digitized image. An emerging image-sensing paradigm breaks this delineation between data collection and analysis by designing optical components to perform not imaging, but enc…
▽ More
Optical imaging is commonly used for both scientific and technological applications across industry and academia. In image sensing, a measurement, such as of an object's position, is performed by computational analysis of a digitized image. An emerging image-sensing paradigm breaks this delineation between data collection and analysis by designing optical components to perform not imaging, but encoding. By optically encoding images into a compressed, low-dimensional latent space suitable for efficient post-analysis, these image sensors can operate with fewer pixels and fewer photons, allowing higher-throughput, lower-latency operation. Optical neural networks (ONNs) offer a platform for processing data in the analog, optical domain. ONN-based sensors have however been limited to linear processing, but nonlinearity is a prerequisite for depth, and multilayer NNs significantly outperform shallow NNs on many tasks. Here, we realize a multilayer ONN pre-processor for image sensing, using a commercial image intensifier as a parallel optoelectronic, optical-to-optical nonlinear activation function. We demonstrate that the nonlinear ONN pre-processor can achieve compression ratios of up to 800:1 while still enabling high accuracy across several representative computer-vision tasks, including machine-vision benchmarks, flow-cytometry image classification, and identification of objects in real scenes. In all cases we find that the ONN's nonlinearity and depth allowed it to outperform a purely linear ONN encoder. Although our experiments are specialized to ONN sensors for incoherent-light images, alternative ONN platforms should facilitate a range of ONN sensors. These ONN sensors may surpass conventional sensors by pre-processing optical information in spatial, temporal, and/or spectral dimensions, potentially with coherent and quantum qualities, all natively in the optical domain.
△ Less
Submitted 27 July, 2022;
originally announced July 2022.
-
An optical neural network using less than 1 photon per multiplication
Authors:
Tianyu Wang,
Shi-Yuan Ma,
Logan G. Wright,
Tatsuhiro Onodera,
Brian Richard,
Peter L. McMahon
Abstract:
Deep learning has rapidly become a widespread tool in both scientific and commercial endeavors. Milestones of deep learning exceeding human performance have been achieved for a growing number of tasks over the past several years, across areas as diverse as game-playing, natural-language translation, and medical-image analysis. However, continued progress is increasingly hampered by the high energy…
▽ More
Deep learning has rapidly become a widespread tool in both scientific and commercial endeavors. Milestones of deep learning exceeding human performance have been achieved for a growing number of tasks over the past several years, across areas as diverse as game-playing, natural-language translation, and medical-image analysis. However, continued progress is increasingly hampered by the high energy costs associated with training and running deep neural networks on electronic processors. Optical neural networks have attracted attention as an alternative physical platform for deep learning, as it has been theoretically predicted that they can fundamentally achieve higher energy efficiency than neural networks deployed on conventional digital computers. Here, we experimentally demonstrate an optical neural network achieving 99% accuracy on handwritten-digit classification using ~3.2 detected photons per weight multiplication and ~90% accuracy using ~0.64 photons (~$2.4 \times 10^{-19}$ J of optical energy) per weight multiplication. This performance was achieved using a custom free-space optical processor that executes matrix-vector multiplications in a massively parallel fashion, with up to ~0.5 million scalar (weight) multiplications performed at the same time. Using commercially available optical components and standard neural-network training methods, we demonstrated that optical neural networks can operate near the standard quantum limit with extremely low optical powers and still achieve high accuracy. Our results provide a proof-of-principle for low-optical-power operation, and with careful system design including the surrounding electronics used for data storage and control, open up a path to realizing optical processors that require only $10^{-16}$ J total energy per scalar multiplication -- which is orders of magnitude more efficient than current digital processors.
△ Less
Submitted 27 April, 2021;
originally announced April 2021.
-
Deep physical neural networks enabled by a backpropagation algorithm for arbitrary physical systems
Authors:
Logan G. Wright,
Tatsuhiro Onodera,
Martin M. Stein,
Tianyu Wang,
Darren T. Schachter,
Zoey Hu,
Peter L. McMahon
Abstract:
Deep neural networks have become a pervasive tool in science and engineering. However, modern deep neural networks' growing energy requirements now increasingly limit their scaling and broader use. We propose a radical alternative for implementing deep neural network models: Physical Neural Networks. We introduce a hybrid physical-digital algorithm called Physics-Aware Training to efficiently trai…
▽ More
Deep neural networks have become a pervasive tool in science and engineering. However, modern deep neural networks' growing energy requirements now increasingly limit their scaling and broader use. We propose a radical alternative for implementing deep neural network models: Physical Neural Networks. We introduce a hybrid physical-digital algorithm called Physics-Aware Training to efficiently train sequences of controllable physical systems to act as deep neural networks. This method automatically trains the functionality of any sequence of real physical systems, directly, using backpropagation, the same technique used for modern deep neural networks. To illustrate their generality, we demonstrate physical neural networks with three diverse physical systems-optical, mechanical, and electrical. Physical neural networks may facilitate unconventional machine learning hardware that is orders of magnitude faster and more energy efficient than conventional electronic processors.
△ Less
Submitted 27 April, 2021;
originally announced April 2021.
-
cuFINUFFT: a load-balanced GPU library for general-purpose nonuniform FFTs
Authors:
Yu-hsuan Shih,
Garrett Wright,
Joakim Andén,
Johannes Blaschke,
Alex H. Barnett
Abstract:
Nonuniform fast Fourier transforms dominate the computational cost in many applications including image reconstruction and signal processing. We thus present a general-purpose GPU-based CUDA library for type 1 (nonuniform to uniform) and type 2 (uniform to nonuniform) transforms in dimensions 2 and 3, in single or double precision. It achieves high performance for a given user-requested accuracy,…
▽ More
Nonuniform fast Fourier transforms dominate the computational cost in many applications including image reconstruction and signal processing. We thus present a general-purpose GPU-based CUDA library for type 1 (nonuniform to uniform) and type 2 (uniform to nonuniform) transforms in dimensions 2 and 3, in single or double precision. It achieves high performance for a given user-requested accuracy, regardless of the distribution of nonuniform points, via cache-aware point reordering, and load-balanced blocked spreading in shared memory. At low accuracies, this gives on-GPU throughputs around $10^9$ nonuniform points per second, and (even including host-device transfer) is typically 4-10$\times$ faster than the latest parallel CPU code FINUFFT (at 28 threads). It is competitive with two established GPU codes, being up to 90$\times$ faster at high accuracy and/or type 1 clustered point distributions. Finally we demonstrate a 5-12$\times$ speedup versus CPU in an X-ray diffraction 3D iterative reconstruction task at $10^{-12}$ accuracy, observing excellent multi-GPU weak scaling up to one rank per GPU.
△ Less
Submitted 25 March, 2021; v1 submitted 16 February, 2021;
originally announced February 2021.
-
Estimating Uncertainty in Neural Networks for Cardiac MRI Segmentation: A Benchmark Study
Authors:
Matthew Ng,
Fumin Guo,
Labonny Biswas,
Steffen E. Petersen,
Stefan K. Piechnik,
Stefan Neubauer,
Graham Wright
Abstract:
Objective: Convolutional neural networks (CNNs) have demonstrated promise in automated cardiac magnetic resonance image segmentation. However, when using CNNs in a large real-world dataset, it is important to quantify segmentation uncertainty and identify segmentations which could be problematic. In this work, we performed a systematic study of Bayesian and non-Bayesian methods for estimating unce…
▽ More
Objective: Convolutional neural networks (CNNs) have demonstrated promise in automated cardiac magnetic resonance image segmentation. However, when using CNNs in a large real-world dataset, it is important to quantify segmentation uncertainty and identify segmentations which could be problematic. In this work, we performed a systematic study of Bayesian and non-Bayesian methods for estimating uncertainty in segmentation neural networks.
Methods: We evaluated Bayes by Backprop, Monte Carlo Dropout, Deep Ensembles, and Stochastic Segmentation Networks in terms of segmentation accuracy, probability calibration, uncertainty on out-of-distribution images, and segmentation quality control.
Results: We observed that Deep Ensembles outperformed the other methods except for images with heavy noise and blurring distortions. We showed that Bayes by Backprop is more robust to noise distortions while Stochastic Segmentation Networks are more resistant to blurring distortions. For segmentation quality control, we showed that segmentation uncertainty is correlated with segmentation accuracy for all the methods. With the incorporation of uncertainty estimates, we were able to reduce the percentage of poor segmentation to 5% by flagging 31--48% of the most uncertain segmentations for manual review, substantially lower than random review without using neural network uncertainty (reviewing 75--78% of all images).
Conclusion: This work provides a comprehensive evaluation of uncertainty estimation methods and showed that Deep Ensembles outperformed other methods in most cases.
Significance: Neural network uncertainty measures can help identify potentially inaccurate segmentations and alert users for manual review.
△ Less
Submitted 30 December, 2022; v1 submitted 31 December, 2020;
originally announced December 2020.
-
Improved Time Warp Edit Distance -- A Parallel Dynamic Program in Linear Memory
Authors:
Garrett Wright
Abstract:
Edit Distance is a classic family of dynamic programming problems, among which Time Warp Edit Distance refines the problem with the notion of a metric and temporal elasticity. A novel Improved Time Warp Edit Distance algorithm that is both massively parallelizable and requiring only linear storage is presented. This method uses the procession of a three diagonal band to cover the original dynamic…
▽ More
Edit Distance is a classic family of dynamic programming problems, among which Time Warp Edit Distance refines the problem with the notion of a metric and temporal elasticity. A novel Improved Time Warp Edit Distance algorithm that is both massively parallelizable and requiring only linear storage is presented. This method uses the procession of a three diagonal band to cover the original dynamic program space. Every element of the diagonal update can be computed in parallel. The core method is a feature of the TWED Longest Common Subsequence data dependence and is applicable to dynamic programs that share similar band subproblem structure. The algorithm has been implemented as a CUDA C library with Python bindings. Speedups for challenging problems are phenomenal.
△ Less
Submitted 31 July, 2020;
originally announced July 2020.
-
A Robust Hyperviscosity Formulation for Stable RBF-FD Discretizations of Advection-Diffusion-Reaction Equations on Manifolds
Authors:
Varun Shankar,
Grady B. Wright,
Akil Narayan
Abstract:
We present a new hyperviscosity formulation for stabilizing radial basis function-finite difference (RBF-FD) discretizations of advection-diffusion-reaction equations on manifolds $\mathbb{M} \subset \mathbb{R}^3$ of co-dimension one. Our technique involves automatic addition of artificial hyperviscosity to damp out spurious modes in the differentiation matrices corresponding to surface gradients,…
▽ More
We present a new hyperviscosity formulation for stabilizing radial basis function-finite difference (RBF-FD) discretizations of advection-diffusion-reaction equations on manifolds $\mathbb{M} \subset \mathbb{R}^3$ of co-dimension one. Our technique involves automatic addition of artificial hyperviscosity to damp out spurious modes in the differentiation matrices corresponding to surface gradients, in the process overcoming a technical limitation of a recently-developed Euclidean formulation. Like the Euclidean formulation, the manifold formulation relies on von Neumann stability analysis performed on auxiliary differential operators that mimic the spurious solution growth induced by RBF-FD differentiation matrices. We demonstrate high-order convergence rates on problems involving surface advection and surface advection-diffusion. Finally, we demonstrate the applicability of our formulation to advection-diffusion-reaction equations on manifolds described purely as point clouds. Our surface discretizations use the recently-developed RBF-LOI method, and with the addition of hyperviscosity, are now empirically high-order accurate, stable, and free of stagnation errors.
△ Less
Submitted 24 April, 2020; v1 submitted 15 October, 2019;
originally announced October 2019.
-
Instance-Level Microtubule Tracking
Authors:
Samira Masoudi,
Afsaneh Razi,
Cameron H. G. Wright,
Jay C. Gatlin,
Ulas Bagci
Abstract:
We propose a new method of instance-level microtubule (MT) tracking in time-lapse image series using recurrent attention. Our novel deep learning algorithm segments individual MTs at each frame. Segmentation results from successive frames are used to assign correspondences among MTs. This ultimately generates a distinct path trajectory for each MT through the frames. Based on these trajectories, w…
▽ More
We propose a new method of instance-level microtubule (MT) tracking in time-lapse image series using recurrent attention. Our novel deep learning algorithm segments individual MTs at each frame. Segmentation results from successive frames are used to assign correspondences among MTs. This ultimately generates a distinct path trajectory for each MT through the frames. Based on these trajectories, we estimate MT velocities. To validate our proposed technique, we conduct experiments using real and simulated data. We use statistics derived from real time-lapse series of MT gliding assays to simulate realistic MT time-lapse image series in our simulated data. This dataset is employed as pre-training and hyperparameter optimization for our network before training on the real data. Our experimental results show that the proposed supervised learning algorithm improves the precision for MT instance velocity estimation drastically to 71.3% from the baseline result (29.3%). We also demonstrate how the inclusion of temporal information into our deep network can reduce the false negative rates from 67.8% (baseline) down to 28.7% (proposed). Our findings in this work are expected to help biologists characterize the spatial arrangement of MTs, specifically the effects of MT-MT interactions.
△ Less
Submitted 20 September, 2019; v1 submitted 17 January, 2019;
originally announced January 2019.
-
Credibility and Dynamics of Collective Attention
Authors:
Tanushree Mitra,
Graham Wright,
Eric Gilbert
Abstract:
Today, social media provide the means by which billions of people experience news and events happening around the world. However, the absence of traditional journalistic gatekeeping allows information to flow unencumbered through these platforms, often raising questions of veracity and credibility of the reported information. Here we ask: How do the dynamics of collective attention directed toward…
▽ More
Today, social media provide the means by which billions of people experience news and events happening around the world. However, the absence of traditional journalistic gatekeeping allows information to flow unencumbered through these platforms, often raising questions of veracity and credibility of the reported information. Here we ask: How do the dynamics of collective attention directed toward an event reported on social media vary with its perceived credibility? By examining the first large-scale, systematically tracked credibility database of public Twitter messages (47M messages corresponding to 1,138 real-world events over a span of three months), we established a relationship between the temporal dynamics of events reported on social media and their associated level of credibility judgments. Representing collective attention by the aggregate temporal signatures of an event reportage, we found that the amount of continued attention focused on an event provides information about its associated levels of perceived credibility. Events exhibiting sustained, intermittent bursts of attention were found to be associated with lower levels of perceived credibility. In other words, as more people showed interest during moments of transient collective attention, the associated uncertainty surrounding these events also increased.
△ Less
Submitted 26 December, 2016;
originally announced December 2016.
-
A Radial Basis Function (RBF)-Finite Difference Method for the Simulation of Reaction-Diffusion Equations on Stationary Platelets within the Augmented Forcing Method
Authors:
Varun Shankar,
Grady B. Wright,
Aaron L. Fogelson,
Robert M. Kirby
Abstract:
We present a computational method for solving the coupled problem of chemical transport in a fluid (blood) with binding/unbinding of the chemical to/from cellular (platelet) surfaces in contact with the fluid, and with transport of the chemical on the cellular surfaces. The overall framework is the Augmented Forcing Point Method (AFM) (\emph{L. Yao and A.L. Fogelson, Simulations of chemical transp…
▽ More
We present a computational method for solving the coupled problem of chemical transport in a fluid (blood) with binding/unbinding of the chemical to/from cellular (platelet) surfaces in contact with the fluid, and with transport of the chemical on the cellular surfaces. The overall framework is the Augmented Forcing Point Method (AFM) (\emph{L. Yao and A.L. Fogelson, Simulations of chemical transport and reaction in a suspension of cells I: An augmented forcing point method for the stationary case, IJNMF (2012) 69, 1736-52.}) for solving fluid-phase transport in a region outside of a collection of cells suspended in the fluid. We introduce a novel Radial Basis Function-Finite Difference (RBF-FD) method to solve reaction-diffusion equations on the surface of each of a collection of 2D stationary platelets suspended in blood. Parametric RBFs are used to represent the geometry of the platelets and give accurate geometric information needed for the RBF-FD method. Symmetric Hermite-RBF interpolants are used for enforcing the boundary conditions on the fluid-phase chemical concentration, and their use removes a significant limitation of the original AFM. The efficacy of the new methods are shown through a series of numerical experiments; in particular, second order convergence for the coupled problem is demonstrated.
△ Less
Submitted 21 September, 2015; v1 submitted 19 October, 2013;
originally announced October 2013.