-
Agent-Based Emulation for Deploying Robot Swarm Behaviors
Authors:
Ricardo Vega,
Kevin Zhu,
Connor Mattson,
Daniel S. Brown,
Cameron Nowzari
Abstract:
Despite significant research, robotic swarms have yet to be useful in solving real-world problems, largely due to the difficulty of creating and controlling swarming behaviors in multi-agent systems. Traditional top-down approaches in which a desired emergent behavior is produced often require complex, resource-heavy robots, limiting their practicality. This paper introduces a bottom-up approach b…
▽ More
Despite significant research, robotic swarms have yet to be useful in solving real-world problems, largely due to the difficulty of creating and controlling swarming behaviors in multi-agent systems. Traditional top-down approaches in which a desired emergent behavior is produced often require complex, resource-heavy robots, limiting their practicality. This paper introduces a bottom-up approach by employing an Embodied Agent-Based Modeling and Simulation approach, emphasizing the use of simple robots and identifying conditions that naturally lead to self-organized collective behaviors. Using the Reality-to-Simulation-to-Reality for Swarms (RSRS) process, we tightly integrate real-world experiments with simulations to reproduce known swarm behaviors as well as discovering a novel emergent behavior without aiming to eliminate or even reduce the sim2real gap. This paper presents the development of an Agent-Based Embodiment and Emulation process that balances the importance of running physical swarming experiments and the prohibitively time-consuming process of even setting up and running a single experiment with 20+ robots by leveraging low-fidelity lightweight simulations to enable hypothesis-formation to guide physical experiments. We demonstrate the usefulness of our methods by emulating two known behaviors from the literature and show a third behavior `discovered' by accident.
△ Less
Submitted 21 October, 2024;
originally announced October 2024.
-
Spiking Neural Networks as a Controller for Emergent Swarm Agents
Authors:
Kevin Zhu,
Connor Mattson,
Shay Snyder,
Ricardo Vega,
Daniel S. Brown,
Maryam Parsa,
Cameron Nowzari
Abstract:
Drones which can swarm and loiter in a certain area cost hundreds of dollars, but mosquitos can do the same and are essentially worthless. To control swarms of low-cost robots, researchers may end up spending countless hours brainstorming robot configurations and policies to ``organically" create behaviors which do not need expensive sensors and perception. Existing research explores the possible…
▽ More
Drones which can swarm and loiter in a certain area cost hundreds of dollars, but mosquitos can do the same and are essentially worthless. To control swarms of low-cost robots, researchers may end up spending countless hours brainstorming robot configurations and policies to ``organically" create behaviors which do not need expensive sensors and perception. Existing research explores the possible emergent behaviors in swarms of robots with only a binary sensor and a simple but hand-picked controller structure. Even agents in this highly limited sensing, actuation, and computational capability class can exhibit relatively complex global behaviors such as aggregation, milling, and dispersal, but finding the local interaction rules that enable more collective behaviors remains a significant challenge. This paper investigates the feasibility of training spiking neural networks to find those local interaction rules that result in particular emergent behaviors. In this paper, we focus on simulating a specific milling behavior already known to be producible using very simple binary sensing and acting agents. To do this, we use evolutionary algorithms to evolve not only the parameters (the weights, biases, and delays) of a spiking neural network, but also its structure. To create a baseline, we also show an evolutionary search strategy over the parameters for the incumbent hand-picked binary controller structure. Our simulations show that spiking neural networks can be evolved in binary sensing agents to form a mill.
△ Less
Submitted 21 October, 2024;
originally announced October 2024.
-
Chattronics: using GPTs to assist in the design of data acquisition systems
Authors:
Jonathan Paul Driemeyer Brown,
Tiago Oliveira Weber
Abstract:
The usefulness of Large Language Models (LLM) is being continuously tested in various fields. However, their intrinsic linguistic characteristic is still one of the limiting factors when applying these models to exact sciences. In this article, a novel approach to use General Pre-Trained Transformers to assist in the design phase of data acquisition systems will be presented. The solution is packa…
▽ More
The usefulness of Large Language Models (LLM) is being continuously tested in various fields. However, their intrinsic linguistic characteristic is still one of the limiting factors when applying these models to exact sciences. In this article, a novel approach to use General Pre-Trained Transformers to assist in the design phase of data acquisition systems will be presented. The solution is packaged in the form of an application that retains the conversational aspects of LLMs, in such a manner that the user must provide details on the desired project in order for the model to draft both a system-level architectural diagram and the block-level specifications, following a Top-Down methodology based on restrictions. To test this tool, two distinct user emulations were used, one of which uses an additional GPT model. In total, 4 different data acquisition projects were used in the testing phase, each with its own measurement requirements: angular position, temperature, acceleration and a fourth project with both pressure and superficial temperature measurements. After 160 test iterations, the study concludes that there is potential for these models to serve adequately as synthesis/assistant tools for data acquisition systems, but there are still technological limitations. The results show coherent architectures and topologies, but that GPTs have difficulties in simultaneously considering all requirements and many times commits theoretical mistakes.
△ Less
Submitted 23 September, 2024;
originally announced September 2024.
-
Diffusion and Multi-Domain Adaptation Methods for Eosinophil Segmentation
Authors:
Kevin Lin,
Donald Brown,
Sana Syed,
Adam Greene
Abstract:
Eosinophilic Esophagitis (EoE) represents a challenging condition for medical providers today. The cause is currently unknown, the impact on a patient's daily life is significant, and it is increasing in prevalence. Traditional approaches for medical image diagnosis such as standard deep learning algorithms are limited by the relatively small amount of data and difficulty in generalization. As a r…
▽ More
Eosinophilic Esophagitis (EoE) represents a challenging condition for medical providers today. The cause is currently unknown, the impact on a patient's daily life is significant, and it is increasing in prevalence. Traditional approaches for medical image diagnosis such as standard deep learning algorithms are limited by the relatively small amount of data and difficulty in generalization. As a response, two methods have arisen that seem to perform well: Diffusion and Multi-Domain methods with current research efforts favoring diffusion methods. For the EoE dataset, we discovered that a Multi-Domain Adversarial Network outperformed a Diffusion based method with a FID of 42.56 compared to 50.65. Future work with diffusion methods should include a comparison with Multi-Domain adaptation methods to ensure that the best performance is achieved.
△ Less
Submitted 17 March, 2024;
originally announced March 2024.
-
Uncertainty Quantification for Eosinophil Segmentation
Authors:
Kevin Lin,
Donald Brown,
Sana Syed,
Adam Greene
Abstract:
Eosinophilic Esophagitis (EoE) is an allergic condition increasing in prevalence. To diagnose EoE, pathologists must find 15 or more eosinophils within a single high-power field (400X magnification). Determining whether or not a patient has EoE can be an arduous process and any medical imaging approaches used to assist diagnosis must consider both efficiency and precision. We propose an improvemen…
▽ More
Eosinophilic Esophagitis (EoE) is an allergic condition increasing in prevalence. To diagnose EoE, pathologists must find 15 or more eosinophils within a single high-power field (400X magnification). Determining whether or not a patient has EoE can be an arduous process and any medical imaging approaches used to assist diagnosis must consider both efficiency and precision. We propose an improvement of Adorno et al's approach for quantifying eosinphils using deep image segmentation. Our new approach leverages Monte Carlo Dropout, a common approach in deep learning to reduce overfitting, to provide uncertainty quantification on current deep learning models. The uncertainty can be visualized in an output image to evaluate model performance, provide insight to how deep learning algorithms function, and assist pathologists in identifying eosinophils.
△ Less
Submitted 7 November, 2023; v1 submitted 28 September, 2023;
originally announced September 2023.
-
Indirect Swarm Control: Characterization and Analysis of Emergent Swarm Behaviors
Authors:
Ricardo Vega,
Connor Mattson,
Daniel S. Brown,
Cameron Nowzari
Abstract:
Emergence and emergent behaviors are often defined as cases where changes in local interactions between agents at a lower level effectively changes what occurs in the higher level of the system (i.e., the whole swarm) and its properties. However, the manner in which these collective emergent behaviors self-organize is less understood. The focus of this paper is in presenting a new framework for ch…
▽ More
Emergence and emergent behaviors are often defined as cases where changes in local interactions between agents at a lower level effectively changes what occurs in the higher level of the system (i.e., the whole swarm) and its properties. However, the manner in which these collective emergent behaviors self-organize is less understood. The focus of this paper is in presenting a new framework for characterizing the conditions that lead to different macrostates and how to predict/analyze their macroscopic properties, allowing us to indirectly engineer the same behaviors from the bottom up by tuning their environmental conditions rather than local interaction rules. We then apply this framework to a simple system of binary sensing and acting agents as an example to see if a re-framing of this swarms problem can help us push the state of the art forward. By first creating some working definitions of macrostates in a particular swarm system, we show how agent-based modeling may be combined with control theory to enable a generalized understanding of controllable emergent processes without needing to simulate everything. Whereas phase diagrams can generally only be created through Monte Carlo simulations or sweeping through ranges of parameters in a simulator, we develop closed-form functions that can immediately produce them revealing an infinite set of swarm parameter combinations that can lead to a specifically chosen self-organized behavior. While the exact methods are still under development, we believe simply laying out a potential path towards solutions that have evaded our traditional methods using a novel method is worth considering. Our results are characterized through both simulations and real experiments on ground robots.
△ Less
Submitted 28 March, 2024; v1 submitted 20 September, 2023;
originally announced September 2023.
-
Label-efficient Contrastive Learning-based model for nuclei detection and classification in 3D Cardiovascular Immunofluorescent Images
Authors:
Nazanin Moradinasab,
Rebecca A. Deaton,
Laura S. Shankman,
Gary K. Owens,
Donald E. Brown
Abstract:
Recently, deep learning-based methods achieved promising performance in nuclei detection and classification applications. However, training deep learning-based methods requires a large amount of pixel-wise annotated data, which is time-consuming and labor-intensive, especially in 3D images. An alternative approach is to adapt weak-annotation methods, such as labeling each nucleus with a point, but…
▽ More
Recently, deep learning-based methods achieved promising performance in nuclei detection and classification applications. However, training deep learning-based methods requires a large amount of pixel-wise annotated data, which is time-consuming and labor-intensive, especially in 3D images. An alternative approach is to adapt weak-annotation methods, such as labeling each nucleus with a point, but this method does not extend from 2D histopathology images (for which it was originally developed) to 3D immunofluorescent images. The reason is that 3D images contain multiple channels (z-axis) for nuclei and different markers separately, which makes training using point annotations difficult. To address this challenge, we propose the Label-efficient Contrastive learning-based (LECL) model to detect and classify various types of nuclei in 3D immunofluorescent images. Previous methods use Maximum Intensity Projection (MIP) to convert immunofluorescent images with multiple slices to 2D images, which can cause signals from different z-stacks to falsely appear associated with each other. To overcome this, we devised an Extended Maximum Intensity Projection (EMIP) approach that addresses issues using MIP. Furthermore, we performed a Supervised Contrastive Learning (SCL) approach for weakly supervised settings. We conducted experiments on cardiovascular datasets and found that our proposed framework is effective and efficient in detecting and classifying various types of nuclei in 3D immunofluorescent images.
△ Less
Submitted 14 January, 2024; v1 submitted 7 September, 2023;
originally announced September 2023.
-
Neural Volumetric Reconstruction for Coherent Synthetic Aperture Sonar
Authors:
Albert W. Reed,
Juhyeon Kim,
Thomas Blanford,
Adithya Pediredla,
Daniel C. Brown,
Suren Jayasuriya
Abstract:
Synthetic aperture sonar (SAS) measures a scene from multiple views in order to increase the resolution of reconstructed imagery. Image reconstruction methods for SAS coherently combine measurements to focus acoustic energy onto the scene. However, image formation is typically under-constrained due to a limited number of measurements and bandlimited hardware, which limits the capabilities of exist…
▽ More
Synthetic aperture sonar (SAS) measures a scene from multiple views in order to increase the resolution of reconstructed imagery. Image reconstruction methods for SAS coherently combine measurements to focus acoustic energy onto the scene. However, image formation is typically under-constrained due to a limited number of measurements and bandlimited hardware, which limits the capabilities of existing reconstruction methods. To help meet these challenges, we design an analysis-by-synthesis optimization that leverages recent advances in neural rendering to perform coherent SAS imaging. Our optimization enables us to incorporate physics-based constraints and scene priors into the image formation process. We validate our method on simulation and experimental results captured in both air and water. We demonstrate both quantitatively and qualitatively that our method typically produces superior reconstructions than existing approaches. We share code and data for reproducibility.
△ Less
Submitted 16 June, 2023;
originally announced June 2023.
-
Physics-Based Acoustic Holograms
Authors:
Antonio Stanziola,
Ben T. Cox,
Bradley E. Treeby,
Michael D. Brown
Abstract:
Advances in additive manufacturing have enabled the realisation of inexpensive, scalable, diffractive acoustic lenses that can be used to generate complex acoustic fields via phase and/or amplitude modulation. However, the design of these holograms relies on a thin-element approximation adapted from optics which can severely limit the fidelity of the realised acoustic field. Here, we introduce phy…
▽ More
Advances in additive manufacturing have enabled the realisation of inexpensive, scalable, diffractive acoustic lenses that can be used to generate complex acoustic fields via phase and/or amplitude modulation. However, the design of these holograms relies on a thin-element approximation adapted from optics which can severely limit the fidelity of the realised acoustic field. Here, we introduce physics-based acoustic holograms with a complex internal structure. The structures are designed using a differentiable acoustic model with manufacturing constraints via optimisation of the acoustic property distribution within the hologram. The holograms can be fabricated simply and inexpensively using contemporary 3D printers. Experimental measurements demonstrate a significant improvement compared to conventional thin-element holograms.
△ Less
Submitted 5 May, 2023;
originally announced May 2023.
-
Automated Time-frequency Domain Audio Crossfades using Graph Cuts
Authors:
Kyle Robinson,
Dan Brown
Abstract:
The problem of transitioning smoothly from one audio clip to another arises in many music consumption scenarios, especially as music consumption has moved from professionally curated and live-streamed radios to personal playback devices and services. we present the first steps toward a new method of automatically transitioning from one audio clip to another by discretizing the frequency spectrum i…
▽ More
The problem of transitioning smoothly from one audio clip to another arises in many music consumption scenarios, especially as music consumption has moved from professionally curated and live-streamed radios to personal playback devices and services. we present the first steps toward a new method of automatically transitioning from one audio clip to another by discretizing the frequency spectrum into bins and then finding transition times for each bin. We phrase the problem as one of graph flow optimization; specifically min-cut/max-flow.
△ Less
Submitted 30 January, 2023;
originally announced January 2023.
-
Approximate Extraction of Late-Time Returns via Morphological Component Analysis
Authors:
Geoff Goehle,
Benjamin Cowen,
Thomas E. Blanford,
J. Daniel Park,
Daniel C. Brown
Abstract:
A fundamental challenge in acoustic data processing is to separate a measured time series into relevant phenomenological components. A given measurement is typically assumed to be an additive mixture of myriad signals plus noise whose separation forms an ill-posed inverse problem. In the setting of sensing elastic objects using active sonar, we wish to separate the early-time returns (e.g., return…
▽ More
A fundamental challenge in acoustic data processing is to separate a measured time series into relevant phenomenological components. A given measurement is typically assumed to be an additive mixture of myriad signals plus noise whose separation forms an ill-posed inverse problem. In the setting of sensing elastic objects using active sonar, we wish to separate the early-time returns (e.g., returns from the object's exterior geometry) from late-time returns caused by elastic or compressional wave coupling.
Under the framework of Morphological Component Analysis (MCA), we compare two separation models using the short-duration and long-duration responses as a proxy for early-time and late-time returns. Results are computed for Stanton's elastic cylinder model as well as on experimental data taken from an in-Air circular Synthetic Aperture Sonar (AirSAS) system, whose separated time series are formed into imagery. We find that MCA can be used to separate early and late-time responses in both cases without the use of time-gating. The separation process is demonstrated to be robust to noise and compatible with AirSAS image reconstruction. The best separation results are obtained with a flexible, but computationally intensive, frame based signal model, while a faster Fourier Transform based method is shown to have competitive performance.
△ Less
Submitted 11 August, 2022;
originally announced August 2022.
-
Encoding Cardiopulmonary Exercise Testing Time Series as Images for Classification using Convolutional Neural Network
Authors:
Yash Sharma,
Nick Coronato,
Donald E. Brown
Abstract:
Exercise testing has been available for more than a half-century and is a remarkably versatile tool for diagnostic and prognostic information of patients for a range of diseases, especially cardiovascular and pulmonary. With rapid advancements in technology, wearables, and learning algorithm in the last decade, its scope has evolved. Specifically, Cardiopulmonary exercise testing (CPX) is one of t…
▽ More
Exercise testing has been available for more than a half-century and is a remarkably versatile tool for diagnostic and prognostic information of patients for a range of diseases, especially cardiovascular and pulmonary. With rapid advancements in technology, wearables, and learning algorithm in the last decade, its scope has evolved. Specifically, Cardiopulmonary exercise testing (CPX) is one of the most commonly used laboratory tests for objective evaluation of exercise capacity and performance levels in patients. CPX provides a non-invasive, integrative assessment of the pulmonary, cardiovascular, and skeletal muscle systems involving the measurement of gas exchanges. However, its assessment is challenging, requiring the individual to process multiple time series data points, leading to simplification to peak values and slopes. But this simplification can discard the valuable trend information present in these time series. In this work, we encode the time series as images using the Gramian Angular Field and Markov Transition Field and use it with a convolutional neural network and attention pooling approach for the classification of heart failure and metabolic syndrome patients. Using GradCAMs, we highlight the discriminative features identified by the model.
△ Less
Submitted 26 April, 2022;
originally announced April 2022.
-
SINR: Deconvolving Circular SAS Images Using Implicit Neural Representations
Authors:
Albert Reed,
Thomas Blanford,
Daniel C. Brown,
Suren Jayasuriya
Abstract:
Circular Synthetic aperture sonars (CSAS) capture multiple observations of a scene to reconstruct high-resolution images. We can characterize resolution by modeling CSAS imaging as the convolution between a scene's underlying point scattering distribution and a system-dependent point spread function (PSF). The PSF is a function of the transmitted waveform's bandwidth and determines a fixed degree…
▽ More
Circular Synthetic aperture sonars (CSAS) capture multiple observations of a scene to reconstruct high-resolution images. We can characterize resolution by modeling CSAS imaging as the convolution between a scene's underlying point scattering distribution and a system-dependent point spread function (PSF). The PSF is a function of the transmitted waveform's bandwidth and determines a fixed degree of blurring on reconstructed imagery. In theory, deconvolution overcomes bandwidth limitations by reversing the PSF-induced blur and recovering the scene's scattering distribution. However, deconvolution is an ill-posed inverse problem and sensitive to noise. We propose a self-supervised pipeline (does not require training data) that leverages an implicit neural representation (INR) for deconvolving CSAS images. We highlight the performance of our SAS INR pipeline, which we call SINR, by implementing and comparing to existing deconvolution methods. Additionally, prior SAS deconvolution methods assume a spatially-invariant PSF, which we demonstrate yields subpar performance in practice. We provide theory and methods to account for a spatially-varying CSAS PSF, and demonstrate that doing so enables SINR to achieve superior deconvolution performance on simulated and real acoustic SAS data. We provide code to encourage reproducibility of research.
△ Less
Submitted 16 October, 2022; v1 submitted 21 April, 2022;
originally announced April 2022.
-
Enveloped Sinusoid Parseval Frames
Authors:
Geoff Goehle,
Benjamin Cowen,
J. Daniel Park,
Daniel C. Brown
Abstract:
This paper presents a method of constructing Parseval frames from any collection of complex envelopes. The resulting Enveloped Sinusoid Parseval (ESP) frames can represent a wide variety of signal types as specified by their physical morphology. Since the ESP frame retains its Parseval property even when generated from a variety of envelopes, it is compatible with large scale and iterative optimiz…
▽ More
This paper presents a method of constructing Parseval frames from any collection of complex envelopes. The resulting Enveloped Sinusoid Parseval (ESP) frames can represent a wide variety of signal types as specified by their physical morphology. Since the ESP frame retains its Parseval property even when generated from a variety of envelopes, it is compatible with large scale and iterative optimization algorithms. ESP frames are constructed by applying time-shifted enveloping functions to the discrete Fourier Transform basis, and in this way are similar to the short-time Fourier Transform.
This work provides examples of ESP frame generation for both synthetic and experimentally measured signals. Furthermore, the frame's compatibility with distributed sparse optimization frameworks is demonstrated, and efficient implementation details are provided. Numerical experiments on acoustics data reveal that the flexibility of this method allows it to be simultaneously competitive with the STFT in time-frequency processing and also with Prony's Method for time-constant parameter estimation, surpassing the shortcomings of each individual technique.
△ Less
Submitted 18 April, 2022;
originally announced April 2022.
-
Implicit Neural Representations for Deconvolving SAS Images
Authors:
Albert Reed,
Thomas Blanford,
Daniel C. Brown,
Suren Jayasuriya
Abstract:
Synthetic aperture sonar (SAS) image resolution is constrained by waveform bandwidth and array geometry. Specifically, the waveform bandwidth determines a point spread function (PSF) that blurs the locations of point scatterers in the scene. In theory, deconvolving the reconstructed SAS image with the scene PSF restores the original distribution of scatterers and yields sharper reconstructions. Ho…
▽ More
Synthetic aperture sonar (SAS) image resolution is constrained by waveform bandwidth and array geometry. Specifically, the waveform bandwidth determines a point spread function (PSF) that blurs the locations of point scatterers in the scene. In theory, deconvolving the reconstructed SAS image with the scene PSF restores the original distribution of scatterers and yields sharper reconstructions. However, deconvolution is an ill-posed operation that is highly sensitive to noise. In this work, we leverage implicit neural representations (INRs), shown to be strong priors for the natural image space, to deconvolve SAS images. Importantly, our method does not require training data, as we perform our deconvolution through an analysis-bysynthesis optimization in a self-supervised fashion. We validate our method on simulated SAS data created with a point scattering model and real data captured with an in-air circular SAS. This work is an important first step towards applying neural networks for SAS image deconvolution.
△ Less
Submitted 15 December, 2021;
originally announced December 2021.
-
Optimal Cost Design for Model Predictive Control
Authors:
Avik Jain,
Lawrence Chan,
Daniel S. Brown,
Anca D. Dragan
Abstract:
Many robotics domains use some form of nonconvex model predictive control (MPC) for planning, which sets a reduced time horizon, performs trajectory optimization, and replans at every step. The actual task typically requires a much longer horizon than is computationally tractable, and is specified via a cost function that cumulates over that full horizon. For instance, an autonomous car may have a…
▽ More
Many robotics domains use some form of nonconvex model predictive control (MPC) for planning, which sets a reduced time horizon, performs trajectory optimization, and replans at every step. The actual task typically requires a much longer horizon than is computationally tractable, and is specified via a cost function that cumulates over that full horizon. For instance, an autonomous car may have a cost function that makes a desired trade-off between efficiency, safety, and obeying traffic laws. In this work, we challenge the common assumption that the cost we optimize using MPC should be the same as the ground truth cost for the task (plus a terminal cost). MPC solvers can suffer from short planning horizons, local optima, incorrect dynamics models, and, importantly, fail to account for future replanning ability. Thus, we propose that in many tasks it could be beneficial to purposefully choose a different cost function for MPC to optimize: one that results in the MPC rollout having low ground truth cost, rather than the MPC planned trajectory. We formalize this as an optimal cost design problem, and propose a zeroth-order optimization-based approach that enables us to design optimal costs for an MPC planning robot in continuous MDPs. We test our approach in an autonomous driving domain where we find costs different from the ground truth that implicitly compensate for replanning, short horizon, incorrect dynamics models, and local minima issues. As an example, the learned cost incentivizes MPC to delay its decision until later, implicitly accounting for the fact that it will get more information in the future and be able to make a better decision. Code and videos available at https://sites.google.com/berkeley.edu/ocd-mpc/.
△ Less
Submitted 9 June, 2021; v1 submitted 22 April, 2021;
originally announced April 2021.
-
Cluster-to-Conquer: A Framework for End-to-End Multi-Instance Learning for Whole Slide Image Classification
Authors:
Yash Sharma,
Aman Shrivastava,
Lubaina Ehsan,
Christopher A. Moskaluk,
Sana Syed,
Donald E. Brown
Abstract:
In recent years, the availability of digitized Whole Slide Images (WSIs) has enabled the use of deep learning-based computer vision techniques for automated disease diagnosis. However, WSIs present unique computational and algorithmic challenges. WSIs are gigapixel-sized ($\sim$100K pixels), making them infeasible to be used directly for training deep neural networks. Also, often only slide-level…
▽ More
In recent years, the availability of digitized Whole Slide Images (WSIs) has enabled the use of deep learning-based computer vision techniques for automated disease diagnosis. However, WSIs present unique computational and algorithmic challenges. WSIs are gigapixel-sized ($\sim$100K pixels), making them infeasible to be used directly for training deep neural networks. Also, often only slide-level labels are available for training as detailed annotations are tedious and can be time-consuming for experts. Approaches using multiple-instance learning (MIL) frameworks have been shown to overcome these challenges. Current state-of-the-art approaches divide the learning framework into two decoupled parts: a convolutional neural network (CNN) for encoding the patches followed by an independent aggregation approach for slide-level prediction. In this approach, the aggregation step has no bearing on the representations learned by the CNN encoder. We have proposed an end-to-end framework that clusters the patches from a WSI into ${k}$-groups, samples ${k}'$ patches from each group for training, and uses an adaptive attention mechanism for slide level prediction; Cluster-to-Conquer (C2C). We have demonstrated that dividing a WSI into clusters can improve the model training by exposing it to diverse discriminative features extracted from the patches. We regularized the clustering mechanism by introducing a KL-divergence loss between the attention weights of patches in a cluster and the uniform distribution. The framework is optimized end-to-end on slide-level cross-entropy, patch-level cross-entropy, and KL-divergence loss (Implementation: https://github.com/YashSharma/C2C).
△ Less
Submitted 13 June, 2021; v1 submitted 19 March, 2021;
originally announced March 2021.
-
GPU Acceleration for Synthetic Aperture Sonar Image Reconstruction
Authors:
Isaac D. Gerg,
Daniel C. Brown,
Stephen G. Wagner,
Daniel Cook,
Brian N. O'Donnell,
Thomas Benson,
Thomas C. Montgomery
Abstract:
Synthetic aperture sonar (SAS) image reconstruction, or beamforming as it is often referred to within the SAS community, comprises a class of computationally intensive algorithms for creating coherent high-resolution imagery from successive spatially varying sonar pings. Image reconstruction is usually performed topside because of the large compute burden necessitated by the procedure. Historicall…
▽ More
Synthetic aperture sonar (SAS) image reconstruction, or beamforming as it is often referred to within the SAS community, comprises a class of computationally intensive algorithms for creating coherent high-resolution imagery from successive spatially varying sonar pings. Image reconstruction is usually performed topside because of the large compute burden necessitated by the procedure. Historically, image reconstruction required significant assumptions in order to produce real-time imagery within an unmanned underwater vehicle's (UUV's) size, weight, and power (SWaP) constraints. However, these assumptions result in reduced image quality. In this work, we describe ASASIN, the Advanced Synthetic Aperture Sonar Imagining eNgine. ASASIN is a time domain backprojection image reconstruction suite utilizing graphics processing units (GPUs) allowing real-time operation on UUVs without sacrificing image quality. We describe several speedups employed in ASASIN allowing us to achieve this objective. Furthermore, ASASIN's signal processing chain is capable of producing 2D and 3D SAS imagery as we will demonstrate. Finally, we measure ASASIN's performance on a variety of GPUs and create a model capable of predicting performance. We demonstrate our model's usefulness in predicting run-time performance on desktop and embedded GPU hardware.
△ Less
Submitted 25 May, 2021; v1 submitted 14 January, 2021;
originally announced January 2021.
-
Advancing Eosinophilic Esophagitis Diagnosis and Phenotype Assessment with Deep Learning Computer Vision
Authors:
William Adorno III,
Alexis Catalano,
Lubaina Ehsan,
Hans Vitzhum von Eckstaedt,
Barrett Barnes,
Emily McGowan,
Sana Syed,
Donald E. Brown
Abstract:
Eosinophilic Esophagitis (EoE) is an inflammatory esophageal disease which is increasing in prevalence. The diagnostic gold-standard involves manual review of a patient's biopsy tissue sample by a clinical pathologist for the presence of 15 or greater eosinophils within a single high-power field (400x magnification). Diagnosing EoE can be a cumbersome process with added difficulty for assessing th…
▽ More
Eosinophilic Esophagitis (EoE) is an inflammatory esophageal disease which is increasing in prevalence. The diagnostic gold-standard involves manual review of a patient's biopsy tissue sample by a clinical pathologist for the presence of 15 or greater eosinophils within a single high-power field (400x magnification). Diagnosing EoE can be a cumbersome process with added difficulty for assessing the severity and progression of disease. We propose an automated approach for quantifying eosinophils using deep image segmentation. A U-Net model and post-processing system are applied to generate eosinophil-based statistics that can diagnose EoE as well as describe disease severity and progression. These statistics are captured in biopsies at the initial EoE diagnosis and are then compared with patient metadata: clinical and treatment phenotypes. The goal is to find linkages that could potentially guide treatment plans for new patients at their initial disease diagnosis. A deep image classification model is further applied to discover features other than eosinophils that can be used to diagnose EoE. This is the first study to utilize a deep learning computer vision approach for EoE diagnosis and to provide an automated process for tracking disease severity and progression.
△ Less
Submitted 13 January, 2021;
originally announced January 2021.
-
Hand-drawn Symbol Recognition of Surgical Flowsheet Graphs with Deep Image Segmentation
Authors:
William Adorno III,
Angela Yi,
Marcel Durieux,
Donald Brown
Abstract:
Perioperative data are essential to investigating the causes of adverse surgical outcomes. In some low to middle income countries, these data are computationally inaccessible due to a lack of digitization of surgical flowsheets. In this paper, we present a deep image segmentation approach using a U-Net architecture that can detect hand-drawn symbols on a flowsheet graph. The segmentation mask outp…
▽ More
Perioperative data are essential to investigating the causes of adverse surgical outcomes. In some low to middle income countries, these data are computationally inaccessible due to a lack of digitization of surgical flowsheets. In this paper, we present a deep image segmentation approach using a U-Net architecture that can detect hand-drawn symbols on a flowsheet graph. The segmentation mask outputs are post-processed with techniques unique to each symbol to convert into numeric values. The U-Net method can detect, at the appropriate time intervals, the symbols for heart rate and blood pressure with over 99 percent accuracy. Over 95 percent of the predictions fall within an absolute error of five when compared to the actual value. The deep learning model outperformed template matching even with a small size of annotated images available for the training set.
△ Less
Submitted 30 June, 2020;
originally announced June 2020.
-
HMIC: Hierarchical Medical Image Classification, A Deep Learning Approach
Authors:
Kamran Kowsari,
Rasoul Sali,
Lubaina Ehsan,
William Adorno,
Asad Ali,
Sean Moore,
Beatrice Amadi,
Paul Kelly,
Sana Syed,
Donald Brown
Abstract:
Image classification is central to the big data revolution in medicine. Improved information processing methods for diagnosis and classification of digital medical images have shown to be successful via deep learning approaches. As this field is explored, there are limitations to the performance of traditional supervised classifiers. This paper outlines an approach that is different from the curre…
▽ More
Image classification is central to the big data revolution in medicine. Improved information processing methods for diagnosis and classification of digital medical images have shown to be successful via deep learning approaches. As this field is explored, there are limitations to the performance of traditional supervised classifiers. This paper outlines an approach that is different from the current medical image classification tasks that view the issue as multi-class classification. We performed a hierarchical classification using our Hierarchical Medical Image classification (HMIC) approach. HMIC uses stacks of deep learning models to give particular comprehension at each level of the clinical picture hierarchy. For testing our performance, we use biopsy of the small bowel images that contain three categories in the parent level (Celiac Disease, Environmental Enteropathy, and histologically normal controls). For the child level, Celiac Disease Severity is classified into 4 classes (I, IIIa, IIIb, and IIIC).
△ Less
Submitted 23 June, 2020; v1 submitted 12 June, 2020;
originally announced June 2020.
-
Framing Effects on Strategic Information Design under Receiver Distrust and Unknown State
Authors:
Doris E. M. Brown,
Venkata Sriram Siddhardh Nadendla
Abstract:
Strategic information design is a framework where a sender designs information strategically to steer its receiver's decision towards a desired choice. Traditionally, such frameworks have always assumed that the sender and the receiver comprehends the state of the choice environment, and that the receiver always trusts the sender's signal. This paper deviates from these assumptions and re-investig…
▽ More
Strategic information design is a framework where a sender designs information strategically to steer its receiver's decision towards a desired choice. Traditionally, such frameworks have always assumed that the sender and the receiver comprehends the state of the choice environment, and that the receiver always trusts the sender's signal. This paper deviates from these assumptions and re-investigates strategic information design in the presence of distrustful receiver and when both sender and receiver cannot observe/comprehend the environment state space. Specifically, we assume that both sender and receiver has access to non-identical beliefs about choice rewards (with sender's belief being more accurate), but not the environment state that determines these rewards. Furthermore, given that the receiver does not trust the sender, we also assume that the receiver updates its prior in a non-Bayesian manner. We evaluate the Stackelberg equilibrium and investigate effects of information framing (i.e. send complete signal, or just expected value of the signal) on the equilibrium. Furthermore, we also investigate trust dynamics at the receiver, under the assumption that the receiver minimizes regret in hindsight. Simulation results are presented to illustrate signaling effects and trust dynamics in strategic information design.
△ Less
Submitted 21 July, 2021; v1 submitted 11 May, 2020;
originally announced May 2020.
-
Hierarchical Deep Convolutional Neural Networks for Multi-category Diagnosis of Gastrointestinal Disorders on Histopathological Images
Authors:
Rasoul Sali,
Sodiq Adewole,
Lubaina Ehsan,
Lee A. Denson,
Paul Kelly,
Beatrice C. Amadi,
Lori Holtz,
Syed Asad Ali,
Sean R. Moore,
Sana Syed,
Donald E. Brown
Abstract:
Deep convolutional neural networks(CNNs) have been successful for a wide range of computer vision tasks, including image classification. A specific area of the application lies in digital pathology for pattern recognition in the tissue-based diagnosis of gastrointestinal(GI) diseases. This domain can utilize CNNs to translate histopathological images into precise diagnostics. This is challenging s…
▽ More
Deep convolutional neural networks(CNNs) have been successful for a wide range of computer vision tasks, including image classification. A specific area of the application lies in digital pathology for pattern recognition in the tissue-based diagnosis of gastrointestinal(GI) diseases. This domain can utilize CNNs to translate histopathological images into precise diagnostics. This is challenging since these complex biopsies are heterogeneous and require multiple levels of assessment. This is mainly due to structural similarities in different parts of the GI tract and shared features among different gut diseases. Addressing this problem with a flat model that assumes all classes (parts of the gut and their diseases) are equally difficult to distinguish leads to an inadequate assessment of each class. Since the hierarchical model restricts classification error to each sub-class, it leads to a more informative model than a flat model. In this paper, we propose to apply the hierarchical classification of biopsy images from different parts of the GI tract and the receptive diseases within each. We embedded a class hierarchy into the plain VGGNet to take advantage of its layers' hierarchical structure. The proposed model was evaluated using an independent set of image patches from 373 whole slide images. The results indicate that the hierarchical model can achieve better results than the flat model for multi-category diagnosis of GI disorders using histopathological images.
△ Less
Submitted 6 August, 2020; v1 submitted 8 May, 2020;
originally announced May 2020.
-
CeliacNet: Celiac Disease Severity Diagnosis on Duodenal Histopathological Images Using Deep Residual Networks
Authors:
Rasoul Sali,
Lubaina Ehsan,
Kamran Kowsari,
Marium Khan,
Christopher A. Moskaluk,
Sana Syed,
Donald E. Brown
Abstract:
Celiac Disease (CD) is a chronic autoimmune disease that affects the small intestine in genetically predisposed children and adults. Gluten exposure triggers an inflammatory cascade which leads to compromised intestinal barrier function. If this enteropathy is unrecognized, this can lead to anemia, decreased bone density, and, in longstanding cases, intestinal cancer. The prevalence of the disorde…
▽ More
Celiac Disease (CD) is a chronic autoimmune disease that affects the small intestine in genetically predisposed children and adults. Gluten exposure triggers an inflammatory cascade which leads to compromised intestinal barrier function. If this enteropathy is unrecognized, this can lead to anemia, decreased bone density, and, in longstanding cases, intestinal cancer. The prevalence of the disorder is 1% in the United States. An intestinal (duodenal) biopsy is considered the "gold standard" for diagnosis. The mild CD might go unnoticed due to non-specific clinical symptoms or mild histologic features. In our current work, we trained a model based on deep residual networks to diagnose CD severity using a histological scoring system called the modified Marsh score. The proposed model was evaluated using an independent set of 120 whole slide images from 15 CD patients and achieved an AUC greater than 0.96 in all classes. These results demonstrate the diagnostic power of the proposed model for CD severity classification using histological images.
△ Less
Submitted 7 October, 2019;
originally announced October 2019.
-
Coupling Rendering and Generative Adversarial Networks for Artificial SAS Image Generation
Authors:
Albert Reed,
Isaac Gerg,
John McKay,
Daniel Brown,
David Williams,
Suren Jayasuriya
Abstract:
Acquisition of Synthetic Aperture Sonar (SAS) datasets is bottlenecked by the costly deployment of SAS imaging systems, and even when data acquisition is possible,the data is often skewed towards containing barren seafloor rather than objects of interest. We present a novel pipeline, called SAS GAN, which couples an optical renderer with a generative adversarial network (GAN) to synthesize realist…
▽ More
Acquisition of Synthetic Aperture Sonar (SAS) datasets is bottlenecked by the costly deployment of SAS imaging systems, and even when data acquisition is possible,the data is often skewed towards containing barren seafloor rather than objects of interest. We present a novel pipeline, called SAS GAN, which couples an optical renderer with a generative adversarial network (GAN) to synthesize realistic SAS images of targets on the seafloor. This coupling enables high levels of SAS image realism while enabling control over image geometry and parameters. We demonstrate qualitative results by presenting examples of images created with our pipeline. We also present quantitative results through the use of t-SNE and the Fréchet Inception Distance to argue that our generated SAS imagery potentially augments SAS datasets more effectively than an off-the-shelf GAN.
△ Less
Submitted 2 October, 2019; v1 submitted 13 September, 2019;
originally announced September 2019.
-
Self-Attentive Adversarial Stain Normalization
Authors:
Aman Shrivastava,
Will Adorno,
Yash Sharma,
Lubaina Ehsan,
S. Asad Ali,
Sean R. Moore,
Beatrice C. Amadi,
Paul Kelly,
Sana Syed,
Donald E. Brown
Abstract:
Hematoxylin and Eosin (H&E) stained Whole Slide Images (WSIs) are utilized for biopsy visualization-based diagnostic and prognostic assessment of diseases. Variation in the H&E staining process across different lab sites can lead to significant variations in biopsy image appearance. These variations introduce an undesirable bias when the slides are examined by pathologists or used for training dee…
▽ More
Hematoxylin and Eosin (H&E) stained Whole Slide Images (WSIs) are utilized for biopsy visualization-based diagnostic and prognostic assessment of diseases. Variation in the H&E staining process across different lab sites can lead to significant variations in biopsy image appearance. These variations introduce an undesirable bias when the slides are examined by pathologists or used for training deep learning models. To reduce this bias, slides need to be translated to a common domain of stain appearance before analysis. We propose a Self-Attentive Adversarial Stain Normalization (SAASN) approach for the normalization of multiple stain appearances to a common domain. This unsupervised generative adversarial approach includes self-attention mechanism for synthesizing images with finer detail while preserving the structural consistency of the biopsy features during translation. SAASN demonstrates consistent and superior performance compared to other popular stain normalization techniques on H&E stained duodenal biopsy image data.
△ Less
Submitted 22 November, 2020; v1 submitted 4 September, 2019;
originally announced September 2019.
-
Deep Learning for Visual Recognition of Environmental Enteropathy and Celiac Disease
Authors:
Aman Shrivastava,
Karan Kant,
Saurav Sengupta,
Sung-Jun Kang,
Marium Khan,
Asad Ali,
Sean R. Moore,
Beatrice C. Amadi,
Paul Kelly,
Donald E. Brown,
Sana Syed
Abstract:
Physicians use biopsies to distinguish between different but histologically similar enteropathies. The range of syndromes and pathologies that could cause different gastrointestinal conditions makes this a difficult problem. Recently, deep learning has been used successfully in helping diagnose cancerous tissues in histopathological images. These successes motivated the research presented in this…
▽ More
Physicians use biopsies to distinguish between different but histologically similar enteropathies. The range of syndromes and pathologies that could cause different gastrointestinal conditions makes this a difficult problem. Recently, deep learning has been used successfully in helping diagnose cancerous tissues in histopathological images. These successes motivated the research presented in this paper, which describes a deep learning approach that distinguishes between Celiac Disease (CD) and Environmental Enteropathy (EE) and normal tissue from digitized duodenal biopsies. Experimental results show accuracies of over 90% for this approach. We also look into interpreting the neural network model using Gradient-weighted Class Activation Mappings and filter activations on input images to understand the visual explanations for the decisions made by the model.
△ Less
Submitted 8 August, 2019;
originally announced August 2019.
-
Diagnosis of Celiac Disease and Environmental Enteropathy on Biopsy Images Using Color Balancing on Convolutional Neural Networks
Authors:
Kamran Kowsari,
Rasoul Sali,
Marium N. Khan,
William Adorno,
S. Asad Ali,
Sean R. Moore,
Beatrice C. Amadi,
Paul Kelly,
Sana Syed,
Donald E. Brown
Abstract:
Celiac Disease (CD) and Environmental Enteropathy (EE) are common causes of malnutrition and adversely impact normal childhood development. CD is an autoimmune disorder that is prevalent worldwide and is caused by an increased sensitivity to gluten. Gluten exposure destructs the small intestinal epithelial barrier, resulting in nutrient mal-absorption and childhood under-nutrition. EE also results…
▽ More
Celiac Disease (CD) and Environmental Enteropathy (EE) are common causes of malnutrition and adversely impact normal childhood development. CD is an autoimmune disorder that is prevalent worldwide and is caused by an increased sensitivity to gluten. Gluten exposure destructs the small intestinal epithelial barrier, resulting in nutrient mal-absorption and childhood under-nutrition. EE also results in barrier dysfunction but is thought to be caused by an increased vulnerability to infections. EE has been implicated as the predominant cause of under-nutrition, oral vaccine failure, and impaired cognitive development in low-and-middle-income countries. Both conditions require a tissue biopsy for diagnosis, and a major challenge of interpreting clinical biopsy images to differentiate between these gastrointestinal diseases is striking histopathologic overlap between them. In the current study, we propose a convolutional neural network (CNN) to classify duodenal biopsy images from subjects with CD, EE, and healthy controls. We evaluated the performance of our proposed model using a large cohort containing 1000 biopsy images. Our evaluations show that the proposed model achieves an area under ROC of 0.99, 1.00, and 0.97 for CD, EE, and healthy controls, respectively. These results demonstrate the discriminative power of the proposed model in duodenal biopsies classification.
△ Less
Submitted 9 October, 2019; v1 submitted 10 April, 2019;
originally announced April 2019.
-
Simulation and Testing Results for a Sub-Bottom Imaging Sonar
Authors:
Daniel C. Brown,
Shawn F. Johnson,
Cale F. Brownstead
Abstract:
The problem of detecting buried unexploded ordnance (UXO) is addressed with a sensor deployed from a shallow-draft surface vessel. This sonar system produces three dimensional synthetic aperture sonar (SAS) imagery of both surficial and buried UXO across a range of environments. The sensor's hardware design was based in part upon data created using a hybrid modeling approach that combined results…
▽ More
The problem of detecting buried unexploded ordnance (UXO) is addressed with a sensor deployed from a shallow-draft surface vessel. This sonar system produces three dimensional synthetic aperture sonar (SAS) imagery of both surficial and buried UXO across a range of environments. The sensor's hardware design was based in part upon data created using a hybrid modeling approach that combined results from separate environmental scattering and target scattering models. This hybrid model produced synthetic sensor data where the sensor/environment/target space could be modified to explore the expected operating conditions. The simulated data were also used to adapt a set of existing signal processing algorithms for formation of three-dimensional acoustic imagery.
Recently, the sonar system has been integrated to a test platform, and experiments have been conducted at a trial site in the Foster Joseph Sayers Reservoir near Howard, PA. This test site has been prepared with several buried man-made objects. Initial results show that fully buried targets can be detected.
△ Less
Submitted 4 October, 2018; v1 submitted 22 September, 2018;
originally announced September 2018.