-
Reducing Down(stream)time: Pretraining Molecular GNNs using Heterogeneous AI Accelerators
Authors:
Jenna A. Bilbrey,
Kristina M. Herman,
Henry Sprueill,
Soritis S. Xantheas,
Payel Das,
Manuel Lopez Roldan,
Mike Kraus,
Hatem Helal,
Sutanay Choudhury
Abstract:
The demonstrated success of transfer learning has popularized approaches that involve pretraining models from massive data sources and subsequent finetuning towards a specific task. While such approaches have become the norm in fields such as natural language processing, implementation and evaluation of transfer learning approaches for chemistry are in the early stages. In this work, we demonstrat…
▽ More
The demonstrated success of transfer learning has popularized approaches that involve pretraining models from massive data sources and subsequent finetuning towards a specific task. While such approaches have become the norm in fields such as natural language processing, implementation and evaluation of transfer learning approaches for chemistry are in the early stages. In this work, we demonstrate finetuning for downstream tasks on a graph neural network (GNN) trained over a molecular database containing 2.7 million water clusters. The use of Graphcore IPUs as an AI accelerator for training molecular GNNs reduces training time from a reported 2.7 days on 0.5M clusters to 1.2 hours on 2.7M clusters. Finetuning the pretrained model for downstream tasks of molecular dynamics and transfer to a different potential energy surface took only 8.3 hours and 28 minutes, respectively, on a single GPU.
△ Less
Submitted 8 November, 2022;
originally announced November 2022.
-
On the Fourier transform of a quantitative trait: Implications for compressive sensing
Authors:
Stephen Doro,
Matthew A. Herman
Abstract:
This paper explores the genotype-phenotype relationship. It outlines conditions under which the dependence of a quantitative trait on the genome might be predictable, based on measurement of a limited subset of genotypes. It uses the theory of real-valued Boolean functions in a systematic way to translate trait data into the Fourier domain. Important trait features, such as the roughness of the tr…
▽ More
This paper explores the genotype-phenotype relationship. It outlines conditions under which the dependence of a quantitative trait on the genome might be predictable, based on measurement of a limited subset of genotypes. It uses the theory of real-valued Boolean functions in a systematic way to translate trait data into the Fourier domain. Important trait features, such as the roughness of the trait landscape or the modularity of a trait have a simple Fourier interpretation. Ruggedness at a gene location corresponds to high sensitivity to mutation, while a modular organization of gene activity reduces such sensitivity.
Traits where rugged loci are rare will naturally compress gene data in the Fourier domain, leading to a sparse representation of trait data, concentrated in identifiable, low-level coefficients. This Fourier representation of a trait organizes epistasis in a form which is isometric to the trait data. As Fourier matrices are known to be maximally incoherent with the standard basis, this permits employing compressive sensing techniques to work from data sets that are relatively small -- sometimes even of polynomial size -- compared to the exponentially large sets of possible genomes.
This theory provides a theoretical underpinning for systematic use of Boolean function machinery to dissect the dependency of a trait on the genome and environment.
△ Less
Submitted 14 March, 2022; v1 submitted 4 January, 2021;
originally announced January 2021.
-
Pedestrian Behavior Prediction for Automated Driving: Requirements, Metrics, and Relevant Features
Authors:
Michael Herman,
Jörg Wagner,
Vishnu Prabhakaran,
Nicolas Möser,
Hanna Ziesche,
Waleed Ahmed,
Lutz Bürkle,
Ernst Kloppenburg,
Claudius Gläser
Abstract:
Automated vehicles require a comprehensive understanding of traffic situations to ensure safe and anticipatory driving. In this context, the prediction of pedestrians is particularly challenging as pedestrian behavior can be influenced by multiple factors. In this paper, we thoroughly analyze the requirements on pedestrian behavior prediction for automated driving via a system-level approach. To t…
▽ More
Automated vehicles require a comprehensive understanding of traffic situations to ensure safe and anticipatory driving. In this context, the prediction of pedestrians is particularly challenging as pedestrian behavior can be influenced by multiple factors. In this paper, we thoroughly analyze the requirements on pedestrian behavior prediction for automated driving via a system-level approach. To this end we investigate real-world pedestrian-vehicle interactions with human drivers. Based on human driving behavior we then derive appropriate reaction patterns of an automated vehicle and determine requirements for the prediction of pedestrians. This includes a novel metric tailored to measure prediction performance from a system-level perspective. The proposed metric is evaluated on a large-scale dataset comprising thousands of real-world pedestrian-vehicle interactions. We furthermore conduct an ablation study to evaluate the importance of different contextual cues and compare these results to ones obtained using established performance metrics for pedestrian prediction. Our results highlight the importance of a system-level approach to pedestrian behavior prediction.
△ Less
Submitted 16 October, 2021; v1 submitted 15 December, 2020;
originally announced December 2020.
-
3DPIFCM Novel Algorithm for Segmentation of Noisy Brain MRI Images
Authors:
Arie Agranonik,
Maya Herman,
Mark Last
Abstract:
We present a novel algorithm named 3DPIFCM, for automatic segmentation of noisy MRI Brain images. The algorithm is an extension of a well-known IFCM (Improved Fuzzy C-Means) algorithm. It performs fuzzy segmentation and introduces a fitness function that is affected by proximity of the voxels and by the color intensity in 3D images. The 3DPIFCM algorithm uses PSO (Particle Swarm Optimization) in o…
▽ More
We present a novel algorithm named 3DPIFCM, for automatic segmentation of noisy MRI Brain images. The algorithm is an extension of a well-known IFCM (Improved Fuzzy C-Means) algorithm. It performs fuzzy segmentation and introduces a fitness function that is affected by proximity of the voxels and by the color intensity in 3D images. The 3DPIFCM algorithm uses PSO (Particle Swarm Optimization) in order to optimize the fitness function. In addition, the 3DPIFCM uses 3D features of near voxels to better adjust the noisy artifacts. In our experiments, we evaluate 3DPIFCM on T1 Brainweb dataset with noise levels ranging from 1% to 20% and on a synthetic dataset with ground truth both in 3D. The analysis of the segmentation results shows a significant improvement in the segmentation quality of up to 28% compared to two generic variants in noisy images and up to 60% when compared to the original FCM (Fuzzy C-Means).
△ Less
Submitted 10 February, 2020; v1 submitted 5 February, 2020;
originally announced February 2020.
-
Parallel 3DPIFCM Algorithm for Noisy Brain MRI Images
Authors:
Arie Agranonik,
Maya Herman,
Mark Last
Abstract:
In this paper we implemented the algorithm we developed in [1] called 3DPIFCM in a parallel environment by using CUDA on a GPU. In our previous work we introduced 3DPIFCM which performs segmentation of images in noisy conditions and uses particle swarm optimization for finding the optimal algorithm parameters to account for noise. This algorithm achieved state of the art segmentation accuracy when…
▽ More
In this paper we implemented the algorithm we developed in [1] called 3DPIFCM in a parallel environment by using CUDA on a GPU. In our previous work we introduced 3DPIFCM which performs segmentation of images in noisy conditions and uses particle swarm optimization for finding the optimal algorithm parameters to account for noise. This algorithm achieved state of the art segmentation accuracy when compared to FCM (Fuzzy C-Means), IFCMPSO (Improved Fuzzy C-Means with Particle Swarm Optimization), GAIFCM (Genetic Algorithm Improved Fuzzy C-Means) on noisy MRI images of an adult Brain.
When using a genetic algorithm or PSO (Particle Swarm Optimization) on a single machine for optimization we witnessed long execution times for practical clinical usage. Therefore, in the current paper our goal was to speed up the execution of 3DPIFCM by taking out parts of the algorithm and executing them as kernels on a GPU. The algorithm was implemented using the CUDA [13] framework from NVIDIA and experiments where performed on a server containing 64GB RAM , 8 cores and a TITAN X GPU with 3072 SP cores and 12GB of GPU memory.
Our results show that the parallel version of the algorithm performs up to 27x faster than the original sequential version and 68x faster than GAIFCM algorithm. We show that the speedup of the parallel version increases as we increase the size of the image due to better utilization of cores in the GPU. Also, we show a speedup of up to 5x in our Brainweb experiment compared to other generic variants such as IFCMPSO and GAIFCM.
△ Less
Submitted 5 February, 2020;
originally announced February 2020.
-
Wasserstein Adversarial Imitation Learning
Authors:
Huang Xiao,
Michael Herman,
Joerg Wagner,
Sebastian Ziesche,
Jalal Etesami,
Thai Hong Linh
Abstract:
Imitation Learning describes the problem of recovering an expert policy from demonstrations. While inverse reinforcement learning approaches are known to be very sample-efficient in terms of expert demonstrations, they usually require problem-dependent reward functions or a (task-)specific reward-function regularization. In this paper, we show a natural connection between inverse reinforcement lea…
▽ More
Imitation Learning describes the problem of recovering an expert policy from demonstrations. While inverse reinforcement learning approaches are known to be very sample-efficient in terms of expert demonstrations, they usually require problem-dependent reward functions or a (task-)specific reward-function regularization. In this paper, we show a natural connection between inverse reinforcement learning approaches and Optimal Transport, that enables more general reward functions with desirable properties (e.g., smoothness). Based on our observation, we propose a novel approach called Wasserstein Adversarial Imitation Learning. Our approach considers the Kantorovich potentials as a reward function and further leverages regularized optimal transport to enable large-scale applications. In several robotic experiments, our approach outperforms the baselines in terms of average cumulative rewards and shows a significant improvement in sample-efficiency, by requiring just one expert demonstration.
△ Less
Submitted 19 June, 2019;
originally announced June 2019.
-
Human Motion Trajectory Prediction: A Survey
Authors:
Andrey Rudenko,
Luigi Palmieri,
Michael Herman,
Kris M. Kitani,
Dariu M. Gavrila,
Kai O. Arras
Abstract:
With growing numbers of intelligent autonomous systems in human environments, the ability of such systems to perceive, understand and anticipate human behavior becomes increasingly important. Specifically, predicting future positions of dynamic agents and planning considering such predictions are key tasks for self-driving vehicles, service robots and advanced surveillance systems. This paper prov…
▽ More
With growing numbers of intelligent autonomous systems in human environments, the ability of such systems to perceive, understand and anticipate human behavior becomes increasingly important. Specifically, predicting future positions of dynamic agents and planning considering such predictions are key tasks for self-driving vehicles, service robots and advanced surveillance systems. This paper provides a survey of human motion trajectory prediction. We review, analyze and structure a large selection of work from different communities and propose a taxonomy that categorizes existing methods based on the motion modeling approach and level of contextual information used. We provide an overview of the existing datasets and performance metrics. We discuss limitations of the state of the art and outline directions for further research.
△ Less
Submitted 17 December, 2019; v1 submitted 15 May, 2019;
originally announced May 2019.
-
Functionally Modular and Interpretable Temporal Filtering for Robust Segmentation
Authors:
Jörg Wagner,
Volker Fischer,
Michael Herman,
Sven Behnke
Abstract:
The performance of autonomous systems heavily relies on their ability to generate a robust representation of the environment. Deep neural networks have greatly improved vision-based perception systems but still fail in challenging situations, e.g. sensor outages or heavy weather. These failures are often introduced by data-inherent perturbations, which significantly reduce the information provided…
▽ More
The performance of autonomous systems heavily relies on their ability to generate a robust representation of the environment. Deep neural networks have greatly improved vision-based perception systems but still fail in challenging situations, e.g. sensor outages or heavy weather. These failures are often introduced by data-inherent perturbations, which significantly reduce the information provided to the perception system. We propose a functionally modularized temporal filter, which stabilizes an abstract feature representation of a single-frame segmentation model using information of previous time steps. Our filter module splits the filter task into multiple less complex and more interpretable subtasks. The basic structure of the filter is inspired by a Bayes estimator consisting of a prediction and an update step. To make the prediction more transparent, we implement it using a geometric projection and estimate its parameters. This additionally enables the decomposition of the filter task into static representation filtering and low-dimensional motion filtering. Our model can cope with missing frames and is trainable in an end-to-end fashion. Using photorealistic, synthetic video data, we show the ability of the proposed architecture to overcome data-inherent perturbations. The experiments especially highlight advantages introduced by an interpretable and explicit filter module.
△ Less
Submitted 15 October, 2018; v1 submitted 9 October, 2018;
originally announced October 2018.
-
Hierarchical Recurrent Filtering for Fully Convolutional DenseNets
Authors:
Jörg Wagner,
Volker Fischer,
Michael Herman,
Sven Behnke
Abstract:
Generating a robust representation of the environment is a crucial ability of learning agents. Deep learning based methods have greatly improved perception systems but still fail in challenging situations. These failures are often not solvable on the basis of a single image. In this work, we present a parameter-efficient temporal filtering concept which extends an existing single-frame segmentatio…
▽ More
Generating a robust representation of the environment is a crucial ability of learning agents. Deep learning based methods have greatly improved perception systems but still fail in challenging situations. These failures are often not solvable on the basis of a single image. In this work, we present a parameter-efficient temporal filtering concept which extends an existing single-frame segmentation model to work with multiple frames. The resulting recurrent architecture temporally filters representations on all abstraction levels in a hierarchical manner, while decoupling temporal dependencies from scene representation. Using a synthetic dataset, we show the ability of our model to cope with data perturbations and highlight the importance of recurrent and hierarchical filtering.
△ Less
Submitted 15 October, 2018; v1 submitted 5 October, 2018;
originally announced October 2018.
-
Inverse Reinforcement Learning with Simultaneous Estimation of Rewards and Dynamics
Authors:
Michael Herman,
Tobias Gindele,
Jörg Wagner,
Felix Schmitt,
Wolfram Burgard
Abstract:
Inverse Reinforcement Learning (IRL) describes the problem of learning an unknown reward function of a Markov Decision Process (MDP) from observed behavior of an agent. Since the agent's behavior originates in its policy and MDP policies depend on both the stochastic system dynamics as well as the reward function, the solution of the inverse problem is significantly influenced by both. Current IRL…
▽ More
Inverse Reinforcement Learning (IRL) describes the problem of learning an unknown reward function of a Markov Decision Process (MDP) from observed behavior of an agent. Since the agent's behavior originates in its policy and MDP policies depend on both the stochastic system dynamics as well as the reward function, the solution of the inverse problem is significantly influenced by both. Current IRL approaches assume that if the transition model is unknown, additional samples from the system's dynamics are accessible, or the observed behavior provides enough samples of the system's dynamics to solve the inverse problem accurately. These assumptions are often not satisfied. To overcome this, we present a gradient-based IRL approach that simultaneously estimates the system's dynamics. By solving the combined optimization problem, our approach takes into account the bias of the demonstrations, which stems from the generating policy. The evaluation on a synthetic MDP and a transfer learning task shows improvements regarding the sample efficiency as well as the accuracy of the estimated reward functions and transition models.
△ Less
Submitted 13 April, 2016;
originally announced April 2016.
-
Managing Null Entries in Pairwise Comparisons
Authors:
W. W. Koczkodaj,
M. W. Herman,
M. Orlowski
Abstract:
This paper shows how to manage null entries in pairwise comparisons matrices. Although assessments can be imprecise, since subjective criteria are involved, the classical pairwise comparisons theory expects all of them to be available. In practice, some experts may not be able (or available) to provide all assessments. Therefore managing null entries is a necessary extension of the pairwise compar…
▽ More
This paper shows how to manage null entries in pairwise comparisons matrices. Although assessments can be imprecise, since subjective criteria are involved, the classical pairwise comparisons theory expects all of them to be available. In practice, some experts may not be able (or available) to provide all assessments. Therefore managing null entries is a necessary extension of the pairwise comparisons method. It is shown that certain null entries can be recovered on the basis of the transitivity property which each pairwise comparisons matrix is expected to satisfy.
△ Less
Submitted 20 May, 2015;
originally announced May 2015.
-
A Monte Carlo Study of Pairwise Comparisons
Authors:
M. W. Herman,
W. W. Koczkodaj
Abstract:
Consistent approximations obtained by geometric means ($GM$) and the principal eigenvector ($EV$), turned out to be close enough for 1,000,000 not-so-inconsistent pairwise comparisons matrices. In this respect both methods are accurate enough for most practical applications. As the enclosed Table 1 demonstrates, the biggest difference between average deviations of $GM$ and $EV$ solutions is 0.0001…
▽ More
Consistent approximations obtained by geometric means ($GM$) and the principal eigenvector ($EV$), turned out to be close enough for 1,000,000 not-so-inconsistent pairwise comparisons matrices. In this respect both methods are accurate enough for most practical applications. As the enclosed Table 1 demonstrates, the biggest difference between average deviations of $GM$ and $EV$ solutions is 0.00019 for the Euclidean metric and 0.00355 for the Tchebychev metric.
For practical applications, this precision is far better than expected. After all we are talking, in most cases, about relative subjective comparisons and one tenth of a percent is usually below our threshold of perception.
△ Less
Submitted 7 May, 2015;
originally announced May 2015.
-
General Deviants: An Analysis of Perturbations in Compressed Sensing
Authors:
Matthew A. Herman,
Thomas Strohmer
Abstract:
We analyze the Basis Pursuit recovery of signals with general perturbations. Previous studies have only considered partially perturbed observations Ax + e. Here, x is a signal which we wish to recover, A is a full-rank matrix with more columns than rows, and e is simple additive noise. Our model also incorporates perturbations E to the matrix A which result in multiplicative noise. This complete…
▽ More
We analyze the Basis Pursuit recovery of signals with general perturbations. Previous studies have only considered partially perturbed observations Ax + e. Here, x is a signal which we wish to recover, A is a full-rank matrix with more columns than rows, and e is simple additive noise. Our model also incorporates perturbations E to the matrix A which result in multiplicative noise. This completely perturbed framework extends the prior work of Candes, Romberg and Tao on stable signal recovery from incomplete and inaccurate measurements. Our results show that, under suitable conditions, the stability of the recovered signal is limited by the noise level in the observation. Moreover, this accuracy is within a constant multiple of the best-case reconstruction using the technique of least squares. In the absence of additive noise numerical simulations essentially confirm that this error is a linear function of the relative perturbation.
△ Less
Submitted 16 July, 2009;
originally announced July 2009.
-
High-Resolution Radar via Compressed Sensing
Authors:
Matthew A. Herman,
Thomas Strohmer
Abstract:
A stylized compressed sensing radar is proposed in which the time-frequency plane is discretized into an N by N grid. Assuming the number of targets K is small (i.e., K much less than N^2), then we can transmit a sufficiently "incoherent" pulse and employ the techniques of compressed sensing to reconstruct the target scene. A theoretical upper bound on the sparsity K is presented. Numerical simu…
▽ More
A stylized compressed sensing radar is proposed in which the time-frequency plane is discretized into an N by N grid. Assuming the number of targets K is small (i.e., K much less than N^2), then we can transmit a sufficiently "incoherent" pulse and employ the techniques of compressed sensing to reconstruct the target scene. A theoretical upper bound on the sparsity K is presented. Numerical simulations verify that even better performance can be achieved in practice. This novel compressed sensing approach offers great potential for better resolution over classical radar.
△ Less
Submitted 22 December, 2008; v1 submitted 14 March, 2008;
originally announced March 2008.