-
FusionINN: Decomposable Image Fusion for Brain Tumor Monitoring
Authors:
Nishant Kumar,
Ziyan Tao,
Jaikirat Singh,
Yang Li,
Peiwen Sun,
Binghui Zhao,
Stefan Gumhold
Abstract:
Image fusion typically employs non-invertible neural networks to merge multiple source images into a single fused image. However, for clinical experts, solely relying on fused images may be insufficient for making diagnostic decisions, as the fusion mechanism blends features from source images, thereby making it difficult to interpret the underlying tumor pathology. We introduce FusionINN, a novel…
▽ More
Image fusion typically employs non-invertible neural networks to merge multiple source images into a single fused image. However, for clinical experts, solely relying on fused images may be insufficient for making diagnostic decisions, as the fusion mechanism blends features from source images, thereby making it difficult to interpret the underlying tumor pathology. We introduce FusionINN, a novel decomposable image fusion framework, capable of efficiently generating fused images and also decomposing them back to the source images. FusionINN is designed to be bijective by including a latent image alongside the fused image, while ensuring minimal transfer of information from the source images to the latent representation. To the best of our knowledge, we are the first to investigate the decomposability of fused images, which is particularly crucial for life-sensitive applications such as medical image fusion compared to other tasks like multi-focus or multi-exposure image fusion. Our extensive experimentation validates FusionINN over existing discriminative and generative fusion methods, both subjectively and objectively. Moreover, compared to a recent denoising diffusion-based fusion model, our approach offers faster and qualitatively better fusion results.
△ Less
Submitted 10 June, 2024; v1 submitted 23 March, 2024;
originally announced March 2024.
-
Quantile-based Maximum Likelihood Training for Outlier Detection
Authors:
Masoud Taghikhah,
Nishant Kumar,
Siniša Šegvić,
Abouzar Eslami,
Stefan Gumhold
Abstract:
Discriminative learning effectively predicts true object class for image classification. However, it often results in false positives for outliers, posing critical concerns in applications like autonomous driving and video surveillance systems. Previous attempts to address this challenge involved training image classifiers through contrastive learning using actual outlier data or synthesizing outl…
▽ More
Discriminative learning effectively predicts true object class for image classification. However, it often results in false positives for outliers, posing critical concerns in applications like autonomous driving and video surveillance systems. Previous attempts to address this challenge involved training image classifiers through contrastive learning using actual outlier data or synthesizing outliers for self-supervised learning. Furthermore, unsupervised generative modeling of inliers in pixel space has shown limited success for outlier detection. In this work, we introduce a quantile-based maximum likelihood objective for learning the inlier distribution to improve the outlier separation during inference. Our approach fits a normalizing flow to pre-trained discriminative features and detects the outliers according to the evaluated log-likelihood. The experimental evaluation demonstrates the effectiveness of our method as it surpasses the performance of the state-of-the-art unsupervised methods for outlier detection. The results are also competitive compared with a recent self-supervised approach for outlier detection. Our work allows to reduce dependency on well-sampled negative training data, which is especially important for domains like medical diagnostics or remote sensing.
△ Less
Submitted 2 June, 2024; v1 submitted 20 August, 2023;
originally announced October 2023.
-
Learning to reconstruct the bubble distribution with conductivity maps using Invertible Neural Networks and Error Diffusion
Authors:
Nishant Kumar,
Lukas Krause,
Thomas Wondrak,
Sven Eckert,
Kerstin Eckert,
Stefan Gumhold
Abstract:
Electrolysis is crucial for eco-friendly hydrogen production, but gas bubbles generated during the process hinder reactions, reduce cell efficiency, and increase energy consumption. Additionally, these gas bubbles cause changes in the conductivity inside the cell, resulting in corresponding variations in the induced magnetic field around the cell. Therefore, measuring these gas bubble-induced magn…
▽ More
Electrolysis is crucial for eco-friendly hydrogen production, but gas bubbles generated during the process hinder reactions, reduce cell efficiency, and increase energy consumption. Additionally, these gas bubbles cause changes in the conductivity inside the cell, resulting in corresponding variations in the induced magnetic field around the cell. Therefore, measuring these gas bubble-induced magnetic field fluctuations using external magnetic sensors and solving the inverse problem of Biot-Savart Law allows for estimating the conductivity in the cell and, thus, bubble size and location. However, determining high-resolution conductivity maps from only a few induced magnetic field measurements is an ill-posed inverse problem. To overcome this, we exploit Invertible Neural Networks (INNs) to reconstruct the conductivity field. Our qualitative results and quantitative evaluation using random error diffusion show that INN achieves far superior performance compared to Tikhonov regularization.
△ Less
Submitted 28 March, 2024; v1 submitted 4 July, 2023;
originally announced July 2023.
-
Towards Computational Performance Engineering for Unsupervised Concept Drift Detection -- Complexities, Benchmarking, Performance Analysis
Authors:
Elias Werner,
Nishant Kumar,
Matthias Lieber,
Sunna Torge,
Stefan Gumhold,
Wolfgang E. Nagel
Abstract:
Concept drift detection is crucial for many AI systems to ensure the system's reliability. These systems often have to deal with large amounts of data or react in real-time. Thus, drift detectors must meet computational requirements or constraints with a comprehensive performance evaluation. However, so far, the focus of developing drift detectors is on inference quality, e.g. accuracy, but not on…
▽ More
Concept drift detection is crucial for many AI systems to ensure the system's reliability. These systems often have to deal with large amounts of data or react in real-time. Thus, drift detectors must meet computational requirements or constraints with a comprehensive performance evaluation. However, so far, the focus of developing drift detectors is on inference quality, e.g. accuracy, but not on computational performance, such as runtime. Many of the previous works consider computational performance only as a secondary objective and do not have a benchmark for such evaluation. Hence, we propose and explain performance engineering for unsupervised concept drift detection that reflects on computational complexities, benchmarking, and performance analysis. We provide the computational complexities of existing unsupervised drift detectors and discuss why further computational performance investigations are required. Hence, we state and substantiate the aspects of a benchmark for unsupervised drift detection reflecting on inference quality and computational performance. Furthermore, we demonstrate performance analysis practices that have proven their effectiveness in High-Performance Computing, by tracing two drift detectors and displaying their performance data.
△ Less
Submitted 10 June, 2024; v1 submitted 17 April, 2023;
originally announced April 2023.
-
Normalizing Flow based Feature Synthesis for Outlier-Aware Object Detection
Authors:
Nishant Kumar,
Siniša Šegvić,
Abouzar Eslami,
Stefan Gumhold
Abstract:
Real-world deployment of reliable object detectors is crucial for applications such as autonomous driving. However, general-purpose object detectors like Faster R-CNN are prone to providing overconfident predictions for outlier objects. Recent outlier-aware object detection approaches estimate the density of instance-wide features with class-conditional Gaussians and train on synthesized outlier f…
▽ More
Real-world deployment of reliable object detectors is crucial for applications such as autonomous driving. However, general-purpose object detectors like Faster R-CNN are prone to providing overconfident predictions for outlier objects. Recent outlier-aware object detection approaches estimate the density of instance-wide features with class-conditional Gaussians and train on synthesized outlier features from their low-likelihood regions. However, this strategy does not guarantee that the synthesized outlier features will have a low likelihood according to the other class-conditional Gaussians. We propose a novel outlier-aware object detection framework that distinguishes outliers from inlier objects by learning the joint data distribution of all inlier classes with an invertible normalizing flow. The appropriate sampling of the flow model ensures that the synthesized outliers have a lower likelihood than inliers of all object classes, thereby modeling a better decision boundary between inlier and outlier objects. Our approach significantly outperforms the state-of-the-art for outlier-aware object detection on both image and video datasets. Code available at https://github.com/nish03/FFS
△ Less
Submitted 28 May, 2023; v1 submitted 1 February, 2023;
originally announced February 2023.
-
Challenging the Universal Representation of Deep Models for 3D Point Cloud Registration
Authors:
David Bojanić,
Kristijan Bartol,
Josep Forest,
Stefan Gumhold,
Tomislav Petković,
Tomislav Pribanić
Abstract:
Learning universal representations across different applications domain is an open research problem. In fact, finding universal architecture within the same application but across different types of datasets is still unsolved problem too, especially in applications involving processing 3D point clouds. In this work we experimentally test several state-of-the-art learning-based methods for 3D point…
▽ More
Learning universal representations across different applications domain is an open research problem. In fact, finding universal architecture within the same application but across different types of datasets is still unsolved problem too, especially in applications involving processing 3D point clouds. In this work we experimentally test several state-of-the-art learning-based methods for 3D point cloud registration against the proposed non-learning baseline registration method. The proposed method either outperforms or achieves comparable results w.r.t. learning based methods. In addition, we propose a dataset on which learning based methods have a hard time to generalize. Our proposed method and dataset, along with the provided experiments, can be used in further research in studying effective solutions for universal representations. Our source code is available at: github.com/DavidBoja/greedy-grid-search.
△ Less
Submitted 29 November, 2022;
originally announced November 2022.
-
Enhancing Fairness of Visual Attribute Predictors
Authors:
Tobias Hänel,
Nishant Kumar,
Dmitrij Schlesinger,
Mengze Li,
Erdem Ünal,
Abouzar Eslami,
Stefan Gumhold
Abstract:
The performance of deep neural networks for image recognition tasks such as predicting a smiling face is known to degrade with under-represented classes of sensitive attributes. We address this problem by introducing fairness-aware regularization losses based on batch estimates of Demographic Parity, Equalized Odds, and a novel Intersection-over-Union measure. The experiments performed on facial a…
▽ More
The performance of deep neural networks for image recognition tasks such as predicting a smiling face is known to degrade with under-represented classes of sensitive attributes. We address this problem by introducing fairness-aware regularization losses based on batch estimates of Demographic Parity, Equalized Odds, and a novel Intersection-over-Union measure. The experiments performed on facial and medical images from CelebA, UTKFace, and the SIIM-ISIC melanoma classification challenge show the effectiveness of our proposed fairness losses for bias mitigation as they improve model fairness while maintaining high classification performance. To the best of our knowledge, our work is the first attempt to incorporate these types of losses in an end-to-end training scheme for mitigating biases of visual attribute predictors. Our code is available at https://github.com/nish03/FVAP.
△ Less
Submitted 1 October, 2022; v1 submitted 7 July, 2022;
originally announced July 2022.
-
Parallel Compositing of Volumetric Depth Images for Interactive Visualization of Distributed Volumes at High Frame Rates
Authors:
Aryaman Gupta,
Pietro Incardona,
Anton Brock,
Guido Reina,
Steffen Frey,
Stefan Gumhold,
Ulrik Günther,
Ivo F. Sbalzarini
Abstract:
We present a parallel compositing algorithm for Volumetric Depth Images (VDIs) of large three-dimensional volume data. Large distributed volume data are routinely produced in both numerical simulations and experiments, yet it remains challenging to visualize them at smooth, interactive frame rates. VDIs are view-dependent piecewise constant representations of volume data that offer a potential sol…
▽ More
We present a parallel compositing algorithm for Volumetric Depth Images (VDIs) of large three-dimensional volume data. Large distributed volume data are routinely produced in both numerical simulations and experiments, yet it remains challenging to visualize them at smooth, interactive frame rates. VDIs are view-dependent piecewise constant representations of volume data that offer a potential solution. They are more compact and less expensive to render than the original data. So far, however, there is no method for generating VDIs from distributed data. We propose an algorithm that enables this by sort-last parallel generation and compositing of VDIs with automatically chosen content-adaptive parameters. The resulting composited VDI can then be streamed for remote display, providing responsive visualization of large, distributed volume data.
△ Less
Submitted 31 July, 2024; v1 submitted 29 June, 2022;
originally announced June 2022.
-
Efficient Raycasting of Volumetric Depth Images for Remote Visualization of Large Volumes at High Frame Rates
Authors:
Aryaman Gupta,
Ulrik Günther,
Pietro Incardona,
Guido Reina,
Steffen Frey,
Stefan Gumhold,
Ivo F. Sbalzarini
Abstract:
We present an efficient raycasting algorithm for rendering Volumetric Depth Images (VDIs), and we show how it can be used in a remote visualization setting with VDIs generated and streamed from a remote server. VDIs are compact view-dependent volume representations that enable interactive visualization of large volumes at high frame rates by decoupling viewpoint changes from expensive rendering ca…
▽ More
We present an efficient raycasting algorithm for rendering Volumetric Depth Images (VDIs), and we show how it can be used in a remote visualization setting with VDIs generated and streamed from a remote server. VDIs are compact view-dependent volume representations that enable interactive visualization of large volumes at high frame rates by decoupling viewpoint changes from expensive rendering calculations. However, current rendering approaches for VDIs struggle with achieving interactive frame rates at high image resolutions. Here, we exploit the properties of perspective projection to simplify intersections of rays with the view-dependent frustums in a VDI and leverage spatial smoothness in the volume data to minimize memory accesses. Benchmarks show that responsive frame rates can be achieved close to the viewpoint of generation for HD display resolutions, providing high-fidelity approximate renderings of Gigabyte-sized volumes. We also propose a method to subsample the VDI for preview rendering, maintaining high frame rates even for large viewpoint deviations. We provide our implementation as an extension of an established open-source visualization library.
△ Less
Submitted 27 July, 2023; v1 submitted 17 June, 2022;
originally announced June 2022.
-
Real-time Visualization of Stream-based Monitoring Data
Authors:
Jan Baumeister,
Bernd Finkbeiner,
Stefan Gumhold,
Malte Schledjewski
Abstract:
Stream-based runtime monitors are used in safety-critical applications such as Unmanned Aerial Systems (UAS) to compute comprehensive statistics and logical assessments of system health that provide the human operator with critical information in hand-over situations. In such applications, a visual display of the monitoring data can be much more helpful than the textual alerts provided by a more t…
▽ More
Stream-based runtime monitors are used in safety-critical applications such as Unmanned Aerial Systems (UAS) to compute comprehensive statistics and logical assessments of system health that provide the human operator with critical information in hand-over situations. In such applications, a visual display of the monitoring data can be much more helpful than the textual alerts provided by a more traditional user interface. This visualization requires extensive real-time data processing, which includes the synchronization of data from different streams, filtering and aggregation, and priorization and management of user attention. We present a visualization approach for the \rtlola monitoring framework. Our approach is based on the principle that the necessary data processing is the responsibility of the monitor itself, rather than the responsibility of some external visualization tool. We show how the various aspects of the data transformation can be described as RTLola stream equations and linked to the visualization component through a bidirectional synchronous interface. In our experience, this approach leads to highly informative visualizations as well as to understandable and easily maintainable monitoring code.
△ Less
Submitted 25 May, 2022;
originally announced May 2022.
-
InFlow: Robust outlier detection utilizing Normalizing Flows
Authors:
Nishant Kumar,
Pia Hanfeld,
Michael Hecht,
Michael Bussmann,
Stefan Gumhold,
Nico Hoffmann
Abstract:
Normalizing flows are prominent deep generative models that provide tractable probability distributions and efficient density estimation. However, they are well known to fail while detecting Out-of-Distribution (OOD) inputs as they directly encode the local features of the input representations in their latent space. In this paper, we solve this overconfidence issue of normalizing flows by demonst…
▽ More
Normalizing flows are prominent deep generative models that provide tractable probability distributions and efficient density estimation. However, they are well known to fail while detecting Out-of-Distribution (OOD) inputs as they directly encode the local features of the input representations in their latent space. In this paper, we solve this overconfidence issue of normalizing flows by demonstrating that flows, if extended by an attention mechanism, can reliably detect outliers including adversarial attacks. Our approach does not require outlier data for training and we showcase the efficiency of our method for OOD detection by reporting state-of-the-art performance in diverse experimental settings. Code available at https://github.com/ComputationalRadiationPhysics/InFlow .
△ Less
Submitted 16 November, 2021; v1 submitted 10 June, 2021;
originally announced June 2021.
-
FuseVis: Interpreting neural networks for image fusion using per-pixel saliency visualization
Authors:
Nishant Kumar,
Stefan Gumhold
Abstract:
Image fusion helps in merging two or more images to construct a more informative single fused image. Recently, unsupervised learning based convolutional neural networks (CNN) have been utilized for different types of image fusion tasks such as medical image fusion, infrared-visible image fusion for autonomous driving as well as multi-focus and multi-exposure image fusion for satellite imagery. How…
▽ More
Image fusion helps in merging two or more images to construct a more informative single fused image. Recently, unsupervised learning based convolutional neural networks (CNN) have been utilized for different types of image fusion tasks such as medical image fusion, infrared-visible image fusion for autonomous driving as well as multi-focus and multi-exposure image fusion for satellite imagery. However, it is challenging to analyze the reliability of these CNNs for the image fusion tasks since no groundtruth is available. This led to the use of a wide variety of model architectures and optimization functions yielding quite different fusion results. Additionally, due to the highly opaque nature of such neural networks, it is difficult to explain the internal mechanics behind its fusion results. To overcome these challenges, we present a novel real-time visualization tool, named FuseVis, with which the end-user can compute per-pixel saliency maps that examine the influence of the input image pixels on each pixel of the fused image. We trained several image fusion based CNNs on medical image pairs and then using our FuseVis tool, we performed case studies on a specific clinical application by interpreting the saliency maps from each of the fusion methods. We specifically visualized the relative influence of each input image on the predictions of the fused image and showed that some of the evaluated image fusion methods are better suited for the specific clinical application. To the best of our knowledge, currently, there is no approach for visual analysis of neural networks for image fusion. Therefore, this work opens up a new research direction to improve the interpretability of deep fusion networks. The FuseVis tool can also be adapted in other deep neural network based image processing applications to make them interpretable.
△ Less
Submitted 6 December, 2020;
originally announced December 2020.
-
Visualisation of Medical Image Fusion and Translation for Accurate Diagnosis of High Grade Gliomas
Authors:
Nishant Kumar,
Nico Hoffmann,
Matthias Kirsch,
Stefan Gumhold
Abstract:
The medical image fusion combines two or more modalities into a single view while medical image translation synthesizes new images and assists in data augmentation. Together, these methods help in faster diagnosis of high grade malignant gliomas. However, they might be untrustworthy due to which neurosurgeons demand a robust visualisation tool to verify the reliability of the fusion and translatio…
▽ More
The medical image fusion combines two or more modalities into a single view while medical image translation synthesizes new images and assists in data augmentation. Together, these methods help in faster diagnosis of high grade malignant gliomas. However, they might be untrustworthy due to which neurosurgeons demand a robust visualisation tool to verify the reliability of the fusion and translation results before they make pre-operative surgical decisions. In this paper, we propose a novel approach to compute a confidence heat map between the source-target image pair by estimating the information transfer from the source to the target image using the joint probability distribution of the two images. We evaluate several fusion and translation methods using our visualisation procedure and showcase its robustness in enabling neurosurgeons to make finer clinical decisions.
△ Less
Submitted 30 January, 2020; v1 submitted 26 January, 2020;
originally announced January 2020.
-
Reinforced Feature Points: Optimizing Feature Detection and Description for a High-Level Task
Authors:
Aritra Bhowmik,
Stefan Gumhold,
Carsten Rother,
Eric Brachmann
Abstract:
We address a core problem of computer vision: Detection and description of 2D feature points for image matching. For a long time, hand-crafted designs, like the seminal SIFT algorithm, were unsurpassed in accuracy and efficiency. Recently, learned feature detectors emerged that implement detection and description using neural networks. Training these networks usually resorts to optimizing low-leve…
▽ More
We address a core problem of computer vision: Detection and description of 2D feature points for image matching. For a long time, hand-crafted designs, like the seminal SIFT algorithm, were unsurpassed in accuracy and efficiency. Recently, learned feature detectors emerged that implement detection and description using neural networks. Training these networks usually resorts to optimizing low-level matching scores, often pre-defining sets of image patches which should or should not match, or which should or should not contain key points. Unfortunately, increased accuracy for these low-level matching scores does not necessarily translate to better performance in high-level vision tasks. We propose a new training methodology which embeds the feature detector in a complete vision pipeline, and where the learnable parameters are trained in an end-to-end fashion. We overcome the discrete nature of key point selection and descriptor matching using principles from reinforcement learning. As an example, we address the task of relative pose estimation between a pair of images. We demonstrate that the accuracy of a state-of-the-art learning-based feature detector can be increased when trained for the task it is supposed to solve at test time. Our training methodology poses little restrictions on the task to learn, and works for any architecture which predicts key point heat maps, and descriptors for key point locations.
△ Less
Submitted 20 March, 2020; v1 submitted 2 December, 2019;
originally announced December 2019.
-
Learning to Think Outside the Box: Wide-Baseline Light Field Depth Estimation with EPI-Shift
Authors:
Titus Leistner,
Hendrik Schilling,
Radek Mackowiak,
Stefan Gumhold,
Carsten Rother
Abstract:
We propose a method for depth estimation from light field data, based on a fully convolutional neural network architecture. Our goal is to design a pipeline which achieves highly accurate results for small- and wide-baseline light fields. Since light field training data is scarce, all learning-based approaches use a small receptive field and operate on small disparity ranges. In order to work with…
▽ More
We propose a method for depth estimation from light field data, based on a fully convolutional neural network architecture. Our goal is to design a pipeline which achieves highly accurate results for small- and wide-baseline light fields. Since light field training data is scarce, all learning-based approaches use a small receptive field and operate on small disparity ranges. In order to work with wide-baseline light fields, we introduce the idea of EPI-Shift: To virtually shift the light field stack which enables to retain a small receptive field, independent of the disparity range. In this way, our approach "learns to think outside the box of the receptive field". Our network performs joint classification of integer disparities and regression of disparity-offsets. A U-Net component provides excellent long-range smoothing. EPI-Shift considerably outperforms the state-of-the-art learning-based approaches and is on par with hand-crafted methods. We demonstrate this on a publicly available, synthetic, small-baseline benchmark and on large-baseline real-world recordings.
△ Less
Submitted 19 September, 2019;
originally announced September 2019.
-
A Proposed Framework for Interactive Virtual Reality In Situ Visualization of Parallel Numerical Simulations
Authors:
Aryaman Gupta,
Ulrik Günther,
Pietro Incardona,
Ata Deniz Aydin,
Raimund Dachselt,
Stefan Gumhold,
Ivo F. Sbalzarini
Abstract:
As computer simulations progress to increasingly complex, non-linear, and three-dimensional systems and phenomena, intuitive and immediate visualization of their results is becoming crucial. While Virtual Reality (VR) and Natural User Interfaces (NUIs) have been shown to improve understanding of complex 3D data, their application to live in situ visualization and computational steering is hampered…
▽ More
As computer simulations progress to increasingly complex, non-linear, and three-dimensional systems and phenomena, intuitive and immediate visualization of their results is becoming crucial. While Virtual Reality (VR) and Natural User Interfaces (NUIs) have been shown to improve understanding of complex 3D data, their application to live in situ visualization and computational steering is hampered by performance requirements. Here, we present the design of a software framework for interactive VR in situ visualization of parallel numerical simulations, as well as a working prototype implementation. Our design is targeted towards meeting the performance requirements for VR, and our work is packaged in a framework that allows for easy instrumentation of simulations. Our preliminary results inform about the technical feasibility of the architecture, as well as the challenges that remain.
△ Less
Submitted 6 September, 2019;
originally announced September 2019.
-
Structural Similarity based Anatomical and Functional Brain Imaging Fusion
Authors:
Nishant Kumar,
Nico Hoffmann,
Martin Oelschlägel,
Edmund Koch,
Matthias Kirsch,
Stefan Gumhold
Abstract:
Multimodal medical image fusion helps in combining contrasting features from two or more input imaging modalities to represent fused information in a single image. One of the pivotal clinical applications of medical image fusion is the merging of anatomical and functional modalities for fast diagnosis of malignant tissues. In this paper, we present a novel end-to-end unsupervised learning-based Co…
▽ More
Multimodal medical image fusion helps in combining contrasting features from two or more input imaging modalities to represent fused information in a single image. One of the pivotal clinical applications of medical image fusion is the merging of anatomical and functional modalities for fast diagnosis of malignant tissues. In this paper, we present a novel end-to-end unsupervised learning-based Convolutional Neural Network (CNN) for fusing the high and low frequency components of MRI-PET grayscale image pairs, publicly available at ADNI, by exploiting Structural Similarity Index (SSIM) as the loss function during training. We then apply color coding for the visualization of the fused image by quantifying the contribution of each input image in terms of the partial derivatives of the fused image. We find that our fusion and visualization approach results in better visual perception of the fused image, while also comparing favorably to previous methods when applying various quantitative assessment metrics.
△ Less
Submitted 18 September, 2019; v1 submitted 11 August, 2019;
originally announced August 2019.
-
Analysis of the Near-Wall Flow in a Turbine Cascade by Splat Visualization
Authors:
Baldwin Nsonga,
Gerik Scheuermann,
Stefan Gumhold,
Jordi Ventosa-Molina,
Denis Koschichow,
Jochen Fröhlich
Abstract:
Turbines are essential components of jet planes and power plants. Therefore, their efficiency and service life are of central engineering interest. In the case of jet planes or thermal power plants, the heating of the turbines due to the hot gas flow is critical. Besides effective cooling, it is a major goal of engineers to minimize heat transfer between gas flow and turbine by design. Since it is…
▽ More
Turbines are essential components of jet planes and power plants. Therefore, their efficiency and service life are of central engineering interest. In the case of jet planes or thermal power plants, the heating of the turbines due to the hot gas flow is critical. Besides effective cooling, it is a major goal of engineers to minimize heat transfer between gas flow and turbine by design. Since it is known that splat events have a substantial impact on the heat transfer between flow and immersed surfaces, we adapt a splat detection and visualization method to a turbine cascade simulation in this case study. Because splat events are small phenomena, we use a direct numerical simulation resolving the turbulence in the flow as the base of our analysis. The outcome shows promising insights into splat formation and its relation to vortex structures. This may lead to better turbine design in the future.
△ Less
Submitted 23 July, 2019;
originally announced July 2019.
-
scenery: Flexible Virtual Reality Visualization on the Java VM
Authors:
Ulrik Günther,
Tobias Pietzsch,
Aryaman Gupta,
Kyle I. S. Harrington,
Pavel Tomancak,
Stefan Gumhold,
Ivo F. Sbalzarini
Abstract:
Life science today involves computational analysis of a large amount and variety of data, such as volumetric data acquired by state-of-the-art microscopes, or mesh data from analysis of such data or simulations. Visualization is often the first step in making sense of data, and a crucial part of building and debugging analysis pipelines. It is therefore important that visualizations can be quickly…
▽ More
Life science today involves computational analysis of a large amount and variety of data, such as volumetric data acquired by state-of-the-art microscopes, or mesh data from analysis of such data or simulations. Visualization is often the first step in making sense of data, and a crucial part of building and debugging analysis pipelines. It is therefore important that visualizations can be quickly prototyped, as well as developed or embedded into full applications. In order to better judge spatiotemporal relationships, immersive hardware, such as Virtual or Augmented Reality (VR/AR) headsets and associated controllers are becoming invaluable tools. In this work we introduce scenery, a flexible VR/AR visualization framework for the Java VM that can handle mesh and large volumetric data, containing multiple views, timepoints, and color channels. scenery is free and open-source software, works on all major platforms, and uses the Vulkan or OpenGL rendering APIs. We introduce scenery's main features and example applications, such as its use in VR for microscopy, in the biomedical image analysis software Fiji, or for visualizing agent-based simulations.
△ Less
Submitted 22 April, 2020; v1 submitted 16 June, 2019;
originally announced June 2019.
-
Global Hypothesis Generation for 6D Object Pose Estimation
Authors:
Frank Michel,
Alexander Kirillov,
Eric Brachmann,
Alexander Krull,
Stefan Gumhold,
Bogdan Savchynskyy,
Carsten Rother
Abstract:
This paper addresses the task of estimating the 6D pose of a known 3D object from a single RGB-D image. Most modern approaches solve this task in three steps: i) Compute local features; ii) Generate a pool of pose-hypotheses; iii) Select and refine a pose from the pool. This work focuses on the second step. While all existing approaches generate the hypotheses pool via local reasoning, e.g. RANSAC…
▽ More
This paper addresses the task of estimating the 6D pose of a known 3D object from a single RGB-D image. Most modern approaches solve this task in three steps: i) Compute local features; ii) Generate a pool of pose-hypotheses; iii) Select and refine a pose from the pool. This work focuses on the second step. While all existing approaches generate the hypotheses pool via local reasoning, e.g. RANSAC or Hough-voting, we are the first to show that global reasoning is beneficial at this stage. In particular, we formulate a novel fully-connected Conditional Random Field (CRF) that outputs a very small number of pose-hypotheses. Despite the potential functions of the CRF being non-Gaussian, we give a new and efficient two-step optimization procedure, with some guarantees for optimality. We utilize our global hypotheses generation procedure to produce results that exceed state-of-the-art for the challenging "Occluded Object Dataset".
△ Less
Submitted 2 January, 2017; v1 submitted 7 December, 2016;
originally announced December 2016.
-
In situ, steerable, hardware-independent and data-structure agnostic visualization with ISAAC
Authors:
Alexander Matthes,
Axel Huebl,
René Widera,
Sebastian Grottel,
Stefan Gumhold,
Michael Bussmann
Abstract:
The computation power of supercomputers grows faster than the bandwidth of their storage and network. Especially applications using hardware accelerators like Nvidia GPUs cannot save enough data to be analyzed in a later step. There is a high risk of loosing important scientific information. We introduce the in situ template library ISAAC which enables arbitrary applications like scientific simula…
▽ More
The computation power of supercomputers grows faster than the bandwidth of their storage and network. Especially applications using hardware accelerators like Nvidia GPUs cannot save enough data to be analyzed in a later step. There is a high risk of loosing important scientific information. We introduce the in situ template library ISAAC which enables arbitrary applications like scientific simulations to live visualize their data without the need of deep copy operations or data transformation using the very same compute node and hardware accelerator the data is already residing on. Arbitrary meta data can be added to the renderings and user defined steering commands can be asynchronously sent back to the running application. Using an aggregating server, ISAAC streams the interactive visualization video and enables user to access their applications from everywhere.
△ Less
Submitted 28 November, 2016;
originally announced November 2016.
-
DSAC - Differentiable RANSAC for Camera Localization
Authors:
Eric Brachmann,
Alexander Krull,
Sebastian Nowozin,
Jamie Shotton,
Frank Michel,
Stefan Gumhold,
Carsten Rother
Abstract:
RANSAC is an important algorithm in robust optimization and a central building block for many computer vision applications. In recent years, traditionally hand-crafted pipelines have been replaced by deep learning pipelines, which can be trained in an end-to-end fashion. However, RANSAC has so far not been used as part of such deep learning pipelines, because its hypothesis selection procedure is…
▽ More
RANSAC is an important algorithm in robust optimization and a central building block for many computer vision applications. In recent years, traditionally hand-crafted pipelines have been replaced by deep learning pipelines, which can be trained in an end-to-end fashion. However, RANSAC has so far not been used as part of such deep learning pipelines, because its hypothesis selection procedure is non-differentiable. In this work, we present two different ways to overcome this limitation. The most promising approach is inspired by reinforcement learning, namely to replace the deterministic hypothesis selection by a probabilistic selection for which we can derive the expected loss w.r.t. to all learnable parameters. We call this approach DSAC, the differentiable counterpart of RANSAC. We apply DSAC to the problem of camera localization, where deep learning has so far failed to improve on traditional approaches. We demonstrate that by directly minimizing the expected loss of the output camera poses, robustly estimated by RANSAC, we achieve an increase in accuracy. In the future, any deep learning pipeline can use DSAC as a robust optimization component.
△ Less
Submitted 21 March, 2018; v1 submitted 17 November, 2016;
originally announced November 2016.
-
Learning Analysis-by-Synthesis for 6D Pose Estimation in RGB-D Images
Authors:
Alexander Krull,
Eric Brachmann,
Frank Michel,
Michael Ying Yang,
Stefan Gumhold,
Carsten Rother
Abstract:
Analysis-by-synthesis has been a successful approach for many tasks in computer vision, such as 6D pose estimation of an object in an RGB-D image which is the topic of this work. The idea is to compare the observation with the output of a forward process, such as a rendered image of the object of interest in a particular pose. Due to occlusion or complicated sensor noise, it can be difficult to pe…
▽ More
Analysis-by-synthesis has been a successful approach for many tasks in computer vision, such as 6D pose estimation of an object in an RGB-D image which is the topic of this work. The idea is to compare the observation with the output of a forward process, such as a rendered image of the object of interest in a particular pose. Due to occlusion or complicated sensor noise, it can be difficult to perform this comparison in a meaningful way. We propose an approach that "learns to compare", while taking these difficulties into account. This is done by describing the posterior density of a particular object pose with a convolutional neural network (CNN) that compares an observed and rendered image. The network is trained with the maximum likelihood paradigm. We observe empirically that the CNN does not specialize to the geometry or appearance of specific objects, and it can be used with objects of vastly different shapes and appearances, and in different backgrounds. Compared to state-of-the-art, we demonstrate a significant improvement on two different datasets which include a total of eleven objects, cluttered background, and heavy occlusion.
△ Less
Submitted 19 August, 2015;
originally announced August 2015.