-
SOLAS: Superpositioning an Optical Lens in Automotive Simulation
Authors:
Daniel Jakab,
Julian Barthel,
Alexander Braun,
Reenu Mohandas,
Brian Michael Deegan,
Mahendar Kumbham,
Dara Molloy,
Fiachra Collins,
Anthony Scanlan,
Ciarán Eising
Abstract:
Automotive Simulation is a potentially cost-effective strategy to identify and test corner case scenarios in automotive perception. Recent work has shown a significant shift in creating realistic synthetic data for road traffic scenarios using a video graphics engine. However, a gap exists in modeling realistic optical aberrations associated with cameras in automotive simulation. This paper builds…
▽ More
Automotive Simulation is a potentially cost-effective strategy to identify and test corner case scenarios in automotive perception. Recent work has shown a significant shift in creating realistic synthetic data for road traffic scenarios using a video graphics engine. However, a gap exists in modeling realistic optical aberrations associated with cameras in automotive simulation. This paper builds on the concept from existing literature to model optical degradations in simulated environments using the Python-based ray-tracing library KrakenOS. As a novel pipeline, we degrade automotive fisheye simulation using an optical doublet with +/-2 deg Field of View (FOV), introducing realistic optical artifacts into two simulation images from SynWoodscape and Parallel Domain Woodscape. We evaluate KrakenOS by calculating the Root Mean Square Error (RMSE), which averaged around 0.023 across the RGB light spectrum compared to Ansys Zemax OpticStudio, an industrial benchmark for optical design and simulation. Lastly, we measure the image sharpness of the degraded simulation using the ISO12233:2023 Slanted Edge Method and show how both qualitative and measured results indicate the extent of the spatial variation in image sharpness from the periphery to the center of the degradations.
△ Less
Submitted 16 January, 2025;
originally announced January 2025.
-
Automatic Bat Call Classification using Transformer Networks
Authors:
Frank Fundel,
Daniel A. Braun,
Sebastian Gottwald
Abstract:
Automatically identifying bat species from their echolocation calls is a difficult but important task for monitoring bats and the ecosystem they live in. Major challenges in automatic bat call identification are high call variability, similarities between species, interfering calls and lack of annotated data. Many currently available models suffer from relatively poor performance on real-life data…
▽ More
Automatically identifying bat species from their echolocation calls is a difficult but important task for monitoring bats and the ecosystem they live in. Major challenges in automatic bat call identification are high call variability, similarities between species, interfering calls and lack of annotated data. Many currently available models suffer from relatively poor performance on real-life data due to being trained on single call datasets and, moreover, are often too slow for real-time classification. Here, we propose a Transformer architecture for multi-label classification with potential applications in real-time classification scenarios. We train our model on synthetically generated multi-species recordings by merging multiple bats calls into a single recording with multiple simultaneous calls. Our approach achieves a single species accuracy of 88.92% (F1-score of 84.23%) and a multi species macro F1-score of 74.40% on our test set. In comparison to three other tools on the independent and publicly available dataset ChiroVox, our model achieves at least 25.82% better accuracy for single species classification and at least 6.9% better macro F1-score for multi species classification.
△ Less
Submitted 20 September, 2023;
originally announced September 2023.
-
ECS -- an Interactive Tool for Data Quality Assurance
Authors:
Christian Sieberichs,
Simon Geerkens,
Alexander Braun,
Thomas Waschulzik
Abstract:
With the increasing capabilities of machine learning systems and their potential use in safety-critical systems, ensuring high-quality data is becoming increasingly important. In this paper we present a novel approach for the assurance of data quality. For this purpose, the mathematical basics are first discussed and the approach is presented using multiple examples. This results in the detection…
▽ More
With the increasing capabilities of machine learning systems and their potential use in safety-critical systems, ensuring high-quality data is becoming increasingly important. In this paper we present a novel approach for the assurance of data quality. For this purpose, the mathematical basics are first discussed and the approach is presented using multiple examples. This results in the detection of data points with potentially harmful properties for the use in safety-critical systems.
△ Less
Submitted 17 July, 2023; v1 submitted 10 July, 2023;
originally announced July 2023.
-
ExAID: A Multimodal Explanation Framework for Computer-Aided Diagnosis of Skin Lesions
Authors:
Adriano Lucieri,
Muhammad Naseer Bajwa,
Stephan Alexander Braun,
Muhammad Imran Malik,
Andreas Dengel,
Sheraz Ahmed
Abstract:
One principal impediment in the successful deployment of AI-based Computer-Aided Diagnosis (CAD) systems in clinical workflows is their lack of transparent decision making. Although commonly used eXplainable AI methods provide some insight into opaque algorithms, such explanations are usually convoluted and not readily comprehensible except by highly trained experts. The explanation of decisions r…
▽ More
One principal impediment in the successful deployment of AI-based Computer-Aided Diagnosis (CAD) systems in clinical workflows is their lack of transparent decision making. Although commonly used eXplainable AI methods provide some insight into opaque algorithms, such explanations are usually convoluted and not readily comprehensible except by highly trained experts. The explanation of decisions regarding the malignancy of skin lesions from dermoscopic images demands particular clarity, as the underlying medical problem definition is itself ambiguous. This work presents ExAID (Explainable AI for Dermatology), a novel framework for biomedical image analysis, providing multi-modal concept-based explanations consisting of easy-to-understand textual explanations supplemented by visual maps justifying the predictions. ExAID relies on Concept Activation Vectors to map human concepts to those learnt by arbitrary Deep Learning models in latent space, and Concept Localization Maps to highlight concepts in the input space. This identification of relevant concepts is then used to construct fine-grained textual explanations supplemented by concept-wise location information to provide comprehensive and coherent multi-modal explanations. All information is comprehensively presented in a diagnostic interface for use in clinical routines. An educational mode provides dataset-level explanation statistics and tools for data and model exploration to aid medical research and education. Through rigorous quantitative and qualitative evaluation of ExAID, we show the utility of multi-modal explanations for CAD-assisted scenarios even in case of wrong predictions. We believe that ExAID will provide dermatologists an effective screening tool that they both understand and trust. Moreover, it will be the basis for similar applications in other biomedical imaging fields.
△ Less
Submitted 4 January, 2022;
originally announced January 2022.
-
A Comparison of Methods for OOV-word Recognition on a New Public Dataset
Authors:
Rudolf A. Braun,
Srikanth Madikeri,
Petr Motlicek
Abstract:
A common problem for automatic speech recognition systems is how to recognize words that they did not see during training. Currently there is no established method of evaluating different techniques for tackling this problem. We propose using the CommonVoice dataset to create test sets for multiple languages which have a high out-of-vocabulary (OOV) ratio relative to a training set and release a n…
▽ More
A common problem for automatic speech recognition systems is how to recognize words that they did not see during training. Currently there is no established method of evaluating different techniques for tackling this problem. We propose using the CommonVoice dataset to create test sets for multiple languages which have a high out-of-vocabulary (OOV) ratio relative to a training set and release a new tool for calculating relevant performance metrics. We then evaluate, within the context of a hybrid ASR system, how much better subword models are at recognizing OOVs, and how much benefit one can get from incorporating OOV-word information into an existing system by modifying WFSTs. Additionally, we propose a new method for modifying a subword-based language model so as to better recognize OOV-words. We showcase very large improvements in OOV-word recognition and make both the data and code available.
△ Less
Submitted 16 July, 2021;
originally announced July 2021.
-
Very High Resolution Land Cover Mapping of Urban Areas at Global Scale with Convolutional Neural Networks
Authors:
Thomas Tilak,
Arnaud Braun,
David Chandler,
Nicolas David,
Sylvain Galopin,
Amélie Lombard,
Michaël Michaud,
Camille Parisel,
Matthieu Porte,
Marjorie Robert
Abstract:
This paper describes a methodology to produce a 7-classes land cover map of urban areas from very high resolution images and limited noisy labeled data. The objective is to make a segmentation map of a large area (a french department) with the following classes: asphalt, bare soil, building, grassland, mineral material (permeable artificialized areas), forest and water from 20cm aerial images and…
▽ More
This paper describes a methodology to produce a 7-classes land cover map of urban areas from very high resolution images and limited noisy labeled data. The objective is to make a segmentation map of a large area (a french department) with the following classes: asphalt, bare soil, building, grassland, mineral material (permeable artificialized areas), forest and water from 20cm aerial images and Digital Height Model. We created a training dataset on a few areas of interest aggregating databases, semi-automatic classification, and manual annotation to get a complete ground truth in each class. A comparative study of different encoder-decoder architectures (U-Net, U-Net with Resnet encoders, Deeplab v3+) is presented with different loss functions. The final product is a highly valuable land cover map computed from model predictions stitched together, binarized, and refined before vectorization.
△ Less
Submitted 12 May, 2020;
originally announced May 2020.
-
On Interpretability of Deep Learning based Skin Lesion Classifiers using Concept Activation Vectors
Authors:
Adriano Lucieri,
Muhammad Naseer Bajwa,
Stephan Alexander Braun,
Muhammad Imran Malik,
Andreas Dengel,
Sheraz Ahmed
Abstract:
Deep learning based medical image classifiers have shown remarkable prowess in various application areas like ophthalmology, dermatology, pathology, and radiology. However, the acceptance of these Computer-Aided Diagnosis (CAD) systems in real clinical setups is severely limited primarily because their decision-making process remains largely obscure. This work aims at elucidating a deep learning b…
▽ More
Deep learning based medical image classifiers have shown remarkable prowess in various application areas like ophthalmology, dermatology, pathology, and radiology. However, the acceptance of these Computer-Aided Diagnosis (CAD) systems in real clinical setups is severely limited primarily because their decision-making process remains largely obscure. This work aims at elucidating a deep learning based medical image classifier by verifying that the model learns and utilizes similar disease-related concepts as described and employed by dermatologists. We used a well-trained and high performing neural network developed by REasoning for COmplex Data (RECOD) Lab for classification of three skin tumours, i.e. Melanocytic Naevi, Melanoma and Seborrheic Keratosis and performed a detailed analysis on its latent space. Two well established and publicly available skin disease datasets, PH2 and derm7pt, are used for experimentation. Human understandable concepts are mapped to RECOD image classification model with the help of Concept Activation Vectors (CAVs), introducing a novel training and significance testing paradigm for CAVs. Our results on an independent evaluation set clearly shows that the classifier learns and encodes human understandable concepts in its latent representation. Additionally, TCAV scores (Testing with CAVs) suggest that the neural network indeed makes use of disease-related concepts in the correct way when making predictions. We anticipate that this work can not only increase confidence of medical practitioners on CAD but also serve as a stepping stone for further development of CAV-based neural network interpretation methods.
△ Less
Submitted 5 May, 2020;
originally announced May 2020.
-
Resolution and accuracy of non-linear regression of PSF with artificial neural networks
Authors:
Matthias Lehmann,
Christian Wittpahl,
Hatem Ben Zakour,
Alexander Braun
Abstract:
In a previous work we have demonstrated a novel numerical model for the point spread function (PSF) of an optical system that can efficiently model both experimental measurements and lens design simulations of the PSF. The novelty lies in the portability and the parameterization of this model, which allows for completely new ways to validate optical systems, which is especially interesting for mas…
▽ More
In a previous work we have demonstrated a novel numerical model for the point spread function (PSF) of an optical system that can efficiently model both experimental measurements and lens design simulations of the PSF. The novelty lies in the portability and the parameterization of this model, which allows for completely new ways to validate optical systems, which is especially interesting for mass production optics like in the automotive industry, but also for ophtalmology. The numerical basis for this model is a non-linear regression of the PSF with an artificial neural network (ANN). In this work we examine two important aspects of this model: the spatial resolution and the accuracy of the model. Measurement and simulation of a PSF can have a much higher resolution then the typical pixel size used in current camera sensors, especially those for the automotive industry. We discuss the influence this has on on the topology of the ANN and the final application where the modeled PSF is actually used. Another important influence on the accuracy of the trained ANN is the error metric which is used during training. The PSF is a distinctly non-linear function, which varies strongly over field and defocus, but nonetheless exhibits strong symmetries and spatial relations. Therefore we examine different distance and similarity measures and discuss its influence on the modeling performance of the ANN.
△ Less
Submitted 21 June, 2018;
originally announced June 2018.
-
Realistic Image Degradation with Measured PSF
Authors:
Christian Wittpahl,
Hatem Ben Zakour,
Matthias Lehmann,
Alexander Braun
Abstract:
Training autonomous vehicles requires lots of driving sequences in all situations\cite{zhao2016}. Typically a simulation environment (software-in-the-loop, SiL) accompanies real-world test drives to systematically vary environmental parameters. A missing piece in the optical model of those SiL simulations is the sharpness, given in linear system theory by the point-spread function (PSF) of the opt…
▽ More
Training autonomous vehicles requires lots of driving sequences in all situations\cite{zhao2016}. Typically a simulation environment (software-in-the-loop, SiL) accompanies real-world test drives to systematically vary environmental parameters. A missing piece in the optical model of those SiL simulations is the sharpness, given in linear system theory by the point-spread function (PSF) of the optical system. We present a novel numerical model for the PSF of an optical system that can efficiently model both experimental measurements and lens design simulations of the PSF. The numerical basis for this model is a non-linear regression of the PSF with an artificial neural network (ANN). The novelty lies in the portability and the parameterization of this model, which allows to apply this model in basically any conceivable optical simulation scenario, e.g. inserting a measured lens into a computer game to train autonomous vehicles. We present a lens measurement series, yielding a numerical function for the PSF that depends only on the parameters defocus, field and azimuth. By convolving existing images and videos with this PSF model we apply the measured lens as a transfer function, therefore generating an image as if it were seen with the measured lens itself. Applications of this method are in any optical scenario, but we focus on the context of autonomous driving, where quality of the detection algorithms depends directly on the optical quality of the used camera system. With the parameterization of the optical model we present a method to validate the functional and safety limits of camera-based ADAS based on the real, measured lens actually used in the product.
△ Less
Submitted 7 January, 2018;
originally announced January 2018.
-
Planning with Information-Processing Constraints and Model Uncertainty in Markov Decision Processes
Authors:
Jordi Grau-Moya,
Felix Leibfried,
Tim Genewein,
Daniel A. Braun
Abstract:
Information-theoretic principles for learning and acting have been proposed to solve particular classes of Markov Decision Problems. Mathematically, such approaches are governed by a variational free energy principle and allow solving MDP planning problems with information-processing constraints expressed in terms of a Kullback-Leibler divergence with respect to a reference distribution. Here we c…
▽ More
Information-theoretic principles for learning and acting have been proposed to solve particular classes of Markov Decision Problems. Mathematically, such approaches are governed by a variational free energy principle and allow solving MDP planning problems with information-processing constraints expressed in terms of a Kullback-Leibler divergence with respect to a reference distribution. Here we consider a generalization of such MDP planners by taking model uncertainty into account. As model uncertainty can also be formalized as an information-processing constraint, we can derive a unified solution from a single generalized variational principle. We provide a generalized value iteration scheme together with a convergence proof. As limit cases, this generalized scheme includes standard value iteration with a known model, Bayesian MDP planning, and robust planning. We demonstrate the benefits of this approach in a grid world simulation.
△ Less
Submitted 7 April, 2016;
originally announced April 2016.
-
Information-Theoretic Bounded Rationality
Authors:
Pedro A. Ortega,
Daniel A. Braun,
Justin Dyer,
Kee-Eung Kim,
Naftali Tishby
Abstract:
Bounded rationality, that is, decision-making and planning under resource limitations, is widely regarded as an important open problem in artificial intelligence, reinforcement learning, computational neuroscience and economics. This paper offers a consolidated presentation of a theory of bounded rationality based on information-theoretic ideas. We provide a conceptual justification for using the…
▽ More
Bounded rationality, that is, decision-making and planning under resource limitations, is widely regarded as an important open problem in artificial intelligence, reinforcement learning, computational neuroscience and economics. This paper offers a consolidated presentation of a theory of bounded rationality based on information-theoretic ideas. We provide a conceptual justification for using the free energy functional as the objective function for characterizing bounded-rational decisions. This functional possesses three crucial properties: it controls the size of the solution space; it has Monte Carlo planners that are exact, yet bypass the need for exhaustive search; and it captures model uncertainty arising from lack of evidence or from interacting with other agents having unknown intentions. We discuss the single-step decision-making case, and show how to extend it to sequential decisions using equivalence transformations. This extension yields a very general class of decision problems that encompass classical decision rules (e.g. EXPECTIMAX and MINIMAX) as limit cases, as well as trust- and risk-sensitive planning.
△ Less
Submitted 21 December, 2015;
originally announced December 2015.
-
Free Energy and the Generalized Optimality Equations for Sequential Decision Making
Authors:
Pedro A. Ortega,
Daniel A. Braun
Abstract:
The free energy functional has recently been proposed as a variational principle for bounded rational decision-making, since it instantiates a natural trade-off between utility gains and information processing costs that can be axiomatically derived. Here we apply the free energy principle to general decision trees that include both adversarial and stochastic environments. We derive generalized se…
▽ More
The free energy functional has recently been proposed as a variational principle for bounded rational decision-making, since it instantiates a natural trade-off between utility gains and information processing costs that can be axiomatically derived. Here we apply the free energy principle to general decision trees that include both adversarial and stochastic environments. We derive generalized sequential optimality equations that not only include the Bellman optimality equations as a limit case, but also lead to well-known decision-rules such as Expectimax, Minimax and Expectiminimax. We show how these decision-rules can be derived from a single free energy principle that assigns a resource parameter to each node in the decision tree. These resource parameters express a concrete computational cost that can be measured as the amount of samples that are needed from the distribution that belongs to each node. The free energy principle therefore provides the normative basis for generalized optimality equations that account for both adversarial and stochastic environments.
△ Less
Submitted 17 May, 2012;
originally announced May 2012.