-
CliMB: An AI-enabled Partner for Clinical Predictive Modeling
Authors:
Evgeny Saveliev,
Tim Schubert,
Thomas Pouplin,
Vasilis Kosmoliaptsis,
Mihaela van der Schaar
Abstract:
Despite its significant promise and continuous technical advances, real-world applications of artificial intelligence (AI) remain limited. We attribute this to the "domain expert-AI-conundrum": while domain experts, such as clinician scientists, should be able to build predictive models such as risk scores, they face substantial barriers in accessing state-of-the-art (SOTA) tools. While automated…
▽ More
Despite its significant promise and continuous technical advances, real-world applications of artificial intelligence (AI) remain limited. We attribute this to the "domain expert-AI-conundrum": while domain experts, such as clinician scientists, should be able to build predictive models such as risk scores, they face substantial barriers in accessing state-of-the-art (SOTA) tools. While automated machine learning (AutoML) has been proposed as a partner in clinical predictive modeling, many additional requirements need to be fulfilled to make machine learning accessible for clinician scientists.
To address this gap, we introduce CliMB, a no-code AI-enabled partner designed to empower clinician scientists to create predictive models using natural language. CliMB guides clinician scientists through the entire medical data science pipeline, thus empowering them to create predictive models from real-world data in just one conversation. CliMB also creates structured reports and interpretable visuals. In evaluations involving clinician scientists and systematic comparisons against a baseline GPT-4, CliMB consistently demonstrated superior performance in key areas such as planning, error prevention, code execution, and model performance. Moreover, in blinded assessments involving 45 clinicians from diverse specialties and career stages, more than 80% preferred CliMB over GPT-4. Overall, by providing a no-code interface with clear guidance and access to SOTA methods in the fields of data-centric AI, AutoML, and interpretable ML, CliMB empowers clinician scientists to build robust predictive models.
The proof-of-concept version of CliMB is available as open-source software on GitHub: https://github.com/vanderschaarlab/climb.
△ Less
Submitted 25 November, 2024; v1 submitted 30 September, 2024;
originally announced October 2024.
-
Enhancing the analysis of murine neonatal ultrasonic vocalizations: Development, evaluation, and application of different mathematical models
Authors:
Rudolf Herdt,
Louisa Kinzel,
Johann Georg Maaß,
Marvin Walther,
Henning Fröhlich,
Tim Schubert,
Peter Maass,
Christian Patrick Schaaf
Abstract:
Rodents employ a broad spectrum of ultrasonic vocalizations (USVs) for social communication. As these vocalizations offer valuable insights into affective states, social interactions, and developmental stages of animals, various deep learning approaches have aimed to automate both the quantitative (detection) and qualitative (classification) analysis of USVs. Here, we present the first systematic…
▽ More
Rodents employ a broad spectrum of ultrasonic vocalizations (USVs) for social communication. As these vocalizations offer valuable insights into affective states, social interactions, and developmental stages of animals, various deep learning approaches have aimed to automate both the quantitative (detection) and qualitative (classification) analysis of USVs. Here, we present the first systematic evaluation of different types of neural networks for USV classification. We assessed various feedforward networks, including a custom-built, fully-connected network and convolutional neural network, different residual neural networks (ResNets), an EfficientNet, and a Vision Transformer (ViT). Paired with a refined, entropy-based detection algorithm (achieving recall of 94.9% and precision of 99.3%), the best architecture (achieving 86.79% accuracy) was integrated into a fully automated pipeline capable of analyzing extensive USV datasets with high reliability. Additionally, users can specify an individual minimum accuracy threshold based on their research needs. In this semi-automated setup, the pipeline selectively classifies calls with high pseudo-probability, leaving the rest for manual inspection. Our study focuses exclusively on neonatal USVs. As part of an ongoing phenotyping study, our pipeline has proven to be a valuable tool for identifying key differences in USVs produced by mice with autism-like behaviors.
△ Less
Submitted 1 October, 2024; v1 submitted 17 May, 2024;
originally announced May 2024.
-
A Reinforcement Learning Environment for Directed Quantum Circuit Synthesis
Authors:
Michael Kölle,
Tom Schubert,
Philipp Altmann,
Maximilian Zorn,
Jonas Stein,
Claudia Linnhoff-Popien
Abstract:
With recent advancements in quantum computing technology, optimizing quantum circuits and ensuring reliable quantum state preparation have become increasingly vital. Traditional methods often demand extensive expertise and manual calculations, posing challenges as quantum circuits grow in qubit- and gate-count. Therefore, harnessing machine learning techniques to handle the growing variety of gate…
▽ More
With recent advancements in quantum computing technology, optimizing quantum circuits and ensuring reliable quantum state preparation have become increasingly vital. Traditional methods often demand extensive expertise and manual calculations, posing challenges as quantum circuits grow in qubit- and gate-count. Therefore, harnessing machine learning techniques to handle the growing variety of gate-to-qubit combinations is a promising approach. In this work, we introduce a comprehensive reinforcement learning environment for quantum circuit synthesis, where circuits are constructed utilizing gates from the the Clifford+T gate set to prepare specific target states. Our experiments focus on exploring the relationship between the depth of synthesized quantum circuits and the circuit depths used for target initialization, as well as qubit count. We organize the environment configurations into multiple evaluation levels and include a range of well-known quantum states for benchmarking purposes. We also lay baselines for evaluating the environment using Proximal Policy Optimization. By applying the trained agents to benchmark tests, we demonstrated their ability to reliably design minimal quantum circuits for a selection of 2-qubit Bell states.
△ Less
Submitted 13 January, 2024;
originally announced January 2024.
-
A Foundational Framework and Methodology for Personalized Early and Timely Diagnosis
Authors:
Tim Schubert,
Richard W Peck,
Alexander Gimson,
Camelia Davtyan,
Mihaela van der Schaar
Abstract:
Early diagnosis of diseases holds the potential for deep transformation in healthcare by enabling better treatment options, improving long-term survival and quality of life, and reducing overall cost. With the advent of medical big data, advances in diagnostic tests as well as in machine learning and statistics, early or timely diagnosis seems within reach. Early diagnosis research often neglects…
▽ More
Early diagnosis of diseases holds the potential for deep transformation in healthcare by enabling better treatment options, improving long-term survival and quality of life, and reducing overall cost. With the advent of medical big data, advances in diagnostic tests as well as in machine learning and statistics, early or timely diagnosis seems within reach. Early diagnosis research often neglects the potential for optimizing individual diagnostic paths. To enable personalized early diagnosis, a foundational framework is needed that delineates the diagnosis process and systematically identifies the time-dependent value of various diagnostic tests for an individual patient given their unique characteristics. Here, we propose the first foundational framework for early and timely diagnosis. It builds on decision-theoretic approaches to outline the diagnosis process and integrates machine learning and statistical methodology for estimating the optimal personalized diagnostic path. To describe the proposed framework as well as possibly other frameworks, we provide essential definitions.
The development of a foundational framework is necessary for several reasons: 1) formalism provides clarity for the development of decision support tools; 2) observed information can be complemented with estimates of the future patient trajectory; 3) the net benefit of counterfactual diagnostic paths and associated uncertainties can be modeled for individuals 4) 'early' and 'timely' diagnosis can be clearly defined; 5) a mechanism emerges for assessing the value of technologies in terms of their impact on personalized early diagnosis, resulting health outcomes and incurred costs.
Finally, we hope that this foundational framework will unlock the long-awaited potential of timely diagnosis and intervention, leading to improved outcomes for patients and higher cost-effectiveness for healthcare systems.
△ Less
Submitted 26 November, 2023;
originally announced November 2023.
-
RTClean: Context-aware Tabular Data Cleaning using Real-time OFDs
Authors:
Daniel Del Gaudio,
Tim Schubert,
Mohamed Abdelaal
Abstract:
Nowadays, machine learning plays a key role in developing plenty of applications, e.g., smart homes, smart medical assistance, and autonomous driving. A major challenge of these applications is preserving high quality of the training and the serving data. Nevertheless, existing data cleaning methods cannot exploit context information. Thus, they usually fail to track shifts in the data distributio…
▽ More
Nowadays, machine learning plays a key role in developing plenty of applications, e.g., smart homes, smart medical assistance, and autonomous driving. A major challenge of these applications is preserving high quality of the training and the serving data. Nevertheless, existing data cleaning methods cannot exploit context information. Thus, they usually fail to track shifts in the data distributions or the associated error profiles. To overcome these limitations, we introduce, in this paper, a novel method for automated tabular data cleaning powered by dynamic functional dependency rules extracted from a live context model. As a proof of concept, we create a smart home use case to collect data while preserving the context information. Using two different data sets, our evaluations show that the proposed cleaning method outperforms a set of baseline methods in terms of the detection and repair accuracy.
△ Less
Submitted 9 February, 2023;
originally announced February 2023.
-
Measuring Presence in Augmented Reality Environments: Design and a First Test of a Questionnaire
Authors:
Holger Regenbrecht,
Thomas Schubert
Abstract:
Augmented Reality (AR) enriches a user's real environment by adding spatially aligned virtual objects (3D models, 2D textures, textual annotations, etc) by means of special display technologies. These are either worn on the body or placed in the working environment. From a technical point of view, AR faces three major challenges: (1) to generate a high quality rendering, (2) to precisely register…
▽ More
Augmented Reality (AR) enriches a user's real environment by adding spatially aligned virtual objects (3D models, 2D textures, textual annotations, etc) by means of special display technologies. These are either worn on the body or placed in the working environment. From a technical point of view, AR faces three major challenges: (1) to generate a high quality rendering, (2) to precisely register (in position and orientation) the virtual objects (VOs) with the real environment, and (3) to do so in interactive real-time (Regenbrecht, Wagner, and Baratoff, 2002). The goal is to create the impression that the VOs are part of the real environment. Therefore, and similar to definitions of virtual reality (Steuer, 1992), it makes sense to define AR from a psychological point of view: Augmented Reality conveys the impression that VOs are present in the real environment. In order to evaluate how well this goal is reached, a psychological measurement of this type of presence is necessary. In the following, we will describe technological features of AR systems that make a special questionnaire version necessary, describe our approach to the questionnaire development, and the data collection strategy. Finally we will present first results of the application of the questionnaire in a recent study with 385 participants.
△ Less
Submitted 3 March, 2021;
originally announced March 2021.
-
Closed-Form Full Map Posteriors for Robot Localization with Lidar Sensors
Authors:
Lukas Luft,
Alexander Schaefer,
Tobias Schubert,
Wolfram Burgard
Abstract:
A popular class of lidar-based grid mapping algorithms computes for each map cell the probability that it reflects an incident laser beam. These algorithms typically determine the map as the set of reflection probabilities that maximizes the likelihood of the underlying laser data and do not compute the full posterior distribution over all possible maps. Thereby, they discard crucial information a…
▽ More
A popular class of lidar-based grid mapping algorithms computes for each map cell the probability that it reflects an incident laser beam. These algorithms typically determine the map as the set of reflection probabilities that maximizes the likelihood of the underlying laser data and do not compute the full posterior distribution over all possible maps. Thereby, they discard crucial information about the confidence of the estimate. The approach presented in this paper preserves this information by determining the full map posterior. In general, this problem is hard because distributions over real-valued quantities can possess infinitely many dimensions. However, for two state-of-the-art beam-based lidar models, our approach yields closed-form map posteriors that possess only two parameters per cell. Even better, these posteriors come for free, in the sense that they use the same parameters as the traditional approaches, without the need for additional computations. An important use case for grid maps is robot localization, which we formulate as Bayesian filtering based on the closed-form map posterior rather than based on a single map. The resulting measurement likelihoods can also be expressed in closed form. In simulations and extensive real-world experiments, we show that leveraging the full map posterior improves the localization accuracy compared to approaches that use the most likely map.
△ Less
Submitted 23 October, 2019;
originally announced October 2019.
-
ChromaPhy - A Living Wearable Connecting Humans and Their Environment
Authors:
Theresa Schubert
Abstract:
This research presents an artistic project aiming to make cyberfiction become reality and exemplifying a current trend in art and science collaborations. Chroma+Phy is a speculative design for a living wearable that combines the protoplasmic structure of the amoeboid acellular organism Physarum polycephalum and the chromatophores of the reptile Chameleon. The underpin-ning idea is that in a future…
▽ More
This research presents an artistic project aiming to make cyberfiction become reality and exemplifying a current trend in art and science collaborations. Chroma+Phy is a speculative design for a living wearable that combines the protoplasmic structure of the amoeboid acellular organism Physarum polycephalum and the chromatophores of the reptile Chameleon. The underpin-ning idea is that in a future far away or close, on planet earth or in outer space, humans will need some tools to help them in their social life and day-to-day routine. Chroma+Phy enhances the body aiming at humans in extreme habitats for an aggression-free and healthy life. Our approach will address actual issues of scientific discovery for society and catalyse idea translation through art and design experiments at frontiers of science.
△ Less
Submitted 26 March, 2014;
originally announced March 2014.
-
Are motorways rational from slime mould's point of view?
Authors:
Andrew Adamatzky,
Selim Akl,
Ramon Alonso-Sanz,
Wesley van Dessel,
Zuwairie Ibrahim,
Andrew Ilachinski,
Jeff Jones,
Anne V. D. M. Kayem,
Genaro J. Martinez,
Pedro de Oliveira,
Mikhail Prokopenko,
Theresa Schubert,
Peter Sloot,
Emanuele Strano,
Xin-She Yang
Abstract:
We analyse the results of our experimental laboratory approximation of motorways networks with slime mould Physarum polycephalum. Motorway networks of fourteen geographical areas are considered: Australia, Africa, Belgium, Brazil, Canada, China, Germany, Iberia, Italy, Malaysia, Mexico, The Netherlands, UK, USA. For each geographical entity we represented major urban areas by oat flakes and inocul…
▽ More
We analyse the results of our experimental laboratory approximation of motorways networks with slime mould Physarum polycephalum. Motorway networks of fourteen geographical areas are considered: Australia, Africa, Belgium, Brazil, Canada, China, Germany, Iberia, Italy, Malaysia, Mexico, The Netherlands, UK, USA. For each geographical entity we represented major urban areas by oat flakes and inoculated the slime mould in a capital. After slime mould spanned all urban areas with a network of its protoplasmic tubes we extracted a generalised Physarum graph from the network and compared the graphs with an abstract motorway graph using most common measures. The measures employed are the number of independent cycles, cohesion, shortest paths lengths, diameter, the Harary index and the Randic index. We obtained a series of intriguing results, and found that the slime mould approximates best of all the motorway graphs of Belgium, Canada and China, and that for all entities studied the best match between Physarum and motorway graphs is detected by the Randic index (molecular branching index).
△ Less
Submitted 13 March, 2012;
originally announced March 2012.