Skip to main content

Showing 1–50 of 161 results for author: Müller, P

Searching in archive cs. Search in all archives.
.
  1. arXiv:2505.18175  [pdf, ps, other

    eess.SP cs.AI cs.CV cs.HC cs.LG

    Evaluation in EEG Emotion Recognition: State-of-the-Art Review and Unified Framework

    Authors: Natia Kukhilava, Tatia Tsmindashvili, Rapael Kalandadze, Anchit Gupta, Sofio Katamadze, François Brémond, Laura M. Ferrari, Philipp Müller, Benedikt Emanuel Wirth

    Abstract: Electroencephalography-based Emotion Recognition (EEG-ER) has become a growing research area in recent years. Analyzing 216 papers published between 2018 and 2023, we uncover that the field lacks a unified evaluation protocol, which is essential to fairly define the state of the art, compare new approaches and to track the field's progress. We report the main inconsistencies between the used evalu… ▽ More

    Submitted 14 May, 2025; originally announced May 2025.

    Comments: This work has been submitted to the IEEE for possible publication

  2. arXiv:2505.16725  [pdf, ps, other

    cs.LG cs.CV

    Masked Conditioning for Deep Generative Models

    Authors: Phillip Mueller, Jannik Wiese, Sebastian Mueller, Lars Mikelsons

    Abstract: Datasets in engineering domains are often small, sparsely labeled, and contain numerical as well as categorical conditions. Additionally. computational resources are typically limited in practical applications which hinders the adoption of generative models for engineering tasks. We introduce a novel masked-conditioning approach, that enables generative models to work with sparse, mixed-type data.… ▽ More

    Submitted 22 May, 2025; originally announced May 2025.

  3. arXiv:2505.06594  [pdf, ps, other

    cs.CL cs.CV

    Integrating Video and Text: A Balanced Approach to Multimodal Summary Generation and Evaluation

    Authors: Galann Pennec, Zhengyuan Liu, Nicholas Asher, Philippe Muller, Nancy F. Chen

    Abstract: Vision-Language Models (VLMs) often struggle to balance visual and textual information when summarizing complex multimodal inputs, such as entire TV show episodes. In this paper, we propose a zero-shot video-to-text summarization approach that builds its own screenplay representation of an episode, effectively integrating key video moments, dialogue, and character information into a unified docume… ▽ More

    Submitted 10 May, 2025; originally announced May 2025.

  4. arXiv:2504.18510  [pdf, other

    cs.CV

    Examining the Impact of Optical Aberrations to Image Classification and Object Detection Models

    Authors: Patrick Müller, Alexander Braun, Margret Keuper

    Abstract: Deep neural networks (DNNs) have proven to be successful in various computer vision applications such that models even infer in safety-critical situations. Therefore, vision models have to behave in a robust way to disturbances such as noise or blur. While seminal benchmarks exist to evaluate model robustness to diverse corruptions, blur is often approximated in an overly simplistic way to model d… ▽ More

    Submitted 25 April, 2025; originally announced April 2025.

    Comments: v1.0

  5. arXiv:2503.21911  [pdf, other

    cs.CL cs.AI

    AutoPsyC: Automatic Recognition of Psychodynamic Conflicts from Semi-structured Interviews with Large Language Models

    Authors: Sayed Muddashir Hossain, Simon Ostermann, Patrick Gebhard, Cord Benecke, Josef van Genabith, Philipp Müller

    Abstract: Psychodynamic conflicts are persistent, often unconscious themes that shape a person's behaviour and experiences. Accurate diagnosis of psychodynamic conflicts is crucial for effective patient treatment and is commonly done via long, manually scored semi-structured interviews. Existing automated solutions for psychiatric diagnosis tend to focus on the recognition of broad disorder categories such… ▽ More

    Submitted 27 March, 2025; originally announced March 2025.

  6. arXiv:2503.21691  [pdf, ps, other

    cs.PL

    Place Capability Graphs: A General-Purpose Model of Rust's Ownership and Borrowing Guarantees

    Authors: Zachary Grannan, Aurel Bílý, Jonáš Fiala, Jasper Geer, Markus de Medeiros, Peter Müller, Alexander J. Summers

    Abstract: Rust's novel type system has proved an attractive target for verification and program analysis tools, due to the rich guarantees it provides for controlling aliasing and mutability. However, fully understanding, extracting and exploiting these guarantees is subtle and challenging: existing models for Rust's type checking either support a smaller idealised language disconnected from real-world Rust… ▽ More

    Submitted 4 April, 2025; v1 submitted 27 March, 2025; originally announced March 2025.

  7. arXiv:2503.14002  [pdf, other

    cs.CV cs.AI cs.LG

    MeshFleet: Filtered and Annotated 3D Vehicle Dataset for Domain Specific Generative Modeling

    Authors: Damian Boborzi, Phillip Mueller, Jonas Emrich, Dominik Schmid, Sebastian Mueller, Lars Mikelsons

    Abstract: Generative models have recently made remarkable progress in the field of 3D objects. However, their practical application in fields like engineering remains limited since they fail to deliver the accuracy, quality, and controllability needed for domain-specific tasks. Fine-tuning large generative models is a promising perspective for making these models available in these fields. Creating high-qua… ▽ More

    Submitted 18 March, 2025; originally announced March 2025.

  8. arXiv:2503.05245  [pdf, other

    eess.IV cs.CV

    L-FUSION: Laplacian Fetal Ultrasound Segmentation & Uncertainty Estimation

    Authors: Johanna P. Müller, Robert Wright, Thomas G. Day, Lorenzo Venturini, Samuel F. Budd, Hadrien Reynaud, Joseph V. Hajnal, Reza Razavi, Bernhard Kainz

    Abstract: Accurate analysis of prenatal ultrasound (US) is essential for early detection of developmental anomalies. However, operator dependency and technical limitations (e.g. intrinsic artefacts and effects, setting errors) can complicate image interpretation and the assessment of diagnostic uncertainty. We present L-FUSION (Laplacian Fetal US Segmentation with Integrated FoundatiON models), a framework… ▽ More

    Submitted 12 March, 2025; v1 submitted 7 March, 2025; originally announced March 2025.

    Comments: Under Review

  9. arXiv:2410.07299  [pdf, other

    cs.LG cs.AI cs.CV

    Towards Generalisable Time Series Understanding Across Domains

    Authors: Özgün Turgut, Philip Müller, Martin J. Menten, Daniel Rueckert

    Abstract: Recent breakthroughs in natural language processing and computer vision, driven by efficient pre-training on large datasets, have enabled foundation models to excel on a wide range of tasks. However, this potential has not yet been fully realised in time series analysis, as existing methods fail to address the heterogeneity in large time series corpora. Prevalent in domains ranging from medicine t… ▽ More

    Submitted 31 January, 2025; v1 submitted 9 October, 2024; originally announced October 2024.

  10. arXiv:2409.17045  [pdf, other

    cs.CV cs.AI

    GeoBiked: A Dataset with Geometric Features and Automated Labeling Techniques to Enable Deep Generative Models in Engineering Design

    Authors: Phillip Mueller, Sebastian Mueller, Lars Mikelsons

    Abstract: We provide a dataset for enabling Deep Generative Models (DGMs) in engineering design and propose methods to automate data labeling by utilizing large-scale foundation models. GeoBiked is curated to contain 4 355 bicycle images, annotated with structural and technical features and is used to investigate two automated labeling techniques: The utilization of consolidated latent features (Hyperfeatur… ▽ More

    Submitted 22 May, 2025; v1 submitted 25 September, 2024; originally announced September 2024.

  11. arXiv:2409.09387  [pdf, other

    eess.IV cs.CV

    Estimating Neural Orientation Distribution Fields on High Resolution Diffusion MRI Scans

    Authors: Mohammed Munzer Dwedari, William Consagra, Philip Müller, Özgün Turgut, Daniel Rueckert, Yogesh Rathi

    Abstract: The Orientation Distribution Function (ODF) characterizes key brain microstructural properties and plays an important role in understanding brain structural connectivity. Recent works introduced Implicit Neural Representation (INR) based approaches to form a spatially aware continuous estimate of the ODF field and demonstrated promising results in key tasks of interest when compared to conventiona… ▽ More

    Submitted 14 September, 2024; originally announced September 2024.

    Comments: 16 pages, 8 figures, conference: Medical Image Computing and Computer-Assisted Intervention (MICCAI)

  12. arXiv:2409.08666  [pdf, other

    cs.LG cs.AI

    Towards certifiable AI in aviation: landscape, challenges, and opportunities

    Authors: Hymalai Bello, Daniel Geißler, Lala Ray, Stefan Müller-Divéky, Peter Müller, Shannon Kittrell, Mengxi Liu, Bo Zhou, Paul Lukowicz

    Abstract: Artificial Intelligence (AI) methods are powerful tools for various domains, including critical fields such as avionics, where certification is required to achieve and maintain an acceptable level of safety. General solutions for safety-critical systems must address three main questions: Is it suitable? What drives the system's decisions? Is it robust to errors/attacks? This is more complex in AI… ▽ More

    Submitted 13 September, 2024; originally announced September 2024.

  13. MultiMediate'24: Multi-Domain Engagement Estimation

    Authors: Philipp Müller, Michal Balazia, Tobias Baur, Michael Dietz, Alexander Heimerl, Anna Penzkofer, Dominik Schiller, François Brémond, Jan Alexandersson, Elisabeth André, Andreas Bulling

    Abstract: Estimating the momentary level of participant's engagement is an important prerequisite for assistive systems that support human interactions. Previous work has addressed this task in within-domain evaluation scenarios, i.e. training and testing on the same dataset. This is in contrast to real-life scenarios where domain shifts between training and testing data frequently occur. With MultiMediate'… ▽ More

    Submitted 29 August, 2024; originally announced August 2024.

    Comments: arXiv admin note: text overlap with arXiv:2308.08256

  14. arXiv:2408.04420  [pdf, other

    cs.CL

    Recognizing Emotion Regulation Strategies from Human Behavior with Large Language Models

    Authors: Philipp Müller, Alexander Heimerl, Sayed Muddashir Hossain, Lea Siegel, Jan Alexandersson, Patrick Gebhard, Elisabeth André, Tanja Schneeberger

    Abstract: Human emotions are often not expressed directly, but regulated according to internal processes and social display rules. For affective computing systems, an understanding of how users regulate their emotions can be highly useful, for example to provide feedback in job interview training, or in psychotherapeutic scenarios. However, at present no method to automatically classify different emotion re… ▽ More

    Submitted 8 August, 2024; originally announced August 2024.

    Comments: Accepted to ACII'24

  15. arXiv:2408.03560  [pdf, other

    cs.LG stat.ML

    In2Core: Leveraging Influence Functions for Coreset Selection in Instruction Finetuning of Large Language Models

    Authors: Ayrton San Joaquin, Bin Wang, Zhengyuan Liu, Nicholas Asher, Brian Lim, Philippe Muller, Nancy F. Chen

    Abstract: Despite advancements, fine-tuning Large Language Models (LLMs) remains costly due to the extensive parameter count and substantial data requirements for model generalization. Accessibility to computing resources remains a barrier for the open-source community. To address this challenge, we propose the In2Core algorithm, which selects a coreset by analyzing the correlation between training and eval… ▽ More

    Submitted 2 October, 2024; v1 submitted 7 August, 2024; originally announced August 2024.

    Comments: EMNLP 2024 - Findings

  16. Formal Foundations for Translational Separation Logic Verifiers (extended version)

    Authors: Thibault Dardinier, Michael Sammler, Gaurav Parthasarathy, Alexander J. Summers, Peter Müller

    Abstract: Program verification tools are often implemented as front-end translations of an input program into an intermediate verification language (IVL) such as Boogie, GIL, Viper, or Why3. The resulting IVL program is then verified using an existing back-end verifier. A soundness proof for such a translational verifier needs to relate the input program and verification logic to the semantics of the IVL, w… ▽ More

    Submitted 20 December, 2024; v1 submitted 29 July, 2024; originally announced July 2024.

    Comments: Extended version of POPL'25 paper

    Journal ref: Proc. ACM Program. Lang. 9, POPL, Article 20 (January 2025)

  17. arXiv:2407.19427  [pdf

    cs.CY cs.HC

    The influence of Automated Decision-Making systems in the context of street-level bureaucrats' practices

    Authors: Manuel Portela, A. Paula Rodriguez Müller, Luca Tangi

    Abstract: In an era of digital governance, the use of automation for individual and cooperative work is increasing in public administrations (Tangi et al., 2022). Despite the promises of efficiency and cost reduction, automation could bring new challenges to the governance schemes. Regional, national, and local governments are taking measures to regulate and measure the impact of automated decision-making s… ▽ More

    Submitted 28 July, 2024; originally announced July 2024.

  18. arXiv:2407.11104  [pdf, other

    cs.LG cs.AI

    Exploring the Potentials and Challenges of Deep Generative Models in Product Design Conception

    Authors: Phillip Mueller, Lars Mikelsons

    Abstract: The synthesis of product design concepts stands at the crux of early-phase development processes for technical products, traditionally posing an intricate interdisciplinary challenge. The application of deep learning methods, particularly Deep Generative Models (DGMs), holds the promise of automating and streamlining manual iterations and therefore introducing heightened levels of innovation and e… ▽ More

    Submitted 21 May, 2025; v1 submitted 15 July, 2024; originally announced July 2024.

  19. arXiv:2407.10592  [pdf, other

    cs.CV

    InsertDiffusion: Identity Preserving Visualization of Objects through a Training-Free Diffusion Architecture

    Authors: Phillip Mueller, Jannik Wiese, Ioan Craciun, Lars Mikelsons

    Abstract: Recent advancements in image synthesis are fueled by the advent of large-scale diffusion models. Yet, integrating realistic object visualizations seamlessly into new or existing backgrounds without extensive training remains a challenge. This paper introduces InsertDiffusion, a novel, training-free diffusion architecture that efficiently embeds objects into images while preserving their structural… ▽ More

    Submitted 15 July, 2024; originally announced July 2024.

  20. arXiv:2407.08410  [pdf, other

    cs.AI

    Specialized curricula for training vision-language models in retinal image analysis

    Authors: Robbie Holland, Thomas R. P. Taylor, Christopher Holmes, Sophie Riedl, Julia Mai, Maria Patsiamanidi, Dimitra Mitsopoulou, Paul Hager, Philip Müller, Hendrik P. N. Scholl, Hrvoje Bogunović, Ursula Schmidt-Erfurth, Daniel Rueckert, Sobha Sivaprasad, Andrew J. Lotery, Martin J. Menten

    Abstract: Clinicians spend a significant amount of time reviewing medical images and transcribing their findings regarding patient diagnosis, referral and treatment in text form. Vision-language models (VLMs), which automatically interpret images and summarize their findings as text, have enormous potential to alleviate clinical workloads and increase patient access to high-quality medical care. While found… ▽ More

    Submitted 24 February, 2025; v1 submitted 11 July, 2024; originally announced July 2024.

    Comments: Under review at npj Digital Medicine

  21. arXiv:2406.16611  [pdf, other

    cs.CL cs.AI

    Evaluation of Language Models in the Medical Context Under Resource-Constrained Settings

    Authors: Andrea Posada, Daniel Rueckert, Felix Meissen, Philip Müller

    Abstract: Since the Transformer architecture emerged, language model development has grown, driven by their promising potential. Releasing these models into production requires properly understanding their behavior, particularly in sensitive domains like medicine. Despite this need, the medical literature still lacks practical assessment of pre-trained language models, which are especially valuable in setti… ▽ More

    Submitted 23 October, 2024; v1 submitted 24 June, 2024; originally announced June 2024.

  22. arXiv:2406.14038  [pdf, other

    cs.CV cs.AI

    Resource-efficient Medical Image Analysis with Self-adapting Forward-Forward Networks

    Authors: Johanna P. Müller, Bernhard Kainz

    Abstract: We introduce a fast Self-adapting Forward-Forward Network (SaFF-Net) for medical imaging analysis, mitigating power consumption and resource limitations, which currently primarily stem from the prevalent reliance on back-propagation for model training and fine-tuning. Building upon the recently proposed Forward-Forward Algorithm (FFA), we introduce the Convolutional Forward-Forward Algorithm (CFFA… ▽ More

    Submitted 17 July, 2024; v1 submitted 20 June, 2024; originally announced June 2024.

    Comments: Accepted for MICCAI Workshop MLMI 2024

  23. arXiv:2406.04769  [pdf, other

    eess.IV cs.CV cs.LG

    Diffusion-based Generative Image Outpainting for Recovery of FOV-Truncated CT Images

    Authors: Michelle Espranita Liman, Daniel Rueckert, Florian J. Fintelmann, Philip Müller

    Abstract: Field-of-view (FOV) recovery of truncated chest CT scans is crucial for accurate body composition analysis, which involves quantifying skeletal muscle and subcutaneous adipose tissue (SAT) on CT slices. This, in turn, enables disease prognostication. Here, we present a method for recovering truncated CT slices using generative image outpainting. We train a diffusion model and apply it to truncated… ▽ More

    Submitted 26 September, 2024; v1 submitted 7 June, 2024; originally announced June 2024.

    Comments: Shared last authorship: Florian J. Fintelmann and Philip Müller

  24. ADESSE: Advice Explanations in Complex Repeated Decision-Making Environments

    Authors: Sören Schleibaum, Lu Feng, Sarit Kraus, Jörg P. Müller

    Abstract: In the evolving landscape of human-centered AI, fostering a synergistic relationship between humans and AI agents in decision-making processes stands as a paramount challenge. This work considers a problem setup where an intelligent agent comprising a neural network-based prediction component and a deep reinforcement learning component provides advice to a human decision-maker in complex repeated… ▽ More

    Submitted 10 September, 2024; v1 submitted 31 May, 2024; originally announced May 2024.

    Journal ref: Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence (2024)

  25. arXiv:2405.10661  [pdf, other

    cs.PL

    Verification Algorithms for Automated Separation Logic Verifiers

    Authors: Marco Eilers, Malte Schwerhoff, Peter Müller

    Abstract: Most automated program verifiers for separation logic use either symbolic execution or verification condition generation to extract proof obligations, which are then handed over to an SMT solver. Existing verification algorithms are designed to be sound, but differ in performance and completeness. These characteristics may also depend on the programs and properties to be verified. Consequently, de… ▽ More

    Submitted 27 May, 2024; v1 submitted 17 May, 2024; originally announced May 2024.

  26. arXiv:2405.08372  [pdf, ps, other

    cs.PL cs.LO

    Reasoning about Interior Mutability in Rust using Library-Defined Capabilities

    Authors: Federico Poli, Xavier Denis, Peter Müller, Alexander J. Summers

    Abstract: Existing automated verification techniques for safe Rust code rely on the strong type-system properties to reason about programs, especially to deduce which memory locations do not change (i.e., are framed) across function calls. However, these type guarantees do not hold in the presence of interior mutability (e.g., when interacting with any concurrent data structure). As a consequence, existing… ▽ More

    Submitted 14 May, 2024; originally announced May 2024.

  27. arXiv:2405.06074  [pdf, other

    cs.CR cs.NI cs.PL

    Protocols to Code: Formal Verification of a Next-Generation Internet Router

    Authors: João C. Pereira, Tobias Klenze, Sofia Giampietro, Markus Limbeck, Dionysios Spiliopoulos, Felix A. Wolf, Marco Eilers, Christoph Sprenger, David Basin, Peter Müller, Adrian Perrig

    Abstract: We present the first formally-verified Internet router, which is part of the SCION Internet architecture. SCION routers run a cryptographic protocol for secure packet forwarding in an adversarial environment. We verify both the protocol's network-wide security properties and low-level properties of its implementation. More precisely, we develop a series of protocol models by refinement in Isabelle… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

  28. arXiv:2404.15770  [pdf, other

    cs.CV cs.CL cs.LG

    ChEX: Interactive Localization and Region Description in Chest X-rays

    Authors: Philip Müller, Georgios Kaissis, Daniel Rueckert

    Abstract: Report generation models offer fine-grained textual interpretations of medical images like chest X-rays, yet they often lack interactivity (i.e. the ability to steer the generation process through user queries) and localized interpretability (i.e. visually grounding their predictions), which we deem essential for future adoption in clinical practice. While there have been efforts to tackle these i… ▽ More

    Submitted 15 July, 2024; v1 submitted 24 April, 2024; originally announced April 2024.

    Comments: Accepted at ECCV 2024

  29. arXiv:2404.07622  [pdf, other

    cs.CV cs.CL

    Language Models Meet Anomaly Detection for Better Interpretability and Generalizability

    Authors: Jun Li, Su Hwan Kim, Philip Müller, Lina Felsner, Daniel Rueckert, Benedikt Wiestler, Julia A. Schnabel, Cosmin I. Bercea

    Abstract: This research explores the integration of language models and unsupervised anomaly detection in medical imaging, addressing two key questions: (1) Can language models enhance the interpretability of anomaly detection maps? and (2) Can anomaly maps improve the generalizability of language models in open-set anomaly detection tasks? To investigate these questions, we introduce a new dataset for mult… ▽ More

    Submitted 23 July, 2024; v1 submitted 11 April, 2024; originally announced April 2024.

    Comments: 13 pages, 7 figures. 5th International Workshop on Multiscale Multimodal Medical Imaging (MMMI 2024)

  30. arXiv:2404.03614  [pdf, ps, other

    cs.PL

    Towards Trustworthy Automated Program Verifiers: Formally Validating Translations into an Intermediate Verification Language (extended version)

    Authors: Gaurav Parthasarathy, Thibault Dardinier, Benjamin Bonneau, Peter Müller, Alexander J. Summers

    Abstract: Automated program verifiers are typically implemented using an intermediate verification language (IVL), such as Boogie or Why3. A verifier front-end translates the input program and specification into an IVL program, while the back-end generates proof obligations for the IVL program and employs an SMT solver to discharge them. Soundness of such verifiers therefore requires that the front-end tran… ▽ More

    Submitted 9 May, 2024; v1 submitted 4 April, 2024; originally announced April 2024.

    Comments: Extended version of PLDI 2024 publication

  31. arXiv:2404.03312  [pdf, other

    cs.CL cs.SD eess.AS

    M3TCM: Multi-modal Multi-task Context Model for Utterance Classification in Motivational Interviews

    Authors: Sayed Muddashir Hossain, Jan Alexandersson, Philipp Müller

    Abstract: Accurate utterance classification in motivational interviews is crucial to automatically understand the quality and dynamics of client-therapist interaction, and it can serve as a key input for systems mediating such interactions. Motivational interviews exhibit three important characteristics. First, there are two distinct roles, namely client and therapist. Second, they are often highly emotiona… ▽ More

    Submitted 4 April, 2024; originally announced April 2024.

    Comments: Accepted for publication at LREC-COLING'24

  32. arXiv:2403.18491  [pdf, other

    cs.SE cs.PL

    Algorithmic Details behind the Predator Shape Analyser

    Authors: Kamil Dudka, Petr Muller, Petr Peringer, Veronika Šoková, Tomáš Vojnar

    Abstract: This chapter, which is an extended and revised version of the conference paper 'Predator: Byte-Precise Verification of Low-Level List Manipulation', concentrates on a detailed description of the algorithms behind the Predator shape analyser based on abstract interpretation and symbolic memory graphs. Predator is particularly suited for formal analysis and verification of sequential non-recursive C… ▽ More

    Submitted 27 March, 2024; originally announced March 2024.

    Comments: Book chapter preview

  33. Weakly Supervised Object Detection in Chest X-Rays with Differentiable ROI Proposal Networks and Soft ROI Pooling

    Authors: Philip Müller, Felix Meissen, Georgios Kaissis, Daniel Rueckert

    Abstract: Weakly supervised object detection (WSup-OD) increases the usefulness and interpretability of image classification algorithms without requiring additional supervision. The successes of multiple instance learning in this task for natural images, however, do not translate well to medical images due to the very different characteristics of their objects (i.e. pathologies). In this work, we propose We… ▽ More

    Submitted 19 February, 2024; originally announced February 2024.

  34. ReNeLiB: Real-time Neural Listening Behavior Generation for Socially Interactive Agents

    Authors: Daksitha Withanage Don, Philipp Müller, Fabrizio Nunnari, Elisabeth André, Patrick Gebhard

    Abstract: Flexible and natural nonverbal reactions to human behavior remain a challenge for socially interactive agents (SIAs) that are predominantly animated using hand-crafted rules. While recently proposed machine learning based approaches to conversational behavior generation are a promising way to address this challenge, they have not yet been employed in SIAs. The primary reason for this is the lack o… ▽ More

    Submitted 12 February, 2024; originally announced February 2024.

    Comments: 8 pages, 11 figures, ICMI conference, project page https://daksitha.github.io/ReNeLib/

  35. arXiv:2401.07066  [pdf, other

    cs.LG

    Classification of Volatile Organic Compounds by Differential Mobility Spectrometry Based on Continuity of Alpha Curves

    Authors: Anton Rauhameri, Angelo Robiños, Osmo Anttalainen, Timo Salpavaara, Jussi Rantala, Veikko Surakka, Pasi Kallio, Antti Vehkaoja, Philipp Müller

    Abstract: Background: Classification of volatile organic compounds (VOCs) is of interest in many fields. Examples include but are not limited to medicine, detection of explosives, and food quality control. Measurements collected with electronic noses can be used for classification and analysis of VOCs. One type of electronic noses that has seen considerable development in recent years is Differential Mobili… ▽ More

    Submitted 13 March, 2024; v1 submitted 13 January, 2024; originally announced January 2024.

  36. arXiv:2312.11805  [pdf, other

    cs.CL cs.AI cs.CV

    Gemini: A Family of Highly Capable Multimodal Models

    Authors: Gemini Team, Rohan Anil, Sebastian Borgeaud, Jean-Baptiste Alayrac, Jiahui Yu, Radu Soricut, Johan Schalkwyk, Andrew M. Dai, Anja Hauth, Katie Millican, David Silver, Melvin Johnson, Ioannis Antonoglou, Julian Schrittwieser, Amelia Glaese, Jilin Chen, Emily Pitler, Timothy Lillicrap, Angeliki Lazaridou, Orhan Firat, James Molloy, Michael Isard, Paul R. Barham, Tom Hennigan, Benjamin Lee , et al. (1326 additional authors not shown)

    Abstract: This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. The Gemini family consists of Ultra, Pro, and Nano sizes, suitable for applications ranging from complex reasoning tasks to on-device memory-constrained use-cases. Evaluation on a broad range of benchmarks shows that our most-capable Gemini Ultr… ▽ More

    Submitted 9 May, 2025; v1 submitted 18 December, 2023; originally announced December 2023.

  37. arXiv:2311.18645  [pdf, other

    cs.CV cs.AI

    Stochastic Vision Transformers with Wasserstein Distance-Aware Attention

    Authors: Franciskus Xaverius Erick, Mina Rezaei, Johanna Paula Müller, Bernhard Kainz

    Abstract: Self-supervised learning is one of the most promising approaches to acquiring knowledge from limited labeled data. Despite the substantial advancements made in recent years, self-supervised models have posed a challenge to practitioners, as they do not readily provide insight into the model's confidence and uncertainty. Tackling this issue is no simple feat, primarily due to the complexity involve… ▽ More

    Submitted 30 November, 2023; originally announced November 2023.

  38. arXiv:2311.14452  [pdf, ps, other

    cs.LO

    Refinement Proofs in Rust Using Ghost Locks

    Authors: Aurel Bílý, João C. Pereira, Jan Schär, Peter Müller

    Abstract: Refinement transforms an abstract system model into a concrete, executable program, such that properties established for the abstract model carry over to the concrete implementation. Refinement has been used successfully in the development of substantial verified systems. Nevertheless, existing refinement techniques have limitations that impede their practical usefulness. Some techniques generate… ▽ More

    Submitted 24 November, 2023; originally announced November 2023.

    Comments: 21 pages, 3 figures, submitted to PLDI 2024

    MSC Class: 68Q60 ACM Class: F.3.1

  39. Whole Slide Multiple Instance Learning for Predicting Axillary Lymph Node Metastasis

    Authors: Glejdis Shkëmbi, Johanna P. Müller, Zhe Li, Katharina Breininger, Peter Schüffler, Bernhard Kainz

    Abstract: Breast cancer is a major concern for women's health globally, with axillary lymph node (ALN) metastasis identification being critical for prognosis evaluation and treatment guidance. This paper presents a deep learning (DL) classification pipeline for quantifying clinical information from digital core-needle biopsy (CNB) images, with one step less than existing methods. A publicly available datase… ▽ More

    Submitted 6 October, 2023; originally announced October 2023.

    Comments: Accepted for MICCAI DEMI Workshop 2023

    Journal ref: Data Engineering in Medical Imaging. DEMI 2023. Lecture Notes in Computer Science, vol 14314. Springer, Cham

  40. arXiv:2309.02578  [pdf, other

    cs.CV cs.LG

    Anatomy-Driven Pathology Detection on Chest X-rays

    Authors: Philip Müller, Felix Meissen, Johannes Brandt, Georgios Kaissis, Daniel Rueckert

    Abstract: Pathology detection and delineation enables the automatic interpretation of medical scans such as chest X-rays while providing a high level of explainability to support radiologists in making informed decisions. However, annotating pathology bounding boxes is a time-consuming task such that large public datasets for this purpose are scarce. Current approaches thus use weakly supervised object dete… ▽ More

    Submitted 5 September, 2023; originally announced September 2023.

    Comments: Accepted at MICCAI 2023

  41. arXiv:2309.00550  [pdf, other

    cs.IR

    NeMig -- A Bilingual News Collection and Knowledge Graph about Migration

    Authors: Andreea Iana, Mehwish Alam, Alexander Grote, Nevena Nikolajevic, Katharina Ludwig, Philipp Müller, Christof Weinhardt, Heiko Paulheim

    Abstract: News recommendation plays a critical role in shaping the public's worldviews through the way in which it filters and disseminates information about different topics. Given the crucial impact that media plays in opinion formation, especially for sensitive topics, understanding the effects of personalized recommendation beyond accuracy has become essential in today's digital society. In this work, w… ▽ More

    Submitted 1 September, 2023; originally announced September 2023.

    Comments: Accepted at the 11th International Workshop on News Recommendation and Analytics (INRA 2023) in conjunction with ACM RecSys 2023

  42. arXiv:2308.15499  [pdf, other

    cs.CV

    Classification robustness to common optical aberrations

    Authors: Patrick Müller, Alexander Braun, Margret Keuper

    Abstract: Computer vision using deep neural networks (DNNs) has brought about seminal changes in people's lives. Applications range from automotive, face recognition in the security industry, to industrial process monitoring. In some cases, DNNs infer even in safety-critical situations. Therefore, for practical applications, DNNs have to behave in a robust way to disturbances such as noise, pixelation, or b… ▽ More

    Submitted 29 August, 2023; originally announced August 2023.

    Comments: ICCVW2023

  43. MultiMediate'23: Engagement Estimation and Bodily Behaviour Recognition in Social Interactions

    Authors: Philipp Müller, Michal Balazia, Tobias Baur, Michael Dietz, Alexander Heimerl, Dominik Schiller, Mohammed Guermal, Dominike Thomas, François Brémond, Jan Alexandersson, Elisabeth André, Andreas Bulling

    Abstract: Automatic analysis of human behaviour is a fundamental prerequisite for the creation of machines that can effectively interact with- and support humans in social interactions. In MultiMediate'23, we address two key human social behaviour analysis tasks for the first time in a controlled challenge: engagement estimation and bodily behaviour recognition in social interactions. This paper describes t… ▽ More

    Submitted 16 August, 2023; originally announced August 2023.

    Comments: ACM MultiMedia'23

  44. arXiv:2308.05764  [pdf, other

    eess.SP cs.AI cs.CV cs.LG

    Unlocking the diagnostic potential of electrocardiograms through information transfer from cardiac magnetic resonance imaging

    Authors: Özgün Turgut, Philip Müller, Paul Hager, Suprosanna Shit, Sophie Starck, Martin J. Menten, Eimo Martens, Daniel Rueckert

    Abstract: Cardiovascular diseases (CVD) can be diagnosed using various diagnostic modalities. The electrocardiogram (ECG) is a cost-effective and widely available diagnostic aid that provides functional information of the heart. However, its ability to classify and spatially localise CVD is limited. In contrast, cardiac magnetic resonance (CMR) imaging provides detailed structural information of the heart a… ▽ More

    Submitted 7 January, 2025; v1 submitted 9 August, 2023; originally announced August 2023.

  45. arXiv:2307.06614  [pdf, other

    eess.IV cs.CV

    Interpretable 2D Vision Models for 3D Medical Images

    Authors: Alexander Ziller, Ayhan Can Erdur, Marwa Trigui, Alp Güvenir, Tamara T. Mueller, Philip Müller, Friederike Jungmann, Johannes Brandt, Jan Peeken, Rickmer Braren, Daniel Rueckert, Georgios Kaissis

    Abstract: Training Artificial Intelligence (AI) models on 3D images presents unique challenges compared to the 2D case: Firstly, the demand for computational resources is significantly higher, and secondly, the availability of large datasets for pre-training is often limited, impeding training success. This study proposes a simple approach of adapting 2D networks with an intermediate feature representation… ▽ More

    Submitted 5 December, 2023; v1 submitted 13 July, 2023; originally announced July 2023.

  46. arXiv:2307.00899  [pdf, other

    cs.CV

    Many tasks make light work: Learning to localise medical anomalies from multiple synthetic tasks

    Authors: Matthew Baugh, Jeremy Tan, Johanna P. Müller, Mischa Dombrowski, James Batten, Bernhard Kainz

    Abstract: There is a growing interest in single-class modelling and out-of-distribution detection as fully supervised machine learning models cannot reliably identify classes not included in their training. The long tail of infinitely many out-of-distribution classes in real-world scenarios, e.g., for screening, triage, and quality control, means that it is often necessary to train single-class models that… ▽ More

    Submitted 3 July, 2023; originally announced July 2023.

    Comments: Early accepted to MICCAI 2023

  47. arXiv:2306.09269  [pdf, other

    cs.CV cs.LG

    Zero-Shot Anomaly Detection with Pre-trained Segmentation Models

    Authors: Matthew Baugh, James Batten, Johanna P. Müller, Bernhard Kainz

    Abstract: This technical report outlines our submission to the zero-shot track of the Visual Anomaly and Novelty Detection (VAND) 2023 Challenge. Building on the performance of the WINCLIP framework, we aim to enhance the system's localization capabilities by integrating zero-shot segmentation models. In addition, we perform foreground instance segmentation which enables the model to focus on the relevant p… ▽ More

    Submitted 15 June, 2023; originally announced June 2023.

    Comments: Ranked 3rd in zero-shot track of the Visual Anomaly and Novelty Detection (VAND) 2023 Challenge

  48. arXiv:2306.01656  [pdf, other

    cs.CV cs.HC

    Backchannel Detection and Agreement Estimation from Video with Transformer Networks

    Authors: Ahmed Amer, Chirag Bhuvaneshwara, Gowtham K. Addluri, Mohammed M. Shaik, Vedant Bonde, Philipp Müller

    Abstract: Listeners use short interjections, so-called backchannels, to signify attention or express agreement. The automatic analysis of this behavior is of key importance for human conversation analysis and interactive conversational agents. Current state-of-the-art approaches for backchannel analysis from visual behavior make use of two types of features: features based on body pose and features based on… ▽ More

    Submitted 2 June, 2023; originally announced June 2023.

    Comments: Accepted at IEEE IJCNN'23

  49. arXiv:2305.04502  [pdf, other

    cs.LG cs.NE

    MO-DEHB: Evolutionary-based Hyperband for Multi-Objective Optimization

    Authors: Noor Awad, Ayushi Sharma, Philipp Muller, Janek Thomas, Frank Hutter

    Abstract: Hyperparameter optimization (HPO) is a powerful technique for automating the tuning of machine learning (ML) models. However, in many real-world applications, accuracy is only one of multiple performance criteria that must be considered. Optimizing these objectives simultaneously on a complex and diverse search space remains a challenging task. In this paper, we propose MO-DEHB, an effective and f… ▽ More

    Submitted 11 May, 2023; v1 submitted 8 May, 2023; originally announced May 2023.

  50. Flexible K Nearest Neighbors Classifier: Derivation and Application for Ion-mobility Spectrometry-based Indoor Localization

    Authors: Philipp Müller

    Abstract: The K Nearest Neighbors (KNN) classifier is widely used in many fields such as fingerprint-based localization or medicine. It determines the class membership of unlabelled sample based on the class memberships of the K labelled samples, the so-called nearest neighbors, that are closest to the unlabelled sample. The choice of K has been the topic of various studies and proposed KNN-variants. Yet no… ▽ More

    Submitted 13 March, 2024; v1 submitted 20 April, 2023; originally announced April 2023.

    Comments: 11 pages, 3 figures, paper presented at the 2023 International Conference on Indoor Positioning and Indoor Navigation (IPIN)