-
Learning Wavelet-Sparse FDK for 3D Cone-Beam CT Reconstruction
Authors:
Yipeng Sun,
Linda-Sophie Schneider,
Chengze Ye,
Mingxuan Gu,
Siyuan Mei,
Siming Bayer,
Andreas Maier
Abstract:
Cone-Beam Computed Tomography (CBCT) is essential in medical imaging, and the Feldkamp-Davis-Kress (FDK) algorithm is a popular choice for reconstruction due to its efficiency. However, FDK is susceptible to noise and artifacts. While recent deep learning methods offer improved image quality, they often increase computational complexity and lack the interpretability of traditional methods. In this…
▽ More
Cone-Beam Computed Tomography (CBCT) is essential in medical imaging, and the Feldkamp-Davis-Kress (FDK) algorithm is a popular choice for reconstruction due to its efficiency. However, FDK is susceptible to noise and artifacts. While recent deep learning methods offer improved image quality, they often increase computational complexity and lack the interpretability of traditional methods. In this paper, we introduce an enhanced FDK-based neural network that maintains the classical algorithm's interpretability by selectively integrating trainable elements into the cosine weighting and filtering stages. Recognizing the challenge of a large parameter space inherent in 3D CBCT data, we leverage wavelet transformations to create sparse representations of the cosine weights and filters. This strategic sparsification reduces the parameter count by $93.75\%$ without compromising performance, accelerates convergence, and importantly, maintains the inference computational cost equivalent to the classical FDK algorithm. Our method not only ensures volumetric consistency and boosts robustness to noise, but is also designed for straightforward integration into existing CT reconstruction pipelines. This presents a pragmatic enhancement that can benefit clinical applications, particularly in environments with computational limitations.
△ Less
Submitted 19 May, 2025;
originally announced May 2025.
-
Filter2Noise: Interpretable Self-Supervised Single-Image Denoising for Low-Dose CT with Attention-Guided Bilateral Filtering
Authors:
Yipeng Sun,
Linda-Sophie Schneider,
Mingxuan Gu,
Siyuan Mei,
Chengze Ye,
Fabian Wagner,
Siming Bayer,
Andreas Maier
Abstract:
Effective denoising is crucial in low-dose CT to enhance subtle structures and low-contrast lesions while preventing diagnostic errors. Supervised methods struggle with limited paired datasets, and self-supervised approaches often require multiple noisy images and rely on deep networks like U-Net, offering little insight into the denoising mechanism. To address these challenges, we propose an inte…
▽ More
Effective denoising is crucial in low-dose CT to enhance subtle structures and low-contrast lesions while preventing diagnostic errors. Supervised methods struggle with limited paired datasets, and self-supervised approaches often require multiple noisy images and rely on deep networks like U-Net, offering little insight into the denoising mechanism. To address these challenges, we propose an interpretable self-supervised single-image denoising framework -- Filter2Noise (F2N). Our approach introduces an Attention-Guided Bilateral Filter that adapted to each noisy input through a lightweight module that predicts spatially varying filter parameters, which can be visualized and adjusted post-training for user-controlled denoising in specific regions of interest. To enable single-image training, we introduce a novel downsampling shuffle strategy with a new self-supervised loss function that extends the concept of Noise2Noise to a single image and addresses spatially correlated noise. On the Mayo Clinic 2016 low-dose CT dataset, F2N outperforms the leading self-supervised single-image method (ZS-N2N) by 4.59 dB PSNR while improving transparency, user control, and parametric efficiency. These features provide key advantages for medical applications that require precise and interpretable noise reduction. Our code is demonstrated at https://github.com/sypsyp97/Filter2Noise.git .
△ Less
Submitted 18 April, 2025;
originally announced April 2025.
-
Coexistence of topologically trivial and non-trivial Yu-Shiba-Rusinov bands in magnetic atomic chains on a superconductor
Authors:
Bendegúz Nyári,
Philip Beck,
András Lászlóffy,
Lucas Schneider,
Krisztián Palotás,
László Szunyogh,
Roland Wiesendanger,
Jens Wiebe,
Balázs Újfalussy,
Levente Rózsa
Abstract:
Majorana zero modes (MZMs) have been proposed as a promising basis for Majorana qubits offering great potential for topological quantum computation. Such modes may form at the ends of a magnetic atomic chain on a superconductor. Typically only a single MZM may be present at one end of the chain, but symmetry may protect multiple MZMs at the same end. Here, we study the topological properties of Yu…
▽ More
Majorana zero modes (MZMs) have been proposed as a promising basis for Majorana qubits offering great potential for topological quantum computation. Such modes may form at the ends of a magnetic atomic chain on a superconductor. Typically only a single MZM may be present at one end of the chain, but symmetry may protect multiple MZMs at the same end. Here, we study the topological properties of Yu-Shiba-Rusinov (YSR) bands of excitations in Mn chains constructed on a Nb(110) and on a Ta(110) substrate using first-principles calculations and scanning tunneling microscopy and spectroscopy experiments. We demonstrate that even and odd YSR states with respect to mirroring on the symmetry plane containing the chain have different dispersions, and both of them may give rise to MZMs separately. Although the spin-orbit coupling leads to a hybridization between the bands, multiple MZMs may still exist due to the mirror symmetry. These findings highlight the influence of symmetries on interpreting the spectroscopic signatures of candidates for MZMs.
△ Less
Submitted 10 April, 2025;
originally announced April 2025.
-
In silico clinical trials in drug development: a systematic review
Authors:
Bohua Chen,
Lucia Chantal Schneider,
Christian Röver,
Emmanuelle Comets,
Markus Christian Elze,
Andrew Hooker,
Joanna IntHout,
Anne-Sophie Jannot,
Daria Julkowska,
Yanis Mimouni,
Marina Savelieva,
Nigel Stallard,
Moreno Ursino,
Marc Vandemeulebroecke,
Sebastian Weber,
Martin Posch,
Sarah Zohar,
Tim Friede
Abstract:
In the context of clinical research, computational models have received increasing attention over the past decades. In this systematic review, we aimed to provide an overview of the role of so-called in silico clinical trials (ISCTs) in medical applications. Exemplary for the broad field of clinical medicine, we focused on in silico (IS) methods applied in drug development, sometimes also referred…
▽ More
In the context of clinical research, computational models have received increasing attention over the past decades. In this systematic review, we aimed to provide an overview of the role of so-called in silico clinical trials (ISCTs) in medical applications. Exemplary for the broad field of clinical medicine, we focused on in silico (IS) methods applied in drug development, sometimes also referred to as model informed drug development (MIDD). We searched PubMed and ClinicalTrials.gov for published articles and registered clinical trials related to ISCTs. We identified 202 articles and 48 trials, and of these, 76 articles and 19 trials were directly linked to drug development. We extracted information from all 202 articles and 48 clinical trials and conducted a more detailed review of the methods used in the 76 articles that are connected to drug development. Regarding application, most articles and trials focused on cancer and imaging related research while rare and pediatric diseases were only addressed in 18 and 4 studies, respectively. While some models were informed combining mechanistic knowledge with clinical or preclinical (in-vivo or in-vitro) data, the majority of models were fully data-driven, illustrating that clinical data is a crucial part in the process of generating synthetic data in ISCTs. Regarding reproducibility, a more detailed analysis revealed that only 24% (18 out of 76) of the articles provided an open-source implementation of the applied models, and in only 20% of the articles the generated synthetic data were publicly available. Despite the widely raised interest, we also found that it is still uncommon for ISCTs to be part of a registered clinical trial and their application is restricted to specific diseases leaving potential benefits of ISCTs not fully exploited.
△ Less
Submitted 18 March, 2025; v1 submitted 11 March, 2025;
originally announced March 2025.
-
Sleepless Nights, Sugary Days: Creating Synthetic Users with Health Conditions for Realistic Coaching Agent Interactions
Authors:
Taedong Yun,
Eric Yang,
Mustafa Safdari,
Jong Ha Lee,
Vaishnavi Vinod Kumar,
S. Sara Mahdavi,
Jonathan Amar,
Derek Peyton,
Reut Aharony,
Andreas Michaelides,
Logan Schneider,
Isaac Galatzer-Levy,
Yugang Jia,
John Canny,
Arthur Gretton,
Maja Matarić
Abstract:
We present an end-to-end framework for generating synthetic users for evaluating interactive agents designed to encourage positive behavior changes, such as in health and lifestyle coaching. The synthetic users are grounded in health and lifestyle conditions, specifically sleep and diabetes management in this study, to ensure realistic interactions with the health coaching agent. Synthetic users a…
▽ More
We present an end-to-end framework for generating synthetic users for evaluating interactive agents designed to encourage positive behavior changes, such as in health and lifestyle coaching. The synthetic users are grounded in health and lifestyle conditions, specifically sleep and diabetes management in this study, to ensure realistic interactions with the health coaching agent. Synthetic users are created in two stages: first, structured data are generated grounded in real-world health and lifestyle factors in addition to basic demographics and behavioral attributes; second, full profiles of the synthetic users are developed conditioned on the structured data. Interactions between synthetic users and the coaching agent are simulated using generative agent-based models such as Concordia, or directly by prompting a language model. Using two independently-developed agents for sleep and diabetes coaching as case studies, the validity of this framework is demonstrated by analyzing the coaching agent's understanding of the synthetic users' needs and challenges. Finally, through multiple blinded evaluations of user-coach interactions by human experts, we demonstrate that our synthetic users with health and behavioral attributes more accurately portray real human users with the same attributes, compared to generic synthetic users not grounded in such attributes. The proposed framework lays the foundation for efficient development of conversational agents through extensive, realistic, and grounded simulated interactions.
△ Less
Submitted 18 February, 2025;
originally announced February 2025.
-
Compressibility Analysis for the differentiable shift-variant Filtered Backprojection Model
Authors:
Chengze Ye,
Linda-Sophie Schneider,
Yipeng Sun,
Mareike Thies,
Andreas Maier
Abstract:
The differentiable shift-variant filtered backprojection (FBP) model enables the reconstruction of cone-beam computed tomography (CBCT) data for any non-circular trajectories. This method employs deep learning technique to estimate the redundancy weights required for reconstruction, given knowledge of the specific trajectory at optimization time. However, computing the redundancy weight for each p…
▽ More
The differentiable shift-variant filtered backprojection (FBP) model enables the reconstruction of cone-beam computed tomography (CBCT) data for any non-circular trajectories. This method employs deep learning technique to estimate the redundancy weights required for reconstruction, given knowledge of the specific trajectory at optimization time. However, computing the redundancy weight for each projection remains computationally intensive. This paper presents a novel approach to compress and optimize the differentiable shift-variant FBP model based on Principal Component Analysis (PCA). We apply PCA to the redundancy weights learned from sinusoidal trajectory projection data, revealing significant parameter redundancy in the original model. By integrating PCA directly into the differentiable shift-variant FBP reconstruction pipeline, we develop a method that decomposes the redundancy weight layer parameters into a trainable eigenvector matrix, compressed weights, and a mean vector. This innovative technique achieves a remarkable 97.25% reduction in trainable parameters without compromising reconstruction accuracy. As a result, our algorithm significantly decreases the complexity of the differentiable shift-variant FBP model and greatly improves training speed. These improvements make the model substantially more practical for real-world applications.
△ Less
Submitted 20 January, 2025;
originally announced January 2025.
-
Hyperband-based Bayesian Optimization for Black-box Prompt Selection
Authors:
Lennart Schneider,
Martin Wistuba,
Aaron Klein,
Jacek Golebiowski,
Giovanni Zappella,
Felice Antonio Merra
Abstract:
Optimal prompt selection is crucial for maximizing large language model (LLM) performance on downstream tasks. As the most powerful models are proprietary and can only be invoked via an API, users often manually refine prompts in a black-box setting by adjusting instructions and few-shot examples until they achieve good performance as measured on a validation set. Recent methods addressing static…
▽ More
Optimal prompt selection is crucial for maximizing large language model (LLM) performance on downstream tasks. As the most powerful models are proprietary and can only be invoked via an API, users often manually refine prompts in a black-box setting by adjusting instructions and few-shot examples until they achieve good performance as measured on a validation set. Recent methods addressing static black-box prompt selection face significant limitations: They often fail to leverage the inherent structure of prompts, treating instructions and few-shot exemplars as a single block of text. Moreover, they often lack query-efficiency by evaluating prompts on all validation instances, or risk sub-optimal selection of a prompt by using random subsets of validation instances. We introduce HbBoPs, a novel Hyperband-based Bayesian optimization method for black-box prompt selection addressing these key limitations. Our approach combines a structural-aware deep kernel Gaussian Process to model prompt performance with Hyperband as a multi-fidelity scheduler to select the number of validation instances for prompt evaluations. The structural-aware modeling approach utilizes separate embeddings for instructions and few-shot exemplars, enhancing the surrogate model's ability to capture prompt performance and predict which prompt to evaluate next in a sample-efficient manner. Together with Hyperband as a multi-fidelity scheduler we further enable query-efficiency by adaptively allocating resources across different fidelity levels, keeping the total number of validation instances prompts are evaluated on low. Extensive evaluation across ten benchmarks and three LLMs demonstrate that HbBoPs outperforms state-of-the-art methods.
△ Less
Submitted 10 December, 2024;
originally announced December 2024.
-
Magnetically-controlled Vortex Dynamics in a Ferromagnetic Superconductor
Authors:
Joseph Alec Wilcox,
Lukas Schneider,
Estefani Marchiori,
Vadim Plastovets,
Alexandre Buzdin,
Pardis Sahafi,
Andrew Jordan,
Raffi Budakian,
Tong Ren,
Ivan Veschunov,
Tsuyoshi Tamegai,
Sven Friedemann,
Martino Poggio,
Simon John Bending
Abstract:
Ferromagnetic superconductors are exceptionally rare because the strong ferromagnetic exchange field usually destroys singlet superconductivity. EuFe$_2$(As$_{1-x}$P$_x$)$_2$, an iron-based superconductor with a maximum critical temperature of 25 K, uniquely exhibits full coexistence with ferromagnetic order below $T_\mathrm{FM}$ $\simeq$ $19$ K. The interplay leads to narrowing of ferromagnetic d…
▽ More
Ferromagnetic superconductors are exceptionally rare because the strong ferromagnetic exchange field usually destroys singlet superconductivity. EuFe$_2$(As$_{1-x}$P$_x$)$_2$, an iron-based superconductor with a maximum critical temperature of 25 K, uniquely exhibits full coexistence with ferromagnetic order below $T_\mathrm{FM}$ $\simeq$ $19$ K. The interplay leads to narrowing of ferromagnetic domains at higher temperatures and spontaneous nucleation of vortices/antivortices at lower temperatures. Here we demonstrate how the underlying magnetic structure controls the superconducting vortex dynamics in applied magnetic fields. Just below $T_\mathrm{FM}$ we observe a pronounced peak in the creep activation energy, and magnetic force microscopy measurements reveal the presence of very closely-spaced ($w\ll λ$) vortex clusters. We attribute these observations to the formation of vortex polarons for which we present a theoretical description. In contrast, we link strong magnetic irreversibility at low temperatures to a critical current governed by giant flux creep over an activation barrier for vortex-antivortex annihilation near domain walls. Our work suggests new routes for the magnetic enhancement of vortex pinning with important applications in high-current conductors.
△ Less
Submitted 29 April, 2025; v1 submitted 5 December, 2024;
originally announced December 2024.
-
Electricity Price Prediction Using Multi-Kernel Gaussian Process Regression Combined with Kernel-Based Support Vector Regression
Authors:
Abhinav Das,
Stephan Schlüter,
Lorenz Schneider
Abstract:
This paper presents a new hybrid model for predicting German electricity prices. The algorithm is based on combining Gaussian Process Regression (GPR) and Support Vector Regression (SVR). While GPR is a competent model for learning the stochastic pattern within the data and interpolation, its performance for out-of-sample data is not very promising. By choosing a suitable data-dependent covariance…
▽ More
This paper presents a new hybrid model for predicting German electricity prices. The algorithm is based on combining Gaussian Process Regression (GPR) and Support Vector Regression (SVR). While GPR is a competent model for learning the stochastic pattern within the data and interpolation, its performance for out-of-sample data is not very promising. By choosing a suitable data-dependent covariance function, we can enhance the performance of GPR for the tested German hourly power prices. However, since the out-of-sample prediction depends on the training data, the prediction is vulnerable to noise and outliers. To overcome this issue, a separate prediction is made using SVR, which applies margin-based optimization, having an advantage in dealing with non-linear processes and outliers, since only certain necessary points (support vectors) in the training data are responsible for regression. Both individual predictions are later combined using the performance-based weight assignment method. A test on historic German power prices shows that this approach outperforms its chosen benchmarks such as the autoregressive exogenous model, the naive approach, as well as the long short-term memory approach of prediction.
△ Less
Submitted 14 January, 2025; v1 submitted 28 November, 2024;
originally announced December 2024.
-
Non-local detection of coherent Yu-Shiba-Rusinov quantum projections
Authors:
Khai Ton That,
Chang Xu,
Ioannis Ioannidis,
Lucas Schneider,
Thore Posske,
Roland Wiesendanger,
Dirk K. Morr,
Jens Wiebe
Abstract:
Probing spatially confined quantum states from afar - a long-sought goal to minimize external interference - has been proposed to be achievable in condensed matter systems via coherent projection. The latter can be tailored by sculpturing the eigenstates of the electron sea that surrounds the quantum state using atom-by-atom built cages, so-called quantum corrals. However, assuring the coherent na…
▽ More
Probing spatially confined quantum states from afar - a long-sought goal to minimize external interference - has been proposed to be achievable in condensed matter systems via coherent projection. The latter can be tailored by sculpturing the eigenstates of the electron sea that surrounds the quantum state using atom-by-atom built cages, so-called quantum corrals. However, assuring the coherent nature of the projection, and manipulating its quantum composition, has remained an elusive goal. Here, we experimentally realize the coherent projection of a magnetic impurity-induced, Yu-Shiba-Rusinov quantum state using the eigenmodes of corrals on the surface of a superconductor, which enables us to manipulate the particle-hole composition of the projected state by tuning corral eigenmodes through the Fermi energy. Our results demonstrate a controlled non-local method for the detection of magnet superconductor hybrid quantum states.
△ Less
Submitted 21 October, 2024;
originally announced October 2024.
-
DRACO: Differentiable Reconstruction for Arbitrary CBCT Orbits
Authors:
Chengze Ye,
Linda-Sophie Schneider,
Yipeng Sun,
Mareike Thies,
Siyuan Mei,
Andreas Maier
Abstract:
This paper introduces a novel method for reconstructing cone beam computed tomography (CBCT) images for arbitrary orbits using a differentiable shift-variant filtered backprojection (FBP) neural network. Traditional CBCT reconstruction methods for arbitrary orbits, like iterative reconstruction algorithms, are computationally expensive and memory-intensive. The proposed method addresses these chal…
▽ More
This paper introduces a novel method for reconstructing cone beam computed tomography (CBCT) images for arbitrary orbits using a differentiable shift-variant filtered backprojection (FBP) neural network. Traditional CBCT reconstruction methods for arbitrary orbits, like iterative reconstruction algorithms, are computationally expensive and memory-intensive. The proposed method addresses these challenges by employing a shift-variant FBP algorithm optimized for arbitrary trajectories through a deep learning approach that adapts to a specific orbit geometry. This approach overcomes the limitations of existing techniques by integrating known operators into the learning model, minimizing the number of parameters, and improving the interpretability of the model. The proposed method is a significant advancement in interventional medical imaging, particularly for robotic C-arm CT systems, enabling faster and more accurate CBCT reconstructions with customized orbits. Especially this method can also be used for the analytical reconstruction of non-continuous orbits like circular plus arc. The experimental results demonstrate that the proposed method significantly accelerates the reconstruction process compared to conventional iterative algorithms. It achieves comparable or superior image quality, as evidenced by metrics such as the mean squared error (MSE), the peak signal-to-noise ratio (PSNR), and the structural similarity index measure (SSIM). The validation experiments show that the method can handle data from different trajectories, demonstrating its flexibility and robustness across different scan geometries. Our method demonstrates a significant improvement, particularly for the sinusoidal trajectory, achieving a 38.6% reduction in MSE, a 7.7% increase in PSNR, and a 5.0% improvement in SSIM. Furthermore, the computation time for reconstruction was reduced by more than 97%.
△ Less
Submitted 18 October, 2024;
originally announced October 2024.
-
Modelling of measuring systems -- From white box models to cognitive approaches
Authors:
Nadine Schiering,
Sascha Eichstaedt,
Michael Heizmann,
Wolfgang Koch,
Linda-Sophie Schneider,
Stephan Scheele,
Klaus-Dieter Sommer
Abstract:
Mathematical models of measuring systems and processes play an essential role in metrology and practical measurements. They form the basis for understanding and evaluating measurements, their results and their trustworthiness. Classic analytical parametric modelling is based on largely complete knowledge of measurement technology and the measurement process. But due to digital transformation towar…
▽ More
Mathematical models of measuring systems and processes play an essential role in metrology and practical measurements. They form the basis for understanding and evaluating measurements, their results and their trustworthiness. Classic analytical parametric modelling is based on largely complete knowledge of measurement technology and the measurement process. But due to digital transformation towards the Internet of Things (IIoT) with an increasing number of intensively and flexibly networked measurement systems and consequently ever larger amounts of data to be processed, data-based modelling approaches have gained enormous importance. This has led to new approaches in measurement technology and industry like Digital Twins, Self-X Approaches, Soft Sensor Technology and Data and Information Fusion. In the future, data-based modelling will be increasingly dominated by intelligent, cognitive systems. Evaluating of the accuracy, trustworthiness and the functional uncertainty of the corresponding models is required.
This paper provides a concise overview of modelling in metrology from classical white box models to intelligent, cognitive data-driven solutions identifying advantages and limitations. Additionally, the approaches to merge trustworthiness and metrological uncertainty will be discussed.
△ Less
Submitted 12 August, 2024;
originally announced August 2024.
-
The infrastructure powering IBM's Gen AI model development
Authors:
Talia Gershon,
Seetharami Seelam,
Brian Belgodere,
Milton Bonilla,
Lan Hoang,
Danny Barnett,
I-Hsin Chung,
Apoorve Mohan,
Ming-Hung Chen,
Lixiang Luo,
Robert Walkup,
Constantinos Evangelinos,
Shweta Salaria,
Marc Dombrowa,
Yoonho Park,
Apo Kayi,
Liran Schour,
Alim Alim,
Ali Sydney,
Pavlos Maniotis,
Laurent Schares,
Bernard Metzler,
Bengi Karacali-Akyamac,
Sophia Wen,
Tatsuhiro Chiba
, et al. (122 additional authors not shown)
Abstract:
AI Infrastructure plays a key role in the speed and cost-competitiveness of developing and deploying advanced AI models. The current demand for powerful AI infrastructure for model training is driven by the emergence of generative AI and foundational models, where on occasion thousands of GPUs must cooperate on a single training job for the model to be trained in a reasonable time. Delivering effi…
▽ More
AI Infrastructure plays a key role in the speed and cost-competitiveness of developing and deploying advanced AI models. The current demand for powerful AI infrastructure for model training is driven by the emergence of generative AI and foundational models, where on occasion thousands of GPUs must cooperate on a single training job for the model to be trained in a reasonable time. Delivering efficient and high-performing AI training requires an end-to-end solution that combines hardware, software and holistic telemetry to cater for multiple types of AI workloads. In this report, we describe IBM's hybrid cloud infrastructure that powers our generative AI model development. This infrastructure includes (1) Vela: an AI-optimized supercomputing capability directly integrated into the IBM Cloud, delivering scalable, dynamic, multi-tenant and geographically distributed infrastructure for large-scale model training and other AI workflow steps and (2) Blue Vela: a large-scale, purpose-built, on-premises hosting environment that is optimized to support our largest and most ambitious AI model training tasks. Vela provides IBM with the dual benefit of high performance for internal use along with the flexibility to adapt to an evolving commercial landscape. Blue Vela provides us with the benefits of rapid development of our largest and most ambitious models, as well as future-proofing against the evolving model landscape in the industry. Taken together, they provide IBM with the ability to rapidly innovate in the development of both AI models and commercial offerings.
△ Less
Submitted 13 January, 2025; v1 submitted 7 July, 2024;
originally announced July 2024.
-
Imaging magnetic spiral phases, skyrmion clusters, and skyrmion displacements at the surface of bulk Cu$_2$OSeO$_3$
Authors:
E. Marchiori,
G. Romagnoli,
L. Schneider,
B. Gross,
P. Sahafi,
A. Jordan,
R. Budakian,
P. R. Baral,
A. Magrez,
J. S. White,
M. Poggio
Abstract:
Surfaces -- by breaking bulk symmetries, introducing roughness, or hosting defects -- can significantly influence magnetic order in magnetic materials. Determining their effect on the complex nanometer-scale phases present in certain non-centrosymmetric magnets is an outstanding problem requiring high-resolution magnetic microscopy. Here, we use scanning SQUID-on-tip microscopy to image the surfac…
▽ More
Surfaces -- by breaking bulk symmetries, introducing roughness, or hosting defects -- can significantly influence magnetic order in magnetic materials. Determining their effect on the complex nanometer-scale phases present in certain non-centrosymmetric magnets is an outstanding problem requiring high-resolution magnetic microscopy. Here, we use scanning SQUID-on-tip microscopy to image the surface of bulk Cu$_2$OSeO$_3$ at low temperature and in a magnetic field applied along $\left\langle100\right\rangle$. Real-space maps measured as a function of applied field reveal the microscopic structure of the magnetic phases and their transitions. In low applied field, we observe a magnetic texture consistent with an in-plane stripe phase, pointing to the existence of a distinct surface state. In the low-temperature skyrmion phase, the surface is populated by clusters of disordered skyrmions, which emerge from rupturing domains of the tilted spiral phase. Furthermore, we displace individual skyrmions from their pinning sites by applying an electric potential to the scanning probe, thereby demonstrating local skyrmion control at the surface of a magnetoelectric insulator.
△ Less
Submitted 6 July, 2024;
originally announced July 2024.
-
Data-driven Modeling in Metrology -- A Short Introduction, Current Developments and Future Perspectives
Authors:
Linda-Sophie Schneider,
Patrick Krauss,
Nadine Schiering,
Christopher Syben,
Richard Schielein,
Andreas Maier
Abstract:
Mathematical models are vital to the field of metrology, playing a key role in the derivation of measurement results and the calculation of uncertainties from measurement data, informed by an understanding of the measurement process. These models generally represent the correlation between the quantity being measured and all other pertinent quantities. Such relationships are used to construct meas…
▽ More
Mathematical models are vital to the field of metrology, playing a key role in the derivation of measurement results and the calculation of uncertainties from measurement data, informed by an understanding of the measurement process. These models generally represent the correlation between the quantity being measured and all other pertinent quantities. Such relationships are used to construct measurement systems that can interpret measurement data to generate conclusions and predictions about the measurement system itself. Classic models are typically analytical, built on fundamental physical principles. However, the rise of digital technology, expansive sensor networks, and high-performance computing hardware have led to a growing shift towards data-driven methodologies. This trend is especially prominent when dealing with large, intricate networked sensor systems in situations where there is limited expert understanding of the frequently changing real-world contexts. Here, we demonstrate the variety of opportunities that data-driven modeling presents, and how they have been already implemented in various real-world applications.
△ Less
Submitted 24 June, 2024;
originally announced June 2024.
-
Towards a Personal Health Large Language Model
Authors:
Justin Cosentino,
Anastasiya Belyaeva,
Xin Liu,
Nicholas A. Furlotte,
Zhun Yang,
Chace Lee,
Erik Schenck,
Yojan Patel,
Jian Cui,
Logan Douglas Schneider,
Robby Bryant,
Ryan G. Gomes,
Allen Jiang,
Roy Lee,
Yun Liu,
Javier Perez,
Jameson K. Rogers,
Cathy Speed,
Shyam Tailor,
Megan Walker,
Jeffrey Yu,
Tim Althoff,
Conor Heneghan,
John Hernandez,
Mark Malhotra
, et al. (9 additional authors not shown)
Abstract:
In health, most large language model (LLM) research has focused on clinical tasks. However, mobile and wearable devices, which are rarely integrated into such tasks, provide rich, longitudinal data for personal health monitoring. Here we present Personal Health Large Language Model (PH-LLM), fine-tuned from Gemini for understanding and reasoning over numerical time-series personal health data. We…
▽ More
In health, most large language model (LLM) research has focused on clinical tasks. However, mobile and wearable devices, which are rarely integrated into such tasks, provide rich, longitudinal data for personal health monitoring. Here we present Personal Health Large Language Model (PH-LLM), fine-tuned from Gemini for understanding and reasoning over numerical time-series personal health data. We created and curated three datasets that test 1) production of personalized insights and recommendations from sleep patterns, physical activity, and physiological responses, 2) expert domain knowledge, and 3) prediction of self-reported sleep outcomes. For the first task we designed 857 case studies in collaboration with domain experts to assess real-world scenarios in sleep and fitness. Through comprehensive evaluation of domain-specific rubrics, we observed that Gemini Ultra 1.0 and PH-LLM are not statistically different from expert performance in fitness and, while experts remain superior for sleep, fine-tuning PH-LLM provided significant improvements in using relevant domain knowledge and personalizing information for sleep insights. We evaluated PH-LLM domain knowledge using multiple choice sleep medicine and fitness examinations. PH-LLM achieved 79% on sleep and 88% on fitness, exceeding average scores from a sample of human experts. Finally, we trained PH-LLM to predict self-reported sleep quality outcomes from textual and multimodal encoding representations of wearable data, and demonstrate that multimodal encoding is required to match performance of specialized discriminative models. Although further development and evaluation are necessary in the safety-critical personal health domain, these results demonstrate both the broad knowledge and capabilities of Gemini models and the benefit of contextualizing physiological data for personal health applications as done with PH-LLM.
△ Less
Submitted 10 June, 2024;
originally announced June 2024.
-
DualAD: Disentangling the Dynamic and Static World for End-to-End Driving
Authors:
Simon Doll,
Niklas Hanselmann,
Lukas Schneider,
Richard Schulz,
Marius Cordts,
Markus Enzweiler,
Hendrik P. A. Lensch
Abstract:
State-of-the-art approaches for autonomous driving integrate multiple sub-tasks of the overall driving task into a single pipeline that can be trained in an end-to-end fashion by passing latent representations between the different modules. In contrast to previous approaches that rely on a unified grid to represent the belief state of the scene, we propose dedicated representations to disentangle…
▽ More
State-of-the-art approaches for autonomous driving integrate multiple sub-tasks of the overall driving task into a single pipeline that can be trained in an end-to-end fashion by passing latent representations between the different modules. In contrast to previous approaches that rely on a unified grid to represent the belief state of the scene, we propose dedicated representations to disentangle dynamic agents and static scene elements. This allows us to explicitly compensate for the effect of both ego and object motion between consecutive time steps and to flexibly propagate the belief state through time. Furthermore, dynamic objects can not only attend to the input camera images, but also directly benefit from the inferred static scene structure via a novel dynamic-static cross-attention. Extensive experiments on the challenging nuScenes benchmark demonstrate the benefits of the proposed dual-stream design, especially for modelling highly dynamic agents in the scene, and highlight the improved temporal consistency of our approach. Our method titled DualAD not only outperforms independently trained single-task networks, but also improves over previous state-of-the-art end-to-end models by a large margin on all tasks along the functional chain of driving.
△ Less
Submitted 10 June, 2024;
originally announced June 2024.
-
Reshuffling Resampling Splits Can Improve Generalization of Hyperparameter Optimization
Authors:
Thomas Nagler,
Lennart Schneider,
Bernd Bischl,
Matthias Feurer
Abstract:
Hyperparameter optimization is crucial for obtaining peak performance of machine learning models. The standard protocol evaluates various hyperparameter configurations using a resampling estimate of the generalization error to guide optimization and select a final hyperparameter configuration. Without much evidence, paired resampling splits, i.e., either a fixed train-validation split or a fixed c…
▽ More
Hyperparameter optimization is crucial for obtaining peak performance of machine learning models. The standard protocol evaluates various hyperparameter configurations using a resampling estimate of the generalization error to guide optimization and select a final hyperparameter configuration. Without much evidence, paired resampling splits, i.e., either a fixed train-validation split or a fixed cross-validation scheme, are often recommended. We show that, surprisingly, reshuffling the splits for every configuration often improves the final model's generalization performance on unseen data. Our theoretical analysis explains how reshuffling affects the asymptotic behavior of the validation loss surface and provides a bound on the expected regret in the limiting regime. This bound connects the potential benefits of reshuffling to the signal and noise characteristics of the underlying optimization problem. We confirm our theoretical results in a controlled simulation study and demonstrate the practical usefulness of reshuffling in a large-scale, realistic hyperparameter optimization experiment. While reshuffling leads to test performances that are competitive with using fixed splits, it drastically improves results for a single train-validation holdout protocol and can often make holdout become competitive with standard CV while being computationally cheaper.
△ Less
Submitted 7 November, 2024; v1 submitted 24 May, 2024;
originally announced May 2024.
-
Application of Gated Recurrent Units for CT Trajectory Optimization
Authors:
Yuedong Yuan,
Linda-Sophie Schneider,
Andreas Maier
Abstract:
Recent advances in computed tomography (CT) imaging, especially with dual-robot systems, have introduced new challenges for scan trajectory optimization. This paper presents a novel approach using Gated Recurrent Units (GRUs) to optimize CT scan trajectories. Our approach exploits the flexibility of robotic CT systems to select projections that enhance image quality by improving resolution and con…
▽ More
Recent advances in computed tomography (CT) imaging, especially with dual-robot systems, have introduced new challenges for scan trajectory optimization. This paper presents a novel approach using Gated Recurrent Units (GRUs) to optimize CT scan trajectories. Our approach exploits the flexibility of robotic CT systems to select projections that enhance image quality by improving resolution and contrast while reducing scan time. We focus on cone-beam CT and employ several projection-based metrics, including absorption, pixel intensities, contrast-to-noise ratio, and data completeness. The GRU network aims to minimize data redundancy and maximize completeness with a limited number of projections. We validate our method using simulated data of a test specimen, focusing on a specific voxel of interest. The results show that the GRU-optimized scan trajectories can outperform traditional circular CT trajectories in terms of image quality metrics. For the used specimen, SSIM improves from 0.38 to 0.49 and CNR increases from 6.97 to 9.08. This finding suggests that the application of GRU in CT scan trajectory optimization can lead to more efficient, cost-effective, and high-quality imaging solutions.
△ Less
Submitted 15 May, 2024;
originally announced May 2024.
-
ADA-Track++: End-to-End Multi-Camera 3D Multi-Object Tracking with Alternating Detection and Association
Authors:
Shuxiao Ding,
Lukas Schneider,
Marius Cordts,
Juergen Gall
Abstract:
Many query-based approaches for 3D Multi-Object Tracking (MOT) adopt the tracking-by-attention paradigm, utilizing track queries for identity-consistent detection and object queries for identity-agnostic track spawning. Tracking-by-attention, however, entangles detection and tracking queries in one embedding for both the detection and tracking task, which is sub-optimal. Other approaches resemble…
▽ More
Many query-based approaches for 3D Multi-Object Tracking (MOT) adopt the tracking-by-attention paradigm, utilizing track queries for identity-consistent detection and object queries for identity-agnostic track spawning. Tracking-by-attention, however, entangles detection and tracking queries in one embedding for both the detection and tracking task, which is sub-optimal. Other approaches resemble the tracking-by-detection paradigm and detect objects using decoupled track and detection queries followed by a subsequent association. These methods, however, do not leverage synergies between the detection and association task. Combining the strengths of both paradigms, we introduce ADA-Track++, a novel end-to-end framework for 3D MOT from multi-view cameras. We introduce a learnable data association module based on edge-augmented cross-attention, leveraging appearance and geometric features. We also propose an auxiliary token in this attention-based association module, which helps mitigate disproportionately high attention to incorrect association targets caused by attention normalization. Furthermore, we integrate this association module into the decoder layer of a DETR-based 3D detector, enabling simultaneous DETR-like query-to-image cross-attention for detection and query-to-query cross-attention for data association. By stacking these decoder layers, queries are refined for the detection and association task alternately, effectively harnessing the task dependencies. We evaluate our method on the nuScenes dataset and demonstrate the advantage of our approach compared to the two previous paradigms.
△ Less
Submitted 13 December, 2024; v1 submitted 14 May, 2024;
originally announced May 2024.
-
Self-training superconducting neuromorphic circuits using reinforcement learning rules
Authors:
M. L. Schneider,
E. M. Jué,
M. R. Pufall,
K. Segall,
C. W. Anderson
Abstract:
Reinforcement learning algorithms are used in a wide range of applications, from gaming and robotics to autonomous vehicles. In this paper we describe a set of reinforcement learning-based local weight update rules and their implementation in superconducting hardware. Using SPICE circuit simulations, we implement a small-scale neural network with a learning time of order one nanosecond. This netwo…
▽ More
Reinforcement learning algorithms are used in a wide range of applications, from gaming and robotics to autonomous vehicles. In this paper we describe a set of reinforcement learning-based local weight update rules and their implementation in superconducting hardware. Using SPICE circuit simulations, we implement a small-scale neural network with a learning time of order one nanosecond. This network can be trained to learn new functions simply by changing the target output for a given set of inputs, without the need for any external adjustments to the network. In this implementation the weights are adjusted based on the current state of the overall network response and locally stored information about the previous action. This removes the need to program explicit weight values in these networks, which is one of the primary challenges that analog hardware implementations of neural networks face. The adjustment of weights is based on a global reinforcement signal that obviates the need for circuitry to back-propagate errors.
△ Less
Submitted 29 April, 2024;
originally announced April 2024.
-
EAGLE: An Edge-Aware Gradient Localization Enhanced Loss for CT Image Reconstruction
Authors:
Yipeng Sun,
Yixing Huang,
Linda-Sophie Schneider,
Mareike Thies,
Mingxuan Gu,
Siyuan Mei,
Siming Bayer,
Andreas Maier
Abstract:
Computed Tomography (CT) image reconstruction is crucial for accurate diagnosis and deep learning approaches have demonstrated significant potential in improving reconstruction quality. However, the choice of loss function profoundly affects the reconstructed images. Traditional mean squared error loss often produces blurry images lacking fine details, while alternatives designed to improve may in…
▽ More
Computed Tomography (CT) image reconstruction is crucial for accurate diagnosis and deep learning approaches have demonstrated significant potential in improving reconstruction quality. However, the choice of loss function profoundly affects the reconstructed images. Traditional mean squared error loss often produces blurry images lacking fine details, while alternatives designed to improve may introduce structural artifacts or other undesirable effects. To address these limitations, we propose Eagle-Loss, a novel loss function designed to enhance the visual quality of CT image reconstructions. Eagle-Loss applies spectral analysis of localized features within gradient changes to enhance sharpness and well-defined edges. We evaluated Eagle-Loss on two public datasets across low-dose CT reconstruction and CT field-of-view extension tasks. Our results show that Eagle-Loss consistently improves the visual quality of reconstructed images, surpassing state-of-the-art methods across various network architectures. Code and data are available at \url{https://github.com/sypsyp97/Eagle_Loss}.
△ Less
Submitted 15 March, 2024;
originally announced March 2024.
-
The variability patterns of the TeV blazar PG 1553+113 from a decade of MAGIC and multi-band observations
Authors:
MAGIC Collaboration,
H. Abe,
S. Abe,
J. Abhir,
V. A. Acciari,
I. Agudo,
T. Aniello,
S. Ansoldi,
L. A. Antonelli,
A. Arbet Engels,
C. Arcaro,
M. Artero,
K. Asano,
D. Baack,
A. Babić,
A. Baquero,
U. Barres de Almeida,
I. Batković,
J. Baxter,
J. Becerra González,
E. Bernardini,
J. Bernete,
A. Berti,
J. Besenrieder,
C. Bigongiari
, et al. (242 additional authors not shown)
Abstract:
PG 1553+113 is one of the few blazars with a convincing quasi-periodic emission in the gamma-ray band. The source is also a very high-energy (VHE; >100 GeV) gamma-ray emitter. To better understand its properties and identify the underlying physical processes driving its variability, the MAGIC Collaboration initiated a multiyear, multiwavelength monitoring campaign in 2015 involving the OVRO 40-m a…
▽ More
PG 1553+113 is one of the few blazars with a convincing quasi-periodic emission in the gamma-ray band. The source is also a very high-energy (VHE; >100 GeV) gamma-ray emitter. To better understand its properties and identify the underlying physical processes driving its variability, the MAGIC Collaboration initiated a multiyear, multiwavelength monitoring campaign in 2015 involving the OVRO 40-m and Medicina radio telescopes, REM, KVA, and the MAGIC telescopes, Swift and Fermi satellites, and the WEBT network. The analysis presented in this paper uses data until 2017 and focuses on the characterization of the variability. The gamma-ray data show a (hint of a) periodic signal compatible with literature, but the X-ray and VHE gamma-ray data do not show statistical evidence for a periodic signal. In other bands, the data are compatible with the gamma-ray period, but with a relatively high p-value. The complex connection between the low and high-energy emission and the non-monochromatic modulation and changes in flux suggests that a simple one-zone model is unable to explain all the variability. Instead, a model including a periodic component along with multiple emission zones is required.
△ Less
Submitted 4 March, 2024;
originally announced March 2024.
-
Deep Learning Computed Tomography based on the Defrise and Clack Algorithm
Authors:
Chengze Ye,
Linda-Sophie Schneider,
Yipeng Sun,
Andreas Maier
Abstract:
This study presents a novel approach for reconstructing cone beam computed tomography (CBCT) for specific orbits using known operator learning. Unlike traditional methods, this technique employs a filtered backprojection type (FBP-type) algorithm, which integrates a unique, adaptive filtering process. This process involves a series of operations, including weightings, differentiations, the 2D Rado…
▽ More
This study presents a novel approach for reconstructing cone beam computed tomography (CBCT) for specific orbits using known operator learning. Unlike traditional methods, this technique employs a filtered backprojection type (FBP-type) algorithm, which integrates a unique, adaptive filtering process. This process involves a series of operations, including weightings, differentiations, the 2D Radon transform, and backprojection. The filter is designed for a specific orbit geometry and is obtained using a data-driven approach based on deep learning. The approach efficiently learns and optimizes the orbit-related component of the filter. The method has demonstrated its ability through experimentation by successfully learning parameters from circular orbit projection data. Subsequently, the optimized parameters are used to reconstruct images, resulting in outcomes that closely resemble the analytical solution. This demonstrates the potential of the method to learn appropriate parameters from any specific orbit projection data and achieve reconstruction. The algorithm has demonstrated improvement, particularly in enhancing reconstruction speed and reducing memory usage for handling specific orbit reconstruction.
△ Less
Submitted 1 March, 2024;
originally announced March 2024.
-
Integer Optimization of CT Trajectories using a Discrete Data Completeness Formulation
Authors:
Linda-Sophie Schneider,
Gabriel Herl,
Andreas Maier
Abstract:
X-ray computed tomography (CT) plays a key role in digitizing three-dimensional structures for a wide range of medical and industrial applications. Traditional CT systems often rely on standard circular and helical scan trajectories, which may not be optimal for challenging scenarios involving large objects, complex structures, or resource constraints. In response to these challenges, we are explo…
▽ More
X-ray computed tomography (CT) plays a key role in digitizing three-dimensional structures for a wide range of medical and industrial applications. Traditional CT systems often rely on standard circular and helical scan trajectories, which may not be optimal for challenging scenarios involving large objects, complex structures, or resource constraints. In response to these challenges, we are exploring the potential of twin robotic CT systems, which offer the flexibility to acquire projections from arbitrary views around the object of interest. Ensuring complete and mathematically sound reconstructions becomes critical in such systems. In this work, we present an integer programming-based CT trajectory optimization method. Utilizing discrete data completeness conditions, we formulate an optimization problem to select an optimized set of projections. This approach enforces data completeness and considers absorption-based metrics for reliability evaluation. We compare our method with an equidistant circular CT trajectory and a greedy approach. While greedy already performs well in some cases, we provide a way to improve greedy-based projection selection using an integer optimization approach. Our approach improves CT trajectories and quantifies the optimality of the solution in terms of an optimality gap.
△ Less
Submitted 29 January, 2024;
originally announced February 2024.
-
High-resolution spectroscopy of proximity superconductivity in finite-size quantized surface states
Authors:
Lucas Schneider,
Christian von Bredow,
Howon Kim,
Khai That Ton,
Torben Hänke,
Jens Wiebe,
Roland Wiesendanger
Abstract:
Adding superconducting (SC) electron pairing via the proximity effect to pristinely non-superconducting materials can lead to a variety of interesting physical phenomena. Particular interest has recently focused on inducing SC into two-dimensional surface states (SSs), potentially also combined with non-trivial topology. We study the mechanism of proximity-induced SC into the Shockley-type SSs of…
▽ More
Adding superconducting (SC) electron pairing via the proximity effect to pristinely non-superconducting materials can lead to a variety of interesting physical phenomena. Particular interest has recently focused on inducing SC into two-dimensional surface states (SSs), potentially also combined with non-trivial topology. We study the mechanism of proximity-induced SC into the Shockley-type SSs of the noble metals Ag(111) and Cu(111) grown on the elemental SC Nb(110) using scanning tunneling spectroscopy. The tunneling spectra exhibit an intriguing multitude of sharp states at low energies. Their appearance can be explained by Andreev bound states (ABS) formed by the weakly proximitized SSs subject to lateral finite-size confinement. We study systematically how the proximity gap in the bulk states of both Ag(111) and Cu(111) persists up to island thicknesses of several times the bulk coherence length of Nb. We find that even for thick islands, the SSs acquire a gap, with the gap size for Cu being consistently larger than for Ag. Based on this, we argue that the SC in the SS is not provided through direct overlap of the SS wavefunction with the SC host but can be understood to be mediated by step edges inducing electronic coupling to the bulk. Our work provides important input for the microscopic understanding of induced superconductivity in heterostructures and its spectral manifestation. Moreover, it lays the foundation for more complex SC heterostructures based on noble metals.
△ Less
Submitted 13 February, 2024;
originally announced February 2024.
-
A 2D Sinogram-Based Approach to Defect Localization in Computed Tomography
Authors:
Yuzhong Zhou,
Linda-Sophie Schneider,
Fuxin Fan,
Andreas Maier
Abstract:
The rise of deep learning has introduced a transformative era in the field of image processing, particularly in the context of computed tomography. Deep learning has made a significant contribution to the field of industrial Computed Tomography. However, many defect detection algorithms are applied directly to the reconstructed domain, often disregarding the raw sensor data. This paper shifts the…
▽ More
The rise of deep learning has introduced a transformative era in the field of image processing, particularly in the context of computed tomography. Deep learning has made a significant contribution to the field of industrial Computed Tomography. However, many defect detection algorithms are applied directly to the reconstructed domain, often disregarding the raw sensor data. This paper shifts the focus to the use of sinograms. Within this framework, we present a comprehensive three-step deep learning algorithm, designed to identify and analyze defects within objects without resorting to image reconstruction. These three steps are defect segmentation, mask isolation, and defect analysis. We use a U-Net-based architecture for defect segmentation. Our method achieves the Intersection over Union of 92.02% on our simulated data, with an average position error of 1.3 pixels for defect detection on a 512-pixel-wide detector.
△ Less
Submitted 29 January, 2024;
originally announced January 2024.
-
Data-Driven Filter Design in FBP: Transforming CT Reconstruction with Trainable Fourier Series
Authors:
Yipeng Sun,
Linda-Sophie Schneider,
Fuxin Fan,
Mareike Thies,
Mingxuan Gu,
Siyuan Mei,
Yuzhong Zhou,
Siming Bayer,
Andreas Maier
Abstract:
In this study, we introduce a Fourier series-based trainable filter for computed tomography (CT) reconstruction within the filtered backprojection (FBP) framework. This method overcomes the limitation in noise reduction by optimizing Fourier series coefficients to construct the filter, maintaining computational efficiency with minimal increment for the trainable parameters compared to other deep l…
▽ More
In this study, we introduce a Fourier series-based trainable filter for computed tomography (CT) reconstruction within the filtered backprojection (FBP) framework. This method overcomes the limitation in noise reduction by optimizing Fourier series coefficients to construct the filter, maintaining computational efficiency with minimal increment for the trainable parameters compared to other deep learning frameworks. Additionally, we propose Gaussian edge-enhanced (GEE) loss function that prioritizes the $L_1$ norm of high-frequency magnitudes, effectively countering the blurring problems prevalent in mean squared error (MSE) approaches. The model's foundation in the FBP algorithm ensures excellent interpretability, as it relies on a data-driven filter with all other parameters derived through rigorous mathematical procedures. Designed as a plug-and-play solution, our Fourier series-based filter can be easily integrated into existing CT reconstruction models, making it an adaptable tool for a wide range of practical applications. Code and data are available at https://github.com/sypsyp97/Trainable-Fourier-Series.
△ Less
Submitted 25 October, 2024; v1 submitted 29 January, 2024;
originally announced January 2024.
-
A gradient-based approach to fast and accurate head motion compensation in cone-beam CT
Authors:
Mareike Thies,
Fabian Wagner,
Noah Maul,
Haijun Yu,
Manuela Goldmann,
Linda-Sophie Schneider,
Mingxuan Gu,
Siyuan Mei,
Lukas Folle,
Alexander Preuhs,
Michael Manhart,
Andreas Maier
Abstract:
Cone-beam computed tomography (CBCT) systems, with their flexibility, present a promising avenue for direct point-of-care medical imaging, particularly in critical scenarios such as acute stroke assessment. However, the integration of CBCT into clinical workflows faces challenges, primarily linked to long scan duration resulting in patient motion during scanning and leading to image quality degrad…
▽ More
Cone-beam computed tomography (CBCT) systems, with their flexibility, present a promising avenue for direct point-of-care medical imaging, particularly in critical scenarios such as acute stroke assessment. However, the integration of CBCT into clinical workflows faces challenges, primarily linked to long scan duration resulting in patient motion during scanning and leading to image quality degradation in the reconstructed volumes. This paper introduces a novel approach to CBCT motion estimation using a gradient-based optimization algorithm, which leverages generalized derivatives of the backprojection operator for cone-beam CT geometries. Building on that, a fully differentiable target function is formulated which grades the quality of the current motion estimate in reconstruction space. We drastically accelerate motion estimation yielding a 19-fold speed-up compared to existing methods. Additionally, we investigate the architecture of networks used for quality metric regression and propose predicting voxel-wise quality maps, favoring autoencoder-like architectures over contracting ones. This modification improves gradient flow, leading to more accurate motion estimation. The presented method is evaluated through realistic experiments on head anatomy. It achieves a reduction in reprojection error from an initial average of 3mm to 0.61mm after motion compensation and consistently demonstrates superior performance compared to existing approaches. The analytic Jacobian for the backprojection operation, which is at the core of the proposed method, is made publicly available. In summary, this paper contributes to the advancement of CBCT integration into clinical workflows by proposing a robust motion estimation approach that enhances efficiency and accuracy, addressing critical challenges in time-sensitive scenarios.
△ Less
Submitted 21 October, 2024; v1 submitted 17 January, 2024;
originally announced January 2024.
-
Exchange energy of the ferromagnetic electronic ground-state in a monolayer semiconductor
Authors:
Nadine Leisgang,
Dmitry Miserev,
Hinrich Mattiat,
Lukas Schneider,
Lukas Sponfeldner,
Kenji Watanabe,
Takashi Taniguchi,
Martino Poggio,
Richard J. Warburton
Abstract:
Mobile electrons in the semiconductor monolayer-MoS$_2$ form a ferromagnetic state at low temperature. The Fermi sea consists of two circles, one at the $K$-point, the other at the $\tilde{K}$-point, both with the same spin. Here, we present an optical experiment on gated MoS$_2$ at low electron-density in which excitons are injected with known spin and valley quantum numbers. The resulting trions…
▽ More
Mobile electrons in the semiconductor monolayer-MoS$_2$ form a ferromagnetic state at low temperature. The Fermi sea consists of two circles, one at the $K$-point, the other at the $\tilde{K}$-point, both with the same spin. Here, we present an optical experiment on gated MoS$_2$ at low electron-density in which excitons are injected with known spin and valley quantum numbers. The resulting trions are identified using a model which accounts for the injection process, the formation of antisymmetrized trion states, electron-hole scattering from one valley to the other, and recombination. The results are consistent with a complete spin polarization. From the splittings between different trion states, we measure the exchange energy, $Σ$, the energy required to flip a single spin within the ferromagnetic state, as well as the intervalley Coulomb exchange energy, $J$. We determine $Σ=11.2\,$meV and $J=5\,$meV at $n=1.5 \times 10^{12}\,$cm$^{-2}$, and find that $J$ depends strongly on the electron density, $n$.
△ Less
Submitted 3 November, 2023;
originally announced November 2023.
-
Evaluating machine learning models in non-standard settings: An overview and new findings
Authors:
Roman Hornung,
Malte Nalenz,
Lennart Schneider,
Andreas Bender,
Ludwig Bothmann,
Bernd Bischl,
Thomas Augustin,
Anne-Laure Boulesteix
Abstract:
Estimating the generalization error (GE) of machine learning models is fundamental, with resampling methods being the most common approach. However, in non-standard settings, particularly those where observations are not independently and identically distributed, resampling using simple random data divisions may lead to biased GE estimates. This paper strives to present well-grounded guidelines fo…
▽ More
Estimating the generalization error (GE) of machine learning models is fundamental, with resampling methods being the most common approach. However, in non-standard settings, particularly those where observations are not independently and identically distributed, resampling using simple random data divisions may lead to biased GE estimates. This paper strives to present well-grounded guidelines for GE estimation in various such non-standard settings: clustered data, spatial data, unequal sampling probabilities, concept drift, and hierarchically structured outcomes. Our overview combines well-established methodologies with other existing methods that, to our knowledge, have not been frequently considered in these particular settings. A unifying principle among these techniques is that the test data used in each iteration of the resampling procedure should reflect the new observations to which the model will be applied, while the training data should be representative of the entire data set used to obtain the final model. Beyond providing an overview, we address literature gaps by conducting simulation studies. These studies assess the necessity of using GE-estimation methods tailored to the respective setting. Our findings corroborate the concern that standard resampling methods often yield biased GE estimates in non-standard settings, underscoring the importance of tailored GE estimation.
△ Less
Submitted 23 October, 2023;
originally announced October 2023.
-
Learning Risk-Aware Quadrupedal Locomotion using Distributional Reinforcement Learning
Authors:
Lukas Schneider,
Jonas Frey,
Takahiro Miki,
Marco Hutter
Abstract:
Deployment in hazardous environments requires robots to understand the risks associated with their actions and movements to prevent accidents. Despite its importance, these risks are not explicitly modeled by currently deployed locomotion controllers for legged robots. In this work, we propose a risk sensitive locomotion training method employing distributional reinforcement learning to consider s…
▽ More
Deployment in hazardous environments requires robots to understand the risks associated with their actions and movements to prevent accidents. Despite its importance, these risks are not explicitly modeled by currently deployed locomotion controllers for legged robots. In this work, we propose a risk sensitive locomotion training method employing distributional reinforcement learning to consider safety explicitly. Instead of relying on a value expectation, we estimate the complete value distribution to account for uncertainty in the robot's interaction with the environment. The value distribution is consumed by a risk metric to extract risk sensitive value estimates. These are integrated into Proximal Policy Optimization (PPO) to derive our method, Distributional Proximal Policy Optimization (DPPO). The risk preference, ranging from risk-averse to risk-seeking, can be controlled by a single parameter, which enables to adjust the robot's behavior dynamically. Importantly, our approach removes the need for additional reward function tuning to achieve risk sensitivity. We show emergent risk sensitive locomotion behavior in simulation and on the quadrupedal robot ANYmal. Videos of the experiments and code are available at https://sites.google.com/leggedrobotics.com/risk-aware-locomotion.
△ Less
Submitted 3 May, 2024; v1 submitted 25 September, 2023;
originally announced September 2023.
-
Monte-Carlo tree search with uncertainty propagation via optimal transport
Authors:
Tuan Dam,
Pascal Stenger,
Lukas Schneider,
Joni Pajarinen,
Carlo D'Eramo,
Odalric-Ambrym Maillard
Abstract:
This paper introduces a novel backup strategy for Monte-Carlo Tree Search (MCTS) designed for highly stochastic and partially observable Markov decision processes. We adopt a probabilistic approach, modeling both value and action-value nodes as Gaussian distributions. We introduce a novel backup operator that computes value nodes as the Wasserstein barycenter of their action-value children nodes;…
▽ More
This paper introduces a novel backup strategy for Monte-Carlo Tree Search (MCTS) designed for highly stochastic and partially observable Markov decision processes. We adopt a probabilistic approach, modeling both value and action-value nodes as Gaussian distributions. We introduce a novel backup operator that computes value nodes as the Wasserstein barycenter of their action-value children nodes; thus, propagating the uncertainty of the estimate across the tree to the root node. We study our novel backup operator when using a novel combination of $L^1$-Wasserstein barycenter with $α$-divergence, by drawing a notable connection to the generalized mean backup operator. We complement our probabilistic backup operator with two sampling strategies, based on optimistic selection and Thompson sampling, obtaining our Wasserstein MCTS algorithm. We provide theoretical guarantees of asymptotic convergence to the optimal policy, and an empirical evaluation on several stochastic and partially observable environments, where our approach outperforms well-known related baselines.
△ Less
Submitted 19 September, 2023;
originally announced September 2023.
-
Evaluating Deep Learning-based Melanoma Classification using Immunohistochemistry and Routine Histology: A Three Center Study
Authors:
Christoph Wies,
Lucas Schneider,
Sarah Haggenmueller,
Tabea-Clara Bucher,
Sarah Hobelsberger,
Markus V. Heppt,
Gerardo Ferrara,
Eva I. Krieghoff-Henning,
Titus J. Brinker
Abstract:
Pathologists routinely use immunohistochemical (IHC)-stained tissue slides against MelanA in addition to hematoxylin and eosin (H&E)-stained slides to improve their accuracy in diagnosing melanomas. The use of diagnostic Deep Learning (DL)-based support systems for automated examination of tissue morphology and cellular composition has been well studied in standard H&E-stained tissue slides. In co…
▽ More
Pathologists routinely use immunohistochemical (IHC)-stained tissue slides against MelanA in addition to hematoxylin and eosin (H&E)-stained slides to improve their accuracy in diagnosing melanomas. The use of diagnostic Deep Learning (DL)-based support systems for automated examination of tissue morphology and cellular composition has been well studied in standard H&E-stained tissue slides. In contrast, there are few studies that analyze IHC slides using DL. Therefore, we investigated the separate and joint performance of ResNets trained on MelanA and corresponding H&E-stained slides. The MelanA classifier achieved an area under receiver operating characteristics curve (AUROC) of 0.82 and 0.74 on out of distribution (OOD)-datasets, similar to the H&E-based benchmark classification of 0.81 and 0.75, respectively. A combined classifier using MelanA and H&E achieved AUROCs of 0.85 and 0.81 on the OOD datasets. DL MelanA-based assistance systems show the same performance as the benchmark H&E classification and may be improved by multi stain classification to assist pathologists in their clinical routine.
△ Less
Submitted 8 September, 2023; v1 submitted 7 September, 2023;
originally announced September 2023.
-
Prediction of Diblock Copolymer Morphology via Machine Learning
Authors:
Hyun Park,
Boyuan Yu,
Juhae Park,
Ge Sun,
Emad Tajkhorshid,
Juan J. de Pablo,
Ludwig Schneider
Abstract:
A machine learning approach is presented to accelerate the computation of block polymer morphology evolution for large domains over long timescales. The strategy exploits the separation of characteristic times between coarse-grained particle evolution on the monomer scale and slow morphological evolution over mesoscopic scales. In contrast to empirical continuum models, the proposed approach learn…
▽ More
A machine learning approach is presented to accelerate the computation of block polymer morphology evolution for large domains over long timescales. The strategy exploits the separation of characteristic times between coarse-grained particle evolution on the monomer scale and slow morphological evolution over mesoscopic scales. In contrast to empirical continuum models, the proposed approach learns stochastically driven defect annihilation processes directly from particle-based simulations. A UNet architecture that respects different boundary conditions is adopted, thereby allowing periodic and fixed substrate boundary conditions of arbitrary shape. Physical concepts are also introduced via the loss function and symmetries are incorporated via data augmentation. The model is validated using three different use cases. Explainable artificial intelligence methods are applied to visualize the morphology evolution over time. This approach enables the generation of large system sizes and long trajectories to investigate defect densities and their evolution under different types of confinement. As an application, we demonstrate the importance of accessing late-stage morphologies for understanding particle diffusion inside a single block. This work has implications for directed self-assembly and materials design in micro-electronics, battery materials, and membranes.
△ Less
Submitted 31 August, 2023;
originally announced August 2023.
-
3DMOTFormer: Graph Transformer for Online 3D Multi-Object Tracking
Authors:
Shuxiao Ding,
Eike Rehder,
Lukas Schneider,
Marius Cordts,
Juergen Gall
Abstract:
Tracking 3D objects accurately and consistently is crucial for autonomous vehicles, enabling more reliable downstream tasks such as trajectory prediction and motion planning. Based on the substantial progress in object detection in recent years, the tracking-by-detection paradigm has become a popular choice due to its simplicity and efficiency. State-of-the-art 3D multi-object tracking (MOT) appro…
▽ More
Tracking 3D objects accurately and consistently is crucial for autonomous vehicles, enabling more reliable downstream tasks such as trajectory prediction and motion planning. Based on the substantial progress in object detection in recent years, the tracking-by-detection paradigm has become a popular choice due to its simplicity and efficiency. State-of-the-art 3D multi-object tracking (MOT) approaches typically rely on non-learned model-based algorithms such as Kalman Filter but require many manually tuned parameters. On the other hand, learning-based approaches face the problem of adapting the training to the online setting, leading to inevitable distribution mismatch between training and inference as well as suboptimal performance. In this work, we propose 3DMOTFormer, a learned geometry-based 3D MOT framework building upon the transformer architecture. We use an Edge-Augmented Graph Transformer to reason on the track-detection bipartite graph frame-by-frame and conduct data association via edge classification. To reduce the distribution mismatch between training and inference, we propose a novel online training strategy with an autoregressive and recurrent forward pass as well as sequential batch optimization. Using CenterPoint detections, our approach achieves 71.2% and 68.2% AMOTA on the nuScenes validation and test split, respectively. In addition, a trained 3DMOTFormer model generalizes well across different object detectors. Code is available at: https://github.com/dsx0511/3DMOTFormer.
△ Less
Submitted 12 August, 2023;
originally announced August 2023.
-
Q(D)O-ES: Population-based Quality (Diversity) Optimisation for Post Hoc Ensemble Selection in AutoML
Authors:
Lennart Purucker,
Lennart Schneider,
Marie Anastacio,
Joeran Beel,
Bernd Bischl,
Holger Hoos
Abstract:
Automated machine learning (AutoML) systems commonly ensemble models post hoc to improve predictive performance, typically via greedy ensemble selection (GES). However, we believe that GES may not always be optimal, as it performs a simple deterministic greedy search. In this work, we introduce two novel population-based ensemble selection methods, QO-ES and QDO-ES, and compare them to GES. While…
▽ More
Automated machine learning (AutoML) systems commonly ensemble models post hoc to improve predictive performance, typically via greedy ensemble selection (GES). However, we believe that GES may not always be optimal, as it performs a simple deterministic greedy search. In this work, we introduce two novel population-based ensemble selection methods, QO-ES and QDO-ES, and compare them to GES. While QO-ES optimises solely for predictive performance, QDO-ES also considers the diversity of ensembles within the population, maintaining a diverse set of well-performing ensembles during optimisation based on ideas of quality diversity optimisation. The methods are evaluated using 71 classification datasets from the AutoML benchmark, demonstrating that QO-ES and QDO-ES often outrank GES, albeit only statistically significant on validation data. Our results further suggest that diversity can be beneficial for post hoc ensembling but also increases the risk of overfitting.
△ Less
Submitted 2 August, 2023; v1 submitted 17 July, 2023;
originally announced July 2023.
-
Multi-Objective Optimization of Performance and Interpretability of Tabular Supervised Machine Learning Models
Authors:
Lennart Schneider,
Bernd Bischl,
Janek Thomas
Abstract:
We present a model-agnostic framework for jointly optimizing the predictive performance and interpretability of supervised machine learning models for tabular data. Interpretability is quantified via three measures: feature sparsity, interaction sparsity of features, and sparsity of non-monotone feature effects. By treating hyperparameter optimization of a machine learning algorithm as a multi-obj…
▽ More
We present a model-agnostic framework for jointly optimizing the predictive performance and interpretability of supervised machine learning models for tabular data. Interpretability is quantified via three measures: feature sparsity, interaction sparsity of features, and sparsity of non-monotone feature effects. By treating hyperparameter optimization of a machine learning algorithm as a multi-objective optimization problem, our framework allows for generating diverse models that trade off high performance and ease of interpretability in a single optimization run. Efficient optimization is achieved via augmentation of the search space of the learning algorithm by incorporating feature selection, interaction and monotonicity constraints into the hyperparameter search space. We demonstrate that the optimization problem effectively translates to finding the Pareto optimal set of groups of selected features that are allowed to interact in a model, along with finding their optimal monotonicity constraints and optimal hyperparameters of the learning algorithm itself. We then introduce a novel evolutionary algorithm that can operate efficiently on this augmented search space. In benchmark experiments, we show that our framework is capable of finding diverse models that are highly competitive or outperform state-of-the-art XGBoost or Explainable Boosting Machine models, both with respect to performance and interpretability.
△ Less
Submitted 16 July, 2023;
originally announced July 2023.
-
S.T.A.R.-Track: Latent Motion Models for End-to-End 3D Object Tracking with Adaptive Spatio-Temporal Appearance Representations
Authors:
Simon Doll,
Niklas Hanselmann,
Lukas Schneider,
Richard Schulz,
Markus Enzweiler,
Hendrik P. A. Lensch
Abstract:
Following the tracking-by-attention paradigm, this paper introduces an object-centric, transformer-based framework for tracking in 3D. Traditional model-based tracking approaches incorporate the geometric effect of object- and ego motion between frames with a geometric motion model. Inspired by this, we propose S.T.A.R.-Track, which uses a novel latent motion model (LMM) to additionally adjust obj…
▽ More
Following the tracking-by-attention paradigm, this paper introduces an object-centric, transformer-based framework for tracking in 3D. Traditional model-based tracking approaches incorporate the geometric effect of object- and ego motion between frames with a geometric motion model. Inspired by this, we propose S.T.A.R.-Track, which uses a novel latent motion model (LMM) to additionally adjust object queries to account for changes in viewing direction and lighting conditions directly in the latent space, while still modeling the geometric motion explicitly. Combined with a novel learnable track embedding that aids in modeling the existence probability of tracks, this results in a generic tracking framework that can be integrated with any query-based detector. Extensive experiments on the nuScenes benchmark demonstrate the benefits of our approach, showing state-of-the-art performance for DETR3D-based trackers while drastically reducing the number of identity switches of tracks at the same time.
△ Less
Submitted 13 October, 2024; v1 submitted 30 June, 2023;
originally announced June 2023.
-
Task-based Generation of Optimized Projection Sets using Differentiable Ranking
Authors:
Linda-Sophie Schneider,
Mareike Thies,
Christopher Syben,
Richard Schielein,
Mathias Unberath,
Andreas Maier
Abstract:
We present a method for selecting valuable projections in computed tomography (CT) scans to enhance image reconstruction and diagnosis. The approach integrates two important factors, projection-based detectability and data completeness, into a single feed-forward neural network. The network evaluates the value of projections, processes them through a differentiable ranking function and makes the f…
▽ More
We present a method for selecting valuable projections in computed tomography (CT) scans to enhance image reconstruction and diagnosis. The approach integrates two important factors, projection-based detectability and data completeness, into a single feed-forward neural network. The network evaluates the value of projections, processes them through a differentiable ranking function and makes the final selection using a straight-through estimator. Data completeness is ensured through the label provided during training. The approach eliminates the need for heuristically enforcing data completeness, which may exclude valuable projections. The method is evaluated on simulated data in a non-destructive testing scenario, where the aim is to maximize the reconstruction quality within a specified region of interest. We achieve comparable results to previous methods, laying the foundation for using reconstruction-based loss functions to learn the selection of projections.
△ Less
Submitted 21 March, 2023;
originally announced March 2023.
-
A Three-Regime Theorem for Flow-Firing
Authors:
Sarah Brauner,
Galen Dorpalen-Barry,
Selvi Kara,
Caroline Klivans,
Lisa Schneider
Abstract:
Graphical chip-firing is a discrete dynamical system where chips are placed on the vertices of a graph and exchanged via simple firing moves. Recent work has sought to generalize chip-firing on graphs to higher dimensions, wherein graphs are replaced by cellular complexes and chip firing becomes flow-rerouting along the faces of the complex. Given such a system, it is natural to ask (1) whether th…
▽ More
Graphical chip-firing is a discrete dynamical system where chips are placed on the vertices of a graph and exchanged via simple firing moves. Recent work has sought to generalize chip-firing on graphs to higher dimensions, wherein graphs are replaced by cellular complexes and chip firing becomes flow-rerouting along the faces of the complex. Given such a system, it is natural to ask (1) whether this firing process terminates and (2) if it terminates uniquely (e.g. is confluent). In the graphical case, these questions were definitively answered by Bjorner--Lovasz--Shor, who developed three regimes which completely determine if a given system will terminate. Building on the work of Duval--Klivans--Martin and Felzenszwalb-Klivans, we answer these questions in a context called flow-firing, where the cellular complexes are 2-dimensional.
△ Less
Submitted 5 March, 2025; v1 submitted 4 March, 2023;
originally announced March 2023.
-
Membership Inference Attack for Beluga Whales Discrimination
Authors:
Voncarlos Marcelo Araújo,
Sébastien Gambs,
Clément Chion,
Robert Michaud,
Léo Schneider,
Hadrien Lautraite
Abstract:
To efficiently monitor the growth and evolution of a particular wildlife population, one of the main fundamental challenges to address in animal ecology is the re-identification of individuals that have been previously encountered but also the discrimination between known and unknown individuals (the so-called "open-set problem"), which is the first step to realize before re-identification. In par…
▽ More
To efficiently monitor the growth and evolution of a particular wildlife population, one of the main fundamental challenges to address in animal ecology is the re-identification of individuals that have been previously encountered but also the discrimination between known and unknown individuals (the so-called "open-set problem"), which is the first step to realize before re-identification. In particular, in this work, we are interested in the discrimination within digital photos of beluga whales, which are known to be among the most challenging marine species to discriminate due to their lack of distinctive features. To tackle this problem, we propose a novel approach based on the use of Membership Inference Attacks (MIAs), which are normally used to assess the privacy risks associated with releasing a particular machine learning model. More precisely, we demonstrate that the problem of discriminating between known and unknown individuals can be solved efficiently using state-of-the-art approaches for MIAs. Extensive experiments on three benchmark datasets related to whales, two different neural network architectures, and three MIA clearly demonstrate the performance of the approach. In addition, we have also designed a novel MIA strategy that we coined as ensemble MIA, which combines the outputs of different MIAs to increase the attack accuracy while diminishing the false positive rate. Overall, one of our main objectives is also to show that the research on privacy attacks can also be leveraged "for good" by helping to address practical challenges encountered in animal ecology.
△ Less
Submitted 28 February, 2023;
originally announced February 2023.
-
Optimizing CT Scan Geometries With and Without Gradients
Authors:
Mareike Thies,
Fabian Wagner,
Noah Maul,
Laura Pfaff,
Linda-Sophie Schneider,
Christopher Syben,
Andreas Maier
Abstract:
In computed tomography (CT), the projection geometry used for data acquisition needs to be known precisely to obtain a clear reconstructed image. Rigid patient motion is a cause for misalignment between measured data and employed geometry. Commonly, such motion is compensated by solving an optimization problem that, e.g., maximizes the quality of the reconstructed image with respect to the project…
▽ More
In computed tomography (CT), the projection geometry used for data acquisition needs to be known precisely to obtain a clear reconstructed image. Rigid patient motion is a cause for misalignment between measured data and employed geometry. Commonly, such motion is compensated by solving an optimization problem that, e.g., maximizes the quality of the reconstructed image with respect to the projection geometry. So far, gradient-free optimization algorithms have been utilized to find the solution for this problem. Here, we show that gradient-based optimization algorithms are a possible alternative and compare the performance to their gradient-free counterparts on a benchmark motion compensation problem. Gradient-based algorithms converge substantially faster while being comparable to gradient-free algorithms in terms of capture range and robustness to the number of free parameters. Hence, gradient-based optimization is a viable alternative for the given type of problems.
△ Less
Submitted 13 February, 2023;
originally announced February 2023.
-
Search for large topological gaps in atomic spin chains on proximitized superconducting heavy metal layers
Authors:
Philip Beck,
Bendegúz Nyári,
Lucas Schneider,
Levente Rózsa,
András Lászlóffy,
Krisztián Palotás,
László Szunyogh,
Balázs Ujfalussy,
Jens Wiebe,
Roland Wiesendanger
Abstract:
One-dimensional systems comprising s-wave superconductivity with meticulously tuned magnetism and spin-orbit coupling can realize topologically gapped superconductors hosting Majorana edge modes whose stability is determined by the gap's size. The ongoing quest for larger topological gaps evolved into a material science issue. However, for atomic spin chains on superconductor surfaces, the effect…
▽ More
One-dimensional systems comprising s-wave superconductivity with meticulously tuned magnetism and spin-orbit coupling can realize topologically gapped superconductors hosting Majorana edge modes whose stability is determined by the gap's size. The ongoing quest for larger topological gaps evolved into a material science issue. However, for atomic spin chains on superconductor surfaces, the effect of the substrate's spin-orbit coupling on the system's topological gap size is largely unexplored. Here, we introduce an atomic layer of the heavy metal Au on Nb(110) which combines strong spin-orbit coupling and a large superconducting gap with a high crystallographic quality enabling the assembly of defect-free Fe chains using a scanning tunneling microscope tip. Scanning tunneling spectroscopy experiments and density functional theory calculations reveal ferromagnetic coupling and ungapped YSR bands in the chain despite of the heavy substrate. By artificially imposing a spin spiral state our calculations indicate a minigap opening and zero-energy edge state formation. The presented methodology paves the way towards a material screening of heavy metal layers on elemental superconductors for ideal systems hosting Majorana edge modes protected by large topological gaps.
△ Less
Submitted 13 January, 2023;
originally announced January 2023.
-
PySAGES: flexible, advanced sampling methods accelerated with GPUs
Authors:
Pablo F. Zubieta Rico,
Ludwig Schneider,
Gustavo R. Pérez-Lemus,
Riccardo Alessandri,
Siva Dasetty,
Cintia A. Menéndez,
Yiheng Wu,
Yezhi Jin,
Yinan Xu,
Trung D. Nguyen,
John A. Parker,
Andrew L. Ferguson,
Jonathan K. Whitmer,
Juan J. de Pablo
Abstract:
Molecular simulations are an important tool for research in physics, chemistry, and biology. The capabilities of simulations can be greatly expanded by providing access to advanced sampling methods and techniques that permit calculation of the relevant underlying free energy landscapes. In this sense, software that can be seamlessly adapted to a broad range of complex systems is essential. Buildin…
▽ More
Molecular simulations are an important tool for research in physics, chemistry, and biology. The capabilities of simulations can be greatly expanded by providing access to advanced sampling methods and techniques that permit calculation of the relevant underlying free energy landscapes. In this sense, software that can be seamlessly adapted to a broad range of complex systems is essential. Building on past efforts to provide open-source community supported software for advanced sampling, we introduce PySAGES, a Python implementation of the Software Suite for Advanced General Ensemble Simulations (SSAGES) that provides full GPU support for massively parallel applications of enhanced sampling methods such as adaptive biasing forces, harmonic bias, or forward flux sampling in the context of molecular dynamics simulations. By providing an intuitive interface that facilitates the management of a system's configuration, the inclusion of new collective variables, and the implementation of sophisticated free energy-based sampling methods, the PySAGES library serves as a general platform for the development and implementation of emerging simulation techniques. The capabilities, core features, and computational performance of this new tool are demonstrated with clear and concise examples pertaining to different classes of molecular systems. We anticipate that PySAGES will provide the scientific community with a robust and easily accessible platform to accelerate simulations, improve sampling, and enable facile estimation of free energies for a wide range of materials and processes.
△ Less
Submitted 4 April, 2023; v1 submitted 12 January, 2023;
originally announced January 2023.
-
Gradient-Based Geometry Learning for Fan-Beam CT Reconstruction
Authors:
Mareike Thies,
Fabian Wagner,
Noah Maul,
Lukas Folle,
Manuela Meier,
Maximilian Rohleder,
Linda-Sophie Schneider,
Laura Pfaff,
Mingxuan Gu,
Jonas Utz,
Felix Denzinger,
Michael Manhart,
Andreas Maier
Abstract:
Incorporating computed tomography (CT) reconstruction operators into differentiable pipelines has proven beneficial in many applications. Such approaches usually focus on the projection data and keep the acquisition geometry fixed. However, precise knowledge of the acquisition geometry is essential for high quality reconstruction results. In this paper, the differentiable formulation of fan-beam C…
▽ More
Incorporating computed tomography (CT) reconstruction operators into differentiable pipelines has proven beneficial in many applications. Such approaches usually focus on the projection data and keep the acquisition geometry fixed. However, precise knowledge of the acquisition geometry is essential for high quality reconstruction results. In this paper, the differentiable formulation of fan-beam CT reconstruction is extended to the acquisition geometry. This allows to propagate gradient information from a loss function on the reconstructed image into the geometry parameters. As a proof-of-concept experiment, this idea is applied to rigid motion compensation. The cost function is parameterized by a trained neural network which regresses an image quality metric from the motion affected reconstruction alone. Using the proposed method, we are the first to optimize such an autofocus-inspired algorithm based on analytical gradients. The algorithm achieves a reduction in MSE by 35.5 % and an improvement in SSIM by 12.6 % over the motion affected reconstruction. Next to motion compensation, we see further use cases of our differentiable method for scanner calibration or hybrid techniques employing deep models.
△ Less
Submitted 5 December, 2022;
originally announced December 2022.
-
Proximity superconductivity in atom-by-atom crafted quantum dots
Authors:
Lucas Schneider,
Khai That Ton,
Ioannis Ioannidis,
Jannis Neuhaus-Steinmetz,
Thore Posske,
Roland Wiesendanger,
Jens Wiebe
Abstract:
Gapless materials in electronic contact with superconductors acquire proximity-induced superconductivity in a region near the interface. Numerous proposals build on this addition of electron pairing to originally non-superconducting systems like ferromagnets and predict intriguing quantum phases of matter, including topological-, odd-frequency-, or nodal-point superconductivity. However, atomic-sc…
▽ More
Gapless materials in electronic contact with superconductors acquire proximity-induced superconductivity in a region near the interface. Numerous proposals build on this addition of electron pairing to originally non-superconducting systems like ferromagnets and predict intriguing quantum phases of matter, including topological-, odd-frequency-, or nodal-point superconductivity. However, atomic-scale experimental investigations of the microscopic mechanisms leading to proximity-induced Cooper pairing in surface or interface states are missing. Here, we investigate the most miniature example of the proximity effect on only a single quantum level of a surface state confined in a quantum corral on a superconducting substrate, built atom-by-atom by a scanning tunneling microscope. Whenever an eigenmode of the corral is pitched close to the Fermi energy by adjusting the corral's size, a pair of particle-hole symmetric states enters the superconductor's gap. We identify the in-gap states as scattering resonances theoretically predicted 50 years ago by Machida and Shibata, which had so far eluded detection. We further show that the observed anticrossings of the in-gap states indicate proximity-induced pairing in the quantum corral's eigenmodes. Our results have direct consequences on the interpretation of in-gap states in unconventional or topological superconductors, corroborate concepts to induce superconductivity into a single quantum level and further pave the way towards superconducting artificial lattices.
△ Less
Submitted 1 December, 2022;
originally announced December 2022.
-
Structural Knowledge Distillation for Object Detection
Authors:
Philip de Rijk,
Lukas Schneider,
Marius Cordts,
Dariu M. Gavrila
Abstract:
Knowledge Distillation (KD) is a well-known training paradigm in deep neural networks where knowledge acquired by a large teacher model is transferred to a small student. KD has proven to be an effective technique to significantly improve the student's performance for various tasks including object detection. As such, KD techniques mostly rely on guidance at the intermediate feature level, which i…
▽ More
Knowledge Distillation (KD) is a well-known training paradigm in deep neural networks where knowledge acquired by a large teacher model is transferred to a small student. KD has proven to be an effective technique to significantly improve the student's performance for various tasks including object detection. As such, KD techniques mostly rely on guidance at the intermediate feature level, which is typically implemented by minimizing an lp-norm distance between teacher and student activations during training. In this paper, we propose a replacement for the pixel-wise independent lp-norm based on the structural similarity (SSIM). By taking into account additional contrast and structural cues, feature importance, correlation and spatial dependence in the feature space are considered in the loss formulation. Extensive experiments on MSCOCO demonstrate the effectiveness of our method across different training schemes and architectures. Our method adds only little computational overhead, is straightforward to implement and at the same time it significantly outperforms the standard lp-norms. Moreover, more complex state-of-the-art KD methods using attention-based sampling mechanisms are outperformed, including a +3.5 AP gain using a Faster R-CNN R-50 compared to a vanilla model.
△ Less
Submitted 23 November, 2022;
originally announced November 2022.
-
Entanglements via Slip-Springs with Soft, Coarse-Grained Models for Systems Having Explicit Liquid-Vapor Interfaces
Authors:
Ludwig Schneider,
Juan de Pablo
Abstract:
Recent advances in nano-rheology require that new methods and models be developed to describe the equilibrium and non-equilibrium properties of entangled polymeric materials and their interfaces at a molecular level of detail. In this work we present a Slip-Spring (SLSP) model capable of describing the dynamics of entangled polymers at interfaces, including explicit liquid-vapor and liquid-solid i…
▽ More
Recent advances in nano-rheology require that new methods and models be developed to describe the equilibrium and non-equilibrium properties of entangled polymeric materials and their interfaces at a molecular level of detail. In this work we present a Slip-Spring (SLSP) model capable of describing the dynamics of entangled polymers at interfaces, including explicit liquid-vapor and liquid-solid interfaces. The highly coarse-grained approach adopted with this model enables simulation of entire nano-rheological characterization systems within a particle-level base description. Many-body dissipative particle dynamics (MDPD) non-bonded interactions allow for explicit liquid-vapor interfaces, and compensating potential within the SLSP model ensures unbiased descriptions of the shape of the liquid-vapor interface. The usefulness of the model has been illustrated by studying the deposition of polymer droplets onto a substrate, where it s shown that the wetting dynamics is strongly dependent on the degree of entanglement of the polymer. More generally, the model proposed here provides a foundation for the development of digital twins of experimentally relevant systems, including a new generation of nano-rheometers based on nano- or micro-droplet deformation.
△ Less
Submitted 9 November, 2022;
originally announced November 2022.
-
Testing the topological nature of end states in antiferromagnetic atomic chains on superconductors
Authors:
Lucas Schneider,
Philip Beck,
Levente Rózsa,
Thore Posske,
Jens Wiebe,
Roland Wiesendanger
Abstract:
Edge states forming at the boundaries of topologically non-trivial phases of matter are promising candidates for future device applications because of their stability against local perturbations. Magnetically ordered spin chains proximitized by an s-wave superconductor are predicted to enter a topologically non-trivial mini-gapped phase with zero-energy Majorana modes (MMs) localized at their ends…
▽ More
Edge states forming at the boundaries of topologically non-trivial phases of matter are promising candidates for future device applications because of their stability against local perturbations. Magnetically ordered spin chains proximitized by an s-wave superconductor are predicted to enter a topologically non-trivial mini-gapped phase with zero-energy Majorana modes (MMs) localized at their ends. However, the presence of non-topological end states mimicking MM properties can spoil their unambiguous observation. Here, we report on a method to experimentally decide on the MM nature of end states observed for the first time in antiferromagnetic spin chains. Using scanning tunneling spectroscopy, we find end states at either finite or near-zero energy in Mn chains on Nb(110) or Ta(110), respectively, within a large minigap. By introducing a locally perturbing defect on one end of the chain, the end state on this side splits off from zero-energy while the one on the other side doesn't - ruling out their MM origin. A minimal model shows that, while wide trivial minigaps hosting such conventional end states are easily achieved in antiferromagnetic spin chains, unrealistically large spin-orbit couplings are required to drive the system into the topologically nontrivial phase with MMs. The methodology of perturbing chains by local defects is a powerful tool to probe the stability of future candidate topological edge modes against local disorder.
△ Less
Submitted 22 December, 2022; v1 submitted 1 November, 2022;
originally announced November 2022.