Search | arXiv e-print repository

Machine learning for in-situ composition mapping in a self-driving magnetron sputtering system

Authors: Sanna Jarl, Jens Sjölund, Robert J. W. Frost, Anders Holst, Jonathan J. S. Scragg

Abstract: Self-driving labs (SDLs), employing automation and machine learning (ML) to accelerate experimental procedures, have enormous potential in the discovery of new materials. However, in thin film science, SDLs are mainly restricted to solution-based synthetic methods which are easier to automate but cannot access the broad chemical space of inorganic materials. This work presents an SDL based on magn… ▽ More Self-driving labs (SDLs), employing automation and machine learning (ML) to accelerate experimental procedures, have enormous potential in the discovery of new materials. However, in thin film science, SDLs are mainly restricted to solution-based synthetic methods which are easier to automate but cannot access the broad chemical space of inorganic materials. This work presents an SDL based on magnetron co-sputtering. We are using combinatorial frameworks, obtaining accurate composition maps on multi-element, compositionally graded thin films. This normally requires time-consuming ex-situ analysis prone to systematic errors. We present a rapid and calibration-free in-situ, ML driven approach to produce composition maps for arbitrary source combinations and sputtering conditions. We develop a method to predict the composition distribution in a multi-element combinatorial thin film, using in-situ measurements from quartz-crystal microbalance sensors placed in a sputter chamber. For a given source, the sensor readings are learned as a function of the sputtering pressure and magnetron power, through active learning using Gaussian processes (GPs). The final GPs are combined with a geometric model of the deposition flux distribution in the chamber, which allows interpolation of the deposition rates from each source, at any position across the sample. We investigate several acquisition functions for the ML procedure. A fully Bayesian GP - BALM (Bayesian active learning MacKay) - achieved the best performance, learning the deposition rates for a single source in 10 experiments. Prediction accuracy for co-sputtering composition distributions was verified experimentally. Our framework dramatically increases throughput by avoiding the need for extensive characterisation or calibration, thus demonstrating the potential of ML-guided SDLs to accelerate materials exploration. △ Less

Submitted 6 June, 2025; originally announced June 2025.

Comments: 24 pages, 10 figures. Submitted to the journal npj computational materials

ACM Class: I.2.1; J.2.8

arXiv:2505.16733 [pdf, ps, other]

Forward-only Diffusion Probabilistic Models

Authors: Ziwei Luo, Fredrik K. Gustafsson, Jens Sjölund, Thomas B. Schön

Abstract: This work presents a forward-only diffusion (FoD) approach for generative modelling. In contrast to traditional diffusion models that rely on a coupled forward-backward diffusion scheme, FoD directly learns data generation through a single forward diffusion process, yielding a simple yet efficient generative framework. The core of FoD is a state-dependent linear stochastic differential equation th… ▽ More This work presents a forward-only diffusion (FoD) approach for generative modelling. In contrast to traditional diffusion models that rely on a coupled forward-backward diffusion scheme, FoD directly learns data generation through a single forward diffusion process, yielding a simple yet efficient generative framework. The core of FoD is a state-dependent linear stochastic differential equation that involves a mean-reverting term in both the drift and diffusion functions. This mean-reversion property guarantees the convergence to clean data, naturally simulating a stochastic interpolation between source and target distributions. More importantly, FoD is analytically tractable and is trained using a simple stochastic flow matching objective, enabling a few-step non-Markov chain sampling during inference. The proposed FoD model, despite its simplicity, achieves competitive performance on various image-conditioned (e.g., image restoration) and unconditional generation tasks, demonstrating its effectiveness in generative modelling. Our code is available at https://github.com/Algolzw/FoD. △ Less

Submitted 22 May, 2025; originally announced May 2025.

Comments: Project page: https://algolzw.github.io/fod

arXiv:2505.04437 [pdf, other]

Probabilistic Zeeman-Doppler imaging of stellar magnetic fields: I. Analysis of tau Scorpii in the weak-field limit

Authors: Jennifer Rosina Andersson, Oleg Kochukhov, Zheng Zhao, Jens Sjölund

Abstract: Zeeman-Doppler imaging (ZDI) is used to study the surface magnetic field topology of stars, based on high-resolution spectropolarimetric time series observations. Multiple ZDI inversions have been conducted for the early B-type star tau Sco, which has been found to exhibit a weak but complex non-dipolar surface magnetic field. The classical ZDI framework suffers from a significant limitation in th… ▽ More Zeeman-Doppler imaging (ZDI) is used to study the surface magnetic field topology of stars, based on high-resolution spectropolarimetric time series observations. Multiple ZDI inversions have been conducted for the early B-type star tau Sco, which has been found to exhibit a weak but complex non-dipolar surface magnetic field. The classical ZDI framework suffers from a significant limitation in that it provides little to no reliable uncertainty quantification for the reconstructed magnetic field maps, with essentially all published results being confined to point estimates. To fill this gap, we propose a Bayesian framework for probabilistic ZDI. Here, the proposed framework is demonstrated on tau Sco in the weak-field limit. We propose three distinct statistical models, and use archival ESPaDOnS high-resolution Stokes V observations to carry out the probabilistic magnetic inversion in closed form. The surface magnetic field is parameterised by a high-dimensional spherical-harmonic expansion. By comparing three different prior distributions over the latent variables in the spherical-harmonic decomposition, our results showcase the ZDI sensitivity to various hyperparameters. The mean magnetic field maps are qualitatively similar to previously published point estimates, but analysis of the magnetic energy distribution indicates high uncertainty and higher energy content at low angular degrees l. Our results effectively demonstrate that, for stars in the weak-field regime, reliable uncertainty quantification of recovered magnetic field maps can be obtained in closed form with natural assumptions on the statistical model. Future work will explore extending this framework beyond the weak-field approximation and incorporating prior uncertainty over multiple stellar parameters in more complex magnetic inversion problems. △ Less

Submitted 7 May, 2025; originally announced May 2025.

Comments: Accepted for publication in A&A

arXiv:2503.16978 [pdf, other]

Real-Time Diffusion Policies for Games: Enhancing Consistency Policies with Q-Ensembles

Authors: Ruoqi Zhang, Ziwei Luo, Jens Sjölund, Per Mattsson, Linus Gisslén, Alessandro Sestini

Abstract: Diffusion models have shown impressive performance in capturing complex and multi-modal action distributions for game agents, but their slow inference speed prevents practical deployment in real-time game environments. While consistency models offer a promising approach for one-step generation, they often suffer from training instability and performance degradation when applied to policy learning.… ▽ More Diffusion models have shown impressive performance in capturing complex and multi-modal action distributions for game agents, but their slow inference speed prevents practical deployment in real-time game environments. While consistency models offer a promising approach for one-step generation, they often suffer from training instability and performance degradation when applied to policy learning. In this paper, we present CPQE (Consistency Policy with Q-Ensembles), which combines consistency models with Q-ensembles to address these challenges.CPQE leverages uncertainty estimation through Q-ensembles to provide more reliable value function approximations, resulting in better training stability and improved performance compared to classic double Q-network methods. Our extensive experiments across multiple game scenarios demonstrate that CPQE achieves inference speeds of up to 60 Hz -- a significant improvement over state-of-the-art diffusion policies that operate at only 20 Hz -- while maintaining comparable performance to multi-step diffusion approaches. CPQE consistently outperforms state-of-the-art consistency model approaches, showing both higher rewards and enhanced training stability throughout the learning process. These results indicate that CPQE offers a practical solution for deploying diffusion-based policies in games and other real-time applications where both multi-modal behavior modeling and rapid inference are critical requirements. △ Less

Submitted 21 March, 2025; originally announced March 2025.

arXiv:2503.13077 [pdf, other]

Towards Better Sample Efficiency in Multi-Agent Reinforcement Learning via Exploration

Authors: Amir Baghi, Jens Sjölund, Joakim Bergdahl, Linus Gisslén, Alessandro Sestini

Abstract: Multi-agent reinforcement learning has shown promise in learning cooperative behaviors in team-based environments. However, such methods often demand extensive training time. For instance, the state-of-the-art method TiZero takes 40 days to train high-quality policies for a football environment. In this paper, we hypothesize that better exploration mechanisms can improve the sample efficiency of m… ▽ More Multi-agent reinforcement learning has shown promise in learning cooperative behaviors in team-based environments. However, such methods often demand extensive training time. For instance, the state-of-the-art method TiZero takes 40 days to train high-quality policies for a football environment. In this paper, we hypothesize that better exploration mechanisms can improve the sample efficiency of multi-agent methods. We propose two different approaches for better exploration in TiZero: a self-supervised intrinsic reward and a random network distillation bonus. Additionally, we introduce architectural modifications to the original algorithm to enhance TiZero's computational efficiency. We evaluate the sample efficiency of these approaches through extensive experiments. Our results show that random network distillation improves training sample efficiency by 18.8% compared to the original TiZero. Furthermore, we evaluate the qualitative behavior of the models produced by both variants against a heuristic AI, with the self-supervised reward encouraging possession and random network distillation leading to a more offensive performance. Our results highlights the applicability of our random network distillation variant in practical settings. Lastly, due to the nature of the proposed method, we acknowledge its use beyond football simulation, especially in environments with strong multi-agent and strategic aspects. △ Less

Submitted 17 March, 2025; originally announced March 2025.

Comments: 8 pages, 3 figures

arXiv:2502.21102 [pdf, other]

Minimal positive Markov realizations

Authors: Hamed Taghavian, Jens Sjölund

Abstract: Finding a positive state-space realization with the minimum dimension for a given transfer function is an open problem in control theory. In this paper, we focus on positive realizations in Markov form and propose a linear programming approach that computes them with a minimum dimension. Such minimum dimension of positive Markov realizations is an upper bound of the minimal positive realization di… ▽ More Finding a positive state-space realization with the minimum dimension for a given transfer function is an open problem in control theory. In this paper, we focus on positive realizations in Markov form and propose a linear programming approach that computes them with a minimum dimension. Such minimum dimension of positive Markov realizations is an upper bound of the minimal positive realization dimension. However, we show that these two dimensions are equal for certain systems. △ Less

Submitted 1 April, 2025; v1 submitted 28 February, 2025; originally announced February 2025.

arXiv:2502.08292 [pdf, ps, other]

Navigating chemical design spaces for metal-ion batteries via machine-learning-guided phase-field simulations

Authors: Hamed Taghavian, Viktor Vanoppen, Erik Berg, Peter Broqvist, Jens Sjölund

Abstract: Metal anodes provide the highest energy density in batteries. However, they still suffer from electrode/electrolyte interface side reactions and dendrite growth, especially under fast-charging conditions. In this paper, we consider a phase-field model of electrodeposition in metal-anode batteries and provide a scalable, versatile framework for optimizing its chemical parameters. Our approach is ba… ▽ More Metal anodes provide the highest energy density in batteries. However, they still suffer from electrode/electrolyte interface side reactions and dendrite growth, especially under fast-charging conditions. In this paper, we consider a phase-field model of electrodeposition in metal-anode batteries and provide a scalable, versatile framework for optimizing its chemical parameters. Our approach is based on Bayesian optimization and explores the parameter space with a high sample efficiency and a low computation complexity. We use this framework to find the optimal cell for suppressing dendrite growth and accelerating charging speed under constant voltage. We identify interfacial mobility as a key parameter, which should be maximized to inhibit dendrites without compromising the charging speed. The results are verified using extended simulations of dendrite evolution in charging half cells with lithium-metal anodes. △ Less

Submitted 29 May, 2025; v1 submitted 12 February, 2025; originally announced February 2025.

arXiv:2411.06887 [pdf, ps, other]

Symmetrizable systems

Authors: Hamed Taghavian, Jens Sjölund

Abstract: Transforming an asymmetric system into a symmetric system makes it possible to exploit the simplifying properties of symmetry in control problems. We define and characterize the family of symmetrizable systems, which can be transformed into symmetric systems by a linear transformation of their inputs and outputs. In the special case of complete symmetry, the set of symmetrizable systems is convex… ▽ More Transforming an asymmetric system into a symmetric system makes it possible to exploit the simplifying properties of symmetry in control problems. We define and characterize the family of symmetrizable systems, which can be transformed into symmetric systems by a linear transformation of their inputs and outputs. In the special case of complete symmetry, the set of symmetrizable systems is convex and verifiable by a semidefinite program. We show that a Khatri-Rao rank needs to be satisfied for a system to be symmetrizable and conclude that linear systems are generically neither symmetric nor symmetrizable. △ Less

Submitted 9 April, 2025; v1 submitted 11 November, 2024; originally announced November 2024.

arXiv:2410.11491 [pdf, other]

doi 10.1007/978-3-031-72069-7_66

Online learning in motion modeling for intra-interventional image sequences

Authors: Niklas Gunnarsson, Jens Sjölund, Peter Kimstrand, Thomas. B Schön

Abstract: Image monitoring and guidance during medical examinations can aid both diagnosis and treatment. However, the sampling frequency is often too low, which creates a need to estimate the missing images. We present a probabilistic motion model for sequential medical images, with the ability to both estimate motion between acquired images and forecast the motion ahead of time. The core is a low-dimensio… ▽ More Image monitoring and guidance during medical examinations can aid both diagnosis and treatment. However, the sampling frequency is often too low, which creates a need to estimate the missing images. We present a probabilistic motion model for sequential medical images, with the ability to both estimate motion between acquired images and forecast the motion ahead of time. The core is a low-dimensional temporal process based on a linear Gaussian state-space model with analytically tractable solutions for forecasting, simulation, and imputation of missing samples. The results, from two experiments on publicly available cardiac datasets, show reliable motion estimates and an improved forecasting performance using patient-specific adaptation by online learning. △ Less

Submitted 15 October, 2024; originally announced October 2024.

Comments: Medical Image Computing and Computer Assisted Intervention (MICCAI) 2024

arXiv:2409.10353 [pdf, other]

Taming Diffusion Models for Image Restoration: A Review

Authors: Ziwei Luo, Fredrik K. Gustafsson, Zheng Zhao, Jens Sjölund, Thomas B. Schön

Abstract: Diffusion models have achieved remarkable progress in generative modelling, particularly in enhancing image quality to conform to human preferences. Recently, these models have also been applied to low-level computer vision for photo-realistic image restoration (IR) in tasks such as image denoising, deblurring, dehazing, etc. In this review paper, we introduce key constructions in diffusion models… ▽ More Diffusion models have achieved remarkable progress in generative modelling, particularly in enhancing image quality to conform to human preferences. Recently, these models have also been applied to low-level computer vision for photo-realistic image restoration (IR) in tasks such as image denoising, deblurring, dehazing, etc. In this review paper, we introduce key constructions in diffusion models and survey contemporary techniques that make use of diffusion models in solving general IR tasks. Furthermore, we point out the main challenges and limitations of existing diffusion-based IR frameworks and provide potential directions for future work. △ Less

Submitted 22 October, 2024; v1 submitted 16 September, 2024; originally announced September 2024.

Comments: Review paper; any comments and suggestions are most welcome!

arXiv:2409.09650 [pdf, other]

Conditional sampling within generative diffusion models

Authors: Zheng Zhao, Ziwei Luo, Jens Sjölund, Thomas B. Schön

Abstract: Generative diffusions are a powerful class of Monte Carlo samplers that leverage bridging Markov processes to approximate complex, high-dimensional distributions, such as those found in image processing and language models. Despite their success in these domains, an important open challenge remains: extending these techniques to sample from conditional distributions, as required in, for example, B… ▽ More Generative diffusions are a powerful class of Monte Carlo samplers that leverage bridging Markov processes to approximate complex, high-dimensional distributions, such as those found in image processing and language models. Despite their success in these domains, an important open challenge remains: extending these techniques to sample from conditional distributions, as required in, for example, Bayesian inverse problems. In this paper, we present a comprehensive review of existing computational approaches to conditional sampling within generative diffusion models. Specifically, we highlight key methodologies that either utilise the joint distribution, or rely on (pre-trained) marginal distributions with explicit likelihoods, to construct conditional generative samplers. △ Less

Submitted 19 February, 2025; v1 submitted 15 September, 2024; originally announced September 2024.

arXiv:2409.08262 [pdf, other]

Learning incomplete factorization preconditioners for GMRES

Authors: Paul Häusner, Aleix Nieto Juscafresa, Jens Sjölund

Abstract: Incomplete LU factorizations of sparse matrices are widely used as preconditioners in Krylov subspace methods to speed up solving linear systems. Unfortunately, computing the preconditioner itself can be time-consuming and sensitive to hyper-parameters. Instead, we replace the hand-engineered algorithm with a graph neural network that is trained to approximate the matrix factorization directly. To… ▽ More Incomplete LU factorizations of sparse matrices are widely used as preconditioners in Krylov subspace methods to speed up solving linear systems. Unfortunately, computing the preconditioner itself can be time-consuming and sensitive to hyper-parameters. Instead, we replace the hand-engineered algorithm with a graph neural network that is trained to approximate the matrix factorization directly. To apply the output of the neural network as a preconditioner, we propose an output activation function that guarantees that the predicted factorization is invertible. Further, applying a graph neural network architecture allows us to ensure that the output itself is sparse which is desirable from a computational standpoint. We theoretically analyze and empirically evaluate different loss functions to train the learned preconditioners and show their effectiveness in decreasing the number of GMRES iterations and improving the spectral properties on synthetic data. The code is available at https://github.com/paulhausner/neural-incomplete-factorization. △ Less

Submitted 11 December, 2024; v1 submitted 12 September, 2024; originally announced September 2024.

Comments: The first two authors contributed equally, Northern Lights Deep Learning Conference, 15 pages

arXiv:2405.13794 [pdf, other]

Conditioning diffusion models by explicit forward-backward bridging

Authors: Adrien Corenflos, Zheng Zhao, Simo Särkkä, Jens Sjölund, Thomas B. Schön

Abstract: Given an unconditional diffusion model targeting a joint model $π(x, y)$, using it to perform conditional simulation $π(x \mid y)$ is still largely an open question and is typically achieved by learning conditional drifts to the denoising SDE after the fact. In this work, we express \emph{exact} conditional simulation within the \emph{approximate} diffusion model as an inference problem on an augm… ▽ More Given an unconditional diffusion model targeting a joint model $π(x, y)$, using it to perform conditional simulation $π(x \mid y)$ is still largely an open question and is typically achieved by learning conditional drifts to the denoising SDE after the fact. In this work, we express \emph{exact} conditional simulation within the \emph{approximate} diffusion model as an inference problem on an augmented space corresponding to a partial SDE bridge. This perspective allows us to implement efficient and principled particle Gibbs and pseudo-marginal samplers marginally targeting the conditional distribution $π(x \mid y)$. Contrary to existing methodology, our methods do not introduce any additional approximation to the unconditional diffusion model aside from the Monte Carlo error. We showcase the benefits and drawbacks of our approach on a series of synthetic and real data examples. △ Less

Submitted 20 February, 2025; v1 submitted 22 May, 2024; originally announced May 2024.

Comments: In AISTATS 2025

arXiv:2405.03880 [pdf, other]

doi 10.1088/1361-6560/ad68bd

Efficient Radiation Treatment Planning based on Voxel Importance

Authors: Sebastian Mair, Anqi Fu, Jens Sjölund

Abstract: Radiation treatment planning involves optimization over a large number of voxels, many of which carry limited information about the clinical problem. We propose an approach to reduce the large optimization problem by only using a representative subset of informative voxels. This way, we drastically improve planning efficiency while maintaining the plan quality. Within an initial probing step, we p… ▽ More Radiation treatment planning involves optimization over a large number of voxels, many of which carry limited information about the clinical problem. We propose an approach to reduce the large optimization problem by only using a representative subset of informative voxels. This way, we drastically improve planning efficiency while maintaining the plan quality. Within an initial probing step, we pre-solve an easier optimization problem involving a simplified objective from which we derive an importance score per voxel. This importance score is then turned into a sampling distribution, which allows us to subsample a small set of informative voxels using importance sampling. By solving a - now reduced - version of the original optimization problem using this subset, we effectively reduce the problem's size and computational demands while accounting for regions where satisfactory dose deliveries are challenging. In contrast to other stochastic (sub-)sampling methods, our technique only requires a single probing and sampling step to define a reduced optimization problem. This problem can be efficiently solved using established solvers without the need of modifying or adapting them. Empirical experiments on open benchmark data highlight substantially reduced optimization times, up to 50 times faster than the original ones, for intensity-modulated radiation therapy (IMRT), all while upholding plan quality comparable to traditional methods. Our novel approach has the potential to significantly accelerate radiation treatment planning by addressing its inherent computational challenges. We reduce the treatment planning time by reducing the size of the optimization problem rather than modifying and improving the optimization method. Our efforts are thus complementary to many previous developments. △ Less

Submitted 9 August, 2024; v1 submitted 6 May, 2024; originally announced May 2024.

Comments: 21 pages, 11 figures

Journal ref: Phys. Med. Biol. 69 (2024)

arXiv:2404.09732 [pdf, other]

Photo-Realistic Image Restoration in the Wild with Controlled Vision-Language Models

Authors: Ziwei Luo, Fredrik K. Gustafsson, Zheng Zhao, Jens Sjölund, Thomas B. Schön

Abstract: Though diffusion models have been successfully applied to various image restoration (IR) tasks, their performance is sensitive to the choice of training datasets. Typically, diffusion models trained in specific datasets fail to recover images that have out-of-distribution degradations. To address this problem, this work leverages a capable vision-language model and a synthetic degradation pipeline… ▽ More Though diffusion models have been successfully applied to various image restoration (IR) tasks, their performance is sensitive to the choice of training datasets. Typically, diffusion models trained in specific datasets fail to recover images that have out-of-distribution degradations. To address this problem, this work leverages a capable vision-language model and a synthetic degradation pipeline to learn image restoration in the wild (wild IR). More specifically, all low-quality images are simulated with a synthetic degradation pipeline that contains multiple common degradations such as blur, resize, noise, and JPEG compression. Then we introduce robust training for a degradation-aware CLIP model to extract enriched image content features to assist high-quality image restoration. Our base diffusion model is the image restoration SDE (IR-SDE). Built upon it, we further present a posterior sampling strategy for fast noise-free image generation. We evaluate our model on both synthetic and real-world degradation datasets. Moreover, experiments on the unified image restoration task illustrate that the proposed posterior sampling improves image generation quality for various degradations. △ Less

Submitted 15 April, 2024; originally announced April 2024.

Comments: CVPRW 2024; Code: https://github.com/Algolzw/daclip-uir

arXiv:2402.10206 [pdf, other]

Ising on the Graph: Task-specific Graph Subsampling via the Ising Model

Authors: Maria Bånkestad, Jennifer R. Andersson, Sebastian Mair, Jens Sjölund

Abstract: Reducing a graph while preserving its overall properties is an important problem with many applications. Typically, reduction approaches either remove edges (sparsification) or merge nodes (coarsening) in an unsupervised way with no specific downstream task in mind. In this paper, we present an approach for subsampling graph structures using an Ising model defined on either the nodes or edges and… ▽ More Reducing a graph while preserving its overall properties is an important problem with many applications. Typically, reduction approaches either remove edges (sparsification) or merge nodes (coarsening) in an unsupervised way with no specific downstream task in mind. In this paper, we present an approach for subsampling graph structures using an Ising model defined on either the nodes or edges and learning the external magnetic field of the Ising model using a graph neural network. Our approach is task-specific as it can learn how to reduce a graph for a specific downstream task in an end-to-end fashion without requiring a differentiable loss function for the task. We showcase the versatility of our approach on four distinct applications: image segmentation, explainability for graph classification, 3D shape sparsification, and sparse approximate matrix inverse determination. △ Less

Submitted 8 April, 2025; v1 submitted 15 February, 2024; originally announced February 2024.

Comments: 29 pages, 22 figures, accepted at the Learning on Graphs conference (LoG 2024)

arXiv:2402.04080 [pdf, other]

Entropy-regularized Diffusion Policy with Q-Ensembles for Offline Reinforcement Learning

Authors: Ruoqi Zhang, Ziwei Luo, Jens Sjölund, Thomas B. Schön, Per Mattsson

Abstract: This paper presents advanced techniques of training diffusion policies for offline reinforcement learning (RL). At the core is a mean-reverting stochastic differential equation (SDE) that transfers a complex action distribution into a standard Gaussian and then samples actions conditioned on the environment state with a corresponding reverse-time SDE, like a typical diffusion policy. We show that… ▽ More This paper presents advanced techniques of training diffusion policies for offline reinforcement learning (RL). At the core is a mean-reverting stochastic differential equation (SDE) that transfers a complex action distribution into a standard Gaussian and then samples actions conditioned on the environment state with a corresponding reverse-time SDE, like a typical diffusion policy. We show that such an SDE has a solution that we can use to calculate the log probability of the policy, yielding an entropy regularizer that improves the exploration of offline datasets. To mitigate the impact of inaccurate value functions from out-of-distribution data points, we further propose to learn the lower confidence bound of Q-ensembles for more robust policy improvement. By combining the entropy-regularized diffusion policy with Q-ensembles in offline RL, our method achieves state-of-the-art performance on most tasks in D4RL benchmarks. Code is available at https://github.com/ruoqizzz/Entropy-Regularized-Diffusion-Policy-with-QEnsemble. △ Less

Submitted 8 January, 2025; v1 submitted 6 February, 2024; originally announced February 2024.

arXiv:2311.12566 [pdf, other]

Variational Elliptical Processes

Authors: Maria Bånkestad, Jens Sjölund, Jalil Taghia, Thomas B. Schöon

Abstract: We present elliptical processes, a family of non-parametric probabilistic models that subsume Gaussian processes and Student's t processes. This generalization includes a range of new heavy-tailed behaviors while retaining computational tractability. Elliptical processes are based on a representation of elliptical distributions as a continuous mixture of Gaussian distributions. We parameterize thi… ▽ More We present elliptical processes, a family of non-parametric probabilistic models that subsume Gaussian processes and Student's t processes. This generalization includes a range of new heavy-tailed behaviors while retaining computational tractability. Elliptical processes are based on a representation of elliptical distributions as a continuous mixture of Gaussian distributions. We parameterize this mixture distribution as a spline normalizing flow, which we train using variational inference. The proposed form of the variational posterior enables a sparse variational elliptical process applicable to large-scale problems. We highlight advantages compared to Gaussian processes through regression and classification experiments. Elliptical processes can supersede Gaussian processes in several settings, including cases where the likelihood is non-Gaussian or when accurate tail modeling is essential. △ Less

Submitted 21 November, 2023; originally announced November 2023.

Comments: 14 pages, 15 figures, appendix 9 pages

Journal ref: Transactions on Machine Learning Research, September 2023

arXiv:2310.19608 [pdf, other]

On Feynman--Kac training of partial Bayesian neural networks

Authors: Zheng Zhao, Sebastian Mair, Thomas B. Schön, Jens Sjölund

Abstract: Recently, partial Bayesian neural networks (pBNNs), which only consider a subset of the parameters to be stochastic, were shown to perform competitively with full Bayesian neural networks. However, pBNNs are often multi-modal in the latent variable space and thus challenging to approximate with parametric models. To address this problem, we propose an efficient sampling-based training strategy, wh… ▽ More Recently, partial Bayesian neural networks (pBNNs), which only consider a subset of the parameters to be stochastic, were shown to perform competitively with full Bayesian neural networks. However, pBNNs are often multi-modal in the latent variable space and thus challenging to approximate with parametric models. To address this problem, we propose an efficient sampling-based training strategy, wherein the training of a pBNN is formulated as simulating a Feynman--Kac model. We then describe variations of sequential Monte Carlo samplers that allow us to simultaneously estimate the parameters and the latent posterior distribution of this model at a tractable computational cost. Using various synthetic and real-world datasets we show that our proposed training scheme outperforms the state of the art in terms of predictive performance. △ Less

Submitted 27 February, 2024; v1 submitted 30 October, 2023; originally announced October 2023.

Comments: In AISTATS 2024

arXiv:2310.01018 [pdf, other]

Controlling Vision-Language Models for Multi-Task Image Restoration

Authors: Ziwei Luo, Fredrik K. Gustafsson, Zheng Zhao, Jens Sjölund, Thomas B. Schön

Abstract: Vision-language models such as CLIP have shown great impact on diverse downstream tasks for zero-shot or label-free predictions. However, when it comes to low-level vision such as image restoration their performance deteriorates dramatically due to corrupted inputs. In this paper, we present a degradation-aware vision-language model (DA-CLIP) to better transfer pretrained vision-language models to… ▽ More Vision-language models such as CLIP have shown great impact on diverse downstream tasks for zero-shot or label-free predictions. However, when it comes to low-level vision such as image restoration their performance deteriorates dramatically due to corrupted inputs. In this paper, we present a degradation-aware vision-language model (DA-CLIP) to better transfer pretrained vision-language models to low-level vision tasks as a multi-task framework for image restoration. More specifically, DA-CLIP trains an additional controller that adapts the fixed CLIP image encoder to predict high-quality feature embeddings. By integrating the embedding into an image restoration network via cross-attention, we are able to pilot the model to learn a high-fidelity image reconstruction. The controller itself will also output a degradation feature that matches the real corruptions of the input, yielding a natural classifier for different degradation types. In addition, we construct a mixed degradation dataset with synthetic captions for DA-CLIP training. Our approach advances state-of-the-art performance on both \emph{degradation-specific} and \emph{unified} image restoration tasks, showing a promising direction of prompting image restoration with large-scale pretrained vision-language models. Our code is available at https://github.com/Algolzw/daclip-uir. △ Less

Submitted 28 February, 2024; v1 submitted 2 October, 2023; originally announced October 2023.

Comments: Accepted by ICLR 2024. Project page: https://algolzw.github.io/daclip-uir/index.html

arXiv:2309.15188 [pdf, other]

doi 10.5281/zenodo.7958513

ICML 2023 Topological Deep Learning Challenge : Design and Results

Authors: Mathilde Papillon, Mustafa Hajij, Helen Jenne, Johan Mathe, Audun Myers, Theodore Papamarkou, Tolga Birdal, Tamal Dey, Tim Doster, Tegan Emerson, Gurusankar Gopalakrishnan, Devendra Govil, Aldo Guzmán-Sáenz, Henry Kvinge, Neal Livesay, Soham Mukherjee, Shreyas N. Samaga, Karthikeyan Natesan Ramamurthy, Maneel Reddy Karri, Paul Rosen, Sophia Sanborn, Robin Walters, Jens Agerberg, Sadrodin Barikbin, Claudio Battiloro , et al. (31 additional authors not shown)

Abstract: This paper presents the computational challenge on topological deep learning that was hosted within the ICML 2023 Workshop on Topology and Geometry in Machine Learning. The competition asked participants to provide open-source implementations of topological neural networks from the literature by contributing to the python packages TopoNetX (data processing) and TopoModelX (deep learning). The chal… ▽ More This paper presents the computational challenge on topological deep learning that was hosted within the ICML 2023 Workshop on Topology and Geometry in Machine Learning. The competition asked participants to provide open-source implementations of topological neural networks from the literature by contributing to the python packages TopoNetX (data processing) and TopoModelX (deep learning). The challenge attracted twenty-eight qualifying submissions in its two-month duration. This paper describes the design of the challenge and summarizes its main findings. △ Less

Submitted 18 January, 2024; v1 submitted 26 September, 2023; originally announced September 2023.

arXiv:2307.10187 [pdf, other]

Personalized Privacy Amplification via Importance Sampling

Authors: Dominik Fay, Sebastian Mair, Jens Sjölund

Abstract: For scalable machine learning on large data sets, subsampling a representative subset is a common approach for efficient model training. This is often achieved through importance sampling, whereby informative data points are sampled more frequently. In this paper, we examine the privacy properties of importance sampling, focusing on an individualized privacy analysis. We find that, in importance s… ▽ More For scalable machine learning on large data sets, subsampling a representative subset is a common approach for efficient model training. This is often achieved through importance sampling, whereby informative data points are sampled more frequently. In this paper, we examine the privacy properties of importance sampling, focusing on an individualized privacy analysis. We find that, in importance sampling, privacy is well aligned with utility but at odds with sample size. Based on this insight, we propose two approaches for constructing sampling distributions: one that optimizes the privacy-efficiency trade-off; and one based on a utility guarantee in the form of coresets. We evaluate both approaches empirically in terms of privacy, efficiency, and accuracy on the differentially private $k$-means problem. We observe that both approaches yield similar outcomes and consistently outperform uniform sampling across a wide range of data sets. Our code is available on GitHub: https://github.com/smair/personalized-privacy-amplification-via-importance-sampling △ Less

Submitted 28 March, 2025; v1 submitted 5 July, 2023; originally announced July 2023.

Comments: 28 pages, 7 figures

Journal ref: Transactions on Machine Learning Research (12/2024)

arXiv:2307.00141 [pdf, other]

Risk-sensitive Actor-free Policy via Convex Optimization

Authors: Ruoqi Zhang, Jens Sjölund

Abstract: Traditional reinforcement learning methods optimize agents without considering safety, potentially resulting in unintended consequences. In this paper, we propose an optimal actor-free policy that optimizes a risk-sensitive criterion based on the conditional value at risk. The risk-sensitive objective function is modeled using an input-convex neural network ensuring convexity with respect to the a… ▽ More Traditional reinforcement learning methods optimize agents without considering safety, potentially resulting in unintended consequences. In this paper, we propose an optimal actor-free policy that optimizes a risk-sensitive criterion based on the conditional value at risk. The risk-sensitive objective function is modeled using an input-convex neural network ensuring convexity with respect to the actions and enabling the identification of globally optimal actions through simple gradient-following methods. Experimental results demonstrate the efficacy of our approach in maintaining effective risk control. △ Less

Submitted 30 June, 2023; originally announced July 2023.

Comments: Accepted by The IJCAI-2023 AlSafety and SafeRL Joint Workshop

arXiv:2305.16368 [pdf, other]

Neural incomplete factorization: learning preconditioners for the conjugate gradient method

Authors: Paul Häusner, Ozan Öktem, Jens Sjölund

Abstract: The convergence of the conjugate gradient method for solving large-scale and sparse linear equation systems depends on the spectral properties of the system matrix, which can be improved by preconditioning. In this paper, we develop a computationally efficient data-driven approach to accelerate the generation of effective preconditioners. We, therefore, replace the typically hand-engineered precon… ▽ More The convergence of the conjugate gradient method for solving large-scale and sparse linear equation systems depends on the spectral properties of the system matrix, which can be improved by preconditioning. In this paper, we develop a computationally efficient data-driven approach to accelerate the generation of effective preconditioners. We, therefore, replace the typically hand-engineered preconditioners by the output of graph neural networks. Our method generates an incomplete factorization of the matrix and is, therefore, referred to as neural incomplete factorization (NeuralIF). Optimizing the condition number of the linear system directly is computationally infeasible. Instead, we utilize a stochastic approximation of the Frobenius loss which only requires matrix-vector multiplications for efficient training. At the core of our method is a novel message-passing block, inspired by sparse matrix theory, that aligns with the objective of finding a sparse factorization of the matrix. We evaluate our proposed method on both synthetic problem instances and on problems arising from the discretization of the Poisson equation on varying domains. Our experiments show that by using data-driven preconditioners within the conjugate gradient method we are able to speed up the convergence of the iterative procedure. The code is available at https://github.com/paulhausner/neural-incomplete-factorization. △ Less

Submitted 24 October, 2024; v1 submitted 25 May, 2023; originally announced May 2023.

Comments: 26 pages, 8 figures, accepted in Transactions on Machine Learning Research (TMLR)

arXiv:2304.08291 [pdf, other]

Refusion: Enabling Large-Size Realistic Image Restoration with Latent-Space Diffusion Models

Authors: Ziwei Luo, Fredrik K. Gustafsson, Zheng Zhao, Jens Sjölund, Thomas B. Schön

Abstract: This work aims to improve the applicability of diffusion models in realistic image restoration. Specifically, we enhance the diffusion model in several aspects such as network architecture, noise level, denoising steps, training image size, and optimizer/scheduler. We show that tuning these hyperparameters allows us to achieve better performance on both distortion and perceptual scores. We also pr… ▽ More This work aims to improve the applicability of diffusion models in realistic image restoration. Specifically, we enhance the diffusion model in several aspects such as network architecture, noise level, denoising steps, training image size, and optimizer/scheduler. We show that tuning these hyperparameters allows us to achieve better performance on both distortion and perceptual scores. We also propose a U-Net based latent diffusion model which performs diffusion in a low-resolution latent space while preserving high-resolution information from the original input for the decoding process. Compared to the previous latent-diffusion model which trains a VAE-GAN to compress the image, our proposed U-Net compression strategy is significantly more stable and can recover highly accurate images without relying on adversarial optimization. Importantly, these modifications allow us to apply diffusion models to various image restoration tasks, including real-world shadow removal, HR non-homogeneous dehazing, stereo super-resolution, and bokeh effect transformation. By simply replacing the datasets and slightly changing the noise network, our model, named Refusion, is able to deal with large-size images (e.g., 6000 x 4000 x 3 in HR dehazing) and produces good results on all the above restoration problems. Our Refusion achieves the best perceptual performance in the NTIRE 2023 Image Shadow Removal Challenge and wins 2nd place overall. △ Less

Submitted 17 April, 2023; originally announced April 2023.

Comments: CVPRW 2023. Runner-up method in NTIRE 2023 Image Shadow Removal Challenge. Code is available at https://github.com/Algolzw/image-restoration-sde

arXiv:2301.13748 [pdf, other]

Archetypal Analysis++: Rethinking the Initialization Strategy

Authors: Sebastian Mair, Jens Sjölund

Abstract: Archetypal analysis is a matrix factorization method with convexity constraints. Due to local minima, a good initialization is essential, but frequently used initialization methods yield either sub-optimal starting points or are prone to get stuck in poor local minima. In this paper, we propose archetypal analysis++ (AA++), a probabilistic initialization strategy for archetypal analysis that seque… ▽ More Archetypal analysis is a matrix factorization method with convexity constraints. Due to local minima, a good initialization is essential, but frequently used initialization methods yield either sub-optimal starting points or are prone to get stuck in poor local minima. In this paper, we propose archetypal analysis++ (AA++), a probabilistic initialization strategy for archetypal analysis that sequentially samples points based on their influence on the objective function, similar to $k$-means++. In fact, we argue that $k$-means++ already approximates the proposed initialization method. Furthermore, we suggest to adapt an efficient Monte Carlo approximation of $k$-means++ to AA++. In an extensive empirical evaluation of 15 real-world data sets of varying sizes and dimensionalities and considering two pre-processing strategies, we show that AA++ almost always outperforms all baselines, including the most frequently used ones. △ Less

Submitted 13 May, 2024; v1 submitted 31 January, 2023; originally announced January 2023.

Comments: 27 pages, 17 figures, accepted at the Transactions on Machine Learning Research

Journal ref: Transactions on Machine Learning Research (04/2024)

arXiv:2301.11699 [pdf, other]

Image Restoration with Mean-Reverting Stochastic Differential Equations

Authors: Ziwei Luo, Fredrik K. Gustafsson, Zheng Zhao, Jens Sjölund, Thomas B. Schön

Abstract: This paper presents a stochastic differential equation (SDE) approach for general-purpose image restoration. The key construction consists in a mean-reverting SDE that transforms a high-quality image into a degraded counterpart as a mean state with fixed Gaussian noise. Then, by simulating the corresponding reverse-time SDE, we are able to restore the origin of the low-quality image without relyin… ▽ More This paper presents a stochastic differential equation (SDE) approach for general-purpose image restoration. The key construction consists in a mean-reverting SDE that transforms a high-quality image into a degraded counterpart as a mean state with fixed Gaussian noise. Then, by simulating the corresponding reverse-time SDE, we are able to restore the origin of the low-quality image without relying on any task-specific prior knowledge. Crucially, the proposed mean-reverting SDE has a closed-form solution, allowing us to compute the ground truth time-dependent score and learn it with a neural network. Moreover, we propose a maximum likelihood objective to learn an optimal reverse trajectory that stabilizes the training and improves the restoration results. The experiments show that our proposed method achieves highly competitive performance in quantitative comparisons on image deraining, deblurring, and denoising, setting a new state-of-the-art on two deraining datasets. Finally, the general applicability of our approach is further demonstrated via qualitative results on image super-resolution, inpainting, and dehazing. Code is available at https://github.com/Algolzw/image-restoration-sde. △ Less

Submitted 31 May, 2023; v1 submitted 27 January, 2023; originally announced January 2023.

Comments: Accepted by ICML 2023; Project page: https://algolzw.github.io/ir-sde/index.html

arXiv:2301.01236 [pdf, other]

A Tutorial on Parametric Variational Inference

Authors: Jens Sjölund

Abstract: Variational inference uses optimization, rather than integration, to approximate the marginal likelihood, and thereby the posterior, in a Bayesian model. Thanks to advances in computational scalability made in the last decade, variational inference is now the preferred choice for many high-dimensional models and large datasets. This tutorial introduces variational inference from the parametric per… ▽ More Variational inference uses optimization, rather than integration, to approximate the marginal likelihood, and thereby the posterior, in a Bayesian model. Thanks to advances in computational scalability made in the last decade, variational inference is now the preferred choice for many high-dimensional models and large datasets. This tutorial introduces variational inference from the parametric perspective that dominates these recent developments, in contrast to the mean-field perspective commonly found in other introductory texts. △ Less

Submitted 3 January, 2023; originally announced January 2023.

Comments: 9 pages

arXiv:2205.06306 [pdf, other]

doi 10.1109/TSP.2023.3245720

Probabilistic Estimation of Instantaneous Frequencies of Chirp Signals

Authors: Zheng Zhao, Simo Särkkä, Jens Sjölund, Thomas B. Schön

Abstract: We present a continuous-time probabilistic approach for estimating the chirp signal and its instantaneous frequency function when the true forms of these functions are not accessible. Our model represents these functions by non-linearly cascaded Gaussian processes represented as non-linear stochastic differential equations. The posterior distribution of the functions is then estimated with stochas… ▽ More We present a continuous-time probabilistic approach for estimating the chirp signal and its instantaneous frequency function when the true forms of these functions are not accessible. Our model represents these functions by non-linearly cascaded Gaussian processes represented as non-linear stochastic differential equations. The posterior distribution of the functions is then estimated with stochastic filters and smoothers. We compute a (posterior) Cramér--Rao lower bound for the Gaussian process model, and derive a theoretical upper bound for the estimation error in the mean squared sense. The experiments show that the proposed method outperforms a number of state-of-the-art methods on a synthetic data. We also show that the method works out-of-the-box for two real-world datasets. △ Less

Submitted 13 February, 2023; v1 submitted 12 May, 2022; originally announced May 2022.

Comments: Accepted for publication in IEEE Transactions on Signal Processing

arXiv:2203.01921 [pdf, other]

NUQ: A Noise Metric for Diffusion MRI via Uncertainty Discrepancy Quantification

Authors: Shreyas Fadnavis, Jens Sjölund, Anders Eklund, Eleftherios Garyfallidis

Abstract: Diffusion MRI (dMRI) is the only non-invasive technique sensitive to tissue micro-architecture, which can, in turn, be used to reconstruct tissue microstructure and white matter pathways. The accuracy of such tasks is hampered by the low signal-to-noise ratio in dMRI. Today, the noise is characterized mainly by visual inspection of residual maps and estimated standard deviation. However, it is har… ▽ More Diffusion MRI (dMRI) is the only non-invasive technique sensitive to tissue micro-architecture, which can, in turn, be used to reconstruct tissue microstructure and white matter pathways. The accuracy of such tasks is hampered by the low signal-to-noise ratio in dMRI. Today, the noise is characterized mainly by visual inspection of residual maps and estimated standard deviation. However, it is hard to estimate the impact of noise on downstream tasks based only on such qualitative assessments. To address this issue, we introduce a novel metric, Noise Uncertainty Quantification (NUQ), for quantitative image quality analysis in the absence of a ground truth reference image. NUQ uses a recent Bayesian formulation of dMRI models to estimate the uncertainty of microstructural measures. Specifically, NUQ uses the maximum mean discrepancy metric to compute a pooled quality score by comparing samples drawn from the posterior distribution of the microstructure measures. We show that NUQ allows a fine-grained analysis of noise, capturing details that are visually imperceptible. We perform qualitative and quantitative comparisons on real datasets, showing that NUQ generates consistent scores across different denoisers and acquisitions. Lastly, by using NUQ on a cohort of schizophrenics and controls, we quantify the substantial impact of denoising on group differences. △ Less

Submitted 4 March, 2022; v1 submitted 3 March, 2022; originally announced March 2022.

arXiv:2202.00264 [pdf, other]

Graph-based Neural Acceleration for Nonnegative Matrix Factorization

Authors: Jens Sjölund, Maria Bånkestad

Abstract: We describe a graph-based neural acceleration technique for nonnegative matrix factorization that builds upon a connection between matrices and bipartite graphs that is well-known in certain fields, e.g., sparse linear algebra, but has not yet been exploited to design graph neural networks for matrix computations. We first consider low-rank factorization more broadly and propose a graph representa… ▽ More We describe a graph-based neural acceleration technique for nonnegative matrix factorization that builds upon a connection between matrices and bipartite graphs that is well-known in certain fields, e.g., sparse linear algebra, but has not yet been exploited to design graph neural networks for matrix computations. We first consider low-rank factorization more broadly and propose a graph representation of the problem suited for graph neural networks. Then, we focus on the task of nonnegative matrix factorization and propose a graph neural network that interleaves bipartite self-attention layers with updates based on the alternating direction method of multipliers. Our empirical evaluation on synthetic and two real-world datasets shows that we attain substantial acceleration, even though we only train in an unsupervised fashion on smaller synthetic instances. △ Less

Submitted 1 February, 2022; originally announced February 2022.

Comments: Authors contributed equally

arXiv:2112.04489 [pdf, other]

Learn2Reg: comprehensive multi-task medical image registration challenge, dataset and evaluation in the era of deep learning

Authors: Alessa Hering, Lasse Hansen, Tony C. W. Mok, Albert C. S. Chung, Hanna Siebert, Stephanie Häger, Annkristin Lange, Sven Kuckertz, Stefan Heldmann, Wei Shao, Sulaiman Vesal, Mirabela Rusu, Geoffrey Sonn, Théo Estienne, Maria Vakalopoulou, Luyi Han, Yunzhi Huang, Pew-Thian Yap, Mikael Brudfors, Yaël Balbastre, Samuel Joutard, Marc Modat, Gal Lifshitz, Dan Raviv, Jinxin Lv , et al. (28 additional authors not shown)

Abstract: Image registration is a fundamental medical image analysis task, and a wide variety of approaches have been proposed. However, only a few studies have comprehensively compared medical image registration approaches on a wide range of clinically relevant tasks. This limits the development of registration methods, the adoption of research advances into practice, and a fair benchmark across competing… ▽ More Image registration is a fundamental medical image analysis task, and a wide variety of approaches have been proposed. However, only a few studies have comprehensively compared medical image registration approaches on a wide range of clinically relevant tasks. This limits the development of registration methods, the adoption of research advances into practice, and a fair benchmark across competing approaches. The Learn2Reg challenge addresses these limitations by providing a multi-task medical image registration data set for comprehensive characterisation of deformable registration algorithms. A continuous evaluation will be possible at https://learn2reg.grand-challenge.org. Learn2Reg covers a wide range of anatomies (brain, abdomen, and thorax), modalities (ultrasound, CT, MR), availability of annotations, as well as intra- and inter-patient registration evaluation. We established an easily accessible framework for training and validation of 3D registration methods, which enabled the compilation of results of over 65 individual method submissions from more than 20 unique teams. We used a complementary set of metrics, including robustness, accuracy, plausibility, and runtime, enabling unique insight into the current state-of-the-art of medical image registration. This paper describes datasets, tasks, evaluation methods and results of the challenge, as well as results of further analysis of transferability to new datasets, the importance of label supervision, and resulting bias. While no single approach worked best across all tasks, many methodological aspects could be identified that push the performance of medical image registration to new state-of-the-art performance. Furthermore, we demystified the common belief that conventional registration methods have to be much slower than deep-learning-based methods. △ Less

Submitted 7 October, 2022; v1 submitted 8 December, 2021; originally announced December 2021.

arXiv:2103.00930 [pdf, other]

doi 10.23919/FUSION49751.2022.9841369

Unsupervised dynamic modeling of medical image transformation

Authors: Niklas Gunnarsson, Peter Kimstrand, Jens Sjölund, Thomas B. Schön

Abstract: Spatiotemporal imaging has applications in e.g. cardiac diagnostics, surgical guidance, and radiotherapy monitoring, In this paper, we explain the temporal motion by identifying the underlying dynamics, only based on the sequential images. Our dynamical model maps the inputs of observed high-dimensional sequential images to a low-dimensional latent space wherein a linear relationship between a hid… ▽ More Spatiotemporal imaging has applications in e.g. cardiac diagnostics, surgical guidance, and radiotherapy monitoring, In this paper, we explain the temporal motion by identifying the underlying dynamics, only based on the sequential images. Our dynamical model maps the inputs of observed high-dimensional sequential images to a low-dimensional latent space wherein a linear relationship between a hidden state process and the lower-dimensional representation of the inputs holds. For this, we use a conditional variational auto-encoder (CVAE) to nonlinearly map the higher-dimensional image to a lower-dimensional space, wherein we model the dynamics with a linear Gaussian state-space model (LG-SSM). The model, a modified version of the Kalman variational auto-encoder, is end-to-end trainable, and the weights, both in the CVAE and LG-SSM, are simultaneously updated by maximizing the evidence lower bound of the marginal likelihood. In contrast to the original model, we explain the motion with a spatial transformation from one image to another. This results in sharper reconstructions and the possibility of transferring auxiliary information, such as segmentation, through the image sequence. Our experiments, on cardiac ultrasound time series, show that the dynamic model outperforms traditional image registration in execution time, to a similar performance. Further, our model offers the possibility to impute and extrapolate for missing samples. △ Less

Submitted 7 November, 2022; v1 submitted 1 March, 2021; originally announced March 2021.

Comments: published in 2022 25th International Conference on Information Fusion (FUSION)

arXiv:2004.06567 [pdf, other]

Decentralized Differentially Private Segmentation with PATE

Authors: Dominik Fay, Jens Sjölund, Tobias J. Oechtering

Abstract: When it comes to preserving privacy in medical machine learning, two important considerations are (1) keeping data local to the institution and (2) avoiding inference of sensitive information from the trained model. These are often addressed using federated learning and differential privacy, respectively. However, the commonly used Federated Averaging algorithm requires a high degree of synchroniz… ▽ More When it comes to preserving privacy in medical machine learning, two important considerations are (1) keeping data local to the institution and (2) avoiding inference of sensitive information from the trained model. These are often addressed using federated learning and differential privacy, respectively. However, the commonly used Federated Averaging algorithm requires a high degree of synchronization between participating institutions. For this reason, we turn our attention to Private Aggregation of Teacher Ensembles (PATE), where all local models can be trained independently without inter-institutional communication. The purpose of this paper is thus to explore how PATE -- originally designed for classification -- can best be adapted for semantic segmentation. To this end, we build low-dimensional representations of segmentation masks which the student can obtain through low-sensitivity queries to the private aggregator. On the Brain Tumor Segmentation (BraTS 2019) dataset, an Autoencoder-based PATE variant achieves a higher Dice coefficient for the same privacy guarantee than prior work based on noisy Federated Averaging. △ Less

Submitted 9 April, 2020; originally announced April 2020.

Comments: Under review for MICCAI 2020

arXiv:2003.10819 [pdf, ps, other]

Registration by tracking for sequential 2D MRI

Authors: Niklas Gunnarsson, Jens Sjölund, Thomas B. Schön

Abstract: Our anatomy is in constant motion. With modern MR imaging it is possible to record this motion in real-time during an ongoing radiation therapy session. In this paper we present an image registration method that exploits the sequential nature of 2D MR images to estimate the corresponding displacement field. The method employs several discriminative correlation filters that independently track spec… ▽ More Our anatomy is in constant motion. With modern MR imaging it is possible to record this motion in real-time during an ongoing radiation therapy session. In this paper we present an image registration method that exploits the sequential nature of 2D MR images to estimate the corresponding displacement field. The method employs several discriminative correlation filters that independently track specific points. Together with a sparse-to-dense interpolation scheme we can then estimate of the displacement field. The discriminative correlation filters are trained online, and our method is modality agnostic. For the interpolation scheme we use a neural network with normalized convolutions that is trained using synthetic diffeomorphic displacement fields. The method is evaluated on a segmented cardiac dataset and when compared to two conventional methods we observe an improved performance. This improvement is especially pronounced when it comes to the detection of larger motions of small objects. △ Less

Submitted 24 March, 2020; originally announced March 2020.

Comments: Currently under review for a conference

arXiv:2003.07201 [pdf, ps, other]

The Elliptical Processes: a Family of Fat-tailed Stochastic Processes

Authors: Maria Bånkestad, Jens Sjölund, Jalil Taghia, Thomas Schön

Abstract: We present the elliptical processes -- a family of non-parametric probabilistic models that subsumes the Gaussian process and the Student-t process. This generalization includes a range of new fat-tailed behaviors yet retains computational tractability. We base the elliptical processes on a representation of elliptical distributions as a continuous mixture of Gaussian distributions and derive clos… ▽ More We present the elliptical processes -- a family of non-parametric probabilistic models that subsumes the Gaussian process and the Student-t process. This generalization includes a range of new fat-tailed behaviors yet retains computational tractability. We base the elliptical processes on a representation of elliptical distributions as a continuous mixture of Gaussian distributions and derive closed-form expressions for the marginal and conditional distributions. We perform numerical experiments on robust regression using an elliptical process defined by a piecewise constant mixing distribution, and show advantages compared with a Gaussian process. The elliptical processes may become a replacement for Gaussian processes in several settings, including when the likelihood is not Gaussian or when accurate tail modeling is critical. △ Less

Submitted 2 December, 2020; v1 submitted 13 March, 2020; originally announced March 2020.

arXiv:1908.06683 [pdf, other]

A unified representation network for segmentation with missing modalities

Authors: Kenneth Lau, Jonas Adler, Jens Sjölund

Abstract: Over the last few years machine learning has demonstrated groundbreaking results in many areas of medical image analysis, including segmentation. A key assumption, however, is that the train- and test distributions match. We study a realistic scenario where this assumption is clearly violated, namely segmentation with missing input modalities. We describe two neural network approaches that can han… ▽ More Over the last few years machine learning has demonstrated groundbreaking results in many areas of medical image analysis, including segmentation. A key assumption, however, is that the train- and test distributions match. We study a realistic scenario where this assumption is clearly violated, namely segmentation with missing input modalities. We describe two neural network approaches that can handle a variable number of input modalities. The first is modality dropout: a simple but surprisingly effective modification of the training. The second is the unified representation network: a network architecture that maps a variable number of input modalities into a unified representation that can be used for downstream tasks such as segmentation. We demonstrate that modality dropout makes a standard segmentation network reasonably robust to missing modalities, but that the same network works even better if trained on the unified representation. △ Less

Submitted 19 August, 2019; originally announced August 2019.

arXiv:1806.03016 [pdf, other]

doi 10.1002/mp.13440

A linear programming approach to inverse planning in Gamma Knife radiosurgery

Authors: Jens Sjölund, Stella Riad, Marcus Hennix, Håkan Nordström

Abstract: Leksell Gamma Knife is a stereotactic radiosurgery system that allows fine-grained control of the delivered dose distribution. We describe a new inverse planning approach that both resolves shortcomings of earlier approaches and unlocks new capabilities. We fix the isocenter positions and perform sector-duration optimization using linear programming, and study the effect of beam-on time penalizati… ▽ More Leksell Gamma Knife is a stereotactic radiosurgery system that allows fine-grained control of the delivered dose distribution. We describe a new inverse planning approach that both resolves shortcomings of earlier approaches and unlocks new capabilities. We fix the isocenter positions and perform sector-duration optimization using linear programming, and study the effect of beam-on time penalization on the trade-off between beam-on time and plan quality. We also describe two techniques that reduce the problem size and thus further reduce the solution time: dualization and representative subsampling. The beam-on time penalization reduces the beam-on time by a factor 2-3 compared with the naive alternative. Dualization and representative subsampling each leads to optimization time-savings by a factor 5-20. Overall, we find in a comparison with 75 clinical plans that we can always find plans with similar coverage and better selectivity and beam-on time. In 44 of these, we can even find a plan that also has better gradient index. On a standard GammaPlan workstation, the optimization times ranged from 2.3 to 26 s with a median time of 5.7 s. In conclusion, we present a combination of techniques that enables sector-duration optimization in a clinically feasible time frame. △ Less

Submitted 19 December, 2018; v1 submitted 8 June, 2018; originally announced June 2018.

Journal ref: Medical Physics, vol. 46, issue 4 (2019), pages 1533-1544

arXiv:1711.06002 [pdf, other]

doi 10.1016/j.neuroimage.2018.03.059

Bayesian uncertainty quantification in linear models for diffusion MRI

Authors: Jens Sjölund, Anders Eklund, Evren Özarslan, Magnus Herberthson, Maria Bånkestad, Hans Knutsson

Abstract: Diffusion MRI (dMRI) is a valuable tool in the assessment of tissue microstructure. By fitting a model to the dMRI signal it is possible to derive various quantitative features. Several of the most popular dMRI signal models are expansions in an appropriately chosen basis, where the coefficients are determined using some variation of least-squares. However, such approaches lack any notion of uncer… ▽ More Diffusion MRI (dMRI) is a valuable tool in the assessment of tissue microstructure. By fitting a model to the dMRI signal it is possible to derive various quantitative features. Several of the most popular dMRI signal models are expansions in an appropriately chosen basis, where the coefficients are determined using some variation of least-squares. However, such approaches lack any notion of uncertainty, which could be valuable in e.g. group analyses. In this work, we use a probabilistic interpretation of linear least-squares methods to recast popular dMRI models as Bayesian ones. This makes it possible to quantify the uncertainty of any derived quantity. In particular, for quantities that are affine functions of the coefficients, the posterior distribution can be expressed in closed-form. We simulated measurements from single- and double-tensor models where the correct values of several quantities are known, to validate that the theoretically derived quantiles agree with those observed empirically. We included results from residual bootstrap for comparison and found good agreement. The validation employed several different models: Diffusion Tensor Imaging (DTI), Mean Apparent Propagator MRI (MAP-MRI) and Constrained Spherical Deconvolution (CSD). We also used in vivo data to visualize maps of quantitative features and corresponding uncertainties, and to show how our approach can be used in a group analysis to downweight subjects with high uncertainty. In summary, we convert successful linear models for dMRI signal estimation to probabilistic models, capable of accurate uncertainty quantification. △ Less

Submitted 19 February, 2018; v1 submitted 16 November, 2017; originally announced November 2017.

Comments: Added results from a group analysis and a comparison with residual bootstrap

Journal ref: NeuroImage, 2018; 175:272-285

arXiv:1612.06741 [pdf]

doi 10.1371/journal.pone.0214238

Whole-brain diffusional variance decomposition (DIVIDE): Demonstration of technical feasibility at clinical MRI systems

Authors: Filip Szczepankiewicz, Jens Sjölund, Freddy Ståhlberg, Jimmy Lätt, Markus Nilsson

Abstract: Purpose: To assess the technical feasibility of whole-brain diffusional variance decomposition (DIVIDE) based on q-space trajectory encoding (QTE) at clinical MRI systems with varying performance. DIVIDE is used to separate diffusional heterogeneity into components that arise due to isotropic and anisotropic tissue structures. Methods: We designed imaging protocols for DIVIDE using numerically opt… ▽ More Purpose: To assess the technical feasibility of whole-brain diffusional variance decomposition (DIVIDE) based on q-space trajectory encoding (QTE) at clinical MRI systems with varying performance. DIVIDE is used to separate diffusional heterogeneity into components that arise due to isotropic and anisotropic tissue structures. Methods: We designed imaging protocols for DIVIDE using numerically optimized gradient waveforms for diffusion encoding. Imaging was performed at systems with magnetic field strengths between 1.5 and 7 T, and gradient amplitudes between 33 and 80 mT/m. Technical feasibility was assessed from signal characteristics and quality of parameter maps in a single volunteer scanned at all systems. Results: The technical feasibility of QTE and DIVIDE was demonstrated at all systems. The system with the highest performance allowed whole-brain DIVIDE at 2 mm isotropic voxels. The system with the lowest performance required a spatial resolution of 2.5x2.5x4 mm3 to yield a sufficient signal-to-noise ratio. Conclusions: Whole-brain DIVIDE based on QTE is feasible at the investigated MRI systems. This demonstration indicates that tissue features beyond those accessible by conventional diffusion encoding may be explored on a wide range of MRI systems. △ Less

Submitted 20 December, 2016; originally announced December 2016.

Comments: 13 pages, 7 figures

Journal ref: PLoS ONE 14(3): e0214238, 2019

arXiv:1611.02869 [pdf, other]

doi 10.1109/ISBI.2017.7950634

Gaussian process regression can turn non-uniform and undersampled diffusion MRI data into diffusion spectrum imaging

Authors: Jens Sjölund, Anders Eklund, Evren Özarslan, Hans Knutsson

Abstract: We propose to use Gaussian process regression to accurately estimate the diffusion MRI signal at arbitrary locations in q-space. By estimating the signal on a grid, we can do synthetic diffusion spectrum imaging: reconstructing the ensemble averaged propagator (EAP) by an inverse Fourier transform. We also propose an alternative reconstruction method guaranteeing a nonnegative EAP that integrates… ▽ More We propose to use Gaussian process regression to accurately estimate the diffusion MRI signal at arbitrary locations in q-space. By estimating the signal on a grid, we can do synthetic diffusion spectrum imaging: reconstructing the ensemble averaged propagator (EAP) by an inverse Fourier transform. We also propose an alternative reconstruction method guaranteeing a nonnegative EAP that integrates to unity. The reconstruction is validated on data simulated from two Gaussians at various crossing angles. Moreover, we demonstrate on non-uniformly sampled in vivo data that the method is far superior to linear interpolation, and allows a drastic undersampling of the data with only a minor loss of accuracy. We envision the method as a potential replacement for standard diffusion spectrum imaging, in particular when acquistion time is limited. △ Less

Submitted 9 November, 2016; originally announced November 2016.

Comments: 5 pages

Journal ref: 2017 IEEE 14th International Symposium on Biomedical Imaging (ISBI 2017)

Showing 1–41 of 41 results for author: Sjölund, J