-
Dynamical Low-Rank Compression of Neural Networks with Robustness under Adversarial Attacks
Authors:
Steffen Schotthöfer,
H. Lexie Yang,
Stefan Schnake
Abstract:
Deployment of neural networks on resource-constrained devices demands models that are both compact and robust to adversarial inputs. However, compression and adversarial robustness often conflict. In this work, we introduce a dynamical low-rank training scheme enhanced with a novel spectral regularizer that controls the condition number of the low-rank core in each layer. This approach mitigates t…
▽ More
Deployment of neural networks on resource-constrained devices demands models that are both compact and robust to adversarial inputs. However, compression and adversarial robustness often conflict. In this work, we introduce a dynamical low-rank training scheme enhanced with a novel spectral regularizer that controls the condition number of the low-rank core in each layer. This approach mitigates the sensitivity of compressed models to adversarial perturbations without sacrificing clean accuracy. The method is model- and data-agnostic, computationally efficient, and supports rank adaptivity to automatically compress the network at hand. Extensive experiments across standard architectures, datasets, and adversarial attacks show the regularized networks can achieve over 94% compression while recovering or improving adversarial accuracy relative to uncompressed baselines.
△ Less
Submitted 12 May, 2025;
originally announced May 2025.
-
CaloChallenge 2022: A Community Challenge for Fast Calorimeter Simulation
Authors:
Claudius Krause,
Michele Faucci Giannelli,
Gregor Kasieczka,
Benjamin Nachman,
Dalila Salamani,
David Shih,
Anna Zaborowska,
Oz Amram,
Kerstin Borras,
Matthew R. Buckley,
Erik Buhmann,
Thorsten Buss,
Renato Paulo Da Costa Cardoso,
Anthony L. Caterini,
Nadezda Chernyavskaya,
Federico A. G. Corchia,
Jesse C. Cresswell,
Sascha Diefenbacher,
Etienne Dreyer,
Vijay Ekambaram,
Engin Eren,
Florian Ernst,
Luigi Favaro,
Matteo Franchini,
Frank Gaede
, et al. (44 additional authors not shown)
Abstract:
We present the results of the "Fast Calorimeter Simulation Challenge 2022" - the CaloChallenge. We study state-of-the-art generative models on four calorimeter shower datasets of increasing dimensionality, ranging from a few hundred voxels to a few tens of thousand voxels. The 31 individual submissions span a wide range of current popular generative architectures, including Variational AutoEncoder…
▽ More
We present the results of the "Fast Calorimeter Simulation Challenge 2022" - the CaloChallenge. We study state-of-the-art generative models on four calorimeter shower datasets of increasing dimensionality, ranging from a few hundred voxels to a few tens of thousand voxels. The 31 individual submissions span a wide range of current popular generative architectures, including Variational AutoEncoders (VAEs), Generative Adversarial Networks (GANs), Normalizing Flows, Diffusion models, and models based on Conditional Flow Matching. We compare all submissions in terms of quality of generated calorimeter showers, as well as shower generation time and model size. To assess the quality we use a broad range of different metrics including differences in 1-dimensional histograms of observables, KPD/FPD scores, AUCs of binary classifiers, and the log-posterior of a multiclass classifier. The results of the CaloChallenge provide the most complete and comprehensive survey of cutting-edge approaches to calorimeter fast simulation to date. In addition, our work provides a uniquely detailed perspective on the important problem of how to evaluate generative models. As such, the results presented here should be applicable for other domains that use generative AI and require fast and faithful generation of samples in a large phase space.
△ Less
Submitted 28 October, 2024;
originally announced October 2024.
-
On high-order/low-order and micro-macro methods for implicit time-stepping of the BGK model
Authors:
Cory Hauck,
M. Paul Laiu,
Stefan Schnake
Abstract:
In this paper, a high-order/low-order (HOLO) method is combined with a micro-macro (MM) decomposition to accelerate iterative solvers in fully implicit time-stepping of the BGK equation for gas dynamics. The MM formulation represents a kinetic distribution as the sum of a local Maxwellian and a perturbation. In highly collisional regimes, the perturbation away from initial and boundary layers is s…
▽ More
In this paper, a high-order/low-order (HOLO) method is combined with a micro-macro (MM) decomposition to accelerate iterative solvers in fully implicit time-stepping of the BGK equation for gas dynamics. The MM formulation represents a kinetic distribution as the sum of a local Maxwellian and a perturbation. In highly collisional regimes, the perturbation away from initial and boundary layers is small and can be compressed to reduce the overall storage cost of the distribution. The convergence behavior of the MM methods, the usual HOLO method, and the standard source iteration method is analyzed on a linear BGK model. Both the HOLO and MM methods are implemented using a discontinuous Galerkin (DG) discretization in phase space, which naturally preserves the consistency between high- and low-order models required by the HOLO approach. The accuracy and performance of these methods are compared on the Sod shock tube problem and a sudden wall heating boundary layer problem. Overall, the results demonstrate the robustness of the MM and HOLO approaches and illustrate the compression benefits enabled by the MM formulation when the kinetic distribution is near equilibrium.
△ Less
Submitted 1 October, 2024;
originally announced October 2024.
-
CaloPointFlow II Generating Calorimeter Showers as Point Clouds
Authors:
Simon Schnake,
Dirk Krücker,
Kerstin Borras
Abstract:
The simulation of calorimeter showers presents a significant computational challenge, impacting the efficiency and accuracy of particle physics experiments. While generative ML models have been effective in enhancing and accelerating the conventional physics simulation processes, their application has predominantly been constrained to fixed detector readout geometries. With CaloPointFlow we have p…
▽ More
The simulation of calorimeter showers presents a significant computational challenge, impacting the efficiency and accuracy of particle physics experiments. While generative ML models have been effective in enhancing and accelerating the conventional physics simulation processes, their application has predominantly been constrained to fixed detector readout geometries. With CaloPointFlow we have presented one of the first models that can generate a calorimeter shower as a point cloud. This study describes CaloPointFlow II, which exhibits several significant improvements compared to its predecessor. This includes a novel dequantization technique, referred to as CDF-Dequantization, and a normalizing flow architecture, referred to as DeepSet- Flow. The new model was evaluated with the fast Calorimeter Simulation Challenge (CaloChallenge) Dataset II and III.
△ Less
Submitted 27 May, 2024; v1 submitted 23 March, 2024;
originally announced March 2024.
-
Sparse-grid Discontinuous Galerkin Methods for the Vlasov-Poisson-Lenard-Bernstein Model
Authors:
Stefan Schnake,
Coleman Kendrick,
Eirik Endeve,
Miroslav Stoyanov,
Steven Hahn,
Cory D Hauck,
David L Green,
Phil Snyder,
John Canik
Abstract:
Sparse-grid methods have recently gained interest in reducing the computational cost of solving high-dimensional kinetic equations. In this paper, we construct adaptive and hybrid sparse-grid methods for the Vlasov-Poisson-Lenard-Bernstein (VPLB) model. This model has applications to plasma physics and is simulated in two reduced geometries: a 0x3v space homogeneous geometry and a 1x3v slab geomet…
▽ More
Sparse-grid methods have recently gained interest in reducing the computational cost of solving high-dimensional kinetic equations. In this paper, we construct adaptive and hybrid sparse-grid methods for the Vlasov-Poisson-Lenard-Bernstein (VPLB) model. This model has applications to plasma physics and is simulated in two reduced geometries: a 0x3v space homogeneous geometry and a 1x3v slab geometry. We use the discontinuous Galerkin (DG) method as a base discretization due to its high-order accuracy and ability to preserve important structural properties of partial differential equations. We utilize a multiwavelet basis expansion to determine the sparse-grid basis and the adaptive mesh criteria. We analyze the proposed sparse-grid methods on a suite of three test problems by computing the savings afforded by sparse-grids in comparison to standard solutions of the DG method. The results are obtained using the adaptive sparse-grid discretization library ASGarD.
△ Less
Submitted 9 February, 2024;
originally announced February 2024.
-
A semi-implicit dynamical low-rank discontinuous Galerkin method for space homogeneous kinetic equations. Part I: emission and absorption
Authors:
Peimeng Yin,
Eirik Endeve,
Cory D. Hauck,
Stefan R. Schnake
Abstract:
Dynamical low-rank approximation (DLRA) is an emerging tool for reducing computational costs and provides memory savings when solving high-dimensional problems. In this work, we propose and analyze a semi-implicit dynamical low-rank discontinuous Galerkin (DLR-DG) method for the space homogeneous kinetic equation with a relaxation operator, modeling the emission and absorption of particles by a ba…
▽ More
Dynamical low-rank approximation (DLRA) is an emerging tool for reducing computational costs and provides memory savings when solving high-dimensional problems. In this work, we propose and analyze a semi-implicit dynamical low-rank discontinuous Galerkin (DLR-DG) method for the space homogeneous kinetic equation with a relaxation operator, modeling the emission and absorption of particles by a background medium. Both DLRA and the DG scheme can be formulated as Galerkin equations. To ensure their consistency, a weighted DLRA is introduced so that the resulting DLR-DG solution is a solution to the fully discrete DG scheme in a subspace of the classical DG solution space. Similar to the classical DG method, we show that the proposed DLR-DG method is well-posed. We also identify conditions such that the DLR-DG solution converges to the equilibrium. Numerical results are presented to demonstrate the theoretical findings.
△ Less
Submitted 10 August, 2023;
originally announced August 2023.
-
JetFlow: Generating Jets with Conditioned and Mass Constrained Normalising Flows
Authors:
Benno Käch,
Dirk Krücker,
Isabell Melzer-Pellmann,
Moritz Scham,
Simon Schnake,
Alexi Verney-Provatas
Abstract:
Fast data generation based on Machine Learning has become a major research topic in particle physics. This is mainly because the Monte Carlo simulation approach is computationally challenging for future colliders, which will have a significantly higher luminosity. The generation of collider data is similar to point cloud generation with complex correlations between the points.
In this study, the…
▽ More
Fast data generation based on Machine Learning has become a major research topic in particle physics. This is mainly because the Monte Carlo simulation approach is computationally challenging for future colliders, which will have a significantly higher luminosity. The generation of collider data is similar to point cloud generation with complex correlations between the points.
In this study, the generation of jets with up to 30 constituents with Normalising Flows using Rational Quadratic Spline coupling layers is investigated. Without conditioning on the jet mass, our Normalising Flows are unable to model all correlations in data correctly, which is evident when comparing the invariant jet mass distributions between ground truth and generated data. Using the invariant mass as a condition for the coupling transformation enhances the performance on all tracked metrics. In addition, we demonstrate how to sample the original mass distribution by interpolating the empirical cumulative distribution function. Similarly, the variable number of constituents is taken care of by introducing an additional condition on the number of constituents in the jet.
Furthermore, we study the usefulness of including an additional mass constraint in the loss term. On the \texttt{JetNet} dataset, our model shows state-of-the-art performance combined with fast and stable training.
△ Less
Submitted 29 November, 2022; v1 submitted 24 November, 2022;
originally announced November 2022.
-
A Predictor-Corrector Strategy for Adaptivity in Dynamical Low-Rank Approximations
Authors:
Cory Hauck,
Stefan Schnake
Abstract:
In this paper, we present a predictor-corrector strategy for constructing rank-adaptive dynamical low-rank approximations (DLRAs) of matrix-valued ODE systems. The strategy is a compromise between (i) low-rank step-truncation approaches that alternately evolve and compress solutions and (ii) strict DLRA approaches that augment the low-rank manifold using subspaces generated locally in time by the…
▽ More
In this paper, we present a predictor-corrector strategy for constructing rank-adaptive dynamical low-rank approximations (DLRAs) of matrix-valued ODE systems. The strategy is a compromise between (i) low-rank step-truncation approaches that alternately evolve and compress solutions and (ii) strict DLRA approaches that augment the low-rank manifold using subspaces generated locally in time by the DLRA integrator. The strategy is based on an analysis of the error between a forward temporal update into the ambient full-rank space, which is typically computed in a step-truncation approach before re-compressing, and the standard DLRA update, which is forced to live in a low-rank manifold. We use this error, without requiring its full-rank representation, to correct the DLRA solution. A key ingredient for maintaining a low-rank representation of the error is a randomized singular value decomposition (SVD), which introduces some degree of stochastic variability into the implementation. The strategy is formulated and implemented in the context of discontinuous Galerkin spatial discretizations of partial differential equations and applied to several versions of DLRA methods found in the literature, as well as a new variant. Numerical experiments comparing the predictor-corrector strategy to other methods demonstrate robustness to overcome short-comings of step truncation or strict DLRA approaches: the former may require more memory than is strictly needed while the latter may miss transients solution features that cannot be recovered. The effect of randomization, tolerances, and other implementation parameters is also explored.
△ Less
Submitted 7 September, 2022; v1 submitted 1 September, 2022;
originally announced September 2022.
-
Asymptotic Preserving Discontinuous Galerkin Methods for a Linear Boltzmann Semiconductor Model
Authors:
Victor DeCaria,
Cory Hauck,
Stefan Schnake
Abstract:
A key property of the linear Boltzmann semiconductor model is that as the collision frequency tends to infinity, the phase space density $f = f(x,v,t)$ converges to an isotropic function $M(v)ρ(x,t)$, called the drift-diffusion limit, where $M$ is a Maxwellian and the physical density $ρ$ satisfies a second-order parabolic PDE known as the drift-diffusion equation. Numerical approximations that mi…
▽ More
A key property of the linear Boltzmann semiconductor model is that as the collision frequency tends to infinity, the phase space density $f = f(x,v,t)$ converges to an isotropic function $M(v)ρ(x,t)$, called the drift-diffusion limit, where $M$ is a Maxwellian and the physical density $ρ$ satisfies a second-order parabolic PDE known as the drift-diffusion equation. Numerical approximations that mirror this property are said to be asymptotic preserving. In this paper we build two discontinuous Galerkin methods to the semiconductor model: one with the standard upwinding flux and the other with a $\varepsilon$-scaled Lax-Friedrichs flux, where 1/$\varepsilon$ is the scale of the collision frequency. We show that these schemes are uniformly stable in $\varepsilon$ and are asymptotic preserving. In particular, we discuss what properties the discrete Maxwellian must satisfy in order for the schemes to converge in $\varepsilon$ to an accurate $h$-approximation of the drift diffusion limit. Discrete versions of the drift-diffusion equation and error estimates in several norms with respect to $\varepsilon$ and the spacial resolution are also included.
△ Less
Submitted 19 April, 2023; v1 submitted 20 June, 2022;
originally announced June 2022.
-
$H^1$-norm error estimate for a nonstandard finite element approximation of second-order linear elliptic PDEs in non-divergence form
Authors:
Xiaobing Feng,
Stefan Schnake
Abstract:
This paper establishes the optimal $H^1$-norm error estimate for a nonstandard finite element method for approximating $H^2$ strong solutions of second order linear elliptic PDEs in non-divergence form with continuous coefficients. To circumvent the difficulty of lacking an effective duality argument for this class of PDEs, a new analysis technique is introduced; the crux of it is to establish an…
▽ More
This paper establishes the optimal $H^1$-norm error estimate for a nonstandard finite element method for approximating $H^2$ strong solutions of second order linear elliptic PDEs in non-divergence form with continuous coefficients. To circumvent the difficulty of lacking an effective duality argument for this class of PDEs, a new analysis technique is introduced; the crux of it is to establish an $H^1$-norm stability estimate for the finite element approximation operator which mimics a similar estimate for the underlying PDE operator recently established by the authors and its proof is based on a freezing coefficient technique and a topological argument. Moreover, both the $H^1$-norm stability and error estimate also hold for the linear finite element method.
△ Less
Submitted 30 September, 2019;
originally announced September 2019.
-
Counting odd numbers in truncations of Pascal's triangle
Authors:
Robert G. Donnelly,
Molly W. Dunkum,
Courtney George,
Stefan Schnake
Abstract:
A "truncation" of Pascal's triangle is a triangular array of numbers that satisfies the usual Pascal recurrence but with a boundary condition that declares some terminal set of numbers along each row of the array to be zero. Presented here is a family of natural truncations of Pascal's triangle that generalize a kind of Catalan triangle. The numbers in each array are realized as differences of bin…
▽ More
A "truncation" of Pascal's triangle is a triangular array of numbers that satisfies the usual Pascal recurrence but with a boundary condition that declares some terminal set of numbers along each row of the array to be zero. Presented here is a family of natural truncations of Pascal's triangle that generalize a kind of Catalan triangle. The numbers in each array are realized as differences of binomial coefficients, as counts of certain lattice paths and tableaux, and as entries of representing matrices for certain linear transformations of polynomial spaces. Lucas's theorem is applied to determine precisely those truncations for which the number of odd entries on each row is a power of two.
△ Less
Submitted 25 July, 2018; v1 submitted 21 July, 2018;
originally announced July 2018.
-
Analysis of the Vanishing Moment Method and its Finite Element Approximations for Second-order Linear Elliptic PDEs in Non-divergence Form
Authors:
Xiaobing Feng,
Thomas Lewis,
Stefan Schnake
Abstract:
This paper is concerned with continuous and discrete approximations of $W^{2,p}$ strong solutions of second-order linear elliptic partial differential equations (PDEs) in non-divergence form. The continuous approximation of these equations is achieved through the Vanishing Moment Method (VMM) which adds a small biharmonic term to the PDE. The structure of the new fourth-order PDE is a natural fit…
▽ More
This paper is concerned with continuous and discrete approximations of $W^{2,p}$ strong solutions of second-order linear elliptic partial differential equations (PDEs) in non-divergence form. The continuous approximation of these equations is achieved through the Vanishing Moment Method (VMM) which adds a small biharmonic term to the PDE. The structure of the new fourth-order PDE is a natural fit for Galerkin-type methods unlike the original second order equation since the highest order term is in divergence form. The well-posedness of the weak form of the perturbed fourth order equation is shown as well as error estimates for approximating the strong solution of the original second-order PDE. A $C^1$ finite element method is then proposed for the fourth order equation, and its existence and uniqueness of solutions as well as optimal error estimates in the $H^2$ norm are shown. Lastly, numerical tests are given to show the validity of the method.
△ Less
Submitted 26 February, 2019; v1 submitted 17 January, 2018;
originally announced January 2018.
-
A Discontinuous Ritz Method for a Class of Calculus of Variations Problems
Authors:
Xiaobing Feng,
Stefan Schnake
Abstract:
This paper develops an analogue (or counterpart) to discontinuous Galerkin (DG) methods for approximating a general class of calculus of variations problems. The proposed method, called the discontinuous Ritz (DR) method, constructs a numerical solution by minimizing a discrete energy over DG function spaces. The discrete energy includes standard penalization terms as well as the DG finite element…
▽ More
This paper develops an analogue (or counterpart) to discontinuous Galerkin (DG) methods for approximating a general class of calculus of variations problems. The proposed method, called the discontinuous Ritz (DR) method, constructs a numerical solution by minimizing a discrete energy over DG function spaces. The discrete energy includes standard penalization terms as well as the DG finite element (DG-FE) numerical derivatives developed recently by Feng, Lewis, and Neilan in [Feng2013]. It is proved that the proposed DR method converges and that the DG-FE numerical derivatives exhibit a compactness property which is desirable and crucial for applying the proposed DR method to problems with more complex energy functionals. Numerical tests are provided on the classical $p$-Laplace problem to gauge the performance of the proposed DR method.
△ Less
Submitted 17 January, 2018; v1 submitted 13 September, 2017;
originally announced September 2017.
-
An enhanced finite element method for a class of variational problems exhibiting the Lavrentiev gap phenomenon
Authors:
Xiaobing Feng,
Stefan Schnake
Abstract:
This paper develops an enhanced finite element method for approximating a class of variational problems which exhibit the \textit{Lavrentiev gap phenomenon} in the sense that the minimum values of the energy functional have a nontrivial gap when the functional is minimized on spaces $W^{1,1}$ and $W^{1,\infty}$. To remedy the standard finite element method, which fails to converge for such variati…
▽ More
This paper develops an enhanced finite element method for approximating a class of variational problems which exhibit the \textit{Lavrentiev gap phenomenon} in the sense that the minimum values of the energy functional have a nontrivial gap when the functional is minimized on spaces $W^{1,1}$ and $W^{1,\infty}$. To remedy the standard finite element method, which fails to converge for such variational problems, a simple and effective cut-off procedure is utilized to design the (enhanced finite element) discrete energy functional. In essence the proposed discrete energy functional curbs the gap phenomenon by capping the derivatives of its input on a scale of $O(h^{-α})$ (where $h$ denotes the mesh size) for some positive constant $α$. A sufficient condition is proposed for determining the problem-dependent parameter $\a$. Extensive 1-D and 2-D numerical experiment results are provided to show the convergence behavior and the performance of the proposed enhanced finite element method.
△ Less
Submitted 10 October, 2016;
originally announced October 2016.
-
Interior Penalty Discontinuous Galerkin Methods for Second Order Linear Non-Divergence Form Elliptic PDEs
Authors:
Xiaobing Feng,
Michael Neilan,
Stefan Schnake
Abstract:
This paper develops interior penalty discontinuous Galerkin (IP-DG) methods to approximate $W^{2,p}$ strong solutions of second order linear elliptic partial differential equations (PDEs) in non-divergence form with continuous coefficients. The proposed IP-DG methods are closely related to the IP-DG methods for advection-diffusion equations, and they are easy to implement on existing standard IP-D…
▽ More
This paper develops interior penalty discontinuous Galerkin (IP-DG) methods to approximate $W^{2,p}$ strong solutions of second order linear elliptic partial differential equations (PDEs) in non-divergence form with continuous coefficients. The proposed IP-DG methods are closely related to the IP-DG methods for advection-diffusion equations, and they are easy to implement on existing standard IP-DG software platforms. It is proved that the proposed IP-DG methods have unique solutions and converge with optimal rate to the $W^{2,p}$ strong solution in a discrete $W^{2,p}$-norm. The crux of the analysis is to establish a DG discrete counterpart of the Calderon-Zygmund estimate and to adapt a freezing coefficient technique used for the PDE analysis at the discrete level. As a byproduct of our analysis, we also establish broken $W^{1,p}$-norm error estimates for IP-DG approximations of constant coefficient elliptic PDEs. Numerical experiments are provided to gauge the performance of the proposed IP-DG methods and to validate the theoretical convergence results.
△ Less
Submitted 13 May, 2016;
originally announced May 2016.