-
Geometric Jensen-Shannon Divergence Between Gaussian Measures On Hilbert Space
Authors:
Minh Ha Quang,
Frank Nielsen
Abstract:
This work studies the Geometric Jensen-Shannon divergence, based on the notion of geometric mean of probability measures, in the setting of Gaussian measures on an infinite-dimensional Hilbert space. On the set of all Gaussian measures equivalent to a fixed one, we present a closed form expression for this divergence that directly generalizes the finite-dimensional version. Using the notion of Log…
▽ More
This work studies the Geometric Jensen-Shannon divergence, based on the notion of geometric mean of probability measures, in the setting of Gaussian measures on an infinite-dimensional Hilbert space. On the set of all Gaussian measures equivalent to a fixed one, we present a closed form expression for this divergence that directly generalizes the finite-dimensional version. Using the notion of Log-Determinant divergences between positive definite unitized trace class operators, we then define a Regularized Geometric Jensen-Shannon divergence that is valid for any pair of Gaussian measures and that recovers the exact Geometric Jensen-Shannon divergence between two equivalent Gaussian measures when the regularization parameter tends to zero.
△ Less
Submitted 12 June, 2025;
originally announced June 2025.
-
Weighting operators for sparsity regularization
Authors:
Ole Løseth Elvetun,
Bjørn Fredrik Nielsen,
Niranjana Sudheer
Abstract:
Standard regularization methods typically favor solutions which are in, or close to, the orthogonal complement of the null space of the forward operator/matrix $\mathsf{A}$. This particular biasedness might not be desirable in applications and can lead to severe challenges when $\mathsf{A}$ is non-injective.
We have therefore, in a series of papers, investigated how to "remedy" this fact, relati…
▽ More
Standard regularization methods typically favor solutions which are in, or close to, the orthogonal complement of the null space of the forward operator/matrix $\mathsf{A}$. This particular biasedness might not be desirable in applications and can lead to severe challenges when $\mathsf{A}$ is non-injective.
We have therefore, in a series of papers, investigated how to "remedy" this fact, relative to a chosen basis and in a certain mathematical sense: Based on a weighting procedure, it turns out that it is possible to modify both Tikhonov and sparsity regularization such that each member of the chosen basis can be almost perfectly recovered from their image under $\mathsf{A}$. In particular, we have studied this problem for the task of using boundary data to identify the source term in an elliptic PDE. However, this weighting procedure involves $\mathsf{A}^\dagger \mathsf{A}$, where $\mathsf{A}^\dagger$ denotes the pseudo inverse of $\mathsf{A}$, and can thus be CPU-demanding and lead to undesirable error amplification.
We therefore, in this paper, study alternative weighting approaches and prove that some of the recovery results established for the methodology involving $\mathsf{A}$ hold for a broader class of weighting schemes. In fact, it turns out that "any" linear operator $\mathsf{B}$ has an associated proper weighting defined in terms of images under $\mathsf{B}\mathsf{A}$. We also present a series of numerical experiments, employing different choices of $\mathsf{B}$.
△ Less
Submitted 8 May, 2025;
originally announced May 2025.
-
What is an inductive mean?
Authors:
Frank Nielsen
Abstract:
An inductive mean is a mean defined as a limit of a convergence sequence of other means. Historically, this notion of inductive means obtained as limits of sequences was pioneered independently by Lagrange and Gauss for defining the arithmetic-geometric mean. In this note, we first explain several generalizations of the scalar geometric mean to symmetric positive-definite matrices, and then presen…
▽ More
An inductive mean is a mean defined as a limit of a convergence sequence of other means. Historically, this notion of inductive means obtained as limits of sequences was pioneered independently by Lagrange and Gauss for defining the arithmetic-geometric mean. In this note, we first explain several generalizations of the scalar geometric mean to symmetric positive-definite matrices, and then present several inductive mean mechanisms for sets of symmetric positive-definite matrices.
△ Less
Submitted 21 October, 2024;
originally announced October 2024.
-
Estimating neural connection strengths from firing intervals
Authors:
Maren Bråthen Kristoffersen,
Bjørn Fredrik Nielsen,
Susanne Solem
Abstract:
We propose and analyse a procedure for using a standard activity-based neuron network model and firing data to compute the effective connection strengths between neurons in a network. We assume a Heaviside response function, that the external inputs are given and that the initial state of the neural activity is known. The associated forward operator for this problem, which maps given connection st…
▽ More
We propose and analyse a procedure for using a standard activity-based neuron network model and firing data to compute the effective connection strengths between neurons in a network. We assume a Heaviside response function, that the external inputs are given and that the initial state of the neural activity is known. The associated forward operator for this problem, which maps given connection strengths to the time intervals of firing, is highly nonlinear. Nevertheless, it turns out that the inverse problem of determining the connection strengths can be solved in a rather transparent manner, only employing standard mathematical tools. In fact, it is sufficient to solve a system of decoupled ODEs, which yields a linear system of algebraic equations for determining the connection strengths. The nature of the inverse problem is investigated by studying some mathematical properties of the aforementioned linear system and by a series of numerical experiments. Finally, under an assumption preventing the effective contribution of the network to each neuron from staying at zero, we prove that the involved forward operator is continuous. Sufficient criteria on the external input ensuring that the needed assumption holds are also provided.
△ Less
Submitted 11 February, 2025; v1 submitted 11 September, 2024;
originally announced September 2024.
-
Fictitious null spaces for improving the solution of injective inverse problems
Authors:
Ole Løseth Elvetun,
Kim Knudsen,
Bjørn Fredrik Nielsen
Abstract:
For linear ill-posed problems with nontrivial null spaces, Tikhonov regularization and truncated singular value decomposition (TSVD) typically yield solutions that are close to the minimum norm solution. Such a bias is not always desirable, and we have therefore in a series of papers developed a weighting procedure which produces solutions with a different and controlled bias. This methodology can…
▽ More
For linear ill-posed problems with nontrivial null spaces, Tikhonov regularization and truncated singular value decomposition (TSVD) typically yield solutions that are close to the minimum norm solution. Such a bias is not always desirable, and we have therefore in a series of papers developed a weighting procedure which produces solutions with a different and controlled bias. This methodology can also conveniently be invoked when sparsity regularization is employed.
The purpose of the present work is to study the potential use of this weighting applied to injective operators. The image under a compact operator of the singular vectors/functions associated with very small singular values will be almost zero. Consequently, one may regard these singular vectors/functions to constitute a basis for a fictitious null space that allows us to mimic the previous weighting procedure. It turns out that this regularization by weighting can improve the solution of injective inverse problems compared with more traditional approaches.
We present some analysis of this methodology and exemplify it numerically, using sparsity regularization, for three PDE-driven inverse problems: the inverse heat conduction problem, the Cauchy problem for Laplace's equation, and the (linearized) Electrical Impedance Tomography problem with experimental data.
△ Less
Submitted 9 December, 2024; v1 submitted 30 August, 2024;
originally announced August 2024.
-
Optimal Transport with Tempered Exponential Measures
Authors:
Ehsan Amid,
Frank Nielsen,
Richard Nock,
Manfred K. Warmuth
Abstract:
In the field of optimal transport, two prominent subfields face each other: (i) unregularized optimal transport, "à-la-Kantorovich", which leads to extremely sparse plans but with algorithms that scale poorly, and (ii) entropic-regularized optimal transport, "à-la-Sinkhorn-Cuturi", which gets near-linear approximation algorithms but leads to maximally un-sparse plans. In this paper, we show that a…
▽ More
In the field of optimal transport, two prominent subfields face each other: (i) unregularized optimal transport, "à-la-Kantorovich", which leads to extremely sparse plans but with algorithms that scale poorly, and (ii) entropic-regularized optimal transport, "à-la-Sinkhorn-Cuturi", which gets near-linear approximation algorithms but leads to maximally un-sparse plans. In this paper, we show that an extension of the latter to tempered exponential measures, a generalization of exponential families with indirect measure normalization, gets to a very convenient middle ground, with both very fast approximation algorithms and sparsity, which is under control up to sparsity patterns. In addition, our formulation fits naturally in the unbalanced optimal transport problem setting.
△ Less
Submitted 16 February, 2024; v1 submitted 7 September, 2023;
originally announced September 2023.
-
Identifying the source term in the potential equation with weighted sparsity regularization
Authors:
Ole Løseth Elvetun,
Bjørn Fredrik Nielsen
Abstract:
We explore the possibility for using boundary measurements to recover a sparse source term f(x) in the potential equation. Employing weighted sparsity regularization and standard results for subgradients, we derive simple-to-check criteria which assure that a number of sinks (f(x) < 0) and sources (f(x) > 0) can be identified. Furthermore, we present two cases for which these criteria always are f…
▽ More
We explore the possibility for using boundary measurements to recover a sparse source term f(x) in the potential equation. Employing weighted sparsity regularization and standard results for subgradients, we derive simple-to-check criteria which assure that a number of sinks (f(x) < 0) and sources (f(x) > 0) can be identified. Furthermore, we present two cases for which these criteria always are fulfilled: a) well-separated sources and sinks, and b) many sources or sinks located at the boundary plus one interior source/sink. Our approach is such that the linearity of the associated forward operator is preserved in the discrete formulation. The theory is therefore conveniently developed in terms of Euclidean spaces, and it can be applied to a wide range of problems. In particular, it can be applied to both isotropic and anisotropic cases. We present a series of numerical experiments. This work is motivated by the observation that standard methods typically suggest that internal sinks and sources are located close to the boundary.
△ Less
Submitted 3 November, 2023; v1 submitted 8 December, 2022;
originally announced December 2022.
-
Variational Representations of Annealing Paths: Bregman Information under Monotonic Embedding
Authors:
Rob Brekelmans,
Frank Nielsen
Abstract:
Markov Chain Monte Carlo methods for sampling from complex distributions and estimating normalization constants often simulate samples from a sequence of intermediate distributions along an annealing path, which bridges between a tractable initial distribution and a target density of interest. Prior works have constructed annealing paths using quasi-arithmetic means, and interpreted the resulting…
▽ More
Markov Chain Monte Carlo methods for sampling from complex distributions and estimating normalization constants often simulate samples from a sequence of intermediate distributions along an annealing path, which bridges between a tractable initial distribution and a target density of interest. Prior works have constructed annealing paths using quasi-arithmetic means, and interpreted the resulting intermediate densities as minimizing an expected divergence to the endpoints. To analyze these variational representations of annealing paths, we extend known results showing that the arithmetic mean over arguments minimizes the expected Bregman divergence to a single representative point. In particular, we obtain an analogous result for quasi-arithmetic means, when the inputs to the Bregman divergence are transformed under a monotonic embedding function. Our analysis highlights the interplay between quasi-arithmetic means, parametric families, and divergence functionals using the rho-tau representational Bregman divergence framework, and associates common divergence functionals with intermediate densities along an annealing path.
△ Less
Submitted 6 February, 2024; v1 submitted 15 September, 2022;
originally announced September 2022.
-
Box constraints and weighted sparsity regularization for identifying sources in elliptic PDEs
Authors:
Ole Løseth Elvetun,
Bjørn Fredrik Nielsen
Abstract:
We explore the possibility for using boundary data to identify sources in elliptic PDEs. Even though the associated forward operator has a large null space, it turns out that box constraints, combined with weighted sparsity regularization, can enable rather accurate recovery of sources with constant magnitude/strength. In addition, for sources with varying strength, the support of the inverse solu…
▽ More
We explore the possibility for using boundary data to identify sources in elliptic PDEs. Even though the associated forward operator has a large null space, it turns out that box constraints, combined with weighted sparsity regularization, can enable rather accurate recovery of sources with constant magnitude/strength. In addition, for sources with varying strength, the support of the inverse solution will be a subset of the support of the true source. We present both an analysis of the problem and a series of numerical experiments. Our work only addresses discretized problems.
The reason for introducing the weighting procedure is that standard (unweighted) sparsity regularization fails to provide adequate results for the source identification task considered in this paper. This investigation is also motivated by applications, e.g., recovering mass distributions from measurements of gravitational fields and inverse scattering. We develop the methodology and the analysis in terms of Euclidean spaces, and our results can therefore be applied to many problems. For example, the results are equally applicable to models involving the screened Poisson equation as to models using the Helmholtz equation, with both large and small wave numbers.
△ Less
Submitted 3 March, 2023; v1 submitted 13 June, 2022;
originally announced June 2022.
-
A note on the $f$-divergences between multivariate location-scale families with either prescribed scale matrices or location parameters
Authors:
Frank Nielsen,
Kazuki Okamura
Abstract:
We first extend the result of Ali and Silvey [Journal of the Royal Statistical Society: Series B, 28.1 (1966), 131-142] who first reported that any $f$-divergence between two isotropic multivariate Gaussian distributions amounts to a corresponding strictly increasing scalar function of their corresponding Mahalanobis distance. We report sufficient conditions on the standard probability density fun…
▽ More
We first extend the result of Ali and Silvey [Journal of the Royal Statistical Society: Series B, 28.1 (1966), 131-142] who first reported that any $f$-divergence between two isotropic multivariate Gaussian distributions amounts to a corresponding strictly increasing scalar function of their corresponding Mahalanobis distance. We report sufficient conditions on the standard probability density function generating a multivariate location family and the function generator $f$ in order to generalize this result. This property is useful in practice as it allows to compare exactly $f$-divergences between densities of these location families via their corresponding Mahalanobis distances, even when the $f$-divergences are not available in closed-form as it is the case, for example, for the Jensen-Shannon divergence or the total variation distance between densities of a normal location family. Second, we consider $f$-divergences between densities of multivariate scale families: We recall Ali and Silvey 's result that for normal scale families we get matrix spectral divergences, and we extend this result to densities of a scale family.
△ Less
Submitted 30 May, 2022; v1 submitted 22 April, 2022;
originally announced April 2022.
-
Numerical approximation of the spectrum of self-adjoint continuously invertible operators
Authors:
Tomáš Gergelits,
Bjørn Fredrik Nielsen,
Zdeněk Strakoš
Abstract:
This paper deals with the generalized spectrum of continuously invertible linear operators defined on infinite dimensional Hilbert spaces. More precisely, we consider two bounded, coercive, and self-adjoint operators $\bc{A, B}: V\mapsto V^{\#}$, where $V^{\#}$ denotes the dual of $V$, and investigate the conditions under which the whole spectrum of $\bc{B}^{-1}\bc{A}:V\mapsto V$ can be approximat…
▽ More
This paper deals with the generalized spectrum of continuously invertible linear operators defined on infinite dimensional Hilbert spaces. More precisely, we consider two bounded, coercive, and self-adjoint operators $\bc{A, B}: V\mapsto V^{\#}$, where $V^{\#}$ denotes the dual of $V$, and investigate the conditions under which the whole spectrum of $\bc{B}^{-1}\bc{A}:V\mapsto V$ can be approximated to an arbitrary accuracy by the eigenvalues of the finite dimensional discretization $\bc{B}_n^{-1}\bc{A}_n$. Since $\bc{B}^{-1}\bc{A}$ is continuously invertible, such an investigation cannot use the concept of uniform (normwise) convergence, and it relies instead on the pointwise (strong) convergence of $\bc{B}_n^{-1}\bc{A}_n$ to $\bc{B}^{-1}\bc{A}$.
The paper is motivated by operator preconditioning which is employed in the numerical solution of boundary value problems. In this context, $\bc{A}, \bc{B}: H_0^1(Ω) \mapsto H^{-1}(Ω)$ are the standard integral/functional representations of the differential operators $ -\nabla \cdot (k(x)\nabla u)$ and $-\nabla \cdot (g(x)\nabla u)$, respectively, and $k(x)$ and $g(x)$ are scalar coefficient functions. The investigated question differs from the eigenvalue problem studied in the numerical PDE literature which is based on the approximation of the eigenvalues within the framework of compact operators.
This work follows the path started by the two recent papers published in [SIAM J. Numer. Anal., 57 (2019), pp.~1369-1394 and 58 (2020), pp.~2193-2211] and addresses one of the open questions formulated at the end of the second paper.
△ Less
Submitted 1 March, 2021;
originally announced March 2021.
-
On $f$-divergences between Cauchy distributions
Authors:
Frank Nielsen,
Kazuki Okamura
Abstract:
We prove that the $f$-divergences between univariate Cauchy distributions are all symmetric, and can be expressed as strictly increasing scalar functions of the symmetric chi-squared divergence. We report the corresponding scalar functions for the total variation distance, the Kullback-Leibler divergence, the squared Hellinger divergence, and the Jensen-Shannon divergence among others. Next, we gi…
▽ More
We prove that the $f$-divergences between univariate Cauchy distributions are all symmetric, and can be expressed as strictly increasing scalar functions of the symmetric chi-squared divergence. We report the corresponding scalar functions for the total variation distance, the Kullback-Leibler divergence, the squared Hellinger divergence, and the Jensen-Shannon divergence among others. Next, we give conditions to expand the $f$-divergences as converging infinite series of higher-order power chi divergences, and illustrate the criterion for converging Taylor series expressing the $f$-divergences between Cauchy distributions. We then show that the symmetric property of $f$-divergences holds for multivariate location-scale families with prescribed matrix scales provided that the standard density is even which includes the cases of the multivariate normal and Cauchy families. However, the $f$-divergences between multivariate Cauchy densities with different scale matrices are shown asymmetric. Finally, we present several metrizations of $f$-divergences between univariate Cauchy distributions and further report geometric embedding properties of the Kullback-Leibler divergence.
△ Less
Submitted 7 December, 2021; v1 submitted 29 January, 2021;
originally announced January 2021.
-
RAP-modulated Fluid Processes: First Passages and the Stationary Distribution
Authors:
Nigel G. Bean,
Giang T. Nguyen,
Bo F. Nielsen,
Oscar Peralta
Abstract:
We construct a stochastic fluid process with an underlying piecewise deterministic Markov process (PDMP) akin to the one used in the construction of the rational arrival process (RAP), which we call the RAP-modulated fluid process. As opposed to the classic stochastic fluid process driven by a Markov jump process, the underlying PDMP of a RAP-modulated fluid process has a continuous state space an…
▽ More
We construct a stochastic fluid process with an underlying piecewise deterministic Markov process (PDMP) akin to the one used in the construction of the rational arrival process (RAP), which we call the RAP-modulated fluid process. As opposed to the classic stochastic fluid process driven by a Markov jump process, the underlying PDMP of a RAP-modulated fluid process has a continuous state space and is driven by matrix parameters which may not be related to an intensity matrix. Through novel techniques we show how well-known formulae associated to the classic stochastic fluid process, such as first passage probabilities and the stationary distribution of its queue, translate to its RAP-modulated counterpart.
△ Less
Submitted 8 January, 2021;
originally announced January 2021.
-
Weighted sparsity regularization for source identification for elliptic PDEs
Authors:
Ole Løseth Elvetun,
Bjørn Fredrik Nielsen
Abstract:
This investigation is motivated by PDE-constrained optimization problems arising in connection with electrocardiograms (ECGs) and electroencephalography (EEG). Standard sparsity regularization does not necessarily produce adequate results for these applications because only boundary data/observations are available for the identification of the unknown source, which may be interior. We therefore st…
▽ More
This investigation is motivated by PDE-constrained optimization problems arising in connection with electrocardiograms (ECGs) and electroencephalography (EEG). Standard sparsity regularization does not necessarily produce adequate results for these applications because only boundary data/observations are available for the identification of the unknown source, which may be interior. We therefore study a weighted $\ell^1$-regularization technique for solving inverse problems when the forward operator has a significant null space. In particular, we prove that a sparse source, regardless of whether it is interior or located at the boundary, can be exactly recovered with this weighting procedure as the regularization parameter $α$ tends to zero. Our analysis is supported by numerical experiments for cases with one and several local sources. The theory is developed in terms of Euclidean spaces, and our results can therefore be applied to many problems.
△ Less
Submitted 24 May, 2023; v1 submitted 21 December, 2020;
originally announced December 2020.
-
Modified Tikhonov regularization for identifying several sources
Authors:
Ole Løseth Elvetun,
Bjørn Fredrik Nielsen
Abstract:
We study whether a modified version of Tikhonov regularization can be used to identify several local sources from Dirichlet boundary data for a prototypical elliptic PDE. This paper extends the results presented in [5]. It turns out that the possibility of distinguishing between two, or more, sources depends on the smoothing properties of a second or fourth order PDE. Consequently, the geometry of…
▽ More
We study whether a modified version of Tikhonov regularization can be used to identify several local sources from Dirichlet boundary data for a prototypical elliptic PDE. This paper extends the results presented in [5]. It turns out that the possibility of distinguishing between two, or more, sources depends on the smoothing properties of a second or fourth order PDE. Consequently, the geometry of the involved domain, as well as the position of the sources relative to the boundary of this domain, determines the identifiability.
We also present a uniqueness result for the identification of a single local source. This result is derived in terms of an abstract operator framework and is therefore not only applicable to the model problem studied in this paper. Our schemes yield quadratic optimization problems and can thus be solved with standard software tools. In addition to a theoretical investigation, this paper also contains several numerical experiments.
△ Less
Submitted 9 November, 2020;
originally announced November 2020.
-
A regularization operator for source identification for elliptic PDEs
Authors:
Ole Løseth Elvetun,
Bjørn Fredrik Nielsen
Abstract:
We study a source identification problem for a prototypical elliptic PDE from Dirichlet boundary data. This problem is ill-posed, and the involved forward operator has a significant nullspace. Standard Tikhonov regularization yields solutions which approach the minimum $L^2$-norm least-squares solution as the regularization parameter tends to zero. We show that this approach 'always' suggests that…
▽ More
We study a source identification problem for a prototypical elliptic PDE from Dirichlet boundary data. This problem is ill-posed, and the involved forward operator has a significant nullspace. Standard Tikhonov regularization yields solutions which approach the minimum $L^2$-norm least-squares solution as the regularization parameter tends to zero. We show that this approach 'always' suggests that the unknown local source is very close to the boundary of the domain of the PDE, regardless of the position of the true local source.
We propose an alternative regularization procedure, realized in terms of a novel regularization operator, which is better suited for identifying local sources positioned anywhere in the domain of the PDE. Our approach is motivated by the classical theory for Tikhonov regularization and yields a standard quadratic optimization problem. Since the new methodology is derived for an abstract operator equation, it can be applied to many other source identification problems. This paper contains several numerical experiments and an analysis of the new methodology.
△ Less
Submitted 28 October, 2020; v1 submitted 19 May, 2020;
originally announced May 2020.
-
Cumulant-free closed-form formulas for some common (dis)similarities between densities of an exponential family
Authors:
Frank Nielsen,
Richard Nock
Abstract:
It is well-known that the Bhattacharyya, Hellinger, Kullback-Leibler, $α$-divergences, and Jeffreys' divergences between densities belonging to a same exponential family have generic closed-form formulas relying on the strictly convex and real-analytic cumulant function characterizing the exponential family. In this work, we report (dis)similarity formulas which bypass the explicit use of the cumu…
▽ More
It is well-known that the Bhattacharyya, Hellinger, Kullback-Leibler, $α$-divergences, and Jeffreys' divergences between densities belonging to a same exponential family have generic closed-form formulas relying on the strictly convex and real-analytic cumulant function characterizing the exponential family. In this work, we report (dis)similarity formulas which bypass the explicit use of the cumulant function and highlight the role of quasi-arithmetic means and their multivariate mean operator extensions. In practice, these cumulant-free formulas are handy when implementing these (dis)similarities using legacy Application Programming Interfaces (APIs) since our method requires only to partially factorize the densities canonically of the considered exponential family.
△ Less
Submitted 7 April, 2020; v1 submitted 5 March, 2020;
originally announced March 2020.
-
Generalized spectrum of second order differential operators
Authors:
Tomáš Gergelits,
Bjørn Fredrik Nielsen,
Zdeněk Strakoš
Abstract:
We analyze the spectrum of the operator $Δ^{-1} [\nabla \cdot (K\nabla u)]$, where $Δ$ denotes the Laplacian and $K=K(x,y)$ is a symmetric tensor. Our main result shows that this spectrum can be derived from the spectral decomposition $K=Q ΛQ^T$, where $Q=Q(x,y)$ is an orthogonal matrix and $Λ=Λ(x,y)$ is a diagonal matrix. More precisely, provided that $K$ is continuous, the spectrum equals the co…
▽ More
We analyze the spectrum of the operator $Δ^{-1} [\nabla \cdot (K\nabla u)]$, where $Δ$ denotes the Laplacian and $K=K(x,y)$ is a symmetric tensor. Our main result shows that this spectrum can be derived from the spectral decomposition $K=Q ΛQ^T$, where $Q=Q(x,y)$ is an orthogonal matrix and $Λ=Λ(x,y)$ is a diagonal matrix. More precisely, provided that $K$ is continuous, the spectrum equals the convex hull of the ranges of the diagonal function entries of $Λ$. The involved domain is assumed to be bounded and Lipschitz, and both homogeneous Dirichlet and homogeneous Neumann boundary conditions are considered. We study operators defined on infinite dimensional Sobolev spaces. Our theoretical investigations are illuminated by numerical experiments, using discretized problems.
The results presented in this paper extend previous analyses which have addressed elliptic differential operators with scalar coefficient functions. Our investigation is motivated by both preconditioning issues (efficient numerical computations) and the need to further develop the spectral theory of second order PDEs (core analysis).
△ Less
Submitted 3 February, 2020;
originally announced February 2020.
-
On a generalization of the Jensen-Shannon divergence
Authors:
Frank Nielsen
Abstract:
The Jensen-Shannon divergence is a renown bounded symmetrization of the Kullback-Leibler divergence which does not require probability densities to have matching supports. In this paper, we introduce a vector-skew generalization of the scalar $α$-Jensen-Bregman divergences and derive thereof the vector-skew $α$-Jensen-Shannon divergences. We study the properties of these novel divergences and show…
▽ More
The Jensen-Shannon divergence is a renown bounded symmetrization of the Kullback-Leibler divergence which does not require probability densities to have matching supports. In this paper, we introduce a vector-skew generalization of the scalar $α$-Jensen-Bregman divergences and derive thereof the vector-skew $α$-Jensen-Shannon divergences. We study the properties of these novel divergences and show how to build parametric families of symmetric Jensen-Shannon-type divergences. Finally, we report an iterative algorithm to numerically compute the Jensen-Shannon-type centroids for a set of probability densities belonging to a mixture family: This includes the case of the Jensen-Shannon centroid of a set of categorical distributions or normalized histograms.
△ Less
Submitted 19 December, 2019; v1 submitted 2 December, 2019;
originally announced December 2019.
-
On the Kullback-Leibler divergence between location-scale densities
Authors:
Frank Nielsen
Abstract:
We show that the $f$-divergence between any two densities of potentially different location-scale families can be reduced to the calculation of the $f$-divergence between one standard density with another location-scale density. It follows that the $f$-divergence between two scale densities depends only on the scale ratio. We then report conditions on the standard distribution to get symmetric…
▽ More
We show that the $f$-divergence between any two densities of potentially different location-scale families can be reduced to the calculation of the $f$-divergence between one standard density with another location-scale density. It follows that the $f$-divergence between two scale densities depends only on the scale ratio. We then report conditions on the standard distribution to get symmetric $f$-divergences: First, we prove that all $f$-divergences between densities of a location family are symmetric whenever the standard density is even, and second, we illustrate a generic symmetric property with the calculation of the Kullback-Leibler divergence between scale Cauchy distributions. Finally, we show that the minimum $f$-divergence of any query density of a location-scale family to
another location-scale family is independent of the query location-scale parameters.
△ Less
Submitted 15 February, 2021; v1 submitted 23 April, 2019;
originally announced April 2019.
-
The statistical Minkowski distances: Closed-form formula for Gaussian Mixture Models
Authors:
Frank Nielsen
Abstract:
The traditional Minkowski distances are induced by the corresponding Minkowski norms in real-valued vector spaces. In this work, we propose novel statistical symmetric distances based on the Minkowski's inequality for probability densities belonging to Lebesgue spaces. These statistical Minkowski distances admit closed-form formula for Gaussian mixture models when parameterized by integer exponent…
▽ More
The traditional Minkowski distances are induced by the corresponding Minkowski norms in real-valued vector spaces. In this work, we propose novel statistical symmetric distances based on the Minkowski's inequality for probability densities belonging to Lebesgue spaces. These statistical Minkowski distances admit closed-form formula for Gaussian mixture models when parameterized by integer exponents. This result extends to arbitrary mixtures of exponential families with natural parameter spaces being cones: This includes the binomial, the multinomial, the zero-centered Laplacian, the Gaussian and the Wishart mixtures, among others. We also derive a Minkowski's diversity index of a normalized weighted set of probability distributions from Minkowski's inequality.
△ Less
Submitted 17 January, 2019; v1 submitted 9 January, 2019;
originally announced January 2019.
-
Laplacian preconditioning of elliptic PDEs: Localization of the eigenvalues of the discretized operator
Authors:
Tomáš Gergelits,
Kent-André Mardal,
Bjørn Fredrik Nielsen,
Zdeněk Strakoš
Abstract:
In the paper \textit{Preconditioning by inverting the {L}aplacian; an analysis of the eigenvalues. IMA Journal of Numerical Analysis 29, 1 (2009), 24--42}, Nielsen, Hackbusch and Tveito study the operator generated by using the inverse of the Laplacian as preconditioner for second order elliptic PDEs $\nabla \cdot (k(x) \nabla u) = f$. They prove that the range of $k(x)$ is contained in the spectr…
▽ More
In the paper \textit{Preconditioning by inverting the {L}aplacian; an analysis of the eigenvalues. IMA Journal of Numerical Analysis 29, 1 (2009), 24--42}, Nielsen, Hackbusch and Tveito study the operator generated by using the inverse of the Laplacian as preconditioner for second order elliptic PDEs $\nabla \cdot (k(x) \nabla u) = f$. They prove that the range of $k(x)$ is contained in the spectrum of the preconditioned operator, provided that $k$ is continuous. Their rigorous analysis only addresses mappings defined on infinite dimensional spaces, but the numerical experiments in the paper suggest that a similar property holds in the discrete case.
% Motivated by this investigation, we analyze the eigenvalues of the matrix $\bf{L}^{-1}\bf{A}$, where $\bf{L}$ and ${\bf{A}}$ are the stiffness matrices associated with the Laplace operator and general second order elliptic operators, respectively. Without any assumption about the continuity of $k(x)$, we prove the existence of a one-to-one pairing between the eigenvalues of $\bf{L}^{-1}\bf{A}$ and the intervals determined by the images under $k(x)$ of the supports of the FE nodal basis functions. As a consequence, we can show that the nodal values of $k(x)$ yield accurate approximations of the eigenvalues of $\bf{L}^{-1}\bf{A}$. Our theoretical results are illuminated by several numerical experiments.
△ Less
Submitted 11 September, 2018;
originally announced September 2018.
-
Regularization of ill-posed point neuron models
Authors:
Bjørn Fredrik Nielsen
Abstract:
Point neuron models with a Heaviside firing rate function can be ill-posed. That is, the initial-condition-to-solution map might become discontinuous in finite time. If a Lipschitz continuous, but steep, firing rate function is employed, then standard ODE theory implies that such models are well-posed and can thus, approximately, be solved with finite precision arithmetic. We investigate whether t…
▽ More
Point neuron models with a Heaviside firing rate function can be ill-posed. That is, the initial-condition-to-solution map might become discontinuous in finite time. If a Lipschitz continuous, but steep, firing rate function is employed, then standard ODE theory implies that such models are well-posed and can thus, approximately, be solved with finite precision arithmetic. We investigate whether the solution of this well-posed model converges to a solution of the ill-posed limit problem as the steepness parameter, of the firing rate function, tends to infinity. Our argument employs the Arzelà-Ascoli theorem and also yields the existence of a solution of the limit problem. However, we only obtain convergence of a subsequence of the regularized solutions. This is consistent with the fact that we show that models with a Heaviside firing rate function can have several solutions. Our analysis assumes that the Lebesgue measure of the time the limit function, provided by the Arzelà-Ascoli theorem, equals the threshold value for firing, is zero. If this assumption does not hold, we argue that the regularized solutions may not converge to a solution of the limit problem with a Heaviside firing function.
△ Less
Submitted 1 March, 2017;
originally announced March 2017.
-
Robust preconditioners for PDE-constrained optimization with limited observations
Authors:
Kent-André Mardal,
Bjørn Fredrik Nielsen,
Magne Nordaas
Abstract:
Regularization robust preconditioners for PDE-constrained optimization problems have been successfully developed. These methods, however, typically assume that observation data is available throughout the entire domain of the state equation. For many inverse problems, this is an unrealistic assumption. In this paper we propose and analyze preconditioners for PDE-constrained optimization problems w…
▽ More
Regularization robust preconditioners for PDE-constrained optimization problems have been successfully developed. These methods, however, typically assume that observation data is available throughout the entire domain of the state equation. For many inverse problems, this is an unrealistic assumption. In this paper we propose and analyze preconditioners for PDE-constrained optimization problems with limited observation data, e.g. observations are only available at the boundary of the solution domain. Our methods are robust with respect to both the regularization parameter and the mesh size. That is, the condition number of the preconditioned optimality system is uniformly bounded, independently of the size of these two parameters. We first consider a prototypical elliptic control problem and thereafter more general PDE-constrained optimization problems. Our theoretical findings are illuminated by several numerical results.
△ Less
Submitted 22 June, 2015;
originally announced June 2015.
-
Medians and means in Finsler geometry
Authors:
Marc Arnaudon,
Frank Nielsen
Abstract:
We investigate existence and uniqueness of p-means and the median of a probability measure on a Finsler manifold, in relation with the convexity of the support of the measure. We prove that the p-mean is the limit point of a continuous time gradient flow. Under some additional condition which is always satisfied for larger than or equal to 2, a discretization of this path converges to the p-mean.…
▽ More
We investigate existence and uniqueness of p-means and the median of a probability measure on a Finsler manifold, in relation with the convexity of the support of the measure. We prove that the p-mean is the limit point of a continuous time gradient flow. Under some additional condition which is always satisfied for larger than or equal to 2, a discretization of this path converges to the p-mean. This provides an algorithm for determining those Finsler center points.
△ Less
Submitted 25 June, 2011; v1 submitted 28 November, 2010;
originally announced November 2010.
-
Staring at Economic Aggregators through Information Lenses
Authors:
Richard Nock,
Nicolas Sanz,
Fred Celimene,
Frank Nielsen
Abstract:
It is hard to exaggerate the role of economic aggregators -- functions that summarize numerous and / or heterogeneous data -- in economic models since the early XX$^{th}$ century. In many cases, as witnessed by the pioneering works of Cobb and Douglas, these functions were information quantities tailored to economic theories, i.e. they were built to fit economic phenomena. In this paper, we look…
▽ More
It is hard to exaggerate the role of economic aggregators -- functions that summarize numerous and / or heterogeneous data -- in economic models since the early XX$^{th}$ century. In many cases, as witnessed by the pioneering works of Cobb and Douglas, these functions were information quantities tailored to economic theories, i.e. they were built to fit economic phenomena. In this paper, we look at these functions from the complementary side: information. We use a recent toolbox built on top of a vast class of distortions coined by Bregman, whose application field rivals metrics' in various subfields of mathematics. This toolbox makes it possible to find the quality of an aggregator (for consumptions, prices, labor, capital, wages, etc.), from the standpoint of the information it carries. We prove a rather striking result.
From the informational standpoint, well-known economic aggregators do belong to the \textit{optimal} set. As common economic assumptions enter the analysis, this large set shrinks, and it essentially ends up \textit{exactly fitting} either CES, or Cobb-Douglas, or both. To summarize, in the relevant economic contexts, one could not have crafted better some aggregator from the information standpoint. We also discuss global economic behaviors of optimal information aggregators in general, and present a brief panorama of the links between economic and information aggregators.
Keywords: Economic Aggregators, CES, Cobb-Douglas, Bregman divergences
△ Less
Submitted 2 January, 2008;
originally announced January 2008.