Search | arXiv e-print repository

Dyadic Factorization and Efficient Inversion of Sparse Positive Definite Matrices

Authors: Michał Kos, Krzysztof Podgórski, Hanqing Wu

Abstract: In inverting large sparse matrices, the key difficulty lies in effectively exploiting sparsity during the inversion process. One well-established strategy is the nested dissection, which seeks the so-called sparse Cholesky factorization. We argue that the matrices for which such factors can be found are characterized by a hidden dyadic sparsity structure. This paper builds on that idea by proposin… ▽ More In inverting large sparse matrices, the key difficulty lies in effectively exploiting sparsity during the inversion process. One well-established strategy is the nested dissection, which seeks the so-called sparse Cholesky factorization. We argue that the matrices for which such factors can be found are characterized by a hidden dyadic sparsity structure. This paper builds on that idea by proposing an efficient approach for inverting such matrices. The method consists of two independent steps: the first packs the matrix into a dyadic form, while the second performs a sparse (dyadic) Gram-Schmidt orthogonalization of the packed matrix. The novel packing procedure works by recovering block-tridiagonal structures, focusing on aggregating terms near the diagonal using the $l_1$-norm, which contrasts with traditional methods that prioritize minimizing bandwidth, i.e. the $l_\infty$-norm. The algorithm performs particularly well for matrices that can be packed into banded or dyadic forms which are moderately dense. Due to the properties of $l_1$-norm, the packing step can be applied iteratively to reconstruct the hidden dyadic structure, which corresponds to the detection of separators in the nested dissection method. We explore the algebraic properties of dyadic-structured matrices and present an algebraic framework that allows for a unified mathematical treatment of both sparse factorization and efficient inversion of factors. For matrices with a dyadic structure, we introduce an optimal inversion algorithm and evaluate its computational complexity. The proposed inversion algorithm and core algebraic operations for dyadic matrices are implemented in the R package DyadiCarma, utilizing Rcpp and RcppArmadillo for high-performance computing. An independent R-based matrix packing module, supported by C++ code, is also provided. △ Less

Submitted 12 May, 2025; originally announced May 2025.

MSC Class: 15A23 (Primary); 15A09; 68Q25; 68R10 (Secondary)

arXiv:2311.17102 [pdf, other]

Splinets -- Orthogonal Splines and FDA for the Classification Problem

Authors: Rani Basna, Hiba Nassar, Krzysztof Podgórski

Abstract: This study introduces an efficient workflow for functional data analysis in classification problems, utilizing advanced orthogonal spline bases. The methodology is based on the flexible Splinets package, featuring a novel spline representation designed for enhanced data efficiency. Several innovative features contribute to this efficiency: 1)Utilization of Orthonormal Spline Bases 2)Consideration… ▽ More This study introduces an efficient workflow for functional data analysis in classification problems, utilizing advanced orthogonal spline bases. The methodology is based on the flexible Splinets package, featuring a novel spline representation designed for enhanced data efficiency. Several innovative features contribute to this efficiency: 1)Utilization of Orthonormal Spline Bases 2)Consideration of Spline Support Sets 3)Data-Driven Knot Selection. Illustrating this approach, we applied the workflow to the Fashion MINST dataset. We demonstrate the classification process and highlight significant efficiency gains. Particularly noteworthy are the improvements that can be achieved through the 2D generalization of our methodology, especially in scenarios where data sparsity and dimension reduction are critical factors. A key advantage of our workflow is the projection operation into the space of splines with arbitrarily chosen knots, allowing for versatile functional data analysis associated with classification problems. Moreover, the study explores Splinets package features suited for functional data analysis. The algebra and calculus of splines use Taylor expansions at the knots within the support sets. Various orthonormalization techniques for B-splines are implemented, including the highly recommended dyadic method, which leads to the creation of splinets. Importantly, the locality of B-splines concerning support sets is preserved in the corresponding splinet. Using this locality, along with implemented algorithms, provides a powerful computational tool for functional data analysis. △ Less

Submitted 28 November, 2023; originally announced November 2023.

Comments: 21 pages. arXiv admin note: text overlap with arXiv:2102.00733

arXiv:2309.16402 [pdf, other]

Spline Based Methods for Functional Data on Multivariate Domains

Authors: Rani Basna, Hiba Nassar, Krzysztof Podgórski

Abstract: Functional data analysis is typically performed in two steps: first, functionally representing discrete observations, and then applying functional methods to the so-represented data. The initial choice of a functional representation may have a significant impact on the second phase of the analysis, as shown in recent research, where data-driven spline bases outperformed the predefined rigid choice… ▽ More Functional data analysis is typically performed in two steps: first, functionally representing discrete observations, and then applying functional methods to the so-represented data. The initial choice of a functional representation may have a significant impact on the second phase of the analysis, as shown in recent research, where data-driven spline bases outperformed the predefined rigid choice of functional representation. The method chooses an initial functional basis by an efficient placement of the knots using a simple machine-learning algorithm. The approach does not apply directly when the data are defined on domains of a higher dimension than one such as, for example, images. The reason is that in higher dimensions the convenient and numerically efficient spline bases are obtained as tensor bases from 1D spline bases that require knots that are located on a lattice. This does not allow for a flexible knot placement that was fundamental for the 1D approach. The goal of this research is to propose two modified approaches that circumvent the problem by coding the irregular knot selection into their densities and utilizing these densities through the topology of the spaces of splines. This allows for regular grids for the knots and thus facilitates using the spline tensor bases. It is tested on 1D data showing that its performance is comparable to or better than the previous methods. △ Less

Submitted 14 May, 2024; v1 submitted 28 September, 2023; originally announced September 2023.

Comments: 14 pages, 8 figures

arXiv:2103.07453 [pdf, other]

Machine Learning Assisted Orthonormal Basis Selection for Functional Data Analysis

Authors: Rani Basna, Hiba Nassar, Krzysztof Podgórski

Abstract: In implementations of the functional data methods, the effect of the initial choice of an orthonormal basis has not gained much attention in the past. Typically, several standard bases such as Fourier, wavelets, splines, etc. are considered to transform observed functional data and a choice is made without any formal criteria indicating which of the bases is preferable for the initial transformati… ▽ More In implementations of the functional data methods, the effect of the initial choice of an orthonormal basis has not gained much attention in the past. Typically, several standard bases such as Fourier, wavelets, splines, etc. are considered to transform observed functional data and a choice is made without any formal criteria indicating which of the bases is preferable for the initial transformation of the data into functions. In an attempt to address this issue, we propose a strictly data-driven method of orthogonal basis selection. The method uses recently introduced orthogonal spline bases called the splinets obtained by efficient orthogonalization of the B-splines. The algorithm learns from the data in the machine learning style to efficiently place knots. The optimality criterion is based on the average (per functional data point) mean square error and is utilized both in the learning algorithms and in comparison studies. The latter indicates efficiency that is particularly evident for the sparse functional data and to a lesser degree in analyses of responses to complex physical systems. △ Less

Submitted 12 March, 2021; originally announced March 2021.

arXiv:2102.00733 [pdf, other]

Splinets -- splines through the Taylor expansion, their support sets and orthogonal bases

Authors: Krzysztof Podgórski

Abstract: A new representation of splines that targets efficiency in the analysis of functional data is implemented. The efficiency is achieved through two novel features: using the recently introduced orthonormal spline bases, the so-called {\it splinets} and accounting for the spline support sets in the proposed spline object representation. The recently-introduced orthogonal splinets are evaluated by {\i… ▽ More A new representation of splines that targets efficiency in the analysis of functional data is implemented. The efficiency is achieved through two novel features: using the recently introduced orthonormal spline bases, the so-called {\it splinets} and accounting for the spline support sets in the proposed spline object representation. The recently-introduced orthogonal splinets are evaluated by {\it dyadic orthogonalization} of the $B$-splines. The package is built around the {\it Splinets}-object that represents a collection of splines. It treats splines as mathematical functions and contains information about the support sets and the values of the derivatives at the knots that uniquely define these functions. Algebra and calculus of splines utilize the Taylor expansions at the knots within the support sets. Several orthonormalization procedures of the $B$-splines are implemented including the recommended dyadic method leading to the splinets. The method bases on a dyadic algorithm that can be also viewed as the efficient method of diagonalizing a band matrix. The locality of the $B$-splines in terms of the support sets is, to a great extend, preserved in the corresponding splinet. This together with implemented algorithms utilizing locality of the supports provides a valuable computational tool for functional data analysis. The benefits are particularly evident when the sparsity in the data plays an important role. Various diagnostic tools are provided allowing to maintain the stability of the computations. Finally, the projection operation to the space of splines is implemented that facilitates functional data analysis. An example of a simple functional analysis of the data using the tools in the package is presented. The functionality of the package extends beyond the splines to piecewise polynomial functions, although the splines are its focus. △ Less

Submitted 26 September, 2024; v1 submitted 1 February, 2021; originally announced February 2021.

Comments: 27 pages, 8 figures, the R-package to be submitted to CRAN

MSC Class: 65D07; 65D10; 62H25

arXiv:2007.14220 [pdf, other]

Effective computations of joint excursion times for stationary Gaussian processes

Authors: Georg Lindgren, Krzysztof Podgorski, Igor Rychlik

Abstract: This work is to popularize the method of computing the distribution of the excursion times for a Gaussian process that involves extended and multivariate Rice's formula. The approach was used in numerical implementations of the high-dimensional integration routine and in earlier work it was shown that the computations are more effective and thus more precise than those based on Rice expansions.… ▽ More This work is to popularize the method of computing the distribution of the excursion times for a Gaussian process that involves extended and multivariate Rice's formula. The approach was used in numerical implementations of the high-dimensional integration routine and in earlier work it was shown that the computations are more effective and thus more precise than those based on Rice expansions. The joint distribution of excursion times is related to the distribution of the number of level crossings, a problem that can be attacked via the Rice series expansion, based on the moments of the number of crossings. Another point of attack is the "Independent Interval Approximation" intensively studied for the persistence of physical systems. It treats the lengths of successive crossing intervals as statistically independent. A renewal type argument leads to an expression that provides the approximate interval distribution via its Laplace transform. However, independence is not valid in typical situations. Even if it leads to acceptable results for the persistency exponent, rigorous assessment of the approximation error is not available. Moreover, we show that the IIA approach cannot deliver properly defined probability distributions and thus the method is limited to persistence studies. This paper presents an alternative approach that is both more general, more accurate, and relatively unknown. It is based on exact expressions for the probability density for one and for two successive excursion lengths. The numerical routine RIND computes the densities using recent advances in scientific computing and is easily accessible via a simple Matlab interface. The result solves the problem of two-step excursion dependence for a general stationary differentiable Gaussian process. The work offers also some analytical results that explain the effectiveness of the implemented method. △ Less

Submitted 28 July, 2020; originally announced July 2020.

MSC Class: 60G15; 60G10; 58J65; 60G55; 62P35; 65C50; 65D30

Showing 1–6 of 6 results for author: Podgórski, K