Gradient-Weighted, Data-Driven Normalization for Approximate Border Bases -- Concept and Computation
Authors:
Hiroshi Kera,
Achim Kehrein
Abstract:
This paper studies the concept and the computation of approximately vanishing ideals of a finite set of data points. By data points, we mean that the points contain some uncertainty, which is a key motivation for the approximate treatment. A careful review of the existing border basis concept for an exact treatment motivates a new adaptation of the border basis concept for an approximate treatment…
▽ More
This paper studies the concept and the computation of approximately vanishing ideals of a finite set of data points. By data points, we mean that the points contain some uncertainty, which is a key motivation for the approximate treatment. A careful review of the existing border basis concept for an exact treatment motivates a new adaptation of the border basis concept for an approximate treatment. In the study of approximately vanishing polynomials, the normalization of polynomials plays a vital role. So far, the most common normalization in computational commutative algebra uses the coefficient norm of a polynomial. Inspired by recent developments in machine learning, the present paper proposes and studies the use of gradient-weighted normalization. The gradient-weighted semi-norm evaluates the gradient of a polynomial at the data points. This data-driven nature of gradient-weighted normalization produces, on the one hand, better stability against perturbation and, on the other hand, very significantly, invariance of border bases with respect to scaling the data points. Neither property is achieved with coefficient normalization. In particular, we present an example of the lack of scaling invariance with respect to coefficient normalization, which can cause an approximate border basis computation to fail. This is extremely relevant because scaling of the point set is often recommended for preprocessing the data. Further, we use an existing algorithm with coefficient normalization to show that it is easily adapted to gradient-weighted normalization. The analysis of the adapted algorithm only requires tiny changes, and the time complexity remains the same. Finally, we present numerical experiments on three affine varieties to demonstrate the superior stability of our data-driven normalization over coefficient normalization. We obtain robustness to perturbations and invariance to scaling.
△ Less
Submitted 11 June, 2025;
originally announced June 2025.