-
Poset-Markov Channels: Capacity via Group Symmetry
Authors:
Eray Unsal Atay,
Eitan Levin,
Venkat Chandrasekaran,
Victoria Kostina
Abstract:
Computing channel capacity is in general intractable because it is given by the limit of a sequence of optimization problems whose dimensionality grows to infinity. As a result, constant-sized characterizations of feedback or non-feedback capacity are known for only a few classes of channels with memory. This paper introduces poset-causal channels$\unicode{x2014}$a new formalism of a communication…
▽ More
Computing channel capacity is in general intractable because it is given by the limit of a sequence of optimization problems whose dimensionality grows to infinity. As a result, constant-sized characterizations of feedback or non-feedback capacity are known for only a few classes of channels with memory. This paper introduces poset-causal channels$\unicode{x2014}$a new formalism of a communication channel in which channel inputs and outputs are indexed by the elements of a partially ordered set (poset). We develop a novel methodology that allows us to establish a single-letter upper bound on the feedback capacity of a subclass of poset-causal channels whose memory structure exhibits a Markov property and symmetry. The methodology is based on symmetry reduction in optimization. We instantiate our method on two channel models: the Noisy Output is The STate (NOST) channel$\unicode{x2014}$for which the bound is tight$\unicode{x2014}$and a new two-dimensional extension of it.
△ Less
Submitted 24 June, 2025;
originally announced June 2025.
-
On Transferring Transferability: Towards a Theory for Size Generalization
Authors:
Eitan Levin,
Yuxin Ma,
Mateo Díaz,
Soledad Villar
Abstract:
Many modern learning tasks require models that can take inputs of varying sizes. Consequently, dimension-independent architectures have been proposed for domains where the inputs are graphs, sets, and point clouds. Recent work on graph neural networks has explored whether a model trained on low-dimensional data can transfer its performance to higher-dimensional inputs. We extend this body of work…
▽ More
Many modern learning tasks require models that can take inputs of varying sizes. Consequently, dimension-independent architectures have been proposed for domains where the inputs are graphs, sets, and point clouds. Recent work on graph neural networks has explored whether a model trained on low-dimensional data can transfer its performance to higher-dimensional inputs. We extend this body of work by introducing a general framework for transferability across dimensions. We show that transferability corresponds precisely to continuity in a limit space formed by identifying small problem instances with equivalent large ones. This identification is driven by the data and the learning task. We instantiate our framework on existing architectures, and implement the necessary changes to ensure their transferability. Finally, we provide design principles for designing new transferable models. Numerical experiments support our findings.
△ Less
Submitted 29 May, 2025;
originally announced May 2025.
-
Free Descriptions of Convex Sets
Authors:
Eitan Levin,
Venkat Chandrasekaran
Abstract:
Convex sets arising in a variety of applications are well-defined for every relevant dimension. Examples include the simplex and the spectraplex that correspond to probability distributions and to quantum states; combinatorial polytopes and their relaxations such as the cut polytope and the elliptope in integer programming; and unit balls of regularizers such as the $\ell_p$ and Schatten norms in…
▽ More
Convex sets arising in a variety of applications are well-defined for every relevant dimension. Examples include the simplex and the spectraplex that correspond to probability distributions and to quantum states; combinatorial polytopes and their relaxations such as the cut polytope and the elliptope in integer programming; and unit balls of regularizers such as the $\ell_p$ and Schatten norms in inverse problems. Moreover, these sets are often specified using conic descriptions that can be obviously instantiated in any dimension. We develop a systematic framework to study such free descriptions of convex sets. We show that free descriptions arise from a recently-identified phenomenon in algebraic topology called representation stability, which relates invariants across dimensions in a sequence of group representations. Our framework yields structural results for free descriptions pertaining to the relations between the sets they describe across dimensions, extendability of a single set in a given dimension to a freely-described sequence, and continuous limits of such sequences. We also develop a procedure to obtain parametric families of freely-described convex sets whose structure is adapted to a given application; illustrations are provided via examples that arise in the literature as well as new families that are derived using our procedure. We demonstrate the utility of our framework in two contexts. First, we develop an algorithm for a free analog of the convex regression problem, where a convex function is fit to input-output data; by searching over our parametric families, we can fit a function to low-dimensional inputs and extend it to any other dimension. Second, we prove that many sequences of symmetric conic programs can be solved in constant time, which unifies and strengthens several results in the literature.
△ Less
Submitted 11 June, 2024; v1 submitted 9 July, 2023;
originally announced July 2023.
-
Any-dimensional equivariant neural networks
Authors:
Eitan Levin,
Mateo Díaz
Abstract:
Traditional supervised learning aims to learn an unknown mapping by fitting a function to a set of input-output pairs with a fixed dimension. The fitted function is then defined on inputs of the same dimension. However, in many settings, the unknown mapping takes inputs in any dimension; examples include graph parameters defined on graphs of any size and physics quantities defined on an arbitrary…
▽ More
Traditional supervised learning aims to learn an unknown mapping by fitting a function to a set of input-output pairs with a fixed dimension. The fitted function is then defined on inputs of the same dimension. However, in many settings, the unknown mapping takes inputs in any dimension; examples include graph parameters defined on graphs of any size and physics quantities defined on an arbitrary number of particles. We leverage a newly-discovered phenomenon in algebraic topology, called representation stability, to define equivariant neural networks that can be trained with data in a fixed dimension and then extended to accept inputs in any dimension. Our approach is user-friendly, requiring only the network architecture and the groups for equivariance, and can be combined with any training procedure. We provide a simple open-source implementation of our methods and offer preliminary numerical experiments.
△ Less
Submitted 29 April, 2024; v1 submitted 9 June, 2023;
originally announced June 2023.
-
The effect of smooth parametrizations on nonconvex optimization landscapes
Authors:
Eitan Levin,
Joe Kileel,
Nicolas Boumal
Abstract:
We develop new tools to study landscapes in nonconvex optimization. Given one optimization problem, we pair it with another by smoothly parametrizing the domain. This is either for practical purposes (e.g., to use smooth optimization algorithms with good guarantees) or for theoretical purposes (e.g., to reveal that the landscape satisfies a strict saddle property). In both cases, the central quest…
▽ More
We develop new tools to study landscapes in nonconvex optimization. Given one optimization problem, we pair it with another by smoothly parametrizing the domain. This is either for practical purposes (e.g., to use smooth optimization algorithms with good guarantees) or for theoretical purposes (e.g., to reveal that the landscape satisfies a strict saddle property). In both cases, the central question is: how do the landscapes of the two problems relate? More precisely: how do desirable points such as local minima and critical points in one problem relate to those in the other problem? A key finding in this paper is that these relations are often determined by the parametrization itself, and are almost entirely independent of the cost function. Accordingly, we introduce a general framework to study parametrizations by their effect on landscapes. The framework enables us to obtain new guarantees for an array of problems, some of which were previously treated on a case-by-case basis in the literature. Applications include: optimizing low-rank matrices and tensors through factorizations; solving semidefinite programs via the Burer-Monteiro approach; training neural networks by optimizing their weights and biases; and quotienting out symmetries.
△ Less
Submitted 4 March, 2024; v1 submitted 7 July, 2022;
originally announced July 2022.
-
Finding stationary points on bounded-rank matrices: A geometric hurdle and a smooth remedy
Authors:
Eitan Levin,
Joe Kileel,
Nicolas Boumal
Abstract:
We consider the problem of provably finding a stationary point of a smooth function to be minimized on the variety of bounded-rank matrices. This turns out to be unexpectedly delicate. We trace the difficulty back to a geometric obstacle: On a nonsmooth set, there may be sequences of points along which standard measures of stationarity tend to zero, but whose limit points are not stationary. We na…
▽ More
We consider the problem of provably finding a stationary point of a smooth function to be minimized on the variety of bounded-rank matrices. This turns out to be unexpectedly delicate. We trace the difficulty back to a geometric obstacle: On a nonsmooth set, there may be sequences of points along which standard measures of stationarity tend to zero, but whose limit points are not stationary. We name such events apocalypses, as they can cause optimization algorithms to converge to non-stationary points. We illustrate this explicitly for an existing optimization algorithm on bounded-rank matrices. To provably find stationary points, we modify a trust-region method on a standard smooth parameterization of the variety. The method relies on the known fact that second-order stationary points on the parameter space map to stationary points on the variety. Our geometric observations and proposed algorithm generalize beyond bounded-rank matrices. We give a geometric characterization of apocalypses on general constraint sets, which implies that Clarke-regular sets do not admit apocalypses. Such sets include smooth manifolds, manifolds with boundaries, and convex sets. Our trust-region method supports parameterization by any complete Riemannian manifold.
△ Less
Submitted 7 July, 2022; v1 submitted 8 July, 2021;
originally announced July 2021.
-
3D ab initio modeling in cryo-EM by autocorrelation analysis
Authors:
Eitan Levin,
Tamir Bendory,
Nicolas Boumal,
Joe Kileel,
Amit Singer
Abstract:
Single-Particle Reconstruction (SPR) in Cryo-Electron Microscopy (cryo-EM) is the task of estimating the 3D structure of a molecule from a set of noisy 2D projections, taken from unknown viewing directions. Many algorithms for SPR start from an initial reference molecule, and alternate between refining the estimated viewing angles given the molecule, and refining the molecule given the viewing ang…
▽ More
Single-Particle Reconstruction (SPR) in Cryo-Electron Microscopy (cryo-EM) is the task of estimating the 3D structure of a molecule from a set of noisy 2D projections, taken from unknown viewing directions. Many algorithms for SPR start from an initial reference molecule, and alternate between refining the estimated viewing angles given the molecule, and refining the molecule given the viewing angles. This scheme is called iterative refinement. Reliance on an initial, user-chosen reference introduces model bias, and poor initialization can lead to slow convergence. Furthermore, since no ground truth is available for an unsolved molecule, it is difficult to validate the obtained results. This creates the need for high quality ab initio models that can be quickly obtained from experimental data with minimal priors, and which can also be used for validation. We propose a procedure to obtain such an ab initio model directly from raw data using Kam's autocorrelation method. Kam's method has been known since 1980, but it leads to an underdetermined system, with missing orthogonal matrices. Until now, this system has been solved only for special cases, such as highly symmetric molecules or molecules for which a homologous structure was already available. In this paper, we show that knowledge of just two clean projections is sufficient to guarantee a unique solution to the system. This system is solved by an optimization-based heuristic. For the first time, we are then able to obtain a low-resolution ab initio model of an asymmetric molecule directly from raw data, without 2D class averaging and without tilting. Numerical results are presented on both synthetic and experimental data.
△ Less
Submitted 7 January, 2018; v1 submitted 22 October, 2017;
originally announced October 2017.
-
Stopping criterion for iterative regularization of large-scale ill-posed problems using the Picard parameter
Authors:
Eitan Levin,
Alexander Y. Meltzer
Abstract:
We propose a new stopping criterion for Krylov subspace iterative regularization of large-scale ill-posed inverse problems. Our stopping criterion accurately filters the data using a generalization of the Picard parameter that was originally introduced for direct regularization of small-scale problems. In the one dimension we filter the data in the discrete Fourier transform (DFT) basis using the…
▽ More
We propose a new stopping criterion for Krylov subspace iterative regularization of large-scale ill-posed inverse problems. Our stopping criterion accurately filters the data using a generalization of the Picard parameter that was originally introduced for direct regularization of small-scale problems. In the one dimension we filter the data in the discrete Fourier transform (DFT) basis using the Picard parameter, which separates noise-dominated Fourier coefficients from the signal-dominated ones. For two-dimensional problems we propose a novel vectorization scheme of the Fourier coefficients of the data based on the Kronecker product structure of the two-dimensional DFT matrix, which effectively reduces the problem to one dimension. At each iteration we compute the distance between the data reconstructed from the iterated solution and the filtered data, terminating the iterations once this distance begins to increase or to level off. The accuracy and robustness of the proposed method is demonstrated by several numerical examples and a MATLAB-based implementation is provided.
△ Less
Submitted 13 July, 2017;
originally announced July 2017.
-
Estimation of the Regularization Parameter in Linear Discrete Ill-Posed Problems Using the Picard parameter
Authors:
Eitan Levin,
Alexander Y. Meltzer
Abstract:
Accurate determination of the regularization parameter in inverse problems still represents an analytical challenge, owing mainly to the considerable difficulty to separate the unknown noise from the signal. We present a new approach for determining the parameter for the general-form Tikhonov regularization of linear ill-posed problems. In our approach the parameter is found by approximate minimiz…
▽ More
Accurate determination of the regularization parameter in inverse problems still represents an analytical challenge, owing mainly to the considerable difficulty to separate the unknown noise from the signal. We present a new approach for determining the parameter for the general-form Tikhonov regularization of linear ill-posed problems. In our approach the parameter is found by approximate minimization of the distance between the unknown noiseless data and the data reconstructed from the regularized solution. We approximate this distance by employing the Picard parameter to separate the noise from the data in the coordinate system of the generalized SVD. A simple and reliable algorithm for the estimation of the Picard parameter enables accurate implementation of the above procedure. We demonstrate the effectiveness of our method on several numerical examples. A MATLAB-based implementation of the proposed algorithms can be found at https://www.weizmann.ac.il/condmat/superc/software/
△ Less
Submitted 7 September, 2017; v1 submitted 4 July, 2016;
originally announced July 2016.
-
On the Classification of Algebras
Authors:
Alex S. E. Levin
Abstract:
We classify (possibly non commutative) algebras of low rank over a domain R. We first review results for algebras of rank 2 and for finite-dimensional division algebras over the real numbers. These results motivate us to consider which algebras possess a standard involution. Our main result is that algebras of rank 3 are either commutative or possess a standard involution.
We classify (possibly non commutative) algebras of low rank over a domain R. We first review results for algebras of rank 2 and for finite-dimensional division algebras over the real numbers. These results motivate us to consider which algebras possess a standard involution. Our main result is that algebras of rank 3 are either commutative or possess a standard involution.
△ Less
Submitted 23 December, 2013;
originally announced December 2013.