-
Non-identifiability distinguishes Neural Networks among Parametric Models
Authors:
Sourav Chatterjee,
Timothy Sudijono
Abstract:
One of the enduring problems surrounding neural networks is to identify the factors that differentiate them from traditional statistical models. We prove a pair of results which distinguish feedforward neural networks among parametric models at the population level, for regression tasks. Firstly, we prove that for any pair of random variables $(X,Y)$, neural networks always learn a nontrivial rela…
▽ More
One of the enduring problems surrounding neural networks is to identify the factors that differentiate them from traditional statistical models. We prove a pair of results which distinguish feedforward neural networks among parametric models at the population level, for regression tasks. Firstly, we prove that for any pair of random variables $(X,Y)$, neural networks always learn a nontrivial relationship between $X$ and $Y$, if one exists. Secondly, we prove that for reasonable smooth parametric models, under local and global identifiability conditions, there exists a nontrivial $(X,Y)$ pair for which the parametric model learns the constant predictor $\mathbb{E}[Y]$. Together, our results suggest that a lack of identifiability distinguishes neural networks among the class of smooth parametric models.
△ Less
Submitted 26 May, 2025; v1 submitted 24 April, 2025;
originally announced April 2025.
-
Rigorous results for timelike Liouville field theory
Authors:
Sourav Chatterjee
Abstract:
Liouville field theory has long been a cornerstone of two-dimensional quantum field theory and quantum gravity, which has attracted much recent attention in the mathematics literature. Timelike Liouville field theory is a version of Liouville field theory where the kinetic term in the action appears with a negative sign, which makes it closer to a theory of quantum gravity than ordinary (spacelike…
▽ More
Liouville field theory has long been a cornerstone of two-dimensional quantum field theory and quantum gravity, which has attracted much recent attention in the mathematics literature. Timelike Liouville field theory is a version of Liouville field theory where the kinetic term in the action appears with a negative sign, which makes it closer to a theory of quantum gravity than ordinary (spacelike) Liouville field theory. Making sense of this "wrong sign" requires a theory of Gaussian random variables with negative variance. Such a theory is developed in this paper, and is used to prove the timelike DOZZ formula for the $3$-point correlation function when the parameters satisfy the so-called "charge neutrality condition". Expressions are derived also for the $k$-point correlation functions for all $k\ge 3$, and it is shown that these functions approach the correct semiclassical limits as the coupling constant is sent to zero.
△ Less
Submitted 15 April, 2025; v1 submitted 3 April, 2025;
originally announced April 2025.
-
Denjoy-Wolff like set for rational semigroups
Authors:
Subham Chatterjee,
Gorachand Chakraborty,
Tarun Kumar Chakra
Abstract:
In this paper, we introduce the concept of Denjoy-Wolff set in rational semigroups. We show that for finitely generated Abelian rational semigroups, the Denjoy-Wolff like set is countable. Some results concerning the Denjoy-Wolff like set and the Julia set are also discussed. Then we consider a special class of rational semigroups and discuss various properties of Denjoy-Wolff like set for this cl…
▽ More
In this paper, we introduce the concept of Denjoy-Wolff set in rational semigroups. We show that for finitely generated Abelian rational semigroups, the Denjoy-Wolff like set is countable. Some results concerning the Denjoy-Wolff like set and the Julia set are also discussed. Then we consider a special class of rational semigroups and discuss various properties of Denjoy-Wolff like set for this class. We use the concept of Denjoy-Wolff like set to classify the class into 3 sub-classes. We also show that for any semigroup in this class, the semigroup can be partitioned into k partitions where k is the cardinality of the Denjoy-Wolff like set.
△ Less
Submitted 20 February, 2025;
originally announced February 2025.
-
Robust construction of the incipient infinite cluster in high dimensional critical percolation
Authors:
Shirshendu Chatterjee,
Pranav Chinmay,
Jack Hanson,
Philippe Sosoe
Abstract:
We give a new construction of the incipient infinite cluster (IIC) associated with high-dimensional percolation in a broad setting and under minimal assumptions. Our arguments differ substantially from earlier constructions of the IIC; we do not directly use the machinery of the lace expansion or similar diagrammatic expansions. We show that the IIC may be constructed by conditioning on the cluste…
▽ More
We give a new construction of the incipient infinite cluster (IIC) associated with high-dimensional percolation in a broad setting and under minimal assumptions. Our arguments differ substantially from earlier constructions of the IIC; we do not directly use the machinery of the lace expansion or similar diagrammatic expansions. We show that the IIC may be constructed by conditioning on the cluster of a vertex being infinite in the supercritical regime $p > p_c$ and then taking $p \searrow p_c$. Furthermore, at criticality, we show that the IIC may be constructed by conditioning on a connection to an arbitrary distant set $V$, generalizing previous constructions where one conditions on a connection to a single distant vertex or the boundary of a large box.
The input to our proof are the asymptotics for the two-point function obtained by Hara, van der Hofstad, and Slade. Our construction thus applies in all dimensions for which those asymptotics are known, rather than an unspecified high dimension considered in previous works. The results in this paper will be instrumental in upcoming work related to structural properties and scaling limits of various objects involving high-dimensional percolation clusters at and near criticality.
△ Less
Submitted 15 February, 2025;
originally announced February 2025.
-
Connections on a principal Lie groupoid bundle and representations up to homotopy
Authors:
Saikat Chatterjee,
Naga Arjun S J
Abstract:
A Lie groupoid principal $\mbbX$ bundle is a surjective submersion $π\colon P\to M$ with an action of $\mathbb{X}$ on $P$ with certain additional conditions. This paper offers a suitable definition for the notion of a connection on such bundles. Although every Lie groupoid $\mathbb{X}$ has its associated Lie algebroid $A:=1^*\ker ds\to X_0$, it does not admit a natural action on its Lie algebroid.…
▽ More
A Lie groupoid principal $\mbbX$ bundle is a surjective submersion $π\colon P\to M$ with an action of $\mathbb{X}$ on $P$ with certain additional conditions. This paper offers a suitable definition for the notion of a connection on such bundles. Although every Lie groupoid $\mathbb{X}$ has its associated Lie algebroid $A:=1^*\ker ds\to X_0$, it does not admit a natural action on its Lie algebroid. There is no natural action of $\mathbb{X}$ on $TP$ either. Choosing a connection $\mathbb{H}\subset TX_1$ on the Lie groupoid $\mathbb{X},$ and considering its induced action up to homotopy of $\mathbb{X}$ on graded vector bundle $TX_0\oplus A,$ we prove the existence of a short exact sequence of diffeological groupoids over the discrete category $M$ (with appropriate vector space structures on the fibres) for the $\mbbX$ bundle $π\colon P\to M.$ We introduce a notion of connection on $\mbbX$ bundle $π\colon P\to M,$ and show that such a connection $ω$ splits the sequence. Finally, we show that a connection pair $(ω, \mathbb{H})$ on $\mbbX$ bundle $π\colon P\to M$ is isomorphic to any other connection pair.}
△ Less
Submitted 21 April, 2025; v1 submitted 4 February, 2025;
originally announced February 2025.
-
A new approach to locally adaptive polynomial regression
Authors:
Sabyasachi Chatterjee,
Subhajit Goswami,
Soumendu Sundar Mukherjee
Abstract:
Adaptive bandwidth selection is a fundamental challenge in nonparametric regression. This paper introduces a new bandwidth selection procedure inspired by the optimality criteria for $\ell_0$-penalized regression. Although similar in spirit to Lepski's method and its variants in selecting the largest interval satisfying an admissibility criterion, our approach stems from a distinct philosophy, uti…
▽ More
Adaptive bandwidth selection is a fundamental challenge in nonparametric regression. This paper introduces a new bandwidth selection procedure inspired by the optimality criteria for $\ell_0$-penalized regression. Although similar in spirit to Lepski's method and its variants in selecting the largest interval satisfying an admissibility criterion, our approach stems from a distinct philosophy, utilizing criteria based on $\ell_2$-norms of interval projections rather than explicit point and variance estimates. We obtain non-asymptotic risk bounds for the local polynomial regression methods based on our bandwidth selection procedure which adapt (near-)optimally to the local Hölder exponent of the underlying regression function simultaneously at all points in its domain. Furthermore, we show that there is a single ideal choice of a global tuning parameter in each case under which the above-mentioned local adaptivity holds. The optimal risks of our methods derive from the properties of solutions to a new ``bandwidth selection equation'' which is of independent interest. We believe that the principles underlying our approach provide a new perspective to the classical yet ever relevant problem of locally adaptive nonparametric regression.
△ Less
Submitted 20 May, 2025; v1 submitted 27 December, 2024;
originally announced December 2024.
-
On the stability of solutions to random optimization problems under small perturbations
Authors:
Sourav Chatterjee,
Souvik Ray
Abstract:
Consider the Euclidean traveling salesman problem with $n$ random points on the plane. Suppose that one of the points is shifted to a new random location. This gives us a new optimal path. Consider such shifts for each of the $n$ points. Do we get $n$ very different optimal paths? In this article, we show that this is not the case - in fact, the number of truly different paths can be at most…
▽ More
Consider the Euclidean traveling salesman problem with $n$ random points on the plane. Suppose that one of the points is shifted to a new random location. This gives us a new optimal path. Consider such shifts for each of the $n$ points. Do we get $n$ very different optimal paths? In this article, we show that this is not the case - in fact, the number of truly different paths can be at most $\mathcal{O}(1)$ as $n\to \infty$. The proof is based on a general argument which allows us to prove similar stability results in a number of other settings, such as branching random walk, the Sherrington-Kirkpatrick model of mean-field spin glasses, the Edwards-Anderson model of short-range spin glasses, and the Wigner ensemble of random matrices.
△ Less
Submitted 28 October, 2024;
originally announced October 2024.
-
Minmax Trend Filtering: Generalizations of Total Variation Denoising via a Local Minmax/Maxmin Formula
Authors:
Sabyasachi Chatterjee
Abstract:
Total Variation Denoising (TVD) is a fundamental denoising and smoothing method. In this article, we identify a new local minmax/maxmin formula producing two estimators which sandwich the univariate TVD estimator at every point. Operationally, this formula gives a local definition of TVD as a minmax/maxmin of a simple function of local averages. Moreover we find that this minmax/maxmin formula is…
▽ More
Total Variation Denoising (TVD) is a fundamental denoising and smoothing method. In this article, we identify a new local minmax/maxmin formula producing two estimators which sandwich the univariate TVD estimator at every point. Operationally, this formula gives a local definition of TVD as a minmax/maxmin of a simple function of local averages. Moreover we find that this minmax/maxmin formula is generalizeable and can be used to define other TVD like estimators. In this article we propose and study higher order polynomial versions of TVD which are defined pointwise lying between minmax and maxmin optimizations of penalized local polynomial regressions over intervals of different scales. These appear to be new nonparametric regression methods, different from usual Trend Filtering and any other existing method in the nonparametric regression toolbox. We call these estimators Minmax Trend Filtering (MTF). We show how the proposed local definition of TVD/MTF estimator makes it tractable to bound pointwise estimation errors in terms of a local bias variance like trade-off. This type of local analysis of TVD/MTF is new and arguably simpler than existing analyses of TVD/Trend Filtering. In particular, apart from minimax rate optimality over bounded variation and piecewise polynomial classes, our pointwise estimation error bounds also enable us to derive local rates of convergence for (locally) Holder Smooth signals. These local rates offer a new pointwise explanation of local adaptivity of TVD/MTF instead of global (MSE) based justifications.
△ Less
Submitted 10 April, 2025; v1 submitted 3 October, 2024;
originally announced October 2024.
-
Neural Networks Generalize on Low Complexity Data
Authors:
Sourav Chatterjee,
Timothy Sudijono
Abstract:
We show that feedforward neural networks with ReLU activation generalize on low complexity data, suitably defined. Given i.i.d.~data generated from a simple programming language, the minimum description length (MDL) feedforward neural network which interpolates the data generalizes with high probability. We define this simple programming language, along with a notion of description length of such…
▽ More
We show that feedforward neural networks with ReLU activation generalize on low complexity data, suitably defined. Given i.i.d.~data generated from a simple programming language, the minimum description length (MDL) feedforward neural network which interpolates the data generalizes with high probability. We define this simple programming language, along with a notion of description length of such networks. We provide several examples on basic computational tasks, such as checking primality of a natural number. For primality testing, our theorem shows the following and more. Suppose that we draw an i.i.d.~sample of $n$ numbers uniformly at random from $1$ to $N$. For each number $x_i$, let $y_i = 1$ if $x_i$ is a prime and $0$ if it is not. Then, the interpolating MDL network accurately answers, with error probability $1- O((\ln N)/n)$, whether a newly drawn number between $1$ and $N$ is a prime or not. Note that the network is not designed to detect primes; minimum description learning discovers a network which does so. Extensions to noisy data are also discussed, suggesting that MDL neural network interpolators can demonstrate tempered overfitting.
△ Less
Submitted 30 June, 2025; v1 submitted 18 September, 2024;
originally announced September 2024.
-
A Vershik-Kerov theorem for wreath products
Authors:
Sourav Chatterjee,
Persi Diaconis
Abstract:
Let $G_{n,k}$ be the group of permutations of $\{1,2,\ldots, kn\}$ that permutes the first $k$ symbols arbitrarily, then the next $k$ symbols and so on through the last $k$ symbols. Finally the $n$ blocks of size $k$ are permuted in an arbitrary way. For $σ$ chosen uniformly in $G_{n,k}$, let $L_{n,k}$ be the length of the longest increasing subsequence in $σ$. For $k,n$ growing, we determine that…
▽ More
Let $G_{n,k}$ be the group of permutations of $\{1,2,\ldots, kn\}$ that permutes the first $k$ symbols arbitrarily, then the next $k$ symbols and so on through the last $k$ symbols. Finally the $n$ blocks of size $k$ are permuted in an arbitrary way. For $σ$ chosen uniformly in $G_{n,k}$, let $L_{n,k}$ be the length of the longest increasing subsequence in $σ$. For $k,n$ growing, we determine that the limiting mean of $L_{n,k}$ is asymptotic to $4\sqrt{nk}$. This is different from parallel variations of the Vershik-Kerov theorem for colored permutations.
△ Less
Submitted 8 August, 2024;
originally announced August 2024.
-
$i$Trust: Trust-Region Optimisation with Ising Machines
Authors:
Sayantan Pramanik,
Kaumudibikash Goswami,
Sourav Chatterjee,
M Girish Chandra
Abstract:
In this work, we present a heretofore unseen application of Ising machines to perform trust region-based optimisation with box constraints. This is done by considering a specific form of opto-electronic oscillator-based coherent Ising machines with clipped transfer functions, and proposing appropriate modifications to facilitate trust-region optimisation. The enhancements include the inclusion of…
▽ More
In this work, we present a heretofore unseen application of Ising machines to perform trust region-based optimisation with box constraints. This is done by considering a specific form of opto-electronic oscillator-based coherent Ising machines with clipped transfer functions, and proposing appropriate modifications to facilitate trust-region optimisation. The enhancements include the inclusion of non-symmetric coupling and linear terms, modulation of noise, and compatibility with convex-projections to improve its convergence. The convergence of the modified Ising machine has been shown under the reasonable assumptions of convexity or invexity. The mathematical structures of the modified Ising machine and trust-region methods have been exploited to design a new trust-region method to effectively solve unconstrained optimisation problems in many scenarios, such as machine learning and optimisation of parameters in variational quantum algorithms. Hence, the proposition is useful for both classical and quantum-classical hybrid scenarios. Finally, the convergence of the Ising machine-based trust-region method, has also been proven analytically, establishing the feasibility of the technique.
△ Less
Submitted 6 June, 2024;
originally announced July 2024.
-
PriME: Privacy-aware Membership profile Estimation in networks
Authors:
Abhinav Chakraborty,
Sayak Chatterjee,
Sagnik Nandy
Abstract:
This paper presents a novel approach to estimating community membership probabilities for network vertices generated by the Degree Corrected Mixed Membership Stochastic Block Model while preserving individual edge privacy. Operating within the $\varepsilon$-edge local differential privacy framework, we introduce an optimal private algorithm based on a symmetric edge flip mechanism and spectral clu…
▽ More
This paper presents a novel approach to estimating community membership probabilities for network vertices generated by the Degree Corrected Mixed Membership Stochastic Block Model while preserving individual edge privacy. Operating within the $\varepsilon$-edge local differential privacy framework, we introduce an optimal private algorithm based on a symmetric edge flip mechanism and spectral clustering for accurate estimation of vertex community memberships. We conduct a comprehensive analysis of the estimation risk and establish the optimality of our procedure by providing matching lower bounds to the minimax risk under privacy constraints. To validate our approach, we demonstrate its performance through numerical simulations and its practical application to real-world data. This work represents a significant step forward in balancing accurate community membership estimation with stringent privacy preservation in network data analysis.
△ Less
Submitted 4 June, 2024;
originally announced June 2024.
-
On Hyperbolicity of Spirallike Circularlike domain
Authors:
Sanjoy Chatterjee,
Golam Mostafa Mondal
Abstract:
In this paper, we prove that a spirallike circularlike domain is Kobayashi hyperbolic if and only if its core is empty. In particular, we show that such a domain is Kobayashi hyperbolic if and only if it is (biholomorphic to) a bounded domain. We also propose a problem in this area.
In this paper, we prove that a spirallike circularlike domain is Kobayashi hyperbolic if and only if its core is empty. In particular, we show that such a domain is Kobayashi hyperbolic if and only if it is (biholomorphic to) a bounded domain. We also propose a problem in this area.
△ Less
Submitted 2 May, 2024;
originally announced May 2024.
-
Liouville Theory: An Introduction to Rigorous Approaches
Authors:
Sourav Chatterjee,
Edward Witten
Abstract:
In recent years, a surprisingly direct and simple rigorous understanding of quantum Liouville theory has developed. We aim here to make this material more accessible to physicists working on quantum field theory.
In recent years, a surprisingly direct and simple rigorous understanding of quantum Liouville theory has developed. We aim here to make this material more accessible to physicists working on quantum field theory.
△ Less
Submitted 17 December, 2024; v1 submitted 2 April, 2024;
originally announced April 2024.
-
A scaling limit of $\mathrm{SU}(2)$ lattice Yang-Mills-Higgs theory
Authors:
Sourav Chatterjee
Abstract:
The construction of non-Abelian Euclidean Yang-Mills theories in dimension four, as scaling limits of lattice Yang-Mills theories or otherwise, is a central open question of mathematical physics. This paper takes the following small step towards this goal. In any dimension $d\ge 2$, we construct a scaling limit of $\mathrm{SU}(2)$ lattice Yang-Mills theory coupled to a Higgs field transforming in…
▽ More
The construction of non-Abelian Euclidean Yang-Mills theories in dimension four, as scaling limits of lattice Yang-Mills theories or otherwise, is a central open question of mathematical physics. This paper takes the following small step towards this goal. In any dimension $d\ge 2$, we construct a scaling limit of $\mathrm{SU}(2)$ lattice Yang-Mills theory coupled to a Higgs field transforming in the fundamental representation of $\mathrm{SU}(2)$. After unitary gauge fixing and taking the lattice spacing $\varepsilon\to 0$, and simultaneously taking the gauge coupling constant $g\to 0$ and the Higgs length $α\to \infty$ in such a manner that $αg$ is always equal to $c\varepsilon$ for some fixed $c$ and $g= O(\varepsilon^{50d})$, a stereographic projection of the gauge field is shown to converge to a scale-invariant massive Gaussian field. This gives the first construction of a scaling limit of a non-Abelian lattice Yang-Mills theory in a dimension higher than two, as well as the first rigorous proof of mass generation by the Higgs mechanism in such a theory. Analogous results are proved for $\mathrm{U}(1)$ theory as well. The question of constructing a non-Gaussian scaling limit remains open.
△ Less
Submitted 4 December, 2024; v1 submitted 19 January, 2024;
originally announced January 2024.
-
Convergence Analysis of Opto-Electronic Oscillator based Coherent Ising Machines
Authors:
Sayantan Pramanik,
Sourav Chatterjee,
Harshkumar Oza
Abstract:
Ising machines are purported to be better at solving large-scale combinatorial optimisation problems better than conventional von Neumann computers. However, these Ising machines are widely believed to be heuristics, whose promise is observed empirically rather than obtained theoretically. We bridge this gap by considering an opto-electronic oscillator based coherent Ising machine, and providing t…
▽ More
Ising machines are purported to be better at solving large-scale combinatorial optimisation problems better than conventional von Neumann computers. However, these Ising machines are widely believed to be heuristics, whose promise is observed empirically rather than obtained theoretically. We bridge this gap by considering an opto-electronic oscillator based coherent Ising machine, and providing the first analytical proof that under reasonable assumptions, the OEO-CIM is not a heuristic approach. We find and prove bounds on its performance in terms of the expected difference between the objective value at the final iteration and the optimal one, and on the number of iterations required by it. In the process, we emphasise on some of its limitations such as the inability to handle asymmetric coupling between spins, and the absence of external magnetic field applied on them (both of which are necessary in many optimisation problems), along with some issues in its convergence. We overcome these limitations by proposing suitable adjustments and prove that the improved architecture is guaranteed to converge to the optimum of the relaxed objective function.
△ Less
Submitted 7 December, 2023;
originally announced December 2023.
-
Discrete Dynamics and Supergeometry
Authors:
Subhobrata Chatterjee,
Andrew Waldron,
Cem Yetişmişoğlu
Abstract:
We formulate a geometric measurement theory of dynamical classical systems possessing both continuous and discrete degrees of freedom. The approach is covariant with respect to choices of clocks and canonically incorporates laboratories. The latter are embedded symplectic submanifolds of an odd-dimensional symplectic structure. When suitably defined, symplectic geometry in odd dimensions is exactl…
▽ More
We formulate a geometric measurement theory of dynamical classical systems possessing both continuous and discrete degrees of freedom. The approach is covariant with respect to choices of clocks and canonically incorporates laboratories. The latter are embedded symplectic submanifolds of an odd-dimensional symplectic structure. When suitably defined, symplectic geometry in odd dimensions is exactly the structure needed for covariance. A fundamentally probabilistic viewpoint allows classical supergeometries to describe discrete dynamics. We solve the problem of how to construct probabilistic measures on supermanifolds given a (possibly odd dimensional) supersymplectic structure. This relies on a superanalog of the Hodge star for differential forms and a description of probabilities by convex cones. We also show how stochastic processes such as Markov chains can be described by supergeometry.
△ Less
Submitted 9 November, 2023;
originally announced November 2023.
-
Spectral gap of nonreversible Markov chains
Authors:
Sourav Chatterjee
Abstract:
We define the spectral gap of a Markov chain on a finite state space as the second-smallest singular value of the generator of the chain, generalizing the usual definition of spectral gap for reversible chains. We then define the relaxation time of the chain as the inverse of this spectral gap, and show that this relaxation time can be characterized, for any Markov chain, as the time required for…
▽ More
We define the spectral gap of a Markov chain on a finite state space as the second-smallest singular value of the generator of the chain, generalizing the usual definition of spectral gap for reversible chains. We then define the relaxation time of the chain as the inverse of this spectral gap, and show that this relaxation time can be characterized, for any Markov chain, as the time required for convergence of empirical averages. This relaxation time is related to the Cheeger constant and the mixing time of the chain through inequalities that are similar to the reversible case, and the path argument can be used to get upper bounds. Several examples are worked out. An interesting finding from the examples is that the time for convergence of empirical averages in nonreversible chains can often be substantially smaller than the mixing time.
△ Less
Submitted 4 January, 2025; v1 submitted 16 October, 2023;
originally announced October 2023.
-
Neighbour Sum Patterns : Chessboards to Toroidal Worlds
Authors:
Sayan Dutta,
Ayanava Mandal,
Sohom Gupta,
Sourin Chatterjee
Abstract:
We say that a chessboard filled with integer entries satisfies the neighbour-sum property if the number appearing on each cell is the sum of entries in its neighbouring cells, where neighbours are cells sharing a common edge or vertex. We show that an $n\times n$ chessboard satisfies this property if and only if $n\equiv 5\pmod 6$. Existence of solutions is further investigated of rectangular, tor…
▽ More
We say that a chessboard filled with integer entries satisfies the neighbour-sum property if the number appearing on each cell is the sum of entries in its neighbouring cells, where neighbours are cells sharing a common edge or vertex. We show that an $n\times n$ chessboard satisfies this property if and only if $n\equiv 5\pmod 6$. Existence of solutions is further investigated of rectangular, toroidal boards, as well as on Neumann neighbourhoods, including a nice connection to discrete harmonic functions. Construction of solutions on infinite boards are also presented. Finally, answers to three dimensional analogues of these boards are explored using properties of cyclotomic polynomials and relevant ideas conjectured.
△ Less
Submitted 17 December, 2024; v1 submitted 6 October, 2023;
originally announced October 2023.
-
A dynamic mean-field statistical model of academic collaboration
Authors:
Soumendu Sundar Mukherjee,
Tamojit Sadhukhan,
Shirshendu Chatterjee
Abstract:
There is empirical evidence that collaboration in academia has increased significantly during the past few decades, perhaps due to the breathtaking advancements in communication and technology during this period. Multi-author articles have become more frequent than single-author ones. Interdisciplinary collaboration is also on the rise. Although there have been several studies on the dynamical asp…
▽ More
There is empirical evidence that collaboration in academia has increased significantly during the past few decades, perhaps due to the breathtaking advancements in communication and technology during this period. Multi-author articles have become more frequent than single-author ones. Interdisciplinary collaboration is also on the rise. Although there have been several studies on the dynamical aspects of collaboration networks, systematic statistical models which theoretically explain various empirically observed features of such networks have been lacking. In this work, we propose a dynamic mean-field model and an associated estimation framework for academic collaboration networks. We primarily focus on how the degree of collaboration of a typical author, rather than the local structure of her collaboration network, changes over time. We consider several popular indices of collaboration from the literature and study their dynamics under the proposed model. In particular, we obtain exact formulae for the expectations and temporal rates of change of these indices. Through extensive simulation experiments, we demonstrate that the proposed model has enough flexibility to capture various phenomena characteristic of real-world collaboration networks. Using metadata on papers from the arXiv repository, we empirically study the mean-field collaboration dynamics in disciplines such as Computer Science, Mathematics and Physics.
△ Less
Submitted 19 September, 2023;
originally announced September 2023.
-
Parallel transport on a Lie 2-group bundle over a Lie groupoid along Haefliger paths
Authors:
Saikat Chatterjee,
Adittya Chaudhuri
Abstract:
We prove a Lie 2-group torsor version of the well-known one-one correspondence between fibered categories and pseudofunctors. Consequently, we obtain a weak version of the principal Lie group bundle over a Lie groupoid. The correspondence also enables us to extend a particular class of principal 2-bundles to be defined over differentiable stacks. We show that the differential geometric connection…
▽ More
We prove a Lie 2-group torsor version of the well-known one-one correspondence between fibered categories and pseudofunctors. Consequently, we obtain a weak version of the principal Lie group bundle over a Lie groupoid. The correspondence also enables us to extend a particular class of principal 2-bundles to be defined over differentiable stacks. We show that the differential geometric connection structures introduced in the authors' previous work, combine nicely with the underlying fibration structure of a principal 2-bundle over a Lie groupoid. This interrelation allows us to derive a notion of parallel transport in the framework of principal 2-bundles over Lie groupoids along a particular class of Haefliger paths. The corresponding parallel transport functor is shown to be smooth. We apply our results to examine the parallel transport on an associated VB-groupoid.
△ Less
Submitted 11 September, 2023;
originally announced September 2023.
-
A characterization of bounded balanced convex domains in $\mathbb{C}^n$
Authors:
Sanjoy Chatterjee,
Golam Mostafa Mondal
Abstract:
In this paper, we investigate the characterization of balanced bounded convex domains in $\mathbb{C}^n$ in terms of the squeezing function. As an application, we provide a characterization of the polydisc in $\mathbb{C}^n$.
In this paper, we investigate the characterization of balanced bounded convex domains in $\mathbb{C}^n$ in terms of the squeezing function. As an application, we provide a characterization of the polydisc in $\mathbb{C}^n$.
△ Less
Submitted 13 February, 2024; v1 submitted 8 August, 2023;
originally announced August 2023.
-
Features of a spin glass in the random field Ising model
Authors:
Sourav Chatterjee
Abstract:
A longstanding open question in the theory of disordered systems is whether short-range models, such as the random field Ising model or the Edwards-Anderson model, can indeed have the famous properties that characterize mean-field spin glasses at nonzero temperature. This article shows that this is at least partially possible in the case of the random field Ising model. Consider the Ising model on…
▽ More
A longstanding open question in the theory of disordered systems is whether short-range models, such as the random field Ising model or the Edwards-Anderson model, can indeed have the famous properties that characterize mean-field spin glasses at nonzero temperature. This article shows that this is at least partially possible in the case of the random field Ising model. Consider the Ising model on a discrete $d$-dimensional cube under free boundary condition, subjected to a very weak i.i.d. random external field, where the field strength is inversely proportional to the square-root of the number of sites. It turns out that in $d\ge 2$ and at subcritical temperatures, this model has some of the key features of a mean-field spin glass. Namely, (a) the site overlap exhibits one step of replica symmetry breaking, (b) the quenched distribution of the overlap is non-self-averaging, and (c) the overlap has the Parisi ultrametric property. Furthermore, it is shown that for Gaussian disorder, replica symmetry does not break if the field strength is taken to be stronger than the one prescribed above, and non-self-averaging fails if it is weaker, showing that the above order of field strength is the only one that allows all three properties to hold. However, the model does not have two other features of mean-field models. Namely, (a) it does not satisfy the Ghirlanda-Guerra identities, and (b) it has only two pure states instead of many.
△ Less
Submitted 7 March, 2024; v1 submitted 14 July, 2023;
originally announced July 2023.
-
A study of spirallike domains: polynomial convexity, Loewner chains and dense holomorphic curves
Authors:
Sanjoy Chatterjee,
Sushil Gorai
Abstract:
In this paper, we prove that the closure of a bounded pseudoconvex domain, which is spirallike with respect to a globally asymptotic stable holomorphic vector field, is polynomially convex. We also provide a necessary and sufficient condition, in terms of polynomial convexity, on a univalent function defined on a strongly convex domain for embedding it into a filtering Loewner chain. Next, we prov…
▽ More
In this paper, we prove that the closure of a bounded pseudoconvex domain, which is spirallike with respect to a globally asymptotic stable holomorphic vector field, is polynomially convex. We also provide a necessary and sufficient condition, in terms of polynomial convexity, on a univalent function defined on a strongly convex domain for embedding it into a filtering Loewner chain. Next, we provide an application of our first result. We show that for any bounded pseudoconvex strictly spirallike domain $Ω$ in $\mathbb{C}^n$ and given any connected complex manifold $Y$, there exists a holomorphic map from the unit disc to the space of all holomorphic maps from $Ω$ to $Y$. This also yields us the existence of $\mathcal{O}(Ω, Y)$-universal map for any generalized translation on $Ω$, which, in turn, is connected to the hypercyclicity of certain composition operators on the space of manifold valued holomorphic maps.
△ Less
Submitted 11 July, 2023;
originally announced July 2023.
-
Enumerative Theory for the Tsetlin Library
Authors:
Sourav Chatterjee,
Persi Diaconis,
Gene B. Kim
Abstract:
The Tsetlin library is a well-studied Markov chain on the symmetric group $S_n$. It has stationary distribution $π(σ)$ the Luce model, a nonuniform distribution on $S_n$, which appears in psychology, horse race betting, and tournament poker. Simple enumerative questions, such as ``what is the distribution of the top $k$ cards?'' or ``what is the distribution of the bottom $k$ cards?'' are long ope…
▽ More
The Tsetlin library is a well-studied Markov chain on the symmetric group $S_n$. It has stationary distribution $π(σ)$ the Luce model, a nonuniform distribution on $S_n$, which appears in psychology, horse race betting, and tournament poker. Simple enumerative questions, such as ``what is the distribution of the top $k$ cards?'' or ``what is the distribution of the bottom $k$ cards?'' are long open. We settle these questions and draw attention to a host of parallel questions on the extension to the chambers of a hyperplane arrangement.
△ Less
Submitted 28 June, 2023;
originally announced June 2023.
-
Central Limit Theorem for Gram-Schmidt Random Walk Design
Authors:
Sabyasachi Chatterjee,
Partha S. Dey,
Subhajit Goswami
Abstract:
We prove a central limit theorem for the Horvitz-Thompson estimator based on the Gram-Schmidt Walk (GSW) design, recently developed in Harshaw et al.(2022). In particular, we consider the version of the GSW design which uses randomized pivot order, thereby answering an open question raised in the same article. We deduce this under minimal and global assumptions involving only the problem parameter…
▽ More
We prove a central limit theorem for the Horvitz-Thompson estimator based on the Gram-Schmidt Walk (GSW) design, recently developed in Harshaw et al.(2022). In particular, we consider the version of the GSW design which uses randomized pivot order, thereby answering an open question raised in the same article. We deduce this under minimal and global assumptions involving only the problem parameters such as the (sum) potential outcome vector and the covariate matrix. As an interesting consequence of our analysis we also obtain the precise limiting variance of the estimator in terms of these parameters which is smaller than the previously known upper bound. The main ingredients are a simplified skeletal process approximating the GSW design and concentration phenomena for random matrices obtained from random sampling using the Stein's method for exchangeable pairs.
△ Less
Submitted 5 June, 2023; v1 submitted 21 May, 2023;
originally announced May 2023.
-
Characterising Solutions of Anomalous Cancellation
Authors:
Satvik Saha,
Sohom Gupta,
Sayan Dutta,
Sourin Chatterjee
Abstract:
Anomalous cancellation of fractions is a mathematically inaccurate method where cancelling the common digits of the numerator and denominator correctly reduces it. While it appears to be accidentally successful, the property of anomalous cancellation is intricately connected to the number of digits of the denominator as well as the base in which the fraction is represented. Previous work have been…
▽ More
Anomalous cancellation of fractions is a mathematically inaccurate method where cancelling the common digits of the numerator and denominator correctly reduces it. While it appears to be accidentally successful, the property of anomalous cancellation is intricately connected to the number of digits of the denominator as well as the base in which the fraction is represented. Previous work have been mostly surrounding three digit solutions or specific properties of the same. This paper seeks to get general results regarding the structure of numbers that follow the cancellation property (denoted by $P^*_{\ell; k}$) and an estimate of the total number of solutions possible in a given base representation. In particular, interesting properties regarding the saturation of the number of solutions in general and $p^n$ bases (where $p$ is a prime) have been studied in detail.
△ Less
Submitted 31 January, 2023;
originally announced February 2023.
-
Spin glass phase at zero temperature in the Edwards-Anderson model
Authors:
Sourav Chatterjee
Abstract:
While the analysis of mean-field spin glass models has seen tremendous progress in the last twenty years, lattice spin glasses have remained largely intractable. This article presents the solutions to a number of questions about the Edwards-Anderson model of short-range spin glasses (in all dimensions) that were raised in the physics literature many years ago. First, it is shown that the ground st…
▽ More
While the analysis of mean-field spin glass models has seen tremendous progress in the last twenty years, lattice spin glasses have remained largely intractable. This article presents the solutions to a number of questions about the Edwards-Anderson model of short-range spin glasses (in all dimensions) that were raised in the physics literature many years ago. First, it is shown that the ground state is sensitive to small perturbations of the disorder, in the sense that a small amount of noise gives rise to a new ground state that is nearly orthogonal to the old one with respect to the site overlap inner product. Second, it is shown that one can overturn a macroscopic fraction of the spins in the ground state with an energy cost that is negligible compared to the size of the boundary of the overturned region - a feature that is believed to be typical of spin glasses but clearly absent in ferromagnets. The third result is that the boundary of the overturned region in dimension $d$ has fractal dimension strictly greater than $d-1$, confirming a prediction from physics. The fourth result is that the correlations between bonds in the ground state can decay at most like the inverse of the distance. This contrasts with the random field Ising model, where it has been shown recently that the correlation decays exponentially in distance in dimension two. The fifth result is that the expected size of the critical droplet of a bond grows at least like a power of the volume. Taken together, these results comprise the first mathematical proof of glassy behavior in a short-range spin glass model.
△ Less
Submitted 28 February, 2023; v1 submitted 10 January, 2023;
originally announced January 2023.
-
AdverSAR: Adversarial Search and Rescue via Multi-Agent Reinforcement Learning
Authors:
Aowabin Rahman,
Arnab Bhattacharya,
Thiagarajan Ramachandran,
Sayak Mukherjee,
Himanshu Sharma,
Ted Fujimoto,
Samrat Chatterjee
Abstract:
Search and Rescue (SAR) missions in remote environments often employ autonomous multi-robot systems that learn, plan, and execute a combination of local single-robot control actions, group primitives, and global mission-oriented coordination and collaboration. Often, SAR coordination strategies are manually designed by human experts who can remotely control the multi-robot system and enable semi-a…
▽ More
Search and Rescue (SAR) missions in remote environments often employ autonomous multi-robot systems that learn, plan, and execute a combination of local single-robot control actions, group primitives, and global mission-oriented coordination and collaboration. Often, SAR coordination strategies are manually designed by human experts who can remotely control the multi-robot system and enable semi-autonomous operations. However, in remote environments where connectivity is limited and human intervention is often not possible, decentralized collaboration strategies are needed for fully-autonomous operations. Nevertheless, decentralized coordination may be ineffective in adversarial environments due to sensor noise, actuation faults, or manipulation of inter-agent communication data. In this paper, we propose an algorithmic approach based on adversarial multi-agent reinforcement learning (MARL) that allows robots to efficiently coordinate their strategies in the presence of adversarial inter-agent communications. In our setup, the objective of the multi-robot team is to discover targets strategically in an obstacle-strewn geographical area by minimizing the average time needed to find the targets. It is assumed that the robots have no prior knowledge of the target locations, and they can interact with only a subset of neighboring robots at any time. Based on the centralized training with decentralized execution (CTDE) paradigm in MARL, we utilize a hierarchical meta-learning framework to learn dynamic team-coordination modalities and discover emergent team behavior under complex cooperative-competitive scenarios. The effectiveness of our approach is demonstrated on a collection of prototype grid-world environments with different specifications of benign and adversarial agents, target locations, and agent rewards.
△ Less
Submitted 20 December, 2022;
originally announced December 2022.
-
Online Distributed Algorithm for Optimal Power Flow problem with Regret Analysis
Authors:
Sushobhan Chatterjee,
Rachel Kalpana Kalaimani
Abstract:
We investigate the distributed DC-Optimal Power Flow (DC-OPF) problem for a dynamic and uncertain environment. The unpredictable supply of renewable resources and varying prices of the electricity market are a few factors responsible for the uncertainty. We propose to address this problem using the framework of online convex optimization, where the cost functions are not known apriori because of t…
▽ More
We investigate the distributed DC-Optimal Power Flow (DC-OPF) problem for a dynamic and uncertain environment. The unpredictable supply of renewable resources and varying prices of the electricity market are a few factors responsible for the uncertainty. We propose to address this problem using the framework of online convex optimization, where the cost functions are not known apriori because of the uncertainty and are revealed only incrementally over time. We also consider a distributed setting, where each agent (generators and loads) in the power network is only privy to their own local objectives and constraints but can communicate with their neighbours. A distributed online algorithm is proposed based on the modified primal-dual approach. The performance of the online algorithm is evaluated using the regret (static) function, which is the difference between the actual cost incurred by employing the proposed algorithm and the optimal fixed decision in hindsight. Since we deal with a constrained optimization problem, analogous to the notion of regret the accumulation of the constraint violation is also calculated at each step. We establish a sub-linear bound on the static regret and constraint violation under suitable assumptions on step-size and cost function. Finally, we use the standard IEEE-14 bus system to demonstrate the performance of our algorithm.
△ Less
Submitted 9 August, 2023; v1 submitted 7 December, 2022;
originally announced December 2022.
-
A survey of some recent developments in measures of association
Authors:
Sourav Chatterjee
Abstract:
This paper surveys some recent developments in measures of association related to a new coefficient of correlation introduced by the author. A straightforward extension of this coefficient to standard Borel spaces (which includes all Polish spaces), overlooked in the literature so far, is proposed at the end of the survey.
This paper surveys some recent developments in measures of association related to a new coefficient of correlation introduced by the author. A straightforward extension of this coefficient to standard Borel spaces (which includes all Polish spaces), overlooked in the literature so far, is proposed at the end of the survey.
△ Less
Submitted 9 August, 2023; v1 submitted 9 November, 2022;
originally announced November 2022.
-
Estimating large causal polytrees from small samples
Authors:
Sourav Chatterjee,
Mathukumalli Vidyasagar
Abstract:
We consider the problem of estimating a large causal polytree from a relatively small i.i.d. sample. This is motivated by the problem of determining causal structure when the number of variables is very large compared to the sample size, such as in gene regulatory networks. We give an algorithm that recovers the tree with high accuracy in such settings. The algorithm works under essentially no dis…
▽ More
We consider the problem of estimating a large causal polytree from a relatively small i.i.d. sample. This is motivated by the problem of determining causal structure when the number of variables is very large compared to the sample size, such as in gene regulatory networks. We give an algorithm that recovers the tree with high accuracy in such settings. The algorithm works under essentially no distributional or modeling assumptions other than some mild non-degeneracy conditions.
△ Less
Submitted 17 August, 2024; v1 submitted 14 September, 2022;
originally announced September 2022.
-
Approximations on certain domains of $\mathbb{C}^{n}$
Authors:
Sanjoy Chatterjee,
Sushil Gorai
Abstract:
In this paper, we study the domains in $\mathbb{C}^n$ that are invariant under the positive flows of some globally defined, complete holomorphic vector field
with a globally attracting fixed point at the origin. Our first result says that such a domain $Ω$ is always Runge. Next, with an additional assumption on the rate of convergence of the flow, we show that
any biholomorphism…
▽ More
In this paper, we study the domains in $\mathbb{C}^n$ that are invariant under the positive flows of some globally defined, complete holomorphic vector field
with a globally attracting fixed point at the origin. Our first result says that such a domain $Ω$ is always Runge. Next, with an additional assumption on the rate of convergence of the flow, we show that
any biholomorphism $Φ\colon Ω\to Φ(Ω)$, with $Φ(Ω)$ is Runge, can be approximated by automorphisms of $\mathbb{C}^{n}$ uniformly on compacts. This generalizes all earlier known theorems in this direction substantially, even when the vector field is linear. As an application of our approximation results, on such domains that are also complete hyperbolic, we show that any Loewner PDE in a complete hyperbolic domain $Ω$ admits an essentially unique univalent solution with values in $\mathbb{C}^n$. We also provide an approximation result for volume preserving biholomorphisms on above domains.
We provide several examples of such domains.
△ Less
Submitted 12 March, 2025; v1 submitted 23 August, 2022;
originally announced August 2022.
-
An invariance principle for the 1D KPZ equation
Authors:
Arka Adhikari,
Sourav Chatterjee
Abstract:
Consider a discrete one-dimensional random surface whose height at a point grows as a function of the heights at neighboring points plus an independent random noise. Assuming that this function is equivariant under constant shifts, symmetric in its arguments, and at least six times continuously differentiable in a neighborhood of the origin, we show that as the variance of the noise goes to zero,…
▽ More
Consider a discrete one-dimensional random surface whose height at a point grows as a function of the heights at neighboring points plus an independent random noise. Assuming that this function is equivariant under constant shifts, symmetric in its arguments, and at least six times continuously differentiable in a neighborhood of the origin, we show that as the variance of the noise goes to zero, any such process converges to the Cole-Hopf solution of the 1D KPZ equation under a suitable scaling of space and time. This proves an invariance principle for the 1D KPZ equation, in the spirit of Donsker's invariance principle for Brownian motion.
△ Less
Submitted 1 September, 2023; v1 submitted 4 August, 2022;
originally announced August 2022.
-
Concentration inequalities for correlated network-valued processes with applications to community estimation and changepoint analysis
Authors:
Sayak Chatterjee,
Shirshendu Chatterjee,
Soumendu Sundar Mukherjee,
Anirban Nath,
Sharmodeep Bhattacharyya
Abstract:
Network-valued time series are currently a common form of network data. However, the study of the aggregate behavior of network sequences generated from network-valued stochastic processes is relatively rare. Most of the existing research focuses on the simple setup where the networks are independent (or conditionally independent) across time, and all edges are updated synchronously at each time s…
▽ More
Network-valued time series are currently a common form of network data. However, the study of the aggregate behavior of network sequences generated from network-valued stochastic processes is relatively rare. Most of the existing research focuses on the simple setup where the networks are independent (or conditionally independent) across time, and all edges are updated synchronously at each time step. In this paper, we study the concentration properties of the aggregated adjacency matrix and the corresponding Laplacian matrix associated with network sequences generated from lazy network-valued stochastic processes, where edges update asynchronously, and each edge follows a lazy stochastic process for its updates independent of the other edges. We demonstrate the usefulness of these concentration results in proving consistency of standard estimators in community estimation and changepoint estimation problems. We also conduct a simulation study to demonstrate the effect of the laziness parameter, which controls the extent of temporal correlation, on the accuracy of community and changepoint estimation.
△ Less
Submitted 2 August, 2022;
originally announced August 2022.
-
Some basic results on fuzzy strong $φ$-b-normed linear spaces
Authors:
Abhishikta Das,
T. Bag,
S. Chatterjee
Abstract:
In this paper, definition of fuzzy strong $φ$-b-normed linear space is given. Here the scalar function |c| is replaced by a general function $φ$(c) where φ satisfies some properties. Some basic results on finite dimensional fuzzy strong $φ$-b-normed linear space are studied.
In this paper, definition of fuzzy strong $φ$-b-normed linear space is given. Here the scalar function |c| is replaced by a general function $φ$(c) where φ satisfies some properties. Some basic results on finite dimensional fuzzy strong $φ$-b-normed linear space are studied.
△ Less
Submitted 25 May, 2022;
originally announced May 2022.
-
A random walk on the Rado graph
Authors:
Sourav Chatterjee,
Persi Diaconis,
Laurent Miclo
Abstract:
The Rado graph, also known as the random graph $G(\infty, p)$, is a classical limit object for finite graphs. We study natural ball walks as a way of understanding the geometry of this graph. For the walk started at $i$, we show that order $\log_2^*i$ steps are sufficient, and for infinitely many $i$, necessary for convergence to stationarity. The proof involves an application of Hardy's inequalit…
▽ More
The Rado graph, also known as the random graph $G(\infty, p)$, is a classical limit object for finite graphs. We study natural ball walks as a way of understanding the geometry of this graph. For the walk started at $i$, we show that order $\log_2^*i$ steps are sufficient, and for infinitely many $i$, necessary for convergence to stationarity. The proof involves an application of Hardy's inequality for trees.
△ Less
Submitted 13 May, 2022;
originally announced May 2022.
-
DeepBayes -- an estimator for parameter estimation in stochastic nonlinear dynamical models
Authors:
Anubhab Ghosh,
Mohamed Abdalmoaty,
Saikat Chatterjee,
Håkan Hjalmarsson
Abstract:
Stochastic nonlinear dynamical systems are ubiquitous in modern, real-world applications. Yet, estimating the unknown parameters of stochastic, nonlinear dynamical models remains a challenging problem. The majority of existing methods employ maximum likelihood or Bayesian estimation. However, these methods suffer from some limitations, most notably the substantial computational time for inference…
▽ More
Stochastic nonlinear dynamical systems are ubiquitous in modern, real-world applications. Yet, estimating the unknown parameters of stochastic, nonlinear dynamical models remains a challenging problem. The majority of existing methods employ maximum likelihood or Bayesian estimation. However, these methods suffer from some limitations, most notably the substantial computational time for inference coupled with limited flexibility in application. In this work, we propose DeepBayes estimators that leverage the power of deep recurrent neural networks in learning an estimator. The method consists of first training a recurrent neural network to minimize the mean-squared estimation error over a set of synthetically generated data using models drawn from the model set of interest. The a priori trained estimator can then be used directly for inference by evaluating the network with the estimation data. The deep recurrent neural network architectures can be trained offline and ensure significant time savings during inference. We experiment with two popular recurrent neural networks -- long short term memory network (LSTM) and gated recurrent unit (GRU). We demonstrate the applicability of our proposed method on different example models and perform detailed comparisons with state-of-the-art approaches. We also provide a study on a real-world nonlinear benchmark problem. The experimental evaluations show that the proposed approach is asymptotically as good as the Bayes estimator.
△ Less
Submitted 4 May, 2022;
originally announced May 2022.
-
Spatially Adaptive Online Prediction of Piecewise Regular Functions
Authors:
Sabyasachi Chatterjee,
Subhajit Goswami
Abstract:
We consider the problem of estimating piecewise regular functions in an online setting, i.e., the data arrive sequentially and at any round our task is to predict the value of the true function at the next revealed point using the available data from past predictions. We propose a suitably modified version of a recently developed online learning algorithm called the sleeping experts aggregation al…
▽ More
We consider the problem of estimating piecewise regular functions in an online setting, i.e., the data arrive sequentially and at any round our task is to predict the value of the true function at the next revealed point using the available data from past predictions. We propose a suitably modified version of a recently developed online learning algorithm called the sleeping experts aggregation algorithm. We show that this estimator satisfies oracle risk bounds simultaneously for all local regions of the domain. As concrete instantiations of the expert aggregation algorithm proposed here, we study an online mean aggregation and an online linear regression aggregation algorithm where experts correspond to the set of dyadic subrectangles of the domain. The resulting algorithms are near linear time computable in the sample size. We specifically focus on the performance of these online algorithms in the context of estimating piecewise polynomial and bounded variation function classes in the fixed design setup. The simultaneous oracle risk bounds we obtain for these estimators in this context provide new and improved (in certain aspects) guarantees even in the batch setting and are not available for the state of the art batch learning estimators.
△ Less
Submitted 30 March, 2022;
originally announced March 2022.
-
Convergence of gradient descent for deep neural networks
Authors:
Sourav Chatterjee
Abstract:
This article presents a criterion for convergence of gradient descent to a global minimum, which is then used to show that gradient descent with proper initialization converges to a global minimum when training any feedforward neural network with smooth and strictly increasing activation functions, provided that the input dimension is greater than or equal to the number of data points. The main di…
▽ More
This article presents a criterion for convergence of gradient descent to a global minimum, which is then used to show that gradient descent with proper initialization converges to a global minimum when training any feedforward neural network with smooth and strictly increasing activation functions, provided that the input dimension is greater than or equal to the number of data points. The main difference with prior work is that the width of the network can be a fixed number instead of growing as some multiple or power of the number of data points.
△ Less
Submitted 17 December, 2022; v1 submitted 30 March, 2022;
originally announced March 2022.
-
Distributed Optimization of Average Consensus Containment with Multiple Stationary Leaders
Authors:
Sushobhan Chatterjee,
Rachel Kalpana Kalaimani
Abstract:
In this paper, we consider the problem of containment control of multi-agent systems with multiple stationary leaders, interacting over a directed network. While, containment control refers to just ensuring that the follower agents reach the convex hull of the leaders states, we focus on the problem where the followers achieve a consensus to the average values of the leaders states. We propose an…
▽ More
In this paper, we consider the problem of containment control of multi-agent systems with multiple stationary leaders, interacting over a directed network. While, containment control refers to just ensuring that the follower agents reach the convex hull of the leaders states, we focus on the problem where the followers achieve a consensus to the average values of the leaders states. We propose an algorithm that can be implemented in a distributed manner to achieve the above consensus among followers. Next we optimize the convergence rate of the followers to the average consensus by proper choice of weights for the interaction graph. This optimization is also performed in a distributed manner using Alternating Direction Method of Multipliers (ADMM). Finally, we complement our results by illustrating them with numerical examples.
△ Less
Submitted 30 March, 2022;
originally announced March 2022.
-
Element-wise Estimation Error of Generalized Fused Lasso
Authors:
Teng Zhang,
Sabyasachi Chatterjee
Abstract:
The main result of this article is that we obtain an elementwise error bound for the Fused Lasso estimator for any general convex loss function $ρ$. We then focus on the special cases when either $ρ$ is the square loss function (for mean regression) or is the quantile loss function (for quantile regression) for which we derive new pointwise error bounds. Even though error bounds for the usual Fuse…
▽ More
The main result of this article is that we obtain an elementwise error bound for the Fused Lasso estimator for any general convex loss function $ρ$. We then focus on the special cases when either $ρ$ is the square loss function (for mean regression) or is the quantile loss function (for quantile regression) for which we derive new pointwise error bounds. Even though error bounds for the usual Fused Lasso estimator and its quantile version have been studied before; our bound appears to be new. This is because all previous works bound a global loss function like the sum of squared error, or a sum of Huber losses in the case of quantile regression in Padilla and Chatterjee (2021). Clearly, element wise bounds are stronger than global loss error bounds as it reveals how the loss behaves locally at each point. Our element wise error bound also has a clean and explicit dependence on the tuning parameter $λ$ which informs the user of a good choice of $λ$. In addition, our bound is nonasymptotic with explicit constants and is able to recover almost all the known results for Fused Lasso (both mean and quantile regression) with additional improvements in some cases.
△ Less
Submitted 18 March, 2022; v1 submitted 8 March, 2022;
originally announced March 2022.
-
A Cross Validation Framework for Signal Denoising with Applications to Trend Filtering, Dyadic CART and Beyond
Authors:
Anamitra Chaudhuri,
Sabyasachi Chatterjee
Abstract:
This paper formulates a general cross validation framework for signal denoising. The general framework is then applied to nonparametric regression methods such as Trend Filtering and Dyadic CART. The resulting cross validated versions are then shown to attain nearly the same rates of convergence as are known for the optimally tuned analogues. There did not exist any previous theoretical analyses o…
▽ More
This paper formulates a general cross validation framework for signal denoising. The general framework is then applied to nonparametric regression methods such as Trend Filtering and Dyadic CART. The resulting cross validated versions are then shown to attain nearly the same rates of convergence as are known for the optimally tuned analogues. There did not exist any previous theoretical analyses of cross validated versions of Trend Filtering or Dyadic CART. To illustrate the generality of the framework we also propose and study cross validated versions of two fundamental estimators; lasso for high dimensional linear regression and singular value thresholding for matrix estimation. Our general framework is inspired by the ideas in Chatterjee and Jafarov (2015) and is potentially applicable to a wide range of estimation methods which use tuning parameters.
△ Less
Submitted 3 May, 2023; v1 submitted 7 January, 2022;
originally announced January 2022.
-
Fractional cyber-neural systems -- a brief survey
Authors:
Emily Reed,
Sarthak Chatterjee,
Guilherme Ramos,
Paul Bogdan,
Sérgio Pequito
Abstract:
Neurotechnology has made great strides in the last 20 years. However, we still have a long way to go to commercialize many of these technologies as we lack a unified framework to study cyber-neural systems (CNS) that bring the hardware, software, and the neural system together. Dynamical systems play a key role in developing these technologies as they capture different aspects of the brain and pro…
▽ More
Neurotechnology has made great strides in the last 20 years. However, we still have a long way to go to commercialize many of these technologies as we lack a unified framework to study cyber-neural systems (CNS) that bring the hardware, software, and the neural system together. Dynamical systems play a key role in developing these technologies as they capture different aspects of the brain and provide insight into their function. Converging evidence suggests that fractional-order dynamical systems are advantageous in modeling neural systems because of their compact representation and accuracy in capturing the long-range memory exhibited in neural behavior. In this brief survey, we provide an overview of fractional CNS that entails fractional-order systems in the context of CNS. In particular, we introduce basic definitions required for the analysis and synthesis of fractional CNS, encompassing system identification, state estimation, and closed-loop control. Additionally, we provide an illustration of some applications in the context of CNS and draw some possible future research directions. Ultimately, advancements in these three areas will be critical in developing the next generation of CNS, which will, ultimately, improve people's quality of life.
△ Less
Submitted 15 December, 2021;
originally announced December 2021.
-
A state space for 3D Euclidean Yang-Mills theories
Authors:
Sky Cao,
Sourav Chatterjee
Abstract:
It is believed that Euclidean Yang-Mills theories behave like the massless Gaussian free field (GFF) at short distances. This makes it impossible to define the main observables for these theories - the Wilson loop observables - in dimensions greater than two, because line integrals of the GFF do not exist in such dimensions. Taking forward a proposal of Charalambous and Gross, this article shows t…
▽ More
It is believed that Euclidean Yang-Mills theories behave like the massless Gaussian free field (GFF) at short distances. This makes it impossible to define the main observables for these theories - the Wilson loop observables - in dimensions greater than two, because line integrals of the GFF do not exist in such dimensions. Taking forward a proposal of Charalambous and Gross, this article shows that it is possible to define Euclidean Yang-Mills theories on the 3D unit torus as "random distributional gauge orbits", provided that they indeed behave like the GFF in a certain sense. One of the main technical tools is the existence of the Yang-Mills heat flow on the 3D torus starting from GFF-like initial data, which is established in a companion paper. A key consequence of this construction is that under the GFF assumption, one can define a notion of "regularized Wilson loop observables" for Euclidean Yang-Mills theories on the 3D unit torus.
△ Less
Submitted 19 November, 2023; v1 submitted 24 November, 2021;
originally announced November 2021.
-
The Yang-Mills heat flow with random distributional initial data
Authors:
Sky Cao,
Sourav Chatterjee
Abstract:
We construct local solutions to the Yang-Mills heat flow (in the DeTurck gauge) for a certain class of random distributional initial data, which includes the 3D Gaussian free field. The main idea, which goes back to work of Bourgain as well as work of Da Prato-Debussche, is to decompose the solution into a rougher linear part and a smoother nonlinear part, and to control the latter by probabilisti…
▽ More
We construct local solutions to the Yang-Mills heat flow (in the DeTurck gauge) for a certain class of random distributional initial data, which includes the 3D Gaussian free field. The main idea, which goes back to work of Bourgain as well as work of Da Prato-Debussche, is to decompose the solution into a rougher linear part and a smoother nonlinear part, and to control the latter by probabilistic arguments. In a companion work, we use the main results of this paper to propose a way towards the construction of 3D Yang-Mills measures.
△ Less
Submitted 25 August, 2022; v1 submitted 20 November, 2021;
originally announced November 2021.
-
Existence of stationary ballistic deposition on the infinite lattice
Authors:
Sourav Chatterjee
Abstract:
Ballistic deposition is one of the many models of interface growth that are believed to be in the KPZ universality class, but have so far proved to be largely intractable mathematically. In this model, blocks of size one fall independently as Poisson processes at each site on the $d$-dimensional lattice, and either attach themselves to the column growing at that site, or to the side of an adjacent…
▽ More
Ballistic deposition is one of the many models of interface growth that are believed to be in the KPZ universality class, but have so far proved to be largely intractable mathematically. In this model, blocks of size one fall independently as Poisson processes at each site on the $d$-dimensional lattice, and either attach themselves to the column growing at that site, or to the side of an adjacent column, whichever comes first. It is not hard to see that if we subtract off the height of the column at the origin from the heights of the other columns, the resulting interface process is Markovian. The main result of this article is that this Markov process has at least one invariant probability measure. We conjecture that the invariant measure is not unique, and provide some partial evidence.
△ Less
Submitted 18 May, 2022; v1 submitted 19 October, 2021;
originally announced October 2021.
-
Regret Minimization in Isotonic, Heavy-Tailed Contextual Bandits via Adaptive Confidence Bands
Authors:
Sabyasachi Chatterjee,
Subhabrata Sen
Abstract:
In this paper we initiate a study of non parametric contextual bandits under shape constraints on the mean reward function. Specifically, we study a setting where the context is one dimensional, and the mean reward function is isotonic with respect to this context. We propose a policy for this problem and show that it attains minimax rate optimal regret. Moreover, we show that the same policy enjo…
▽ More
In this paper we initiate a study of non parametric contextual bandits under shape constraints on the mean reward function. Specifically, we study a setting where the context is one dimensional, and the mean reward function is isotonic with respect to this context. We propose a policy for this problem and show that it attains minimax rate optimal regret. Moreover, we show that the same policy enjoys automatic adaptation; that is, for subclasses of the parameter space where the true mean reward functions are also piecewise constant with $k$ pieces, this policy remains minimax rate optimal simultaneously for all $k \geq 1.$ Automatic adaptation phenomena are well-known for shape constrained problems in the offline setting;
%The phenomenon of automatic adaptation of shape constrained methods is known to occur in offline problems;
we show that such phenomena carry over to the online setting.
The main technical ingredient underlying our policy is a procedure to derive confidence bands for an underlying isotonic function using the isotonic quantile estimator. The confidence band we propose is valid under heavy tailed noise, and its average width goes to $0$ at an adaptively optimal rate. We consider this to be an independent contribution to the isotonic regression literature.
△ Less
Submitted 19 October, 2021;
originally announced October 2021.
-
Quantile Regression by Dyadic CART
Authors:
Oscar Hernan Madrid Padilla,
Sabyasachi Chatterjee
Abstract:
In this paper we propose and study a version of the Dyadic Classification and Regression Trees (DCART) estimator from Donoho (1997) for (fixed design) quantile regression in general dimensions. We refer to this proposed estimator as the QDCART estimator. Just like the mean regression version, we show that a) a fast dynamic programming based algorithm with computational complexity $O(N \log N)$ exi…
▽ More
In this paper we propose and study a version of the Dyadic Classification and Regression Trees (DCART) estimator from Donoho (1997) for (fixed design) quantile regression in general dimensions. We refer to this proposed estimator as the QDCART estimator. Just like the mean regression version, we show that a) a fast dynamic programming based algorithm with computational complexity $O(N \log N)$ exists for computing the QDCART estimator and b) an oracle risk bound (trading off squared error and a complexity parameter of the true signal) holds for the QDCART estimator. This oracle risk bound then allows us to demonstrate that the QDCART estimator enjoys adaptively rate optimal estimation guarantees for piecewise constant and bounded variation function classes. In contrast to existing results for the DCART estimator which requires subgaussianity of the error distribution, for our estimation guarantees to hold we do not need any restrictive tail decay assumptions on the error distribution. For instance, our results hold even when the error distribution has no first moment such as the Cauchy distribution. Apart from the Dyadic CART method, we also consider other variant methods such as the Optimal Regression Tree (ORT) estimator introduced in Chatterjee and Goswami (2019). In particular, we also extend the ORT estimator to the quantile setting and establish that it enjoys analogous guarantees. Thus, this paper extends the scope of these globally optimal regression tree based methodologies to be applicable for heavy tailed data. We then perform extensive numerical experiments on both simulated and real data which illustrate the usefulness of the proposed methods.
△ Less
Submitted 16 October, 2021;
originally announced October 2021.
-
Local KPZ behavior under arbitrary scaling limits
Authors:
Sourav Chatterjee
Abstract:
One of the main difficulties in proving convergence of discrete models of surface growth to the Kardar-Parisi-Zhang (KPZ) equation in dimensions higher than one is that the correct way to take a scaling limit, so that the limit is nontrivial, is not known in a rigorous sense. To understand KPZ growth without being hindered by this issue, this article introduces a notion of "local KPZ behavior", wh…
▽ More
One of the main difficulties in proving convergence of discrete models of surface growth to the Kardar-Parisi-Zhang (KPZ) equation in dimensions higher than one is that the correct way to take a scaling limit, so that the limit is nontrivial, is not known in a rigorous sense. To understand KPZ growth without being hindered by this issue, this article introduces a notion of "local KPZ behavior", which roughly means that the instantaneous growth of the surface at a point decomposes into the sum of a Laplacian term, a gradient squared term, a noise term that behaves like white noise, and a remainder term that is negligible compared to the other three terms and their sum. The main result is that for a general class of surfaces, which contains the model of directed polymers in a random environment as a special case, local KPZ behavior occurs under arbitrary scaling limits, in any dimension.
△ Less
Submitted 29 July, 2022; v1 submitted 3 October, 2021;
originally announced October 2021.