-
Statistical Inference for Optimal Transport Maps: Recent Advances and Perspectives
Authors:
Sivaraman Balakrishnan,
Tudor Manole,
Larry Wasserman
Abstract:
In many applications of optimal transport (OT), the object of primary interest is the optimal transport map. This map rearranges mass from one probability distribution to another in the most efficient way possible by minimizing a specified cost. In this paper we review recent advances in estimating and developing limit theorems for the OT map, using samples from the underlying distributions. We al…
▽ More
In many applications of optimal transport (OT), the object of primary interest is the optimal transport map. This map rearranges mass from one probability distribution to another in the most efficient way possible by minimizing a specified cost. In this paper we review recent advances in estimating and developing limit theorems for the OT map, using samples from the underlying distributions. We also review parallel lines of work that establish similar results for special cases and variants of the basic OT setup. We conclude with a discussion of key directions for future research with the goal of providing practitioners with reliable inferential tools.
△ Less
Submitted 23 June, 2025;
originally announced June 2025.
-
Testing Random Effects for Binomial Data
Authors:
Lucas Kania,
Larry Wasserman,
Sivaraman Balakrishnan
Abstract:
In modern scientific research, small-scale studies with limited participants are increasingly common. However, interpreting individual outcomes can be challenging, making it standard practice to combine data across studies using random effects to draw broader scientific conclusions. In this work, we introduce an optimal methodology for assessing the goodness of fit between a given reference distri…
▽ More
In modern scientific research, small-scale studies with limited participants are increasingly common. However, interpreting individual outcomes can be challenging, making it standard practice to combine data across studies using random effects to draw broader scientific conclusions. In this work, we introduce an optimal methodology for assessing the goodness of fit between a given reference distribution and the distribution of random effects arising from binomial counts.
Using the minimax framework, we characterize the smallest separation between the null and alternative hypotheses, called the critical separation, under the 1-Wasserstein distance that ensures the existence of a valid and powerful test. The optimal test combines a plug-in estimator of the Wasserstein distance with a debiased version of Pearson's chi-squared test.
We focus on meta-analyses, where a key question is whether multiple studies agree on a treatment's effectiveness before pooling data. That is, researchers must determine whether treatment effects are homogeneous across studies. We begin by analyzing scenarios with a specified reference effect, such as testing whether all studies show the treatment is effective 80% of the time, and describe how the critical separation depends on the reference effect. We then extend the analysis to homogeneity testing without a reference effect and construct an optimal test by debiasing Cochran's chi-squared test.
Finally, we illustrate how our proposed methodologies improve the construction of p-values and confidence intervals, with applications to assessing drug safety in the context of rare adverse outcomes and modeling political outcomes at the county level.
△ Less
Submitted 17 April, 2025;
originally announced April 2025.
-
Stability Bounds for Smooth Optimal Transport Maps and their Statistical Implications
Authors:
Sivaraman Balakrishnan,
Tudor Manole
Abstract:
We study estimators of the optimal transport (OT) map between two probability distributions. We focus on plugin estimators derived from the OT map between estimates of the underlying distributions. We develop novel stability bounds for OT maps which generalize those in past work, and allow us to reduce the problem of optimally estimating the transport map to that of optimally estimating densities…
▽ More
We study estimators of the optimal transport (OT) map between two probability distributions. We focus on plugin estimators derived from the OT map between estimates of the underlying distributions. We develop novel stability bounds for OT maps which generalize those in past work, and allow us to reduce the problem of optimally estimating the transport map to that of optimally estimating densities in the Wasserstein distance. In contrast, past work provided a partial connection between these problems and relied on regularity theory for the Monge-Ampere equation to bridge the gap, a step which required unnatural assumptions to obtain sharp guarantees. We also provide some new insights into the connections between stability bounds which arise in the analysis of plugin estimators and growth bounds for the semi-dual functional which arise in the analysis of Brenier potential-based estimators of the transport map. We illustrate the applicability of our new stability bounds by revisiting the smooth setting studied by Manole et al., analyzing two of their estimators under more general conditions. Critically, our bounds do not require smoothness or boundedness assumptions on the underlying measures. As an illustrative application, we develop and analyze a novel tuning parameter-free estimator for the OT map between two strongly log-concave distributions.
△ Less
Submitted 17 February, 2025;
originally announced February 2025.
-
Rational points on the non-split Cartan modular curve of level 27 and quadratic Chabauty over number fields
Authors:
Jennifer S. Balakrishnan,
L. Alexander Betts,
Daniel Rayor Hast,
Aashraya Jha,
J. Steffen Müller
Abstract:
Thanks to work of Rouse, Sutherland, and Zureick-Brown, it is known exactly which subgroups of GL$_2(\mathbf{Z}_3)$ can occur as the image of the $3$-adic Galois representation attached to a non-CM elliptic curve over $\mathbf{Q}$, with a single exception: the normaliser of the non-split Cartan subgroup of level 27. In this paper, we complete the classification of 3-adic Galois images by showing t…
▽ More
Thanks to work of Rouse, Sutherland, and Zureick-Brown, it is known exactly which subgroups of GL$_2(\mathbf{Z}_3)$ can occur as the image of the $3$-adic Galois representation attached to a non-CM elliptic curve over $\mathbf{Q}$, with a single exception: the normaliser of the non-split Cartan subgroup of level 27. In this paper, we complete the classification of 3-adic Galois images by showing that the normaliser of the non-split Cartan subgroup of level 27 cannot occur as a 3-adic Galois image of a non-CM elliptic curve.
Our proof proceeds via computing the $\mathbf{Q}(ζ_3)$-rational points on a certain smooth plane quartic curve $X'_H$ (arising as a quotient of the modular curve $X_{ns}^+(27)$) defined over $\mathbf{Q}(ζ_3)$ whose Jacobian has Mordell--Weil rank 6. To this end, we describe how to carry out the quadratic Chabauty method for a modular curve $X$ defined over a number field $F$, which, when applicable, determines a finite subset of $X(F\otimes\mathbf{Q}_p)$ in certain situations of larger Mordell--Weil rank than previously considered. Together with an analysis of local heights above 3, we apply this quadratic Chabauty method to determine $X'_H(\mathbf{Q}(ζ_3))$. This allows us to compute the set $X_{ns}^+(27)(\mathbf{Q})$, finishing the classification of 3-adic images of Galois.
△ Less
Submitted 13 January, 2025;
originally announced January 2025.
-
A refined Chabauty--Coleman bound for surfaces
Authors:
Jennifer S. Balakrishnan,
Jerson Caro
Abstract:
Caro and Pasten gave an explicit upper bound on the number of rational points on a hyperbolic surface that is embedded in an abelian variety of rank at most one. We show how to use their method to produce a refined bound on the number of rational points on the surface $W_2 := C+C$ in the case of a hyperelliptic curve $C$ of genus $3$ over $\mathbb{Q}$. Combining this with work of Siksek, we use th…
▽ More
Caro and Pasten gave an explicit upper bound on the number of rational points on a hyperbolic surface that is embedded in an abelian variety of rank at most one. We show how to use their method to produce a refined bound on the number of rational points on the surface $W_2 := C+C$ in the case of a hyperelliptic curve $C$ of genus $3$ over $\mathbb{Q}$. Combining this with work of Siksek, we use this to determine $W_2(\mathbb{Q})$ in a selection of examples.
△ Less
Submitted 3 February, 2025; v1 submitted 6 January, 2025;
originally announced January 2025.
-
Stochastic interventions, sensitivity analysis, and optimal transport
Authors:
Alexander W. Levis,
Edward H. Kennedy,
Alec McClean,
Sivaraman Balakrishnan,
Larry Wasserman
Abstract:
Recent methodological research in causal inference has focused on effects of stochastic interventions, which assign treatment randomly, often according to subject-specific covariates. In this work, we demonstrate that the usual notion of stochastic interventions have a surprising property: when there is unmeasured confounding, bounds on their effects do not collapse when the policy approaches the…
▽ More
Recent methodological research in causal inference has focused on effects of stochastic interventions, which assign treatment randomly, often according to subject-specific covariates. In this work, we demonstrate that the usual notion of stochastic interventions have a surprising property: when there is unmeasured confounding, bounds on their effects do not collapse when the policy approaches the observational regime. As an alternative, we propose to study generalized policies, treatment rules that can depend on covariates, the natural value of treatment, and auxiliary randomness. We show that certain generalized policy formulations can resolve the "non-collapsing" bound issue: bounds narrow to a point when the target treatment distribution approaches that in the observed data. Moreover, drawing connections to the theory of optimal transport, we characterize generalized policies that minimize worst-case bound width in various sensitivity analysis models, as well as corresponding sharp bounds on their causal effects. These optimal policies are new, and can have a more parsimonious interpretation compared to their usual stochastic policy analogues. Finally, we develop flexible, efficient, and robust estimators for the sharp nonparametric bounds that emerge from the framework.
△ Less
Submitted 21 November, 2024;
originally announced November 2024.
-
Two-Sample Testing with a Graph-Based Total Variation Integral Probability Metric
Authors:
Alden Green,
Sivaraman Balakrishnan,
Ryan J. Tibshirani
Abstract:
We consider a novel multivariate nonparametric two-sample testing problem where, under the alternative, distributions $P$ and $Q$ are separated in an integral probability metric over functions of bounded total variation (TV IPM). We propose a new test, the graph TV test, which uses a graph-based approximation to the TV IPM as its test statistic. We show that this test, computed with an…
▽ More
We consider a novel multivariate nonparametric two-sample testing problem where, under the alternative, distributions $P$ and $Q$ are separated in an integral probability metric over functions of bounded total variation (TV IPM). We propose a new test, the graph TV test, which uses a graph-based approximation to the TV IPM as its test statistic. We show that this test, computed with an $\varepsilon$-neighborhood graph and calibrated by permutation, is minimax rate-optimal for detecting alternatives separated in the TV IPM. As an important special case, we show that this implies the graph TV test is optimal for detecting spatially localized alternatives, whereas the $χ^2$ test is provably suboptimal. Our theory is supported with numerical experiments on simulated and real data.
△ Less
Submitted 23 September, 2024;
originally announced September 2024.
-
Shadow line distributions
Authors:
Jennifer S. Balakrishnan,
Mirela Çiperiani,
Barry Mazur,
Karl Rubin
Abstract:
Let $E$ be an elliptic curve over $\mathbb{Q}$ with Mordell--Weil rank $2$ and $p$ be an odd prime of good ordinary reduction. For every imaginary quadratic field $K$ satisfying the Heegner hypothesis, there is (subject to the Shafarevich--Tate conjecture) a line, i.e., a free $\mathbb{Z}_p$-submodule of rank $1$, in $ E(K)\otimes \mathbb{Z}_p$ given by universal norms coming from the Mordell--Wei…
▽ More
Let $E$ be an elliptic curve over $\mathbb{Q}$ with Mordell--Weil rank $2$ and $p$ be an odd prime of good ordinary reduction. For every imaginary quadratic field $K$ satisfying the Heegner hypothesis, there is (subject to the Shafarevich--Tate conjecture) a line, i.e., a free $\mathbb{Z}_p$-submodule of rank $1$, in $ E(K)\otimes \mathbb{Z}_p$ given by universal norms coming from the Mordell--Weil groups of subfields of the anticyclotomic $\mathbb{Z}_p$-extension of $K$; we call it the {\it shadow line}. When the twist of $E$ by $K$ has analytic rank $1$, the shadow line is conjectured to lie in $E(\mathbb{Q})\otimes\mathbb{Z}_p$; we verify this computationally in all our examples. We study the distribution of shadow lines in $E(\mathbb{Q})\otimes\mathbb{Z}_p$ as $K$ varies, framing conjectures based on the computations we have made.
△ Less
Submitted 12 May, 2025; v1 submitted 1 September, 2024;
originally announced September 2024.
-
Causal Inference with High-dimensional Discrete Covariates
Authors:
Zhenghao Zeng,
Sivaraman Balakrishnan,
Yanjun Han,
Edward H. Kennedy
Abstract:
When estimating causal effects from observational studies, researchers often need to adjust for many covariates to deconfound the non-causal relationship between exposure and outcome, among which many covariates are discrete. The behavior of commonly used estimators in the presence of many discrete covariates is not well understood since their properties are often analyzed under structural assumpt…
▽ More
When estimating causal effects from observational studies, researchers often need to adjust for many covariates to deconfound the non-causal relationship between exposure and outcome, among which many covariates are discrete. The behavior of commonly used estimators in the presence of many discrete covariates is not well understood since their properties are often analyzed under structural assumptions including sparsity and smoothness, which do not apply in discrete settings. In this work, we study the estimation of causal effects in a model where the covariates required for confounding adjustment are discrete but high-dimensional, meaning the number of categories $d$ is comparable with or even larger than sample size $n$. Specifically, we show the mean squared error of commonly used regression, weighting and doubly robust estimators is bounded by $\frac{d^2}{n^2}+\frac{1}{n}$. We then prove the minimax lower bound for the average treatment effect is of order $\frac{d^2}{n^2 \log^2 n}+\frac{1}{n}$, which characterizes the fundamental difficulty of causal effect estimation in the high-dimensional discrete setting, and shows the estimators mentioned above are rate-optimal up to log-factors. We further consider additional structures that can be exploited, namely effect homogeneity and prior knowledge of the covariate distribution, and propose new estimators that enjoy faster convergence rates of order $\frac{d}{n^2} + \frac{1}{n}$, which achieve consistency in a broader regime. The results are illustrated empirically via simulation studies.
△ Less
Submitted 5 May, 2024; v1 submitted 30 April, 2024;
originally announced May 2024.
-
Double Cross-fit Doubly Robust Estimators: Beyond Series Regression
Authors:
Alec McClean,
Sivaraman Balakrishnan,
Edward H. Kennedy,
Larry Wasserman
Abstract:
Doubly robust estimators with cross-fitting have gained popularity in causal inference due to their favorable structure-agnostic error guarantees. However, when additional structure, such as Hölder smoothness, is available then more accurate "double cross-fit doubly robust" (DCDR) estimators can be constructed by splitting the training data and undersmoothing nuisance function estimators on indepe…
▽ More
Doubly robust estimators with cross-fitting have gained popularity in causal inference due to their favorable structure-agnostic error guarantees. However, when additional structure, such as Hölder smoothness, is available then more accurate "double cross-fit doubly robust" (DCDR) estimators can be constructed by splitting the training data and undersmoothing nuisance function estimators on independent samples. We study a DCDR estimator of the Expected Conditional Covariance, a functional of interest in causal inference and conditional independence testing. We first provide a structure-agnostic error analysis for the DCDR estimator with no assumptions on the nuisance functions or their estimators. Then, assuming the nuisance functions are Hölder smooth, but without assuming knowledge of the true smoothness level or the covariate density, we establish that DCDR estimators with several linear smoothers are $\sqrt{n}$-consistent and asymptotically normal under minimal conditions and achieve fast convergence rates in the non-$\sqrt{n}$ regime. When the covariate density and smoothnesses are known, we propose a minimax rate-optimal DCDR estimator based on undersmoothed kernel regression. Moreover, we show an undersmoothed DCDR estimator satisfies a slower-than-$\sqrt{n}$ central limit theorem, and that inference is possible even in the non-$\sqrt{n}$ regime. Finally, we support our theoretical results with simulations, providing intuition for double cross-fitting and undersmoothing, demonstrating where our estimator achieves $\sqrt{n}$-consistency while the usual "single cross-fit" estimator fails, and illustrating asymptotic normality for the undersmoothed DCDR estimator.
△ Less
Submitted 7 May, 2025; v1 submitted 22 March, 2024;
originally announced March 2024.
-
Semi-Supervised U-statistics
Authors:
Ilmun Kim,
Larry Wasserman,
Sivaraman Balakrishnan,
Matey Neykov
Abstract:
Semi-supervised datasets are ubiquitous across diverse domains where obtaining fully labeled data is costly or time-consuming. The prevalence of such datasets has consistently driven the demand for new tools and methods that exploit the potential of unlabeled data. Responding to this demand, we introduce semi-supervised U-statistics enhanced by the abundance of unlabeled data, and investigate thei…
▽ More
Semi-supervised datasets are ubiquitous across diverse domains where obtaining fully labeled data is costly or time-consuming. The prevalence of such datasets has consistently driven the demand for new tools and methods that exploit the potential of unlabeled data. Responding to this demand, we introduce semi-supervised U-statistics enhanced by the abundance of unlabeled data, and investigate their statistical properties. We show that the proposed approach is asymptotically Normal and exhibits notable efficiency gains over classical U-statistics by effectively integrating various powerful prediction tools into the framework. To understand the fundamental difficulty of the problem, we derive minimax lower bounds in semi-supervised settings and showcase that our procedure is semi-parametrically efficient under regularity conditions. Moreover, tailored to bivariate kernels, we propose a refined approach that outperforms the classical U-statistic across all degeneracy regimes, and demonstrate its optimality properties. Simulation studies are conducted to corroborate our findings and to further demonstrate our framework.
△ Less
Submitted 9 March, 2024; v1 submitted 29 February, 2024;
originally announced February 2024.
-
Central Limit Theorems for Smooth Optimal Transport Maps
Authors:
Tudor Manole,
Sivaraman Balakrishnan,
Jonathan Niles-Weed,
Larry Wasserman
Abstract:
One of the central objects in the theory of optimal transport is the Brenier map: the unique monotone transformation which pushes forward an absolutely continuous probability law onto any other given law. A line of recent work has analyzed $L^2$ convergence rates of plugin estimators of Brenier maps, which are defined as the Brenier map between density estimators of the underlying distributions. I…
▽ More
One of the central objects in the theory of optimal transport is the Brenier map: the unique monotone transformation which pushes forward an absolutely continuous probability law onto any other given law. A line of recent work has analyzed $L^2$ convergence rates of plugin estimators of Brenier maps, which are defined as the Brenier map between density estimators of the underlying distributions. In this work, we show that such estimators satisfy a pointwise central limit theorem when the underlying laws are supported on the flat torus of dimension $d \geq 3$. We also derive a negative result, showing that these estimators do not converge weakly in $L^2$ when the dimension is sufficiently large. Our proofs hinge upon a quantitative linearization of the Monge-Ampère equation, which may be of independent interest. This result allows us to reduce our problem to that of deriving limit laws for the solution of a uniformly elliptic partial differential equation with a stochastic right-hand side, subject to periodic boundary conditions.
△ Less
Submitted 16 September, 2024; v1 submitted 19 December, 2023;
originally announced December 2023.
-
Arbitrarily long strings of consecutive primes in special sets
Authors:
Sai Sanjeev Balakrishnan,
Félix Houde,
Vahagn Hovhannisyan,
Maryna Manskova,
Yiqing Wang
Abstract:
Let $F(x)$ be a function of the form $ \sum_{i=1}^r d_i x^{ρ_i}$ where $d_1,\ldots,d_r\in\mathbb{R}$, $0 \leq ρ_1 < \ldots < ρ_r,$ $ρ_r \not\in \mathbb{Z},ρ_i \in \mathbb{R}$ for $ 1 \leq i \leq r$ and $d_r\not=0$. We prove that sets of the form $\{ n \in \mathbb{N}: \{ F(n) \} \in U \}$ for any non-empty open set $U \subset [0,1)$ contain arbitrarily long strings of consecutive primes.
Let $F(x)$ be a function of the form $ \sum_{i=1}^r d_i x^{ρ_i}$ where $d_1,\ldots,d_r\in\mathbb{R}$, $0 \leq ρ_1 < \ldots < ρ_r,$ $ρ_r \not\in \mathbb{Z},ρ_i \in \mathbb{R}$ for $ 1 \leq i \leq r$ and $d_r\not=0$. We prove that sets of the form $\{ n \in \mathbb{N}: \{ F(n) \} \in U \}$ for any non-empty open set $U \subset [0,1)$ contain arbitrarily long strings of consecutive primes.
△ Less
Submitted 30 November, 2023;
originally announced November 2023.
-
Conservative Inference for Counterfactuals
Authors:
Sivaraman Balakrishnan,
Edward Kennedy,
Larry Wasserman
Abstract:
In causal inference, the joint law of a set of counterfactual random variables is generally not identified. We show that a conservative version of the joint law - corresponding to the smallest treatment effect - is identified. Finding this law uses recent results from optimal transport theory. Under this conservative law we can bound causal effects and we may construct inferences for each individu…
▽ More
In causal inference, the joint law of a set of counterfactual random variables is generally not identified. We show that a conservative version of the joint law - corresponding to the smallest treatment effect - is identified. Finding this law uses recent results from optimal transport theory. Under this conservative law we can bound causal effects and we may construct inferences for each individual's counterfactual dose-response curve. Intuitively, this is the flattest counterfactual curve for each subject that is consistent with the distribution of the observables. If the outcome is univariate then, under mild conditions, this curve is simply the quantile function of the counterfactual distribution that passes through the observed point. This curve corresponds to a nonparametric rank preserving structural model.
△ Less
Submitted 19 October, 2023;
originally announced October 2023.
-
Causal Effect Estimation after Propensity Score Trimming with Continuous Treatments
Authors:
Zach Branson,
Edward H. Kennedy,
Sivaraman Balakrishnan,
Larry Wasserman
Abstract:
Propensity score trimming, which discards subjects with propensity scores below a threshold, is a common way to address positivity violations that complicate causal effect estimation. However, most works on trimming assume treatment is discrete and models for the outcome regression and propensity score are parametric. This work proposes nonparametric estimators for trimmed average causal effects i…
▽ More
Propensity score trimming, which discards subjects with propensity scores below a threshold, is a common way to address positivity violations that complicate causal effect estimation. However, most works on trimming assume treatment is discrete and models for the outcome regression and propensity score are parametric. This work proposes nonparametric estimators for trimmed average causal effects in the case of continuous treatments based on efficient influence functions. For continuous treatments, an efficient influence function for a trimmed causal effect does not exist, due to a lack of pathwise differentiability induced by trimming and a continuous treatment. Thus, we target a smoothed version of the trimmed causal effect for which an efficient influence function exists. Our resulting estimators exhibit doubly-robust style guarantees, with error involving products or squares of errors for the outcome regression and propensity score, which allows for valid inference even when nonparametric models are used. Our results allow the trimming threshold to be fixed or defined as a quantile of the propensity score, such that confidence intervals incorporate uncertainty involved in threshold estimation. These findings are validated via simulation and an application, thereby showing how to efficiently-but-flexibly estimate trimmed causal effects with continuous treatments.
△ Less
Submitted 29 July, 2024; v1 submitted 1 September, 2023;
originally announced September 2023.
-
Nearly Minimax Optimal Wasserstein Conditional Independence Testing
Authors:
Matey Neykov,
Larry Wasserman,
Ilmun Kim,
Sivaraman Balakrishnan
Abstract:
This paper is concerned with minimax conditional independence testing. In contrast to some previous works on the topic, which use the total variation distance to separate the null from the alternative, here we use the Wasserstein distance. In addition, we impose Wasserstein smoothness conditions which on bounded domains are weaker than the corresponding total variation smoothness imposed, for inst…
▽ More
This paper is concerned with minimax conditional independence testing. In contrast to some previous works on the topic, which use the total variation distance to separate the null from the alternative, here we use the Wasserstein distance. In addition, we impose Wasserstein smoothness conditions which on bounded domains are weaker than the corresponding total variation smoothness imposed, for instance, by Neykov et al. [2021]. This added flexibility expands the distributions which are allowed under the null and the alternative to include distributions which may contain point masses for instance. We characterize the optimal rate of the critical radius of testing up to logarithmic factors. Our test statistic which nearly achieves the optimal critical radius is novel, and can be thought of as a weighted multi-resolution version of the U-statistic studied by Neykov et al. [2021].
△ Less
Submitted 16 August, 2023;
originally announced August 2023.
-
Conditional Independence Testing for Discrete Distributions: Beyond $χ^2$- and $G$-tests
Authors:
Ilmun Kim,
Matey Neykov,
Sivaraman Balakrishnan,
Larry Wasserman
Abstract:
This paper is concerned with the problem of conditional independence testing for discrete data. In recent years, researchers have shed new light on this fundamental problem, emphasizing finite-sample optimality. The non-asymptotic viewpoint adapted in these works has led to novel conditional independence tests that enjoy certain optimality under various regimes. Despite their attractive theoretica…
▽ More
This paper is concerned with the problem of conditional independence testing for discrete data. In recent years, researchers have shed new light on this fundamental problem, emphasizing finite-sample optimality. The non-asymptotic viewpoint adapted in these works has led to novel conditional independence tests that enjoy certain optimality under various regimes. Despite their attractive theoretical properties, the considered tests are not necessarily practical, relying on a Poissonization trick and unspecified constants in their critical values. In this work, we attempt to bridge the gap between theory and practice by reproving optimality without Poissonization and calibrating tests using Monte Carlo permutations. Along the way, we also prove that classical asymptotic $χ^2$- and $G$-tests are notably sub-optimal in a high-dimensional regime, which justifies the demand for new tools. Our theoretical results are complemented by experiments on both simulated and real-world datasets. Accompanying this paper is an R package UCI that implements the proposed tests.
△ Less
Submitted 28 October, 2023; v1 submitted 10 August, 2023;
originally announced August 2023.
-
Ogg's Torsion conjecture: Fifty years later
Authors:
Jennifer S. Balakrishnan,
Barry Mazur
Abstract:
Andrew Ogg's mathematical viewpoint has inspired an increasingly broad array of results and conjectures. His results and conjectures have earmarked fruitful turning points in our subject, and his influence has been such a gift to all of us.
Ogg's celebrated Torsion Conjecture -- as it relates to modular curves -- can be paraphrased as saying that rational points (on the modular curves that param…
▽ More
Andrew Ogg's mathematical viewpoint has inspired an increasingly broad array of results and conjectures. His results and conjectures have earmarked fruitful turning points in our subject, and his influence has been such a gift to all of us.
Ogg's celebrated Torsion Conjecture -- as it relates to modular curves -- can be paraphrased as saying that rational points (on the modular curves that parametrize torsion points on elliptic curves) exist if and only if there is a good geometric reason for them to exist.
We give a survey of Ogg's Torsion Conjecture and the subsequent developments in our understanding of rational points on modular curves over the last fifty years.
△ Less
Submitted 8 August, 2024; v1 submitted 10 July, 2023;
originally announced July 2023.
-
The Fundamental Limits of Structure-Agnostic Functional Estimation
Authors:
Sivaraman Balakrishnan,
Edward H. Kennedy,
Larry Wasserman
Abstract:
Many recent developments in causal inference, and functional estimation problems more generally, have been motivated by the fact that classical one-step (first-order) debiasing methods, or their more recent sample-split double machine-learning avatars, can outperform plugin estimators under surprisingly weak conditions. These first-order corrections improve on plugin estimators in a black-box fash…
▽ More
Many recent developments in causal inference, and functional estimation problems more generally, have been motivated by the fact that classical one-step (first-order) debiasing methods, or their more recent sample-split double machine-learning avatars, can outperform plugin estimators under surprisingly weak conditions. These first-order corrections improve on plugin estimators in a black-box fashion, and consequently are often used in conjunction with powerful off-the-shelf estimation methods. These first-order methods are however provably suboptimal in a minimax sense for functional estimation when the nuisance functions live in Holder-type function spaces. This suboptimality of first-order debiasing has motivated the development of "higher-order" debiasing methods. The resulting estimators are, in some cases, provably optimal over Holder-type spaces, but both the estimators which are minimax-optimal and their analyses are crucially tied to properties of the underlying function space.
In this paper we investigate the fundamental limits of structure-agnostic functional estimation, where relatively weak conditions are placed on the underlying nuisance functions. We show that there is a strong sense in which existing first-order methods are optimal. We achieve this goal by providing a formalization of the problem of functional estimation with black-box nuisance function estimates, and deriving minimax lower bounds for this problem. Our results highlight some clear tradeoffs in functional estimation -- if we wish to remain agnostic to the underlying nuisance function spaces, impose only high-level rate conditions, and maintain compatibility with black-box nuisance estimators then first-order methods are optimal. When we have an understanding of the structure of the underlying nuisance functions then carefully constructed higher-order estimators can outperform first-order estimators.
△ Less
Submitted 7 June, 2025; v1 submitted 6 May, 2023;
originally announced May 2023.
-
Data-Driven Observability Decomposition with Koopman Operators for Optimization of Output Functions of Nonlinear Systems
Authors:
Shara Balakrishnan,
Aqib Hasnain,
Robert Egbert,
Enoch Yeung
Abstract:
When complex systems with nonlinear dynamics achieve an output performance objective, only a fraction of the state dynamics significantly impacts that output. Those minimal state dynamics can be identified using the differential geometric approach to the observability of nonlinear systems, but the theory is limited to only analytical systems. In this paper, we extend the notion of nonlinear observ…
▽ More
When complex systems with nonlinear dynamics achieve an output performance objective, only a fraction of the state dynamics significantly impacts that output. Those minimal state dynamics can be identified using the differential geometric approach to the observability of nonlinear systems, but the theory is limited to only analytical systems. In this paper, we extend the notion of nonlinear observable decomposition to the more general class of data-informed systems. We employ Koopman operator theory, which encapsulates nonlinear dynamics in linear models, allowing us to bridge the gap between linear and nonlinear observability notions. We propose a new algorithm to learn Koopman operator representations that capture the system dynamics while ensuring that the output performance measure is in the span of its observables. We show that a transformation of this linear, output-inclusive Koopman model renders a new minimum Koopman representation. This representation embodies only the observable portion of the nonlinear observable decomposition of the original system. A prime application of this theory is to identify genes in biological systems that correspond to specific phenotypes, the performance measure. We simulate two biological gene networks and demonstrate that the observability of Koopman operators can successfully identify genes that drive each phenotype. We anticipate our novel system identification tool will effectively discover reduced gene networks that drive complex behaviors in biological systems.
△ Less
Submitted 17 October, 2022;
originally announced October 2022.
-
Median Regularity and Honest Inference
Authors:
Arun Kumar Kuchibhotla,
Sivaraman Balakrishnan,
Larry Wasserman
Abstract:
We introduce a new notion of regularity of an estimator called median regularity. We prove that uniformly valid (honest) inference for a functional is possible if and only if there exists a median regular estimator of that functional. To our knowledge, such a notion of regularity that is necessary for uniformly valid inference is unavailable in the literature.
We introduce a new notion of regularity of an estimator called median regularity. We prove that uniformly valid (honest) inference for a functional is possible if and only if there exists a median regular estimator of that functional. To our knowledge, such a notion of regularity that is necessary for uniformly valid inference is unavailable in the literature.
△ Less
Submitted 6 June, 2022;
originally announced June 2022.
-
Minimax rates for heterogeneous causal effect estimation
Authors:
Edward H. Kennedy,
Sivaraman Balakrishnan,
James M. Robins,
Larry Wasserman
Abstract:
Estimation of heterogeneous causal effects - i.e., how effects of policies and treatments vary across subjects - is a fundamental task in causal inference. Many methods for estimating conditional average treatment effects (CATEs) have been proposed in recent years, but questions surrounding optimality have remained largely unanswered. In particular, a minimax theory of optimality has yet to be dev…
▽ More
Estimation of heterogeneous causal effects - i.e., how effects of policies and treatments vary across subjects - is a fundamental task in causal inference. Many methods for estimating conditional average treatment effects (CATEs) have been proposed in recent years, but questions surrounding optimality have remained largely unanswered. In particular, a minimax theory of optimality has yet to be developed, with the minimax rate of convergence and construction of rate-optimal estimators remaining open problems. In this paper we derive the minimax rate for CATE estimation, in a Holder-smooth nonparametric model, and present a new local polynomial estimator, giving high-level conditions under which it is minimax optimal. Our minimax lower bound is derived via a localized version of the method of fuzzy hypotheses, combining lower bound constructions for nonparametric regression and functional estimation. Our proposed estimator can be viewed as a local polynomial R-Learner, based on a localized modification of higher-order influence function methods. The minimax rate we find exhibits several interesting features, including a non-standard elbow phenomenon and an unusual interpolation between nonparametric regression and functional estimation rates. The latter quantifies how the CATE, as an estimand, can be viewed as a regression/functional hybrid.
△ Less
Submitted 22 December, 2023; v1 submitted 1 March, 2022;
originally announced March 2022.
-
Local permutation tests for conditional independence
Authors:
Ilmun Kim,
Matey Neykov,
Sivaraman Balakrishnan,
Larry Wasserman
Abstract:
In this paper, we investigate local permutation tests for testing conditional independence between two random vectors $X$ and $Y$ given $Z$. The local permutation test determines the significance of a test statistic by locally shuffling samples which share similar values of the conditioning variables $Z$, and it forms a natural extension of the usual permutation approach for unconditional independ…
▽ More
In this paper, we investigate local permutation tests for testing conditional independence between two random vectors $X$ and $Y$ given $Z$. The local permutation test determines the significance of a test statistic by locally shuffling samples which share similar values of the conditioning variables $Z$, and it forms a natural extension of the usual permutation approach for unconditional independence testing. Despite its simplicity and empirical support, the theoretical underpinnings of the local permutation test remain unclear. Motivated by this gap, this paper aims to establish theoretical foundations of local permutation tests with a particular focus on binning-based statistics. We start by revisiting the hardness of conditional independence testing and provide an upper bound for the power of any valid conditional independence test, which holds when the probability of observing collisions in $Z$ is small. This negative result naturally motivates us to impose additional restrictions on the possible distributions under the null and alternate. To this end, we focus our attention on certain classes of smooth distributions and identify provably tight conditions under which the local permutation method is universally valid, i.e. it is valid when applied to any (binning-based) test statistic. To complement this result on type I error control, we also show that in some cases, a binning-based statistic calibrated via the local permutation method can achieve minimax optimal power. We also introduce a double-binning permutation strategy, which yields a valid test over less smooth null distributions than the typical single-binning method without compromising much power. Finally, we present simulation results to support our theoretical findings.
△ Less
Submitted 6 January, 2022; v1 submitted 21 December, 2021;
originally announced December 2021.
-
Minimax Optimal Regression over Sobolev Spaces via Laplacian Eigenmaps on Neighborhood Graphs
Authors:
Alden Green,
Sivaraman Balakrishnan,
Ryan J. Tibshirani
Abstract:
In this paper we study the statistical properties of Principal Components Regression with Laplacian Eigenmaps (PCR-LE), a method for nonparametric regression based on Laplacian Eigenmaps (LE). PCR-LE works by projecting a vector of observed responses ${\bf Y} = (Y_1,\ldots,Y_n)$ onto a subspace spanned by certain eigenvectors of a neighborhood graph Laplacian. We show that PCR-LE achieves minimax…
▽ More
In this paper we study the statistical properties of Principal Components Regression with Laplacian Eigenmaps (PCR-LE), a method for nonparametric regression based on Laplacian Eigenmaps (LE). PCR-LE works by projecting a vector of observed responses ${\bf Y} = (Y_1,\ldots,Y_n)$ onto a subspace spanned by certain eigenvectors of a neighborhood graph Laplacian. We show that PCR-LE achieves minimax rates of convergence for random design regression over Sobolev spaces. Under sufficient smoothness conditions on the design density $p$, PCR-LE achieves the optimal rates for both estimation (where the optimal rate in squared $L^2$ norm is known to be $n^{-2s/(2s + d)}$) and goodness-of-fit testing ($n^{-4s/(4s + d)}$). We also show that PCR-LE is \emph{manifold adaptive}: that is, we consider the situation where the design is supported on a manifold of small intrinsic dimension $m$, and give upper bounds establishing that PCR-LE achieves the faster minimax estimation ($n^{-2s/(2s + m)}$) and testing ($n^{-4s/(4s + m)}$) rates of convergence. Interestingly, these rates are almost always much faster than the known rates of convergence of graph Laplacian eigenvectors to their population-level limits; in other words, for this problem regression with estimated features appears to be much easier, statistically speaking, than estimating the features itself. We support these theoretical results with empirical evidence.
△ Less
Submitted 14 November, 2021;
originally announced November 2021.
-
Heavy-tailed Streaming Statistical Estimation
Authors:
Che-Ping Tsai,
Adarsh Prasad,
Sivaraman Balakrishnan,
Pradeep Ravikumar
Abstract:
We consider the task of heavy-tailed statistical estimation given streaming $p$-dimensional samples. This could also be viewed as stochastic optimization under heavy-tailed distributions, with an additional $O(p)$ space complexity constraint. We design a clipped stochastic gradient descent algorithm and provide an improved analysis, under a more nuanced condition on the noise of the stochastic gra…
▽ More
We consider the task of heavy-tailed statistical estimation given streaming $p$-dimensional samples. This could also be viewed as stochastic optimization under heavy-tailed distributions, with an additional $O(p)$ space complexity constraint. We design a clipped stochastic gradient descent algorithm and provide an improved analysis, under a more nuanced condition on the noise of the stochastic gradients, which we show is critical when analyzing stochastic optimization problems arising from general statistical estimation problems. Our results guarantee convergence not just in expectation but with exponential concentration, and moreover does so using $O(1)$ batch size. We provide consequences of our results for mean estimation and linear regression. Finally, we provide empirical corroboration of our results and algorithms via synthetic experiments for mean estimation and linear regression.
△ Less
Submitted 25 February, 2022; v1 submitted 25 August, 2021;
originally announced August 2021.
-
Plugin Estimation of Smooth Optimal Transport Maps
Authors:
Tudor Manole,
Sivaraman Balakrishnan,
Jonathan Niles-Weed,
Larry Wasserman
Abstract:
We analyze a number of natural estimators for the optimal transport map between two distributions and show that they are minimax optimal. We adopt the plugin approach: our estimators are simply optimal couplings between measures derived from our observations, appropriately extended so that they define functions on $\mathbb{R}^d$. When the underlying map is assumed to be Lipschitz, we show that com…
▽ More
We analyze a number of natural estimators for the optimal transport map between two distributions and show that they are minimax optimal. We adopt the plugin approach: our estimators are simply optimal couplings between measures derived from our observations, appropriately extended so that they define functions on $\mathbb{R}^d$. When the underlying map is assumed to be Lipschitz, we show that computing the optimal coupling between the empirical measures, and extending it using linear smoothers, already gives a minimax optimal estimator. When the underlying map enjoys higher regularity, we show that the optimal coupling between appropriate nonparametric density estimates yields faster rates. Our work also provides new bounds on the risk of corresponding plugin estimators for the quadratic Wasserstein distance, and we show how this problem relates to that of estimating optimal transport maps using stability arguments for smooth and strongly convex Brenier potentials. As an application of our results, we derive central limit theorems for plugin estimators of the squared Wasserstein distance, which are centered at their population counterpart when the underlying distributions have sufficiently smooth densities. In contrast to known central limit theorems for empirical estimators, this result easily lends itself to statistical inference for the quadratic Wasserstein distance.
△ Less
Submitted 16 June, 2024; v1 submitted 26 July, 2021;
originally announced July 2021.
-
The Effect of Sensor Fusion on Data-Driven Learning of Koopman Operators
Authors:
Shara Balakrishnan,
Aqib Hasnain,
Rob Egbert,
Enoch Yeung
Abstract:
Dictionary methods for system identification typically rely on one set of measurements to learn governing dynamics of a system. In this paper, we investigate how fusion of output measurements with state measurements affects the dictionary selection process in Koopman operator learning problems. While prior methods use dynamical conjugacy to show a direct link between Koopman eigenfunctions in two…
▽ More
Dictionary methods for system identification typically rely on one set of measurements to learn governing dynamics of a system. In this paper, we investigate how fusion of output measurements with state measurements affects the dictionary selection process in Koopman operator learning problems. While prior methods use dynamical conjugacy to show a direct link between Koopman eigenfunctions in two distinct data spaces (measurement channels), we explore the specific case where output measurements are nonlinear, non-invertible functions of the system state. This setup reflects the measurement constraints of many classes of physical systems, e.g., biological measurement data, where one type of measurement does not directly transform to another. We propose output constrained Koopman operators (OC-KOs) as a new framework to fuse two measurement sets. We show that OC-KOs are effective for sensor fusion by proving that when learning a Koopman operator, output measurement functions serve to constrain the space of potential Koopman observables and their eigenfunctions. Further, low-dimensional output measurements can be embedded to inform selection of Koopman dictionary functions for high-dimensional models. We propose two algorithms to identify OC-KO representations directly from data: a direct optimization method that uses state and output data simultaneously and a sequential optimization method. We prove a theorem to show that the solution spaces of the two optimization problems are equivalent. We illustrate these findings with a theoretical example and two numerical simulations.
△ Less
Submitted 29 June, 2021;
originally announced June 2021.
-
Minimax Optimal Regression over Sobolev Spaces via Laplacian Regularization on Neighborhood Graphs
Authors:
Alden Green,
Sivaraman Balakrishnan,
Ryan J. Tibshirani
Abstract:
In this paper we study the statistical properties of Laplacian smoothing, a graph-based approach to nonparametric regression. Under standard regularity conditions, we establish upper bounds on the error of the Laplacian smoothing estimator $\widehat{f}$, and a goodness-of-fit test also based on $\widehat{f}$. These upper bounds match the minimax optimal estimation and testing rates of convergence…
▽ More
In this paper we study the statistical properties of Laplacian smoothing, a graph-based approach to nonparametric regression. Under standard regularity conditions, we establish upper bounds on the error of the Laplacian smoothing estimator $\widehat{f}$, and a goodness-of-fit test also based on $\widehat{f}$. These upper bounds match the minimax optimal estimation and testing rates of convergence over the first-order Sobolev class $H^1(\mathcal{X})$, for $\mathcal{X}\subseteq \mathbb{R}^d$ and $1 \leq d < 4$; in the estimation problem, for $d = 4$, they are optimal modulo a $\log n$ factor. Additionally, we prove that Laplacian smoothing is manifold-adaptive: if $\mathcal{X} \subseteq \mathbb{R}^d$ is an $m$-dimensional manifold with $m < d$, then the error rate of Laplacian smoothing (in either estimation or testing) depends only on $m$, in the same way it would if $\mathcal{X}$ were a full-dimensional set in $\mathbb{R}^d$.
△ Less
Submitted 2 June, 2021;
originally announced June 2021.
-
The HulC: Confidence Regions from Convex Hulls
Authors:
Arun Kumar Kuchibhotla,
Sivaraman Balakrishnan,
Larry Wasserman
Abstract:
We develop and analyze the HulC, an intuitive and general method for constructing confidence sets using the convex hull of estimates constructed from subsets of the data. Unlike classical methods which are based on estimating the (limiting) distribution of an estimator, the HulC is often simpler to use and effectively bypasses this step. In comparison to the bootstrap, the HulC requires fewer regu…
▽ More
We develop and analyze the HulC, an intuitive and general method for constructing confidence sets using the convex hull of estimates constructed from subsets of the data. Unlike classical methods which are based on estimating the (limiting) distribution of an estimator, the HulC is often simpler to use and effectively bypasses this step. In comparison to the bootstrap, the HulC requires fewer regularity conditions and succeeds in many examples where the bootstrap provably fails. Unlike subsampling, the HulC does not require knowledge of the rate of convergence of the estimators on which it is based. The validity of the HulC requires knowledge of the (asymptotic) median-bias of the estimators. We further analyze a variant of our basic method, called the Adaptive HulC, which is fully data-driven and estimates the median-bias using subsampling. We show that the Adaptive HulC retains the aforementioned strengths of the HulC. In certain cases where the underlying estimators are pathologically asymmetric the HulC and Adaptive HulC can fail to provide useful confidence sets. We propose a final variant, the Unimodal HulC, which can salvage the situation in cases where the distribution of the underlying estimator is (asymptotically) unimodal. We discuss these methods in the context of several challenging inferential problems which arise in parametric, semi-parametric, and non-parametric inference. Although our focus is on validity under weak regularity conditions, we also provide some general results on the width of the HulC confidence sets, showing that in many cases the HulC confidence sets have near-optimal width.
△ Less
Submitted 8 September, 2023; v1 submitted 30 May, 2021;
originally announced May 2021.
-
Minimax Optimal Conditional Density Estimation under Total Variation Smoothness
Authors:
Michael Li,
Matey Neykov,
Sivaraman Balakrishnan
Abstract:
This paper studies the minimax rate of nonparametric conditional density estimation under a weighted absolute value loss function in a multivariate setting. We first demonstrate that conditional density estimation is impossible if one only requires that $p_{X|Z}$ is smooth in $x$ for all values of $z$. This motivates us to consider a sub-class of absolutely continuous distributions, restricting th…
▽ More
This paper studies the minimax rate of nonparametric conditional density estimation under a weighted absolute value loss function in a multivariate setting. We first demonstrate that conditional density estimation is impossible if one only requires that $p_{X|Z}$ is smooth in $x$ for all values of $z$. This motivates us to consider a sub-class of absolutely continuous distributions, restricting the conditional density $p_{X|Z}(x|z)$ to not only be Hölder smooth in $x$, but also be total variation smooth in $z$. We propose a corresponding kernel-based estimator and prove that it achieves the minimax rate. We give some simple examples of densities satisfying our assumptions which imply that our results are not vacuous. Finally, we propose an estimator which achieves the minimax optimal rate adaptively, i.e., without the need to know the smoothness parameter values in advance. Crucially, both of our estimators (the adaptive and non-adaptive ones) impose no assumptions on the marginal density $p_Z$, and are not obtained as a ratio between two kernel smoothing estimators which may sound like a go to approach in this problem.
△ Less
Submitted 12 March, 2021;
originally announced March 2021.
-
Semiparametric counterfactual density estimation
Authors:
Edward H. Kennedy,
Sivaraman Balakrishnan,
Larry Wasserman
Abstract:
Causal effects are often characterized with averages, which can give an incomplete picture of the underlying counterfactual distributions. Here we consider estimating the entire counterfactual density and generic functionals thereof. We focus on two kinds of target parameters. The first is a density approximation, defined by a projection onto a finite-dimensional model using a generalized distance…
▽ More
Causal effects are often characterized with averages, which can give an incomplete picture of the underlying counterfactual distributions. Here we consider estimating the entire counterfactual density and generic functionals thereof. We focus on two kinds of target parameters. The first is a density approximation, defined by a projection onto a finite-dimensional model using a generalized distance metric, which includes f-divergences as well as $L_p$ norms. The second is the distance between counterfactual densities, which can be used as a more nuanced effect measure than the mean difference, and as a tool for model selection. We study nonparametric efficiency bounds for these targets, giving results for smooth but otherwise generic models and distances. Importantly, we show how these bounds connect to means of particular non-trivial functions of counterfactuals, linking the problems of density and mean estimation. We go on to propose doubly robust-style estimators for the density approximations and distances, and study their rates of convergence, showing they can be optimally efficient in large nonparametric models. We also give analogous methods for model selection and aggregation, when many models may be available and of interest. Our results all hold for generic models and distances, but throughout we highlight what happens for particular choices, such as $L_2$ projections on linear models, and KL projections on exponential families. Finally we illustrate by estimating the density of CD4 count among patients with HIV, had all been treated with combination therapy versus zidovudine alone, as well as a density effect. Our results suggest combination therapy may have increased CD4 count most for high-risk patients. Our methods are implemented in the freely available R package npcausal on GitHub.
△ Less
Submitted 23 February, 2021;
originally announced February 2021.
-
Even values of Ramanujan's tau-function
Authors:
Jennifer S. Balakrishnan,
Ken Ono,
Wei-Lun Tsai
Abstract:
In the spirit of Lehmer's speculation that Ramanujan's tau-function never vanishes, it is natural to ask whether any given integer $α$ is a value of $τ(n)$. For odd $α$, Murty, Murty, and Shorey proved that $τ(n)\neq α$ for sufficiently large $n$. Several recent papers have identified explicit examples of odd $α$ which are not tau-values. Here we apply these results (most notably the recent work o…
▽ More
In the spirit of Lehmer's speculation that Ramanujan's tau-function never vanishes, it is natural to ask whether any given integer $α$ is a value of $τ(n)$. For odd $α$, Murty, Murty, and Shorey proved that $τ(n)\neq α$ for sufficiently large $n$. Several recent papers have identified explicit examples of odd $α$ which are not tau-values. Here we apply these results (most notably the recent work of Bennett, Gherga, Patel, and Siksek) to offer the first examples of even integers that are not tau-values. Namely, for primes $\ell$ we find that $$ τ(n)\not \in \{ \pm 2\ell \ : \ 3\leq \ell< 100\} \cup \{\pm 2\ell^2 \ : \ 3\leq \ell <100\} \cup \{\pm 2\ell^3 \ : \ 3\leq \ell<100\ {\text {\rm with $\ell\neq 59$}}\}.$$ Moreover, we obtain such results for infinitely many powers of each prime $3\leq \ell<100$. As an example, for $\ell=97$ we prove that $$τ(n)\not \in \{ 2\cdot 97^j \ : \ 1\leq j\not \equiv 0\pmod{44}\}\cup \{-2\cdot 97^j \ : \ j\geq 1\}.$$ The method of proof applies mutatis mutandis to newforms with residually reducible mod 2 Galois representation and is easily adapted to generic newforms with integer coefficients.
△ Less
Submitted 13 December, 2021; v1 submitted 29 January, 2021;
originally announced February 2021.
-
Quadratic Chabauty for modular curves: Algorithms and examples
Authors:
Jennifer S. Balakrishnan,
Netan Dogra,
Jan Steffen Müller,
Jan Tuitman,
Jan Vonk
Abstract:
We describe how the quadratic Chabauty method may be applied to explicitly determine the set of rational points on modular curves of genus $g>1$ whose Jacobians have Mordell--Weil rank $g$. This extends our previous work on the split Cartan curve of level 13 and allows us to consider modular curves that may have few known rational points or nontrivial local height contributions at primes of bad re…
▽ More
We describe how the quadratic Chabauty method may be applied to explicitly determine the set of rational points on modular curves of genus $g>1$ whose Jacobians have Mordell--Weil rank $g$. This extends our previous work on the split Cartan curve of level 13 and allows us to consider modular curves that may have few known rational points or nontrivial local height contributions at primes of bad reduction. We illustrate our algorithms with a number of examples where we determine the set of rational points on several modular curves of genus 2 and 3: this includes Atkin--Lehner quotients $X_0^+(N)$ of prime level $N$, the curve $X_{S_4}(13)$, as well as a few other curves relevant to Mazur's Program B. We also describe the computation of rational points on the genus 6 non-split Cartan modular curve $X_{\textrm{ns}} ^+ (17)$.
△ Less
Submitted 7 March, 2023; v1 submitted 5 January, 2021;
originally announced January 2021.
-
Prediction of fitness in bacteria with causal jump dynamic mode decomposition
Authors:
Shara Balakrishnan,
Aqib Hasnain,
Nibodh Boddupalli,
Dennis M. Joshy,
Robert G. Egbert,
Enoch Yeung
Abstract:
In this paper, we consider the problem of learning a predictive model for population cell growth dynamics as a function of the media conditions. We first introduce a generic data-driven framework for training operator-theoretic models to predict cell growth rate. We then introduce the experimental design and data generated in this study, namely growth curves of Pseudomonas putida as a function of…
▽ More
In this paper, we consider the problem of learning a predictive model for population cell growth dynamics as a function of the media conditions. We first introduce a generic data-driven framework for training operator-theoretic models to predict cell growth rate. We then introduce the experimental design and data generated in this study, namely growth curves of Pseudomonas putida as a function of casein and glucose concentrations. We use a data driven approach for model identification, specifically the nonlinear autoregressive (NAR) model to represent the dynamics. We show theoretically that Hankel DMD can be used to obtain a solution of the NAR model. We show that it identifies a constrained NAR model and to obtain a more general solution, we define a causal state space system using 1-step,2-step,...,τ-step predictors of the NAR model and identify a Koopman operator for this model using extended dynamic mode decomposition. The hybrid scheme we call causal-jump dynamic mode decomposition, which we illustrate on a growth profile or fitness prediction challenge as a function of different input growth conditions. We show that our model is able to recapitulate training growth curve data with 96.6% accuracy and predict test growth curve data with 91% accuracy.
△ Less
Submitted 22 June, 2020;
originally announced June 2020.
-
Variants of Lehmer's speculation for newforms
Authors:
Jennifer S. Balakrishnan,
William Craig,
Ken Ono,
Wei-Lun Tsai
Abstract:
In the spirit of Lehmer's unresolved speculation on the nonvanishing of Ramanujan's tau-function, it is natural to ask whether a fixed integer is a value of $τ(n)$ or is a Fourier coefficient $a_f(n)$ of any given newform $f(z)$. We offer a method, which applies to newforms with integer coefficients and trivial residual mod 2 Galois representation, that answers this question for odd integers. We d…
▽ More
In the spirit of Lehmer's unresolved speculation on the nonvanishing of Ramanujan's tau-function, it is natural to ask whether a fixed integer is a value of $τ(n)$ or is a Fourier coefficient $a_f(n)$ of any given newform $f(z)$. We offer a method, which applies to newforms with integer coefficients and trivial residual mod 2 Galois representation, that answers this question for odd integers. We determine infinitely many spaces for which the primes $3\leq \ell\leq 37$ are not absolute values of coefficients of newforms with integer coefficients. For $τ(n)$ with $n>1$, we prove that $$τ(n)\not \in \{\pm 1, \pm 3, \pm 5, \pm 7, \pm 13, \pm 17, -19, \pm 23, \pm 37, \pm 691\},$$ and assuming GRH we show for primes $\ell$ that $$τ(n)\not \in \left \{ \pm \ell\ : \ 41\leq \ell\leq 97 \ {\textrm{with}}\ \left(\frac{\ell}{5}\right)=-1\right\} \cup \left \{ -11, -29, -31, -41, -59, -61, -71, -79, -89\right\}. $$ We also obtain sharp lower bounds for the number of prime factors of such newform coefficients. In the weight aspect, for powers of odd primes $\ell$, we prove that $\pm \ell^m$ is not a coefficient of any such newform $f$ with weight $2k>M^{\pm}(\ell,m)=O_{\ell}(m)$ and even level coprime to $\ell,$ where $M^{\pm}(\ell,m)$ is effectively computable.
△ Less
Submitted 24 September, 2023; v1 submitted 20 May, 2020;
originally announced May 2020.
-
Variations of Lehmer's Conjecture for Ramanujan's tau-function
Authors:
Jennifer S. Balakrishnan,
William Craig,
Ken Ono
Abstract:
We consider natural variants of Lehmer's unresolved conjecture that Ramanujan's tau-function never vanishes. Namely, for $n>1$ we prove that $$τ(n)\not \in \{\pm 1, \pm 3, \pm 5, \pm 7, \pm 691\}.$$ This result is an example of general theorems for newforms with trivial mod 2 residual Galois representation, which will appear in forthcoming work of the authors with Wei-Lun Tsai. Ramanujan's well-kn…
▽ More
We consider natural variants of Lehmer's unresolved conjecture that Ramanujan's tau-function never vanishes. Namely, for $n>1$ we prove that $$τ(n)\not \in \{\pm 1, \pm 3, \pm 5, \pm 7, \pm 691\}.$$ This result is an example of general theorems for newforms with trivial mod 2 residual Galois representation, which will appear in forthcoming work of the authors with Wei-Lun Tsai. Ramanujan's well-known congruences for $τ(n)$ allow for the simplified proof in these special cases. We make use of the theory of Lucas sequences, the Chabauty-Coleman method for hyperelliptic curves, and facts about certain Thue equations.
△ Less
Submitted 20 May, 2020;
originally announced May 2020.
-
Minimax optimality of permutation tests
Authors:
Ilmun Kim,
Sivaraman Balakrishnan,
Larry Wasserman
Abstract:
Permutation tests are widely used in statistics, providing a finite-sample guarantee on the type I error rate whenever the distribution of the samples under the null hypothesis is invariant to some rearrangement. Despite its increasing popularity and empirical success, theoretical properties of the permutation test, especially its power, have not been fully explored beyond simple cases. In this pa…
▽ More
Permutation tests are widely used in statistics, providing a finite-sample guarantee on the type I error rate whenever the distribution of the samples under the null hypothesis is invariant to some rearrangement. Despite its increasing popularity and empirical success, theoretical properties of the permutation test, especially its power, have not been fully explored beyond simple cases. In this paper, we attempt to partly fill this gap by presenting a general non-asymptotic framework for analyzing the minimax power of the permutation test. The utility of our proposed framework is illustrated in the context of two-sample and independence testing under both discrete and continuous settings. In each setting, we introduce permutation tests based on U-statistics and study their minimax performance. We also develop exponential concentration bounds for permuted U-statistics based on a novel coupling idea, which may be of independent interest. Building on these exponential bounds, we introduce permutation tests which are adaptive to unknown smoothness parameters without losing much power. The proposed framework is further illustrated using more sophisticated test statistics including weighted U-statistics for multinomial testing and Gaussian kernel-based statistics for density testing. Finally, we provide some simulation results that further justify the permutation approach.
△ Less
Submitted 25 May, 2022; v1 submitted 30 March, 2020;
originally announced March 2020.
-
Minimax Optimal Conditional Independence Testing
Authors:
Matey Neykov,
Sivaraman Balakrishnan,
Larry Wasserman
Abstract:
We consider the problem of conditional independence testing of $X$ and $Y$ given $Z$ where $X,Y$ and $Z$ are three real random variables and $Z$ is continuous. We focus on two main cases - when $X$ and $Y$ are both discrete, and when $X$ and $Y$ are both continuous. In view of recent results on conditional independence testing (Shah and Peters, 2018), one cannot hope to design non-trivial tests, w…
▽ More
We consider the problem of conditional independence testing of $X$ and $Y$ given $Z$ where $X,Y$ and $Z$ are three real random variables and $Z$ is continuous. We focus on two main cases - when $X$ and $Y$ are both discrete, and when $X$ and $Y$ are both continuous. In view of recent results on conditional independence testing (Shah and Peters, 2018), one cannot hope to design non-trivial tests, which control the type I error for all absolutely continuous conditionally independent distributions, while still ensuring power against interesting alternatives. Consequently, we identify various, natural smoothness assumptions on the conditional distributions of $X,Y|Z=z$ as $z$ varies in the support of $Z$, and study the hardness of conditional independence testing under these smoothness assumptions. We derive matching lower and upper bounds on the critical radius of separation between the null and alternative hypotheses in the total variation metric. The tests we consider are easily implementable and rely on binning the support of the continuous variable $Z$. To complement these results, we provide a new proof of the hardness result of Shah and Peters.
△ Less
Submitted 1 July, 2021; v1 submitted 9 January, 2020;
originally announced January 2020.
-
Universal Inference
Authors:
Larry Wasserman,
Aaditya Ramdas,
Sivaraman Balakrishnan
Abstract:
We propose a general method for constructing hypothesis tests and confidence sets that have finite sample guarantees without regularity conditions. We refer to such procedures as "universal." The method is very simple and is based on a modified version of the usual likelihood ratio statistic, that we call "the split likelihood ratio test" (split LRT). The method is especially appealing for irregul…
▽ More
We propose a general method for constructing hypothesis tests and confidence sets that have finite sample guarantees without regularity conditions. We refer to such procedures as "universal." The method is very simple and is based on a modified version of the usual likelihood ratio statistic, that we call "the split likelihood ratio test" (split LRT). The method is especially appealing for irregular statistical models. Canonical examples include mixture models and models that arise in shape-constrained inference. Constructing tests and confidence sets for such models is notoriously difficult. Typical inference methods, like the likelihood ratio test, are not useful in these cases because they have intractable limiting distributions. In contrast, the method we suggest works for any parametric model and also for some nonparametric models. The split LRT can also be used with profile likelihoods to deal with nuisance parameters, and it can also be run sequentially to yield anytime-valid $p$-values and confidence sequences.
△ Less
Submitted 19 October, 2022; v1 submitted 24 December, 2019;
originally announced December 2019.
-
Statistical Guarantees for Local Spectral Clustering on Random Neighborhood Graphs
Authors:
Alden Green,
Sivaraman Balakrishnan,
Ryan J. Tibshirani
Abstract:
We study the Personalized PageRank (PPR) algorithm, a local spectral method for clustering, which extracts clusters using locally-biased random walks around a given seed node. In contrast to previous work, we adopt a classical statistical learning setup, where we obtain samples from an unknown nonparametric distribution, and aim to identify sufficiently salient clusters. We introduce a trio of pop…
▽ More
We study the Personalized PageRank (PPR) algorithm, a local spectral method for clustering, which extracts clusters using locally-biased random walks around a given seed node. In contrast to previous work, we adopt a classical statistical learning setup, where we obtain samples from an unknown nonparametric distribution, and aim to identify sufficiently salient clusters. We introduce a trio of population-level functionals -- the normalized cut, conductance, and local spread, analogous to graph-based functionals of the same name -- and prove that PPR, run on a neighborhood graph, recovers clusters with small population normalized cut and large conductance and local spread. We apply our general theory to establish that PPR identifies connected regions of high density (density clusters) that satisfy a set of natural geometric conditions. We also show a converse result, that PPR can fail to recover geometrically poorly-conditioned density clusters, even asymptotically. Finally, we provide empirical support for our theory.
△ Less
Submitted 22 December, 2021; v1 submitted 21 November, 2019;
originally announced November 2019.
-
Two recent p-adic approaches towards the (effective) Mordell conjecture
Authors:
Jennifer S. Balakrishnan,
Alex J. Best,
Francesca Bianchi,
Brian Lawrence,
J. Steffen Müller,
Nicholas Triantafillou,
Jan Vonk
Abstract:
We give an introductory account of two recent approaches towards an effective proof of the Mordell conjecture, due to Lawrence--Venkatesh and Kim. The latter method, which is usually called the method of Chabauty--Kim or non-abelian Chabauty in the literature, has the advantage that in some cases it has been turned into an effective method to determine the set of rational points on a curve, and we…
▽ More
We give an introductory account of two recent approaches towards an effective proof of the Mordell conjecture, due to Lawrence--Venkatesh and Kim. The latter method, which is usually called the method of Chabauty--Kim or non-abelian Chabauty in the literature, has the advantage that in some cases it has been turned into an effective method to determine the set of rational points on a curve, and we illustrate this by presenting three new examples of modular curves where this set can be determined.
△ Less
Submitted 19 January, 2020; v1 submitted 28 October, 2019;
originally announced October 2019.
-
Explicit quadratic Chabauty over number fields
Authors:
Jennifer S. Balakrishnan,
Amnon Besser,
Francesca Bianchi,
J. Steffen Müller
Abstract:
We generalize the explicit quadratic Chabauty techniques for integral points on odd degree hyperelliptic curves and for rational points on genus 2 bielliptic curves to arbitrary number fields using restriction of scalars. This is achieved by combining equations coming from Siksek's extension of classical Chabauty with equations defined in terms of p-adic heights attached to independent continuous…
▽ More
We generalize the explicit quadratic Chabauty techniques for integral points on odd degree hyperelliptic curves and for rational points on genus 2 bielliptic curves to arbitrary number fields using restriction of scalars. This is achieved by combining equations coming from Siksek's extension of classical Chabauty with equations defined in terms of p-adic heights attached to independent continuous idele class characters. We give several examples to show the practicality of our methods.
△ Less
Submitted 15 June, 2020; v1 submitted 10 October, 2019;
originally announced October 2019.
-
Steady state programming of controlled nonlinear systems via deep dynamic mode decomposition
Authors:
Aqib Hasnain,
Nibodh Boddupalli,
Shara Balakrishnan,
Enoch Yeung
Abstract:
This paper describes the optimal selection of a control policy to program the steady state of controlled nonlinear systems with hyperbolic fixed points. This work is motivated by the field of synthetic biology, in which saddle points are common (along with limit cycles), and the aim is to program cells to perform both digital and analog computation, though developing genetic digital computation ha…
▽ More
This paper describes the optimal selection of a control policy to program the steady state of controlled nonlinear systems with hyperbolic fixed points. This work is motivated by the field of synthetic biology, in which saddle points are common (along with limit cycles), and the aim is to program cells to perform both digital and analog computation, though developing genetic digital computation has been the main focus. We frame the analog computing challenge of generating a steady state input-output function inside living cells. To program the steady state, a data-driven approach is taken wherein an approximation of the Koopman operator, identified via deep dynamic mode decomposition, is used to describe the dynamics of the system linearly. The new representation of the dynamics are then used to solve an optimization problem for the input which maximizes a direction in state space. Some added structure on the Koopman operator learning process for controlled systems is given for dynamics that are separable in the state and input. Finally, the methods are demonstrated on simulation examples of an incoherent feedforward loop and a combinatorial promoter system, two common network architectures seen in the field of synthetic biology.
△ Less
Submitted 9 June, 2020; v1 submitted 29 September, 2019;
originally announced September 2019.
-
Minimax Confidence Intervals for the Sliced Wasserstein Distance
Authors:
Tudor Manole,
Sivaraman Balakrishnan,
Larry Wasserman
Abstract:
Motivated by the growing popularity of variants of the Wasserstein distance in statistics and machine learning, we study statistical inference for the Sliced Wasserstein distance--an easily computable variant of the Wasserstein distance. Specifically, we construct confidence intervals for the Sliced Wasserstein distance which have finite-sample validity under no assumptions or under mild moment as…
▽ More
Motivated by the growing popularity of variants of the Wasserstein distance in statistics and machine learning, we study statistical inference for the Sliced Wasserstein distance--an easily computable variant of the Wasserstein distance. Specifically, we construct confidence intervals for the Sliced Wasserstein distance which have finite-sample validity under no assumptions or under mild moment assumptions. These intervals are adaptive in length to the regularity of the underlying distributions. We also bound the minimax risk of estimating the Sliced Wasserstein distance, and as a consequence establish that the lengths of our proposed confidence intervals are minimax optimal over appropriate distribution classes. To motivate the choice of these classes, we also study minimax rates of estimating a distribution under the Sliced Wasserstein distance. These theoretical findings are complemented with a simulation study demonstrating the deficiencies of the classical bootstrap, and the advantages of our proposed methods. We also show strong correspondences between our theoretical predictions and the adaptivity of our confidence interval lengths in simulations. We conclude by demonstrating the use of our confidence intervals in the setting of simulator-based likelihood-free inference. In this setting, contrasting popular approximate Bayesian computation methods, we develop uncertainty quantification methods with rigorous frequentist coverage guarantees.
△ Less
Submitted 3 April, 2022; v1 submitted 17 September, 2019;
originally announced September 2019.
-
Path Length Bounds for Gradient Descent and Flow
Authors:
Chirag Gupta,
Sivaraman Balakrishnan,
Aaditya Ramdas
Abstract:
We derive bounds on the path length $ζ$ of gradient descent (GD) and gradient flow (GF) curves for various classes of smooth convex and nonconvex functions. Among other results, we prove that: (a) if the iterates are linearly convergent with factor $(1-c)$, then $ζ$ is at most $\mathcal{O}(1/c)$; (b) under the Polyak-Kurdyka-Lojasiewicz (PKL) condition, $ζ$ is at most $\mathcal{O}(\sqrtκ)$, where…
▽ More
We derive bounds on the path length $ζ$ of gradient descent (GD) and gradient flow (GF) curves for various classes of smooth convex and nonconvex functions. Among other results, we prove that: (a) if the iterates are linearly convergent with factor $(1-c)$, then $ζ$ is at most $\mathcal{O}(1/c)$; (b) under the Polyak-Kurdyka-Lojasiewicz (PKL) condition, $ζ$ is at most $\mathcal{O}(\sqrtκ)$, where $κ$ is the condition number, and at least $\widetildeΩ(\sqrt{d} \wedge κ^{1/4})$; (c) for quadratics, $ζ$ is $Θ(\min\{\sqrt{d},\sqrt{\log κ}\})$ and in some cases can be independent of $κ$; (d) assuming just convexity, $ζ$ can be at most $2^{4d\log d}$; (e) for separable quasiconvex functions, $ζ$ is $Θ(\sqrt{d})$. Thus, we advance current understanding of the properties of GD and GF curves beyond rates of convergence. We expect our techniques to facilitate future studies for other algorithms.
△ Less
Submitted 19 March, 2021; v1 submitted 2 August, 2019;
originally announced August 2019.
-
Robust Nonparametric Regression under Huber's $ε$-contamination Model
Authors:
Simon S. Du,
Yining Wang,
Sivaraman Balakrishnan,
Pradeep Ravikumar,
Aarti Singh
Abstract:
We consider the non-parametric regression problem under Huber's $ε$-contamination model, in which an $ε$ fraction of observations are subject to arbitrary adversarial noise. We first show that a simple local binning median step can effectively remove the adversary noise and this median estimator is minimax optimal up to absolute constants over the Hölder function class with smoothness parameters s…
▽ More
We consider the non-parametric regression problem under Huber's $ε$-contamination model, in which an $ε$ fraction of observations are subject to arbitrary adversarial noise. We first show that a simple local binning median step can effectively remove the adversary noise and this median estimator is minimax optimal up to absolute constants over the Hölder function class with smoothness parameters smaller than or equal to 1. Furthermore, when the underlying function has higher smoothness, we show that using local binning median as pre-preprocessing step to remove the adversarial noise, then we can apply any non-parametric estimator on top of the medians. In particular we show local median binning followed by kernel smoothing and local polynomial regression achieve minimaxity over Hölder and Sobolev classes with arbitrary smoothness parameters. Our main proof technique is a decoupled analysis of adversary noise and stochastic noise, which can be potentially applied to other robust estimation problems. We also provide numerical results to verify the effectiveness of our proposed methods.
△ Less
Submitted 25 May, 2018;
originally announced May 2018.
-
Chabauty-Coleman experiments for genus 3 hyperelliptic curves
Authors:
Jennifer S. Balakrishnan,
Francesca Bianchi,
Victoria Cantoral-Farfán,
Mirela Çiperiani,
Anastassia Etropolski
Abstract:
We describe a computation of rational points on genus 3 hyperelliptic curves $C$ defined over $\mathbb{Q}$ whose Jacobians have Mordell-Weil rank 1. Using the method of Chabauty and Coleman, we present and implement an algorithm in Sage to compute the zero locus of two Coleman integrals and analyze the finite set of points cut out by the vanishing of these integrals. We run the algorithm on approx…
▽ More
We describe a computation of rational points on genus 3 hyperelliptic curves $C$ defined over $\mathbb{Q}$ whose Jacobians have Mordell-Weil rank 1. Using the method of Chabauty and Coleman, we present and implement an algorithm in Sage to compute the zero locus of two Coleman integrals and analyze the finite set of points cut out by the vanishing of these integrals. We run the algorithm on approximately 17,000 curves from a forthcoming database of genus 3 hyperelliptic curves and discuss some interesting examples where the zero set includes global points not found in $C(\mathbb{Q})$.
△ Less
Submitted 8 May, 2018;
originally announced May 2018.
-
Optimization of Smooth Functions with Noisy Observations: Local Minimax Rates
Authors:
Yining Wang,
Sivaraman Balakrishnan,
Aarti Singh
Abstract:
We consider the problem of global optimization of an unknown non-convex smooth function with zeroth-order feedback. In this setup, an algorithm is allowed to adaptively query the underlying function at different locations and receives noisy evaluations of function values at the queried points (i.e. the algorithm has access to zeroth-order information). Optimization performance is evaluated by the…
▽ More
We consider the problem of global optimization of an unknown non-convex smooth function with zeroth-order feedback. In this setup, an algorithm is allowed to adaptively query the underlying function at different locations and receives noisy evaluations of function values at the queried points (i.e. the algorithm has access to zeroth-order information). Optimization performance is evaluated by the expected difference of function values at the estimated optimum and the true optimum. In contrast to the classical optimization setup, first-order information like gradients are not directly accessible to the optimization algorithm. We show that the classical minimax framework of analysis, which roughly characterizes the worst-case query complexity of an optimization algorithm in this setting, leads to excessively pessimistic results. We propose a local minimax framework to study the fundamental difficulty of optimizing smooth functions with adaptive function evaluations, which provides a refined picture of the intrinsic difficulty of zeroth-order optimization. We show that for functions with fast level set growth around the global minimum, carefully designed optimization algorithms can identify a near global minimizer with many fewer queries. For the special case of strongly convex and smooth functions, our implied convergence rates match the ones developed for zeroth-order convex optimization problems. At the other end of the spectrum, for worst-case smooth functions no algorithm can converge faster than the minimax rate of estimating the entire unknown function in the $\ell_\infty$-norm. We provide an intuitive and efficient algorithm that attains the derived upper error bounds.
△ Less
Submitted 22 March, 2018;
originally announced March 2018.
-
Robust Multivariate Nonparametric Tests via Projection-Averaging
Authors:
Ilmun Kim,
Sivaraman Balakrishnan,
Larry Wasserman
Abstract:
In this work, we generalize the Cramér-von Mises statistic via projection-averaging to obtain a robust test for the multivariate two-sample problem. The proposed test is consistent against all fixed alternatives, robust to heavy-tailed data and minimax rate optimal against a certain class of alternatives. Our test statistic is completely free of tuning parameters and is computationally efficient e…
▽ More
In this work, we generalize the Cramér-von Mises statistic via projection-averaging to obtain a robust test for the multivariate two-sample problem. The proposed test is consistent against all fixed alternatives, robust to heavy-tailed data and minimax rate optimal against a certain class of alternatives. Our test statistic is completely free of tuning parameters and is computationally efficient even in high dimensions. When the dimension tends to infinity, the proposed test is shown to have comparable power to the existing high-dimensional mean tests under certain location models. As a by-product of our approach, we introduce a new metric called the angular distance which can be thought of as a robust alternative to the Euclidean distance. Using the angular distance, we connect the proposed method to the reproducing kernel Hilbert space approach. In addition to the Cramér-von Mises statistic, we demonstrate that the projection-averaging technique can be used to define robust, multivariate tests in many other problems.
△ Less
Submitted 21 May, 2019; v1 submitted 1 March, 2018;
originally announced March 2018.
-
Explicit Chabauty-Kim for the Split Cartan Modular Curve of Level 13
Authors:
Jennifer S. Balakrishnan,
Netan Dogra,
J. Steffen Müller,
Jan Tuitman,
Jan Vonk
Abstract:
We extend the explicit quadratic Chabauty methods developed in previous work by the first two authors to the case of non-hyperelliptic curves. This results in an algorithm to compute the rational points on a curve of genus $g \ge 2$ over the rationals whose Jacobian has Mordell-Weil rank $g$ and Picard number greater than one, and which satisfies some additional conditions. This algorithm is then…
▽ More
We extend the explicit quadratic Chabauty methods developed in previous work by the first two authors to the case of non-hyperelliptic curves. This results in an algorithm to compute the rational points on a curve of genus $g \ge 2$ over the rationals whose Jacobian has Mordell-Weil rank $g$ and Picard number greater than one, and which satisfies some additional conditions. This algorithm is then applied to the modular curve $X_{s}(13)$, completing the classification of non-CM elliptic curves over $\mathbf{Q}$ with split Cartan level structure due to Bilu-Parent and Bilu-Parent-Rebolledo.
△ Less
Submitted 15 November, 2017;
originally announced November 2017.