-
Will the real Hardy-Ramanujan formula please stand up?
Authors:
Stephen DeSalvo
Abstract:
The Hardy-Ramanujan formula for the number of integer partitions of $n$ is one of the most popular results in partition theory. While the unabridged final formula has been celebrated as reflecting the genius of its authors, it has become all too common to attribute either some simplified version of the formula which is not as ingenious, or an alternative more elegant version which was expanded on…
▽ More
The Hardy-Ramanujan formula for the number of integer partitions of $n$ is one of the most popular results in partition theory. While the unabridged final formula has been celebrated as reflecting the genius of its authors, it has become all too common to attribute either some simplified version of the formula which is not as ingenious, or an alternative more elegant version which was expanded on afterwards by other authors. We attempt to provide a clear and compelling justification for distinguishing between the various formulas and simplifications, with a summarizing list of key take-aways in the final section.
△ Less
Submitted 3 July, 2021; v1 submitted 15 March, 2020;
originally announced March 2020.
-
Attacks and alignments: rooks, set partitions, and permutations
Authors:
Richard Arratia,
Stephen DeSalvo
Abstract:
We consider uniformly random set partitions of size $n$ with exactly $k$ blocks, and uniformly random permutations of size $n$ with exactly $k$ cycles, under the regime where $n-k \sim t\sqrt{n}$, $t>0$. In this regime, there is a simple approximation for the entire process of component counts; in particular, the number of components of size 3 converges in distribution to Poisson with mean…
▽ More
We consider uniformly random set partitions of size $n$ with exactly $k$ blocks, and uniformly random permutations of size $n$ with exactly $k$ cycles, under the regime where $n-k \sim t\sqrt{n}$, $t>0$. In this regime, there is a simple approximation for the entire process of component counts; in particular, the number of components of size 3 converges in distribution to Poisson with mean $\frac{2}{3}t^2$ for set partitions and mean $\frac{4}{3}t^2$ for permutations, and with high probability all other components have size one or two. These approximations are proved, with preasymptotic error bounds, using combinatorial bijections for placements of $r$ rooks on a triangular half of an $n\times n$ chess board, together with the Chen--Stein method for processes of indicator random variables.
△ Less
Submitted 3 July, 2021; v1 submitted 10 July, 2018;
originally announced July 2018.
-
Limit shapes via bijections
Authors:
Stephen DeSalvo,
Igor Pak
Abstract:
We compute the limit shape for several classes of restricted integer partitions, where the restrictions are placed on the part sizes rather than the multiplicities. Our approach utilizes certain classes of bijections which map limit shapes continuously in the plane. We start with bijections outlined previously by the second author, and extend them to include limit shapes with different scaling fun…
▽ More
We compute the limit shape for several classes of restricted integer partitions, where the restrictions are placed on the part sizes rather than the multiplicities. Our approach utilizes certain classes of bijections which map limit shapes continuously in the plane. We start with bijections outlined previously by the second author, and extend them to include limit shapes with different scaling functions.
△ Less
Submitted 18 November, 2016;
originally announced November 2016.
-
A robust quantitative local central limit theorem with applications to enumerative combinatorics and random combinatorial structures
Authors:
Stephen DeSalvo,
Georg Menz
Abstract:
A useful heuristic in the understanding of large random combinatorial structures is the Arratia-Tavare principle, which describes an approximation to the joint distribution of component-sizes using independent random variables. The principle outlines conditions under which the total variation distance between the true joint distribution and the approximation should be small, and was successfully e…
▽ More
A useful heuristic in the understanding of large random combinatorial structures is the Arratia-Tavare principle, which describes an approximation to the joint distribution of component-sizes using independent random variables. The principle outlines conditions under which the total variation distance between the true joint distribution and the approximation should be small, and was successfully exploited by Pittel in the cases of integer partitions and set partitions. We provide sufficient conditions for this principle to be true in a general context, valid for certain discrete probability distributions which are $\textit{perturbed log-concave}$, via a quantitative local central limit theorem. We then use it to generalize some classical asymptotic statistics in combinatorial theory, as well as assert some new ones.
△ Less
Submitted 24 October, 2016;
originally announced October 2016.
-
The probability of avoiding consecutive patterns in the Mallows distribution
Authors:
Harry Crane,
Stephen DeSalvo,
Sergi Elizalde
Abstract:
We use various combinatorial and probabilistic techniques to study growth rates for the probability that a random permutation from the Mallows distribution avoids consecutive patterns. The Mallows distribution behaves like a $q$-analogue of the uniform distribution by weighting each permutation $π$ by $q^{inv(π)}$, where $inv(π)$ is the number of inversions in $π$ and $q$ is a positive, real-value…
▽ More
We use various combinatorial and probabilistic techniques to study growth rates for the probability that a random permutation from the Mallows distribution avoids consecutive patterns. The Mallows distribution behaves like a $q$-analogue of the uniform distribution by weighting each permutation $π$ by $q^{inv(π)}$, where $inv(π)$ is the number of inversions in $π$ and $q$ is a positive, real-valued parameter. We prove that the growth rate exists for all patterns and all $q>0$, and we generalize Goulden and Jackson's cluster method to keep track of the number of inversions in permutations avoiding a given consecutive pattern. Using singularity analysis, we approximate the growth rates for length-3 patterns, monotone patterns, and non-overlapping patterns starting with 1, and we compare growth rates between different patterns. We also use Stein's method to show that, under certain assumptions on $q$, the length of $σ$, and $inv(σ)$, the number of occurrences of a given pattern $σ$ is well approximated by the normal distribution.
△ Less
Submitted 5 September, 2016;
originally announced September 2016.
-
Improvements to exact Boltzmann sampling using probabilistic divide-and-conquer and the recursive method
Authors:
Stephen DeSalvo
Abstract:
We demonstrate an approach for exact sampling of certain discrete combinatorial distributions, which is a hybrid of exact Boltzmann sampling and the recursive method, using probabilistic divide-and-conquer (PDC). The approach specializes to exact Boltzmann sampling in the trivial setting, and specializes to PDC deterministic second half in the first non-trivial application. A large class of exampl…
▽ More
We demonstrate an approach for exact sampling of certain discrete combinatorial distributions, which is a hybrid of exact Boltzmann sampling and the recursive method, using probabilistic divide-and-conquer (PDC). The approach specializes to exact Boltzmann sampling in the trivial setting, and specializes to PDC deterministic second half in the first non-trivial application. A large class of examples is given for which this method broadly applies, and several examples are worked out explicitly.
△ Less
Submitted 29 August, 2016;
originally announced August 2016.
-
Poisson and independent process approximation for random combinatorial structures with a given number of components, and near-universal behavior for low rank assemblies
Authors:
Richard Arratia,
Stephen DeSalvo
Abstract:
We give a general framework for approximations to combinatorial assemblies, especially suitable to the situation where the number $k$ of components is specified, in addition to the overall size $n$. This involves a Poisson process, which, with the appropriate choice of parameter, may be viewed as an extension of saddlepoint approximation.
We illustrate the use of this by analyzing the component…
▽ More
We give a general framework for approximations to combinatorial assemblies, especially suitable to the situation where the number $k$ of components is specified, in addition to the overall size $n$. This involves a Poisson process, which, with the appropriate choice of parameter, may be viewed as an extension of saddlepoint approximation.
We illustrate the use of this by analyzing the component structure when the rank and size are specified, and the rank, $r := n-k$, is small relative to $n$. There is near-universal behavior, in the sense that apart from cases where the exponential generating function has radius of convergence zero, for $\ell=1,2,\dots$, when $r \asymp n^α$ for fixed $α\in (\frac{\ell}{\ell+1}, \frac{\ell+1}{\ell+2})$, the size $L_1$ of the largest component converges in probabiity to $\ell+2$. Further, when $r \sim t\, n^{\ell/(\ell+1)}$ for a positive integer $\ell$, and $t \in (0,\infty)$, $\mathbb{P}\,(L_1 \in \{\ell+1,\ell+2\}) \to 1$, with the choice governed by a Poisson limit distribution for the number of components of size $\ell+2$. This was previously observed, for the case $\ell=1$ and the special cases of permutations and set partitions, using Chen-Stein approximations for the indicators of attacks and alignments, when rooks are placed randomly on a triangular board. The case $\ell=1$ is especially delicate, and was not handled by previous saddlepoint approximations.
△ Less
Submitted 4 July, 2016; v1 submitted 15 June, 2016;
originally announced June 2016.
-
An Independent Process Approximation to Sparse Random Graphs with a Prescribed Number of Edges and Triangles
Authors:
Stephen DeSalvo,
M. Puck Rombach
Abstract:
We prove a $pre$-$asymptotic$ bound on the total variation distance between the uniform distribution over two types of undirected graphs with $n$ nodes. One distribution places a prescribed number of $k_T$ triangles and $k_S$ edges not involved in a triangle independently and uniformly over all possibilities, and the other is the uniform distribution over simple graphs with exactly $k_T$ triangles…
▽ More
We prove a $pre$-$asymptotic$ bound on the total variation distance between the uniform distribution over two types of undirected graphs with $n$ nodes. One distribution places a prescribed number of $k_T$ triangles and $k_S$ edges not involved in a triangle independently and uniformly over all possibilities, and the other is the uniform distribution over simple graphs with exactly $k_T$ triangles and $k_S$ edges not involved in a triangle. As a corollary, for $k_S = o(n)$ and $k_T = o(n)$ as $n$ tends to infinity, the total variation distance tends to $0$, at a rate that is given explicitly. Our main tool is Chen-Stein Poisson approximation, hence our bounds are explicit for all finite values of the parameters.
△ Less
Submitted 29 September, 2015;
originally announced September 2015.
-
Pattern Avoidance for Random Permutations
Authors:
Harry Crane,
Stephen DeSalvo
Abstract:
Using techniques from Poisson approximation, we prove explicit error bounds on the number of permutations that avoid any pattern. Most generally, we bound the total variation distance between the joint distribution of pattern occurrences and a corresponding joint distribution of independent Bernoulli random variables, which as a corollary yields a Poisson approximation for the distribution of the…
▽ More
Using techniques from Poisson approximation, we prove explicit error bounds on the number of permutations that avoid any pattern. Most generally, we bound the total variation distance between the joint distribution of pattern occurrences and a corresponding joint distribution of independent Bernoulli random variables, which as a corollary yields a Poisson approximation for the distribution of the number of occurrences of any pattern. We also investigate occurrences of consecutive patterns in random Mallows permutations, of which uniform random permutations are a special case. These bounds allow us to estimate the probability that a pattern occurs any number of times and, in particular, the probability that a random permutation avoids a given pattern.
△ Less
Submitted 15 November, 2018; v1 submitted 25 September, 2015;
originally announced September 2015.
-
Random Sampling of Contingency Tables via Probabilistic Divide-and-Conquer
Authors:
Stephen DeSalvo,
James Y. Zhao
Abstract:
We present a new approach for random sampling of contingency tables of any size and constraints based on a recently introduced $\textit{probabilistic divide-and-conquer}$ technique. A simple exact sampling algorithm is presented for $2\times n$ tables, as well as a generalization where each entry of the table has a specified marginal distribution.
We present a new approach for random sampling of contingency tables of any size and constraints based on a recently introduced $\textit{probabilistic divide-and-conquer}$ technique. A simple exact sampling algorithm is presented for $2\times n$ tables, as well as a generalization where each entry of the table has a specified marginal distribution.
△ Less
Submitted 29 February, 2016; v1 submitted 30 June, 2015;
originally announced July 2015.
-
Exact sampling algorithms for Latin squares and Sudoku matrices via probabilistic divide-and-conquer
Authors:
Stephen DeSalvo
Abstract:
We provide several algorithms for the exact, uniform random sampling of Latin squares and Sudoku matrices via probabilistic divide-and-conquer (PDC). Our approach divides the sample space into smaller pieces, samples each separately, and combines them in a manner which yields an exact sample from the target distribution. We demonstrate, in particular, a version of PDC in which one of the pieces is…
▽ More
We provide several algorithms for the exact, uniform random sampling of Latin squares and Sudoku matrices via probabilistic divide-and-conquer (PDC). Our approach divides the sample space into smaller pieces, samples each separately, and combines them in a manner which yields an exact sample from the target distribution. We demonstrate, in particular, a version of PDC in which one of the pieces is sampled using a brute force approach, which we dub $\textit{almost deterministic second half}$, as it is a generalization to a previous application of PDC for which one of the pieces is uniquely determined given the others.
△ Less
Submitted 8 September, 2016; v1 submitted 1 February, 2015;
originally announced February 2015.
-
Probabilistic divide-and-conquer: deterministic second half
Authors:
Stephen DeSalvo
Abstract:
We present a probabilistic divide-and-conquer (PDC) method for \emph{exact} sampling of conditional distributions of the form $\mathcal{L}( {\bf X}\, |\, {\bf X} \in E)$, where ${\bf X}$ is a random variable on $\mathcal{X}$, a complete, separable metric space, and event $E$ with $\mathbb{P}(E) \geq 0$ is assumed to have sufficient regularity such that the conditional distribution exists and is un…
▽ More
We present a probabilistic divide-and-conquer (PDC) method for \emph{exact} sampling of conditional distributions of the form $\mathcal{L}( {\bf X}\, |\, {\bf X} \in E)$, where ${\bf X}$ is a random variable on $\mathcal{X}$, a complete, separable metric space, and event $E$ with $\mathbb{P}(E) \geq 0$ is assumed to have sufficient regularity such that the conditional distribution exists and is unique up to almost sure equivalence. The PDC approach is to define a decomposition of $\mathcal{X}$ via sets $\mathcal{A}$ and $\mathcal{B}$ such that $\mathcal{X} = \mathcal{A} \times \mathcal{B}$, and sample from each separately. The deterministic second half approach is to select the sets $\mathcal{A}$ and $\mathcal{B}$ such that for each element $a\in \mathcal{A}$, there is only one element $b_a \in \mathcal{B}$ for which $(a,b_a)\in E$. We show how this simple approach provides non-trivial improvements to several conventional random sampling algorithms in combinatorics, and we demonstrate its versatility with applications to sampling from sufficiently regular conditional distributions.
△ Less
Submitted 14 September, 2016; v1 submitted 24 November, 2014;
originally announced November 2014.
-
Completely effective error bounds for Stirling Numbers of the first and second kind via Poisson Approximation
Authors:
Richard Arratia,
Stephen DeSalvo
Abstract:
We provide completely effective error estimates for Stirling numbers of the first and second kind, denoted by $s(n,m)$ and $S(n,m)$, respectively. These bounds are useful for values of $m \geq n - O(\sqrt{n})$. An application of our Theorem 5 yields, for example, \[ s(10^{12},\ 10^{12}-2\times 10^6)/10^{35664464} \in [ 1.87669, 1.876982 ], \] \[ S(10^{12},\ 10^{12}-2\times 10^6)/10^{35664463} \in…
▽ More
We provide completely effective error estimates for Stirling numbers of the first and second kind, denoted by $s(n,m)$ and $S(n,m)$, respectively. These bounds are useful for values of $m \geq n - O(\sqrt{n})$. An application of our Theorem 5 yields, for example, \[ s(10^{12},\ 10^{12}-2\times 10^6)/10^{35664464} \in [ 1.87669, 1.876982 ], \] \[ S(10^{12},\ 10^{12}-2\times 10^6)/10^{35664463} \in [ 1.30121, 1.306975 ]. \] The bounds are obtained via Chen-Stein Poisson approximation, using an interpretation of Stirling numbers as the number of ways of placing non-attacking rooks on a chess board.
As a corollary to Theorem 5, summarized in Proposition 1, we obtain two simple and explicit asymptotic formulas, one for each of $s(n,m)$ and $S(n,m)$, for the parametrization $m = n - t\, n^a$, $0 \leq a \leq \frac{1}{2}.$ These asymptotic formulas agree with the ones originally observed by Moser and Wyman in the range $0<a<\frac{1}{2}$, and they connect with a recent asymptotic expansion by Louchard for $\frac{1}{2}<a < 1$, hence filling the gap at $a = \frac{1}{2}$.
We also provide a generalization applicable to rook and file numbers.
△ Less
Submitted 8 September, 2016; v1 submitted 11 April, 2014;
originally announced April 2014.
-
Log-Concavity of the Partition Function
Authors:
Stephen DeSalvo,
Igor Pak
Abstract:
We prove that the partition function $p(n)$ is log-concave for all $n>25$. We then extend the results to resolve two related conjectures by Chen. The proofs are based on Lehmer's estimates on the remainders of the Hardy--Ramanujan and the Rademacher series for $p(n)$.
We prove that the partition function $p(n)$ is log-concave for all $n>25$. We then extend the results to resolve two related conjectures by Chen. The proofs are based on Lehmer's estimates on the remainders of the Hardy--Ramanujan and the Rademacher series for $p(n)$.
△ Less
Submitted 4 July, 2014; v1 submitted 29 October, 2013;
originally announced October 2013.
-
On the Random Sampling of Pairs, with Pedestrian examples
Authors:
Richard Arratia,
Stephen DeSalvo
Abstract:
Suppose one desires to randomly sample a pair of objects such as socks, hoping to get a matching pair. Even in the simplest situation for sampling, which is sampling with replacement, the innocent phrase "the distribution of the color of a matching pair" is ambiguous. One interpretation is that we condition on the event of getting a match between two random socks; this corresponds to sampling two…
▽ More
Suppose one desires to randomly sample a pair of objects such as socks, hoping to get a matching pair. Even in the simplest situation for sampling, which is sampling with replacement, the innocent phrase "the distribution of the color of a matching pair" is ambiguous. One interpretation is that we condition on the event of getting a match between two random socks; this corresponds to sampling two at a time, over and over without memory, until a matching pair is found. A second interpretation is to sample sequentially, one at a time, with memory, until the same color has been seen twice.
We study the difference between these two methods. The input is a discrete probability distribution on colors, describing what happens when one sock is sampled. There are two derived distributions --- the pair-color distributions under the two methods of getting a match. The output, a number we call the discrepancy of the input distribution, is the total variation distance between the two derived distributions.
It is easy to determine when the two pair-color distributions come out equal, that is, to determine which distributions have discrepancy zero, but hard to determine the largest possible discrepancy. We find the exact extreme for the case of two colors, by analyzing the roots of a fifth degree polynomial in one variable. We find the exact extreme for the case of three colors, by analyzing the 49 roots of a variety spanned by two seventh-degree polynomials in two variables. We give a plausible conjecture for the general situation of a finite number of colors, and give an exact computation of a constant which is a plausible candidate for the supremum of the discrepancy over all discrete probability distributions.
We briefly consider the more difficult case where the objects to be matched into pairs are of two different kinds, such as male-female or left-right.
△ Less
Submitted 1 June, 2013; v1 submitted 27 November, 2012;
originally announced November 2012.
-
Probabilistic divide-and-conquer: a new exact simulation method, with integer partitions as an example
Authors:
Richard Arratia,
Stephen DeSalvo
Abstract:
We propose a new method, probabilistic divide-and-conquer, for improving the success probability in rejection sampling. For the example of integer partitions, there is an ideal recursive scheme which improves the rejection cost from asymptotically order $n^{3/4}$ to a constant. We show other examples for which a non--recursive, one--time application of probabilistic divide-and-conquer removes a su…
▽ More
We propose a new method, probabilistic divide-and-conquer, for improving the success probability in rejection sampling. For the example of integer partitions, there is an ideal recursive scheme which improves the rejection cost from asymptotically order $n^{3/4}$ to a constant. We show other examples for which a non--recursive, one--time application of probabilistic divide-and-conquer removes a substantial fraction of the rejection sampling cost.
We also present a variation of probabilistic divide-and-conquer for generating i.i.d. samples that exploits features of the coupon collector's problem, in order to obtain a cost that is sublinear in the number of samples.
△ Less
Submitted 23 November, 2015; v1 submitted 17 October, 2011;
originally announced October 2011.
-
On the singularity of random Bernoulli matrices - novel integer partitions and lower bound expansions
Authors:
Richard Arratia,
Stephen DeSalvo
Abstract:
We prove a lower bound expansion on the probability that a random $\pm 1$ matrix is singular, and conjecture that such expansions govern the actual probability of singularity. These expansions are based on naming the most likely, second most likely, and so on, ways that a Bernoulli matrix can be singular; the most likely way is to have a null vector of the form $e_i \pm e_j$, which corresponds to…
▽ More
We prove a lower bound expansion on the probability that a random $\pm 1$ matrix is singular, and conjecture that such expansions govern the actual probability of singularity. These expansions are based on naming the most likely, second most likely, and so on, ways that a Bernoulli matrix can be singular; the most likely way is to have a null vector of the form $e_i \pm e_j$, which corresponds to the integer partition 11, with two parts of size 1. The second most likely way is to have a null vector of the form $e_i \pm e_j \pm e_k \pm e_\ell$, which corresponds to the partition 1111. The fifth most likely way corresponds to the partition 21111.
We define and characterize the "novel partitions" which show up in this series. As a family, novel partitions suffice to detect singularity, i.e., any singular Bernoulli matrix has a left null vector whose underlying integer partition is novel. And, with respect to this property, the family of novel partitions is minimal.
We prove that the only novel partitions with six or fewer parts are 11, 1111, 21111, 111111, 221111, 311111, and 322111. We prove that there are fourteen novel partitions having seven parts.
We formulate a conjecture about which partitions are "first place and runners up," in relation to the Erdős-Littlewood-Offord bound.
We prove some bounds on the interaction between left and right null vectors.
△ Less
Submitted 22 May, 2012; v1 submitted 13 May, 2011;
originally announced May 2011.