-
Markov chains arising from biased random derangements
Authors:
Poly H. da Silva,
Arash Jamshidpey,
Simon Tavaré
Abstract:
We explore the cycle types of a class of biased random derangements, described as a random game played by some children labeled $1,\cdots,n$. Children join the game one by one, in a random order, and randomly form some circles of size at least $2$, so that no child is left alone. The game gives rise to the cyclic decomposition of a random derangement, inducing an exchangeable random partition. The…
▽ More
We explore the cycle types of a class of biased random derangements, described as a random game played by some children labeled $1,\cdots,n$. Children join the game one by one, in a random order, and randomly form some circles of size at least $2$, so that no child is left alone. The game gives rise to the cyclic decomposition of a random derangement, inducing an exchangeable random partition. The rate at which the circles are closed varies in time, and at each time $t$, depends on the number of individuals who have not played until t. A $\{0,1\}$-valued Markov chain $ X^n$ records the cycle type of the corresponding random derangement in that any $1$ represents a hand-grasping event that closes a circle. Using this, we study the cycle counts and sizes of the random derangements and their asymptotic behavior. We approximate the total variation distance between the reversed chain of $X^n$ and its weak limit $X^\infty$, as $n\to\infty$. We establish conditional (and push-forward) relations between $X^n$ and a generalization of the Feller coupling, given that no $11$-pattern ($1$-cycle) appears in the latter. We extend these relations to $X^\infty$ and apply them to investigate some asymptotic behaviors of $X^n$.
△ Less
Submitted 24 November, 2022;
originally announced November 2022.
-
Another view of sequential sampling in the birth process with immigration
Authors:
Poly H. da Silva,
Arash Jamshidpey,
Simon Tavaré
Abstract:
Models of counts-of-counts data have been extensively used in the biological sciences, for example in cancer, population genetics, sampling theory and ecology. In this paper we explore properties of one model that is embedded into a continuous-time process and can describe the appearance of certain biological data such as covid DNA sequences in a database. More specifically, we consider an evolvin…
▽ More
Models of counts-of-counts data have been extensively used in the biological sciences, for example in cancer, population genetics, sampling theory and ecology. In this paper we explore properties of one model that is embedded into a continuous-time process and can describe the appearance of certain biological data such as covid DNA sequences in a database. More specifically, we consider an evolving model of counts-of-counts data that arises as the family size counts of samples taken sequentially from a Birth process with Immigration (BI). Here, each family represents a type or species, and the family size counts represent the type or species frequency spectrum in the population. We study the correlation of $S(a,b)$ and $S(c,d)$, the number of families observed in two disjoint time intervals $(a,b)$ and $(c,d)$. We find the expected sample variance and its asymptotics for $p$ consecutive sequential samples $\mathbf{S}_p:=(S(t_0,t_1),\dots, S(t_{p-1},t_p))$, for any given $0=t_0<t_1<\dots<t_p$. By conditioning on the sizes of the samples, we provide a connection between $\mathbf{S}_p$ and $p$ sequential samples of sizes $n_1,n_2,\dots,n_p$, drawn from a single run of a Chinese Restaurant Process. The properties of the latter were studied in da Silva et al. (2022). We show how the continuous-time framework helps to make asymptotic calculations easier than its discrete-time counterpart. As an application, for a specific choice of $t_1,t_2,\dots, t_p$, we revisit Fisher's 1943 multi-sampling problem and give another explanation of what Fisher's model could have meant in the world of sequential samples drawn from a BI process.
△ Less
Submitted 29 November, 2022; v1 submitted 13 October, 2022;
originally announced October 2022.
-
Escape from parsimony of a double-cut-and-join genome evolution process
Authors:
Mona Meghdari Miardan,
Arash Jamshidpey,
David Sankoff
Abstract:
We analyze models of genome evolution based on both restricted and unrestricted double-cut-and-join (DCJ) operations. We compare the number of operations along the evolutionary trajectory to the DCJ distance of the genome from its ancestor at each step, and determine at what point they diverge: the process escapes from parsimony. Adapting the method developed by Berestycki and Durret (2006), we es…
▽ More
We analyze models of genome evolution based on both restricted and unrestricted double-cut-and-join (DCJ) operations. We compare the number of operations along the evolutionary trajectory to the DCJ distance of the genome from its ancestor at each step, and determine at what point they diverge: the process escapes from parsimony. Adapting the method developed by Berestycki and Durret (2006), we estimate the number of cycles in the breakpoint graph of a random genome at time $t$ and its ancestral genome by the number of tree components of an Erdös-Rényi random graph constructed from the model of evolution. In both models, the process on a genome of size $n$ is bound to its parsimonious estimate up to $t\approx n/2$ steps.
△ Less
Submitted 27 September, 2021;
originally announced September 2021.
-
Random derangements and the Ewens Sampling Formula
Authors:
Poly H. da Silva,
Arash Jamshidpey,
Simon Tavaré
Abstract:
We study derangements of $\{1,2,\ldots,n\}$ under the Ewens distribution with parameter $θ$. We give the moments and marginal distributions of the cycle counts, the number of cycles, and asymptotic distributions for large $n$. We develop a $\{0,1\}$-valued non-homogeneous Markov chain with the property that the counts of lengths of spacings between the 1s have the derangement distribution. This ch…
▽ More
We study derangements of $\{1,2,\ldots,n\}$ under the Ewens distribution with parameter $θ$. We give the moments and marginal distributions of the cycle counts, the number of cycles, and asymptotic distributions for large $n$. We develop a $\{0,1\}$-valued non-homogeneous Markov chain with the property that the counts of lengths of spacings between the 1s have the derangement distribution. This chain, an analog of the so-called Feller Coupling, provides a simple way to simulate derangements in time independent of $θ$ for a given $n$ and linear in the size of the derangement.
△ Less
Submitted 8 June, 2020;
originally announced June 2020.
-
Algebraic Construction of Quasi-split Algebraic Tori
Authors:
Armin Jamshidpey,
Nicole Lemire,
Eric Schost
Abstract:
The main purpose of this work is to give a constructive proof for a particular case of the no-name lemma. Let $G$ be a finite group, $K$ be a field, $L$ be a permutation $G$-lattice and $K[L]$ be the group algebra of $L$ over $K$. The no-name lemma asserts that the invariant field of the quotient field of $K[L]$, $K(L)^G$ is a purely transcendental extension of $K^G$. In other words, there exist…
▽ More
The main purpose of this work is to give a constructive proof for a particular case of the no-name lemma. Let $G$ be a finite group, $K$ be a field, $L$ be a permutation $G$-lattice and $K[L]$ be the group algebra of $L$ over $K$. The no-name lemma asserts that the invariant field of the quotient field of $K[L]$, $K(L)^G$ is a purely transcendental extension of $K^G$. In other words, there exist $y_1, \ldots , y_n$ which are algebraically independent over $K^G$ such that $K(L)^G \cong K^G(y_1, \ldots , y_n)$. We define elements $\lbrace y_1, \ldots, y_n \rbrace \subset K[L]^G$ with the desired properties, in the case when $G$ is the Galois group of a finite extension $\mathrm{Gal}(K/F)$, and $L$ is a sign permutation $G$-lattice.
△ Less
Submitted 29 January, 2018;
originally announced January 2018.
-
Partial geodesics on symmetric groups endowed with breakpoint distance
Authors:
Poly H. da Silva,
Arash Jamshidpey,
David Sankoff
Abstract:
The notion of partial geodesic was introduced by Jamshidpey et al. in "Sets of medians in the non-geodesic pseudometric space of unsigned genomes with breakpoints", 2014. In this paper, we study the density of points on non-trivial partial geodesics between two permutations $ξ_1^{(n)}$ and $ξ_2^{(n)}$ chosen uniformly and independently at random from the symmetric group $S_n$, where $S_n$ is endow…
▽ More
The notion of partial geodesic was introduced by Jamshidpey et al. in "Sets of medians in the non-geodesic pseudometric space of unsigned genomes with breakpoints", 2014. In this paper, we study the density of points on non-trivial partial geodesics between two permutations $ξ_1^{(n)}$ and $ξ_2^{(n)}$ chosen uniformly and independently at random from the symmetric group $S_n$, where $S_n$ is endowed with the breakpoint distance. For a permutation $π:= π_1 \ ... \ π_n$, any unordered pair $\{π_i , π_{i+1}\}$, for $i=1, ..., n-1$, is called an adjacency of $π$. The set of all adjacencies of $π$ is denoted by $\mathcal{A}_π$. Denote by $id^{(n)}$ the identity permutation, and let $I_n$ be an arbitrary subset of $\mathcal A_{id^{(n)}}$. We classify the set of all adjacencies of a permutation $π\in S_n$ into four types, with respect to $I_n$. Then for a permutation $ξ^{(n)}$ chosen uniformly at random from $S_n$, we derive a convergence theorem for the normalized number (after dividing by $n$) of adjacencies of each type in $ξ^{(n)}$ with respect to $I_n$ (for some random or deterministic choices of $I_n$), as $n\rightarrow \infty$. We also see an application of this convergence theorem to find the appropriate choices of $I_n$. A geodesic point of $u$ and $v$ in a pseudometric space $(S,ρ)$ is a point $w$ of the space that $ρ(u,w)+ρ(w,v)=ρ(u,v)$. We find an upper bound for the number of permutations $x\in S_n$ for which there exists at least one non-trivial geodesic point between $id^{(n)}$ and $x$, far from both. This partially verifies the conjecture of Haghighi and Sankoff stated in "Medians seek the corners, and other conjectures", 2012, namely we prove that, with high probability, there is no breakpoint median of two permutations $ξ_1^{(n)}$ and $ξ_2^{(n)}$ chosen uniformly and independently at random from $S_n$, far from both of them.
△ Less
Submitted 15 January, 2018;
originally announced January 2018.
-
Median inverse problem and approximating the number of $k$-median inverses of a permutation
Authors:
Poly H. da Silva,
Arash Jamshidpey,
David Sankoff
Abstract:
We introduce the "Median Inverse Problem" for metric spaces. In particular, having a permutation $π$ in the symmetric group $S_n$ (endowed with the breakpoint distance), we study the set of all $k$-subsets $\{x_1,...,x_k\}\subset S_n$ for which $π$ is a breakpoint median. The set of all $k$-tuples $(x_1,...,x_k)$ with this property is called the $k$-median inverse of $π$. Finding an upper bound fo…
▽ More
We introduce the "Median Inverse Problem" for metric spaces. In particular, having a permutation $π$ in the symmetric group $S_n$ (endowed with the breakpoint distance), we study the set of all $k$-subsets $\{x_1,...,x_k\}\subset S_n$ for which $π$ is a breakpoint median. The set of all $k$-tuples $(x_1,...,x_k)$ with this property is called the $k$-median inverse of $π$. Finding an upper bound for the cardinality of this set, we provide an asymptotic upper bound for the probability that $π$ is a breakpoint median of $k$ permutations $ξ_1^{(n)},...,ξ_k^{(n)}$ chosen uniformly and independently at random from $S_n$.
△ Less
Submitted 7 December, 2017;
originally announced December 2017.
-
An Ergodic Theorem for Fleming-Viot Models in Random Environments
Authors:
Arash Jamshidpey
Abstract:
The Fleming-Viot (FV) process is a measure-valued diffusion that models the evolution of type frequencies in a countable population which evolves under resampling (genetic drift), mutation, and selection. In the classic FV model the fitness (strength) of types is given by a measurable function. In this paper, we introduce and study the Fleming-Viot process in random environment (FVRE), when by ran…
▽ More
The Fleming-Viot (FV) process is a measure-valued diffusion that models the evolution of type frequencies in a countable population which evolves under resampling (genetic drift), mutation, and selection. In the classic FV model the fitness (strength) of types is given by a measurable function. In this paper, we introduce and study the Fleming-Viot process in random environment (FVRE), when by random environment we mean the fitness of types is a stochastic process with càdlàg paths. We identify FVRE as the unique solution to a so called quenched martingale problem and derive some of its properties via martingale and duality methods. We develop the duality methods for general time-inhomogeneous and quenched martingale problems. In fact, some important aspects of the duality relations only appears for time-inhomogeneous (and quenched) martingale problems. For example, we see that duals evolve backward in time with respect to the main Markov process whose evolution is forward in time. Using a family of function-valued dual processes for FVRE, we prove that, as the number of individuals $N$ tends to $\infty$, the measure-valued Moran process $μ_N^{e_N}$ (with fitness process $e_N$) converges weakly in Skorokhod topology of càdlàg functions to the FVRE process $μ^e$ (with fitness process $e$), if $e_N \rightarrow e$ a.s. in Skorokhod topology of càdlàg functions. We also study the long-time behaviour of FVRE process $(μ_t^e)_{t\geq 0}$ joint with its fitness process $e=(e_t)_{t\geq 0}$ and prove that the joint FV-environment process $(μ_t^e,e_t)_{t\geq 0}$ is ergodic under the assumption of weak ergodicity of $e$.
△ Less
Submitted 11 January, 2017;
originally announced January 2017.