-
Prüfer codes on vertex-colored rooted trees
Authors:
R. W. R. Darling,
Grant Fickes
Abstract:
Prüfer codes provide an encoding scheme for representing a vertex-labeled tree on $n$ vertices with a string of length $n-2$. Indeed, two labeled trees are isomorphic if and only if their Prüfer codes are identical, and this supplies a proof of Cayley's Theorem. Motivated by a graph decomposition of freight networks into a corpus of vertex-colored rooted trees, we extend the notion of Prüfer codes…
▽ More
Prüfer codes provide an encoding scheme for representing a vertex-labeled tree on $n$ vertices with a string of length $n-2$. Indeed, two labeled trees are isomorphic if and only if their Prüfer codes are identical, and this supplies a proof of Cayley's Theorem. Motivated by a graph decomposition of freight networks into a corpus of vertex-colored rooted trees, we extend the notion of Prüfer codes to that setting, i.e., trees without a unique labeling, by defining a canonical label for a vertex-colored rooted tree and incorporating vertex colors into our variation of the Prüfer code. Given a pair of trees, we prove properties of the vertex-colored Prüfer code (abbreviated VCPC) equivalent to (1) isomorphism between a pair of vertex-colored rooted trees, (2) the subtree relationship between vertex-colored rooted trees, and (3) when one vertex-colored rooted tree is isomorphic to a minor of another vertex-colored rooted tree.
△ Less
Submitted 18 June, 2025;
originally announced June 2025.
-
Rank-based linkage I: triplet comparisons and oriented simplicial complexes
Authors:
R. W. R. Darling,
Will Grilliette,
Adam Logan
Abstract:
Rank-based linkage is a new tool for summarizing a collection $S$ of objects according to their relationships. These objects are not mapped to vectors, and ``similarity'' between objects need be neither numerical nor symmetrical. All an object needs to do is rank nearby objects by similarity to itself, using a Comparator which is transitive, but need not be consistent with any metric on the whole…
▽ More
Rank-based linkage is a new tool for summarizing a collection $S$ of objects according to their relationships. These objects are not mapped to vectors, and ``similarity'' between objects need be neither numerical nor symmetrical. All an object needs to do is rank nearby objects by similarity to itself, using a Comparator which is transitive, but need not be consistent with any metric on the whole set. Call this a ranking system on $S$. Rank-based linkage is applied to the $K$-nearest neighbor digraph derived from a ranking system. Computations occur on a 2-dimensional abstract oriented simplicial complex whose faces are among the points, edges, and triangles of the line graph of the undirected $K$-nearest neighbor graph on $S$. In $|S| K^2$ steps it builds an edge-weighted linkage graph $(S, \mathcal{L}, σ)$ where $σ(\{x, y\})$ is called the in-sway between objects $x$ and $y$. Take $\mathcal{L}_t$ to be the links whose in-sway is at least $t$, and partition $S$ into components of the graph $(S, \mathcal{L}_t)$, for varying $t$. Rank-based linkage is a functor from a category of out-ordered digraphs to a category of partitioned sets, with the practical consequence that augmenting the set of objects in a rank-respectful way gives a fresh clustering which does not ``rip apart`` the previous one. The same holds for single linkage clustering in the metric space context, but not for typical optimization-based methods. Open combinatorial problems are presented in the last section.
△ Less
Submitted 20 April, 2023; v1 submitted 4 February, 2023;
originally announced February 2023.
-
Proceedings of TDA: Applications of Topological Data Analysis to Data Science, Artificial Intelligence, and Machine Learning Workshop at SDM 2022
Authors:
R. W. R. Darling,
John A. Emanuello,
Emilie Purvine,
Ahmad Ridley
Abstract:
Topological Data Analysis (TDA) is a rigorous framework that borrows techniques from geometric and algebraic topology, category theory, and combinatorics in order to study the "shape" of such complex high-dimensional data. Research in this area has grown significantly over the last several years bringing a deeply rooted theory to bear on practical applications in areas such as genomics, natural la…
▽ More
Topological Data Analysis (TDA) is a rigorous framework that borrows techniques from geometric and algebraic topology, category theory, and combinatorics in order to study the "shape" of such complex high-dimensional data. Research in this area has grown significantly over the last several years bringing a deeply rooted theory to bear on practical applications in areas such as genomics, natural language processing, medicine, cybersecurity, energy, and climate change. Within some of these areas, TDA has also been used to augment AI and ML techniques.
We believe there is further utility to be gained in this space that can be facilitated by a workshop bringing together experts (both theorists and practitioners) and non-experts. Currently there is an active community of pure mathematicians with research interests in developing and exploring the theoretical and computational aspects of TDA. Applied mathematicians and other practitioners are also present in community but do not represent a majority. This speaks to the primary aim of this workshop which is to grow a wider community of interest in TDA. By fostering meaningful exchanges between these groups, from across the government, academia, and industry, we hope to create new synergies that can only come through building a mutual comprehensive awareness of the problem and solution spaces.
△ Less
Submitted 14 April, 2022; v1 submitted 3 April, 2022;
originally announced April 2022.
-
Partitioned K-nearest neighbor local depth for scalable comparison-based learning
Authors:
Jacob D. Baron,
R. W. R. Darling,
J. Laylon Davis,
R. Pettit
Abstract:
A triplet comparison oracle on a set $S$ takes an object $x \in S$ and for any pair $\{y, z\} \subset S \setminus \{x\}$ declares which of $y$ and $z$ is more similar to $x$. Partitioned Local Depth (PaLD) supplies a principled non-parametric partitioning of $S$ under such triplet comparisons but needs $O(n^2 \log{n})$ oracle calls and $O(n^3)$ post-processing steps.
We introduce Partitioned Nea…
▽ More
A triplet comparison oracle on a set $S$ takes an object $x \in S$ and for any pair $\{y, z\} \subset S \setminus \{x\}$ declares which of $y$ and $z$ is more similar to $x$. Partitioned Local Depth (PaLD) supplies a principled non-parametric partitioning of $S$ under such triplet comparisons but needs $O(n^2 \log{n})$ oracle calls and $O(n^3)$ post-processing steps.
We introduce Partitioned Nearest Neighbors Local Depth (PaNNLD), a computationally tractable variant of PaLD leveraging the $K$-nearest neighbors digraph on $S$. PaNNLD needs only $O(n K \log{n})$ oracle calls, by replacing an oracle call by a coin flip when neither $y$ nor $z$ is adjacent to $x$ in the undirected version of the $K$-nearest neighbors digraph. By averaging over randomizations, PaNNLD subsequently requires (at best) only $O(n K^2)$ post-processing steps. Concentration of measure shows that the probability of randomization-induced error $δ$ in PaNNLD is no more than $2 e^{-δ^2 K^2}$.
△ Less
Submitted 2 December, 2021; v1 submitted 19 August, 2021;
originally announced August 2021.
-
Hidden Ancestor Graphs: Models for Detagging Property Graphs
Authors:
R. W. R. Darling,
Gregory S. Clark,
J. D. Tucker
Abstract:
Consider a graph $G$ where each vertex is visibly labelled as a member of a distinct class, but also has a hidden binary state: wild or tame. Edges with end points in the same class are called agreement edges. Premise: an edge connecting vertices in different classes -- a conflict edge -- is allowed only when at least one end point is wild. Interpret wild status as readiness to form connections wi…
▽ More
Consider a graph $G$ where each vertex is visibly labelled as a member of a distinct class, but also has a hidden binary state: wild or tame. Edges with end points in the same class are called agreement edges. Premise: an edge connecting vertices in different classes -- a conflict edge -- is allowed only when at least one end point is wild. Interpret wild status as readiness to form connections with any other vertex, regardless of class -- a form of class disaffiliation. The learning goal is to classify each vertex as wild or tame using its neighborhood data. In applications such as communications metadata, bio-informatics, retailing, or bibliography, adjacency in $G$ is typically created by paths of length two in a transactional bipartite graph $B$. Class labelling, imported from a reference data source, is typically assortative, so agreement edges predominate. Conflict edges represent observed behavior (from $B$) inconsistent with prior labelling of $V(G)$. Wild vertices are those whose label is uninformative. The hidden ancestor graph constitutes a natural model for generating agreement edges and conflict edges, depending on a latent tree structure. The model is able to manifest high clustering rates and heavy-tailed degree distributions typical of social and spatial networks. It can be fitted to graph data using a few measurable graph parameters, and supplies a natural statistical classifier for wild versus tame.
△ Less
Submitted 13 December, 2023; v1 submitted 18 February, 2021;
originally announced February 2021.
-
K-Nearest Neighbor Approximation Via the Friend-of-a-Friend Principle
Authors:
Jacob D. Baron,
R. W. R. Darling
Abstract:
Suppose $V$ is an $n$-element set where for each $x \in V$, the elements of $V \setminus \{x\}$ are ranked by their similarity to $x$. The $K$-nearest neighbor graph is a directed graph including an arc from each $x$ to the $K$ points of $V \setminus \{x\}$ most similar to $x$. Constructive approximation to this graph using far fewer than $n^2$ comparisons is important for the analysis of large hi…
▽ More
Suppose $V$ is an $n$-element set where for each $x \in V$, the elements of $V \setminus \{x\}$ are ranked by their similarity to $x$. The $K$-nearest neighbor graph is a directed graph including an arc from each $x$ to the $K$ points of $V \setminus \{x\}$ most similar to $x$. Constructive approximation to this graph using far fewer than $n^2$ comparisons is important for the analysis of large high-dimensional data sets. $K$-Nearest Neighbor Descent is a parameter-free heuristic where a sequence of graph approximations is constructed, in which second neighbors in one approximation are proposed as neighbors in the next. Run times in a test case fit an $O(n K^2 \log{n})$ pattern. This bound is rigorously justified for a similar algorithm, using range queries, when applied to a homogeneous Poisson process in suitable dimension. However the basic algorithm fails to achieve subquadratic complexity on sets whose similarity rankings arise from a ``generic'' linear order on the $\binom{n}{2}$ inter-point distances in a metric space.
△ Less
Submitted 28 December, 2020; v1 submitted 20 August, 2019;
originally announced August 2019.
-
Anomaly Detection and Correction in Large Labeled Bipartite Graphs
Authors:
R. W. R. Darling,
Mark L. Velednitsky
Abstract:
Binary classification problems can be naturally modeled as bipartite graphs, where we attempt to classify right nodes based on their left adjacencies. We consider the case of labeled bipartite graphs in which some labels and edges are not trustworthy. Our goal is to reduce noise by identifying and fixing these labels and edges.
We first propose a geometric technique for generating random graph i…
▽ More
Binary classification problems can be naturally modeled as bipartite graphs, where we attempt to classify right nodes based on their left adjacencies. We consider the case of labeled bipartite graphs in which some labels and edges are not trustworthy. Our goal is to reduce noise by identifying and fixing these labels and edges.
We first propose a geometric technique for generating random graph instances with untrustworthy labels and analyze the resulting graph properties. We focus on generating graphs which reflect real-world data, where degree and label frequencies follow power law distributions.
We review several algorithms for the problem of detection and correction, proposing novel extensions and making observations specific to the bipartite case. These algorithms range from math programming algorithms to discrete combinatorial algorithms to Bayesian approximation algorithms to machine learning algorithms.
We compare the performance of all these algorithms using several metrics and, based on our observations, identify the relative strengths and weaknesses of each individual algorithm.
△ Less
Submitted 11 November, 2018;
originally announced November 2018.
-
The Four Point Permutation Test for Latent Block Structure in Incidence Matrices
Authors:
R W R Darling,
Cheyne Homberger
Abstract:
Transactional data may be represented as a bipartite graph $G:=(L \cup R, E)$, where $L$ denotes agents, $R$ denotes objects visible to many agents, and an edge in $E$ denotes an interaction between an agent and an object. Unsupervised learning seeks to detect block structures in the adjacency matrix $Z$ between $L$ and $R$, thus grouping together sets of agents with similar object interactions. N…
▽ More
Transactional data may be represented as a bipartite graph $G:=(L \cup R, E)$, where $L$ denotes agents, $R$ denotes objects visible to many agents, and an edge in $E$ denotes an interaction between an agent and an object. Unsupervised learning seeks to detect block structures in the adjacency matrix $Z$ between $L$ and $R$, thus grouping together sets of agents with similar object interactions. New results on quasirandom permutations suggest a non-parametric \textbf{four point test} to measure the amount of block structure in $G$, with respect to vertex orderings on $L$ and $R$. Take disjoint 4-edge random samples, order these four edges by left endpoint, and count the relative frequencies of the $4!$ possible orderings of the right endpoint. When these orderings are equiprobable, the edge set $E$ corresponds to a quasirandom permutation $π$ of $|E|$ symbols. Total variation distance of the relative frequency vector away from the uniform distribution on 24 permutations measures the amount of block structure. Such a test statistic, based on $\lfloor |E|/4 \rfloor$ samples, is computable in $O(|E|/p)$ time on $p$ processors. Possibly block structure may be enhanced by precomputing \textbf{natural orders} on $L$ and $R$, related to the second eigenvector of graph Laplacians. In practice this takes $O(d |E|)$ time, where $d$ is the graph diameter. Five open problems are described.
△ Less
Submitted 19 July, 2019; v1 submitted 3 October, 2018;
originally announced October 2018.
-
The Combinatorial Data Fusion Problem in Conflicted-supervised Learning
Authors:
R. W. R. Darling,
David G. Harris,
Dev R. Phulara,
John A. Proos
Abstract:
The best merge problem in industrial data science generates instances where disparate data sources place incompatible relational structures on the same set $V$ of objects. Graph vertex labelling data may include (1) missing or erroneous labels,(2) assertions that two vertices carry the same (unspecified) label, and (3) denying some subset of vertices from carrying the same label. Conflicted-superv…
▽ More
The best merge problem in industrial data science generates instances where disparate data sources place incompatible relational structures on the same set $V$ of objects. Graph vertex labelling data may include (1) missing or erroneous labels,(2) assertions that two vertices carry the same (unspecified) label, and (3) denying some subset of vertices from carrying the same label. Conflicted-supervised learning applies to cases where no labelling scheme satisfies (1), (2), and (3). Our rigorous formulation starts from a connected weighted graph $(V, E)$, and an independence system $\mathcal{S}$ on $V$, characterized by its circuits, called forbidden sets. Global incompatibility is expressed by the fact $V \notin \mathcal{S}$. Combinatorial data fusion seeks a subset $E_1 \subset E$ of maximum edge weight so that no vertex component of the subgraph $(V, E_1)$ contains any forbidden set. Multicut and multiway cut are special cases where all forbidden sets have cardinality two. The general case exhibits unintuitive properties, shown in counterexamples. The first in a series of papers concentrates on cases where $(V, E)$ is a tree, and presents an algorithm on general graphs, in which the combinatorial data fusion problem is transferred to the Gomory-Hu tree, where it is solved using greedy set cover. Experimental results are given.
△ Less
Submitted 23 September, 2018;
originally announced September 2018.
-
Euclidean Embedding of the Poisson Weighted Infinite Tree and Application to Mobility Models
Authors:
R. W. R. Darling,
Robin Pemantle
Abstract:
Continuous time branching models are used to create random fractals in a Euclidean space, whose Hausdorff dimension is controlled by an input parameter. Finite realizations are applied in modelling the set of sites visited in models of human and animal mobility.
Continuous time branching models are used to create random fractals in a Euclidean space, whose Hausdorff dimension is controlled by an input parameter. Finite realizations are applied in modelling the set of sites visited in models of human and animal mobility.
△ Less
Submitted 23 May, 2018;
originally announced May 2018.
-
Rank deficiency in sparse random GF[2] matrices
Authors:
R. W. R. Darling,
Mathew D. Penrose,
Andrew R. Wade,
Sandy L. Zabell
Abstract:
Let $M$ be a random $m \times n$ matrix with binary entries and i.i.d. rows. The weight (i.e., number of ones) of a row has a specified probability distribution, with the row chosen uniformly at random given its weight. Let $N(n,m)$ denote the number of left null vectors in ${0,1}^m$ for $M$ (including the zero vector), where addition is mod 2. We take $n, m \to \infty$, with $m/n \to α> 0$, while…
▽ More
Let $M$ be a random $m \times n$ matrix with binary entries and i.i.d. rows. The weight (i.e., number of ones) of a row has a specified probability distribution, with the row chosen uniformly at random given its weight. Let $N(n,m)$ denote the number of left null vectors in ${0,1}^m$ for $M$ (including the zero vector), where addition is mod 2. We take $n, m \to \infty$, with $m/n \to α> 0$, while the weight distribution may vary with $n$ but converges weakly to a limiting distribution on ${3, 4, 5, ...}$; let $W$ denote a variable with this limiting distribution. Identifying $M$ with a hypergraph on $n$ vertices, we define the 2-core of $M$ as the terminal state of an iterative algorithm that deletes every row incident to a column of degree 1.
We identify two thresholds $α^*$ and $\underlineα$, and describe them analytically in terms of the distribution of $W$. Threshold $α^*$ marks the infimum of values of $α$ at which $n^{-1} \log{\mathbb{E} [N(n,m)}]$ converges to a positive limit, while $\underlineα$ marks the infimum of values of $α$ at which there is a 2-core of non-negligible size compared to $n$ having more rows than non-empty columns.
We have $1/2 \leq α^* \leq \underlineα \leq 1$, and typically these inequalities are strict; for example when $W = 3$ almost surely, numerics give $α^* = 0.88949 ...$ and $\underlineα = 0.91793 ...$ (previous work on this model has mainly been concerned with such cases where $W$ is non-random). The threshold of values of $α$ for which $N(n,m) \geq 2$ in probability lies in $[α^*,\underlineα]$ and is conjectured to equal $\underlineα$.
The random row weight setting gives rise to interesting new phenomena not present in the non-random case that has been the focus of previous work.
△ Less
Submitted 23 November, 2012;
originally announced November 2012.
-
Maximum GCD Among Pairs of Random Integers
Authors:
R. W. R. Darling,
E. E. Pyle
Abstract:
Fix $α>0$, and sample $N$ integers uniformly at random from $\{1,2,\ldots ,\lfloor e^{αN}\rfloor \}$. Given $η>0$, the probability that the maximum of the pairwise GCDs lies between $N^{2-η}$ and $N^{2+η}$ converges to 1 as $N\to \infty $. More precise estimates are obtained. This is a Birthday Problem: two of the random integers are likely to share some prime factor of order $N^2/\log [N]$. The…
▽ More
Fix $α>0$, and sample $N$ integers uniformly at random from $\{1,2,\ldots ,\lfloor e^{αN}\rfloor \}$. Given $η>0$, the probability that the maximum of the pairwise GCDs lies between $N^{2-η}$ and $N^{2+η}$ converges to 1 as $N\to \infty $. More precise estimates are obtained. This is a Birthday Problem: two of the random integers are likely to share some prime factor of order $N^2/\log [N]$. The proof generalizes to any arithmetical semigroup where a suitable form of the Prime Number Theorem is valid.
△ Less
Submitted 13 November, 2009;
originally announced November 2009.
-
Differential equation approximations for Markov chains
Authors:
R. W. R. Darling,
J. R. Norris
Abstract:
We formulate some simple conditions under which a Markov chain may be approximated by the solution to a differential equation, with quantifiable error probabilities. The role of a choice of coordinate functions for the Markov chain is emphasised. The general theory is illustrated in three examples: the classical stochastic epidemic, a population process model with fast and slow variables, and co…
▽ More
We formulate some simple conditions under which a Markov chain may be approximated by the solution to a differential equation, with quantifiable error probabilities. The role of a choice of coordinate functions for the Markov chain is emphasised. The general theory is illustrated in three examples: the classical stochastic epidemic, a population process model with fast and slow variables, and core-finding algorithms for large random hypergraphs.
△ Less
Submitted 23 April, 2008; v1 submitted 17 October, 2007;
originally announced October 2007.
-
Structure of large random hypergraphs
Authors:
R. W. R. Darling,
J. R. Norris
Abstract:
The theme of this paper is the derivation of analytic formulae for certain large combinatorial structures. The formulae are obtained via fluid limits of pure jump-type Markov processes, established under simple conditions on the Laplace transforms of their Levy kernels. Furthermore, a related Gaussian approximation allows us to describe the randomness which may persist in the limit when certain…
▽ More
The theme of this paper is the derivation of analytic formulae for certain large combinatorial structures. The formulae are obtained via fluid limits of pure jump-type Markov processes, established under simple conditions on the Laplace transforms of their Levy kernels. Furthermore, a related Gaussian approximation allows us to describe the randomness which may persist in the limit when certain parameters take critical values. Our method is quite general, but is applied here to vertex identifiability in random hypergraphs. A vertex v is identifiable in n steps if there is a hyperedge containing v all of whose other vertices are identifiable in fewer steps.
We say that a hyperedge is identifiable if every one of its vertices is identifiable. Our analytic formulae describe the asymptotics of the number of identifiable vertices and the number of identifiable hyperedges for a Poisson(β) random hypergraph Λon a set V of N vertices, in the limit as N\to \infty. Here βis a formal power series with nonnegative coefficients β_0,β_1,..., and (Λ(A))_{A\subseteq V} are independent Poisson random variables such that Λ(A), the number of hyperedges on A, has mean Nβ_j/\pmatrixN j whenever |A|=j.
△ Less
Submitted 22 March, 2005;
originally announced March 2005.
-
Continuous and discontinuous phase transitions in hypergraph processes
Authors:
R. W. R. Darling,
D. A. Levin,
J. R. Norris
Abstract:
Let V denote a set of N vertices. To construct a "hypergraph process", create a new hyperedge at each event time of a Poisson process; the cardinality K of this hyperedge is random, with arbitrary probability generating function r(x), except that we assume P(K=1) +P(K=2) > 0. Given K=k, the k vertices appearing in the new hyperedge are selected uniformly at random from V. Hyperedges of cardinali…
▽ More
Let V denote a set of N vertices. To construct a "hypergraph process", create a new hyperedge at each event time of a Poisson process; the cardinality K of this hyperedge is random, with arbitrary probability generating function r(x), except that we assume P(K=1) +P(K=2) > 0. Given K=k, the k vertices appearing in the new hyperedge are selected uniformly at random from V. Hyperedges of cardinality 1 are called patches, and serve as a way of selecting root vertices. Identifiable vertices are those which are reachable from these root vertices, in a strong sense which generalizes the notion of graph component. Hyperedges are also called identifiable if all of their vertices are identifiable. We use "fluid limit" scaling: hyperedges arrive at rate N, and we study structures of size O(1) and O(N). After division by N, numbers of identifiable vertices and reducible hyperedges exhibit phase transitions, which may be continuous or discontinuous depending on the shape of the structure function -log(1 - x)/r'(x), for x in (0,1). Both the case P(K=1) > 0 and the case P(K=1) = 0 < P(K=2) are considered; for the latter, a single extraneous patch is added to mark the root vertex.
△ Less
Submitted 2 March, 2004; v1 submitted 24 December, 2003;
originally announced December 2003.
-
Fluid Limits of Pure Jump Markov Processes: a Practical Guide
Authors:
R. W. R. Darling
Abstract:
A rescaled Markov chain converges uniformly in probability to the solution of an ordinary differential equation, under carefully specified assumptions.
The presentation is much simpler than those in the outside literature.
The result may be used to build parsimonious models of large random or pseudo-random systems.
A rescaled Markov chain converges uniformly in probability to the solution of an ordinary differential equation, under carefully specified assumptions.
The presentation is much simpler than those in the outside literature.
The result may be used to build parsimonious models of large random or pseudo-random systems.
△ Less
Submitted 23 December, 2002; v1 submitted 8 October, 2002;
originally announced October 2002.
-
Structure of large random hypergraphs
Authors:
R. W. R. Darling,
J. R. Norris
Abstract:
The theme of this paper is the derivation of analytic formulae for certain large combinatorial structures. The formulae are obtained via fluid limits of pure jump type Markov processes, established under simple conditions on the Laplace transforms of their Levy kernels. Furthermore, a related Gaussian approximation allows us to describe the randomness which may persist in the limit when certain…
▽ More
The theme of this paper is the derivation of analytic formulae for certain large combinatorial structures. The formulae are obtained via fluid limits of pure jump type Markov processes, established under simple conditions on the Laplace transforms of their Levy kernels. Furthermore, a related Gaussian approximation allows us to describe the randomness which may persist in the limit when certain parameters take critical values. Our method is quite general, but is applied here to vertex identifiability in random hypergraphs. A vertex v is identifiable in n steps if there is a hyperedge containing v all of whose other vertices are identifiable in fewer than n steps. We say that a hyperedge is identifiable if every one of its vertices is identifiable. Our analytic formulae describe the asymptotics of the number of identifiable vertices and the number of identifiable hyperedges for a Poisson random hypergraph on a set of N vertices, in the limit as N goes to infinity.
△ Less
Submitted 16 January, 2004; v1 submitted 4 September, 2001;
originally announced September 2001.
-
Geometrically Intrinsic Nonlinear Recursive Filters II: Foundations
Authors:
R. W. R. Darling
Abstract:
This paper contains the technical foundations from stochastic differential geometry for the construction of geometrically intrinsic nonlinear recursive filters. A diffusion X on a manifold N is run for a time interval T, with a random initial condition. There is a single observation consisting of a nonlinear function of X(T), corrupted by noise, and with values in another manifold M. The noise c…
▽ More
This paper contains the technical foundations from stochastic differential geometry for the construction of geometrically intrinsic nonlinear recursive filters. A diffusion X on a manifold N is run for a time interval T, with a random initial condition. There is a single observation consisting of a nonlinear function of X(T), corrupted by noise, and with values in another manifold M. The noise covariance of X and the observation covariance themselves induce geometries on M and N, respectively. Using these geometries we compute approximate but coordinate-free formulas for the "best estimate" of X(T), given the observation, and its conditional variance. Calculations are based on use of Jacobi fields and of "intrinsic location parameters", a notion derived from the heat flow of harmonic mappings. When any nonlinearity is present, the resulting formulas are not the same as those for the continuous-discrete Extended Kalman Filter. A subsidiary result is a formula for computing approximately the "exponential barycenter" of a random variable S on a manifold, i.e. a point z such that the inverse image of S under the exponential map at z has mean zero in the tangent space at z.
△ Less
Submitted 6 September, 1998;
originally announced September 1998.
-
Geometrically Intrinsic Nonlinear Recursive Filters I: Algorithms
Authors:
R. W. R. Darling
Abstract:
The Geometrically Intrinsic Nonlinear Recursive Filter, or GI Filter, is designed to estimate an arbitrary continuous-time Markov diffusion process X subject to nonlinear discrete-time observations. The GI Filter is fundamentally different from the much-used Extended Kalman Filter (EKF), and its second-order variants, even in the simplest nonlinear case, in that: (i) It uses a quadratic function…
▽ More
The Geometrically Intrinsic Nonlinear Recursive Filter, or GI Filter, is designed to estimate an arbitrary continuous-time Markov diffusion process X subject to nonlinear discrete-time observations. The GI Filter is fundamentally different from the much-used Extended Kalman Filter (EKF), and its second-order variants, even in the simplest nonlinear case, in that: (i) It uses a quadratic function of a vector observation to update the state, instead of the linear function used by the EKF. (ii) It is based on deeper geometric principles, which make the GI Filter coordinate-invariant. This implies, for example, that if a linear system were subjected to a nonlinear transformation f of the state-space and analyzed using the GI Filter, the resulting state estimates and conditional variances would be the push-forward under f of the Kalman Filter estimates for the untransformed system - a property which is not shared by the EKF or its second-order variants.
The noise covariance of X and the observation covariance themselves induce geometries on state space and observation space, respectively, and associated canonical connections. A sequel to this paper develops stochastic differential geometry results - based on "intrinsic location parameters", a notion derived from the heat flow of harmonic mappings - from which we derive the coordinate-free filter update formula. The present article presents the algorithm with reference to a specific example - the problem of tracking and intercepting a target, using sensors based on a moving missile. Computational experiments show that, when the observation function is highly nonlinear, there exist choices of the noise parameters at which the GI Filter significantly outperforms the EKF.
△ Less
Submitted 6 September, 1998;
originally announced September 1998.
-
Intrinsic Location Parameter of a Diffusion Process
Authors:
R. W. R. Darling
Abstract:
For nonlinear functions f of a random vector Y, E[f(Y)] and f(E[Y]) usually differ. Consequently the mathematical expectation of Y is not intrinsic: when we change coordinate systems, it is not invariant.This article is about a fundamental and hitherto neglected property of random vectors of the form Y = f(X(t)), where X(t) is the value at time t of a diffusion process X: namely that there exist…
▽ More
For nonlinear functions f of a random vector Y, E[f(Y)] and f(E[Y]) usually differ. Consequently the mathematical expectation of Y is not intrinsic: when we change coordinate systems, it is not invariant.This article is about a fundamental and hitherto neglected property of random vectors of the form Y = f(X(t)), where X(t) is the value at time t of a diffusion process X: namely that there exists a measure of location, called the "intrinsic location parameter" (ILP), which coincides with mathematical expectation only in special cases, and which is invariant under change of coordinate systems. The construction uses martingales with respect to the intrinsic geometry of diffusion processes, and the heat flow of harmonic mappings. We compute formulas which could be useful to statisticians, engineers, and others who use diffusion process models; these have immediate application, discussed in a separate article, to the construction of an intrinsic nonlinear analog to the Kalman Filter. We present here a numerical simulation of a nonlinear SDE, showing how well the ILP formula tracks the mean of the SDE for a Euclidean geometry.
△ Less
Submitted 6 September, 1998;
originally announced September 1998.
-
The Repeated Solicitation Model
Authors:
R. W. R. Darling
Abstract:
This paper presents a probabilistic analysis of what we call the "repeated solicitation model". To give a specific context, suppose B is a direct marketing company with a list of S sales prospects. At epoch 1, B sends a solicitation to every prospect on the list, and elicits X(1) replies. The company deletes the respondents from the list, and at epoch 2 sends a solicitation to the other prospect…
▽ More
This paper presents a probabilistic analysis of what we call the "repeated solicitation model". To give a specific context, suppose B is a direct marketing company with a list of S sales prospects. At epoch 1, B sends a solicitation to every prospect on the list, and elicits X(1) replies. The company deletes the respondents from the list, and at epoch 2 sends a solicitation to the other prospects, of whom X(2) respond, and so on. This continues until an epoch n such that X(n) = 0, which we call epoch T, and then B makes no further solicitations. We seek (a) the probability distribution of T; (b) the distribution of the total number of respondents; (c) the expected total number of solicitations. All three quantities are explicitly computed, assuming that (i) prospects' response times are independent, and (ii) S is Poisson distributed.
△ Less
Submitted 15 September, 1998; v1 submitted 11 August, 1998;
originally announced August 1998.