-
Inference for Log-Gaussian Cox Point Processes using Bayesian Deep Learning: Application to Human Oral Microbiome Image Data
Authors:
Shuwan Wang,
Christopher K. Wikle,
Athanasios C. Micheas,
Jessica L. Mark Welch,
Jacqueline R. Starr,
Kyu Ha Lee
Abstract:
It is common in nature to see aggregation of objects in space. Exploring the mechanism associated with the locations of such clustered observations can be essential to understanding the phenomenon, such as the source of spatial heterogeneity, or comparison to other event generating processes in the same domain. Log-Gaussian Cox processes (LGCPs) represent an important class of models for quantifyi…
▽ More
It is common in nature to see aggregation of objects in space. Exploring the mechanism associated with the locations of such clustered observations can be essential to understanding the phenomenon, such as the source of spatial heterogeneity, or comparison to other event generating processes in the same domain. Log-Gaussian Cox processes (LGCPs) represent an important class of models for quantifying aggregation in a spatial point pattern. However, implementing likelihood-based Bayesian inference for such models presents many computational challenges, particularly in high dimensions. In this paper, we propose a novel likelihood-free inference approach for LGCPs using the recently developed BayesFlow approach, where invertible neural networks are employed to approximate the posterior distribution of the parameters of interest. BayesFlow is a neural simulation-based method based on "amortized" posterior estimation. That is, after an initial training procedure, fast feed-forward operations allow rapid posterior inference for any data within the same model family. Comprehensive numerical studies validate the reliability of the framework and show that BayesFlow achieves substantial computational gain in repeated application, especially for two-dimensional LGCPs. We demonstrate the utility and robustness of the method by applying it to two distinct oral microbial biofilm images.
△ Less
Submitted 18 March, 2025; v1 submitted 17 February, 2025;
originally announced February 2025.
-
A Bayesian Multivariate Spatial Point Pattern Model: Application to Oral Microbiome FISH Image Data
Authors:
Kyu Ha Lee,
Brent A. Coull,
Suman Majumder,
Patrick J. La Riviere,
Jessica L. Mark Welch,
Jacqueline R. Starr
Abstract:
Advances in cellular imaging technologies, especially those based on fluorescence in situ hybridization (FISH) now allow detailed visualization of the spatial organization of human or bacterial cells. Quantifying this spatial organization is crucial for understanding the function of multicellular tissues or biofilms, with implications for human health and disease. To address the need for better me…
▽ More
Advances in cellular imaging technologies, especially those based on fluorescence in situ hybridization (FISH) now allow detailed visualization of the spatial organization of human or bacterial cells. Quantifying this spatial organization is crucial for understanding the function of multicellular tissues or biofilms, with implications for human health and disease. To address the need for better methods to achieve such quantification, we propose a flexible multivariate point process model that characterizes and estimates complex spatial interactions among multiple cell types. The proposed Bayesian framework is appealing due to its unified estimation process and the ability to directly quantify uncertainty in key estimates of interest, such as those of inter-type correlation and the proportion of variance due to inter-type relationships. To ensure stable and interpretable estimation, we consider shrinkage priors for coefficients associated with latent processes. Model selection and comparison are conducted by using a deviance information criterion designed for models with latent variables, effectively balancing the risk of overfitting with that of oversimplifying key quantities. Furthermore, we develop a hierarchical modeling approach to integrate multiple image-specific estimates from a given subject, allowing inference at both the global and subject-specific levels. We apply the proposed method to microbial biofilm image data from the human tongue dorsum and find that specific taxon pairs, such as Streptococcus mitis-Streptococcus salivarius and Streptococcus mitis-Veillonella, exhibit strong positive spatial correlations, while others, such as Actinomyces-Rothia, show slight negative correlations. For most of the taxa, a substantial portion of spatial variance can be attributed to inter-taxon relationships.
△ Less
Submitted 14 February, 2025;
originally announced February 2025.
-
Efficient Wait-Free Linearizable Implementations of Approximate Bounded Counters Using Read-Write Registers
Authors:
Colette Johnen,
Adnane Khattabi,
Alessia Milani,
Jennifer L. Welch
Abstract:
Relaxing the sequential specification of a shared object is a way to obtain an implementation with better performance compared to implementing the original specification. We apply this approach to the Counter object, under the assumption that the number of times the Counter is incremented in any execution is at most a known bound $m$. We consider the $k$-multiplicative-accurate Counter object, whe…
▽ More
Relaxing the sequential specification of a shared object is a way to obtain an implementation with better performance compared to implementing the original specification. We apply this approach to the Counter object, under the assumption that the number of times the Counter is incremented in any execution is at most a known bound $m$. We consider the $k$-multiplicative-accurate Counter object, where each read operation returns an approximate value that is within a multiplicative factor $k$ of the accurate value. More specifically, a read is allowed to return an approximate value $x$ of the number $v$ of increments previously applied to the counter such that $v/k \le x \le vk$. We present three algorithms to implement this object in a wait-free linearizable manner in the shared memory model using read-write registers. All the algorithms have read operations whose worst-case step complexity improves exponentially on that for an exact $m$-bounded counter (which in turn improves exponentially on that for an exact unbounded counter). Two of the algorithms have read step complexity that is asymptotically optimal. The algorithms differ in their requirements on $k$, step complexity of the increment operation, and space complexity.
△ Less
Submitted 21 February, 2024;
originally announced February 2024.
-
Multi-Valued Connected Consensus: A New Perspective on Crusader Agreement and Adopt-Commit
Authors:
Hagit Attiya,
Jennifer L. Welch
Abstract:
Algorithms to solve fault-tolerant consensus in asynchronous systems often rely on primitives such as crusader agreement, adopt-commit, and graded broadcast, which provide weaker agreement properties than consensus. Although these primitives have a similar flavor, they have been defined and implemented separately in ad hoc ways. We propose a new problem called connected consensus that has as speci…
▽ More
Algorithms to solve fault-tolerant consensus in asynchronous systems often rely on primitives such as crusader agreement, adopt-commit, and graded broadcast, which provide weaker agreement properties than consensus. Although these primitives have a similar flavor, they have been defined and implemented separately in ad hoc ways. We propose a new problem called connected consensus that has as special cases crusader agreement, adopt-commit, and graded broadcast, and generalizes them to handle multi-valued inputs. The generalization is accomplished by relating the problem to approximate agreement on graphs.
We present three algorithms for multi-valued connected consensus in asynchronous message-passing systems, one tolerating crash failures and two tolerating malicious (unauthenticated Byzantine) failures. We extend the definition of binding, a desirable property recently identified as supporting binary consensus algorithms that are correct against adaptive adversaries, to the multi-valued input case and show that all our algorithms satisfy the property. Our crash-resilient algorithm has failure-resilience and time complexity that we show are optimal. When restricted to the case of binary inputs, the algorithm has improved time complexity over prior algorithms. Our two algorithms for malicious failures trade off failure resilience and time complexity. The first algorithm has time complexity that we prove is optimal but worse failure-resilience, while the second has failure-resilience that we prove is optimal but worse time complexity. When restricted to the case of binary inputs, the time complexity (as well as resilience) of the second algorithm matches that of prior algorithms.
△ Less
Submitted 8 August, 2023;
originally announced August 2023.
-
Multivariate cluster point process to quantify and explore multi-entity configurations: Application to biofilm image data
Authors:
Suman Majumder,
Brent A. Coull,
Jessica L. Mark Welch,
Patrick J. La Riviere,
Floyd E. Dewhirst,
Jacqueline R. Starr,
Kyu Ha Lee
Abstract:
Clusters of similar or dissimilar objects are encountered in many fields. Frequently used approaches treat the central object of each cluster as latent. Yet, often objects of one or more types cluster around objects of another type. Such arrangements are common in biomedical images of cells, in which nearby cell types likely interact. Quantifying spatial relationships may elucidate biological mech…
▽ More
Clusters of similar or dissimilar objects are encountered in many fields. Frequently used approaches treat the central object of each cluster as latent. Yet, often objects of one or more types cluster around objects of another type. Such arrangements are common in biomedical images of cells, in which nearby cell types likely interact. Quantifying spatial relationships may elucidate biological mechanisms. Parent-offspring statistical frameworks can be usefully applied even when central objects (parents) differ from peripheral ones (offspring). We propose the novel multivariate cluster point process (MCPP) to quantify multi-object (e.g., multi-cellular) arrangements. Unlike commonly used approaches, the MCPP exploits locations of the central parent object in clusters. It accounts for possibly multilayered, multivariate clustering. The model formulation requires specification of which object types function as cluster centers and which reside peripherally. If such information is unknown, the relative roles of object types may be explored by comparing fit of different models via the deviance information criterion (DIC). In simulated data, we compared DIC of a series of models; the MCPP correctly identified simulated relationships. It also produced more accurate and precise parameter estimates than the classical univariate Neyman-Scott process model. We also used the MCPP to quantify proposed configurations and explore new ones in human dental plaque biofilm image data. MCPP models quantified simultaneous clustering of Streptococcus and Porphyromonas around Corynebacterium and of Pasteurellaceae around Streptococcus and successfully captured hypothesized structures for all taxa. Further exploration suggested the presence of clustering between Fusobacterium and Leptotrichia, a previously unreported relationship.
△ Less
Submitted 8 October, 2024; v1 submitted 8 February, 2022;
originally announced February 2022.
-
Blunting an Adversary Against Randomized Concurrent Programs with Linearizable Implementations
Authors:
Hagit Attiya,
Constantin Enea,
Jennifer L. Welch
Abstract:
Atomic shared objects, whose operations take place instantaneously, are a powerful abstraction for designing complex concurrent programs. Since they are not always available, they are typically substituted with software implementations. A prominent condition relating these implementations to their atomic specifications is linearizability, which preserves safety properties of the programs using the…
▽ More
Atomic shared objects, whose operations take place instantaneously, are a powerful abstraction for designing complex concurrent programs. Since they are not always available, they are typically substituted with software implementations. A prominent condition relating these implementations to their atomic specifications is linearizability, which preserves safety properties of the programs using them. However linearizability does not preserve hyper-properties, which include probabilistic guarantees of randomized programs: an adversary can greatly amplify the probability of a bad outcome. This unwelcome behavior prevents modular reasoning, which is the key benefit provided by the use of linearizable object implementations. A more restrictive property, strong linearizability, does preserve hyper-properties but it is impossible to achieve in many situations.
This paper suggests a novel approach to blunting the adversary's additional power that works even in cases where strong linearizability is not achievable. We show that a wide class of linearizable implementations, including well-known ones for registers and snapshots, can be modified to approximate the probabilistic guarantees of randomized programs when using atomic objects. The technical approach is to transform the algorithm of each method of an existing linearizable implementation by repeating a carefully chosen prefix of the method several times and then randomly choosing which repetition to use subsequently. We prove that the probability of a bad outcome decreases with the number of repetitions, approaching the probability attained when using atomic objects. The class of implementations to which our transformation applies includes the ABD implementation of a shared register using message-passing and the Afek et al. implementation of an atomic snapshot using single-writer registers.
△ Less
Submitted 1 March, 2022; v1 submitted 29 June, 2021;
originally announced June 2021.
-
Store-Collect in the Presence of Continuous Churn with Application to Snapshots and Lattice Agreement
Authors:
Hagit Attiya,
Sweta Kumari,
Archit Somani,
Jennifer L. Welch
Abstract:
We present an algorithm for implementing a store-collect object in an asynchronous crash-prone message-passing dynamic system, where nodes continually enter and leave. The algorithm is very simple and efficient, requiring just one round trip for a store operation and two for a collect. We then show the versatility of the store-collect object for implementing churn-tolerant versions of useful data…
▽ More
We present an algorithm for implementing a store-collect object in an asynchronous crash-prone message-passing dynamic system, where nodes continually enter and leave. The algorithm is very simple and efficient, requiring just one round trip for a store operation and two for a collect. We then show the versatility of the store-collect object for implementing churn-tolerant versions of useful data structures, while shielding the user from the complications of the underlying churn. In particular, we present elegant and efficient implementations of atomic snapshot and generalized lattice agreement objects that use store-collect.
△ Less
Submitted 5 November, 2020; v1 submitted 17 March, 2020;
originally announced March 2020.
-
Byzantine-Tolerant Register in a System with Continuous Churn
Authors:
Saptaparni Kumar,
Jennifer L. Welch
Abstract:
A shared read/write register emulation provides the illusion of shared-memory on top of message-passing models. The main hurdle with such emulations is dealing with server faults in the system. Several crash-tolerant register emulations in static systems require algorithms to replicate the value of the shared register onto a majority of servers. Majority correctness is necessary for such emulation…
▽ More
A shared read/write register emulation provides the illusion of shared-memory on top of message-passing models. The main hurdle with such emulations is dealing with server faults in the system. Several crash-tolerant register emulations in static systems require algorithms to replicate the value of the shared register onto a majority of servers. Majority correctness is necessary for such emulations. Byzantine faults are considered to be the worst kind of faults that can happen in any distributed system. Emulating a Byzantine-tolerant register requires replicating the register value on to more than two-thirds of the servers. Emulating a register in a dynamic system where servers and clients can enter and leave the system and be faulty is harder than in static systems. There are several crash-tolerant register emulations for dynamic systems. This paper presents the first emulation of a multi-reader multi-writer atomic register in a system that can withstand nodes continually entering and leaving, imposes no upper bound on the system size and can tolerate Byzantine servers. The algorithm works as long as the number of servers entering and leaving during a fixed time interval is at most a constant fraction of the system size at the beginning of the interval, and as long as the number of Byzantine servers in the system is at most f. Although our algorithm requires that there be a constant known upper bound on the number of Byzantine servers, this restriction is unavoidable, as we show that it is impossible to emulate an atomic register if the system size and maximum number of servers that can be Byzantine in the system is unknown to the nodes.
△ Less
Submitted 13 October, 2019;
originally announced October 2019.
-
A Tight Lower Bound for Clock Synchronization in Odd-Ary M-Toroids
Authors:
Reginald Frank,
Jennifer L. Welch
Abstract:
Synchronizing clocks in a distributed system in which processes communicate through messages with uncertain delays is subject to inherent errors. Prior work has shown upper and lower bounds on the best synchronization achievable in a variety of network topologies and assumptions about the uncertainty on the message delays. However, until now there has not been a tight closed-form expression for th…
▽ More
Synchronizing clocks in a distributed system in which processes communicate through messages with uncertain delays is subject to inherent errors. Prior work has shown upper and lower bounds on the best synchronization achievable in a variety of network topologies and assumptions about the uncertainty on the message delays. However, until now there has not been a tight closed-form expression for the optimal synchronization in $k$-ary $m$-cubes with wraparound, where $k$ is odd. In this paper, we prove a lower bound of $\frac{1}{4}um\left(k-\frac{1}{k}\right)$, where $k$ is the (odd) number of processes in the each of the $m$ dimensions, and $u$ is the uncertainty in delay on every link. Our lower bound matches the previously known upper bound.
△ Less
Submitted 13 July, 2018;
originally announced July 2018.
-
Simulating a Shared Register in a System that Never Stops Changing
Authors:
Hagit Attiya,
Hyun Chul Chung,
Faith Ellen,
Saptaparni Kumar,
Jennifer L. Welch
Abstract:
Simulating a shared register can mask the intricacies of designing algorithms for asynchronous message-passing systems subject to crash failures, since it allows them to run algorithms designed for the simpler shared-memory model. Typically such simulations replicate the value of the register in multiple servers and require readers and writers to communicate with a majority of servers. The success…
▽ More
Simulating a shared register can mask the intricacies of designing algorithms for asynchronous message-passing systems subject to crash failures, since it allows them to run algorithms designed for the simpler shared-memory model. Typically such simulations replicate the value of the register in multiple servers and require readers and writers to communicate with a majority of servers. The success of this approach for static systems, where the set of nodes (readers, writers, and servers) is fixed, has motivated several similar simulations for dynamic systems, where nodes may enter and leave. However, existing simulations need to assume that the system eventually stops changing for a long enough period or that the system size is bounded. This paper presents the first simulation of an atomic read/write register in a crash-prone asynchronous system that can change size and withstand nodes continually entering and leaving. The simulation allows the system to keep changing, provided that the number of nodes entering and leaving during a fixed time interval is at most a constant fraction of the current system size. The simulation also tolerates node crashes as long as the number of failed nodes in the system is at most a constant fraction of the current system size.
△ Less
Submitted 10 August, 2017;
originally announced August 2017.
-
Scheduling Sensors by Tiling Lattices
Authors:
Andreas Klappenecker,
Hyunyoung Lee,
Jennifer L. Welch
Abstract:
Suppose that wirelessly communicating sensors are placed in a regular fashion on the points of a lattice. Common communication protocols allow the sensors to broadcast messages at arbitrary times, which can lead to problems should two sensors broadcast at the same time. It is shown that one can exploit a tiling of the lattice to derive a deterministic periodic schedule for the broadcast communic…
▽ More
Suppose that wirelessly communicating sensors are placed in a regular fashion on the points of a lattice. Common communication protocols allow the sensors to broadcast messages at arbitrary times, which can lead to problems should two sensors broadcast at the same time. It is shown that one can exploit a tiling of the lattice to derive a deterministic periodic schedule for the broadcast communication of sensors that is guaranteed to be collision-free. The proposed schedule is shown to be optimal in the number of time slots.
△ Less
Submitted 7 June, 2008;
originally announced June 2008.