-
A Novel Quantum Algorithm for Efficient Attractor Search in Gene Regulatory Networks
Authors:
Mirko Rossini,
Felix M. Weidner,
Joachim Ankerhold,
Hans A. Kestler
Abstract:
The description of gene interactions that constantly occur in the cellular environment is an extremely challenging task due to an immense number of degrees of freedom and incomplete knowledge about microscopic details. Hence, a coarse-grained and rather powerful modeling of such dynamics is provided by Boolean Networks (BNs). BNs are dynamical systems composed of Boolean agents and a record of the…
▽ More
The description of gene interactions that constantly occur in the cellular environment is an extremely challenging task due to an immense number of degrees of freedom and incomplete knowledge about microscopic details. Hence, a coarse-grained and rather powerful modeling of such dynamics is provided by Boolean Networks (BNs). BNs are dynamical systems composed of Boolean agents and a record of their possible interactions over time. Stable states in these systems are called attractors which are closely related to the cellular expression of biological phenotypes. Identifying the full set of attractors is, therefore, of substantial biological interest. However, for conventional high-performance computing, this problem is plagued by an exponential growth of the dynamic state space. Here, we demonstrate a novel quantum search algorithm inspired by Grover's algorithm to be implemented on quantum computing platforms. The algorithm performs an iterative suppression of states belonging to basins of previously discovered attractors from a uniform superposition, thus increasing the amplitudes of states in basins of yet unknown attractors. This approach guarantees that a new attractor state is measured with each iteration of the algorithm, an optimization not currently achieved by any other algorithm in the literature. Tests of its resistance to noise have also shown promising performance on devices from the current Noise Intermediate Scale Quantum Computing (NISQ) era.
△ Less
Submitted 16 August, 2024;
originally announced August 2024.
-
The Art of the Fugue: Minimizing Interleaving in Collaborative Text Editing
Authors:
Matthew Weidner,
Martin Kleppmann
Abstract:
Most existing algorithms for replicated lists, which are widely used in collaborative text editors, suffer from a problem: when two users concurrently insert text at the same position in the document, the merged outcome may interleave the inserted text passages, resulting in corrupted and potentially unreadable text. The problem has gone unnoticed for decades, and it affects both CRDTs and Operati…
▽ More
Most existing algorithms for replicated lists, which are widely used in collaborative text editors, suffer from a problem: when two users concurrently insert text at the same position in the document, the merged outcome may interleave the inserted text passages, resulting in corrupted and potentially unreadable text. The problem has gone unnoticed for decades, and it affects both CRDTs and Operational Transformation. This paper defines maximal non-interleaving, our new correctness property for replicated lists. We introduce two related CRDT algorithms, Fugue and FugueMax, and prove that FugueMax satisfies maximal non-interleaving. We also implement our algorithms and demonstrate that Fugue offers performance comparable to state-of-the-art CRDT libraries for text editing.
△ Less
Submitted 17 November, 2023; v1 submitted 30 April, 2023;
originally announced May 2023.
-
For-Each Operations in Collaborative Apps
Authors:
Matthew Weidner,
Ria Pradeep,
Benito Geordie,
Heather Miller
Abstract:
Conflict-free Replicated Data Types (CRDTs) allow collaborative access to an app's data. We describe a novel CRDT operation, for-each on the list of CRDTs, and demonstrate its use in collaborative apps. Our for-each operation applies a given mutation to each element of a list, including elements inserted concurrently. This often preserves user intention in a way that would otherwise require custom…
▽ More
Conflict-free Replicated Data Types (CRDTs) allow collaborative access to an app's data. We describe a novel CRDT operation, for-each on the list of CRDTs, and demonstrate its use in collaborative apps. Our for-each operation applies a given mutation to each element of a list, including elements inserted concurrently. This often preserves user intention in a way that would otherwise require custom CRDT algorithms. We give example applications of our for-each operation to collaborative rich-text, recipe, and slideshow editors.
△ Less
Submitted 6 April, 2023;
originally announced April 2023.
-
Collabs: A Flexible and Performant CRDT Collaboration Framework
Authors:
Matthew Weidner,
Huairui Qi,
Maxime Kjaer,
Ria Pradeep,
Benito Geordie,
Yicheng Zhang,
Gregory Schare,
Xuan Tang,
Sicheng Xing,
Heather Miller
Abstract:
A collaboration framework is a distributed system that serves as the data layer for a collaborative app. Conflict-free Replicated Data Types (CRDTs) are a promising theoretical technique for implementing collaboration frameworks. However, existing frameworks are inflexible: they are often one-off implementations of research papers or only permit a restricted set of CRDT semantics, and they do not…
▽ More
A collaboration framework is a distributed system that serves as the data layer for a collaborative app. Conflict-free Replicated Data Types (CRDTs) are a promising theoretical technique for implementing collaboration frameworks. However, existing frameworks are inflexible: they are often one-off implementations of research papers or only permit a restricted set of CRDT semantics, and they do not allow app-specific optimizations. Until now, there was no general framework that lets programmers mix, match, and modify CRDTs.
We solve this with Collabs, a CRDT-based collaboration framework that lets programmers implement their own CRDTs, either from-scratch or by composing existing building blocks. Collabs prioritizes both semantic flexibility and performance flexibility: it allows arbitrary app-specific CRDT behaviors and optimizations, while still providing strong eventual consistency. We demonstrate Collabs's capabilities and programming model with example apps and CRDT implementations. We then show that a collaborative rich-text editor using Collabs's built-in CRDTs can scale to over 100 simultaneous users, unlike existing CRDT frameworks and Google Docs. Collabs also has lower end-to-end latency and server CPU usage than a popular Operational Transformation framework, with acceptable CRDT metadata overhead.
△ Less
Submitted 13 October, 2023; v1 submitted 5 December, 2022;
originally announced December 2022.
-
Composing and Decomposing Op-Based CRDTs with Semidirect Products
Authors:
Matthew Weidner,
Heather Miller,
Christopher Meiklejohn
Abstract:
Operation-based Conflict-free Replicated Data Types (CRDTs) are eventually consistent replicated data types that automatically resolve conflicts between concurrent operations. Op-based CRDTs must be designed differently for each data type, and current designs use ad-hoc techniques to handle concurrent operations that do not naturally commute. We present a new construction, the semidirect product o…
▽ More
Operation-based Conflict-free Replicated Data Types (CRDTs) are eventually consistent replicated data types that automatically resolve conflicts between concurrent operations. Op-based CRDTs must be designed differently for each data type, and current designs use ad-hoc techniques to handle concurrent operations that do not naturally commute. We present a new construction, the semidirect product of op-based CRDTs, which combines the operations of two CRDTs into one while handling conflicts between their concurrent operations in a uniform way. We demonstrate the construction's utility by using it to construct novel CRDTs, as well as decomposing several existing CRDTs as semidirect products of simpler CRDTs. Although it reproduces common CRDT semantics, the semidirect product can be viewed as a restricted kind of operational transformation, thus forming a bridge between these two opposing techniques for constructing replicated data types.
△ Less
Submitted 8 April, 2020;
originally announced April 2020.
-
On Decoding Cohen-Haeupler-Schulman Tree Codes
Authors:
Anand Kumar Narayanan,
Matthew Weidner
Abstract:
Tree codes, introduced by Schulman, are combinatorial structures essential to coding for interactive communication. An infinite family of tree codes with both rate and distance bounded by positive constants is called asymptotically good. Rate being constant is equivalent to the alphabet size being constant. Schulman proved that there are asymptotically good tree code families using the Lovasz loca…
▽ More
Tree codes, introduced by Schulman, are combinatorial structures essential to coding for interactive communication. An infinite family of tree codes with both rate and distance bounded by positive constants is called asymptotically good. Rate being constant is equivalent to the alphabet size being constant. Schulman proved that there are asymptotically good tree code families using the Lovasz local lemma, yet their explicit construction remains an outstanding open problem. In a major breakthrough, Cohen, Haeupler and Schulman constructed explicit tree code families with constant distance, but over an alphabet polylogarithmic in the length. Our main result is a randomized polynomial time decoding algorithm for these codes making novel use of the polynomial method. The number of errors corrected scales roughly as the block length to the three-fourths power, falling short of the constant fraction error correction guaranteed by the constant distance. We further present number theoretic variants of Cohen-Haeupler-Schulman codes, all correcting a constant fraction of errors with polylogarithmic alphabet size. Towards efficiently correcting close to a constant fraction of errors, we propose a speculative convex optimization approach inspired by compressed sensing.
△ Less
Submitted 16 September, 2019;
originally announced September 2019.
-
Subquadratic time encodable codes beating the Gilbert-Varshamov bound
Authors:
Anand Kumar Narayanan,
Matthew Weidner
Abstract:
We construct explicit algebraic geometry codes built from the Garcia-Stichtenoth function field tower beating the Gilbert-Varshamov bound for alphabet sizes at least 192. Messages are identied with functions in certain Riemann-Roch spaces associated with divisors supported on multiple places. Encoding amounts to evaluating these functions at degree one places. By exploiting algebraic structures pa…
▽ More
We construct explicit algebraic geometry codes built from the Garcia-Stichtenoth function field tower beating the Gilbert-Varshamov bound for alphabet sizes at least 192. Messages are identied with functions in certain Riemann-Roch spaces associated with divisors supported on multiple places. Encoding amounts to evaluating these functions at degree one places. By exploiting algebraic structures particular to the Garcia-Stichtenoth tower, we devise an intricate deterministic ω/2 < 1.19 runtime exponent encoding and 1+ω/2 < 2.19 expected runtime exponent randomized (unique and list) decoding algorithms. Here ω< 2.373 is the matrix multiplication exponent. If ω= 2, as widely believed, the encoding and decoding runtimes are respectively nearly linear and nearly quadratic. Prior to this work, encoding (resp. decoding) time of code families beating the Gilbert-Varshamov bound were quadratic (resp. cubic) or worse.
△ Less
Submitted 13 August, 2018; v1 submitted 28 December, 2017;
originally announced December 2017.
-
Fast OLAP Query Execution in Main Memory on Large Data in a Cluster
Authors:
Demian Hespe,
Martin Weidner,
Jonathan Dees,
Peter Sanders
Abstract:
Main memory column-stores have proven to be efficient for processing analytical queries. Still, there has been much less work in the context of clusters. Using only a single machine poses several restrictions: Processing power and data volume are bounded to the number of cores and main memory fitting on one tightly coupled system. To enable the processing of larger data sets, switching to a cluste…
▽ More
Main memory column-stores have proven to be efficient for processing analytical queries. Still, there has been much less work in the context of clusters. Using only a single machine poses several restrictions: Processing power and data volume are bounded to the number of cores and main memory fitting on one tightly coupled system. To enable the processing of larger data sets, switching to a cluster becomes necessary. In this work, we explore techniques for efficient execution of analytical SQL queries on large amounts of data in a parallel database cluster while making maximal use of the available hardware. This includes precompiled query plans for efficient CPU utilization, full parallelization on single nodes and across the cluster, and efficient inter-node communication. We implement all features in a prototype for running a subset of TPC-H benchmark queries. We evaluate our implementation using a 128 node cluster running TPC-H queries with 30 000 gigabyte of uncompressed data.
△ Less
Submitted 15 September, 2017;
originally announced September 2017.