-
Optimal-Depth Sorting Networks
Authors:
Daniel Bundala,
Michael Codish,
Luís Cruz-Filipe,
Peter Schneider-Kamp,
Jakub Závodný
Abstract:
We solve a 40-year-old open problem on the depth optimality of sorting networks. In 1973, Donald E. Knuth detailed, in Volume 3 of "The Art of Computer Programming", sorting networks of the smallest depth known at the time for n =< 16 inputs, quoting optimality for n =< 8. In 1989, Parberry proved the optimality of the networks with 9 =< n =< 10 inputs. In this article, we present a general techni…
▽ More
We solve a 40-year-old open problem on the depth optimality of sorting networks. In 1973, Donald E. Knuth detailed, in Volume 3 of "The Art of Computer Programming", sorting networks of the smallest depth known at the time for n =< 16 inputs, quoting optimality for n =< 8. In 1989, Parberry proved the optimality of the networks with 9 =< n =< 10 inputs. In this article, we present a general technique for obtaining such optimality results, and use it to prove the optimality of the remaining open cases of 11 =< n =< 16 inputs. We show how to exploit symmetry to construct a small set of two-layer networks on n inputs such that if there is a sorting network on n inputs of a given depth, then there is one whose first layers are in this set. For each network in the resulting set, we construct a propositional formula whose satisfiability is necessary for the existence of a sorting network of a given depth. Using an off-the-shelf SAT solver we show that the sorting networks listed by Knuth are optimal. For n =< 10 inputs, our algorithm is orders of magnitude faster than the prior ones.
△ Less
Submitted 17 December, 2014;
originally announced December 2014.
-
Optimal Sorting Networks
Authors:
Daniel Bundala,
Jakub Závodný
Abstract:
This paper settles the optimality of sorting networks given in The Art of Computer Programming vol. 3 more than 40 years ago. The book lists efficient sorting networks with n <= 16 inputs. In this paper we give general combinatorial arguments showing that if a sorting network with a given depth exists then there exists one with a special form. We then construct propositional formulas whose satisfi…
▽ More
This paper settles the optimality of sorting networks given in The Art of Computer Programming vol. 3 more than 40 years ago. The book lists efficient sorting networks with n <= 16 inputs. In this paper we give general combinatorial arguments showing that if a sorting network with a given depth exists then there exists one with a special form. We then construct propositional formulas whose satisfiability is necessary for the existence of such a network. Using a SAT solver we conclude that the listed networks have optimal depth. For n <= 10 inputs where optimality was known previously, our algorithm is four orders of magnitude faster than those in prior work.
△ Less
Submitted 22 December, 2013; v1 submitted 23 October, 2013;
originally announced October 2013.
-
Aggregation and Ordering in Factorised Databases
Authors:
Nurzhan Bakibayev,
Tomáš Kočiský,
Dan Olteanu,
Jakub Závodný
Abstract:
A common approach to data analysis involves understanding and manipulating succinct representations of data. In earlier work, we put forward a succinct representation system for relational data called factorised databases and reported on the main-memory query engine FDB for select-project-join queries on such databases.
In this paper, we extend FDB to support a larger class of practical queries…
▽ More
A common approach to data analysis involves understanding and manipulating succinct representations of data. In earlier work, we put forward a succinct representation system for relational data called factorised databases and reported on the main-memory query engine FDB for select-project-join queries on such databases.
In this paper, we extend FDB to support a larger class of practical queries with aggregates and ordering. This requires novel optimisation and evaluation techniques. We show how factorisation coupled with partial aggregation can effectively reduce the number of operations needed for query evaluation. We also show how factorisations of query results can support enumeration of tuples in desired orders as efficiently as listing them from the unfactorised, sorted results.
We experimentally observe that FDB can outperform off-the-shelf relational engines by orders of magnitude.
△ Less
Submitted 1 July, 2013;
originally announced July 2013.
-
FDB: A Query Engine for Factorised Relational Databases
Authors:
Nurzhan Bakibayev,
Dan Olteanu,
Jakub Závodný
Abstract:
Factorised databases are relational databases that use compact factorised representations at the physical layer to reduce data redundancy and boost query performance. This paper introduces FDB, an in-memory query engine for select-project-join queries on factorised databases. Key components of FDB are novel algorithms for query optimisation and evaluation that exploit the succinctness brought by d…
▽ More
Factorised databases are relational databases that use compact factorised representations at the physical layer to reduce data redundancy and boost query performance. This paper introduces FDB, an in-memory query engine for select-project-join queries on factorised databases. Key components of FDB are novel algorithms for query optimisation and evaluation that exploit the succinctness brought by data factorisation. Experiments show that for data sets with many-to-many relationships FDB can outperform relational engines by orders of magnitude.
△ Less
Submitted 12 March, 2012;
originally announced March 2012.
-
Factorised Representations of Query Results
Authors:
Dan Olteanu,
Jakub Zavodny
Abstract:
Query tractability has been traditionally defined as a function of input database and query sizes, or of both input and output sizes, where the query result is represented as a bag of tuples. In this report, we introduce a framework that allows to investigate tractability beyond this setting. The key insight is that, although the cardinality of a query result can be exponential, its structure can…
▽ More
Query tractability has been traditionally defined as a function of input database and query sizes, or of both input and output sizes, where the query result is represented as a bag of tuples. In this report, we introduce a framework that allows to investigate tractability beyond this setting. The key insight is that, although the cardinality of a query result can be exponential, its structure can be very regular and thus factorisable into a nested representation whose size is only polynomial in the size of both the input database and query.
For a given query result, there may be several equivalent representations, and we quantify the regularity of the result by its readability, which is the minimum over all its representations of the maximum number of occurrences of any tuple in that representation. We give a characterisation of select-project-join queries based on the bounds on readability of their results for any input database. We complement it with an algorithm that can find asymptotically optimal upper bounds and corresponding factorised representations.
△ Less
Submitted 5 April, 2011;
originally announced April 2011.