-
Branch Prediction Analysis of Morris-Pratt and Knuth-Morris-Pratt Algorithms
Authors:
Cyril Nicaud,
Carine Pivoteau,
Stéphane Vialette
Abstract:
We analyze the classical Morris-Pratt and Knuth-Morris-Pratt pattern matching algorithms through the lens of computer architecture, investigating the impact of incorporating a simple branch prediction mechanism into the model of computation. Assuming a fixed pattern and a random text, we derive precise estimates of the number of mispredictions these algorithms produce using local predictors. Our a…
▽ More
We analyze the classical Morris-Pratt and Knuth-Morris-Pratt pattern matching algorithms through the lens of computer architecture, investigating the impact of incorporating a simple branch prediction mechanism into the model of computation. Assuming a fixed pattern and a random text, we derive precise estimates of the number of mispredictions these algorithms produce using local predictors. Our approach is based on automata theory and Markov chains, providing a foundation for the theoretical analysis of other text algorithms and more advanced branch prediction strategies.
△ Less
Submitted 17 March, 2025;
originally announced March 2025.
-
Record-biased permutations and their permuton limit
Authors:
Mathilde Bouvel,
Cyril Nicaud,
Carine Pivoteau
Abstract:
In this article, we study a non-uniform distribution on permutations biased by their number of records that we call \emph{record-biased permutations}. We give several generative processes for record-biased permutations, explaining also how they can be used to devise efficient (linear) random samplers. For several classical permutation statistics, we obtain their expectation using the above generat…
▽ More
In this article, we study a non-uniform distribution on permutations biased by their number of records that we call \emph{record-biased permutations}. We give several generative processes for record-biased permutations, explaining also how they can be used to devise efficient (linear) random samplers. For several classical permutation statistics, we obtain their expectation using the above generative processes, as well as their limit distributions in the regime that has a logarithmic number of records (as in the uniform case). Finally, increasing the bias to obtain a regime with an expected linear number of records, we establish the convergence of record-biased permutations to a deterministic permuton, which we fully characterize.
This model was introduced in our earlier work [N. Auger, M. Bouvel, C. Nicaud, C. Pivoteau, \emph{Analysis of Algorithms for Permutations Biased by Their Number of Records}, AofA 2016], in the context of realistic analysis of algorithms. We conduct here a more thorough study but with a theoretical perspective.
△ Less
Submitted 3 September, 2024;
originally announced September 2024.
-
On the Worst-Case Complexity of TimSort
Authors:
Nicolas Auger,
Vincent Jugé,
Cyril Nicaud,
Carine Pivoteau
Abstract:
TimSort is an intriguing sorting algorithm designed in 2002 for Python, whose worst-case complexity was announced, but not proved until our recent preprint. In fact, there are two slightly different versions of TimSort that are currently implemented in Python and in Java respectively. We propose a pedagogical and insightful proof that the Python version runs in $\mathcal{O}(n\log n)$. The approach…
▽ More
TimSort is an intriguing sorting algorithm designed in 2002 for Python, whose worst-case complexity was announced, but not proved until our recent preprint. In fact, there are two slightly different versions of TimSort that are currently implemented in Python and in Java respectively. We propose a pedagogical and insightful proof that the Python version runs in $\mathcal{O}(n\log n)$. The approach we use in the analysis also applies to the Java version, although not without very involved technical details. As a byproduct of our study, we uncover a bug in the Java implementation that can cause the sorting method to fail during the execution. We also give a proof that Python's TimSort running time is in $\mathcal{O}(n + n\log ρ)$, where $ρ$ is the number of runs (i.e. maximal monotonic sequences), which is quite a natural parameter here and part of the explanation for the good behavior of TimSort on partially sorted inputs.
△ Less
Submitted 7 July, 2019; v1 submitted 22 May, 2018;
originally announced May 2018.
-
Analysis of Algorithms for Permutations Biased by Their Number of Records
Authors:
Nicolas Auger,
Mathilde Bouvel,
Cyril Nicaud,
Carine Pivoteau
Abstract:
The topic of the article is the parametric study of the complexity of algorithms on arrays of pairwise distinct integers. We introduce a model that takes into account the non-uniformness of data, which we call the Ewens-like distribution of parameter $θ$ for records on permutations: the weight $θ^r$ of a permutation depends on its number $r$ of records. We show that this model is meaningful for th…
▽ More
The topic of the article is the parametric study of the complexity of algorithms on arrays of pairwise distinct integers. We introduce a model that takes into account the non-uniformness of data, which we call the Ewens-like distribution of parameter $θ$ for records on permutations: the weight $θ^r$ of a permutation depends on its number $r$ of records. We show that this model is meaningful for the notion of presortedness, while still being mathematically tractable. Our results describe the expected value of several classical permutation statistics in this model, and give the expected running time of three algorithms: the Insertion Sort, and two variants of the Min-Max search.
△ Less
Submitted 10 May, 2016;
originally announced May 2016.
-
An algorithm computing combinatorial specifications of permutation classes
Authors:
Frédérique Bassino,
Mathilde Bouvel,
Adeline Pierrot,
Carine Pivoteau,
Dominique Rossin
Abstract:
This article presents a methodology that automatically derives a combinatorial specification for a permutation class C, given its basis B of excluded patterns and the set of simple permutations in C, when these sets are both finite. This is achieved considering both pattern avoidance and pattern containment constraints in permutations. The obtained specification yields a system of equations satisf…
▽ More
This article presents a methodology that automatically derives a combinatorial specification for a permutation class C, given its basis B of excluded patterns and the set of simple permutations in C, when these sets are both finite. This is achieved considering both pattern avoidance and pattern containment constraints in permutations. The obtained specification yields a system of equations satisfied by the generating function of C, this system being always positive and algebraic. It also yields a uniform random sampler of permutations in C. The method presented is fully algorithmic.
△ Less
Submitted 31 October, 2016; v1 submitted 2 June, 2015;
originally announced June 2015.