Search | arXiv e-print repository

arXiv:1911.06985 [pdf, ps, other]

Constructing the Bijective and the Extended Burrows-Wheeler Transform in Linear Time

Authors: Hideo Bannai, Juha Kärkkäinen, Dominik Köppl, Marcin Picatkowski

Abstract: The Burrows-Wheeler transform (BWT) is a permutation whose applications are prevalent in data compression and text indexing. The bijective BWT (BBWT) is a bijective variant of it. Although it is known that the BWT can be constructed in linear time for integer alphabets by using a linear time suffix array construction algorithm, it was up to now only conjectured that the BBWT can also be constructe… ▽ More The Burrows-Wheeler transform (BWT) is a permutation whose applications are prevalent in data compression and text indexing. The bijective BWT (BBWT) is a bijective variant of it. Although it is known that the BWT can be constructed in linear time for integer alphabets by using a linear time suffix array construction algorithm, it was up to now only conjectured that the BBWT can also be constructed in linear time. We confirm this conjecture by proposing a construction algorithm that is based on SAIS, improving the best known result of $O(n \lg n /\lg \lg n)$ time to linear. △ Less

Submitted 22 April, 2021; v1 submitted 16 November, 2019; originally announced November 2019.

arXiv:1711.02910 [pdf, ps, other]

Run Compressed Rank/Select for Large Alphabets

Authors: José Fuentes-Sepúlveda, Juha Kärkkäinen, Dmitry Kosolobov, Simon J. Puglisi

Abstract: Given a string of length $n$ that is composed of $r$ runs of letters from the alphabet $\{0,1,\ldots,σ{-}1\}$ such that $2 \le σ\le r$, we describe a data structure that, provided $r \le n / \log^{ω(1)} n$, stores the string in $r\log\frac{nσ}{r} + o(r\log\frac{nσ}{r})$ bits and supports select and access queries in $O(\log\frac{\log(n/r)}{\log\log n})$ time and rank queries in… ▽ More Given a string of length $n$ that is composed of $r$ runs of letters from the alphabet $\{0,1,\ldots,σ{-}1\}$ such that $2 \le σ\le r$, we describe a data structure that, provided $r \le n / \log^{ω(1)} n$, stores the string in $r\log\frac{nσ}{r} + o(r\log\frac{nσ}{r})$ bits and supports select and access queries in $O(\log\frac{\log(n/r)}{\log\log n})$ time and rank queries in $O(\log\frac{\log(nσ/r)}{\log\log n})$ time. We show that $r\log\frac{n(σ-1)}{r} - O(\log\frac{n}{r})$ bits are necessary for any such data structure and, thus, our solution is succinct. We also describe a data structure that uses $(1 + ε)r\log\frac{nσ}{r} + O(r)$ bits, where $ε> 0$ is an arbitrary constant, with the same query times but without the restriction $r \le n / \log^{ω(1)} n$. By simple reductions to the colored predecessor problem, we show that the query times are optimal in the important case $r \ge 2^{\log^δn}$, for an arbitrary constant $δ> 0$. We implement our solution and compare it with the state of the art, showing that the closest competitors consume 31-46% more space. △ Less

Submitted 26 February, 2018; v1 submitted 8 November, 2017; originally announced November 2017.

Comments: This research has received funding from the European Union's Horizon 2020 research and innovation programme under the Marie Sklodowska-Curie Actions H2020-MSCA-RISE-2015 BIRDS GA No. 690941. 10 pages, 1 figure, 4 tables; published in DCC'2018

arXiv:1611.08898 [pdf, other]

doi 10.4230/LIPIcs.STACS.2017.45

On the Size of Lempel-Ziv and Lyndon Factorizations

Authors: Juha Kärkkäinen, Dominik Kempa, Yuto Nakashima, Simon J. Puglisi, Arseny M. Shur

Abstract: Lyndon factorization and Lempel-Ziv (LZ) factorization are both important tools for analysing the structure and complexity of strings, but their combinatorial structure is very different. In this paper, we establish the first direct connection between the two by showing that while the Lyndon factorization can be bigger than the non-overlapping LZ factorization (which we demonstrate by describing a… ▽ More Lyndon factorization and Lempel-Ziv (LZ) factorization are both important tools for analysing the structure and complexity of strings, but their combinatorial structure is very different. In this paper, we establish the first direct connection between the two by showing that while the Lyndon factorization can be bigger than the non-overlapping LZ factorization (which we demonstrate by describing a new, non-trivial family of strings) it is never more than twice the size. △ Less

Submitted 27 November, 2016; originally announced November 2016.

Comments: 12 pages

arXiv:1609.06378 [pdf, ps, other]

Linear-time string indexing and analysis in small space

Authors: Djamal Belazzougui, Fabio Cunial, Juha Kärkkäinen, Veli Mäkinen

Abstract: The field of succinct data structures has flourished over the last 16 years. Starting from the compressed suffix array (CSA) by Grossi and Vitter (STOC 2000) and the FM-index by Ferragina and Manzini (FOCS 2000), a number of generalizations and applications of string indexes based on the Burrows-Wheeler transform (BWT) have been developed, all taking an amount of space that is close to the input s… ▽ More The field of succinct data structures has flourished over the last 16 years. Starting from the compressed suffix array (CSA) by Grossi and Vitter (STOC 2000) and the FM-index by Ferragina and Manzini (FOCS 2000), a number of generalizations and applications of string indexes based on the Burrows-Wheeler transform (BWT) have been developed, all taking an amount of space that is close to the input size in bits. In many large-scale applications, the construction of the index and its usage need to be considered as one unit of computation. Efficient string indexing and analysis in small space lies also at the core of a number of primitives in the data-intensive field of high-throughput DNA sequencing. We report the following advances in string indexing and analysis. We show that the BWT of a string $T\in \{1,\ldots,σ\}^n$ can be built in deterministic $O(n)$ time using just $O(n\logσ)$ bits of space, where $σ\leq n$. Within the same time and space budget, we can build an index based on the BWT that allows one to enumerate all the internal nodes of the suffix tree of $T$. Many fundamental string analysis problems can be mapped to such enumeration, and can thus be solved in deterministic $O(n)$ time and in $O(n\logσ)$ bits of space from the input string. We also show how to build many of the existing indexes based on the BWT, such as the CSA, the compressed suffix tree (CST), and the bidirectional BWT index, in randomized $O(n)$ time and in $O(n\logσ)$ bits of space. The previously fastest construction algorithms for BWT, CSA and CST, which used $O(n\logσ)$ bits of space, took $O(n\log{\logσ})$ time for the first two structures, and $O(n\log^εn)$ time for the third, where $ε$ is any positive constant. Contrary to the state of the art, our bidirectional BWT index supports every operation in constant time per element in its output. △ Less

Submitted 20 September, 2016; originally announced September 2016.

Comments: Journal submission (52 pages, 2 figures)

arXiv:1606.04573 [pdf, ps, other]

String Inference from the LCP Array

Authors: Juha Kärkkäinen, Marcin Piątkowski, Simon J. Puglisi

Abstract: The suffix array, perhaps the most important data structure in modern string processing, is often augmented with the longest common prefix (LCP) array which stores the lengths of the LCPs for lexicographically adjacent suffixes of a string. Together the two arrays are roughly equivalent to the suffix tree with the LCP array representing the tree shape. In order to better understand the combinato… ▽ More The suffix array, perhaps the most important data structure in modern string processing, is often augmented with the longest common prefix (LCP) array which stores the lengths of the LCPs for lexicographically adjacent suffixes of a string. Together the two arrays are roughly equivalent to the suffix tree with the LCP array representing the tree shape. In order to better understand the combinatorics of LCP arrays, we consider the problem of inferring a string from an LCP array, i.e., determining whether a given array of integers is a valid LCP array, and if it is, reconstructing some string or all strings with that LCP array. There are recent studies of inferring a string from a suffix tree shape but using significantly more information (in the form of suffix links) than is available in the LCP array. We provide two main results. (1) We describe two algorithms for inferring strings from an LCP array when we allow a generalized form of LCP array defined for a multiset of cyclic strings: a linear time algorithm for binary alphabet and a general algorithm with polynomial time complexity for a constant alphabet size. (2) We prove that determining whether a given integer array is a valid LCP array is NP-complete when we require more restricted forms of LCP array defined for a single cyclic or non-cyclic string or a multiset of non-cyclic strings. The result holds whether or not the alphabet is restricted to be binary. In combination, the two results show that the generalized form of LCP array for a multiset of cyclic strings is fundamentally different from the other more restricted forms. △ Less

Submitted 23 February, 2017; v1 submitted 14 June, 2016; originally announced June 2016.

Comments: Added algorithm for general alphabets

ACM Class: F.2.2; G.2.1; G.2.2

arXiv:1605.09362 [pdf, other]

Document Retrieval on Repetitive String Collections

Authors: Travis Gagie, Aleksi Hartikainen, Kalle Karhu, Juha Kärkkäinen, Gonzalo Navarro, Simon J. Puglisi, Jouni Sirén

Abstract: Most of the fastest-growing string collections today are repetitive, that is, most of the constituent documents are similar to many others. As these collections keep growing, a key approach to handling them is to exploit their repetitiveness, which can reduce their space usage by orders of magnitude. We study the problem of indexing repetitive string collections in order to perform efficient docum… ▽ More Most of the fastest-growing string collections today are repetitive, that is, most of the constituent documents are similar to many others. As these collections keep growing, a key approach to handling them is to exploit their repetitiveness, which can reduce their space usage by orders of magnitude. We study the problem of indexing repetitive string collections in order to perform efficient document retrieval operations on them. Document retrieval problems are routinely solved by search engines on large natural language collections, but the techniques are less developed on generic string collections. The case of repetitive string collections is even less understood, and there are very few existing solutions. We develop two novel ideas, {\em interleaved LCPs} and {\em precomputed document lists}, that yield highly compressed indexes solving the problem of document listing (find all the documents where a string appears), top-$k$ document retrieval (find the $k$ documents where a string appears most often), and document counting (count the number of documents where a string appears). We also show that a classical data structure supporting the latter query becomes highly compressible on repetitive data. Finally, we show how the tools we developed can be combined to solve ranked conjunctive and disjunctive multi-term queries under the simple tf-idf model of relevance. We thoroughly evaluate the resulting techniques in various real-life repetitiveness scenarios, and recommend the best choices for each case. △ Less

Submitted 18 May, 2017; v1 submitted 30 May, 2016; originally announced May 2016.

Comments: This research has received funding from the European Union's Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie Actions H2020-MSCA-RISE-2015 BIRDS GA No. 690941. Accepted to the Information Retrieval Journal

arXiv:1602.00329 [pdf, other]

doi 10.1007/978-3-319-38851-9_5

Lempel-Ziv Decoding in External Memory

Authors: Djamal Belazzougui, Juha Kärkkäinen, Dominik Kempa, Simon J. Puglisi

Abstract: Simple and fast decoding is one of the main advantages of LZ77-type text encoding used in many popular file compressors such as gzip and 7zip. With the recent introduction of external memory algorithms for Lempel-Ziv factorization there is a need for external memory LZ77 decoding but the standard algorithm makes random accesses to the text and cannot be trivially modified for external memory compu… ▽ More Simple and fast decoding is one of the main advantages of LZ77-type text encoding used in many popular file compressors such as gzip and 7zip. With the recent introduction of external memory algorithms for Lempel-Ziv factorization there is a need for external memory LZ77 decoding but the standard algorithm makes random accesses to the text and cannot be trivially modified for external memory computation. We describe the first external memory algorithms for LZ77 decoding, prove that their I/O complexity is optimal, and demonstrate that they are very fast in practice, only about three times slower than in-memory decoding (when reading input and writing output is included in the time). △ Less

Submitted 31 January, 2016; originally announced February 2016.

arXiv:1503.04045 [pdf, ps, other]

doi 10.1142/S0129054118400014

Diverse Palindromic Factorization is NP-Complete

Authors: Hideo Bannai, Travis Gagie, Shunsuke Inenaga, Juha Karkkainen, Dominik Kempa, Marcin Piatkowski, Simon J. Puglisi, Shiho Sugimoto

Abstract: We prove that it is NP-complete to decide whether a given string can be factored into palindromes that are each unique in the factorization. We prove that it is NP-complete to decide whether a given string can be factored into palindromes that are each unique in the factorization. △ Less

Submitted 16 February, 2017; v1 submitted 13 March, 2015; originally announced March 2015.

arXiv:1412.0967 [pdf, other]

Queries on LZ-Bounded Encodings

Authors: Djamal Belazzougui, Travis Gagie, Paweł Gawrychowski, Juha Kärkkäinen, Alberto Ordóñez, Simon J. Puglisi, Yasuo Tabei

Abstract: We describe a data structure that stores a string $S$ in space similar to that of its Lempel-Ziv encoding and efficiently supports access, rank and select queries. These queries are fundamental for implementing succinct and compressed data structures, such as compressed trees and graphs. We show that our data structure can be built in a scalable manner and is both small and fast in practice compar… ▽ More We describe a data structure that stores a string $S$ in space similar to that of its Lempel-Ziv encoding and efficiently supports access, rank and select queries. These queries are fundamental for implementing succinct and compressed data structures, such as compressed trees and graphs. We show that our data structure can be built in a scalable manner and is both small and fast in practice compared to other data structures supporting such queries. △ Less

Submitted 2 December, 2014; originally announced December 2014.

arXiv:1409.6780 [pdf, other]

Document Counting in Practice

Authors: Travis Gagie, Aleksi Hartikainen, Juha Kärkkäinen, Gonzalo Navarro, Simon J. Puglisi, Jouni Sirén

Abstract: We address the problem of counting the number of strings in a collection where a given pattern appears, which has applications in information retrieval and data mining. Existing solutions are in a theoretical stage. We implement these solutions and develop some new variants, comparing them experimentally on various datasets. Our results not only show which are the best options for each situation a… ▽ More We address the problem of counting the number of strings in a collection where a given pattern appears, which has applications in information retrieval and data mining. Existing solutions are in a theoretical stage. We implement these solutions and develop some new variants, comparing them experimentally on various datasets. Our results not only show which are the best options for each situation and help discard practically unappealing solutions, but also uncover some unexpected compressibility properties of the best data structures. By taking advantage of these properties, we can reduce the size of the structures by a factor of 5--400, depending on the dataset. △ Less

Submitted 1 October, 2015; v1 submitted 23 September, 2014; originally announced September 2014.

Comments: This is a slightly extended version of the paper that was presented at DCC 2015. The implementations are available at http://jltsiren.kapsi.fi/rlcsa and https://github.com/ahartik/succinct

arXiv:1403.2431 [pdf, ps, other]

doi 10.1016/j.jda.2014.08.001

A Subquadratic Algorithm for Minimum Palindromic Factorization

Authors: Gabriele Fici, Travis Gagie, Juha Kärkkäinen, Dominik Kempa

Abstract: We give an $\mathcal{O}(n \log n)$-time, $\mathcal{O}(n)$-space algorithm for factoring a string into the minimum number of palindromic substrings. That is, given a string $S [1..n]$, in $\mathcal{O}(n \log n)$ time our algorithm returns the minimum number of palindromes $S_1,\ldots, S_\ell$ such that $S = S_1 \cdots S_\ell$. We also show that the time complexity is $\mathcal{O}(n)$ on average and… ▽ More We give an $\mathcal{O}(n \log n)$-time, $\mathcal{O}(n)$-space algorithm for factoring a string into the minimum number of palindromic substrings. That is, given a string $S [1..n]$, in $\mathcal{O}(n \log n)$ time our algorithm returns the minimum number of palindromes $S_1,\ldots, S_\ell$ such that $S = S_1 \cdots S_\ell$. We also show that the time complexity is $\mathcal{O}(n)$ on average and $Ω(n\log n)$ in the worst case. The last result is based on a characterization of the palindromic structure of Zimin words. △ Less

Submitted 7 August, 2014; v1 submitted 10 March, 2014; originally announced March 2014.

Comments: Accepted for publication in Journal of Discrete Algorithms

arXiv:1307.1428 [pdf, other]

doi 10.1109/DCC.2014.78

Lempel-Ziv Parsing in External Memory

Authors: Juha Kärkkäinen, Dominik Kempa, Simon J. Puglisi

Abstract: For decades, computing the LZ factorization (or LZ77 parsing) of a string has been a requisite and computationally intensive step in many diverse applications, including text indexing and data compression. Many algorithms for LZ77 parsing have been discovered over the years; however, despite the increasing need to apply LZ77 to massive data sets, no algorithm to date scales to inputs that exceed t… ▽ More For decades, computing the LZ factorization (or LZ77 parsing) of a string has been a requisite and computationally intensive step in many diverse applications, including text indexing and data compression. Many algorithms for LZ77 parsing have been discovered over the years; however, despite the increasing need to apply LZ77 to massive data sets, no algorithm to date scales to inputs that exceed the size of internal memory. In this paper we describe the first algorithm for computing the LZ77 parsing in external memory. Our algorithm is fast in practice and will allow the next generation of text indexes to be realised for massive strings and string collections. △ Less

Submitted 4 July, 2013; originally announced July 2013.

Comments: 10 pages

arXiv:1302.1064 [pdf, other]

doi 10.1007/978-3-642-38527-8_14

Lightweight Lempel-Ziv Parsing

Authors: Juha Kärkkäinen, Dominik Kempa, Simon J. Puglisi

Abstract: We introduce a new approach to LZ77 factorization that uses O(n/d) words of working space and O(dn) time for any d >= 1 (for polylogarithmic alphabet sizes). We also describe carefully engineered implementations of alternative approaches to lightweight LZ77 factorization. Extensive experiments show that the new algorithm is superior in most cases, particularly at the lowest memory levels and for h… ▽ More We introduce a new approach to LZ77 factorization that uses O(n/d) words of working space and O(dn) time for any d >= 1 (for polylogarithmic alphabet sizes). We also describe carefully engineered implementations of alternative approaches to lightweight LZ77 factorization. Extensive experiments show that the new algorithm is superior in most cases, particularly at the lowest memory levels and for highly repetitive data. As a part of the algorithm, we describe new methods for computing matching statistics which may be of independent interest. △ Less

Submitted 6 February, 2013; v1 submitted 5 February, 2013; originally announced February 2013.

Comments: 12 pages

arXiv:1212.2952 [pdf, ps, other]

doi 10.1007/978-3-642-38905-4_19

Linear Time Lempel-Ziv Factorization: Simple, Fast, Small

Authors: Juha Kärkkäinen, Dominik Kempa, Simon J. Puglisi

Abstract: Computing the LZ factorization (or LZ77 parsing) of a string is a computational bottleneck in many diverse applications, including data compression, text indexing, and pattern discovery. We describe new linear time LZ factorization algorithms, some of which require only 2n log n + O(log n) bits of working space to factorize a string of length n. These are the most space efficient linear time algor… ▽ More Computing the LZ factorization (or LZ77 parsing) of a string is a computational bottleneck in many diverse applications, including data compression, text indexing, and pattern discovery. We describe new linear time LZ factorization algorithms, some of which require only 2n log n + O(log n) bits of working space to factorize a string of length n. These are the most space efficient linear time algorithms to date, using n log n bits less space than any previous linear time algorithm. The algorithms are also practical, simple to implement, and very fast in practice. △ Less

Submitted 12 December, 2012; originally announced December 2012.

arXiv:1111.1355 [pdf, ps, other]

A Compressed Self-Index for Genomic Databases

Authors: Travis Gagie, Juha Kärkkäinen, Yakov Nekrich, Simon J. Puglisi

Abstract: Advances in DNA sequencing technology will soon result in databases of thousands of genomes. Within a species, individuals' genomes are almost exact copies of each other; e.g., any two human genomes are 99.9% the same. Relative Lempel-Ziv (RLZ) compression takes advantage of this property: it stores the first genome uncompressed or as an FM-index, then compresses the other genomes with a variant o… ▽ More Advances in DNA sequencing technology will soon result in databases of thousands of genomes. Within a species, individuals' genomes are almost exact copies of each other; e.g., any two human genomes are 99.9% the same. Relative Lempel-Ziv (RLZ) compression takes advantage of this property: it stores the first genome uncompressed or as an FM-index, then compresses the other genomes with a variant of LZ77 that copies phrases only from the first genome. RLZ achieves good compression and supports fast random access; in this paper we show how to support fast search as well, thus obtaining an efficient compressed self-index. △ Less

Submitted 5 November, 2011; originally announced November 2011.

arXiv:1109.3954 [pdf, other]

A Faster Grammar-Based Self-Index

Authors: Travis Gagie, Paweł Gawrychowski, Juha Kärkkäinen, Yakov Nekrich, Simon J. Puglisi

Abstract: To store and search genomic databases efficiently, researchers have recently started building compressed self-indexes based on grammars. In this paper we show how, given a straight-line program with $r$ rules for a string (S [1..n]) whose LZ77 parse consists of $z$ phrases, we can store a self-index for $S$ in $\Oh{r + z \log \log n}$ space such that, given a pattern (P [1..m]), we can list the… ▽ More To store and search genomic databases efficiently, researchers have recently started building compressed self-indexes based on grammars. In this paper we show how, given a straight-line program with $r$ rules for a string (S [1..n]) whose LZ77 parse consists of $z$ phrases, we can store a self-index for $S$ in $\Oh{r + z \log \log n}$ space such that, given a pattern (P [1..m]), we can list the $\occ$ occurrences of $P$ in $S$ in $\Oh{m^2 + \occ \log \log n}$ time. If the straight-line program is balanced and we accept a small probability of building a faulty index, then we can reduce the $\Oh{m^2}$ term to $\Oh{m \log m}$. All previous self-indexes are larger or slower in the worst case. △ Less

Submitted 26 September, 2012; v1 submitted 19 September, 2011; originally announced September 2011.

Comments: journal version of LATA '12 paper

arXiv:1104.3810 [pdf, ps, other]

Fixed Block Compression Boosting in FM-Indexes

Authors: Juha Kärkkäinen, Simon J. Puglisi

Abstract: A compressed full-text self-index occupies space close to that of the compressed text and simultaneously allows fast pattern matching and random access to the underlying text. Among the best compressed self-indexes, in theory and in practice, are several members of the FM-index family. In this paper, we describe new FM-index variants that combine nice theoretical properties, simple implementation… ▽ More A compressed full-text self-index occupies space close to that of the compressed text and simultaneously allows fast pattern matching and random access to the underlying text. Among the best compressed self-indexes, in theory and in practice, are several members of the FM-index family. In this paper, we describe new FM-index variants that combine nice theoretical properties, simple implementation and improved practical performance. Our main result is a new technique called fixed block compression boosting, which is a simpler and faster alternative to optimal compression boosting and implicit compression boosting used in previous FM-indexes. △ Less

Submitted 19 April, 2011; originally announced April 2011.

arXiv:1011.3491 [pdf, other]

Pattern Kits

Authors: Travis Gagie, Kalle Karhu, Juha Kärkkäinen, Veli Mäkinen, Leena Salmela

Abstract: Suppose we have just performed searches in a self-index for two patterns $A$ and $B$ and now we want to search for their concatenation \A B); how can we best make use of our previous computations? In this paper we consider this problem and, more generally, how we can store a dynamic library of patterns that we can easily manipulate in interesting ways. We give a space- and time-efficient data stru… ▽ More Suppose we have just performed searches in a self-index for two patterns $A$ and $B$ and now we want to search for their concatenation \A B); how can we best make use of our previous computations? In this paper we consider this problem and, more generally, how we can store a dynamic library of patterns that we can easily manipulate in interesting ways. We give a space- and time-efficient data structure for this problem that is compatible with many of the best self-indexes. △ Less

Submitted 2 April, 2011; v1 submitted 15 November, 2010; originally announced November 2010.

arXiv:1011.3480 [pdf, ps, other]

Counting Colours in Compressed Strings

Authors: Travis Gagie, Juha Kärkkäinen

Abstract: Suppose we are asked to preprocess a string $s [1..n]$ such that later, given a substring's endpoints, we can quickly count how many distinct characters it contains. In this paper we give a data structure for this problem that takes $n H_0 (s) + \Oh{n} + \oh{n H_0 (s)}$ bits, where $H_0 (s)$ is the 0th-order empirical entropy of $s$, and answers queries in $\Oh{\log^{1 + ε} n}$ time for any… ▽ More Suppose we are asked to preprocess a string $s [1..n]$ such that later, given a substring's endpoints, we can quickly count how many distinct characters it contains. In this paper we give a data structure for this problem that takes $n H_0 (s) + \Oh{n} + \oh{n H_0 (s)}$ bits, where $H_0 (s)$ is the 0th-order empirical entropy of $s$, and answers queries in $\Oh{\log^{1 + ε} n}$ time for any constant $ε> 0$. We also show how our data structure can be made partially dynamic. △ Less

Submitted 15 November, 2010; originally announced November 2010.

Showing 1–19 of 19 results for author: Karkkainen, J