Skip to main content

Showing 1–42 of 42 results for author: Tomescu, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2504.20605  [pdf, other

    cs.CL cs.AI cs.DL cs.LG

    TF1-EN-3M: Three Million Synthetic Moral Fables for Training Small, Open Language Models

    Authors: Mihai Nadas, Laura Diosan, Andrei Piscoran, Andreea Tomescu

    Abstract: Moral stories are a time-tested vehicle for transmitting values, yet modern NLP lacks a large, structured corpus that couples coherent narratives with explicit ethical lessons. We close this gap with TF1-EN-3M, the first open dataset of three million English-language fables generated exclusively by instruction-tuned models no larger than 8B parameters. Each story follows a six-slot scaffold (chara… ▽ More

    Submitted 29 April, 2025; originally announced April 2025.

  2. arXiv:2503.14023  [pdf, other

    cs.CL

    Synthetic Data Generation Using Large Language Models: Advances in Text and Code

    Authors: Mihai Nadas, Laura Diosan, Andreea Tomescu

    Abstract: Large language models (LLMs) have unlocked new possibilities for generating synthetic training data in both natural language and code. By producing artificial but task-relevant examples, these models can significantly augment or even replace real-world datasets, especially when labeled data is scarce or sensitive. This paper surveys recent advances in using LLMs to create synthetic text and code,… ▽ More

    Submitted 18 March, 2025; originally announced March 2025.

    Comments: 21 pages, 3 tables, 64 references, preprint

  3. arXiv:2502.06459  [pdf, ps, other

    cs.DS

    Maximum Coverage $k$-Antichains and Chains: A Greedy Approach

    Authors: Manuel Cáceres, Andreas Grigorjew, Wanchote Po Jiamjitrak, Alexandru I. Tomescu

    Abstract: Given an input acyclic digraph $G = (V,E)$ and a positive integer $k$, the problem of Maximum Coverage $k$-Antichains (resp., Chains) denoted as MA-$k$ (resp., MC-$k$) asks to find $k$ sets of pairwise unreachable vertices, known as antichains (resp., $k$ subsequences of paths, known as chains), maximizing the number of vertices covered by these antichains (resp. chains). While MC-$k$ has been rec… ▽ More

    Submitted 10 February, 2025; originally announced February 2025.

  4. arXiv:2411.03871  [pdf, other

    cs.DS math.OC q-bio.GN

    Safe Paths and Sequences for Scalable ILPs in RNA Transcript Assembly Problems

    Authors: Francisco Sena, Alexandru I. Tomescu

    Abstract: A common step at the core of many RNA transcript assembly tools is to find a set of weighted paths that best explain the weights of a DAG. While such problems easily become NP-hard, scalable solvers exist only for a basic error-free version of this problem, namely minimally decomposing a network flow into weighted paths. The main result of this paper is to show that we can achieve speedups of tw… ▽ More

    Submitted 21 December, 2024; v1 submitted 6 November, 2024; originally announced November 2024.

  5. arXiv:2409.20278  [pdf, other

    cs.DS

    Parameterised Approximation and Complexity of Minimum Flow Decompositions

    Authors: Andreas Grigorjew, Wanchote Jiamjitrak, Brendan Mumey, Alexandru I. Tomescu

    Abstract: Minimum flow decomposition (MFD) is the strongly NP-hard problem of finding a smallest set of integer weighted paths in a graph $G$ whose weighted sum is equal to a given flow $f$ on $G$. Despite its many practical applications, we lack an understanding of graph structures that make MFD easy or hard. In particular, it is not known whether a good approximation algorithm exists when the weights are… ▽ More

    Submitted 30 September, 2024; originally announced September 2024.

  6. arXiv:2308.08960  [pdf, other

    cs.DS

    Minimum Path Cover: The Power of Parameterization

    Authors: Manuel Cáceres, Brendan Mumey, Santeri Toivonen, Alexandru I. Tomescu

    Abstract: Computing a minimum path cover (MPC) of a directed acyclic graph (DAG) is a fundamental problem with a myriad of applications, including reachability. Although it is known how to solve the problem by a simple reduction to minimum flow, recent theoretical advances exploit this idea to obtain algorithms parameterized by the number of paths of an MPC, known as the width. These results obtain fast [Mä… ▽ More

    Submitted 17 August, 2023; originally announced August 2023.

  7. arXiv:2301.13245  [pdf, other

    cs.DS math.CO q-bio.GN

    A Safety Framework for Flow Decomposition Problems via Integer Linear Programming

    Authors: Fernando H. C. Dias, Manuel Caceres, Lucia Williams, Brendan Mumey, Alexandru I. Tomescu

    Abstract: Many important problems in Bioinformatics (e.g., assembly or multi-assembly) admit multiple solutions, while the final objective is to report only one. A common approach to deal with this uncertainty is finding safe partial solutions (e.g., contigs) which are common to all solutions. Previous research on safety has focused on polynomially-time solvable problems, whereas many successful and natural… ▽ More

    Submitted 30 January, 2023; originally announced January 2023.

  8. arXiv:2211.09659  [pdf, ps, other

    cs.DS

    Minimum Path Cover in Parameterized Linear Time

    Authors: Manuel Caceres, Massimo Cairo, Brendan Mumey, Romeo Rizzi, Alexandru I. Tomescu

    Abstract: A minimum path cover (MPC) of a directed acyclic graph (DAG) $G = (V,E)$ is a minimum-size set of paths that together cover all the vertices of the DAG. Computing an MPC is a basic polynomial problem, dating back to Dilworth's and Fulkerson's results in the 1950s. Since the size $k$ of an MPC (also known as the width) can be small in practical applications, research has also studied algorithms who… ▽ More

    Submitted 17 November, 2022; originally announced November 2022.

    Comments: arXiv admin note: substantial text overlap with arXiv:2107.05717

  9. arXiv:2210.07530  [pdf, other

    cs.DM math.CO q-bio.QM

    Cut paths and their remainder structure, with applications

    Authors: Massimo Cairo, Shahbaz Khan, Romeo Rizzi, Sebastian Schmidt, Alexandru I. Tomescu, Elia C. Zirondelli

    Abstract: In a strongly connected graph $G = (V,E)$, a cut arc (also called strong bridge) is an arc $e \in E$ whose removal makes the graph no longer strongly connected. Equivalently, there exist $u,v \in V$, such that all $u$-$v$ walks contain $e$. Cut arcs are a fundamental graph-theoretic notion, with countless applications, especially in reachability problems. In this paper we initiate the study of c… ▽ More

    Submitted 14 October, 2022; originally announced October 2022.

  10. arXiv:2209.00042  [pdf, other

    cs.DS math.CO math.OC q-bio.GN

    Minimum Flow Decomposition in Graphs with Cycles using Integer Linear Programming

    Authors: Fernando H. C. Dias, Lucia Williams, Brendan Mumey, Alexandru I. Tomescu

    Abstract: Minimum flow decomposition (MFD) -- the problem of finding a minimum set of weighted source-to-sink paths that perfectly decomposes a flow -- is a classical problem in Computer Science, and variants of it are powerful models in different fields such as Bioinformatics and Transportation. Even on acyclic graphs, the problem is NP-hard, and most practical solutions have been via heuristics or approxi… ▽ More

    Submitted 16 January, 2023; v1 submitted 31 August, 2022; originally announced September 2022.

  11. arXiv:2208.08522  [pdf, other

    cs.DS cs.DM

    Simplicity in Eulerian Circuits: Uniqueness and Safety

    Authors: Nidia Obscura Acosta, Alexandru I. Tomescu

    Abstract: An Eulerian circuit in a directed graph is one of the most fundamental Graph Theory notions. Detecting if a graph $G$ has a unique Eulerian circuit can be done in polynomial time via the BEST theorem by de Bruijn, van Aardenne-Ehrenfest, Smith and Tutte, 1941-1951 (involving counting arborescences), or via a tailored characterization by Pevzner, 1989 (involving computing the intersection graph of… ▽ More

    Submitted 25 May, 2023; v1 submitted 17 August, 2022; originally announced August 2022.

    ACM Class: G.2.2

  12. arXiv:2207.02136  [pdf, other

    cs.DS

    Width Helps and Hinders Splitting Flows

    Authors: Manuel Cáceres, Massimo Cairo, Andreas Grigorjew, Shahbaz Khan, Brendan Mumey, Romeo Rizzi, Alexandru I. Tomescu, Lucia Williams

    Abstract: Minimum flow decomposition (MFD) is the NP-hard problem of finding a smallest decomposition of a network flow/circulation $X$ on a directed graph $G$ into weighted source-to-sink paths whose superposition equals $X$. We show that, for acyclic graphs, considering the \emph{width} of the graph (the minimum number of paths needed to cover all of its edges) yields advances in our understanding of its… ▽ More

    Submitted 9 May, 2023; v1 submitted 5 July, 2022; originally announced July 2022.

    Comments: A preliminary version was submitted to ESA 2022

  13. arXiv:2201.10372  [pdf, other

    cs.DS q-bio.GN

    Safety and Completeness in Flow Decompositions for RNA Assembly

    Authors: Shahbaz Khan, Milla Kortelainen, Manuel Cáceres, Lucia Williams, Alexandru I. Tomescu

    Abstract: Decomposing a network flow into weighted paths has numerous applications. Some applications require any decomposition that is optimal w.r.t. some property such as number of paths, robustness, or length. Many bioinformatic applications require a specific decomposition where the paths correspond to some underlying data that generated the flow. For real inputs, no optimization criteria guarantees to… ▽ More

    Submitted 25 January, 2022; originally announced January 2022.

    Comments: RECOMB 2022. arXiv admin note: text overlap with arXiv:2102.06480

  14. The Labeled Direct Product Optimally Solves String Problems on Graphs

    Authors: Nicola Rizzo, Alexandru I. Tomescu, Alberto Policriti

    Abstract: Suffix trees are an important data structure at the core of optimal solutions to many fundamental string problems, such as exact pattern matching, longest common substring, matching statistics, and longest repeated substring. Recent lines of research focused on extending some of these problems to vertex-labeled graphs, although using ad-hoc approaches which in some cases do not generalize to all i… ▽ More

    Submitted 11 September, 2021; originally announced September 2021.

    Comments: 19 pages, 8 figures

    Journal ref: Algorithmica (2022) 1-26

  15. arXiv:2107.05717  [pdf, other

    cs.DS

    Sparsifying, Shrinking and Splicing for Minimum Path Cover in Parameterized Linear Time

    Authors: Manuel Cáceres, Massimo Cairo, Brendan Mumey, Romeo Rizzi, Alexandru I. Tomescu

    Abstract: A minimum path cover (MPC) of a directed acyclic graph (DAG) $G = (V,E)$ is a minimum-size set of paths that together cover all the vertices of the DAG. Computing an MPC is a basic polynomial problem, dating back to Dilworth's and Fulkerson's results in the 1950s. Since the size $k$ of an MPC (also known as the width) can be small in practical applications, research has also studied algorithms who… ▽ More

    Submitted 12 July, 2021; originally announced July 2021.

  16. arXiv:2102.12822  [pdf, other

    cs.DS cs.CC

    Algorithms and Complexity on Indexing Founder Graphs

    Authors: Massimo Equi, Tuukka Norri, Jarno Alanko, Bastien Cazaux, Alexandru I. Tomescu, Veli Mäkinen

    Abstract: We study the problem of matching a string in a labeled graph. Previous research has shown that unless the Orthogonal Vectors Hypothesis (OVH) is false, one cannot solve this problem in strongly sub-quadratic time, nor index the graph in polynomial time to answer queries efficiently (Equi et al. ICALP 2019, SOFSEM 2021). These conditional lower-bounds cover even deterministic graphs with binary alp… ▽ More

    Submitted 10 June, 2022; v1 submitted 25 February, 2021; originally announced February 2021.

    Comments: This is an extended full version of WABI 2020 paper (https://doi.org/10.4230/LIPIcs.WABI.2020.7), whose preprint is in arXiv:2005.09342, and of ISAAC 2021 paper (to appear)

    ACM Class: E.1; E.4; F.1.3; F.2.2

  17. arXiv:2102.09041  [pdf, ps, other

    cs.DC

    Reaching Consensus for Asynchronous Distributed Key Generation

    Authors: Ittai Abraham, Philipp Jovanovic, Mary Maller, Sarah Meiklejohn, Gilad Stern, Alin Tomescu

    Abstract: We give a protocol for Asynchronous Distributed Key Generation (A-DKG) that is optimally resilient (can withstand $f<\frac{n}{3}$ faulty parties), has a constant expected number of rounds, has $\tilde{O}(n^3)$ expected communication complexity, and assumes only the existence of a PKI. Prior to our work, the best A-DKG protocols required $Ω(n)$ expected number of rounds, and $Ω(n^4)$ expected commu… ▽ More

    Submitted 4 June, 2021; v1 submitted 17 February, 2021; originally announced February 2021.

  18. arXiv:2102.06480  [pdf, other

    cs.DS

    Optimizing Safe Flow Decompositions in DAGs

    Authors: Shahbaz Khan, Alexandru I. Tomescu

    Abstract: Network flow is one of the most studied combinatorial optimization problems having innumerable applications. Any flow on a directed acyclic graph $G$ having $n$ vertices and $m$ edges can be decomposed into a set of $O(m)$ paths. In some applications, each solution (decomposition) corresponds to some particular data that generated the original flow. Given the possibility of multiple optimal soluti… ▽ More

    Submitted 4 July, 2022; v1 submitted 12 February, 2021; originally announced February 2021.

    Comments: 16 pages, 6 figures, Accepted at ESA 2022

  19. arXiv:2011.12635  [pdf, other

    cs.DM math.CO q-bio.GN

    The Hydrostructure: a Universal Framework for Safe and Complete Algorithms for Genome Assembly

    Authors: Massimo Cairo, Shahbaz Khan, Romeo Rizzi, Sebastian Schmidt, Alexandru I. Tomescu, Elia C. Zirondelli

    Abstract: Genome assembly is a fundamental problem in Bioinformatics, requiring to reconstruct a source genome from an assembly graph built from a set of reads (short strings sequenced from the genome). A notion of genome assembly solution is that of an arc-covering walk of the graph. Since assembly graphs admit many solutions, the goal is to find what is definitely present in all solutions, or what is safe… ▽ More

    Submitted 2 November, 2021; v1 submitted 25 November, 2020; originally announced November 2020.

  20. arXiv:2007.07575  [pdf, ps, other

    cs.DS

    A linear-time parameterized algorithm for computing the width of a DAG

    Authors: Manuel Cáceres, Massimo Cairo, Brendan Mumey, Romeo Rizzi, Alexandru I. Tomescu

    Abstract: The width $k$ of a directed acyclic graph (DAG) $G = (V, E)$ equals the largest number of pairwise non-reachable vertices. Computing the width dates back to Dilworth's and Fulkerson's results in the 1950s, and is doable in quadratic time in the worst case. Since $k$ can be small in practical applications, research has also studied algorithms whose complexity is parameterized on $k$. Despite these… ▽ More

    Submitted 24 June, 2021; v1 submitted 15 July, 2020; originally announced July 2020.

  21. Safety in $s$-$t$ Paths, Trails and Walks

    Authors: Massimo Cairo, Shahbaz Khan, Romeo Rizzi, Sebastian Schmidt, Alexandru I. Tomescu

    Abstract: Given a directed graph $G$ and a pair of nodes $s$ and $t$, an \emph{$s$-$t$ bridge} of $G$ is an edge whose removal breaks all $s$-$t$ paths of $G$ (and thus appears in all $s$-$t$ paths). Computing all $s$-$t$ bridges of $G$ is a basic graph problem, solvable in linear time. In this paper, we consider a natural generalisation of this problem, with the notion of "safety" from bioinformatics. We… ▽ More

    Submitted 17 July, 2020; v1 submitted 9 July, 2020; originally announced July 2020.

  22. arXiv:2006.15024  [pdf, other

    cs.DS

    Computing all $s$-$t$ bridges and articulation points simplified

    Authors: Massimo Cairo, Shahbaz Khan, Romeo Rizzi, Sebastian Schmidt, Alexandru I. Tomescu, Elia Zirondelli

    Abstract: Given a directed graph $G$ and a pair of nodes $s$ and $t$, an $s$-$t$ bridge of $G$ is an edge whose removal breaks all $s$-$t$ paths of $G$. Similarly, an $s$-$t$ articulation point of $G$ is a node whose removal breaks all $s$-$t$ paths of $G$. Computing the sequence of all $s$-$t$ bridges of $G$ (as well as the $s$-$t$ articulation points) is a basic graph problem, solvable in linear time usin… ▽ More

    Submitted 26 June, 2020; originally announced June 2020.

    Comments: 5 pages, 5 figures

  23. arXiv:2005.09342  [pdf, other

    cs.DS

    Linear Time Construction of Indexable Founder Block Graphs

    Authors: Veli Mäkinen, Bastien Cazaux, Massimo Equi, Tuukka Norri, Alexandru I. Tomescu

    Abstract: We introduce a compact pangenome representation based on an optimal segmentation concept that aims to reconstruct founder sequences from a multiple sequence alignment (MSA). Such founder sequences have the feature that each row of the MSA is a recombination of the founders. Several linear time dynamic programming algorithms have been previously devised to optimize segmentations that induce founder… ▽ More

    Submitted 19 May, 2020; originally announced May 2020.

    ACM Class: E.1; F.2.2; J.3

  24. arXiv:2002.10498  [pdf, other

    cs.DM math.CO q-bio.GN

    Genome assembly, from practice to theory: safe, complete and linear-time

    Authors: Massimo Cairo, Romeo Rizzi, Alexandru I. Tomescu, Elia C. Zirondelli

    Abstract: Genome assembly asks to reconstruct an unknown string from many shorter substrings of it. Even though it is one of the key problems in Bioinformatics, it is generally lacking major theoretical advances. Its hardness stems both from practical issues (size and errors of real data), and from the fact that problem formulations inherently admit multiple solutions. Given these, at their core, most state… ▽ More

    Submitted 8 November, 2020; v1 submitted 24 February, 2020; originally announced February 2020.

  25. arXiv:2002.00629  [pdf, other

    cs.CC

    Graphs cannot be indexed in polynomial time for sub-quadratic time string matching, unless SETH fails

    Authors: Massimo Equi, Veli Mäkinen, Alexandru I. Tomescu

    Abstract: We consider the following string matching problem on a node-labeled graph $G=(V,E)$: given a pattern string $P$, decide whether there exists a path in $G$ whose concatenation of node labels equals $P$. This is a basic primitive in various problems in bioinformatics, graph databases, or networks. The hardness results of Backurs and Indyk (FOCS 2016) imply that this problem cannot be solved in bette… ▽ More

    Submitted 4 March, 2020; v1 submitted 3 February, 2020; originally announced February 2020.

    ACM Class: E.1; F.1; F.2.2; G.2.2

  26. arXiv:1902.03560  [pdf, other

    cs.CC cs.DS

    On the Complexity of Exact Pattern Matching in Graphs: Determinism and Zig-Zag Matching

    Authors: Massimo Equi, Roberto Grossi, Alexandru I. Tomescu, Veli Mäkinen

    Abstract: Exact pattern matching in labeled graphs is the problem of searching paths of a graph $G=(V,E)$ that spell the same string as the given pattern $P[1..m]$. This basic problem can be found at the heart of more complex operations on variation graphs in computational biology, query operations in graph databases, and analysis of heterogeneous networks, where the nodes of some paths must match a sequenc… ▽ More

    Submitted 10 February, 2019; originally announced February 2019.

    Comments: Further developments on our previous work: arXiv:1901.05264

    ACM Class: E.1; F.1; F.2.2; G.2.2; H.2.3; H.2.8; H.3.3; J.3

  27. arXiv:1807.03720  [pdf, other

    cs.CR

    sAVSS: Scalable Asynchronous Verifiable Secret Sharing in BFT Protocols

    Authors: Soumya Basu, Alin Tomescu, Ittai Abraham, Dahlia Malkhi, Michael K. Reiter, Emin Gün Sirer

    Abstract: This paper introduces a new way to incorporate verifiable secret sharing (VSS) schemes into Byzantine Fault Tolerance (BFT) protocols. This technique extends the threshold guarantee of classical Byzantine Fault Tolerant algorithms to include privacy as well. This provides applications with a powerful primitive: a threshold trusted third party, which simplifies many difficult problems such as a fai… ▽ More

    Submitted 21 December, 2018; v1 submitted 10 July, 2018; originally announced July 2018.

  28. arXiv:1804.01626  [pdf, other

    cs.DC

    SBFT: a Scalable and Decentralized Trust Infrastructure

    Authors: Guy Golan Gueta, Ittai Abraham, Shelly Grossman, Dahlia Malkhi, Benny Pinkas, Michael K. Reiter, Dragos-Adrian Seredinschi, Orr Tamir, Alin Tomescu

    Abstract: SBFT is a state of the art Byzantine fault tolerant permissioned blockchain system that addresses the challenges of scalability, decentralization and world-scale geo-replication. SBFTis optimized for decentralization and can easily handle more than 200 active replicas in a real world-scale deployment. We evaluate \sysname in a world-scale geo-replicated deployment with 209 replicas withstanding f=… ▽ More

    Submitted 2 January, 2019; v1 submitted 4 April, 2018; originally announced April 2018.

  29. arXiv:1705.08754  [pdf, other

    cs.DS

    Using Minimum Path Cover to Boost Dynamic Programming on DAGs: Co-Linear Chaining Extended

    Authors: Anna Kuosmanen, Topi Paavilainen, Travis Gagie, Rayan Chikhi, Alexandru I. Tomescu, Veli Mäkinen

    Abstract: Aligning sequencing reads on graph representations of genomes is an important ingredient of pan-genomics. Such approaches typically find a set of local anchors that indicate plausible matches between substrings of a read to subpaths of the graph. These anchor matches are then combined to form a (semi-local) alignment of the complete read on a subpath. Co-linear chaining is an algorithmically rigor… ▽ More

    Submitted 29 January, 2018; v1 submitted 24 May, 2017; originally announced May 2017.

    ACM Class: G.2.2; F.2.2; J.3

  30. arXiv:1701.05492  [pdf, other

    cs.DM cs.CC cs.DS math.CO q-bio.PE

    Perfect phylogenies via branchings in acyclic digraphs and a generalization of Dilworth's theorem

    Authors: Ademir Hujdurović, Edin Husić, Martin Milanič, Romeo Rizzi, Alexandru I. Tomescu

    Abstract: Motivated by applications in cancer genomics and following the work of Hajirasouliha and Raphael (WABI 2014), Hujdurović et al. (IEEE TCBB, to appear) introduced the minimum conflict-free row split (MCRS) problem: split each row of a given binary matrix into a bitwise OR of a set of rows so that the resulting matrix corresponds to a perfect phylogeny and has the minimum possible number of rows amo… ▽ More

    Submitted 27 January, 2018; v1 submitted 19 January, 2017; originally announced January 2017.

    Comments: 29 pages, 10 figures, extended abstract appeared in Proceedings of WG 2017, full paper accepted for publication in ACM Transactions on Algorithms

  31. Hardness of Covering Alignment: Phase Transition in Post-Sequence Genomics

    Authors: Romeo Rizzi, Massimo Cairo, Veli Mäkinen, Alexandru I. Tomescu, Daniel Valenzuela

    Abstract: Covering alignment problems arise from recent developments in genomics; so called pan-genome graphs are replacing reference genomes, and advances in haplotyping enable full content of diploid genomes to be used as basis of sequence analysis. In this paper, we show that the computational complexity will change for natural extensions of alignments to pan-genome representations and to diploid genomes… ▽ More

    Submitted 22 May, 2018; v1 submitted 15 November, 2016; originally announced November 2016.

    Journal ref: IEEE/ACM Trans. on Computational Biology and Bioinformatics, 30 April 2018

  32. arXiv:1601.02932  [pdf, other

    q-bio.QM cs.DM cs.DS q-bio.GN

    Safe and complete contig assembly via omnitigs

    Authors: Alexandru I. Tomescu, Paul Medvedev

    Abstract: Contig assembly is the first stage that most assemblers solve when reconstructing a genome from a set of reads. Its output consists of contigs -- a set of strings that are promised to appear in any genome that could have generated the reads. From the introduction of contigs 20 years ago, assemblers have tried to obtain longer and longer contigs, but the following question was never solved: given a… ▽ More

    Submitted 16 August, 2016; v1 submitted 12 January, 2016; originally announced January 2016.

    Comments: Full version of the paper in the proceedings of RECOMB 2016

  33. arXiv:1508.07820  [pdf, other

    cs.DS

    Interval scheduling maximizing minimum coverage

    Authors: Veli Mäkinen, Valeria Staneva, Alexandru Tomescu, Daniel Valenzuela

    Abstract: In the classical interval scheduling type of problems, a set of $n$ jobs, characterized by their start and end time, need to be executed by a set of machines, under various constraints. In this paper we study a new variant in which the jobs need to be assigned to at most $k$ identical machines, such that the minimum number of machines that are busy at the same time is maximized. This is relevant i… ▽ More

    Submitted 30 October, 2015; v1 submitted 31 August, 2015; originally announced August 2015.

  34. arXiv:1506.07675  [pdf, other

    q-bio.PE cs.CC cs.DM cs.DS

    Complexity and algorithms for finding a perfect phylogeny from mixed tumor samples

    Authors: Ademir Hujdurović, Urša Kačar, Martin Milanič, Bernard Ries, Alexandru I. Tomescu

    Abstract: Recently, Hajirasouliha and Raphael (WABI 2014) proposed a model for deconvoluting mixed tumor samples measured from a collection of high-throughput sequencing reads. This is related to understanding tumor evolution and critical cancer mutations. In short, their formulation asks to split each row of a binary matrix so that the resulting matrix corresponds to a perfect phylogeny and has the minimum… ▽ More

    Submitted 7 July, 2016; v1 submitted 25 June, 2015; originally announced June 2015.

    Comments: This is the extended version of Hujdurović et al, Finding a perfect phylogeny from mixed tumor samples, WABI 2015, DOI: 10.1007/978-3-662-48221-6_6

  35. Enumeration of the adjunctive hierarchy of hereditarily finite sets

    Authors: Giorgio Audrito, Alexandru I. Tomescu, Stephan Wagner

    Abstract: Hereditarily finite sets (sets which are finite and have only hereditarily finite sets as members) are basic mathematical and computational objects, and also stand at the basis of some programming languages. This raises the need for efficient representation of such sets, for example by numbers. In 2008, Kirby proposed an adjunctive hierarchy of hereditarily finite sets, based on the fact that they… ▽ More

    Submitted 9 April, 2014; v1 submitted 10 September, 2013; originally announced September 2013.

    MSC Class: 03E20; 05A99; 03E05 ACM Class: F.4.1; G.2.1

  36. arXiv:1307.7811  [pdf, other

    q-bio.QM cs.CE cs.DS

    A Novel Combinatorial Method for Estimating Transcript Expression with RNA-Seq: Bounding the Number of Paths

    Authors: Alexandru I. Tomescu, Anna Kuosmanen, Romeo Rizzi, Veli Mäkinen

    Abstract: RNA-Seq technology offers new high-throughput ways for transcript identification and quantification based on short reads, and has recently attracted great interest. The problem is usually modeled by a weighted splicing graph whose nodes stand for exons and whose edges stand for split alignments to the exons. The task consists of finding a number of paths, together with their expression levels, whi… ▽ More

    Submitted 30 July, 2013; originally announced July 2013.

    Comments: Peer-reviewed and presented as part of the 13th Workshop on Algorithms in Bioinformatics (WABI2013)

  37. arXiv:1307.2347  [pdf, ps, other

    cs.DS cs.DM math.CO

    Combinatorial decomposition approaches for efficient counting and random generation FPTASes

    Authors: Romeo Rizzi, Alexandru I. Tomescu

    Abstract: Given a combinatorial decomposition for a counting problem, we resort to the simple scheme of approximating large numbers by floating-point representations in order to obtain efficient Fully Polynomial Time Approximation Schemes (FPTASes) for it. The number of bits employed for the exponent and the mantissa will depend on the error parameter $0 < \varepsilon \leq 1$ and on the characteristics of t… ▽ More

    Submitted 15 November, 2013; v1 submitted 9 July, 2013; originally announced July 2013.

  38. Motif matching using gapped patterns

    Authors: Emanuele Giaquinta, Kimmo Fredriksson, Szymon Grabowski, Alexandru I. Tomescu, Esko Ukkonen

    Abstract: We present new algorithms for the problem of multiple string matching of gapped patterns, where a gapped pattern is a sequence of strings such that there is a gap of fixed length between each two consecutive strings. The problem has applications in the discovery of transcription factor binding sites in DNA sequences when using generalized versions of the Position Weight Matrix model to describe tr… ▽ More

    Submitted 7 July, 2014; v1 submitted 11 June, 2013; originally announced June 2013.

  39. arXiv:1304.5560  [pdf, ps, other

    cs.DS

    Indexes for Jumbled Pattern Matching in Strings, Trees and Graphs

    Authors: Ferdinando Cicalese, Travis Gagie, Emanuele Giaquinta, Eduardo Sany Laber, Zsuzsanna Lipták, Romeo Rizzi, Alexandru I. Tomescu

    Abstract: We consider how to index strings, trees and graphs for jumbled pattern matching when we are asked to return a match if one exists. For example, we show how, given a tree containing two colours, we can build a quadratic-space index with which we can find a match in time proportional to the size of the match. We also show how we need only linear space if we are content with approximate matches.

    Submitted 19 April, 2013; originally announced April 2013.

  40. arXiv:1208.1640  [pdf, other

    cs.GT cs.CC

    Graph Operations on Parity Games and Polynomial-Time Algorithms

    Authors: Christoph Dittmann, Stephan Kreutzer, Alexandru I. Tomescu

    Abstract: Parity games are games that are played on directed graphs whose vertices are labeled by natural numbers, called priorities. The players push a token along the edges of the digraph. The winner is determined by the parity of the greatest priority occurring infinitely often in this infinite play. A motivation for studying parity games comes from the area of formal verification of systems by model c… ▽ More

    Submitted 8 August, 2012; originally announced August 2012.

  41. arXiv:1207.7184  [pdf, ps, other

    cs.DM cs.CC

    Set graphs. II. Complexity of set graph recognition and similar problems

    Authors: Martin Milanič, Romeo Rizzi, Alexandru I. Tomescu

    Abstract: A graph $G$ is said to be a `set graph' if it admits an acyclic orientation that is also `extensional', in the sense that the out-neighborhoods of its vertices are pairwise distinct. Equivalently, a set graph is the underlying graph of the digraph representation of a hereditarily finite set. In this paper, we continue the study of set graphs and related topics, focusing on computational complexity… ▽ More

    Submitted 31 July, 2012; originally announced July 2012.

  42. arXiv:1006.0902  [pdf, ps, other

    cs.DM

    On cycles through two arcs in strong multipartite tournaments

    Authors: Alexandru I. Tomescu

    Abstract: A multipartite tournament is an orientation of a complete $c$-partite graph. In [L. Volkmann, A remark on cycles through an arc in strongly connected multipartite tournaments, Appl. Math. Lett. 20 (2007) 1148--1150], Volkmann proved that a strongly connected $c$-partite tournament with $c \ge 3$ contains an arc that belongs to a directed cycle of length $m$ for every $m \in \{3, 4, \ldots, c\}$. H… ▽ More

    Submitted 4 June, 2010; originally announced June 2010.