Skip to main content

Showing 1–6 of 6 results for author: Fujishige, Y

.
  1. arXiv:2402.18090  [pdf, other

    cs.DS cs.FL

    Computing Minimal Absent Words and Extended Bispecial Factors with CDAWG Space

    Authors: Shunsuke Inenaga, Takuya Mieno, Hiroki Arimura, Mitsuru Funakoshi, Yuta Fujishige

    Abstract: A string $w$ is said to be a minimal absent word (MAW) for a string $S$ if $w$ does not occur in $S$ and any proper substring of $w$ occurs in $S$. We focus on non-trivial MAWs which are of length at least 2. Finding such non-trivial MAWs for a given string is motivated for applications in bioinformatics and data compression. Fujishige et al. [TCS 2023] proposed a data structure of size $Θ(n)$ tha… ▽ More

    Submitted 19 May, 2024; v1 submitted 28 February, 2024; originally announced February 2024.

    Comments: Accepted for IWOCA 2024

  2. arXiv:2310.06446  [pdf, other

    cs.SE cs.LG

    Rule Mining for Correcting Classification Models

    Authors: Hirofumi Suzuki, Hiroaki Iwashita, Takuya Takagi, Yuta Fujishige, Satoshi Hara

    Abstract: Machine learning models need to be continually updated or corrected to ensure that the prediction accuracy remains consistently high. In this study, we consider scenarios where developers should be careful to change the prediction results by the model correction, such as when the model is part of a complex system or software. In such scenarios, the developers want to control the specification of t… ▽ More

    Submitted 14 October, 2023; v1 submitted 10 October, 2023; originally announced October 2023.

  3. arXiv:2307.01428  [pdf, other

    cs.DS cs.FL

    Linear-time Computation of DAWGs, Symmetric Indexing Structures, and MAWs for Integer Alphabets

    Authors: Yuta Fujishige, Yuki Tsujimaru, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda

    Abstract: The directed acyclic word graph (DAWG) of a string $y$ of length $n$ is the smallest (partial) DFA which recognizes all suffixes of $y$ with only $O(n)$ nodes and edges. In this paper, we show how to construct the DAWG for the input string $y$ from the suffix tree for $y$, in $O(n)$ time for integer alphabets of polynomial size in $n$. In so doing, we first describe a folklore algorithm which, giv… ▽ More

    Submitted 3 July, 2023; originally announced July 2023.

    Comments: This is an extended version of the paper "Computing DAWGs and Minimal Absent Words in Linear Time for Integer Alphabets" from MFCS 2016

  4. arXiv:1909.02804  [pdf, ps, other

    cs.DS

    Minimal Unique Substrings and Minimal Absent Words in a Sliding Window

    Authors: Takuya Mieno, Yuki Kuhara, Tooru Akagi, Yuta Fujishige, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda

    Abstract: A substring $u$ of a string $T$ is called a minimal unique substring (MUS) of $T$ if $u$ occurs exactly once in $T$ and any proper substring of $u$ occurs at least twice in $T$. A string $w$ is called a minimal absent word (MAW) of $T$ if $w$ does not occur in $T$ and any proper substring of $w$ occurs in $T$. In this paper, we study the problems of computing MUSs and MAWs in a sliding window over… ▽ More

    Submitted 13 September, 2019; v1 submitted 6 September, 2019; originally announced September 2019.

  5. arXiv:1705.09779  [pdf, ps, other

    cs.DS

    Linear-size CDAWG: new repetition-aware indexing and grammar compression

    Authors: Takuya Takagi, Keisuke Goto, Yuta Fujishige, Shunsuke Inenaga, Hiroki Arimura

    Abstract: In this paper, we propose a novel approach to combine \emph{compact directed acyclic word graphs} (CDAWGs) and grammar-based compression. This leads us to an efficient self-index, called Linear-size CDAWGs (L-CDAWGs), which can be represented with $O(\tilde e_T \log n)$ bits of space allowing for $O(\log n)$-time random and $O(1)$-time sequential accesses to edge labels, and $O(m \log σ+ occ)$-tim… ▽ More

    Submitted 27 July, 2017; v1 submitted 27 May, 2017; originally announced May 2017.

    Comments: 12 pages, 2 figures

  6. arXiv:1703.04954  [pdf, other

    cs.DS

    Faster STR-IC-LCS computation via RLE

    Authors: Keita Kuboi, Yuta Fujishige, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda

    Abstract: The constrained LCS problem asks one to find a longest common subsequence of two input strings $A$ and $B$ with some constraints. The STR-IC-LCS problem is a variant of the constrained LCS problem, where the solution must include a given constraint string $C$ as a substring. Given two strings $A$ and $B$ of respective lengths $M$ and $N$, and a constraint string $C$ of length at most… ▽ More

    Submitted 15 March, 2017; originally announced March 2017.