Fast Full Permuted Pattern Matching Algorithms on Multi-track Strings
Proceedings of the Prague Stringology Conference 2016 (PSC2016),
, pp.7-21
(2016), [peer-reviewed]
Event Date:
August 29-31, 2016
Abstract / 概要
A multi-track string is a tuple of strings of the same length. The full permuted pattern matching problem is, given two multi-track strings $\mathcal{T} = (t_1, t_2,\ldots, t_N)$ and $\mathcal{P} = (p_1, p_2,\ldots, p_N)$ such that $|p_1|= \cdots =|p_N| \leq |t_1| = \ldots =|t_N|$, to find all positions $i$ such that $\mathcal{P} = (t_{r_1}[i:i+m-1], \ldots, t_{r_N}[i:i+m-1])$ for some permutation $(r_1, \ldots, r_N)$ of $(1, \ldots, N)$, where $m=|p_1|$ and $t[i:j]$ denotes the substring of t from position $i$ to $j$. We propose new algorithms that perform full permuted pattern matching practically fast. The first and second algorithms are based on the Boyer-Moore algorithm and the Horspool algorithm, respectively. The third algorithm is based on the Aho-Corasick algorithm where we use a multi-track character instead of a single character in the so-called goto function. The fourth algorithm is an improvement of the multi-track Knuth-Morris-Pratt algorithm that uses an automaton instead of the failure function of the original algorithm. Our experiment results demonstrate that those algorithms perform permuted pattern matching faster than existing algorithms.