conference

Online Construction of Subsequence Automata for Multiple Texts.

Hiromasa Hoshino, Ayumi Shinohara, Masayuki Takeda, Setsuo Arikawa

International Symposium on String Processing and Information Retrieval (SPIRE2000), , pp.146-152 (2000), [peer-reviewed]
Event Date: September 27-29, 2000 @ A Coruña, Spain

Abstract / 概要

We consider a deterministic finite automaton which accepts all subsequences of a set of texts, called subsequence automaton. We show an online algorithm for constructing a subsequence automaton for a set of texts. It runs in O(|/spl Sigma/|(m+k)+N) time using O(|/spl Sigma/|m) space, where |/spl Sigma/| is the size of alphabet, m is the size of the resulting subsequence automaton, k is the number of texts, and N is the total length of texts. It can be used to preprocess a given set S of texts in such a way that for any query /spl omega/ /spl isin/ /spl Sigma/*, returns in O(|/spl omega/|) time the number of texts in S which contain /spl omega/ as a subsequence. We also show an upper bound of the size of automaton compared to the minimum automaton.