Online Construction of Subsequence Automata for Multiple Texts.
International Symposium on String Processing and Information Retrieval (SPIRE2000),
, pp.146-152
(2000), [peer-reviewed]
Event Date:
September 27-29, 2000
@ A Coruña, Spain
Abstract / 概要
We consider a deterministic finite automaton which accepts all subsequences of a set of texts, called subsequence automaton. We show an online algorithm for constructing a subsequence automaton for a set of texts. It runs in O(|/spl Sigma/|(m+k)+N) time using O(|/spl Sigma/|m) space, where |/spl Sigma/| is the size of alphabet, m is the size of the resulting subsequence automaton, k is the number of texts, and N is the total length of texts. It can be used to preprocess a given set S of texts in such a way that for any query /spl omega/ /spl isin/ /spl Sigma/*, returns in O(|/spl omega/|) time the number of texts in S which contain /spl omega/ as a subsequence. We also show an upper bound of the size of automaton compared to the minimum automaton.