String Processing on

2026-03-WALCOM

Thu, 01 Jan 2026 00:00:00 +0000

(Induced) Subgraph Isomorphism and Maximum Common (Induced) Subgraph are fundamental problems in graph pattern matching and similarity computation. In graphs derived from time-series data or protein structures, a natural total ordering of vertices often arises from…

SOFSEM2026

Thu, 01 Jan 2026 00:00:00 +0000

Given a set of results from range minimum queries (RMQs), our task is to construct a sequence that is consistent with the results of the queries. We study two types of RMQs: a value-based RMQ returns the minimum value and an index-based RMQ returns the index of the…

CIAC2025

Thu, 12 Jun 2025 00:00:00 +0000

The longest common subsequence (LCS) is a fundamental problem in string processing which has numerous algorithmic studies, extensions, and applications. A sequence $$u_1, \ldots , u_f$$…

CPM2025

Sun, 01 Jun 2025 00:00:00 +0000

Run-length straight-line programs (RLSLPs) are a technique for grammar-based compression, allowing any string to be represented with optimal space for $\delta$, the substring complexity of the string. We address the compressed pattern matching problem for RLSLPs: Given a compressed text in RLSLP format and an uncompressed pattern, determine if the pattern appears in the text. This paper proposes an algorithm that solves this problem in linear time with respect to the size of the grammar and the length of the pattern.

Acta-Informatica-2024

Thu, 01 Aug 2024 00:00:00 +0000

Given a text and a pattern over an alphabet, the classic exact matching problem searches for all occurrences of the pattern in the text. Unlike exact match

2024-01-ICALP

Mon, 01 Jul 2024 00:00:00 +0000

A parameterized string (p-string) is a string over an alphabet (Σ_s ∪ Σ_p), where Σ_s and Σ_p are disjoint alphabets for static symbols (s-symbols) and for parameter symbols (p-symbols), respectively. Two p-strings x and y are said to parameterized match (p-match) if and only if x can be transformed into y by applying a bijection on Σ_p to every occurrence of p-symbols in x. The indexing problem for p-matching is to preprocess a p-string T of length n so that we can efficiently find the occurrences of substrings of T that p-match with a given pattern. Let σ_s and respectively σ_p be the numbers of distinct s-symbols and p-symbols that appear in T and σ = σ_s + σ_p. Extending the Burrows-Wheeler Transform (BWT) based index for exact string pattern matching, Ganguly et al. [SODA 2017] proposed parameterized BWTs (pBWTs) to design the first compact index for p-matching, and posed an open problem on how to construct the pBWT-based index in compact space, i.e., in O(n lg |Σ_s ∪ Σ_p|) bits of space. Hashimoto et al. [SPIRE 2022] showed how to construct the pBWT for T, under the assumption that Σ_s ∪ Σ_p = [0..O(σ)], in O(n lg σ) bits of space and O(n (σ_p lg n)/(lg lg n)) time in an online manner while reading the symbols of T from right to left. In this paper, we refine Hashimoto et al.’s algorithm to work in O(n lg σ) bits of space and O(n (lg σ_p lg n)/(lg lg n)) time in a more general assumption that Σ_s ∪ Σ_p = [0..n^{O(1)}]. Our result has an immediate application to constructing parameterized suffix arrays in O(n (lg σ_p lg n)/(lg lg n)) time and O(n lg σ) bits of working space. We also show that our data structure can support backward search, a core procedure of BWT-based indexes, at any stage of the online construction, making it the first compact index for p-matching that can be constructed in compact space and even in an online manner.

TCS2024-Diptarama

Mon, 01 Jul 2024 00:00:00 +0000

Abstract could not be automatically retrieved from the linked source. Please add the official abstract manually if available.

CPM2024

Mon, 24 Jun 2024 00:00:00 +0000

SOFSEM2024

Fri, 23 Feb 2024 00:00:00 +0000

SPIRE2023

Fri, 01 Sep 2023 00:00:00 +0000

The parameterized matching problem is a variant of string matching, which is to search for all parameterized occurrences of a pattern P in a text T. In considering matching algorithms, the combinatorial natures of strings, especially periodicity, play an important…

ICGI2023

Wed, 12 Jul 2023 00:00:00 +0000

This paper is concerned with the identification in the limit from positive data of substitutable context-free languages CFLs over infinite alphabets. [ClarkE07] showed that substitutable CFLs over finite alphabets are learnable in this learning paradigm. We show that substitutable CFLs generated by grammars whose production rules may have predicates that represent sets of potentially infinitely many terminal symbols in a compact manner are learnable if the terminal symbol sets represented by those predicates are learnable, under a certain condition. This can be seen as a result parallel to [ArgyrosDA2018]’s work (2018) that amplifies the query learnability of predicate classes to that of symbolic automata classes. Our result is the first that shows such amplification is possible for identifying some CFLs in the limit from positive data.

WALCOM2023-Kumagai

Wed, 01 Mar 2023 00:00:00 +0000

Position heaps are index structures of text strings used for the string matching problem. They are rooted trees whose edges and nodes are labeled and numbered, respectively. This paper is concerned with variants of the inverse problem of position heap construction…

2023-02-la-sympo-winter

Wed, 01 Feb 2023 00:00:00 +0000

SPIRE2022

Wed, 09 Nov 2022 00:00:00 +0000

Theoretical Computer Science 2022

Fri, 14 Oct 2022 00:00:00 +0000

Two strings $x$ and $y$ over $\Sigma \cup \Pi$ of equal length are said to parameterized match (p-match) if there is a renaming bijection $f:\Sigma \cup \Pi \rightarrow \Sigma \cup \Pi$ that is identity on $\Sigma$ and transforms $x$ to $y$ (or vice versa). The p-matching problem is to look for substrings in a text that p-match a given pattern. In this paper, we propose parameterized suffix automata (p-suffix automata) and parameterized directed acyclic word graphs (PDAWGs) which are the p-matching versions of suffix automata and DAWGs. While suffix automata and DAWGs are equivalent for standard strings, we show that p-suffix automata can have $\Theta(n^2)$ nodes and edges but PDAWGs have only $O(n)$ nodes and edges, where $n$ is the length of an input string. We also give an $O(n |\Pi| \log (|\Pi| + |\Sigma|))$-time $O(n)$-space algorithm that builds the PDAWG in a left-to-right online manner. As a byproduct, it is shown that the \emph{parameterized suffix tree} for the reversed string can also be built in the same time and space, in a right-to-left online manner. This duality also leads us to two further efficient algorithms for p-matching: Given the parameterized suffix tree for the reversal of the input string $T$, one can build the PDAWG of $T$ in $O(n)$ time in an offline manner; One can perform \emph{bidirectional} p-matching in $O(m \log (|\Pi|+|\Sigma|) + occ)$ time using $O(n)$ space, where $m$ denotes the pattern length and $\mathit{occ}$ is the number of pattern occurrences in the text $T$.

CPM2022

Sun, 26 Jun 2022 00:00:00 +0000

2022-02-la-sympo-winter-dava

Wed, 02 Feb 2022 00:00:00 +0000

2022-02-la-sympo-winter-ichikawa

Wed, 02 Feb 2022 00:00:00 +0000

ICGI2023-Kaito

Mon, 23 Aug 2021 00:00:00 +0000

We propose a query learning algorithm for an extension of weighted finite automata (WFAs), named symbolic weighted finite automata (SWFAs), which can handle strings over infinite alphabets more efficiently. Based on the idea of symbolic finite automata, SWFAs generalize WFAs by allowing transitions to be functions from a possibly infinite alphabet to weights. Our algorithm can learn SWFAs if functions in transitions are also learnable by queries. We also investigate minimization and equivalence checking for SWFAs.

ICGI2023-kanazawa

Mon, 23 Aug 2021 00:00:00 +0000

We consider a generalization of the “dual” approach to distributional learning of context-free grammars, where each nonterminal $A$ is associated with a string set $X_A$ characterized by a finite set $C$ of contexts. Rather than letting $X_A$ be the set of all strings accepted by all contexts in $C$ as in previous works, we allow more flexible uses of the contexts in $C$, using some of them positively (contexts that accept the strings in $X_A$) and others negatively (contexts that do not accept any strings in $X_A$). The resulting more general algorithm works in essentially the same way as before, but on a larger class of context-free languages.

2020-12-comp

Fri, 04 Dec 2020 00:00:00 +0000

SPIRE2020

Tue, 13 Oct 2020 00:00:00 +0000

Covers are a kind of quasiperiodicity in strings. A string $C$ is a cover of another string $T$ if any position of $T$ is inside some occurrence of $C$ in $T$. The shortest and longest cover arrays of $T$ have the lengths of the shortest and longest covers of each prefix of $T$, respectively. The literature has proposed linear-time algorithms computing longest and shortest cover arrays taking border arrays as input. An equivalence relation $\approx$ over strings is called a substring consistent equivalence relation (SCER) iff $X \approx Y$ implies (1) $|X|=|Y|$ and (2) $X[i:j] \approx Y[i:j]$ for all $1 \leq i \leq j \leq |X|$. In this paper, we generalize the notion of covers for SCERs and prove that existing algorithms to compute the shortest cover array and the longest cover array of a string T under the identity relation will work for any SCERs taking the accordingly generalized border arrays.

CPM2020-Funakoshi

Wed, 17 Jun 2020 00:00:00 +0000

The equidistant subsequence pattern matching problem is considered. Given a pattern string $P$ and a text string $T$, we say that $P$ is an equidistant subsequence of $T$ if $P$ is a subsequence of the text such that consecutive symbols of $P$ in the occurrence are equally spaced. We can consider the problem of equidistant subsequences as generalizations of (sub-)cadences. We give bit-parallel algorithms that yield $o(n^2)$ time algorithms for finding $k$-(sub-)cadences and equidistant subsequences. Furthermore, $O(n \log^2{n})$ and $O(n \log{n})$ time algorithms, respectively for equidistant and Abelian equidistant matching for the case $|P| = 3$, are shown. The algorithms make use of a technique that was recently introduced which can efficiently compute convolutions with linear constraints.

CPM2020-Köppl

Wed, 17 Jun 2020 00:00:00 +0000

One of the most well-known variants of the Burrows-Wheeler transform (BWT) [Burrows and Wheeler, 1994] is the bijective BWT (BBWT) [Gil and Scott, arXiv 2012], which applies the extended BWT (EBWT) [Mantaci et al., TCS 2007] to the multiset of Lyndon factors of a given text. Since the EBWT is invertible, the BBWT is a bijective transform in the sense that the inverse image of the EBWT restores this multiset of Lyndon factors such that the original text can be obtained by sorting these factors in non-increasing order. In this paper, we present algorithms constructing or inverting the BBWT in-place using quadratic time. We also present conversions from the BBWT to the BWT, or vice versa, either (a) in-place using quadratic time, or (b) in the run-length compressed setting using $O(n \lg{r} / \lg{\lg{r}})$ time with $O(r \lg{n})$ bits of words, where r is the sum of character runs in the BWT and the BBWT.

CPM2020-Nakashima

Wed, 17 Jun 2020 00:00:00 +0000

Two strings $x$ and $y$ over $\Sigma \cup \Pi$ of equal length are said to parameterized match (p-match) if there is a renaming bijection $f:\Sigma \cup \Pi \to \Sigma \cup \Pi$ that is identity on $\Sigma$ and transforms $x$ to $y$ (or vice versa). The p-matching problem is to look for substrings in a text that p-match a given pattern. In this paper, we propose parameterized suffix automata (p-suffix automata) and parameterized directed acyclic word graphs (PDAWGs) which are the p-matching versions of suffix automata and DAWGs. While suffix automata and DAWGs are equivalent for standard strings, we show that p-suffix automata can have $\Theta(n^2)$ nodes and edges but PDAWGs have only $O(n)$ nodes and edges, where $n$ is the length of an input string. We also give $O(n |\Pi| \log{(|\Pi| + |\Sigma|)})$-time $O(n)$-space algorithm that builds the PDAWG in a left-to-right online manner. As a byproduct, it is shown that the parameterized suffix tree for the reversed string can also be built in the same time and space, in a right-to-left online manner.

SEA2020

Tue, 16 Jun 2020 00:00:00 +0000

Given a text $T$ of length $n$ and a pattern $P$ of length $m$, the string matching problem is a task to find all occurrences of $P$ in $T$. In this study, we propose an algorithm that solves this problem in $O((n + m)q)$ time considering the distance between two adjacent occurrences of the same $q$-gram contained in $P$. We also propose a theoretical improvement of it which runs in $O(n + m)$ time, though it is not necessarily faster in practice. We compare the execution times of our and existing algorithms on various kinds of real and artificial datasets such as an English text, a genome sequence and a Fibonacci string. The experimental results show that our algorithm is as fast as the state-of-the-art algorithms in many cases, particularly when a pattern frequently appears in a text.

TCS2020-Narisada-gap

Mon, 06 Apr 2020 00:00:00 +0000

In this paper, we introduce new types of approximate palindromes called single-arm-gapped palindromes (shortly SAGPs). A SAGP contains a gap in either its left or right arm, which is in the form of either $wgucu^Rw^R$ or $wucu^Rgw^R$, where $w$ and $u$ are non-empty strings, $w^R$ and $u^R$ are respectively the reversed strings of $w$ and $u$, $g$ is a string called a gap, and $c$ is either a single character or the empty string. Here we call $wu$ and $u^Rw^R$ the arm of the SAGP, and $|uv|$ the length of the arm. We classify SAGPs into two groups: those which have $ucu^R$ as a maximal palindrome (type-1), and the others (type-2). We propose several algorithms to compute type-1 SAGPs with longest arms occurring in a given string, based on suffix arrays. Then, we propose a linear-time algorithm to compute all type-1 SAGPs with longest arms, based on suffix trees. Also, we show how to compute type-2 SAGPs with longest arms in linear time. We also perform some preliminary experiments to show practical performances of the proposed methods.

TCS2020-Narisada-walk

Mon, 06 Apr 2020 00:00:00 +0000

Abstract We consider the problem of inferring an edge-labeled graph from the sequence of edge labels seen in a walk on that graph. It has been known that this problem is solvable in $O( n \log{n})$ time when the targets are path or cycle graphs. This paper presents an online algorithm for the problem of this restricted case that runs in $O ( n )$ time, based on Manacher’s algorithm for computing all the maximal palindromes in a string.

DCC2020

Wed, 25 Mar 2020 00:00:00 +0000

We propose a new approach for universal lossless text compression, based on grammar compression. In the literature, a target string $T$ has been compressed as a context-free grammar $G$ in Chomsky normal form satisfying $L(G) = T$. Such a grammar is often called a straight-line program (SLP). In this paper, we consider a probabilistic grammar $G$ that generates $T$, but not necessarily as a unique element of $L(G)$. In order to recover the original text $T$ unambiguously, we keep both the grammar G and the derivation tree of $T$ from the start symbol in $G$, in compressed form. We show some simple evidence that our proposal is indeed more efficient than SLPs for certain texts, both from theoretical and practical points of view.

2020-02-la-winter-nakashima

Fri, 07 Feb 2020 00:00:00 +0000

2020-02-la-winter-natsumi

Fri, 07 Feb 2020 00:00:00 +0000

SOFSEM2020

Wed, 22 Jan 2020 00:00:00 +0000

Given a text and a pattern over an alphabet, the classic exact matching problem searches for all occurrences of pattern P in text T. Unlike the exact matching problem, order-preserving pattern matching considers the relative order of elements, rather than their exact values. In this paper, we propose the first parallel algorithm for the OPPM problem. Our algorithm is based on the “duel-and-sweep” algorithm. For a pattern of length $m$ and a text of length $n$, our algorithm runs in $O(\log^3{m})$ time and $O(n \log^3{m})$ work on the Priority CRCW PRAM.

2020-01-Algorithmica

Wed, 01 Jan 2020 00:00:00 +0000

We consider the construction of the suffix tree and the directed acyclic word graph (DAWG) indexing data structures for a collection $$\mathcal {T}$$ of te

2020-01-SOFSEM-Doctoral-Student-Research-Forum-2

Wed, 01 Jan 2020 00:00:00 +0000

2019-12-comp

Fri, 13 Dec 2019 00:00:00 +0000

2019-08-tsjc

Thu, 22 Aug 2019 00:00:00 +0000

2019-01-PSC

Thu, 01 Aug 2019 00:00:00 +0000

2019-05-comp

Fri, 10 May 2019 00:00:00 +0000

2019-01-Algorithms

Tue, 01 Jan 2019 00:00:00 +0000

A multi-track string is a tuple of strings of the same length. Given the pattern and text of two multi-track strings, the permuted pattern matching problem is to find the occurrence positions of all permutations of the pattern in the text. In this paper, we propose several algorithms for permuted pattern matching. Our first algorithm, which is based on the Knuth–Morris–Pratt (KMP) algorithm, has a fast theoretical computing time with O ( m k ) as the preprocessing time and O ( n k log σ ) as the matching time, where n, m, k, σ , and occ denote the length of the text, the length of the pattern, the number of strings in the multi-track, the alphabet size, and the number of occurrences of the pattern, respectively. We then improve the KMP-based algorithm by using an automaton, which has a better experimental running time. The next proposed algorithms are based on the Boyer–Moore algorithm and the Horspool algorithm that try to perform pattern matching. These algorithms are the fastest experimental algorithms. Furthermore, we propose an extension of the AC-automaton algorithm that can solve dictionary matching on multi-tracks, which is a task to find multiple multi-track patterns in a multi-track text. Finally, we propose filtering algorithms that can perform permuted pattern matching quickly in practice.

2019-01-GandALF

Tue, 01 Jan 2019 00:00:00 +0000

We propose a query learning algorithm for residual symbolic finite automata (RSFAs). Symbolic finite automata (SFAs) are finite automata whose transitions are labeled by predicates over a Boolean algebra, in which a big collection of characters leading the same transition may be represented by a single predicate. Residual finite automata (RFAs) are a special type of non-deterministic finite automata which can be exponentially smaller than the minimum deterministic finite automata and have a favorable property for learning algorithms. RSFAs have both properties of SFAs and RFAs and can have more succinct representation of transitions and fewer states than RFAs and deterministic SFAs accepting the same language. The implementation of our algorithm efficiently learns RSFAs over a huge alphabet and outperforms an existing learning algorithm for deterministic SFAs. The result also shows that the benefit of non-determinism in efficiency is even larger in learning SFAs than non-symbolic automata.

2019-01-J-Comput-Syst-Sci

Tue, 01 Jan 2019 00:00:00 +0000

Abstract Approaches based on the idea generically called distributional learning have been making great success in the algorithmic learning of several rich subclasses of context-free languages and their extensions. Those language classes are defined by properties concerning string-context relation. In this paper, we present a distributional learning algorithm for conjunctive grammars with the k -finite context property ( k - fcp ) for each natural number k . We also compare our result with the closely related work by Clark et al. (JMLR 2010) [5] on learning some context-free grammars using contextual binary feature grammars ( cbfg s). We prove that the context-free grammars targeted by their algorithm have the k - fcp . Moreover, we show that every exact cbfg has the k - fcp , too, while not all of them are learnable by their algorithm. Clark et al. conjectured a learning algorithm for exact cbfg s should exist. This paper answers their conjecture in a positive way.

2019-01-Theor-Comput-Sci

Tue, 01 Jan 2019 00:00:00 +0000

Abstract The dictionary matching is a task to find all occurrences of pattern strings in a set D (called a dictionary) on a text string T. The Aho–Corasick-automaton (AC-automaton) which is built on D is a fundamental data structure which enables us to solve the dictionary matching problem in O ( d log ⁡ σ ) preprocessing time and O ( n log ⁡ σ + occ ) matching time, where d is the total length of the patterns in the dictionary D, n is the length of the text, σ is the alphabet size, and occ is the total number of occurrences of all the patterns in the text. The dynamic dictionary matching is a variant where patterns may dynamically be inserted into and deleted from the dictionary D. This problem is called semi-dynamic dictionary matching if only insertions are allowed. In this paper, we propose two efficient algorithms that can solve both problems with some modifications. For a pattern of length m, our first algorithm supports insertions in O ( m log ⁡ σ + log ⁡ d / log ⁡ log ⁡ d ) time and pattern matching in O ( n log ⁡ σ + occ ) for the semi-dynamic setting. This algorithm also supports both insertions and deletions in O ( σ m + log ⁡ d / log ⁡ log ⁡ d ) time and pattern matching in O ( n ( log ⁡ d / log ⁡ log ⁡ d + log ⁡ σ ) + occ ( log ⁡ d / log ⁡ log ⁡ d ) ) time for the dynamic dictionary matching problem by some modifications. This algorithm is based on the directed acyclic word graph (DAWG) of Blumer et al. (JACM 1987). Our second algorithm, which is based on the AC-automaton, supports insertions in O ( m log ⁡ σ + u f + u o ) time for the semi-dynamic setting and supports both insertions and deletions in O ( σ m + u f + u o ) time for the dynamic setting, where u f and u o respectively denote the numbers of states in which the failure function and the output function need to be updated. This algorithm performs pattern matching in O ( n log ⁡ σ + occ ) time for both settings. Our algorithm achieves optimal update time for AC-automaton based methods over constant-size alphabets, since any algorithm which explicitly maintains the AC-automaton requires Ω ( m + u f + u o ) update time.

SPIRE2018

Thu, 11 Oct 2018 00:00:00 +0000

We consider the problem of inferring an edge-labeled graph from the sequence of edge labels seen in a walk of that graph. It has been known that this problem is solvable in $O(n\log n)$ time when the targets are path or cycle graphs. This paper presents an online algorithm for the problem of this restricted case that runs in $O(n)$ time, based on Manacher’s algorithm for computing all the maximal palindromes in a string.

2018-01-CIAA

Wed, 01 Aug 2018 00:00:00 +0000

A cryptarithm is a mathematical puzzle where given an arithmetic equation written with letters rather than numerals, a player must discover an assignment of numerals on letters that makes the equation hold true. In this paper, we propose a method to construct a DFA…

情報処理学会東北支部第13回野口研究奨励賞を受賞

Wed, 27 Jun 2018 00:00:00 +0000

2018年6月20日、当研究室のDiptarama Hendrian 助教が情報処理学会東北支部の「第13回野口研究奨励賞」を受賞しました。

この賞は，情報処理分野に関するより一層の研究開発を奨励することを趣旨として優秀な学術論文を出版した東北支部会員の若手研究者に対し授与されるものです．

2018-01-SOFSEM

Thu, 01 Feb 2018 00:00:00 +0000

Given a text and a pattern over two types of symbols called constants and variables, the parameterized pattern matching problem is to find all occurrences of substrings of the text that the pattern matches by substituting a variable in the text for each variable in…

2018-01-SOFSEM-2

Thu, 01 Feb 2018 00:00:00 +0000

Given a text and a pattern over an alphabet, the classic exact matching problem searches for all occurrences of the pattern in the text. Unlike exact matching, order-preserving pattern matching (OPPM) considers the relative order of elements, rather t han their real…

2017-gpw-nozaki

Sun, 12 Nov 2017 00:00:00 +0000

2017-01-CPM

Sat, 01 Jul 2017 00:00:00 +0000

We propose a new indexing structure for parameterized strings, called parameterized position heap. Parameterized position heap is applicable for parameterized pattern matching problem, where the pattern matches a substring of the text if there exists a bijective mapping from the symbols of the pattern to the symbols of the substring. We propose an online construction algorithm of parameterized position heap of a text and show that our algorithm runs in linear time with respect to the text size. We also show that by using parameterized position heap, we can find all occurrences of a pattern in the text in linear time with respect to the product of the pattern size and the alphabet size.

2017-01-LATA

Wed, 01 Mar 2017 00:00:00 +0000

We identify the properties of context-free grammars that exactly correspond to the behavior of the dual and primal versions of Clark and Yoshinaka’s distributional learning algorithm and call them the very weak finite context/kernel property. We show that the…

2017-01-SOFSEM-2

Sun, 01 Jan 2017 00:00:00 +0000

We introduce new types of approximate palindromes called single-arm-gapped palindromes (SAGPs). A SAGP contains a gap in either its left or right arm, which is in the form of either $$wguc…

2016-01-SPIRE

Sat, 01 Oct 2016 00:00:00 +0000

Given a set of pattern strings called a dictionary and a text string, dictionary matching is the problem to find the occurrences of the patterns on the text. Dynamic dictionary matching is dictionary matching where patterns may dynamically be inserted into and…

2016-01-PSC

Mon, 01 Aug 2016 00:00:00 +0000

2016-01-DCC

Fri, 01 Apr 2016 00:00:00 +0000

2016-01-Discret-Appl-Math

Fri, 01 Jan 2016 00:00:00 +0000

Abstract could not be automatically retrieved from the linked source. Please add the official abstract manually if available.

2016-01-SOFSEM-Student-Research-Forum-Papers-Pos

Fri, 01 Jan 2016 00:00:00 +0000

2015-09-paper-025

Tue, 01 Sep 2015 00:00:00 +0000

DubeとBeaudoinによって2010年に考案された無ひずみデータ圧縮法である部分文字列数え上げ圧縮法に対し，金井と横尾はBurrows-Wheeler変換行列をうまく利用した効率的な実装方法を提案した．本論文は，この実装方法を一般化し，(1)バイト指向情報源に対してフェーズの区別を明示的に導入することで圧縮率を高める，(2) バイナリ文字列のみならず多値アルファベットの文字列をそのまま取り扱える，という２方向への拡張を示す．

2015-01-LATA

Sun, 01 Mar 2015 00:00:00 +0000

Approaches based on the idea generically called distributional learning have been making great success in the algorithmic learning of context-free languages and their extensions. We in this paper show that conjunctive grammars are also learnable by a distributional…

LA2015-winter

Fri, 30 Jan 2015 00:00:00 +0000

桂敬史くん（博士3年）がSOFSEM2015でBest Student Poster Awardを受賞

Thu, 29 Jan 2015 00:00:00 +0000

博士課程後期3年の桂敬史くんが2015年1月24日-29日にチェコで開催された国際会議SOFSEM 2015 （41st International Conference on Current Trends in Theory and Practice of Computer Science）において，Best Student Poster Awardを受賞しました．

SOFSEM2015

Mon, 26 Jan 2015 00:00:00 +0000

A multi-set of $N$ strings of length $n$ is called a multi-track string. The permuted pattern matching is the problem that given two multi-track strings $T = { t_1, \ldots, t_N }$ of length $n$ and $P = {p_1, \ldots, p_{N}}$ of length $m$, outputs all positions $i$ such that ${p_1 \ldots, p_N} = {t_1[i : i + m_1], \ldots, t_N[i:i+m_1]}$. We propose two newi ndexing structures for multi-track stings. One is a time-efficient structure for $T$ that needs $O(nN)$ space and enables us to solve the problem in $O(m^2 N+occ)$ time, where $occ$ is the number of occurrences of the pattern $P$ in the text $T$. The other is memory-e cient, it requires only $O(n)$ space, whereas the matching consumes $O(m^2 N^2 + occ)$ time. We show that both of them can be constructed in $O(nN)$ time.

2015-01-Inf-Comput

Thu, 01 Jan 2015 00:00:00 +0000

Abstract could not be automatically retrieved from the linked source. Please add the official abstract manually if available.

2014-09-paper-029

Tue, 02 Sep 2014 00:00:00 +0000

2014-01-DCC

Sat, 01 Mar 2014 00:00:00 +0000

2014-01-Mach-Learn-2

Wed, 01 Jan 2014 00:00:00 +0000

This paper describes several collapsed Bayesian methods, which work by first marginalizing out transition probabilities, for inferring several kinds of pro

2014-01-SOFSEM

Wed, 01 Jan 2014 00:00:00 +0000

Given two sets of strings and a similarity function on strings, similarity joins attempt to find all similar pairs of strings from each respective set. In this paper, we focus on similarity joins with respect to the edit distance, and propose a new metric called the…

2013-12-paper-032

Sat, 21 Dec 2013 00:00:00 +0000

第3回相磯秀夫杯 FPGAデザインコンテスト準優勝

Fri, 20 Sep 2013 00:00:00 +0000

奥田　遼介（システム情報科学専攻篠原研究室 M2）が所属するチームSnowdropが「第3回相磯秀夫杯 FPGAデザインコンテスト」で第2位入賞となりました．

2013-01-PSC

Sun, 01 Sep 2013 00:00:00 +0000

2013-07-paper-037

Wed, 17 Jul 2013 00:00:00 +0000

2013-04-n-57-2-03696

Wed, 24 Apr 2013 00:00:00 +0000

2013-04-paper-038

Wed, 24 Apr 2013 00:00:00 +0000

2013-01-SOFSEM

Tue, 01 Jan 2013 00:00:00 +0000

We propose a new variant of pattern matching on a multi-set of strings, or multi-tracks, called permuted-matching, that looks for occurrences of a multi-track pattern of length m with M tracks, in a multi-track text of length n with N tracks over Σ. We show that…

2012-10-paper-043

Wed, 31 Oct 2012 00:00:00 +0000

2012-01-SPIRE

Mon, 01 Oct 2012 00:00:00 +0000

A run (also called maximal repetition) in a word is a non-extendable repetition. Finding the maximum number ρ(n) of runs in a string of length n is a challenging problem. Although it is known that ρ(n) ≤ 1.029n for any n and there exists…

2012-09-paper-044

Mon, 03 Sep 2012 00:00:00 +0000

2012-01-High-Order-Symb-Comput

Sun, 01 Jan 2012 00:00:00 +0000

We propose an application of programming language techniques to lossless data compression, where tree data are compressed as functional programs that gener

2012-03-LATA

Sun, 01 Jan 2012 00:00:00 +0000

Recently several “distributional learning algorithms” have been proposed and have made great success in learning different subclasses of context-free grammars. The distributional learning models and exploits the relation between strings and contexts that…

2011-11-paper-046

Sat, 05 Nov 2011 00:00:00 +0000

2011-01-ALT

Sat, 01 Oct 2011 00:00:00 +0000

This paper demonstrates how existing distributional learning techniques for context-free grammars can be adapted to simple context-free tree grammars in a straightforward manner once the necessary notions and properties for string languages have been redefined for…

2011-01-PSC

Mon, 01 Aug 2011 00:00:00 +0000

2011-07-paper-049

Tue, 19 Jul 2011 00:00:00 +0000

2011-01-Developments-in-Language-Theory

Fri, 01 Jul 2011 00:00:00 +0000

Recent studies on grammatical inference have demonstrated the benefits of “distributional learning” for learning context-free and context-sensitive languages. Distributional learning models and exploits the relation between strings and contexts in the…

2011-01-LACL

Fri, 01 Jul 2011 00:00:00 +0000

Minimalist grammars (MGs) constitute a mildly context-sensitive formalism when being equipped with a particular locality condition (LC), the shortest move condition. In this format MGs define the same class of derivable string languages as…

2010-09-paper-050

Mon, 20 Sep 2010 00:00:00 +0000

2010-09-paper-051

Mon, 20 Sep 2010 00:00:00 +0000

2010-01-PSC

Wed, 01 Sep 2010 00:00:00 +0000

2010-04-paper-052

Thu, 22 Apr 2010 00:00:00 +0000

2010-03-paper-056

Tue, 09 Mar 2010 00:00:00 +0000

2010-01-Chic-J-Theor-Comput-Sci

Fri, 01 Jan 2010 00:00:00 +0000

Abstract could not be automatically retrieved from the linked source. Please add the official abstract manually if available.

2009-01-PSC

Tue, 01 Sep 2009 00:00:00 +0000

2009-07-paper-059

Mon, 20 Jul 2009 00:00:00 +0000

2009-07-paper-060

Mon, 20 Jul 2009 00:00:00 +0000

2009-05-cell-be

Thu, 28 May 2009 00:00:00 +0000

2009-01-LATA

Wed, 01 Apr 2009 00:00:00 +0000

We present a new series of run-rich strings, and give a new lower bound 0.94457567 of the maximum number of runs in a string. We also introduce the general conjecture about a asymptotic behavior of the numbers of runs in the strings defined by any recurrence formula,…

2009-03-paper-063

Mon, 02 Mar 2009 00:00:00 +0000

2009-02-paper-065

Sun, 01 Feb 2009 00:00:00 +0000

2009-02-paper-068

Sun, 01 Feb 2009 00:00:00 +0000

2009-01-Algorithms

Thu, 01 Jan 2009 00:00:00 +0000

We consider grammar-based text compression with longest first substitution (LFS), where non-overlapping occurrences of a longest repeating factor of the input text are replaced by a new non-terminal symbol. We present the first linear-time algorithm for LFS. Our algorithm employs a new data structure called sparse lazy suffix trees. We also deal with a more sophisticated version of LFS, called LFS2, that allows better compression. The first linear-time algorithm for LFS2 is also presented.

2009-01-CATS

Thu, 01 Jan 2009 00:00:00 +0000

2009-01-IEICE-Trans-Inf-Syst

Thu, 01 Jan 2009 00:00:00 +0000

Access full-text academic articles: J-STAGE is an online platform for Japanese academic journals.

2009-01-Int-J-Found-Comput-Sci

Thu, 01 Jan 2009 00:00:00 +0000

A substring w[i.j] in w is called a repetition of period p if w[k] = w[k + p] for any i ≤ k ≤ j - p. Especially, a maximal repetition, which cannot be extended neither to left nor to right, is called a run. The ratio of the length of the run to its period, i.e. [Formula: see text], is called an exponent. The sum of exponents of runs in a string is of interest. The maximal value of the sum is still unknown, and the current upper bound is 2.9n given by Crochemore and Ilie, where n is the length of a string. In this paper we show a closed formula which exactly expresses the average value of it for any n and any alphabet size, and the limit of this value per unit length as n approaches infinity. For binary strings, the limit value is approximately 1.13103. We also show the average number of squares in a string of length n and its limit value.

2008-01-PSC

Mon, 01 Sep 2008 00:00:00 +0000

2008-01-PSC-2

Mon, 01 Sep 2008 00:00:00 +0000

2008-03-paper-072

Sat, 01 Mar 2008 00:00:00 +0000

2008-01-SOFSEM

Tue, 01 Jan 2008 00:00:00 +0000

This paper studies two problems on compressed strings described in terms of straight line programs (SLPs). One is to compute the length of the longest common substring of two given SLP-compressed strings, and the other is to compute all palindromes of a given…

2008-01-Structure-Based-Compression-of-Complex-M

Tue, 01 Jan 2008 00:00:00 +0000

In this paper we study the problem of deciding whether a given compressed string contains a square. A string x is called a square if x = zz and z = u^k implies k = 1 and u = z. A string w is said to be square-free if no substrings of w are squares. Many efficient algorithms to test if a given string is square-free, have been developed so far. However, very little is known for testing square-freeness of a given compressed string. In this paper, we give an O(max(n^2; n log^2 N))-time O(n^2)-space solution to test square-freeness of a given compressed string, where n and N are the size of a given compressed string and the corresponding decompressed string, respectively. Our input strings are compressed by balanced straight line program (BSLP). We remark that BSLP has exponential compression, that is, N = O(2^n). Hence no decompress-then-test approaches can be better than our method in the worst case.

COMP-2007-04

Fri, 20 Apr 2007 00:00:00 +0000

2005-01-FCT

Mon, 01 Aug 2005 00:00:00 +0000

Sequence comparison is a fundamental task in pattern matching. Its applications include file comparison, spelling correction, information retrieval, and computing (dis)similarities between biological sequences. A common scheme for sequence…

2005-01-WEA

Sun, 01 May 2005 00:00:00 +0000

The task of approximate string matching is to find all locations at which a pattern string p of length m matches a substring of a text string t of length n with at most k differences. It is common to use Levenshtein distance [5], which allows the differences to be…

2005-01-Int-J-Found-Comput-Sci

Sat, 01 Jan 2005 00:00:00 +0000

We study the fully compressed pattern matching problem (FCPM problem): Given [Formula: see text] and [Formula: see text] which are descriptions of text T and pattern P respectively, find the occurrences of P in Twithout decompressing[Formula: see text]or[Formula: see text]. This problem is rather challenging since patterns are also given in a compressed form. In this paper we present an FCPM algorithm for simple collage systems. Collage systems are a general framework representing various kinds of dictionary-based compressions in a uniform way, and simple collage systems are a subclass that includes LZW and LZ78 compressions. Collage systems are of the form [Formula: see text], where [Formula: see text] is a dictionary and [Formula: see text] is a sequence of variables from [Formula: see text]. Our FCPM algorithm performs in [Formula: see text] time, where [Formula: see text] and [Formula: see text]. This is faster than the previous best result of O(m 2 n 2 ) time.

2005-01-SOFSEM

Sat, 01 Jan 2005 00:00:00 +0000

The approximate string matching problem is to find all locations at which a query p of length m matches a substring of a text t of length n with at most k differences (insertions, deletions, substitutions). The fastest solutions in practice for this problem are the…

2005-01-Theor-Comput-Sci

Sat, 01 Jan 2005 00:00:00 +0000

We consider a deterministic finite automaton which accepts all subsequences of a set of texts, called subsequence automaton. We show an online algorithm for constructing a subsequence automaton for a set of texts. It runs in O(|/spl Sigma/|(m+k)+N) time using O(|/spl Sigma/|m) space, where |/spl Sigma/| is the size of alphabet, m is the size of the resulting subsequence automaton, k is the number of texts, and N is the total length of texts. It can be used to preprocess a given set S of texts in such a way that for any query /spl omega/ /spl isin/ /spl Sigma/*, returns in O(|/spl omega/|) time the number of texts in S which contain /spl omega/ as a subsequence. We also show an upper bound of the size of automaton compared to the minimum automaton.

2004-01-Developments-in-Language-Theory

Wed, 01 Dec 2004 00:00:00 +0000

There is a close relationship between formal language theory and data compression. Since 1990’s various types of grammar-based text compression algorithms have been introduced. Given an input string, a grammar-based text compression algorithm constructs a…

2004-01-AIRS

Fri, 01 Oct 2004 00:00:00 +0000

There exist practical bit-parallel algorithms for several types of pair-wise string processing, such as longest common subsequence computation or approximate string matching. The bit-parallel algorithms typically use a size-σ table of match bit-vectors, where…

2004-01-ALT

Fri, 01 Oct 2004 00:00:00 +0000

Finding a good pattern which discriminates one set of strings from the other set is a critical task in knowledge discovery. In this paper, we review a series of our works concerning with the string pattern discovery. It includes theoretical analyses of learnabilities…

2004-01-Discovery-Science

Fri, 01 Oct 2004 00:00:00 +0000

We consider the problem of discovering the optimal pair of substring patterns with bounded distance α, from a given set S of strings. We study two kinds of pattern classes, one is in form…

2004-01-WABI

Wed, 01 Sep 2004 00:00:00 +0000

We consider the problem of finding the optimal pair of string patterns for discriminating between two sets of strings, i.e. finding the pair of patterns that is best with respect to some appropriate scoring function that gives higher scores to pattern pairs which…

2004-01-J-Bioinform-Comput-Biol

Thu, 01 Jan 2004 00:00:00 +0000

We present an efficient algorithm for detecting putative regulatory elements in the upstream DNA sequences of genes, using gene expression information obtained from microarray experiments. Based on a generalized suffix tree, our algorithm looks for motif patterns whose appearance in the upstream region is most correlated with the expression levels of the genes. We are able to find the optimal pattern, in time linear in the total length of the upstream sequences. We implement and apply our algorithm to publicly available microarray gene expression data, and show that our method is able to discover biologically significant motifs, including various motifs which have been reported previously using the same data set. We further discuss applications for which the efficiency of the method is essential, as well as possible extensions to our algorithm.

2004-01-J-Discrete-Algorithms

Thu, 01 Jan 2004 00:00:00 +0000

2003-01-SPIRE

Wed, 01 Oct 2003 00:00:00 +0000

Given a text, grammar-based compression is to construct a grammar that generates the text. There are many kinds of text compression techniques of this type. Each compression scheme is categorized as being either off-line or on-line, according to how a text is…

2003-10-paper-084

Wed, 01 Oct 2003 00:00:00 +0000

2003-01-MFCS-2

Fri, 01 Aug 2003 00:00:00 +0000

This paper introduces a new problem of inferring strings from graphs, and inferring strings from arrays. Given a graph G or an array A, we infer a string that suits the graph, or the array, under some condition. Firstly, we solve the problem of finding a string w…

2003-01-Discovery-Science

Wed, 01 Jan 2003 00:00:00 +0000

The classificatory power of a pattern is measured by how well it separates two given sets of strings. This paper gives practical algorithms to find the fixed/variable-length-don’t-care pattern (FVLDC pattern) and approximate FVLDC pattern which are most…

2003-01-Nord-J-Comput

Wed, 01 Jan 2003 00:00:00 +0000

Abstract could not be automatically retrieved from the linked source. Please add the official abstract manually if available.

2003-01-Theor-Comput-Sci-3

Wed, 01 Jan 2003 00:00:00 +0000

We introduce our research on compressed pattern matching technology that combines data compression and pattern matching. To show the results of this work, we explain the collage system proposed by Kida et al. in 2003 that is a unifying framework for compressed pattern matching, and we explain the Repair-VF method proposed by Yoshida and Kida in 2013 and the MR-Repair method proposed by Furuya et al. in 2019 as grammar compressions suitable for compressed pattern matching.

2002-01-Discovery-Science

Fri, 01 Nov 2002 00:00:00 +0000

A variable-length-don’t-care pattern (VLDC pattern) is an element of set Π = (∑∪{⋆}), where ∑ is an alphabet and ⋆ is a wildcard matching any string in ∑. Given two sets of strings, we consider the problem of finding…

2002-01-CPM

Mon, 01 Jul 2002 00:00:00 +0000

For a string w over an alphabet Σ, we consider a composite data structure called the all-suffixes directed acyclic word graph (ASDAWG). ASDAWG(w) has |w| + 1 initial nodes, and the dag induced by all reachable nodes from the k-th initial node conforms with…

2002-01-Progress-in-Discovery-Science

Tue, 01 Jan 2002 00:00:00 +0000

Finding a pattern which separates two sets is a critical task in discovery. Given two sets of strings, consider the problem to find a subsequence that is common to one set but never appears in the other set. The problem is known to be NP-complete. Episode pattern is…

2002-08-MFCS

Tue, 01 Jan 2002 00:00:00 +0000

The minimum all-suffixes directed acyclic word graph (MASDAWG) of a string w has |w| + 1 initial nodes, where the dag induced by all reachable nodes from the k-th initial node conforms with the DAWG of the k-th suffix of w. A new space-economical algorithm for the…

2002-09-SPIRE

Tue, 01 Jan 2002 00:00:00 +0000

Techniques in processing text files “as is” are presented, in which given text files are processed without modification. The compressed pattern matching problem, first defined by Amir and Benson (1992), is a good example of the “as-is”…

2001-01-ISAAC

Sat, 01 Dec 2001 00:00:00 +0000

A fragmentary pattern is a multiset of non-empty strings, and it matches a string w if all the strings in it occur within w without any overlaps. We study some fundamental issues on computational complexity related to the matching of fragmentary patterns. We show…

2001-01-Discovery-Science-2

Thu, 01 Nov 2001 00:00:00 +0000

Episode pattern is a generalized concept of subsequence pattern where the length of substring containing the subsequence is bounded. Given two sets of strings, consider an optimization problem to find a best episode pattern that is common to one set but not common in…

2001-01-SPIRE

Thu, 01 Nov 2001 00:00:00 +0000

2001-01-SPIRE-2

Thu, 01 Nov 2001 00:00:00 +0000

DCC2001

Wed, 28 Mar 2001 00:00:00 +0000

DCC2001-Kida

Tue, 27 Mar 2001 00:00:00 +0000

A fundamental problem on strings in the realm of approximate string matching is pattern matching with mismatches: Given a text t, a pattern p, and a number k, determine whether some substring of t has Hamming distance at most k to p; such a substring is called a k-match. As real-world texts often come in compressed form, we study the case of searching for a small pattern p in a text t that is compressed by a straight-line program. This grammar compression is popular in the string community, since it is mathematically elegant and unifies many practically relevant compression schemes such as the Lempel-Ziv family, dictionary methods, and others. We denote by m the length of p and by n the compressed size of t. While exact pattern matching, that is, the case k = 0, is known to be solvable in near-linear time O (n + m) [Jez TALG'15], despite considerable interest in the string community, the fastest known algorithm for pattern matching with mismatches runs in time [MATH HERE] [Gawrychowski, Straszak ISAAC'13], which is far from linear even for very small k. In this paper, we obtain an algorithm for pattern matching with mismatches running in time O((n + m) poly(k)). This is near-linear in the input size for any constant (or slightly superconstant) k. We obtain analogous running time for counting and enumerating all k-matches. Our algorithm is based on a new structural insight for approximate pattern matching, essentially showing that either the number of k-matches is very small or both text and pattern must be almost periodic. While intuitive and simple for exact matches, such a characterization is surprising when allowing k mismatches.

2001-07-CPM

Mon, 01 Jan 2001 00:00:00 +0000

Compressed pattern matching is one of the most active top- ics in string matching. The goal is to find all occurrences of a pattern in a compressed text without decompression. Various algorithms have been proposed depending on underlying compression methods in the…

SPIRE2000-Matsumoto

Thu, 28 Sep 2000 00:00:00 +0000

SPIRE2000-Hirao

Wed, 27 Sep 2000 00:00:00 +0000

Internal Pattern Matching (IPM) queries on a text $T$, given two fragments $X$ and $Y$ of $T$ such that $|Y|<2|X|$, ask to compute all exact occurrences of $X$ within $Y$. IPM queries have been introduced by Kociumaka, Radoszewski, Rytter, and Wale'n [SODA'15&SICOMP'24], who showed that they can be answered in $O(1)$ time using a data structure of size $O(n)$ and used this result to answer various queries about fragments of $T$. In this work, we study IPM queries on compressed and dynamic strings. Our result is an $O(\log n)$-time query algorithm applicable to any balanced recompression-based run-length straight-line program (RLSLP). In particular, one can use it on top of the RLSLP of Kociumaka, Navarro, and Prezza [IEEE TIT'23], whose size $O\big(\delta \log \frac{n\log \sigma}{\delta \log n}\big)$ is optimal (among all text representations) as a function of the text length $n$, the alphabet size $\sigma$, and the substring complexity $\delta$. Our procedure does not rely on any preprocessing of the underlying RLSLP, which makes it readily applicable on top of the dynamic strings data structure of Gawrychowski, Karczmarz, Kociumaka, {\L}\k{a}cki and Sankowski [SODA'18], which supports fully persistent updates in logarithmic time with high probability.

SPIRE2000-Hoshino

Wed, 27 Sep 2000 00:00:00 +0000

CPM2000

Thu, 01 Jun 2000 00:00:00 +0000

We apply the Boyer-Moore technique to compressed pattern matching for text string described in terms of collage system, which is a formal framework that captures various dictionary-based compression methods. For a subclass of collage systems that contain no…

CIAC2000

Wed, 01 Mar 2000 00:00:00 +0000

Byte pair encoding (BPE) is a simple universal text compression scheme. Decompression is very fast and requires small work space. Moreover, it is easy to decompress an arbitrary part of the orig- inal text. However, it has not been so popular since the compression is…

1999-01-SPIRE-CRIWG

Wed, 01 Sep 1999 00:00:00 +0000

1999-09-paper-089

Wed, 01 Sep 1999 00:00:00 +0000

1999-09-paper-090

Wed, 01 Sep 1999 00:00:00 +0000

CPM99-Shibata

Fri, 23 Jul 1999 00:00:00 +0000

In this paper we focus on the problem of compressed pattern matching for the text compression using antidictionaries, which is a new compression scheme proposed recently by Crochemore et al. (1998). We show an algorithm which preprocesses a pattern of length m and an…

CPM99-Kida

Thu, 22 Jul 1999 00:00:00 +0000

This paper considers the Shift-And approach to the problem of pattern matching in LZW compressed text, and gives a new algorithm that solves it. The algorithm is indeed fast when a pattern length is at most 32, or the word length. After an O(m + |∑|) time and…

1998-10-lzw-compressed-text-matching

Thu, 01 Oct 1998 00:00:00 +0000

DCC98

Wed, 01 Apr 1998 00:00:00 +0000

CPM97

Tue, 01 Jul 1997 00:00:00 +0000

We show an efficient pattern-matching algorithm for strings that are succinctly described in terms of straight-line programs, in which the constants are symbols and the only operation is the concatenation. In this paper, both text T and pattern P are given by…

EuroCOLT97

Tue, 18 Mar 1997 00:00:00 +0000

A pattern is a finite string of constant and variable symbols. For k≥1, we denote by kμΠ the set of all patterns in which each variable symbol occurs at most k times. In particular, we abbreviate μΠ for k=1. The language L(π) of a…

1997-01-Nord-J-Comput

Wed, 01 Jan 1997 00:00:00 +0000

Abstract could not be automatically retrieved from the linked source. Please add the official abstract manually if available.

CPM95

Fri, 07 Jul 1995 00:00:00 +0000

We consider strings which are succinctly described. The description is in terms of straight-line programs in which the constants are symbols and the only operation is the concatenation. Such descriptions correspond to the systems of recurrences or to context-free…

1995-01-Electron-Colloquium-Comput-Complex

Sun, 01 Jan 1995 00:00:00 +0000

Homepage of the Electronic Colloquium on Computational Complexity located at the Weizmann Institute of Science, Israel

1993-07-bonsai

Sun, 11 Jul 1993 00:00:00 +0000

1991-01-Nonmonotonic-and-Inductive-Logic

Tue, 01 Jan 1991 00:00:00 +0000

Elementary formal system (EFS for short) is a kind of logic program directly dealing with character strings. In 1989, we proposed the class of variable-bounded EFS’s as a unifying framework for language learning. Responding to the proposal, several works have…

String Processing on

2026-03-WALCOM

SOFSEM2026

CIAC2025

CPM2025

Acta-Informatica-2024

2024-01-ICALP

TCS2024-Diptarama

CPM2024

SOFSEM2024

SPIRE2023

ICGI2023

WALCOM2023-Kumagai

2023-02-la-sympo-winter

SPIRE2022

Theoretical Computer Science 2022

CPM2022

2022-02-la-sympo-winter-dava

2022-02-la-sympo-winter-ichikawa

ICGI2023-Kaito

ICGI2023-kanazawa

2020-12-comp

SPIRE2020

CPM2020-Funakoshi

CPM2020-Köppl

CPM2020-Nakashima

SEA2020

TCS2020-Narisada-gap

TCS2020-Narisada-walk

DCC2020

2020-02-la-winter-nakashima

2020-02-la-winter-natsumi

SOFSEM2020

2020-01-Algorithmica

2020-01-SOFSEM-Doctoral-Student-Research-Forum-2

2019-12-comp

2019-08-tsjc

2019-01-PSC

2019-05-comp

2019-01-Algorithms

2019-01-GandALF

2019-01-J-Comput-Syst-Sci

2019-01-Theor-Comput-Sci

SPIRE2018

2018-01-CIAA

情報処理学会東北支部 第13回野口研究奨励賞を受賞

2018-01-SOFSEM

2018-01-SOFSEM-2

2017-gpw-nozaki

2017-01-CPM

2017-01-LATA

2017-01-SOFSEM-2

2016-01-SPIRE

2016-01-PSC

2016-01-DCC

2016-01-Discret-Appl-Math

2016-01-SOFSEM-Student-Research-Forum-Papers-Pos

2015-09-paper-025

2015-01-LATA

LA2015-winter

桂敬史くん（博士3年）がSOFSEM2015でBest Student Poster Awardを受賞

SOFSEM2015

2015-01-Inf-Comput

2014-09-paper-029

2014-01-DCC

2014-01-Mach-Learn-2

2014-01-SOFSEM

2013-12-paper-032

第3回 相磯秀夫杯 FPGAデザインコンテスト 準優勝

2013-01-PSC

2013-07-paper-037

2013-04-n-57-2-03696

2013-04-paper-038

2013-01-SOFSEM

2012-10-paper-043

2012-01-SPIRE

2012-09-paper-044

2012-01-High-Order-Symb-Comput

2012-03-LATA

2011-11-paper-046

情報処理学会東北支部第13回野口研究奨励賞を受賞

第3回相磯秀夫杯 FPGAデザインコンテスト準優勝