Are there improvements on Dana Angluin’s algorithm for learning regular sets

Problem Detail: In her 1987 seminal paper Dana Angluin presents a polynomial time algorithm for learning a DFA from membership queries and theory queries (counterexamples to a proposed DFA). She shows that if you are trying to learn a minimal DFA with $n$ states, and your largest countexample is of length $m$, then you need to make $O(mn^2)$ membership-queries and at most $n – 1$ theory-queries. Have there been significant improvements on the number of queries needed to learn a regular set?

References and Related Questions

Asked By : Artem Kaznatcheev

Answered By : Artem Kaznatcheev

In his answer on cstheory.SE, Lev Reyzin directed me to Robert Schapire’s thesis that improves the bound to $O(n^2 + nlog m)$ membership queries in section 5.4.5. The number of counterexample queries remains unchanged. The algorithm Schapire uses differs in what it does after a counterexample query.

Sketch of the improvement

At the highest level, Schapire forces $(S,E,T)$ from Angluin’s algorithm to have the extra condition that for a closed $(S,E,T)$ and each $s_1, s_2 in S$ if $s_1 neq s_2$ then $row(s_1) neq row(s_2)$. This guarantees that $|S| leq n$ and also makes the consistency property of Angluin’s algorithm trivial to satisfy. To ensure this, he has to handle the results of a counterexample differently. Given a counterexample $z$, Angluin simply added $z$ and all its prefixes to $S$. Schapire does something more subtle by instead adding a single element $e$ to $E$. This new $e$ will make $(S,E,T)$ to be not closed in Angluin’s sense and the update to get closure with introduce at least one new string to $S$ while keeping all rows distinct. The condition on $e$ is: $$exists s, s’ in S, a in Sigma quad text{s.t} quad row(s) = row(s’a) ; text{and} ; o(delta(q_0,se)) neq o(delta(q_0,s’ae))$$ Where $o$ is the output function, $q_0$ is the initial state, and $delta$ the update rule of the true ‘unknown’ DFA. In otherwords, $e$ must serve as a witness to distinguish the future of $s$ from $s’a$. To figure out this $e$ from $z$ we do a binary search to figure out a substring $r_i$ such that $z = p_ir_i$ and $0 leq |p_i| = i < |z|$ such that the behavior of our conjectured-machine differs based on one input character. In more detail, we let $s_i$ be the string corresponding to the state reached in our conjectured machine by following $p_i$. We use binary search (this is where the $log m$ comes from) to find an $k$ such that $o(delta(q_0,s_kr_k)) neq o(delta(q_0,s_{k+1}r_{k+1})$. In other words, $r_{k+1}$ distinguishes two states that our conjectured machines finds equivalent and thus satisfies the condition on $e$, so we add it to $E$.

Best Answer from StackOverflow

Question Source : http://cs.stackexchange.com/questions/118