References and Related Questions
- Dana Angluin (1987) “Learning Regular Sets from Queries and Counterexamples”, Infortmation and Computation 75: 87-106
- Lower bounds for learning in the membership query and counterexample model
Asked By : Artem Kaznatcheev
Answered By : Artem Kaznatcheev
Sketch of the improvement
At the highest level, Schapire forces $(S,E,T)$ from Angluin’s algorithm to have the extra condition that for a closed $(S,E,T)$ and each $s_1, s_2 in S$ if $s_1 neq s_2$ then $row(s_1) neq row(s_2)$. This guarantees that $|S| leq n$ and also makes the consistency property of Angluin’s algorithm trivial to satisfy. To ensure this, he has to handle the results of a counterexample differently. Given a counterexample $z$, Angluin simply added $z$ and all its prefixes to $S$. Schapire does something more subtle by instead adding a single element $e$ to $E$. This new $e$ will make $(S,E,T)$ to be not closed in Angluin’s sense and the update to get closure with introduce at least one new string to $S$ while keeping all rows distinct. The condition on $e$ is: $$exists s, s’ in S, a in Sigma quad text{s.t} quad row(s) = row(s’a) ; text{and} ; o(delta(q_0,se)) neq o(delta(q_0,s’ae))$$ Where $o$ is the output function, $q_0$ is the initial state, and $delta$ the update rule of the true ‘unknown’ DFA. In otherwords, $e$ must serve as a witness to distinguish the future of $s$ from $s’a$. To figure out this $e$ from $z$ we do a binary search to figure out a substring $r_i$ such that $z = p_ir_i$ and $0 leq |p_i| = i < |z|$ such that the behavior of our conjectured-machine differs based on one input character. In more detail, we let $s_i$ be the string corresponding to the state reached in our conjectured machine by following $p_i$. We use binary search (this is where the $log m$ comes from) to find an $k$ such that $o(delta(q_0,s_kr_k)) neq o(delta(q_0,s_{k+1}r_{k+1})$. In other words, $r_{k+1}$ distinguishes two states that our conjectured machines finds equivalent and thus satisfies the condition on $e$, so we add it to $E$.
Best Answer from StackOverflow
Question Source : http://cs.stackexchange.com/questions/118