Problem Detail: Given two sets $A,B$ of strings over alphabet $Sigma$, can we compute the smallest deterministic finite-state automaton (DFA) $M$ such that $A subseteq L(M)$ and $L(M) subseteq Sigma^*setminus B$? In other words, $A$ represents a set of positive examples. Every string in $A$ needs to be accepted by the DFA. $B$ represents a set of negative examples. No string in $B$ should be accepted by the DFA. Is there a way to solve this, perhaps using DFA minimization techniques? I could imagine creating a DFA-like automaton that has three kinds of states: accept states, reject states, and “don’t-care” states (any input that ends in a “don’t-care” state can be either accepted or rejected). But can we then find a way to minimize this to an ordinary DFA? You could think of this as the problem of learning a DFA, given positive and negative examples. This is inspired by Is regex golf NP-Complete?, which asks a similar questions for regexps instead of DFAs.
Asked By : D.W.
Answered By : Shaull
A DFA as you describe is called a separating DFA. There is some literature on this problem when $A$ and $B$ are regular languages, such as Learning Minimal Separating DFA’s for Compositional Verification, by Yu-Fang Chen, Azadeh Farzan, Edmund M. Clarke, Yih-Kuen Tsay, Bow-Yaw Wang Note that as @reinierpost states, without any restrictions on A and B, the problem may become undecidable.
Best Answer from StackOverflow
Question Source : http://cs.stackexchange.com/questions/19687