Problem Detail: I have pretty simple question, but still can’t find an answer just googling it. I’m trying to understand Chomsky Normal Form (CNF). There are three production rules:
- $A to BC$
- $A to alpha$
- $S to epsilon$
First two I understand. But last one $epsilon$ doesn’t makes sense for me. Why do we need this rule? What is use of having this?
Asked By : Igor Konoplyanko
Answered By : Yuval Filmus
The last rule is necessary for languages containing $epsilon$. A context-free grammar in which $S$ does not generate $epsilon$ can be converted to Chomsky normal form without the rule $S to epsilon$. Conversely, simple induction shows that all other rules do not generate $epsilon$, so this rule is needed for languages generating $epsilon$. The rule arises during the $epsilon$-elimination step. For each non-terminal generating $epsilon$, we “optionally” delete it from any production containing in in the right-hand side, and continue the process inductively (eliminating one non-terminal overtly generating $epsilon$ may reveal that under generates $epsilon$). Finally, it could happen that $epsilon$ is pushed all the way to the starting symbol $S$; there is no way to eliminate it there.
Best Answer from StackOverflow
Question Source : http://cs.stackexchange.com/questions/24012