Asked By : Andrea Tucci
Best Answer from StackOverflow
Question Source : http://cs.stackexchange.com/questions/2696
Answered By : codeah
Left Recursion
Let’s define a transformation that removes left recursion. Given $qquad displaystyle A to Aalpha_0 mid A alpha_1 mid dots mid A alpha_n mid B beta_0 mid B beta_1 mid dots mid Bbeta_n$, we remove left recursion like this: $qquad displaystyle begin{align} A_h &to B beta_0 mid B beta_1 mid dots mid B beta_n A_t &to alpha_0 mid alpha_1 mid dots mid alpha_n A_{t^+} &to A_t A_{t^+} mid A_t A &to A_h A_{t^+} mid A_h end{align}$ The generality of above is typically not given in compiler texts, but parsing texts like Grune & Jacobs do cover this. Left factoring can be applied to the above transformed grammar but will be just introduce extra rules that will not change the answer. So we will simplify the presentation without extra left factoring being performed. In this answer we will not cover indirect left recursion issues because we are only concerned with a single non-terminal’s rules. Note that indirect left recursion can be dealt with, though. (Open a separate question if that is important.)
Left Factoring
Removing left factoring is in most introductory compiler texts done like this. Given $qquad displaystyle A to x y mid x z$ left factoring yields: $qquad displaystyle begin{align} A_s &to y mid z A &to x A_s end{align}$ Now that’s perform the transformations in both ordering.
Left factoring first
Let’s left factor the question’s grammar $qquad displaystyle begin{align} F_s &to D S mid varepsilon F &to F B a mid c F_s end{align}$ and then remove the left recursion: $qquad displaystyle begin{align} F_h &to c F_s F_t &to B a F_{t^+} &to F_t F_{t^+} | F_t F_s &to D S mid varepsilon F &to F_h F_{t^+} mid F_h end{align}$
Removing left-recursion first
And for the other ordering, let’s remove left recursion from the question’s grammar $qquad displaystyle begin{align} F_h &to c D S mid c F_t &to B a F_{t^+} &to F_t F_{t^+} mid F_t F &to F_h F_{t^+} mid F_h end{align}$ and then left factor the terminal $c$: $qquad displaystyle begin{align} F_{hs} &to D S mid varepsilon F_h &to c F_{hs} F_t &to B a F_{t^+} &to F_t F_{t^+} mid F_t F &to F_h F_{t^+} mid F_h end{align}$ Aha, the resulting grammars are the same! In general, proving that two grammars are equivalent is undecidable. So if a series grammar transformations could influence the language that is recognized then, it would be disastrous.