Left recursion and left factoring — which one goes first?

Question Detail: if I have a grammar having a production that contains both left recursion and left factoring like $qquad displaystyle F to FBa mid cDS mid c$ which one has priority, left recursion or left factoring?

Asked By : Andrea Tucci
Best Answer from StackOverflow

Question Source : http://cs.stackexchange.com/questions/2696

Answered By : codeah

Transformations such as left factoring or removing left recursion do not have precedence rules. Obviously, the resulting grammars may be different but they will recognize the same language. The question’s example grammar is more difficult than the typical undergrad homework problem. So showing our work will be useful.

Left Recursion

Let’s define a transformation that removes left recursion. Given $qquad displaystyle A to Aalpha_0 mid A alpha_1 mid dots mid A alpha_n mid B beta_0 mid B beta_1 mid dots mid Bbeta_n$, we remove left recursion like this: $qquad displaystyle begin{align} A_h &to B beta_0 mid B beta_1 mid dots mid B beta_n A_t &to alpha_0 mid alpha_1 mid dots mid alpha_n A_{t^+} &to A_t A_{t^+} mid A_t A &to A_h A_{t^+} mid A_h end{align}$ The generality of above is typically not given in compiler texts, but parsing texts like Grune & Jacobs do cover this. Left factoring can be applied to the above transformed grammar but will be just introduce extra rules that will not change the answer. So we will simplify the presentation without extra left factoring being performed. In this answer we will not cover indirect left recursion issues because we are only concerned with a single non-terminal’s rules. Note that indirect left recursion can be dealt with, though. (Open a separate question if that is important.)

Left Factoring

Removing left factoring is in most introductory compiler texts done like this. Given $qquad displaystyle A to x y mid x z$ left factoring yields: $qquad displaystyle begin{align} A_s &to y mid z A &to x A_s end{align}$ Now that’s perform the transformations in both ordering.

Left factoring first

Let’s left factor the question’s grammar $qquad displaystyle begin{align} F_s &to D S mid varepsilon F &to F B a mid c F_s end{align}$ and then remove the left recursion: $qquad displaystyle begin{align} F_h &to c F_s F_t &to B a F_{t^+} &to F_t F_{t^+} | F_t F_s &to D S mid varepsilon F &to F_h F_{t^+} mid F_h end{align}$

Removing left-recursion first

And for the other ordering, let’s remove left recursion from the question’s grammar $qquad displaystyle begin{align} F_h &to c D S mid c F_t &to B a F_{t^+} &to F_t F_{t^+} mid F_t F &to F_h F_{t^+} mid F_h end{align}$ and then left factor the terminal $c$: $qquad displaystyle begin{align} F_{hs} &to D S mid varepsilon F_h &to c F_{hs} F_t &to B a F_{t^+} &to F_t F_{t^+} mid F_t F &to F_h F_{t^+} mid F_h end{align}$ Aha, the resulting grammars are the same! In general, proving that two grammars are equivalent is undecidable. So if a series grammar transformations could influence the language that is recognized then, it would be disastrous.