A computer must be able to make a deterministic choice between several productions during any given stage of a derivation. We note that non-deterministic choices can be rendered deterministic with the appropriate amount of lookahead. Unless otherwise stated, we will assume a lookahead of one token.
stmt -> ifstmt | other ifstmt -> if ( expr ) stmt | if ( expr ) stmt else stmt expr -> T | FIn this example, we do not immediately know which production to use to expand ifstmt. Left factoring is a way of transforming a grammar to be more suitable for predictive parsing.
Consider two A productions as follows:
A -> αβ1 A -> αβ2We don't know which A production to take until we reach either β1 or β2. We could rewrite the A productions as follows:
A -> αA' A' -> β1 | β2Algorithm 4.2 on p. 178 discusses Left factoring a grammar.
P#1: Left factor the above if grammar.
As we have discussed before, not all languages can be generated by a CFG. In particular, some language constructs cannot be specified using CFGs alone. Let's take a look at one. Consider the case in Pascal where every identifier must be previously declared. This abstracts to the language: L(G) = {sts | s is in (a | b)*}. We are saying that a string s must be declared before it is used.
Two major forms of parsers exist: (a) backtracking and (b) predictive.
P#2: Consider the following grammar:
S -> cBd B -> ab | aa) Is this grammar ambiguous?
b) Is this grammar left-recursive?
c) Show S=>+lmcad
Is backtracking necessary or not?
d) Can this grammar be left factored?
If so, do so and answer question c) again. If not, why not?
We begin by studying two top-down parsing algorithms: (a) recursive decent and (b) LL(1) parsing.
recursive decent parsing - is a top-down parsing method that requires no backtracking and works on grammars that have no left-recursion and are left factored.
Consider the following grammar:
expr -> expr op term | term op -> + | - term -> term mulop factor | factor mulop -> * factor -> ( expr ) | numa) What language does this grammar describe?
b) Is this grammar ambiguous?
c) Is this grammar immediate left recursive? If so, eliminate the immediate left recursion.
d) Can this grammar be left-factored? If so, left-factor the grammar.
LL(1) parsing - where the first L processes input left to right and the second L stands for leftmost derivation. The 1 is one symbol of lookahead.