Chapter 4 - Parse Tables

Must be viewed with IE 4.0 or later for greek symbols.

Parse Tables

Up to now we have talked about the concept of a top-down recursive decent parser and some of the problems that can arise based on the specification of the CFG. By carefully designing the grammar we can use the recursive decent method for many simple grammars. For more complicated grammars, we need more formal methods of parser construction using parsing tables. Also, it is possible to build nonrecursive predictive parsers as we will look at now.

LL(1) parsing uses a stack instead of recursion to parse an input string.

Consider the following grammar:

E  -> TE'
E' -> +TE' | e
T  -> FT'
T' -> *FT' | e
F  -> (E)  | id

Defn [Aho]: "If α is any string of grammar symbols, let FIRST(α) be the set of terminals that begin the strings derived from α. If α =>^*e, then e is also in FIRST(α)."

Defn [Aho]: "FOLLOW(N), for nonterminal N, is the set of terminals t that can appear immediately to the right of N in some sentential form, that is, the set of terminals t such that there exists a derivation of the form S =>^* αNtβ for some α and β."

Note1: It is possible that during the derivation, a symbol may have existed between N and t, but said symbol derived e.

Note2: If in some sentential form, N is the rightmost symbol, then $ is in the follow of A.

Let's examine the algorithm for FIRST(X) found on p. 189.

OK, back to our grammar:

E  -> TE'
E' -> +TE' | e
T  -> FT'
T' -> *FT' | e
F  -> (E)  | id

Compute each of the following:

FIRST(E)  = {                }

FIRST(E') = {                }

FIRST(T)  = {                }

FIRST(T') = {                }

FIRST(F)  = {                }

Our expression grammar for the compiler (part 4 of 7) will be the following:

exp        -> lvalue assignop exp | * unexp assignop exp |
              orterm
orterm     -> orterm || andterm | andterm
andterm    -> andterm && eqterm | eqterm
eqterm     -> eqterm equop relterm | relterm
relterm    -> relterm relop term | term
term       -> term addop factor | factor
factor     -> factor mulop unexp | unexp
unexp      -> lvalue autoop | & lvalue | * unexp |
              negop unexp | primary
primary    -> ( exp ) | lvalue | constant | func
lvalue     -> var | ( lvalue )
var        -> id | id [ exp ]
func       -> id | id ( list )
list       -> exp | list , exp

P#3: Identify FIRST(exp).

Note: FIRST(exp) is to be used in your top-down parser to identify the beginning of an expression or e

Must be viewed with IE 4.0 or later for greek symbols.

Algorithm for FOLLOW(A) for all nonterminals

The Follow rules are found on p. 189.

Back to our grammar one more time:

E  -> TE'
E' -> +TE' | e
T  -> FT'
T' -> *FT' | e
F  -> ( E )  | id

Compute each of the following:

FOLLOW(E)  = {                }

FOLLOW(E') = {                }

FOLLOW(T)  = {                }

FOLLOW(T') = {                }

FOLLOW(F)  = {                }

Problem: Compute FIRST(exp) and FOLLOW(exp) for our expression grammar.

Finally, how do we construct the LL(1) parse table TBL[N,t]?

Algorithm:

for (each nonterminal N and production possibility N -> α)
{
  for (each token t in the FIRST(α)
  {
    Add N -> α to TBL[N,t]

    if (e is an element of FIRST(α)
      for (each token a in the FOLLOW(N),
        Add N -> α to TBL[N,t]
  }
}

Douglas J. Ryan / ryandj@pacificu.edu