Lecture 13 - More Parsing
Top-down parsers
LL(1), recursive descent
- Input: read left to right
- Construct leftmost derivation
- 1 input symbol look ahead

Example
Grammar S -> % S % | & S & | $
Our language can be described mathematically as: {w $ w^R | w β (%, &)^*}
Is this LL(1)?
Yes! Because as you push the left most derivation through, you only need to look ahead one token at a time, and you end up with a deterministic parse tree.
The grammar S -> % S % | % & S % | $ is NOT LL(1), but rather LL(2).
Formally defining LL(1)
A -> π° | π s.t. First(π°) β© First(π) = β
But, how do we compute the First sets?
Left Recursion
Remember the expression grammar?
Goal -> Expr
Expr -> Expr + Term
| Expr - Term
| Term
Term -> Term * Factor
| Term / Factor
| Factor
Factor -> number
| idTop-down parsers cannot handle left-recursive grammars
Formally, A grammar is left recursive if β A β NT such that β a derivation A =>^+ Aπ°, for some string π° β (NT βͺ T)^+
Our expression grammar is left recursive
- This can lead to non-termination in a top-down parser
- For a top-down parser, any recursion must be right recursion
- We would like to convert the left recursion to right recursion
Non-termination is a bad property in any part of a compiler
Eliminating Left Recursion
To remove left recursion, we can transform the grammar
Consider a grammar fragment of the form
Fee -> Fee π°
| π±where neither π° nor π± start with Fee
We can rewrite this as
Fee -> π± Fie
Fie -> π° Fie
| π΄where Fie is a new non-terminal
This accepts the same language, but uses only right recursion
Roadmap (Where are we?)
We set out to study parsing
- Specifying syntax
- Context free grammars
- Ambiguity
- Top-down parsers
- Algorithm & its problem with left recursion
- Left-recursion removal
- Left factoring (will discuss later)
- Predictive top-down parsing
- The LL(1) condition
- Table-driven LL(1) parsers
- Recursive descent parsers
- Syntax directed translation (example)
Picking the βRightβ production
If it picks the wrong production, a top-down parser may backtrack. Alternative is to look ahead in input & use context to pick correctly
How much look ahead is needed?
- In general, an arbitrarily large amount
- Use the Cocke-Younger, Kasami algorithm or Earleyβs algorithm
Fortunately
- Large subclasses of CFGs can be parsed with limited look ahead
- Most programming language constructs fall in those subclasses
Among the interesting subclasses are LL(1) and LR(1) grammars
Predictive Parsing
Basic idea Given A -> π° | π±, the parser should be able to choose between π° and π±
FIRST Sets
- For some rhs π° β G, define FIRST(π°) as the set of tokens that appear as the first symbol in some string that derives from π°
- That is, a β FIRST(π°) iff a =>^* aπ², for some π²
The LL(1) property If A -> π° and A -> π± both appear in the grammar, we would like FIRST(π°) β© FIRST(π±) = β
- Note: This is almost correct, but not quite!
This would allow the parser to make a correct choice with a look ahead of exactly one symbol!
The FIRST Set - 1 Symbol Look ahead

The FOLLOW Set - 1 Symbol
For a non-terminal A, define FOLLOW(A) as
FOLLOW(A) := the set of terminals that can appear immediately to the right of A in some sentential form
Thus, a non-terminalβs FOLLOW set specifies the tokens that can legally appear after it; a terminal has no FOLLOW set
FOLLOW(A) = { a β (T βͺ {eof}) | S eof =>^* π° A a π² }
To build FOLLOW(X) for all non-terminal X
- Place eof in FOLLOW(<goal>)
- Iterate until no more terminals or eof can be added to any FOLLOW(X)
- If A -> π°Bπ± then put {FIRST(π±) - π΄} in FOLLOW(B)
- If A -> π°B then put FOLLOW(A) in FOLLOW(B)
- If A -> π°Bπ± and π΄ β FIRST(π±) then put FOLLOW(A) in FOLLOW(B)
If A -> π° and A -> π± and π΄ β FIRST(π°), then we need to ensure that FIRST(π±) is disjoint from FOLLOW(A), too
