Lecture 14
FIRST Set

To build FIRST(π) for π = X1, X2, β¦, Xn:
-
a β FIRST(π) if a β FIRST(Xi) and π β FIRST(Xj) for all 1 <= j <= i
-
π β FIRST(π) if π β FIRST(Xi) for all 1 <= i <= n
FOLLOW Set
For a non-terminal A, define FOLLOW(A) as the set of terminals that can appear immediately to the right of A in some sentential form
Thus, a non-terminalβs FOLLOW set specifies the tokens that can legally appear after it; a terminal has no FOLLOW set
FOLLOW(A) = { a β (T βͺ {eof})| S eof =>* π A a π }
To build FOLLOW(X) for all non-terminal X:
- Place eof in FOLLOW(
)
- iterate until no more terminals or eof can be added to any FOLLOW(X):
- If A -> πBπ then
- put {FIRST(π) - π} in FOLLOW(B)
- If A -> πB then
- put FOLLOW(A) in FOLLOW(B)
- If A -> πBπ and π β FIRST(π) then
- put FOLLOW(A) in FOLLOW(B)
Predictive Parsing
If A -> π and A -> π and π β FIRST(π), then we need to ensure that FIRST(π) is disjoint from FOLLOW(A), too

This means that we need to update our LL(1) property to be:
A grammar is LL(1) iff A -> π and A -> π implies FIRST+(π) β© FIRST+(π) = β
- Notice we use FIRST+ instead of FIRST, in order to deal with π
This would allow the parser to make a correct choice with a look ahead of exactly one symbol!
Building Top Down Parsers
Building the complete table
- Need a row for every NT & a column for every T + βeofβ
- Need an algorithm to build the table
Filling in TABLE[X,y], X β NT, y β T βͺ { eof }
- entry is the rule X -> π, if y β FIRST+(π)
- entry is error otherwise
If any entry is defined multiple times, G is not LL(1)
LL(1) Skeleton Parser
token = next_token() // scanner call
psuh EOF onto Stack
push the start symbol, S, onto Stack
TOS = top of Stack
loop forever
if TOS = EOF and token = EOF then
break and report success
else if TOS is a terminal token then
if TOS matches token then
pop Stack // recognized TOS
token = next_token()
else report error looking for TOS
else // TOS is a non-terminal symbol
if TABLE[TOS, token] is A -> B1,B2,...,Bk then
pop Stack // get rid of A
push Bk,Bk-1,...,B1 // in that order
else report error expanding TOS
TOS = top of StackLL(1) Parser Example
Table-drive LL(1) parser
| a | b | eof | other | |
|---|---|---|---|---|
| S | aSb | π | π | error |
How to parse input aaabbb?
Describe action as sequence of states (PDA stack content, remaining input, next action), use eof as bottom-of-stack marker
PDA stack content: [X, β¦ Z], where Z is the TOS next actions: rule or next input + pop or error or accept

Recursive descent LL(1) parser
- Every NT is associated with a parsing procedure
- The parsing procedure for A β NT, proc A, is responsible to parse and consume any (token) string that can be derived from A; it may recursively call other parsing procedures
- The parser is invoked by calling proc S for start symbol S
Reminder: Left Recursion
Top-down parsers cannot handle left-recursive grammars
Our expression grammar is left recursive
- This can lead to non-termination in a top-down parser
- For a top-down parser, any recursion must be right recursion
- We would like to convert the left recursion to right recursion
Left Factoring
What if my grammar does not have the LL(1) property? => Sometimes, we can transform the grammar
The algorithm

A graphical example:

Consider the following fragment of the expression grammar
Factor -> Identifier
| Identifier [ExprList]
| Identifier (ExprList)After left factoring, it becomes
Factor -> Identifier Arguments
Arguments -> [ExprList]
-> (ExprList)
-> πThis form has the same syntax, with the LL(1) property

LL(1) Grammars
Question: By eliminating left recursion and left factoring, can we transform an arbitrary CFG to a form where it meets the LL(1) condition? (and can be parsed predictively with single token look ahead?)
Answer: Given a CFG that doesnβt meet the LL(1) condition, it is undecidable whether or not an equivalent LL(1) grammar exists.