Lecture 18 - Context-Sensitive Analysis Part 2

Midterm #2

Regular Expressions
NFA and DFA
Regular Expressions to minimal DFA construction
CFG
- Derivations
- Parse Trees
- ambiguity
LL(1) parsing
- FIRST and FOLLOW sets
- Parse tables
- Recursive descent parsers
LR(0) parsing
- LR(0) items
- LR(0) canonical collection and its construction
- ACTION and GOTO tables
- Shift/reduce and reduce/reduce conflicts

Attribute Grammars, cont.

The Problems:

Copy rules increase cognitive overhead
Copy rules increase space requirements
- Need copies of attributes
Result is an attributed tree
- Must build the parse tree
- Either search tree for answers or copy them to the root

Addressing the Problem

What would a good programmer do?

Introduce a central repository for facts
Table names
- Field in table for loaded/not loaded state
Avoid all the copy rules, allocation & storage headaches
All inter-assignment attribute flow is through table
- Clean, efficient implementation
- Good techniques for implementing the table
- When it’s done, information is in the table
- Cures most of the problems
Unfortunately, this design violates the functional, AG paradigm
- Do we care?

The Realist’s Alternative

Ad-hoc syntax-directed translation

Associate pieces of code with each production
At each reduction, the corresponding code is executed
Allowing arbitrary code provides complete flexibility
- Includes ability to do tasteless and bad things

To make this work:

Need names for attributes of each symbol on lhs and rhs
- Typically, one attribute passed through parser + arbitrary code (structures, globals, …)
- Yacc introduced $$, $1, $2, …, $n, left to right
Need an evaluation scheme
- Fits nicely into LR(1) parsing algorithm

Project 2

You do not have to change the scanner (scan.l)
How to specify and use attributes in YACC?
- Define attributes as types in attr.h
- typedef struct info_node { int a; int b } infonode;
Include type attribute name in %union in parse.y
- %union {tokentype token; infonode myinfo; ... }
Assign attributes in parse.y to
- Terminals: %token <token> ID ICONST
- Non-terminals: %type <myinfo> block variables procdecls cmpdstmt

At each reduction, the corresponding code is executed.

Accessing attribute values in parse.y:
use $$, $1, $2, …, etc. notation:

block : variables procdecls {$2.b = $1.b + 1;} cmpdstmt
        { $$.a = $1.a + $2.a + $4.b; }

Implemented as

block : variables procdecls newsymbol cmpdstmt
        { $$.a = $1.a + $2.a + $4.b; }
newsymbol: ∈ {$2.b = $1.b + 1;}

Summary - is this really ad-hoc?

Relationship between practice and attribute grammars

Similarities

Both rules & actions associated with productions
Application order determined by tools
(Somewhat) abstract names for symbols

Differences:

Actions applied as a unit; not true for AG rules
Anything goes in ad-hoc actions; AG rules are (purely) functional
AG rules are higher level than ad-hoc actions

Types and Type Systems

Type: A set of values and meaningful operations on them

Types provide semantic “sanity checks” (consistency checks) and determine efficient implementations for data objects

Types help identify

errors
which operation to use for overloaded names and operators, or what type coercion to use (e.g.: 3.0 + 1)
identification of polymorphic functions

Type System

Type System: each language construct (operator, expression, statement, etc.) is associated with a type expression. The type system is a collection of rules for assigning type expressions to these constructs.

Type expressions for

basic types: integer, char, real, boolean, typeError
constructed types, e.g., one-dimensional arrays:
- array(lb, ub, elem_type), where elem_type is a type expression

A Type Checker implements a type system. It computes or “constructs” type expressions for each language construct