Lecture 3
Instruction Scheduling (Engineer’s View)
The Problem:
Given a code fragment (basic block) for some target machine and the latencies for each individual operation, reorder the operations to minimize execution time
The Concept:

The Task:
- Produce correct code
- Minimize wasted (idle) cycles
- Scheduler operates efficiently
Data Dependencies (stmt./instr. level)
Dependencies => defined on memory locations / registers
Statement / instruction b depends on statement / instruction a if there exists:
RAW = Read after Write WAR = Write after Read WAW = Write after Write
-
true of flow dependence
awrites a location / register thatblater reads (RAW conflict)
-
anti dependence
areads a location / register thatblater writes (WAR conflict)
-
output dependence
awrites a location / register thatblater writes (WAW conflict)
Dependencies defines ORDER CONSTRAINTS that need to be respected in order to generate correct code.

Precedence / Dependence Graphs
Example latencies

To capture properties of the code, build a precedence/dependence graph G
- Nodes n in G are operations with type(n) and delay(n)
- An edge e = (n1, n2) in G if n2 depends on n1
a: loadAI r0,@w => r1
b: add r1,r1 => r1
c: loadAI r0,@x => r2
d: mult r1,r2 => r1
e: loadAI r0,@y => r3
f: mult r1,r3 => r1
g: loadAI r0,@z => r2
h: mult r1,r2 => r1
i: storeAI r1 => r0,@wThe Precedence Graph

All other dependencies (output & anti) are covered, i.e., are satisfied through the dependencies shown
The Big Picture
- Build a dependency graph, P
- Compute a priority function over the nodes in P
- Use list scheduling to construct a schedule, one cycle at a time
- Can only issue / schedule at most one instructions per cycle
- Use a set of operations that are ready
- At each cycle
- Choose a ready operation (priority based) and schedule it
- Increment cycle
- Update the ready set
Local list scheduling
- The dominant algorithm for many years
- A greedy, heuristic, local technique
Scheduling Example
-
Build the dependency graph (using the same one as above)
-
Determine priorities: longest latency-weighted path
