Compilers - Lecture 2

January 24, 2021

Lecture 2

Backend - Register Allocation

Critical properties:

  • Produce correct code that uses k (or fewer) registers
  • Minimize added loads and stores
  • Minimize space used to hold spilled values
  • Operate efficiently: O(n), O(nlog2n), maybe O(n^2), but not O(2^n)

Instruction Scheduling

Motivation

  • Instruction latency (pipelining)
  • Several cycles to complete instructions; instructions can be issued every cycle
  • Instruction level paralellelism (VLIW, superscalar)
    • Execute multiple instructions per cycle

Issues

  • Reorder instructions to reduce execution time

  • Static schedule - insert NOPs to preserve correctness

  • Dynamic schedule - hardware pipeline stalls

  • Preserve correctness, improve performance

  • Interactions with other optimizations (register allocation!)

  • Note: After register allocation, code shape contains real, not virtual registers ==> register may be redefined

  • ILOC simulator “sim” is available on ilab at ~uli/cs415/ILOC_Simulator/sim

Local Instruction Scheduling

Readings: EaC 12.1-12.3, Appendix A (ILOC)

Definition: A basic block is a maximal length segment of straight line (i.e., branch free) code. Control can only enter at first instruction of basic block and exist after last instruction.

Local: within single basic block Global: across procedures/functions

ILOC (Intermediate Language for Optimizing Compilers)

Instruction scheduling on basic blocks in “ILOC”

  • Pseudo-code for a simple, abstracted RISC machine
    • generated by the instruction selection process
  • Simple, conmpact data structures
  • Here: we only use a small subset of ILOC

  • ILOC simulator “sim” is available on ilab at ~uli/cs415/ILOC_Simulator/sim

Memory Model / Code Shape

  • Source code:
A = 5;
B = 6;
C = A + B;

Memory layout:

Assume A, B, C are integer values of 4 bytes

address(A) = 1024 + offset(A) = 1028
address(B) = 1024 + offset(B) = 1032
address(C) = 1024 + offset(C) = 1036

More generally: address(X) = base_address + offset(X)

This convention is used in activation records or stack frames. We use it here for consistency

ILOC Code:

loadI 5 => r1
// compute address of A in r2
...
store r1 => r2 // content(A) = r1
loadI 6 => r3
// compute address of B in r4
...
store r3 => r4 // content(B) = r3
add r1, r3 => r5
// compute address of C in r6
...
store r5 => r6 // content(C) = r1 + r3

Is this code correct?

foo (var A, B)
  A = 5;
  B = 6;
  C = A + B;
end foo;

X = 1
call foo(X,X);
print C;

Incorrect for call-by-reference!

Aliasing Problem

  • Aliasing: Two variables or source code names may refer to the same memory location

Examples:

  • Formal call-by-reference parameters a and b
  • Pointers a->f and b->f
  • Array elements a(i,j) and a(k,l) if i == k and j == l

Channlenge: When is it safe to keep a variable’s value in a register across STORE instructions, i.e., while other STORE instructions are executed?

Memory Models

  • Register-register Model (We will use this one from now on)

    • Values that may safely reside in registers are assigned to a unique virtual register (alias analysis)
    • Register allocation/assignment maps virtual registers to limited set of physical registers
    • Register allocation/assignment pass needed to make code “work”
  • Memory-Memory Model

    • All values reside in memory, and are only kept in registers as briefly as possible (load operands from memory, perform computation, store result into memory)
    • Register allocation/assignment has to try to identify cases where values can be safely kept in register
    • Safety verification is hard at the low levels of program abstraction
    • Even without register allocation/assignment, code will “work”