Lecture 22 - Compiler Optimizations
Code Improvement
- Analyzes IR and rewrites (or transforms) IR
- Primary goal is to reduce running time of the compiled code
- May also improve space, power dissipation, energy consumption
- Must preserve “meaning” of the code (may include approximations, i.e., quality of outcomes and trade-offs)
- Measured by values of named variables or produced output
The Optimizer (or Middle End)
Modern optimizers are structured as a series of passes
Typical transformations
- Discover & propagate some constant value
- Move a computation to a less frequently executed place
- Specialize some computation based on context
- Discover a redundant computation and remove it
- Remove useless or unreachable code
- Encode an idiom in some particularly efficient form
Benefits
How to assess any technique (transformation) that will improve the overall program outcome or its (dynamic) execution
- (S) - Safety
- Program semantics has to be preserved (true or false)
- (O) - Opportunity
- How often can the optimization be safely applied during the execution of the program (percentage)
- (P) - Profitability
- If the optimization is applied, what is the expected average benefit in terms of the target metric?
Benefit = [(100 - O) + O/P] if S = true
Examples:
- The transformation “a” is safe and improves the execution time of 10% of the executed code by a factor of 5
- Benefit: execution time reduced to 92%
- The transformation “b” is not safe and improves the execution time 40% of the executed code by a factor of 2
- Benefit is not defined
- If “b” were safe, benefit: execution time reduced to 80%
Interactions
How do these optimizations interact?
A significant body of research tries to find the best sequence of optimizing transformations for different application domains. These transformations are not Church-Rosser, i.e., the particular order of these transformations impact the overall ooutcome.
Some of the optimizations are used as “clean-up” passes (e.g.: constant propagation, dead code elimination). This allows implementers of other transformations to use simpler algorithms and data abstractions that are easier to reason about.
When you decide an optimization pass, keep in mind that the program your optimizing pass is presented with may have run through many previous transformations, significantly changing the program’s code shape. Most likely, this code shape would not have been generated directly by any human programmer. Make sure your optimization path algorithms and data structures can deal with “un-natural” shapes.
Commonality
What do these optimizations have in common?
- Their goal is to reduce the number of machine cycles needed to execute the program (reduce dynamic execution count)
Note: reducing dynamic execution cycles does not always imply reducing static program size. In fact, many optimizations increase the program size significantly. This in turn can have negative impact on (dynamic) performance (e.g.: caches, failure of “standard” algorithms to generate good code).
Examples:
- Procedure in-lining
- Blocking for memory hierarchy
- Loop unrolling to increase basic block size
Optimization Goals
What other optimization goals are there?
- Performance
- Size of executable
- Power
- Energy
- Thermal
How do these different optimization goals interact
- Does one optimization goal subsumes another, or are they all different?
- Can one optimization goal conflict with another?
- (e.g.: power vs. performance, thermal vs. performance)
Scope / Granularity
Example: Discover and propagate some constant values (constant folding / propagation)
Local, global (intra-procedural), and inter-procedural optimization
Local
Local: Basic block within a procedure
a := 2
b := 3
c := a + b
print(c)This get’s optimized to:
a := 2
b := 3
c := 5
print(5)Global
Global: Control flow between basic blocks within a procedure
if (...) then {
a := 2
b := 3
} else {
a := 3
b := 2
}
c := a + b
print(c)This gets optimized to
if (...) then {
a := 2
b := 3
} else {
a := 3
b := 2
}
c := 5
print(5)Inter-procedural
Inter-procedural: Control flow across procedure calls
procedure foo(a,b) {
c := a + b // no side effects
return c;
}
procedure bar() {
...
c := foo(2, 3)
print(c)
d := foo(5,5)
print(d)
}This gets optimized to