Lab 5: Static Single-Assignment Form


Overview

In this lab, you will build a static single-assignment form (SSA) for your Dragon compiler to perform optimizations. SSA is a state-of-the-art IR also used in production compilers such as GCC or LLVM.

This lab consists of two parts: first, in part A, you will implement the translation from CFG to SSA as well as the translation from SSA back to CFG. Second, in part B, you will implement several classic data-flow analysis and optimizations on SSA.

Getting Started

First check out the source we offered you for lab5:

    $ git commit -am 'my solution to lab4'
    $ git fetch
    $ git checkout -b lab5 origin/lab5

these commands will first commit your changes to the lab4 branch of your local Git repository, and then create a local lab5 branch and check out the remote lab5 branch into the new local lab5 branch.

Again, you will need to merge your code from lab4 into the new lab5 branch:

    $ git merge lab4

Do not forget to resolve any conflicts before commit to the local lab5 branch:

    $ git commit -am 'lab5 init'

You should first import the new lab5 code into your favorite IDE, and make sure the code compiles. There are a bunch of new files that you should browse through:

    cfg/lab5/*:    static single-assignment form and its operations

Hand-in Procedure

When you finished your lab, zip you code and submit to the online teaching system.


Part A: Static Single-Assignment Form (SSA)

The static single-assignment form (SSA) is an important compiler intermediate representation, in which each variable is assigned (statically) at most once. SSA is more advantageous over other compiler IRs in that its single-assignment property makes program analysis and optimization simpler and more efficient. As a result, SSA has become a de-factor IR in modern optimizing compilers for imperative, OO, or even functional languages.

In this part of the lab, you will build a static single-assigment form for your Tiger compiler, and conduct optimizations based on your SSA IR. To aid in your implementation, we have given some hints for most exercises, but keep in mind that these hints are not mandatory, and you are free (and encouraged) to propose your own ideas.

SSA Data Structure

To better reuse your existing code base, you will implement SSA by leveraging data structures already designed for the control-flow graph. This design decision makes the interfaces cleaner and more elegant hence simplifying subsequent compilation phases.

Exercise 1. Read the code defining the SSA data structures in the file cfg/lab5/Ssa.java. A basic block b might contain a list of φ-instructions, which reside at the top of a block. Syntactically, a φ-function
        x = φ((B_1, x_1), (B_2, x_2), ..., (B_p, x_p))
takes a list of tuples (B_i, x_i), 1≤ i ≤ p as arguments and selects one of the variable x_j from these tuples as its value to assign to the left-side x, when the control-flow reach the current basic block from its predecessor B_j. You do not need to write any code for this exercise.

Dominator Trees

You will be constructing the SSA form with the following 5 steps: 1) calculate dominators; 2) build dominator trees; 3) calculate dominance frontiers; 4) place φ-functions; and 5) rename variables.

In this part, you will finish the first two steps, that is, you will first calculate dominators then build a dominator tree for a given CFG.

Exercise 2. Calculate, for a node n in control-flow graph g, its immediate dominator IDOM[n]. You should finish this exercise by filling in the method idom() in the file util/Graph.java.
Hint: you might extend the tree data structure to implement this algorithm.
Exercise 3. Calculate, for a control-flow graph g, its dominator tree t. You should finish this exercise by filling in the method dominatorTree() in the file util/Graph.java.
Hint: you should make use of the dominators DOM[n] already calculated for the node n, and think carefully about the basic data structure design to implement this algorithm.

Dominance Frontier

In this part of the lab, you will be implementing the third step to build an SSA, that is, you will calculate the dominance frontier for each node. The dominance frontier DF[n] of a node n is the set of all nodes d such that n dominates an immediate predecessor of d, but n does not strictly dominate d. Intuitively, the dominance frontier DF[n] for a node n is the set of nodes where n's dominance stops.

Exercise 4. Implement the algorithm calculating dominance frontiers DF[n] for any node n in the CFG. You should finish this exercise by filling in the method dominanceFrontier() in the file util/Graph.java.
Hint: to keep your implementation simpler, you may use the following algorithm to calculate dominance frontier:
    dominanceFrontier()
        foreach(node n ∈ CFG)
            DF[n] = ∅  // initialized to an empty set
        foreach(node n ∈ CFG)
            if(n has multiple predecessors)
                foreach(predecessor p of n)
                    runner = p
                    while(runner ≠ IDOM[n])
                        DF[runner] ⋃= {n}
                        runner = IDOM[runner]
where IDOM[n] represents the immediate dominator for the node n.

φ-function Placement

In this part, you will implement the fourth step to build an SSA, that is, you will be implementing the φ-function placement. The φ-function placement algorithm places φ-functions at the top of corresponding blocks. Specifically, for a definition

x = ...

to a variable x in a basic block n, this algorithm will place a φ-function

x = φ(x, ..., x)

at the top of each n's dominance frontier d ∈ DF[n].

Exercise 5. Implement the algorithm to place the φ-functions, as described by Algorithm 5.4 in the reading material. You should finish this exercise by filling in the method placePhiFunctions() in the file cfg/lab5/BuildSsa.java.
Hint: for any basic block b, take care not to place φ-functions more than once for a same variable x.
Hint: φ-function placement has a cascaded effect, that is, placing a φ-function for a variable x at a basic block b also makes b a definition block for x.

Renaming the Variables

In this part of the lab, you will be implementing the fifth and last step to build an SSA, that is, you will rename variables to make its definition unique.

Exercise 6. Implement the algorithm to rename the variables, as described by Algorithm 5.5 in the reading material. You should finish this exercise by filling in the method renameBlock() in the file cfg/lab5/BuildSsa.java.
Hint: think carefully about basic data structure design (e.g., variable versions, and stacks).
Hint: to keep your implementation simple, you might want to use a functional programming style, that is, you should construct a new function from an old one instead of modifying the old one in place.

To this point, your compiler can convert all legal C programs to SSA forms. Do regression test your implementation and fix potential bugs.

Translation Out of SSA Forms

Modern machines do not support the φ-functions directly, hence φ-functions must be eliminated before execution. This task is accomplished by the translation out of SSA forms.

Exercise 7. Implement the algorithm to translate a program out of SSA form. Specifically, to avoid the lost-copy problem and the swap problem, you might use the following algorithm:
    outSSA(ssa)
        splitCriticalEdges(ssa);
        foreach(basic block b, with p predecessors b1, ..., bp)
            suppose b contains m φ-functions:
                x1 = φ(x11, ..., x1p);
                ...
                xm = φ(xm1, ..., xmp);
            generate m fresh variables: t1, ..., tm;
            foreach(b's predecessor block bi, 1≤ i ≤p)
                append these assignment statements at the tail of block bi:
                    t1 = x1i
                    ...
                    tm = xmi
            append these assigment statements at the top of block b:
                x1 = t1
                ...
                xm = tm
Fill in the code in the file cfg/lab5/OutSsa.java to implement algorithm to translate a program out of SSA form.

Part B: SSA-based Optimizations

SSA makes compiler optimizations not only easier to engineer but also more efficient to execute, largely due to its single-assignment property. This nice property make it much easier to calculate the data-flow information required to perform optimizations.

In this part of the lab, you will re-implement the copy propagation optimization that we have done on the CFG on SSA again, and φ-function elimination optimization specific to SSA.

Copy Propagation

In SSA, given a copy statement x = y, then any use of x can be replaced by y.

Exercise 8. Implement the copy propagation algorithm on SSA. You should finish this exercise by filling in the code in the file cfg/lab5/CopyProp.java.

φ-function Elimination

In SSA, if a φ-function is of a special form taking a list of same arguments:

x = φ(y, y, ..., y)

where y is a variable, then this φ-function can be eliminated by rewriting it to a normal assignment to x:

x = y

Note that, often the time, this special form of φ-function x = φ(y, y, ..., y) might be generated by copy propagation optimizations, which substituted φ-function arguments by another variable.

Similarly, the φ-function:

x = φ(c, c, ..., c)

can also be eliminated by rewriting it to an assignment statement:

x = c

where c is a constant.

Exercise 9. Implement the φ-function elimination optimization on SSA. You should finish this exercise by filling in the code in the file cfg/lab5/PhiElim.java.

Sparse Simple Constant Propagation

Challenge. Implement the sparse simple constant propagation optimization, using the lattice-based approach. You might refer to this paper or our reading to start with.

Sparse Conditional Constant Propagation

Challenge. Implement the sparse conditional constant propagation optimization, using the lattice-based approach. You might refer to this paper or our reading to start with.

Hand-in

This completes the lab. Remember to hand in your solution to the online teaching system.