In this lab, you will design and implement a dominator tree algorithm, a foundational tool for code optimization. Additionally, you will develop a liveness analysis, a classical backward data flow analysis algorithm, and a reaching definitions analysis, a classical forward data flow analysis algorithm.
First check out the code skeleton for this lab:
$ git commit -am 'my solution to lab1' $ git fetch $ git checkout -b lab2 origin/lab2
which will commit your changes to 'lab1' branch to your local Git repository, and create and check out a new lab2 branch from the remote lab2 branch.
You will now need to merge the lab1 into the lab2:
$ git merge lab1
In some cases, Git may not be able to figure out how to merge your changes with the new lab assignment (e.g., if you modified some of the code that is changed in the second lab assignment). In that case, the git merge command will report you which files are conflicted, and you should first resolve the conflict (by editing the relevant files) and then commit the resulting files:
$ git commit -am 'lab2 init'
Lab 2 contains the following new source files, which you should browse through:
cfg/lab2/Dominator.java: dominator tree algorithm framework cfg/lab2/Liveness.java: liveness analyses framework cfg/lab2/ReachDefinition.java: reaching definition analyses framework
When you finished your lab, zip you code and submit to the online teaching system.
In program optimization and analysis, Dominator Tree is a tree structure used to represent the dominance relationship between nodes in a Control Flow Graph (CFG). It is an important tool in compiler design and static program analysis, especially in optimizations (such as loop optimization, dead code elimination) and data flow analysis.
Let DOM[n]
represent the dominator of node n
.
A simple and intuitive way to construct a dominating tree is to utilize
the following data-flow equation:
DOM[n] = {n} ⋃ (⋂p∈ pred[n] DOM[p])where
DOM[n0] = {n0}
for
the entry block n0
.
While the data-flow equation based algorithm is easy to engineer, it is
worst-time complexity is O(n2)
due to its use of
iterations to reach the fixpoint.
In this part, you should implement a more efficient algorithm to construct
dominator tree using Cooper's algorithm.
src/cfg/lab2/Dominator.java
and build a dominator tree from the control flow graph. You can refer to the
readings we provided and
this paper
to learn how Cooper's algorithm works.
A
liveness analysis
calculates, for each statement
and transfer in a function, the live-in and live-out variable sets.
In liveness analysis, you should calculate
in
and live_out
for the in
and out
separately; and the fix-point code should
calculate the in
and out
sets for
each basic block in a reverse topo-sort order.
But for other data-flow analysis, these details may be different.
cfg/lab2/Liveness.java
by implementing the
aforementioned algorithm.
In this part of the lab, You will implement a reaching definition analysis , which tracks the values of variables in a program. Through reaching definition analysis, we can detect the use of undefined variables in the program or perform constant propagation and copy propagation.
The reaching definition analysis algorithm we designed primarily consists of two steps: analyzing variable definition points and iteratively solving data flow equations.
The analysis of variable definition points is primarily performed to compute the information required for data flow equations. This step involves traversing each function and recording every definition point of variables within the function.
cfg/lab2/ReachDefinition.java
, then fill in the function
defsForFunction()
.
defSitesProp
to record variable
definition site.
Prior to solving data flow equations, we must also implement functions to
compute the gen/kill sets for each program statement. For each statement
d: t = x1 op x2
, the GEN set is computed as
Gen[d] = {d}
, and the KILL set is calculated by
Kill[d]=defs[t]-{d}
.
cfg/lab2/ReachDefinition.java
, then fill in the function
gkStm()
.
The solution of data flow equations typically employs a fixed-point
algorithm. This involves maintaining a state table where each iteration
recomputes the current state. If the state in the current iteration differs
from the previous iteration, the algorithm proceeds to the next round until
a stable convergence state—referred to as the fixed point—is reached.
It is noteworthy that reaching definition analysis is a forward
data flow analysis. For each statement s
, its In set
is the union of the Out sets from all its predecessor statements:
In[s] =
⋃p∈pred[s]Out[pred]
,
while its Out set is computed by
Out[s] =
Gen[s] ⋃ (In[s] - Kill[s])
.
doitFunction()
and complete the doitStm()
and doitTransfer()
functions to compute the In/Out sets for each program
statement/transfer.
Afterwards, you can validate your implementation by invoking the
test()
function within the embedded class UnitTest
.
This completes the lab. Remember to hand in your solution to the online teaching system.