In this lab, you will design and implement an abstract syntax tree as well as a type checker for Tiger. To implement an abstract syntax tree, you will design a bunch of Java classes, and add semantic actions to the parser to generate abstract syntax trees automatically. To implement the type checker, you will design symbol tables as well as type checking rules.
First check out the code skeleton for this lab:
$ git commit -am 'my solution to lab1' $ git checkout -b lab2 origin/lab2
which will commit your changes to 'lab1' branch to your local Git repository, and create and check out a new lab2 branch from the remote lab2 branch.
You will now need to merge the lab1 into the lab2:
$ git merge lab1
In some cases, Git may not be able to figure out how to merge your changes with the new lab assignment (e.g., if you modified some of the code that is changed in the second lab assignment). In that case, the git merge command will report you which files are conflicted, and you should first resolve the conflict (by editing the relevant files) and then commit the resulting files:
$ git commit -am 'lab2 init'
Lab 2 contains the following new source files, which you should browse through:
src/ast/*: abstract syntax tree data structure src/checker/*: symbol tables and the type checker
When you finished your lab, zip you code and submit to the online teaching system.
In this part of the lab, you will first design and implement an abstract syntax tree for Tiger.
ast/Ast.java
,
which encodes the abstract syntax for MiniJava.
Refer to the
MiniJava reference manual
for the MiniJava syntax.
Given the data structure definitions for the MiniJava syntax, one can (in principle) manually encode any MiniJava program. While this method is, at first glance, boring and error-prone to do in everyday programming practice, it does allow a direct representation of program structures, and also is the dominant programming styles in some functional languages, like Lisp or Scheme.
ast/SamplePrograms.java
, we have offered you some
code to encode several sample MiniJava program.
Read the code and make sure
you understand how those sample programs are represented.
Then write code to encode the MiniJava
program test/Sum.java
.
As you have done in lab 1, you will implement a pretty printer in this part, which makes use of the pattern matching feature in Java.
ast/PrettyPrinter.java
,
finish the pretty printer by filling in the missing code.
Do not forget to test your code when you finished, you can
start with the unit test code in ast/Test.java
.
The parser can construct the abstract syntax tree automatically by its semantic action attached with the parsing methods.
src/parser/Parser.java
to add
semantic actions to each parsing method. As the result of
parsing, your Tiger compiler will return an abstract
syntax tree for subsequent processing.
To this point, your Tiger compiler should parse all legal MiniJava programs, generate abstract syntax trees, and perform pretty printing on the trees. Do not forget to test your compiler:
$ java --enable-preview -cp bin Tiger -trace parser.Parser.parse <inputFile>
Fix any bugs before continuing.
A type checker inspects the abstract syntax trees for
well-formedness, according to the
MiniJava language specification which is in turn based on Java
specification.
Well-formedness guarantees that the input MiniJava program
obey all constraint rules.
For example, a variable must be
declared before its use; the "+"
operator
must apply to two operands of integer type; the methods being
invoked on an object must have
been defined in its class or some superclass; and so on.
In this part, you will be implementing a type checker for
Tiger.
Just like the pretty printing, a type checker performs some kind of (post-order) tree traversal. However, there is a key difference: the type checking traversal is context-sensitive, that is, it must record necessary context information. We can use dictionary data structures to record these information. For historical reason, such dictionaries are called symbol tables.
The design of the symbol table is highly-dependent on the language being compiled. For MiniJava, we have designed two symbol tables: 1) a class table; and 2) a method table. A class table maps a class name to its to instance variables and methods. And a method table map a method name to its parameters and locals variables.
src/checker/*
.
We have provided you all code, make sure to understand the overall
organization of these data structures.
To debug your compiler implementation, you often need to
dump the class table or method table. You can
run your Tiger compiler with:
$ java --enable-preview -cp bin Tiger -dump classtable -dump methodtable <inputFile>
A type checker performs a post-order traversal of an abstract syntax tree. That is, a type checker first inspects each subtree and calculates a type for that subtree, respectively. When the type checker returns to the root of the tree, it compares the types returned from all subtrees and guarantees those types can be combined properly into a result type before returning. For example, for the tree node e1+e2, the type checker first inspects the left subtree e1 to calculate a type, say t1, then inspects the right subtree e2 to calculate a type t2. At the type checker returns to the root node "+", it will check that both t1 and t2 are int, before returning a type int as the final result.
src/checker/Checker.java
,
by filling in the missing code.
In implementing the type checker, you can run the following command
$ java --enable-preview -cp bin Tiger -builtin SumRec.javain case that your parser has bugs.
To this point, your Tiger compiler can type check all MiniJava programs. Do not forget to test your compiler and fix any bugs before continuing.
A type checker should report type errors to the programmers and should recover from type errors to detect and report as many errors as possible. The error recovery strategy in Tiger's current implementation is preliminary in that it only reports one error before exiting, but does not try to recover from an error.
Java, like many static languages, obeys a declaration-before-use rule, meaning that each variable in a program should have a unique static declaration site before all its usages. However, a variable declaration may have not been used at all. For such scenarios, the variable is said to be an unused variable. A good compiler might generate warning like:
Warning: Variable 'x' is never used
which you might have observed in your Java IDE in this course.
Error: Variable 'x' might not have been initializedPropose a strategy to detect uninitialized variables, and implement your idea into your Tiger compiler.
This completes the lab. Remember to hand in your solution to the online teaching system.