Lab 2: Abstract Syntax Tree and Type Checker


Overview

In this lab, you will design and implement an abstract syntax tree as well as a type checker for Tiger. To implement an abstract syntax tree, you will design a bunch of Java classes, and add semantic actions to the parser to generate abstract syntax trees automatically. To implement the type checker, you will design and implement symbol tables as well as type checking rules.

Getting Started

First check out the code skeleton for this lab:

    $ git commit -am 'my solution to lab1'
    $ git checkout -b lab2 origin/lab2

which will commit your changes to 'lab1' branch to your local Git repository, and create and check out a new lab2 branch from the remote lab2 branch.

You will now need to merge the lab1 into the lab2:

    $ git merge lab1

In some cases, Git may not be able to figure out how to merge your changes with the new lab assignment (e.g., if you modified some of the code that is changed in the second lab assignment). In that case, the git merge command will report you which files are conflicted. To resolve the conflict, you should first edit the relevant files and then add the resulting files, before merging them:

    $ git add conflict-files
    $ git commit -am 'lab2 init'

Lab 2 contains the following new source files, which you should browse through:

    src/ast/*:     abstract syntax tree data structure
    src/checker/*: symbol tables and the type checker

Hand-in Procedure

When you finished your lab, zip you code and submit to the online teaching system.


Part A: Abstract Syntax Trees

In this part of the lab, you will first design and implement an abstract syntax tree for Tiger.

Program Representation as Trees

Given the definitions for the MiniJava language, we can design a group of classes to represent the MiniJava syntax.

Exercise 1. Read and understand the source code in the file ast/Ast.java, which encodes the abstract syntax for MiniJava. Refer to the MiniJava reference manual for the MiniJava syntax.

Pretty Printing

As you have done in lab 1, you will implement a pretty printer in this part, which makes use of the pattern matching feature in Java.

Exercise 2. Read the code in the file ast/Ast.java, finish the pretty printer by filling in the missing code in various pp methods. Do not forget to test your code when you finished.
Challenge! While the pretty printer you have implemented is powerful enough to process MiniJava code, it may be not that general and powerful. Study some general pretty printing algorithms and implement them into your Tiger compilers. For example, you may start with Wadler's printer.

Tree Generation

The parser can construct the abstract syntax tree automatically by the semantic actions attached to the parsing methods.

Exercise 3. Modify the parser src/parser/Parser.java to add semantic actions to each parsing method. As the result of parsing, your Tiger compiler will return an abstract syntax tree for subsequent processing.

To this point, your Tiger compiler should parse all legal MiniJava programs to generate abstract syntax trees. Meanwhile, your Tiger compiler can pretty print the generated trees. Do not forget to test your compiler:

    $ java -cp build/libs/tiger-1.0.jar Tiger -trace parser.Parser.parse <inputFile>

Fix any bugs before continuing.

Challenge! If you have tried parser generators from the previous lab, then you may generate abstract syntax trees from that parser by adding semantic actions into productions.

Part B: Type Checker

A type checker inspects the abstract syntax trees for well-formedness, according to the MiniJava language specification which is in turn based on Java specification. Well-formedness guarantees that the input MiniJava program obey all constraint rules. For example, a variable must be declared before its use; the "+" operator must apply to two operands of integer type; the methods being invoked on an object must have been defined in its class or some superclass; and so on. In this part, you will design and implement a type checker for Tiger.

Symbol Tables

Just like the pretty printing, a type checker performs some kind of (post-order) traversal of the abstract syntax tree. However, there is a key difference: the type checking traversal is context-sensitive, that is, it must record necessary context information. We can use dictionary data structures to record these information. For historical reason, such dictionaries are called symbol tables.

The design of the symbol table is highly-dependent on the language being compiled. For MiniJava, we design two symbol tables: 1) a class table, to maps a class name to its to instance variables and methods; and 2) a method table, to map a method name to its parameters and locals variables.

Exercise 4. Read the code defining symbol table data structures in src/checker/*. We have provided you all code, make sure to understand the overall organization of these symbol table data structures. To debug your compiler implementation, you often need to dump the class table or method table. You can run your Tiger compiler with:
 $ java -cp build/libs/tiger-1.0.jar Tiger -dump classtable
           -dump methodtable <inputFile>
to dump both the class table and the method table.

Implementing Type Checking Rules

A type checker performs a post-order traversal of an abstract syntax tree. That is, a type checker first inspects each subtree and calculates a type for that subtree, respectively. When the type checker returns to the root of the tree, it compares the types returned from all subtrees and check those types can be combined properly into a result type before returning. For example, for the tree node e1+e2, the type checker first inspects the left subtree e1 to calculate a type, say t1, then inspects the right subtree e2 to calculate a type t2. At the type checker returns to the root node "+", it checks that both types t1 and t2 are int, and returns a type int as the type of whole expression e1+e2.

Exercise 5. Finish the code in the type checker src/checker/Checker.java, by filling in the missing code.

To this point, your Tiger compiler can type check all MiniJava programs. Do not forget to test your compiler and fix any bugs before continuing.

Error Recovery

A type checker should report type errors to the programmers and should recover from type errors to detect and report as many errors as possible. The error recovery strategy in Tiger's current implementation is preliminary in that it only reports one error before exiting, but does not try to recover from an error.

Exercise 6. Propose a better error recovery strategy and implement it in your Tiger compiler.

Declaration and Usage

Java, like many static languages, obeys a declaration-before-use rule, meaning that each variable in a program should have a unique static declaration site before all its usages. However, a variable declaration may have not been used at all. For such scenarios, the variable is said to be an unused variable. A good compiler might generate warning like:

    Warning: Variable 'x' is never used

which you might have observed in your Java IDE in this course.

Exercise 7. Extend your Tiger compiler to issue warnings for unused variables.
Challenge. In well-behaved Java programs, every variable should have been initialized before its usages, otherwise the usage of such a variable might produce unpredictable results. For uninitialized variables, good compilers might generate the following error:
        Error: Variable 'x' might not have been initialized
Propose a strategy to detect uninitialized variables, and implement your idea into your Tiger compiler.

Hand-in

This completes the lab. Remember to hand in your solution to the online teaching system.