CS 654: Introduction to Compilers Spring 2006
In this assignment you will implement part of the static semantics of Cool. You will use the abstract syntax tree (AST) built by your parser from assignment 3 to check that a program is in conformance with the Cool specification. In this part of the static semantics, we will only check method bodies and will not handle the object-oriented features of Cool.
This assignment has much more room for design decisions than previous assignments. Your program is correct if it checks programs against the specification. There is no single ``right'' way to do the assignment, but there are many wrong ways. There are a number of standard practices which we think make life easier and we will try to convey them to you. We will suggest certain program structures and if you learn how to use them, they will make your task easier. Whatever you decide to do, be prepared to justify your solution.
You will need to refer to the typing rules, identifier scoping rules, and other restrictions of Cool as defined in the CoolAid. You will also need to add methods and data members to the AST class definitions for this phase. The functions that the tree package provides are documented in the handout for programming assignment 3 and in the header files cool-tree.h and tree.h. You can modify the class definitions by changing (adding things to) cool-tree.h.
To get the assignment type
make -f /afs/cs.uwm.edu/users/classes/cs654/assignments/PA4/Makefilein your PA4 directory in afs. This command copies a number of files to your directory, some of them with read-only permission. As usual, you should not modify files that are read-only. Please read and follow the directions in the README file.
The files that you will need to modify are:
These test files are extremely important: as with previous assignments you can get credit for testing a feature, even if you did not completely implement it. Conversely, if you do not test a feature adequately, you will lose credit in this part of the assignment. Explain your tests in these files and put any overall comments in the README file.
Make sure that the name and login of each group member is in the README file.
As usual, there are other files used in the assignment that are symbolically linked to your directory or are included from /afs/cs.uwm.edu/users/classes/cs654/include/PA4. You should not modify these files. Almost all of these files have been described in previous assignments. The exceptions are ast.flex and ast.y, which implement lexical analysis and parsing for a textual representation of Cool ASTs. Recall that there are two versions of coolc: a normal executable and a shell script that glues separate parser, semantic analysis, and code generation phases together via pipes. Each phase uses dumptype to print the AST to the pipe, and the next phase uses the AST parser to reconstruct the tree data structure.1
All software supplied with this assignment is supported on both linux and sparc machines. Remember to run make clean if you switch architectures.
You will need a working scanner and parser to test your semantic analyzer. There is a script mysemant1 that combines a scanner, parser, and a partial semantic analyzer (the parts you do not do) with your partial semantic analyzer. See the README file for instructions on how to use either your own components or the components from coolc. It is wise to test your semantic analyzer with the coolc scanner and parser at least once, because we will grade your semantic analyzer using coolc's scanner and parser.
We have provided you with a -s flag for debugging you semantic analyzer; see the README file for instructions on how to use this flag.
As a result of assignment 3, your parser builds abstract syntax trees. The file dumptype.cc illustrates how to traverse the AST and gather information from it. This algorithmic style--a recursive traversal of a complex tree structure--is very important, because it is a very natural way to structure many computations on ASTs.
Your programming task for this assignment is to 1) traverse the tree, 2) manage various pieces of information that you glean from the tree, and 3) use that information to enforce the semantics of Cool. One traversal of the AST is called a ``pass''. It is possible to do this part of the assignment in a single pass over the tree.
If you need new data members and member functions, add them directly in the cool-tree.h file.
A major portion of any semantic checker is the management of names. The specific problem is determining which declaration is in effect for each use of an identifier, especially when names can be reused. For example, if a let expression declares the variable i twice, then wherever i is referenced the semantics of the language specify which declaration is in effect. It is the job of the semantic checker to keep track of which declaration a name refers to.
As discussed in class, a symbol table is usually used to manage names. We provide a module to manage symbols and you are free to do with it as you like. Our module allows you to enter, exit and augment scopes as needed. You will need multiple symbol tables and multiple kinds of symbol tables. Our solution uses four kinds of symbol tables: classes, methods, attributes and objects.
The semant.h file we start you with includes partial definitions of
two classes: ClassTable and Environment.
The first holds the class namespace (which is global and flat).
Objects of the Environment class
are intended to hold the object namespace plus has access to the other
namespaces of the referencing environment indirectly:
an environment has a pointer to the class table.
Methods and attributes need to be stored in tables too, but we will
defer that part until the next assignment.
Besides the identifier self, which is implicitly bound in every class, there are four ways that an object name can be introduced in Cool:
Type names are used in six different kinds of places in the AST:
NULL for
this attribute.
Remember that neither classes, methods, nor attributes need be declared before use. Think about how this affects your analysis.
Type checking is another major function of the semantic analyzer. The semantic analyzer must check that valid types are declared where required. For example, the return types of methods must be declared. Using this information, the semantic analyzer must also verify that every expression has a valid type according to the type rules. The type rules are discussed in detail in the CoolAid and in lecture.
One difficult issue is what to do if an expression doesn't have a valid type according to the rules. First, an error message should be printed with the line number and a description of what went wrong. It is relatively easy to give informative error messages in the semantic analysis phase, because it is generally obvious what the error is. We expect you to give informative error messages. Second, the semantic analyzer should attempt to recover and continue. A good semantic analyzer will avoid cascading errors using any of several standard techniques. We do expect your semantic analyzer to recover, and it should also make some attempt to avoid cascading error messages. One technique is to assign a special type (compatible with all types) to any expression that cannot otherwise be given a type (we used this method in coolc).
The purpose of the semantic analyzer is not only to check for errors
but also to decorate the abstract syntax tree. All the attributes are defined
in cool-tree.h as private fields.
The following table
shows what attributes the code generator needs.
The ones marked with a footnote are already done for you.
The remaining ones must be assigned (with a minor exception concerning
binding).
All of these attributes are pointers of one sort or another. Many of
them are pointers to the AST nodes referred to by name in the node.
The ``bindings'' are pointers to VarBinding objects which describe
the different kinds of variable bindings. You need to compute the
bindings of those uses that do not refer to attributes.
The latter ones are already assigned for you.
So before doing name resolution for identifiers, check to see if the
binding is already non-null in which case you should leave it
unchanged.
The type of every
expression is a symbol from the ``id'' symbol table. The type of
no_expr nodes should be the symbol _no_type, which can
be accessed using global variable No_type.
The type of null nodes should be the symbol _void, which
can be accessed using global variable Void.
| constructor or phylum | attribute | meaning |
| program | obj_class | AST for class Object |
| int_class | AST for class Integer | |
| str_class | AST for class String | |
| bool_class | AST for class Boolean | |
| class | super |
AST for super class (or NULL for Object) |
| method | overrides |
AST of method being overridden (or NULL) |
| return_of_class | AST for class of return type | |
| attr | of_class | AST for class of type |
| formal | of_class | AST for class of type |
| branch | of_class | AST for class of type of branch variable |
| type | symbol for type of value of expression in branch | |
| Expression | type | symbol for type of value of expression (or NoType) |
| let | of_class | AST for class of type of declare variable |
| object | binding | description of variable being used |
| assign | binding | description of variable being assigned |
| dispatch | method |
AST for (statically selected) method |
| static_dispatch | method |
AST for method selected |
| new_ | of_class | AST for class being instantiated. |
For incorrect programs, the output of semantic analysis is error messages. You are expected to recover from all errors that this part of the semantic analyzer finds. You do not need to test your program on programs with inheritance-related errors. You are expected to produce complete and informative error messages. Assuming the inheritance is used correctly, the semantic checker should catch and report all semantic errors in the program. When in doubt, use coolc as a guide in determining what informative error messages should say. Your error messages need not be identical to those of coolc, but making them identical will make it easier for you to see if you have correctly implemented tests (and will make it easier to grade your project).
We have supplied you with a simple error reporting method semant_error. This routine takes a filename and the AST node where the error occurred, and it returns the error stream after it has printed an error header. The filename should be the file in which the error occurs. The Class_ nodes store the file in which the class was defined (recall that class definitions cannot be split across files). In an error message, the line number of the error message is obtained from the AST node where the error is detected and the file name is obtained from the enclosing class. We have provided you with a version of semant_error in class Environment which you should find useful. We have also provided you with semant_warning to make it easy to generate warning messages that do not prevent compilation. You are not required to implement any warnings.
For correct programs, the output is a decorated abstract syntax tree. You will be graded on whether your semantic phase correctly annotates ASTs with types and on whether your semantic phase works correctly with the coolc code generator. You may wish to compare your results with the solution:
grid.cs: coolc -Pps xxx.cl > xxx.sol-out grid.cs: mysemant1 xxx.cl > xxx.out grid.cs: diff xxx.sol-out xxx.out
You are also expected to program in good, structured style. You should spend some time thinking about the class definitions you will use.
In summary, your partial semantic analyzer should find all errors in
method bodies except for method lookup errors, and also find errors in
attribute types and method return types. You must set the
following attributes:
the ones named something_class, binding and
type, as described in the table.
The semantic analysis phase is by far the largest component of the compiler so far. Even divided into two parts, it is easy to get lost when doing it. Ask yourself: