CS 654: Introduction to Compilers Spring 2008

Programming Assignment 3
Due Thursday, March 6

Introduction

In this assignment you will write a parser for Cool. The assignment makes use of two tools: the parser generator bison and a package for generating and manipulating trees. You will not use the generating part of the package, but only the manipulators of a (already generated) tree definition we will give you. The output of your parser will be an abstract syntax tree (AST). The AST will be constructed using bison's semantic actions.

You certainly will need to refer to the syntactic structure of Cool, as described in the CoolAid reference manual. There is a manual for bison on the web page. There is also a section in the recommended textbook on yacc, which is a close relative of bison. The tree package is described in the Tour handout. You will need the tree package information for this and future assignments.

There is a lot of information in this handout and you need to know most of it to write a working parser. Please read the handout thoroughly.

Files and Directories

To get the assignment type

make -f /afs/cs.uwm.edu/users/classes/cs654/assignments/PA3/Makefile
in the PA3 directory of your AFS volume. This command copies a number of files to your directory, some of them with read-only permission. As usual, you should not modify files that are read-only. If you insist on making your own versions of these files, you are on your own. If your code doesn't compile with the original files (including the original Makefile) you will receive little credit, because your program will fail all of the automatically run test cases. Please read and follow the directions in the README file.

The files that you need to modify are:

The files that your code will link with and include, but that you should not modify, are:

Important: All software supplied with this assignment is supported on weise and pabst. Remember to run make clean if you switch architectures.

Testing the Parser

You will need a working scanner to test the parser. You may use either your own scanner or the coolc scanner. See the README file for instructions on how to build a working parser either way. Don't automatically assume that the scanner (whichever one you use!) is bug free--latent bugs in the scanner may cause mysterious problems in the parser.

Note that parser has a -p flag for debugging the parser; using this flag causes lots of information about what the parser is doing to be printed on stdout. In addition, the -l flag is available for generating debugging output from the scanner. Using these two flags together is sometimes useful for isolating strange behavior in the parser.

Once you are confident that your parser is working, try make mycoolc to build a complete compiler that includes your parser. You should test this compiler on both good and bad inputs to see if everything is working. Remember, bugs in your parser may manifest themselves anywhere.

Your parser will be graded using our lexical analyzer. Thus, even if you do most of the work using your own scanner you should test your parser with the coolc scanner before turning in the assignment. Leave you code in the PA3 directory of your AFS volume to turn in your assignment.

Parser Output

Your semantic actions should build an AST for the input. The root (and only the root) of the AST should be of type Program. The semantic action for the start symbol assigns the root of the AST to a global variable ast_root (also of type Program, of course). For programs that parse successfully, the output of parser is a listing of the AST.

For programs that have errors, the output is the error messages of the parser. We have supplied you with a function yyerror in cool.y that prints error messages in a standard format; please do not modify this function. If you write error productions (see page 53 of the textbook) to handle things such as missing semicolons, you should call yyerror with an informative message. You should not invoke yyerror for rules that use the error nonterminal because the bison-generated parser automatically invokes yyerror when it detects a problem. Do not use yyerrok or yyclearin.

Your parser need only work for programs contained in a single file--don't worry about compiling multiple files.

Error Handling

You should use bison's error pseudo-nonterminal to add error handling capabilities in the parser. The purpose of error is to permit the parser to continue after some anticipated error. It is not a panacea and the parser may become completely confused. See the bison documentation for how best to use error. In your README, describe which errors you attempt to catch. Your test file bad.cl should have some instances that illustrate the errors from which your parser can recover. To receive full credit, you parser should recover in at least the following situations:

You probably should only end up with four or five uses of the error nonterminal. Each rule that matches error must construct some legal tree to assign to $$, but the result will not be used, so it needed be anything interesting, or related to the input; it simply must be a legal tree of the legal type.

Do not be overly concerned about the the line numbers that appear in the error messages your parser generates. If your lexer is working correctly, the line number will generally be the line where the error occurred. For erroneous constructs broken across multiple lines, the line number will probably be the last line of the construct.

Do not write any rule using the ERROR terminal.


The Tree Package

There is an extensive discussion of the APS tree package for Cool abstract syntax trees in the Tour. You will need most of that information to write a working parser. Unfortunately, the handling of native was omitted. The three cases are as follows:

The fact that the native keyword is permitted only in basic.cl is handled by our scanner. The parser should not have any restrictions on the use of native.

Remarks

You may use bison's precedence declarations, but only for operators

%left
left-associative operators
%right
right-associative operators
%nonassoc
non-associative operators
Do not use precedence declarations blindly--do not respond to a shift-reduce conflict in your grammar by adding precedence rules until it goes away. If you find yourself making up precedences for things other than operators in expressions, you are probably doing something wrong.

You must declare bison ``types'' for your non-terminals and terminals that have attributes. For example, in the skeleton cool.y is the declaration:

%type <class_> class
This declaration says that the non-terminal class has type <class_>. The use of the word ``type'' is misleading here; what it really means is that the attribute for the non-terminal class is stored in the class_ member of the union declaration in cool.y, which has type Class_. By specifying the type
%type <member_name> X Y Z ...
you instruct bison that the attributes of non-terminals (or terminals) X, Y, and Z have a type appropriate for the member member_name of the union. All the union members and their types have similar names by design. These names are also often the same as the names of nonterminals. Do not let these similar names confuse you.

It is critical that you declare the correct types for the attributes of grammar symbols; failure to do so virtually guarantees that your parser won't work. You do not need to declare types for symbols of your grammar that do not have attributes. The g++ type checker complains if you use the tree constructors with the wrong type parameters. If you ignore the warnings, your program may crash when the constructor notices that it is being used incorrectly. Moreover, bison may complain if you make type errors. Heed any warnings. Don't be surprised if your program crashes when bison or g++ give warning messages.



John Tang Boyland 2008-02-14