This homework will have you modifying a Bison parser to implement the substring parsing algorithm by Bates and Lavie. The substring parser will be started at the first parse error and will continue until the end of the (Cool) file. You will use the CompSci 654 Cool 2004 parser. Do not distribute this (solution!) parser (or any of the other solution files) to anyone outside the class.
Bison produces table-driven LALR(1) parsers in an output file that has three logical parts:
%{
...%} sections as well as the actions. The former
code occurs at the beginning of the output file, whereas the
action code occurs in a large switch statement near the end of the file.#define'd values used for
keywords and multiple character tokens) and maps it into a
symbol number. The symbol numbers are:
yyrhs+YYLAST inclusive) and gives the default
action for this state. The action is either 0 (error) or a
rule number for a reduction.
YYNTBASE).
This nonterminal is the one resulting from a reduction.
It gives the new state we go to after a reduction.
yypact is indexed by state index and
tells the parser
what to do in this state. If the value is YYFLAG,
then the input is irrelevant, and we use the yydefact
table (see above). Otherwise, the value is added to
the input token (after it is translated into a symbol
index) and used to index into yytable which is
essentially a two-d array where rows are squashed on top of
each other for compression. To make sure we really have
a good entry, the index needs to pass three checks:
yycheck must have the same
input symbol as we were just looking at.
yyn gives the parse action:
-yyn is rule number.
yyn is new state.
yytable.
If the result passes the checks above (except that we check
that yycheck of the index is the state), the value
in the table is the new state. Otherwise, we use the
default goto table (see above).
yypact and yypgoto above).
/* yyresume: */
Read the paper by Bates and Lavie. Implementing the algorithm will require new data structures as well as code that uses them and the tables described above. You will need to declare structures to holds nodes, root lists and state sets. Use the simplest structures you can think of. You may use the STL if you like. Do not attempt to optimize.
Substring parsing is an
error-recovery technique. It should not be used unless a parse
error happens. Right after the point where an error message
is printed, at yyerrlab1, call your substring parser
which should start parsing with the token that caused the
error. Continue parsing (without invoking any actions). Whenever an
error occurs, start another parse with the token that caused the
previous parse to fail. When this is done, call YYABORT.
As an exception, if a token is read that does not occur anywhere in the grammar, then this token should be skipped before starting the substring parsing. The detection of this condition should be done in the startup-phase of the substring parser. When it happens, a non-empty string of such illegal tokens should be collected and accepted as a special-case substring with a special error message before normal substring parsing continues.
Your code will need to be inserted in bison.substring, copied from the normal bison.simple. The first task your substring parser will do is build the long goto table. Then it can start the process.
Get set up by running
make -f /afs/cs.uwm.edu/users/classes/cs754/src/homework2/MakefileThis command will copy in everything you need. It will create a file bison.substring that is a copy of the standard bison.simple. You will need to edit it to add the substring parser. Then make parsetest to get an executable. Test it on the examples that will be installed in
/afs/cs.uwm.edu/users/classes/cs754/src/homework2/.Report how big the forest stack gets: specifically
The .a libraries include the following relevant objects:
DEBUG defined and BASIC_COOL
defined to cs754/lib/basic.cl).my-own-libs that does this.
It assumes the ar command takes -cr flags.
You may need to modify it for your brand of Unix.
You should turn in the report in hard-copy, and leave the files
(especially bison.substring) in your AFS volume.