Midterm #1: Sample Questions

This page includes questions by the instructor (John Boyland) At least one of the (appropriate) questions will appear on the first midterm. The actual midterm will have four questions, although some may consist of a set of short-answer/review questions.

Review Questions

Any of the review questions from Chapters 1, 2 or 4 may appear on the midterm.

Definitions and Short Answers


Compilers and Interpreters

Use T and I diagrams to show how one can run a Cool program life.cool given the following artifacts:

  1. life.cool
  2. coolc.jar (a JVM program for compiling Cool to MIPS)
  3. spim.c (a C program to interpret MIPS code)
  4. gcc.x (a working C compiler)
  5. java.x (a working JVM interpreter)
Please also show the T or I diagrams of any intermediate artifacts. Make sure you only run programs on the "ground" (presumably ARM64) or balanced firmly on a running interpreter. You can't run anything that is an input of a compiler!

Regular Expressions

Construct a regular expression that matches the language of all strings of ones and zeroes which contain exactly 3 zeroes.

Automata for regular expressions

Give the NFA construction for the following regular expression. Then find a DFA for the same language: (0*1+)*0*

Same as above for (aa|bb)*.

Construct an automaton for the following regular expression for Ada integers. A nondeterministic automaton is OK.


Regular expressions for automata

Give a regular expression for the following automata where the set of states is {A,O}; A is the start state and O is the only final state. The transitions are written as triples:

The automaton can go from from to to when the input is any character in the set given ([ ] denotes a space character):
  1. First Automaton
  2. Second Automaton (where "e" means epsilon)

Constructing parse trees

Given the following grammar

a -> t + t
t -> f * f | f
construct a parse tree for 3*5+7. (What additions would be necessary to this grammar to let it parse 3*5*7? Is the grammar ambiguous now ?)

LR parsing tables

Given the following grammar, give the LR(0) parse tables (the characteristic finite-state machine (CFSM)):

S -> E
E -> E '+' E
E -> 'a'

List all the LR items associated with the following grammar. Then construct the CFSM.

      S -> expr
   expr -> expr '*' primary | primary
primary -> '7'


Floating point numbers

It has been proposed to add floating point numbers to Cool. The proposed form for floating point constants is that a single dot occurs anywhere within a non-empty string of digits (including at beginning or end) and that immediately following may appear an exponent of the following form: an ``e'' either upper or lower case, then an optional plus or minus sign and then a nonempty string of digits. Write a regular expression suitable for flex that matches exactly this set of floating-point constants. You may use flex definitions.

Cool ``if'' expressions

It has been proposed to making the else part of a conditional optional for Cool. Write the modified context-free grammar production(s). Do we get the ``dangling else'' problem? Explain!

Give a LR(1) context-free grammar for Cool match expressions.

Arrays in Cool

Write the extra productions to the syntax that would be required to use a nicer syntax for arrays in Cool. Make sure your syntax can handle expressions such as

	a[1] := a[i] + 1
What new tree node constructors would we need? Give their C prototypes, and give the bison actions for your new syntactic rules.


Regular Expressions

The following regular expression was proposed for C comments:
	"/*" [*]* [^*]* [*]* "*/" 
but it didn't match the following comment:
/* Here Y=2*x+1; */
and it ate up this entire line as a single comment:
/**/ y = "Hello, world\n"; /**/

Explain why the first comment isn't matched and the second ``comment'' was matched. Then fix the problem.

Some languages (such as Cool) have line comments; others (such as C) use bracketed comments (see above); C++ has both kinds. Compare and contrast the benefits and problems of each kind. In your answer touch on the following issues:

Does there exist a regular expression which describe non-empty strings containing the same number of a's and b's and no other characters? What about a context-free grammar ?

Finite State Automata

Describe the following DFA in both plain (short!) English and as a regular expression. There are four states {A,B,C,D}; A is the start state and D is the only final state:



Given the following grammar

  expr -> term | expr '+' term
  term -> factor | term '*' factor
factor -> INT | '(' expr ')'
is + left or right associative? What about *? Does * have higher or lower precedence than + ? Rewrite the grammar so that it accepts the smae language, but the associativity of + is reversed.


In language Drool, we have two operators: the dispatch operator ., and the assignment operator =, which is right-associative. The former has a higher precedence than the latter. Give a non-ambiguous CFG so that expressions such as the following are parsed correctly:

z = x = obj1.f1().f2()

Given the following context-free grammar

stmt : if_then | if_then_else | other_stmts
if_then : IF condition THEN stmt
if_then_else : IF condition THEN stmt ELSE stmt
You may assume that nonterminals other_stmts and condition have already been defined.
  1. Show that this context-free grammar is ambiguous.
  2. Rewrite the grammar so that it is not ambiguous, but still accepts the same language

Prove that the following grammar is ambiguous. Write a grammar that accepts the same language that is not ambiguous. Is the language inherently ambiguous? Is it regular?

S -> z S | Z
Z -> | Zz

LL Parsing

Given the following grammar:
stmtlist -> stmtlist stmt | stmt
stmt -> noun iverb adv | noun tverb noun
noun -> COMPILERS | I | YOU
iverb -> SLEEP | WORK
adv -> HARD | WELL
tverb -> LOVE | HATE | SEE
Is it LL(1) ? LL(k) for any k ? If so, give predict sets. If not, how can it be corrected to be LL(k)? LL(1)?

LR Parsing

Given the context-free gramar

expr1 : expr2 ':'
      | expr1 ';'
expr2 : expr2 '%' ID
      | ID
Is this grammar LL(1), LR(1).

Given the following grammar fragment for Cool:

expr : ...
     | IF '( expr ')' expr ELSE expr
     | expr '+' expr
     | ...
When the Cool grammar is processed by bison we get a shift-reduce conflict for '+':
state 146

    expr  ->  IF '(' expr ')' expr ELSE expr . (rule 35)
    expr  ->  expr . '+' expr 

    '+' 	shift, and go to state 92

    '+' 	[reduce using rule 35 (expr)]
    $default	reduce using rule 35 (expr)

Give a sample Cool expression that will lead the parser into such a state when parsing. In this case, what is the correct action for the parser to take?

Describe the ``dangling else'' problem of Pascal and C/C++. Is the problem we see here an instance of the ``dangling else'' problem?

If we don't correct the problem, how will bison resolve the conflict? Will this achieve the correct result for all inputs?

How can the problem be fixed if we do not rely on bison's default conflict resolution?

Attribute Grammars

Constant evaluation

Suppose we wish to write an attribute grammar to do constant evaluation on Cool ASTs for the purpose of replacing expressions with literals at compile time. A design of such an attribute grammar with two synthesized attributes isc (true if the expression is a constant boolean or integer) and val (the value if it was constant, where 0 = false and 1=true for boolean expressions). Write the rules for the following COOL AST nodes:

Evaluate your attribute grammar over the following example. Draw the AST and show the value of each attribute on every node. Make sure you follow your own rules!
if 3 < 2 then x*(65536/0) else 10*(65536/2) fi