Homework # 6
due 11/30/04

Static virtual function resolution

Virtual method calls can contribute substantial overhead to the execution of a program in a highly object-oriented language, such as Cool. For this assignment, you will implement two different kinds of static virtual function resolution by determining what are the possible classes of the receiving object. At a dispatch, we see if the limited set of classes mean that only one possible method is the result. At the same time, we will check to see if the receiver could be void. If the receiver will never be void and the target of the dispatch can be statically resolved, a dispatch node will be converted into a static dispatch. In a real compiler, we would want to make use the cases when only one of the two conditions can be verified, but for here, it is easiest to require both (since the code generation of static dispatch assumes that the receiver cannot be void).

Similarly, with the information about what classes an expression can evaluate to and whether void is possible, one can prune a case expression to remove branches that will not execute.

Class Hierarchy Analysis

Class hierarchy analysis uses the static type of the receiver to do static virtual function resolution. If none of the classes under the static type override the method, it can be statically resolved. In particular, if the static class of the receiver has no subclasses, it can be statically resolved.

However, as mentioned above, we also require that the receiver be guaranteed non-void in order to convert it into a static dispatch. For this task, we will use a simple local analysis:

0-CFA

(Not required for undergraduates.)

This analysis determines the set of types (classes) each expression could evaluate to using an iterative whole-program analysis. The $S(x)$ sets described below will need to be saved and then the inequations below are iterated until there are no changes in these sets. The $E(e)$ sets will be computed on-the-fly using (recursive) virtual methods.

Let $\mu = M(C,m)$ be the method named $m$ selected for an object of dynamic type $C$. Let $\sigma_\mu = \mathtt{self}$ be the defining occurrence of self in method $\mu$. Let $D(C)$ be the class/void for the default value of class $C$. We have $D(C) = C$ for $C\in\left\{{\mathtt{Integer},\mathtt{String},\mathtt{Boolean}}\right\}$ and ``void'' for all other classes. Let $S(x)$ be the least set for each defining occurrence $x$ (we assume all defining occurrences are unique), and $E(e)$ be the least set for each expression $e$ in the program such that $S$ and $E$ satisfy the following conditions:

To get the analysis started, we note that the constructor for class Main will be called:

\begin{displaymath}
S(\sigma_{\mu}) \supseteq \left\{{\texttt{Main}}\right\}
\end{displaymath}

where $\mu$ is the constructor for class Main).

For each attribute $\alpha=x:T$ in each class $C$:

\begin{displaymath}S(\alpha) \ni D(T) \end{displaymath}

For each expression body $b$ of method $\mu$ of class $C$ inheriting from class $C'$: if $S(\sigma_\mu)$ is non-empty (the method can be called) we have

\begin{displaymath}S(\mu) = E(b) \end{displaymath}

. Then (again, if the method could be called) for each expression $e$ somewhere within $b$:

$x$

\(E(e) = S(x)\)

$x$ := $e'$

\(S(x) \supseteq E(e')\), \(E(e) = E(e')\)

super.$m$($e_1$,...,$e_n$)

Let $\mu' = M(C',m)$ be $m$($x_1$:$T_1$,...,$x_n$:$T_n$):$T$ is $e'$ end; in \(S(\sigma_{\mu'}) \supseteq S(\sigma_\mu)\), \(S(x_1) \supseteq E(e_1)\), ... \(S(x_n) \supseteq E(e_n)\), \(E(e) = S(\mu')\)

NB: internally, a super calls will be represented by a static_dispatch node. The method $\mu'$ is known from the binding, you don't need to do the lookup $M(C',m)$.

Make sure that you use the stored value of $S(\mu')$ rather than computing it on the fly, which can lead to infinite recursion.

$e_0$.$m$($e_1$,...,$e_n$)

For each $c \in E(e_0)$ with $c \neq \texttt{void}$, let $\mu' = M(c,m)$ be $m$($x_1$:$T_1$,...,$x_n$:$T_n$):$T$ is $e'$ end; in \(S(\sigma_{\mu'}) \supseteq \left\{{c}\right\}\), \(S(x_1) \supseteq E(e_1)\), ... \(S(x_n) \supseteq E(e_n)\), \(E(e) \supseteq S(\mu')\)

NB: for dispatch nodes, the method binding is not useful. You will need to re-implement method lookup $M(C,m)$. And as with static_dispatch, you should use the stored value of $S(\mu')$.

new $c$($e_1$,...,$e_n$)

Let $\mu' = M(c,c)$ be $m$($x_1$:$T_1$,...,$x_n$:$T_n$) is $e'$ end; in \(S(\sigma_{\mu'}) \supseteq \left\{{c}\right\}\), \(S(x_1) \supseteq E(e_1)\), ... \(S(x_n) \supseteq E(e_n)\), \(E(e) = \left\{{c}\right\}\)

NB This case actually doesn't occur. It is folded into dispatch. Your analysis should always determine that these calls can be statically resolved.

if $e_0$ then $e_1$ else $e_2$

\(\mathit{ignored} = E(e_0),\) \(E(e) = E(e_1) \cup E(e_2)\)

while $e_0$ loop $e_1$ pool

\(\mathit{ignored} = E(e_0) \cup E(e_1),\) \(E(e) = \left\{{\mathtt{void}}\right\}\)

case $e_0$ od when $x_1:T_1$ do $e_1$ od ... when $x_n:T_n$ do $e_n$ od

For each $c \in E(e_0)$ with $c \neq \texttt{void}$, if exists $T_i = \min_{i} \left\{{T_i \mid c \leq T_i}\right\}$, then $E(e) \supseteq E(e_i)$ and $S(x_i) \supseteq \left\{{c}\right\}$. (Don't traverse branches unless they are selected!)

$e_1$; ...; $e_n$

\(\mathit{ignored} = E(e_i),\) \(E(e) = E(e_n)\)

declare $x$:$T$; begin $e_2$ end

\(S(x) \ni D(T)\), \(E(e) = E(e_2)\)

declare $x$:$T$ := $e_1$; begin $e_2$ end

\(S(x) \supseteq E(e_1)\), \(E(e) = E(e_2)\)

$e_1$ $o$ $e_2$

For $o \in\left\{{\mathtt{+},\mathtt{-},\mathtt{*},\mathtt{/}}\right\}$ \(E(e) = \left\{{\mathtt{Integer}}\right\}\)

For $o \in \left\{{\mathtt{<},\mathtt{<=},\mathtt{==}}\right\}$ \(E(e) = \left\{{\mathtt{Boolean}}\right\}\)

-$e_1$

\(E(e) = \left\{{\mathtt{Integer}}\right\}\)

not $e_1$

\(E(e) = \left\{{\mathtt{Boolean}}\right\}\)

$\mathit{literal}$

If literal is a string constants, \(E(e) = \left\{{\mathtt{String}}\right\}\). Similarly with boolean, integer and void literals.

native

This term may come one of the following sources:

Class Method $E(\mathtt{native})$
Object Object \(S(\sigma_{\mu})\)
Object abort \(\left\{{}\right\}\)
Object type_name \(\left\{{\mathtt{String}}\right\}\)
Object copy \(S(\sigma_{\mu})\)
IO out_string \(S(\sigma_{\mu})\)
IO out_int \(S(\sigma_{\mu})\)
IO in_string \(\left\{{\mathtt{String}}\right\}\)
IO in_int \(\left\{{\mathtt{Integer}}\right\}\)
Integer equals \(\left\{{\mathtt{Boolean}}\right\}\)
Boolean equals \(\left\{{\mathtt{Boolean}}\right\}\)
String equals \(\left\{{\mathtt{Boolean}}\right\}\)
String concat \(\left\{{\mathtt{String}}\right\}\)
String substr \(\left\{{\mathtt{String}}\right\}\)
Array Array \(S(\sigma_{\mu})\)
Array resize \(S(\sigma_{\mu})\)
Array get_item \(S(\texttt{obj})\cup\left\{{\mathtt{void}}\right\}\)
Array set_item \(S(\sigma_{\mu})\)

The occurrence of \( S(\texttt{obj}) \) in the rule for get_item refers the formal parameter of this name in set_item.

Implementing Optimizations

You will be given a skeleton optimizer that accepts analysis options. You should implement static class resolution and should use it to perform the following changes:

when clauses
If a ``when'' clause is impossible, it should be removed.
dispatch
If a dispatch is statically resolvable (only one possible method and the receiver cannot be void), it is replaced by a static dispatch. The binding will be to the method that the static resolution determines.
methods
If 0-CFA determines that a method will never be called, no code needs to be generated for it. As a close approximation to this, the body (of a non-native method!) should be replaced with a null node. (Not a no_expr node because that would mean it was native, and would get link errors.)
The skeleton optimizer mentions other optimizations (inlining and dead-let removal), but limit yourself to these. It sets the resolve_level global variable to select the kind of resolution desired. Do nothing is the level is 0. If it has value ``1'' (or any non-zero value for undergraduate students) then do CHA resolution. If it has anything higher for graduate students, do 0-CFA.

Optimization at the AST level requires changing the AST for expressions. This can be done by having a recursive function for optimizing expression ASTs that returns a new AST (or perhaps the same AST) and callers update their children using these routines.

The 0-CFA analysis requires that a set be stored for every $S$ set: We associate with every formal parameter, every attribute, every let bound variable, every `when' bound variable and every method body a set of classes, initially empty. The $E$ sets can be computed by recursive member functions. Each set may also contain an indication that the variable could be void. Then we make repeated passes over the program in which we compute the set for every expression and use this to update the $S$ sets involved. Once a fixed point is reached, we can use the information for a receiver to determine whether it is statically dispatchable. The analysis must be implemented to carry along possible void-ness as well.

Testing & Measurement

You need to write a test case (as one big file test.cl, for easy of testing) that checks for bugs in your implementations. Optimizations are very easy to do wrong, so testing is important.

Evaluate the results of optimization (or the non-optimized ``control'' case) in two ways:

You should use the benchmark files and inputs in

/afs/cs.uwm.edu/users/classes/cs754/src/homework6/benchmark
(For simplicity, each benchmark is a standalone Cool program.)
Please create a readable table summarizing your results in README.

Deliverables

To get started, type


make -f /afs/cs.uwm.edu/users/classes/cs754/src/homework6/Makefile
This will get a number of files. The files you will edit are:
cool-tree.h
Add fields and methods to AST nodes.
optimize.h
Define data structures and non-member functions.
optimize.cc
Implement the optimization. We include code to create class and method tables. This code will be useful, but you will need to call init_class_table yourself.
myopt
Change to test different options. (Undergrads will not need to change this file.)
test.cl
Test case for CHA and 0-CFA.
README
Your results.



John Tang Boyland
2004-11-18