Abstract Twelf is an implementation of the LF logical framework proposed as a vehicle for writing machinechecked proofs of type system theorems. This paper serves several purposes: it records experience from an outsider trying to learn the system; it contains an introduction to using Twelf; it can serve as a tutorial for learning Twelf. 
The “POPLmark” challenge is looking toward a time when papers presented at conferences such as POPL are routinely accompanied by machinechecked proofs. Having written several typetheory papers with extensive handwritten proofs, the idea of having machinechecked proofs is attractive. In particular, the ability to reuse proofs after minor variations is desirable. However learning to use a proof system is daunting. In this paper, I give my experience in starting to learn how to use Twelf (specifically version 1.5R1). At the time of writing, I had been using Twelf for little more than two weeks. There is a general paucity of information directed to beginning Twelf users interested in proving the soundness of type systems. Thus despite my meager knowledge, I felt that I should share my thoughts with others. I will do my best to correct any errors and misleading information in this text pointed out by Twelf masters.
Part of the POPLmark effort consists of posting solutions to the challenge problems. The Twelf solution was not only one of the submissions that came closest to meeting the challenge (remarkably, no solution fully addresses all parts of the challenge) but also was extensively commented.
To a neophyte such as myself, the main contending systems appear to be Isabelle/HOL and Twelf. Of the two, Isabelle/HOL is by far the better documented and user friendly. However, the ability to express proofs in formal language made Twelf much more attractive, whereas from the Isabelle documentation, it appeared a complete proof in Isabelle consists of a list of proof tactices to apply—not the proof itself. It seems that in a theorem proving system, there is less interest in the proof itself than in the fact that it is proved. From other information (in particular the slides by Christian Urban), it appears possible to express proofs in Isabelle’s metasyntax ISAR, but this aspect of Isabelle is, if possible, documented even worse than Twelf. The proofs produced by ISAR look more like natural language proofs, with all the benefits and problems that leads to. On the other hand, Twelf proofs look like Prolog programs. Perhaps the programmer in me is attracted to the latter, but in any case I first started investigating Twelf.
Twelf comes wih a “user guide” which includes more information than one is likely to find elsewhere. At the time I write this, the user guide has not been updated for Twelf 1.5 and so I used the user guide for Twelf 1.4.
The user guide acknoweldges that both it and the Twelf implementation are research products. The user guide was apparently written to serve as an introduction as well as reference manual, but shows signs of decay: the formal reference elements are actively maintained but the explanatory material is changed only when it is actively contradicted. Thus it would have one believe that theorem proving (as opposed to theorem checking) is a recent experimental addition to Twelf when it dates at least to Twelf 1.2 (1998), and apparently predates the theorem checker (first in Twelf 1.4) (2002).
The Twelf web page http://www.cs.cmu.edu/~twelf also contains links to Pfenning’s book Computation and Deduction and a tutorial by Appel. The tutorial show how to use the Twelf system to prove theorems in a logic embedded in Twelf, rather than showing how to use Twelf to prove theorems directly. The book has some valuable information and exercises in it but includes little on using either the theorem checker or theorem prover. There is also a Twelf Wiki http://fp.logosphere.cs.cmu.edu/twelf, at which questions can be posted and often get answered quickly. My thanks to all those who have provided introductory material for Twelf. My dependence on them and on Pierce’s Type Systems for Programming Languages will be visible in my examples.
If one wishes to use Twelf to prove type safety theorems, one will need to write the following aspects:
Twelf elegantly (but confusingly) uses the same syntax to define both parts of all three aspects (six logically separate entities):
name : type.
The Twelf theorem prover does support a special syntax for expressing the statement of theorems, and from experiments, it appears this syntax can also be used for manually constructed proofs. On the other hand, I have been informed that the syntax is subject to modification at any time and its use should not be encouraged.
Twelf as an executable system is most conveniently invoked from within Emacs, although it can also be executed from within SML. I was initialized confused because the “twelf server” doesn’t actually accept Twelf syntax. Rather Twelf syntax must be read indirectly, for example, using Emacs. The “twelf server” does has the ability to run “queries” interactively, as explained below.
To use this document as a tutorial, install Twelf according to directions (including installing the Emacs mode). Then create a fresh directory for sample files and create an empty file named source.cfg. Save it and then visit a file “test.elf” in which to write Twelf code. Start the server using MetaX twelfserverconfigure. Consent to any questions Emacs asks about configuring the Twelf server.
A new abstract nonterminal N is declared by writing “N : type.” where type is the special type for types. New productions for this nonterminal are declared as functions that return this nonterminal. Thus, a simple expression language can be declared as follows:
term : type.
true : term. false : term. if : term > term > term > term. zero : term. 
The zero term is included to ensure that the toy language has nontrivial typing. In Twelf, constants (everything declared so far) should not start with capital letters but may include a variety of symbols other than colons, periods or any kind of parenthesis. It would be legal to use 0 (ASCII zero) instead of zero.
To continue the tutorial, type in this “grammar” from memory in the Emacs buffer and then press ControlC ControlS. If you make a mistake in the syntax (such as forgetting a period), you will get an error and can use ControlC backquote to have the error highlighted. Correct the error and press ControlC ControlS again, until no errors are noted.
We also need to define the abstract syntax of types. The type system is very simple and thus can be declared as follows:
ty : type.
bool : ty. int : ty. 
Of course we can’t use type as the type of our minilanguage’s types!
On a sidenote: Twelf doesn’t include polymorphism, which can be annoying at times: every new kind of list type must be redeclared.
For the tutorial, add the declarations of the types to your Twelf file from memory.
As well as declarations, Twelf also permits one to write definitions. Definitions at the syntactic level function as syntactic sugar. For example, in order to add syntactic sugar for “and,” one can write:
%abbrev and : term > term > term = [B1] [B2] if B1 B2 false.

The squarebracketed variables are lambda bound variables. Variables should start with capital letters (although there are ways to avoid this).
The %abbrev tag is a strange requirement that I haven’t fully understood. Basically it means that the name “and” will be immediately replaced by its definition wherever it is used. It is needed if the uses of the variables are not “strict,” according to a technical definition of “strictness.” In this case, the uses are “strict.” I find however that even if all uses are strict, this tag is still obligatory to avoid cryptic errors in the proof checker.
Exercise 1 Write a definition for or. □
The term following the equal sign (=) looks a little bit like a functional language expression. However, without conditionals or recursion, possibilities are limited. In particular operations on terms (such as “size” or sets of free variables) must be defined using relations instead. There are very good theoretical reasons for this restriction which I will not attempt to explain.
The next step in defining a type system is to define the relations (judgments) that one will prove properties about. First one defines the type of the relation itself as a function that produces a “type.” For example, suppose we wish to define a smallstep evaluation relation . The arrow in the relation here is merely a convention, the relation could easily have be written . In any case, the relation consists of two terms and thus we write:
eval : term > term > type.

Typically a relation has several different inference rules (axioms). These may be written in a style reminiscent of Prolog. For example, evaluating an “if” expression has the following rule:
In Twelf (using abstract syntax), we write this as follows
eval/if : eval (if E E1 E2) (if E’ E1 E2)
< eval E E’. 
The inference rule must be given a unique name so that we can refer to it in proofs. Here the name is “eval/if” following a Twelf convention, separating teh relation name from the case name by a slash. The consequent of the judgment is given first and then the antedecents, each preceded by <. In this case, the only antecedent is a “recursive call” to eval. Twelf is also perfectly happy to have the inference rules written the other way around, using >:
eval/if : eval E E’ > eval (if E E1 E2) (if E’ E1 E2)

The original “backward” style is often preferred because it reflects Twelf’s operational semantics: order of evaluation. The forward style however makes proofs more natural since it reflects the order the antecedents are passed to the inference rule.
For the next step in the tutorial, write the declaration of “eval” and its first inference rule by memory and get Twelf to accept the whole file with ControlC ControlS.
Twelf is happy to accept E’ as a variable name. It has no conception that there is any connection between this variable and the variable E. However, for the human reader, meaningful variable names help comprehension. Variables have types. In most cases, Twelf can infer the types automatically, but if not, you can write (E:term) to force E to have type term. In the *twelfserver* window, you can see what type it assigns:
eval/if : {E:term} {E’:term} {E1:term} {E2:term}
eval E E’ > eval (if E E1 E2) (if E’ E1 E2). 
Notice that the variable’s type is presented in an abstraction introduced with curly braces (rather than square brackets). Technically this is because eval/if is given a “dependent type.” For practical purposes, you can just think of it as a way to explicitly indicate the type of a variable. Indeed you can use this syntax to define this inference rule which will then have six parameters rather than just two when you use it in proofs. In the way we originally typed it, the four variables are implicit parameters to the inference rule. We will return to this topic later.
Next we define the evaluation rules for ‘if” where the condition is fully evaluated:
eval/if_true : eval (if true X _) X.
eval/if_false : eval (if false _ X) X. 
These rules have no antecedents. The underscore (_) is an anonymous variable in the style of Prolog.
One can test the evaluation rules using Twelf’s query language. In Emacs, this can be done by visiting the *twelfserver* buffer and typing top at the end. This gets one a Prolog style “?” prompt at which one can type a query (followed by period) such as
? eval (if true false true) X.
Solving... X = false. More? 
One can type y to see what other possibilities there are, or just press return. Here there are no other possibilities. If in tutorial mode, try to see what expressions could evaluate to false. Twelf will give two independent possibilities. To leave the interactive query system in Emacs, press ControlC TAB.
Exercise 2 Add a new production iszero : term > term and evaluation rules for it that reflect the following inference rules:
Next, we define the typing relation: . Again the elements ⊢ and : are conventional, what we really have is a relation with two parts, a term and a (minilanguage) type:
of : term > ty > type.

The only nontrivial type rule is the one for if’s:
of/true : of true bool.
of/false : of false bool. of/zero : of zero int. of/if : of X bool > of Y T > of Z T > of (if X Y Z) T. 
Here I use the “forwards” style. Notice that the repeated use of the variable “T” ensures that each side of the “if” has the same type.
In tutorial mode, write these rules from memory and check that of/if has four implicit parameters.
Exercise 3 The inference rule for iszero is as follows:
Implement this inference rule in Twelf. □
In type soundness proofs based on smallstep semantics, progress means that any welltyped expression (with no free variables, but we don’t have these here) is either already a value or else can be evaluated one step further, where preservation means that if any welltyped expression can be evaluated, the resulting expression is also welltyped with the same type. In order to express progress, therefore, we need to have some notion of a “value,” a subset of the terms.
Since Twelf does not having subtyping, “valueness” must be defined as a relation:
is_value : term > type.
is_value/true : is_value true. is_value/false : is_value false. is_value/zero : is_value zero. 
Next, since the progress theorem says that the term is either a value or can be evaluated a further step, this “or” will need to be converted into an explicit relation that is true in either case:
not_stuck : term > type.
not_stuck/value : not_stuck X < is_value X. not_stuck/eval : not_stuck X < eval X X’. 
It has two cases, no matter how many different evaluation rules or values we have.
Twelf accepts a special syntax for theorems (or lemmas, but they are all called theorems in Twelf). Here is how we can express the statement of the progress theorem:
%theorem
progress : forall* {X} {T} forall {O:of X T} exists {NS:not_stuck X} true. 
Various parts can be omitted, but not repeated and not put in a different order. The keyword forall* has an asterisk because the variables thus mentioned are implicit—when using the theorem, one does not pass the corresponding terms as parameters. On the other hand, the next variable O is explicit with a relation type. The result is an explicit output parameter (result of the theorem) rather than input parameters (requirements of the theorem). Basically the theorem can be seen as a function that takes the input parameter (the typing fact) and returns the output parameter (the fact of nonstuckness). The theory behind Twelf is that if such a function is
then one has a valid theorem. A %theorem declaration succeeds as long as all variables are declared: a theorem is not allowed implicit free variables, implicit variables must be bound by forall*. Success says nothing as to whether the theorem is true or not!
To prove the theorem, one must either invoke the automatic theorem prover or else provide a proof. First, the theorem prover can be invoked using a %prove tag:
%prove 2 XT (progress XT _).

The number “2” is an indication of how hard you wish the theorem prover to try. Generally very simple inductive theorems can be proved with level 2, more complex ones with 3. I haven’t found it necessary to use larger numbers than 3.
In the %prove request, the XT is the variable on which induction should be tried. Its meaning is determined by the “call pattern” (progress XT _) in which the other parameter is elided with an underscore. The natural language equivalent to this request is “Prove the progress theorem by structural induction over the typing derivation.”
In tutorial mode, type in the theorem by memory and ask the theorem prover to prove it. In this case, (at least in Twelf 1.5R1), the theorem prover apparently diverges. After losing patience, press ControlC TAB to stop it. We will return to this theorem later after defining a helper lemma. In the mean time, comment out the %prove request by preceding it with %% (or by putting a space before the “p”, an interesting convention I have seen people use).
The %theorem declaration is essentially equivalent to the following declarations (one can see these in the *twelfserver* window):
progress : {X:term} {T:ty} of X T > not_stuck X > type.
%mode +{X:term} +{T:ty} +{O:of X T} {NS:not_stuck X} (progress O NS). 
The first part is a normal relation declaration. The %mode declaration indicates which variables are input “+” and which are output “”. The inputs are universally quantified (“for all”) and the outputs are existentially qualified (“there exists”). The mode declaration will cause further checks to occur when the proof is written.
The preservation theorem does not require a helper relation (as progress required “not_stuck”):
%theorem
preservation : forall* {X} {X’} {T} forall {O:of X T} {E:eval X X’} exists {O’:of X’ T} true. 
If in tutorial mode, type in this theorem and ask the theorem to prove it, again by induction on the typing derivation. In this case, the call pattern has three explicit parameters. It should succeed quickly.
As we have seen, the theorem prover is not always capable of inferring a proof. In that case, one can either prove the theorem by hand (see next section) or else “assert” the truth of the theorem. Assertions can only be done once you have set “unsafe” mode (type set unsafe true at the end of the *twelfserver* buffer). Assertions are written in the following way:
%assert (progress _ _).

An asserted theorem can be used to (automatically) prove later theorems.
An interesting start point for seeing how to write proofs would be to see the ones produced by the theorem checker. In Twelf 1.2, the theorem prover prints out the proof once it is found. In an unfortunate twist, this highly useful feature is disabled in later versions for technical reasons relating to higherorder syntax. I have been assured that this feature will be reimplemented soon; it is an embarrassment that the theorem prover in a system intended as a foundation for proofcarrying code won’t produce the proofs it creates.
In any case, we still need to discuss the format for a proof. As explained above, a theorem is a relation that is indeed a function from inputs to outputs. The theorem cases are thus very much like the clauses implementing the inference rules, except that we are not interested in the name of the case. Starting with the preservation theorem, we can thus handle the case where evaluation has performed eval/if_true:
 : preservation (of/if (of/true) YT ZT) (eval/if_true) YT.

The single hyphen (“”) is the uninteresting name of the clause (an alternate name might be eval/if_true, using a prefix hyphen to ensure we don’t shadow the previously defined relation case).
After the colon, we have the name of the theorem being proved and its “parameters.” The first parameter is the typing relation. Since we know that the term is an “if” (because the evaluation rule is eval/if_true), the typing relation must use of/if. Now, it would make intuitive sense if the typing relation included the type and the term, but in Twelf, the relation doesn’t include the pieces (its type does!), instead you must state how one knows the relation is true, in this case the antecedents of the TIF type rule. The first antecedent is the typing relation for the condition (the literal term “true” because we are doing the case for eval/if_true). This relation must have been produced by of/true, the only rule to assign a type to true.
The variables YT and ZT stand for the other two typing relations (not the subterms being typed!). Then the proof case states the evaluation rule, which is eval/if_true as explained before. It takes no parameters since the evaluation rule is unconditional, I have placed it (and the earlier unconditional rule of/true) in parentheses to emphasize their lack of parameters, but the parentheses are extraneous and may be omitted. The final parameter to preservation is the output parameter, the typing of the result. Since the result is the second child of the “if” term and its typing is the antecendent already bound as YT, we simply name the variable again. Of course the name YT is merely conventional. I would have chosen Y:T except that colons (unlike hyphens) may not appear in identifiers.
The case of eval/if_false is analogous:
 : preservation (of/if of/false YT ZT) eval/if_false ZT.

The case for EIF is more complex:
 : preservation (of/if Xbool YT ZT) (eval/if X>X’)
(of/if X’bool YT ZT) < preservation Xbool X>X’ X’bool. 
Unlike the previous cases, eval/if has an antecedent, which I have given the suggestive name X>X’. If we closed the rule before the <, we would have a case that had a valid type, but would nonetheless give an error:
Occurrence of variable X’bool in output () argument
not necessarily ground 
If in tutorial mode, try this out by replacing the line starting < with a single period.
The error comes from mode checking. The first two parameters to the theorem preservation are input parameters and thus XBool and X>X’ get values from the inputs, through pattern matching. But the last parameter requires us to “call” of/if. Of its three parameters, the second is not yet determined, hence the error. The required typing relation is obtained using a recursive call to the theorem, that is, an inductive use of the theorem being proved.
In general, one may need to call other theorems as well (and thus have multiple clauses preceded by <). Order is relevant because of mode checking. The backward syntax (that using <) is preferred because then evaluation order coincides with textual order. You will notice however, that Twelf uses the forward syntax when echoing the cases. When trying to troubleshoot errors, you will need to remember to mentally reverse the order when reading the generated output.
Exercise 4 Write the preservation cases for iszero evaluation. □
Once you think you’ve implemented all the cases, you can check that your theorem is “total” (that it covers all cases, and terminates successfully on all inputs):
%worlds () (preservation XT X>X’ %{=>}% X’T).
%total XT (preservation XT _ _). 
The %worlds declaration is required for reasons related to “higherorder abstract syntax.” The “()” is a “trivial” world. In my opinion, the trivial worlds declaration should have been the default. However, since there is no default, I make use of the requirement to state a call pattern by giving suggestive names to the parameters, separating the input parameters from the output parameters by a comment (%{=>}%). I have the habit of writing the %worlds declaration before writing the cases; it helps serve as a reminder of the purpose of the parameters. On the other hand, this habit may run into “autofreezing” problems which you may notice in a version other than Twelf 1.5R1.
The %total command requests that the theorem be checked using all the cases that Twelf is aware of. If you use the Emacs command ControlC ControlD to execute single Twelf lines at a time, then you must be careful to redeclare the theorem before checking the proof again or else any bad cases will still be present.
If in tutorial mode, check that the preservation theorem is total. Fix any mistakes in stating the cases.
The callpattern is used in the same way as for the theorem solver—the XT means that the theorem is proved inductively on the first “parameter.” Twelf only uses structural induction. If you want to write a proof inductively on the size of something, you will need to reify the size as a natural number and then do structural induction on the natural number. See the next section for more information on doing this.
Often the term “structural induction” is used only for structural induction on terms, but as we have seen, relations are basically indistinguishable from terms in Twelf. Again, it’s important to stress that the “children” of a relation are not the related terms, but rather the antecedents in the inference rule that proves the relation.
Twelf supports various kinds of mutual induction among parameters. See the “user guide” for a listing of different kinds of “termination orders.”
Moving ahead to the “progress” theorem, we write the “worlds” declaration and the trivial value cases:
%worlds () (progress XT %{=>}% X>X’).
 : progress (of/true) (not_stuck/value is_value/true).  : progress (of/false) (not_stuck/value is_value/false).  : progress (of/zero) (not_stuck/value is_value/zero). 
If in tutorial mode, type in a worlds declaration (no need to get fancy with the parameters) and these three cases from memory and give them to the Twelf server. Fix any errors.
The case for if must be handled differently. The problem is that we need to recursively check progress on the boolean condition and then depending on the result of this progress (the condition may be a value or may evaluate further) we need to do different actions. Unlike a handwritten proof, Twelf does not permit cases analysis inside another case analysis. Thus we need to perform what is called “output factoring” and declare a helper theorem on the output of our recursive “call”:
%theorem
progressif : forall* {X} {Y} {Z} forall {O:of X bool} {NS:not_stuck X} exists {NS’:not_stuck (if X Y Z)} true. 
Then we declare the trivial worlds for this theorem and give the cases:
%worlds () (progressif XT X>X’ %{=>}% IFX>_).
 : progressif _ (not_stuck/value is_value/true) (not_stuck/eval eval/if_true).  : progressif _ (not_stuck/value is_value/false) (not_stuck/eval eval/if_false).  : progressif _ (not_stuck/eval X>X’) (not_stuck/eval (eval/if X>X’)). 
In no case do we actually use the typing fact. The typing is needed only to avoid the need for a “zero” case. Twelf’s coverage checker will be able to figure out that there’s no way to give zero the type bool.
If in tutorial mode, type in these cases from memory, and then write a %total directive and check we have indeed proved the helper theorem.
Exercise 5 Write the progress proof cases for iszero. You will need to do “output factoring” again (write a helper lemma). □
We can now prove the theorem’s totality:
%total XT (progress XT _).

If in tutorial mode, check that the progress theorem is correct.
The proof checker and theorem prover are basically independent modules in Twelf and neither has knowledge of the other. In other words, the theorem prover is unable to use a theorem proved manually and a manual proof is unable to use theorems proved automatically. This is another embarrassment, only partially mitigated by a new “unsafe” tag %trustme (not in Twelf 1.5R1) for asserting theorems to the totality checker. I believe that rectifying this situation is also high priority to the Twelf developers.
The Twelf user guide mentions “constraint domains” and in particular extensions to support integers and rational numbers directly. This seems very attractive, but fails in a bad way if you are trying to prove theorems: at best one gets cryptic “definition violation” error messages, at worst unsoundness results. Do not use “constraint domains” in conjunction with proofs.
Since Twelf does not currently come with a prooffriendly definition of numbers, one needs to “roll one’s own.” I intend to make available a “signature” of useful theorems concerning natural numbers. The examples here come from that signature. This section no longer has explicit tutorial directions, but the reader is recommended to try out the examples.
Natural numbers are defined with a zero term and a successor term:
nat : type.
z : nat. s : nat > nat. 
We then can define addition and subtraction (the latter an abbreviation):
plus : nat > nat > nat > type.
plus/z : plus z Y Y. plus/s : plus (s X) Y (s Z) < plus X Y Z. %abbrev minus : nat > nat > nat > type = [X] [Y] [Z] plus Z Y X. 
Exercise 6 Define times and divide in a similar way. The case times_s will need two antecedents. □
Then we can prove that addition is associative:
%theorem
plusassociative : forall* {X:nat} {Y:nat} {Z:nat} {X’:nat} {Z’:nat} forall {P1:plus X Y X’} {P2:plus X’ Z Z’} exists {Y’:nat} {Q1:plus Y Z Y’} {Q2:plus X Y’ Z’} true. %prove 2 P1 (plusassociative P1 _ _ _ _). 
I use hyphens to separate words in theorem names to distinguish them from relation cases (separated by slashes) and from multiple word relations (separated by underscores).
Exercise 7 Write out a proof for associativity that the Twelf theorem checker will accept. You will need two cases: one for the case that we X is zero (the addition fact uses plus/z), where we don’t care what form Y (= X’) or Z has; and one for the case that X is not zero, and so the first addition fact must have been produced by plus/s, and thus the second addition fact must also have been produced by plus/s. The second case will use recursion (induction). □
Associativity, as it happens, is much easier to prove than commutativity:
Exercise 8 Write a statement of commutativity of addition and check that the theorem prover is unable to prove it. (One needs two inductive lemmas.) □
Indeed, as things get more complex, the theorem prover gets less useful. I found it was capable to proving most of the facts about addition but failed when attempting to prove facts about multiplication, even though the handwritten proofs were not that much longer. My guess is that the search space simply gets too big.
The associativity theorem presented above yields the addition fact Y+Z=Y’ as an output parameter. Sometimes, however, one already knows this fact. Doing the “obvious thing” will get an “output coverage error” when checking the theorem using the associativity theorem in this way. I will explain this problem with a new theorem that assumes the addition fact in question and uses the original associativity theorem as a lemma:
%theorem
plusassociative* : forall* {X} {Y} {Z} {X’} {Y’} {Z’} forall {P1:plus X Y X’} {P2:plus X’ Z Z’} {Q1:plus Y Z Y’} exists {Q2:plus X Y’ Z’} true. %worlds () (plusassociative* X+Y=X’ X’+Z=Z’ Y+Z=Y’ %{=>}% X+Y’=Z’).  : plusassociative* X+Y=X’ X’+Z=Z’ Y+Z=Y’ X+Y’=Z’ < plusassociative X+Y=X’ X’+Z=Z’ Y’ Y+Z=Y’ X+Y’=Z’. %total {} (plusassociative* _ _ _ _). 
The induction argument {} tells the proof checker there will be no induction. This “proof” passes the type checker and the input coverage checker but fails totality with the following error message:
Occurrence of variable Y’ in output () argument not free

The problem is not the occurrence of Y’ by itself, but the implied occurrence in Y+Z=Y’. This addition fact is produced by plusassociative, but we already have an addition fact. The Twelf system has no way of knowing that the facts are guaranteed to have the same output; the output of a relation is not necessarily determined by the inputs. Essentially, we know Y+Z=Y’ and the associativity theorem tells us Y+Z=Y’’ for a new variable Y’’. We could ignore the fact using the new variable, but then the final result of the theorem (X+Y’’=Z’) would unusable.
One solution is to prove the theorem over again from scratch, which is no great trouble since the theorem is easy to prove. On the other hand, these situations reoccur frequently in my experience. We need a way to show that Y’ and Y’’ are the same. Thus we need a new relation defining equality:
eq : nat > nat > type.
eq/ : eq N N. 
We don’t need to worry about input vs. output coverage here because the relation is not moded, only the theorems about the relations. (It is possible to mode relations too, but I do not recommend that).
Then we need a theorem that addition is deterministic: it can only produce one possibility. (Twelf gives a way to force a moded relation to be deterministic, but the determinism is only available at the level of relations, not at the level of theorems about relations and thus it is useless for proofs.)
The theorem here is made more general as it accepts “equal” addends as well as proving an equal result:
%theorem
plusdeterministic: forall* {N1} {N1’} {N2} {N2’} {N3} {N3’} forall {P:plus N1 N2 N3} {P’:plus N1’ N2’ N3’} {E1:eq N1 N1’} {E2:eq N2 N2’} exists {E3:eq N3 N3’} true. 
This sort of theorem is known as a “uniqueness lemma” in official Twelfspeak.
To prove this theorem, we need some additional lemmas: one that says if N_{1} = N_{2} then sN_{1} = sN_{2} and one for the converse. These proofs are very easy to prove since the only way to create an “eq” relation is one that makes the variables identical:
%theorem
succdeterministic : forall* {N1:nat} {N2:nat} forall {E:eq N1 N2} exists {F:eq (s N1) (s N2)} true. %worlds () (succdeterministic N1=N2 %{=>}% SN1=SN2).  : succdeterministic eq/ eq/. %total {} (succdeterministic _ _). %theorem succcancels : forall* {N1:nat} {N2:nat} forall {E:eq (s N1) (s N2)} exists {F:eq N1 N2} true. %worlds () (succcancels SN1=SN2 %{=>}% N1=N2).  : succcancels eq/ eq/. %total {} (succcancels _ _). 
These sort of proofs might seem strange: if the proof is so trivial, why is there a need for a lemma at all? The lemma is needed because case analysis only happens at the top level. The fact that there is only one case is global information (because someone could add another case at any time).
Exercise 9 Write a proof of plusdeterministic using these two lemmas. □
Next, we use this theorem in our ongoing proof of plusassociative*, but we only get so far:
 : plusassociative* X+Y=X’ X’+Z=Z’ Y+Z=Y’ X+Y’=Z’
< plusassociative X+Y=X’ X’+Z=Z’ Y’’ Y+Z=Y’’ X+Y’’=Z’ < plusdeterministic Y+Z=Y’’ Y+Z=Y’ eq/ eq/ Y’’=Y’ < ??? 
The problem is that we still have not shown than X+Y’=Z’ even though we know X+Y’’=Z’ and Y’’=Y’. The equality fact is not treated specially by Twelf. We need yet another lemma that says that an addition relation can be rewritten with equal variables:
%theorem
plusrespectseq : forall* {M1} {M2} {N1} {N2} {P1} {P2} forall {P:plus M1 N1 P1} {E1:eq M1 M2} {E2:eq N1 N2} {E3:eq P1 P2} exists {Q:plus M2 N2 P2} true. 
We will need one of these “theorems” for every relation on natural numbers, which is annoying. At least these theorems are very easy to prove since the only way to establish equality uses the same variables, as seen in succdeterministic and succcancels.
Exercise 10 Write the proof of this latest lemma. □
We can use this latest lemma to finally prove our alternate associativity theorem:
 : plusassociative* X+Y=X’ X’+Z=Z’ Y+Z=Y’ X+Y’=Z’
< plusassociative X+Y=X’ X’+Z=Z’ Y’’ Y+Z=Y’’ X+Y’’=Z’ < plusdeterministic Y+Z=Y’’ Y+Z=Y’ eq/ eq/ Y’’=Y’ < plusrespectseq X+Y’’=Z’ eq/ Y’’=Y’ eq/ X+Y’=Z’. 
The order of arguments to plusdeterministic is important—if we reversed the first two parameters, we get the relation eq Y’ Y’’, which can’t be used directly in plusrespectseq. It can however be proved that “eq” is a symmetric relation:
%theorem
eqsymmetric : forall* {M:nat} {N:nat} forall {E:eq M N} exists {F:eq N M} true. 
Exercise 11 Write the (utterly trivial) proof of this theorem. □
If instead of using plusassociative to prove plusassociative*, we wanted to use plusassociative* to prove plusassociative, we have a different problem. (Of course, I’m not speaking about trying to prove each circularly using the other!) Suppose we try:
 : plusassociative X+Y=X’ X’+Z=Z’ Y’ Y+Z=Y’ X+Y’=Z’
< plusassociative* X+Y=X’ X’+Z=Z’ Y+Z=Y’ X+Y’=Z’. 
Then we get the following error:
Occurrence of variable Y’ in input (+) argument
not necessarily ground 
We have not bound Y’ and indeed if we somehow bind this to something, we also have not bound Y+Z=Y’ (an input parameter to the alternative associativity theorem).
But Y’ is simply the addition of Y and Z, so perhaps one can write:
 : plusassociative (X+Y=X’:plus X Y X’) (X’+Z=Z’:plus X’ Z Z’)
Y’ Y+Z=Y’ X+Y’=Z’ < plus Y Z Y’ < plusassociative* X+Y=X’ X’+Z=Z’ Y+Z=Y’ X+Y’=Z’. 
But this attempt falls afoul of the lack of modes for plus, and then the lack of proven totality for the moded addition relation. Even if these problems were fixed, we still wouldn’t have a relation Y+Z=Y’, despite the fact that we are using it right there! Essentially we run into a level problem: the relation is an active participant in the proof, not an object. I myself don’t fully understand the distinction.
In summary, totality of a relation (such as plus) must be proved at the metalevel (as a theorem) before it can be used at the metalevel (in a proof). Thus one writes a theorem:
%theorem
plustotal*: forall {N1:nat} {N2:nat} exists {N3:nat} {P:plus N1 N2 N3} true. 
(Such a theorem is known as an “effectiveness lemma” in official Twelfspeak.) The theorem must have the number N1 as an explicit parameter since the proof will use induction on it, and the syntax of %total (or %prove) requires the inductive parameter to be explicit. The proof of this theorem essentially repeats the definition of the relation, lifted to the metalevel:
%worlds () (plustotal* N1 N2 N3 N1+N2=N3).
 : plustotal* z N N plus/z.  : plustotal* (s N1’) N2 (s N3) (plus/s P) < plustotal* N1’ N2 N3 P. %total N1 (plustotal* N1 _ _ _). 
We can use an abbreviation to avoid the need to mention the numbers:
%abbrev plustotal = plustotal* _ _ _.

This abbreviation is more convenient to use, hence the asterisk at the end of the original theorem’s name. Now we can write our proof of plusassociative as follows:
 : plusassociative X+Y=X’ X’+Z=Z’ Y’ Y+Z=Y’ X+Y’=Z’
< plustotal Y+Z=Y’ < plusassociative* X+Y=X’ X’+Z=Z’ Y+Z=Y’ X+Y’=Z’. 
Here Y’ gets a binding implicitly from Y+Z=Y’ when it is bound by plustotal.
As mentioned previous, Twelf’s totality checker requires that induction be structural. The structural nature of the reduction may be obscured by equality. The connection can be made available to Twelf using a %reduces declaration:
%reduces X = Y (eq X Y).

Similarly, we wish to able to use induction on natural numbers. To start with, we need a definition of when one natural number is greater than another:
gt : nat > nat > type.
gt/z : gt (s M) z. gt/s : gt (s M) (s N) < gt M N. 
This definition works fine to define “greater than” but is not much use for defining reduction, because the definition doesn’t show that gt M N always ensures that N is a subterm of M. Instead we need a definition constructed so that reduction is easily provable:
gt : nat > nat > type.
gt/1 : gt (s M) M. gt/> : gt (s M) N < gt M N. %reduces M < N (gt N M). 
From this new definition, it is possible for Twelf to determine that the first argument structurally includes the second.
Furthermore, as with the totality of “plus” before, the fact that gt is reductive does not carry over to theorems that manipulate gt relations. Instead, we need to lift the relation to the metalevel and prove reduction and totality at this level. (Totality is needed so that metagt can be “called” in a proof.)
%theorem
metagt : forall {M} {N} {G:gt M N} true. %worlds () (metagt _ _ _).  : metagt (s M) M (gt/1 M).  : metagt (s M) N (gt/> G) < metagt M N G. %total M (metagt M _ _). %reduces M < N (metagt N M _). 
As before, we basically have to repeat the definition of the relation at the metalevel. (The theorem itself is not interesting since it has no output parameters. Totality depends merely on termination.)
Once we have a greaterthan operation, it seems useful to be able to state the fact that given any two natural numbers, either they are equal, or one is greater than the other. As noted before, Twelf has no notion of alternation (this or that), and thus the condition must be reified in a new term family:
comparison : type.
less : comparison. equal : comparison. greater : comparison. 
Then we define a relation that maps any pair of natural numbers to a comparison:
comp : nat > nat > comparison > type.
comp/equal : comp z z equal. comp/less : comp z (s _) less. comp/greater : comp (s _) z greater. comp/recurse : comp (s M) (s N) C < comp M N C. 
Next, we must prove a metatotality theorem that comp is total so that Twelf knows it will always return something:
%theorem
comptotal* : forall {M} {N} exists {CMP} {P:(comp M N CMP)} true. %worlds () (comptotal* M N R MRN). 
Exercise 12 Complete the proof. As usual, each proof case will basically repeat a relation case, at the metalevel. Define an abbreviation to omit all but the last parameter. □
Finally, we need theorems to indicate that if comp assigns equal, less or greater to a particular pair of integers, then they are eq, reverse gt or gt, respectively:
%theorem
greaterimpliesgt : forall* {M} {N} forall {C:comp M N greater} exists {G:gt M N} true. %theorem lessimplieslt : forall* {M} {N} forall {C:comp M N less} exists {G:gt N M} true. %theorem equalimplieseq : forall* {M} {N} forall {C:comp M N equal} exists {E:eq M N} true. 
In order to prove the first, we need two lemmas:
%theorem
succimpliesgtz: forall {M} exists {G:gt (s M) z} true. %theorem succpreservesgt: forall* {M} {N} forall {G1:gt M N} exists {G2:gt (s M) (s N)} true. 
Exercise 13 Prove these two lemmas each by induction on their first explicit parameter. □
Exercise 14 Prove greaterimpliesgt using the two lemmas. Then prove lessimplieslt similarly. Finally write the proof of equalimplieseq using succdeterministic. □
Twelf does not have a concept of negation and thus one normally doesn’t prove things by contradiction. But Twelf does make it possible to reason forward from a manifest contradiction. For example, we can write a proof that if 0 equals 1, then any two numbers are equal:
%theorem
falseimplieseq : forall* {M} {N} forall {X:eq z (s z)} exists {E:eq M N} true. %prove 2 {} (falseimplieseq _ _). %worlds () (falseimplieseq _ _). %total {} (falseimplieseq _ _). 
Notice that this proof has no cases, which is fine, since there is no case where zero is equal to one.
If one is going to frequently use proofbycontradiction, it is helpful to define a canonical “false” abbreviation:
%abbrev false = eq z (s z).

Then one needs a lemma for each relation that says one can generate an instance of it given this impossible relation.
Exercise 15 State and prove a theorem that “false” implies any addition fact one may wish. □
Normally, it’s not necessary to work with contradictions because the case analysis already rules out the contradictions. However, sometimes the contradiction is not apparent immediately from the cases. In such a situation, we need to derive a contradiction such as 0 = 1 and from there to use a theorem such as the one just “proved” to generate the required relations for the proof. For instance, case analysis makes it clear that 0 > X is impossible, but not that X > Y contradicts Y > X. An inductive proof is needed to show this.
Exercise 16 Write a proof of the theorem that gt is “antisymmetric,” that is, if we have gt X Y and gt Y X then we can derive false. Hint: the induction will have no base case. □
Why is an inductive proof with no base case valid? Because it’s proving a contradiction: the induction terminates, it just terminates in a contradiction.
The Twelf emacs mode works pretty well at pointing you to errors. Often, however, the first error is the only one to make sense. After fixing it, you can try again (using ControlC ControlD to load just the Twelf declaration the cursor is sitting on).
Here are some of the errors I have encountered with some suggested troubleshooting tips:
In this section, I briefly mention some more advanced topics. Because I myself have not had experience in these areas, I have little to say. The reader should consult the resources mentioned at the start of this document.
Twelf has the ability to check mutually inductive theorems—multiple call patterns can be specified in a %total declaration.
One of interesting aspects of Twelf is the ability to dynamically add relations during a “call” to another relation. Dynamic additions can be used in many places where one might use “assert” and “retract” in Prolog.
Higherorder syntax, in which function terms are represented by functions at the Twelf level avoids the complications of explicit environment objects, but requires “regular contexts” (nontrivial worlds).
Twelf is wellsuited for writing typing proofs but the user needs some adjusting to its various idiosyncracies. There are number of notable shortcomings (especially the fact that the theorem prover does not produce a proof) that should and will be addressed as soon as possible. There are a number of other improvements (such as polymorphism, and the ability to use totality or reduction information at the metalevel) that would make Twelf more convenient to use.