Using the PCC Infrastructure

On line usage: Demo

This section discusses how to use the online Demo.

The following picture gives an overview of the infrastructure.

Configuration

On the top-level page of the Demo you can configure the producer and the consumer side, by entering hostname and a directory for both components. By default, both sides are running on our main server. If you have downloaded a complete bundle of the demo, including the web pages, and installed them locally, adjust hostname and directories accordingly. Note, that Java (preferably j2sdk 1.4.1) has to be available both at the consumer and producer side. Camelot (with lfd_infer for inferring space consumption) has to be available on the producer side. Isabelle has to be installed at the consumer side.

Producer

On the producer-side top-level page you will find a list of available programs on the left hand side of the screen. Select one of these programs and you will see the code in a textarea. You can modify this code if you wish. Three operations are possible on the programs

  • Performing space inference. This will run the inference on the space consumption of the program and list the results as a sequence of extended types, one for each Camelot function. See the Section called How to read LFD types on how to read these types.

  • Compile to Grail. This compiles the Camelot program down to Grail code. This might be useful if you want to check how high-level Camelot functions are compiled down to JVM level. Note that Grail corresponds 1-1 to JVM code, but the syntax has more of a functional flavour than JVM code. See the Section called How to read Grail code on how to read Grail programs.

  • Compile to JVM. This will compile the Camelot program down to JVM code, also produce a certificate and bundle the whole things up in a .jar file. On the next screen you'll see the components of the .jar file and you can transmit it to the consumer.

Consumer

After having transmitted code and certificate, they will be unpacked on the consumer-side, and the first page gives some explanation on the contents of the received data. Out of the JVM class file, the Grail code is reconstructed, and translated in a form digestable by Isabelle, which acts as proof checker. Out of the data present in code and certificate, several files are constructed (we call them consumer-side certificates). The main one is an Isabelle theory file with a list of lemmas, one for each Camelot function, stating that the body of this function fulfills its assertion, which is encoded in the certificate as an LFD type.

To check the certificate, press launch at the bottom of the page. On the next page you will see Isabelle processing the input files. Each of the lemmas, checking the Camelot functions, is processed, and at the end possible an overall resource statement that encodes the heap space consumption of the overall program

Consumer (Safety Policy)

On the consumer-side top-level page it is possible to set the safety policy for the consumer. At the moment, this is restricted to setting an upper bound to the permitted amount of heap consumption. If the certificate sent with the code includes an overall resource statement, this statement will be checked against the upper bound of heap consumption specified on this page.

A Guided Tour

Choose the default configuration at the top-level page of the Demo and click at continue as producer to go to the producer-side top-level page . From the list of programs, pick InsSortW. You'll now see the Camelot code for insertion sort in an editable textarea. Here is the main part of the code that you should see:

type iList = !Nil | Cons of int * iList

let ins a l = 
	match l with 
	  Nil -> Cons(a,Nil)
	| Cons(x,t)@_ -> 
	    if a < x then Cons(a,Cons(x,t))
		else Cons(x, ins a t)

let sort l = 
  match l with 
    Nil -> Nil
  | Cons(a,t)@_ -> ins a (sort t)

You can read it pretty much as standard ML (or OCaml) code (see the Gentle Introduction to Camelot for more details). The main novel construct is the @_ annotation. It turns the match statement into a destructive match, where the constructor, that is being matched against, will be returned to a global freelist. Therefore, in the ins function above, the heap space of the matched cons-cell is free again when performing that branch of the match. In the recursive case in else branch, this space is re-used when building the Cons-cell as result, and in total no heap space is used. In the base case in the then branch, 2 Cons cells are built, to that the overall heap consumption of ins is one heap cell. In the top-level sort function, again a destructive match is used to free the current Cons cell so that the overall heap consumption of sort is 0.

The heap space consumption, as explained above, can be inferred automatically by the compiler using the [HoJo03]Hofmann-Jost type system. Pick the option Space Inference and click submit. You'll see the results of the inference as a sequence of LFD types, one type for each Camelot function:

...
ins        : 1, int -> iList[0|int,#,0] -> iList[0|int,#,0], 0;
..
sort       : 0, iList[0|int,#,0] -> iList[0|int,#,0], 0;

From this type the heap usage is read as follows. The ins function needs 1 heap cell up front plus 0 heap cells for each element of the input list (read from the constant 1 at the beginning and the 0 in the type of the second argument). After execution, 0 heap cells plus 0 heap cells for each element in the result list are left-over (read from the trailing 0 and the 0 in the result type). Note that from these annotations the sort function doesn't consume any heap space.

Now, what happens if we change the code to use a purely functional version of sort? Go back to the producer top level, and in the displayed code change sort as follows:

let sort l = 
  match l with 
    Nil -> Nil
  | Cons(a,t) -> ins a (sort t)

Now, the Cons is still in use after the match, and each recursive call to sort will use 1 heap cell, namely the one used up by the call to ins. Running the inference on this code reflects this behaviour by showing these types:

ins        : 1, int -> iList[0|int,#,0] -> iList[0|int,#,0], 0;
...
sort       : 0, iList[0|int,#,1] -> iList[0|int,#,0], 0;

Now, sort uses 1 cell for each cell of the input list.

Off line usage

This section discusses how to use the infrastructure in an off-line mode, ie. having after having downloaded and installed the software. Details on installation are discussed in the Section called Installing the PCC Infrastructure. In this form, the reasoning infrastructure, in particular the bytecode logic for Grail, and the heap space logic put on top of it, are available to build other components on top of them.

Prerequisites

The following software is necessary to use the infrastructure in an off-line fashion:

  • Isabelle2003: The formalistion of the bytecode logic for Grail, its soundness and completeness proofs, and the formalisation of the heap space logic in the form of derived assertions, has all been done in Isabelle/HOL, and is currently tied to the specific version of Isabelle2003.

  • MRG: This is an Isabelle/HOL heap images, which includes the precompiled bytecode logic. Strictly speaking this is not necessary (you could use the raw theory files instead) but without it every validation of a certificate would have to redo eg. the soundness proof of the logic and therefore be very slow.

  • Camelot 4.0: To automatically generated certificates on heap consumption out of the Camelot code, version 4.0 of the Camelot compiler is needed. It can be downloaded as binary RPM or as a source bundle from the main Camelot page.

A snapshot of the infrastructure is available here.

If you want to build the Camelot compiler from source you additionally need the following software:

  • Moscow ML (to build the Camelot compile from source)

  • OCaml (to build LFD module, only; OCaml page)

  • lp_solve (to build LFD module, only; download)

Basic Usage

The snapshot of the infrastructure contains a complete progs directory, which includes Camelot compiler, gdf/gf assembler and dis-assembler, as well as the Bytecode Logic. The environment variable MRGLOCALBINDIR is used as installation directory (e.g. ~/bin) for the binaries, and MRGISABELLE is used as Isabelle command (e.g. emacs). You can rebuild the entire toolset by typing

  make clean 
  make local 

After unpacking the programs, go to progs/BytecodeLogic/HP/tests/lists. In this directory you will find several example Camelot programs. You can run the whole process of compilation, certficate generation and validation on your local machine by typing make cons-<prg> where <prg> is the name of the program without the .cmlt extension. This will pop-up an instance of Proof General, the emacs-based interface to Isabelle/HOL. It will start with the main theory file that needs to be proven. Before starting the proof, enable the option Proof General -> Options -> X-symbols and pick the MRG in the pop-up menu you get from Isabelle/Isar -> Logics.

The example in this section is an in-place reverse-append. The Camelot source code for this program is

let rev0 l acc =
  match l with Nil -> acc
             |Cons(h,t)@_ -> rev0 t (Cons(h,acc))

Now you can step through the proof. You'll find a series of lemmas of the form

lemma rev0_Correct: " Context |> snd (methtable RevW rev0_) : SPEC rev0_" 
 by (W rev0_pdefs)
This says that the body for the Camelot function rev0 (snd (methtable RevW rev0_)) fulfills the specification found in the specification table (SPEC rev0_). The proof of this lemma is done by calling a generic proof tactic (W), based on the heap space logic, and parameterised with the funnction definitions for the code (rev0_pdefs).

Probably more insightful is the following proof of the same lemma, which unwinds the steps of the tactic and systematically applies the rules of the heap space logic. The following reference page for the MRG logics provides a succinct summary of the main rules. See also the Section called The Structure of a Certificate. The file containing this step-by-step proof is RevW_C1.thy.

lemma rev0_Correct: " Context |> snd (methtable RevW rev0_) : SPEC rev0_" 
apply (simp add: rev0_pdefs)
(* translate it into a statement in the MRG (heap space) logic *)
apply (rule MRG_sound)
apply (rule MRG_Weak)
(* from now on, syntax-directed rules with on-the-fly simplifications *)
apply CALL
apply (simp add: dmp_defs)
apply (DOMS rev0_pdefs)
apply (rule MRG_Leti)
apply LEAF
apply (rule MRG_If)
apply CALL
apply (simp add: dmp_defs)
apply (DOMS rev0_pdefs)
apply LEAF
apply CALL
apply (simp add: dmp_defs)
apply (DOMS rev0_pdefs)
(* now we have a let, generated from a (destructive) match against a list; 
   use the special rule instead of generic let rule *)
apply (rule MRG_ListMatchD)
apply GT
apply (rule conjI)
apply simp+
apply (rule MRG_Letr)
(* the Make method generates a heap cell; we have a special rule for it *)
apply (rule MRG_MakeList)
apply GT
apply (rule conjI)
apply simp+
(* meeting the recursive call here: use rule for InvokeStatic; 
   supplied Context must contain the invocation *)
apply (rule MRG_InvStat)
apply (simp add: ctxt_def)
apply (simp add: meth_defs)
apply (simp add: meth_defs)
apply (simp add: meth_defs)
apply (simp add: meth_defs)
apply (simp add: meth_defs)
apply (simp add: DOM_def)
apply (simp add: DOM_def)
apply simp
apply simp
apply (simp add: rev0_pdefs)
apply (tactic {* RENA_tac 1 *})
apply simp
(* now only trivial sub-goals over sets and nats are left *)
apply simp+
apply fastsimp
done

The main phases of the proof can be summarised as follows:

  • The rule MRG_sound is used to reduce the lemma on the bytecode logic to a lemma in the MRG (heap space) logic. No rules of the bytecode logic have to be used (the names of the rules start with vdm_).

  • In the rest of the proof syntax-directed rules of the heap space logic are applied (the names start with MRG_; short-cuts for rules with simplification are in all-caps). Most of the simplifications done on the fly use Isabelles standard simplification procedures, extended with definitions for the code and definitions of specification tables, context and dominates relation.

  • When meeting an expression, that has been generated our of a high-level match command, a special rule for this case is used (in the proof MRG_ListMatchD).

  • When meeting the recursive call, the rule for InvokeStatic is used, which requires a specification for the invocation to be present in the context. Its other sub-goals are solved with simplifications and applying a rule allowing for renaming of parameters.

  • The remaining subgoals are simple side-conditions over sets, which are solved via standard simplifications and one fastsimp.

How to read LFD types

An LFD type extends a usual Hindley-Milner type with annotations, representing information on the space consumption of the code. Most notably, this space consumption is not given as a closed form expression over the sizes of the input arguments. Rather, weights are attached to the constructors of the data type, which state how much is per constructor is needed to ensure that the program succeeds.

As an example, let's look at an in-place reverse-append function, which can be written in Camelot like this:

let revAppend l acc =
  match l with Nil -> acc
             |Cons(h,t) -> revAppend t (Cons(h,acc))

Inference of LFD types reports the following type for revAppend:

revAppend: 0, ilist[0|int,#,1] -> ilist[0|int,#,0] -> ilist[0|int,#,0], 0

The basic type, is the expected type of illist -> ilist -> ilist. The annotations attached to ilist, represent the weights to the constuctors in illist, Nil and Cons, respectively. Thus, the overall space consumption of this function is 1 for each Cons cell plus 0 for the final Nil plus 0 as constant space needed, irrespective of the input. In other words, if n is the length of the input list, the space consumption will be 1 * n + 0. The annotations in the result type represent the space left over after the execution, which is 0 for each Cons cell of the result list plus 0 for each Nil cell plus 0 as a constant. In this case no heap space becomes free. In other examples, intermediate data structures might be unused after finishing the function and would show up in the annotations of the result type.

We can turn the above revAppend function into an in-place function, by turning the match operation into a destructive match, that allows to free the Cons cell, after it has been matched against. This is realised by attaching the @_ annotation to the Cons in the match.

let revAppend l acc =
  match l with Nil -> acc
             |Cons(h,t)@_ -> revAppend t (Cons(h,acc))

With this change, the type inference can indeed determine 0 heap consumption.

revAppend  : 0, ilist[0|int,#,0] -> ilist[0|int,#,0] -> ilist[0|int,#,0], 0;

For a detailed discussion of the LFD types and the type inference process, see ...

How to read Grail code

Grail (Guaranteed Resource Allocation Intermediate Language) is an abstraction over the JVM, covering its main components. It's sufficiently abstract to also allow for translations into .NET or more specific virtual machines. Yet, it is close enough to JVM machine code to guarantee a 1-to-1 translation. Therefore, we can recover the Grail code from the JVM class file transmitted in a distributed environment.

Grail makes several restrictions on the structure of the code: programs must be in ANF (administrative normal form, also known as let-normal form), which means that arguments to functions, and headers in conditionals must be variables; methods in the JVM are handled as tail-recursive functions in Grail; basic blocks in the JVM are treated as functions in Grail, with a restricted parameter passing convention: parameters and arguments must have the same name; thus a Call in Grail amounts to a jump instruction; finally, all Grail functions have to be lambda-lifted.

The code below is an example of the translation of an in-place list-reversal function in Camelot to Grail.

method static List rev (List l, List acc) =
let fun f(List l, List acc) =         // Local function declaration
      let val tag = getfield l TAG    // Access to object discriminator
      in if tag = 0 then acc
         else f1(l, acc)              // Conditional function call
      end
    fun f1(List l, List acc) =        // Another local function    
      let val h = getfield l HD         
          val t = getfield l TL
          val _ = putfield l TAG 1   // RHS with side effect
          val _ = putfield l HD h
          val _ = putfield l TL acc
          val acc = l
          val l = t
      in f(l,acc) end                 // Tail call satisfying parameter condition 
in f(l, acc) end                      // Main expression

Note that the basic blocks have been turned into functions, with f1 representing the else branch of the main function. Because the argument names must match, explicit assignments have to be used for parameter passing. This, however, is only necessary for functions. For methods, the usual parameter passing conventions apply.

The Structure of a Certificate

In essence, the certificate contains information on the heap consumption of all Camelot functions, i.e. methods in the Grail code. LFD inference is used to infer the heap consumption where possible. Calling the camelot compiler with options -C -lfd will generate a file <prg>Certificate.thy. The main part of this file is the definition of SPEC, which maps Camelot functions to specifications in the form of derived assertions. A derived assertion is the formalisation of an LFD type as an assertion in the Grail logic. It has the form DAss U n C T m, where U is a set of variables used (the arguments to the function), T is the result type, n,m are the constants from the LFD types, i.e. the constants in the space needed/released for/after execution. C is a context mapping variables to enriched types, which contain weights for each constructor, e.g. ListET 0 1 is a list with weight 0 for Nil and weight 1 for Cons. The sum of the weights in the context plus n represents the space needed for execution.

As an example, below is the derived assertion for revAppend as it appears in the certificate:

{ {l_17_, acc_, 0, [(l_17_, ListET 0 0),(acc_, ListET 0 0)] >> (ListET 0 0), 0 }
A functional (non-destructive) version would assign type ListET to l_17_. As with LFD types, the result type is also annotated with weights, representing, together with the final constant, the heap space left-over after execution.

Miscellaneous Hints

If you want to see in more detail how the resource lemmas are proven, use tactic flavour 6 by inserting the following line in your Camelot source code:

(*# TACTIC_FLAVOUR: "6" #*)

If you want to produce an overall resource statement (a "wrapper"), which phrases the resource statement in a more comprehensible version, add the following line to the Camelot source code:

(*# CERTIFIED_METHOD: "InvokeStatic RevW rev_ [RNarg l1]" #*)
Note that this works only for single argument functions, and it doesn't use a generic tactic in the proof. So this step might fail although the resource bounds of all functions are met (you can check this by validating the code without the above pragma).

In some cases, when the lfd inference detects a "space leak" in the program, the tactic will most likely not work. In this case the compiler prints out a warning and the consumer will also print a warning on the first page.

To call the camelot compiler directly in progs/BytecodeLogic/HP/tests/lists use the following options (for the settings in the online infrastructure see Demo/Htmls/pcc/producer/configure-lionel.php):

../../../../Camelot/src/camelot -n -C -thy -lfd -lfd-prg ../../../../LFD_inference/lfd_infer -lfd-opts "-olhs 4"   InsSortM.cmlt
(-C for certificate generation, -thy for .thy file generation). The reconstruction of Grail and then .thy code, as well as generation of the consumer-side certificates is done with
../../../../Grail/gf/src/gf -L 0 -T 0 -F 4  -t mrg -C -q  InsSortM
(-F specifies the tactic flavour, taken from the pragma; -t mrg uses the default MRG logic; -C for certificate generation).