OpenKnowledge Design Support

Dave Robertson
4th November 2008

1 Introduction

The OpenKnowledge system makes knowledge more easily shared in open systems by always sharing knowledge in the context of a formal model of the interaction process that has stimulated the knowledge sharing. The language we have chosen to represent the interaction process is LCC so LCC specifications have to be produced from somewhere. In the long term we want to make that production process as fast and straightforward as possible because, for the OpenKnowledge approach to flourish, it is necessary for large numbers of potentially useful interactions to be described. This document describes three ways of facilitating this: The solutions we present in each of these areas are not definitive or exhaustive (applied methods seldom are) but they demonstrate what can be done. For each of the methods in each area we summarise the idea in general and connect detailed technical reports (plus source code where appropriate) of the specific way in which we developed it. This, provides a resource down to coding level for those who want to replicate or extend our efforts.

2 Structured Design

2.1 Structure Editing for LCC

The most direct way to assist in constructing LCC is by providing an editor in which editing operations are based on structural patterns that are meaningful in terms of the engineering of the specification. For example a common pattern for specifying recursive roles is as follows:
a(F(A1...An),X) ::
     ( then a(F(A1...An-1,AnN),X))
     or
     null <- 
where: F(A1...An) is a role definition with functor F and arguments A1...An; the new argument AnN is derived as a consequence of the earlier definition in the recursive part of the definition; and the base case is determined by some constraint, .

Patterns like the one above provide skeletal definitions for LCC specifications that can then be elaborated using further editing operations. This view of structured design is similar to the idea of techniques editing for logic programs. A basic structure editor for LCC is described in detail, along with examples of the system in operation, in our technical report on techniques editing.

Structure editors of the sort described above provide assistance in design but assume that those being assisted are interested in manipulating the target language (in our case LCC) directly; the engineer is always aware that he or she is working on a LCC specification. There are other forms of editing where this need not be the case and in the next section we consider one of those.

2.2 Finite State Based Editing for LCC

LCC is a process language in which the definition of a process orders message passing events. Consequently, there is no explicit representation in LCC syntax of the space of states that can be encountered in an interaction; this space can be inferred from a LCC specification, rather than being directly described by it. Some systems of design, however, start from a finite state model of interactions, in which the different states of the interaction are explicitly represented and events in the interaction appear as transitions between these states. One such style of specification in the multi-agent systems community is that of electronic institutions. Here the nodes represent the different states of the conversation and the directed arcs connecting the nodes are labeled with the actions that make the scene state evolve. The Electronic Institutions Development Environment (EIDE) is a tool for describing electronic institutions, with a translator that produces, automatically, LCC from the electronic institution specification. This means that designers who prefer a state-oriented rather than a process oriented view of interaction can still contribute to OpenKnowledge.

The mechanism for translation from electronic institutions in EIDE to LCC is described in Section 5 of the OpenKnowledge report on visualiser components and visual authoring tools. The basic idea is that definitions for the sequencing in LCC definitions of roles correspond to traces through the finite state machine of the electronic institution. This translation does not preserve some of the distinctions made in an electronic institution model because electronic institutions have, as primitive, a concept of scene composition that is absent (for reasons of parsimony) in LCC. The relationship between electronic institutions and LCC is described in detail in the OpenKnowledge report on Ambient LCC.

2.3 Generating OpenKnowledge Components (Groovy)

In Sections 2.1 and 2.2 we described systems to support the construction of LCC specifications. In order for interactions to do useful work, however, it is necessary to supply OpenKnowledge components that can be used by peers to satisfy the constraints in an interaction. Although component design inevitably involves some application programming, we can make this programming simpler by supplying a higher level, Java-compatible language targeted at component design. Groovy is an agile and dynamic language for the Java Virtual Machine, it builds upon the strengths of Java but has additional power features inspired by languages like Python, Ruby and other scripting languages. Groovy also increases developer productivity by reducing scaffolding code. It integrates with all existing Java objects and libraries seamlessly, therefore Groovy can be used to create OKCs that run within the OK kernel. Moreover Java developers are able to use Groovy with almost-zero learning curve, and it's also relatively easy for novice programmers to learn and use, enlarging the group of programmers able to write OpenKnowledge components themselves. To demonstrate Groovy in use for component definition we applied it to one of our bioinformatics service coordination examples. The technical details of this (along with an illustrative example) are in our technical report on Groovy.

3 Interaction Model Analysis

Like all sophisticated process languages, LCC specifications can be complex so it is not always easy for the designer of an interaction to be certain that the specification generates the interaction that he or she had in mind when writing it. To raise confidence that it does behave as intended it is useful to be able to explore the LCC behaviours prior to deploying the interaction on the OpenKnowledge system. There are numerous ways of doing this but here we explore three of these: the first is simulation via trace generation from the LCC specification; the second is temporal property checking of these traces; the third is the inclusion of a (virtual) real-time environment in the simulation.

3.1 Generating Behavioural Traces (Meta-Interpretation)

LCC is an executable specification language and the style of execution of LCC in the OpenKnowledge kernel is based on the idea of unfolding the clauses of each peer's role definition as a means of representing change in the state of the interaction. Although the kernel is Java based for portability, Prolog is a more elegant language in which to describe unfolding. For this reason the behavioural trace generator (and the other related LCC mechanisms in Section 3) are implemented as Prolog meta-interpreters. A detailed description of the process of unfolding to generate a trace is given in the LCC operational semantics definition. The basic idea, however, is that the meta-interpreter "walks" through the role definitions, sending messages in sequence and simulating concurrency via non-deterministic choice. For example, the LCC interaction model:
a(r1, X) ::
    ( m1 => a(r2, Y) or m2 => a(r2, Y) ) then
    M <= a(r2, Y).

a(r2, Y) ::
    ( m1 <= a(r1, X) then m3 => a(r1, X) ) or
    ( m2 <= a(r1, X) then m4 => a(r1, X) ).
would be capable of generating the following two traces for peer p1 in role r1 and peer p2 in role r2:
[m(p1, m1 => a(r2, p2)), m(p2, m1 <= a(r1, p1)), m(p2, m3 => a(r1, X)), m(p1, m3 <= a(r2, p2))]
[m(p1, m2 => a(r2, p2)), m(p2, m2 <= a(r1, p1)), m(p2, m4 => a(r1, X)), m(p1, m4 <= a(r2, p2))]
where each element of the trace above is either a message, M, being sent from peer S to peer R (m(S, M => R)) or is a message being received by peer R from peer S (m(R, M <= S)). The source code for a generating traces from LCC specifications (along with some example specifications can be downloaded as a zipped folder. This sort of simulator is useful both for exhaustively exploring the state space for interactions (a ropic of the following section) and for running multiple simulations of interactions with random selections of events at choice points in the trace generation. This latter method was used in developing our peer rank algorithm (used in our bioinformatics testbed) because it gave us a way of rapidly running thousands of interactions in concert with the peer rank reputation mechanism without having to set up a much more complex (and less easily controlled) test harness for the OpenKnowledge kernel. The peer rank algorithm eventually was provided as a service for the kernel but only after this initial testing phase. A description of peer ranking and some of the simulation results appears in our technical report on peer rank simulation. The full source code for the peer rank simulator, plus test interactions, downloaded as a zipped folder.

3.2 Checking Temporal Properties of Interactions (Tabled Resolution)

In Section 3.1 our concern was to be able to generate individual traces corresponding to a permitted behaviour in an interaction specification. Sometimes, however, we are interested in knowing whether some temporal property can occur across the space of all interaction behaviours (for example if a particular message is always eventually followed by some other particular message; or if a given sequence can never occur). A basic temporal property checker that utilises a trace generator, as described in Section 3.1, applied to business process examples is described in our technical report on constraint verification. This provided a way of checking the following properties of traces (each of these is given formally in Section 3.5 of our technical report): Although our basic temporal property checker can analyse important properties of LCC interaction specifications, it explores the search space of possible interaction traces using a standard Prolog search strategy. This limits the efficiency of search space exploration because it involves a high proportion of redundant search. In OpenKnowledge we addressed this problem by using a tabled resolution based Prolog system: XSB. This allowed us to perform more complex forms of property checking on larger LCC specifications. More surprisingly, it allowed us to perform limited but useful forms of property checking in only a few seconds of real time which makes it possible to check LCC interaction models not only in advance of their deployment (the traditional approach) but also, in some circumstances, during their deployment. This provides a novel form of trust-related verification. The XSB-based property checker is described in detail in our technical report on runtime verification of trust models and the code used to implement it can be downloaded as source code.

3.3 Virtual Environments (Unreal Tournament)

All of the analytical methods described so far in this section assume that the environment on which interactions is run need not be modelled as part of the analysis, other than in terms of constraints satisfied by the LCC interaction model (or, as in Section 3.2, by a combination of LCC and service specifications). In some cases, however, we are interested in detailed simulation of environments. This is especially the case when we want to assess performance of LCC as a coordination medium in real-time systems, where response times in a rapidly changing environment are of utmost importance. To perform these sorts of analyses one needs a simulator for the dynamic environment. A popular source of this sort of simulator comes from the computer gaming world where commercial success has depended on providing semi-realistic environments in which to play. One of the standard gaming simulators is Unreal Tournament which is a popular gaming environment in its own right but also provides an accessible game engine that can be used by developers to introduce automated game playing agents ("bots" in UT jargon) into the game. It also provides a rich source of complex virtual environment topologies, courtesy of the environment design community that has built up around the game. We have built a means of linking a LCC interpreter to UT-bots so that LCC can be used to coordinate message passing between them. This allows us to use LCC to define collaborative strategies for game playing which we then test by playing teams of coordinated "LCC-enabled" bots against teams of individually superior but uncoordinated conventional bots. We have been able to produce remarkably fluid and effective team play by this means. A description our most recent work with the Unreal Tournament environment is available as a Quicktime movie with a set of accompanying notes. The second half of this video shows the environment in action (note that the movie is a comparatively large file, 75MB, so may take several minutes to download). The bots that you see in this video are highly autonomous, using machine learning algorithms fed by data from the environment to develop their individual behaviours as the game proceeds. The LCC being used to coordinate them is simple and reactive (in fact the system uses a reactive subset of LCC for speed of response) as can be seen from the accompanying notes. In the long term, we hope that this sort of architecture could develop into a framework for behaviour-based software system development analogous to the subsumption architectures used to combine behavioural modules in robotic systems.

4 Connecting to Other Systems of Design

LCC is as lightweight and parsimonious as we can make it. This does not mean, however, that it is straightforward for every engineer to use. Application domains develop engineering cultures that often take as a focus particular styles of design, supported by task-specific design notations. Thousands of these have developed and many continue to be invented so it is impossible to catalogue exhaustively the relationship of all these to LCC. Instead, we give examples below of ways to connect OpenKnowledge to established systems of design. First, in Section 4.1, we demonstrate the most direct route, via translation from a more traditional language to LCC. Then, in Section 4.2, we consider the case where the traditional language to which we wish to connect is providing a different functionality from that of LCC so extension of LCC is required to embrace it. In Section 4.3 we explain the more radical alternative of writing, in LCC, an interpreter for an established language. Finally, in Section 4.4, we discuss the most radical step of all - to replace LCC in th eOpenKnowledge kernel with an alternative process language.

4.1 Translation to LCC From Established Languages (UML, SCUFL)

4.1.1 Translation to LCC from UML Activity Diagrams

The LCC coordination calculus lends itself well to display in a graphical form. Its concepts of participants, messages and constraints, are similar to that of the UML activity diagram that has partitions, flows and activities. As UML is a common tool for the design of software that is well used in industry, by providing a conversion between UML diagrams and LCC, we are able to minimise the intellectual cost of joining the OpenKnowledge network for many industry software houses and developers.

UML editors are relatively common, yet they tend to focus less on data flow than data design. There are very few open source UML editors and those we found did not provide activity diagram support. Commercial products would be very hard to augment for LCC output without considerable cost and/or reverse engineering. So, as one of the support tools for the OpenKnowledge project we have implemented a basic UML activity diagram editor that has the option of outputting LCC code. The editor itself has been designed in a modular, extensible way so that other open-source developers could extend it such that it would handle other diagram types. However, we have, for now, only implemented the basic UML activity diagrams.

A screenshot of the editor appears below. Various node types are displayed on the left of the editor and they are added to the diagram by clicking their appropriate button. Once in the editing area, they can be dragged around and their properties edited by double clicking. UML diagram example

The conversion is achieved by searching for the initial node (the black dot) and tracing the graph through to the final node (the black dot with the outer circle). Transitions are considered LCC sequence ("then") statements unless they cross the partition swimlanes, in which case they are rendered as message sending and receiving. Activities are translated as constraint satisfactions. The branch nodes are rendered as "or" branches in the LCC.

The LCC is rendered by traversing the graph and outputting the appropriate LCC for the node encountered. The LCC is buffered so that when "or" nodes are encountered, the LCC can be re-written. Back-tracing of the graph is done at various points to ensure that activities that occur after branches in other participants are rendered correctly. The LCC below shows the LCC export for the diagram shown in the UML diagram above:

// -----------------------------------------------------------------------------
// LCC File generated by UML-to-LCC exporter.
// test.lcc
// 24/10/2008
// -----------------------------------------------------------------------------
r( participant1, initial )
r( participant2, necessary )

// ============================================================ 
a( participant1, ID ) :: 
(
	null <- activity1() then
	msg() => a( participant2, Participant2ID ) then
	msg() <= a( participant2, Participant2ID ) then
	null <- activity3()
)
or
(
	null <- activity2a()
)

// ============================================================ 
a( participant2, ID ) :: 
	msg() <= a( participant1, Participant1ID ) then
	null <- activity2b() then
	msg() => a( participant1, Participant1ID )
The implementation of the basic UML editor and translator to LCC can be downloaded as source code.

4.1.2 Attaching to Existing Design Systems via Translation (Taverna)

One of the main testbeds for the OpenKnowledge system is in bioinformatics. Although this domain of application is comparatively new, there already exist accessible design tools for bioinformatics workflow, especially in Grid systems. One of the best known systems is Taverna (although there are others, such as Kepler and Triana). The Taverna system provides a visual editor for describing workflows to be enacted on a Grid system and produces a specification of the workflow in the SCUFL process language. For the sector of the bioinformatics community interested in using Taverna, the easiest way to connect to the OpenKnowledge system would be to continue to use their familiar design tools but, instead of using a Grid system for workflow enactment, to use the OpenKnowledge system. To make a switch to OpenKnowledge as straightforward as possible for this community we built an automatic translator from SCUFL to LCC. This does not disrupt in any way the current methods of use of the Taverna system (which we assume are well honed to the Taverna community) but simply provides an additional step beyond the traditional endpoint of Taverna specification (which is a workflow specification in SCUFL) to LCC. The SCUFL to LCC translator is described in detail in our technical report and the Java code used to implement it can be downloaded as source code.

4.2 Connecting LCC to Compatible Languages With Different Functionality (OWL-S)

In the previous section our approach to integrating with an existing design system was automatic translation, with the aim of altering established design practice as little as possible. Some systems of design, however, are oriented to a different problem than the one being addressed by OpenKnowledge and then the issue becomes one of complementarity: establishing connections such that the two systems can be used together to mutual advantage. An example of this is the OWL-S language for semantic web service specification.

OWL-S is essentially a typed language for specifying input/output interfaces to Web services. It's use of OWL as a type language gives it a strong connection to semantic web efforts. OWL-S, however, deliberately avoids prescribing any language in which the processes to choreograph services might be specified (so as to remain neutral to choices in choreography language). Conversely, LCC deliberately avoids commitment to a service specification language (so as to remain neutral to service specification infrastructure). LCC and OWL-S therefore tackle service choreography from different perspectives.

As an experiment in combining LCC and OWL-S, we built a prototype discovery system for services when enacting LCC interaction models. This involved adding type annotations to constraints in LCC interaction models and, from these annotations, automatically extracting service descriptions that could be matched to OWL-S service specifications using a Description Logic reasoner. The relationship between OWL-S and LCC is described in detail in our technical report and the Prolog code used to implement his OWL-S based discovery system for LCC (and the DL reasoner) can be downloaded as source code.

4.3 Writing an Interpreter for an Established Language in LCC (BPEL4WS)

In Section 4.1 we gave an example of translation as a means of linking LCC to a task-specific design system. In Section 4.2 we gave an example of connecting to a complementary task-specific language via annotations in LCC. We now introduce a third way to bring a task-specific language into the sphere of OpenKnowledge: by writing an interpreter for the language in LCC.

Although it is unconventional to use one protocol language as an interpreter for another, a similar idea - that one can use a declarative language to write a meta-interpreter for another language - is quite conventional in declarative programming. As an example, we chose the Business Process Execution Language for Web Services (BPEL4WS) which is an industrially used language for specifying business interaction protocols. Instead of writing a translator from BPEL4WS to LCC (the route we chose with UML and SCUFL in Section 4.1) we wrote a protocol in which the principal role is to act as a BPEL4WS interpreter. A BPEL4WS specification is given as a parameter to this role and enacting the role interprets the BPEL4WS specification to produce appropriate message passing and invocation of services. The LCC specification needed for this is, of course, complex but for BPEL4WS users this complexity is no more apparent than in any other system for enacting BPEL4WS. Conversely, from an LCC point of view, the BPEL4WS interpreter is just a normal (though complex) interaction model.

The LCC interpreter for BPEL4WS is described in detail in Chapter 5 of our technical report on enacting decentralised workflow and the same source also demonstrates, in Chapter 4, the more conventional route of translation from BPEL4WS to LCC.

4.4 Replacing LCC with an Established Language (WS-BPEL)

OpenKnowledge chose LCC as its core language because it provided a parsimonious yet powerful language with strong links to more abstract, generic process specification and declarative languages. Nevertheless, we have always recognised that other process languages could have been substituted for LCC and we have made the OpenKnowledge kernel system as independent as we could from specific choice of core language. This raises the possibility, for those who prefer a core language in some style other than LCC, to replace LCC with a process language of choice. We have demonstrated this with the Web Services Business Process Execution Language (WS-BPEL).

This approach to integrating OpenKnowledge with other design systems requires much deeper understanding of the OpenKnowledge system, since it requires the LCC interpreter in the OpenKnowledge kernel to be replaced with an interpreter for a different language. Once done, however, it allows the more traditional process language (in our case WS-BPEL) to take advantage of the peer to peer discovery methods and other OpenKnowledge infrastructure. For those deeply committed to a non-LCC process specification language, this could be a more attractive long-term option than translation or meta-interpretation. Our experience of replacing LCC with WS-BPEL is described in our technical report.