Enhance project logo

The Enhance project

Enhancing the Performance Predictability of Grid Applications with Patterns and Process Algebras

Anne Benoit, Murray Cole, Stephen Gilmore, Jane Hillston and Gagarine Yaikhom

Overview

One of the most promising technical innovations in present-day computing is the invention of Grid technologies which harness the computational and storage power of widely distributed collections of computers. Grid technologies are breaking new ground in e-Science where scientists across the globe can collaborate and share data sets, processing power and specialised scientific instruments to accelerate the pace of scientific development. Grid technologies are also opening new doors in e-Business where small or medium-scale businesses can now have access to supercompting power which was once the province of only the most wealthy banks and multi-nationals. Data and resource sharing schemes such as these are changing the basic ground rules of computing.

All such schemes raise difficult issues of resource allocation and scheduling (roughly speaking, how to decide which computer does what, and when, and how they interact) which are made all the more complex by the inherent unpredictability of resource availability and performance. For example, I may forget to switch on my PC, a supercomputer may be required for a more important task, the internet connections I need may be particularly busy.

The Enhance project aims to simplify the effective programming of such systems by exploiting and synthesizing results from two underlying research programmes. Stochastic process algebras such as PEPA are used to model the behaviour of concurrent systems in which some aspects of behaviour are not precisely predictable. Pattern based programming recognizes that many real applications draw from a range of well known solution paradigms and seeks to make it easy for an application developer to tailor such a paradigm to a specific problem without re-inventing the wheel. The choice of a particular paradigm carries with it considerable information about implied scheduling dependencies. By modelling these with PEPA, and thereby being able to include aspects of uncertainty which are inherent to Grid computing, we believe that we will be able to underpin systems which can make better scheduling and dynamic rescheduling decisions than less sophisticated approaches.

About the project

The vision of Grids as both tools and enabling fabrics for distributed collaborative computation continues to excite applications designers and challenge the Computer Science research community. While some of the issues raised are familiar from previous environments in distributed and parallel computing, many are new.

This project seeks to address the allied problems of programmability, cost predictability and scheduling by augmenting a component based Grid HPC framework with a mechanism for constructing applications as instantiations of pre-defined application patterns. Each pattern will encapsulate the logical behaviour of a familiar pattern of multi-component interaction, at a level of abstraction which simplifies the developer's task, without over constraining the selected implementation. The definition and manipulation of semantic and performance behaviour of components and patterns will be facilitated by the use of a performance oriented process algebra as a modelling tool.

Underpinning an application development process of the form illustrated below, our key insights are:

Application development

Ultimately, by construction of a prototype and experimentation with a number of case studies, we will demonstrate that this approach allows simple, intuitive construction of applications, and facilitates run-time scheduling and rescheduling with an effectiveness which surpasses that of less constrained methodologies.

(More information about Enhance.)

Case studies

In the Enhance project we have undertaken several case studies to check the practical usefulness of our methods. One of these relates to the on-line scheduling of a numerically-intensive algorithm which uses the Mean Value Analysis algorithm to compute metrics of a queueing network. The algorithm is a pipeline-structured parallel routine implemented using C, MPI and the Edinburgh skeletons library.

Stages in the pipeline

We sample the CPU availability and inter-node TCP communication latency of our Beowulf compute cluster, using the measurement facilities of the Network Weather Service. We then use the measurements so obtained to parameterise our PEPA model of a pipeline-structured parallel application.

The PEPA model

We solve the PEPA model using the PEPA Workbench, and derive performance metrics from the result which predict which will be the worst allocation of processes to processors, and which will be the best. By basing our performance predictions on measurements of network latency and compute load we are able to predict ahead-of-run those scheduling decisions which will lead to the shortest run-time and those which lead to the longest runtime. We then run the MVA application on the cluster and measure the run-times.

A graph of the best and worst mappings

In the present study we did not find a single instance where the performance predictions computed by the PEPA Workbench were misleading: predicted best cases always significantly outperformed predicted worst cases in practice. Strategic use of these methods will reduce the run-times of these parallel codes overall, making better use of heavily-loaded compute clusters.

Members of the Enhance project

Publications by Year

2004

2005

2006

Talks and presentations

Software

Funding

The Enhance project was funded by the Engineering and Physical Sciences Research council grant number GR/S21717/01. The project commenced on June 1st 2003 and ended on March 1st 2007.